Science.gov

Sample records for cluster enrichment analysis

  1. Analysis of High Throughput Screening Assays using Cluster Enrichment

    PubMed Central

    Pu, Minya; Hayashi, Tomoko; Cottam, Howard; Mulvaney, Joseph; Arkin, Michelle; Corr, Maripat; Carson, Dennis; Messer, Karen

    2013-01-01

    In this paper we describe implementation and evaluation of a cluster-based enrichment strategy to call hits from a high-throughput screen (HTS), using a typical cell-based assay of 160,000 chemical compounds. Our focus is on statistical properties of the prospective design choices throughout the analysis, including how to choose the number of clusters for optimal power, the choice of test statistic, the significance thresholds for clusters and the activity threshold for candidate hits, how to rank selected hits for carry-forward to the confirmation screen, and how to identify confirmed hits in a data-driven manner. While previously the literature has focused on choice of test statistic or chemical descriptors, our studies suggest cluster size is the more important design choice. We recommend clusters be ranked by enrichment odds ratio, not p-value. Our conceptually simple test statistic is seen to identify the same set of hits as more complex scoring methods proposed in the literature. We prospectively confirm that such a cluster-based approach can outperform the naive top X approach, and estimate that we improved confirmation rates by about 31.5%, from 813 using the Top X approach to 1187 using our cluster-based method. PMID:22763983

  2. Analysis of high-throughput screening assays using cluster enrichment.

    PubMed

    Pu, Minya; Hayashi, Tomoko; Cottam, Howard; Mulvaney, Joseph; Arkin, Michelle; Corr, Maripat; Carson, Dennis; Messer, Karen

    2012-12-30

    In this paper, we describe the implementation and evaluation of a cluster-based enrichment strategy to call hits from a high-throughput screen using a typical cell-based assay of 160,000 chemical compounds. Our focus is on statistical properties of the prospective design choices throughout the analysis, including how to choose the number of clusters for optimal power, the choice of test statistic, the significance thresholds for clusters and the activity threshold for candidate hits, how to rank selected hits for carry-forward to the confirmation screen, and how to identify confirmed hits in a data-driven manner. Whereas previously the literature has focused on choice of test statistic or chemical descriptors, our studies suggest that cluster size is the more important design choice. We recommend clusters to be ranked by enrichment odds ratio, not by p-value. Our conceptually simple test statistic is seen to identify the same set of hits as more complex scoring methods proposed in the literature do. We prospectively confirm that such a cluster-based approach can outperform the naive top X approach and estimate that we improved confirmation rates by about 31.5% from 813 using the top X approach to 1187 using our cluster-based method. Copyright © 2012 John Wiley & Sons, Ltd.

  3. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering

    PubMed Central

    Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample. PMID:27764138

  4. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.

    PubMed

    Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.

  5. The self-enrichment of globular clusters

    SciTech Connect

    Morgan, S.; Lake, G.

    1989-04-01

    It is shown that protoglobular clusters of primordial gas can confine the supernovae needed to enrich themselves. The required protocluster cloud masses and structural parameters are the same as those currently observed for the clusters. Two causal scenarios for star formation are examined to calculate the initial enrichment of primordial clouds. In the 'Christmas tree' scheme, the maximum final (Fe/H) is about 0.1. Since the time scale for formation and evolution of massive stars at the center of a cluster is nearly an order of magnitude less than the collapse time of the cluster, every globular cluster may have to survive a supernova detonation. If this is the case, the minimum mass of a globular cluster is about 10 to the 4.6th solar mass. 24 refs.

  6. The self-enrichment of globular clusters

    NASA Astrophysics Data System (ADS)

    Morgan, Siobahn; Lake, George

    1989-04-01

    It is shown that protoglobular clusters of primordial gas can confine the supernovae needed to enrich themselves. The required protocluster cloud masses and structural parameters are the same as those currently observed for the clusters. Two causal scenarios for star formation are examined to calculate the initial enrichment of primordial clouds. In the 'Christmas tree' scheme, the maximum final (Fe/H) is about 0.1. Since the time scale for formation and evolution of massive stars at the center of a cluster is nearly an order of magnitude less than the collapse time of the cluster, every globular cluster may have to survive a suprernova detonation. If this is the case, the minimum mass of a globular cluster is about 10 to the 4.6th solar mass.

  7. Assessment of air quality in the coastal area of South Korea using principal component analysis combined with cluster analysis and enrichment factor

    NASA Astrophysics Data System (ADS)

    Yoo, H.; Kwack, W.; Ha, H.; Choe, C.; Kim, Y.; Zoh, K.; Yi, S.

    2009-12-01

    The interpretation of ambiet air pollution in relation with meteorological parameters is a topic of great interest to maintain and manage ambient air quality in the coastal region of South Korea. The objectives of this study were: (i) to locate emission sources qualitatitively and quantitatively, (ii) to identify air quality by day with similar air pollution behaviors; and (iii) to evaluate the effect of sea salt on the aerosols. Two statistical techniques, principal components analysis(PCA) and cluster analysis(CA) were applied to the concentrations of five pollutants(PM10, SO2, NO2, CO and O3). The enrichment factor(EF) representing the ratio of Cl/Na in aerosol to Cl/Na in sea water was calculated with 2 soluble ions(Na+ and Cl-) in PM10 analyzed by ion chromatograph in coastal urban area of Incheon city, South Korea from January to December 2008. PCA results show three emission sources; (i) high PM10, NO2 and CO with low temperature, radiation and wind speed(35.9%), (ii) high O3 with high radiation and wind speed, and low humidity(18.2%), (iii) high NO2 and O3 with high temperature and radiation, and low wind speed(11.2%), respectively. CA results show three groups; (i) Friday (high PM10 and NO2, low O3), (ii) Sunday (low PM10 and NO2, high O3), (iii) Monday/Tuesday/Wednesday/Thursday/Saturday (medium PM10, NO2 and O3). EF was 1.02 implying contribution of sea salt on the aerosol level with various anthropogenic sources. In conclusion, PCA and CA are suitable for idenfying and estimating the sources of air pollution. EF allows for investigating the effect of sea salt on the PM10 in the region where sea-land breezes. It was conclueded that the sea wind had an important contribution towards the variation of air pollutants. Key words : air pollutants, principal component analysis, cluster analysis, enrichment factor Fig. 1. Dendrogram using average linkage of cluster analysis for PM10 at coastal urban area of Incheon, South Korea. Table 1 Main results of PCA

  8. MicroRNA-Target Network Inference and Local Network Enrichment Analysis Identify Two microRNA Clusters with Distinct Functions in Head and Neck Squamous Cell Carcinoma.

    PubMed

    Sass, Steffen; Pitea, Adriana; Unger, Kristian; Hess, Julia; Mueller, Nikola S; Theis, Fabian J

    2015-12-18

    MicroRNAs represent ~22 nt long endogenous small RNA molecules that have been experimentally shown to regulate gene expression post-transcriptionally. One main interest in miRNA research is the investigation of their functional roles, which can typically be accomplished by identification of mi-/mRNA interactions and functional annotation of target gene sets. We here present a novel method "miRlastic", which infers miRNA-target interactions using transcriptomic data as well as prior knowledge and performs functional annotation of target genes by exploiting the local structure of the inferred network. For the network inference, we applied linear regression modeling with elastic net regularization on matched microRNA and messenger RNA expression profiling data to perform feature selection on prior knowledge from sequence-based target prediction resources. The novelty of miRlastic inference originates in predicting data-driven intra-transcriptome regulatory relationships through feature selection. With synthetic data, we showed that miRlastic outperformed commonly used methods and was suitable even for low sample sizes. To gain insight into the functional role of miRNAs and to determine joint functional properties of miRNA clusters, we introduced a local enrichment analysis procedure. The principle of this procedure lies in identifying regions of high functional similarity by evaluating the shortest paths between genes in the network. We can finally assign functional roles to the miRNAs by taking their regulatory relationships into account. We thoroughly evaluated miRlastic on a cohort of head and neck cancer (HNSCC) patients provided by The Cancer Genome Atlas. We inferred an mi-/mRNA regulatory network for human papilloma virus (HPV)-associated miRNAs in HNSCC. The resulting network best enriched for experimentally validated miRNA-target interaction, when compared to common methods. Finally, the local enrichment step identified two functional clusters of miRNAs that

  9. Supernova Enrichment of Planetary Systems in Low Mass Star Clusters

    NASA Astrophysics Data System (ADS)

    Nicholson, Rhana; Parker, R.

    2017-06-01

    Short-lived radioactive species have been detected in chondritic meteorites from the early epoch of the Solar system. This implies that the Sun formed in the vicinity of the supernovae of one or more massive stars. Massive stars are more likely to form in massive star clusters (1000 Msun) than lower mass clusters (50-200 Msun). We show that direct enrichment of protoplanetary discs via supernovae occurs as frequently in low mass clusters containing one or two massive stars as in more populous clusters. This significantly relaxes the constraints on the birth environment of the Solar System.

  10. Enriched topological learning for cluster detection and visualization.

    PubMed

    Cabanes, Guénaël; Bennani, Younès; Fresneau, Dominique

    2012-08-01

    The exponential growth of data generates terabytes of very large databases. The growing number of data dimensions and data objects presents tremendous challenges for effective data analysis and data exploration methods and tools. Thus, it becomes crucial to have methods able to construct a condensed description of the properties and structure of data, as well as visualization tools capable of representing the data structure from these condensed descriptions. The purpose of our work described in this paper is to develop a method of describing data from enriched and segmented prototypes using a topological clustering algorithm. We then introduce a visualization tool that can enhance the structure within and between groups in data. We show, using some artificial and real databases, the relevance of the proposed approach. Copyright © 2012 Elsevier Ltd. All rights reserved.

  11. Extremely α -Enriched Globular Clusters in Early-Type Galaxies

    NASA Astrophysics Data System (ADS)

    Puzia, T. H.; Kissler-Patig, M.; Goudfrooij, P.

    2005-12-01

    We compare [α /Fe], metallicity, and age distributions of globular clusters in elliptical, lenticular, and spiral galaxies, which we derive from Lick line index measurements. We find a large number of globular clusters in elliptical galaxies that reach significantly higher [α /Fe] values ([α /Fe] >0.5) than clusters in lenticular and spiral galaxies. Most of these highly α -enriched globular clusters are old (t>8 Gyr) and exhibit relatively high metallicities up to solar values. A comparison with supernova yield models suggests that the progenitor gas clouds of these globular clusters were predominantly enriched by massive stars (⪆ 20 M⊙) with little contribution from lower-mass stars. The measured [α /Fe] ratios are also consistent with yields of very massive pair-instability supernovae ( ˜130-190 M⊙). This implies that the chemical enrichment of the progenitor gas was completed on extremely short timescales of the order of a few Myr. Given the lower [α /Fe] ratios of the diffuse stellar population in early-type galaxies, our results suggest that the extremely α -enhanced globular clusters are members of the the very first generation of star clusters formed, and that their formation epochs likely predate the formation of the majority of stars in giant early-type galaxies.

  12. Implementing Enrichment Clusters in Elementary Schools: Lessons Learned

    ERIC Educational Resources Information Center

    Fiddyment, Gail E.

    2014-01-01

    Enrichment clusters offer a way for schools to encourage a high level of learning as students and adults work together to develop a product, service, or performance by applying advanced knowledge and authentic processes to real-world problems. This study utilized a qualitative research design to examine the perceptions and experiences of two…

  13. Implementing Enrichment Clusters in Elementary Schools: Lessons Learned

    ERIC Educational Resources Information Center

    Fiddyment, Gail E.

    2014-01-01

    Enrichment clusters offer a way for schools to encourage a high level of learning as students and adults work together to develop a product, service, or performance by applying advanced knowledge and authentic processes to real-world problems. This study utilized a qualitative research design to examine the perceptions and experiences of two…

  14. Tissue enrichment analysis for C. elegans genomics.

    PubMed

    Angeles-Albores, David; N Lee, Raymond Y; Chan, Juancarlos; Sternberg, Paul W

    2016-09-13

    Over the last ten years, there has been explosive development in methods for measuring gene expression. These methods can identify thousands of genes altered between conditions, but understanding these datasets and forming hypotheses based on them remains challenging. One way to analyze these datasets is to associate ontologies (hierarchical, descriptive vocabularies with controlled relations between terms) with genes and to look for enrichment of specific terms. Although Gene Ontology (GO) is available for Caenorhabditis elegans, it does not include anatomical information. We have developed a tool for identifying enrichment of C. elegans tissues among gene sets and generated a website GUI where users can access this tool. Since a common drawback to ontology enrichment analyses is its verbosity, we developed a very simple filtering algorithm to reduce the ontology size by an order of magnitude. We adjusted these filters and validated our tool using a set of 30 gold standards from Expression Cluster data in WormBase. We show our tool can even discriminate between embryonic and larval tissues and can even identify tissues down to the single-cell level. We used our tool to identify multiple neuronal tissues that are down-regulated due to pathogen infection in C. elegans. Our Tissue Enrichment Analysis (TEA) can be found within WormBase, and can be downloaded using Python's standard pip installer. It tests a slimmed-down C. elegans tissue ontology for enrichment of specific terms and provides users with a text and graphic representation of the results.

  15. Enrichment by supernovae in globular clusters with multiple populations.

    PubMed

    Lee, Jae-Woo; Kang, Young-Woon; Lee, Jina; Lee, Young-Wook

    2009-11-26

    The most massive globular cluster in the Milky Way, omega Centauri, is thought to be the remaining core of a disrupted dwarf galaxy, as expected within the model of hierarchical merging. It contains several stellar populations having different heavy elemental abundances supplied by supernovae-a process known as metal enrichment. Although M 22 appears to be similar to omega Cen, other peculiar globular clusters do not. Therefore omega Cen and M 22 are viewed as exceptional, and the presence of chemical inhomogeneities in other clusters is seen as 'pollution' from the intermediate-mass asymptotic-giant-branch stars expected in normal globular clusters. Here we report Ca abundances for seven globular clusters and compare them to omega Cen. Calcium and other heavy elements can only be supplied through numerous supernovae explosions of massive stars in these stellar systems, but the gravitational potentials of the present-day clusters cannot preserve most of the ejecta from such explosions. We conclude that these globular clusters, like omega Cen, are most probably the relics of more massive primeval dwarf galaxies that merged and disrupted to form the proto-Galaxy.

  16. Chemical enrichment of galaxy clusters from hydrodynamical simulations

    NASA Astrophysics Data System (ADS)

    Tornatore, L.; Borgani, S.; Dolag, K.; Matteucci, F.

    2007-12-01

    We present cosmological hydrodynamical simulations of galaxy clusters aimed at studying the process of metal enrichment of the intra-cluster medium (ICM). These simulations have been performed by implementing a detailed model of chemical evolution in the TREE-PM+SPMGADGET-2 code. This model allows us to follow the metal release from Type II supernovae (SNII), Type Ia supernovae (SNIa) and asymptotic giant branch (AGB) stars by properly accounting for the lifetimes of stars of different mass, as well as to change the stellar initial mass function (IMF), the lifetime function and the stellar yields. As such, our implementation of chemical evolution represents a powerful instrument to follow the cosmic history of metal production. The simulations presented here have been performed with the twofold aim of checking numerical effects, as well as the impact of changing the model of chemical evolution and the efficiency of stellar feedback. In general, we find that the distribution of metals produced by SNII is more clumpy than for the product of low-mass stars, as a consequence of the different time-scales over which they are released. Using a standard Salpeter IMF produces a radial profile of iron abundance which is in fairly good agreement with observations available out to ~=0.6R500. This result holds almost independent of the numerical scheme adopted to distribute metals around star-forming regions. The mean age of enrichment of the ICM corresponds to redshift z ~ 0.5, which progressively increases outside the virial region. Increasing resolution, we improve the description of a diffuse high-redshift enrichment of the inter-galactic medium (IGM). This turns into a progressively more efficient enrichment of the cluster outskirts, while having a smaller impact at R <~ 0.5R500. As for the effect of the model of chemical evolution, we find that changing the IMF has the strongest impact. Using an IMF, which is top-heavier than the Salpeter one, provides a larger iron

  17. Cluster Morphology Analysis

    PubMed Central

    Jacquez, Geoffrey M.

    2009-01-01

    Most disease clustering methods assume specific shapes and do not evaluate statistical power using the applicable geography, at-risk population, and covariates. Cluster Morphology Analysis (CMA) conducts power analyses of alternative techniques assuming clusters of different relative risks and shapes. Results are ranked by statistical power and false positives, under the rationale that surveillance should (1) find true clusters while (2) avoiding false clusters. CMA then synthesizes results of the most powerful methods. CMA was evaluated in simulation studies and applied to pancreatic cancer mortality in Michigan, and finds clusters of flexible shape while routinely evaluating statistical power. PMID:20234799

  18. Pathway Enrichment Analysis with Networks.

    PubMed

    Liu, Lu; Wei, Jinmao; Ruan, Jianhua

    2017-09-28

    Detecting associations between an input gene set and annotated gene sets (e.g., pathways) is an important problem in modern molecular biology. In this paper, we propose two algorithms, termed NetPEA and NetPEA', for conducting network-based pathway enrichment analysis. Our algorithms consider not only shared genes but also gene-gene interactions. Both algorithms utilize a protein-protein interaction network and a random walk with a restart procedure to identify hidden relationships between an input gene set and pathways, but both use different randomization strategies to evaluate statistical significance and as a result emphasize different pathway properties. Compared to an over representation-based method, our algorithms can identify more statistically significant pathways. Compared to an existing network-based algorithm, EnrichNet, our algorithms have a higher sensitivity in revealing the true causal pathways while at the same time achieving a higher specificity. A literature review of selected results indicates that some of the novel pathways reported by our algorithms are biologically relevant and important. While the evaluations are performed only with KEGG pathways, we believe the algorithms can be valuable for general functional discovery from high-throughput experiments.

  19. Cluster Correspondence Analysis.

    PubMed

    van de Velden, M; D'Enza, A Iodice; Palumbo, F

    2017-03-01

    A method is proposed that combines dimension reduction and cluster analysis for categorical data by simultaneously assigning individuals to clusters and optimal scaling values to categories in such a way that a single between variance maximization objective is achieved. In a unified framework, a brief review of alternative methods is provided and we show that the proposed method is equivalent to GROUPALS applied to categorical data. Performance of the methods is appraised by means of a simulation study. The results of the joint dimension reduction and clustering methods are compared with the so-called tandem approach, a sequential analysis of dimension reduction followed by cluster analysis. The tandem approach is conjectured to perform worse when variables are added that are unrelated to the cluster structure. Our simulation study confirms this conjecture. Moreover, the results of the simulation study indicate that the proposed method also consistently outperforms alternative joint dimension reduction and clustering methods.

  20. Chemical Enrichment RGS cluster Sample (CHEERS): Constraints on turbulence

    NASA Astrophysics Data System (ADS)

    Pinto, Ciro; Sanders, Jeremy S.; Werner, Norbert; de Plaa, Jelle; Fabian, Andrew C.; Zhang, Yu-Ying; Kaastra, Jelle S.; Finoguenov, Alexis; Ahoranta, Jussi

    2015-03-01

    Context. Feedback from active galactic nuclei, galactic mergers, and sloshing are thought to give rise to turbulence, which may prevent cooling in clusters. Aims: We aim to measure the turbulence in clusters of galaxies and compare the measurements to some of their structural and evolutionary properties. Methods: It is possible to measure the turbulence of the hot gas in clusters by estimating the velocity widths of their X-ray emission lines. The Reflection Grating Spectrometers aboard XMM-Newton are currently the only instruments provided with sufficient effective area and spectral resolution in this energy domain. We benefited from excellent 1.6 Ms new data provided by the Chemical Enrichment RGS cluster Sample (CHEERS) project. Results: The new observations improve the quality of the archival data and allow us to place constraints for some clusters, which were not accessible in previous work. One-half of the sample shows upper limits on turbulence less than 500 km s-1. For several sources, our data are consistent with relatively strong turbulence with upper limits on the velocity widths that are larger than 1000 km s-1. The NGC 507 group of galaxies shows transonic velocities, which are most likely associated with the merging phenomena and bulk motions occurring in this object. Where both low- and high-ionization emission lines have good enough statistics, we find larger upper limits for the hot gas, which is partly due to the different spatial extents of the hot and cool gas phases. Our upper limits are larger than the Mach numbers required to balance cooling, suggesting that dissipation of turbulence may prevent cooling, although other heating processes could be dominant. The systematics associated with the spatial profile of the source continuum make this technique very challenging, though still powerful, for current instruments. In a forthcoming paper we will use the resonant-scattering technique to place lower-limits on the velocity broadening and provide

  1. Enrichment Clusters: A Practical Plan for Real-World, Student-Driven Learning.

    ERIC Educational Resources Information Center

    Renzulli, Joseph S.; Gentry, Marcia; Reis, Sally M.

    This guidebook provides a rationale and guidelines for implementing a student-driven learning approach using enrichment clusters. Enrichment clusters allow students who share a common interest to meet each week to produce a product, performance, or targeted service based on that common interest. Chapter 1 discusses different models of learning.…

  2. Probing the Large Magellanic Cloud's recent chemical enrichment history through its star clusters

    NASA Astrophysics Data System (ADS)

    Palma, T.; Clariá, J. J.; Geisler, D.; Gramajo, L. V.; Ahumada, A. V.

    2015-06-01

    We present Washington system colour-magnitude diagrams (CMDs) for 17 practically unstudied star clusters located in the bar as well as in the inner disc and outer regions of the Large Magellanic Cloud (LMC). Cluster sizes were estimated from star counts distributed throughout the entire observed fields. Based on the best fits of theoretical isochrones to the cleaned (C - T1, T1) CMDs, as well as on the δT1 parameter and the standard giant branch method, we derive ages and metallicities for the cluster sample. Four objects are found to be intermediate-age clusters (1.8-2.5 Gyr), with [Fe/H] ranging from -0.66 to -0.84. With the exception of SL 263, a very young cluster (˜16 Myr), the remaining 12 objects are aged between 0.32 and 0.89 Gyr, with their [Fe/H] values ranging from -0.19 to -0.50. We combined our results with those for other 231 clusters studied in a similar way using the Washington system. The resulting age-metallicity relationship shows a significant dispersion in metallicities, whatever age is considered. Although there seems to exist a clear tendency for the younger clusters to be more metal rich than the intermediate ones, we believe that none of the chemical evolution models currently available in the literature reasonably well represents the recent chemical enrichment processes in the LMC clusters. The present sample of 17 clusters is part of our ongoing project of generating a data base of LMC clusters homogeneously studied using the Washington photometric system and applying the same analysis procedure.

  3. A Detailed Study of Chemical Enrichment History of Galaxy Clusters out to Virial Radius

    NASA Astrophysics Data System (ADS)

    Loewenstein, Michael

    The origin of the metal enrichment of the intracluster medium (ICM) represents a fundamental problem in extragalactic astrophysics, with implications for our understanding of how stars and galaxies form, the nature of Type Ia supernova (SNIa) progenitors, and the thermal history of the ICM. These heavy elements are ultimately synthesized by supernova (SN) explosions; however, the details of the sites of metal production and mechanisms that transport metals to the ICM remain unclear. To make progress, accurate abundance profiles for multiple elements extending from the cluster core out to the virial radius (r180) are required for a significant cluster sample. We propose an X-ray spectroscopic study of a carefully-chosen sample of archival Suzaku and XMM-Newton observations of 23 clusters: XMM-Newton data probe the cluster temperature and abundances out to (0.5-1)r500, while Suzaku data probe the cluster outskirts. A method devised by our team to utilize all elements with emission lines in the X-ray bandpass to measure the relative contributions of supernova explosions by direct modeling of their X-ray spectra will be applied in order to constrain the demographics of the enriching supernova population. In addition we will conduct a stacking analysis of our already existing Suzaku and XMM-Newton cluster spectra to search for weak emssion lines that are important SN diagnostics, and to look for trends with cluster mass and redshift. The funding we propose here will also support the data analysis of our recent Suzaku observations of the archetypal cluster A3112 (200 ks each on the core and outskirts). Our data analysis, intepreted using theoretical models we have developed, will enable us to constrain the star formation history, SN demographics, and nature of SNIa progenitors associated with galaxy cluster stellar populations - and, hence, directly addresess NASA s Strategic Objective 2.4.2 in Astrophysics that aims to improve the understanding of how the Universe works

  4. A general framework for a reliable multivariate analysis and pattern recognition in high-dimensional epidemiological data, based on cluster robustness: a tutorial to enrich the epidemiologists' toolkit.

    PubMed

    Lefèvre, T; Chauvin, P

    2015-02-01

    In an epidemiologist's toolbox, three main types of statistical tools can be found: means and proportions comparisons, linear or logistic regression models and Cox-type regression models. All these techniques have their own multivariate formulations, so that biases can be accounted for. Nonetheless, there is an entire set of natively massive multivariate techniques, which are based on weaker assumptions than classical statistical techniques are, and which seem to be underestimated or remain unknown to most epidemiologists. These techniques are used for pattern recognition or clustering – that is, for retrieving homogeneous groups in data without any a priori about these groups. They are widely used in connex domains such as genetics or biomolecular studies. Most clustering techniques require tuning specific parameters so that groups can be identified in data. A critical parameter to set is the number of groups the technique needs to discover. Different approaches to find the optimal number of groups are available, such as the silhouette approach and the robustness approach. This article presents the key aspects of clustering techniques (how proximity between observations is defined and how to find the number of groups), two archetypal techniques (namely the k-means and PAM algorithms) and how they relate to more classical statistical approaches. Through a theoretical, simple example and a real data application, we provide a complete framework within which classical epidemiological concerns can be reconsidered. We show how to (i) identify whether distinct groups exist in data, (ii) identify the optimal number of groups in data, (iii) label each observation according to its own group and (iv) analyze the groups identified according to separate and explicative data. In addition, how to achieve consistent results while removing sensitivity to initial conditions is explained. Clustering techniques, in conjunction with methods for parameter tuning, provide the

  5. Isotope enrichment during the formation of water clusters in supersonic free jet expansions. [p

    SciTech Connect

    Kay, B.D.; Castleman, A.W. Jr.

    1983-03-15

    Presently, investigation of the dynamics, energetics, and structure of microscopic molecular clusters constitutes an active area of research in chemical physics. Herein the RRKM theory of primary isotope effects is extended to qualitatively predict isotope enrichment in water cluster formation. The theoretical model is verified experimentally by neutral free jet expansion modulated molecular beam mass spectrometry of mixed (H/sub 2/O)/sub m/(D/sub 2/O)/sub n/ clusters. The results further support the previously presented mechanism for neutral cluster growth in free jet expansions. The observed enrichment factors (approx.30%) suggest that techniques involving clustering may find practical applications in the area of isotope separation.

  6. ANISOTROPIC METAL-ENRICHED OUTFLOWS DRIVEN BY ACTIVE GALACTIC NUCLEI IN CLUSTERS OF GALAXIES

    SciTech Connect

    Kirkpatrick, C. C.; McNamara, B. R.; Cavagnolo, K. W.

    2011-04-20

    We present an analysis of the spatial distribution of metal-rich gas in 10 galaxy clusters using deep observations from the Chandra X-ray Observatory. The brightest cluster galaxies (BCGs) have experienced recent active galactic nucleus activity in the forms of bright radio emission, cavities, and shock fronts embedded in the hot atmospheres. The heavy elements are distributed anisotropically and are aligned with the large-scale radio and cavity axes. They are apparently being transported from the halo of the BCG into the intracluster medium along large-scale outflows driven by the radio jets. The radial ranges of the metal-enriched outflows are found to scale with jet power as R{sub Fe} {proportional_to} P {sup 0.42}{sub jet}, with a scatter of only 0.5 dex. The heavy elements are transported beyond the extent of the inner cavities in all clusters, suggesting that this is a long-lasting effect sustained over multiple generations of outbursts. Black holes in BCGs will likely have difficulty ejecting metal-enriched gas beyond 1 Mpc unless their masses substantially exceed 10{sup 9} M{sub sun}.

  7. Comprehensive cluster analysis with Transitivity Clustering.

    PubMed

    Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

    2011-03-01

    Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.

  8. Enriching Planning through Industry Analysis

    ERIC Educational Resources Information Center

    Martinez, Mario; Wolverton, Mimi

    2009-01-01

    Strategic planning is an important tool, but the sole dependence on it across departments and campuses has resulted in the underutilization of equally important methods of analysis. The evolution of higher and postsecondary education necessitates a systemic industry analysis, as the combination of new providers and delivery mechanisms and changing…

  9. Enriching Planning through Industry Analysis

    ERIC Educational Resources Information Center

    Martinez, Mario; Wolverton, Mimi

    2009-01-01

    Strategic planning is an important tool, but the sole dependence on it across departments and campuses has resulted in the underutilization of equally important methods of analysis. The evolution of higher and postsecondary education necessitates a systemic industry analysis, as the combination of new providers and delivery mechanisms and changing…

  10. Enrichment analysis applied to disease prognosis

    PubMed Central

    2013-01-01

    Enrichment analysis is well established in the field of transcriptomics, where it is used to identify relevant biological features that characterize a set of genes obtained in an experiment. This article proposes the application of enrichment analysis as a first step in a disease prognosis methodology, in particular of diseases with a strong genetic component. With this analysis the objective is to identify clinical and biological features that characterize groups of patients with a common disease, and that can be used to distinguish between groups of patients associated with disease-related events. Data mining methodologies can then be used to exploit those features, and assist medical doctors in the evaluation of the patients in respect to their predisposition for a specific event. In this work the disease hypertrophic cardiomyopathy (HCM) is used as a case-study, as a first test to assess the feasibility of the application of an enrichment analysis to disease prognosis. To perform this assessment, two groups of patients have been considered: patients that have suffered a sudden cardiac death episode and patients that have not. The results presented were obtained with genetic data and the Gene Ontology, in two enrichment analyses: an enrichment profiling aiming at characterizing a group of patients (e.g. that suffered a disease-related event) based on their mutations; and a differential enrichment aiming at identifying differentiating features between a sub-group of patients and all the patients with the disease. These analyses correspond to an adaptation of the standard enrichment analysis, since multiple sets of genes are being considered, one for each patient. The preliminary results are promising, as the sets of terms obtained reflect the current knowledge about the gene functions commonly altered in HCM patients, thus allowing their characterization. Nevertheless, some factors need to be taken into consideration before the full potential of the enrichment

  11. [Cluster analysis in biomedical researches].

    PubMed

    Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

    2013-01-01

    Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.

  12. Uranium enrichment management review: summary of analysis

    SciTech Connect

    Not Available

    1981-01-01

    In May 1980, the Assistant Secretary for Resource Applications within the Department of Energy requested that a group of experienced business executives be assembled to review the operation, financing, and management of the uranium enrichment enterprise as a basis for advising the Secretary of Energy. After extensive investigation, analysis, and discussion, the review group presented its findings and recommendations in a report on December 2, 1980. The following pages contain background material on which that final report was based. This report is arranged in chapters that parallel those of the uranium enrichment management review final report - chapters that contain summaries of the review group's discussion and analyses in six areas: management of operations and construction; long-range planning; marketing of enrichment services; financial management; research and development; and general management. Further information, in-depth analysis, and discussion of suggested alternative management practices are provided in five appendices.

  13. Supernova enrichment of planetary systems in low-mass star clusters

    NASA Astrophysics Data System (ADS)

    Nicholson, Rhana B.; Parker, Richard J.

    2017-02-01

    The presence and abundance of short-lived radioisotopes 26Al and 60Fe in chondritic meteorites implies that the Sun formed in the vicinity of one or more massive stars that exploded as supernovae (SNe). Massive stars are more likely to form in massive star clusters (>1000 M⊙) than lower mass clusters. However, photoevaporation of protoplanetary discs from massive stars and dynamical interactions with passing stars can inhibit planet formation in clusters with radii of ˜1 pc. We investigate whether low-mass (50-200 M⊙) star clusters containing one or two massive stars are a more likely avenue for early Solar system enrichment as they are more dynamically quiescent. We analyse N-body simulations of the evolution of these low-mass clusters and find that a similar fraction of stars experience SN enrichment than in high-mass clusters, despite their lower densities. This is due to two-body relaxation, which causes a significant expansion before the first SN even in clusters with relatively low (100 stars pc-3) initial densities. However, because of the high number of low-mass clusters containing one or two massive stars, the absolute number of enriched stars is the same, if not higher than for more populous clusters. Our results show that direct enrichment of protoplanetary discs from SNe occurs as frequently in low-mass clusters containing one or two massive stars (>20 M⊙) as in more populous star clusters (1000 M⊙). This relaxes the constraints on the direct enrichment scenario and therefore the birth environment of the Solar system.

  14. Relation chain based clustering analysis

    NASA Astrophysics Data System (ADS)

    Zhang, Cheng-ning; Zhao, Ming-yang; Luo, Hai-bo

    2011-08-01

    Clustering analysis is currently one of well-developed branches in data mining technology which is supposed to find the hidden structures in the multidimensional space called feature or pattern space. A datum in the space usually possesses a vector form and the elements in the vector represent several specifically selected features. These features are often of efficiency to the problem oriented. Generally, clustering analysis goes into two divisions: one is based on the agglomerative clustering method, and the other one is based on divisive clustering method. The former refers to a bottom-up process which regards each datum as a singleton cluster while the latter refers to a top-down process which regards entire data as a cluster. As the collected literatures, it is noted that the divisive clustering is currently overwhelming both in application and research. Although some famous divisive clustering methods are designed and well developed, clustering problems are still far from being solved. The k - means algorithm is the original divisive clustering method which initially assigns some important index values, such as the clustering number and the initial clustering prototype positions, and that could not be reasonable in some certain occasions. More than the initial problem, the k - means algorithm may also falls into local optimum, clusters in a rigid way and is not available for non-Gaussian distribution. One can see that seeking for a good or natural clustering result, in fact, originates from the one's understanding of the concept of clustering. Thus, the confusion or misunderstanding of the definition of clustering always derives some unsatisfied clustering results. One should consider the definition deeply and seriously. This paper demonstrates the nature of clustering, gives the way of understanding clustering, discusses the methodology of designing a clustering algorithm, and proposes a new clustering method based on relation chains among 2D patterns. In

  15. Preparation of magnetic hydroxyapatite clusters and their application in the enrichment of phosphopeptides.

    PubMed

    Yu, Qiao; Li, Xiao-Shui; Yuan, Bi-Feng; Feng, Yu-Qi

    2014-03-01

    A novel strategy for the effective enrichment of phosphopeptides based on magnetic hydro-xyapatite (HAp) clusters was developed in the current study. The structure of HAp ensures its probable separation capability, including cation exchange with P-sites (negatively charged pairs of crystal phosphates), calcium coordination, anion exchange with C-sites (positively charged pairs of crystal calcium ions). The prepared magnetic HAp clusters showed good performance on the efficient enrichment of phosphopeptides from the digestion mixture of β-casein and BSA. Compared to commercial HAp particles, the magnetic HAp clusters exhibited better selectivity toward phosphopeptides. In addition, the use of magnetic material greatly simplified the enrichment procedure, which avoided the tedious centrifugation steps in a typical phosphopeptides enrichment protocol. Finally, the material was successfully applied in the enrichment of phosphopeptides from human serum. Taken together, the efficient enrichment of the phosphopeptides by the easily prepared magnetic HAp clusters demonstrated a rapid and convenient strategy for the purification of phosphopeptides from complex samples, which may facilitate protein phosphorylation studies.

  16. STAR CLUSTERS IN M31. V. EVIDENCE FOR SELF-ENRICHMENT IN OLD M31 CLUSTERS FROM INTEGRATED SPECTROSCOPY

    SciTech Connect

    Schiavon, Ricardo P.; Caldwell, Nelson; Conroy, Charlie; Graves, Genevieve J.; Strader, Jay; MacArthur, Lauren A.; Courteau, Stéphane; Harding, Paul E-mail: caldwell@cfa.harvard.edu E-mail: graves@astro.princeton.edu E-mail: Lauren.MacArthur@nrc-cnrc.gc.ca E-mail: paul.harding@case.edu

    2013-10-10

    In the past decade, the notion that globular clusters (GCs) are composed of coeval stars with homogeneous initial chemical compositions has been challenged by growing evidence that they host an intricate stellar population mix, likely indicative of a complex history of star formation and chemical enrichment. Several models have been proposed to explain the existence of multiple stellar populations in GCs, but no single model provides a fully satisfactory match to existing data. Correlations between chemistry and global parameters such as cluster mass or luminosity are fundamental clues to the physics of GC formation. In this Letter, we present an analysis of the mean abundances of Fe, Mg, C, N, and Ca for 72 old GCs from the Andromeda galaxy. We show for the first time that there is a correlation between the masses of GCs and the mean stellar abundances of nitrogen, spanning almost two decades in mass. This result sheds new light on the formation of GCs, providing important constraints on their internal chemical evolution and mass loss history.

  17. Hemagglutinin Clusters in the Plasma Membrane Are Not Enriched with Cholesterol and Sphingolipids

    DOE PAGES

    Wilson, Robert L.; Frisz, Jessica F.; Klitzing, Haley A.; ...

    2015-04-07

    The clusters of the influenza envelope protein, hemagglutinin, within the plasma membrane are hypothesized to be enriched with cholesterol and sphingolipids. Here in this paper, we directly tested this hypothesis by using high-resolution secondary ion mass spectrometry to image the distributions of antibody-labeled hemagglutinin and isotope-labeled cholesterol and sphingolipids in the plasma membranes of fibroblast cells that stably express hemagglutinin. We found that the hemagglutinin clusters were neither enriched with cholesterol nor colocalized with sphingolipid domains. Thus, hemagglutinin clustering and localization in the plasma membrane is not controlled by cohesive interactions between hemagglutinin and liquid-ordered domains enriched with cholesterol andmore » sphingolipids, or from specific binding interactions between hemagglutinin, cholesterol, and/or the majority of sphingolipid species in the plasma membrane.« less

  18. Hemagglutinin Clusters in the Plasma Membrane Are Not Enriched with Cholesterol and Sphingolipids

    PubMed Central

    Wilson, Robert L.; Frisz, Jessica F.; Klitzing, Haley A.; Zimmerberg, Joshua; Weber, Peter K.; Kraft, Mary L.

    2015-01-01

    The clusters of the influenza envelope protein, hemagglutinin, within the plasma membrane are hypothesized to be enriched with cholesterol and sphingolipids. Here, we directly tested this hypothesis by using high-resolution secondary ion mass spectrometry to image the distributions of antibody-labeled hemagglutinin and isotope-labeled cholesterol and sphingolipids in the plasma membranes of fibroblast cells that stably express hemagglutinin. We found that the hemagglutinin clusters were neither enriched with cholesterol nor colocalized with sphingolipid domains. Thus, hemagglutinin clustering and localization in the plasma membrane is not controlled by cohesive interactions between hemagglutinin and liquid-ordered domains enriched with cholesterol and sphingolipids, or from specific binding interactions between hemagglutinin, cholesterol, and/or the majority of sphingolipid species in the plasma membrane. PMID:25863057

  19. Globular clusters in the far-ultraviolet: evidence for He-enriched second populations in extragalactic globular clusters?

    NASA Astrophysics Data System (ADS)

    Peacock, Mark B.; Zepf, Stephen E.; Kundu, Arunav; Chael, Julia

    2017-01-01

    We investigate the integrated far-ultraviolet (FUV) emission from globular clusters. We present new FUV photometry of M87's clusters based on archival Hubble Space Telescope (HST) Wide Field Planetary Camera 2 F170W observations. We use these data to test the reliability of published photometry based on HST space telescope imaging spectrograph FUV-MAMA observations, which are now known to suffer from significant red-leak. We generally confirm these previous FUV detections, but suggest they may be somewhat fainter. We compare the FUV emission from bright (MV < -9.0) clusters in the Milky Way, M31, M81 and M87 to each other and to the predictions from stellar populations models. Metal-rich globular clusters show a large spread in FUV - V, with some clusters in M31, M81 and M87 being much bluer than standard predictions. This requires that some metal-rich clusters host a significant population of blue/extreme horizontal branch (HB) stars. These hot HB stars are not traditionally expected in metal-rich environments, but are a natural consequence of multiple populations in clusters - since the enriched population is observed to be He enhanced and will therefore produce bluer HB stars, even at high metallicity. We conclude that the observed FUV emission from metal-rich clusters in M31, M81 and M87 provides evidence that He-enhanced second populations, similar to those observed directly in the Milky Way, may be a ubiquitous feature of globular clusters in the local Universe. Future HST FUV photometry is required to both confirm our interpretation of these archival data and provide constraints on He-enriched second populations of stars in extragalactic globular clusters.

  20. The colour-magnitude relation of globular clusters in Centaurus and Hydra. Constraints on star cluster self-enrichment with a link to massive Milky Way globular clusters

    NASA Astrophysics Data System (ADS)

    Fensch, J.; Mieske, S.; Müller-Seidlitz, J.; Hilker, M.

    2014-07-01

    Aims: We investigate the colour-magnitude relation of metal-poor globular clusters, the so-called blue tilt, in the Hydra and Centaurus galaxy clusters and constrain the primordial conditions for star cluster self-enrichment. Methods: We analyse U,I photometry for about 2500 globular clusters in the central regions of Hydra and Centaurus, based on VLT/FORS1 data. We measure the relation between mean colour and luminosity for the blue and red subpopulation of the globular cluster samples. We convert these relations into mass-metallicity space and compare the obtained GC mass-metallicity relation with predictions from the star cluster self-enrichment model by Bailin & Harris (2009, ApJ, 695, 1082). For this we include effects of dynamical and stellar evolution and a physically well motivated primordial mass-radius scaling. Results: We obtain a mass-metallicity scaling of Z ∝ M0.27 ± 0.05 for Centaurus GCs and Z ∝ M0.40 ± 0.06 for Hydra GCs, consistent with the range of observed relations in other environments. We find that the GC mass-metallicity relation already sets in at present-day masses of a few and is well established in the luminosity range of massive MW clusters like ω Centauri. The inclusion of a primordial mass-radius scaling of star clusters significantly improves the fit of the self-enrichment model to the data. The self-enrichment model accurately reproduces the observed relations for average primordial half-light radii rh ~ 1-1.5 pc, star formation efficiencies f⋆ ~ 0.3-0.4, and pre-enrichment levels of [Fe/H] - 1.7 dex. The slightly steeper blue tilt for Hydra can be explained either by a ~30% smaller average rh at fixed f⋆ ~ 0.3, or analogously by a ~20% smaller f⋆ at fixed rh ~ 1.5 pc. Within the self-enrichment scenario, the observed blue tilt implies a correlation between GC mass and width of the stellar metallicity distribution. We find that this implied correlation matches the trend of width with GC mass measured in Galactic GCs

  1. Clusters of Antibiotic Resistance Genes Enriched Together Stay Together in Swine Agriculture

    PubMed Central

    Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong; Cole, James R.; Hashsham, Syed A.; Looft, Torey; Zhu, Yong-Guan

    2016-01-01

    ABSTRACT   Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk. PMID:27073098

  2. Clusters of antibiotic resistance genes enriched together stay together in swine agriculture

    SciTech Connect

    Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong; Cole, James R.; Hashsham, Syed A.; Looft, Torey; Zhu, Yong -Guan; Tiedje, James M.

    2016-04-12

    Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk.Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of

  3. Clusters of antibiotic resistance genes enriched together stay together in swine agriculture

    DOE PAGES

    Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong; ...

    2016-04-12

    Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundancemore » of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk.Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance

  4. Clusters of Antibiotic Resistance Genes Enriched Together Stay Together in Swine Agriculture.

    PubMed

    Johnson, Timothy A; Stedtfeld, Robert D; Wang, Qiong; Cole, James R; Hashsham, Syed A; Looft, Torey; Zhu, Yong-Guan; Tiedje, James M

    2016-04-12

    Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk. Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if

  5. Ranking metrics in gene set enrichment analysis: do they matter?

    PubMed

    Zyla, Joanna; Marczyk, Michal; Weiner, January; Polanska, Joanna

    2017-05-12

    There exist many methods for describing the complex relation between changes of gene expression in molecular pathways or gene ontologies under different experimental conditions. Among them, Gene Set Enrichment Analysis seems to be one of the most commonly used (over 10,000 citations). An important parameter, which could affect the final result, is the choice of a metric for the ranking of genes. Applying a default ranking metric may lead to poor results. In this work 28 benchmark data sets were used to evaluate the sensitivity and false positive rate of gene set analysis for 16 different ranking metrics including new proposals. Furthermore, the robustness of the chosen methods to sample size was tested. Using k-means clustering algorithm a group of four metrics with the highest performance in terms of overall sensitivity, overall false positive rate and computational load was established i.e. absolute value of Moderated Welch Test statistic, Minimum Significant Difference, absolute value of Signal-To-Noise ratio and Baumgartner-Weiss-Schindler test statistic. In case of false positive rate estimation, all selected ranking metrics were robust with respect to sample size. In case of sensitivity, the absolute value of Moderated Welch Test statistic and absolute value of Signal-To-Noise ratio gave stable results, while Baumgartner-Weiss-Schindler and Minimum Significant Difference showed better results for larger sample size. Finally, the Gene Set Enrichment Analysis method with all tested ranking metrics was parallelised and implemented in MATLAB, and is available at https://github.com/ZAEDPolSl/MrGSEA . Choosing a ranking metric in Gene Set Enrichment Analysis has critical impact on results of pathway enrichment analysis. The absolute value of Moderated Welch Test has the best overall sensitivity and Minimum Significant Difference has the best overall specificity of gene set analysis. When the number of non-normally distributed genes is high, using Baumgartner

  6. GOMA: functional enrichment analysis tool based on GO modules

    PubMed Central

    Huang, Qiang; Wu, Ling-Yun; Wang, Yong; Zhang, Xiang-Sun

    2013-01-01

    Analyzing the function of gene sets is a critical step in interpreting the results of high-throughput experiments in systems biology. A variety of enrichment analysis tools have been developed in recent years, but most output a long list of significantly enriched terms that are often redundant, making it difficult to extract the most meaningful functions. In this paper, we present GOMA, a novel enrichment analysis method based on the new concept of enriched functional Gene Ontology (GO) modules. With this method, we systematically revealed functional GO modules, i.e., groups of functionally similar GO terms, via an optimization model and then ranked them by enrichment scores. Our new method simplifies enrichment analysis results by reducing redundancy, thereby preventing inconsistent enrichment results among functionally similar terms and providing more biologically meaningful results. PMID:23237213

  7. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool.

    PubMed

    Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi

    2015-11-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.

  8. IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis.

    PubMed

    Zhang, Fan; Drabier, Renee

    2012-01-01

    Next-Generation Sequencing (NGS) technologies and Genome-Wide Association Studies (GWAS) generate millions of reads and hundreds of datasets, and there is an urgent need for a better way to accurately interpret and distill such large amounts of data. Extensive pathway and network analysis allow for the discovery of highly significant pathways from a set of disease vs. healthy samples in the NGS and GWAS. Knowledge of activation of these processes will lead to elucidation of the complex biological pathways affected by drug treatment, to patient stratification studies of new and existing drug treatments, and to understanding the underlying anti-cancer drug effects. There are approximately 141 biological human pathway resources as of Jan 2012 according to the Pathguide database. However, most currently available resources do not contain disease, drug or organ specificity information such as disease-pathway, drug-pathway, and organ-pathway associations. Systematically integrating pathway, disease, drug and organ specificity together becomes increasingly crucial for understanding the interrelationships between signaling, metabolic and regulatory pathway, drug action, disease susceptibility, and organ specificity from high-throughput omics data (genomics, transcriptomics, proteomics and metabolomics). We designed the Integrated Pathway Analysis Database for Systematic Enrichment Analysis (IPAD, http://bioinfo.hsc.unt.edu/ipad), defining inter-association between pathway, disease, drug and organ specificity, based on six criteria: 1) comprehensive pathway coverage; 2) gene/protein to pathway/disease/drug/organ association; 3) inter-association between pathway, disease, drug, and organ; 4) multiple and quantitative measurement of enrichment and inter-association; 5) assessment of enrichment and inter-association analysis with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources; and 6) cross-linking of

  9. IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis

    PubMed Central

    2012-01-01

    Background Next-Generation Sequencing (NGS) technologies and Genome-Wide Association Studies (GWAS) generate millions of reads and hundreds of datasets, and there is an urgent need for a better way to accurately interpret and distill such large amounts of data. Extensive pathway and network analysis allow for the discovery of highly significant pathways from a set of disease vs. healthy samples in the NGS and GWAS. Knowledge of activation of these processes will lead to elucidation of the complex biological pathways affected by drug treatment, to patient stratification studies of new and existing drug treatments, and to understanding the underlying anti-cancer drug effects. There are approximately 141 biological human pathway resources as of Jan 2012 according to the Pathguide database. However, most currently available resources do not contain disease, drug or organ specificity information such as disease-pathway, drug-pathway, and organ-pathway associations. Systematically integrating pathway, disease, drug and organ specificity together becomes increasingly crucial for understanding the interrelationships between signaling, metabolic and regulatory pathway, drug action, disease susceptibility, and organ specificity from high-throughput omics data (genomics, transcriptomics, proteomics and metabolomics). Results We designed the Integrated Pathway Analysis Database for Systematic Enrichment Analysis (IPAD, http://bioinfo.hsc.unt.edu/ipad), defining inter-association between pathway, disease, drug and organ specificity, based on six criteria: 1) comprehensive pathway coverage; 2) gene/protein to pathway/disease/drug/organ association; 3) inter-association between pathway, disease, drug, and organ; 4) multiple and quantitative measurement of enrichment and inter-association; 5) assessment of enrichment and inter-association analysis with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources; and 6

  10. On Iron Enrichment, Star Formation, and Type Ia Supernovae in Galaxy Clusters

    NASA Technical Reports Server (NTRS)

    Loewenstein, Michael

    2006-01-01

    The nature of star formation and Type Ia supernovae (SNIa) in galaxies in the field and in rich galaxy clusters are contrasted by juxtaposing the buildup of heavy metals in the universe inferred from observed star formation and supernovae rate histories with data on the evolution of Fe abundances in the intracluster medium (ICM). Models for the chemical evolution of Fe in these environments are constructed, subject to observational constraints, for this purpose. While models with a mean delay for SNIa of 3 Gyr and standard initial mass function (IMF) are fully consistent with observations in the field, cluster Fe enrichment immediately tracked a rapid, top-heavy phase of star formation - although transport of Fe into the ICM may have been more prolonged and star formation likely continued beyond redshift 1. The means of this prompt enrichment consisted of SNII yielding greater than or equal to 0.1 solar mass per explosion (if the SNIa rate normalization is scaled down from its value in the field according to the relative number of candidate progenitor stars in the 3 - 8 solar mass range) and/or SNIa with short delay times originating during the rapid star formation epoch. Star formation is greater than 3 times more efficient in rich clusters than in the field, mitigating the overcooling problem in numerical cluster simulations. Both the fraction of baryons cycled through stars, and the fraction of the total present-day stellar mass in the form of stellar remnants, are substantially greater in clusters than in the field.

  11. FLUORINE VARIATIONS IN THE GLOBULAR CLUSTER NGC 6656 (M22): IMPLICATIONS FOR INTERNAL ENRICHMENT TIMESCALES

    SciTech Connect

    D'Orazi, Valentina; Lucatello, Sara; Gratton, Raffaele G.; Lugaro, Maria; Angelou, George; Bragaglia, Angela; Carretta, Eugenio; Alves-Brito, Alan; Ivans, Inese I.; Masseron, Thomas; Mucciarelli, Alessio

    2013-01-20

    Observed chemical (anti)correlations in proton-capture elements among globular cluster stars are presently recognized as the signature of self-enrichment from now extinct, previous generations of stars. This defines the multiple population scenario. Since fluorine is also affected by proton captures, determining its abundance in globular clusters provides new and complementary clues regarding the nature of these previous generations and supplies strong observational constraints to the chemical enrichment timescales. In this paper, we present our results on near-infrared CRIRES spectroscopic observations of six cool giant stars in NGC 6656 (M22): the main objective is to derive the F content and its internal variation in this peculiar cluster, which exhibits significant changes in both light- and heavy-element abundances. Across our sample, we detected F variations beyond the measurement uncertainties and found that the F abundances are positively correlated with O and anticorrelated with Na, as expected according to the multiple population framework. Furthermore, our observations reveal an increase in the F content between the two different sub-groups, s-process rich and s-process poor, hosted within M22. The comparison with theoretical models suggests that asymptotic giant stars with masses between 4 and 5 M {sub Sun} are responsible for the observed chemical pattern, confirming evidence from previous works: the difference in age between the two sub-components in M22 must be not larger than a few hundred Myr.

  12. Chemical enrichment in the hot intra-cluster medium seen with XMM-Newton/EPIC

    NASA Astrophysics Data System (ADS)

    Mernier, F.; de Plaa, J.; Pinto, C.; Kaastra, J.; Kosec, P.; Zhang, Y.; Mao, J.; Werner, N.

    2016-06-01

    The intra-cluster medium (ICM), permeating the large gravitational potential well of galaxy clusters and groups, is rich in metals, which can be detected via their emission lines in the soft X-ray band. These heavy elements (typically from O to Ni) have been synthesized by Type Ia (SNIa) and core-collapse (SNcc) supernovae within the galaxy members, and continuously enrich the ICM since the cosmic star formation peak (z ≃ 2-3). Because the predicted chemical yields of supernovae depend on either their explosion mechanisms (SNIa) or the initial mass and metallicity of their progenitors (SNcc), measuring the abundances in the ICM can help to constrain supernovae models. In this study, we use XMM-Newton/EPIC to measure the abundances of 9 elements (Mg, Si, S, Ar, Ca, Cr, Mn, Fe and Ni) in a sample of 44 cool-core galaxy clusters, groups and ellipticals (the CHEERS catalog). Combining these results with the O and Ne abundances measured using RGS, we establish an average X/Fe abundance pattern in the ICM, and we determine the best-fit SNIa and SNcc models, as well as the relative fraction of SNIa/SNcc responsible for the enrichment.

  13. Fluorine Variations in the Globular Cluster NGC 6656 (M22): Implications for Internal Enrichment Timescales

    NASA Astrophysics Data System (ADS)

    D'Orazi, Valentina; Lucatello, Sara; Lugaro, Maria; Gratton, Raffaele G.; Angelou, George; Bragaglia, Angela; Carretta, Eugenio; Alves-Brito, Alan; Ivans, Inese I.; Masseron, Thomas; Mucciarelli, Alessio

    2013-01-01

    Observed chemical (anti)correlations in proton-capture elements among globular cluster stars are presently recognized as the signature of self-enrichment from now extinct, previous generations of stars. This defines the multiple population scenario. Since fluorine is also affected by proton captures, determining its abundance in globular clusters provides new and complementary clues regarding the nature of these previous generations and supplies strong observational constraints to the chemical enrichment timescales. In this paper, we present our results on near-infrared CRIRES spectroscopic observations of six cool giant stars in NGC 6656 (M22): the main objective is to derive the F content and its internal variation in this peculiar cluster, which exhibits significant changes in both light- and heavy-element abundances. Across our sample, we detected F variations beyond the measurement uncertainties and found that the F abundances are positively correlated with O and anticorrelated with Na, as expected according to the multiple population framework. Furthermore, our observations reveal an increase in the F content between the two different sub-groups, s-process rich and s-process poor, hosted within M22. The comparison with theoretical models suggests that asymptotic giant stars with masses between 4 and 5 M ⊙ are responsible for the observed chemical pattern, confirming evidence from previous works: the difference in age between the two sub-components in M22 must be not larger than a few hundred Myr. Based on observations taken with ESO telescopes under program 087.0319(A).

  14. Pancreatic islet enhancer clusters enriched in type 2 diabetes risk-associated variants.

    PubMed

    Pasquali, Lorenzo; Gaulton, Kyle J; Rodríguez-Seguí, Santiago A; Mularoni, Loris; Miguel-Escalada, Irene; Akerman, Ildem; Tena, Juan J; Morán, Ignasi; Gómez-Marín, Carlos; van de Bunt, Martijn; Ponsa-Cobas, Joan; Castro, Natalia; Nammo, Takao; Cebola, Inês; García-Hurtado, Javier; Maestro, Miguel Angel; Pattou, François; Piemonti, Lorenzo; Berney, Thierry; Gloyn, Anna L; Ravassard, Philippe; Gómez-Skarmeta, José Luis; Müller, Ferenc; McCarthy, Mark I; Ferrer, Jorge

    2014-02-01

    Type 2 diabetes affects over 300 million people, causing severe complications and premature death, yet the underlying molecular mechanisms are largely unknown. Pancreatic islet dysfunction is central in type 2 diabetes pathogenesis, and understanding islet genome regulation could therefore provide valuable mechanistic insights. We have now mapped and examined the function of human islet cis-regulatory networks. We identify genomic sequences that are targeted by islet transcription factors to drive islet-specific gene activity and show that most such sequences reside in clusters of enhancers that form physical three-dimensional chromatin domains. We find that sequence variants associated with type 2 diabetes and fasting glycemia are enriched in these clustered islet enhancers and identify trait-associated variants that disrupt DNA binding and islet enhancer activity. Our studies illustrate how islet transcription factors interact functionally with the epigenome and provide systematic evidence that the dysregulation of islet enhancers is relevant to the mechanisms underlying type 2 diabetes.

  15. ALMA Reveals Potential Localized Dust Enrichment from Massive Star Clusters in II Zw 40

    NASA Astrophysics Data System (ADS)

    Consiglio, S. Michelle; Turner, Jean L.; Beck, Sara; Meier, David S.

    2016-12-01

    We present subarcsecond images of submillimeter CO and continuum emission from a local galaxy forming massive star clusters: the blue compact dwarf galaxy II Zw 40. At ˜0.″4 resolution (20 pc), the CO(3-2), CO(1-0), 3 mm, and 870 μm continuum maps illustrate star formation on the scales of individual molecular clouds. Dust contributes about one-third of the 870 μm continuum emission, with free-free accounting for the rest. On these scales, there is not a good correspondence between gas, dust, and free-free emission. Dust continuum is enhanced toward the star-forming region as compared to the CO emission. We suggest that an unexpectedly low and spatially variable gas-to-dust ratio is the result of rapid and localized dust enrichment of clouds by the massive clusters of the starburst.

  16. EVIDENCE FOR ENRICHMENT BY SUPERNOVAE IN THE GLOBULAR CLUSTER NGC 6273

    SciTech Connect

    Han, Sang-Il; Lim, Dongwook; Seo, Hyunju; Lee, Young-Wook

    2015-11-10

    In our recent investigation, we showed that narrowband photometry can be combined with low-resolution spectroscopy to effectively search for globular clusters (GCs) with supernova (SN) enrichments. Here we apply this technique to the metal-poor bulge GC NGC 6273 and find that the red giant branch stars in this GC are clearly divided into two distinct subpopulations with different calcium abundances. The Ca rich subpopulation in this GC is also enhanced in CN and CH, showing a positive correlation between them. This trend is identical to the result we found in M22, suggesting that this might be a ubiquitous nature of GCs more strongly affected by SNe in their chemical evolution. Our results suggest that NGC 6273 was massive enough to retain SN ejecta, which would place this cluster in the growing group of GCs with Galactic building block characteristics, such as ω Centauri and Terzan 5.

  17. Effects of progressive resistance training combined with a protein-enriched lean red meat diet on health-related quality of life in elderly women: secondary analysis of a 4-month cluster randomised controlled trial.

    PubMed

    Torres, Susan J; Robinson, Sian; Orellana, Liliana; O'Connell, Stella L; Grimes, Carley A; Mundell, Niamh L; Dunstan, David W; Nowson, Caryl A; Daly, Robin M

    2017-06-01

    Resistance training (RT) and increased dietary protein are recommended to attenuate age-related muscle loss in the elderly. This study examined the effect of a lean red meat protein-enriched diet combined with progressive resistance training (RT+Meat) on health-related quality of life (HR-QoL) in elderly women. In this 4-month cluster randomised controlled trial, 100 women aged 60-90 years (mean 73 years) from self-care retirement villages participated in RT twice a week and were allocated either 160 g/d (cooked) lean red meat consumed across 2 meals/d, 6 d/week or ≥1 serving/d (25-30 g) carbohydrates (control group, CRT). HR-QoL (SF-36 Health Survey questionnaire), lower limb maximum muscle strength and lean tissue mass (LTM) (dual-energy X-ray absorptiometry) were assessed at baseline and 4 months. In all, ninety-one women (91 %) completed the study (RT+Meat (n 48); CRT (n 43)). Mean protein intake was greater in RT+Meat than CRT throughout the study (1·3 (sd 0·3) v. 1·1 (sd 0·3) g/kg per d, P<0·05). Exercise compliance (74 %) was not different between groups. After 4 months there was a significant net benefit in the RT+Meat compared with CRT group for overall HR-QoL and the physical component summary (PCS) score (P<0·01), but there were no changes in either group in the mental component summary (MCS) score. Changes in lower limb muscle strength, but not LTM, were positively associated with changes in overall HR-QoL (muscle strength, β: 2·2 (95 % CI 0·1, 4·3), P<0·05). In conclusion, a combination of RT and increased dietary protein led to greater net benefits in overall HR-QoL in elderly women compared with RT alone, which was because of greater improvements in PCS rather than MCS.

  18. ENVIRONMENTAL EFFECTS ON THE METAL ENRICHMENT OF LOW-MASS GALAXIES IN NEARBY CLUSTERS

    SciTech Connect

    Petropoulou, V.; Vilchez, J.; Iglesias-Paramo, J.

    2012-04-20

    In this paper, we study the chemical history of low-mass star-forming (SF) galaxies in the local universe clusters Coma, A1367, A779, and A634. The aim of this work is to search for the imprint of the environment on the chemical evolution of these galaxies. Galaxy chemical evolution is linked to the star formation history, as well as to the gas interchange with the environment, and low-mass galaxies are well known to be vulnerable systems to environmental processes affecting both these parameters. For our study we have used spectra from the SDSS-III DR8. We have examined the spectroscopic properties of SF galaxies of stellar masses 10{sup 8}-10{sup 10} M{sub Sun }, located from the core to the cluster's outskirts. The gas-phase O/H and N/O chemical abundances have been derived using the latest empirical calibrations. We have examined the mass-metallicity relation of cluster galaxies, finding well-defined sequences. The slope of these sequences, for galaxies in low-mass clusters and galaxies at large cluster-centric distances, follows the predictions of recent hydrodynamic models. A flattening of this slope has been observed for galaxies located in the core of the two more massive clusters of the sample, principally in Coma, suggesting that the imprint of the cluster environment on the chemical evolution of SF galaxies should be sensitive to both the galaxy mass and the host cluster mass. The H I gas content of Coma and A1367 galaxies indicates that low-mass SF galaxies, located at the core of these clusters, have been severely affected by ram-pressure stripping (RPS). The observed mass-dependent enhancement of the metal content of low-mass galaxies in dense environments seems plausible, according to hydrodynamic simulations. This enhanced metal enrichment could be produced by the combination of effects such as wind reaccretion, due to pressure confinement by the intracluster medium (ICM), and the truncation of gas infall, as a result of the RPS. Thus, the

  19. Functional Gene Networks: R/Bioc package to generate and analyse gene networks derived from functional enrichment and clustering

    PubMed Central

    Aibar, Sara; Fontanillo, Celia; Droste, Conrad; De Las Rivas, Javier

    2015-01-01

    Summary: Functional Gene Networks (FGNet) is an R/Bioconductor package that generates gene networks derived from the results of functional enrichment analysis (FEA) and annotation clustering. The sets of genes enriched with specific biological terms (obtained from a FEA platform) are transformed into a network by establishing links between genes based on common functional annotations and common clusters. The network provides a new view of FEA results revealing gene modules with similar functions and genes that are related to multiple functions. In addition to building the functional network, FGNet analyses the similarity between the groups of genes and provides a distance heatmap and a bipartite network of functionally overlapping genes. The application includes an interface to directly perform FEA queries using different external tools: DAVID, GeneTerm Linker, TopGO or GAGE; and a graphical interface to facilitate the use. Availability and implementation: FGNet is available in Bioconductor, including a tutorial. URL: http://bioconductor.org/packages/release/bioc/html/FGNet.html Contact: jrivas@usal.es Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25600944

  20. Overview on techniques in cluster analysis.

    PubMed

    Frades, Itziar; Matthiesen, Rune

    2010-01-01

    Clustering is the unsupervised, semisupervised, and supervised classification of patterns into groups. The clustering problem has been addressed in many contexts and disciplines. Cluster analysis encompasses different methods and algorithms for grouping objects of similar kinds into respective categories. In this chapter, we describe a number of methods and algorithms for cluster analysis in a stepwise framework. The steps of a typical clustering analysis process include sequentially pattern representation, the choice of the similarity measure, the choice of the clustering algorithm, the assessment of the output, and the representation of the clusters.

  1. Cluster analysis in family psychology research.

    PubMed

    Henry, David B; Tolan, Patrick H; Gorman-Smith, Deborah

    2005-03-01

    This article discusses the use of cluster analysis in family psychology research. It provides an overview of potential clustering methods, the steps involved in cluster analysis, hierarchical and nonhierarchical clustering methods, and validation and interpretation of cluster solutions. The article also reviews 5 uses of clustering in family psychology research: (a) deriving family types, (b) studying families over time, (c) as an interface between qualitative and quantitative methods, (d) as an alternative to multivariate interactions in linear models, and (e) as a data reduction technique for small samples. The article concludes with some cautions for using clustering in family psychology research.

  2. SUPERMODEL ANALYSIS OF GALAXY CLUSTERS

    SciTech Connect

    Fusco-Femiano, R.; Cavaliere, A.; Lapi, A.

    2009-11-01

    We present the analysis of the X-ray brightness and temperature profiles for six clusters belonging to both the Cool Core (CC) and Non Cool Core (NCC) classes, in terms of the Supermodel (SM) developed by Cavaliere et al. Based on the gravitational wells set by the dark matter (DM) halos, the SM straightforwardly expresses the equilibrium of the intracluster plasma (ICP) modulated by the entropy deposited at the boundary by standing shocks from gravitational accretion, and injected at the center by outgoing blast waves from mergers or from outbursts of active galactic nuclei. The cluster set analyzed here highlights not only how simply the SM represents the main dichotomy CC versus NCC clusters in terms of a few ICP parameters governing the radial entropy run, but also how accurately it fits even complex brightness and temperature profiles. For CC clusters like A2199 and A2597, the SM with a low level of central entropy straightforwardly yields the characteristic peaked profile of the temperature marked by a decline toward the center, without requiring currently strong radiative cooling and high mass deposition rates. NCC clusters like A1656 require instead a central entropy floor of a substantial level, and some like A2256 and even more A644 feature structured temperature profiles that also call for a definite floor extension; in such conditions the SM accurately fits the observations, and suggests that in these clusters the ICP has been just remolded by a merger event, in the way of a remnant cool core. The SM also predicts that DM halos with high concentration should correlate with flatter entropy profiles and steeper brightness in the outskirts; this is indeed the case with A1689, for which from X-rays we find concentration values c approx 10, the hallmark of an early halo formation. Thus, we show the SM to constitute a fast tool not only to provide wide libraries of accurate fits to X-ray temperature and density profiles, but also to retrieve from the ICP

  3. The s-process enrichment of the globular clusters M4 and M22

    SciTech Connect

    Shingles, Luke J.; Karakas, Amanda I.; Fishlock, Cherie K.; Yong, David; Da Costa, Gary S.; Marino, Anna F.; Hirschi, Raphael

    2014-11-01

    We investigate the enrichment in elements produced by the slow neutron-capture process (s-process) in the globular clusters M4 (NGC 6121) and M22 (NGC 6656). Stars in M4 have homogeneous abundances of Fe and neutron-capture elements, but the entire cluster is enhanced in s-process elements (Sr, Y, Ba, Pb) relative to other clusters with a similar metallicity. In M22, two stellar groups exhibit different abundances of Fe and s-process elements. By subtracting the mean abundances of s-poor from s-rich stars, we derive s-process residuals or empirical s-process distributions for M4 and M22. We find that the s-process distribution in M22 is more weighted toward the heavy s-peak (Ba, La, Ce) and Pb than M4, which has been enriched mostly with light s-peak elements (Sr, Y, Zr). We construct simple chemical evolution models using yields from massive star models that include rotation, which dramatically increases s-process production at low metallicity. We show that our massive star models with rotation rates of up to 50% of the critical (break-up) velocity and changes to the preferred {sup 17}O(α, γ){sup 21}Ne rate produce insufficient heavy s-elements and Pb to match the empirical distributions. For models that incorporate asymptotic giant branch yields, we find that intermediate-mass yields (with a {sup 22}Ne neutron source) alone do not reproduce the light-to-heavy s-element ratios for M4 and M22, and that a small contribution from models with a {sup 13}C pocket is required. With our assumption that {sup 13}C pockets form for initial masses below a transition range between 3.0 and 3.5 M {sub ☉}, we match the light-to-heavy s-element ratio in the s-process residual of M22 and predict a minimum enrichment timescale of between 240 and 360 Myr. Our predicted value is consistent with the 300 Myr upper limit age difference between the two groups derived from isochrone fitting.

  4. Extremely α-Enriched Globular Clusters in Early-Type Galaxies:A Step toward the Dawn of Stellar Populations?

    NASA Astrophysics Data System (ADS)

    Puzia, Thomas H.; Kissler-Patig, Markus; Goudfrooij, Paul

    2006-09-01

    We compare [α/Fe], metallicity, and age distributions of globular clusters in elliptical, lenticular, and spiral galaxies, which we derive from Lick line index measurements. We find a large number of globular clusters in elliptical galaxies that reach significantly higher [α/Fe] values ([α/Fe]>0.5) than any clusters in lenticular and spiral galaxies. Most of these extremely α-enriched globular clusters are old (t>8 Gyr), and cover the metallicity range -1<~[Z/H]<~0. A comparison with supernova yield models suggests that the progenitor gas clouds of these globular clusters must have been predominantly enriched by massive stars (>~20 Msolar), with little contribution from lower mass stars. The measured [α/Fe] ratios are also consistent with yields of very massive pair-instability supernovae (~130-190 Msolar). Both scenarios imply that the chemical enrichment of the progenitor gas was completed on extremely short timescales of the order of a few Myr. Given the lower [α/Fe] average ratios of the diffuse stellar population in early-type galaxies, our results suggest that these extremely α-enhanced globular clusters could be members of the very first generation of star clusters formed, and that their formation epochs would predate the formation of the majority of stars in giant early-type galaxies.

  5. Globular cluster formation with multiple stellar populations: self-enrichment in fractal massive molecular clouds

    NASA Astrophysics Data System (ADS)

    Bekki, Kenji

    2017-08-01

    Internal chemical abundance spreads are one of fundamental properties of globular clusters (GCs) in the Galaxy. In order to understand the origin of such abundance spreads, we numerically investigate GC formation from massive molecular clouds (MCs) with fractal structures using our new hydrodynamical simulations with star formation and feedback effects of core-collapse supernovae (SNe) and asymptotic giant branch (AGB) stars. We particularly investigate star formation from gas chemically contaminated by SNe and AGB stars ('self-enrichment') in forming GCs within MCs with different initial conditions and environments. The principal results are as follows. GCs with multiple generations of stars can be formed from merging of hierarchical star cluster complexes that are developed from high-density regions of fractal MCs. Feedback effects of SNe and AGB stars can control the formation efficiencies of stars formed from original gas of MCs and from gas ejected from AGB stars. The simulated GCs have strong radial gradients of helium abundances within the central 3 pc. The original MC masses need to be as large as 107 M⊙ for a canonical initial stellar mass function (IMF) so that the final masses of stars formed from AGB ejecta can be ∼105 M⊙. Since star formation from AGB ejecta is rather prolonged (∼108 yr), their formation can be strongly suppressed by SNe of the stars themselves. This result implies that the so-called mass budget problem is much more severe than ever thought in the self-enrichment scenario of GC formation and thus that IMF for the second generation of stars should be 'top-light'.

  6. The SMART CLUSTER METHOD - adaptive earthquake cluster analysis and declustering

    NASA Astrophysics Data System (ADS)

    Schaefer, Andreas; Daniell, James; Wenzel, Friedemann

    2016-04-01

    Earthquake declustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity with usual applications comprising of probabilistic seismic hazard assessments (PSHAs) and earthquake prediction methods. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation. Various methods have been developed to address this issue from other researchers. These have differing ranges of complexity ranging from rather simple statistical window methods to complex epidemic models. This study introduces the smart cluster method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal identification. Hereby, an adaptive search algorithm for data point clusters is adopted. It uses the earthquake density in the spatio-temporal neighbourhood of each event to adjust the search properties. The identified clusters are subsequently analysed to determine directional anisotropy, focussing on a strong correlation along the rupture plane and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010/2011 Darfield-Christchurch events, an adaptive classification procedure is applied to disassemble subsequent ruptures which may have been grouped into an individual cluster using near-field searches, support vector machines and temporal splitting. The steering parameters of the search behaviour are linked to local earthquake properties like magnitude of completeness, earthquake density and Gutenberg-Richter parameters. The method is capable of identifying and classifying earthquake clusters in space and time. It is tested and validated using earthquake data from California and New Zealand. As a result of the cluster identification process, each event in

  7. miEAA: microRNA enrichment analysis and annotation

    PubMed Central

    Backes, Christina; Khaleeq, Qurratulain T.; Meese, Eckart; Keller, Andreas

    2016-01-01

    Similar to the development of gene set enrichment and gene regulatory network analysis tools over a decade ago, microRNA enrichment tools are currently gaining importance. Building on our experience with the gene set analysis toolkit GeneTrail, we implemented the miRNA Enrichment Analysis and Annotation tool (miEAA). MiEAA is a web-based application that offers a variety of commonly applied statistical tests such as over-representation analysis and miRNA set enrichment analysis, which is similar to Gene Set Enrichment Analysis. Besides the different statistical tests, miEAA also provides rich functionality in terms of miRNA categories. Altogether, over 14 000 miRNA sets have been added, including pathways, diseases, organs and target genes. Importantly, our tool can be applied for miRNA precursors as well as mature miRNAs. To make the tool as useful as possible we additionally implemented supporting tools such as converters between different miRBase versions and converters from miRNA names to precursor names. We evaluated the performance of miEAA on two sets of miRNAs that are affected in lung adenocarcinomas and have been detected by array analysis. The web-based application is freely accessible at: http://www.ccb.uni-saarland.de/mieaa_tool/. PMID:27131362

  8. Analysis of mosses and topsoils for detecting sources of heavy metal pollution: multivariate and enrichment factor analysis.

    PubMed

    Dragović, S; Mihailović, N

    2009-10-01

    In order to assess the contribution of emission sources to the pollution of areas remote from industrial facilities, a combined approach of enrichment factor analysis and multivariate statistics was used for detecting the origin of heavy metal pollution in the Zlatibor ecosystem, in Serbia. Samples of moss (Pleurozium schreberi, Hylocomium splendens, Scleropodium purum, Hypnum cupressiforme and Thuidum delicatulum) and of topsoil (0-5 cm) were collected in 2005. The concentrations of seven heavy metals (Cd, Cr, Cu, Mn, Ni, Pb and Zn) were determined in moss and soil samples by atomic absorption spectrometry. The results obtained by enrichment factor analysis and two multivariate statistical methods, principal component analysis and cluster analysis, enabled discrimination of the lithologic and anthropogenic sources of heavy metals in the mosses. Enrichment factors, calculated to evaluate the contribution to the metal content in moss from anthropogenic sources, revealed pollution of the investigated area by Cd and Pb, originating from long-range transport and fossil fuel burning.

  9. Multi-viewpoint clustering analysis

    NASA Technical Reports Server (NTRS)

    Mehrotra, Mala; Wild, Chris

    1993-01-01

    In this paper, we address the feasibility of partitioning rule-based systems into a number of meaningful units to enhance the comprehensibility, maintainability and reliability of expert systems software. Preliminary results have shown that no single structuring principle or abstraction hierarchy is sufficient to understand complex knowledge bases. We therefore propose the Multi View Point - Clustering Analysis (MVP-CA) methodology to provide multiple views of the same expert system. We present the results of using this approach to partition a deployed knowledge-based system that navigates the Space Shuttle's entry. We also discuss the impact of this approach on verification and validation of knowledge-based systems.

  10. Enabling Enrichment Analysis with the Human Disease Ontology

    PubMed Central

    LePendu, Paea; Musen, Mark A.; Shah, Nigam H.

    2012-01-01

    Advanced statistical methods used to analyze high-throughput data such as gene-expression assays result in long lists of “significant genes.” One way to gain insight into the significance of altered expression levels is to determine whether Gene Ontology (GO) terms associated with a particular biological process, molecular function, or cellular component are over- or under-represented in the set of genes deemed significant. This process, referred to as enrichment analysis, profiles a gene-set, and is widely used to make sense of the results of high-throughput experiments. Our goal is to develop and apply general enrichment analysis methods to profile other sets of interest, such as patient cohorts from the electronic medical record, using a variety of ontologies including SNOMED CT, MedDRA, RxNorm, and others. Although it is possible to perform enrichment analysis using ontologies other than the GO, a key pre-requisite is the availability of a background set of annotations to enable the enrichment calculation. In the case of the GO, this background set is provided by the Gene Ontology Annotations. In the current work, we describe: (i) a general method that uses hand-curated GO annotations as a starting point for creating background datasets for enrichment analysis using other ontologies; and (ii) a gene–disease background annotation set—that enables disease-based enrichment—to demonstrate feasibility of our method. PMID:21550421

  11. DICON: interactive visual analysis of multidimensional clusters.

    PubMed

    Cao, Nan; Gotz, David; Sun, Jimeng; Qu, Huamin

    2011-12-01

    Clustering as a fundamental data analysis technique has been widely used in many analytic applications. However, it is often difficult for users to understand and evaluate multidimensional clustering results, especially the quality of clusters and their semantics. For large and complex data, high-level statistical information about the clusters is often needed for users to evaluate cluster quality while a detailed display of multidimensional attributes of the data is necessary to understand the meaning of clusters. In this paper, we introduce DICON, an icon-based cluster visualization that embeds statistical information into a multi-attribute display to facilitate cluster interpretation, evaluation, and comparison. We design a treemap-like icon to represent a multidimensional cluster, and the quality of the cluster can be conveniently evaluated with the embedded statistical information. We further develop a novel layout algorithm which can generate similar icons for similar clusters, making comparisons of clusters easier. User interaction and clutter reduction are integrated into the system to help users more effectively analyze and refine clustering results for large datasets. We demonstrate the power of DICON through a user study and a case study in the healthcare domain. Our evaluation shows the benefits of the technique, especially in support of complex multidimensional cluster analysis.

  12. Quantitative mass spectrometric analysis of glycoproteins combined with enrichment methods.

    PubMed

    Ahn, Yeong Hee; Kim, Jin Young; Yoo, Jong Shin

    2015-01-01

    Mass spectrometry (MS) has been a core technology for high sensitive and high-throughput analysis of the enriched glycoproteome in aspects of quantitative assays as well as qualitative profiling of glycoproteins. Because it has been widely recognized that aberrant glycosylation in a glycoprotein may involve in progression of a certain disease, the development of efficient analysis tool for the aberrant glycoproteins is very important for deep understanding about pathological function of the glycoprotein and new biomarker development. This review first describes the protein glycosylation-targeting enrichment technologies mainly employing solid-phase extraction methods such as hydrizide-capturing, lectin-specific capturing, and affinity separation techniques based on porous graphitized carbon, hydrophilic interaction chromatography, or immobilized boronic acid. Second, MS-based quantitative analysis strategies coupled with the protein glycosylation-targeting enrichment technologies, by using a label-free MS, stable isotope-labeling, or targeted multiple reaction monitoring (MRM) MS, are summarized with recent published studies.

  13. ProbCD: enrichment analysis accounting for categorization uncertainty.

    PubMed

    Vêncio, Ricardo Z N; Shmulevich, Ilya

    2007-10-12

    As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create probabilistic annotations, especially in the Gene Ontology context, or to deal with uncertainty in high throughput-based datasets, current enrichment methods largely ignore this probabilistic information since they are mainly based on variants of the Fisher Exact Test. We developed an open-source R-based software to deal with probabilistic categorical data analysis, ProbCD, that does not require a static contingency table. The contingency table for the enrichment problem is built using the expectation of a Bernoulli Scheme stochastic process given the categorization probabilities. An on-line interface was created to allow usage by non-programmers and is available at: http://xerad.systemsbiology.net/ProbCD/. We present an analysis framework and software tools to address the issue of uncertainty in categorical data analysis. In particular, concerning the enrichment analysis, ProbCD can accommodate: (i) the stochastic nature of the high-throughput experimental techniques and (ii) probabilistic gene annotation.

  14. Multivariate Analysis of the Globular Clusters in M87

    NASA Astrophysics Data System (ADS)

    Das, Sukanta; Chattopadhayay, Tanuka; Davoust, Emmanuel

    2015-11-01

    An objective classification of 147 globular clusters (GCs) in the inner region of the giant elliptical galaxy M87 is carried out with the help of two methods of multivariate analysis. First, independent component analysis (ICA) is used to determine a set of independent variables that are linear combinations of various observed parameters (mostly Lick indices) of the GCs. Next, K-means cluster analysis (CA) is applied on the independent components (ICs), to find the optimum number of homogeneous groups having an underlying structure. The properties of the four groups of GCs thus uncovered are used to explain the formation mechanism of the host galaxy. It is suggested that M87 formed in two successive phases. First a monolithic collapse, which gave rise to an inner group of metal-rich clusters with little systematic rotation and an outer group of metal-poor clusters in eccentric orbits. In a second phase, the galaxy accreted low-mass satellites in a dissipationless fashion, from the gas of which the two other groups of GCs formed. Evidence is given for a blue stellar population in the more metal rich clusters, which we interpret by Helium enrichment. Finally, it is found that the clusters of M87 differ in some of their chemical properties (NaD, TiO1, light-element abundances) from GCs in our Galaxy and M31.

  15. Separate enrichment analysis of pathways for up- and downregulated genes.

    PubMed

    Hong, Guini; Zhang, Wenjing; Li, Hongdong; Shen, Xiaopei; Guo, Zheng

    2014-03-06

    Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.

  16. Cluster analysis in phenotyping a Portuguese population.

    PubMed

    Loureiro, C C; Sa-Couto, P; Todo-Bom, A; Bousquet, J

    2015-09-03

    Unbiased cluster analysis using clinical parameters has identified asthma phenotypes. Adding inflammatory biomarkers to this analysis provided a better insight into the disease mechanisms. This approach has not yet been applied to asthmatic Portuguese patients. To identify phenotypes of asthma using cluster analysis in a Portuguese asthmatic population treated in secondary medical care. Consecutive patients with asthma were recruited from the outpatient clinic. Patients were optimally treated according to GINA guidelines and enrolled in the study. Procedures were performed according to a standard evaluation of asthma. Phenotypes were identified by cluster analysis using Ward's clustering method. Of the 72 patients enrolled, 57 had full data and were included for cluster analysis. Distribution was set in 5 clusters described as follows: cluster (C) 1, early onset mild allergic asthma; C2, moderate allergic asthma, with long evolution, female prevalence and mixed inflammation; C3, allergic brittle asthma in young females with early disease onset and no evidence of inflammation; C4, severe asthma in obese females with late disease onset, highly symptomatic despite low Th2 inflammation; C5, severe asthma with chronic airflow obstruction, late disease onset and eosinophilic inflammation. In our study population, the identified clusters were mainly coincident with other larger-scale cluster analysis. Variables such as age at disease onset, obesity, lung function, FeNO (Th2 biomarker) and disease severity were important for cluster distinction. Copyright © 2015. Published by Elsevier España, S.L.U.

  17. The applicability and effectiveness of cluster analysis

    NASA Technical Reports Server (NTRS)

    Ingram, D. S.; Actkinson, A. L.

    1973-01-01

    An insight into the characteristics which determine the performance of a clustering algorithm is presented. In order for the techniques which are examined to accurately cluster data, two conditions must be simultaneously satisfied. First the data must have a particular structure, and second the parameters chosen for the clustering algorithm must be correct. By examining the structure of the data from the Cl flight line, it is clear that no single set of parameters can be used to accurately cluster all the different crops. The effectiveness of either a noniterative or iterative clustering algorithm to accurately cluster data representative of the Cl flight line is questionable. Thus extensive a prior knowledge is required in order to use cluster analysis in its present form for applications like assisting in the definition of field boundaries and evaluating the homogeneity of a field. New or modified techniques are necessary for clustering to be a reliable tool.

  18. Increasing the performance of tritium analysis by electrolytic enrichment.

    PubMed

    Groning, M; Auer, R; Brummer, D; Jaklitsch, M; Sambandam, C; Tanweer, A; Tatzber, H

    2009-06-01

    Several improvements are described for the existing tritium enrichment system at the Isotope Hydrology Laboratory of the International Atomic Energy Agency for processing natural water samples. The improvements include a simple method for pretreatment of electrolytic cells to ensure a high tritium separation factor, an improved design of the exhaust system for explosive gases, and a vacuum distillation line for faster initial preparation of water samples for electrolytic enrichment and for tritium analysis. Achievements included the reduction of variation of individual enrichment parameters of all cells to less than 1% and an improvement of 50% of the stability of the background mean. It resulted in an improved detection limit of less than 0.4 TU (at 2s), important for application of tritium measurements in the future at low concentration levels, and resulted in measurement precisions of+/-0.2 TU and+/-0.15 TU for liquid scintillation counting and for gas proportional counting, respectively.

  19. Central Elemental Abundance Ratios In the Perseus Cluster: Resonant Scattering or SN Ia Enrichment?

    NASA Technical Reports Server (NTRS)

    Dupke, Renato A.; Arnaud, Keith; White, Nicholas E. (Technical Monitor)

    2001-01-01

    We have determined abundance ratios in the core of the Perseus Cluster for several elements. These ratios indicate a central dominance of Type 1a supernova (SN Ia) ejects similar to that found for A496, A2199 and A3571. Simultaneous analysis of ASCA spectra from SIS1, GIS2, and GIS3 shows that the ratio of Ni to Fe abundances is approx. 3.4 +/- 1.1 times solar within the central 4'. This ratio is consistent with (and more precise than) that observed in other clusters whose central regions are dominated by SN Ia ejecta. Such a large Ni overabundance is predicted by "convective deflagration" explosion models for SNe Ia such as W7 but is inconsistent with delayed detonation models. We note that with current instrumentation the Ni K(alpha) line is confused with Fe K(beta) and that the Ni overabundance we observe has been interpreted by others as an anomalously large ratio of Fe K(beta) to Fe K(alpha) caused by resonant scattering in the Fe K(alpha) line. We argue that a central enhancement of SN Ia ejecta and hence a high ratio of Ni to Fe abundances are naturally explained by scenarios that include the generation of chemical gradients by suppressed SN Ia winds or ram pressure stripping of cluster galaxies. It is not necessary to suppose that the intracluster gas is optically thick to resonant scattering of the Fe K(alpha) line.

  20. Central Elemental Abundance Ratios In the Perseus Cluster: Resonant Scattering or SN Ia Enrichment?

    NASA Technical Reports Server (NTRS)

    Dupke, Renato A.; Arnaud, Keith; White, Nicholas E. (Technical Monitor)

    2001-01-01

    We have determined abundance ratios in the core of the Perseus Cluster for several elements. These ratios indicate a central dominance of Type 1a supernova (SN Ia) ejects similar to that found for A496, A2199 and A3571. Simultaneous analysis of ASCA spectra from SIS1, GIS2, and GIS3 shows that the ratio of Ni to Fe abundances is approx. 3.4 +/- 1.1 times solar within the central 4'. This ratio is consistent with (and more precise than) that observed in other clusters whose central regions are dominated by SN Ia ejecta. Such a large Ni overabundance is predicted by "convective deflagration" explosion models for SNe Ia such as W7 but is inconsistent with delayed detonation models. We note that with current instrumentation the Ni K(alpha) line is confused with Fe K(beta) and that the Ni overabundance we observe has been interpreted by others as an anomalously large ratio of Fe K(beta) to Fe K(alpha) caused by resonant scattering in the Fe K(alpha) line. We argue that a central enhancement of SN Ia ejecta and hence a high ratio of Ni to Fe abundances are naturally explained by scenarios that include the generation of chemical gradients by suppressed SN Ia winds or ram pressure stripping of cluster galaxies. It is not necessary to suppose that the intracluster gas is optically thick to resonant scattering of the Fe K(alpha) line.

  1. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

    PubMed

    Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

    2013-03-01

    Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.

  2. Anaerobic central metabolic pathways active during polyhydroxyalkanoate production in uncultured cluster 1 Defluviicoccus enriched in activated sludge communities.

    PubMed

    Burow, Luke C; Mabbett, Amanda N; Borrás, Luis; Blackall, Linda L

    2009-09-01

    A glycogen nonpolyphosphate-accumulating organism (GAO) enrichment culture dominated by the Alphaproteobacteria cluster 1 Defluviicoccus was investigated to determine the metabolic pathways involved in the anaerobic formation of polyhydroxyalkanoates, carbon storage polymers important for the proliferation of microorganisms in enhanced biological phosphorus removal processes. FISH-microautoradiography and post-FISH fluorescent chemical staining confirmed acetate assimilation as polyhydroxyalkanoates in cluster 1 Defluviicoccus under anaerobic conditions. Chemical inhibition of glycolysis using iodoacetate, and of isocitrate lyase by 3-nitropropionate and itaconate, indicated that carbon is likely to be channelled through both glycolysis and the glyoxylate cycle in cluster 1 Defluviicoccus. The effect of metabolic inhibitors of aconitase (monofluoroacetate) and succinate dehydrogenase (malonate) suggested that aconitase, but not succinate dehydrogenase, was active, providing further support for the role of the glyoxylate cycle in these GAOs. Metabolic inhibition of fumarate reductase using oxantel decreased polyhydroxyalkanoate production. This indicated reduction of fumarate to succinate and the operation of the reductive branch of the tricarboxylic acid cycle, which is possibly important in the production of the polyhydroxyvalerate component of polyhydroxyalkanoates observed in cluster 1 Defluviicoccus enrichment cultures. These findings were integrated with previous metabolic models for GAOs and enabled an anaerobic central metabolic pathway model for polyhydroxyalkanoate formation in cluster 1 Defluviicoccus to be proposed.

  3. Cluster analysis of multiple planetary flow regimes

    NASA Technical Reports Server (NTRS)

    Mo, Kingtse; Ghil, Michael

    1987-01-01

    A modified cluster analysis method was developed to identify spatial patterns of planetary flow regimes, and to study transitions between them. This method was applied first to a simple deterministic model and second to Northern Hemisphere (NH) 500 mb data. The dynamical model is governed by the fully-nonlinear, equivalent-barotropic vorticity equation on the sphere. Clusters of point in the model's phase space are associated with either a few persistent or with many transient events. Two stationary clusters have patterns similar to unstable stationary model solutions, zonal, or blocked. Transient clusters of wave trains serve as way stations between the stationary ones. For the NH data, cluster analysis was performed in the subspace of the first seven empirical orthogonal functions (EOFs). Stationary clusters are found in the low-frequency band of more than 10 days, and transient clusters in the bandpass frequency window between 2.5 and 6 days. In the low-frequency band three pairs of clusters determine, respectively, EOFs 1, 2, and 3. They exhibit well-known regional features, such as blocking, the Pacific/North American (PNA) pattern and wave trains. Both model and low-pass data show strong bimodality. Clusters in the bandpass window show wave-train patterns in the two jet exit regions. They are related, as in the model, to transitions between stationary clusters.

  4. FDR-FET: an optimizing gene set enrichment analysis method

    PubMed Central

    Ji, Rui-Ru; Ott, Karl-Heinz; Yordanova, Roumyana; Bruccoleri, Robert E

    2011-01-01

    Gene set enrichment analysis for analyzing large profiling and screening experiments can reveal unifying biological schemes based on previously accumulated knowledge represented as “gene sets”. Most of the existing implementations use a fixed fold-change or P value cutoff to generate regulated gene lists. However, the threshold selection in most cases is arbitrary, and has a significant effect on the test outcome and interpretation of the experiment. We developed a new gene set enrichment analysis method, ie, FDR-FET, which dynamically optimizes the threshold choice and improves the sensitivity and selectivity of gene set enrichment analysis. The procedure translates experimental results into a series of regulated gene lists at multiple false discovery rate (FDR) cutoffs, and computes the P value of the overrepresentation of a gene set using a Fisher’s exact test (FET) in each of these gene lists. The lowest P value is retained to represent the significance of the gene set. We also implemented improved methods to define a more relevant global reference set for the FET. We demonstrate the validity of the method using a published microarray study of three protease inhibitors of the human immunodeficiency virus and compare the results with those from other popular gene set enrichment analysis algorithms. Our results show that combining FDR with multiple cutoffs allows us to control the error while retaining genes that increase information content. We conclude that FDR-FET can selectively identify significant affected biological processes. Our method can be used for any user-generated gene list in the area of transcriptome, proteome, and other biological and scientific applications. PMID:21918636

  5. FDR-FET: an optimizing gene set enrichment analysis method.

    PubMed

    Ji, Rui-Ru; Ott, Karl-Heinz; Yordanova, Roumyana; Bruccoleri, Robert E

    2011-01-01

    Gene set enrichment analysis for analyzing large profiling and screening experiments can reveal unifying biological schemes based on previously accumulated knowledge represented as "gene sets". Most of the existing implementations use a fixed fold-change or P value cutoff to generate regulated gene lists. However, the threshold selection in most cases is arbitrary, and has a significant effect on the test outcome and interpretation of the experiment. We developed a new gene set enrichment analysis method, ie, FDR-FET, which dynamically optimizes the threshold choice and improves the sensitivity and selectivity of gene set enrichment analysis. The procedure translates experimental results into a series of regulated gene lists at multiple false discovery rate (FDR) cutoffs, and computes the P value of the overrepresentation of a gene set using a Fisher's exact test (FET) in each of these gene lists. The lowest P value is retained to represent the significance of the gene set. We also implemented improved methods to define a more relevant global reference set for the FET. We demonstrate the validity of the method using a published microarray study of three protease inhibitors of the human immunodeficiency virus and compare the results with those from other popular gene set enrichment analysis algorithms. Our results show that combining FDR with multiple cutoffs allows us to control the error while retaining genes that increase information content. We conclude that FDR-FET can selectively identify significant affected biological processes. Our method can be used for any user-generated gene list in the area of transcriptome, proteome, and other biological and scientific applications.

  6. QUANTITATIVE MASS SPECTROMETRIC ANALYSIS OF GLYCOPROTEINS COMBINED WITH ENRICHMENT METHODS

    PubMed Central

    Ahn, Yeong Hee; Kim, Jin Young; Yoo, Jong Shin

    2015-01-01

    Mass spectrometry (MS) has been a core technology for high sensitive and high-throughput analysis of the enriched glycoproteome in aspects of quantitative assays as well as qualitative profiling of glycoproteins. Because it has been widely recognized that aberrant glycosylation in a glycoprotein may involve in progression of a certain disease, the development of efficient analysis tool for the aberrant glycoproteins is very important for deep understanding about pathological function of the glycoprotein and new biomarker development. This review first describes the protein glycosylation-targeting enrichment technologies mainly employing solid-phase extraction methods such as hydrizide-capturing, lectin-specific capturing, and affinity separation techniques based on porous graphitized carbon, hydrophilic interaction chromatography, or immobilized boronic acid. Second, MS-based quantitative analysis strategies coupled with the protein glycosylation-targeting enrichment technologies, by using a label-free MS, stable isotope-labeling, or targeted multiple reaction monitoring (MRM) MS, are summarized with recent published studies. © 2014 The Authors. Mass Spectrometry Reviews Published by Wiley Periodicals, Inc. Rapid Commun. Mass Spec Rev 34:148–165, 2015. PMID:24889823

  7. Onsite Gaseous Centrifuge Enrichment Plant UF6 Cylinder Destructive Analysis

    SciTech Connect

    Anheier, Norman C.; Cannon, Bret D.; Qiao, Hong; Carter, Jennifer C.; McNamara, Bruce K.; O'Hara, Matthew J.; Phillips, Jon R.; Curtis, Michael M.

    2012-07-17

    The IAEA safeguards approach for gaseous centrifuge enrichment plants (GCEPs) includes measurements of gross, partial, and bias defects in a statistical sampling plan. These safeguard methods consist principally of mass and enrichment nondestructive assay (NDA) verification. Destructive assay (DA) samples are collected from a limited number of cylinders for high precision offsite mass spectrometer analysis. DA is typically used to quantify bias defects in the GCEP material balance. Under current safeguards measures, the operator collects a DA sample from a sample tap following homogenization. The sample is collected in a small UF6 sample bottle, then sealed and shipped under IAEA chain of custody to an offsite analytical laboratory. Current practice is expensive and resource intensive. We propose a new and novel approach for performing onsite gaseous UF6 DA analysis that provides rapid and accurate assessment of enrichment bias defects. DA samples are collected using a custom sampling device attached to a conventional sample tap. A few micrograms of gaseous UF6 is chemically adsorbed onto a sampling coupon in a matter of minutes. The collected DA sample is then analyzed onsite using Laser Ablation Absorption Ratio Spectrometry-Destructive Assay (LAARS-DA). DA results are determined in a matter of minutes at sufficient accuracy to support reliable bias defect conclusions, while greatly reducing DA sample volume, analysis time, and cost.

  8. Nursing home care quality: a cluster analysis.

    PubMed

    Grøndahl, Vigdis Abrahamsen; Fagerli, Liv Berit

    2017-02-13

    Purpose The purpose of this paper is to explore potential differences in how nursing home residents rate care quality and to explore cluster characteristics. Design/methodology/approach A cross-sectional design was used, with one questionnaire including questions from quality from patients' perspective and Big Five personality traits, together with questions related to socio-demographic aspects and health condition. Residents ( n=103) from four Norwegian nursing homes participated (74.1 per cent response rate). Hierarchical cluster analysis identified clusters with respect to care quality perceptions. χ(2) tests and one-way between-groups ANOVA were performed to characterise the clusters ( p<0.05). Findings Two clusters were identified; Cluster 1 residents (28.2 per cent) had the best care quality perceptions and Cluster 2 (67.0 per cent) had the worst perceptions. The clusters were statistically significant and characterised by personal-related conditions: gender, psychological well-being, preferences, admission, satisfaction with staying in the nursing home, emotional stability and agreeableness, and by external objective care conditions: healthcare personnel and registered nurses. Research limitations/implications Residents assessed as having no cognitive impairments were included, thus excluding the largest group. By choosing questionnaire design and structured interviews, the number able to participate may increase. Practical implications Findings may provide healthcare personnel and managers with increased knowledge on which to develop strategies to improve specific care quality perceptions. Originality/value Cluster analysis can be an effective tool for differentiating between nursing homes residents' care quality perceptions.

  9. Evidence for a chemical enrichment coupling of globular clusters and field stars in the Fornax dSph

    NASA Astrophysics Data System (ADS)

    Hendricks, Benjamin; Boeche, Corrado; Johnson, Christian I.; Frank, Matthias J.; Koch, Andreas; Mateo, Mario; Bailey, John I.

    2016-01-01

    The globular cluster H4, located in the center of the Fornax dwarf spheroidal galaxy, is crucial for understanding the formation and chemical evolution of star clusters in low-mass galactic environments. H4 is peculiar because the cluster is significantly more metal-rich than the galaxy's other clusters, is located near the galaxy center, and may also be the youngest cluster in the galaxy. In this study, we present detailed chemical abundances derived from high-resolution (R ~ 28 000) spectroscopy of an isolated H4 member star for comparison with a sample of 22 nearby Fornax field stars. We find the H4 member to be depleted in the alpha-elements Si, Ca, and Ti with [Si/Fe] = -0.35 ± 0.34, [Ca/Fe] = + 0.05 ± 0.08, and [Ti/Fe] = -0.27 ± 0.23, resulting in an average [α/Fe] = -0.19 ± 0.14. If this result is representative of the average cluster properties, H4 is the only known system with a low [α/Fe] ratio and a moderately low metallicity embedded in an intact birth environment. For the field stars we find a clear sequence, seen as an early depletion in [α/Fe] at low metallicities, in good agreement with previous measurements. H4 falls on top of the observed field star [α/Fe] sequence and clearly disagrees with the properties of Milky Way halo stars. We therefore conclude that within a galaxy, the chemical enrichment of globular clusters may be closely linked to the enrichment pattern of the field star population. The low [α/Fe] ratios of H4 and similar metallicity field stars in Fornax give evidence that slow chemical enrichment environments, such as dwarf galaxies, may be the original hosts of alpha-depleted clusters in the halos of the Milky Way and M31. This article includes data gathered with the 6.5 m Magellan Telescopes located at Las Campanas Observatory, Chile.

  10. Comparative analysis of affinity-based 5-hydroxymethylation enrichment techniques

    PubMed Central

    Thomson, John P.; Hunter, Jennifer M.; Nestor, Colm E.; Dunican, Donncha S.; Terranova, Rémi; Moggs, Jonathan G.; Meehan, Richard R.

    2013-01-01

    The epigenetic modification of 5-hydroxymethylcytosine (5hmC) is receiving great attention due to its potential role in DNA methylation reprogramming and as a cell state identifier. Given this interest, it is important to identify reliable and cost-effective methods for the enrichment of 5hmC marked DNA for downstream analysis. We tested three commonly used affinity-based enrichment techniques; (i) antibody, (ii) chemical capture and (iii) protein affinity enrichment and assessed their ability to accurately and reproducibly report 5hmC profiles in mouse tissues containing high (brain) and lower (liver) levels of 5hmC. The protein-affinity technique is a poor reporter of 5hmC profiles, delivering 5hmC patterns that are incompatible with other methods. Both antibody and chemical capture-based techniques generate highly similar genome-wide patterns for 5hmC, which are independently validated by standard quantitative PCR (qPCR) and glucosyl-sensitive restriction enzyme digestion (gRES-qPCR). Both antibody and chemical capture generated profiles reproducibly link to unique chromatin modification profiles associated with 5hmC. However, there appears to be a slight bias of the antibody to bind to regions of DNA rich in simple repeats. Ultimately, the increased specificity observed with chemical capture-based approaches makes this an attractive method for the analysis of locus-specific or genome-wide patterns of 5hmC. PMID:24214958

  11. Cluster analysis of multiple planetary flow regimes

    NASA Technical Reports Server (NTRS)

    Mo, Kingtse; Ghil, Michael

    1988-01-01

    A modified cluster analysis method developed for the classification of quasi-stationary events into a few planetary flow regimes and for the examination of transitions between these regimes is described. The method was applied first to a simple deterministic model and then to a 500-mbar data set for Northern Hemisphere (NH), for which cluster analysis was carried out in the subspace of the first seven empirical orthogonal functions (EOFs). Stationary clusters were found in the low-frequency band of more than 10 days, while transient clusters were found in the band-pass frequency window between 2.5 and 6 days. In the low-frequency band, three pairs of clusters determined EOFs 1, 2, and 3, respectively; they exhibited well-known regional features, such as blocking, the Pacific/North American pattern, and wave trains. Both model and low-pass data exhibited strong bimodality.

  12. Applications of cluster analysis to satellite soundings

    NASA Technical Reports Server (NTRS)

    Munteanu, M. J.; Jakubowicz, O.; Kalnay, E.; Piraino, P.

    1984-01-01

    The advantages of the use of cluster analysis in the improvement of satellite temperature retrievals were evaluated since the use of natural clusters, which are associated with atmospheric temperature soundings characteristic of different types of air masses, has the potential for improving stratified regression schemes in comparison with currently used methods which stratify soundings based on latitude, season, and land/ocean. The method of discriminatory analysis was used. The correct cluster of temperature profiles from satellite measurements was located in 85% of the cases. Considerable improvement was observed at all mandatory levels using regression retrievals derived in the clusters of temperature (weighted and nonweighted) in comparison with the control experiment and with the regression retrievals derived in the clusters of brightness temperatures of 3 MSU and 5 IR channels.

  13. Ultrafast clustering algorithms for metagenomic sequence analysis

    PubMed Central

    Fu, Limin; Niu, Beifang; Wu, Sitao; Wooley, John

    2012-01-01

    The rapid advances of high-throughput sequencing technologies dramatically prompted metagenomic studies of microbial communities that exist at various environments. Fundamental questions in metagenomics include the identities, composition and dynamics of microbial populations and their functions and interactions. However, the massive quantity and the comprehensive complexity of these sequence data pose tremendous challenges in data analysis. These challenges include but are not limited to ever-increasing computational demand, biased sequence sampling, sequence errors, sequence artifacts and novel sequences. Sequence clustering methods can directly answer many of the fundamental questions by grouping similar sequences into families. In addition, clustering analysis also addresses the challenges in metagenomics. Thus, a large redundant data set can be represented with a small non-redundant set, where each cluster can be represented by a single entry or a consensus. Artifacts can be rapidly detected through clustering. Errors can be identified, filtered or corrected by using consensus from sequences within clusters. PMID:22772836

  14. [Cluster analysis and its application].

    PubMed

    Půlpán, Zdenĕk

    2002-01-01

    The study exploits knowledge-oriented and context-based modification of well-known algorithms of (fuzzy) clustering. The role of fuzzy sets is inherently inclined towards coping with linguistic domain knowledge also. We try hard to obtain from rich diverse data and knowledge new information about enviroment that is being explored.

  15. Interactive Maximum Reliability Cluster Analysis.

    ERIC Educational Resources Information Center

    Mays, Robert

    1978-01-01

    A FORTRAN program for clustering variables using the alpha coefficient of reliability is described. For batch operation, a rule for stopping the agglomerative precedure is available. The conversational version of the program allows the user to intervene in the process in order to test the final solution for sensitivity to changes. (Author/JKS)

  16. Cluster Analysis of Adolescent Blogs

    ERIC Educational Resources Information Center

    Liu, Eric Zhi-Feng; Lin, Chun-Hung; Chen, Feng-Yi; Peng, Ping-Chuan

    2012-01-01

    Emerging web applications and networking systems such as blogs have become popular, and they offer unique opportunities and environments for learners, especially for adolescent learners. This study attempts to explore the writing styles and genres used by adolescents in their blogs by employing content, factor, and cluster analyses. Factor…

  17. Constraints on mass loss and self-enrichment scenarios for the globular clusters of the Fornax dSph

    NASA Astrophysics Data System (ADS)

    Larsen, S. S.; Strader, J.; Brodie, J. P.

    2012-08-01

    Recently, high-dispersion spectroscopy has demonstrated conclusively that four of the five globular clusters (GCs) in the Fornax dwarf spheroidal galaxy are very metal-poor with [Fe/H] < -2. The remaining cluster, Fornax 4, has [Fe/H] = -1.4. This is in stark contrast to the field star metallicity distribution which shows a broad peak around [Fe/H] ≈ -1 with only a few percent of the stars having [Fe/H] < -2. If we only consider stars and clusters with [Fe/H] < -2 we thus find an extremely high GC specific frequency, SN ≈ 400, implying by far the highest ratio of GCs to field stars known anywhere. We estimate that about 1/5-1/4 of all stars in the Fornax dSph with [Fe/H] < -2 belong to the four most metal-poor GCs. These GCs could, therefore, at most have been a factor of 4-5 more massive initially. Yet, the Fornax GCs appear to share the same anomalous chemical abundance patterns known from Milky Way GCs, commonly attributed to the presence of multiple stellar generations within the clusters. The extreme ratio of metal-poor GC- versus field stars in the Fornax dSph is difficult to reconcile with scenarios for self-enrichment and early evolution of GCs in which a large fraction (90%-95%) of the first-generation stars have been lost. It also suggests that the GCs may not have formed as part of a larger population of now disrupted clusters with an initial power-law mass distribution. The Fornax dSph may be a rosetta stone for constraining theories of the formation, self-enrichment and early dynamical evolution of star clusters. Based on observations made with ESO Telescopes at the La Silla Paranal Observatory under programme ID 078.B-0631(A).

  18. LISSAT Analysis of a Generic Centrifuge Enrichment Plant

    SciTech Connect

    Lambert, H; Elayat, H A; O?Connell, W J; Szytel, L; Dreicer, M

    2007-05-31

    The U.S. Department of Energy (DOE) is interested in developing tools and methods for use in designing and evaluating safeguards systems for current and future plants in the nuclear power fuel cycle. The DOE is engaging several DOE National Laboratories in efforts applied to safeguards for chemical conversion plants and gaseous centrifuge enrichment plants. As part of the development, Lawrence Livermore National Laboratory has developed an integrated safeguards system analysis tool (LISSAT). This tool provides modeling and analysis of facility and safeguards operations, generation of diversion paths, and evaluation of safeguards system effectiveness. The constituent elements of diversion scenarios, including material extraction and concealment measures, are structured using directed graphs (digraphs) and fault trees. Statistical analysis evaluates the effectiveness of measurement verification plans and randomly timed inspections. Time domain simulations analyze significant scenarios, especially those involving alternate time ordering of events or issues of timeliness. Such simulations can provide additional information to the fault tree analysis and can help identify the range of normal operations and, by extension, identify additional plant operational signatures of diversions. LISSAT analyses can be used to compare the diversion-detection probabilities for individual safeguards technologies and to inform overall strategy implementations for present and future plants. Additionally, LISSAT can be the basis for a rigorous cost-effectiveness analysis of safeguards and design options. This paper will describe the results of a LISSAT analysis of a generic centrifuge enrichment plant. The paper will describe the diversion scenarios analyzed and the effectiveness of various safeguards systems alternatives.

  19. Simple enrichment and analysis of plasma lysophosphatidic acids.

    PubMed

    Wang, Jialu; Sibrian-Vazquez, Martha; Escobedo, Jorge O; Lowry, Mark; Wang, Lei; Chu, Yu-Hsuan; Moore, Richard G; Strongin, Robert M

    2013-11-21

    A simple and highly efficient technique for the analysis of lysophosphatidic acid (LPA) subspecies in human plasma is described. The streamlined sample preparation protocol furnishes the five major LPA subspecies with excellent recoveries. Extensive analysis of the enriched sample reveals only trace levels of other phospholipids. This level of purity not only improves MS analyses, but enables HPLC post-column detection in the visible region with a commercially available fluorescent phospholipids probe. Human plasma samples from different donors were analyzed using the above method and validated by LC-ESI/MS/MS.

  20. Cluster Analysis of the Malaysian Hipposideros

    NASA Astrophysics Data System (ADS)

    Sazali, Siti Nurlydia; Laman, Charlie J.; Abdullah, M. T.

    2008-01-01

    A preliminary study on the morphometric variations among species in the genus Hipposideros was conducted using voucher specimens from the Universiti Malaysia Sarawak (UNIMAS) Zoological Museum and the Department of Wildlife and National Park (DWNP) Kuala Lumpur. A total of 24 individuals from six species of this genus were morphologically studied where all related measurements of body, skull and dental were measured and recorded. The statistical data subjected to the cluster analysis shows that the genus Hipposideros is divided into two major clusters where each species was clearly separated. The cluster analysis among Hipposideros species is useful for aiding in species identification.

  1. Digital image analysis of haematopoietic clusters.

    PubMed

    Benzinou, A; Hojeij, Y; Roudot, A-C

    2005-02-01

    Counting and differentiating cell clusters is a tedious task when performed with a light microscope. Moreover, biased counts and interpretation are difficult to avoid because of the difficulties to evaluate the limits between different types of clusters. Presented here, is a computer-based application able to solve these problems. The image analysis system is entirely automatic, from the stage screening, to the statistical analysis of the results of each experimental plate. Good correlations are found with measurements made by a specialised technician.

  2. Using cluster analysis to explore survey data.

    PubMed

    Spencer, Llinos; Roberts, Gwerfyl; Irvine, Fiona; Jones, Peter; Baker, Colin

    2007-01-01

    Llinos Haf Spencer reports on the use of the cluster analysis statistical technique in nursing research and uses data from the Welsh Language Awareness in Healthcare Provision in Wales survey as an exemplar She concludes that cluster analysis is a valuable tool to tease out patterns in data that are not initially evident in bivariate analyses and thus should be considered as a viable option for nursing research.

  3. MAVTgsa: An R Package for Gene Set (Enrichment) Analysis

    DOE PAGES

    Chien, Chih-Yi; Chang, Ching-Wei; Tsai, Chen-An; ...

    2014-01-01

    Gene semore » t analysis methods aim to determine whether an a priori defined set of genes shows statistically significant difference in expression on either categorical or continuous outcomes. Although many methods for gene set analysis have been proposed, a systematic analysis tool for identification of different types of gene set significance modules has not been developed previously. This work presents an R package, called MAVTgsa, which includes three different methods for integrated gene set enrichment analysis. (1) The one-sided OLS (ordinary least squares) test detects coordinated changes of genes in gene set in one direction, either up- or downregulation. (2) The two-sided MANOVA (multivariate analysis variance) detects changes both up- and downregulation for studying two or more experimental conditions. (3) A random forests-based procedure is to identify gene sets that can accurately predict samples from different experimental conditions or are associated with the continuous phenotypes. MAVTgsa computes the P values and FDR (false discovery rate) q -value for all gene sets in the study. Furthermore, MAVTgsa provides several visualization outputs to support and interpret the enrichment results. This package is available online.« less

  4. Cluster Analysis and Clinical Asthma Phenotypes

    PubMed Central

    Shaw, Dominic E.; Berry, Michael A.; Thomas, Michael; Brightling, Christopher E.; Wardlaw, Andrew J.

    2014-01-01

    Rationale Heterogeneity in asthma expression is multidimensional, including variability in clinical, physiologic, and pathologic parameters. Classification requires consideration of these disparate domains in a unified model. Objectives To explore the application of a multivariate mathematical technique, k-means cluster analysis, for identifying distinct phenotypic groups. Methods We performed k-means cluster analysis in three independent asthma populations. Clusters of a population managed in primary care (n = 184) with predominantly mild to moderate disease, were compared with a refractory asthma population managed in secondary care (n = 187). We then compared differences in asthma outcomes (exacerbation frequency and change in corticosteroid dose at 12 mo) between clusters in a third population of 68 subjects with predominantly refractory asthma, clustered at entry into a randomized trial comparing a strategy of minimizing eosinophilic inflammation (inflammation-guided strategy) with standard care. Measurements and Main Results Two clusters (early-onset atopic and obese, noneosinophilic) were common to both asthma populations. Two clusters characterized by marked discordance between symptom expression and eosinophilic airway inflammation (early-onset symptom predominant and late-onset inflammation predominant) were specific to refractory asthma. Inflammation-guided management was superior for both discordant subgroups leading to a reduction in exacerbation frequency in the inflammation-predominant cluster (3.53 [SD, 1.18] vs. 0.38 [SD, 0.13] exacerbation/patient/yr, P = 0.002) and a dose reduction of inhaled corticosteroid in the symptom-predominant cluster (mean difference, 1,829 μg beclomethasone equivalent/d [95% confidence interval, 307–3,349 μg]; P = 0.02). Conclusions Cluster analysis offers a novel multidimensional approach for identifying asthma phenotypes that exhibit differences in clinical response to treatment algorithms. PMID:18480428

  5. Origin of central abundances in the hot intra-cluster medium. II. Chemical enrichment and supernova yield models

    NASA Astrophysics Data System (ADS)

    Mernier, F.; de Plaa, J.; Pinto, C.; Kaastra, J. S.; Kosec, P.; Zhang, Y.-Y.; Mao, J.; Werner, N.; Pols, O. R.; Vink, J.

    2016-11-01

    The hot intra-cluster medium (ICM) is rich in metals, which are synthesised by supernovae (SNe) and accumulate over time into the deep gravitational potential well of clusters of galaxies. Since most of the elements visible in X-rays are formed by type Ia (SNIa) and/or core-collapse (SNcc) supernovae, measuring their abundances gives us direct information on the nucleosynthesis products of billions of SNe since the epoch of the star formation peak (z 2-3). In this study, we compare the most accurate average X/Fe abundance ratios (compiled in a previous work from XMM-Newton EPIC and RGS observations of 44 galaxy clusters, groups, and ellipticals), representative of the chemical enrichment in the nearby ICM, to various SNIa and SNcc nucleosynthesis models found in the literature. The use of a SNcc model combined to any favoured standard SNIa model (deflagration or delayed-detonation) fails to reproduce our abundance pattern. In particular, the Ca/Fe and Ni/Fe ratios are significantly underestimated by the models. We show that the Ca/Fe ratio can be reproduced better, either by taking a SNIa delayed-detonation model that matches the observations of the Tycho supernova remnant, or by adding a contribution from the "Ca-rich gap transient" SNe, whose material should easily mix into the hot ICM. On the other hand, the Ni/Fe ratio can be reproduced better by assuming that both deflagration and delayed-detonation SNIa contribute in similar proportions to the ICM enrichment. In either case, the fraction of SNIa over the total number of SNe (SNIa+SNcc) contributing to the ICM enrichment ranges within 29-45%. This fraction is found to be systematically higher than the corresponding SNIa/(SNIa+SNcc) fraction contributing to the enrichment of the proto-solar environnement (15-25%). We also discuss and quantify two useful constraints on both SNIa (i.e. the initial metallicity on SNIa progenitors and the fraction of low-mass stars that result in SNIa) and SNcc (i.e. the effect of

  6. Enrichment and Analysis of Intact Phosphoproteins in Arabidopsis Seedlings.

    PubMed

    Aryal, Uma K; Ross, Andrew R S; Krochko, Joan E

    2015-01-01

    Protein phosphorylation regulates diverse cellular functions and plays a key role in the early development of plants. To complement and expand upon previous investigations of protein phosphorylation in Arabidopsis seedlings we used an alternative approach that combines protein extraction under non-denaturing conditions with immobilized metal-ion affinity chromatography (IMAC) enrichment of intact phosphoproteins in Rubisco-depleted extracts, followed by identification using two-dimensional gel electrophoresis (2-DE) and liquid chromatography-tandem mass spectrometry (LC-MS/MS). In-gel trypsin digestion and analysis of selected gel spots identified 144 phosphorylated peptides and residues, of which only 18 phosphopeptides and 8 phosphosites were found in the PhosPhAt 4.0 and P3DB Arabidopsis thaliana phosphorylation site databases. More than half of the 82 identified phosphoproteins were involved in carbohydrate metabolism, photosynthesis/respiration or oxidative stress response mechanisms. Enrichment of intact phosphoproteins prior to 2-DE and LC-MS/MS appears to enhance detection of phosphorylated threonine and tyrosine residues compared with methods that utilize peptide-level enrichment, suggesting that the two approaches are somewhat complementary in terms of phosphorylation site coverage. Comparing results for young seedlings with those obtained previously for mature Arabidopsis leaves identified five proteins that are differentially phosphorylated in these tissues, demonstrating the potential of this technique for investigating the dynamics of protein phosphorylation during plant development.

  7. Enrichment and single-cell analysis of circulating tumor cells

    PubMed Central

    Song, Yanling; Tian, Tian; Shi, Yuanzhi; Liu, Wenli; Zou, Yuan; Khajvand, Tahereh; Wang, Sili; Zhu, Zhi

    2017-01-01

    Up to 90% of cancer-related deaths are caused by metastatic cancer. Circulating tumor cells (CTCs), a type of cancer cell that spreads through the blood after detaching from a solid tumor, are essential for the establishment of distant metastasis for a given cancer. As a new type of liquid biopsy, analysis of CTCs offers the possibility to avoid invasive tissue biopsy procedures with practical implications for diagnostics. The fundamental challenges of analyzing and profiling CTCs are the extremely low abundances of CTCs in the blood and the intrinsic heterogeneity of CTCs. Various technologies have been proposed for the enrichment and single-cell analysis of CTCs. This review aims to provide in-depth insights into CTC analysis, including various techniques for isolation of CTCs with capture methods based on physical and biochemical principles, and single-cell analysis of CTCs at the genomic, proteomic and phenotypic level, as well as current developmental trends and promising research directions. PMID:28451298

  8. Correcting an analysis of variance for clustering.

    PubMed

    Hedges, Larry V; Rhoads, Christopher H

    2011-02-01

    A great deal of educational and social data arises from cluster sampling designs where clusters involve schools, classrooms, or communities. A mistake that is sometimes encountered in the analysis of such data is to ignore the effect of clustering and analyse the data as if it were based on a simple random sample. This typically leads to an overstatement of the precision of results and too liberal conclusions about precision and statistical significance of mean differences. This paper gives simple corrections to the test statistics that would be computed in an analysis of variance if clustering were (incorrectly) ignored. The corrections are multiplicative factors depending on the total sample size, the cluster size, and the intraclass correlation structure. For example, the corrected F statistic has Fisher's F distribution with reduced degrees of freedom. The corrected statistic reduces to the F statistic computed by ignoring clustering when the intraclass correlations are zero. It reduces to the F statistic computed using cluster means when the intraclass correlations are unity, and it is in between otherwise. A similar adjustment to the usual statistic for testing a linear contrast among group means is described.

  9. ASteCA: Automated Stellar Cluster Analysis

    NASA Astrophysics Data System (ADS)

    Perren, G. I.; Vázquez, R. A.; Piatti, A. E.

    2015-04-01

    We present the Automated Stellar Cluster Analysis package (ASteCA), a suit of tools designed to fully automate the standard tests applied on stellar clusters to determine their basic parameters. The set of functions included in the code make use of positional and photometric data to obtain precise and objective values for a given cluster's center coordinates, radius, luminosity function and integrated color magnitude, as well as characterizing through a statistical estimator its probability of being a true physical cluster rather than a random overdensity of field stars. ASteCA incorporates a Bayesian field star decontamination algorithm capable of assigning membership probabilities using photometric data alone. An isochrone fitting process based on the generation of synthetic clusters from theoretical isochrones and selection of the best fit through a genetic algorithm is also present, which allows ASteCA to provide accurate estimates for a cluster's metallicity, age, extinction and distance values along with its uncertainties. To validate the code we applied it on a large set of over 400 synthetic MASSCLEAN clusters with varying degrees of field star contamination as well as a smaller set of 20 observed Milky Way open clusters (Berkeley 7, Bochum 11, Czernik 26, Czernik 30, Haffner 11, Haffner 19, NGC 133, NGC 2236, NGC 2264, NGC 2324, NGC 2421, NGC 2627, NGC 6231, NGC 6383, NGC 6705, Ruprecht 1, Tombaugh 1, Trumpler 1, Trumpler 5 and Trumpler 14) studied in the literature. The results show that ASteCA is able to recover cluster parameters with an acceptable precision even for those clusters affected by substantial field star contamination. ASteCA is written in Python and is made available as an open source code which can be downloaded ready to be used from its official site.

  10. Improved Methods for the Enrichment and Analysis of Glycated Peptides

    SciTech Connect

    Zhang, Qibin; Schepmoes, Athena A; Brock, Jonathan W; Wu, Si; Moore, Ronald J; Purvine, Samuel O; Baynes, John; Smith, Richard D; Metz, Thomas O

    2008-12-15

    Non-enzymatic glycation of tissue proteins has important implications in the development of complications of diabetes mellitus. Herein we report improved methods for the enrichment and analysis of glycated peptides using boronate affinity chromatography and electron transfer dissociation mass spectrometry, respectively. The enrichment of glycated peptides was improved by replacing an off-line desalting step with an on-line wash of column-bound glycated peptides using 50 mM ammonium acetate. The analysis of glycated peptides by MS/MS was improved by considering only higher charged (≥3) precursor-ions during data-dependent acquisition, which increased the number of glycated peptide identifications. Similarly, the use of supplemental collisional activation after electron transfer (ETcaD) resulted in more glycated peptide identifications when the MS survey scan was acquired with enhanced resolution. In general, acquiring ETD-MS/MS data at a normal MS survey scan rate, in conjunction with the rejection of both 1+ and 2+ precursor-ions, increased the number of identified glycated peptides relative to ETcaD or the enhanced MS survey scan rate. Finally, an evaluation of trypsin, Arg-C, and Lys-C showed that tryptic digestion of glycated proteins was comparable to digestion with Lys-C and that both were better than Arg-C in terms of the number glycated peptides identified by LC-MS/MS.

  11. Preliminary uranium enrichment analysis results using cadmium zinc telluride detectors

    NASA Astrophysics Data System (ADS)

    Lavietes, Anthony D.; McQuaid, James H.; Paulus, T. J.

    1996-10-01

    Lawrence Livermore National Laboratory (LLNL) and EG and G ORTEC have jointly developed a portable ambient-temperature detection system that can be used in a number of application scenarios. The detection system uses a planar cadmium zinc telluride (CZT) detector with custom-designed detector support electronics developed at LLNL and is based on the recently released MicroNOMAD multichannel analyzer (MCA) produced by ORTEC. Spectral analysis is performed using software developed at LLNL that was originally designed for use with high-purity germanium (HPGe) detector systems. In one application, the CZT detection system determines uranium enrichments ranging from less than 3% to over 75% to within accuracies of 20%. The analysis was performed using sample sizes of 200 g or larger and acquisition times of 30 min. We have demonstrated the capabilities of this system by analyzing the spectra gathered by the CZT detection system from uranium sources of several enrichments. These experiments demonstrate that current CZT detectors can, in some cases, approach performance criteria that were previously the exclusive domain of larger HPGe detector systems.

  12. Improved Methods for the Enrichment and Analysis of Glycated Peptides

    PubMed Central

    Zhang, Qibin; Schepmoes, Athena A.; Brock, Jonathan W. C.; Wu, Si; Moore, Ronald J.; Purvine, Samuel O.; Baynes, John W.; Smith, Richard D.; Metz, Thomas O.

    2009-01-01

    Nonenzymatic glycation of tissue proteins has important implications in the development of complications of diabetes mellitus. Herein we report improved methods for the enrichment and analysis of glycated peptides using boronate affinity chromatography and electron-transfer dissociation mass spectrometry, respectively. The enrichment of glycated peptides was improved by replacing an off-line desalting step with an online wash of column-bound glycated peptides using 50 mM ammonium acetate, followed by elution with 100 mM acetic acid. The analysis of glycated peptides by MS/MS was improved by considering only higher charged (≥3) precursor ions during data-dependent acquisition, which increased the number of glycated peptide identifications. Similarly, the use of supplemental collisional activation after electron transfer (ETcaD) resulted in more glycated peptide identifications when the MS survey scan was acquired with enhanced resolution. Acquiring ETD-MS/MS data at a normal MS survey scan rate, in conjunction with the rejection of both 1+ and 2+ precursor ions, increased the number of identified glycated peptides relative to ETcaD or the enhanced MS survey scan rate. Finally, an evaluation of trypsin, Arg-C, and Lys-C showed that tryptic digestion of glycated proteins was comparable to digestion with Lys-C and that both were better than Arg-C in terms of the number of glycated peptides and corresponding glycated proteins identified by LC–MS/MS. PMID:18989935

  13. Gowinda: unbiased analysis of gene set enrichment for genome-wide association studies.

    PubMed

    Kofler, Robert; Schlötterer, Christian

    2012-08-01

    An analysis of gene set [e.g. Gene Ontology (GO)] enrichment assumes that all genes are sampled independently from each other with the same probability. These assumptions are violated in genome-wide association (GWA) studies since (i) longer genes typically have more single-nucleotide polymorphisms resulting in a higher probability of being sampled and (ii) overlapping genes are sampled in clusters. Herein, we introduce Gowinda, a software specifically designed to test for enrichment of gene sets in GWA studies. We show that GO tests on GWA data could result in a substantial number of false-positive GO terms. Permutation tests implemented in Gowinda eliminate these biases, but maintain sufficient power to detect enrichment of GO terms. Since sufficient resolution for large datasets requires millions of permutations, we use multi-threading to keep computation times reasonable. Gowinda is implemented in Java (v1.6) and freely available on http://code.google.com/p/gowinda/ christian.schloetterer@vetmeduni.ac.at Manual: http://code.google.com/p/gowinda/wiki/Manual. Test data and tutorial: http://code.google.com/p/gowinda/wiki/Tutorial. http://code.google.com/p/gowinda/wiki/VALIDATION.

  14. Using Cluster Analysis to Examine Husband-Wife Decision Making

    ERIC Educational Resources Information Center

    Bonds-Raacke, Jennifer M.

    2006-01-01

    Cluster analysis has a rich history in many disciplines and although cluster analysis has been used in clinical psychology to identify types of disorders, its use in other areas of psychology has been less popular. The purpose of the current experiments was to use cluster analysis to investigate husband-wife decision making. Cluster analysis was…

  15. Using Enrichment Clusters to Address the Needs of Culturally and Linguistically Diverse Learners

    ERIC Educational Resources Information Center

    Allen, Jennifer K.; Robbins, Margaret A.; Payne, Yolanda Denise; Brown, Katherine Backes

    2016-01-01

    Using data from teacher interviews, classroom observations, and a professional development workshop, this article explains how one component of the schoolwide enrichment model (SEM) has been implemented at a culturally diverse elementary school serving primarily Latina/o and African American students. Based on a broadened conception of giftedness,…

  16. Bacterial isolates from polysaccharide enrichments cluster by host origin for Firmicutes but not Bacteroidetes.

    USDA-ARS?s Scientific Manuscript database

    The intestinal microbiota allows mammals to recover energy stored in plant biomass through fermentation of plant cell walls, primarily cellulose and hemicellulose. Bacteria were isolated from 8 week continuous culture enrichments with cellulose and xylan/pectin from cow (C, n=4), goat (G, n=4), huma...

  17. Using Enrichment Clusters to Address the Needs of Culturally and Linguistically Diverse Learners

    ERIC Educational Resources Information Center

    Allen, Jennifer K.; Robbins, Margaret A.; Payne, Yolanda Denise; Brown, Katherine Backes

    2016-01-01

    Using data from teacher interviews, classroom observations, and a professional development workshop, this article explains how one component of the schoolwide enrichment model (SEM) has been implemented at a culturally diverse elementary school serving primarily Latina/o and African American students. Based on a broadened conception of giftedness,…

  18. Bayesian Analysis of Multiple Populations in Galactic Globular Clusters

    NASA Astrophysics Data System (ADS)

    Wagner-Kaiser, Rachel A.; Sarajedini, Ata; von Hippel, Ted; Stenning, David; Piotto, Giampaolo; Milone, Antonino; van Dyk, David A.; Robinson, Elliot; Stein, Nathan

    2016-01-01

    We use GO 13297 Cycle 21 Hubble Space Telescope (HST) observations and archival GO 10775 Cycle 14 HST ACS Treasury observations of Galactic Globular Clusters to find and characterize multiple stellar populations. Determining how globular clusters are able to create and retain enriched material to produce several generations of stars is key to understanding how these objects formed and how they have affected the structural, kinematic, and chemical evolution of the Milky Way. We employ a sophisticated Bayesian technique with an adaptive MCMC algorithm to simultaneously fit the age, distance, absorption, and metallicity for each cluster. At the same time, we also fit unique helium values to two distinct populations of the cluster and determine the relative proportions of those populations. Our unique numerical approach allows objective and precise analysis of these complicated clusters, providing posterior distribution functions for each parameter of interest. We use these results to gain a better understanding of multiple populations in these clusters and their role in the history of the Milky Way.Support for this work was provided by NASA through grant numbers HST-GO-10775 and HST-GO-13297 from the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS5-26555. This material is based upon work supported by the National Aeronautics and Space Administration under Grant NNX11AF34G issued through the Office of Space Science. This project was supported by the National Aeronautics & Space Administration through the University of Central Florida's NASA Florida Space Grant Consortium.

  19. [Bacterial diversity analysis of moderately thermophilic microflora enriched by different energy sources].

    PubMed

    Liu, Fei-fei; Zhou, Hong-bo; Fu, Bo; Qiu, Guan-zhou

    2007-06-01

    Bacterial biodiversities of three moderately thermophilic bioleaching microfloras grown at 50 degrees C on media with pyrite, chalcopyrite, and pure ferrous iron supplemented with sulfur as energy sources were investigated respectively. The 16S rRNA genes of the microorganisms in the cultures flasks were PCR amplified and cloned to identify the bacterial species by comparative sequence analysis, the structural differences of microfloras enriched by different energy sources were compared. A total of 303 clones were recovered and evaluated by restriction fragment length polymorphism (RFLP) analysis. Cluster analysis identified 29 unique RFLP patterns, and the inserted 16S rRNA genes sequences were determined and for phylogenetic analysis. Most of sequences obtained were similar (89.1%-99.7%) to the 16S rRNA gene sequences of the reported bioleaching microorganisms. The species identified from the flasks during bioleaching of pyrite, pure ferrous iron supplemented with sulfur, and chalcopyrite were closely related to Acidithiobacillus caldus, Sulfobacillus thermotolerans, Sulfobacillus thermosulfidooxidans, Leptospirillum ferriphilum, two uncultured forest soil bacterium clones and one uncultured proteobacterium clone. Among these bacteria, Acidithiobacillus caldus, Sulfobacillus thermotolerans and Leptospirillum ferriphilum were the dominant bacterial species. L. ferriphilum was the most dominant species in microfloras enriched in media with pyrite and ferrous iron supplemented with sulfur as energy sources, the abundance were 53.8% and 45.9% respectively. In the culture with chalcopyrite as energy sources, S. thermotolerans had the highest abundance of 70.1%.

  20. Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

    PubMed

    Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

    2014-11-01

    Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. Shotgun proteome analysis of honeybee venom using targeted enrichment strategies.

    PubMed

    Matysiak, Jan; Hajduk, Joanna; Pietrzak, Łukasz; Schmelzer, Christian E H; Kokot, Zenon J

    2014-11-01

    The aim of this study was to explore the honeybee venom proteome applying a shotgun proteomics approach using different enrichment strategies (combinatorial peptide ligand libraries and solid phase extraction). The studies were conducted using nano-LC/MALDI-TOF/TOF-MS system. The MS analysis of peptide profiles (in the range of 900-4500 Da) and virtual gel-image of proteins from Lab-on-Chip assay (in the range of 10-250 kDa) confirm that use of targeted enrichment strategies increase detection of honeybee venom components. The gel-free shotgun strategy and sophisticated instrumentation led to a significant increase of the sensitivity and higher number of identified peptides in honeybee venom samples, comparing with the current literature. Moreover, 11 of 12 known honeybee venom allergens were acknowledged and 4 new, so far uncharacterized proteins were identified. In addition, similarity searches were performed in order to investigate biological relations and homology between newly identified proteins sequences from Apis mellifera and other Hymenoptera. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Analysis of Enriched Uranyl Nitrate in Nested Annular Tank Array

    SciTech Connect

    John D. Bess; James D. Cleaver

    2009-06-01

    Two series of experiments were performed at the Rocky Flats Critical Mass Laboratory during the 1980s using highly enriched (93%) uranyl nitrate solution in annular tanks. [1, 2] Tanks were of typical sizes found in nuclear production plants. Experiments looked at tanks of varying radii in a co-located set of nested tanks, a 1 by 2 array, and a 1 by 3 array. The co-located set of tanks had been analyzed previously [3] as a benchmark for inclusion within the International Handbook of Evaluated Criticality Safety Benchmark Experiments. [4] The current study represents the benchmark analysis of the 1 by 3 array of a series of nested annular tanks. Of the seventeen configurations performed in this set of experiments, twelve were evaluated and nine were judged as acceptable benchmarks.

  3. Globular Cluster Formation at High Density: A Model for Elemental Enrichment with Fast Recycling of Massive-star Debris

    NASA Astrophysics Data System (ADS)

    Elmegreen, Bruce G.

    2017-02-01

    The self-enrichment of massive star clusters by p-processed elements is shown to increase significantly with increasing gas density as a result of enhanced star formation rates and stellar scatterings compared to the lifetime of a massive star. Considering the type of cloud core where a globular cluster (GC) might have formed, we follow the evolution and enrichment of the gas and the time dependence of stellar mass. A key assumption is that interactions between massive stars are important at high density, including interactions between massive stars and massive-star binaries that can shred stellar envelopes. Massive-star interactions should also scatter low-mass stars out of the cluster. Reasonable agreement with the observations is obtained for a cloud-core mass of ∼4 × 106 M ⊙ and a density of ∼2 × 106 cm‑3. The results depend primarily on a few dimensionless parameters, including, most importantly, the ratio of the gas consumption time to the lifetime of a massive star, which has to be low, ∼10%, and the efficiency of scattering low-mass stars per unit dynamical time, which has to be relatively large, such as a few percent. Also for these conditions, the velocity dispersions of embedded GCs should be comparable to the high gas dispersions of galaxies at that time, so that stellar ejection by multistar interactions could cause low-mass stars to leave a dwarf galaxy host altogether. This could solve the problem of missing first-generation stars in the halos of Fornax and WLM.

  4. Cluster and constraint analysis in tetrahedron packings.

    PubMed

    Jin, Weiwei; Lu, Peng; Liu, Lufeng; Li, Shuixiang

    2015-04-01

    The disordered packings of tetrahedra often show no obvious macroscopic orientational or positional order for a wide range of packing densities, and it has been found that the local order in particle clusters is the main order form of tetrahedron packings. Therefore, a cluster analysis is carried out to investigate the local structures and properties of tetrahedron packings in this work. We obtain a cluster distribution of differently sized clusters, and peaks are observed at two special clusters, i.e., dimer and wagon wheel. We then calculate the amounts of dimers and wagon wheels, which are observed to have linear or approximate linear correlations with packing density. Following our previous work, the amount of particles participating in dimers is used as an order metric to evaluate the order degree of the hierarchical packing structure of tetrahedra, and an order map is consequently depicted. Furthermore, a constraint analysis is performed to determine the isostatic or hyperstatic region in the order map. We employ a Monte Carlo algorithm to test jamming and then suggest a new maximally random jammed packing of hard tetrahedra from the order map with a packing density of 0.6337.

  5. Sequence analysis of porothramycin biosynthetic gene cluster.

    PubMed

    Najmanova, Lucie; Ulanova, Dana; Jelinkova, Marketa; Kamenik, Zdenek; Kettnerova, Eliska; Koberska, Marketa; Gazak, Radek; Radojevic, Bojana; Janata, Jiri

    2014-11-01

    The biosynthetic gene cluster of porothramycin, a sequence-selective DNA alkylating compound, was identified in the genome of producing strain Streptomyces albus subsp. albus (ATCC 39897) and sequentially characterized. A 39.7 kb long DNA region contains 27 putative genes, 18 of them revealing high similarity with homologous genes from biosynthetic gene cluster of closely related pyrrolobenzodiazepine (PBD) compound anthramycin. However, considering the structures of both compounds, the number of differences in the gene composition of compared biosynthetic gene clusters was unexpectedly high, indicating participation of alternative enzymes in biosynthesis of both porothramycin precursors, anthranilate, and branched L-proline derivative. Based on the sequence analysis of putative NRPS modules Por20 and Por21, we suppose that in porothramycin biosynthesis, the methylation of anthranilate unit occurs prior to the condensation reaction, while modifications of branched proline derivative, oxidation, and dimethylation of the side chain occur on already condensed PBD core. Corresponding two specific methyltransferase encoding genes por26 and por25 were identified in the porothramycin gene cluster. Surprisingly, also methyltransferase gene por18 homologous to orf19 from anthramycin biosynthesis was detected in porothramycin gene cluster even though the appropriate biosynthetic step is missing, as suggested by ultra high-performance liquid chromatography-diode array detection-mass spectrometry (UHPLC-DAD-MS) analysis of the product in the S. albus culture broth.

  6. Identifying Peer Institutions Using Cluster Analysis

    ERIC Educational Resources Information Center

    Boronico, Jess; Choksi, Shail S.

    2012-01-01

    The New York Institute of Technology's (NYIT) School of Management (SOM) wishes to develop a list of peer institutions for the purpose of benchmarking and monitoring/improving performance against other business schools. The procedure utilizes relevant criteria for the purpose of establishing this peer group by way of a cluster analysis. The…

  7. Systematization of actinides using cluster analysis

    SciTech Connect

    Kopyrin, A.A.; Terent`eva, T.N.; Khramov, N.N.

    1994-11-01

    A representation of the actinides in multidimensional property space is proposed for systematization of these elements using cluster analysis. Literature data for their atomic properties are used. Owing to the wide variation of published ionization potentials, medians are used to estimate them. Vertical dendograms are used for classification on the basis of distances between the actinides in atomic-property space. The properties of actinium and lawrencium are furthest removed from the main group. Thorium and mendelevium exhibit individualized properties. A cluster based on the einsteinium-fermium pair is joined by californium.

  8. [Visual field progression in glaucoma: cluster analysis].

    PubMed

    Bresson-Dumont, H; Hatton, J; Foucher, J; Fonteneau, M

    2012-11-01

    Visual field progression analysis is one of the key points in glaucoma monitoring, but distinction between true progression and random fluctuation is sometimes difficult. There are several different algorithms but no real consensus for detecting visual field progression. The trend analysis of global indices (MD, sLV) may miss localized deficits or be affected by media opacities. Conversely, point-by-point analysis makes progression difficult to differentiate from physiological variability, particularly when the sensitivity of a point is already low. The goal of our study was to analyse visual field progression with the EyeSuite™ Octopus Perimetry Clusters algorithm in patients with no significant changes in global indices or worsening of the analysis of pointwise linear regression. We analyzed the visual fields of 162 eyes (100 patients - 58 women, 42 men, average age 66.8 ± 10.91) with ocular hypertension or glaucoma. For inclusion, at least six reliable visual fields per eye were required, and the trend analysis (EyeSuite™ Perimetry) of visual field global indices (MD and SLV), could show no significant progression. The analysis of changes in cluster mode was then performed. In a second step, eyes with statistically significant worsening of at least one of their clusters were analyzed point-by-point with the Octopus Field Analysis (OFA). Fifty four eyes (33.33%) had a significant worsening in some clusters, while their global indices remained stable over time. In this group of patients, more advanced glaucoma was present than in stable group (MD 6.41 dB vs. 2.87); 64.82% (35/54) of those eyes in which the clusters progressed, however, had no statistically significant change in the trend analysis by pointwise linear regression. Most software algorithms for analyzing visual field progression are essentially trend analyses of global indices, or point-by-point linear regression. This study shows the potential role of analysis by clusters trend. However, for best

  9. Electrical Load Profile Analysis Using Clustering Techniques

    NASA Astrophysics Data System (ADS)

    Damayanti, R.; Abdullah, A. G.; Purnama, W.; Nandiyanto, A. B. D.

    2017-03-01

    Data mining is one of the data processing techniques to collect information from a set of stored data. Every day the consumption of electricity load is recorded by Electrical Company, usually at intervals of 15 or 30 minutes. This paper uses a clustering technique, which is one of data mining techniques to analyse the electrical load profiles during 2014. The three methods of clustering techniques were compared, namely K-Means (KM), Fuzzy C-Means (FCM), and K-Means Harmonics (KHM). The result shows that KHM is the most appropriate method to classify the electrical load profile. The optimum number of clusters is determined using the Davies-Bouldin Index. By grouping the load profile, the demand of variation analysis and estimation of energy loss from the group of load profile with similar pattern can be done. From the group of electric load profile, it can be known cluster load factor and a range of cluster loss factor that can help to find the range of values of coefficients for the estimated loss of energy without performing load flow studies.

  10. A Multivariate Analysis of Galaxy Cluster Properties

    NASA Astrophysics Data System (ADS)

    Ogle, P. M.; Djorgovski, S.

    1993-05-01

    We have assembled from the literature a data base on on 394 clusters of galaxies, with up to 16 parameters per cluster. They include optical and x-ray luminosities, x-ray temperatures, galaxy velocity dispersions, central galaxy and particle densities, optical and x-ray core radii and ellipticities, etc. In addition, derived quantities, such as the mass-to-light ratios and x-ray gas masses are included. Doubtful measurements have been identified, and deleted from the data base. Our goal is to explore the correlations between these parameters, and interpret them in the framework of our understanding of evolution of clusters and large-scale structure, such as the Gott-Rees scaling hierarchy. Among the simple, monovariate correlations we found, the most significant include those between the optical and x-ray luminosities, x-ray temperatures, cluster velocity dispersions, and central galaxy densities, in various mutual combinations. While some of these correlations have been discussed previously in the literature, generally smaller samples of objects have been used. We will also present the results of a multivariate statistical analysis of the data, including a principal component analysis (PCA). Such an approach has not been used previously for studies of cluster properties, even though it is much more powerful and complete than the simple monovariate techniques which are commonly employed. The observed correlations may lead to powerful constraints for theoretical models of formation and evolution of galaxy clusters. P.M.O. was supported by a Caltech graduate fellowship. S.D. acknowledges a partial support from the NASA contract NAS5-31348 and the NSF PYI award AST-9157412.

  11. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

    PubMed Central

    2014-01-01

    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations

  12. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations.

    PubMed

    Corrigan, Neil; Bankart, Michael J G; Gray, Laura J; Smith, Karen L

    2014-05-24

    There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations include avoidance of cluster merges where

  13. Geographic atrophy phenotype identification by cluster analysis.

    PubMed

    Monés, Jordi; Biarnés, Marc

    2017-07-20

    To identify ocular phenotypes in patients with geographic atrophy secondary to age-related macular degeneration (GA) using a data-driven cluster analysis. This was a retrospective analysis of data from a prospective, natural history study of patients with GA who were followed for ≥6 months. Cluster analysis was used to identify subgroups within the population based on the presence of several phenotypic features: soft drusen, reticular pseudodrusen (RPD), primary foveal atrophy, increased fundus autofluorescence (FAF), greyish FAF appearance and subfoveal choroidal thickness (SFCT). A comparison of features between the subgroups was conducted, and a qualitative description of the new phenotypes was proposed. The atrophy growth rate between phenotypes was then compared. Data were analysed from 77 eyes of 77 patients with GA. Cluster analysis identified three groups: phenotype 1 was characterised by high soft drusen load, foveal atrophy and slow growth; phenotype 3 showed high RPD load, extrafoveal and greyish FAF appearance and thin SFCT; the characteristics of phenotype 2 were midway between phenotypes 1 and 3. Phenotypes differed in all measured features (p≤0.013), with decreases in the presence of soft drusen, foveal atrophy and SFCT seen from phenotypes 1 to 3 and corresponding increases in high RPD load, high FAF and greyish FAF appearance. Atrophy growth rate differed between phenotypes 1, 2 and 3 (0.63, 1.91 and 1.73 mm(2)/year, respectively, p=0.0005). Cluster analysis identified three distinct phenotypes in GA. One of them showed a particularly slow growth pattern. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  14. DOSE: an R/Bioconductor package for disease ontology semantic and enrichment analysis.

    PubMed

    Yu, Guangchuang; Wang, Li-Gen; Yan, Guang-Rong; He, Qing-Yu

    2015-02-15

    Disease ontology (DO) annotates human genes in the context of disease. DO is important annotation in translating molecular findings from high-throughput data to clinical relevance. DOSE is an R package providing semantic similarity computations among DO terms and genes which allows biologists to explore the similarities of diseases and of gene functions in disease perspective. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented to support discovering disease associations of high-throughput biological data. This allows biologists to verify disease relevance in a biological experiment and identify unexpected disease associations. Comparison among gene clusters is also supported. DOSE is released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/release/bioc/html/DOSE.html). Supplementary data are available at Bioinformatics online. gcyu@connect.hku.hk or tqyhe@jnu.edu.cn. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Isotopic hybrids of nitrogenase. Mössbauer study of MoFe protein with selective 57Fe enrichment of the P-cluster.

    PubMed

    McLean, P A; Papaefthymiou, V; Orme-Johnson, W H; Münck, E

    1987-09-25

    Previous Mössbauer and EPR studies of the MoFe protein (approximately 30 Fe and 2 Mo) of nitrogenase have revealed the presence of two unique clusters, namely, the P-clusters (presumably of the Fe4S4 type) and the molybdenum- and iron-containing cofactors (or M-clusters). Mössbauer components D (approximately 10-12 Fe) and Fe2+ (approximately 4 Fe) represent subsites of the P-clusters while component S (approximately 2 Fe) appeared to belong to a separate, unidentified cluster. In order to refine the analyses of Mössbauer spectra, we have constructed an isotopic hybrid of the Klebsiella pneumoniae protein which contains 57Fe-enriched P-clusters and 56Fe-enriched M-clusters. The highly resolved 57Fe Mössbauer spectra of this hybrid show that component S behaves spectroscopically like the P-cluster sites D and Fe2+ in oxidized and reduced MoFe protein. This suggests that S is a subset of the P-clusters rather than a different cluster type. The present study shows, for the first time, that the Debye-Waller factors of different P-cluster subsites have a different temperature dependence. Thus, the Fe2+/D absorption ratio is 4.0:10.0 at 4.2 K and 4.0:11.6 at 173 K. We propose that the reduced MoFe protein contains two pairs of P-clusters: one pair containing one Fe2+ and three D-sites and the other one Fe2+, two D, and one S-site. We have argued previously that the oxidized P-clusters occur in pairs as well.

  16. AMOEBA clustering revisited. [cluster analysis, classification, and image display program

    NASA Technical Reports Server (NTRS)

    Bryant, Jack

    1990-01-01

    A description of the clustering, classification, and image display program AMOEBA is presented. Using a difficult high resolution aircraft-acquired MSS image, the steps the program takes in forming clusters are traced. A number of new features are described here for the first time. Usage of the program is discussed. The theoretical foundation (the underlying mathematical model) is briefly presented. The program can handle images of any size and dimensionality.

  17. AMOEBA clustering revisited. [cluster analysis, classification, and image display program

    NASA Technical Reports Server (NTRS)

    Bryant, Jack

    1990-01-01

    A description of the clustering, classification, and image display program AMOEBA is presented. Using a difficult high resolution aircraft-acquired MSS image, the steps the program takes in forming clusters are traced. A number of new features are described here for the first time. Usage of the program is discussed. The theoretical foundation (the underlying mathematical model) is briefly presented. The program can handle images of any size and dimensionality.

  18. ClusterViz: A Cytoscape APP for Cluster Analysis of Biological Network.

    PubMed

    Wang, Jianxin; Zhong, Jiancheng; Chen, Gang; Li, Min; Wu, Fang-xiang; Pan, Yi

    2015-01-01

    Cluster analysis of biological networks is one of the most important approaches for identifying functional modules and predicting protein functions. Furthermore, visualization of clustering results is crucial to uncover the structure of biological networks. In this paper, ClusterViz, an APP of Cytoscape 3 for cluster analysis and visualization, has been developed. In order to reduce complexity and enable extendibility for ClusterViz, we designed the architecture of ClusterViz based on the framework of Open Services Gateway Initiative. According to the architecture, the implementation of ClusterViz is partitioned into three modules including interface of ClusterViz, clustering algorithms and visualization and export. ClusterViz fascinates the comparison of the results of different algorithms to do further related analysis. Three commonly used clustering algorithms, FAG-EC, EAGLE and MCODE, are included in the current version. Due to adopting the abstract interface of algorithms in module of the clustering algorithms, more clustering algorithms can be included for the future use. To illustrate usability of ClusterViz, we provided three examples with detailed steps from the important scientific articles, which show that our tool has helped several research teams do their research work on the mechanism of the biological networks.

  19. THE SUPERNOVA DELAY TIME DISTRIBUTION IN GALAXY CLUSTERS AND IMPLICATIONS FOR TYPE-Ia PROGENITORS AND METAL ENRICHMENT

    SciTech Connect

    Maoz, Dan; Sharon, Keren; Avishay Gal-Yam

    2010-10-20

    Knowledge of the supernova (SN) delay time distribution (DTD)-the SN rate versus time that would follow a hypothetical brief burst of star formation-can shed light on SN progenitors and physics, as well as on the timescales of chemical enrichment in different environments. We compile recent measurements of the Type-Ia SN (SN Ia) rate in galaxy clusters at redshifts from z = 0 out to z = 1.45, just 2 Gyr after cluster star formation at z {approx} 3. We review the plausible range for the observed total iron-to-stellar mass ratio in clusters, based on the latest data and analyses, and use it to constrain the time-integrated number of SN Ia events in clusters. With these data, we recover the DTD of SNe Ia in cluster environments. The DTD is sharply peaked at the shortest time-delay interval we probe, 0Gyr < t < 2.2 Gyr, with a low tail out to delays of {approx}10 Gyr, and is remarkably consistent with several recent DTD reconstructions based on different methods, applied to different environments. We test DTD models from the literature, requiring that they simultaneously reproduce the observed cluster SN rates and the observed iron-to-stellar mass ratios. A parameterized power-law DTD of the form t {sup -1.2{+-}0.3} from t = 400 Myr to a Hubble time can satisfy both constraints. Shallower power laws such as t {sup -1/2} cannot, assuming a single DTD, and a single star formation burst (either brief or extended) at high z. This implies that 50%-85% of SNe Ia explode within 1 Gyr of star formation. DTDs from double-degenerate (DD) models, which generically have {approx}t {sup -1} shapes over a wide range of timescales, match the data, but only if their predictions are scaled up by factors of 5-10. Single-degenerate (SD) DTDs always give poor fits to the data, due to a lack of delayed SNe and overall low numbers of SNe. The observations can also be reproduced with a combination of two SN Ia populations-a prompt SD population of SNe Ia that explodes within a few Gyr of star

  20. Advanced analysis of forest fire clustering

    NASA Astrophysics Data System (ADS)

    Kanevski, Mikhail; Pereira, Mario; Golay, Jean

    2017-04-01

    Analysis of point pattern clustering is an important topic in spatial statistics and for many applications: biodiversity, epidemiology, natural hazards, geomarketing, etc. There are several fundamental approaches used to quantify spatial data clustering using topological, statistical and fractal measures. In the present research, the recently introduced multi-point Morisita index (mMI) is applied to study the spatial clustering of forest fires in Portugal. The data set consists of more than 30000 fire events covering the time period from 1975 to 2013. The distribution of forest fires is very complex and highly variable in space. mMI is a multi-point extension of the classical two-point Morisita index. In essence, mMI is estimated by covering the region under study by a grid and by computing how many times more likely it is that m points selected at random will be from the same grid cell than it would be in the case of a complete random Poisson process. By changing the number of grid cells (size of the grid cells), mMI characterizes the scaling properties of spatial clustering. From mMI, the data intrinsic dimension (fractal dimension) of the point distribution can be estimated as well. In this study, the mMI of forest fires is compared with the mMI of random patterns (RPs) generated within the validity domain defined as the forest area of Portugal. It turns out that the forest fires are highly clustered inside the validity domain in comparison with the RPs. Moreover, they demonstrate different scaling properties at different spatial scales. The results obtained from the mMI analysis are also compared with those of fractal measures of clustering - box counting and sand box counting approaches. REFERENCES Golay J., Kanevski M., Vega Orozco C., Leuenberger M., 2014: The multipoint Morisita index for the analysis of spatial patterns. Physica A, 406, 191-202. Golay J., Kanevski M. 2015: A new estimator of intrinsic dimension based on the multipoint Morisita index

  1. GAGE: generally applicable gene set enrichment for pathway analysis

    PubMed Central

    Luo, Weijun; Friedman, Michael S; Shedden, Kerby; Hankenson, Kurt D; Woolf, Peter J

    2009-01-01

    Background Gene set analysis (GSA) is a widely used strategy for gene expression data analysis based on pathway knowledge. GSA focuses on sets of related genes and has established major advantages over individual gene analyses, including greater robustness, sensitivity and biological relevance. However, previous GSA methods have limited usage as they cannot handle datasets of different sample sizes or experimental designs. Results To address these limitations, we present a new GSA method called Generally Applicable Gene-set Enrichment (GAGE). We successfully apply GAGE to multiple microarray datasets with different sample sizes, experimental designs and profiling techniques. GAGE shows significantly better results when compared to two other commonly used GSA methods of GSEA and PAGE. We demonstrate this improvement in the following three aspects: (1) consistency across repeated studies/experiments; (2) sensitivity and specificity; (3) biological relevance of the regulatory mechanisms inferred. GAGE reveals novel and relevant regulatory mechanisms from both published and previously unpublished microarray studies. From two published lung cancer data sets, GAGE derived a more cohesive and predictive mechanistic scheme underlying lung cancer progress and metastasis. For a previously unpublished BMP6 study, GAGE predicted novel regulatory mechanisms for BMP6 induced osteoblast differentiation, including the canonical BMP-TGF beta signaling, JAK-STAT signaling, Wnt signaling, and estrogen signaling pathways–all of which are supported by the experimental literature. Conclusion GAGE is generally applicable to gene expression datasets with different sample sizes and experimental designs. GAGE consistently outperformed two most frequently used GSA methods and inferred statistically and biologically more relevant regulatory pathways. The GAGE method is implemented in R in the "gage" package, available under the GNU GPL from . PMID:19473525

  2. Neighbor overlap is enriched in the yeast interaction network: analysis and implications.

    PubMed

    Feiglin, Ariel; Moult, John; Lee, Byungkook; Ofran, Yanay; Unger, Ron

    2012-01-01

    The yeast protein-protein interaction network has been shown to have distinct topological features such as a scale free degree distribution and a high level of clustering. Here we analyze an additional feature which is called Neighbor Overlap. This feature reflects the number of shared neighbors between a pair of proteins. We show that Neighbor Overlap is enriched in the yeast protein-protein interaction network compared with control networks carefully designed to match the characteristics of the yeast network in terms of degree distribution and clustering coefficient. Our analysis also reveals that pairs of proteins with high Neighbor Overlap have higher sequence similarity, more similar GO annotations and stronger genetic interactions than pairs with low ones. Finally, we demonstrate that pairs of proteins with redundant functions tend to have high Neighbor Overlap. We suggest that a combination of three mechanisms is the basis for this feature: The abundance of protein complexes, selection for backup of function, and the need to allow functional variation.

  3. Neighbor Overlap Is Enriched in the Yeast Interaction Network: Analysis and Implications

    PubMed Central

    Feiglin, Ariel; Moult, John; Lee, Byungkook; Ofran, Yanay; Unger, Ron

    2012-01-01

    The yeast protein-protein interaction network has been shown to have distinct topological features such as a scale free degree distribution and a high level of clustering. Here we analyze an additional feature which is called Neighbor Overlap. This feature reflects the number of shared neighbors between a pair of proteins. We show that Neighbor Overlap is enriched in the yeast protein-protein interaction network compared with control networks carefully designed to match the characteristics of the yeast network in terms of degree distribution and clustering coefficient. Our analysis also reveals that pairs of proteins with high Neighbor Overlap have higher sequence similarity, more similar GO annotations and stronger genetic interactions than pairs with low ones. Finally, we demonstrate that pairs of proteins with redundant functions tend to have high Neighbor Overlap. We suggest that a combination of three mechanisms is the basis for this feature: The abundance of protein complexes, selection for backup of function, and the need to allow functional variation. PMID:22761860

  4. Enrichment Analysis - A Technique for Encouraging Better Planning and Better Use of Resources.

    ERIC Educational Resources Information Center

    Andrew, Loyd D.

    The University of Utah in building a planning, programming, and budgeting system has developed an analytical measurement called enrichment analysis that has proved useful in focusing faculty and administration attention during budget setting on long-range planning, objectives and outputs. Enrichment analysis shows not only the rate of increase in…

  5. Equivalent damage validation by variable cluster analysis

    NASA Astrophysics Data System (ADS)

    Drago, Carlo; Ferlito, Rachele; Zucconi, Maria

    2016-06-01

    The main aim of this work is to perform a clustering analysis on the damage relieved in the old center of L'Aquila after the earthquake occurred on April 6, 2009 and to validate an Indicator of Equivalent Damage ED that summarizes the information reported on the AeDES card regarding the level of damage and their extension on the surface of the buildings. In particular we used a sample of 13442 masonry buildings located in an area characterized by a Macroseismic Intensity equal to 8 [1]. The aim is to ensure the coherence between the clusters and its hierarchy identified in the data of damage detected and in the data of the ED elaborated.

  6. Chaotic map clustering algorithm for EEG analysis

    NASA Astrophysics Data System (ADS)

    Bellotti, R.; De Carlo, F.; Stramaglia, S.

    2004-03-01

    The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.

  7. Tweets clustering using latent semantic analysis

    NASA Astrophysics Data System (ADS)

    Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul

    2017-04-01

    Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.

  8. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

    PubMed

    Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

    2017-08-31

    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.

  9. Constructing storyboards based on hierarchical clustering analysis

    NASA Astrophysics Data System (ADS)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

    2005-07-01

    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  10. The isotopic composition of enriched Si: a data analysis

    NASA Astrophysics Data System (ADS)

    Bulska, E.; Drozdov, M. N.; Mana, G.; Pramann, A.; Rienitz, O.; Sennikov, P.; Valkiers, S.

    2011-04-01

    To determine the Avogadro constant by counting the atoms in quasi-perfect spheres made of a silicon crystal highly enriched with 28Si, the isotopic composition of the crystal was measured in different laboratories by different measurement methods. This paper examines the consistency of the measurement results.

  11. Low-resolution Spectroscopy for the Globular Clusters with Signs of Supernova Enrichment: M22, NGC 1851, and NGC 288

    NASA Astrophysics Data System (ADS)

    Lim, Dongwook; Han, Sang-Il; Lee, Young-Wook; Roh, Dong-Goo; Sohn, Young-Jong; Chun, Sang-Hyun; Lee, Jae-Woo; Johnson, Christian I.

    2015-01-01

    There is increasing evidence for the presence of multiple red giant branches (RGBs) in the color-magnitude diagrams of massive globular clusters (GCs). In order to investigate the origin of this split on the RGB, we have performed new narrow-band Ca photometry and low-resolution spectroscopy for M22, NGC 1851, and NGC 288. We find significant differences (more than 4σ) in calcium abundance from the spectroscopic HK' index for M22 and NGC 1851. We also find more than 8σ differences in CN-band strength between the Ca-strong and Ca-weak subpopulations for these GCs. For NGC 288, however, a large difference is detected only in the CN strength. The calcium abundances of RGB stars in this GC are identical to within the errors. This is consistent with the conclusion from our new Ca photometry where the RGB splits are confirmed in M22 and NGC 1851, but not in NGC 288. We also find interesting differences in the CN-CH correlations among these GCs. While CN and CH are anti-correlated in NGC 288, they show a positive correlation in M22. NGC 1851, however, shows no difference in CH between the two groups of stars with different CN strengths. We suggest that all of these systematic differences would be best explained by how strongly Type II supernovae enrichment has contributed to the chemical evolution of these GCs.

  12. LOW-RESOLUTION SPECTROSCOPY FOR THE GLOBULAR CLUSTERS WITH SIGNS OF SUPERNOVA ENRICHMENT: M22, NGC 1851, AND NGC 288

    SciTech Connect

    Lim, Dongwook; Han, Sang-Il; Lee, Young-Wook; Roh, Dong-Goo; Sohn, Young-Jong; Chun, Sang-Hyun; Lee, Jae-Woo; Johnson, Christian I.

    2015-01-01

    There is increasing evidence for the presence of multiple red giant branches (RGBs) in the color-magnitude diagrams of massive globular clusters (GCs). In order to investigate the origin of this split on the RGB, we have performed new narrow-band Ca photometry and low-resolution spectroscopy for M22, NGC 1851, and NGC 288. We find significant differences (more than 4σ) in calcium abundance from the spectroscopic HK' index for M22 and NGC 1851. We also find more than 8σ differences in CN-band strength between the Ca-strong and Ca-weak subpopulations for these GCs. For NGC 288, however, a large difference is detected only in the CN strength. The calcium abundances of RGB stars in this GC are identical to within the errors. This is consistent with the conclusion from our new Ca photometry where the RGB splits are confirmed in M22 and NGC 1851, but not in NGC 288. We also find interesting differences in the CN-CH correlations among these GCs. While CN and CH are anti-correlated in NGC 288, they show a positive correlation in M22. NGC 1851, however, shows no difference in CH between the two groups of stars with different CN strengths. We suggest that all of these systematic differences would be best explained by how strongly Type II supernovae enrichment has contributed to the chemical evolution of these GCs.

  13. Genome editing using FACS enrichment of nuclease-expressing cells and indel detection by amplicon analysis.

    PubMed

    Lonowski, Lindsey A; Narimatsu, Yoshiki; Riaz, Anjum; Delay, Catherine E; Yang, Zhang; Niola, Francesco; Duda, Katarzyna; Ober, Elke A; Clausen, Henrik; Wandall, Hans H; Hansen, Steen H; Bennett, Eric P; Frödin, Morten

    2017-03-01

    This protocol describes methods for increasing and evaluating the efficiency of genome editing based on the CRISPR-Cas9 (clustered regularly interspaced short palindromic repeats-CRISPR-associated 9) system, transcription activator-like effector nucleases (TALENs) or zinc-finger nucleases (ZFNs). First, Indel Detection by Amplicon Analysis (IDAA) determines the size and frequency of insertions and deletions elicited by nucleases in cells, tissues or embryos through analysis of fluorophore-labeled PCR amplicons covering the nuclease target site by capillary electrophoresis in a sequenator. Second, FACS enrichment of cells expressing nucleases linked to fluorescent proteins can be used to maximize knockout or knock-in editing efficiencies or to balance editing efficiency and toxic/off-target effects. The two methods can be combined to form a pipeline for cell-line editing that facilitates the testing of new nuclease reagents and the generation of edited cell pools or clonal cell lines, reducing the number of clones that need to be generated and increasing the ease with which they are screened. The pipeline shortens the time line, but it most prominently reduces the workload of cell-line editing, which may be completed within 4 weeks.

  14. Cluster analysis applied to multiparameter geophysical dataset

    NASA Astrophysics Data System (ADS)

    Di Giuseppe, M. G.; Troiano, A.; Troise, C.; De Natale, G.

    2012-04-01

    Multi-parameter acquisition is a common geophysical field practice nowadays. Regularly seismic velocity and attenuation, gravity and electromagnetic dataset are acquired in a certain area, to obtain a complete characterization of the some investigate feature of the subsoil. Such a richness of information is often underestimated, although an integration of the analysis could provide a notable improving in the imaging of the investigated structures, mostly because the handling of distinct parameters and their joint inversion still presents several and severe problems. Post-inversion statistical techniques represent a promising approach to these questions, providing a quick, simple and elegant way to obtain this advantageous but complex integration. We present an approach based on the partition of the analyzed multi parameter dataset in a number of different classes, identified as localized regions of high correlation. These classes, or 'Cluster', are structured in such a way that the observations pertaining to a certain group are more similar to each other than the observations belonging to a different one, according to an optimal logical criterion. Regions of the subsoil sharing the same physical characteristic are so identified, without a-priori or empirical relationship linking the distinct measured parameters. The retrieved imaging results highly affordable in a statistical sense, specifically due to this lack of external hypothesis that are, instead, indispensable in a full joint inversion, were works, as matter of fact, just a real constrain for the inversion process, not seldom of relative consistence. We apply our procedure to a certain number of experimental dataset, related to several structures at very different scales presents in the Campanian district (southern Italy). These structures goes from the shallows evidence of the active fault zone originating the M 7.9 Irpinia earthquake to the main feature characterizing the Campi Flegrei Caldera and the Mt

  15. Enrichment/isolation of phosphorylated peptides on hafnium oxide prior to mass spectrometric analysis.

    PubMed

    Rivera, José G; Choi, Yong Seok; Vujcic, Stefan; Wood, Troy D; Colón, Luis A

    2009-01-01

    Hafnium oxide (hafnia) exhibits unique enrichment properties towards phosphorylated peptides that are complementary to those of titanium oxide (titania) and zirconium oxide (zirconia) for use with mass spectrometric analysis in the field of proteomics.

  16. WebGestalt 2017: a more comprehensive, powerful, flexible and interactive gene set enrichment analysis toolkit

    PubMed Central

    Wang, Jing; Vasaikar, Suhas; Shi, Zhiao; Greer, Michael

    2017-01-01

    Abstract Functional enrichment analysis has played a key role in the biological interpretation of high-throughput omics data. As a long-standing and widely used web application for functional enrichment analysis, WebGestalt has been constantly updated to satisfy the needs of biologists from different research areas. WebGestalt 2017 supports 12 organisms, 324 gene identifiers from various databases and technology platforms, and 150 937 functional categories from public databases and computational analyses. Omics data with gene identifiers not supported by WebGestalt and functional categories not included in the WebGestalt database can also be uploaded for enrichment analysis. In addition to the Over-Representation Analysis in the previous versions, Gene Set Enrichment Analysis and Network Topology-based Analysis have been added to WebGestalt 2017, providing complementary approaches to the interpretation of high-throughput omics data. The new user-friendly output interface and the GOView tool allow interactive and efficient exploration and comparison of enrichment results. Thus, WebGestalt 2017 enables more comprehensive, powerful, flexible and interactive functional enrichment analysis. It is freely available at http://www.webgestalt.org. PMID:28472511

  17. Solid-phase enrichment and analysis of electrophilic natural products

    PubMed Central

    Wesche, Frank; He, Yue

    2017-01-01

    In search for new natural products, which may lead to the development of new drugs for all kind of applications, novel methods are needed. Here we describe the identification of electrophilic natural products in crude extracts via their reactivity against azide as a nucleophile followed by their subsequent enrichment using a cleavable azide-reactive resin (CARR). Using this approach, natural products carrying epoxides and α,β-unsaturated enones as well as several unknown compounds were identified in crude extracts from entomopathogenic Photorhabdus bacteria. PMID:28382178

  18. Cluster analysis of word frequency dynamics

    NASA Astrophysics Data System (ADS)

    Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.

    2015-01-01

    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.

  19. Star formation in the first galaxies - III. Formation, evolution, and characteristics of the first metal-enriched stellar cluster

    NASA Astrophysics Data System (ADS)

    Safranek-Shrader, Chalence; Montgomery, Michael H.; Milosavljević, Miloš; Bromm, Volker

    2016-01-01

    We simulate the formation of a low-metallicity (10-2 Z⊙) stellar cluster at redshift z ˜ 14. Beginning with cosmological initial conditions, the simulation utilizes adaptive mesh refinement and sink particles to follow the collapse and evolution of gas past the opacity limit for fragmentation, thus resolving the formation of individual protostellar cores. A time- and location-dependent protostellar radiation field, which heats the gas by absorption on dust, is computed by integration of protostellar evolutionary tracks. The simulation also includes a robust non-equilibrium chemical network that self-consistently treats gas thermodynamics and dust-gas coupling. The system is evolved for 18 kyr after the first protostellar source has formed. In this time span, 30 sink particles representing protostellar cores form with a total mass of 81 M⊙. Their masses range from ˜0.1 to 14.4 M⊙ with a median mass ˜0.5-1 M⊙. Massive protostars grow by competitive accretion while lower mass protostars are stunted in growth by close encounters and many-body ejections. In the regime explored here, the characteristic mass scale is determined by the cosmic microwave background temperature floor and the onset of efficient dust-gas coupling. It seems unlikely that host galaxies of the first bursts of metal-enriched star formation will be detectable with the James Webb Space Telescope or other next-generation infrared observatories. Instead, the most promising access route to the dawn of cosmic star formation may lie in the scrutiny of metal-poor, ancient stellar populations in the Galactic neighbourhood. The observable targets corresponding to the system simulated here are ultra-faint dwarf satellite galaxies such as Boötes II and Willman I.

  20. Phytotoxicity and Plant Productivity Analysis of Tar-Enriched Biochars

    NASA Astrophysics Data System (ADS)

    Keller, M. L.; Masiello, C. A.; Dugan, B.; Rudgers, J. A.; Capareda, S. C.

    2008-12-01

    Biochar is one of the three by-products obtained by the pyrolysis of organic material, the other two being syngas and bio-oil. The pyrolysis of biomass has generated a great amount of interest in recent years as all three by-products can be put toward beneficial uses. As part of a larger project designed to evaluate the hydrologic impact of biochar soil amendment, we generated a biochar through fast pyrolysis (less than 2 minutes) of sorghum stock at 600°C. In the initial biochar production run, the char bin was not purged with nitrogen. This inadvertent change in pyrolysis conditions produced a fast-pyrolysis biochar enriched with tars. We chose not to discard this batch, however, and instead used it to test the impact of tar-enriched biochars on plants. A suite of phytotoxicity tests were run to assess the effects of tar-rich biochar on plant germination and plant productivity. We designed the experiment to test for negative effects, using an organic carbon and nutrient-rich, greenhouse- optimized potting medium instead of soil. We used Black Seeded Simpson lettuce (Lactuca sativa) as the test organism. We found that even when tars are present within biochar, biochar amendment up to 10% by weight caused increased lettuce germination rates and increased biomass productivity. In this presentation, we will report the statistical significance of our germination and biomass data, as well as present preliminary data on how biochar amendment affects soil hydrologic properties.

  1. Preparation of Mitochondrial Enriched Fractions for Metabolic Analysis in Drosophila

    PubMed Central

    Villa-Cuesta, Eugenia; Rand, David M.

    2015-01-01

    Since mitochondria play roles in amino acid metabolism, carbohydrate metabolism and fatty acid oxidation, defects in mitochondrial function often compromise the lives of those who suffer from these complex diseases. Detecting mitochondrial metabolic changes is vital to the understanding of mitochondrial disorders and mitochondrial responses to pharmacological agents. Although mitochondrial metabolism is at the core of metabolic regulation, the detection of subtle changes in mitochondrial metabolism may be hindered by the overrepresentation of other cytosolic metabolites obtained using whole organism or whole tissue extractions. Here we describe an isolation method that detected pronounced mitochondrial metabolic changes in Drosophila that were distinct between whole-fly and mitochondrial enriched preparations. To illustrate the sensitivity of this method, we used a set of Drosophila harboring genetically diverse mitochondrial DNAs (mtDNA) and exposed them to the drug rapamycin. Using this method we showed that rapamycin modifies mitochondrial metabolism in a mitochondrial-genotype-dependent manner. However, these changes are much more distinct in metabolomics studies when metabolites were extracted from mitochondrial enriched fractions. In contrast, whole tissue extracts only detected metabolic changes mediated by the drug rapamycin independently of mtDNAs. PMID:26485391

  2. Analysis of High Enriched Uranyl Nitrate Solution Containing Cadmium

    SciTech Connect

    S. S. Kim

    2006-09-01

    A benchmark evaluation has been performed for a set of twenty-one critical experiments involving high enriched uranyl nitrate solution with and without cadmium nitrate as a soluble neutron absorber. The critical experiments analyzed include two types of cylindrical vessels with 24.18 and 29.16 cm in diameters. The vessels were reflected with water and in some cases with water containing dissolved cadmium nitrate. The uranium concentration ranged from 482 to 529 g/l, and cadmium concentration in the uranyl nitrate solution ranged from 0.0 to 11.31 g/l. The cadmium concentration in the reflector solution ranged from 0.0 to 15.16 g/l. Using MCNP and KENO-V.a, complete three-dimensional models were created for the two vessels filled with the uranyl nitrate solution and reflector solution. A series of criticality calculations were performed with KENO-V.a, MCNP4b, and MCNP5. In general, good agreement between KENO-V.a and MCNP4b was observed. However, MCNP5 results show consistently lower values compared with MCNP4b results with the maximum difference of 1.2 %. This ICSBEP supported evaluation provides valuable data for the effect of soluble neutron absorber (cadmium nitrate) on the criticality safety of high-enriched uranyl nitrate solution. These data can also be used in determining critical controls and for validation of the calculation methods.

  3. The limitations of simple gene set enrichment analysis assuming gene independence.

    PubMed

    Tamayo, Pablo; Steinhardt, George; Liberzon, Arthur; Mesirov, Jill P

    2016-02-01

    Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis's nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis's on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene-gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods. © The Author(s) 2012.

  4. Failure Mode Identification Through Clustering Analysis

    NASA Technical Reports Server (NTRS)

    Arunajadai, Srikesh G.; Stone, Robert B.; Tumer, Irem Y.; Clancy, Daniel (Technical Monitor)

    2002-01-01

    Research has shown that nearly 80% of the costs and problems are created in product development and that cost and quality are essentially designed into products in the conceptual stage. Currently, failure identification procedures (such as FMEA (Failure Modes and Effects Analysis), FMECA (Failure Modes, Effects and Criticality Analysis) and FTA (Fault Tree Analysis)) and design of experiments are being used for quality control and for the detection of potential failure modes during the detail design stage or post-product launch. Though all of these methods have their own advantages, they do not give information as to what are the predominant failures that a designer should focus on while designing a product. This work uses a functional approach to identify failure modes, which hypothesizes that similarities exist between different failure modes based on the functionality of the product/component. In this paper, a statistical clustering procedure is proposed to retrieve information on the set of predominant failures that a function experiences. The various stages of the methodology are illustrated using a hypothetical design example.

  5. Cluster-based exposure variation analysis.

    PubMed

    Samani, Afshin; Mathiassen, Svend Erik; Madeleine, Pascal

    2013-04-04

    Static posture, repetitive movements and lack of physical variation are known risk factors for work-related musculoskeletal disorders, and thus needs to be properly assessed in occupational studies. The aims of this study were (i) to investigate the effectiveness of a conventional exposure variation analysis (EVA) in discriminating exposure time lines and (ii) to compare it with a new cluster-based method for analysis of exposure variation. For this purpose, we simulated a repeated cyclic exposure varying within each cycle between "low" and "high" exposure levels in a "near" or "far" range, and with "low" or "high" velocities (exposure change rates). The duration of each cycle was also manipulated by selecting a "small" or "large" standard deviation of the cycle time. Theses parameters reflected three dimensions of exposure variation, i.e. range, frequency and temporal similarity.Each simulation trace included two realizations of 100 concatenated cycles with either low (ρ = 0.1), medium (ρ = 0.5) or high (ρ = 0.9) correlation between the realizations. These traces were analyzed by conventional EVA, and a novel cluster-based EVA (C-EVA). Principal component analysis (PCA) was applied on the marginal distributions of 1) the EVA of each of the realizations (univariate approach), 2) a combination of the EVA of both realizations (multivariate approach) and 3) C-EVA. The least number of principal components describing more than 90% of variability in each case was selected and the projection of marginal distributions along the selected principal component was calculated. A linear classifier was then applied to these projections to discriminate between the simulated exposure patterns, and the accuracy of classified realizations was determined. C-EVA classified exposures more correctly than univariate and multivariate EVA approaches; classification accuracy was 49%, 47% and 52% for EVA (univariate and multivariate), and C-EVA, respectively (p < 0.001). All three methods

  6. A uniform contribution of core-collapse and type Ia supernovae to the chemical enrichment pattern in the outskirts of the Virgo Cluster

    DOE PAGES

    Simionescu, A.; Werner, N.; Urban, O.; ...

    2015-09-24

    We present the first measurements of the abundances of α-elements (Mg, Si, and S) extending out beyond the virial radius of a cluster of galaxies. Our results, based on Suzaku Key Project observations of the Virgo Cluster, show that the chemical composition of the intracluster medium is consistent with being constant on large scales, with a flat distribution of the Si/Fe, S/Fe, and Mg/Fe ratios as a function of radius and azimuth out to 1.4 Mpc (1.3 r200). Chemical enrichment of the intergalactic medium due solely to core-collapse supernovae (SNcc) is excluded with very high significance; instead, the measured metalmore » abundance ratios are generally consistent with the solar value. The uniform metal abundance ratios observed today are likely the result of an early phase of enrichment and mixing, with both SNcc and SNe Ia contributing to the metal budget during the period of peak star formation activity at redshifts of 2–3. Furthermore, we estimate the ratio between the number of SNe Ia and the total number of supernovae enriching the intergalactic medium to be between 12% and 37%, broadly consistent with the metal abundance patterns in our own Galaxy or with the SN Ia contribution estimated for the cluster cores.« less

  7. A uniform contribution of core-collapse and type Ia supernovae to the chemical enrichment pattern in the outskirts of the Virgo Cluster

    SciTech Connect

    Simionescu, A.; Werner, N.; Urban, O.; Allen, S. W.; Ichinohe, Y.; Zhuravleva, I.

    2015-09-24

    We present the first measurements of the abundances of α-elements (Mg, Si, and S) extending out beyond the virial radius of a cluster of galaxies. Our results, based on Suzaku Key Project observations of the Virgo Cluster, show that the chemical composition of the intracluster medium is consistent with being constant on large scales, with a flat distribution of the Si/Fe, S/Fe, and Mg/Fe ratios as a function of radius and azimuth out to 1.4 Mpc (1.3 r200). Chemical enrichment of the intergalactic medium due solely to core-collapse supernovae (SNcc) is excluded with very high significance; instead, the measured metal abundance ratios are generally consistent with the solar value. The uniform metal abundance ratios observed today are likely the result of an early phase of enrichment and mixing, with both SNcc and SNe Ia contributing to the metal budget during the period of peak star formation activity at redshifts of 2–3. Furthermore, we estimate the ratio between the number of SNe Ia and the total number of supernovae enriching the intergalactic medium to be between 12% and 37%, broadly consistent with the metal abundance patterns in our own Galaxy or with the SN Ia contribution estimated for the cluster cores.

  8. A UNIFORM CONTRIBUTION OF CORE-COLLAPSE AND TYPE Ia SUPERNOVAE TO THE CHEMICAL ENRICHMENT PATTERN IN THE OUTSKIRTS OF THE VIRGO CLUSTER

    SciTech Connect

    Simionescu, A.; Ichinohe, Y.; Werner, N.; Urban, O.; Allen, S. W.; Zhuravleva, I.

    2015-10-01

    We present the first measurements of the abundances of α-elements (Mg, Si, and S) extending out beyond the virial radius of a cluster of galaxies. Our results, based on Suzaku Key Project observations of the Virgo Cluster, show that the chemical composition of the intracluster medium is consistent with being constant on large scales, with a flat distribution of the Si/Fe, S/Fe, and Mg/Fe ratios as a function of radius and azimuth out to 1.4 Mpc (1.3 r{sub 200}). Chemical enrichment of the intergalactic medium due solely to core-collapse supernovae (SNcc) is excluded with very high significance; instead, the measured metal abundance ratios are generally consistent with the solar value. The uniform metal abundance ratios observed today are likely the result of an early phase of enrichment and mixing, with both SNcc and SNe Ia contributing to the metal budget during the period of peak star formation activity at redshifts of 2–3. We estimate the ratio between the number of SNe Ia and the total number of supernovae enriching the intergalactic medium to be between 12% and 37%, broadly consistent with the metal abundance patterns in our own Galaxy or with the SN Ia contribution estimated for the cluster cores.

  9. Somatotyping using 3D anthropometry: a cluster analysis.

    PubMed

    Olds, Tim; Daniell, Nathan; Petkov, John; David Stewart, Arthur

    2013-01-01

    Somatotyping is the quantification of human body shape, independent of body size. Hitherto, somatotyping (including the most popular method, the Heath-Carter system) has been based on subjective visual ratings, sometimes supported by surface anthropometry. This study used data derived from three-dimensional (3D) whole-body scans as inputs for cluster analysis to objectively derive clusters of similar body shapes. Twenty-nine dimensions normalised for body size were measured on a purposive sample of 301 adults aged 17-56 years who had been scanned using a Vitus Smart laser scanner. K-means Cluster Analysis with v-fold cross-validation was used to determine shape clusters. Three male and three female clusters emerged, and were visualised using those scans closest to the cluster centroid and a caricature defined by doubling the difference between the average scan and the cluster centroid. The male clusters were decidedly endomorphic (high fatness), ectomorphic (high linearity), and endo-mesomorphic (a mixture of fatness and muscularity). The female clusters were clearly endomorphic, ectomorphic, and the ecto-mesomorphic (a mixture of linearity and muscularity). An objective shape quantification procedure combining 3D scanning and cluster analysis yielded shape clusters strikingly similar to traditional somatotyping.

  10. An analysis of hospital brand mark clusters.

    PubMed

    Vollmers, Stacy M; Miller, Darryl W; Kilic, Ozcan

    2010-07-01

    This study analyzed brand mark clusters (i.e., various types of brand marks displayed in combination) used by hospitals in the United States. The brand marks were assessed against several normative criteria for creating brand marks that are memorable and that elicit positive affect. Overall, results show a reasonably high level of adherence to many of these normative criteria. Many of the clusters exhibited pictorial elements that reflected benefits and that were conceptually consistent with the verbal content of the cluster. Also, many clusters featured icons that were balanced and moderately complex. However, only a few contained interactive imagery or taglines communicating benefits.

  11. A hybrid monkey search algorithm for clustering analysis.

    PubMed

    Chen, Xin; Zhou, Yongquan; Luo, Qifang

    2014-01-01

    Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.

  12. Instantaneous normal mode analysis of melting of finite dust clusters.

    PubMed

    Melzer, André; Schella, André; Schablinski, Jan; Block, Dietmar; Piel, Alexander

    2012-06-01

    The experimental melting transition of finite two-dimensional dust clusters in a dusty plasma is analyzed using the method of instantaneous normal modes. In the experiment, dust clusters are heated in a thermodynamic equilibrium from a solid to a liquid state using a four-axis laser manipulation system. The fluid properties of the dust cluster, such as the diffusion constant, are measured from the instantaneous normal mode analysis. Thereby, the phase transition of these finite clusters is approached from the liquid phase. From the diffusion constants, unique melting temperatures have been assigned to dust clusters of various sizes that very well reflect their dynamical stability properties.

  13. Redefining the Breast Cancer Exosome Proteome by Tandem Mass Tag Quantitative Proteomics and Multivariate Cluster Analysis.

    PubMed

    Clark, David J; Fondrie, William E; Liao, Zhongping; Hanson, Phyllis I; Fulton, Amy; Mao, Li; Yang, Austin J

    2015-10-20

    Exosomes are microvesicles of endocytic origin constitutively released by multiple cell types into the extracellular environment. With evidence that exosomes can be detected in the blood of patients with various malignancies, the development of a platform that uses exosomes as a diagnostic tool has been proposed. However, it has been difficult to truly define the exosome proteome due to the challenge of discerning contaminant proteins that may be identified via mass spectrometry using various exosome enrichment strategies. To better define the exosome proteome in breast cancer, we incorporated a combination of Tandem-Mass-Tag (TMT) quantitative proteomics approach and Support Vector Machine (SVM) cluster analysis of three conditioned media derived fractions corresponding to a 10 000g cellular debris pellet, a 100 000g crude exosome pellet, and an Optiprep enriched exosome pellet. The quantitative analysis identified 2 179 proteins in all three fractions, with known exosomal cargo proteins displaying at least a 2-fold enrichment in the exosome fraction based on the TMT protein ratios. Employing SVM cluster analysis allowed for the classification 251 proteins as "true" exosomal cargo proteins. This study provides a robust and vigorous framework for the future development of using exosomes as a potential multiprotein marker phenotyping tool that could be useful in breast cancer diagnosis and monitoring disease progression.

  14. Cluster Analysis to Identify Possible Subgroups in Tinnitus Patients.

    PubMed

    van den Berge, Minke J C; Free, Rolien H; Arnold, Rosemarie; de Kleine, Emile; Hofman, Rutger; van Dijk, J Marc C; van Dijk, Pim

    2017-01-01

    In tinnitus treatment, there is a tendency to shift from a "one size fits all" to a more individual, patient-tailored approach. Insight in the heterogeneity of the tinnitus spectrum might improve the management of tinnitus patients in terms of choice of treatment and identification of patients with severe mental distress. The goal of this study was to identify subgroups in a large group of tinnitus patients. Data were collected from patients with severe tinnitus complaints visiting our tertiary referral tinnitus care group at the University Medical Center Groningen. Patient-reported and physician-reported variables were collected during their visit to our clinic. Cluster analyses were used to characterize subgroups. For the selection of the right variables to enter in the cluster analysis, two approaches were used: (1) variable reduction with principle component analysis and (2) variable selection based on expert opinion. Various variables of 1,783 tinnitus patients were included in the analyses. Cluster analysis (1) included 976 patients and resulted in a four-cluster solution. The effect of external influences was the most discriminative between the groups, or clusters, of patients. The "silhouette measure" of the cluster outcome was low (0.2), indicating a "no substantial" cluster structure. Cluster analysis (2) included 761 patients and resulted in a three-cluster solution, comparable to the first analysis. Again, a "no substantial" cluster structure was found (0.2). Two cluster analyses on a large database of tinnitus patients revealed that clusters of patients are mostly formed by a different response of external influences on their disease. However, both cluster outcomes based on this dataset showed a poor stability, suggesting that our tinnitus population comprises a continuum rather than a number of clearly defined subgroups.

  15. The Psychology of Yoga Practitioners: A Cluster Analysis.

    PubMed

    Genovese, Jeremy E C; Fondran, Kristine M

    2017-03-30

    Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall-Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.

  16. Early Hemostatic Responses to Trauma Identified Using Hierarchical Clustering Analysis

    PubMed Central

    White, N.J.; Contaifer, D.; Martin, E.J.; Newton, J.C.; Mohammed, B.M.; Bostic, J.L.; Brophy, G.M.; Spiess, B.D.; Pusateri, A.E.; Ward, K.R.; Brophy, D.F.

    2015-01-01

    Background Trauma-induced coagulopathy is a complex multifactorial hemostatic response that is poorly understood. Objectives Identify distinct hemostatic responses to trauma and identify key components of the hemostatic system that vary between responses. Patients/Methods Cross-sectional observational study of adult trauma patients at an urban Level I trauma center Emergency Department. Hierarchical clustering analysis was used to identify distinct clusters of similar subjects using vital signs, injury/shock severity, and by comprehensive assessment of coagulation, clot formation, platelet function, and thrombin generation. Results Of 84 total trauma patients included in the model, three distinct trauma clusters were identified. Cluster 1 (N=57) displayed platelet activation, preserved peak thrombin generation, plasma coagulation dysfunction, moderately decreased fibrinogen concentration, and normal clot formation relative to healthy controls. Cluster 2 (N=18) displayed platelet activation, preserved peak thrombin generation, and preserved fibrinogen concentration with normal clot formation. Cluster 3 (N=9) was the most severely injured and shocked and displayed a strong inflammatory and bleeding phenotype. Platelet dysfunction, thrombin inhibition, plasma coagulation dysfunction, and decreased fibrinogen concentration were present in this cluster. Fibrinolytic activation was present in all clusters, but increased more so in Cluster 3. Trauma clusters were different most noticeably in their relative fibrinogen concentration, peak thrombin generation, and platelet-induced clot contraction. Conclusions Hierarchical clustering analysis identified 3 distinct hemostatic responses to trauma. Further insight into the underlying hemostatic mechanisms responsible for these responses is needed. PMID:25816845

  17. A Survey of Popular R Packages for Cluster Analysis

    ERIC Educational Resources Information Center

    Flynt, Abby; Dean, Nema

    2016-01-01

    Cluster analysis is a set of statistical methods for discovering new group/class structure when exploring data sets. This article reviews the following popular libraries/commands in the R software language for applying different types of cluster analysis: from the stats library, the kmeans, and hclust functions; the mclust library; the poLCA…

  18. Using Cluster Analysis for Data Mining in Educational Technology Research

    ERIC Educational Resources Information Center

    Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

    2012-01-01

    Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…

  19. Using Cluster Analysis for Data Mining in Educational Technology Research

    ERIC Educational Resources Information Center

    Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

    2012-01-01

    Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…

  20. Simultaneous Two-Way Clustering of Multiple Correspondence Analysis

    ERIC Educational Resources Information Center

    Hwang, Heungsun; Dillon, William R.

    2010-01-01

    A 2-way clustering approach to multiple correspondence analysis is proposed to account for cluster-level heterogeneity of both respondents and variable categories in multivariate categorical data. Specifically, in the proposed method, multiple correspondence analysis is combined with k-means in a unified framework in which "k"-means is…

  1. A Survey of Popular R Packages for Cluster Analysis

    ERIC Educational Resources Information Center

    Flynt, Abby; Dean, Nema

    2016-01-01

    Cluster analysis is a set of statistical methods for discovering new group/class structure when exploring data sets. This article reviews the following popular libraries/commands in the R software language for applying different types of cluster analysis: from the stats library, the kmeans, and hclust functions; the mclust library; the poLCA…

  2. Effective Enrichment and Mass Spectrometry Analysis of Phosphopeptides Using Mesoporous Metal Oxide Nanomaterials

    PubMed Central

    Nelson, Cory A.; Szczech, Jeannine R.; Dooley, Chad J.; Xu, Qingge; Lawrence, Matthew J.; Zhu, Haoyue; Jin, Song; Ge, Ying

    2010-01-01

    Mass spectrometry (MS)-based phosphoproteomics remains challenging due to the low abundance of phosphoproteins and substoichiometric phosphorylation. This demands better methods to effectively enrich phosphoproteins/peptides prior to MS analysis. We have previously communicated the first use of mesoporous zirconium oxide (ZrO2) nanomaterials for effective phosphopeptide enrichment. Here we present the full report including the synthesis, characterization, and application of mesoporous titanium dioxide (TiO2), ZrO2, and hafnium oxide (HfO2) in phosphopeptide enrichment and MS analysis. Mesoporous ZrO2 and HfO2 are demonstrated to be superior to TiO2 for phosphopeptide enrichment from a complex mixture with high specificity (>99%), which could almost be considered as “a purification”, mainly because of the extremely large active surface area of mesoporous nanomaterials. A single enrichment and Fourier transform MS analysis of phosphopeptides digested from a complex mixture containing 7% of α-casein identified 21 out of 22 phosphorylation sites for α-casein. Moreover, the mesoporous ZrO2 and HfO2 can be reused after a simple solution regeneration procedure with comparable enrichment performance to that of fresh materials. Mesoporous ZrO2 and HfO2 nanomaterials hold great promise for applications in MS-based phosphoproteomics. PMID:20704311

  3. miSEA: microRNA set enrichment analysis.

    PubMed

    Çorapçıoğlu, M Erdem; Oğul, Hasan

    2015-08-01

    We introduce a novel web-based tool, miSEA, for evaluating the enrichment of relevant microRNA sets from microarray and miRNA-Seq experiments on paired samples, e.g. control vs. In addition to a group of previously annotated microRNA sets embedded in the system, this tool enables users to import new microRNA sets obtained from their own research. miSEA allows users to select from a large variety of microRNA grouping categories, such as family classification, disease association, common regulation, and genome coordinates, based on their requirements. miSEA therefore provides a knowledge-driven representation scheme for microRNA experiments. The usability of this platform was discerned with a cancer type-classification task performed on a set of real microRNA expression profiling experiments. The miSEA web server is available at http://www.baskent.edu.tr/∼hogul/misea. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  4. Spectral Analysis of Cluster Induced Turbulence

    NASA Astrophysics Data System (ADS)

    Patel, Ravi; Ireland, Peter; Capecelatro, Jesse; Fox, Rodney; Desjardins, Olivier

    2015-11-01

    Particle laden turbulent flows are an important feature of many industrial processes such as fluidized bed reactors. The study of cluster-induced turbulence (CIT), wherein particles falling under gravity generate turbulence in the carrier gas via fluctuations in particle concentration, may lead to better models for these processes. We present a spectral analysis of a database of statistically stationary CIT simulations. These simulations were previously performed using a two way coupled Eulerian-Lagrangian approach for various mass loadings and particle-scale Reynolds numbers. The Lagrangian particle data is carefully filtered to obtain Eulerian fields for particle phase volume fraction, velocity, and granular temperature. We perform a spectral decomposition of the particle and fluid turbulent kinetic energy budget. We investigate the contributions to the particle and fluid turbulent kinetic energy by pressure strain, viscous dissipation, drag exchange, viscous exchange, and pressure exchange over the range of wavenumbers. Results from this study may help develop closure models for large eddy simulation of particle laden turbulent flows.

  5. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool.

    PubMed

    Chen, Edward Y; Tan, Christopher M; Kou, Yan; Duan, Qiaonan; Wang, Zichen; Meirelles, Gabriela Vaz; Clark, Neil R; Ma'ayan, Avi

    2013-04-15

    System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.

  6. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool

    PubMed Central

    2013-01-01

    Background System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. Results Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. Conclusions Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr. PMID:23586463

  7. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale

    PubMed Central

    Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Overview Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms—Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. Cluster Quality Metrics We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Network Clustering Algorithms Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large

  8. Cluster analysis of the hot subdwarfs in the PG survey

    NASA Technical Reports Server (NTRS)

    Thejll, Peter; Charache, Darryl; Shipman, Harry L.

    1989-01-01

    Application of cluster analysis to the hot subdwarfs in the Palomar Green (PG) survey of faint blue high-Galactic-latitude objects is assessed, with emphasis on data noise and the number of clusters to subdivide the data into. The data used in the study are presented, and cluster analysis, using the CLUSTAN program, is applied to it. Distances are calculated using the Euclidean formula, and clustering is done by Ward's method. The results are discussed, and five groups representing natural divisions of the subdwarfs in the PG survey are presented.

  9. Characterization of population exposure to organochlorines: a cluster analysis application.

    PubMed

    Guimarães, Raphael Mendonça; Asmus, Carmen Ildes Rodrigues Fróes; Burdorf, Alex

    2013-06-01

    This study aimed to show the results from a cluster analysis application in the characterization of population exposure to organochlorines through variables related to time and exposure dose. Characteristics of 354 subjects in a population exposed to organochlorine pesticides residues related to time and exposure dose were subjected to cluster analysis to separate them into subgroups. We performed hierarchical cluster analysis. To evaluate the classification accuracy, compared to intra-group and inter-group variability by ANOVA for each dimension. The aggregation strategy was accomplished by the method of Ward. It was, for the creation of clusters, variables associated with exposure and routes of contamination. The information on the estimated intake doses of compound were used to weight the values of exposure time at each of the routes, so as to obtain values proxy exposure intensity. The results showed three clusters: cluster 1 (n = 45), characteristics of greatest exposure, the cluster 2 (n = 103), intermediate exposure, and cluster 3 (n = 206), less exposure. The bivariate analyzes performed with groups that are groups showed a statistically significant difference. This study demonstrated the applicability of cluster analysis to categorize populations exposed to organochlorines and also points to the relevance of typological studies that may contribute to a better classification of subjects exposed to chemical agents, which is typical of environmental epidemiology studies to a wider understanding of etiological, preventive and therapeutic contamination.

  10. MASSCLEAN: MASSive CLuster Evolution and ANalysis package -- A new tool for stellar clusters

    NASA Astrophysics Data System (ADS)

    Popescu, Bogdan

    2010-11-01

    Stellar clusters are laboratories for stellar evolution. Their stellar content have an uniform age and chemical composition, but span a large mass interval. The majority of stars are born in clusters and end up in the general field population. An accurate characterization of stellar clusters could be used to built better models, from stellar evolution to the evolution of an entire galaxy. Regardless of the fact that they are so close, for many Milky Way clusters it is difficult to be observed because they are obscured by the dust in the disk of our Galaxy. The clusters from the Local Group and beyond are too distant, so only their integrated properties could be used most of the time. There is one way to analyze the observational data, to search for clusters, and to describe them: simulations. MASSCLEAN (MASSive CLuster Evolution and ANalysis) package was developed to provide a better characterization of Galactic clusters, to derive selection effects of current surveys, and to provide information about the extra-galactic clusters. Simulations of known Galactic clusters are used to get better constraints on their parameters, like mass, age, extinction, chemical composition and distance. This is the traditional way to describe the Galactic clusters, fitting the data using the available models. The difference is that MASSCLEAN simulations provide a consistent set of parameters. The majority of extra-galactic clusters are known only from their integrated properties, integrated magnitudes and colors. The current models for stellar populations are available only in the infinite mass limit. But the real clusters have a finite mass, and their integrated colors show a large dispersion (stochastic fluctuations). The description of the variation of integrated colors as a function of mass and age lead to the creation of MASSCLEANcolors database, based on 70 million Monte Carlo simulations. Since the entries in the database form a consistent set of integrated colors, integrated

  11. Investigating Subtypes of Child Development: A Comparison of Cluster Analysis and Latent Class Cluster Analysis in Typology Creation

    ERIC Educational Resources Information Center

    DiStefano, Christine; Kamphaus, R. W.

    2006-01-01

    Two classification methods, latent class cluster analysis and cluster analysis, are used to identify groups of child behavioral adjustment underlying a sample of elementary school children aged 6 to 11 years. Behavioral rating information across 14 subscales was obtained from classroom teachers and used as input for analyses. Both the procedures…

  12. Investigating Subtypes of Child Development: A Comparison of Cluster Analysis and Latent Class Cluster Analysis in Typology Creation

    ERIC Educational Resources Information Center

    DiStefano, Christine; Kamphaus, R. W.

    2006-01-01

    Two classification methods, latent class cluster analysis and cluster analysis, are used to identify groups of child behavioral adjustment underlying a sample of elementary school children aged 6 to 11 years. Behavioral rating information across 14 subscales was obtained from classroom teachers and used as input for analyses. Both the procedures…

  13. Promoter analysis of intestinal genes induced during iron-deprivation reveals enrichment of conserved SP1-like binding sites

    PubMed Central

    Collins, James F; Hu, Zihua

    2007-01-01

    Background Iron-deficiency leads to the induction of genes related to intestinal iron absorption and homeostasis. By analyzing a large GeneChip® dataset from the rat intestine, we identified a large cluster of 228 genes that was induced by iron-deprivation. Only 2 of these genes contained 3' iron-response elements, suggesting that other regulation including transcriptional may be involved. We therefore utilized computational methods to test the hypothesis that some of the genes within this large up-regulated cluster are co-ordinately regulated by common transcriptional mechanisms. We thus identified promoters from the up-regulated gene cluster from rat, mouse and human, and performed enrichment analyses with the Clover program and the TRANSFAC database. Results Surprisingly, we found a strong statistical enrichment for SP1 binding sites in our experimental promoters as compared to background sequences. As the TRANSFAC database cannot distinguish among SP/KLF family members, many of which bind similar GC-rich DNA sequences, we surmise that SP1 or an SP1-like factor could be involved in this response. In fact, we detected induction of SP6/KLF14 in the GeneChip® studies, and confirmed it by real-time PCR. Additional computational analyses suggested that an SP1-like factor may function synergistically with a FOX TF to regulate a subset of these genes. Furthermore, analysis of promoter sequences identified many genes with multiple, conserved SP1 and FOX binding sites, the relative location of which within orthologous promoters was highly conserved. Conclusion SP1 or a closely related factor may play a primary role in the genetic response to iron-deficiency in the mammalian intestine. PMID:18005439

  14. Cluster analysis of simulated gravitational wave triggers using S-means and constrained validation clustering

    NASA Astrophysics Data System (ADS)

    Tang, Lappoon R.; Lei, Hansheng; Mukherjee, Soma; Mohanty, Soumya

    2008-09-01

    The fifth science run of LIGO (S5) has recently been concluded. The data collected over 2 years of the run call for a thorough analysis of the glitches seen in the gravitational wave channels, as well as in the auxiliary and environmental channels. The study presents two new techniques for cluster analysis of gravitational wave burst triggers. Traditional approaches to clustering treat the problem as an optimization problem in an 'open' search space of clustering models. However, this can lead to problems by producing models that over-fit or under-fit the data as the search is stuck on local minima. The new algorithms tackle local minima by putting constraints in the search process. S-means looks at similarity statistics of burst triggers and builds up clusters that have the advantage of avoiding local minima. Constrained validation clustering tackles the problem by constraining the search in the space of clustering models that are 'non-splittable' models in which the centroids of the left and right child of a cluster (after splitting) are nearest to each other; the region of models that either over-fit or under-fit data (i.e. 'splittable' models) can therefore be effectively avoided when assumptions about data are satisfied. These methods are demonstrated by using simulated data. The results on simulated data are promising and the methods are expected to be useful for LIGO S5 data analysis.

  15. Visual cluster analysis and pattern recognition methods

    DOEpatents

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    2001-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  16. Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis.

    PubMed

    Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun

    2017-01-01

    Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.

  17. Two-dimensional enrichment analysis for mining high-level imaging genetic associations.

    PubMed

    Yao, Xiaohui; Yan, Jingwen; Kim, Sungeun; Nho, Kwangsik; Risacher, Shannon L; Inlow, Mark; Moore, Jason H; Saykin, Andrew J; Shen, Li

    2017-03-01

    Enrichment analysis has been widely applied in the genome-wide association studies, where gene sets corresponding to biological pathways are examined for significant associations with a phenotype to help increase statistical power and improve biological interpretation. In this work, we expand the scope of enrichment analysis into brain imaging genetics, an emerging field that studies how genetic variation influences brain structure and function measured by neuroimaging quantitative traits (QT). Given the high dimensionality of both imaging and genetic data, we propose to study Imaging Genetic Enrichment Analysis (IGEA), a new enrichment analysis paradigm that jointly considers meaningful gene sets (GS) and brain circuits (BC) and examines whether any given GS-BC pair is enriched in a list of gene-QT findings. Using gene expression data from Allen Human Brain Atlas and imaging genetics data from Alzheimer's Disease Neuroimaging Initiative as test beds, we present an IGEA framework and conduct a proof-of-concept study. This empirical study identifies 25 significant high-level two-dimensional imaging genetics modules. Many of these modules are relevant to a variety of neurobiological pathways or neurodegenerative diseases, showing the promise of the proposal framework for providing insight into the mechanism of complex diseases.

  18. The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions

    NASA Astrophysics Data System (ADS)

    Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

    2017-07-01

    Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.

  19. The smart cluster method - Adaptive earthquake cluster identification and analysis in strong seismic regions

    NASA Astrophysics Data System (ADS)

    Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

    2017-03-01

    Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.

  20. Distinct clinical phenotypes of airways disease defined by cluster analysis.

    PubMed

    Weatherall, M; Travers, J; Shirtcliffe, P M; Marsh, S E; Williams, M V; Nowitz, M R; Aldington, S; Beasley, R

    2009-10-01

    Airways disease is currently classified using diagnostic labels such as asthma, chronic bronchitis and emphysema. The current definitions of these classifications may not reflect the phenotypes of airways disease in the community, which may have differing disease processes, clinical features or responses to treatment. The aim of the present study was to use cluster analysis to explore clinical phenotypes in a community population with airways disease. A random population sample of 25-75-yr-old adults underwent detailed investigation, including a clinical questionnaire, pulmonary function tests, nitric oxide measurements, blood tests and chest computed tomography. Cluster analysis was performed on the subgroup with current respiratory symptoms or obstructive spirometric results. Subjects with a complete dataset (n = 175) were included in the cluster analysis. Five clusters were identified with the following characteristics: cluster 1: severe and markedly variable airflow obstruction with features of atopic asthma, chronic bronchitis and emphysema; cluster 2: features of emphysema alone; cluster 3: atopic asthma with eosinophilic airways inflammation; cluster 4: mild airflow obstruction without other dominant phenotypic features; and cluster 5: chronic bronchitis in nonsmokers. Five distinct clinical phenotypes of airflow obstruction were identified. If confirmed in other populations, these findings may form the basis of a modified taxonomy for the disorders of airways obstruction.

  1. Genome wide analysis of Silurana (Xenopus) tropicalis development reveals dynamic expression using network enrichment analysis.

    PubMed

    Langlois, Valérie S; Martyniuk, Christopher J

    2013-01-01

    Development involves precise timing of gene expression and coordinated pathways for organogenesis and morphogenesis. Functional and sub-network enrichment analysis provides an integrated approach for identifying networks underlying development. The objectives of this study were to characterize early gene regulatory networks over Silurana tropicalis development from NF stage 2 to 46 using a custom Agilent 4×44K microarray. There were >8000 unique gene probes that were differentially expressed between Nieuwkoop-Faber (NF) stage 2 and stage 16, and >2000 gene probes differentially expressed between NF 34 and 46. Gene ontology revealed that genes involved in nucleosome assembly, cell division, pattern specification, neurotransmission, and general metabolism were increasingly regulated throughout development, consistent with active development. Sub-network enrichment analysis revealed that processes such as membrane hyperpolarisation, retinoic acid, cholesterol, and dopamine metabolic gene networks were activated/inhibited over time. This study identifies RNA transcripts that are potentially maternally inherited in an anuran species, provides evidence that the expression of genes involved in retinoic acid receptor signaling may increase prior to those involved in thyroid receptor signaling, and characterizes novel gene expression networks preceding organogenesis which increases understanding of the spatiotemporal embryonic development in frogs.

  2. Reactome Pathway Analysis to Enrich Biological Discovery in Proteomics Datasets

    PubMed Central

    Haw, Robin; Hermjakob, Henning; D’Eustachio, Peter; Stein, Lincoln

    2012-01-01

    Reactome (http://www.reactome.org) is an open source, expert-authored, peer-reviewed, manually curated database of reactions, pathways and biological processes. We provide an intuitive web-based user interface to pathway knowledge and a suite of data analysis tools. The Pathway Browser is a Systems Biology Graphical Notation (SBGN)-like visualization system that supports manual navigation of pathways by zooming, scrolling and event highlighting, and that exploits PSI Common Query Interface (PSIQUIC) web services to overlay pathways with molecular interaction data from the Reactome Functional Interaction (FI) Network and interaction databases such as IntAct, ChEMBL, and BioGRID. Pathway and Expression Analysis tools employ web services to provide ID mapping, pathway assignment and over-representation analysis of user-supplied datasets. By applying Ensembl Compara to curated human proteins and reactions, Reactome generates pathway inferences for 20 other species. The Species Comparison tool provides a summary of results for each of these species as a table showing numbers of orthologous proteins found by pathway from which users can navigate to inferred details for specific proteins and reactions. Reactome’s diverse pathway knowledge and suite of data analysis tools provide a platform for data mining, modeling and the analysis of large-scale proteomics datasets. PMID:21751369

  3. GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis.

    PubMed

    Zheng, Qi; Wang, Xiu-Jie

    2008-07-01

    Gene Ontology (GO) analysis has become a commonly used approach for functional studies of large-scale genomic or transcriptomic data. Although there have been a lot of software with GO-related analysis functions, new tools are still needed to meet the requirements for data generated by newly developed technologies or for advanced analysis purpose. Here, we present a Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), an easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets. Compared with available GO analysis tools, GOEAST has the following improved features: (i) GOEAST displays enriched GO terms in graphical format according to their relationships in the hierarchical tree of each GO category (biological process, molecular function and cellular component), therefore, provides better understanding of the correlations among enriched GO terms; (ii) GOEAST supports analysis for data from various sources (probe or probe set IDs of Affymetrix, Illumina, Agilent or customized microarrays, as well as different gene identifiers) and multiple species (about 60 prokaryote and eukaryote species); (iii) One unique feature of GOEAST is to allow cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them. GOEAST also provides rigorous statistical tests to enhance the reliability of analysis results. GOEAST is freely accessible at http://omicslab.genetics.ac.cn/GOEAST/

  4. Cluster analysis using multivariate mixed effects models.

    PubMed

    Villarroel, Luis; Marshall, Guillermo; Barón, Anna E

    2009-09-10

    A common situation in the biological and social sciences is to have data on one or more variables measured longitudinally on a sample of individuals. A problem of growing interest in these areas is the grouping of individuals into one of two or more clusters according to their longitudinal behavior. Recently, methods have been proposed to deal with cases where individuals are classified into clusters through a linear model of mixed univariate effects deriving from a longitudinally measured variable. The method proposed in the current work deals with the case of clustering and then classification based on two or more variables measured longitudinally, through the fitting of non-linear multivariate mixed effect models, and with consideration given to parameter estimation for balanced and unbalanced data using an EM algorithm. The application of the method is illustrated with an example in which the clusters are identified and the classification into clusters is compared with the true membership of individuals in one of two groups, which is known at the end of the follow-up period.

  5. Correlation analysis of objectively defined galaxy and cluster catalogues

    NASA Astrophysics Data System (ADS)

    Stevenson, P. R. F.; Fong, R.; Shanks, T.

    1988-10-01

    The authors present further galaxy clustering results from the objective COSMOS/UKST galaxy catalogue of Stevenson et al. They first re-examine the results of SSFM for the galaxy correlation function, wgg(θ), testing the stability of the result against possible systematic effects and extending the analysis to larger angular scales. They then use the method of Turner & Gott to automatically detect groups and clusters in these catalogues. The authors next present the cluster-galaxy cross-correlation function wcg. Finally, the above correlation analyses are carried out on simulated galaxy and cluster catalogues.

  6. A Note on Cluster Effects in Latent Class Analysis

    ERIC Educational Resources Information Center

    Kaplan, David; Keller, Bryan

    2011-01-01

    This article examines the effects of clustering in latent class analysis. A comprehensive simulation study is conducted, which begins by specifying a true multilevel latent class model with varying within- and between-cluster sample sizes, varying latent class proportions, and varying intraclass correlations. These models are then estimated under…

  7. STATISTICAL ANALYSIS OF DWARF GALAXIES AND THEIR GLOBULAR CLUSTERS IN THE LOCAL VOLUME

    SciTech Connect

    Chattopadhyay, Tanuka; Karmakar, Pradip; Sharina, Margarita

    2010-11-20

    Although morphological classification of dwarf galaxies into early and late types can account for some of their origin and characteristics, this does not aid the study of their formation mechanism. Thus an objective classification using principal component analysis together with K means cluster analysis of these dwarf galaxies and their globular clusters (GCs) is carried out to overcome this problem. It is found that the classification of dwarf galaxies in the local volume is irrespective of their morphological indices. The more massive (M{sub V0} < -13.7) galaxies evolve through self-enrichment and harbor dynamically less evolved younger GCs, whereas fainter galaxies (M{sub V0} > - 13.7) are influenced by their environment in the star formation process.

  8. An Economical Method for Static Headspace Enrichment for Arson Analysis

    ERIC Educational Resources Information Center

    Olesen, Bjorn

    2010-01-01

    Static headspace analysis of accelerants from suspected arsons is accomplished by placing an arson sample in a sealed container with a carbon strip suspended above the sample. The sample is heated, cooled to room temperature, and then the organic components are extracted from the carbon strip with carbon disulfide followed by gas chromatography…

  9. An Economical Method for Static Headspace Enrichment for Arson Analysis

    ERIC Educational Resources Information Center

    Olesen, Bjorn

    2010-01-01

    Static headspace analysis of accelerants from suspected arsons is accomplished by placing an arson sample in a sealed container with a carbon strip suspended above the sample. The sample is heated, cooled to room temperature, and then the organic components are extracted from the carbon strip with carbon disulfide followed by gas chromatography…

  10. First CCD UBVI photometric analysis of six open cluster candidates

    NASA Astrophysics Data System (ADS)

    Piatti, A. E.; Clariá, J. J.; Ahumada, A. V.

    2011-04-01

    We have obtained CCD UBVIKC photometry down to V ˜ 22 for the open cluster candidates Haffner 3, Haffner 5, NGC 2368, Haffner 25, Hogg 3 and Hogg 4 and their surrounding fields. None of these objects have been photometrically studied so far. Our analysis shows that these stellar groups are not genuine open clusters since no clear main sequences or other meaningful features can be seen in their colour-magnitude and colour-colour diagrams. We checked for possible differential reddening across the studied fields that could be hiding the characteristics of real open clusters. However, the dust in the directions to these objects appears to be uniformly distributed. Moreover, star counts carried out within and outside the open cluster candidate fields do not support the hypothesis that these objects are real open clusters or even open cluster remnants.

  11. Modeling and Analysis Methods for an On-line Enrichment Monitor

    SciTech Connect

    Smith, Leon E.; Jarman, Kenneth D.; Wittman, Richard S.; Zalavadia, Mital A.; March-Leuba, Jose A.

    2016-05-30

    The International Atomic Energy Agency (IAEA) has developed an On-Line Enrichment Monitor (OLEM) as one possible component in a new generation of safeguards measures for uranium enrichment plants. The OLEM measures 235U emissions from the UF6 gas flowing through a unit header pipe using NaI(Tl) spectrometers, and corrects for gas density changes using pressure and temperature sensors in order to determine the enrichment of the gas as a function of time. In parallel with the OLEM instrument development, a Virtual OLEM (VOLEM) software tool has been developed that is capable of producing synthetic gamma-ray, pressure, and temperature data representative of a wide range of enrichment plant operating conditions. VOLEM complements instrument development activities and allows the study of OLEM for scenarios that will be difficult or impossible to evaluate empirically. Uses of VOLEM include: investigation of hardware design options; inter-comparison of candidate gamma-ray spectral analysis and enrichment estimation algorithms; uncertainty budget analysis and performance prediction for typical and atypical operational scenarios; and testing of the OLEM data acquisition, analysis and reporting software. This paper describes the technical foundations of VOLEM and illustrates how it can be used. An overview of the nominal instrument design and deployment scenario for OLEM is provided, with emphasis on the key online-assay measurement challenge: accurately determining the portion of the total 235U signal that comes from a background that includes solid uranium deposits on the piping walls. Monte Carlo modeling tools, data analysis algorithms and uncertainty quantification methods are described. VOLEM is then used to quantitatively explore the uncertainty budgets and predicted instrument performance for a plausible range of typical plant operating parameters, and one set of candidate analysis algorithms. Additionally, a series of VOLEM case studies illustrates how an online

  12. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.

    PubMed

    Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.

  13. A meta-analysis of the effects of nutrient enrichment on litter decomposition in streams.

    PubMed

    Ferreira, Verónica; Castagneyrol, Bastien; Koricheva, Julia; Gulis, Vladislav; Chauvet, Eric; Graça, Manuel A S

    2015-08-01

    The trophic state of many streams is likely to deteriorate in the future due to the continuing increase in human-induced nutrient availability. Therefore, it is of fundamental importance to understand how nutrient enrichment affects plant litter decomposition, a key ecosystem-level process in forest streams. Here, we present a meta-analysis of 99 studies published between 1970 and 2012 that reported the effects of nutrient enrichment on litter decomposition in running waters. When considering the entire database, which consisted of 840 case studies, nutrient enrichment stimulated litter decomposition rate by approximately 50%. The stimulation was higher when the background nutrient concentrations were low and the magnitude of the nutrient enrichment was high, suggesting that oligotrophic streams are most vulnerable to nutrient enrichment. The magnitude of the nutrient-enrichment effect on litter decomposition was higher in the laboratory than in the field experiments, suggesting that laboratory experiments overestimate the effect and their results should be interpreted with caution. Among field experiments, effects of nutrient enrichment were smaller in the correlative than in the manipulative experiments since in the former the effects of nutrient enrichment on litter decomposition were likely confounded by other environmental factors, e.g. pollutants other than nutrients commonly found in streams impacted by human activity. However, primary studies addressing the effect of multiple stressors on litter decomposition are still few and thus it was not possible to consider the interaction between factors in this review. In field manipulative experiments, the effect of nutrient enrichment on litter decomposition depended on the scale at which the nutrients were added: stream reach > streamside channel > litter bag. This may have resulted from a more uniform and continuous exposure of microbes and detritivores to nutrient enrichment at the stream-reach scale. By

  14. The enrichment ratio of atomic contacts in crystals, an indicator derived from the Hirshfeld surface analysis.

    PubMed

    Jelsch, Christian; Ejsmont, Krzysztof; Huder, Loïc

    2014-03-01

    The partitioning of space with Hirshfeld surfaces enables the analysis of fingerprint molecular interactions in crystalline environments. This study uses the decomposition of the crystal contact surface between pairs of interacting chemical species to derive an enrichment ratio. This quantity enables the analysis of the propensity of chemical species to form intermolecular interactions with themselves and other species. The enrichment ratio is obtained by comparing the actual contacts in the crystal with those computed as if all types of contacts had the same probability to form. The enrichments and contact tendencies were analyzed in several families of compounds, based on chemical composition and aromatic character. As expected, the polar contacts of the type H⋯N, H⋯O and H⋯S, which are generally hydrogen bonds, show enrichment values larger than unity. O⋯O and N⋯N contacts are impoverished while H⋯H interactions display enrichment ratios which are generally close to unity or slightly lower. In aromatic compounds, C⋯C contacts can display large enrichment ratios due to extensive π⋯π stacking in the crystal packings of heterocyclic compounds. C⋯C contacts are, however, less enriched in pure (C,H) hydrocarbons as π⋯π stacking is not so favourable from the electrostatic point of view compared with heterocycles. C⋯H contacts are favoured in (C,H) aromatics, but these interactions occur less in compounds containing O, N or S as some H atoms are then involved in hydrogen bonds. The study also highlights the fact that hydrogen is a prefered interaction partner for fluorine.

  15. Obstructive Sleep Apnea: A Cluster Analysis at Time of Diagnosis

    PubMed Central

    Grillet, Yves; Richard, Philippe; Stach, Bruno; Vivodtzev, Isabelle; Timsit, Jean-Francois; Lévy, Patrick; Tamisier, Renaud; Pépin, Jean-Louis

    2016-01-01

    Background The classification of obstructive sleep apnea is on the basis of sleep study criteria that may not adequately capture disease heterogeneity. Improved phenotyping may improve prognosis prediction and help select therapeutic strategies. Objectives: This study used cluster analysis to investigate the clinical clusters of obstructive sleep apnea. Methods An ascending hierarchical cluster analysis was performed on baseline symptoms, physical examination, risk factor exposure and co-morbidities from 18,263 participants in the OSFP (French national registry of sleep apnea). The probability for criteria to be associated with a given cluster was assessed using odds ratios, determined by univariate logistic regression. Results: Six clusters were identified, in which patients varied considerably in age, sex, symptoms, obesity, co-morbidities and environmental risk factors. The main significant differences between clusters were minimally symptomatic versus sleepy obstructive sleep apnea patients, lean versus obese, and among obese patients different combinations of co-morbidities and environmental risk factors. Conclusions Our cluster analysis identified six distinct clusters of obstructive sleep apnea. Our findings underscore the high degree of heterogeneity that exists within obstructive sleep apnea patients regarding clinical presentation, risk factors and consequences. This may help in both research and clinical practice for validating new prevention programs, in diagnosis and in decisions regarding therapeutic strategies. PMID:27314230

  16. Analysis of the effectiveness of gas centrifuge enrichment plants advanced safeguards

    SciTech Connect

    Boyer, Brian David; Erpenbeck, Heather H; Miller, Karen A; Swinjoe, Martyn T; Ianakiev, Kiril D; Marlow, Johnna B

    2010-01-01

    Current safeguards approaches used by the International Atomic Energy Agency (IAEA) at gas centrifuge enrichment plants (GCEPs) need enhancement in order to verify declared low-enriched uranium (LEU) production, detect undeclared LEU production and detect highly enriched uranium (HEU) production with adequate detection probability using non destructive assay (NDA) techniques. At present inspectors use attended systems, systems needing the presence of an inspector for operation, during inspections to verify the mass and 235U enrichment of declared UF6 containers used in the process of enrichment at GCEPs. This paper contains an analysis of possible improvements in unattended and attended NDA systems including process monitoring and possible on-site destructive assay (DA) of samples that could reduce the uncertainty of the inspector's measurements. These improvements could reduce the difference between the operator's and inspector's measurements providing more effective and efficient IAEA GCEPs safeguards. We also explore how a few advanced safeguards systems could be assembled for unattended operation. The analysis will focus on how unannounced inspections (UIs), and the concept of information-driven inspections (IDS) can affect probability of detection of the diversion of nuclear materials when coupled to new GCEPs safeguards regimes augmented with unattended systems.

  17. [On National Demonstration Areas: a cluster analysis].

    PubMed

    Mao, F; Jiang, Y Y; Dong, W L; Ji, N; Dong, J Q

    2017-04-10

    Objective: To understand the 'backward' provinces and the relatively poor work among the construction of National Demonstration Area, so as to promote communication and future visions among different regions. Methods: Methods on Cluster analysis were used to compare the development of National Demonstration Area in different provinces, including the coverage of National Demonstration Area and the scores of non-communicable disease (NCDs) prevention and control work based on a standardized indicating system. Results: According to the results from the construction of National Demonstration Area, all the 29 provinces and the Xinjiang Production and Construction Corps (except Tibet and Qinghai) were classified into 6 categories: Shanghai; Beijing, Zhejiang, Chongqing; Tianjin, Shandong, Guangdong and Xinjiang Production and Construction Corps; Hebei, Fujian, Hubei, Jiangsu, Liaoning, Xinjiang, Hunan and Guangxi; Shanxi, Jilin, Henan, Hainan,Sichuan, Anhui and Jiangxi; Inner Mongolia, Shaanxi, Ningxia, Guizhou, Yunnan, Gansu and Heilongjiang. Based on the scores gathered from this study, 24 items that representing the achievements from the NCDs prevention and control endeavor were classified into 4 categories: Manpower, special day on NCD, information materials development, policy/strategy support, financial support, mass media, enabled environment, community fitness campaign, health promotion for children and teenage, institutional structure and patient self-management; healthy diet, risk factors on NCDs surveillance, tobacco control and community diagnosis; intervention of high-risk groups, identification of high-risk groups, reporting system on cardiovascular and cerebrovascular events, popularization of basic public health service, workplace intervention programs, construction of demonstration units and mortality surveillance; oral hygiene and tumor registration. Contents including oral hygiene, tumor registration, intervention on high-risk groups, identification of

  18. Mission analysis of clusters of satellites

    NASA Astrophysics Data System (ADS)

    Frayssinhes, Eric; Lansard, Erick

    1996-09-01

    An innovative satellite system that provides high precision localisation of beacon positions consists of a cluster of satellites, i.e. a group of satellites that maintain assigned positions at relatively short distances from each other. Compared to a single satellite, the interest of such a cluster lies in its ability to synthesise antenna bases much longer than those who can be physically mounted on one satellite. Each satellite of the cluster measures the time-of-arrival of the signal transmitted by the beacon. The derived time-differences-of-arrival (TDOA) are processed to estimate the beacon position. At first, this paper summarises the investigations performed on the localisation accuracy that have yielded the optimal cluster geometry. In a previous paper [E. Frayssinhes and E. Lansard, AAS paper 95-334 (1995)], Alcatel Espace has proposed a mathematical formulation relying on a strong analogy with GPS geometrical characterisation of navigation performances. The effects of geometry are expressed by geometric dilution of precision (GDOP) parameters. Such parameters are obtained by solving the TDOA measurement equations for the beacon position using an iterated-least-squares procedure. Then, the paper focuses at the system level on the peculiar problems that arise when such a satellite cluster system is dealt with, and more particularly the launch and early operations phases, the station-keeping strategies of manoeuvres, and the relative localisation and clock synchronisation of the satellites. In particular, it is shown that even with the "civil" C/A GPS measurements, differential techniques can yield respective accuracies better than 5 m r.m.s. and 15 ns r.m.s.

  19. Revealing gene clusters associated with the development of cholangiocarcinoma, based on a time series analysis.

    PubMed

    Wu, Jianyu; Xiao, Zhifu; Zhao, Xiulei; Wu, Xiangsong

    2015-05-01

    Cholangiocarcinoma (CC) is a rapidly lethal malignancy and currently is considered to be incurable. Biomarkers related to the development of CC remain unclear. The present study aimed to identify differentially expressed genes (DEGs) between normal tissue and intrahepatic CC, as well as specific gene expression patterns that changed together with the development of CC. By using a two‑way analysis of variance test, the biomarkers that could distinguish between normal tissue and intrahepatic CC dissected from different days were identified. A k‑means cluster method was used to identify gene clusters associated with the development of CC according to their changing expression pattern. Functional enrichment analysis was used to infer the function of each of the gene sets. A time series analysis was constructed to reveal gene signatures that were associated with the development of CC based on gene expression profile changes. Genes related to CC were shown to be involved in 'mitochondrion' and 'focal adhesion'. Three interesting gene groups were identified by the k‑means cluster method. Gene clusters with a unique expression pattern are related with the development of CC. The data of this study will facilitate novel discoveries regarding the genetic study of CC by further work.

  20. OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis.

    PubMed

    Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A

    2014-10-16

    The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P <0.0001) among the four clusters. The four clusters divided the sample into severity levels: Cluster 1 reflects the lowest average levels across all symptoms, and cluster 4 reflects the highest average levels. Clusters 2 and 3 capture moderate symptoms levels. Clusters 2 and 3 differed mainly in profiles of anxiety and depression, with Cluster 2 having lower levels of depression and anxiety than Cluster 3, despite higher levels of pain. The results of the cluster analysis of the external sample (n = 478) looked very similar to those found in the original cluster analysis, except for a slight difference in sleep problems. This was despite having patients in the validation sample who were significantly younger (P <0.0001) and had more severe symptoms (higher FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a

  1. Design and Analysis Considerations for Cluster Randomized Controlled Trials That Have a Small Number of Clusters.

    PubMed

    Deke, John

    2016-10-25

    Cluster randomized controlled trials (CRCTs) often require a large number of clusters in order to detect small effects with high probability. However, there are contexts where it may be possible to design a CRCT with a much smaller number of clusters (10 or fewer) and still detect meaningful effects. The objective is to offer recommendations for best practices in design and analysis for small CRCTs. I use simulations to examine alternative design and analysis approaches. Specifically, I examine (1) which analytic approaches control Type I errors at the desired rate, (2) which design and analytic approaches yield the most power, (3) what is the design effect of spurious correlations, and (4) examples of specific scenarios under which impacts of different sizes can be detected with high probability. I find that (1) mixed effects modeling and using Ordinary Least Squares (OLS) on data aggregated to the cluster level both control the Type I error rate, (2) randomization within blocks is always recommended, but how best to account for blocking through covariate adjustment depends on whether the precision gains offset the degrees of freedom loss, (3) power calculations can be accurate when design effects from small sample, spurious correlations are taken into account, and (4) it is very difficult to detect small effects with just four clusters, but with six or more clusters, there are realistic circumstances under which small effects can be detected with high probability. © The Author(s) 2016.

  2. Human protein cluster analysis using amino acid frequencies.

    PubMed

    Vernone, Annamaria; Berchialla, Paola; Pescarmona, Gianpiero

    2013-01-01

    The paper focuses on the development of a software tool for protein clustering according to their amino acid content. All known human proteins were clustered according to the relative frequencies of their amino acids starting from the UniProtKB/Swiss-Prot reference database and making use of hierarchical cluster analysis. RESULTS were compared to those based on sequence similarities. Proteins display different clustering patterns according to type. Many extracellular proteins with highly specific and repetitive sequences (keratins, collagens etc.) cluster clearly confirming the accuracy of the clustering method. In our case clustering by sequence and amino acid content overlaps. Proteins with a more complex structure with multiple domains (catalytic, extracellular, transmembrane etc.), even if classified very similar according to sequence similarity and function (aquaporins, cadherins, steroid 5-alpha reductase etc.) showed different clustering according to amino acid content. Availability of essential amino acids according to local conditions (starvation, low or high oxygen, cell cycle phase etc.) may be a limiting factor in protein synthesis, whatever the mRNA level. This type of protein clustering may therefore prove a valuable tool in identifying so far unknown metabolic connections and constraints.

  3. Visual verification and analysis of cluster detection for molecular dynamics.

    PubMed

    Grottel, Sebastian; Reina, Guido; Vrabec, Jadran; Ertl, Thomas

    2007-01-01

    A current research topic in molecular thermodynamics is the condensation of vapor to liquid and the investigation of this process at the molecular level. Condensation is found in many physical phenomena, e.g. the formation of atmospheric clouds or the processes inside steam turbines, where a detailed knowledge of the dynamics of condensation processes will help to optimize energy efficiency and avoid problems with droplets of macroscopic size. The key properties of these processes are the nucleation rate and the critical cluster size. For the calculation of these properties it is essential to make use of a meaningful definition of molecular clusters, which currently is a not completely resolved issue. In this paper a framework capable of interactively visualizing molecular datasets of such nucleation simulations is presented, with an emphasis on the detected molecular clusters. To check the quality of the results of the cluster detection, our framework introduces the concept of flow groups to highlight potential cluster evolution over time which is not detected by the employed algorithm. To confirm the findings of the visual analysis, we coupled the rendering view with a schematic view of the clusters' evolution. This allows to rapidly assess the quality of the molecular cluster detection algorithm and to identify locations in the simulation data in space as well as in time where the cluster detection fails. Thus, thermodynamics researchers can eliminate weaknesses in their cluster detection algorithms. Several examples for the effective and efficient usage of our tool are presented.

  4. A Flocking Based algorithm for Document Clustering Analysis

    SciTech Connect

    Cui, Xiaohui; Gao, Jinzhu; Potok, Thomas E

    2006-01-01

    Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.

  5. Identifying heterogeneity among injection drug users: a cluster analysis approach.

    PubMed

    Shaw, Souradet Y; Shah, Lena; Jolly, Ann M; Wylie, John L

    2008-08-01

    We used cluster analysis to subdivide a population of injection drug users and identify previously unknown behavioral heterogeneity within that population. We applied cluster analysis techniques to data collected in a cross-sectional survey of injection drug users in Winnipeg, Manitoba. The clustering variables we used were based on receptive syringe sharing, ethnicity, and types of drugs injected. Seven clusters were identified for both male and female injection drug users. Some relationships previously revealed in our study setting, such as the known relationship between Talwin (pentazocine) and Ritalin (methylphenidate) use, injection in hotels, and hepatitis C virus prevalence, were confirmed through our cluster analysis approach. Also, relationships between drug use and infection risk not previously observed in our study setting were identified, an example being a cluster of female crystal methamphetamine users who exhibited high-risk behaviors but an absence or low prevalence of blood-borne pathogens. Cluster analysis was useful in both confirming relationships previously identified and identifying new ones relevant to public health research and interventions.

  6. Identification of chronic rhinosinusitis phenotypes using cluster analysis.

    PubMed

    Soler, Zachary M; Hyer, J Madison; Ramakrishnan, Viswanathan; Smith, Timothy L; Mace, Jess; Rudmik, Luke; Schlosser, Rodney J

    2015-05-01

    Current clinical classifications of chronic rhinosinusitis (CRS) have been largely defined based upon preconceived notions of factors thought to be important, such as polyp or eosinophil status. Unfortunately, these classification systems have little correlation with symptom severity or treatment outcomes. Unsupervised clustering can be used to identify phenotypic subgroups of CRS patients, describe clinical differences in these clusters and define simple algorithms for classification. A multi-institutional, prospective study of 382 patients with CRS who had failed initial medical therapy completed the Sino-Nasal Outcome Test (SNOT-22), Rhinosinusitis Disability Index (RSDI), Medical Outcomes Study Short Form-12 (SF-12), Pittsburgh Sleep Quality Index (PSQI), and Patient Health Questionnaire (PHQ-2). Objective measures of CRS severity included Brief Smell Identification Test (B-SIT), CT, and endoscopy scoring. All variables were reduced and unsupervised hierarchical clustering was performed. After clusters were defined, variations in medication usage were analyzed. Discriminant analysis was performed to develop a simplified, clinically useful algorithm for clustering. Clustering was largely determined by age, severity of patient reported outcome measures, depression, and fibromyalgia. CT and endoscopy varied somewhat among clusters. Traditional clinical measures, including polyp/atopic status, prior surgery, B-SIT and asthma, did not vary among clusters. A simplified algorithm based upon productivity loss, SNOT-22 score, and age predicted clustering with 89% accuracy. Medication usage among clusters did vary significantly. A simplified algorithm based upon hierarchical clustering is able to classify CRS patients and predict medication usage. Further studies are warranted to determine if such clustering predicts treatment outcomes. © 2015 ARS-AAOA, LLC.

  7. Atlas-guided cluster analysis of large tractography datasets.

    PubMed

    Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

    2013-01-01

    Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment.

  8. Differences in Pedaling Technique in Cycling: A Cluster Analysis.

    PubMed

    Lanferdini, Fábio J; Bini, Rodrigo R; Figueiredo, Pedro; Diefenthaeler, Fernando; Mota, Carlos B; Arndt, Anton; Vaz, Marco A

    2016-10-01

    To employ cluster analysis to assess if cyclists would opt for different strategies in terms of neuromuscular patterns when pedaling at the power output of their second ventilatory threshold (POVT2) compared with cycling at their maximal power output (POMAX). Twenty athletes performed an incremental cycling test to determine their power output (POMAX and POVT2; first session), and pedal forces, muscle activation, muscle-tendon unit length, and vastus lateralis architecture (fascicle length, pennation angle, and muscle thickness) were recorded (second session) in POMAX and POVT2. Athletes were assigned to 2 clusters based on the behavior of outcome variables at POVT2 and POMAX using cluster analysis. Clusters 1 (n = 14) and 2 (n = 6) showed similar power output and oxygen uptake. Cluster 1 presented larger increases in pedal force and knee power than cluster 2, without differences for the index of effectiveness. Cluster 1 presented less variation in knee angle, muscle-tendon unit length, pennation angle, and tendon length than cluster 2. However, clusters 1 and 2 showed similar muscle thickness, fascicle length, and muscle activation. When cycling at POVT2 vs POMAX, cyclists could opt for keeping a constant knee power and pedal-force production, associated with an increase in tendon excursion and a constant fascicle length. Increases in power output lead to greater variations in knee angle, muscle-tendon unit length, tendon length, and pennation angle of vastus lateralis for a similar knee-extensor activation and smaller pedal-force changes in cyclists from cluster 2 than in cluster 1.

  9. Atlas-Guided Cluster Analysis of Large Tractography Datasets

    PubMed Central

    Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

    2013-01-01

    Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment. PMID:24386292

  10. Automated analysis of organic particles using cluster SIMS

    NASA Astrophysics Data System (ADS)

    Gillen, Greg; Zeissler, Cindy; Mahoney, Christine; Lindstrom, Abigail; Fletcher, Robert; Chi, Peter; Verkouteren, Jennifer; Bright, David; Lareau, Richard T.; Boldman, Mike

    2004-06-01

    Cluster primary ion bombardment combined with secondary ion imaging is used on an ion microscope secondary ion mass spectrometer for the spatially resolved analysis of organic particles on various surfaces. Compared to the use of monoatomic primary ion beam bombardment, the use of a cluster primary ion beam (SF 5+ or C 8-) provides significant improvement in molecular ion yields and a reduction in beam-induced degradation of the analyte molecules. These characteristics of cluster bombardment, along with automated sample stage control and custom image analysis software are utilized to rapidly characterize the spatial distribution of trace explosive particles, narcotics and inkjet-printed microarrays on a variety of surfaces.

  11. Evaluating the Efficacy of Using Predifferentiated and Enriched Mathematics Curricula for Grade 3 Students: A Multisite Cluster-Randomized Trial

    ERIC Educational Resources Information Center

    McCoach, D. Betsy; Gubbins, E. Jean; Foreman, Jennifer; Rubenstein, Lisa DaVia; Rambo-Hernandez, Karen E.

    2014-01-01

    Despite the potential of differentiated curricula to enhance learning, limited research exists that documents their impact on Grade 3 students of all ability levels. To determine if there was a difference in achievement between students involved in 16 weeks of predifferentiated, enriched mathematics curricula and students using their district's…

  12. Application of biclustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials.

    PubMed

    Williams, Andrew; Halappanavar, Sabina

    2015-01-01

    sets, the present study also identified two novel genes sets: a gene set associated with pulmonary fibrosis and a gene set associated with ROS, underlining the advantage of using a data-driven approach to identify novel, functionally related gene sets. The results can be used in future gene set enrichment analysis studies involving NMs or as features for clustering and classifying NMs of diverse properties.

  13. Assessment of cluster yield components by image analysis.

    PubMed

    Diago, Maria P; Tardaguila, Javier; Aleixos, Nuria; Millan, Borja; Prats-Montalban, Jose M; Cubero, Sergio; Blasco, Jose

    2015-04-01

    Berry weight, berry number and cluster weight are key parameters for yield estimation for wine and tablegrape industry. Current yield prediction methods are destructive, labour-demanding and time-consuming. In this work, a new methodology, based on image analysis was developed to determine cluster yield components in a fast and inexpensive way. Clusters of seven different red varieties of grapevine (Vitis vinifera L.) were photographed under laboratory conditions and their cluster yield components manually determined after image acquisition. Two algorithms based on the Canny and the logarithmic image processing approaches were tested to find the contours of the berries in the images prior to berry detection performed by means of the Hough Transform. Results were obtained in two ways: by analysing either a single image of the cluster or using four images per cluster from different orientations. The best results (R(2) between 69% and 95% in berry detection and between 65% and 97% in cluster weight estimation) were achieved using four images and the Canny algorithm. The model's capability based on image analysis to predict berry weight was 84%. The new and low-cost methodology presented here enabled the assessment of cluster yield components, saving time and providing inexpensive information in comparison with current manual methods. © 2014 Society of Chemical Industry.

  14. Cancer symptom clusters: an exploratory analysis of eight statistical techniques.

    PubMed

    Aktas, Aynur; Walsh, Declan; Hu, Bo

    2014-12-01

    Statistical methods to identify symptom clusters (SC) have varied between studies. The optimal statistical method to identify SC is unknown. Our primary objective was to explore whether eight different statistical techniques applied to a single data set produced different SC. A secondary objective was to investigate whether SC identified by these techniques resembled those from our original study. We reanalyzed a symptom data set of 1000 patients with advanced cancer. Eight separate cluster analyses were conducted on both prevalence and severity of 38 symptoms. Hierarchical cluster analysis identified clusters at r-values of 0.6, 0.5, and 0.4. For prevalence and severity, the Spearman correlation and Kendall tau-b correlation, respectively, measured the similarity (distance) between symptom pairs. Sensitivity analysis of the prevalence data was done with Cohen kappa coefficient as a similarity measure. The K-means clustering method validated clusters. Hierarchical cluster analysis identified similar cluster configurations from the 38 symptoms using an r-value of 0.6, 0.5, or 0.4. A cutoff point of 0.6 yielded seven clusters. Five of them were identical at all three r-values used: (1) fatigue/anorexia-cachexia: anorexia, dry mouth, early satiety, fatigue, lack of energy, taste changes, weakness, and weight loss (>10%); (2) gastrointestinal: belching, bloating, dyspepsia, and hiccough; (3) nausea/vomiting: nausea and vomiting; (4) aerodigestive: cough, dysphagia, dyspnea, hoarseness, and wheeze; (5) neurologic: confusion, hallucinations, and memory problems. Regardless of the threshold, there were always some symptoms (e.g., pain) that did not cluster with any others. Seven clusters were validated by K-means analysis. Seven SC identified from both prevalence and severity data were consistently present irrespective of the statistical analysis used. There were only minor variations in the number of clusters and their symptom composition between analytical techniques

  15. Construction of gene/protein interaction networks for primary myelofibrosis and KEGG pathway-enrichment analysis of molecular compounds.

    PubMed

    Sun, C G; Cao, X J; Zhou, C; Liu, L J; Feng, F B; Liu, R J; Zhuang, J; Li, Y J

    2015-12-08

    The objective of this study was the development of a gene/protein interaction network for primary myelofibrosis based on gene expression, and the enrichment analysis of KEGG pathways underlying the molecular complexes in this network. To achieve this, genes involved in primary myelofibrosis were selected from the OMIM database. A gene/protein interaction network for primary myelofibrosis was obtained through Cytoscape with the literature mining performed using the Agilent Literature Search plugin. The molecular complexes in the network were detected by ClusterViz plugin and KEGG pathway enrichment of molecular complexes was performed using DAVID online. We found 75 genes associated with primary myelofibrosis in the OMIM database. The gene/protein interaction network of primary myelofibrosis contained 608 nodes, 2086 edges, and 4 molecular complexes with a correlation integral value greater than 4. Molecular complexes involved in KEGG pathways are related to cytokine regulation, immune function regulation, ECM-receptor interaction, focal adhesion, actin cytoskeleton regulation, cell adhesion molecules, and other biological behavior of tumors, which can provide a reliable direction for the treatment of primary myelofibrosis and the bioinformatic foundation for further understanding the molecular mechanisms of this disease.

  16. Detection of early glaucomatous progression with octopus cluster trend analysis.

    PubMed

    Naghizadeh, Farzaneh; Holló, Gábor

    2014-01-01

    To compare the ability of Corrected Cluster Trend Analysis (CCTA) and Cluster Trend Analysis (CTA) with event analysis of Octopus visual field series to detect early glaucomatous progression. One eye of 15 healthy, 19 ocular hypertensive, 20 preperimetric, and 51 perimetric glaucoma (PG) patients were investigated with Octopus normal G2 test at 6-month intervals for 1.5 to 3 years. Progression was defined with significant worsening in any of the 10 Octopus clusters with CCTA, and event analysis criteria, respectively. With event analysis, 9 PG eyes showed localized progression and 1 diffuse mean defect (MD) worsening. With CCTA, progression was indicated in 1 normal, 1 ocular hypertensive, and 1 preperimetric glaucoma eyes due to vitreous floaters, and 28 PG eyes including all 9 eyes with localized progression with event analysis. The locations of CCTA progression matched those found with event analysis in all 9 cases. In 17 of the remaining 19 eyes, progressing clusters matched the locations that were suspicious but not definitive for progression with event analysis. In the eye with diffuse MD worsening, CTA found significant progression for 7 clusters. For global MD progression rate, eyes worsened with CCTA only did not differ from the stable eyes but had significantly smaller progression rates than the eyes progressed with event analysis (P=0.0002). In PG, Octopus CCTA and CTA are clinically useful to identify early progression and areas suspicious for early progression. However, in some eyes with no glaucomatous visual field damage, vitreous floaters may cause progression artifacts.

  17. The Limitations of Simple Gene Set Enrichment Analysis Assuming Gene Independence

    PubMed Central

    Tamayo, Pablo; Steinhardt, George; Liberzon, Arthur; Mesirov, Jill P.

    2013-01-01

    Since its first publication in 2003, the Gene Set Enrichment Analysis (GSEA) method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach, using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes GSEA’s nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with GSEA’s on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene-gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods. PMID:23070592

  18. [Enrichment analysis of Fanconi anemia gene expression profiles in cancer related genesets].

    PubMed

    Zhou, Quan-quan; Wang, Xiao-juan; Zhu, Xiao-fan; Yuan, Wei-ping; Cheng, Tao

    2012-05-01

    To investigate the underlying tumor susceptibility mechanisms and reasons for the high risk of cancer in Fanconi anemia (FA). Gene Set Enrichment Analysis (GSEA) was performed to compare gene expression profiles between 21 FA patients' bone marrow (BM) mononuclear cell (BMNC) and 11 normal controls in cancer related gene sets from NCBI GEO database, then core enriched genes were identified by further investigation. Through enrichment analyzing biological processes of gene ontology sets and structural genomic gene sets between FA expression profiles and control, more details related with its tumor susceptibility had been revealed. Compared with normal control, gene expression in FA group had significant been enriched in resistance to Bcl-2 inhibitor gene set, fibroblast growth factors signalling pathways, insulin and insulin-like growth factors (IGF) signalling pathways induced cancer genesis gene sets. The high level of D4S234E, SST, FGFs, IGFs, FGFRs and IGFBP expression provided an initiate environment for tumorgenesis and drug resistance. There were significant differences in biogenesis extracellular molecules and cytomembrane structure organizations between FA and control. Genes with promoter regions around transcription start sites containing either motif RRCAGGTGNCV or CCTNTMAGA were enriched and those former genes match annotation for tumorgenic transcription factor 3 (TCF3). The high tumor susceptibility of FA patients may be closely related with the dramatic changes in cancer related growth factors and hormones environment. This study provides new insights into tumor susceptibility mechanism in FA patients.

  19. Influence of nuclear data on uranium enrichment results obtained by XKalpha spectral region analysis.

    PubMed

    Morel, Jean; Clark, DeLynn

    2002-01-01

    During the recent international uranium exercise organized by the ESARDA NDA Working Group, several participants determined the uranium enrichment of samples using methods based on analysis of the XKalpha region of the uranium spectrum. For these methods, no calibration with known enrichment standards is required but accurate knowledge of nuclear data is needed. Despite this requirement, it appeared that during the exercise, four different sets of nuclear data were used by the participants. In view of this fact, it was decided to introduce these nuclear data sets into some computer codes in order to check their effects on the enrichment results. Two participants agreed to cooperate, and the main results of this test are presented here. It can be seen that three nuclear data sets, although different, give satisfactory results with no significant bias. Nevertheless, a more accurate characterization of X- and gamma-ray emission from 235U, 235U and their daughters appears necessary.

  20. Using cluster analysis to organize and explore regional GPS velocities

    USGS Publications Warehouse

    Simpson, Robert W.; Thatcher, Wayne; Savage, James C.

    2012-01-01

    Cluster analysis offers a simple visual exploratory tool for the initial investigation of regional Global Positioning System (GPS) velocity observations, which are providing increasingly precise mappings of actively deforming continental lithosphere. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, can be subjective and is often guided by the distribution of known faults. To illustrate our method, we apply cluster analysis to GPS velocities from the San Francisco Bay Region, California, to search for spatially coherent patterns of deformation, including evidence of block-like behavior. The clustering process identifies four robust groupings of velocities that we identify with four crustal blocks. Although the analysis uses no prior geologic information other than the GPS velocities, the cluster/block boundaries track three major faults, both locked and creeping.

  1. Comparative analysis of genomic signal processing for microarray data clustering.

    PubMed

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  2. Comprehensive assessment of sequence variation within the copy number variable defensin cluster on 8p23 by target enriched in-depth 454 sequencing.

    PubMed

    Taudien, Stefan; Szafranski, Karol; Felder, Marius; Groth, Marco; Huse, Klaus; Raffaelli, Francesca; Petzold, Andreas; Zhang, Xinmin; Rosenstiel, Philip; Hampe, Jochen; Schreiber, Stefan; Platzer, Matthias

    2011-05-18

    In highly copy number variable (CNV) regions such as the human defensin gene locus, comprehensive assessment of sequence variations is challenging. PCR approaches are practically restricted to tiny fractions, and next-generation sequencing (NGS) approaches of whole individual genomes e.g. by the 1000 Genomes Project is confined by an affordable sequence depth. Combining target enrichment with NGS may represent a feasible approach. As a proof of principle, we enriched a ~850 kb section comprising the CNV defensin gene cluster DEFB, the invariable DEFA part and 11 control regions from two genomes by sequence capture and sequenced it by 454 technology. 6,651 differences to the human reference genome were found. Comparison to HapMap genotypes revealed sensitivities and specificities in the range of 94% to 99% for the identification of variations.Using error probabilities for rigorous filtering revealed 2,886 unique single nucleotide variations (SNVs) including 358 putative novel ones. DEFB CN determinations by haplotype ratios were in agreement with alternative methods. Although currently labor extensive and having high costs, target enriched NGS provides a powerful tool for the comprehensive assessment of SNVs in highly polymorphic CNV regions of individual genomes. Furthermore, it reveals considerable amounts of putative novel variations and simultaneously allows CN estimation.

  3. A Method for Selective Enrichment and Analysis of Nitrotyrosine-Containing Peptides in Complex Proteome Samples

    SciTech Connect

    Zhang, Qibin; Qian, Weijun; Knyushko, Tanya V.; Clauss, Therese RW; Purvine, Samuel O.; Moore, Ronald J.; Sacksteder, Colette A.; Chin, Mark H.; Smith, Desmond J.; Camp, David G.; Bigelow, Diana J.; Smith, Richard D.

    2007-06-01

    Elevated levels of protein tyrosine nitration have been found in various neurodegenerative diseases and aging related pathologies; however, the lack of an efficient enrichment method has prevented the analysis of this important low level protein modification. We have developed an efficient method for specific enrichment of nitrotyrosine containing peptides that permits nitrotyrosine peptides and specific nitration sites to be unambiguously identified with LC-MS/MS. The method is based on the derivatization of nitrotyrosine into free sulfhydryl groups followed by high efficiency enrichment of sulfhydryl-containing peptides with thiopropyl sepharose beads. The derivatization process starts with acetylation with acetic anhydride to block all primary amines, followed by reduction of nitrotyrosine to aminotyrosine, then derivatization of aminotyrosine with N-Succinimidyl S-Acetylthioacetate (SATA), and finally deprotecting of S-acetyl on SATA to form free sulfhydryl groups. This method was evaluated using nitrotyrosine containing peptides, in-vitro nitrated human histone 1.2, and bovine serum albumin (BSA). 91% and 62% of the identified peptides from enriched histone and BSA samples were nitrotyrosine derivatized peptides, respectively, suggesting relative high specificity of the enrichment method. The application of this method to in-vitro nitrated mouse brain homogenate resulted in 35% of identified peptides containing nitrotyrosine (compared to only 5.9% observed from the global analysis of unenriched sample), and a total of 150 unique nitrated peptides covering 102 proteins were identified with a false discovery rate estimated at 3.3% from duplicate LC-MS/MS analyses of a single enriched sample.

  4. [Meta-analysis of stable carbon and nitrogen isotopic enrichment factors for aquatic animals].

    PubMed

    Guo, Liang; Sun, Cui-ping; Ren, Wei-zheng; Zhang, Jian; Tang, Jian-iun; Hu, Liana-liang; Chen, Xin

    2016-02-01

    Isotopic enrichment factor (Δ, the difference between the δ value of food and a consumer tissue) is an important parameter in using stable isotope analysis (SIA) to reconstruct diets, characterize trophic relationships, elucidate patterns of resource allocation, and construct food webs. Isotopic enrichment factor has been considered as a constancy value across a broad range of animals. However, recent studies showed that the isotopic enrichment factor differed among various types of animals although the magnitude of variation was not clear. Here, we conducted a meta-analysis to synthesize and compare Δ13C and Δ15N among four types of aquatic animals (teleosts, crustaceans, reptiles and molluscs). We searched for papers published before 2014 on Web of Science and CNKI using the key words "stable isotope or isotopic fractionation or fractionation factor or isotopic enrichment or trophic enrichment". Forty-two publications that contain 140 studies on Δ13C and 159 studies on Δ15N were obtained. We conducted three parallel meta-analyses by using three types of weights (the reciprocal of variance as weights, the sample size as weights, and equal weights). The results showed that no significant difference in Δ13C among different animal types (teleosts 1.0 per thousand, crustaceans 1.3 per thousand, reptiles 0.5 per thousand, and molluscs 1.5 per thousand), while Δ15N values were significantly different (teleosts 2.4 per thousand, crustaceans 3.6 per thousand, reptiles 1.0 per thousand and molluscs 2.5 per thousand). Our results suggested that the overall mean of Δ13C could be used as a general enrichment factor, but Δ15N should be chosen according to the type of aquatic animals in using SIA to analyze trophic relationships, patterns of resource allocation and food webs.

  5. PEIMAN 1.0: Post-translational modification Enrichment, Integration and Matching ANalysis.

    PubMed

    Nickchi, Payman; Jafari, Mohieddin; Kalantari, Shiva

    2015-01-01

    Conventional proteomics has discovered a wide gap between protein sequences and biological functions. The third generation of proteomics was provoked to bridge this gap. Targeted and untargeted post-translational modification (PTM) studies are the most important parts of today's proteomics. Considering the expensive and time-consuming nature of experimental methods, computational methods are developed to study, analyze, predict, count and compute the PTM annotations on proteins. The enrichment analysis softwares are among the common computational biology and bioinformatic software packages. The focus of such softwares is to find the probability of occurrence of the desired biological features in any arbitrary list of genes/proteins. We introduce Post-translational modification Enrichment Integration and Matching Analysis (PEIMAN) software to explore more probable and enriched PTMs on proteins. Here, we also represent the statistics of detected PTM terms used in enrichment analysis in PEIMAN software based on the latest released version of UniProtKB/Swiss-Prot. These results, in addition to giving insight to any given list of proteins, could be useful to design targeted PTM studies for identification and characterization of special chemical groups. Database URL: http://bs.ipm.ir/softwares/PEIMAN/ © The Author(s) 2015. Published by Oxford University Press.

  6. PEIMAN 1.0: Post-translational modification Enrichment, Integration and Matching ANalysis

    PubMed Central

    Nickchi, Payman; Jafari, Mohieddin; Kalantari, Shiva

    2015-01-01

    Conventional proteomics has discovered a wide gap between protein sequences and biological functions. The third generation of proteomics was provoked to bridge this gap. Targeted and untargeted post-translational modification (PTM) studies are the most important parts of today’s proteomics. Considering the expensive and time-consuming nature of experimental methods, computational methods are developed to study, analyze, predict, count and compute the PTM annotations on proteins. The enrichment analysis softwares are among the common computational biology and bioinformatic software packages. The focus of such softwares is to find the probability of occurrence of the desired biological features in any arbitrary list of genes/proteins. We introduce Post-translational modification Enrichment Integration and Matching Analysis (PEIMAN) software to explore more probable and enriched PTMs on proteins. Here, we also represent the statistics of detected PTM terms used in enrichment analysis in PEIMAN software based on the latest released version of UniProtKB/Swiss-Prot. These results, in addition to giving insight to any given list of proteins, could be useful to design targeted PTM studies for identification and characterization of special chemical groups. Database URL: http://bs.ipm.ir/softwares/PEIMAN/ PMID:25911152

  7. CLASH-VLT: Strangulation of cluster galaxies in MACS J0416.1-2403 as seen from their chemical enrichment

    NASA Astrophysics Data System (ADS)

    Maier, C.; Kuchner, U.; Ziegler, B. L.; Verdugo, M.; Balestra, I.; Girardi, M.; Mercurio, A.; Rosati, P.; Fritz, A.; Grillo, C.; Nonino, M.; Sartoris, B.

    2016-05-01

    Aims: Environmental effects gain importance as large scale structures in the Universe develop with time and have become the dominant mechanism for quenching galaxies of intermediate and low stellar masses at lower redshifts. Therefore, clusters of galaxies at z< 0.5 are the sites where environmental effects are expected to be more pronounced and more easily observed with present-day large telescopes. Methods: We explore the Frontier Fields cluster MACS J0416.1-2403 at z = 0.3972 with VIMOS/VLT spectroscopy from the CLASH-VLT survey covering a region that corresponds to almost three virial radii. We measure fluxes of Hβ, [O III]λ 5007, Hα, and [N II]λ 6584 emission lines of cluster members enabling us to unambiguously derive O/H gas metallicities, and also star formation rates from extinction-corrected Hα fluxes. We compare our cluster galaxy sample with a field sample at z ~ 0.4 drawn from zCOSMOS. Results: The 76 galaxies of our cluster sample follow the star-forming metallicity sequence in a diagnostic diagram disentangling ionizing sources. For intermediate masses we find a similar distribution of cluster and field galaxies in the mass vs. metallicity and mass vs. sSFR diagrams. An in-depth investigation furthermore reveals that bulge-dominated cluster galaxies have on average lower sSFRs and higher O/Hs than their disk-dominated counterparts. We use the location of galaxies in the projected velocity vs. position phase-space to separate our cluster sample into a region of objects accreted longer ago and a region of recently accreted and infalling galaxies. We find a higher fraction of accreted metal-rich galaxies (63%) compared to the fraction of 28% of metal-rich galaxies in the infalling regions. Intermediate-mass galaxies (9.2 < log (M/M⊙) < 10.2) falling into the cluster for the first time are found to be in agreement with predictions of the fundamental metallicity relation. In contrast, for already accreted star-forming galaxies of similar masses, we

  8. A Distributed Flocking Approach for Information Stream Clustering Analysis

    SciTech Connect

    Cui, Xiaohui; Potok, Thomas E

    2006-01-01

    Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.

  9. Cluster analysis and prediction of treatment outcomes for chronic rhinosinusitis.

    PubMed

    Soler, Zachary M; Hyer, J Madison; Rudmik, Luke; Ramakrishnan, Viswanathan; Smith, Timothy L; Schlosser, Rodney J

    2016-04-01

    Current clinical classifications of chronic rhinosinusitis (CRS) have weak prognostic utility regarding treatment outcomes. Simplified discriminant analysis based on unsupervised clustering has identified novel phenotypic subgroups of CRS, but prognostic utility is unknown. We sought to determine whether discriminant analysis allows prognostication in patients choosing surgery versus continued medical management. A multi-institutional prospective study of patients with CRS in whom initial medical therapy failed who then self-selected continued medical management or surgical treatment was used to separate patients into 5 clusters based on a previously described discriminant analysis using total Sino-Nasal Outcome Test-22 (SNOT-22) score, age, and missed productivity. Patients completed the SNOT-22 at baseline and for 18 months of follow-up. Baseline demographic and objective measures included olfactory testing, computed tomography, and endoscopy scoring. SNOT-22 outcomes for surgical versus continued medical treatment were compared across clusters. Data were available on 690 patients. Baseline differences in demographics, comorbidities, objective disease measures, and patient-reported outcomes were similar to previous clustering reports. Three of 5 clusters identified by means of discriminant analysis had improved SNOT-22 outcomes with surgical intervention when compared with continued medical management (surgery was a mean of 21.2 points better across these 3 clusters at 6 months, P < .05). These differences were sustained at 18 months of follow-up. Two of 5 clusters had similar outcomes when comparing surgery with continued medical management. A simplified discriminant analysis based on 3 common clinical variables is able to cluster patients and provide prognostic information regarding surgical treatment versus continued medical management in patients with CRS. Copyright © 2015 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All

  10. PlantGSEA: a gene set enrichment analysis toolkit for plant community.

    PubMed

    Yi, Xin; Du, Zhou; Su, Zhen

    2013-07-01

    Gene Set Enrichment Analysis (GSEA) is a powerful method for interpreting biological meaning of a list of genes by computing the overlaps with various previously defined gene sets. As one of the most widely used annotations for defining gene sets, Gene Ontology (GO) system has been used in many enrichment analysis tools. EasyGO and agriGO, two GO enrichment analysis toolkits developed by our laboratory, have gained extensive usage and citations since their releases because of their effective performance and consistent maintenance. Responding to the increasing demands of more comprehensive analysis from the users, we developed a web server as an important component of our bioinformatics analysis toolkit, named PlantGSEA, which is based on GSEA method and mainly focuses on plant organisms. In PlantGSEA, 20 290 defined gene sets deriving from different resources were collected and used for GSEA analysis. The PlantGSEA currently supports gene locus IDs and Affymatrix microarray probe set IDs from four plant model species (Arabidopsis thaliana, Oryza sativa, Zea mays and Gossypium raimondii). The PlantGSEA is an efficient and user-friendly web server, and now it is publicly accessible at http://structuralbiology.cau.edu.cn/PlantGSEA.

  11. Effect of rosemary polyphenols on human colon cancer cells: transcriptomic profiling and functional enrichment analysis.

    PubMed

    Valdés, Alberto; García-Cañas, Virginia; Rocamora-Reverte, Lourdes; Gómez-Martínez, Angeles; Ferragut, José Antonio; Cifuentes, Alejandro

    2013-01-01

    In this work, the effect of rosemary extracts rich on polyphenols obtained using pressurized fluids was investigated on the gene expression of human SW480 and HT29 colon cancer cells. The application of transcriptomic profiling and functional enrichment analysis was done via two computational approaches, Ingenuity Pathway Analysis and Gene Set Enrichment Analysis. These two approaches were used for functional enrichment analysis as a previous step for a reliable interpretation of the data obtained from microarray analysis. Reverse transcription quantitative-PCR was used to confirm relative changes in mRNA levels of selected genes from microarrays. The selection of genes was based on their expression change, adjusted p value, and known biological function. According to genome-wide transcriptomics analysis, rosemary polyphenols altered the expression of ~4 % of the genes covered by the Affymetrix Human Gene 1.0ST chip in both colon cancer cells. However, only ~18 % of the differentially expressed genes were common to both cell lines, indicating markedly different expression profiles in response to the treatment. Differences in induction of G2/M arrest observed by rosemary polyphenols in the two colon adenocarcinoma cell lines suggest that the extract may be differentially effective against tumors with specific mutational pattern. From our results, it is also concluded that rosemary polyphenols induced a low degree of apoptosis indicating that other multiple signaling pathways may contribute to colon cancer cell death.

  12. Design of an Unattended Environmental Aerosol Sampling and Analysis System for Gaseous Centrifuge Enrichment Plants

    SciTech Connect

    Anheier, Norman C.; Munley, John T.; Alexander, M. L.

    2011-07-19

    The resources of the IAEA continue to be challenged by the rapid, worldwide expansion of nuclear energy production. Gaseous centrifuge enrichment plants (GCEPs) represent an especially formidable dilemma to the application of safeguard measures, as the size and enrichment capacity of GCEPs continue to escalate. During the early part of the 1990's, the IAEA began to lay the foundation to strengthen and make cost-effective its future safeguard regime. Measures under Part II of 'Programme 93+2' specifically sanctioned access to nuclear fuel production facilities and environmental sampling by IAEA inspectors. Today, the Additional Protocol grants inspection and environmental sample collection authority to IAEA inspectors at GCEPs during announced and low frequency unannounced (LFUA) inspections. During inspections, IAEA inspectors collect environmental swipe samples that are then shipped offsite to an analytical laboratory for enrichment assay. This approach has proven to be an effective deterrence to GCEP misuse, but this method has never achieved the timeliness of detection goals set forth by IAEA. Furthermore it is questionable whether the IAEA will have the resources to even maintain pace with the expansive production capacity of the modern GCEP, let alone improve the timeliness in reaching current safeguards conclusions. New safeguards propositions, outside of familiar mainstream safeguard measures, may therefore be required that counteract the changing landscape of nuclear energy fuel production. A new concept is proposed that offers rapid, cost effective GCEP misuse detection, without increasing LFUA inspection access or introducing intrusive access demands on GCEP operations. Our approach is based on continuous onsite aerosol collection and laser enrichment analysis. This approach mitigates many of the constraints imposed by the LFUA protocol, reduces the demand for onsite sample collection and offsite analysis, and overcomes current limitations associated with

  13. Clustering rainfall pattern in Malaysia using functional data analysis

    NASA Astrophysics Data System (ADS)

    Hamdan, Muhammad Fauzee; Suhaila, Jamaludin; Jemain, Abdul Aziz

    2015-02-01

    Understanding rainfall pattern is important for planning and prediction in hydrology, meteorology, water planning and agriculture. There are two important features of rainfall: the rainfall amount and the probability of rainfall occurrence. The discrete raw data of rainfall precipitation was reconstructed into rainfall amount curves by using functional data analysis method. Hierarchical clustering method with complete-linkage method was used to search for natural similar groupings of rainfall amount curves. The functional clustering illustrated the four dominant patterns for rainfall amount curves. In additional, adaptive Neyman test showed that each clusters are significantly different with from each others.

  14. Metabolomic analysis can detect the composition of pasta enriched with fibre after cooking.

    PubMed

    Beleggia, Romina; Menga, Valeria; Platani, Cristiano; Nigro, Franca; Fragasso, Mariagiovanna; Fares, Clara

    2016-07-01

    Several studies have demonstrated that metabolomics has a definite place in food quality, nutritional value, and safety issues. The aim of the present study was to determine and compare the metabolites in different pasta samples with fibre, and to investigate the modifications induced in these different kinds of pasta during cooking, using a gas chromatography-mass spectrometry-based metabolomics approach. Differences were seen for some of the amino acids, which were absent in control pasta, while were present both in the commercially available high-fibre pasta (samples A-C) and the enriched pasta (samples D-F). The highest content in reducing sugars was observed in enriched samples in comparison with high-fibre pasta. The presence of stigmasterol in samples enriched with wheat bran was relevant. Cooking decreased all of the metabolites: the high-fibre pasta (A-C) and Control showed losses of amino acids and tocopherols, while for sugars and organic acids, the decrease depended on the pasta sample. The enriched pasta samples (D-F) showed the same decreases with the exception of phytosterols, and in pasta with barley the decrease of saturated fatty acids was not significant as for tocopherols in pasta with oat. Principal component analysis of the metabolites and the pasta discrimination was effective in differentiating the enriched pasta from the commercial pasta, both uncooked and cooked. The study has established that such metabolomic analyses provide useful tools in the evaluation of the changes in nutritional compounds in high-fibre and enriched pasta, both before and after cooking. © 2015 Society of Chemical Industry. © 2015 Society of Chemical Industry.

  15. MCAM: Multiple Clustering Analysis Methodology for Deriving Hypotheses and Insights from High-Throughput Proteomic Datasets

    PubMed Central

    Naegle, Kristen M.; Welsch, Roy E.; Yaffe, Michael B.; White, Forest M.; Lauffenburger, Douglas A.

    2011-01-01

    Advances in proteomic technologies continue to substantially accelerate capability for generating experimental data on protein levels, states, and activities in biological samples. For example, studies on receptor tyrosine kinase signaling networks can now capture the phosphorylation state of hundreds to thousands of proteins across multiple conditions. However, little is known about the function of many of these protein modifications, or the enzymes responsible for modifying them. To address this challenge, we have developed an approach that enhances the power of clustering techniques to infer functional and regulatory meaning of protein states in cell signaling networks. We have created a new computational framework for applying clustering to biological data in order to overcome the typical dependence on specific a priori assumptions and expert knowledge concerning the technical aspects of clustering. Multiple clustering analysis methodology (‘MCAM’) employs an array of diverse data transformations, distance metrics, set sizes, and clustering algorithms, in a combinatorial fashion, to create a suite of clustering sets. These sets are then evaluated based on their ability to produce biological insights through statistical enrichment of metadata relating to knowledge concerning protein functions, kinase substrates, and sequence motifs. We applied MCAM to a set of dynamic phosphorylation measurements of the ERRB network to explore the relationships between algorithmic parameters and the biological meaning that could be inferred and report on interesting biological predictions. Further, we applied MCAM to multiple phosphoproteomic datasets for the ERBB network, which allowed us to compare independent and incomplete overlapping measurements of phosphorylation sites in the network. We report specific and global differences of the ERBB network stimulated with different ligands and with changes in HER2 expression. Overall, we offer MCAM as a broadly

  16. The Enhanced Hoshen-Kopelman Algorithm for Cluster Analysis

    NASA Astrophysics Data System (ADS)

    Hoshen, Joseph

    1997-08-01

    In 1976 Hoshen and Kopelman(J. Hoshen and R. Kopelman, Phys. Rev. B, 14, 3438 (1976).) introduced a breakthrough algorithm, known today as the Hoshen-Kopelman algorithm, for cluster analysis. This algorithm revolutionized Monte Carlo cluster calculations in percolation theory as it enables analysis of very large lattices containing 10^11 or more sites. Initially the HK algorithm primary use was in the domain of pure and basic sciences. Later it began finding applications in diverse fields of technology and applied sciences. Example of such applications are two and three dimensional image analysis, composite material modeling, polymers, remote sensing, brain modeling and food processing. While the original HK algorithm provides only cluster size data for only one class of sites, the Enhanced HK (EHK) algorithm, presented in this paper, enables calculations of cluster spatial moments -- characteristics of cluster shapes -- for multiple classes of sites. These enhancements preserve the time and space complexities of the original HK algorithm, such that very large lattices could be still analyzed simultaneously in a single pass through the lattice for cluster sizes, classes and shapes.

  17. Core and peripheral connectivity based cluster analysis over PPI network.

    PubMed

    Ahmed, Hasin A; Bhattacharyya, Dhruba K; Kalita, Jugal K

    2015-12-01

    A number of methods have been proposed in the literature of protein-protein interaction (PPI) network analysis for detection of clusters in the network. Clusters are identified by these methods using various graph theoretic criteria. Most of these methods have been found time consuming due to involvement of preprocessing and post processing tasks. In addition, they do not achieve high precision and recall consistently and simultaneously. Moreover, the existing methods do not employ the idea of core-periphery structural pattern of protein complexes effectively to extract clusters. In this paper, we introduce a clustering method named CPCA based on a recent observation by researchers that a protein complex in a PPI network is arranged as a relatively dense core region and additional proteins weakly connected to the core. CPCA uses two connectivity criterion functions to identify core and peripheral regions of the cluster. To locate initial node of a cluster we introduce a measure called DNQ (Degree based Neighborhood Qualification) index that evaluates tendency of the node to be part of a cluster. CPCA performs well when compared with well-known counterparts. Along with protein complex gold standards, a co-localization dataset has also been used for validation of the results. Copyright © 2015 Elsevier Ltd. All rights reserved.

  18. Cluster Analysis of Undergraduate Drinkers Based on Alcohol Expectancy Scores

    PubMed Central

    Leeman, Robert F.; Kulesza, Magdalena; Stewart, Diana W.; Copeland, Amy L.

    2012-01-01

    Objective: Expectancies of alcohol's effects have been associated with problem drinking in undergraduates. If subgroups can be classified based on expectancies, this may facilitate identifying those at highest risk forproblem drinking. Method: Undergraduates (N = 612) from two state universities completed a web-based survey. Responses to the Comprehensive Effects of Alcohol scale were analyzed using k-means cluster analysis separately within each university sample. Results: Hartigan's heuristic was used to determine that five was the optimal number of clusters in each sample. Clusters were distinguishable based on their overall magnitude of expectancy endorsement and by a tendency to endorse stronger positive than negative expectancies. Subsequent analyses were conducted to compare clusters on alcohol involvement and trait disinhibition. A cluster characterized by endorsement of positive and negative expectancies (“strong expectancy”) was associated with a particularly problematic risk profile, specifically concerning difficulties with self-control (i.e., trait disinhibition and impaired control over alcohol use). A cluster with higher positive and lower negative expectancies reported frequent heavy drinking but appeared to be at lower risk than the strong expectancy cluster in a number of respects. Negative expectancy endorsement appeared to represent added risk above and beyond positive expectancies. Conclusions: Results suggest that both the magnitude and combination of expectancies endorsed by subgroups of undergraduate drinkers may relate to their risk level in terms of alcohol involvement and personality traits. These findings may have implications for interventions with young adult drinkers. PMID:22333331

  19. Cluster analysis of undergraduate drinkers based on alcohol expectancy scores.

    PubMed

    Leeman, Robert F; Kulesza, Magdalena; Stewart, Diana W; Copeland, Amy L

    2012-03-01

    Expectancies of alcohol's effects have been associated with problem drinking in undergraduates. If subgroups can be classified based on expectancies, this may facilitate identifying those at highest risk for problem drinking. Undergraduates (N = 612) from two state universities completed a web-based survey. Responses to the Comprehensive Effects of Alcohol scale were analyzed using k-means cluster analysis separately within each university sample. Hartigan's heuristic was used to determine that five was the optimal number of clusters in each sample. Clusters were distinguishable based on their overall magnitude of expectancy endorsement and by a tendency to endorse stronger positive than negative expectancies. Subsequent analyses were conducted to compare clusters on alcohol involvement and trait disinhibition. A cluster characterized by endorsement of positive and negative expectancies ("strong expectancy") was associated with a particularly problematic risk profile, specifically concerning difficulties with self-control (i.e., trait disinhibition and impaired control over alcohol use). A cluster with higher positive and lower negative expectancies reported frequent heavy drinking but appeared to be at lower risk than the strong expectancy cluster in a number of respects. Negative expectancy endorsement appeared to represent added risk above and beyond positive expectancies. Results suggest that both the magnitude and combination of expectancies endorsed by subgroups of undergraduate drinkers may relate to their risk level in terms of alcohol involvement and personality traits. These findings may have implications for interventions with young adult drinkers.

  20. Towards eliminating bias in cluster analysis of TB genotyped data.

    PubMed

    van Schalkwyk, Cari; Cule, Madeleine; Welte, Alex; van Helden, Paul; van der Spuy, Gian; Uys, Pieter

    2012-01-01

    The relative contributions of transmission and reactivation of latent infection to TB cases observed clinically has been reported in many situations, but always with some uncertainty. Genotyped data from TB organisms obtained from patients have been used as the basis for heuristic distinctions between circulating (clustered strains) and reactivated infections (unclustered strains). Naïve methods previously applied to the analysis of such data are known to provide biased estimates of the proportion of unclustered cases. The hypergeometric distribution, which generates probabilities of observing clusters of a given size as realized clusters of all possible sizes, is analyzed in this paper to yield a formal estimator for genotype cluster sizes. Subtle aspects of numerical stability, bias, and variance are explored. This formal estimator is seen to be stable with respect to the epidemiologically interesting properties of the cluster size distribution (the number of clusters and the number of singletons) though it does not yield satisfactory estimates of the number of clusters of larger sizes. The problem that even complete coverage of genotyping, in a practical sampling frame, will only provide a partial view of the actual transmission network remains to be explored.

  1. Towards Eliminating Bias in Cluster Analysis of TB Genotyped Data

    PubMed Central

    Welte, Alex; van Helden, Paul; van der Spuy, Gian; Uys, Pieter

    2012-01-01

    The relative contributions of transmission and reactivation of latent infection to TB cases observed clinically has been reported in many situations, but always with some uncertainty. Genotyped data from TB organisms obtained from patients have been used as the basis for heuristic distinctions between circulating (clustered strains) and reactivated infections (unclustered strains). Naïve methods previously applied to the analysis of such data are known to provide biased estimates of the proportion of unclustered cases. The hypergeometric distribution, which generates probabilities of observing clusters of a given size as realized clusters of all possible sizes, is analyzed in this paper to yield a formal estimator for genotype cluster sizes. Subtle aspects of numerical stability, bias, and variance are explored. This formal estimator is seen to be stable with respect to the epidemiologically interesting properties of the cluster size distribution (the number of clusters and the number of singletons) though it does not yield satisfactory estimates of the number of clusters of larger sizes. The problem that even complete coverage of genotyping, in a practical sampling frame, will only provide a partial view of the actual transmission network remains to be explored. PMID:22479534

  2. Detection of Functional Change Using Cluster Trend Analysis in Glaucoma.

    PubMed

    Gardiner, Stuart K; Mansberger, Steven L; Demirel, Shaban

    2017-05-01

    Global analyses using mean deviation (MD) assess visual field progression, but can miss localized changes. Pointwise analyses are more sensitive to localized progression, but more variable so require confirmation. This study assessed whether cluster trend analysis, averaging information across subsets of locations, could improve progression detection. A total of 133 test-retest eyes were tested 7 to 10 times. Rates of change and P values were calculated for possible re-orderings of these series to generate global analysis ("MD worsening faster than x dB/y with P < y"), pointwise and cluster analyses ("n locations [or clusters] worsening faster than x dB/y with P < y") with specificity exactly 95%. These criteria were applied to 505 eyes tested over a mean of 10.5 years, to find how soon each detected "deterioration," and compared using survival models. This was repeated including two subsequent visual fields to determine whether "deterioration" was confirmed. The best global criterion detected deterioration in 25% of eyes in 5.0 years (95% confidence interval [CI], 4.7-5.3 years), compared with 4.8 years (95% CI, 4.2-5.1) for the best cluster analysis criterion, and 4.1 years (95% CI, 4.0-4.5) for the best pointwise criterion. However, for pointwise analysis, only 38% of these changes were confirmed, compared with 61% for clusters and 76% for MD. The time until 25% of eyes showed subsequently confirmed deterioration was 6.3 years (95% CI, 6.0-7.2) for global, 6.3 years (95% CI, 6.0-7.0) for pointwise, and 6.0 years (95% CI, 5.3-6.6) for cluster analyses. Although the specificity is still suboptimal, cluster trend analysis detects subsequently confirmed deterioration sooner than either global or pointwise analyses.

  3. Key genes and pathways in thyroid cancer based on gene set enrichment analysis.

    PubMed

    He, Wenwu; Qi, Bin; Zhou, Qiuxi; Lu, Chuansen; Huang, Qi; Xian, Lei; Chen, Mingwu

    2013-09-01

    The incidence of thyroid cancer and its associated morbidity has shown the most rapid increase among all cancers since 1982, but the mechanisms involved in thyroid cancer, particularly significant key genes induced in thyroid cancer, remain undefined. In many studies, gene probes have been used to search for key genes involved in causing and facilitating thyroid cancer. As a result, many possible virulence genes and pathways have been identified. However, these studies lack a case contrast for selecting the most possible virulence genes and pathways, as well as conclusive results with which to clarify the mechanisms of cancer development. In the present study, we used gene set enrichment and meta-analysis to select key genes and pathways. Based on gene set enrichment, we identified 5 downregulated and 4 upregulated mixed pathways in 6 tissue datasets. Based on the meta-analysis, there were 17 common pathways in the tissue datasets. One pathway, the p53 signaling pathway, which includes 13 genes, was identified by both the gene set enrichment analysis and meta-analysis. Genes are important elements that form key pathways. These pathways can induce the development of thyroid cancer later in life. The key pathways and genes identified in the present study can be used in the next stage of research, which will involve gene elimination and other methods of experimentation.

  4. Principal Component Clustering Approach to Teaching Quality Discriminant Analysis

    ERIC Educational Resources Information Center

    Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan

    2016-01-01

    Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…

  5. Electrochemical and genomic analysis of novel electroactive isolates obtained via potentiostatic enrichment from tropical sediment

    NASA Astrophysics Data System (ADS)

    Doyle, Lucinda E.; Yung, Pui Yi; Mitra, Sumitra D.; Wuertz, Stefan; Williams, Rohan B. H.; Lauro, Federico M.; Marsili, Enrico

    2017-07-01

    Enrichment of electrochemically-active microorganisms (EAM) to date has mostly relied on microbial fuel cells fed with wastewater. This study aims to enrich novel EAM by exposing tropical sediment, not frequently reported in the literature, to sustained anodic potentials. Voltamperometric techniques and electrochemical impedance spectroscopy, performed over a wide range of potentials, characterise extracellular electron transfer (EET) over time. Applied potential is found to affect biofilm electrochemical signature. Geobacter metallireducens is heavily enriched on the electrodes, as determined by metagenomic and metatranscriptomic analysis, in the first report of the species in a lactate-fed system. Two novel isolates are grown in pure culture from the enrichment, identified by 16S rRNA gene sequencing as Aeromonas and Enterobacter, respectively. The names proposed are Aeromonas sp. CL-1 and Enterobacter sp. EA-1. Both isolates are capable of EET on carbon felt and screen-printed carbon electrodes without the addition of exogenous redox mediators. Enterobacter sp. EA-1 can also perform mediated electron transfer using the soluble redox mediator 2-hydroxy-1,4-naphthoquinone (HNQ). Both isolates are able to use acetate and lactate as electron donors. This work outlines a comprehensive methodology for characterising novel EAM from unconventional inocula.

  6. Constraints on the Parental Melts of Enriched Shergottites from Image Analysis and High Pressure Experiments

    NASA Technical Reports Server (NTRS)

    Collinet, M.; Medard, E.; Devouard, B.; Peslier, A.

    2012-01-01

    Martian basalts can be classified in at least two geochemically different families: enriched and depleted shergottites. Enriched shergottites are characterized by higher incompatible element concentrations and initial Sr-87/Sr-86 and lower initial Nd-143/Nd-144 and Hf-176/Hf-177 than depleted shergottites [e.g. 1, 2]. It is now generally admitted that shergottites result from the melting of at least two distinct mantle reservoirs [e.g. 2, 3]. Some of the olivine-phyric shergottites (either depleted or enriched), the most magnesian Martian basalts, could represent primitive melts, which are of considerable interest to constrain mantle sources. Two depleted olivine-phyric shergottites, Yamato (Y) 980459 and Northwest Africa (NWA) 5789, are in equilibrium with their most magnesian olivine (Fig. 1) and their bulk rock compositions are inferred to represent primitive melts [4, 5]. Larkman Nunatak (LAR) 06319 [3, 6, 7] and NWA 1068 [8], the most magnesian enriched basalts, have bulk Mg# that are too high to be in equilibrium with their olivine megacryst cores. Parental melt compositions have been estimated by subtracting the most magnesian olivine from the bulk rock composition, assuming that olivine megacrysts have partially accumulated [3, 9]. However, because this technique does not account for the actual petrography of these meteorites, we used image analysis to study these rocks history, reconstruct their parent magma and understand the nature of olivine megacrysts.

  7. Mapping DNA-protein interactions in large genomes by sequence tag analysis of genomic enrichment.

    PubMed

    Kim, Jonghwan; Bhinge, Akshay A; Morgan, Xochitl C; Iyer, Vishwanath R

    2005-01-01

    Identifying the chromosomal targets of transcription factors is important for reconstructing the transcriptional regulatory networks underlying global gene expression programs. We have developed an unbiased genomic method called sequence tag analysis of genomic enrichment (STAGE) to identify the direct binding targets of transcription factors in vivo. STAGE is based on high-throughput sequencing of concatemerized tags derived from target DNA enriched by chromatin immunoprecipitation. We first used STAGE in yeast to confirm that RNA polymerase III genes are the most prominent targets of the TATA-box binding protein. We optimized the STAGE protocol and developed analysis methods to allow the identification of transcription factor targets in human cells. We used STAGE to identify several previously unknown binding targets of human transcription factor E2F4 that we independently validated by promoter-specific PCR and microarray hybridization. STAGE provides a means of identifying the chromosomal targets of DNA-associated proteins in any sequenced genome.

  8. BUFET: boosting the unbiased miRNA functional enrichment analysis using bitsets.

    PubMed

    Zagganas, Konstantinos; Vergoulis, Thanasis; Paraskevopoulou, Maria D; Vlachos, Ioannis S; Skiadopoulos, Spiros; Dalamagas, Theodore

    2017-09-06

    A group of miRNAs can regulate a biological process by targeting genes involved in the process. The unbiased miRNA functional enrichment analysis is the most precise in silico approach to predict the biological processes that may be regulated by a given miRNA group. However, it is computationally intensive and significantly more expensive than its alternatives. We introduce BUFET, a new approach to significantly reduce the time required for the execution of the unbiased miRNA functional enrichment analysis. It derives its strength from the utilization of efficient bitset-based methods and parallel computation techniques. BUFET outperforms the state-of-the-art implementation, in regard to computational efficiency, in all scenarios (both single- and multi-core), being, in some cases, more than one order of magnitude faster.

  9. LA-ICP-MS analysis of isolated phosphatic grains indicates selective rare earth element enrichment during reworking and transport processes

    NASA Astrophysics Data System (ADS)

    Auer, Gerald; Reuter, Markus; Hauzenberger, Christoph A.; Piller, Werner E.

    2016-04-01

    water chemistry under certain well constrained circumstances of primary authigenesis. Are these conditions not met, REE patterns are more likely to reflect complex enrichment processes that likely already started to occur during reworking over geologically relatively short time frames. Similarities in the REE patterns of clearly detrital and biogenic phosphate further suggest that the often observed 'hat-shaped' pattern in biogenic phosphates can easily result from increased middle REE (Neodymium to Holmium) scavenging during taphonomic processes prior to final deposition. Finally, cluster analysis coupled with sedimentological considerations proved a valuable tool for the characterization of REE patterns of phosphates in terms of their formation conditions and depositional history, such as the distinction of phosphates formed in situ from reworked and transported phosphate grains.

  10. Semi-supervised Cluster Analysis of Imaging Data

    PubMed Central

    Filipovych, Roman; Resnick, Susan M.; Davatzikos, Christos

    2010-01-01

    In this paper, we present a semi-supervised clustering-based framework for discovering coherent subpopulations in heterogeneous image sets. Our approach involves limited supervision in the form of labeled instances from two distributions that reflect a rough guess about subspace of features that are relevant for cluster analysis. By assuming that images are defined in a common space via registration to a common template, we propose a segmentation-based method for detecting locations that signify local regional differences in the two labeled sets. A PCA model of local image appearance is then estimated at each location of interest, and ranked with respect to its relevance for clustering. We develop an incremental k-means-like algorithm that discovers novel meaningful categories in a test image set. The application of our approach in this paper is in analysis of populations of healthy older adults. We validate our approach on a synthetic dataset, as well as on a dataset of brain images of older adults. We assess our method’s performance on the problem of discovering clusters of MR images of human brain, and present a cluster-based measure of pathology that reflects the deviation of a subject’s MR image from normal (i.e. cognitively stable) state. We analyze the clusters’ structure, and show that clustering results obtained using our approach correlate well with clinical data. PMID:20933091

  11. IDAS: a Windows based software package for cluster analysis

    NASA Astrophysics Data System (ADS)

    Bondarenko, Igor; Treiger, Boris; Van Grieken, René; Van Espen, Pierre

    1996-03-01

    This article is an electronic publication in Spectrochimica Acta Electronica (SAE), the electronic section of Spectrochimica Acta Part B (SAB). The hardcopy text, comprising the main article and one appendix, is accompanied by two installation diskettes with the software package and data files. The main article discusses the chemometric aspects of the package and explains its purpose. The IDAS software package combines three cluster analysis methods (hierarchical, non-hierarchical and fuzzy) and runs under MS Windows. Modified algorithms for non-hierarchical and fuzzy clusterings are described. The interpretation of the clustering results is facilitated by the extensive use of different types of graph. New approaches to the graphical representation of the results of fuzzy clustering are proposed. Two data sets, the Iris data by Fisher and a data set on the chemical composition of tea, are used to demonstrate the capabilities of the software.

  12. Breast cancer clustering in Kanagawa, Japan: a geographic analysis.

    PubMed

    Katayama, Kayoko; Yokoyama, Kazuhito; Yako-Suketomo, Hiroko; Okamoto, Naoyuki; Tango, Toshiro; Inaba, Yutaka

    2014-01-01

    The purpose of the present study was to determine geographic clustering of breast cancer incidence in Kanagawa Prefecture, using cancer registry data. The study also aimed at examining the association between socio-economic factors and any identified cluster. Incidence data were collected for women who were first diagnosed with breast cancer during the period from January to December 2006 in Kanagawa. The data consisted of 2,326 incidence cases extracted from the total of 34,323 Kanagawa Cancer Registration data issued in 2011. To adjust for differences in age distribution, the standardized mortality ratio (SMR) and the standardized incidence ratio (SIR) of breast cancer were calculated for each of 56 municipalities (e.g., city, special ward, town, and village) in Kanagawa by an indirect method using Kanagawa female population data. Spatial scan statistics were used to detect any area of elevated risk as a cluster for breast cancer deaths and/ or incidences. The Student t-test was performed to examine differences in socio-economic variables, viz, persons per household, total fertility rate, age at first marriage for women, and marriage rate, between cluster and other regions. There was a statistically significant cluster of breast cancer incidence (p=0.001) composed of 11 municipalities in southeastern area of Kanagawa Prefecture, whose SIR was 35 percent higher than that of the remainder of Kanagawa Prefecture. In this cluster, average value of age at first-marriage for women was significantly higher than in the rest of Kanagawa (p=0.017). No statistically significant clusters of breast cancer deaths were detected (p=0.53). There was a statistically significant cluster of high breast cancer incidence in southeastern area of Kanagawa Prefecture. It was suggested that the cluster region was related to the tendency to marry later. This study methodology will be helpful in the analysis of geographical disparities in cancer deaths and incidence.

  13. An Empirical Analysis of Rough Set Categorical Clustering Techniques

    PubMed Central

    2017-01-01

    Clustering a set of objects into homogeneous groups is a fundamental operation in data mining. Recently, many attentions have been put on categorical data clustering, where data objects are made up of non-numerical attributes. For categorical data clustering the rough set based approaches such as Maximum Dependency Attribute (MDA) and Maximum Significance Attribute (MSA) has outperformed their predecessor approaches like Bi-Clustering (BC), Total Roughness (TR) and Min-Min Roughness(MMR). This paper presents the limitations and issues of MDA and MSA techniques on special type of data sets where both techniques fails to select or faces difficulty in selecting their best clustering attribute. Therefore, this analysis motivates the need to come up with better and more generalize rough set theory approach that can cope the issues with MDA and MSA. Hence, an alternative technique named Maximum Indiscernible Attribute (MIA) for clustering categorical data using rough set indiscernible relations is proposed. The novelty of the proposed approach is that, unlike other rough set theory techniques, it uses the domain knowledge of the data set. It is based on the concept of indiscernibility relation combined with a number of clusters. To show the significance of proposed approach, the effect of number of clusters on rough accuracy, purity and entropy are described in the form of propositions. Moreover, ten different data sets from previously utilized research cases and UCI repository are used for experiments. The results produced in tabular and graphical forms shows that the proposed MIA technique provides better performance in selecting the clustering attribute in terms of purity, entropy, iterations, time, accuracy and rough accuracy. PMID:28068344

  14. An Empirical Analysis of Rough Set Categorical Clustering Techniques.

    PubMed

    Uddin, Jamal; Ghazali, Rozaida; Deris, Mustafa Mat

    2017-01-01

    Clustering a set of objects into homogeneous groups is a fundamental operation in data mining. Recently, many attentions have been put on categorical data clustering, where data objects are made up of non-numerical attributes. For categorical data clustering the rough set based approaches such as Maximum Dependency Attribute (MDA) and Maximum Significance Attribute (MSA) has outperformed their predecessor approaches like Bi-Clustering (BC), Total Roughness (TR) and Min-Min Roughness(MMR). This paper presents the limitations and issues of MDA and MSA techniques on special type of data sets where both techniques fails to select or faces difficulty in selecting their best clustering attribute. Therefore, this analysis motivates the need to come up with better and more generalize rough set theory approach that can cope the issues with MDA and MSA. Hence, an alternative technique named Maximum Indiscernible Attribute (MIA) for clustering categorical data using rough set indiscernible relations is proposed. The novelty of the proposed approach is that, unlike other rough set theory techniques, it uses the domain knowledge of the data set. It is based on the concept of indiscernibility relation combined with a number of clusters. To show the significance of proposed approach, the effect of number of clusters on rough accuracy, purity and entropy are described in the form of propositions. Moreover, ten different data sets from previously utilized research cases and UCI repository are used for experiments. The results produced in tabular and graphical forms shows that the proposed MIA technique provides better performance in selecting the clustering attribute in terms of purity, entropy, iterations, time, accuracy and rough accuracy.

  15. Characterizing Suicide in Toronto: An Observational Study and Cluster Analysis

    PubMed Central

    Sinyor, Mark; Schaffer, Ayal; Streiner, David L

    2014-01-01

    Objective: To determine whether people who have died from suicide in a large epidemiologic sample form clusters based on demographic, clinical, and psychosocial factors. Method: We conducted a coroner’s chart review for 2886 people who died in Toronto, Ontario, from 1998 to 2010, and whose death was ruled as suicide by the Office of the Chief Coroner of Ontario. A cluster analysis using known suicide risk factors was performed to determine whether suicide deaths separate into distinct groups. Clusters were compared according to person- and suicide-specific factors. Results: Five clusters emerged. Cluster 1 had the highest proportion of females and nonviolent methods, and all had depression and a past suicide attempt. Cluster 2 had the highest proportion of people with a recent stressor and violent suicide methods, and all were married. Cluster 3 had mostly males between the ages of 20 and 64, and all had either experienced recent stressors, suffered from mental illness, or had a history of substance abuse. Cluster 4 had the youngest people and the highest proportion of deaths by jumping from height, few were married, and nearly one-half had bipolar disorder or schizophrenia. Cluster 5 had all unmarried people with no prior suicide attempts, and were the least likely to have an identified mental illness and most likely to leave a suicide note. Conclusions: People who die from suicide assort into different patterns of demographic, clinical, and death-specific characteristics. Identifying and studying subgroups of suicides may advance our understanding of the heterogeneous nature of suicide and help to inform development of more targeted suicide prevention strategies. PMID:24444321

  16. Statistical analysis of bound companions in the Coma cluster

    NASA Astrophysics Data System (ADS)

    Mendelin, Martin; Binggeli, Bruno

    2017-08-01

    Aims: The rich and nearby Coma cluster of galaxies is known to have substructure. We aim to create a more detailed picture of this substructure by searching directly for bound companions around individual giant members. Methods: We have used two catalogs of Coma galaxies, one covering the cluster core for a detailed morphological analysis, another covering the outskirts. The separation limit between possible companions (secondaries) and giants (primaries) is chosen as MB = -19 and MR = -20, respectively for the two catalogs. We have created pseudo-clusters by shuffling positions or velocities of the primaries and search for significant over-densities of possible companions around giants by comparison with the data. This method was developed and applied first to the Virgo cluster. In a second approach we introduced a modified nearest neighbor analysis using several interaction parameters for all galaxies. Results: We find evidence for some excesses due to possible companions for both catalogs. Satellites are typically found among the faintest dwarfs (MB < -16) around high-luminosity primaries. The most significant excesses are found around very luminous late-type giants (spirals) in the outskirts, which is expected in an infall scenario of cluster evolution. A rough estimate for an upper limit of bound galaxies within Coma is 2-4%, to be compared with 7% for Virgo. Conclusions: The results agree well with the expected low frequency of bound companions in a regular cluster such as Coma. To exploit the data more fully and reach more detailed insights into the physics of cluster evolution we suggest applying the method also to model clusters created by N-body simulations for comparison.

  17. CN and CH Abundance Analysis in a Sample of Eight Galactic Globular Clusters

    NASA Astrophysics Data System (ADS)

    Smolinski, Jason P.; Lee, Y.; Beers, T. C.; Martell, S. L.; An, D.; Sivarani, T.

    2011-01-01

    Galactic globular clusters exhibit star-to-star variations in their light element abundances that are not predicted by formation and evolution models involving single stellar generations. Recently it has been suggested that internal pollution from early supernovae and AGB winds may have played important roles in forming a second generation of enriched stars. We present updated results of a CN and CH abundance analysis of stars from the base to the tip of the red giant branch, and in some cases down onto the main sequence, for eight globular clusters with available photometric and spectroscopic data from SDSS-I and SDSS-II/SEGUE. These results include a discussion of the radial distribution of CN enrichment and how this may impact the current paradigm. Funding for SDSS-I and SDSS-II has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, the U.S. Department of Energy, the National Aeronautics and Space Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England. The SDSS Web Site is http://www.sdss.org/. This work was supported in part by grants PHY 02-16783 and PHY 08-22648: Physics Frontiers Center/Joint Institute for Nuclear Astrophysics (JINA), awarded by the U.S. National Science Foundation.

  18. Integrated proteomic analysis of post-translational modifications by serial enrichment

    PubMed Central

    Mertins, Philipp; Qiao, Jana W; Patel, Jinal; Udeshi, Namrata D; Clauser, Karl R; Mani, D R; Burgess, Michael W; Gillette, Michael A; Jaffe, Jacob D; Carr, Steven A

    2014-01-01

    We report a mass spectrometry–based method for the integrated analysis of protein expression, phosphorylation, ubiquitination and acetylation by serial enrichments of different post-translational modifications (SEPTM) from the same biological sample. this technology enabled quantitative analysis of nearly 8,000 proteins and more than 20,000 phosphorylation, 15,000 ubiquitination and 3,000 acetylation sites per experiment, generating a holistic view of cellular signal transduction pathways as exemplified by analysis of bortezomib-treated human leukemia cells. PMID:23749302

  19. A novel titanium dioxide-polydimethylsiloxane plate for phosphopeptide enrichment and mass spectrometry analysis.

    PubMed

    Chen, Chao-Jung; Lai, Chien-Chen; Tseng, Mei-Chun; Liu, Yu-Ching; Liu, Yu-Huei; Chiou, Liang-Wei; Tsai, Fuu-Jen

    2014-02-17

    The phosphorylation of proteins is a major post-translational modification that is required for the regulation of many cellular processes and activities. Mass spectrometry signals of low-abundance phosphorylated peptides are commonly suppressed by the presence of abundant non-phosphorylated peptides. Therefore, one of the major challenges in the detection of low-abundance phosphopeptides is their enrichment from complex peptide mixtures. Titanium dioxide (TiO2) has been proven to be a highly efficient approach for phosphopeptide enrichment and is widely applied. In this study, a novel TiO2 plate was developed by coating TiO2 particles onto polydimethylsiloxane (PDMS)-coated MALDI plates, glass, or plastic substrates. The TiO2-PDMS plate (TP plate) could be used for on-target MALDI-TOF analysis, or as a purification plate on which phosphopeptides were eluted out and subjected to MALDI-TOF or nanoLC-MS/MS analysis. The detection limit of the TP plate was ∼10-folds lower than that of a TiO2-packed tip approach. The capacity of the ∼2.5 mm diameter TiO2 spots was estimated to be ∼10 μg of β-casein. Following TiO2 plate enrichment of SCC4 cell lysate digests and nanoLC-MS/MS analysis, ∼82% of the detected proteins were phosphorylated, illustrating the sensitivity and effectiveness of the TP plate for phosphoproteomic study.

  20. An integrated microfluidic analysis microsystems with bacterial capture enrichment and in-situ impedance detection

    NASA Astrophysics Data System (ADS)

    Liu, Hai-Tao; Wen, Zhi-Yu; Xu, Yi; Shang, Zheng-Guo; Peng, Jin-Lan; Tian, Peng

    2017-09-01

    In this paper, an integrated microfluidic analysis microsystems with bacterial capture enrichment and in-situ impedance detection was purposed based on microfluidic chips dielectrophoresis technique and electrochemical impedance detection principle. The microsystems include microfluidic chip, main control module, and drive and control module, and signal detection and processing modulet and result display unit. The main control module produce the work sequence of impedance detection system parts and achieve data communication functions, the drive and control circuit generate AC signal which amplitude and frequency adjustable, and it was applied on the foodborne pathogens impedance analysis microsystems to realize the capture enrichment and impedance detection. The signal detection and processing circuit translate the current signal into impendence of bacteria, and transfer to computer, the last detection result is displayed on the computer. The experiment sample was prepared by adding Escherichia coli standard sample into chicken sample solution, and the samples were tested on the dielectrophoresis chip capture enrichment and in-situ impedance detection microsystems with micro-array electrode microfluidic chips. The experiments show that the Escherichia coli detection limit of microsystems is 5 × 104 CFU/mL and the detection time is within 6 min in the optimization of voltage detection 10 V and detection frequency 500 KHz operating conditions. The integrated microfluidic analysis microsystems laid the solid foundation for rapid real-time in-situ detection of bacteria.

  1. NEArender: an R package for functional interpretation of 'omics' data via network enrichment analysis.

    PubMed

    Jeggari, Ashwini; Alexeyenko, Andrey

    2017-03-23

    The statistical evaluation of pathway enrichment, i.e. of gene profiles' confluence to the pathway level, allows exploring molecular landscapes using functionally annotated gene sets. However, pathway scores can also be used as predictive features in machine learning. That requires, firstly, increasing statistical power and biological relevance via a network enrichment analysis (NEA) and, secondly, a fast and convenient procedure for rendering the original data into a space of pathway scores. However, previous implementations of NEA involved multiple runs of network randomization and were therefore slow. Here, we present a new R package NEArender which can transform raw 'omics' features of experimental or clinical samples into matrices describing the same samples with many fewer NEA-based pathway scores. This is done via a parametric estimation of the null binomial distribution and is thus much faster and less biased than randomization procedures. Further, we compare estimates from these two alternative procedures and demonstrate that the summarization of individual genes to pathways increases the statistical power compared to both the default differential expression analysis on individual genes and the state-of-the-art gene set enrichment analysis. The package also contains functions for preparing input, modeling null distributions, and evaluating alternative versions of the global network. Beyond the state-of-the-art exploration of molecular data through pathway enrichment, score matrices produced by NEArender can be used in larger bioinformatics pipelines as input for phenotype modeling, predicting disease outcomes etc. This approach is often more sensitive and robust than using the original data. The package NEArender is complementary to the online NEA tool EviNet ( https://www.evinet.org ) and, unlike of the latter, enables high performance of computations off-line. The R package NEArender version 1.4 is available at CRAN repository https://cran.r-project.org/web/packages/NEArender/.

  2. Application of microarray analysis on computer cluster and cloud platforms.

    PubMed

    Bernau, C; Boulesteix, A-L; Knaus, J

    2013-01-01

    Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.

  3. Bayesian analysis of two stellar populations in Galactic globular clusters- III. Analysis of 30 clusters

    NASA Astrophysics Data System (ADS)

    Wagner-Kaiser, R.; Stenning, D. C.; Sarajedini, A.; von Hippel, T.; van Dyk, D. A.; Robinson, E.; Stein, N.; Jefferys, W. H.

    2016-12-01

    We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of 30 Galactic globular clusters to characterize two distinct stellar populations. A sophisticated Bayesian technique is employed to simultaneously sample the joint posterior distribution of age, distance, and extinction for each cluster, as well as unique helium values for two populations within each cluster and the relative proportion of those populations. We find the helium differences among the two populations in the clusters fall in the range of ˜0.04 to 0.11. Because adequate models varying in carbon, nitrogen, and oxygen are not presently available, we view these spreads as upper limits and present them with statistical rather than observational uncertainties. Evidence supports previous studies suggesting an increase in helium content concurrent with increasing mass of the cluster and we also find that the proportion of the first population of stars increases with mass as well. Our results are examined in the context of proposed globular cluster formation scenarios. Additionally, we leverage our Bayesian technique to shed light on the inconsistencies between the theoretical models and the observed data.

  4. Identifying clinical course patterns in SMS data using cluster analysis.

    PubMed

    Kent, Peter; Kongsted, Alice

    2012-07-02

    Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important subgroups in the outcomes of research studies. Two previous studies have investigated detailed clinical course patterns in SMS data obtained from people seeking care for low back pain. One used a visual analysis approach and the other performed a cluster analysis of SMS data that had first been transformed by spline analysis. However, cluster analysis of SMS data in its original untransformed form may be simpler and offer other advantages. Therefore, the aim of this study was to determine whether cluster analysis could be used for identifying clinical course patterns distinct from the pattern of the whole group, by including all SMS time points in their original form. It was a 'proof of concept' study to explore the potential, clinical relevance, strengths and weakness of such an approach. This was a secondary analysis of longitudinal SMS data collected in two randomised controlled trials conducted simultaneously from a single clinical population (n = 322). Fortnightly SMS data collected over a year on 'days of problematic low back pain' and on 'days of sick leave' were analysed using Two-Step (probabilistic) Cluster Analysis. Clinical course patterns were identified that were clinically interpretable and different from those of the whole group. Similar patterns were obtained when the number of SMS time points was reduced to monthly. The advantages and disadvantages of this method were contrasted to that of first transforming SMS data by spline analysis. This study showed that clinical course patterns can be identified by cluster analysis using all SMS time points as cluster variables. This method is simple, intuitive and does not require a high level of statistical skill. However, there

  5. Identifying clinical course patterns in SMS data using cluster analysis

    PubMed Central

    2012-01-01

    Background Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important subgroups in the outcomes of research studies. Two previous studies have investigated detailed clinical course patterns in SMS data obtained from people seeking care for low back pain. One used a visual analysis approach and the other performed a cluster analysis of SMS data that had first been transformed by spline analysis. However, cluster analysis of SMS data in its original untransformed form may be simpler and offer other advantages. Therefore, the aim of this study was to determine whether cluster analysis could be used for identifying clinical course patterns distinct from the pattern of the whole group, by including all SMS time points in their original form. It was a ‘proof of concept’ study to explore the potential, clinical relevance, strengths and weakness of such an approach. Methods This was a secondary analysis of longitudinal SMS data collected in two randomised controlled trials conducted simultaneously from a single clinical population (n = 322). Fortnightly SMS data collected over a year on ‘days of problematic low back pain’ and on ‘days of sick leave’ were analysed using Two-Step (probabilistic) Cluster Analysis. Results Clinical course patterns were identified that were clinically interpretable and different from those of the whole group. Similar patterns were obtained when the number of SMS time points was reduced to monthly. The advantages and disadvantages of this method were contrasted to that of first transforming SMS data by spline analysis. Conclusions This study showed that clinical course patterns can be identified by cluster analysis using all SMS time points as cluster variables. This method is simple, intuitive and does not require

  6. Sun Protection Belief Clusters: Analysis of Amazon Mechanical Turk Data.

    PubMed

    Santiago-Rivas, Marimer; Schnur, Julie B; Jandorf, Lina

    2016-12-01

    This study aimed (i) to determine whether people could be differentiated on the basis of their sun protection belief profiles and individual characteristics and (ii) explore the use of a crowdsourcing web service for the assessment of sun protection beliefs. A sample of 500 adults completed an online survey of sun protection belief items using Amazon Mechanical Turk. A two-phased cluster analysis (i.e., hierarchical and non-hierarchical K-means) was utilized to determine clusters of sun protection barriers and facilitators. Results yielded three distinct clusters of sun protection barriers and three distinct clusters of sun protection facilitators. Significant associations between gender, age, sun sensitivity, and cluster membership were identified. Results also showed an association between barrier and facilitator cluster membership. The results of this study provided a potential alternative approach to developing future sun protection promotion initiatives in the population. Findings add to our knowledge regarding individuals who support, oppose, or are ambivalent toward sun protection and inform intervention research by identifying distinct subtypes that may best benefit from (or have a higher need for) skin cancer prevention efforts.

  7. BayGO: Bayesian analysis of ontology term enrichment in microarray data.

    PubMed

    Vêncio, Ricardo Z N; Koide, Tie; Gomes, Suely L; Pereira, Carlos A de B

    2006-02-23

    The search for enriched (aka over-represented or enhanced) ontology terms in a list of genes obtained from microarray experiments is becoming a standard procedure for a system-level analysis. This procedure tries to summarize the information focussing on classification designs such as Gene Ontology, KEGG pathways, and so on, instead of focussing on individual genes. Although it is well known in statistics that association and significance are distinct concepts, only the former approach has been used to deal with the ontology term enrichment problem. BayGO implements a Bayesian approach to search for enriched terms from microarray data. The R source-code is freely available at http://blasto.iq.usp.br/~tkoide/BayGO in three versions: Linux, which can be easily incorporated into pre-existent pipelines; Windows, to be controlled interactively; and as a web-tool. The software was validated using a bacterial heat shock response dataset, since this stress triggers known system-level responses. The Bayesian model accounts for the fact that, eventually, not all the genes from a given category are observable in microarray data due to low intensity signal, quality filters, genes that were not spotted and so on. Moreover, BayGO allows one to measure the statistical association between generic ontology terms and differential expression, instead of working only with the common significance analysis.

  8. Bacterial community analysis of cypermethrin enrichment cultures and bioremediation of cypermethrin contaminated soils.

    PubMed

    Akbar, Shamsa; Sultan, Sikander; Kertesz, Michael

    2015-07-01

    Cypermethrin is widely used for insect control; however, its toxicity toward aquatic life requires its complete removal from contaminated areas where the natural degradation ability of microbes can be utilized. Agricultural soil with extensive history of CM application was used to prepare enrichment cultures using cypermethrin as sole carbon source for isolation of cypermethrin degrading bacteria and bacterial community analysis using PCR-DGGE of 16 S rRNA gene. DGGE analysis revealed that dominant members of CM enrichment culture were associated with α-proteobacteria followed by γ-proteobacteria, Firmicutes, and Actinobacteria. Three potential CM-degrading isolates identified as Ochrobactrum anthropi JCm1, Bacillus megaterium JCm2, and Rhodococcus sp. JCm5 degraded 86-100% of CM (100 mg L(-1) ) within 10 days. These isolates were also able to degrade other pyrethroids, carbofuran, and cypermethrin degradation products. Enzyme activity assays revealed that enzymes involved in CM-degradation were inducible and showed activity when strains were grown on cypermethrin. Degradation kinetics of cypermethrin (200 mg kg(-1)) in soils inoculated with isolates JCm1, JCm2, and JCm5 suggested time-dependent disappearance of cypermethrin with rate constants of 0.0516, 0.0425, and 0.0807 d(-1), respectively, following first order rate kinetics. The isolated bacterial strains were among dominant genera selected under CM enriched conditions and represent valuable candidates for in situ bioremediation of contaminated soils and waters.

  9. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

    PubMed Central

    Huang, Da Wei; Sherman, Brad T.; Lempicki, Richard A.

    2009-01-01

    Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in this survey. Tools are uniquely categorized into three major classes, according to their underlying enrichment algorithms. The comprehensive collections, unique tool classifications and associated questions/issues will provide a more comprehensive and up-to-date view regarding the advantages, pitfalls and recent trends in a simpler tool-class level rather than by a tool-by-tool approach. Thus, the survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests. PMID:19033363

  10. BayGO: Bayesian analysis of ontology term enrichment in microarray data

    PubMed Central

    Vêncio, Ricardo ZN; Koide, Tie; Gomes, Suely L; de B Pereira, Carlos A

    2006-01-01

    Background The search for enriched (aka over-represented or enhanced) ontology terms in a list of genes obtained from microarray experiments is becoming a standard procedure for a system-level analysis. This procedure tries to summarize the information focussing on classification designs such as Gene Ontology, KEGG pathways, and so on, instead of focussing on individual genes. Although it is well known in statistics that association and significance are distinct concepts, only the former approach has been used to deal with the ontology term enrichment problem. Results BayGO implements a Bayesian approach to search for enriched terms from microarray data. The R source-code is freely available at in three versions: Linux, which can be easily incorporated into pre-existent pipelines; Windows, to be controlled interactively; and as a web-tool. The software was validated using a bacterial heat shock response dataset, since this stress triggers known system-level responses. Conclusion The Bayesian model accounts for the fact that, eventually, not all the genes from a given category are observable in microarray data due to low intensity signal, quality filters, genes that were not spotted and so on. Moreover, BayGO allows one to measure the statistical association between generic ontology terms and differential expression, instead of working only with the common significance analysis. PMID:16504085

  11. Single Molecule Cluster Analysis dissects splicing pathway conformational dynamics.

    PubMed

    Blanco, Mario R; Martin, Joshua S; Kahlscheuer, Matthew L; Krishnan, Ramya; Abelson, John; Laederach, Alain; Walter, Nils G

    2015-11-01

    We report Single Molecule Cluster Analysis (SiMCAn), which utilizes hierarchical clustering of hidden Markov modeling-fitted single-molecule fluorescence resonance energy transfer (smFRET) trajectories to dissect the complex conformational dynamics of biomolecular machines. We used this method to study the conformational dynamics of a precursor mRNA during the splicing cycle as carried out by the spliceosome. By clustering common dynamic behaviors derived from selectively blocked splicing reactions, SiMCAn was able to identify the signature conformations and dynamic behaviors of multiple ATP-dependent intermediates. In addition, it identified an open conformation adopted late in splicing by a 3' splice-site mutant, invoking a mechanism for substrate proofreading. SiMCAn enables rapid interpretation of complex single-molecule behaviors and should prove useful for the comprehensive analysis of a plethora of dynamic cellular machines.

  12. Who are the healthy active seniors? A cluster analysis.

    PubMed

    Lai, Claudia K Y; Chan, Engle Angela; Chin, Kenny C W

    2014-12-01

    This paper reports a cluster analysis of a sample recruited from a randomized controlled trial that explored the effect of using a life story work approach to improve the psychological outcomes of older people in the community. 238 subjects from community centers were included in this analysis. After statistical testing, 169 seniors were assigned to the active ageing (AG) cluster and 69 to the inactive ageing (IG) cluster. Those in the AG were younger and healthier, with fewer chronic diseases and fewer depressive symptoms than those in the IG. They were more satisfied with their lives, and had higher self-esteem. They met with their family members more frequently, they engaged in more leisure activities and were more likely to have the ability to move freely. In summary, active ageing was observed in people with better health and functional performance. Our results echoed the limited findings reported in the literature.

  13. Mokken Scale Analysis Using Hierarchical Clustering Procedures

    ERIC Educational Resources Information Center

    van Abswoude, Alexandra A. H.; Vermunt, Jeroen K.; Hemker, Bas T.; van der Ark, L. Andries

    2004-01-01

    Mokken scale analysis (MSA) can be used to assess and build unidimensional scales from an item pool that is sensitive to multiple dimensions. These scales satisfy a set of scaling conditions, one of which follows from the model of monotone homogeneity. An important drawback of the MSA program is that the sequential item selection and scale…

  14. Cluster Analysis in Minority Group Poverty Studies.

    ERIC Educational Resources Information Center

    Ross, E. Lamar

    This paper, one of a series which arose out of data gathered on Choctaw Indians, Negroes, and whites in a low income area of Mississippi, expands upon one aspect of a recently completed analysis by the author. In the study, an attempt was made to distinguish between the characteristics associated with income levels and those related to ethnic…

  15. Joint Sequence Analysis: Association and Clustering

    ERIC Educational Resources Information Center

    Piccarreta, Raffaella

    2017-01-01

    In its standard formulation, sequence analysis aims at finding typical patterns in a set of life courses represented as sequences. Recently, some proposals have been introduced to jointly analyze sequences defined on different domains (e.g., work career, partnership, and parental histories). We introduce measures to evaluate whether a set of…

  16. Mokken Scale Analysis Using Hierarchical Clustering Procedures

    ERIC Educational Resources Information Center

    van Abswoude, Alexandra A. H.; Vermunt, Jeroen K.; Hemker, Bas T.; van der Ark, L. Andries

    2004-01-01

    Mokken scale analysis (MSA) can be used to assess and build unidimensional scales from an item pool that is sensitive to multiple dimensions. These scales satisfy a set of scaling conditions, one of which follows from the model of monotone homogeneity. An important drawback of the MSA program is that the sequential item selection and scale…

  17. Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups

    PubMed Central

    Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José

    2013-01-01

    Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674

  18. Multilayer interparticle linking hybrid MOF-199 for noninvasive enrichment and analysis of plant hormone ethylene.

    PubMed

    Zhang, Zhuomin; Huang, Yichun; Ding, Weiwei; Li, Gongke

    2014-04-01

    Ethylene, an important plant hormone, is of utmost importance during many developmental processes of plants. However, the efficient enrichment and analysis of trace ethylene still remains a challenge. A simple and mild multilayer interparticle linking strategy was proposed to fabricate a novel hybrid MOF-199 enrichment coating. Strong chemical interparticle linkages throughout the coating improved the durability and reproducibility of hybrid MOF-199 coating dramatically. This coating performed a significant extraction superiority of ethylene over commonly used commercial coatings, attributed to the multiple interactions including "molecular sieving effect", hydrogen bonding, open metal site interaction, and π-π affinity. The hybridization of multiwalled carbon nanotubes (MWCNTs) with MOF-199 further improved the enrichment capability and also acted as a hydrophobic "shield" to prevent the open metal sites of MOF-199 from being occupied by water molecules, which effectively improved the moisture-resistant property of MOF-199/CNTs coating. Finally, this novel enrichment method was successfully applied for the noninvasive analysis of trace ethylene, methanol, and ethanol from fruit samples with relatively high humidity. The low detection limit was 0.016 μg/L for ethylene. It was satisfactory that trace ethylene could be actually detected from fruit samples by this noninvasive method. Good recoveries of spiked grape, wampee, blueberry, and durian husk samples were obtained in the range of 90.0-114%, 79.4-88.6%, 78.5-86.8%, and 85.2-105% with the corresponding relative standard deviations of 4.8-9.8%, 6.9-8.9%, 3.8-8.1%, and 9.3-10.5% (n = 3), respectively.

  19. Learning From Hidden Traits: Joint Factor Analysis and Latent Clustering

    NASA Astrophysics Data System (ADS)

    Yang, Bo; Fu, Xiao; Sidiropoulos, Nicholas D.

    2017-01-01

    Dimensionality reduction techniques play an essential role in data analytics, signal processing and machine learning. Dimensionality reduction is usually performed in a preprocessing stage that is separate from subsequent data analysis, such as clustering or classification. Finding reduced-dimension representations that are well-suited for the intended task is more appealing. This paper proposes a joint factor analysis and latent clustering framework, which aims at learning cluster-aware low-dimensional representations of matrix and tensor data. The proposed approach leverages matrix and tensor factorization models that produce essentially unique latent representations of the data to unravel latent cluster structure -- which is otherwise obscured because of the freedom to apply an oblique transformation in latent space. At the same time, latent cluster structure is used as prior information to enhance the performance of factorization. Specific contributions include several custom-built problem formulations, corresponding algorithms, and discussion of associated convergence properties. Besides extensive simulations, real-world datasets such as Reuters document data and MNIST image data are also employed to showcase the effectiveness of the proposed approaches.

  20. Unsupervised feature relevance analysis applied to improve ECG heartbeat clustering.

    PubMed

    Rodríguez-Sotelo, J L; Peluffo-Ordoñez, D; Cuesta-Frau, D; Castellanos-Domínguez, G

    2012-10-01

    The computer-assisted analysis of biomedical records has become an essential tool in clinical settings. However, current devices provide a growing amount of data that often exceeds the processing capacity of normal computers. As this amount of information rises, new demands for more efficient data extracting methods appear. This paper addresses the task of data mining in physiological records using a feature selection scheme. An unsupervised method based on relevance analysis is described. This scheme uses a least-squares optimization of the input feature matrix in a single iteration. The output of the algorithm is a feature weighting vector. The performance of the method was assessed using a heartbeat clustering test on real ECG records. The quantitative cluster validity measures yielded a correctly classified heartbeat rate of 98.69% (specificity), 85.88% (sensitivity) and 95.04% (general clustering performance), which is even higher than the performance achieved by other similar ECG clustering studies. The number of features was reduced on average from 100 to 18, and the temporal cost was a 43% lower than in previous ECG clustering schemes.

  1. Phage cluster relationships identified through single gene analysis

    PubMed Central

    2013-01-01

    Background Phylogenetic comparison of bacteriophages requires whole genome approaches such as dotplot analysis, genome pairwise maps, and gene content analysis. Currently mycobacteriophages, a highly studied phage group, are categorized into related clusters based on the comparative analysis of whole genome sequences. With the recent explosion of phage isolation, a simple method for phage cluster prediction would facilitate analysis of crude or complex samples without whole genome isolation and sequencing. The hypothesis of this study was that mycobacteriophage-cluster prediction is possible using comparison of a single, ubiquitous, semi-conserved gene. Tape Measure Protein (TMP) was selected to test the hypothesis because it is typically the longest gene in mycobacteriophage genomes and because regions within the TMP gene are conserved. Results A single gene, TMP, identified the known Mycobacteriophage clusters and subclusters using a Gepard dotplot comparison or a phylogenetic tree constructed from global alignment and maximum likelihood comparisons. Gepard analysis of 247 mycobacteriophage TMP sequences appropriately recovered 98.8% of the subcluster assignments that were made by whole-genome comparison. Subcluster-specific primers within TMP allow for PCR determination of the mycobacteriophage subcluster from DNA samples. Using the single-gene comparison approach for siphovirus coliphages, phage groupings by TMP comparison reflected relationships observed in a whole genome dotplot comparison and confirm the potential utility of this approach to another widely studied group of phages. Conclusions TMP sequence comparison and PCR results support the hypothesis that a single gene can be used for distinguishing phage cluster and subcluster assignments. TMP single-gene analysis can quickly and accurately aid in mycobacteriophage classification. PMID:23777341

  2. Quantitative site-specific analysis of protein glycosylation by LC-MS using different glycopeptide-enrichment strategies.

    PubMed

    Wohlgemuth, Jessica; Karas, Michael; Eichhorn, Thomas; Hendriks, Robertus; Andrecht, Sven

    2009-12-15

    A common technique for analysis of protein glycosylation is HPLC coupled to mass spectrometry (LC-MS). However, analysis is challenging due to a low abundance of glycopeptides in complex protein digests, microheterogeneity at the glycosylation site, ion suppression effects, and competition for ionization by coeluting peptides. Specific sample preparation is necessary for a comprehensive and site-specific glycosylation analysis by MS. In this study we qualitatively compared hydrophilic interaction chromatography (HILIC) and hydrazine chemistry for the enrichment of all N-linked glycopeptides and titanium dioxide for capturing sialylated glycopeptides from a complex peptide mixture. Bare silica, microcrystalline cellulose, amino-, amide- (TSKgel Amide-80), and sulfobetaine-(ZIC-HILIC) bonded phases were evaluated for HILIC enrichment. The experiments revealed that ZIC-HILIC and TSKgel Amide-80 are very specific for capturing glycopeptides under optimized conditions. Quantitative analysis of N-glycosidase F-released and 2-aminobenzamide-labeled glycans of a ZIC-HILIC-enriched monoclonal antibody demonstrated that glycopeptides could be enriched without bias for particular glycan structures and without significant losses. Sialylated glycopeptides could be efficiently enriched by titanium dioxide and in addition to HILIC both methods enable a comprehensive analysis of protein glycosylation by MS. Enrichment of N-linked glycopeptides by hydrazine chemistry resulted in lower peptide recovery using a more complex enrichment scheme.

  3. A Cluster Analysis of Personality Style in Adults with ADHD

    ERIC Educational Resources Information Center

    Robin, Arthur L.; Tzelepis, Angela; Bedway, Marquita

    2008-01-01

    Objective: The purpose of this study was to use hierarchical linear cluster analysis to examine the normative personality styles of adults with ADHD. Method: A total of 311 adults with ADHD completed the Millon Index of Personality Styles, which consists of 24 scales assessing motivating aims, cognitive modes, and interpersonal behaviors. Results:…

  4. Language Learner Motivational Types: A Cluster Analysis Study

    ERIC Educational Resources Information Center

    Papi, Mostafa; Teimouri, Yasser

    2014-01-01

    The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…

  5. Making Sense of Cluster Analysis: Revelations from Pakistani Science Classes

    ERIC Educational Resources Information Center

    Pell, Tony; Hargreaves, Linda

    2011-01-01

    Cluster analysis has been applied to quantitative data in educational research over several decades and has been a feature of the Maurice Galton's research in primary and secondary classrooms. It has offered potentially useful insights for teaching yet its implications for practice are rarely implemented. It has been subject also to negative…

  6. Language Learner Motivational Types: A Cluster Analysis Study

    ERIC Educational Resources Information Center

    Papi, Mostafa; Teimouri, Yasser

    2014-01-01

    The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…

  7. Cluster analysis as a prediction tool for pregnancy outcomes.

    PubMed

    Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L

    2015-03-01

    Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.

  8. Making Sense of Cluster Analysis: Revelations from Pakistani Science Classes

    ERIC Educational Resources Information Center

    Pell, Tony; Hargreaves, Linda

    2011-01-01

    Cluster analysis has been applied to quantitative data in educational research over several decades and has been a feature of the Maurice Galton's research in primary and secondary classrooms. It has offered potentially useful insights for teaching yet its implications for practice are rarely implemented. It has been subject also to negative…

  9. A New Analysis of s-process Enrichments in Planetary Nebulae

    NASA Astrophysics Data System (ADS)

    Sterling, Nicholas C.; Porter, Ryan; Dinerstein, Harriet L.

    2015-01-01

    We present a new analysis of selenium and krypton enrichments in planetary nebulae (PNe), using recently determined atomic data for these elements. Se and Kr are the two most widely-detected neutron-capture elements in PNe, and can be enriched by s-process nucleosynthesis in PN progenitor stars. With the photoionization code Cloudy (Ferland et al. 2013, RMxA&A, 49, 1), we computed grids of models that span the range of physical conditions in most PNe to investigate the ionization balance of Se and Kr. The new atomic data were tested by modeling 15 PNe that exhibit emission from multiple Kr ions. We found systematic discrepancies between the modeled and observed Kr lines, which could not be satisfactorily explained by observational uncertainties or approximations in the models. The observed ionization balance is reproduced more accurately by empirically adjusting the photoionization cross sections of Kr+—Kr3+ within their cited uncertainties, and the dielectronic recombination rate coefficients by slightly larger amounts. We present new analytical ionization correction factors for Se and Kr, based on correlations between the ionic fractions of detected Se and Kr ions and those of routinely observed O, Ar, and S ions. The correction factors are applied to the K band survey of Sterling & Dinerstein (2008, ApJS, 174, 158) to derive improved Se and Kr abundances in 120 PNe. The revised abundances are 0.1—0.3 dex lower than the previous values in most PNe, reducing the estimated fraction of enriched objects from 52% to 37%. However, this figure depends on the assumed initial abundances of Se and Kr in the progenitor stars, which may be subsolar in some cases and may differ for objects belonging to different stellar populations. We find that the primary conclusions of Sterling & Dinerstein still hold: Kr is more strongly enriched than Se in PNe, in accordance with nucleosynthetic predictions; PNe with more massive progenitors show little if any s-process enrichment

  10. Large-Scale Graph Processing Analysis using Supercomputer Cluster

    NASA Astrophysics Data System (ADS)

    Vildario, Alfrido; Fitriyani; Nugraha Nurkahfi, Galih

    2017-01-01

    Graph implementation is widely use in various sector such as automotive, traffic, image processing and many more. They produce graph in large-scale dimension, cause the processing need long computational time and high specification resources. This research addressed the analysis of implementation large-scale graph using supercomputer cluster. We impelemented graph processing by using Breadth-First Search (BFS) algorithm with single destination shortest path problem. Parallel BFS implementation with Message Passing Interface (MPI) used supercomputer cluster at High Performance Computing Laboratory Computational Science Telkom University and Stanford Large Network Dataset Collection. The result showed that the implementation give the speed up averages more than 30 times and eficiency almost 90%.

  11. K-means cluster analysis and seismicity partitioning for Pakistan

    NASA Astrophysics Data System (ADS)

    Rehman, Khaista; Burton, Paul W.; Weatherill, Graeme A.

    2014-07-01

    Pakistan and the western Himalaya is a region of high seismic activity located at the triple junction between the Arabian, Eurasian and Indian plates. Four devastating earthquakes have resulted in significant numbers of fatalities in Pakistan and the surrounding region in the past century (Quetta, 1935; Makran, 1945; Pattan, 1974 and the recent 2005 Kashmir earthquake). It is therefore necessary to develop an understanding of the spatial distribution of seismicity and the potential seismogenic sources across the region. This forms an important basis for the calculation of seismic hazard; a crucial input in seismic design codes needed to begin to effectively mitigate the high earthquake risk in Pakistan. The development of seismogenic source zones for seismic hazard analysis is driven by both geological and seismotectonic inputs. Despite the many developments in seismic hazard in recent decades, the manner in which seismotectonic information feeds the definition of the seismic source can, in many parts of the world including Pakistan and the surrounding regions, remain a subjective process driven primarily by expert judgment. Whilst much research is ongoing to map and characterise active faults in Pakistan, knowledge of the seismogenic properties of the active faults is still incomplete in much of the region. Consequently, seismicity, both historical and instrumental, remains a primary guide to the seismogenic sources of Pakistan. This study utilises a cluster analysis approach for the purposes of identifying spatial differences in seismicity, which can be utilised to form a basis for delineating seismogenic source regions. An effort is made to examine seismicity partitioning for Pakistan with respect to earthquake database, seismic cluster analysis and seismic partitions in a seismic hazard context. A magnitude homogenous earthquake catalogue has been compiled using various available earthquake data. The earthquake catalogue covers a time span from 1930 to 2007 and

  12. Analysis of microbial community and nitrogen transition with enriched nitrifying soil microbes for organic hydroponics.

    PubMed

    Saijai, Sakuntala; Ando, Akinori; Inukai, Ryuya; Shinohara, Makoto; Ogawa, Jun

    2016-06-27

    Nitrifying microbial consortia were enriched from bark compost in a water system by regulating the amounts of organic nitrogen compounds and by controlling the aeration conditions with addition of CaCO3 for maintaining suitable pH. Repeated enrichment showed reproducible mineralization of organic nitrogen via the conversion of ammonium ions ([Formula: see text]) and nitrite ions ([Formula: see text]) into nitrate ions ([Formula: see text]). The change in microbial composition during the enrichment was investigated by PCR-DGGE analysis with a focus on prokaryote, ammonia-oxidizing bacteria, nitrite-oxidizing bacteria, and eukaryote cell types. The microbial transition had a simple profile and showed clear relation to nitrogen ions transition. Nitrosomonas and Nitrobacter were mainly detected during [Formula: see text] and [Formula: see text] oxidation, respectively. These results revealing representative microorganisms acting in each ammonification and nitrification stages will be valuable for the development of artificial simple microbial consortia for organic hydroponics that consisted of identified heterotrophs and autotrophic nitrifying bacteria.

  13. Potentially novel copper resistance genes in copper-enriched activated sludge revealed by metagenomic analysis.

    PubMed

    Li, Li-Guan; Cai, Lin; Zhang, Xu-Xiang; Zhang, Tong

    2014-12-01

    In this study, we utilized the Illumina high-throughput metagenomic approach to investigate diversity and abundance of both microbial community and copper resistance genes (CuRGs) in activated sludge (AS) which was enriched under copper selective stress up to 800 mg/L. The raw datasets (~3.5 Gb for each sample, i.e., the copper-enriched AS and the control AS) were merged and normalized for the BLAST analyses against the SILVA SSU rRNA gene database and self-constructed copper resistance protein database (CuRD). Also, the raw metagenomic sequences were assembled into contigs and analyzed based on Open Reading Frames (ORFs) to identify potentially novel copper resistance genes. Among the different resistance systems for copper detoxification under the high copper stress condition, the Cus system was the most enriched system. The results also indicated that genes encoding multi-copper oxidase played a more important role than those encoding efflux proteins. More significantly, several potentially novel copper resistance ORFs were identified by Pfam search and phylogenic analysis. This study demonstrated a new understanding of microbial-mediated copper resistance under high copper stress using high-throughput shotgun sequencing technique.

  14. Outcome-Driven Cluster Analysis with Application to Microarray Data.

    PubMed

    Hsu, Jessie J; Finkelstein, Dianne M; Schoenfeld, David A

    2015-01-01

    One goal of cluster analysis is to sort characteristics into groups (clusters) so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes) into groups of highly correlated genes that have the same effect on the outcome (recovery). We propose a random effects model where the genes within each group (cluster) equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.

  15. Enrichment Analysis Identifies Functional MicroRNA-Disease Associations in Humans

    PubMed Central

    Yuan, Dandan; Cui, Xiaomeng; Wang, Yang; Zhao, Yilei; Li, Huiying; Hu, Suangjiu; Chu, Xiaodan; Li, Yan; Li, Qiang; Liu, Qian; Zhu, Wenliang

    2015-01-01

    Substantial evidence has shown that microRNAs (miRNAs) may be causally linked to the occurrence and progression of human diseases. Herein, we conducted an enrichment analysis to identify potential functional miRNA-disease associations (MDAs) in humans by integrating currently known biological data: miRNA-target interactions (MTIs), protein-protein interactions, and gene-disease associations. Two contributing factors to functional miRNA-disease associations were quantitatively considered: the direct effects of miRNA that target disease-related genes, and indirect effects triggered by protein-protein interactions. Ninety-nine miRNAs were scanned for possible functional association with 2223 MeSH-defined human diseases. Each miRNA was experimentally validated to target ≥ 10 mRNA genes. Putative MDAs were identified when at least one MTI was confidently validated for a disease. Overall, 19648 putative MDAs were found, of which 10.0% was experimentally validated. Further results suggest that filtering for miRNAs that target a greater number of disease-related genes (n ≥ 8) can significantly enrich for true MDAs from the set of putative associations (enrichment rate = 60.7%, adjusted hypergeometric p = 2.41×10−91). Considering the indirect effects of miRNAs further elevated the enrichment rate to 72.6%. By using this method, a novel MDA between miR-24 and ovarian cancer was found. Compared with scramble miRNA overexpression of miR-24 was validated to remarkably induce ovarian cancer cells apoptosis. Our study provides novel insight into factors contributing to functional MDAs by integrating large quantities of previously generated biological data, and establishes a feasible method to identify plausible associations with high confidence. PMID:26296081

  16. MEME-LaB: motif analysis in clusters.

    PubMed

    Brown, Paul; Baxter, Laura; Hickman, Richard; Beynon, Jim; Moore, Jonathan D; Ott, Sascha

    2013-07-01

    Genome-wide expression analysis can result in large numbers of clusters of co-expressed genes. Although there are tools for ab initio discovery of transcription factor-binding sites, most do not provide a quick and easy way to study large numbers of clusters. To address this, we introduce a web tool called MEME-LaB. The tool wraps MEME (an ab initio motif finder), providing an interface for users to input multiple gene clusters, retrieve promoter sequences, run motif finding and then easily browse and condense the results, facilitating better interpretation of the results from large-scale datasets. MEME-LaB is freely accessible at: http://wsbc.warwick.ac.uk/wsbcToolsWebpage/. Supplementary data are available at Bioinformatics online.

  17. MEME-LaB: motif analysis in clusters

    PubMed Central

    Brown, Paul; Baxter, Laura; Hickman, Richard; Beynon, Jim; Moore, Jonathan D.; Ott, Sascha

    2013-01-01

    Summary: Genome-wide expression analysis can result in large numbers of clusters of co-expressed genes. Although there are tools for ab initio discovery of transcription factor-binding sites, most do not provide a quick and easy way to study large numbers of clusters. To address this, we introduce a web tool called MEME-LaB. The tool wraps MEME (an ab initio motif finder), providing an interface for users to input multiple gene clusters, retrieve promoter sequences, run motif finding and then easily browse and condense the results, facilitating better interpretation of the results from large-scale datasets. Availability: MEME-LaB is freely accessible at: http://wsbc.warwick.ac.uk/wsbcToolsWebpage/. Contact: p.e.brown@warwick.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23681125

  18. BiNChE: a web tool and library for chemical enrichment analysis based on the ChEBI ontology.

    PubMed

    Moreno, Pablo; Beisken, Stephan; Harsha, Bhavana; Muthukrishnan, Venkatesh; Tudose, Ilinca; Dekker, Adriano; Dornfeldt, Stefanie; Taruttis, Franziska; Grosse, Ivo; Hastings, Janna; Neumann, Steffen; Steinbeck, Christoph

    2015-02-21

    Ontology-based enrichment analysis aids in the interpretation and understanding of large-scale biological data. Ontologies are hierarchies of biologically relevant groupings. Using ontology annotations, which link ontology classes to biological entities, enrichment analysis methods assess whether there is a significant over or under representation of entities for ontology classes. While many tools exist that run enrichment analysis for protein sets annotated with the Gene Ontology, there are only a few that can be used for small molecules enrichment analysis. We describe BiNChE, an enrichment analysis tool for small molecules based on the ChEBI Ontology. BiNChE displays an interactive graph that can be exported as a high-resolution image or in network formats. The tool provides plain, weighted and fragment analysis based on either the ChEBI Role Ontology or the ChEBI Structural Ontology. BiNChE aids in the exploration of large sets of small molecules produced within Metabolomics or other Systems Biology research contexts. The open-source tool provides easy and highly interactive web access to enrichment analysis with the ChEBI ontology tool and is additionally available as a standalone library.

  19. An optical analysis of the merging cluster Abell 3888

    NASA Astrophysics Data System (ADS)

    Shakouri, S.; Johnston-Hollitt, M.; Dehghan, S.

    2016-05-01

    In this paper we present new AAOmega spectroscopy of 254 galaxies within a 30 arcmin radius around Abell 3888. We combine these data with the existing redshifts measured in a one degree radius around the cluster and performed a substructure analysis. We confirm 71 member galaxies within the core of A3888 and determine a new average redshift and velocity dispersion for the cluster of 0.1535 ± 0.0009 and 1181 ± 197 km s-1, respectively. The cluster is elongated along an East-West axis and we find the core is bimodal along this axis with two subgroups of 26 and 41 members detected. Our results suggest that A3888 is a merging system putting to rest the previous conjecture about the morphological status of the cluster derived from X-ray observations. In addition to the results on A3888 we also present six newly detected galaxy overdensities in the field, three of which we classify as new galaxy clusters.

  20. Seismicity monitoring by cluster analysis of moment tensors

    NASA Astrophysics Data System (ADS)

    Cesca, Simone; Şen, Ali Tolga; Dahm, Torsten

    2014-03-01

    We suggest a new clustering approach to classify focal mechanisms from large moment tensor catalogues, with the purpose of automatically identify families of earthquakes with similar source geometry, recognize the orientation of most active faults, and detect temporal variations of the rupture processes. The approach differs in comparison to waveform similarity methods since clusters are detected even if they occur in large spatial distances. This approach is particularly helpful to analyse large moment tensor catalogues, as in microseismicity applications, where a manual analysis and classification is not feasible. A flexible algorithm is here proposed: it can handle different metrics, norms, and focal mechanism representations. In particular, the method can handle full moment tensor or constrained source model catalogues, for which different metrics are suggested. The method can account for variable uncertainties of different moment tensor components. We verify the method with synthetic catalogues. An application to real data from mining induced seismicity illustrates possible applications of the method and demonstrate the cluster detection and event classification performance with different moment tensor catalogues. Results proof that main earthquake source types occur on spatially separated faults, and that temporal changes in the number and characterization of focal mechanism clusters are detected. We suggest that moment tensor clustering can help assessing time dependent hazard in mines.

  1. Improving Gene-Set Enrichment Analysis of RNA-Seq Data with Small Replicates.

    PubMed

    Yoon, Sora; Kim, Seon-Young; Nam, Dougu

    2016-01-01

    Deregulated pathways identified from transcriptome data of two sample groups have played a key role in many genomic studies. Gene-set enrichment analysis (GSEA) has been commonly used for pathway or functional analysis of microarray data, and it is also being applied to RNA-seq data. However, most RNA-seq data so far have only small replicates. This enforces to apply the gene-permuting GSEA method (or preranked GSEA) which results in a great number of false positives due to the inter-gene correlation in each gene-set. We demonstrate that incorporating the absolute gene statistic in one-tailed GSEA considerably improves the false-positive control and the overall discriminatory ability of the gene-permuting GSEA methods for RNA-seq data. To test the performance, a simulation method to generate correlated read counts within a gene-set was newly developed, and a dozen of currently available RNA-seq enrichment analysis methods were compared, where the proposed methods outperformed others that do not account for the inter-gene correlation. Analysis of real RNA-seq data also supported the proposed methods in terms of false positive control, ranks of true positives and biological relevance. An efficient R package (AbsFilterGSEA) coded with C++ (Rcpp) is available from CRAN.

  2. Detection of Functional Change Using Cluster Trend Analysis in Glaucoma

    PubMed Central

    Gardiner, Stuart K.; Mansberger, Steven L.; Demirel, Shaban

    2017-01-01

    Purpose Global analyses using mean deviation (MD) assess visual field progression, but can miss localized changes. Pointwise analyses are more sensitive to localized progression, but more variable so require confirmation. This study assessed whether cluster trend analysis, averaging information across subsets of locations, could improve progression detection. Methods A total of 133 test–retest eyes were tested 7 to 10 times. Rates of change and P values were calculated for possible re-orderings of these series to generate global analysis (“MD worsening faster than x dB/y with P < y”), pointwise and cluster analyses (“n locations [or clusters] worsening faster than x dB/y with P < y”) with specificity exactly 95%. These criteria were applied to 505 eyes tested over a mean of 10.5 years, to find how soon each detected “deterioration,” and compared using survival models. This was repeated including two subsequent visual fields to determine whether “deterioration” was confirmed. Results The best global criterion detected deterioration in 25% of eyes in 5.0 years (95% confidence interval [CI], 4.7–5.3 years), compared with 4.8 years (95% CI, 4.2–5.1) for the best cluster analysis criterion, and 4.1 years (95% CI, 4.0–4.5) for the best pointwise criterion. However, for pointwise analysis, only 38% of these changes were confirmed, compared with 61% for clusters and 76% for MD. The time until 25% of eyes showed subsequently confirmed deterioration was 6.3 years (95% CI, 6.0–7.2) for global, 6.3 years (95% CI, 6.0–7.0) for pointwise, and 6.0 years (95% CI, 5.3–6.6) for cluster analyses. Conclusions Although the specificity is still suboptimal, cluster trend analysis detects subsequently confirmed deterioration sooner than either global or pointwise analyses. PMID:28715580

  3. The ACS survey of Galactic globular clusters - XIV. Bayesian single-population analysis of 69 globular clusters

    NASA Astrophysics Data System (ADS)

    Wagner-Kaiser, R.; Sarajedini, A.; von Hippel, T.; Stenning, D. C.; van Dyk, D. A.; Jeffery, E.; Robinson, E.; Stein, N.; Anderson, J.; Jefferys, W. H.

    2017-06-01

    We use Hubble Space Telescope (HST) imaging from the ACS Treasury Survey to determine fits for single population isochrones of 69 Galactic globular clusters. Using robust Bayesian analysis techniques, we simultaneously determine ages, distances, absorptions and helium values for each cluster under the scenario of a 'single' stellar population on model grids with solar ratio heavy element abundances. The set of cluster parameters is determined in a consistent and reproducible manner for all clusters using the Bayesian analysis suite BASE-9. Our results are used to re-visit the age-metallicity relation. We find correlations with helium and several other parameters such as metallicity, binary fraction and proxies for cluster mass. The helium abundances of the clusters are also considered in the context of carbon, nitrogen, and oxygen abundances and the multiple population scenario.

  4. Cluster analysis of Plasmodium RNA-seq time-course data identifies stage-specific co-regulated biological processes and regulatory elements

    PubMed Central

    Oyelade, Jelili; Adebiyi, Ezekiel

    2016-01-01

    In this study, we interpreted RNA-seq time-course data of three developmental stages of Plasmodium species by clustering genes based on similarities in their expression profile without prior knowledge of the gene function. Functional enrichment of clusters of upregulated genes at specific time-points reveals potential targetable biological processes with information on their timings. We identified common consensus sequences that these clusters shared as potential points of coordinated transcriptional control. Five cluster groups showed upregulated profile patterns of biological interest. This included two clusters from the Intraerythrocytic Developmental Cycle (cluster 4 = 16 genes, and cluster 9 = 32 genes), one from the sexual development stage (cluster 2 = 851 genes), and two from the gamete-fertilization stage in the mosquito host (cluster 4 = 153 genes, and cluster 9 = 258 genes). The IDC expressed the least numbers of genes with only 1448 genes showing any significant activity of the 5020 genes (~29%) in the experiment. Gene ontology (GO) enrichment analysis of these clusters revealed a total of 671 uncharacterized genes implicated in 14 biological processes and components associated with these stages, some of which are currently being investigated as drug targets in on-going research. Five putative transcription regulatory binding motifs shared by members of each cluster were also identified, one of which was also identified in a previous study by separate researchers. Our study shows stage-specific genes and biological processes that may be important in antimalarial drug research efforts. In addition, timed-coordinated control of separate processes may explain the paucity of factors in parasites. PMID:27990252

  5. Nonproliferation analysis of the reduction of excess separated plutonium and high-enriched uranium

    SciTech Connect

    Persiani, P.J.

    1995-08-01

    The purpose of this preliminary investigation is to explore alternatives and strategies aimed at the gradual reduction of the excess inventories of separated plutonium and high-enriched uranium (HEU) in the civilian nuclear power industry. The study attempts to establish a technical and economic basis to assist in the formation of alternative approaches consistent with nonproliferation and safeguards concerns. The analysis addresses several options in reducing the excess separated plutonium and HEU, and the consequences on nonproliferation and safeguards policy assessments resulting from the interacting synergistic effects between fuel cycle processes and isotopic signatures of nuclear materials.

  6. Analysis of civilian processing programs in reduction of excess separated plutonium and high-enriched uranium

    SciTech Connect

    Persiani, P.J.

    1995-12-31

    The purpose of this preliminary investigation is to explore alternatives and strategies aimed at the gradual reduction of the excess inventories of separated plutonium and high-enriched uranium (HEU) in the civilian nuclear power industry. The study attempts to establish a technical and economic basis to assist in the formation of alternative approaches consistent with nonproliferation and safeguards concerns. The analysis addresses several options in reducing the excess separated plutonium and HEU, and the consequences on nonproliferation and safeguards policy assessments resulting from the interacting synergistic effects between fuel cycle processes and isotopic signatures of nuclear materials.

  7. REGIONAL-SCALE WIND FIELD CLASSIFICATION EMPLOYING CLUSTER ANALYSIS

    SciTech Connect

    Glascoe, L G; Glaser, R E; Chin, H S; Loosmore, G A

    2004-06-17

    The classification of time-varying multivariate regional-scale wind fields at a specific location can assist event planning as well as consequence and risk analysis. Further, wind field classification involves data transformation and inference techniques that effectively characterize stochastic wind field variation. Such a classification scheme is potentially useful for addressing overall atmospheric transport uncertainty and meteorological parameter sensitivity issues. Different methods to classify wind fields over a location include the principal component analysis of wind data (e.g., Hardy and Walton, 1978) and the use of cluster analysis for wind data (e.g., Green et al., 1992; Kaufmann and Weber, 1996). The goal of this study is to use a clustering method to classify the winds of a gridded data set, i.e, from meteorological simulations generated by a forecast model.

  8. Multifaceted enrichment analysis of RNA–RNA crosstalk reveals cooperating micro-societies in human colorectal cancer

    PubMed Central

    Mazza, Tommaso; Mazzoccoli, Gianluigi; Fusilli, Caterina; Capocefalo, Daniele; Panza, Anna; Biagini, Tommaso; Castellana, Stefano; Gentile, Annamaria; De Cata, Angelo; Palumbo, Orazio; Stallone, Raffaella; Rubino, Rosa; Carella, Massimo; Piepoli, Ada

    2016-01-01

    Alterations in the balance of mRNA and microRNA (miRNA) expression profiles contribute to the onset and development of colorectal cancer. The regulatory functions of individual miRNA-gene pairs are widely acknowledged, but group effects are largely unexplored. We performed an integrative analysis of mRNA–miRNA and miRNA–miRNA interactions using high-throughput mRNA and miRNA expression profiles obtained from matched specimens of human colorectal cancer tissue and adjacent non-tumorous mucosa. This investigation resulted in a hypernetwork-based model, whose functional backbone was fulfilled by tight micro-societies of miRNAs. These proved to modulate several genes that are known to control a set of significantly enriched cancer-enhancer and cancer-protection biological processes, and that an array of upstream regulatory analyses demonstrated to be dependent on miR-145, a cell cycle and MAPK signaling cascade master regulator. In conclusion, we reveal miRNA-gene clusters and gene families with close functional relationships and highlight the role of miR-145 as potent upstream regulator of a complex RNA–RNA crosstalk, which mechanistically modulates several signaling pathways and regulatory circuits that when deranged are relevant to the changes occurring in colorectal carcinogenesis. PMID:27067546

  9. Multifaceted enrichment analysis of RNA-RNA crosstalk reveals cooperating micro-societies in human colorectal cancer.

    PubMed

    Mazza, Tommaso; Mazzoccoli, Gianluigi; Fusilli, Caterina; Capocefalo, Daniele; Panza, Anna; Biagini, Tommaso; Castellana, Stefano; Gentile, Annamaria; De Cata, Angelo; Palumbo, Orazio; Stallone, Raffaella; Rubino, Rosa; Carella, Massimo; Piepoli, Ada

    2016-05-19

    Alterations in the balance of mRNA and microRNA (miRNA) expression profiles contribute to the onset and development of colorectal cancer. The regulatory functions of individual miRNA-gene pairs are widely acknowledged, but group effects are largely unexplored. We performed an integrative analysis of mRNA-miRNA and miRNA-miRNA interactions using high-throughput mRNA and miRNA expression profiles obtained from matched specimens of human colorectal cancer tissue and adjacent non-tumorous mucosa. This investigation resulted in a hypernetwork-based model, whose functional backbone was fulfilled by tight micro-societies of miRNAs. These proved to modulate several genes that are known to control a set of significantly enriched cancer-enhancer and cancer-protection biological processes, and that an array of upstream regulatory analyses demonstrated to be dependent on miR-145, a cell cycle and MAPK signaling cascade master regulator. In conclusion, we reveal miRNA-gene clusters and gene families with close functional relationships and highlight the role of miR-145 as potent upstream regulator of a complex RNA-RNA crosstalk, which mechanistically modulates several signaling pathways and regulatory circuits that when deranged are relevant to the changes occurring in colorectal carcinogenesis.

  10. Specific on-plate enrichment of phosphorylated peptides for direct MALDI-TOF MS analysis.

    PubMed

    Qiao, Liang; Roussel, Christophe; Wan, Jingjing; Yang, Pengyuan; Girault, Hubert H; Liu, Baohong

    2007-12-01

    An on-plate specific enrichment method is presented for the direct analysis of peptides phosphorylation. An array of sintered TiO 2 nanoparticle spots was prepared on a stainless steel plate to provide porous substrate with a very large specific surface and durable functions. These spots were used to selectively capture phosphorylated peptides from peptide mixtures, and the immobilized phosphopeptides could then be analyzed directly by MALDI MS after washing away the nonphosphorylated peptides. beta-Casein and protein mixtures were employed as model samples to investigate the selection efficiency. In this strategy, the steps of phosphopeptide capture, purification, and subsequent mass spectrometry analysis are all successfully accomplished on a single target plate, which greatly reduces sample loss and simplifies analytical procedures. The low detection limit, small sample size, and rapid selective entrapment show that this on-plate strategy is promising for online enrichment of phosphopeptides, which is essential for the analysis of minute amount of samples in high-throughput proteome research.

  11. Analysis of RXTE data on Clusters of Galaxies

    NASA Technical Reports Server (NTRS)

    Petrosian, Vahe

    2004-01-01

    This grant provided support for the reduction, analysis and interpretation of of hard X-ray (HXR, for short) observations of the cluster of galaxies RXJO658--5557 scheduled for the week of August 23, 2002 under the RXTE Cycle 7 program (PI Vahe Petrosian, Obs. ID 70165). The goal of the observation was to search for and characterize the shape of the HXR component beyond the well established thermal soft X-ray (SXR) component. Such hard components have been detected in several nearby clusters. distant cluster would provide information on the characteristics of this radiation at a different epoch in the evolution of the imiverse and shed light on its origin. We (Petrosian, 2001) have argued that thermal bremsstrahlung, as proposed earlier, cannot be the mechanism for the production of the HXRs and that the most likely mechanism is Compton upscattering of the cosmic microwave radiation by relativistic electrons which are known to be present in the clusters and be responsible for the observed radio emission. Based on this picture we estimated that this cluster, in spite of its relatively large distance, will have HXR signal comparable to the other nearby ones. The planned observation of a relatively The proposed RXTE observations were carried out and the data have been analyzed. We detect a hard X-ray tail in the spectrum of this cluster with a flux very nearly equal to our predicted value. This has strengthen the case for the Compton scattering model. We intend the data obtained via this observation to be a part of a larger data set. We have identified other clusters of galaxies (in archival RXTE and other instrument data sets) with sufficiently high quality data where we can search for and measure (or at least put meaningful limits) on the strength of the hard component. With these studies we expect to clarify the mechanism for acceleration of particles in the intercluster medium and provide guidance for future observations of this intriguing phenomenon by instrument

  12. Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum.

    PubMed

    Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia

    2016-04-01

    Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.

  13. Full text clustering and relationship network analysis of biomedical publications.

    PubMed

    Guan, Renchu; Yang, Chen; Marchese, Maurizio; Liang, Yanchun; Shi, Xiaohu

    2014-01-01

    Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP) to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.

  14. The Productivity Analysis of Chennai Automotive Industry Cluster

    NASA Astrophysics Data System (ADS)

    Bhaskaran, E.

    2014-07-01

    Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.

  15. Sirenomelia in Argentina: Prevalence, geographic clusters and temporal trends analysis.

    PubMed

    Groisman, Boris; Liascovich, Rosa; Gili, Juan Antonio; Barbero, Pablo; Bidondo, María Paz

    2016-07-01

    Sirenomelia is a severe malformation of the lower body characterized by a single medial lower limb and a variable combination of visceral abnormalities. Given that Sirenomelia is a very rare birth defect, epidemiological studies are scarce. The aim of this study is to evaluate prevalence, geographic clusters and time trends of sirenomelia in Argentina, using data from the National Network of Congenital Anomalies of Argentina (RENAC) from November 2009 until December 2014. This is a descriptive study using data from the RENAC, a hospital-based surveillance system for newborns affected with major morphological congenital anomalies. We calculated sirenomelia prevalence throughout the period, searched for geographical clusters, and evaluated time trends. The prevalence of confirmed cases of sirenomelia throughout the period was 2.35 per 100,000 births. Cluster analysis showed no statistically significant geographical aggregates. Time-trends analysis showed that the prevalence was higher in years 2009 to 2010. The observed prevalence was higher than the observed in previous epidemiological studies in other geographic regions. We observed a likely real increase in the initial period of our study. We used strict diagnostic criteria, excluding cases that only had clinical diagnosis of sirenomelia. Therefore, real prevalence could be even higher. This study did not show any geographic clusters. Because etiology of sirenomelia has not yet been established, studies of epidemiological features of this defect may contribute to define its causes. Birth Defects Research (Part A) 106:604-611, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  16. Visual Analysis and Processing of Clusters Structures in Multidimensional Datasets

    NASA Astrophysics Data System (ADS)

    Bondarev, A. E.

    2017-05-01

    The article is devoted to problems of visual analysis of clusters structures for a multidimensional datasets. For visual analyzing an approach of elastic maps design [1,2] is applied. This approach is quite suitable for processing and visualizing of multidimensional datasets. To analyze clusters in original data volume the elastic maps are used as the methods of original data points mapping to enclosed manifolds having less dimensionality. Diminishing the elasticity parameters one can design map surface which approximates the multidimensional dataset in question much better. Then the points of dataset in question are projected to the map. The extension of designed map to a flat plane allows one to get an insight about the cluster structure of multidimensional dataset. The approach of elastic maps does not require any a priori information about data in question and does not depend on data nature, data origin, etc. Elastic maps are usually combined with PCA approach. Being presented in the space based on three first principal components the elastic maps provide quite good results. The article describes the results of elastic maps approach application to visual analysis of clusters for different multidimensional datasets including medical data.

  17. Full Text Clustering and Relationship Network Analysis of Biomedical Publications

    PubMed Central

    Guan, Renchu; Yang, Chen; Marchese, Maurizio; Liang, Yanchun; Shi, Xiaohu

    2014-01-01

    Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP) to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers. PMID:25250864

  18. Enrichment and Molecular Analysis of Breast Cancer Disseminated Tumor Cells from Bone Marrow Using Microfiltration

    PubMed Central

    Siddappa, Chidananda M.; Adams, Daniel L.; Li, Shuhong; Makarova, Olga V.; Amstutz, Pete; Nunley, Ryan; Tang, Cha-Mei; Watson, Mark A.; Aft, Rebecca L.

    2017-01-01

    Purpose Molecular characterization of disseminated tumor cells (DTCs) in the bone marrow (BM) of breast cancer (BC) patients has been hindered by their rarity. To enrich for these cells using an antigen-independent methodology, we have evaluated a size-based microfiltration device in combination with several downstream biomarker assays. Methods BM aspirates were collected from healthy volunteers or BC patients. Healthy BM was mixed with a specified number of BC cells to calculate recovery and fold enrichment by microfiltration. Specimens were pre-filtered using a 70 μm mesh sieve and the effluent filtered through CellSieve microfilters. Captured cells were analyzed by immunocytochemistry (ICC), FISH for HER-2/neu gene amplification status, and RNA in situ hybridization (RISH). Cells eluted from the filter were used for RNA isolation and subsequent qRT-PCR analysis for DTC biomarker gene expression. Results Filtering an average of 14×106 nucleated BM cells yielded approximately 17–21×103 residual BM cells. In the BC cell spiking experiments, an average of 87% (range 84–92%) of tumor cells were recovered with approximately 170- to 400-fold enrichment. Captured BC cells from patients co-stained for cytokeratin and EpCAM, but not CD45 by ICC. RNA yields from 4 ml of patient BM after filtration averaged 135ng per 10 million BM cells filtered with an average RNA Integrity Number (RIN) of 5.3. DTC-associated gene expression was detected by both qRT-PCR and RISH in filtered spiked or BC patient specimens but, not in control filtered normal BM. Conclusions We have tested a microfiltration technique for enrichment of BM DTCs. DTC capture efficiency was shown to range from 84.3% to 92.1% with up to 400-fold enrichment using model BC cell lines. In patients, recovered DTCs can be identified and distinguished from normal BM cells using multiple antibody-, DNA-, and RNA-based biomarker assays. PMID:28129357

  19. The Quantitative Analysis of Chennai Automotive Industry Cluster

    NASA Astrophysics Data System (ADS)

    Bhaskaran, Ethirajan

    2016-07-01

    Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity

  20. Criticality Expermints with Subcritical Clusters of 2.35 Wt% and 4.31 Wt% 2.35U Enriched UO2 Rods in Water at a Water-to-Fuel Volume Ratio of 1.6

    SciTech Connect

    SR Bierman; ED Clayton

    1980-07-01

    The fourth in a series of Nuclear Regulatory Commission funded criticality experiments have provided data for 2.35 wt% and 4.31 wt% {sup 235}U enriched U0{sub 2} rods at a water-to-fuel volume ratio of 1.6. The results from some 147 critical experiments are presented. They include for each enrichment: {sm_bullet}The critical size of single lattices or clusters of fuel {sm_bullet}The critical separation between sub-critical clusters of fuel {sm_bullet}The critical separation between sub-critical clusters of fuel having fixed neutron absorbers between the fuel clusters {sm_bullet}The isolation distance between fuel clusters {sm_bullet}The critical size of fuel clusters containing water holes and voids {sm_bullet}The critical size of fuel clusters separated by flux traps The fixed neutron absorbers for which data were obtained include 304-L steel, borated 304-L steel, copper, copper containing 1 wt% cadmium, cadmium, aluminium, zirconium and two trade name materials containing boron (Boral and Borofl ex).

  1. Associations between DNA methylation and schizophrenia-related intermediate phenotypes - a gene set enrichment analysis.

    PubMed

    Hass, Johanna; Walton, Esther; Wright, Carrie; Beyer, Andreas; Scholz, Markus; Turner, Jessica; Liu, Jingyu; Smolka, Michael N; Roessner, Veit; Sponheim, Scott R; Gollub, Randy L; Calhoun, Vince D; Ehrlich, Stefan

    2015-06-03

    Multiple genetic approaches have identified microRNAs as key effectors in psychiatric disorders as they post-transcriptionally regulate expression of thousands of target genes. However, their role in specific psychiatric diseases remains poorly understood. In addition, epigenetic mechanisms such as DNA methylation, which affect the expression of both microRNAs and coding genes, are critical for our understanding of molecular mechanisms in schizophrenia. Using clinical, imaging, genetic, and epigenetic data of 103 patients with schizophrenia and 111 healthy controls of the Mind Clinical Imaging Consortium (MCIC) study of schizophrenia, we conducted gene set enrichment analysis to identify markers for schizophrenia-associated intermediate phenotypes. Genes were ranked based on the correlation between DNA methylation patterns and each phenotype, and then searched for enrichment in 221 predicted microRNA target gene sets. We found the predicted hsa-miR-219a-5p target gene set to be significantly enriched for genes (EPHA4, PKNOX1, ESR1, among others) whose methylation status is correlated with hippocampal volume independent of disease status. Our results were strengthened by significant associations between hsa-miR-219a-5p target gene methylation patterns and hippocampus-related neuropsychological variables. IPA pathway analysis of the respective predicted hsa-miR-219a-5p target genes revealed associated network functions in behavior and developmental disorders. Altered methylation patterns of predicted hsa-miR-219a-5p target genes are associated with a structural aberration of the brain that has been proposed as a possible biomarker for schizophrenia. The (dys)regulation of microRNA target genes by epigenetic mechanisms may confer additional risk for developing psychiatric symptoms. Further study is needed to understand possible interactions between microRNAs and epigenetic changes and their impact on risk for brain-based disorders such as schizophrenia.

  2. Associations between DNA methylation and schizophrenia-related intermediate phenotypes a gene set enrichment analysis

    PubMed Central

    Hass, Johanna; Walton, Esther; Wright, Carrie; Beyer, Andreas; Scholz, Markus; Turner, Jessica; Liu, Jingyu; Smolka, Michael N.; Roessner, Veit; Sponheim, Scott R.; Gollub, Randy L.; Calhoun, Vince D.; Ehrlich, Stefan

    2015-01-01

    Multiple genetic approaches have identified microRNAs as key effectors in psychiatric disorders as they post-transcriptionally regulate expression of thousands of target genes. However, their role in specific psychiatric diseases remains poorly understood. In addition, epigenetic mechanisms such as DNA methylation, which affect the expression of both microRNAs and coding genes, are critical for our understanding of molecular mechanisms in schizophrenia. Using clinical, imaging, genetic, and epigenetic data of 103 patients with schizophrenia and 111 healthy controls of the Mind Clinical Imaging Consortium (MCIC) study of schizophrenia, we conducted gene set enrichment analysis to identify markers for schizophrenia-associated intermediate phenotypes. Genes were ranked based on the correlation between DNA methylation patterns and each phenotype, and then searched for enrichment in 221 predicted microRNA target gene sets. We found the predicted hsa-miR-219a-5p target gene set to be significantly enriched for genes (EPHA4, PKNOX1, ESR1, amongst others) whose methylation status is correlated with hippocampal volume independent of disease status. Our results were strengthened by significant associations between hsa-miR-219a-5p target gene methylation patterns and hippocampus-related neuropsychological variables. IPA pathway analysis of the respective predicted hsa-miR-219a-5p target genes revealed associated network functions in behaviour and developmental disorders. Altered methylation patterns of predicted hsa-miR-219a-5p target genes are associated with a structural aberration of the brain that has been proposed as a possible biomarker for schizophrenia. The (dys)regulation of microRNA target genes by epigenetic mechanisms may confer additional risk for developing psychiatric symptoms. Further study is needed to understand possible interactions between microRNAs and epigenetic changes and their impact on risk for brain-based disorders such as schizophrenia. PMID

  3. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches.

    PubMed

    Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C

    2014-01-01

    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  4. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches

    PubMed Central

    Bolin, Jocelyn H.; Edwards, Julianne M.; Finch, W. Holmes; Cassady, Jerrell C.

    2014-01-01

    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering. PMID:24795683

  5. Parallel Comparison of N-Linked Glycopeptide Enrichment Techniques Reveals Extensive Glycoproteomic Analysis of Plasma Enabled by SAX-ERLIC.

    PubMed

    Totten, Sarah M; Feasley, Christa L; Bermudez, Abel; Pitteri, Sharon J

    2017-03-03

    Protein glycosylation is of increasing interest due to its important roles in protein function and aberrant expression with disease. Characterizing protein glycosylation remains analytically challenging due to its low abundance, ion suppression issues, and microheterogeneity at glycosylation sites, especially in complex samples such as human plasma. In this study, the utility of three common N-linked glycopeptide enrichment techniques is compared using human plasma. By analysis on an LTQ-Orbitrap Elite mass spectrometer, electrostatic repulsion hydrophilic interaction liquid chromatography using strong anion exchange solid-phase extraction (SAX-ERLIC) provided the most extensive N-linked glycopeptide enrichment when compared with multilectin affinity chromatography (M-LAC) and Sepharose-HILIC enrichments. SAX-ERLIC enrichment yielded 191 unique glycoforms across 72 glycosylation sites from 48 glycoproteins, which is more than double that detected using other enrichment techniques. The greatest glycoform diversity was observed in SAX-ERLIC enrichment, with no apparent bias toward specific glycan types. SAX-ERLIC enrichments were additionally analyzed by an Orbitrap Fusion Lumos mass spectrometer to maximize glycopeptide identifications for a more comprehensive assessment of protein glycosylation. In these experiments, 829 unique glycoforms were identified across 208 glycosylation sites from 95 plasma glycoproteins, a significant improvement from the initial method comparison and one of the most extensive site-specific glycosylation analysis in immunodepleted human plasma to date. Data are available via ProteomeXchange with identifier PXD005655.

  6. Kinematic gait patterns in healthy runners: A hierarchical cluster analysis.

    PubMed

    Phinyomark, Angkoon; Osis, Sean; Hettinga, Blayne A; Ferber, Reed

    2015-11-05

    Previous studies have demonstrated distinct clusters of gait patterns in both healthy and pathological groups, suggesting that different movement strategies may be represented. However, these studies have used discrete time point variables and usually focused on only one specific joint and plane of motion. Therefore, the first purpose of this study was to determine if running gait patterns for healthy subjects could be classified into homogeneous subgroups using three-dimensional kinematic data from the ankle, knee, and hip joints. The second purpose was to identify differences in joint kinematics between these groups. The third purpose was to investigate the practical implications of clustering healthy subjects by comparing these kinematics with runners experiencing patellofemoral pain (PFP). A principal component analysis (PCA) was used to reduce the dimensionality of the entire gait waveform data and then a hierarchical cluster analysis (HCA) determined group sets of similar gait patterns and homogeneous clusters. The results show two distinct running gait patterns were found with the main between-group differences occurring in frontal and sagittal plane knee angles (P<0.001), independent of age, height, weight, and running speed. When these two groups were compared to PFP runners, one cluster exhibited greater while the other exhibited reduced peak knee abduction angles (P<0.05). The variability observed in running patterns across this sample could be the result of different gait strategies. These results suggest care must be taken when selecting samples of subjects in order to investigate the pathomechanics of injured runners. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Meta-analysis of transcriptomic datasets identifies genes enriched in the mammalian circadian pacemaker.

    PubMed

    Brown, Laurence A; Williams, John; Taylor, Lewis; Thomson, Ross J; Nolan, Patrick M; Foster, Russell G; Peirson, Stuart N

    2017-09-29

    The master circadian pacemaker in mammals is located in the suprachiasmatic nuclei (SCN) which regulate physiology and behaviour, as well as coordinating peripheral clocks throughout the body. Investigating the function of the SCN has often focused on the identification of rhythmically expressed genes. However, not all genes critical for SCN function are rhythmically expressed. An alternative strategy is to characterize those genes that are selectively enriched in the SCN. Here, we examined the transcriptome of the SCN and whole brain (WB) of mice using meta-analysis of publicly deposited data across a range of microarray platforms and RNA-Seq data. A total of 79 microarrays were used (24 SCN and 55 WB samples, 4 different microarray platforms), alongside 17 RNA-Seq data files (7 SCN and 10 WB). 31 684 MGI gene symbols had data for at least one platform. Meta-analysis using a random effects model for weighting individual effect sizes (derived from differential expression between relevant SCN and WB samples) reliably detected known SCN markers. SCN-enriched transcripts identified in this study provide novel insights into SCN function, including identifying genes which may play key roles in SCN physiology or provide SCN-specific drivers. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Partial Safety Analysis for a Reduced Uranium Enrichment Core for the High Flux Isotope Reactor

    SciTech Connect

    Primm, Trent; Gehin, Jess C

    2009-04-01

    A computational model of the reactor core of the High Flux Isotope Rector (HFIR) was developed in order to analyze non-destructive accidents caused by transients during reactor operation. The reactor model was built for the latest version of the nuclear analysis software package called Program for the Analysis of Reactor Transients (PARET). Analyses performed with the model constructed were compared with previous data obtained with other tools in order to benchmark the code. Finally, the model was used to analyze the behavior of the reactor under transients using a different nuclear fuel with lower enrichment of uranium (LEU) than the fuel currently used, which has a high enrichment of uranium (HEU). The study shows that the presence of fertile isotopes in LEU fuel, which increases the neutron resonance absorption, reduces the impact of transients on the fuel and enhances the negative reactivity feedback, thus, within the limitations of this study, making LEU fuel appear to be a safe alternative fuel for the reactor core.

  9. Proteomic analysis of the mouse brain following protein enrichment by preparative electrophoresis.

    PubMed

    Xixi, Elena; Dimitraki, Ploumisti; Vougas, Kostantinos; Kossida, Sofia; Lubec, Gert; Fountoulakis, Michael

    2006-04-01

    Proteomics is a powerful technology to study the identity and levels of brain proteins. Changes of protein levels as well as modifications that occur in neurological disorders may be informative for the pathogenesis of these disorders and could result in the identification of potential drug targets and disease markers. To increase the capability of characterizing complex protein profiles, protein mixtures should be separated into simpler fractions, thus increasing the likelihood of detecting low-abundance proteins. Considering that low-abundance proteins are thought to be involved in important biological processes, identification of those low-copy-number gene products appears to be a scientific challenge. In the present study, proteomic analysis of adult mouse brain tissue was performed following enrichment by preparative electrophoresis. This was performed using the PrepCell apparatus in the presence of 0.1% lithium dodecyl sulfate. Samples were electrophoresed in a cylindrical polyacrylamide gel and the proteins of the fractions collected were first analyzed by 1-D and then by 2-DE. Protein identification was performed by MALDI-TOF-MS. The present analysis resulted in the identification of 360 different gene products. Among those were transport proteins, transcription activators, signal transduction molecules as well as proteins with a number of other functions. Preparative electrophoresis is an efficient method for the enrichment of proteins of low molecular mass and may be useful in the investigation of disorders of the central nervous system.

  10. Length bias correction in gene ontology enrichment analysis using logistic regression.

    PubMed

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  11. Length Bias Correction in Gene Ontology Enrichment Analysis Using Logistic Regression

    PubMed Central

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S.; Chang, Jeff H.

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called “length bias”, will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible. PMID:23056249

  12. Analysis of data separation and recovery problems using clustered sparsity

    NASA Astrophysics Data System (ADS)

    King, Emily J.; Kutyniok, Gitta; Zhuang, Xiaosheng

    2011-09-01

    Data often have two or more fundamental components, like cartoon-like and textured elements in images; point, filament, and sheet clusters in astronomical data; and tonal and transient layers in audio signals. For many applications, separating these components is of interest. Another issue in data analysis is that of incomplete data, for example a photograph with scratches or seismic data collected with fewer than necessary sensors. There exists a unified approach to solving these problems which is minimizing the l1 norm of the analysis coefficients with respect to particular frame(s). This approach using the concept of clustered sparsity leads to similar theoretical bounds and results, which are presented here. Furthermore, necessary conditions for the frames to lead to sufficiently good solutions are also shown.

  13. Segment clustering methodology for unsupervised Holter recordings analysis

    NASA Astrophysics Data System (ADS)

    Rodríguez-Sotelo, Jose Luis; Peluffo-Ordoñez, Diego; Castellanos Dominguez, German

    2015-01-01

    Cardiac arrhythmia analysis on Holter recordings is an important issue in clinical settings, however such issue implicitly involves attending other problems related to the large amount of unlabelled data which means a high computational cost. In this work an unsupervised methodology based in a segment framework is presented, which consists of dividing the raw data into a balanced number of segments in order to identify fiducial points, characterize and cluster the heartbeats in each segment separately. The resulting clusters are merged or split according to an assumed criterion of homogeneity. This framework compensates the high computational cost employed in Holter analysis, being possible its implementation for further real time applications. The performance of the method is measure over the records from the MIT/BIH arrhythmia database and achieves high values of sensibility and specificity, taking advantage of database labels, for a broad kind of heartbeats types recommended by the AAMI.

  14. An Interpretation of the Boshier-Collins Cluster Analysis Testing Houle's Typology.

    ERIC Educational Resources Information Center

    Furst, Edward J.

    1986-01-01

    This article speculates on an underlying order obscured by the details of the Boshier-Collins cluster analysis and the mapping of Houle's types onto it. A table illustrates an interpretation of cluster analysis on Boshier's Education Participation Scale. (CT)

  15. Sensory over responsivity and obsessive compulsive symptoms: A cluster analysis.

    PubMed

    Ben-Sasson, Ayelet; Podoly, Tamar Yonit

    2017-02-01

    Several studies have examined the sensory component in Obsesseive Compulsive Disorder (OCD) and described an OCD subtype which has a unique profile, and that Sensory Phenomena (SP) is a significant component of this subtype. SP has some commonalities with Sensory Over Responsivity (SOR) and might be in part a characteristic of this subtype. Although there are some studies that have examined SOR and its relation to Obsessive Compulsive Symptoms (OCS), literature lacks sufficient data on this interplay. First to further examine the correlations between OCS and SOR, and to explore the correlations between SOR modalities (i.e. smell, touch, etc.) and OCS subscales (i.e. washing, ordering, etc.). Second, to investigate the cluster analysis of SOR and OCS dimensions in adults, that is, to classify the sample using the sensory scores to find whether a sensory OCD subtype can be specified. Our third goal was to explore the psychometric features of a new sensory questionnaire: the Sensory Perception Quotient (SPQ). A sample of non clinical adults (n=350) was recruited via e-mail, social media and social networks. Participants completed questionnaires for measuring SOR, OCS, and anxiety. SOR and OCI-F scores were moderately significantly correlated (n=274), significant correlations between all SOR modalities and OCS subscales were found with no specific higher correlation between one modality to one OCS subscale. Cluster analysis revealed four distinct clusters: (1) No OC and SOR symptoms (NONE; n=100), (2) High OC and SOR symptoms (BOTH; n=28), (3) Moderate OC symptoms (OCS; n=63), (4) Moderate SOR symptoms (SOR; n=83). The BOTH cluster had significantly higher anxiety levels than the other clusters, and shared OC subscales scores with the OCS cluster. The BOTH cluster also reported higher SOR scores across tactile, vision, taste and olfactory modalities. The SPQ was found reliable and suitable to detect SOR, the sample SPQ scores was normally distributed (n=350). SOR is a

  16. Assessing intraplate volcano compositional similarities with cluster analysis

    NASA Astrophysics Data System (ADS)

    Konter, J. G.

    2012-12-01

    The compositional variation in intraplate volcanoes is commonly assessed as a function of end-members that were recognized as extrema in a 3D space, defined by radiogenic isotope ratios. The specific isotope ratios used are the principle components in the intraplate volcano compositional data set, and by reducing the dimensionality of the data set to 3, groupings and trends in the data can be visually identified. Such groupings can then be used to compare to other geochemical or geophysical data sets (e.g. correlations with seismic models). A complementary approach in examining groupings and trends in a data set is the use of cluster analysis, which can be used to recognize groups of similar intraplate volcanic systems. Since it is not known a priori how many clusters may exist, hierarchical cluster analysis can be used to examine the relationships between individual intraplate volcanic systems. The technique compares the Euclidian distance between the data available at the different locations, and this data can have a large number of dimensions. The results can be visualized as a dendrogram, where individual locations are represented by different branches (or leafs) that join at different distances. We use Matlab to examine the data extracted from pre-compiled GEOROC database files, including location, major elements, large ion lithophile elements, high field strength elements, rare earth elements and radiogenic isotopes. These data do not vary over the same range in values and are therefore first normalized by the total range in the data set for each particular element or isotope ratio. Since multiple samples have been analyzed for most intraplate volcanic systems, we assess the results for the average, the maximum, and the minimum values for each element. In addition, we investigate the robustness of the outcome by removing one element at a time from the data set and recalculating a new dendrogram. One of the outcomes is that the resulting clusters seem to

  17. Sensitive and inexpensive digital DNA analysis by microfluidic enrichment of rolling circle amplified single-molecules

    PubMed Central

    Kühnemund, Malte; Hernández-Neuta, Iván; Sharif, Mohd Istiaq; Cornaglia, Matteo; Gijs, Martin A.M.

    2017-01-01

    Abstract Single molecule quantification assays provide the ultimate sensitivity and precision for molecular analysis. However, most digital analysis techniques, i.e. droplet PCR, require sophisticated and expensive instrumentation for molecule compartmentalization, amplification and analysis. Rolling circle amplification (RCA) provides a simpler means for digital analysis. Nevertheless, the sensitivity of RCA assays has until now been limited by inefficient detection methods. We have developed a simple microfluidic strategy for enrichment of RCA products into a single field of view of a low magnification fluorescent sensor, enabling ultra-sensitive digital quantification of nucleic acids over a dynamic range from 1.2 aM to 190 fM. We prove the broad applicability of our analysis platform by demonstrating 5-plex detection of as little as ∼1 pg (∼300 genome copies) of pathogenic DNA with simultaneous antibiotic resistance marker detection, and the analysis of rare oncogene mutations. Our method is simpler, more cost-effective and faster than other digital analysis techniques and provides the means to implement digital analysis in any laboratory equipped with a standard fluorescent microscope. PMID:28077562

  18. Sensitive and inexpensive digital DNA analysis by microfluidic enrichment of rolling circle amplified single-molecules.

    PubMed

    Kühnemund, Malte; Hernández-Neuta, Iván; Sharif, Mohd Istiaq; Cornaglia, Matteo; Gijs, Martin A M; Nilsson, Mats

    2017-05-05

    Single molecule quantification assays provide the ultimate sensitivity and precision for molecular analysis. However, most digital analysis techniques, i.e. droplet PCR, require sophisticated and expensive instrumentation for molecule compartmentalization, amplification and analysis. Rolling circle amplification (RCA) provides a simpler means for digital analysis. Nevertheless, the sensitivity of RCA assays has until now been limited by inefficient detection methods. We have developed a simple microfluidic strategy for enrichment of RCA products into a single field of view of a low magnification fluorescent sensor, enabling ultra-sensitive digital quantification of nucleic acids over a dynamic range from 1.2 aM to 190 fM. We prove the broad applicability of our analysis platform by demonstrating 5-plex detection of as little as ∼1 pg (∼300 genome copies) of pathogenic DNA with simultaneous antibiotic resistance marker detection, and the analysis of rare oncogene mutations. Our method is simpler, more cost-effective and faster than other digital analysis techniques and provides the means to implement digital analysis in any laboratory equipped with a standard fluorescent microscope. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis

    PubMed Central

    Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao

    2015-01-01

    Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383

  20. Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis.

    PubMed

    Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao

    2015-01-01

    Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis.

  1. Transcriptomic analysis of the effects of a fish oil enriched diet on murine brains.

    PubMed

    Hammamieh, Rasha; Chakraborty, Nabarun; Gautam, Aarti; Miller, Stacy-Ann; Muhie, Seid; Meyerhoff, James; Jett, Marti

    2014-01-01

    The health benefits of fish oil enriched with high omega-3 polyunsaturated fatty acids (n-3 PUFA) are widely documented. Fish oil as dietary supplements, however, show moderate clinical efficacy, highlighting an immediate scope of systematic in vitro feedback. Our transcriptomic study was designed to investigate the genomic shift of murine brains fed on fish oil enriched diets. A customized fish oil enriched diet (FD) and standard lab diet (SD) were separately administered to two randomly chosen populations of C57BL/6J mice from their weaning age until late adolescence. Statistical analysis mined 1,142 genes of interest (GOI) differentially altered in the hemibrains collected from the FD- and SD-fed mice at the age of five months. The majority of identified GOI (∼ 40%) encodes proteins located in the plasma membrane, suggesting that fish oil primarily facilitated the membrane-oriented biofunctions. FD potentially augmented the nervous system's development and functions by selectively stimulating the Src-mediated calcium-induced growth cascade and the downstream PI3K-AKT-PKC pathways. FD reduced the amyloidal burden, attenuated oxidative stress, and assisted in somatostatin activation-the signatures of attenuation of Alzheimer's disease, Parkinson's disease, and affective disorder. FD induced elevation of FKBP5 and suppression of BDNF, which are often linked with the improvement of anxiety disorder, depression, and post-traumatic stress disorder. Hence we anticipate efficacy of FD in treating illnesses such as depression that are typically triggered by the hypoactivities of dopaminergic, adrenergic, cholinergic, and GABAergic networks. Contrastingly, FD's efficacy could be compromised in treating illnesses such as bipolar disorder and schizophrenia, which are triggered by hyperactivities of the same set of neuromodulators. A more comprehensive investigation is recommended to elucidate the implications of fish oil on disease pathomechanisms, and the result

  2. Transcriptomic Analysis of the Effects of a Fish Oil Enriched Diet on Murine Brains

    PubMed Central

    Gautam, Aarti; Miller, Stacy-Ann; Muhie, Seid; Meyerhoff, James; Jett, Marti

    2014-01-01

    The health benefits of fish oil enriched with high omega-3 polyunsaturated fatty acids (n-3 PUFA) are widely documented. Fish oil as dietary supplements, however, show moderate clinical efficacy, highlighting an immediate scope of systematic in vitro feedback. Our transcriptomic study was designed to investigate the genomic shift of murine brains fed on fish oil enriched diets. A customized fish oil enriched diet (FD) and standard lab diet (SD) were separately administered to two randomly chosen populations of C57BL/6J mice from their weaning age until late adolescence. Statistical analysis mined 1,142 genes of interest (GOI) differentially altered in the hemibrains collected from the FD- and SD-fed mice at the age of five months. The majority of identified GOI (∼40%) encodes proteins located in the plasma membrane, suggesting that fish oil primarily facilitated the membrane-oriented biofunctions. FD potentially augmented the nervous system's development and functions by selectively stimulating the Src-mediated calcium-induced growth cascade and the downstream PI3K-AKT-PKC pathways. FD reduced the amyloidal burden, attenuated oxidative stress, and assisted in somatostatin activation—the signatures of attenuation of Alzheimer's disease, Parkinson's disease, and affective disorder. FD induced elevation of FKBP5 and suppression of BDNF, which are often linked with the improvement of anxiety disorder, depression, and post-traumatic stress disorder. Hence we anticipate efficacy of FD in treating illnesses such as depression that are typically triggered by the hypoactivities of dopaminergic, adrenergic, cholinergic, and GABAergic networks. Contrastingly, FD's efficacy could be compromised in treating illnesses such as bipolar disorder and schizophrenia, which are triggered by hyperactivities of the same set of neuromodulators. A more comprehensive investigation is recommended to elucidate the implications of fish oil on disease pathomechanisms, and the result

  3. Fuzzy cluster analysis of high-field functional MRI data.

    PubMed

    Windischberger, Christian; Barth, Markus; Lamm, Claus; Schroeder, Lee; Bauer, Herbert; Gur, Ruben C; Moser, Ewald

    2003-11-01

    Functional magnetic resonance imaging (fMRI) based on blood-oxygen level dependent (BOLD) contrast today is an established brain research method and quickly gains acceptance for complementary clinical diagnosis. However, neither the basic mechanisms like coupling between neuronal activation and haemodynamic response are known exactly, nor can the various artifacts be predicted or controlled. Thus, modeling functional signal changes is non-trivial and exploratory data analysis (EDA) may be rather useful. In particular, identification and separation of artifacts as well as quantification of expected, i.e. stimulus correlated, and novel information on brain activity is important for both, new insights in neuroscience and future developments in functional MRI of the human brain. After an introduction on fuzzy clustering and very high-field fMRI we present several examples where fuzzy cluster analysis (FCA) of fMRI time series helps to identify and locally separate various artifacts. We also present and discuss applications and limitations of fuzzy cluster analysis in very high-field functional MRI: differentiate temporal patterns in MRI using (a) a test object with static and dynamic parts, (b) artifacts due to gross head motion artifacts. Using a synthetic fMRI data set we quantitatively examine the influences of relevant FCA parameters on clustering results in terms of receiver-operator characteristics (ROC) and compare them with a commonly used model-based correlation analysis (CA) approach. The application of FCA in analyzing in vivo fMRI data is shown for (a) a motor paradigm, (b) data from multi-echo imaging, and (c) a fMRI study using mental rotation of three-dimensional cubes. We found that differentiation of true "neural" from false "vascular" activation is possible based on echo time dependence and specific activation levels, as well as based on their signal time-course. Exploratory data analysis methods in general and fuzzy cluster analysis in particular may

  4. ICAP - An Interactive Cluster Analysis Procedure for analyzing remotely sensed data

    NASA Technical Reports Server (NTRS)

    Wharton, S. W.; Turner, B. J.

    1981-01-01

    An Interactive Cluster Analysis Procedure (ICAP) was developed to derive classifier training statistics from remotely sensed data. ICAP differs from conventional clustering algorithms by allowing the analyst to optimize the cluster configuration by inspection, rather than by manipulating process parameters. Control of the clustering process alternates between the algorithm, which creates new centroids and forms clusters, and the analyst, who can evaluate and elect to modify the cluster structure. Clusters can be deleted, or lumped together pairwise, or new centroids can be added. A summary of the cluster statistics can be requested to facilitate cluster manipulation. The principal advantage of this approach is that it allows prior information (when available) to be used directly in the analysis, since the analyst interacts with ICAP in a straightforward manner, using basic terms with which he is more likely to be familiar. Results from testing ICAP showed that an informed use of ICAP can improve classification, as compared to an existing cluster analysis procedure.

  5. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.

    PubMed

    Kuleshov, Maxim V; Jones, Matthew R; Rouillard, Andrew D; Fernandez, Nicolas F; Duan, Qiaonan; Wang, Zichen; Koplev, Simon; Jenkins, Sherry L; Jagodnik, Kathleen M; Lachmann, Alexander; McDermott, Michael G; Monteiro, Caroline D; Gundersen, Gregory W; Ma'ayan, Avi

    2016-07-08

    Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.

  6. Importance of collection in gene set enrichment analysis of drug response in cancer cell lines

    PubMed Central

    Bateman, Alain R.; El-Hachem, Nehme; Beck, Andrew H.; Aerts, Hugo J. W. L.; Haibe-Kains, Benjamin

    2014-01-01

    Gene set enrichment analysis (GSEA) associates gene sets and phenotypes, its use is predicated on the choice of a pre-defined collection of sets. The defacto standard implementation of GSEA provides seven collections yet there are no guidelines for the choice of collections and the impact of such choice, if any, is unknown. Here we compare each of the standard gene set collections in the context of a large dataset of drug response in human cancer cell lines. We define and test a new collection based on gene co-expression in cancer cell lines to compare the performance of the standard collections to an externally derived cell line based collection. The results show that GSEA findings vary significantly depending on the collection chosen for analysis. Henceforth, collections should be carefully selected and reported in studies that leverage GSEA. PMID:24522610

  7. Psychosocial Costs of Racism to Whites: Exploring Patterns through Cluster Analysis

    ERIC Educational Resources Information Center

    Spanierman, Lisa B.; Poteat, V. Paul; Beer, Amanda M.; Armstrong, Patrick Ian

    2006-01-01

    Participants (230 White college students) completed the Psychosocial Costs of Racism to Whites (PCRW) Scale. Using cluster analysis, we identified 5 distinct cluster groups on the basis of PCRW subscale scores: the unempathic and unaware cluster contained the lowest empathy scores; the insensitive and afraid cluster consisted of low empathy and…

  8. A Monte Carlo Analysis of Gas Centrifuge Enrichment Plant Process Load Cell Data

    SciTech Connect

    Garner, James R; Whitaker, J Michael

    2013-01-01

    As uranium enrichment plants increase in number, capacity, and types of separative technology deployed (e.g., gas centrifuge, laser, etc.), more automated safeguards measures are needed to enable the IAEA to maintain safeguards effectiveness in a fiscally constrained environment. Monitoring load cell data can significantly increase the IAEA s ability to efficiently achieve the fundamental safeguards objective of confirming operations as declared (i.e., no undeclared activities), but care must be taken to fully protect the operator s proprietary and classified information related to operations. Staff at ORNL, LANL, JRC/ISPRA, and University of Glasgow are investigating monitoring the process load cells at feed and withdrawal (F/W) stations to improve international safeguards at enrichment plants. A key question that must be resolved is what is the necessary frequency of recording data from the process F/W stations? Several studies have analyzed data collected at a fixed frequency. This paper contributes to load cell process monitoring research by presenting an analysis of Monte Carlo simulations to determine the expected errors caused by low frequency sampling and its impact on material balance calculations.

  9. Phylogenetic analysis of anaerobic psychrophilic enrichment cultures obtained from a greenland glacier ice core

    NASA Technical Reports Server (NTRS)

    Sheridan, Peter P.; Miteva, Vanya I.; Brenchley, Jean E.

    2003-01-01

    The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at -9 degrees C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 x 10(7) cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at -2 degrees C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years.

  10. Phylogenetic analysis of anaerobic psychrophilic enrichment cultures obtained from a greenland glacier ice core

    NASA Technical Reports Server (NTRS)

    Sheridan, Peter P.; Miteva, Vanya I.; Brenchley, Jean E.

    2003-01-01

    The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at -9 degrees C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 x 10(7) cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at -2 degrees C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years.

  11. Phylogenetic analysis of anaerobic psychrophilic enrichment cultures obtained from a greenland glacier ice core.

    PubMed

    Sheridan, Peter P; Miteva, Vanya I; Brenchley, Jean E

    2003-04-01

    The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at -9 degrees C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 x 10(7) cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at -2 degrees C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years.

  12. Phylogenetic Analysis of Anaerobic Psychrophilic Enrichment Cultures Obtained from a Greenland Glacier Ice Core

    PubMed Central

    Sheridan, Peter P.; Miteva, Vanya I.; Brenchley, Jean E.

    2003-01-01

    The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at −9°C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 × 107 cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at −2°C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years. PMID:12676695

  13. Cluster Analysis of the International Stellarator Confinement Database

    NASA Astrophysics Data System (ADS)

    Kus, A.; Dinklage, A.; Preuss, R.; Ascasibar, E.; Harris, J. H.; Okamura, S.; Sano, F.; Stroth, U.; Talmadge, J.; Yamada, H.

    2008-03-01

    Heterogeneous structure of collected data is one of the problems that occur during derivation of scalings for energy confinement time, and whose analysis tourns out to be wide and complicated matter. The International Stellarator Confinement Database [1], shortly ISCDB, comprises in its latest version 21 a total of 3647 observations from 8 experimental devices, 2067 therefrom beeing so far completed for upcoming analyses. For confinement scaling studies 1933 observation were chosen as the standard dataset. Here we describe a statistical method of cluster analysis for identification of possible cohesive substructures in ISDCB and present some preliminary results.

  14. ACPA: automated cluster plot analysis of genotype data

    PubMed Central

    2009-01-01

    Genome-wide association studies have become standard in genetic epidemiology. Analyzing hundreds of thousands of markers simultaneously imposes some challenges for statisticians. One issue is the problem of multiplicity, which has been compared with the search for the needle in a haystack. To reduce the number of false-positive findings, a number of quality filters such as exclusion of single-nucleotide polymorphisms (SNPs) with a high missing fraction are employed. Another filter is exclusion of SNPs for which the calling algorithm had difficulties in assigning the genotypes. The only way to do this is the visual inspection of the cluster plots, also termed signal intensity plots, but this approach is often neglected. We developed an algorithm ACPA (automated cluster plot analysis), which performs this task automatically for autosomal SNPs. It is based on counting samples that lie too close to the cluster of a different genotype; SNPs are excluded when a certain threshold is exceeded. We evaluated ACPA using 1,000 randomly selected quality controlled SNPs from the Framingham Heart Study data that were provided for the Genetic Analysis Workshop 16. We compared the decision of ACPA with the decision made by two independent readers. We achieved a sensitivity of 88% (95% CI: 81%-93%) and a specificity of 86% (95% CI: 83%-89%). In a screening setting in which one aims at not losing any good SNP, we achieved 99% (95% CI: 98%-100%) specificity and still detected every second low-quality SNP. PMID:20018051

  15. ACPA: automated cluster plot analysis of genotype data.

    PubMed

    Schillert, Arne; Schwarz, Daniel F; Vens, Maren; Szymczak, Silke; König, Inke R; Ziegler, Andreas

    2009-12-15

    Genome-wide association studies have become standard in genetic epidemiology. Analyzing hundreds of thousands of markers simultaneously imposes some challenges for statisticians. One issue is the problem of multiplicity, which has been compared with the search for the needle in a haystack. To reduce the number of false-positive findings, a number of quality filters such as exclusion of single-nucleotide polymorphisms (SNPs) with a high missing fraction are employed. Another filter is exclusion of SNPs for which the calling algorithm had difficulties in assigning the genotypes. The only way to do this is the visual inspection of the cluster plots, also termed signal intensity plots, but this approach is often neglected. We developed an algorithm ACPA (automated cluster plot analysis), which performs this task automatically for autosomal SNPs. It is based on counting samples that lie too close to the cluster of a different genotype; SNPs are excluded when a certain threshold is exceeded. We evaluated ACPA using 1,000 randomly selected quality controlled SNPs from the Framingham Heart Study data that were provided for the Genetic Analysis Workshop 16. We compared the decision of ACPA with the decision made by two independent readers. We achieved a sensitivity of 88% (95% CI: 81%-93%) and a specificity of 86% (95% CI: 83%-89%). In a screening setting in which one aims at not losing any good SNP, we achieved 99% (95% CI: 98%-100%) specificity and still detected every second low-quality SNP.

  16. Clustering Analysis of Fast-ion Driven Instabilities

    NASA Astrophysics Data System (ADS)

    Gresl, J.; Heidbrink, W. W.; Haskey, S.; Blackwell, B. D.

    2016-10-01

    Beam ions often drive Alfvén eigenmodes and other instabilities unstable in DIII-D. Many of these modes have been unambigously identified but some frequently occurring features have been neglected. In this work, datamining analysis techniques that successfully analyzed magnetics data from the H-1NF heliac are applied to arrays of magnetic and electron cyclotron emission (ECE) data from DIII-D. The techniques group instabilities with similar magnetic or ECE features into clusters. Once the clusters are found, a database of plasma parameters will facilitate mode identification. Work supported by the US Department of Energy under DE-FC02-04ER54698, DE-FG03-94ER54271, DE-AC02-09CH11466.

  17. Cluster: Mission Overview and End-of-Life Analysis

    NASA Technical Reports Server (NTRS)

    Pallaschke, S.; Munoz, I.; Rodriquez-Canabal, J.; Sieg, D.; Yde, J. J.

    2007-01-01

    The Cluster mission is part of the scientific programme of the European Space Agency (ESA) and its purpose is the analysis of the Earth's magnetosphere. The Cluster project consists of four satellites. The selected polar orbit has a shape of 4.0 and 19.2 Re which is required for performing measurements near the cusp and the tail of the magnetosphere. When crossing these regions the satellites form a constellation which in most of the cases so far has been a regular tetrahedron. The satellite operations are carried out by the European Space Operations Centre (ESOC) at Darmstadt, Germany. The paper outlines the future orbit evolution and the envisaged operations from a Flight Dynamics point of view. In addition a brief summary of the LEOP and routine operations is included beforehand.

  18. Accident patterns for construction-related workers: a cluster analysis

    NASA Astrophysics Data System (ADS)

    Liao, Chia-Wen; Tyan, Yaw-Yauan

    2011-12-01

    The construction industry has been identified as one of the most hazardous industries. The risk of constructionrelated workers is far greater than that in a manufacturing based industry. However, some steps can be taken to reduce worker risk through effective injury prevention strategies. In this article, k-means clustering methodology is employed in specifying the factors related to different worker types and in identifying the patterns of industrial occupational accidents. Accident reports during the period 1998 to 2008 are extracted from case reports of the Northern Region Inspection Office of the Council of Labor Affairs of Taiwan. The results show that the cluster analysis can indicate some patterns of occupational injuries in the construction industry. Inspection plans should be proposed according to the type of construction-related workers. The findings provide a direction for more effective inspection strategies and injury prevention programs.

  19. Accident patterns for construction-related workers: a cluster analysis

    NASA Astrophysics Data System (ADS)

    Liao, Chia-Wen; Tyan, Yaw-Yauan

    2012-01-01

    The construction industry has been identified as one of the most hazardous industries. The risk of constructionrelated workers is far greater than that in a manufacturing based industry. However, some steps can be taken to reduce worker risk through effective injury prevention strategies. In this article, k-means clustering methodology is employed in specifying the factors related to different worker types and in identifying the patterns of industrial occupational accidents. Accident reports during the period 1998 to 2008 are extracted from case reports of the Northern Region Inspection Office of the Council of Labor Affairs of Taiwan. The results show that the cluster analysis can indicate some patterns of occupational injuries in the construction industry. Inspection plans should be proposed according to the type of construction-related workers. The findings provide a direction for more effective inspection strategies and injury prevention programs.

  20. Where pseudo-hallucinations meet dissociation: a cluster analysis.

    PubMed

    Wearne, Deborah; Curtis, Guy J; Genetti, Amanda; Samuel, Mathew; Sebastian, Justin

    2017-08-01

    The possible link between cognitive areas of perception and integration of consciousness was examined using assessments of hallucinations and derealisation/depersonalization. Sixty-five subjects in three main diagnostic groups - posttraumatic stress disorder (PTSD), borderline personality disorder (BPD) and schizophrenia - identified by their treating psychiatrist as hearing voices were surveyed regarding characteristics of hallucinations, derealisation/depersonalization, delusions and childhood/adult trauma. A cluster analysis produced two clusters predominantly determined by variables of hallucinations measures, childhood sexual abuse and derealisation/depersonalization scores. History of childhood trauma and variability in derealisation/depersonalization scores were better predictors of external, negative, uncontrollable voices than diagnosis of BPD or PTSD. The potential links between dissociative states and pseudo-hallucinations are discussed.

  1. Generalized enrichment analysis improves the detection of adverse drug events from the biomedical literature.

    PubMed

    Winnenburg, Rainer; Shah, Nigam H

    2016-06-23

    Identification of associations between marketed drugs and adverse events from the biomedical literature assists drug safety monitoring efforts. Assessing the significance of such literature-derived associations and determining the granularity at which they should be captured remains a challenge. Here, we assess how defining a selection of adverse event terms from MeSH, based on information content, can improve the detection of adverse events for drugs and drug classes. We analyze a set of 105,354 candidate drug adverse event pairs extracted from article indexes in MEDLINE. First, we harmonize extracted adverse event terms by aggregating them into higher-level MeSH terms based on the terms' information content. Then, we determine statistical enrichment of adverse events associated with drug and drug classes using a conditional hypergeometric test that adjusts for dependencies among associated terms. We compare our results with methods based on disproportionality analysis (proportional reporting ratio, PRR) and quantify the improvement in signal detection with our generalized enrichment analysis (GEA) approach using a gold standard of drug-adverse event associations spanning 174 drugs and four events. For single drugs, the best GEA method (Precision: .92/Recall: .71/F1-measure: .80) outperforms the best PRR based method (.69/.69/.69) on all four adverse event outcomes in our gold standard. For drug classes, our GEA performs similarly (.85/.69/.74) when increasing the level of abstraction for adverse event terms. Finally, on examining the 1609 individual drugs in our MEDLINE set, which map to chemical substances in ATC, we find signals for 1379 drugs (10,122 unique adverse event associations) on applying GEA with p < 0.005. We present an approach based on generalized enrichment analysis that can be used to detect associations between drugs, drug classes and adverse events at a given level of granularity, at the same time correcting for known dependencies among

  2. Systematic Analysis of a Novel Human Renal Glomerulus-Enriched Gene Expression Dataset

    PubMed Central

    Lindenmeyer, Maja T.; Eichinger, Felix; Sen, Kontheari; Anders, Hans-Joachim; Edenhofer, Ilka; Mattinzoli, Deborah; Kretzler, Matthias; Rastaldi, Maria P.; Cohen, Clemens D.

    2010-01-01

    Glomerular diseases account for the majority of cases with chronic renal failure. Several genes have been identified with key relevance for glomerular function. Quite a few of these genes show a specific or preferential mRNA expression in the renal glomerulus. To identify additional candidate genes involved in glomerular function in humans we generated a human renal glomerulus-enriched gene expression dataset (REGGED) by comparing gene expression profiles from human glomeruli and tubulointerstitium obtained from six transplant living donors using Affymetrix HG-U133A arrays. This analysis resulted in 677 genes with prominent overrepresentation in the glomerulus. Genes with ‘a priori’ known prominent glomerular expression served for validation and were all found in the novel dataset (e.g. CDKN1, DAG1, DDN, EHD3, MYH9, NES, NPHS1, NPHS2, PDPN, PLA2R1, PLCE1, PODXL, PTPRO, SYNPO, TCF21, TJP1, WT1). The mRNA expression of several novel glomerulus-enriched genes in REGGED was validated by qRT-PCR. Gene ontology and pathway analysis identified biological processes previously not reported to be of relevance in glomeruli of healthy human adult kidneys including among others axon guidance. This finding was further validated by assessing the expression of the axon guidance molecules neuritin (NRN1) and roundabout receptor ROBO1 and -2. In diabetic nephropathy, a prevalent glomerulopathy, differential regulation of glomerular ROBO2 mRNA was found. In summary, novel transcripts with predominant expression in the human glomerulus could be identified using a comparative strategy on microdissected nephrons. A systematic analysis of this glomerulus-specifc gene expression dataset allows the detection of target molecules and biological processes involved in glomerular biology and renal disease. PMID:20634963

  3. Galactic Pal-eontology: abundance analysis of the disrupting globular cluster Palomar 5

    NASA Astrophysics Data System (ADS)

    Koch, Andreas; Côté, Patrick

    2017-05-01

    We present a chemical abundance analysis of the tidally disrupted globular cluster (GC) Palomar 5. By co-adding high-resolution spectra of 15 member stars from the cluster's main body, taken at low signal-to-noise with the Keck/HIRES spectrograph, we were able to measure integrated abundance ratios of 24 species of 20 elements including all major nucleosynthetic channels (namely the light element Na; α-elements Mg, Si, Ca, Ti; Fe-peak and heavy elements Sc, V, Cr, Mn, Co, Ni, Cu, Zn; and the neutron-capture elements Y, Zr, Ba, La, Nd, Sm, Eu). The mean metallicity of -1.56 ± 0.02 ± 0.06 dex (statistical and systematic errors) agrees well with the values from individual, low-resolution measurements of individual stars, but it is lower than previous high-resolution results of a small number of stars in the literature. Comparison with Galactic halo stars and other disrupted and unperturbed GCs renders Pal 5 a typical representative of the Milky Way halo population, as has been noted before, emphasizing that the early chemical evolution of such clusters is decoupled from their later dynamical history. We also performed a test as to the detectability of light element variations in this co-added abundance analysis technique and found that this approach is not sensitive even in the presence of a broad range in sodium of 0.6 dex, a value typically found in the old halo GCs. Thus, while methods of determining the global abundance patterns of such objects are well suited to study their overall enrichment histories, chemical distinctions of their multiple stellar populations is still best obtained from measurements of individual stars. Full Table 3 is is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/601/A41

  4. Program Process, Costs and Consequences: A Comparative Analysis of YCCIP Enrichment, and a Guidebook for the Enrichment of Labor-Intensive Work Projects.

    ERIC Educational Resources Information Center

    Osoro and Associates, Bellingham, WA.

    This document contains (1) a monograph investigating and describing conditions under which it is cost-beneficial to operate an enriched YCCIP (Youth Community Conservation and Improvement Project) design and (2) a guidebook to work project enrichment. The first sections of the monograph focus on the attributes of an enriched YCCIP activity in…

  5. Analysis of forest fires spatial clustering using local fractal measure

    NASA Astrophysics Data System (ADS)

    Kanevski, Mikhail; Rochat, Mikael; Timonin, Vadim

    2013-04-01

    The research deals with an application of local fractal measure - local sandbox counting or mass counting, for the characterization of patterns of spatial clustering. The main application concerns the simulated (random patterns within validity domain in forest regions) and real data (forest fires in Ticino, Switzerland) case studies. The global patterns of spatial clustering of forest fires were extensively studied using different topological (nearest-neighbours, Voronoi polygons), statistical (Ripley's k-function, Morisita diagram) and fractal/multifractal measures (box-counting, sandbox counting, lacunarity) (Kanevski, 2008). Generalizations of these measures to functional ones can reveal the structure of the phenomena, e.g. burned areas. All these measures are valuable and complementary tools to study spatial clustering. Moreover, application of the validity domain (complex domain where phenomena is studied) concept helps in understanding and interpretation of the results. In the present paper a sandbox counting method was applied locally, i.e. each point of ignition was considered as a centre of events counting with an increasing search radius. Then, the local relationships between the radius and the number of ignition points within the given radius were examined. Finally, the results are mapped using an interpolation algorithm for the visualization and analytical purposes. Both 2d (X,Y) and 3d (X,Y,Z) cases were studied and compared. Local "fractal" study gives an interesting spatially distributed picture of clustering. The real data case study was compared with a reference homogeneous pattern - complete spatial randomness. The difference between two patterns clearly indicates the regions with important spatial clustering. An extension to the local functional measure was applied taking into account the surface of burned area, i.e. by analysing only data with the fires above some threshold of burned area. Such analysis is similar to marked point processes and

  6. Enrichments for phototrophic bacteria and characterization by morphology and pigment analysis

    NASA Technical Reports Server (NTRS)

    Brune, D.

    1985-01-01

    The purpose of this investigation was to examine several sulfide containing environments for the presence of phototrophic bacteria and to obtain enriched cultures of some of the bacteria present. The field sites were Alum Rock State Park, the Palo Alto salt marsh, the bay area salt ponds, and Big Soda Lake (near Fallon, Nevada). Bacteria from these sites were characterized by microscopic examination, measurement of in vitro absorption spectra, and analysis of carotenoid pigments. Field observations at one of the bay area salt ponds, in which the salt concentration was saturating (about 30 percent NaCl) and the sediments along the shore of the pond covered with a gypsum crust, revealed a layer of purple photosynthetic bacteria under a green layer in the gypsum crust. Samples of this gypsum crust were taken to the laboratory to measure light transmission through the crust and to try to identify the purple photosynthetic bacteria present in this extremely saline environment.

  7. Global analysis of asymmetric RNA enrichment in oocytes reveals low conservation between closely related Xenopus species

    PubMed Central

    Claußen, Maike; Lingner, Thomas; Pommerenke, Claudia; Opitz, Lennart; Salinas, Gabriela; Pieler, Tomas

    2015-01-01

    RNAs that localize to the vegetal cortex during Xenopus laevis oogenesis have been reported to function in germ layer patterning, axis determination, and development of the primordial germ cells. Here we report on the genome-wide, comparative analysis of differentially localizing RNAs in Xenopus laevis and Xenopus tropicalis oocytes, revealing a surprisingly weak degree of conservation in respect to the identity of animally as well as vegetally enriched transcripts in these closely related species. Heterologous RNA injections and protein binding studies indicate that the different RNA localization patterns in these two species are due to gain/loss of cis-acting localization signals rather than to differences in the RNA-localizing machinery. PMID:26337391

  8. Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes

    PubMed Central

    Lees, John A.; Vehkala, Minna; Välimäki, Niko; Harris, Simon R.; Chewapreecha, Claire; Croucher, Nicholas J.; Marttinen, Pekka; Davies, Mark R.; Steer, Andrew C.; Tong, Steven Y. C.; Honkela, Antti; Parkhill, Julian; Bentley, Stephen D.; Corander, Jukka

    2016-01-01

    Bacterial genomes vary extensively in terms of both gene content and gene sequence. This plasticity hampers the use of traditional SNP-based methods for identifying all genetic associations with phenotypic variation. Here we introduce a computationally scalable and widely applicable statistical method (SEER) for the identification of sequence elements that are significantly enriched in a phenotype of interest. SEER is applicable to tens of thousands of genomes by counting variable-length k-mers using a distributed string-mining algorithm. Robust options are provided for association analysis that also correct for the clonal population structure of bacteria. Using large collections of genomes of the major human pathogens Streptococcus pneumoniae and Streptococcus pyogenes, SEER identifies relevant previously characterized resistance determinants for several antibiotics and discovers potential novel factors related to the invasiveness of S. pyogenes. We thus demonstrate that our method can answer important biologically and medically relevant questions. PMID:27633831

  9. Analysis of omega 3 fatty acid in natural and enriched chicken eggs by capillary zone electrophoresis.

    PubMed

    Porto, Brenda Lee Simas; Souza, Marcus Vinicius Nora de; Oliveira, Marcone Augusto Leal de

    2011-01-01

    Qualitative differentiation between natural and enriched chicken eggs through omega (ω) 3 fatty acid profiles by capillary zone electrophoresis (CZE) under direct UV detection at 200 nm is proposed. The electrolyte background consisted of 12.0 mmol L(-1) tetraborate buffer (pH 9.2) mixed with 12.0 mmol L(-1) Brij 35, 17% acetonitrile (ACN) and 33% methanol (MeOH). Omega 3 fatty acid profile in chicken egg samples were analyzed by CZE system and confirmed by single-quadrupole mass spectrometry with an electrospray ionization probe set to negative ionization mode after sample preparation by the Folch method. The results showed that ω fatty acid profiles analyzed by the CZE approach can be used to chemical markers to monitor fraud, presenting simplicity, short analysis time (10 min) and low cost as advantages.

  10. Genome-wide linkage analysis for uric acid in families enriched for hypertension

    PubMed Central

    Rule, Andrew D.; Fridley, Brooke L.; Hunt, Steven C.; Asmann, Yan; Boerwinkle, Eric; Pankow, James S.; Mosley, Thomas H.; Turner, Stephen T.

    2009-01-01

    Background. Uric acid is heritable and associated with hypertension and insulin resistance. We sought to identify genomic regions influencing serum uric acid in families in which two or more siblings had hypertension. Methods. Uric acid levels and microsatellite markers were assayed in the Genetic Epidemiology Network of Arteriopathy (GENOA) cohort (1075 whites and 1333 blacks) and the Hypertension Genetic Epidemiology Network (HyperGEN) cohort (1542 whites and 1627 blacks). Genome-wide linkage analyses of uric acid and bivariate linkage analyses of uric acid with an additional surrogate of insulin resistance were completed. Pathway analysis explored gene sets enriched at loci influencing uric acid. Results. In the GENOA white cohort, loci influencing uric acid were identified on chromosome 8 at 135 cM [multipoint logarithm of odds score (MLS) = 2.4], on chromosome 9 at 113 cM (MLS = 3.7) and on chromosome 16 at 93 cM (MLS = 2.3), but did not replicate in HyperGEN. At these loci, there was evidence of pleiotropy with other surrogates of insulin resistance and genes in the fructose and mannose metabolism pathway were enriched. In the HyperGEN-black cohort, there was some evidence of a locus for uric acid on chromosome 4 at 135 cM (MLS = 2.4) that had modest replication in GENOA (MLS = 1.2). Conclusions. Several novel loci linked to uric acid were identified but none showed clear replication. Widespread diuretic use, a medication that raises uric acid levels, was an important study limitation. Bivariate linkage analyses and pathway analysis were consistent with genes regulating insulin resistance and fructose metabolism contributing to the heritability of uric acid. PMID:19258383

  11. Comparative analysis of metagenomes from three methanogenic hydrocarbon-degrading enrichment cultures with 41 environmental samples

    PubMed Central

    Tan, Boonfei; Jane Fowler, S; Laban, Nidal Abu; Dong, Xiaoli; Sensen, Christoph W; Foght, Julia; Gieg, Lisa M

    2015-01-01

    Methanogenic hydrocarbon metabolism is a key process in subsurface oil reservoirs and hydrocarbon-contaminated environments and thus warrants greater understanding to improve current technologies for fossil fuel extraction and bioremediation. In this study, three hydrocarbon-degrading methanogenic cultures established from two geographically distinct environments and incubated with different hydrocarbon substrates (added as single hydrocarbons or as mixtures) were subjected to metagenomic and 16S rRNA gene pyrosequencing to test whether these differences affect the genetic potential and composition of the communities. Enrichment of different putative hydrocarbon-degrading bacteria in each culture appeared to be substrate dependent, though all cultures contained both acetate- and H2-utilizing methanogens. Despite differing hydrocarbon substrates and inoculum sources, all three cultures harbored genes for hydrocarbon activation by fumarate addition (bssA, assA, nmsA) and carboxylation (abcA, ancA), along with those for associated downstream pathways (bbs, bcr, bam), though the cultures incubated with hydrocarbon mixtures contained a broader diversity of fumarate addition genes. A comparative metagenomic analysis of the three cultures showed that they were functionally redundant despite their enrichment backgrounds, sharing multiple features associated with syntrophic hydrocarbon conversion to methane. In addition, a comparative analysis of the culture metagenomes with those of 41 environmental samples (containing varying proportions of methanogens) showed that the three cultures were functionally most similar to each other but distinct from other environments, including hydrocarbon-impacted environments (for example, oil sands tailings ponds and oil-affected marine sediments). This study provides a basis for understanding key functions and environmental selection in methanogenic hydrocarbon-associated communities. PMID:25734684

  12. Characterization of the Mouse Brain Proteome Using Global Proteomic Analysis Complemented with Cysteinyl-Peptide Enrichment

    PubMed Central

    Wang, Haixing; Qian, Wei-Jun; Chin, Mark H.; Petyuk, Vladislav A.; Barry, Richard C.; Liu, Tao; Gritsenko, Marina A.; Mottaz, Heather M.; Moore, Ronald J.; Camp, David G.; Khan, Arshad H.; Smith, Desmond J.; Smith, Richard D.

    2007-01-01

    Given the growing interest in applying genomic and proteomic approaches for studying the mammalian brain using mouse models, we hereby present a global proteomic approach for analyzing brain tissue and for the first time a comprehensive characterization of the whole mouse brain proteome. Preparation of the whole brain sample incorporated a highly efficient cysteinyl-peptide enrichment (CPE) technique to complement a global enzymatic digestion method. Both the global and the cysteinyl-enriched peptide samples were analyzed by SCX fractionation coupled with reversed phase LC-MS/MS analysis. A total of 48,328 different peptides were confidently identified (>98% confidence level), covering 7792 non-redundant proteins (∼34% of the predicted mouse proteome). 1564 and 1859 proteins were identified exclusively from the cysteinyl-peptide and the global peptide samples, respectively, corresponding to 25% and 31% improvements in proteome coverage compared to analysis of only the global peptide or cysteinyl-peptide samples. The identified proteins provide a broad representation of the mouse proteome with little bias evident due to protein pI, molecular weight, and/or cellular localization. Approximately 26% of the identified proteins with gene ontology (GO) annotations were membrane proteins, with 1447 proteins predicted to have transmembrane domains, and many of the membrane proteins were found to be involved in transport and cell signaling. The MS/MS spectrum count information for the identified proteins was used to provide a measure of relative protein abundances. The mouse brain peptide/protein database generated from this study represents the most comprehensive proteome coverage for the mammalian brain to date, and the basis for future quantitative brain proteomic studies using mouse models. The proteomic approach presented here may have broad applications for rapid proteomic analyses of various mouse models of human brain diseases. PMID:16457602

  13. Comparative analysis of metagenomes from three methanogenic hydrocarbon-degrading enrichment cultures with 41 environmental samples.

    PubMed

    Tan, Boonfei; Fowler, S Jane; Abu Laban, Nidal; Dong, Xiaoli; Sensen, Christoph W; Foght, Julia; Gieg, Lisa M

    2015-09-01

    Methanogenic hydrocarbon metabolism is a key process in subsurface oil reservoirs and hydrocarbon-contaminated environments and thus warrants greater understanding to improve current technologies for fossil fuel extraction and bioremediation. In this study, three hydrocarbon-degrading methanogenic cultures established from two geographically distinct environments and incubated with different hydrocarbon substrates (added as single hydrocarbons or as mixtures) were subjected to metagenomic and 16S rRNA gene pyrosequencing to test whether these differences affect the genetic potential and composition of the communities. Enrichment of different putative hydrocarbon-degrading bacteria in each culture appeared to be substrate dependent, though all cultures contained both acetate- and H2-utilizing methanogens. Despite differing hydrocarbon substrates and inoculum sources, all three cultures harbored genes for hydrocarbon activation by fumarate addition (bssA, assA, nmsA) and carboxylation (abcA, ancA), along with those for associated downstream pathways (bbs, bcr, bam), though the cultures incubated with hydrocarbon mixtures contained a broader diversity of fumarate addition genes. A comparative metagenomic analysis of the three cultures showed that they were functionally redundant despite their enrichment backgrounds, sharing multiple features associated with syntrophic hydrocarbon conversion to methane. In addition, a comparative analysis of the culture metagenomes with those of 41 environmental samples (containing varying proportions of methanogens) showed that the three cultures were functionally most similar to each other but distinct from other environments, including hydrocarbon-impacted environments (for example, oil sands tailings ponds and oil-affected marine sediments). This study provides a basis for understanding key functions and environmental selection in methanogenic hydrocarbon-associated communities.

  14. Common and Cluster-Specific Simultaneous Component Analysis

    PubMed Central

    De Roover, Kim; Timmerman, Marieke E.; Mesquita, Batja; Ceulemans, Eva

    2013-01-01

    In many fields of research, so-called ‘multiblock’ data are collected, i.e., data containing multivariate observations that are nested within higher-level research units (e.g., inhabitants of different countries). Each higher-level unit (e.g., country) then corresponds to a ‘data block’. For such data, it may be interesting to investigate the extent to which the correlation structure of the variables differs between the data blocks. More specifically, when capturing the correlation structure by means of component analysis, one may want to explore which components are common across all data blocks and which components differ across the data blocks. This paper presents a common and cluster-specific simultaneous component method which clusters the data blocks according to their correlation structure and allows for common and cluster-specific components. Model estimation and model selection procedures are described and simulation results validate their performance. Also, the method is applied to data from cross-cultural values research to illustrate its empirical value. PMID:23667463

  15. Clustered Numerical Data Analysis Using Markov Lie Monoid Based Networks

    NASA Astrophysics Data System (ADS)

    Johnson, Joseph

    2016-03-01

    We have designed and build an optimal numerical standardization algorithm that links numerical values with their associated units, error level, and defining metadata thus supporting automated data exchange and new levels of artificial intelligence (AI). The software manages all dimensional and error analysis and computational tracing. Tables of entities verses properties of these generalized numbers (called ``metanumbers'') support a transformation of each table into a network among the entities and another network among their properties where the network connection matrix is based upon a proximity metric between the two items. We previously proved that every network is isomorphic to the Lie algebra that generates continuous Markov transformations. We have also shown that the eigenvectors of these Markov matrices provide an agnostic clustering of the underlying patterns. We will present this methodology and show how our new work on conversion of scientific numerical data through this process can reveal underlying information clusters ordered by the eigenvalues. We will also show how the linking of clusters from different tables can be used to form a ``supernet'' of all numerical information supporting new initiatives in AI.

  16. Covariance analysis of differential drag-based satellite cluster flight

    NASA Astrophysics Data System (ADS)

    Ben-Yaacov, Ohad; Ivantsov, Anatoly; Gurfil, Pini

    2016-06-01

    One possibility for satellite cluster flight is to control relative distances using differential drag. The idea is to increase or decrease the drag acceleration on each satellite by changing its attitude, and use the resulting small differential acceleration as a controller. The most significant advantage of the differential drag concept is that it enables cluster flight without consuming fuel. However, any drag-based control algorithm must cope with significant aerodynamical and mechanical uncertainties. The goal of the current paper is to develop a method for examination of the differential drag-based cluster flight performance in the presence of noise and uncertainties. In particular, the differential drag control law is examined under measurement noise, drag uncertainties, and initial condition-related uncertainties. The method used for uncertainty quantification is the Linear Covariance Analysis, which enables us to propagate the augmented state and filter covariance without propagating the state itself. Validation using a Monte-Carlo simulation is provided. The results show that all uncertainties have relatively small effect on the inter-satellite distance, even in the long term, which validates the robustness of the used differential drag controller.

  17. Enrichment Clusters for Gifted Learning.

    ERIC Educational Resources Information Center

    Renzulli, Joseph S.

    1999-01-01

    Authentic learning consists of applying relevant knowledge, thinking skills, and interpersonal skills to solving real-world problems. Students assume roles of firsthand investigators, writers, artists, or other practitioners committed to producing a product or a service. A Connecticut high school's video-production company embodies a successful…

  18. Outlier analysis of functional genomic profiles enriches for oncology targets and enables precision medicine.

    PubMed

    Zhu, Zhou; Ihle, Nathan T; Rejto, Paul A; Zarrinkar, Patrick P

    2016-06-13

    Genome-scale functional genomic screens across large cell line panels provide a rich resource for discovering tumor vulnerabilities that can lead to the next generation of targeted therapies. Their data analysis typically has focused on identifying genes whose knockdown enhances response in various pre-defined genetic contexts, which are limited by biological complexities as well as the incompleteness of our knowledge. We thus introduce a complementary data mining strategy to identify genes with exceptional sensitivity in subsets, or outlier groups, of cell lines, allowing an unbiased analysis without any a priori assumption about the underlying biology of dependency. Genes with outlier features are strongly and specifically enriched with those known to be associated with cancer and relevant biological processes, despite no a priori knowledge being used to drive the analysis. Identification of exceptional responders (outliers) may not lead only to new candidates for therapeutic intervention, but also tumor indications and response biomarkers for companion precision medicine strategies. Several tumor suppressors have an outlier sensitivity pattern, supporting and generalizing the notion that tumor suppressors can play context-dependent oncogenic roles. The novel application of outlier analysis described here demonstrates a systematic and data-driven analytical strategy to decipher large-scale functional genomic data for oncology target and precision medicine discoveries.

  19. OSAnalyzer: A Bioinformatics Tool for the Analysis of Gene Polymorphisms Enriched with Clinical Outcomes

    PubMed Central

    Agapito, Giuseppe; Botta, Cirino; Guzzi, Pietro Hiram; Arbitrio, Mariamena; Di Martino, Maria Teresa; Tassone, Pierfrancesco; Tagliaferri, Pierosandro; Cannataro, Mario

    2016-01-01

    Background: The identification of biomarkers for the estimation of cancer patients’ survival is a crucial problem in modern oncology. Recently, the Affymetrix DMET (Drug Metabolizing Enzymes and Transporters) microarray platform has offered the possibility to determine the ADME (absorption, distribution, metabolism, and excretion) gene variants of a patient and to correlate them with drug-dependent adverse events. Therefore, the analysis of survival distribution of patients starting from their profile obtained using DMET data may reveal important information to clinicians about possible correlations among drug response, survival rate, and gene variants. Methods: In order to provide support to this analysis we developed OSAnalyzer, a software tool able to compute the overall survival (OS) and progression-free survival (PFS) of cancer patients and evaluate their association with ADME gene variants. Results: The tool is able to perform an automatic analysis of DMET data enriched with survival events. Moreover, results are ranked according to statistical significance obtained by comparing the area under the curves that is computed by using the log-rank test, allowing a quick and easy analysis and visualization of high-throughput data. Conclusions: Finally, we present a case study to highlight the usefulness of OSAnalyzer when analyzing a large cohort of patients. PMID:27669316

  20. The REFLEX II galaxy cluster survey: power spectrum analysis

    NASA Astrophysics Data System (ADS)

    Balaguera-Antolínez, A.; Sánchez, Ariel G.; Böhringer, H.; Collins, C.; Guzzo, L.; Phleps, S.

    2011-05-01

    We present the power spectrum of galaxy clusters measured from the new ROSAT-ESO Flux-Limited X-Ray (REFLEX II) galaxy cluster catalogue. This new sample extends the flux limit of the original REFLEX catalogue to 1.8 × 10-12 erg s-1 cm-2, yielding a total of 911 clusters with ≥94 per cent completeness in redshift follow-up. The analysis of the data is improved by creating a set of 100 REFLEX II-catalogue-like mock galaxy cluster catalogues built from a suite of large-volume Λ cold dark matter (ΛCDM) N-body simulations (L-BASICC II). The measured power spectrum is in agreement with the predictions from a ΛCDM cosmological model. The measurements show the expected increase in the amplitude of the power spectrum with increasing X-ray luminosity. On large scales, we show that the shape of the measured power spectrum is compatible with a scale-independent bias and provide a model for the amplitude that allows us to connect our measurements with a cosmological model. By implementing a luminosity-dependent power-spectrum estimator, we observe that the power spectrum measured from the REFLEX II sample is weakly affected by flux-selection effects. The shape of the measured power spectrum is compatible with a featureless power spectrum on scales k > 0.01 h Mpc-1 and hence no statistically significant signal of baryonic acoustic oscillations can be detected. We show that the measured REFLEX II power spectrum displays signatures of non-linear evolution.

  1. Sequential Fe3O4/TiO2 enrichment for phosphopeptide analysis by liquid chromatography/tandem mass spectrometry.

    PubMed

    Choi, Sunkyu; Kim, Jaeyoon; Cho, Kun; Park, Gunwook; Yoon, Jong Hyuk; Park, Sehoon; Yoo, Jong Shin; Ryu, Sung Ho; Kim, Young Hwan; Kim, Jeongkwon

    2010-05-30

    Protein phosphorylation regulates a wide range of cellular functions and is associated with signaling pathways in cells. Various strategies for enrichment of phosphoproteins or phosphopeptides have been developed. Here, we developed a novel sequential phosphopeptide enrichment method, using magnetic iron oxide (Fe(3)O(4)) and titanium dioxide (TiO(2)) particles, to detect mono- and multi-phosphorylated peptides. In the first step, phosphopeptides were captured on Fe(3)O(4) particles. In a subsequent step, any residual phosphopeptides were captured on TiO(2) particles. The particles were eluted and rinsed to yield phosphopeptide-enriched fractions that were combined and analyzed using liquid chromatography/tandem mass spectrometry (LC/MS/MS). The validity of this sequential Fe(3)O(4)/TiO(2) enrichment strategy was demonstrated by the successful enrichment of bovine alpha-casein phosphopeptides. We then applied the sequential Fe(3)O(4)/TiO(2) enrichment method to the analysis of phosphopeptides in L6 muscle cell lysates and successfully identified mono- and multi-phosphorylated peptides. Copyright 2010 John Wiley & Sons, Ltd.

  2. Cluster analysis in systems of magnetic spheres and cubes

    NASA Astrophysics Data System (ADS)

    Pyanzina, E. S.; Gudkova, A. V.; Donaldson, J. G.; Kantorovich, S. S.

    2017-06-01

    In the present work we use molecular dynamics simulations and graph-theory based cluster analysis to compare self-assembly in systems of magnetic spheres, and cubes where the dipole moment is oriented along the side of the cube in the [001] crystallographic direction. We show that under the same conditions cubes aggregate far less than their spherical counterparts. This difference can be explained in terms of the volume of phase space in which the formation of the bond is thermodynamically advantageous. It follows that this volume is much larger for a dipolar sphere than for a dipolar cube.

  3. A cluster analysis on road traffic accidents using genetic algorithms

    NASA Astrophysics Data System (ADS)

    Saharan, Sabariah; Baragona, Roberto

    2017-04-01

    The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are "Speed greater than 60 km h" and "Did not see other people until it was too late". A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.

  4. clusterProfiler: an R package for comparing biological themes among gene clusters.

    PubMed

    Yu, Guangchuang; Wang, Li-Gen; Han, Yanyan; He, Qing-Yu

    2012-05-01

    Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis. Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters. The analysis module and visualization module were combined into a reusable workflow. Currently, clusterProfiler supports three species, including humans, mice, and yeast. Methods provided in this package can be easily extended to other species and ontologies. The clusterProfiler package is released under Artistic-2.0 License within Bioconductor project. The source code and vignette are freely available at http://bioconductor.org/packages/release/bioc/html/clusterProfiler.html.

  5. Genome-wide enrichment analysis between endometriosis and obesity-related traits reveals novel susceptibility loci

    PubMed Central

    Rahmioglu, Nilufer; Macgregor, Stuart; Drong, Alexander W.; Hedman, Åsa K.; Harris, Holly R.; Randall, Joshua C.; Prokopenko, Inga; Nyholt, Dale R.; Morris, Andrew P.; Montgomery, Grant W.; Missmer, Stacey A.; Lindgren, Cecilia M.; Zondervan, Krina T.

    2015-01-01

    Endometriosis is a chronic inflammatory condition in women that results in pelvic pain and subfertility, and has been associated with decreased body mass index (BMI). Genetic variants contributing to the heritable component have started to emerge from genome-wide association studies (GWAS), although the majority remain unknown. Unexpectedly, we observed an intergenic locus on 7p15.2 that was genome-wide significantly associated with both endometriosis and fat distribution (waist-to-hip ratio adjusted for BMI; WHRadjBMI) in an independent meta-GWAS of European ancestry individuals. This led us to investigate the potential overlap in genetic variants underlying the aetiology of endometriosis, WHRadjBMI and BMI using GWAS data. Our analyses demonstrated significant enrichment of common variants between fat distribution and endometriosis (P = 3.7 × 10−3), which was stronger when we restricted the investigation to more severe (Stage B) cases (P = 4.5 × 10−4). However, no genetic enrichment was observed between endometriosis and BMI (P = 0.79). In addition to 7p15.2, we identify four more variants with statistically significant evidence of involvement in both endometriosis and WHRadjBMI (in/near KIFAP3, CAB39L, WNT4, GRB14); two of these, KIFAP3 and CAB39L, are novel associations for both traits. KIFAP3, WNT4 and 7p15.2 are associated with the WNT signalling pathway; formal pathway analysis confirmed a statistically significant (P = 6.41 × 10−4) overrepresentation of shared associations in developmental processes/WNT signalling between the two traits. Our results demonstrate an example of potential biological pleiotropy that was hitherto unknown, and represent an opportunity for functional follow-up of loci and further cross-phenotype comparisons to assess how fat distribution and endometriosis pathogenesis research fields can inform each other. PMID:25296917

  6. Transcriptome analysis identifies genes with enriched expression in the mouse central Extended Amygdala

    PubMed Central

    Becker, Jérôme A. J.; Befort, Katia; Blad, Clara; Filliol, Dominique; Ghate, Aditee; Dembele, Doulaye; Thibault, Christelle; Koch, Muriel; Muller, Jean; Lardenois, Aurélie; Poch, Olivier; Kieffer, Brigitte L.

    2008-01-01

    The central Extended Amygdala (EAc) is an ensemble of highly interconnected limbic structures of the anterior brain, and forms a cellular continuum including the Bed Nucleus of the Stria Terminalis (BNST), the central nucleus of the Amygdala (CeA) and the Nucleus Accumbens shell (AcbSh). This neural network is a key site for interactions between brain reward and stress systems, and has been implicated in several aspects of drug abuse. In order to increase our understanding of EAc function at the molecular level, we undertook a genome-wide screen (Affymetrix) to identify genes whose expression is enriched in the EAc. We focused on the less-well known BNST-CeA areas of the EAc, and identified 121 genes that exhibit more than 2-fold higher expression level in the EAc compared to whole brain. Among these, forty-three genes have never been described to be expressed in the EAc. We mapped these genes throughout the brain, using non-radioactive in situ hybridization, and identified eight genes with a unique and distinct rostro-caudal expression pattern along AcbSh, BNST and CeA. Q-PCR analysis performed in brain and peripheral organ tissues indicated that, with the exception of one (Spata13), all these genes are predominantly expressed in brain. These genes encode signaling proteins (Adora2, GPR88, Arpp21 and Rem2), a transcription factor (Limh6) or proteins of unknown function (Rik130, Spata13 and Wfs1). The identification of genes with enriched expression expands our knowledge of EAc at a molecular level, and provides useful information to towards genetic manipulations within the EAc. PMID:18786617

  7. GC-based analysis of plant stanyl fatty acid esters in enriched foods.

    PubMed

    Barnsteiner, Andreas; Lubinus, Tim; di Gianvito, Angelica; Schmid, Wolfgang; Engel, Karl-Heinz

    2011-05-25

    Approaches for the capillary gas chromatographic (GC) based analysis of intact plant stanyl esters in enriched foods were developed. Reference compounds were synthesized by enzyme-catalyzed transesterifications. Their identities were confirmed by means of mass spectrometry. Using a medium polar trifluoropropylmethyl polysiloxane stationary phase, long-chain plant stanyl esters could be separated according to their stanol moieties and their fatty acid chains. Thermal degradation during GC analysis was compensated by determining response factors; calibrations were performed for ten individual plant stanyl esters. For the analysis of low-fat products (skimmed milk drinking yogurts), the GC separation was combined with a "fast extraction" under acidic conditions. For fat-based foods (margarines), online coupled LC-GC offered an elegant and efficient way to avoid time-consuming sample preparation steps. The robust and rapid methods allow conclusions on both, the stanol profiles and the fatty acid moieties, and thus provide a basis for the authentication of this type of functional food ingredients.

  8. Analysis of phytosterols and phytostanols in enriched dairy products by Fast gas chromatography with mass spectrometry.

    PubMed

    Inchingolo, Raffaella; Cardenia, Vladimiro; Rodriguez-Estrada, Maria Teresa

    2014-10-01

    A Fast gas chromatography and mass spectrometry method for plant sterols/stanols analysis was developed, using a short capillary gas chromatography column (10 m × 0.1 mm internal diameter × 0.1 μm film thickness) coated with 5% diphenyl-polysiloxane. A silylated mixture of the main plant sterols/stanols standards (β-sitosterol, campesterol, stigmasterol, campestanol, sitostanol) was well separated in 1.5 min, with a good peak resolution (>1.4, determined on a critical chromatographic peak pair (β-sitosterol and sitostanol)), repeatability (<13%), and sensitivity (<0.017 ng/mL). The suitability of this Fast chromatography method was tested on plant sterols/stanols-enriched dairy products (yogurt and milk), which were subjected to lipid extraction, cold saponification, and silylation prior to injection. The analytical performance (sensitivity < 0.256 ng/mL and repeatability < 10.36%) and significant reduction of the analysis time and consumables demonstrate that Fast gas chromatography-mass spectrometry method could be also employed for the plant sterols/stanols analysis in functional dairy products. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Evaluation of tetraglyme for the enrichment and analysis of volatile organic compounds in air.

    PubMed

    Huybrechts, T; Dewulf, J; Van Craeynest, K; Van Langenhove, H

    2001-07-13

    A recently developed method for the sampling and analysis of volatile organic compounds in air has been evaluated. The system is based on the enrichment of analytes in tetraethylene glycol dimethyl ether or tetraglyme, a water-soluble organic liquid. The subsequent analysis consists of dispersion of a sample aliquot in water followed by purge-and-trap and gas chromatographic separation. Physico-chemical data were investigated for 10 volatile organic compounds, providing information on the possibilities and limitations of the tetraglyme method. The target analytes included chlorinated alkanes and alkenes, and monocyclic aromatic hydrocarbons. Air/tetraglyme partition coefficients Kat were determined over an environmental relevant temperature range of 2-25 degrees C to evaluate sorption efficiencies and estimate breakthrough volumes at the sampling stage. At 2 degrees C breakthrough volumes (allowing 5% of breakthrough) ranged from 5.8 (1,1-dichloroethane) to 312 l (1,1,2-trichloroethane) for 20 ml of tetraglyme. With regard to the desorption stage, the effect of tetraglyme on the air/water partition of organic compounds was investigated through the measurement of air/tetraglyme-water partition coefficients Kat-w for 2-31% (v/v) tetraglyme in water. Finally a clean-up procedure for tetraglyme was evaluated. Analysis of a blank tetraglyme-water (17:83, v:v) mixture by gas chromatography-flame ionization detection/mass spectrometry showed minor background signals. None of the target compounds were detected.

  10. [Study of the clinical phenotype of symptomatic chronic airways disease by hierarchical cluster analysis and two-step cluster analyses].

    PubMed

    Ning, P; Guo, Y F; Sun, T Y; Zhang, H S; Chai, D; Li, X M

    2016-09-01

    To study the distinct clinical phenotype of chronic airway diseases by hierarchical cluster analysis and two-step cluster analysis. A population sample of adult patients in Donghuamen community, Dongcheng district and Qinghe community, Haidian district, Beijing from April 2012 to January 2015, who had wheeze within the last 12 months, underwent detailed investigation, including a clinical questionnaire, pulmonary function tests, total serum IgE levels, blood eosinophil level and a peak flow diary. Nine variables were chosen as evaluating parameters, including pre-salbutamol forced expired volume in one second(FEV1)/forced vital capacity(FVC) ratio, pre-salbutamol FEV1, percentage of post-salbutamol change in FEV1, residual capacity, diffusing capacity of the lung for carbon monoxide/alveolar volume adjusted for haemoglobin level, peak expiratory flow(PEF) variability, serum IgE level, cumulative tobacco cigarette consumption (pack-years) and respiratory symptoms (cough and expectoration). Subjects' different clinical phenotype by hierarchical cluster analysis and two-step cluster analysis was identified. (1) Four clusters were identified by hierarchical cluster analysis. Cluster 1 was chronic bronchitis in smokers with normal pulmonary function. Cluster 2 was chronic bronchitis or mild chronic obstructive pulmonary disease (COPD) patients with mild airflow limitation. Cluster 3 included COPD patients with heavy smoking, poor quality of life and severe airflow limitation. Cluster 4 recognized atopic patients with mild airflow limitation, elevated serum IgE and clinical features of asthma. Significant differences were revealed regarding pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, maximal mid-expiratory flow curve(MMEF)% pred, carbon monoxide diffusing capacity per liter of alveolar(DLCO)/(VA)% pred, residual volume(RV)% pred, total serum IgE level, smoking history (pack-years), St.George's respiratory questionnaire

  11. Multivariate cluster analysis of forest fire events in Portugal

    NASA Astrophysics Data System (ADS)

    Tonini, Marj; Pereira, Mario; Vega Orozco, Carmen; Parente, Joana

    2015-04-01

    Portugal is one of the major fire-prone European countries, mainly due to its favourable climatic, topographic and vegetation conditions. Compared to the other Mediterranean countries, the number of events registered here from 1980 up to nowadays is the highest one; likewise, with respect to the burnt area, Portugal is the third most affected country. Portuguese mapped burnt areas are available from the website of the Institute for the Conservation of Nature and Forests (ICNF). This official geodatabase is the result of satellite measurements starting from the year 1990. The spatial information, delivered in shapefile format, provides a detailed description of the shape and the size of area burnt by each fire, while the date/time information relate to the ignition fire is restricted to the year of occurrence. In terms of a statistical formalism wildfires can be associated to a stochastic point process, where events are analysed as a set of geographical coordinates corresponding, for example, to the centroid of each burnt area. The spatio/temporal pattern of stochastic point processes, including the cluster analysis, is a basic procedure to discover predisposing factorsas well as for prevention and forecasting purposes. These kinds of studies are primarily focused on investigating the spatial cluster behaviour of environmental data sequences and/or mapping their distribution at different times. To include both the two dimensions (space and time) a comprehensive spatio-temporal analysis is needful. In the present study authors attempt to verify if, in the case of wildfires in Portugal, space and time act independently or if, conversely, neighbouring events are also closer in time. We present an application of the spatio-temporal K-function to a long dataset (1990-2012) of mapped burnt areas. Moreover, the multivariate K-function allowed checking for an eventual different distribution between small and large fires. The final objective is to elaborate a 3D

  12. Cluster analysis of autoantibodies in 852 patients with systemic lupus erythematosus from a single center.

    PubMed

    Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat

    2014-07-01

    Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.

  13. Detection of orphan domains in Drosophila using "hydrophobic cluster analysis".

    PubMed

    Bitard-Feildel, Tristan; Heberlein, Magdalena; Bornberg-Bauer, Erich; Callebaut, Isabelle

    2015-12-01

    Comparative genomics has become an important strategy in life science research. While many genes, and the proteins they code for, can be well characterized by assigning orthologs, a significant amount of proteins or domains remain obscure "orphans". Some orphans are overlooked by current computational methods because they rapidly diverged, others emerged relatively recently (de novo). Recent research has demonstrated the importance of orphans, and of de novo proteins and domains for development of new phenotypic traits and adaptation. New approaches for detecting novel domains are thus of paramount importance. The hydrophobic cluster analysis (HCA) method delineates globular-like domains from the information of a protein sequence and thereby allows bypassing some of the established methods limitations based on conserved sequence similarity. In this study, HCA is tested for orphan domain detection on 12 Drosophila genomes. After their detection, the oprhan domains are classified into two categories, depending on their presence/absence in distantly related species. The two categories show significantly different physico-chemical properties when compared to previously characterized domains from the Pfam database. The newly detected domains have a higher degree of intrinsic disorder and a particular hydrophobic cluster composition. The older the domains are, the more similar their hydrophobic cluster content is to the cluster content of Pfam domains. The results suggest that, over time, newly created domains acquire a canonical set of hydrophobic clusters but conserve some features of intrinsically disordered regions. Our results agree with previous findings on orphan domains and suggest that the physico-chemical properties of domains change over evolutionary long time scale. The presented HCA-based method is able to detect domains with unusual properties without relying on prior knowledge, such as the availability of homologs. Therefore, the method has large potential

  14. Gene-Based Analysis of Regionally Enriched Cortical Genes in GWAS Data Sets of Cognitive Traits and Psychiatric Disorders

    PubMed Central

    Ersland, Kari M.; Christoforou, Andrea; Stansberg, Christine; Espeseth, Thomas; Mattheisen, Manuel; Mattingsdal, Morten; Hardarson, Gudmundur A.; Hansen, Thomas; Fernandes, Carla P. D.; Giddaluru, Sudheer; Breuer, René; Strohmaier, Jana; Djurovic, Srdjan; Nöthen, Markus M.; Rietschel, Marcella; Lundervold, Astri J.; Werge, Thomas; Cichon, Sven; Andreassen, Ole A.; Reinvang, Ivar; Steen, Vidar M.; Le Hellard, Stephanie

    2012-01-01

    Background Despite its estimated high heritability, the genetic architecture leading to differences in cognitive performance remains poorly understood. Different cortical regions play important roles in normal cognitive functioning and impairment. Recently, we reported on sets of regionally enriched genes in three different cortical areas (frontomedial, temporal and occipital cortices) of the adult rat brain. It has been suggested that genes preferentially, or specifically, expressed in one region or organ reflect functional specialisation. Employing a gene-based approach to the analysis, we used the regionally enriched cortical genes to mine a genome-wide association study (GWAS) of the Norwegian Cognitive NeuroGenetics (NCNG) sample of healthy adults for association to nine psychometric tests measures. In addition, we explored GWAS data sets for the serious psychiatric disorders schizophrenia (SCZ) (n = 3 samples) and bipolar affective disorder (BP) (n = 3 samples), to which cognitive impairment is linked. Principal Findings At the single gene level, the temporal cortex enriched gene RAR-related orphan receptor B (RORB) showed the strongest overall association, namely to a test of verbal intelligence (Vocabulary, P = 7.7E-04). We also applied gene set enrichment analysis (GSEA) to test the candidate genes, as gene sets, for enrichment of association signal in the NCNG GWAS and in GWASs of BP and of SCZ. We found that genes differentially expressed in the temporal cortex showed a significant enrichment of association signal in a test measure of non-verbal intelligence (Reasoning) in the NCNG sample. Conclusion Our gene-based approach suggests that RORB could be involved in verbal intelligence differences, while the genes enriched in the temporal cortex might be important to intellectual functions as measured by a test of reasoning in the healthy population. These findings warrant further replication in independent samples on cognitive traits. PMID

  15. High resolution spectroscopic analysis of seven giants in the bulge globular cluster NGC 6723

    NASA Astrophysics Data System (ADS)

    Rojas-Arriagada, A.; Zoccali, M.; Vásquez, S.; Ripepi, V.; Musella, I.; Marconi, M.; Grado, A.; Limatola, L.

    2016-03-01

    Context. Globular clusters associated with the Galactic bulge are important tracers of stellar populations in the inner Galaxy. High resolution analysis of stars in these clusters allows us to characterize them in terms of kinematics, metallicity, and individual abundances, and to compare these fingerprints with those characterizing field populations. Aims: We present iron and element ratios for seven red giant stars in the globular cluster NGC 6723, based on high resolution spectroscopy. Methods: High resolution spectra (R ~ 48 000) of seven K giants belonging to NGC 6723 were obtained with the FEROS spectrograph at the MPG/ESO 2.2 m telescope. Photospheric parameters were derived from ~130 Fe i and Fe ii transitions. Abundance ratios were obtained from line-to-line spectrum synthesis calculations on clean selected features. Results: An intermediate metallicity of [Fe/H] = -0.98 ± 0.08 dex and a heliocentric radial velocity of vhel = -96.6 ± 1.3 km s-1 were found for NGC 6723. Alpha-element abundances present enhancements of [O/Fe] = 0.29 ± 0.18 dex, [Mg/Fe] = 0.23 ± 0.10 dex, [Si/Fe] = 0.36 ± 0.05 dex, and [Ca/Fe] = 0.30 ± 0.07 dex. Similar overabundance is found for the iron-peak Ti with [Ti/Fe] = 0.24 ± 0.09 dex. Odd-Z elements Na and Al present abundances of [Na/Fe] = 0.00 ± 0.21 dex and [Al/Fe] = 0.31 ± 0.21 dex, respectively. Finally, the s-element Ba is also enhanced by [Ba/Fe] = 0.22 ± 0.21 dex. Conclusions: The enhancement levels of NGC 6723 are comparable to those of other metal-intermediate bulge globular clusters. In turn, these enhancement levels are compatible with the abundance profiles displayed by bulge field stars at that metallicity. This hints at a possible similar chemical evolution with globular clusters and the metal-poor of the bulge going through an early prompt chemical enrichment.

  16. Analysis and Implementation of Graph Clustering for Digital News Using Star Clustering Algorithm

    NASA Astrophysics Data System (ADS)

    Ahdi, A. B.; SW, K. R.; Herdiani, A.

    2017-01-01

    Since Web 2.0 notion emerged and is used extensively by many services in the Internet, we see an unprecedented proliferation of digital news. Those digital news is very rich in term of content and link to other news/sources but lack of category information. This make the user could not easily identify or grouping all the news that they read into set of groups. Naturally, digital news are linked data because every digital new has relation/connection with other digital news/resources. The most appropriate model for linked data is graph model. Graph model is suitable for this purpose due its flexibility in describing relation and its easy-to-understand visualization. To handle the grouping issue, we use graph clustering approach. There are many graph clustering algorithm available, such as MST Clustering, Chameleon, Makarov Clustering and Star Clustering. From all of these options, we choose Star Clustering because this algorithm is more easy-to-understand, more accurate, efficient and guarantee the quality of clusters results. In this research, we investigate the accuracy of the cluster results by comparing it with expert judgement. We got quite high accuracy level, which is 80.98% and for the cluster quality, we got promising result which is 62.87%.

  17. Year clustering analysis for modelling olive flowering phenology.

    PubMed

    Oteros, J; García-Mozo, H; Hervás-Martínez, C; Galán, C

    2013-07-01

    It is now widely accepted that weather conditions occurring several months prior to the onset of flowering have a major influence on various aspects of olive reproductive phenology, including flowering intensity. Given the variable characteristics of the Mediterranean climate, we analyse its influence on the registered variations in olive flowering intensity in southern Spain, and relate them to previous climatic parameters using a year-clustering approach, as a first step towards an olive flowering phenology model adapted to different year categories. Phenological data from Cordoba province (Southern Spain) for a 30-year period (1982-2011) were analysed. Meteorological and phenological data were first subjected to both hierarchical and "K-means" clustering analysis, which yielded four year-categories. For this classification purpose, three different models were tested: (1) discriminant analysis; (2) decision-tree analysis; and (3) neural network analysis. Comparison of the results showed that the neural-networks model was the most effective, classifying four different year categories with clearly distinct weather features. Flowering-intensity models were constructed for each year category using the partial least squares regression method. These category-specific models proved to be more effective than general models. They are better suited to the variability of the Mediterranean climate, due to the different response of plants to the same environmental stimuli depending on the previous weather conditions in any given year. The present detailed analysis of the influence of weather patterns of different years on olive phenology will help us to understand the short-term effects of climate change on olive crop in the Mediterranean area that is highly affected by it.

  18. Application of cluster analysis in prevention of coronary heart disease.

    PubMed

    Pereira, Catarina; Vogelaere, Peter

    2005-03-01

    Coronary heart disease is one of the principal causes of death and morbidity in the western world, and particularly in Portugal. This study's aim was to investigate coronary disease risk factors, differentiating lifestyles and behavioral habits which are associated with onset of the disease. The experimental population was divided into two groups: an experimental group (n=30)--male subjects, aged 40-75 years, who suffered a first coronary event in the previous 20 days; and a control group (n=30)--male subjects, aged 40-75 years, who presented no coronary problems. Individuals with a clinical history of any other chronic disease were excluded from the sample. Data were obtained through questionnaires. Data analysis consisted of both traditional statistical analysis (Student's t test) and cluster analysis. The latter technique enables behavioral patterns that will or will not induce coronary heart disease to be distinguished. The Student's t test revealed significant differences (p < or = 0.05) between the experimental and control groups for the following variables: nutrition and dietary habits, smoking, stress and psychosocial factors, hereditary factors and total risk factors. The risk level of all these factors was higher in the experimental group. Cluster analysis applied to 19 variables enabled three behavioral patterns to be identified that may induce the disease, characterized by high risk indices in specific variables, and one behavioral pattern that tends to prevent development of coronary heart disease, which is characterized by low levels of risk factors. Coronary heart disease appears to be related to lifestyle and habits. Analysis of the three high-risk behavioral patterns enabled priority areas to be established for preventive measures against coronary heart disease. These are: stress, irritability and depression, smoking, sedentary lifestyle and nutrition (excessive consumption of salt, sugar and alcohol).

  19. Time series clustering analysis of health-promoting behavior

    NASA Astrophysics Data System (ADS)

    Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng

    2013-10-01

    Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.

  20. Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma.

    PubMed

    Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D

    2017-06-01

    Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.

  1. A Meta-Analysis of the Effects of Enrichment Programs on Gifted Students

    ERIC Educational Resources Information Center

    Kim, Mihyeon

    2016-01-01

    Although descriptions of enrichment programs are valuable for practitioners, practices, and services for gifted students, they must be backed by evidence, derived through a synthesis of research. This study examined research on enrichment programs serving gifted students and synthesized the current studies between 1985 and 2014 on the effects of…

  2. A Meta-Analysis of the Effects of Enrichment Programs on Gifted Students

    ERIC Educational Resources Information Center

    Kim, Mihyeon

    2016-01-01

    Although descriptions of enrichment programs are valuable for practitioners, practices, and services for gifted students, they must be backed by evidence, derived through a synthesis of research. This study examined research on enrichment programs serving gifted students and synthesized the current studies between 1985 and 2014 on the effects of…

  3. An enhanced cluster analysis program with bootstrap significance testing for ecological community analysis

    USGS Publications Warehouse

    McKenna, J.E.

    2003-01-01

    The biosphere is filled with complex living patterns and important questions about biodiversity and community and ecosystem ecology are concerned with structure and function of multispecies systems that are responsible for those patterns. Cluster analysis identifies discrete groups within multivariate data and is an effective method of coping with these complexities, but often suffers from subjective identification of groups. The bootstrap testing method greatly improves objective significance determination for cluster analysis. The BOOTCLUS program makes cluster analysis that reliably identifies real patterns within a data set more accessible and easier to use than previously available programs. A variety of analysis options and rapid re-analysis provide a means to quickly evaluate several aspects of a data set. Interpretation is influenced by sampling design and a priori designation of samples into replicate groups, and ultimately relies on the researcher's knowledge of the organisms and their environment. However, the BOOTCLUS program provides reliable, objectively determined groupings of multivariate data.

  4. IPC two-color analysis of x ray galaxy clusters

    NASA Technical Reports Server (NTRS)

    White, Raymond E., III

    1990-01-01

    The mass distributions were determined of several clusters of galaxies by using X ray surface brightness data from the Einstein Observatory Imaging Proportional Counter (IPC). Determining cluster mass distributions is important for constraining the nature of the dark matter which dominates the mass of galaxies, galaxy clusters, and the Universe. Galaxy clusters are permeated with hot gas in hydrostatic equilibrium with the gravitational potentials of the clusters. Cluster mass distributions can be determined from x ray observations of cluster gas by using the equation of hydrostatic equilibrium and knowledge of the density and temperature structure of the gas. The x ray surface brightness at some distance from the cluster is the result of the volume x ray emissivity being integrated along the line of sight in the cluster.

  5. Cluster Analysis of Tumor Suppressor Genes in Canine Leukocytes Identifies Activation State

    PubMed Central

    Daly, Julie-Anne; Mortlock, Sally-Anne; Taylor, Rosanne M.; Williamson, Peter

    2015-01-01

    Cells of the immune system undergo activation and subsequent proliferation in the normal course of an immune response. Infrequently, the molecular and cellular events that underlie the mechanisms of proliferation are dysregulated and may lead to oncogenesis, leading to tumor formation. The most common forms of immunological cancers are lymphomas, which in dogs account for 8%–20% of all cancers, affecting up to 1.2% of the dog population. Key genes involved in negatively regulating proliferation of lymphocytes include a group classified as tumor suppressor genes (TSGs). These genes are also known to be associated with progression of lymphoma in humans, mice, and dogs and are potential candidates for pathological grading and diagnosis. The aim of the present study was to analyze TSG profiles in stimulated leukocytes from dogs to identify genes that discriminate an activated phenotype. A total of 554 TSGs and three gene set collections were analyzed from microarray data. Cluster analysis of three subsets of genes discriminated between stimulated and unstimulated cells. These included 20 most upregulated and downregulated TSGs, TSG in hallmark gene sets significantly enriched in active cells, and a selection of candidate TSGs, p15 (CDKN2B), p18 (CDKN2C), p19 (CDKN1A), p21 (CDKN2A), p27 (CDKN1B), and p53 (TP53) in the third set. Analysis of two subsets suggested that these genes or a subset of these genes may be used as a specialized PCR set for additional analysis. PMID:27478369

  6. Defining the optimal animal model for translational research using gene set enrichment analysis.

    PubMed

    Weidner, Christopher; Steinfath, Matthias; Opitz, Elisa; Oelgeschläger, Michael; Schönfelder, Gilbert

    2016-08-01

    The mouse is the main model organism used to study the functions of human genes because most biological processes in the mouse are highly conserved in humans. Recent reports that compared identical transcriptomic datasets of human inflammatory diseases with datasets from mouse models using traditional gene-to-gene comparison techniques resulted in contradictory conclusions regarding the relevance of animal models for translational research. To reduce susceptibility to biased interpretation, all genes of interest for the biological question under investigation should be considered. Thus, standardized approaches for systematic data analysis are needed. We analyzed the same datasets using gene set enrichment analysis focusing on pathways assigned to inflammatory processes in either humans or mice. The analyses revealed a moderate overlap between all human and mouse datasets, with average positive and negative predictive values of 48 and 57% significant correlations. Subgroups of the septic mouse models (i.e., Staphylococcus aureus injection) correlated very well with most human studies. These findings support the applicability of targeted strategies to identify the optimal animal model and protocol to improve the success of translational research. © 2016 The Authors. Published under the terms of the CC BY 4.0 license.

  7. Metatranscriptomic array analysis of 'Candidatus Accumulibacter phosphatis'-enriched enhanced biological phosphorus removal sludge.

    PubMed

    He, Shaomei; Kunin, Victor; Haynes, Matthew; Martin, Hector Garcia; Ivanova, Natalia; Rohwer, Forest; Hugenholtz, Philip; McMahon, Katherine D

    2010-05-01

    Here we report the first metatranscriptomic analysis of gene expression and regulation of 'Candidatus Accumulibacter'-enriched lab-scale sludge during enhanced biological phosphorus removal (EBPR). Medium density oligonucleotide microarrays were generated with probes targeting most predicted genes hypothesized to be important for the EBPR phenotype. RNA samples were collected at the early stage of anaerobic and aerobic phases (15 min after acetate addition and switching to aeration respectively). We detected the expression of a number of genes involved in the carbon and phosphate metabolisms, as proposed by EBPR models (e.g. polyhydroxyalkanoate synthesis, a split TCA cycle through methylmalonyl-CoA pathway, and polyphosphate formation), as well as novel genes discovered through metagenomic analysis. The comparison between the early stage anaerobic and aerobic gene expression profiles showed that expression levels of most genes were not significantly different between the two stages. The majority of upregulated genes in the aerobic sample are predicted to encode functions such as transcription, translation and protein translocation, reflecting the rapid growth phase of Accumulibacter shortly after being switched to aerobic conditions. Components of the TCA cycle and machinery involved in ATP synthesis were also upregulated during the early aerobic phase. These findings support the predictions of EBPR metabolic models that the oxidation of intracellularly stored carbon polymers through the TCA cycle provides ATP for cell growth when oxygen becomes available. Nitrous oxide reductase was among the very few Accumulibacter genes upregulated in the anaerobic sample, suggesting that its expression is likely induced by the deprivation of oxygen.

  8. Enrichment of Root Endophytic Bacteria from Populus deltoides and Single-Cell-Genomics Analysis

    DOE PAGES

    Utturkar, Sagar M.; Cude, W. Nathan; Robeson, Jr., Michael S.; ...

    2016-07-15

    Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. In this paper, we present a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from themore » plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Finally, comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria.« less

  9. A lectin-based isolation/enrichment strategy for improved coverage of N-glycan analysis.

    PubMed

    Guan, Feng; Tan, Zengqi; Li, Xiang; Pang, Xingchen; Zhu, Yunlin; Li, Dongliang; Yang, Ganglong

    2015-10-30

    Glycomics provides an increasingly useful research tool as the genomes and proteomes of more and more animal species are elucidated. In view of the general complexity and heterogeneity of glycans, improved depth-of-coverage and sensitivity are required for glycosylation analysis. In this study, we established the lectin-based isolation/enrichment strategy for total glycomic information. Specific lectins are added onto the filter to capture corresponding glycans prior to release of N-glycans by peptide N-glycosidase F (PNGase F). Non-bound glycans and bound glycans are released and analyzed by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS), respectively. Application of the strategy to chicken ovalbumin, normal mouse mammary epithelial cells (NMuMG), and human serum resulted in detection of 5, 6, and 11 additional N-glycan structures, respectively. The strategy facilitates identification of intact N-glycans in biological samples, and can be extended to detailed analysis of O-glycome or glycoproteome.

  10. Enrichment of Root Endophytic Bacteria from Populus deltoides and Single-Cell-Genomics Analysis

    SciTech Connect

    Utturkar, Sagar M.; Cude, W. Nathan; Robeson, Jr., Michael S.; Yang, Zamin Koo; Klingeman, Dawn Marie; Land, Miriam L.; Allman, Steve L.; Lu, Tse-Yuan S.; Brown, Steven D.; Schadt, Christopher Warren; Podar, Mircea; Doktycz, Mitchel J.; Pelletier, Dale A.

    2016-07-15

    Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. In this paper, we present a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from the plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Finally, comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria.

  11. Do cnidarians have a ParaHox cluster? Analysis of synteny around a Nematostella homeobox gene cluster.

    PubMed

    Hui, Jerome H L; Holland, Peter W H; Ferrier, David E K

    2008-01-01

    The Hox gene cluster is renowned for its role in developmental patterning of embryogenesis along the anterior-posterior axis of bilaterians. Its supposed evolutionary sister or paralog, the ParaHox cluster, is composed of Gsx, Xlox, and Cdx, and also has important roles in anterior-posterior development. There is a debate as to whether the cnidarians, as an outgroup to bilaterians, contain true Hox and ParaHox genes, or instead the Hox-like gene complement of cnidarians arose from independent duplications to those that generated the genes of the bilaterian Hox and ParaHox clusters. A recent whole genome analysis of the cnidarian Nematostella vectensis found conserved synteny between this cnidarian and vertebrates, including a region of synteny between the putative Hox cluster of N. vectensis and the Hox clusters of vertebrates. No syntenic region was identified around a potential cnidarian ParaHox cluster. Here we use different approaches to identify a genomic region in N. vectensis that is syntenic with the bilaterian ParaHox cluster. This proves that the duplication that gave rise to the Hox and ParaHox regions of bilaterians occurred before the origin of cnidarians, and the cnidarian N. vectensis has bona fide Hox and ParaHox loci.

  12. Clustering and Network Analysis of Reverse Phase Protein Array Data.

    PubMed

    Byron, Adam

    2017-01-01

    Molecular profiling of proteins and phosphoproteins using a reverse phase protein array (RPPA) platform, with a panel of target-specific antibodies, enables the parallel, quantitative proteomic analysis of many biological samples in a microarray format. Hence, RPPA analysis can generate a high volume of multidimensional data that must be effectively interrogated and interpreted. A range of computational techniques for data mining can be applied to detect and explore data structure and to form functional predictions from large datasets. Here, two approaches for the computational analysis of RPPA data are detailed: the identification of similar patterns of protein expression by hierarchical cluster analysis and the modeling of protein interactions and signaling relationships by network analysis. The protocols use freely available, cross-platform software, are easy to implement, and do not require any programming expertise. Serving as data-driven starting points for further in-depth analysis, validation, and biological experimentation, these and related bioinformatic approaches can accelerate the functional interpretation of RPPA data.

  13. A Laser-Based Method for On-Site Analysis of UF6 at Enrichment Plants

    SciTech Connect

    Anheier, Norman C.; Cannon, Bret D.; Martinez, Alonzo; Barrett, Christopher A.; Taubman, Matthew S.; Anderson, Kevin K.; Smith, Leon E.

    2014-11-23

    The International Atomic Energy Agency’s (IAEA’s) long-term research and development plan calls for more cost-effective and efficient safeguard methods to detect and deter misuse of gaseous centrifuge enrichment plants (GCEPs). The IAEA’s current safeguards approaches at GCEPs are based on a combination of routine and random inspections that include environmental sampling and destructive assay (DA) sample collection from UF6 in-process material and selected cylinders. Samples are then shipped offsite for subsequent laboratory analysis. In this paper, a new DA sample collection and onsite analysis approach that could help to meet challenges in transportation and chain of custody for UF6 DA samples is introduced. This approach uses a handheld sampler concept and a Laser Ablation, Laser Absorbance Spectrometry (LAARS) analysis instrument, both currently under development at the Pacific Northwest National Laboratory. A LAARS analysis instrument could be temporarily or permanently deployed in the IAEA control room of the facility, in the IAEA data acquisition cabinet, for example. The handheld PNNL DA sampler design collects and stabilizes a much smaller DA sample mass compared to current sampling methods. The significantly lower uranium mass reduces the sample radioactivity and the stabilization approach diminishes the risk of uranium and hydrogen fluoride release. These attributes enable safe sample handling needed during onsite LAARS assay and may help ease shipping challenges for samples to be processed at the IAEA’s offsite laboratory. The LAARS and DA sampler implementation concepts will be described and preliminary technical viability results presented.

  14. Highlights of the Merging Cluster Collaboration's Analysis of 26 Radio Relic Galaxy Cluster Mergers

    NASA Astrophysics Data System (ADS)

    Dawson, William; Golovich, Nathan; Wittman, David M.; Bradac, Marusa; Brüggen, Marcus; Bullock, James; Elbert, Oliver; Jee, James; Kaplinghat, Manoj; Kim, Stacy; Mahdavi, Andisheh; Merten, Julian; Ng, Karen; Annika, Peter; Rocha, Miguel E.; Sobral, David; Stroe, Andra; Van Weeren, Reinout J.; Merging Cluster Collaboration

    2016-01-01

    Merging galaxy clusters are now recognized as multifaceted probes providing unique insight into the properties of dark matter, the environmental impact of plasma shocks on galaxy evolution, and the physics of high energy particle acceleration. The Merging Cluster Collaboration has used the diffuse radio emission associated with the synchrotron radiation of relativistic particles accelerated by shocks generated during major cluster mergers (i.e. radio relics) to identify a homogenous sample of 26 galaxy cluster mergers. We have confirmed theoretical expectations that radio relics are predominantly associated with mergers occurring near the plane of the sky and at a relatively common merger phase; making them ideal probes of self-interacting dark matter, and eliminating much of the dominant uncertainty when relating the observed star formation rates to the event of the major cluster merger. We will highlight a number of the discovered common traits of this sample as well as detailed measurements of individual mergers.

  15. Principal Component Analysis and Cluster Analysis in Profile of Electrical System

    NASA Astrophysics Data System (ADS)

    Iswan; Garniwa, I.

    2017-03-01

    This paper propose to present approach for profile of electrical system, presented approach is combination algorithm, namely principal component analysis (PCA) and cluster analysis. Based on relevant data of gross domestic regional product and electric power and energy use. This profile is set up to show the condition of electrical system of the region, that will be used as a policy in the electrical system of spatial development in the future. This paper consider 24 region in South Sulawesi province as profile center points and use principal component analysis (PCA) to asses the regional profile for development. Cluster analysis is used to group these region into few cluster according to the new variable be produced PCA. The general planning of electrical system of South Sulawesi province can provide support for policy making of electrical system development. The future research can be added several variable into existing variable.

  16. The composite sequential clustering technique for analysis of multispectral scanner data

    NASA Technical Reports Server (NTRS)

    Su, M. Y.

    1972-01-01

    The clustering technique consists of two parts: (1) a sequential statistical clustering which is essentially a sequential variance analysis, and (2) a generalized K-means clustering. In this composite clustering technique, the output of (1) is a set of initial clusters which are input to (2) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum likelihood classification techniques. The mathematical algorithms for the composite sequential clustering program and a detailed computer program description with job setup are given.

  17. The methodology of multi-viewpoint clustering analysis

    NASA Technical Reports Server (NTRS)

    Mehrotra, Mala; Wild, Chris

    1993-01-01

    One of the greatest challenges facing the software engineering community is the ability to produce large and complex computer systems, such as ground support systems for unmanned scientific missions, that are reliable and cost effective. In order to build and maintain these systems, it is important that the knowledge in the system be suitably abstracted, structured, and otherwise clustered in a manner which facilitates its understanding, manipulation, testing, and utilization. Development of complex mission-critical systems will require the ability to abstract overall concepts in the system at various levels of detail and to consider the system from different points of view. Multi-ViewPoint - Clustering Analysis MVP-CA methodology has been developed to provide multiple views of large, complicated systems. MVP-CA provides an ability to discover significant structures by providing an automated mechanism to structure both hierarchically (from detail to abstract) and orthogonally (from different perspectives). We propose to integrate MVP/CA into an overall software engineering life cycle to support the development and evolution of complex mission critical systems.

  18. The methodology of multi-viewpoint clustering analysis

    NASA Technical Reports Server (NTRS)

    Mehrotra, Mala; Wild, Chris

    1993-01-01

    One of the greatest challenges facing the software engineering community is the ability to produce large and complex computer systems, such as ground support systems for unmanned scientific missions, that are reliable and cost effective. In order to build and maintain these systems, it is important that the knowledge in the system be suitably abstracted, structured, and otherwise clustered in a manner which facilitates its understanding, manipulation, testing, and utilization. Development of complex mission-critical systems will require the ability to abstract overall concepts in the system at various levels of detail and to consider the system from different points of view. Multi-ViewPoint - Clustering Analysis MVP-CA methodology has been developed to provide multiple views of large, complicated systems. MVP-CA provides an ability to discover significant structures by providing an automated mechanism to structure both hierarchically (from detail to abstract) and orthogonally (from different perspectives). We propose to integrate MVP/CA into an overall software engineering life cycle to support the development and evolution of complex mission critical systems.

  19. Focused maternity care in Ghana: results of a cluster analysis.

    PubMed

    Ayanore, Martin Amogre; Pavlova, Milena; Groot, Wim

    2016-08-17

    Ghana missed out in attaining Millennium Development Goal 5 in 2015. The provision of adequate prenatal and postnatal care remains problematic, with poor evidence on women's views on met and unmet maternity care needs across all regions in Ghana. This paper examines maternal care utilization in Ghana by applying WHO indicators for focused maternal care utilization. Two-step cluster analysis segregated women into groups based on the components of the maternity care used. Using cluster membership variables as dependent variables, we applied multinomial and binary regression to examine associations of care use with individual, household and regional characteristics. We identified three patterns of care use: adequate, less and least adquate care. The presence of a female and skilled provider is an indicator of adequate care. Women in Volta, Upper West, Northern and Western regions received less adequate care compared with other regions. Supply-related factors (drugs availability, distance/transport, health insurance ownership, rural residence) were associated with adequacy of care. The lack of female autonomy, widowed/divorced women, age and parity were associated with less adequate care. Care patterns were distinctively associated with the quality of health care support (skilled and female attendant) instead of with the number of visits made to the facility. Across regions and within rural settings, disparities exist, often compounded by supply-related factors. Efforts to address skilled workforce shortages, greater accountability for quality and equity, improving women motivation for care seeking and active participation are important for maternity care in Ghana.

  20. [Clustering Analysis of Hydatid Disease in Gansu Province].

    PubMed

    Yu, Da-wei; Ding, Guo-wu; Hou, Yan-dong; Feng, Yu; Li, Fan

    2015-08-01

    The prevalence of hydatid disease in human population and livestock, and the positive rate of echinococcal antigen in canine feces were analyzed with sample clustering method, according to the survey on hydatid disease in 72 counties in Gansu province in the database of the National Survey on Prevalence of Echinococcosis in 2012. The prevalence of hydatid disease in huma and livestock, and the positive rate of echinococcal antigen in canine feces were 0-1.59%, 0-15.22%, and 0-16.87% respectively. Clustering analysis revealed four types of prevalence in the 72 counties. The first type existed only in Dunhuang city, with the three indicators being 0.27%, 15.22% and 16.87%; the second in four counties, with the three indicators being 0.43%, 6.57% and 1.83%; the third in 22 counties, with the three indicators being 0.22%, 1.15% and 1035%; and the fourth in 45 counties, with the three indicators being 0.16%, 0.58% and 1.69%.

  1. Using n-gram analysis to cluster heartbeat signals

    PubMed Central

    2012-01-01

    Background Biological signals may carry specific characteristics that reflect basic dynamics of the body. In particular, heart beat signals carry specific signatures that are related to human physiologic mechanisms. In recent years, many researchers have shown that representations which used non-linear symbolic sequences can often reveal much hidden dynamic information. This kind of symbolization proved to be useful for predicting life-threatening cardiac diseases. Methods This paper presents an improved method called the “Adaptive Interbeat Interval Analysis (AIIA) method”. The AIIA method uses the Simple K-Means algorithm for symbolization, which offers a new way to represent subtle variations between two interbeat intervals without human intervention. After symbolization, it uses the n-gram algorithm to generate different kinds of symbolic sequences. Each symbolic sequence stands for a variation phase. Finally, the symbolic sequences are categorized by classic classifiers. Results In the experiments presented in this paper, AIIA method achieved 91% (3-gram, 26 clusters) accuracy in successfully classifying between the patients with Atrial Fibrillation (AF), Congestive Heart Failure (CHF) and healthy people. It also achieved 87% (3-gram, 26 clusters) accuracy in classifying the patients with apnea. Conclusions The two experiments presented in this paper demonstrate that AIIA method can categorize different heart diseases. Both experiments acquired the best category results when using the Bayesian Network. For future work, the concept of the AIIA method can be extended to the categorization of other physiological signals. More features can be added to improve the accuracy. PMID:22769567

  2. Fuel handling accident analysis for the University of Missouri Research Reactor's High Enriched Uranium to Low Enriched Uranium fuel conversion initiative

    NASA Astrophysics Data System (ADS)

    Rickman, Benjamin

    In accordance with the 1986 amendment concerning licenses for research and test reactors, the MU Research Reactor (MURR) is planning to convert from using High-Enriched Uranium (HEU) fuel to the use of Low-Enriched Uranium (LEU) fuel. Since the approval of a new LEU fuel that could meet the MURR's performance demands, the next phase of action for the fuel conversion process is to create a new Safety Analysis Report (SAR) with respect to the LEU fuel. A component of the SAR includes the Maximum Hypothetical Accident (MHA) and accidents that qualify under the class of Fuel Handling Accidents (FHA). In this work, the dose to occupational staff at the MURR is calculated for the FHAs. The radionuclide inventory for the proposed LEU fuel was calculated using the ORIGEN2 point-depletion code linked to the MURR neutron spectrum. The MURR spectrum was generated from a Monte Carlo Neutron transPort (MCNP) simulation. The coupling of these codes create MONTEBURNS, a time-dependent burnup code. The release fraction from each FHA within this analysis was established by the methodology of the 2006 HEU SAR, which was accepted by the NRC. The actual dose methodology was not recorded in the HEU SAR, so a conservative path was chosen. In compliance to NUREG 1537, when new methodology is used in a HEU to LEU analysis, it is necessary to re-evaluate the HEU accident. The Total Effective Dose Equivalent (TEDE) values were calculated in addition to the whole body dose and thyroid dose to operation personnel. The LEU FHA occupational TEDE dose was 349 mrem which is under the NRC regulatory occupational dose limit of 5 rem TEDE, and under the LEU MHA limit of 403 mrem. The re-evaluated HEU FHA occupational TEDE dose was 235 mrem, which is above the HEU MHA TEDE dose of 132 mrem. Since the new methodology produces a dose that is larger than the HEU MHA, we can safely assume that it is more conservative than the previous, unspecified dose.

  3. Investigating Faculty Familiarity with Assessment Terminology by Applying Cluster Analysis to Interpret Survey Data

    ERIC Educational Resources Information Center

    Raker, Jeffrey R.; Holme, Thomas A.

    2014-01-01

    A cluster analysis was conducted with a set of survey data on chemistry faculty familiarity with 13 assessment terms. Cluster groupings suggest a high, middle, and low overall familiarity with the terminology and an independent high and low familiarity with terms related to fundamental statistics. The six resultant clusters were found to be…

  4. The Use of Cluster Analysis in Typological Research on Community College Students

    ERIC Educational Resources Information Center

    Bahr, Peter Riley; Bielby, Rob; House, Emily

    2011-01-01

    One useful and increasingly popular method of classifying students is known commonly as cluster analysis. The variety of techniques that comprise the cluster analytic family are intended to sort observations (for example, students) within a data set into subsets (clusters) that share similar characteristics and differ in meaningful ways from other…

  5. Investigating Faculty Familiarity with Assessment Terminology by Applying Cluster Analysis to Interpret Survey Data

    ERIC Educational Resources Information Center

    Raker, Jeffrey R.; Holme, Thomas A.

    2014-01-01

    A cluster analysis was conducted with a set of survey data on chemistry faculty familiarity with 13 assessment terms. Cluster groupings suggest a high, middle, and low overall familiarity with the terminology and an independent high and low familiarity with terms related to fundamental statistics. The six resultant clusters were found to be…

  6. The Use of Cluster Analysis in Typological Research on Community College Students

    ERIC Educational Resources Information Center

    Bahr, Peter Riley; Bielby, Rob; House, Emily

    2011-01-01

    One useful and increasingly popular method of classifying students is known commonly as cluster analysis. The variety of techniques that comprise the cluster analytic family are intended to sort observations (for example, students) within a data set into subsets (clusters) that share similar characteristics and differ in meaningful ways from other…

  7. Cluster analysis of rural, urban, and curbside atmospheric particle size data.

    PubMed

    Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M

    2009-07-01

    Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.

  8. Clinical Characteristics of Exacerbation-Prone Adult Asthmatics Identified by Cluster Analysis.

    PubMed

    Kim, Mi Ae; Shin, Seung Woo; Park, Jong Sook; Uh, Soo Taek; Chang, Hun Soo; Bae, Da Jeong; Cho, You Sook; Park, Hae Sim; Yoon, Ho Joo; Choi, Byoung Whui; Kim, Yong Hoon; Park, Choon Sik

    2017-11-01

    Asthma is a heterogeneous disease characterized by various types of airway inflammation and obstruction. Therefore, it is classified into several subphenotypes, such as early-onset atopic, obese non-eosinophilic, benign, and eosinophilic asthma, using cluster analysis. A number of asthmatics frequently experience exacerbation over a long-term follow-up period, but the exacerbation-prone subphenotype has rarely been evaluated by cluster analysis. This prompted us to identify clusters reflecting asthma exacerbation. A uniform cluster analysis method was applied to 259 adult asthmatics who were regularly followed-up for over 1 year using 12 variables, selected on the basis of their contribution to asthma phenotypes. After clustering, clinical profiles and exacerbation rates during follow-up were compared among the clusters. Four subphenotypes were identified: cluster 1 was comprised of patients with early-onset atopic asthma with preserved lung function, cluster 2 late-onset non-atopic asthma with impaired lung function, cluster 3 early-onset atopic asthma with severely impaired lung function, and cluster 4 late-onset non-atopic asthma with well-preserved lung function. The patients in clusters 2 and 3 were identified as exacerbation-prone asthmatics, showing a higher risk of asthma exacerbation. Two different phenotypes of exacerbation-prone asthma were identified among Korean asthmatics using cluster analysis; both were characterized by impaired lung function, but the age at asthma onset and atopic status were different between the two.

  9. Cluster stability in the analysis of mass cytometry data.

    PubMed

    Melchiotti, Rossella; Gracio, Filipe; Kordasti, Shahram; Todd, Alan K; de Rinaldis, Emanuele

    2017-01-01

    Manual gating has been traditionally applied to cytometry data sets to identify cells based on protein expression. The advent of mass cytometry allows for a higher number of proteins to be simultaneously measured on cells, therefore providing a means to define cell clusters in a high dimensional expression space. This enhancement, whilst opening unprecedented opportunities for single cell-level analyses, makes the incremental replacement of manual gating with automated clustering a compelling need. To this aim many methods have been implemented and their successful applications demonstrated in different settings. However, the reproducibility of automatically generated clusters is proving challenging and an analytical framework to distinguish spurious clusters from more stable entities, and presumably more biologically relevant ones, is still missing. One way to estimate cell clusters' stability is the evaluation of their consistent re-occurrence within- and between-algorithms, a metric that is commonly used to evaluate results from gene expression. Herein we report the usage and importance of cluster stability evaluations, when applied to results generated from three popular clustering algorithms - SPADE, FLOCK and PhenoGraph - run on four different data sets. These algorithms were shown to generate clusters with various degrees of statistical stability, many of them being unstable. By comparing the results of automated clustering with manually gated populations, we illustrate how information on cluster stability can assist towards a more rigorous and informed interpretation of clustering results. We also explore the relationships between statistical stability and other properties such as clusters' compactness and isolation, demonstrating that whilst cluster stability is linked to other properties it cannot be reliably predicted by any of them. Our study proposes the introduction of cluster stability as a necessary checkpoint for cluster interpretation and

  10. Speeding up the Consensus Clustering methodology for microarray data analysis

    PubMed Central

    2011-01-01

    Background The inference of the number of clusters in a dataset, a fundamental problem in Statistics, Data Analysis and Classification, is usually addressed via internal validation measures. The stated problem is quite difficult, in particular for microarrays, since the inferred prediction must be sensible enough to capture the inherent biological structure in a dataset, e.g., functionally related genes. Despite the rich literature present in that area, the identification of an internal validation measure that is both fast and precise has proved to be elusive. In order to partially fill this gap, we propose a speed-up of Consensus (Consensus Clustering), a methodology whose purpose is the provision of a prediction of the number of clusters in a dataset, together with a dissimilarity matrix (the consensus matrix) that can be used by clustering algorithms. As detailed in the remainder of the paper, Consensus is a natural candidate for a speed-up. Results Since the time-precision performance of Consensus depends on two parameters, our first task is to show that a simple adjustment of the parameters is not enough to obtain a good precision-time trade-off. Our second task is to provide a fast approximation algorithm for Consensus. That is, the closely related algorithm FC (Fast Consensus) that would have the same precision as Consensus with a substantially better time performance. The performance of FC has been assessed via extensive experiments on twelve benchmark datasets that summarize key features of microarray applications, such as cancer studies, gene expression with up and down patterns, and a full spectrum of dimensionality up to over a thousand. Based on their outcome, compared with previous benchmarking results available in the literature, FC turns out to be among the fastest internal validation methods, while retaining the same outstanding precision of Consensus. Moreover, it also provides a consensus matrix that can be used as a dissimilarity matrix

  11. Transmutation Analysis of Enriched Uranium and Deep Burn High Temperature Reactors

    SciTech Connect

    Michael A. Pope

    2012-07-01

    High temperature reactors (HTRs) have been under consideration for production of electricity, process heat, and for destruction of transuranics for decades. As part of the transmutation analysis efforts within the Fuel Cycle Research and Development (FCR&D) campaign, a need was identified for detailed discharge isotopics from HTRs for use in the VISION code. A conventional HTR using enriched uranium in UCO fuel was modeled having discharge burnup of 120 GWd/MTiHM. Also, a deep burn HTR (DB-HTR) was modeled burning transuranic (TRU)-only TRU-O2 fuel to a discharge burnup of 648 GWd/MTiHM. For each of these cases, unit cell depletion calculations were performed with SCALE/TRITON. Unit cells were used to perform this analysis using SCALE 6.1. Because of the long mean free paths (and migration lengths) of neutrons in HTRs, using a unit cell to represent a whole core can be non-trivial. The sizes of these cells were first set by using Serpent calculations to match a spectral index between unit cell and whole core domains. In the case of the DB-HTR, the unit cell which was arrived at in this way conserved the ratio of fuel to moderator found in a single block of fuel. In the conventional HTR case, a larger moderator-to-fuel ratio than that of a single block was needed to simulate the whole core spectrum. Discharge isotopics (for 500 nuclides) and one-group cross-sections (for 1022 nuclides) were delivered to the transmutation analysis team. This report provides documentation for these calculations. In addition to the discharge isotopics, one-group cross-sections were provided for the full list of 1022 nuclides tracked in the transmutation library.

  12. A Hierarchical Bayesian Procedure for Two-Mode Cluster Analysis

    ERIC Educational Resources Information Center

    DeSarbo, Wayne S.; Fong, Duncan K. H.; Liechty, John; Saxton, M. Kim

    2004-01-01

    This manuscript introduces a new Bayesian finite mixture methodology for the joint clustering of row and column stimuli/objects associated with two-mode asymmetric proximity, dominance, or profile data. That is, common clusters are derived which partition both the row and column stimuli/objects simultaneously into the same derived set of clusters.…

  13. A Hierarchical Bayesian Procedure for Two-Mode Cluster Analysis

    ERIC Educational Resources Information Center

    DeSarbo, Wayne S.; Fong, Duncan K. H.; Liechty, John; Saxton, M. Kim

    2004-01-01

    This manuscript introduces a new Bayesian finite mixture methodology for the joint clustering of row and column stimuli/objects associated with two-mode asymmetric proximity, dominance, or profile data. That is, common clusters are derived which partition both the row and column stimuli/objects simultaneously into the same derived set of clusters.…

  14. High-resolution synchrotron X-ray analysis of bioglass-enriched hydrogels.

    PubMed

    Gorodzha, Svetlana; Douglas, Timothy E L; Samal, Sangram K; Detsch, Rainer; Cholewa-Kowalska, Katarzyna; Braeckmans, Kevin; Boccaccini, Aldo R; Skirtach, Andre G; Weinhardt, Venera; Baumbach, Tilo; Surmeneva, Maria A; Surmenev, Roman A

    2016-05-01

    Enrichment of hydrogels with inorganic particles improves their suitability for bone regeneration by enhancing their mechanical properties, mineralizability, and bioactivity as well as adhesion, proliferation, and differentiation of bone-forming cells, while maintaining injectability. Low aggregation and homogeneous distribution maximize particle surface area, promoting mineralization, cell-particle interactions, and homogenous tissue regeneration. Hence, determination of the size and distribution of particles/particle agglomerates in the hydrogel is desirable. Commonly used techniques have drawbacks. High-resolution techniques (e.g., SEM) require drying. Distribution in the dry state is not representative of the wet state. Techniques in the wet state (histology, µCT) are of lower resolution. Here, self-gelling, injectable composites of Gellan Gum (GG) hydrogel and two different types of sol-gel-derived bioactive glass (bioglass) particles were analyzed in the wet state using Synchrotron X-ray radiation, enabling high-resolution determination of particle size and spatial distribution. The lower detection limit volume was 9 × 10(-5) mm(3) . Bioglass particle suspensions were also studied using zeta potential measurements and Coulter analysis. Aggregation of bioglass particles in the GG hydrogels occurred and aggregate distribution was inhomogeneous. Bioglass promoted attachment of rat mesenchymal stem cells (rMSC) and mineralization.

  15. An enriched 1D finite element for the buckling analysis of sandwich beam-columns

    NASA Astrophysics Data System (ADS)

    Sad Saoud, Kahina; Le Grognec, Philippe

    2016-06-01

    Sandwich constructions have been widely used during the last few decades in various practical applications, especially thanks to the attractive compromise between a lightweight and high mechanical properties. Nevertheless, despite the advances achieved to date, buckling still remains a major failure mode for sandwich materials which often fatally leads to collapse. Recently, one of the authors derived closed-form analytical solutions for the buckling analysis of sandwich beam-columns under compression or pure bending. These solutions are based on a specific hybrid formulation where the faces are represented by Euler-Bernoulli beams and the core layer is described as a 2D continuous medium. When considering more complex loadings or non-trivial boundary conditions, closed-form solutions are no more available and one must resort to numerical models. Instead of using a 2D computationally expensive model, the present paper aims at developing an original enriched beam finite element. It is based on the previous analytical formulation, insofar as the skin layers are modeled by Timoshenko beams whereas the displacement fields in the core layer are described by means of hyperbolic functions, in accordance with the modal displacement fields obtained analytically. By using this 1D finite element, linearized buckling analyses are performed for various loading cases, whose results are confronted to either analytical or numerical reference solutions, for validation purposes.

  16. Cluster Analysis in Nursing Research: An Introduction, Historical Perspective, and Future Directions.

    PubMed

    Dunn, Heather; Quinn, Laurie; Corbridge, Susan J; Eldeirawi, Kamal; Kapella, Mary; Collins, Eileen G

    2017-05-01

    The use of cluster analysis in the nursing literature is limited to the creation of classifications of homogeneous groups and the discovery of new relationships. As such, it is important to provide clarity regarding its use and potential. The purpose of this article is to provide an introduction to distance-based, partitioning-based, and model-based cluster analysis methods commonly utilized in the nursing literature, provide a brief historical overview on the use of cluster analysis in nursing literature, and provide suggestions for future research. An electronic search included three bibliographic databases, PubMed, CINAHL and Web of Science. Key terms were cluster analysis and nursing. The use of cluster analysis in the nursing literature is increasing and expanding. The increased use of cluster analysis in the nursing literature is positioning this statistical method to result in insights that have the potential to change clinical practice.

  17. Embryonic stem cell-based screen for small molecules: cluster analysis reveals four response patterns in developing neural cells.

    PubMed

    Kern, I; Xu, R; Julien, S; Suter, D M; Preynat-Seauve, O; Baquié, M; Poncet, A; Combescure, C; Stoppini, L; Thriel, C V; Krause, Karl-Heinz

    2013-01-01

    Neural differentiation of embryonic stem cells (ESC) is considered a promising model to perform in vitro testing for neuroactive and neurotoxic compounds. We studied the potential of a dual reporter murine ESC line to identify bioactive and/or toxic compounds. This line expressed firefly luciferase under the control of the neural cell-specific tubulin alpha promoter (TUBA1A), and renilla luciferase under the control of the ubiquitous translation elongation factor 1-alpha-1 (EEF1A1) promoter. During neural differentiation, TUBA1A activity increased, while EEF1A1 activity decreased. We first validated our test system using the known neurotoxin methyl mercury. This compound altered expression of both reporter genes, with ESC-derived neural precursors being affected at markedly lower concentrations than undifferentiated ESCs. Analysis of a library of 1040 bioactive compounds picked up 127 compounds with altered EEF1A1 and/or TUBA1A promoter activity, which were classified in 4 clusters. Cluster 1 (low EEF1A1 and TUBA1A) was the largest cluster, containing many cytostatic drugs, as well as known neurodevelopmental toxicants, psychotropic drugs and endocrine disruptors. Cluster 2 (high EEF1A1, stable TUBA1A) was limited to three sulfonamides. Cluster 3 (high EEF1A1 and TUBA1A) was small, but markedly enriched in neuroactive and neurotoxic compounds. Cluster 4 (stable EEF1A1, high TUBA1A) was heterogeneous, containing endocrine disruptors, neurotoxic and cytostatic drugs. The dual reporter gene assay described here might be a useful addition to in vitro drug testing panels. Our two-dimensional testing strategy provides information on complex response patterns, which could not be achieved by a single marker approach.

  18. Analysis of FMRI data using an integrated principal component analysis and supervised affinity propagation clustering approach.

    PubMed

    Zhang, Jiang; Tuo, Xianguo; Yuan, Zhen; Liao, Wei; Chen, Huafu

    2011-11-01

    Clustering analysis is a promising data-driven method for analyzing functional magnetic resonance imaging (fMRI) time series data. The huge computational load, however, creates practical difficulties for this technique. We present a novel approach, integrating principal component analysis (PCA) and supervised affinity propagation clustering (SAPC). In this method, fMRI data are initially processed by PCA to obtain a preliminary image of brain activation. SAPC is then used to detect different brain functional activation patterns. We used a supervised Silhouette index to optimize clustering quality and automatically search for the optimal parameter p in SAPC, so that the basic affinity propagation clustering is improved by applying SAPC. Four simulation studies and tests with three in vivo fMRI datasets containing data from both block-design and event-related experiments revealed that functional brain activation was effectively detected and different response patterns were distinguished using our integrated method. In addition, the improved SAPC method was superior to the k -centers clustering and hierarchical clustering methods in both block-design and event-related fMRI data, as measured by the average squared error. These results suggest that our proposed novel integrated approach will be useful for detecting brain functional activation in both block-design and event-related experimental fMRI data.

  19. Microarray Cluster Analysis of Irradiated Growth Plate Zones Following Laser Microdissection

    SciTech Connect

    Damron, Timothy A. Zhang Mingliang; Pritchard, Meredith R.; Middleton, Frank A.; Horton, Jason A.; Margulies, Bryan M.; Strauss, Judith A.; Farnum, Cornelia E.; Spadaro, Joseph A.

    2009-07-01

    Purpose: Genes and pathways involved in early growth plate chondrocyte recovery after fractionated irradiation were sought as potential targets for selective radiorecovery modulation. Materials and Methods: Three groups of six 5-week male Sprague-Dawley rats underwent fractionated irradiation to the right tibiae over 5 days, totaling 17.5 Gy, and then were killed at 7, 11, and 16 days after the first radiotherapy fraction. The growth plates were collected from the proximal tibiae bilaterally and subsequently underwent laser microdissection to separate reserve, perichondral, proliferative, and hypertrophic zones. Differential gene expression was analyzed between irradiated right and nonirradiated left tibia using RAE230 2.0 GeneChip microarray, compared between zones and time points and subjected to functional pathway cluster analysis with real-time polymerase chain reaction to confirm selected results. Results: Each zone had a number of pathways showing enrichment after the pattern of hypothesized importance to growth plate recovery, yet few met the strictest criteria. The proliferative and hypertrophic zones showed both the greatest number of genes with a 10-fold right/left change at 7 days after initiation of irradiation and enrichment of the most functional pathways involved in bone, cartilage, matrix, or skeletal development. Six genes confirmed by real-time polymerase chain reaction to have early upregulation included insulin-like growth factor 2, procollagen type I alpha 2, matrix metallopeptidase 9, parathyroid hormone receptor 1, fibromodulin, and aggrecan 1. Conclusions: Nine overlapping pathways in the proliferative and hypertrophic zones (skeletal development, ossification, bone remodeling, cartilage development, extracellular matrix structural constituent, proteinaceous extracellular matrix, collagen, extracellular matrix, and extracellular matrix part) may play key roles in early growth plate radiorecovery.

  20. Cluster randomized clinical trials in orthodontics: design, analysis and reporting issues.

    PubMed

    Pandis, Nikolaos; Walsh, Tanya; Polychronopoulou, Argy; Eliades, Theodore

    2013-10-01

    Cluster randomized trials (CRTs) use as the unit of randomization clusters, which are usually defined as a collection of individuals sharing some common characteristics. Common examples of clusters include entire dental practices, hospitals, schools, school classes, villages, and towns. Additionally, several measurements (repeated measurements) taken on the same individual at different time points are also considered to be clusters. In dentistry, CRTs are applicable as patients may be treated as clusters containing several individual teeth. CRTs require certain methodological procedures during sample calculation, randomization, data analysis, and reporting, which are often ignored in dental research publications. In general, due to similarity of the observations within clusters, each individual within a cluster provides less information compared with an individual in a non-clustered trial. Therefore, clustered designs require larger sample sizes compared with non-clustered randomized designs, and special statistical analyses that account for the fact that observations within clusters are correlated. It is the purpose of this article to highlight with relevant examples the important methodological characteristics of cluster randomized designs as they may be applied in orthodontics and to explain the problems that may arise if clustered observations are erroneously treated and analysed as independent (non-clustered).

  1. Ultrafast dynamics in atomic clusters: Analysis and control

    PubMed Central

    Bonačić-Koutecký, Vlasta; Mitrić, Roland; Werner, Ute; Wöste, Ludger; Berry, R. Stephen

    2006-01-01

    We present a study of dynamics and ultrafast observables in the frame of pump–probe negative-to-neutral-to-positive ion (NeNePo) spectroscopy illustrated by the examples of bimetallic trimers Ag2Au−/Ag2Au/Ag2Au+ and silver oxides Ag3O2−/Ag3O2/Ag3O2+ in the context of cluster reactivity. First principle multistate adiabatic dynamics allows us to determine time scales of different ultrafast processes and conditions under which these processes can be experimentally observed. Furthermore, we present a strategy for optimal pump–dump control in complex systems based on the ab initio Wigner distribution approach and apply it to tailor laser fields for selective control of the isomerization process in