Sample records for class cluster analysis

  1. A Note on Cluster Effects in Latent Class Analysis

    ERIC Educational Resources Information Center

    Kaplan, David; Keller, Bryan

    2011-01-01

    This article examines the effects of clustering in latent class analysis. A comprehensive simulation study is conducted, which begins by specifying a true multilevel latent class model with varying within- and between-cluster sample sizes, varying latent class proportions, and varying intraclass correlations. These models are then estimated under…

  2. Investigating Subtypes of Child Development: A Comparison of Cluster Analysis and Latent Class Cluster Analysis in Typology Creation

    ERIC Educational Resources Information Center

    DiStefano, Christine; Kamphaus, R. W.

    2006-01-01

    Two classification methods, latent class cluster analysis and cluster analysis, are used to identify groups of child behavioral adjustment underlying a sample of elementary school children aged 6 to 11 years. Behavioral rating information across 14 subscales was obtained from classroom teachers and used as input for analyses. Both the procedures…

  3. NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways.

    PubMed

    Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Sand, Olivier; Janky, Rekin's; Vanderstocken, Gilles; Deville, Yves; van Helden, Jacques

    2008-07-01

    The network analysis tools (NeAT) (http://rsat.ulb.ac.be/neat/) provide a user-friendly web access to a collection of modular tools for the analysis of networks (graphs) and clusters (e.g. microarray clusters, functional classes, etc.). A first set of tools supports basic operations on graphs (comparison between two graphs, neighborhood of a set of input nodes, path finding and graph randomization). Another set of programs makes the connection between networks and clusters (graph-based clustering, cliques discovery and mapping of clusters onto a network). The toolbox also includes programs for detecting significant intersections between clusters/classes (e.g. clusters of co-expression versus functional classes of genes). NeAT are designed to cope with large datasets and provide a flexible toolbox for analyzing biological networks stored in various databases (protein interactions, regulation and metabolism) or obtained from high-throughput experiments (two-hybrid, mass-spectrometry and microarrays). The web interface interconnects the programs in predefined analysis flows, enabling to address a series of questions about networks of interest. Each tool can also be used separately by entering custom data for a specific analysis. NeAT can also be used as web services (SOAP/WSDL interface), in order to design programmatic workflows and integrate them with other available resources.

  4. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    PubMed Central

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data. PMID:20937082

  5. Cluster analysis of novel isometric strength measures produces a valid and evidence-based classification structure for wheelchair track racing.

    PubMed

    Connick, Mark J; Beckman, Emma; Vanlandewijck, Yves; Malone, Laurie A; Blomqvist, Sven; Tweedy, Sean M

    2017-11-25

    The Para athletics wheelchair-racing classification system employs best practice to ensure that classes comprise athletes whose impairments cause a comparable degree of activity limitation. However, decision-making is largely subjective and scientific evidence which reduces this subjectivity is required. To evaluate whether isometric strength tests were valid for the purposes of classifying wheelchair racers and whether cluster analysis of the strength measures produced a valid classification structure. Thirty-two international level, male wheelchair racers from classes T51-54 completed six isometric strength tests evaluating elbow extensors, shoulder flexors, trunk flexors and forearm pronators and two wheelchair performance tests-Top-Speed (0-15 m) and Top-Speed (absolute). Strength tests significantly correlated with wheelchair performance were included in a cluster analysis and the validity of the resulting clusters was assessed. All six strength tests correlated with performance (r=0.54-0.88). Cluster analysis yielded four clusters with reasonable overall structure (mean silhouette coefficient=0.58) and large intercluster strength differences. Six athletes (19%) were allocated to clusters that did not align with their current class. While the mean wheelchair racing performance of the resulting clusters was unequivocally hierarchical, the mean performance of current classes was not, with no difference between current classes T53 and T54. Cluster analysis of isometric strength tests produced classes comprising athletes who experienced a similar degree of activity limitation. The strength tests reported can provide the basis for a new, more transparent, less subjective wheelchair racing classification system, pending replication of these findings in a larger, representative sample. This paper also provides guidance for development of evidence-based systems in other Para sports. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  6. Clustering Educational Digital Library Usage Data: A Comparison of Latent Class Analysis and K-Means Algorithms

    ERIC Educational Resources Information Center

    Xu, Beijie; Recker, Mimi; Qi, Xiaojun; Flann, Nicholas; Ye, Lei

    2013-01-01

    This article examines clustering as an educational data mining method. In particular, two clustering algorithms, the widely used K-means and the model-based Latent Class Analysis, are compared, using usage data from an educational digital library service, the Instructional Architect (IA.usu.edu). Using a multi-faceted approach and multiple data…

  7. Evaluating Mixture Modeling for Clustering: Recommendations and Cautions

    ERIC Educational Resources Information Center

    Steinley, Douglas; Brusco, Michael J.

    2011-01-01

    This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…

  8. Pathological and non-pathological variants of restrictive eating behaviors in middle childhood: A latent class analysis.

    PubMed

    Schmidt, Ricarda; Vogel, Mandy; Hiemisch, Andreas; Kiess, Wieland; Hilbert, Anja

    2018-08-01

    Although restrictive eating behaviors are very common during early childhood, their precise nature and clinical correlates remain unclear. Especially, there is little evidence on restrictive eating behaviors in older children and their associations with children's shape concern. The present population-based study sought to delineate subgroups of restrictive eating patterns in N = 799 7-14 year old children. Using Latent Class Analysis, children were classified based on six restrictive eating behaviors (for example, picky eating, food neophobia, and eating-related anxiety) and shape concern, separately in three age groups. For cluster validation, sociodemographic and objective anthropometric data, parental feeding practices, and general and eating disorder psychopathology were used. The results showed a 3-cluster solution across all age groups: an asymptomatic class (Cluster 1), a class with restrictive eating behaviors without shape concern (Cluster 2), and a class showing restrictive eating behaviors with prominent shape concern (Cluster 3). The clusters differed in all variables used for validation. Particularly, the proportion of children with symptoms of avoidant/restrictive food intake disorder was greater in Cluster 2 than Clusters 1 and 3. The study underlined the importance of considering shape concern to distinguish between different phenotypes of children's restrictive eating patterns. Longitudinal data are needed to evaluate the clusters' predictive effects on children's growth and development of clinical eating disorders. Copyright © 2018 Elsevier Ltd. All rights reserved.

  9. A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

    PubMed

    Brusco, Michael J; Shireman, Emilie; Steinley, Douglas

    2017-09-01

    The problem of partitioning a collection of objects based on their measurements on a set of dichotomous variables is a well-established problem in psychological research, with applications including clinical diagnosis, educational testing, cognitive categorization, and choice analysis. Latent class analysis and K-means clustering are popular methods for partitioning objects based on dichotomous measures in the psychological literature. The K-median clustering method has recently been touted as a potentially useful tool for psychological data and might be preferable to its close neighbor, K-means, when the variable measures are dichotomous. We conducted simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data. Although all 3 methods proved capable of recovering cluster structure, K-median clustering yielded the best average performance, followed closely by latent class analysis. We also report results for the 3 methods within the context of an application to transitive reasoning data, in which it was found that the 3 approaches can exhibit profound differences when applied to real data. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  10. Symptom Cluster Research With Biomarkers and Genetics Using Latent Class Analysis.

    PubMed

    Conley, Samantha

    2017-12-01

    The purpose of this article is to provide an overview of latent class analysis (LCA) and examples from symptom cluster research that includes biomarkers and genetics. A review of LCA with genetics and biomarkers was conducted using Medline, Embase, PubMed, and Google Scholar. LCA is a robust latent variable model used to cluster categorical data and allows for the determination of empirically determined symptom clusters. Researchers should consider using LCA to link empirically determined symptom clusters to biomarkers and genetics to better understand the underlying etiology of symptom clusters. The full potential of LCA in symptom cluster research has not yet been realized because it has been used in limited populations, and researchers have explored limited biologic pathways.

  11. The X-ray luminosity functions of Abell clusters from the Einstein Cluster Survey

    NASA Technical Reports Server (NTRS)

    Burg, R.; Giacconi, R.; Forman, W.; Jones, C.

    1994-01-01

    We have derived the present epoch X-ray luminosity function of northern Abell clusters using luminosities from the Einstein Cluster Survey. The sample is sufficiently large that we can determine the luminosity function for each richness class separately with sufficient precision to study and compare the different luminosity functions. We find that, within each richness class, the range of X-ray luminosity is quite large and spans nearly a factor of 25. Characterizing the luminosity function for each richness class with a Schechter function, we find that the characteristic X-ray luminosity, L(sub *), scales with richness class as (L(sub *) varies as N(sub*)(exp gamma), where N(sub *) is the corrected, mean number of galaxies in a richness class, and the best-fitting exponent is gamma = 1.3 +/- 0.4. Finally, our analysis suggests that there is a lower limit to the X-ray luminosity of clusters which is determined by the integrated emission of the cluster member galaxies, and this also scales with richness class. The present sample forms a baseline for testing cosmological evolution of Abell-like clusters when an appropriate high-redshift cluster sample becomes available.

  12. Exploring the Relationship between Autism Spectrum Disorder and Epilepsy Using Latent Class Cluster Analysis

    ERIC Educational Resources Information Center

    Cuccaro, Michael L.; Tuchman, Roberto F.; Hamilton, Kara L.; Wright, Harry H.; Abramson, Ruth K.; Haines, Jonathan L.; Gilbert, John R.; Pericak-Vance, Margaret

    2012-01-01

    Epilepsy co-occurs frequently in autism spectrum disorders (ASD). Understanding this co-occurrence requires a better understanding of the ASD-epilepsy phenotype (or phenotypes). To address this, we conducted latent class cluster analysis (LCCA) on an ASD dataset (N = 577) which included 64 individuals with epilepsy. We identified a 5-cluster…

  13. Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm

    PubMed Central

    Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong

    2016-01-01

    In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis. PMID:27959895

  14. Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm.

    PubMed

    Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong

    2016-01-01

    In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis.

  15. An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data.

    PubMed

    Hsu, Arthur L; Tang, Sen-Lin; Halgamuge, Saman K

    2003-11-01

    Current Self-Organizing Maps (SOMs) approaches to gene expression pattern clustering require the user to predefine the number of clusters likely to be expected. Hierarchical clustering methods used in this area do not provide unique partitioning of data. We describe an unsupervised dynamic hierarchical self-organizing approach, which suggests an appropriate number of clusters, to perform class discovery and marker gene identification in microarray data. In the process of class discovery, the proposed algorithm identifies corresponding sets of predictor genes that best distinguish one class from other classes. The approach integrates merits of hierarchical clustering with robustness against noise known from self-organizing approaches. The proposed algorithm applied to DNA microarray data sets of two types of cancers has demonstrated its ability to produce the most suitable number of clusters. Further, the corresponding marker genes identified through the unsupervised algorithm also have a strong biological relationship to the specific cancer class. The algorithm tested on leukemia microarray data, which contains three leukemia types, was able to determine three major and one minor cluster. Prediction models built for the four clusters indicate that the prediction strength for the smaller cluster is generally low, therefore labelled as uncertain cluster. Further analysis shows that the uncertain cluster can be subdivided further, and the subdivisions are related to two of the original clusters. Another test performed using colon cancer microarray data has automatically derived two clusters, which is consistent with the number of classes in data (cancerous and normal). JAVA software of dynamic SOM tree algorithm is available upon request for academic use. A comparison of rectangular and hexagonal topologies for GSOM is available from http://www.mame.mu.oz.au/mechatronics/journalinfo/Hsu2003supp.pdf

  16. Benefits of off-campus education for students in the health sciences: a text-mining analysis.

    PubMed

    Nakagawa, Kazumasa; Asakawa, Yasuyoshi; Yamada, Keiko; Ushikubo, Mitsuko; Yoshida, Tohru; Yamaguchi, Haruyasu

    2012-08-28

    In Japan, few community-based approaches have been adopted in health-care professional education, and the appropriate content for such approaches has not been clarified. In establishing community-based education for health-care professionals, clarification of its learning effects is required. A community-based educational program was started in 2009 in the health sciences course at Gunma University, and one of the main elements in this program is conducting classes outside school. The purpose of this study was to investigate using text-analysis methods how the off-campus program affects students. In all, 116 self-assessment worksheets submitted by students after participating in the off-campus classes were decomposed into words. The extracted words were carefully selected from the perspective of contained meaning or content. With the selected terms, the relations to each word were analyzed by means of cluster analysis. Cluster analysis was used to select and divide 32 extracted words into four clusters: cluster 1-"actually/direct," "learn/watch/hear," "how," "experience/participation," "local residents," "atmosphere in community-based clinical care settings," "favorable," "communication/conversation," and "study"; cluster 2-"work of staff member" and "role"; cluster 3-"interaction/communication," "understanding," "feel," "significant/important/necessity," and "think"; and cluster 4-"community," "confusing," "enjoyable," "proactive," "knowledge," "academic knowledge," and "class." The students who participated in the program achieved different types of learning through the off-campus classes. They also had a positive impression of the community-based experience and interaction with the local residents, which is considered a favorable outcome. Off-campus programs could be a useful educational approach for students in health sciences.

  17. Gene features selection for three-class disease classification via multiple orthogonal partial least square discriminant analysis and S-plot using microarray data.

    PubMed

    Yang, Mingxing; Li, Xiumin; Li, Zhibin; Ou, Zhimin; Liu, Ming; Liu, Suhuan; Li, Xuejun; Yang, Shuyu

    2013-01-01

    DNA microarray analysis is characterized by obtaining a large number of gene variables from a small number of observations. Cluster analysis is widely used to analyze DNA microarray data to make classification and diagnosis of disease. Because there are so many irrelevant and insignificant genes in a dataset, a feature selection approach must be employed in data analysis. The performance of cluster analysis of this high-throughput data depends on whether the feature selection approach chooses the most relevant genes associated with disease classes. Here we proposed a new method using multiple Orthogonal Partial Least Squares-Discriminant Analysis (mOPLS-DA) models and S-plots to select the most relevant genes to conduct three-class disease classification and prediction. We tested our method using Golub's leukemia microarray data. For three classes with subtypes, we proposed hierarchical orthogonal partial least squares-discriminant analysis (OPLS-DA) models and S-plots to select features for two main classes and their subtypes. For three classes in parallel, we employed three OPLS-DA models and S-plots to choose marker genes for each class. The power of feature selection to classify and predict three-class disease was evaluated using cluster analysis. Further, the general performance of our method was tested using four public datasets and compared with those of four other feature selection methods. The results revealed that our method effectively selected the most relevant features for disease classification and prediction, and its performance was better than that of the other methods.

  18. Dynamic transcriptomic analysis in hircine longissimus dorsi muscle from fetal to neonatal development stages.

    PubMed

    Zhan, Siyuan; Zhao, Wei; Song, Tianzeng; Dong, Yao; Guo, Jiazhong; Cao, Jiaxue; Zhong, Tao; Wang, Linjie; Li, Li; Zhang, Hongping

    2018-01-01

    Muscle growth and development from fetal to neonatal stages consist of a series of delicately regulated and orchestrated changes in expression of genes. In this study, we performed whole transcriptome profiling based on RNA-Seq of caprine longissimus dorsi muscle tissue obtained from prenatal stages (days 45, 60, and 105 of gestation) and neonatal stage (the 3-day-old newborn) to identify genes that are differentially expressed and investigate their temporal expression profiles. A total of 3276 differentially expressed genes (DEGs) were identified (Q value < 0.01). Time-series expression profile clustering analysis indicated that DEGs were significantly clustered into eight clusters which can be divided into two classes (Q value < 0.05), class I profiles with downregulated patterns and class II profiles with upregulated patterns. Based on cluster analysis, GO enrichment analysis found that 75, 25, and 8 terms to be significantly enriched in biological process (BP), cellular component (CC), and molecular function (MF) categories in class I profiles, while 35, 21, and 8 terms to be significantly enriched in BP, CC, and MF in class II profiles. KEGG pathway analysis revealed that DEGs from class I profiles were significantly enriched in 22 pathways and the most enriched pathway was Rap1 signaling pathway. DEGs from class II profiles were significantly enriched in 17 pathways and the mainly enriched pathway was AMPK signaling pathway. Finally, six selected DEGs from our sequencing results were confirmed by qPCR. Our study provides a comprehensive understanding of the molecular mechanisms during goat skeletal muscle development from fetal to neonatal stages and valuable information for future studies of muscle development in goats.

  19. A Survey of Popular R Packages for Cluster Analysis

    ERIC Educational Resources Information Center

    Flynt, Abby; Dean, Nema

    2016-01-01

    Cluster analysis is a set of statistical methods for discovering new group/class structure when exploring data sets. This article reviews the following popular libraries/commands in the R software language for applying different types of cluster analysis: from the stats library, the kmeans, and hclust functions; the mclust library; the poLCA…

  20. Clustering of lifestyle risk behaviours among residents of forty deprived neighbourhoods in London: lessons for targeting public health interventions.

    PubMed

    Watts, P; Buck, D; Netuveli, G; Renton, A

    2016-06-01

    Clustering of lifestyle risk behaviours is very important in predicting premature mortality. Understanding the extent to which risk behaviours are clustered in deprived communities is vital to most effectively target public health interventions. We examined co-occurrence and associations between risk behaviours (smoking, alcohol consumption, poor diet, low physical activity and high sedentary time) reported by adults living in deprived London neighbourhoods. Associations between sociodemographic characteristics and clustered risk behaviours were examined. Latent class analysis was used to identify underlying clustering of behaviours. Over 90% of respondents reported at least one risk behaviour. Reporting specific risk behaviours predicted reporting of further risk behaviours. Latent class analyses revealed four underlying classes. Membership of a maximal risk behaviour class was more likely for young, white males who were unable to work. Compared with recent national level analysis, there was a weaker relationship between education and clustering of behaviours and a very high prevalence of clustering of risk behaviours in those unable to work. Young, white men who report difficulty managing on income were at high risk of reporting multiple risk behaviours. These groups may be an important target for interventions to reduce premature mortality caused by multiple risk behaviours. © The Author 2015. Published by Oxford University Press on behalf of Faculty of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  1. Clustering of Multiple Risk Behaviors Among a Sample of 18-Year-Old Australians and Associations With Mental Health Outcomes: A Latent Class Analysis.

    PubMed

    Champion, Katrina E; Mather, Marius; Spring, Bonnie; Kay-Lambkin, Frances; Teesson, Maree; Newton, Nicola C

    2018-01-01

    Risk behaviors commonly co-occur, typically emerge in adolescence, and become entrenched by adulthood. This study investigated the clustering of established (physical inactivity, diet, smoking, and alcohol use) and emerging (sedentary behavior and sleep) chronic disease risk factors among young Australian adults, and examined how clusters relate to mental health. The sample was derived from the long-term follow-up of a cohort of Australians. Participants were initially recruited at school as part of a cluster randomized controlled trial. A total of 853 participants (M age  = 18.88 years, SD = 0.42) completed an online self-report survey as part of the 5-year follow-up for the RCT. The survey assessed six behaviors (binge drinking and smoking in the past 6 months, moderate-to-vigorous physical activity/week, sitting time/day, fruit and vegetable intake/day, and sleep duration/night). Each behavior was represented by a dichotomous variable reflecting adherence to national guidelines. Exploratory analyses were conducted. Clusters were identified using latent class analysis. Three classes emerged: "moderate risk" (moderately likely to binge drink and not eat enough fruit, high probability of insufficient vegetable intake; Class 1, 52%); "inactive, non-smokers" (high probabilities of not meeting guidelines for physical activity, sitting time and fruit/vegetable consumption, very low probability of smoking; Class 2, 24%), and "smokers and binge drinkers" (high rates of smoking and binge drinking, poor fruit/vegetable intake; Class 3, 24%). There were significant differences between the classes in terms of psychological distress ( p  = 0.003), depression ( p  < 0.001), and anxiety ( p  = 0.003). Specifically, Class 3 ("smokers and binge drinkers") showed higher levels of distress, depression, and anxiety than Class 1 ("moderate risk"), while Class 2 ("inactive, non-smokers") had greater depression than the "moderate risk" group. Results indicate that risk behaviors are prevalent and clustered in 18-year old Australians. Mental health symptoms were significantly greater among the two classes that were characterized by high probabilities of engaging in multiple risk behaviors (Classes 2 and 3). An examination of the clustering of lifestyle risk behaviors is important to guide the development of preventive interventions. Our findings reinforce the importance of delivering multiple health interventions to reduce disease risk and improve mental well-being.

  2. The X-CLASS-redMaPPer galaxy cluster comparison. I. Identification procedures

    NASA Astrophysics Data System (ADS)

    Sadibekova, T.; Pierre, M.; Clerc, N.; Faccioli, L.; Gastaud, R.; Le Fevre, J.-P.; Rozo, E.; Rykoff, E.

    2014-11-01

    Context. This paper is the first in a series undertaking a comprehensive correlation analysis between optically selected and X-ray-selected cluster catalogues. The rationale of the project is to develop a holistic picture of galaxy clusters utilising optical and X-ray-cluster-selected catalogues with well-understood selection functions. Aims: Unlike most of the X-ray/optical cluster correlations to date, the present paper focuses on the non-matching objects in either waveband. We investigate how the differences observed between the optical and X-ray catalogues may stem from (1) a shortcoming of the detection algorithms; (2) dispersion in the X-ray/optical scaling relations; or (3) substantial intrinsic differences between the cluster populations probed in the X-ray and optical bands. The aim is to inventory and elucidate these effects in order to account for selection biases in the further determination of X-ray/optical cluster scaling relations. Methods: We correlated the X-CLASS serendipitous cluster catalogue extracted from the XMM archive with the redMaPPer optical cluster catalogue derived from the Sloan Digital Sky Survey (DR8). We performed a detailed and, in large part, interactive analysis of the matching output from the correlation. The overlap between the two catalogues has been accurately determined and possible cluster positional errors were manually recovered. The final samples comprise 270 and 355 redMaPPer and X-CLASS clusters, respectively. X-ray cluster matching rates were analysed as a function of optical richness. In the second step, the redMaPPer clusters were correlated with the entire X-ray catalogue, containing point and uncharacterised sources (down to a few 10-15 erg s-1 cm-2 in the [0.5-2] keV band). A stacking analysis was performed for the remaining undetected optical clusters. Results: We find that all rich (λ ≥ 80) clusters are detected in X-rays out to z = 0.6. Below this redshift, the richness threshold for X-ray detection steadily decreases with redshift. Likewise, all X-ray bright clusters are detected by redMaPPer. After correcting for obvious pipeline shortcomings (about 10% of the cases both in optical and X-ray), ~50% of the redMaPPer (down to a richness of 20) are found to coincide with an X-CLASS cluster; when considering X-ray sources of any type, this fraction increases to ~80%; for the remaining objects, the stacking analysis finds a weak signal within 0.5 Mpc around the cluster optical centres. The fraction of clusters totally dominated by AGN-type emission appears to be a few percent. Conversely, ~40% of the X-CLASS clusters are identified with a redMaPPer (down to a richness of 20) - part of the non-matches being due to the X-CLASS sample extending further out than redMaPPer (z< 1.5 vs. z< 0.6), but extending the correlation down to a richness of 5 raises the matching rate to ~65%. Conclusions: This state-of-the-art study involving two well-validated cluster catalogues has shown itself to be complex, and it points to a number of issues inherent to blind cross-matching, owing both to pipeline shortcomings and cluster peculiar properties. These can only been accounted for after a manual check. The combined X-ray and optical scaling relations will be presented in a subsequent article.

  3. On Identifying Clusters Within the C-type Asteroids of the Sloan Digital Sky Survey

    NASA Astrophysics Data System (ADS)

    Poole, Renae; Ziffer, J.; Harvell, T.

    2012-10-01

    We applied AutoClass, a data mining technique based upon Bayesian Classification, to C-group asteroid colors in the Sloan Digital Sky Survey (SDSS). Previous taxonomic studies relied mostly on Principal Component Analysis (PCA) to differentiate asteroids within the C-group (e.g. B, G, F, Ch, Cg and Cb). AutoClass's advantage is that it calculates the most probable classification for us, removing the human factor from this part of the analysis. In our results, AutoClass divided the C-groups into two large classes and six smaller classes. The two large classes (n=4974 and 2033, respectively) display distinct regions with some overlap in color-vs-color plots. Each cluster's average spectrum is compared to 'typical' spectra of the C-group subtypes as defined by Tholen (1989) and each cluster's members are evaluated for consistency with previous taxonomies. Of the 117 asteroids classified as B-type in previous taxonomies, only 12 were found with SDSS colors that matched our criteria of having less than 0.1 magnitude error in u and 0.05 magnitude error in g, r, i, and z colors. Although this is a relatively small group, 11 of the 12 B-types were placed by AutoClass in the same cluster. By determining the C-group sub-classifications in the large SDSS database, this research furthers our understanding of the stratigraphy and composition of the main-belt.

  4. Novel approach to characterising individuals with low back-related leg pain: cluster identification with latent class analysis and 12-month follow-up.

    PubMed

    Stynes, Siobhán; Konstantinou, Kika; Ogollah, Reuben; Hay, Elaine M; Dunn, Kate M

    2018-04-01

    Traditionally, low back-related leg pain (LBLP) is diagnosed clinically as referred leg pain or sciatica (nerve root involvement). However, within the spectrum of LBLP, we hypothesised that there may be other unrecognised patient subgroups. This study aimed to identify clusters of patients with LBLP using latent class analysis and describe their clinical course. The study population was 609 LBLP primary care consulters. Variables from clinical assessment were included in the latent class analysis. Characteristics of the statistically identified clusters were compared, and their clinical course over 1 year was described. A 5 cluster solution was optimal. Cluster 1 (n = 104) had mild leg pain severity and was considered to represent a referred leg pain group with no clinical signs, suggesting nerve root involvement (sciatica). Cluster 2 (n = 122), cluster 3 (n = 188), and cluster 4 (n = 69) had mild, moderate, and severe pain and disability, respectively, and response to clinical assessment items suggested categories of mild, moderate, and severe sciatica. Cluster 5 (n = 126) had high pain and disability, longer pain duration, and more comorbidities and was difficult to map to a clinical diagnosis. Most improvement for pain and disability was seen in the first 4 months for all clusters. At 12 months, the proportion of patients reporting recovery ranged from 27% for cluster 5 to 45% for cluster 2 (mild sciatica). This is the first study that empirically shows the variability in profile and clinical course of patients with LBLP including sciatica. More homogenous groups were identified, which could be considered in future clinical and research settings.

  5. The clustering-based case-based reasoning for imbalanced business failure prediction: a hybrid approach through integrating unsupervised process with supervised process

    NASA Astrophysics Data System (ADS)

    Li, Hui; Yu, Jun-Ling; Yu, Le-An; Sun, Jie

    2014-05-01

    Case-based reasoning (CBR) is one of the main forecasting methods in business forecasting, which performs well in prediction and holds the ability of giving explanations for the results. In business failure prediction (BFP), the number of failed enterprises is relatively small, compared with the number of non-failed ones. However, the loss is huge when an enterprise fails. Therefore, it is necessary to develop methods (trained on imbalanced samples) which forecast well for this small proportion of failed enterprises and performs accurately on total accuracy meanwhile. Commonly used methods constructed on the assumption of balanced samples do not perform well in predicting minority samples on imbalanced samples consisting of the minority/failed enterprises and the majority/non-failed ones. This article develops a new method called clustering-based CBR (CBCBR), which integrates clustering analysis, an unsupervised process, with CBR, a supervised process, to enhance the efficiency of retrieving information from both minority and majority in CBR. In CBCBR, various case classes are firstly generated through hierarchical clustering inside stored experienced cases, and class centres are calculated out by integrating cases information in the same clustered class. When predicting the label of a target case, its nearest clustered case class is firstly retrieved by ranking similarities between the target case and each clustered case class centre. Then, nearest neighbours of the target case in the determined clustered case class are retrieved. Finally, labels of the nearest experienced cases are used in prediction. In the empirical experiment with two imbalanced samples from China, the performance of CBCBR was compared with the classical CBR, a support vector machine, a logistic regression and a multi-variant discriminate analysis. The results show that compared with the other four methods, CBCBR performed significantly better in terms of sensitivity for identifying the minority samples and generated high total accuracy meanwhile. The proposed approach makes CBR useful in imbalanced forecasting.

  6. A Cross-Cultural Comparison of Symptom Reporting and Symptom Clusters in Heart Failure.

    PubMed

    Park, Jumin; Johantgen, Mary E

    2017-07-01

    An understanding of symptoms in heart failure (HF) among different cultural groups has become increasingly important. The purpose of this study was to compare symptom reporting and symptom clusters in HF patients between a Western (the United States) and an Eastern Asian sample (China and Taiwan). A secondary analysis of a cross-sectional observational study was conducted. The data were obtained from a matched HF patient sample from the United States and China/Taiwan ( N = 240 in each). Eight selective items related to HF symptoms from the Minnesota Living with Heart Failure Questionnaire were analyzed. Compared with the U.S. sample, HF patients from China/Taiwan reported a lower level of symptom distress. Analysis of two different regional groups did not result in the same number of clusters using latent class approach: the United States (four classes) and China/Taiwan (three classes). The study demonstrated that symptom reporting and identification of symptom clusters might be influenced by cultural factors.

  7. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.

    PubMed

    Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C

    2015-10-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  8. Nearest clusters based partial least squares discriminant analysis for the classification of spectral data.

    PubMed

    Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar

    2018-06-07

    Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Seismic Data Analysis throught Multi-Class Classification.

    NASA Astrophysics Data System (ADS)

    Anderson, P.; Kappedal, R. D.; Magana-Zook, S. A.

    2017-12-01

    In this research, we conducted twenty experiments of varying time and frequency bands on 5000seismic signals with the intent of finding a method to classify signals as either an explosion or anearthquake in an automated fashion. We used a multi-class approach by clustering of the data throughvarious techniques. Dimensional reduction was examined through the use of wavelet transforms withthe use of the coiflet mother wavelet and various coefficients to explore possible computational time vsaccuracy dependencies. Three and four classes were generated from the clustering techniques andexamined with the three class approach producing the most accurate and realistic results.

  10. Class imbalance in unsupervised change detection - A diagnostic analysis from urban remote sensing

    NASA Astrophysics Data System (ADS)

    Leichtle, Tobias; Geiß, Christian; Lakes, Tobia; Taubenböck, Hannes

    2017-08-01

    Automatic monitoring of changes on the Earth's surface is an intrinsic capability and simultaneously a persistent methodological challenge in remote sensing, especially regarding imagery with very-high spatial resolution (VHR) and complex urban environments. In order to enable a high level of automatization, the change detection problem is solved in an unsupervised way to alleviate efforts associated with collection of properly encoded prior knowledge. In this context, this paper systematically investigates the nature and effects of class distribution and class imbalance in an unsupervised binary change detection application based on VHR imagery over urban areas. For this purpose, a diagnostic framework for sensitivity analysis of a large range of possible degrees of class imbalance is presented, which is of particular importance with respect to unsupervised approaches where the content of images and thus the occurrence and the distribution of classes are generally unknown a priori. Furthermore, this framework can serve as a general technique to evaluate model transferability in any two-class classification problem. The applied change detection approach is based on object-based difference features calculated from VHR imagery and subsequent unsupervised two-class clustering using k-means, genetic k-means and self-organizing map (SOM) clustering. The results from two test sites with different structural characteristics of the built environment demonstrated that classification performance is generally worse in imbalanced class distribution settings while best results were reached in balanced or close to balanced situations. Regarding suitable accuracy measures for evaluating model performance in imbalanced settings, this study revealed that the Kappa statistics show significant response to class distribution while the true skill statistic was widely insensitive to imbalanced classes. In general, the genetic k-means clustering algorithm achieved the most robust results with respect to class imbalance while the SOM clustering exhibited a distinct optimization towards a balanced distribution of classes.

  11. The association between mood state and chronobiological characteristics in bipolar I disorder: a naturalistic, variable cluster analysis-based study.

    PubMed

    Gonzalez, Robert; Suppes, Trisha; Zeitzer, Jamie; McClung, Colleen; Tamminga, Carol; Tohen, Mauricio; Forero, Angelica; Dwivedi, Alok; Alvarado, Andres

    2018-02-19

    Multiple types of chronobiological disturbances have been reported in bipolar disorder, including characteristics associated with general activity levels, sleep, and rhythmicity. Previous studies have focused on examining the individual relationships between affective state and chronobiological characteristics. The aim of this study was to conduct a variable cluster analysis in order to ascertain how mood states are associated with chronobiological traits in bipolar I disorder (BDI). We hypothesized that manic symptomatology would be associated with disturbances of rhythm. Variable cluster analysis identified five chronobiological clusters in 105 BDI subjects. Cluster 1, comprising subjective sleep quality was associated with both mania and depression. Cluster 2, which comprised variables describing the degree of rhythmicity, was associated with mania. Significant associations between mood state and cluster analysis-identified chronobiological variables were noted. Disturbances of mood were associated with subjectively assessed sleep disturbances as opposed to objectively determined, actigraphy-based sleep variables. No associations with general activity variables were noted. Relationships between gender and medication classes in use and cluster analysis-identified chronobiological characteristics were noted. Exploratory analyses noted that medication class had a larger impact on these relationships than the number of psychiatric medications in use. In a BDI sample, variable cluster analysis was able to group related chronobiological variables. The results support our primary hypothesis that mood state, particularly mania, is associated with chronobiological disturbances. Further research is required in order to define these relationships and to determine the directionality of the associations between mood state and chronobiological characteristics.

  12. Application of Classification Methods for Forecasting Mid-Term Power Load Patterns

    NASA Astrophysics Data System (ADS)

    Piao, Minghao; Lee, Heon Gyu; Park, Jin Hyoung; Ryu, Keun Ho

    Currently an automated methodology based on data mining techniques is presented for the prediction of customer load patterns in long duration load profiles. The proposed approach in this paper consists of three stages: (i) data preprocessing: noise or outlier is removed and the continuous attribute-valued features are transformed to discrete values, (ii) cluster analysis: k-means clustering is used to create load pattern classes and the representative load profiles for each class and (iii) classification: we evaluated several supervised learning methods in order to select a suitable prediction method. According to the proposed methodology, power load measured from AMR (automatic meter reading) system, as well as customer indexes, were used as inputs for clustering. The output of clustering was the classification of representative load profiles (or classes). In order to evaluate the result of forecasting load patterns, the several classification methods were applied on a set of high voltage customers of the Korea power system and derived class labels from clustering and other features are used as input to produce classifiers. Lastly, the result of our experiments was presented.

  13. FACTOR ANALYTIC MODELS OF CLUSTERED MULTIVARIATE DATA WITH INFORMATIVE CENSORING

    EPA Science Inventory

    This paper describes a general class of factor analytic models for the analysis of clustered multivariate data in the presence of informative missingness. We assume that there are distinct sets of cluster-level latent variables related to the primary outcomes and to the censorin...

  14. Semi-Supervised Clustering for High-Dimensional and Sparse Features

    ERIC Educational Resources Information Center

    Yan, Su

    2010-01-01

    Clustering is one of the most common data mining tasks, used frequently for data organization and analysis in various application domains. Traditional machine learning approaches to clustering are fully automated and unsupervised where class labels are unknown a priori. In real application domains, however, some "weak" form of side…

  15. Chaotic map clustering algorithm for EEG analysis

    NASA Astrophysics Data System (ADS)

    Bellotti, R.; De Carlo, F.; Stramaglia, S.

    2004-03-01

    The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.

  16. [On measuring of factors influencing the complex need for cultural entertainments of the inhabitants in geriatric nursing homes (3rd information) (author's transl)].

    PubMed

    Kuhlmey, J; Lautsch, E

    1980-01-01

    In our 2. information on the investigation of the need for cultural entertainments of inhabitants in geriatric nursing homes we tested the influence of the factors age, sex, kind of work and during of stay in the geriatric nursing home singly and successively for each single indicator of this complex need. In this 3. information the influence of this four factors was investigated in these contradictory dependency on the indicators under synchronous consideration of their contradictory dependency. The contradictory dependency of the factors was presented by typisation (cluster analysis). As a result of the cluster analysis same classes arose--similar disposed inhabitants belong to same classes. The average coinage in this classes was obtained and differences were analysed by statistical methods multidimensional analysis of variance and analysis of discriminance).

  17. Exploring the application of latent class cluster analysis for investigating pedestrian crash injury severities in Switzerland.

    PubMed

    Sasidharan, Lekshmi; Wu, Kun-Feng; Menendez, Monica

    2015-12-01

    One of the major challenges in traffic safety analyses is the heterogeneous nature of safety data, due to the sundry factors involved in it. This heterogeneity often leads to difficulties in interpreting results and conclusions due to unrevealed relationships. Understanding the underlying relationship between injury severities and influential factors is critical for the selection of appropriate safety countermeasures. A method commonly employed to address systematic heterogeneity is to focus on any subgroup of data based on the research purpose. However, this need not ensure homogeneity in the data. In this paper, latent class cluster analysis is applied to identify homogenous subgroups for a specific crash type-pedestrian crashes. The manuscript employs data from police reported pedestrian (2009-2012) crashes in Switzerland. The analyses demonstrate that dividing pedestrian severity data into seven clusters helps in reducing the systematic heterogeneity of the data and to understand the hidden relationships between crash severity levels and socio-demographic, environmental, vehicle, temporal, traffic factors, and main reason for the crash. The pedestrian crash injury severity models were developed for the whole data and individual clusters, and were compared using receiver operating characteristics curve, for which results favored clustering. Overall, the study suggests that latent class clustered regression approach is suitable for reducing heterogeneity and revealing important hidden relationships in traffic safety analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.

  18. Making Sense of Cluster Analysis: Revelations from Pakistani Science Classes

    ERIC Educational Resources Information Center

    Pell, Tony; Hargreaves, Linda

    2011-01-01

    Cluster analysis has been applied to quantitative data in educational research over several decades and has been a feature of the Maurice Galton's research in primary and secondary classrooms. It has offered potentially useful insights for teaching yet its implications for practice are rarely implemented. It has been subject also to negative…

  19. A Study of Pupil Control Ideology: A Person-Oriented Approach to Data Analysis

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph

    2010-01-01

    Responses of urban school teachers to the Pupil Control Ideology questionnaire were studied using Latent Class Analysis. The results of the analysis suggest that the best fitting model to the data is a two-cluster solution. In particular, the pupil control ideology of the sample delineates into two clusters of teachers, those with humanistic and…

  20. A Class of Manifold Regularized Multiplicative Update Algorithms for Image Clustering.

    PubMed

    Yang, Shangming; Yi, Zhang; He, Xiaofei; Li, Xuelong

    2015-12-01

    Multiplicative update algorithms are important tools for information retrieval, image processing, and pattern recognition. However, when the graph regularization is added to the cost function, different classes of sample data may be mapped to the same subspace, which leads to the increase of data clustering error rate. In this paper, an improved nonnegative matrix factorization (NMF) cost function is introduced. Based on the cost function, a class of novel graph regularized NMF algorithms is developed, which results in a class of extended multiplicative update algorithms with manifold structure regularization. Analysis shows that in the learning, the proposed algorithms can efficiently minimize the rank of the data representation matrix. Theoretical results presented in this paper are confirmed by simulations. For different initializations and data sets, variation curves of cost functions and decomposition data are presented to show the convergence features of the proposed update rules. Basis images, reconstructed images, and clustering results are utilized to present the efficiency of the new algorithms. Last, the clustering accuracies of different algorithms are also investigated, which shows that the proposed algorithms can achieve state-of-the-art performance in applications of image clustering.

  1. Clustering of trauma and associations with single and co-occurring depression and panic attack over twenty years.

    PubMed

    McCutcheon, Vivia V; Heath, Andrew C; Nelson, Elliot C; Bucholz, Kathleen K; Madden, Pamela A F; Martin, Nicholas G

    2010-02-01

    Individuals who experience one type of trauma often experience other types, yet few studies have examined the clustering of trauma. This study examines the clustering of traumatic events and associations of trauma with risk for single and co-occurring major depressive disorder (MDD) and panic attack for 20 years after first trauma. Lifetime histories of MDD, panic attack, and traumatic events were obtained from participants in an Australian twin sample. Latent class analysis was used to derive trauma classes based on each respondent's trauma history. Associations of the resulting classes and of parental alcohol problems and familial effects with risk for a first onset of single and co-occurring MDD and panic attack were examined from the year of first trauma to 20 years later. Traumatic events clustered into three distinct classes characterized by endorsement of little or no trauma, primarily nonassaultive, and primarily assaultive events. Individuals in the assaultive class were characterized by a younger age at first trauma, a greater number of traumatic events, and high rates of parental alcohol problems. Members of the assaultive trauma class had the strongest and most enduring risk for single and co-occurring lifetime MDD and panic attack. Assaultive trauma outweighed associations of familial effects and nonassaultive trauma with risk for 10 years following first trauma.

  2. A meadow site classification for the Sierra Nevada, California

    Treesearch

    Raymond D. Ratliff

    1982-01-01

    This report describes 14 meadow site classes derived through techniques of agglomerative cluster analysis. The class names are: Carex rostrata (beaked sedge), Poa (Kentucky bluegrass), Heleocharis/Heleocharis (ephemeral-lake), Hypericum/Polygonum/ Viola (hillside bog), Trifolium/...

  3. Modem Signature Analysis.

    DTIC Science & Technology

    1982-10-01

    AD-A127 993 MODEM SIGNATURE ANALISIS (U) PAR TECHNOLOGY CORP NEW / HARTFORD NY V EDWARDS ET AL. OCT 82 RADC-TR-82-269 F30602-80-C-0264 NCLASSIFIED F/G...as an indication of the class clustering and separation between different classes in the modem data base. It is apparent from the projection that the...that as the clusters disperse, the likelihood of a sample crossing the boundary into an adjacent region and causing a symbol decision error increases. As

  4. Use of LANDSAT imagery for wildlife habitat mapping in northeast and east central Alaska

    NASA Technical Reports Server (NTRS)

    Lent, P. C. (Principal Investigator)

    1975-01-01

    The author has identified the following significant results. Two scenes were analyzed by applying an iterative cluster analysis to a 2% random data sample and then using the resulting clusters as a training set basis for maximum likelihood classification. Twenty-six and twenty-seven categorical classes, respectively resulted from this process. The majority of classes in each case were quite specific vegetation types; each of these types has specific value as moose habitat.

  5. Lipoprotein lipase S447X variant associated with VLDL, LDL and HDL diameter clustering in the MetS

    USDA-ARS?s Scientific Manuscript database

    Previous analysis clustered 1,238 individuals from the general population Genetics of Lipid Lowering Drugs Network (GOLDN) study by the size of their fasting very low-density, low-density and high-density lipoproteins (VLDL, LDL, HDL) using latent class analysis. From two of the eight identified gro...

  6. Use of LANDSAT imagery for wildlife habitat mapping in northeast and eastcentral Alaska

    NASA Technical Reports Server (NTRS)

    Lent, P. C. (Principal Investigator)

    1976-01-01

    The author has identified the following significant results. There is strong indication that spatially rare feature classes may be missed in clustering classifications based on 2% random sampling. Therefore, it seems advisable to augment random sampling for cluster analysis with directed sampling of any spatially rare features which are relevant to the analysis.

  7. Clustering of unhealthy behaviors in the aerobics center longitudinal study.

    PubMed

    Héroux, Mariane; Janssen, Ian; Lee, Duck-chul; Sui, Xuemei; Hebert, James R; Blair, Steven N

    2012-04-01

    Clustering of unhealthy behaviors has been reported in previous studies; however the link with all-cause mortality and differences between those with and without chronic disease requires further investigation. To observe the clustering effects of unhealthy diet, fitness, smoking, and excessive alcohol consumption in adults with and without chronic disease and to assess all-cause mortality risk according to the clustering of unhealthy behaviors. Participants were 13,621 adults (aged 20-84) from the Aerobics Center Longitudinal Study. Four health behaviors were observed (diet, fitness, smoking, and drinking). Baseline characteristics of the study population and bivariate relations between pairs of the health behaviors were evaluated separately for those with and without chronic disease using cross-tabulation and a chi-square test. The odds of partaking in unhealthy behaviors were also calculated. Latent class analysis (LCA) was used to assess clustering. Cox regression was used to assess the relationship between the behaviors and mortality. The four health behaviors were related to each other. LCA results suggested that two classes existed. Participants in class 1 had a higher probability of partaking in each of the four unhealthy behaviors than participants in class 2. No differences in health behavior clustering were found between participants with and without chronic disease. Mortality risk increased relative to the number of unhealthy behaviors participants engaged in. Unhealthy behaviors cluster together irrespective of chronic disease status. Such findings suggest that multi-behavioral intervention strategies can be similar in those with and without chronic disease.

  8. Factors influencing the quality of life of haemodialysis patients according to symptom cluster.

    PubMed

    Shim, Hye Yeung; Cho, Mi-Kyoung

    2018-05-01

    To identify the characteristics in each symptom cluster and factors influencing the quality of life of haemodialysis patients in Korea according to cluster. Despite developments in renal replacement therapy, haemodialysis still restricts the activities of daily living due to pain and impairs physical functioning induced by the disease and its complications. Descriptive survey. Two hundred and thirty dialysis patients aged >18 years. They completed self-administered questionnaires of Dialysis Symptom Index and Kidney Disease Quality of Life instrument-Short Form 1.3. To determine the optimal number of clusters, the collected data were analysed using polytomous variable latent class analysis in R software (poLCA) to estimate the latent class models and the latent class regression models for polytomous outcome variables. Differences in characteristics, symptoms and QOL according to the symptom cluster of haemodialysis patients were analysed using the independent t test and chi-square test. The factors influencing the QOL according to symptom cluster were identified using hierarchical multiple regression analysis. Physical and emotional symptoms were significantly more severe, and the QOL was significantly worse in Cluster 1 than in Cluster 2. The factors influencing the QOL were spouse, job, insurance type and physical and emotional symptoms in Cluster 1, with these variables having an explanatory power of 60.9%. Physical and emotional symptoms were the only influencing factors in Cluster 2, and they had an explanatory power of 37.4%. Mitigating the symptoms experienced by haemodialysis patients and improving their QOL require educational and therapeutic symptom management interventions that are tailored according to the characteristics and symptoms in each cluster. The findings of this study are expected to lead to practical guidelines for addressing the symptoms experienced by haemodialysis patients, and they provide basic information for developing nursing interventions to manage these symptoms and improve the QOL of these patients. © 2017 John Wiley & Sons Ltd.

  9. Three-Level Models for Indirect Effects in School- and Class-Randomized Experiments in Education

    ERIC Educational Resources Information Center

    Pituch, Keenan A.; Murphy, Daniel L.; Tate, Richard L.

    2009-01-01

    Due to the clustered nature of field data, multi-level modeling has become commonly used to analyze data arising from educational field experiments. While recent methodological literature has focused on multi-level mediation analysis, relatively little attention has been devoted to mediation analysis when three levels (e.g., student, class,…

  10. Investigating the effects of climate variations on bacillary dysentery incidence in northeast China using ridge regression and hierarchical cluster analysis

    PubMed Central

    Huang, Desheng; Guan, Peng; Guo, Junqiao; Wang, Ping; Zhou, Baosen

    2008-01-01

    Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations. PMID:18816415

  11. Clustered Stomates in "Begonia": An Exercise in Data Collection & Statistical Analysis of Biological Space

    ERIC Educational Resources Information Center

    Lau, Joann M.; Korn, Robert W.

    2007-01-01

    In this article, the authors present a laboratory exercise in data collection and statistical analysis in biological space using clustered stomates on leaves of "Begonia" plants. The exercise can be done in middle school classes by students making their own slides and seeing imprints of cells, or at the high school level through collecting data of…

  12. Joint model-based clustering of nonlinear longitudinal trajectories and associated time-to-event data analysis, linked by latent class membership: with application to AIDS clinical studies.

    PubMed

    Huang, Yangxin; Lu, Xiaosun; Chen, Jiaqing; Liang, Juan; Zangmeister, Miriam

    2017-10-27

    Longitudinal and time-to-event data are often observed together. Finite mixture models are currently used to analyze nonlinear heterogeneous longitudinal data, which, by releasing the homogeneity restriction of nonlinear mixed-effects (NLME) models, can cluster individuals into one of the pre-specified classes with class membership probabilities. This clustering may have clinical significance, and be associated with clinically important time-to-event data. This article develops a joint modeling approach to a finite mixture of NLME models for longitudinal data and proportional hazard Cox model for time-to-event data, linked by individual latent class indicators, under a Bayesian framework. The proposed joint models and method are applied to a real AIDS clinical trial data set, followed by simulation studies to assess the performance of the proposed joint model and a naive two-step model, in which finite mixture model and Cox model are fitted separately.

  13. A novel unsupervised spike sorting algorithm for intracranial EEG.

    PubMed

    Yadav, R; Shah, A K; Loeb, J A; Swamy, M N S; Agarwal, R

    2011-01-01

    This paper presents a novel, unsupervised spike classification algorithm for intracranial EEG. The method combines template matching and principal component analysis (PCA) for building a dynamic patient-specific codebook without a priori knowledge of the spike waveforms. The problem of misclassification due to overlapping classes is resolved by identifying similar classes in the codebook using hierarchical clustering. Cluster quality is visually assessed by projecting inter- and intra-clusters onto a 3D plot. Intracranial EEG from 5 patients was utilized to optimize the algorithm. The resulting codebook retains 82.1% of the detected spikes in non-overlapping and disjoint clusters. Initial results suggest a definite role of this method for both rapid review and quantitation of interictal spikes that could enhance both clinical treatment and research studies on epileptic patients.

  14. Gene expression pattern recognition algorithm inferences to classify samples exposed to chemical agents

    NASA Astrophysics Data System (ADS)

    Bushel, Pierre R.; Bennett, Lee; Hamadeh, Hisham; Green, James; Ableson, Alan; Misener, Steve; Paules, Richard; Afshari, Cynthia

    2002-06-01

    We present an analysis of pattern recognition procedures used to predict the classes of samples exposed to pharmacologic agents by comparing gene expression patterns from samples treated with two classes of compounds. Rat liver mRNA samples following exposure for 24 hours with phenobarbital or peroxisome proliferators were analyzed using a 1700 rat cDNA microarray platform. Sets of genes that were consistently differentially expressed in the rat liver samples following treatment were stored in the MicroArray Project System (MAPS) database. MAPS identified 238 genes in common that possessed a low probability (P < 0.01) of being randomly detected as differentially expressed at the 95% confidence level. Hierarchical cluster analysis on the 238 genes clustered specific gene expression profiles that separated samples based on exposure to a particular class of compound.

  15. Trajectories of acute low back pain: a latent class growth analysis.

    PubMed

    Downie, Aron S; Hancock, Mark J; Rzewuska, Magdalena; Williams, Christopher M; Lin, Chung-Wei Christine; Maher, Christopher G

    2016-01-01

    Characterising the clinical course of back pain by mean pain scores over time may not adequately reflect the complexity of the clinical course of acute low back pain. We analysed pain scores over 12 weeks for 1585 patients with acute low back pain presenting to primary care to identify distinct pain trajectory groups and baseline patient characteristics associated with membership of each cluster. This was a secondary analysis of the PACE trial that evaluated paracetamol for acute low back pain. Latent class growth analysis determined a 5 cluster model, which comprised 567 (35.8%) patients who recovered by week 2 (cluster 1, rapid pain recovery); 543 (34.3%) patients who recovered by week 12 (cluster 2, pain recovery by week 12); 222 (14.0%) patients whose pain reduced but did not recover (cluster 3, incomplete pain recovery); 167 (10.5%) patients whose pain initially decreased but then increased by week 12 (cluster 4, fluctuating pain); and 86 (5.4%) patients who experienced high-level pain for the whole 12 weeks (cluster 5, persistent high pain). Patients with longer pain duration were more likely to experience delayed recovery or nonrecovery. Belief in greater risk of persistence was associated with nonrecovery, but not delayed recovery. Higher pain intensity, longer duration, and workers' compensation were associated with persistent high pain, whereas older age and increased number of episodes were associated with fluctuating pain. Identification of discrete pain trajectory groups offers the potential to better manage acute low back pain.

  16. [Achene morphology cluster analysis of Taraxacum F. H. Wigg. from northeast China and molecule systematics evidence determined by SRAP].

    PubMed

    Li, Hai-juan; Zhao, Xin; Jia, Qing-fei; Li, Tian-lai; Ning, Wei

    2012-08-01

    The achenes morphological and micro-morphological characteristics of six species of genus Taraxacum from northeastern China as well as SRAP cluster analysis were observed for their classification evidences. The achenes were observed by microscope and EPMA. Cluster analysis was given on the basis of the size, shape, cone proportion, color and surface sculpture of achenes. The Taraxacum inter-species achene shape characteristic difference is obvious, particularly spinulose distribution and size, achene color and achene size; with the Taraxacum plant achene shape the cluster method T. antungense Kitag. and the T. urbanum Kitag. should combine for the identical kind; the achene morphology cluster analysis and the SRAP tagged molecule systematics's cluster result retrieves in the table with "the Chinese flora". The class group to divide the result is consistent. Taraxacum plant achene shape characteristic stable conservative, may carry on the inter-species division and the sibship analysis according to the achene shape characteristic combination difference; the achene morphology cluster analysis as well as the SRAP tagged molecule systematics confirmation support dandelion classification result of "the Chinese flora".

  17. Exploratory Item Classification Via Spectral Graph Clustering

    PubMed Central

    Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

    2017-01-01

    Large-scale assessments are supported by a large item pool. An important task in test development is to assign items into scales that measure different characteristics of individuals, and a popular approach is cluster analysis of items. Classical methods in cluster analysis, such as the hierarchical clustering, K-means method, and latent-class analysis, often induce a high computational overhead and have difficulty handling missing data, especially in the presence of high-dimensional responses. In this article, the authors propose a spectral clustering algorithm for exploratory item cluster analysis. The method is computationally efficient, effective for data with missing or incomplete responses, easy to implement, and often outperforms traditional clustering algorithms in the context of high dimensionality. The spectral clustering algorithm is based on graph theory, a branch of mathematics that studies the properties of graphs. The algorithm first constructs a graph of items, characterizing the similarity structure among items. It then extracts item clusters based on the graphical structure, grouping similar items together. The proposed method is evaluated through simulations and an application to the revised Eysenck Personality Questionnaire. PMID:29033476

  18. Comparing Perceived Adequacy of Help Received Among Different Classes of Individuals with Severe Mental Disorders at Five-Year Follow-Up: A Longitudinal Cluster Analysis.

    PubMed

    Fleury, Marie-Josée; Grenier, Guy; Bamvita, Jean-Marie

    2017-11-13

    This study developed a typology describing change in the perceived adequacy of help received among 204 individuals with severe mental disorders, 5 years after transfer to the community following a major mental health reform in Quebec (Canada). Participant typologies were constructed using a two-step cluster analysis. There were significant differences between T0 and T2 for perceived adequacy of help received and other independent variables, including seriousness of needs, help from services or relatives, and care continuity. Five classes emerged from the analysis. Perceived adequacy of help received at T2 increased for Class 1, mainly comprised of older women with mood disorders. Overall, greater care continuity and levels of help from services and relatives related to higher perceived AHR. Changes in perceived adequacy of help received resulting from several combinations of associated variables indicate that MH service delivery should respond to specific profiles and determinants.

  19. Probabilistic cluster labeling of imagery data

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B. (Principal Investigator)

    1980-01-01

    The problem of obtaining the probabilities of class labels for the clusters using spectral and spatial information from a given set of labeled patterns and their neighbors is considered. A relationship is developed between class and clusters conditional densities in terms of probabilities of class labels for the clusters. Expressions are presented for updating the a posteriori probabilities of the classes of a pixel using information from its local neighborhood. Fixed-point iteration schemes are developed for obtaining the optimal probabilities of class labels for the clusters. These schemes utilize spatial information and also the probabilities of label imperfections. Experimental results from the processing of remotely sensed multispectral scanner imagery data are presented.

  20. Collaborative filtering recommendation model based on fuzzy clustering algorithm

    NASA Astrophysics Data System (ADS)

    Yang, Ye; Zhang, Yunhua

    2018-05-01

    As one of the most widely used algorithms in recommender systems, collaborative filtering algorithm faces two serious problems, which are the sparsity of data and poor recommendation effect in big data environment. In traditional clustering analysis, the object is strictly divided into several classes and the boundary of this division is very clear. However, for most objects in real life, there is no strict definition of their forms and attributes of their class. Concerning the problems above, this paper proposes to improve the traditional collaborative filtering model through the hybrid optimization of implicit semantic algorithm and fuzzy clustering algorithm, meanwhile, cooperating with collaborative filtering algorithm. In this paper, the fuzzy clustering algorithm is introduced to fuzzy clustering the information of project attribute, which makes the project belong to different project categories with different membership degrees, and increases the density of data, effectively reduces the sparsity of data, and solves the problem of low accuracy which is resulted from the inaccuracy of similarity calculation. Finally, this paper carries out empirical analysis on the MovieLens dataset, and compares it with the traditional user-based collaborative filtering algorithm. The proposed algorithm has greatly improved the recommendation accuracy.

  1. Regression analysis of clustered failure time data with informative cluster size under the additive transformation models.

    PubMed

    Chen, Ling; Feng, Yanqin; Sun, Jianguo

    2017-10-01

    This paper discusses regression analysis of clustered failure time data, which occur when the failure times of interest are collected from clusters. In particular, we consider the situation where the correlated failure times of interest may be related to cluster sizes. For inference, we present two estimation procedures, the weighted estimating equation-based method and the within-cluster resampling-based method, when the correlated failure times of interest arise from a class of additive transformation models. The former makes use of the inverse of cluster sizes as weights in the estimating equations, while the latter can be easily implemented by using the existing software packages for right-censored failure time data. An extensive simulation study is conducted and indicates that the proposed approaches work well in both the situations with and without informative cluster size. They are applied to a dental study that motivated this study.

  2. Inference from clustering with application to gene-expression microarrays.

    PubMed

    Dougherty, Edward R; Barrera, Junior; Brun, Marcel; Kim, Seungchan; Cesar, Roberto M; Chen, Yidong; Bittner, Michael; Trent, Jeffrey M

    2002-01-01

    There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that clusters represent a partition of the sample points according to which process they belong. This paper discusses a model-based clustering toolbox that evaluates cluster accuracy. Each random process is modeled as its mean plus independent noise, sample points are generated, the points are clustered, and the clustering error is the number of points clustered incorrectly according to the generating random processes. Various clustering algorithms are evaluated based on process variance and the key issue of the rate at which algorithmic performance improves with increasing numbers of experimental replications. The model means can be selected by hand to test the separability of expected types of biological expression patterns. Alternatively, the model can be seeded by real data to test the expected precision of that output or the extent of improvement in precision that replication could provide. In the latter case, a clustering algorithm is used to form clusters, and the model is seeded with the means and variances of these clusters. Other algorithms are then tested relative to the seeding algorithm. Results are averaged over various seeds. Output includes error tables and graphs, confusion matrices, principal-component plots, and validation measures. Five algorithms are studied in detail: K-means, fuzzy C-means, self-organizing maps, hierarchical Euclidean-distance-based and correlation-based clustering. The toolbox is applied to gene-expression clustering based on cDNA microarrays using real data. Expression profile graphics are generated and error analysis is displayed within the context of these profile graphics. A large amount of generated output is available over the web.

  3. Understanding Teacher Users of a Digital Library Service: A Clustering Approach

    ERIC Educational Resources Information Center

    Xu, Beijie

    2011-01-01

    This research examined teachers' online behaviors while using a digital library service--the Instructional Architect (IA)--through three consecutive studies. In the first two studies, a statistical model called latent class analysis (LCA) was applied to cluster different groups of IA teachers according to their diverse online behaviors. The third…

  4. Clusters and Correlates of Experiences with Parents and Peers in Early Adolescence

    ERIC Educational Resources Information Center

    Kan, Marni L.; McHale, Susan M.

    2007-01-01

    This study used a person-oriented approach to examine links between adolescents' experiences with parents and peers. Cluster analysis classified 361, White, working- and middle-class youth (mean age = 12.16 years) based on mothers' and fathers' reports of parental acceptance and adolescents' reports of perceived peer competence. Three patterns…

  5. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    NASA Astrophysics Data System (ADS)

    Crawford, I.; Ruske, S.; Topping, D. O.; Gallagher, M. W.

    2015-07-01

    In this paper we present improved methods for discriminating and quantifying Primary Biological Aerosol Particles (PBAP) by applying hierarchical agglomerative cluster analysis to multi-parameter ultra violet-light induced fluorescence (UV-LIF) spectrometer data. The methods employed in this study can be applied to data sets in excess of 1×106 points on a desktop computer, allowing for each fluorescent particle in a dataset to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient dataset. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4) where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best performing methods were applied to the BEACHON-RoMBAS ambient dataset where it was found that the z-score and range normalisation methods yield similar results with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP) where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the underestimation of bacterial aerosol concentration by a factor of 5. We suggest that this likely due to errors arising from misatrribution due to poor centroid definition and failure to assign particles to a cluster as a result of the subsampling and comparative attribution method employed by WASP. The methods used here allow for the entire fluorescent population of particles to be analysed yielding an explict cluster attribution for each particle, improving cluster centroid definition and our capacity to discriminate and quantify PBAP meta-classes compared to previous approaches.

  6. A fast learning method for large scale and multi-class samples of SVM

    NASA Astrophysics Data System (ADS)

    Fan, Yu; Guo, Huiming

    2017-06-01

    A multi-class classification SVM(Support Vector Machine) fast learning method based on binary tree is presented to solve its low learning efficiency when SVM processing large scale multi-class samples. This paper adopts bottom-up method to set up binary tree hierarchy structure, according to achieved hierarchy structure, sub-classifier learns from corresponding samples of each node. During the learning, several class clusters are generated after the first clustering of the training samples. Firstly, central points are extracted from those class clusters which just have one type of samples. For those which have two types of samples, cluster numbers of their positive and negative samples are set respectively according to their mixture degree, secondary clustering undertaken afterwards, after which, central points are extracted from achieved sub-class clusters. By learning from the reduced samples formed by the integration of extracted central points above, sub-classifiers are obtained. Simulation experiment shows that, this fast learning method, which is based on multi-level clustering, can guarantee higher classification accuracy, greatly reduce sample numbers and effectively improve learning efficiency.

  7. College Textbook Reading Assignments and Class Time Activity

    ERIC Educational Resources Information Center

    Aagaard, Lola; Conner, Timothy W., II.; Skidmore, Ronald L.

    2014-01-01

    A convenient cluster sample of 105 undergraduate students at a regional university in the midsouth completed a survey regarding their use of college textbooks, what strategies might increase the likelihood of their reading textbook assignments, and their preference for how class time was used. Descriptive analysis was conducted on the results and…

  8. A Comparison of Classification Approaches for Cyberbullying and Traditional Bullying Using Data from Six European Countries

    ERIC Educational Resources Information Center

    Schultze-Krumbholz, Anja; Göbel, Kristin; Scheithauer, Herbert; Brighi, Antonella; Guarini, Annalisa; Tsorbatzoudis, Haralambos; Barkoukis, Vassilis; Pyzalski, Jacek; Plichta, Piotr; Del Rey, Rosario; Casas, José A.; Thompson, Fran; Smith, Peter K.

    2015-01-01

    In recently published studies on cyberbullying, students are frequently categorized into distinct (cyber)bully and (cyber)victim clusters based on theoretical assumptions and arbitrary cut-off scores adapted from traditional bullying research. The present study identified involvement classes empirically using latent class analysis (LCA), to…

  9. Three subgroups of pain profiles identified in 227 women with arthritis: a latent class analysis.

    PubMed

    de Luca, Katie; Parkinson, Lynne; Downie, Aron; Blyth, Fiona; Byles, Julie

    2017-03-01

    The objectives were to identify subgroups of women with arthritis based upon the multi-dimensional nature of their pain experience and to compare health and socio-demographic variables between subgroups. A latent class analysis of 227 women with self-reported arthritis was used to identify clusters of women based upon the sensory, affective, and cognitive dimensions of the pain experience. Multivariate multinomial logistic regression analysis was used to determine the relationship between cluster membership and health and sociodemographic characteristics. A three-class cluster model was most parsimonious. 39.5 % of women had a unidimensional pain profile; 38.6 % of women had moderate multidimensional pain profile that included additional pain symptomatology such as sensory qualities and pain catastrophizing; and 21.9 % of women had severe multidimensional pain profile that included prominent pain symptomatology such as sensory and affective qualities of pain, pain catastrophizing, and neuropathic pain. Women with severe multidimensional pain profile have a 30.5 % higher risk of poorer quality of life and a 7.3 % higher risk of suffering depression, and women with moderate multidimensional pain profile have a 6.4 % higher risk of poorer quality of life when compared to women with unidimensional pain. This study identified three distinct subgroups of pain profiles in older women with arthritis. Women had very different experiences of pain, and cluster membership impacted significantly on health-related quality of life. These preliminary findings provide a stronger understanding of profiles of pain and may contribute to the development of tailored treatment options in arthritis.

  10. Latent Class Analysis of Incomplete Data via an Entropy-Based Criterion

    PubMed Central

    Larose, Chantal; Harel, Ofer; Kordas, Katarzyna; Dey, Dipak K.

    2016-01-01

    Latent class analysis is used to group categorical data into classes via a probability model. Model selection criteria then judge how well the model fits the data. When addressing incomplete data, the current methodology restricts the imputation to a single, pre-specified number of classes. We seek to develop an entropy-based model selection criterion that does not restrict the imputation to one number of clusters. Simulations show the new criterion performing well against the current standards of AIC and BIC, while a family studies application demonstrates how the criterion provides more detailed and useful results than AIC and BIC. PMID:27695391

  11. Technical support for creating an artificial intelligence system for feature extraction and experimental design

    NASA Technical Reports Server (NTRS)

    Glick, B. J.

    1985-01-01

    Techniques for classifying objects into groups or clases go under many different names including, most commonly, cluster analysis. Mathematically, the general problem is to find a best mapping of objects into an index set consisting of class identifiers. When an a priori grouping of objects exists, the process of deriving the classification rules from samples of classified objects is known as discrimination. When such rules are applied to objects of unknown class, the process is denoted classification. The specific problem addressed involves the group classification of a set of objects that are each associated with a series of measurements (ratio, interval, ordinal, or nominal levels of measurement). Each measurement produces one variable in a multidimensional variable space. Cluster analysis techniques are reviewed and methods for incuding geographic location, distance measures, and spatial pattern (distribution) as parameters in clustering are examined. For the case of patterning, measures of spatial autocorrelation are discussed in terms of the kind of data (nominal, ordinal, or interval scaled) to which they may be applied.

  12. Cloning, expression and biochemical characterization of one Epsilon-class (GST-3) and ten Delta-class (GST-1) glutathione S-transferases from Drosophila melanogaster, and identification of additional nine members of the Epsilon class.

    PubMed Central

    Sawicki, Rafał; Singh, Sharda P; Mondal, Ashis K; Benes, Helen; Zimniak, Piotr

    2003-01-01

    From the fruitfly, Drosophila melanogaster, ten members of the cluster of Delta-class glutathione S-transferases (GSTs; formerly denoted as Class I GSTs) and one member of the Epsilon-class cluster (formerly GST-3) have been cloned, expressed in Escherichia coli, and their catalytic properties have been determined. In addition, nine more members of the Epsilon cluster have been identified through bioinformatic analysis but not further characterized. Of the 11 expressed enzymes, seven accepted the lipid peroxidation product 4-hydroxynonenal as substrate, and nine were active in glutathione conjugation of 1-chloro-2,4-dinitrobenzene. Since the enzymically active proteins included the gene products of DmGSTD3 and DmGSTD7 which were previously deemed to be pseudogenes, we investigated them further and determined that both genes are transcribed in Drosophila. Thus our present results indicate that DmGSTD3 and DmGSTD7 are probably functional genes. The existence and multiplicity of insect GSTs capable of conjugating 4-hydroxynonenal, in some cases with catalytic efficiencies approaching those of mammalian GSTs highly specialized for this function, indicates that metabolism of products of lipid peroxidation is a highly conserved biochemical pathway with probable detoxification as well as regulatory functions. PMID:12443531

  13. Applicability of Hydrologic Landscapes for Model Calibration ...

    EPA Pesticide Factsheets

    The Pacific Northwest Hydrologic Landscapes (PNW HL) at the assessment unit scale has provided a solid conceptual classification framework to relate and transfer hydrologically meaningful information between watersheds without access to streamflow time series. A collection of techniques were applied to the HL assessment unit composition in watersheds across the Pacific Northwest to aggregate the hydrologic behavior of the Hydrologic Landscapes from the assessment unit scale to the watershed scale. This non-trivial solution both emphasizes HL classifications within the watershed that provide that majority of moisture surplus/deficit and considers the relative position (upstream vs. downstream) of these HL classifications. A clustering algorithm was applied to the HL-based characterization of assessment units within 185 watersheds to help organize watersheds into nine classes hypothesized to have similar hydrologic behavior. The HL-based classes were used to organize and describe hydrologic behavior information about watershed classes and both predictions and validations were independently performed with regard to the general magnitude of six hydroclimatic signature values. A second cluster analysis was then performed using the independently calculated signature values as similarity metrics, and it was found that the six signature clusters showed substantial overlap in watershed class membership to those in the HL-based classes. One hypothesis set forward from thi

  14. The cluster-cluster correlation function. [of galaxies

    NASA Technical Reports Server (NTRS)

    Postman, M.; Geller, M. J.; Huchra, J. P.

    1986-01-01

    The clustering properties of the Abell and Zwicky cluster catalogs are studied using the two-point angular and spatial correlation functions. The catalogs are divided into eight subsamples to determine the dependence of the correlation function on distance, richness, and the method of cluster identification. It is found that the Corona Borealis supercluster contributes significant power to the spatial correlation function to the Abell cluster sample with distance class of four or less. The distance-limited catalog of 152 Abell clusters, which is not greatly affected by a single system, has a spatial correlation function consistent with the power law Xi(r) = 300r exp -1.8. In both the distance class four or less and distance-limited samples the signal in the spatial correlation function is a power law detectable out to 60/h Mpc. The amplitude of Xi(r) for clusters of richness class two is about three times that for richness class one clusters. The two-point spatial correlation function is sensitive to the use of estimated redshifts.

  15. Substance misuse subtypes among women convicted of homicide.

    PubMed

    de Melo Nunes, Adriana; Baltieri, Danilo Antonio

    2013-01-01

    The proportion of women incarcerated is growing at a faster pace than that for men. The reasons for this important increase have been mainly attributed to drug-using lifestyle and drug-related offenses. About half of female inmates have history of substance misuse and one third demonstrate high impulsiveness levels. The objectives of this study were to (a) identify subtypes of alcohol and drug problems and impulsiveness among women convicted of homicide, and (b) examine the association between psychosocial and criminological features and the resulting clusters. Data come from 158 female inmates serving a sentence for homicide in the Penitentiary of Sant'Ana in São Paulo State, Brazil. Latent class analysis was used to group participants into substance misuse and impulsiveness classes. Two classes were identified: nonproblematic (cluster 1: 54.53%, n = 86) and problematic (cluster 2: 45.57%, n = 72) ones. After controlling for several psychosocial and criminological variables, cluster 2 inmates showed an earlier beginning of criminal activities and a lower educational level than their counterparts. To recognize the necessities of specific groups of female offenders is crucial for the development of an adequate system of health politics and for the decrease of criminal recidivism among those offenders who have shown higher risk.

  16. Iterative Stable Alignment and Clustering of 2D Transmission Electron Microscope Images

    PubMed Central

    Yang, Zhengfan; Fang, Jia; Chittuluru, Johnathan; Asturias, Francisco J.; Penczek, Pawel A.

    2012-01-01

    SUMMARY Identification of homogeneous subsets of images in a macromolecular electron microscopy (EM) image data set is a critical step in single-particle analysis. The task is handled by iterative algorithms, whose performance is compromised by the compounded limitations of image alignment and K-means clustering. Here we describe an approach, iterative stable alignment and clustering (ISAC) that, relying on a new clustering method and on the concepts of stability and reproducibility, can extract validated, homogeneous subsets of images. ISAC requires only a small number of simple parameters and, with minimal human intervention, can eliminate bias from two-dimensional image clustering and maximize the quality of group averages that can be used for ab initio three-dimensional structural determination and analysis of macromolecular conformational variability. Repeated testing of the stability and reproducibility of a solution within ISAC eliminates heterogeneous or incorrect classes and introduces critical validation to the process of EM image clustering. PMID:22325773

  17. Covering #SAE: A Mobile Reporting Class's Changing Patterns of Interaction on Twitter over Time

    ERIC Educational Resources Information Center

    Jones, Julie

    2015-01-01

    This study examined the social network that emerged on Twitter surrounding a mobile reporting class as they covered a national breaking news event. The work introduces pedagogical strategies that enhance students' learning opportunities. Through NodeXL and social network cluster analysis, six groups emerged from the Twitter interactions tied to…

  18. The Relationship of Academic Self-Efficacy to Class Participation and Exam Performance

    ERIC Educational Resources Information Center

    Galyon, Charles E.; Blondin, Carolyn A.; Yaw, Jared S.; Nalls, Meagan L.; Williams, Robert L.

    2012-01-01

    This study examined the relationship of academic self-efficacy to engagement in class discussion and performance on major course exams among students (N = 165) in an undergraduate human development course. Cluster analysis was used to identify three levels of academic self-efficacy: high (n = 34), medium (n = 91), and low (n = 40). Results…

  19. Motivational and emotional profiles in university undergraduates: a self-determination theory perspective.

    PubMed

    González, Antonio; Paoloni, Verónica; Donolo, Danilo; Rinaudo, Cristina

    2012-11-01

    Previous research has focused on specific forms of self-determined motivation or discrete class-related emotions, but few studies have simultaneously examined both constructs. The aim of this study on 472 undergraduates was twofold: to perform cluster analysis to identify homogeneous groups of motivation in the sample; and to determine the profile of each cluster for emotions and academic achievement. Cluster analysis configured four groups in terms of motivation: controlled, autonomous, both high, and both low. Each cluster revealed a distinct emotional profile, autonomous motivation being the most adaptable with high scores for academic achievement and pleasant emotions and low values for unpleasant emotions. The results are discussed in the light of their implications for academic adjustment.

  20. Longitudinal patterns of gambling activities and associated risk factors in college students

    PubMed Central

    Goudriaan, Anna E.; Slutske, Wendy S.; Krull, Jennifer L.; Sher, Kenneth J.

    2009-01-01

    Aims To investigate which clusters of gambling activities exist within a longitudinal study of college health, how membership in gambling clusters change over time and whether particular clusters of gambling are associated with unhealthy risk behaviour. Design Four-year longitudinal study (2002–2006). Setting Large, public university. Participants Undergraduate college students. Measurements Ten common gambling activities were measured during 4 consecutive college years (years 1–4). Clusters of gambling activities were examined using latent class analyses. Relations between gambling clusters and gender, Greek membership, alcohol use, drug use, personality indicators of behavioural undercontrol and psychological distress were examined. Findings Four latent gambling classes were identified: (1) a low-gambling class, (2) a card gambling class, (3) a casino/slots gambling class and (4) an extensive gambling class. Over the first college years a high probability of transitioning from the low-gambling class and the card gambling class into the casino/slots gambling class was present. Membership in the card, casino/slots and extensive gambling classes were associated with higher scores on alcohol/drug use, novelty seeking and self-identified gambling problems compared to the low-gambling class. The extensive gambling class scored higher than the other gambling classes on risk factors. Conclusions Extensive gamblers and card gamblers are at higher risk for problem gambling and other risky health behaviours. Prospective examinations of class membership suggested that being in the extensive and the low gambling classes was highly stable across the 4 years of college. PMID:19438422

  1. An application of cluster detection to scene analysis

    NASA Technical Reports Server (NTRS)

    Rosenfeld, A. H.; Lee, Y. H.

    1971-01-01

    Certain arrangements of local features in a scene tend to group together and to be seen as units. It is suggested that in some instances, this phenomenon might be interpretable as a process of cluster detection in a graph-structured space derived from the scene. This idea is illustrated using a class of scenes that contain only horizontal and vertical line segments.

  2. Cluster analysis of S. Cerevisiae nucleosome binding sites

    NASA Astrophysics Data System (ADS)

    Suvorova, Y.; Korotkov, E.

    2017-12-01

    It is well known that major part of a eukaryotic genome is wrapped around histone proteins forming nucleosomes. It was also demonstrated that the DNA sequence itself is playing an important role in the nucleosome positioning process. In this work, a cluster analysis of 67 517 nucleosome binding sites from the S. Cerevisiae genome was carried out. The classification method is based on the self-adjusting dinucleotides position weight matrix. As a result, 135 significant clusters were discovered that contain 43225 sequences (which constitutes 64% of the initial set). The meaning of the found classes is discussed, as well as the possibility of the further usage.

  3. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    NASA Astrophysics Data System (ADS)

    Crawford, I.; Ruske, S.; Topping, D. O.; Gallagher, M. W.

    2015-11-01

    In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs) by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF) spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4) where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio-hydro-atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen-Rocky Mountain Biogenic Aerosol Study) ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP) where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the underestimation of bacterial aerosol concentration by a factor of 5. We suggest that this likely due to errors arising from misattribution due to poor centroid definition and failure to assign particles to a cluster as a result of the subsampling and comparative attribution method employed by WASP. The methods used here allow for the entire fluorescent population of particles to be analysed, yielding an explicit cluster attribution for each particle and improving cluster centroid definition and our capacity to discriminate and quantify PBAP meta-classes compared to previous approaches.

  4. Yellow evolved stars in open clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sowell, J.R.

    1987-05-01

    This paper describes a program in which Galactic cluster post-AGB candidates were first identified and then analyzed for cluster membership via radial velocities, monitored for possible photometric variations, examined for evidence of mass loss, and classified as completely as possible in terms of their basic stellar parameters. The intrinsically brightest supergiants are found in the youngest clusters. With increasing cluster age, the absolute luminosities attained by the supergiants decline. It appears that the evolutionary tracks of luminosity class II stars are more similar to those of class I than of class III. Only two superluminous giant star candidates are foundmore » in open clusters. 154 references.« less

  5. Inductive Approaches to Improving Diagnosis and Design for Diagnosability

    NASA Technical Reports Server (NTRS)

    Fisher, Douglas H. (Principal Investigator)

    1995-01-01

    The first research area under this grant addresses the problem of classifying time series according to their morphological features in the time domain. A supervised learning system called CALCHAS, which induces a classification procedure for signatures from preclassified examples, was developed. For each of several signature classes, the system infers a model that captures the class's morphological features using Bayesian model induction and the minimum message length approach to assign priors. After induction, a time series (signature) is classified in one of the classes when there is enough evidence to support that decision. Time series with sufficiently novel features, belonging to classes not present in the training set, are recognized as such. A second area of research assumes two sources of information about a system: a model or domain theory that encodes aspects of the system under study and data from actual system operations over time. A model, when it exists, represents strong prior expectations about how a system will perform. Our work with a diagnostic model of the RCS (Reaction Control System) of the Space Shuttle motivated the development of SIG, a system which combines information from a model (or domain theory) and data. As it tracks RCS behavior, the model computes quantitative and qualitative values. Induction is then performed over the data represented by both the 'raw' features and the model-computed high-level features. Finally, work on clustering for operating mode discovery motivated some important extensions to the clustering strategy we had used. One modification appends an iterative optimization technique onto the clustering system; this optimization strategy appears to be novel in the clustering literature. A second modification improves the noise tolerance of the clustering system. In particular, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus making post-clustering analysis easier.

  6. Parallel and Scalable Clustering and Classification for Big Data in Geosciences

    NASA Astrophysics Data System (ADS)

    Riedel, M.

    2015-12-01

    Machine learning, data mining, and statistical computing are common techniques to perform analysis in earth sciences. This contribution will focus on two concrete and widely used data analytics methods suitable to analyse 'big data' in the context of geoscience use cases: clustering and classification. From the broad class of available clustering methods we focus on the density-based spatial clustering of appliactions with noise (DBSCAN) algorithm that enables the identification of outliers or interesting anomalies. A new open source parallel and scalable DBSCAN implementation will be discussed in the light of a scientific use case that detects water mixing events in the Koljoefjords. The second technique we cover is classification, with a focus set on the support vector machines algorithm (SVMs), as one of the best out-of-the-box classification algorithm. A parallel and scalable SVM implementation will be discussed in the light of a scientific use case in the field of remote sensing with 52 different classes of land cover types.

  7. MHC class I loci of the Bar-Headed goose (Anser indicus)

    PubMed Central

    2010-01-01

    MHC class I proteins mediate functions in anti-pathogen defense. MHC diversity has already been investigated by many studies in model avian species, but here we chose the bar-headed goose, a worldwide migrant bird, as a non-model avian species. Sequences from exons encoding the peptide-binding region (PBR) of MHC class I molecules were isolated from liver genomic DNA, to investigate variation in these genes. These are the first MHC class I partial sequences of the bar-headed goose to be reported. A preliminary analysis suggests the presence of at least four MHC class I genes, which share great similarity with those of the goose and duck. A phylogenetic analysis of bar-headed goose, goose and duck MHC class I sequences using the NJ method supports the idea that they all cluster within the anseriforms clade. PMID:21637434

  8. Statewide land cover derived from multiseasonal Landsat TM data: A retrospective of the WISCLAND project

    USGS Publications Warehouse

    Reese, H.M.; Lillesand, T.M.; Nagel, D.E.; Stewart, J.S.; Goldmann, R.A.; Simmons, T.E.; Chipman, J.W.; Tessar, P.A.

    2002-01-01

    Landsat Thematic Mapper (TM) data were the basis in production of a statewide land cover data set for Wisconsin, undertaken in partnership with U.S. Geological Survey's (USGS) Gap Analysis Program (GAP). The data set contained seven classes comparable to Anderson Level I and 24 classes comparable to Anderson Level II/III. Twelve scenes of dual-date TM data were processed with methods that included principal components analysis, stratification into spectrally consistent units, separate classification of upland, wetland, and urban areas, and a hybrid supervised/unsupervised classification called "guided clustering." The final data had overall accuracies of 94% for Anderson Level I upland classes, 77% for Level II/III upland classes, and 84% for Level II/III wetland classes. Classification accuracies for deciduous and coniferous forest were 95% and 93%, respectively, and forest species' overall accuracies ranged from 70% to 84%. Limited availability of acceptable imagery necessitated use of an early May date in a majority of scene pairs, perhaps contributing to lower accuracy for upland deciduous forest species. The mixed deciduous/coniferous forest class had the lowest accuracy, most likely due to distinctly classifying a purely mixed class. Mixed forest signatures containing oak were often confused with pure oak. Guided clustering was seen as an efficient classification method, especially at the tree species level, although its success relied in part on image dates, accurate ground troth, and some analyst intervention. ?? 2002 Elsevier Science Inc. All rights reserved.

  9. Reference Values of Within-District Intraclass Correlations of Academic Achievement by District Characteristics: Results from a Meta-Analysis of District-Specific Values

    ERIC Educational Resources Information Center

    Hedberg, E. C.; Hedges, Larry V.

    2014-01-01

    Randomized experiments are often considered the strongest designs to study the impact of educational interventions. Perhaps the most prevalent class of designs used in large scale education experiments is the cluster randomized design in which entire schools are assigned to treatments. In cluster randomized trials (CRTs) that assign schools to…

  10. Robust Classification of Small-Molecule Mechanism of Action Using a Minimalist High-Content Microscopy Screen and Multidimensional Phenotypic Trajectory Analysis

    PubMed Central

    Twarog, Nathaniel R.; Low, Jonathan A.; Currier, Duane G.; Miller, Greg; Chen, Taosheng; Shelat, Anang A.

    2016-01-01

    Phenotypic screening through high-content automated microscopy is a powerful tool for evaluating the mechanism of action of candidate therapeutics. Despite more than a decade of development, however, high content assays have yielded mixed results, identifying robust phenotypes in only a small subset of compound classes. This has led to a combinatorial explosion of assay techniques, analyzing cellular phenotypes across dozens of assays with hundreds of measurements. Here, using a minimalist three-stain assay and only 23 basic cellular measurements, we developed an analytical approach that leverages informative dimensions extracted by linear discriminant analysis to evaluate similarity between the phenotypic trajectories of different compounds in response to a range of doses. This method enabled us to visualize biologically-interpretable phenotypic tracks populated by compounds of similar mechanism of action, cluster compounds according to phenotypic similarity, and classify novel compounds by comparing them to phenotypically active exemplars. Hierarchical clustering applied to 154 compounds from over a dozen different mechanistic classes demonstrated tight agreement with published compound mechanism classification. Using 11 phenotypically active mechanism classes, classification was performed on all 154 compounds: 78% were correctly identified as belonging to one of the 11 exemplar classes or to a different unspecified class, with accuracy increasing to 89% when less phenotypically active compounds were excluded. Importantly, several apparent clustering and classification failures, including rigosertib and 5-fluoro-2’-deoxycytidine, instead revealed more complex mechanisms or off-target effects verified by more recent publications. These results show that a simple, easily replicated, minimalist high-content assay can reveal subtle variations in the cellular phenotype induced by compounds and can correctly predict mechanism of action, as long as the appropriate analytical tools are used. PMID:26886014

  11. Robust Classification of Small-Molecule Mechanism of Action Using a Minimalist High-Content Microscopy Screen and Multidimensional Phenotypic Trajectory Analysis.

    PubMed

    Twarog, Nathaniel R; Low, Jonathan A; Currier, Duane G; Miller, Greg; Chen, Taosheng; Shelat, Anang A

    2016-01-01

    Phenotypic screening through high-content automated microscopy is a powerful tool for evaluating the mechanism of action of candidate therapeutics. Despite more than a decade of development, however, high content assays have yielded mixed results, identifying robust phenotypes in only a small subset of compound classes. This has led to a combinatorial explosion of assay techniques, analyzing cellular phenotypes across dozens of assays with hundreds of measurements. Here, using a minimalist three-stain assay and only 23 basic cellular measurements, we developed an analytical approach that leverages informative dimensions extracted by linear discriminant analysis to evaluate similarity between the phenotypic trajectories of different compounds in response to a range of doses. This method enabled us to visualize biologically-interpretable phenotypic tracks populated by compounds of similar mechanism of action, cluster compounds according to phenotypic similarity, and classify novel compounds by comparing them to phenotypically active exemplars. Hierarchical clustering applied to 154 compounds from over a dozen different mechanistic classes demonstrated tight agreement with published compound mechanism classification. Using 11 phenotypically active mechanism classes, classification was performed on all 154 compounds: 78% were correctly identified as belonging to one of the 11 exemplar classes or to a different unspecified class, with accuracy increasing to 89% when less phenotypically active compounds were excluded. Importantly, several apparent clustering and classification failures, including rigosertib and 5-fluoro-2'-deoxycytidine, instead revealed more complex mechanisms or off-target effects verified by more recent publications. These results show that a simple, easily replicated, minimalist high-content assay can reveal subtle variations in the cellular phenotype induced by compounds and can correctly predict mechanism of action, as long as the appropriate analytical tools are used.

  12. [A cloud detection algorithm for MODIS images combining Kmeans clustering and multi-spectral threshold method].

    PubMed

    Wang, Wei; Song, Wei-Guo; Liu, Shi-Xing; Zhang, Yong-Ming; Zheng, Hong-Yang; Tian, Wei

    2011-04-01

    An improved method for detecting cloud combining Kmeans clustering and the multi-spectral threshold approach is described. On the basis of landmark spectrum analysis, MODIS data is categorized into two major types initially by Kmeans method. The first class includes clouds, smoke and snow, and the second class includes vegetation, water and land. Then a multi-spectral threshold detection is applied to eliminate interference such as smoke and snow for the first class. The method is tested with MODIS data at different time under different underlying surface conditions. By visual method to test the performance of the algorithm, it was found that the algorithm can effectively detect smaller area of cloud pixels and exclude the interference of underlying surface, which provides a good foundation for the next fire detection approach.

  13. Broad phonetic class definition driven by phone confusions

    NASA Astrophysics Data System (ADS)

    Lopes, Carla; Perdigão, Fernando

    2012-12-01

    Intermediate representations between the speech signal and phones may be used to improve discrimination among phones that are often confused. These representations are usually found according to broad phonetic classes, which are defined by a phonetician. This article proposes an alternative data-driven method to generate these classes. Phone confusion information from the analysis of the output of a phone recognition system is used to find clusters at high risk of mutual confusion. A metric is defined to compute the distance between phones. The results, using TIMIT data, show that the proposed confusion-driven phone clustering method is an attractive alternative to the approaches based on human knowledge. A hierarchical classification structure to improve phone recognition is also proposed using a discriminative weight training method. Experiments show improvements in phone recognition on the TIMIT database compared to a baseline system.

  14. A tripartite clustering analysis on microRNA, gene and disease model.

    PubMed

    Shen, Chengcheng; Liu, Ying

    2012-02-01

    Alteration of gene expression in response to regulatory molecules or mutations could lead to different diseases. MicroRNAs (miRNAs) have been discovered to be involved in regulation of gene expression and a wide variety of diseases. In a tripartite biological network of human miRNAs, their predicted target genes and the diseases caused by altered expressions of these genes, valuable knowledge about the pathogenicity of miRNAs, involved genes and related disease classes can be revealed by co-clustering miRNAs, target genes and diseases simultaneously. Tripartite co-clustering can lead to more informative results than traditional co-clustering with only two kinds of members and pass the hidden relational information along the relation chain by considering multi-type members. Here we report a spectral co-clustering algorithm for k-partite graph to find clusters with heterogeneous members. We use the method to explore the potential relationships among miRNAs, genes and diseases. The clusters obtained from the algorithm have significantly higher density than randomly selected clusters, which means members in the same cluster are more likely to have common connections. Results also show that miRNAs in the same family based on the hairpin sequences tend to belong to the same cluster. We also validate the clustering results by checking the correlation of enriched gene functions and disease classes in the same cluster. Finally, widely studied miR-17-92 and its paralogs are analyzed as a case study to reveal that genes and diseases co-clustered with the miRNAs are in accordance with current research findings.

  15. Cognitive competence of graduates, oriented to work in the knowledge management system in the state corporation “Rosatom”

    NASA Astrophysics Data System (ADS)

    Kireev, V.; Silenko, A.; Guseva, A.

    2017-01-01

    This article describes an approach to the determination of the level of formation of competences of university graduates, oriented to work in the state corporation “Rosatom” in a knowledge management system. With the use of cluster analysis graduate classes were identified, focused on knowledge transfer, analysis and the search for new knowledge, creative transformation of knowledge. In addition, the class innovators were identified, which were fully formed the necessary cognitive competences.

  16. Graduation of fertility schedules: an analysis of fertility patterns in London in the 1980s and an application to fertility forecasts.

    PubMed

    Congdon, P

    1990-08-01

    London's average total fertility rate (TFR) stood at 1.75. Using a cluster analysis to compare the 1985-1987 fertility patterns of different boroughs of London, demographers learned that 5 natural groupings occurred. 4 boroughs in a central London cluster have the distinction of having a low TFR (1.38) and late fertility (average age of 29.58 years). The researchers attributed these occurrences to the high levels of employment and career attachment and low rates of marriage among women in this cluster. 2 inner city boroughs constituted the smallest cluster and had the largest TFR (2.37), mainly due to high numbers of births to the ethnic minorities. The largest cluster consisted of 12 boroughs located mainly along the periphery with 2 centrally located boroughs (TFR, 1.79). Some of the upper class outer boroughs characterized another cluster with a TFR of 1.61. Another cluster made up of inner and outer boroughs in east and southeast London had a ample proportion of manual worker (TFR, 2.04). Social class most likely accounted for the contrast in TFRs between the 2 aformentioned clusters. Demographers observed that cyclical fluctuation of fertility occurred as opposed to secular trends. Due to these fluctuations, demographers used autoregressive moving average forecast models to time series of the fertility variables in London since 1952. They also applied structural time series models which included regression variables and the influence of cyclical and/or trend behavior. The results showed that large cohorts and the increase in female economic activity caused a delay in the modal age of births and a reduction in the number of births.

  17. Spiral Arm Morphology in Cluster Environment

    NASA Astrophysics Data System (ADS)

    Choi, Isaac Yeoun-Gyu; Ann, Hong Bae

    2011-10-01

    We examine the dependence of the morphology of spiral galaxies on the environment using the KIAS Value Added Galaxy Catalog (VAGC) which is derived from the Sloan Digital Sky Survey (SDSS) DR7. Our goal is to understand whether the local environment or global conditions dominate in determining the morphology of spiral galaxies. For the analysis, we conduct a morphological classification of galaxies in 20 X-ray selected Abell clusters up to z˜0.06, using SDSS color images and the X-ray data from the Northern ROSAT All-Sky (NORAS) catalog. We analyze the distribution of arm classes along the clustercentric radius as well as that of Hubble types. To segregate the effect of local environment from the global environment, we compare the morphological distribution of galaxies in two X-lay luminosity groups, the low-Lx clusters (Lx < 0.15×1044erg/s) and high-Lx clusters (Lx > 1.8×1044erg/s). We find that the morphology-clustercentric relation prevails in the cluster envirnment although there is a brake near the cluster virial radius. The grand design arms comprise about 40% of the cluster spiral galaxies with a weak morphology-clustercentric radius relation for the arm classes, in the sense that flocculent galaxies tend to increase outward, regardless of the X-ray luminosity. From the cumulative radial distribution of cluster galaxies, we found that the low-Lx clusters are fully virialized while the high-Lx clusters are not.

  18. Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization.

    PubMed

    Yang, Haixuan; Seoighe, Cathal

    2016-01-01

    Nonnegative Matrix Factorization (NMF) has proved to be an effective method for unsupervised clustering analysis of gene expression data. By the nonnegativity constraint, NMF provides a decomposition of the data matrix into two matrices that have been used for clustering analysis. However, the decomposition is not unique. This allows different clustering results to be obtained, resulting in different interpretations of the decomposition. To alleviate this problem, some existing methods directly enforce uniqueness to some extent by adding regularization terms in the NMF objective function. Alternatively, various normalization methods have been applied to the factor matrices; however, the effects of the choice of normalization have not been carefully investigated. Here we investigate the performance of NMF for the task of cancer class discovery, under a wide range of normalization choices. After extensive evaluations, we observe that the maximum norm showed the best performance, although the maximum norm has not previously been used for NMF. Matlab codes are freely available from: http://maths.nuigalway.ie/~haixuanyang/pNMF/pNMF.htm.

  19. Comparative genomic analysis by microbial COGs self-attraction rate.

    PubMed

    Santoni, Daniele; Romano-Spica, Vincenzo

    2009-06-21

    Whole genome analysis provides new perspectives to determine phylogenetic relationships among microorganisms. The availability of whole nucleotide sequences allows different levels of comparison among genomes by several approaches. In this work, self-attraction rates were considered for each cluster of orthologous groups of proteins (COGs) class in order to analyse gene aggregation levels in physical maps. Phylogenetic relationships among microorganisms were obtained by comparing self-attraction coefficients. Eighteen-dimensional vectors were computed for a set of 168 completely sequenced microbial genomes (19 archea, 149 bacteria). The components of the vector represent the aggregation rate of the genes belonging to each of 18 COGs classes. Genes involved in nonessential functions or related to environmental conditions showed the highest aggregation rates. On the contrary genes involved in basic cellular tasks showed a more uniform distribution along the genome, except for translation genes. Self-attraction clustering approach allowed classification of Proteobacteria, Bacilli and other species belonging to Firmicutes. Rearrangement and Lateral Gene Transfer events may influence divergences from classical taxonomy. Each set of COG classes' aggregation values represents an intrinsic property of the microbial genome. This novel approach provides a new point of view for whole genome analysis and bacterial characterization.

  20. Methods of Conceptual Clustering and their Relation to Numerical Taxonomy.

    DTIC Science & Technology

    1985-07-22

    the conceptual clustering problem is to first solve theaggregation problem, and then the characterization problem. In machine learning, the...cluster- ings by first generating some number of possible clusterings. For each clustering generated, one calls a learning from examples subroutine, which...class 1 from class 2, and vice versa, only the first combination implies a partition over the set of theoretically possible objects. The first

  1. A fuzzy adaptive network approach to parameter estimation in cases where independent variables come from an exponential distribution

    NASA Astrophysics Data System (ADS)

    Dalkilic, Turkan Erbay; Apaydin, Aysen

    2009-11-01

    In a regression analysis, it is assumed that the observations come from a single class in a data cluster and the simple functional relationship between the dependent and independent variables can be expressed using the general model; Y=f(X)+[epsilon]. However; a data cluster may consist of a combination of observations that have different distributions that are derived from different clusters. When faced with issues of estimating a regression model for fuzzy inputs that have been derived from different distributions, this regression model has been termed the [`]switching regression model' and it is expressed with . Here li indicates the class number of each independent variable and p is indicative of the number of independent variables [J.R. Jang, ANFIS: Adaptive-network-based fuzzy inference system, IEEE Transaction on Systems, Man and Cybernetics 23 (3) (1993) 665-685; M. Michel, Fuzzy clustering and switching regression models using ambiguity and distance rejects, Fuzzy Sets and Systems 122 (2001) 363-399; E.Q. Richard, A new approach to estimating switching regressions, Journal of the American Statistical Association 67 (338) (1972) 306-310]. In this study, adaptive networks have been used to construct a model that has been formed by gathering obtained models. There are methods that suggest the class numbers of independent variables heuristically. Alternatively, in defining the optimal class number of independent variables, the use of suggested validity criterion for fuzzy clustering has been aimed. In the case that independent variables have an exponential distribution, an algorithm has been suggested for defining the unknown parameter of the switching regression model and for obtaining the estimated values after obtaining an optimal membership function, which is suitable for exponential distribution.

  2. A SPITZER VIEW OF STAR FORMATION IN THE CYGNUS X NORTH COMPLEX

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beerer, I. M.; Koenig, X. P.; Hora, J. L.

    2010-09-01

    We present new images and photometry of the massive star-forming complex Cygnus X obtained with the Infrared Array Camera (IRAC) and the Multiband Imaging Photometer for Spitzer (MIPS) on board the Spitzer Space Telescope. A combination of IRAC, MIPS, UKIRT Deep Infrared Sky Survey, and Two Micron All Sky Survey data are used to identify and classify young stellar objects (YSOs). Of the 8231 sources detected exhibiting infrared excess in Cygnus X North, 670 are classified as class I and 7249 are classified as class II. Using spectra from the FAST Spectrograph at the Fred L. Whipple Observatory and Hectospecmore » on the MMT, we spectrally typed 536 sources in the Cygnus X complex to identify the massive stars. We find that YSOs tend to be grouped in the neighborhoods of massive B stars (spectral types B0 to B9). We present a minimal spanning tree analysis of clusters in two regions in Cygnus X North. The fraction of infrared excess sources that belong to clusters with {>=}10 members is found to be 50%-70%. Most class II objects lie in dense clusters within blown out H II regions, while class I sources tend to reside in more filamentary structures along the bright-rimmed clouds, indicating possible triggered star formation.« less

  3. SU-F-T-312: Identifying Distinct Radiation Therapy Plan Classes Through Multi-Dimensional Analysis of Plan Complexity Metrics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Desai, V; Labby, Z; Culberson, W

    Purpose: To determine whether body site-specific treatment plans form unique “plan class” clusters in a multi-dimensional analysis of plan complexity metrics such that a single beam quality correction determined for a representative plan could be universally applied within the “plan class”, thereby increasing the dosimetric accuracy of a detector’s response within a subset of similarly modulated nonstandard deliveries. Methods: We collected 95 clinical volumetric modulated arc therapy (VMAT) plans from four body sites (brain, lung, prostate, and spine). The lung data was further subdivided into SBRT and non-SBRT data for a total of five plan classes. For each control pointmore » in each plan, a variety of aperture-based complexity metrics were calculated and stored as unique characteristics of each patient plan. A multiple comparison of means analysis was performed such that every plan class was compared to every other plan class for every complexity metric in order to determine which groups could be considered different from one another. Statistical significance was assessed after correcting for multiple hypothesis testing. Results: Six out of a possible 10 pairwise plan class comparisons were uniquely distinguished based on at least nine out of 14 of the proposed metrics (Brain/Lung, Brain/SBRT lung, Lung/Prostate, Lung/SBRT Lung, Lung/Spine, Prostate/SBRT Lung). Eight out of 14 of the complexity metrics could distinguish at least six out of the possible 10 pairwise plan class comparisons. Conclusion: Aperture-based complexity metrics could prove to be useful tools to quantitatively describe a distinct class of treatment plans. Certain plan-averaged complexity metrics could be considered unique characteristics of a particular plan. A new approach to generating plan-class specific reference (pcsr) fields could be established through a targeted preservation of select complexity metrics or a clustering algorithm that identifies plans exhibiting similar modulation characteristics. Measurements and simulations will better elucidate potential plan-class specific dosimetry correction factors.« less

  4. Determining the Optimal Number of Clusters with the Clustergram

    NASA Technical Reports Server (NTRS)

    Fluegemann, Joseph K.; Davies, Misty D.; Aguirre, Nathan D.

    2011-01-01

    Cluster analysis aids research in many different fields, from business to biology to aerospace. It consists of using statistical techniques to group objects in large sets of data into meaningful classes. However, this process of ordering data points presents much uncertainty because it involves several steps, many of which are subject to researcher judgment as well as inconsistencies depending on the specific data type and research goals. These steps include the method used to cluster the data, the variables on which the cluster analysis will be operating, the number of resulting clusters, and parts of the interpretation process. In most cases, the number of clusters must be guessed or estimated before employing the clustering method. Many remedies have been proposed, but none is unassailable and certainly not for all data types. Thus, the aim of current research for better techniques of determining the number of clusters is generally confined to demonstrating that the new technique excels other methods in performance for several disparate data types. Our research makes use of a new cluster-number-determination technique based on the clustergram: a graph that shows how the number of objects in the cluster and the cluster mean (the ordinate) change with the number of clusters (the abscissa). We use the features of the clustergram to make the best determination of the cluster-number.

  5. Clustering of longitudinal data by using an extended baseline: A new method for treatment efficacy clustering in longitudinal data.

    PubMed

    Schramm, Catherine; Vial, Céline; Bachoud-Lévi, Anne-Catherine; Katsahian, Sandrine

    2018-01-01

    Heterogeneity in treatment efficacy is a major concern in clinical trials. Clustering may help to identify the treatment responders and the non-responders. In the context of longitudinal cluster analyses, sample size and variability of the times of measurements are the main issues with the current methods. Here, we propose a new two-step method for the Clustering of Longitudinal data by using an Extended Baseline. The first step relies on a piecewise linear mixed model for repeated measurements with a treatment-time interaction. The second step clusters the random predictions and considers several parametric (model-based) and non-parametric (partitioning, ascendant hierarchical clustering) algorithms. A simulation study compares all options of the clustering of longitudinal data by using an extended baseline method with the latent-class mixed model. The clustering of longitudinal data by using an extended baseline method with the two model-based algorithms was the more robust model. The clustering of longitudinal data by using an extended baseline method with all the non-parametric algorithms failed when there were unequal variances of treatment effect between clusters or when the subgroups had unbalanced sample sizes. The latent-class mixed model failed when the between-patients slope variability is high. Two real data sets on neurodegenerative disease and on obesity illustrate the clustering of longitudinal data by using an extended baseline method and show how clustering may help to identify the marker(s) of the treatment response. The application of the clustering of longitudinal data by using an extended baseline method in exploratory analysis as the first stage before setting up stratified designs can provide a better estimation of treatment effect in future clinical trials.

  6. Studying the Therapeutic Process by Observing Clinicians' In-Session Behaviour.

    PubMed

    Montaño-Fidalgo, Montserrat; Ruiz, Elena M; Calero-Elvira, Ana; Froján-Parga, María Xesús

    2015-01-01

    This paper presents a further step in the use and validation of a systematic, functional-analytic method of describing psychologists' verbal behaviour during therapy. We observed recordings from 92 clinical sessions of 19 adults (14 women and 5 men of Caucasian origin, with ages ranging from 19 to 51 years) treated by nine cognitive-behavioural therapists (eight women and one man, Caucasian as well, with ages ranging from 25 to 48 years). The therapists' verbal behaviour was codified and then classified according to its possible functionality. A cluster analysis of the data, followed by a discriminant analysis, showed that the therapists' verbal behaviour tended to aggregate around four types of session differentiated by their clinical objective (assessment, explanation, treatment and consolidation). These results confirm the validity of our method and enable us to further describe clinical phenomena by distinguishing psychologists' classes of clinically relevant activities. Specific learning mechanisms may be responsible for clinical change within each class. These issues should be analysed more closely when explaining therapeutic phenomena and when developing more effective forms of clinical intervention. We described therapists' verbal behaviour in a focused fashion so as to develop new research methods that evaluate psychological work moment by moment. We performed a cluster analysis in order to evaluate how the therapists' verbal behaviour was distributed throughout the intervention. A discriminant analysis gave us further information about the statistical significance and possible nature of the clusters we observed. The therapists' verbal behaviour depended on current clinical objectives and could be classified into four classes of clinically relevant activities: evaluation, explanation, treatment and consolidation. Some of the therapist's verbalizations were more important than others when carrying out these clinically relevant activities. The distribution of the therapists' verbal behaviour across classes may provide us with clues regarding the functionality of their in-session verbal behaviour. Copyright © 2014 John Wiley & Sons, Ltd.

  7. Assessment of repeatability of composition of perfumed waters by high-performance liquid chromatography combined with numerical data analysis based on cluster analysis (HPLC UV/VIS - CA).

    PubMed

    Ruzik, L; Obarski, N; Papierz, A; Mojski, M

    2015-06-01

    High-performance liquid chromatography (HPLC) with UV/VIS spectrophotometric detection combined with the chemometric method of cluster analysis (CA) was used for the assessment of repeatability of composition of nine types of perfumed waters. In addition, the chromatographic method of separating components of the perfume waters under analysis was subjected to an optimization procedure. The chromatograms thus obtained were used as sources of data for the chemometric method of cluster analysis (CA). The result was a classification of a set comprising 39 perfumed water samples with a similar composition at a specified level of probability (level of agglomeration). A comparison of the classification with the manufacturer's declarations reveals a good degree of consistency and demonstrates similarity between samples in different classes. A combination of the chromatographic method with cluster analysis (HPLC UV/VIS - CA) makes it possible to quickly assess the repeatability of composition of perfumed waters at selected levels of probability. © 2014 Society of Cosmetic Scientists and the Société Française de Cosmétologie.

  8. Predicting the decision to pursue mediation in civil disputes: a hierarchical classes analysis.

    PubMed

    Reich, Warren A; Kressel, Kenneth; Scanlon, Kathleen M; Weiner, Gary A

    2007-11-01

    Clients (N = 185) involved in civil court cases completed the CPR Institute's Mediation Screen, which is designed to assist in making a decision about pursuing mediation. The authors modeled data using hierarchical classes analysis (HICLAS), a clustering algorithm that places clients into 1 set of classes and CPRMS items into another set of classes. HICLAS then links the sets of classes so that any class of clients can be identified in terms of the classes of items they endorsed. HICLAS-derived item classes reflected 2 underlying themes: (a) suitability of the dispute for a problem-solving process and (b) potential benefits of mediation. All clients who perceived that mediation would be beneficial also believed that the context of their conflict was favorable to mediation; however, not all clients who saw a favorable context believed they would benefit from mediation. The majority of clients who agreed to pursue mediation endorsed items reflecting both contextual suitability and perceived benefits of mediation.

  9. Clustering Methods with Qualitative Data: A Mixed Methods Approach for Prevention Research with Small Samples

    PubMed Central

    Henry, David; Dymnicki, Allison B.; Mohatt, Nathaniel; Allen, James; Kelly, James G.

    2016-01-01

    Qualitative methods potentially add depth to prevention research, but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data, but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-Means clustering, and latent class analysis produced similar levels of accuracy with binary data, and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a “real-world” example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities. PMID:25946969

  10. Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples.

    PubMed

    Henry, David; Dymnicki, Allison B; Mohatt, Nathaniel; Allen, James; Kelly, James G

    2015-10-01

    Qualitative methods potentially add depth to prevention research but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed-methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed-methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-means clustering, and latent class analysis produced similar levels of accuracy with binary data and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a "real-world" example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities.

  11. Tri-city study of Ecstasy use problems: a latent class analysis.

    PubMed

    Scheier, Lawrence M; Ben Abdallah, Arbi; Inciardi, James A; Copeland, Jan; Cottler, Linda B

    2008-12-01

    This study used latent class analysis to examine distinctive subtypes of Ecstasy users based on 24 abuse and dependence symptoms underlying standard DSM-IV criteria. Data came from a three site, population-based, epidemiological study to examine diagnostic nosology for Ecstasy use. Subject inclusion criteria included lifetime Ecstasy use exceeding five times and once in the past year, with participants ranging in age between 16 and 47 years of age from St. Louis, Miami, U.S. and Sydney, Australia. A satisfactory model typified four latent classes representing clearly differentiated diagnostic clusters including: (1) a group of sub-threshold users endorsing few abuse and dependence symptoms (negatives), (2) a group of 'diagnostic orphans' who had characteristic features of dependence for a select group of symptoms (mild dependent), (3) a 'transitional group' mimicking the orphans with regard to their profile of dependence also but reporting some abuse symptoms (moderate dependent), and (4) a 'severe dependent' group with a distinct profile of abuse and dependence symptoms. A multinomial logistic regression model indicated that certain latent classes showed unique associations with external non-diagnostic markers. Controlling for demographic characteristics and lifetime quantity of Ecstasy pill use, criminal behavior and motivational cues for Ecstasy use were the most efficient predictors of cluster membership. This study reinforces the heuristic utility of DSM-IV criteria applied to Ecstasy but with a different collage of symptoms that produced four distinct classes of Ecstasy users.

  12. Consistency of Cluster Analysis for Cognitive Diagnosis: The Reduced Reparameterized Unified Model and the General Diagnostic Model.

    PubMed

    Chiu, Chia-Yi; Köhn, Hans-Friedrich

    2016-09-01

    The asymptotic classification theory of cognitive diagnosis (ACTCD) provided the theoretical foundation for using clustering methods that do not rely on a parametric statistical model for assigning examinees to proficiency classes. Like general diagnostic classification models, clustering methods can be useful in situations where the true diagnostic classification model (DCM) underlying the data is unknown and possibly misspecified, or the items of a test conform to a mix of multiple DCMs. Clustering methods can also be an option when fitting advanced and complex DCMs encounters computational difficulties. These can range from the use of excessive CPU times to plain computational infeasibility. However, the propositions of the ACTCD have only been proven for the Deterministic Input Noisy Output "AND" gate (DINA) model and the Deterministic Input Noisy Output "OR" gate (DINO) model. For other DCMs, there does not exist a theoretical justification to use clustering for assigning examinees to proficiency classes. But if clustering is to be used legitimately, then the ACTCD must cover a larger number of DCMs than just the DINA model and the DINO model. Thus, the purpose of this article is to prove the theoretical propositions of the ACTCD for two other important DCMs, the Reduced Reparameterized Unified Model and the General Diagnostic Model.

  13. Intra-class correlation estimates for assessment of vitamin A intake in children.

    PubMed

    Agarwal, Girdhar G; Awasthi, Shally; Walter, Stephen D

    2005-03-01

    In many community-based surveys, multi-level sampling is inherent in the design. In the design of these studies, especially to calculate the appropriate sample size, investigators need good estimates of intra-class correlation coefficient (ICC), along with the cluster size, to adjust for variation inflation due to clustering at each level. The present study used data on the assessment of clinical vitamin A deficiency and intake of vitamin A-rich food in children in a district in India. For the survey, 16 households were sampled from 200 villages nested within eight randomly-selected blocks of the district. ICCs and components of variances were estimated from a three-level hierarchical random effects analysis of variance model. Estimates of ICCs and variance components were obtained at village and block levels. Between-cluster variation was evident at each level of clustering. In these estimates, ICCs were inversely related to cluster size, but the design effect could be substantial for large clusters. At the block level, most ICC estimates were below 0.07. At the village level, many ICC estimates ranged from 0.014 to 0.45. These estimates may provide useful information for the design of epidemiological studies in which the sampled (or allocated) units range in size from households to large administrative zones.

  14. Longitudinal analysis of latent classes of psychopathology and patterns of class migration in survivors of severe injury.

    PubMed

    Forbes, David; Nickerson, Angela; Alkemade, Nathan; Bryant, Richard A; Creamer, Mark; Silove, Derrick; McFarlane, Alexander C; Van Hooff, Miranda; Fletcher, Susan L; O'Donnell, Meaghan

    2015-09-01

    Little research to date has explored the typologies of psychopathology following trauma, beyond development of particular diagnoses such as posttraumatic stress disorder (PTSD). The objective of this study was to determine the longitudinal patterns of these typologies, especially the movement of persons across clusters of psychopathology. In this 6-year longitudinal study, 1,167 hospitalized severe injury patients who were recruited between April 2004-February 2006 were analyzed, with repeated measures at baseline, 3 months, 12 months, and 72 months after injury. All patients met the DSM-IV criterion A1 for PTSD. Structured clinical interviews were used to assess psychiatric disorders at each follow-up point. Latent class analysis and latent transition analysis were applied to assess clusters of individuals determined by psychopathology. The Mini International Neuropsychiatric Interview (MINI) and Clinician-Administered PTSD Scale (CAPS) were employed to complete diagnoses. Four latent classes were identified at each time point: (1) Alcohol/Depression class (3 months, 2.1%; 12 months, 1.3%; and 72 months, 1.1%), (2) Alcohol class (3 months, 3.3%; 12 months, 3.7%; and 72 months, 5.4%), (3) PTSD/Depression class (3 months, 10.3%; 12 months, 11.5%; and 72 months, 6.4%), and (4) No Disorder class (3 months, 84.2%; 12 months, 83.5%; and 72 months, 87.1%). Latent transition analyses conducted across the 2 transition points (12 months and 72 months) found consistently high levels of stability in the No Disorder class (90.9%, 93.0%, respectively) but lower and reducing levels of consistency in the PTSD/Depression class (81.3%, 46.6%), the Alcohol/Depression class (59.7%, 21.5%), and the Alcohol class (61.0%, 36.5%), demonstrating high levels of between-class migration. Despite the array of psychiatric disorders that may develop following severe injury, a 4-class model best described the data with excellent classification certainty. The high levels of migration across classes indicate a complex pattern of psychopathology expression over time. The findings have considerable implications for tailoring multifocused interventions to class type, as well as flexible stepped care models, and for the potential development and delivery of transdiagnostic interventions targeting underlying mechanisms. © Copyright 2015 Physicians Postgraduate Press, Inc.

  15. Attachment typologies and posttraumatic stress disorder (PTSD), depression and anxiety: a latent profile analysis approach

    PubMed Central

    Armour, Cherie; Elklit, Ask; Shevlin, Mark

    2011-01-01

    Background Bartholomew (1990) proposed a four category adult attachment model based on Bowlby's (1973) proposal that attachment is underpinned by an individual's view of the self and others. Previous cluster analytic techniques have identified four and two attachment styles based on the Revised Adult Attachment Scale (RAAS). In addition, attachment styles have been proposed to meditate the association between stressful life events and subsequent psychiatric status. Objective The current study aimed to empirically test the attachment typology proposed by Collins and Read (1990). Specifically, LPA was used to determine if the proposed four styles can be derived from scores on the dimensions of closeness/dependency and anxiety. In addition, we aimed to test if the resultant attachment styles predicted the severity of psychopathology in response to a whiplash trauma. Method A large sample of Danish trauma victims (N=1577) participated. A Latent Profile Analysis was conducted, using Mplus 5.1, on scores from the RAAS scale to ascertain if there were underlying homogeneous attachment classes/subgroups. Class membership was used in a series of one-way ANOVA tests to determine if classes were significantly different in terms of mean scores on measures of psychopathology. Results The three class solution was considered optimal. Class one was termed Fearful (18.6%), Class two Preoccupied (34.5%), and Class three Secure (46.9%). The secure class evidenced significantly lower mean scores on PTSD, depression, and anxiety measures compared to other classes, whereas the fearful class evidenced significantly higher mean scores compared to other classes. Conclusions The results demonstrated evidence of three discrete classes of attachment styles, which were labelled secure, preoccupied, and fearful. This is in contrast to previous cluster analytic techniques which have identified four and two attachment styles based on the RAAS.In addition, Securely attached individuals display lower levels of psychopathology post whiplash trauma. PMID:22893805

  16. A new scoring system in Cystic Fibrosis: statistical tools for database analysis - a preliminary report.

    PubMed

    Hafen, G M; Hurst, C; Yearwood, J; Smith, J; Dzalilov, Z; Robinson, P J

    2008-10-05

    Cystic fibrosis is the most common fatal genetic disorder in the Caucasian population. Scoring systems for assessment of Cystic fibrosis disease severity have been used for almost 50 years, without being adapted to the milder phenotype of the disease in the 21st century. The aim of this current project is to develop a new scoring system using a database and employing various statistical tools. This study protocol reports the development of the statistical tools in order to create such a scoring system. The evaluation is based on the Cystic Fibrosis database from the cohort at the Royal Children's Hospital in Melbourne. Initially, unsupervised clustering of the all data records was performed using a range of clustering algorithms. In particular incremental clustering algorithms were used. The clusters obtained were characterised using rules from decision trees and the results examined by clinicians. In order to obtain a clearer definition of classes expert opinion of each individual's clinical severity was sought. After data preparation including expert-opinion of an individual's clinical severity on a 3 point-scale (mild, moderate and severe disease), two multivariate techniques were used throughout the analysis to establish a method that would have a better success in feature selection and model derivation: 'Canonical Analysis of Principal Coordinates' and 'Linear Discriminant Analysis'. A 3-step procedure was performed with (1) selection of features, (2) extracting 5 severity classes out of a 3 severity class as defined per expert-opinion and (3) establishment of calibration datasets. (1) Feature selection: CAP has a more effective "modelling" focus than DA.(2) Extraction of 5 severity classes: after variables were identified as important in discriminating contiguous CF severity groups on the 3-point scale as mild/moderate and moderate/severe, Discriminant Function (DF) was used to determine the new groups mild, intermediate moderate, moderate, intermediate severe and severe disease. (3) Generated confusion tables showed a misclassification rate of 19.1% for males and 16.5% for females, with a majority of misallocations into adjacent severity classes particularly for males. Our preliminary data show that using CAP for detection of selection features and Linear DA to derive the actual model in a CF database might be helpful in developing a scoring system. However, there are several limitations, particularly more data entry points are needed to finalize a score and the statistical tools have further to be refined and validated, with re-running the statistical methods in the larger dataset.

  17. Clustering of modifiable biobehavioral risk factors for chronic disease in US adults: a latent class analysis.

    PubMed

    Leventhal, Adam M; Huh, Jimi; Dunton, Genevieve F

    2014-11-01

    Examining the co-occurrence patterns of modifiable biobehavioral risk factors for deadly chronic diseases (e.g. cancer, cardiovascular disease, diabetes) can elucidate the etiology of risk factors and guide disease-prevention programming. The aims of this study were to (1) identify latent classes based on the clustering of five key biobehavioral risk factors among US adults who reported at least one risk factor and (2) explore the demographic correlates of the identified latent classes. Participants were respondents of the National Epidemiologic Survey of Alcohol and Related Conditions (2004-2005) with at least one of the following disease risk factors in the past year (N = 22,789), which were also the latent class indicators: (1) alcohol abuse/dependence, (2) drug abuse/dependence, (3) nicotine dependence, (4) obesity, and (5) physical inactivity. Housing sample units were selected to match the US National Census in location and demographic characteristics, with young adults oversampled. Participants were administered surveys by trained interviewers. Five latent classes were yielded: 'obese, active non-substance abusers' (23%); 'nicotine-dependent, active, and non-obese' (19%); 'active, non-obese alcohol abusers' (6%); 'inactive, non-substance abusers' (50%); and 'active, polysubstance abusers' (3.7%). Four classes were characterized by a 100% likelihood of having one risk factor coupled with a low or moderate likelihood of having the other four risk factors. The five classes exhibited unique demographic profiles. Risk factors may cluster together in a non-monotonic fashion, with the majority of the at-risk population of US adults expected to have a high likelihood of endorsing only one of these five risk factors. © Royal Society for Public Health 2013.

  18. Accessing and constructing driving data to develop fuel consumption forecast model

    NASA Astrophysics Data System (ADS)

    Yamashita, Rei-Jo; Yao, Hsiu-Hsen; Hung, Shih-Wei; Hackman, Acquah

    2018-02-01

    In this study, we develop a forecasting models, to estimate fuel consumption based on the driving behavior, in which vehicles and routes are known. First, the driving data are collected via telematics and OBDII. Then, the driving fuel consumption formula is used to calculate the estimate fuel consumption, and driving behavior indicators are generated for analysis. Based on statistical analysis method, the driving fuel consumption forecasting model is constructed. Some field experiment results were done in this study to generate hundreds of driving behavior indicators. Based on data mining approach, the Pearson coefficient correlation analysis is used to filter highly fuel consumption related DBIs. Only highly correlated DBI will be used in the model. These DBIs are divided into four classes: speed class, acceleration class, Left/Right/U-turn class and the other category. We then use K-means cluster analysis to group to the driver class and the route class. Finally, more than 12 aggregate models are generated by those highly correlated DBIs, using the neural network model and regression analysis. Based on Mean Absolute Percentage Error (MAPE) to evaluate from the developed AMs. The best MAPE values among these AM is below 5%.

  19. Effect Sizes in Three-Level Cluster-Randomized Experiments

    ERIC Educational Resources Information Center

    Hedges, Larry V.

    2011-01-01

    Research designs involving cluster randomization are becoming increasingly important in educational and behavioral research. Many of these designs involve two levels of clustering or nesting (students within classes and classes within schools). Researchers would like to compute effect size indexes based on the standardized mean difference to…

  20. COVARIATE-ADAPTIVE CLUSTERING OF EXPOSURES FOR AIR POLLUTION EPIDEMIOLOGY COHORTS*

    PubMed Central

    Keller, Joshua P.; Drton, Mathias; Larson, Timothy; Kaufman, Joel D.; Sandler, Dale P.; Szpiro, Adam A.

    2017-01-01

    Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge. Specifically, we present a method that uses geographic covariate information to cluster multi-pollutant observations and predict cluster membership at cohort locations. Our predictive k-means procedure identifies centers using a mixture model and is followed by multi-class spatial prediction. In simulations, we demonstrate that predictive k-means can reduce misclassification error by over 50% compared to ordinary k-means, with minimal loss in cluster representativeness. The improved prediction accuracy results in large gains of 30% or more in power for detecting effect modification by cluster in a simulated health analysis. In an analysis of the NIEHS Sister Study cohort using predictive k-means, we find that the association between systolic blood pressure (SBP) and long-term fine particulate matter (PM2.5) exposure varies significantly between different clusters of PM2.5 component profiles. Our cluster-based analysis shows that for subjects assigned to a cluster located in the Midwestern U.S., a 10 μg/m3 difference in exposure is associated with 4.37 mmHg (95% CI, 2.38, 6.35) higher SBP. PMID:28572869

  1. DNA-methylation dependent regulation of embryo-specific 5S ribosomal DNA cluster transcription in adult tissues of sea urchin Paracentrotus lividus.

    PubMed

    Bellavia, Daniele; Dimarco, Eufrosina; Naselli, Flores; Caradonna, Fabio

    2013-10-01

    We have previously reported a molecular and cytogenetic characterization of three different 5S rDNA clusters in the sea urchin Paracentrotus lividus and recently, demonstrated the presence of high heterogeneity in functional 5S rRNA. In this paper, we show some important distinctive data on 5S rRNA transcription for this organism. Using single strand conformation polymorphism (SSCP) analysis, we demonstrate the existence of two classes of 5S rRNA, one which is embryo-specific and encoded by the smallest (700 bp) cluster and the other which is expressed at every stage and encoded by longer clusters (900 and 950 bp). We also demonstrate that the embryo-specific class of 5S rRNA is expressed in oocytes and embryonic stages and is silenced in adult tissue and that this phenomenon appears to be due exclusively to DNA methylation, as indicated by sensitivity to 5-azacytidine, unlike Xenopus where this mechanism is necessary but not sufficient to maintain the silenced status. © 2013 Elsevier Inc. All rights reserved.

  2. Diversity amongst trigeminal neurons revealed by high throughput single cell sequencing

    PubMed Central

    Nguyen, Minh Q.; Wu, Youmei; Bonilla, Lauren S.; von Buchholtz, Lars J.

    2017-01-01

    The trigeminal ganglion contains somatosensory neurons that detect a range of thermal, mechanical and chemical cues and innervate unique sensory compartments in the head and neck including the eyes, nose, mouth, meninges and vibrissae. We used single-cell sequencing and in situ hybridization to examine the cellular diversity of the trigeminal ganglion in mice, defining thirteen clusters of neurons. We show that clusters are well conserved in dorsal root ganglia suggesting they represent distinct functional classes of somatosensory neurons and not specialization associated with their sensory targets. Notably, functionally important genes (e.g. the mechanosensory channel Piezo2 and the capsaicin gated ion channel Trpv1) segregate into multiple clusters and often are expressed in subsets of cells within a cluster. Therefore, the 13 genetically-defined classes are likely to be physiologically heterogeneous rather than highly parallel (i.e., redundant) lines of sensory input. Our analysis harnesses the power of single-cell sequencing to provide a unique platform for in silico expression profiling that complements other approaches linking gene-expression with function and exposes unexpected diversity in the somatosensory system. PMID:28957441

  3. Modified vegetation indices for Ganoderma disease detection in oil palm from field spectroradiometer data

    NASA Astrophysics Data System (ADS)

    Shafri, Helmi Z. M.; Anuar, M. Izzuddin; Saripan, M. Iqbal

    2009-10-01

    High resolution field spectroradiometers are important for spectral analysis and mobile inspection of vegetation disease. The biggest challenges in using this technology for automated vegetation disease detection are in spectral signatures pre-processing, band selection and generating reflectance indices to improve the ability of hyperspectral data for early detection of disease. In this paper, new indices for oil palm Ganoderma disease detection were generated using band ratio and different band combination techniques. Unsupervised clustering method was used to cluster the values of each class resultant from each index. The wellness of band combinations was assessed by using Optimum Index Factor (OIF) while cluster validation was executed using Average Silhouette Width (ASW). 11 modified reflectance indices were generated in this study and the indices were ranked according to the values of their ASW. These modified indices were also compared to several existing and new indices. The results showed that the combination of spectral values at 610.5nm and 738nm was the best for clustering the three classes of infection levels in the determination of the best spectral index for early detection of Ganoderma disease.

  4. Wildlife management by habitat units: A preliminary plan of action

    NASA Technical Reports Server (NTRS)

    Frentress, C. D.; Frye, R. G.

    1975-01-01

    Procedures for yielding vegetation type maps were developed using LANDSAT data and a computer assisted classification analysis (LARSYS) to assist in managing populations of wildlife species by defined area units. Ground cover in Travis County, Texas was classified on two occasions using a modified version of the unsupervised approach to classification. The first classification produced a total of 17 classes. Examination revealed that further grouping was justified. A second analysis produced 10 classes which were displayed on printouts which were later color-coded. The final classification was 82 percent accurate. While the classification map appeared to satisfactorily depict the existing vegetation, two classes were determined to contain significant error. The major sources of error could have been eliminated by stratifying cluster sites more closely among previously mapped soil associations that are identified with particular plant associations and by precisely defining class nomenclature using established criteria early in the analysis.

  5. [Temporal-spatial analysis of bacillary dysentery in the Three Gorges Area of China, 2005-2016].

    PubMed

    Zhang, P; Zhang, J; Chang, Z R; Li, Z J

    2018-01-10

    Objective: To analyze the spatial and temporal distributions of bacillary dysentery in Chongqing, Yichang and Enshi (the Three Gorges Area) from 2005 to 2016, and provide evidence for the disease prevention and control. Methods: The incidence data of bacillary dysentery in the Three Gorges Area during this period were collected from National Notifiable Infectious Disease Reporting System. The spatial-temporal scan statistic was conducted with software SaTScan 9.4 and bacillary dysentery clusters were visualized with software ArcGIS 10.3. Results: A total of 126 196 cases were reported in the Three Gorges Area during 2005-2016, with an average incidence rate of 29.67/100 000. The overall incidence was in a downward trend, with an average annual decline rate of 4.74%. Cases occurred all the year round but with an obvious seasonal increase between May and October. Among the reported cases, 44.71% (56 421/126 196) were children under 5-year-old, the cases in children outside child care settings accounted for 41.93% (52 918/126 196) of the total. The incidence rates in districts of Yuzhong, Dadukou, Jiangbei, Shapingba, Jiulongpo, Nanan, Yubei, Chengkou of Chongqing and districts of Xiling and Wujiagang of Yichang city of Hubei province were high, ranging from 60.20/100 000 to 114.81/100 000. Spatial-temporal scan statistic for the spatial and temporal distributions of bacillary dysentery during this period revealed that the temporal distribution was during May-October, and there were 12 class Ⅰ clusters, 35 class Ⅱ clusters, and 9 clusters without statistical significance in counties with high incidence. All the class Ⅰ clusters were in urban area of Chongqing (Yuzhong, Dadukou, Jiangbei, Shapingba, Jiulongpo, Nanan, Beibei, Yubei, Banan) and surrounding counties, and the class Ⅱ clusters transformed from concentrated distribution to scattered distribution. Conclusions: Temporal and spatial cluster of bacillary dysentery incidence existed in the three gorges area during 2005-2016. It is necessary to strengthen the bacillary dysentery prevention and control in urban areas of Chongqing and Yichang.

  6. Barrios, ghettos, and residential racial composition: Examining the racial makeup of neighborhood profiles and their relationship to self-rated health.

    PubMed

    Booth, Jaime M; Teixeira, Samantha; Zuberi, Anita; Wallace, John M

    2018-01-01

    Racial/ethnic disparities in self-rated health persist and according to the social determinants of health framework, may be partially explained by residential context. The relationship between neighborhood factors and self-rated health has been examined in isolation but a more holistic approach is needed to understand how these factors may cluster together and how these neighborhood typologies relate to health. To address this gap, we conducted a latent profile analysis using data from the Chicago Community Adult Health Study (CCAHS; N = 2969 respondents in 342 neighborhood clusters) to identify neighborhood profiles, examined differences in neighborhood characteristics among the identified typologies and tested their relationship to self-rated health. Results indicated four distinct classes of neighborhoods that vary significantly on most neighborhood-level social determinants of health and can be defined by racial/ethnic composition and class. Residents in Hispanic, majority black disadvantaged, and majority black non-poor neighborhoods all had significantly poorer self-rated health when compared to majority white neighborhoods. The difference between black non-poor and white neighborhoods in self-rated health was not significant when controlling for individual race/ethnicity. The results indicate that neighborhood factors do cluster by race and class of the neighborhood and that this clustering is related to poorer self-rated health. Copyright © 2017. Published by Elsevier Inc.

  7. Identification of temporal variations in mental workload using locally-linear-embedding-based EEG feature reduction and support-vector-machine-based clustering and classification techniques.

    PubMed

    Yin, Zhong; Zhang, Jianhua

    2014-07-01

    Identifying the abnormal changes of mental workload (MWL) over time is quite crucial for preventing the accidents due to cognitive overload and inattention of human operators in safety-critical human-machine systems. It is known that various neuroimaging technologies can be used to identify the MWL variations. In order to classify MWL into a few discrete levels using representative MWL indicators and small-sized training samples, a novel EEG-based approach by combining locally linear embedding (LLE), support vector clustering (SVC) and support vector data description (SVDD) techniques is proposed and evaluated by using the experimentally measured data. The MWL indicators from different cortical regions are first elicited by using the LLE technique. Then, the SVC approach is used to find the clusters of these MWL indicators and thereby to detect MWL variations. It is shown that the clusters can be interpreted as the binary class MWL. Furthermore, a trained binary SVDD classifier is shown to be capable of detecting slight variations of those indicators. By combining the two schemes, a SVC-SVDD framework is proposed, where the clear-cut (smaller) cluster is detected by SVC first and then a subsequent SVDD model is utilized to divide the overlapped (larger) cluster into two classes. Finally, three-class MWL levels (low, normal and high) can be identified automatically. The experimental data analysis results are compared with those of several existing methods. It has been demonstrated that the proposed framework can lead to acceptable computational accuracy and has the advantages of both unsupervised and supervised training strategies. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  8. Processes and subdivisions in diogenites, a multivariate statistical analysis

    NASA Technical Reports Server (NTRS)

    Harriott, T. A.; Hewins, R. H.

    1984-01-01

    Multivariate statistical techniques used on diogenite orthopyroxene analyses show the relationships that occur within diogenites and the two orthopyroxenite components (class I and II) in the polymict diogenite Garland. Cluster analysis shows that only Peckelsheim is similar to Garland class I (Fe-rich) and the other diogenites resemble Garland class II. The unique diogenite Y 75032 may be related to type I by fractionation. Factor analysis confirms the subdivision and shows that Fe does not correlate with the weakly incompatible elements across the entire pyroxene composition range, indicating that igneous fractionation is not the process controlling total diogenite composition variation. The occurrence of two groups of diogenites is interpreted as the result of sampling or mixing of two main sequences of orthopyroxene cumulates with slightly different compositions.

  9. A possibilistic approach to clustering

    NASA Technical Reports Server (NTRS)

    Krishnapuram, Raghu; Keller, James M.

    1993-01-01

    Fuzzy clustering has been shown to be advantageous over crisp (or traditional) clustering methods in that total commitment of a vector to a given class is not required at each image pattern recognition iteration. Recently fuzzy clustering methods have shown spectacular ability to detect not only hypervolume clusters, but also clusters which are actually 'thin shells', i.e., curves and surfaces. Most analytic fuzzy clustering approaches are derived from the 'Fuzzy C-Means' (FCM) algorithm. The FCM uses the probabilistic constraint that the memberships of a data point across classes sum to one. This constraint was used to generate the membership update equations for an iterative algorithm. Recently, we cast the clustering problem into the framework of possibility theory using an approach in which the resulting partition of the data can be interpreted as a possibilistic partition, and the membership values may be interpreted as degrees of possibility of the points belonging to the classes. We show the ability of this approach to detect linear and quartic curves in the presence of considerable noise.

  10. Characterizing the course of back pain after osteoporotic vertebral fracture: a hierarchical cluster analysis of a prospective cohort study.

    PubMed

    Toyoda, Hiromitsu; Takahashi, Shinji; Hoshino, Masatoshi; Takayama, Kazushi; Iseki, Kazumichi; Sasaoka, Ryuichi; Tsujio, Tadao; Yasuda, Hiroyuki; Sasaki, Takeharu; Kanematsu, Fumiaki; Kono, Hiroshi; Nakamura, Hiroaki

    2017-09-23

    This study demonstrated four distinct patterns in the course of back pain after osteoporotic vertebral fracture (OVF). Greater angular instability in the first 6 months after the baseline was one factor affecting back pain after OVF. Understanding the natural course of symptomatic acute OVF is important in deciding the optimal treatment strategy. We used latent class analysis to classify the course of back pain after OVF and identify the risk factors associated with persistent pain. This multicenter cohort study included 218 consecutive patients with ≤ 2-week-old OVFs who were enrolled at 11 institutions. Dynamic x-rays and back pain assessment with a visual analog scale (VAS) were obtained at enrollment and at 1-, 3-, and 6-month follow-ups. The VAS scores were used to characterize patient groups, using hierarchical cluster analysis. VAS for 128 patients was used for hierarchical cluster analysis. Analysis yielded four clusters representing different patterns of back pain progression. Cluster 1 patients (50.8%) had stable, mild pain. Cluster 2 patients (21.1%) started with moderate pain and progressed quickly to very low pain. Patients in cluster 3 (10.9%) had moderate pain that initially improved but worsened after 3 months. Cluster 4 patients (17.2%) had persistent severe pain. Patients in cluster 4 showed significant high baseline pain intensity, higher degree of angular instability, and higher number of previous OVFs, and tended to lack regular exercise. In contrast, patients in cluster 2 had significantly lower baseline VAS and less angular instability. We identified four distinct groups of OVF patients with different patterns of back pain progression. Understanding the course of back pain after OVF may help in its management and contribute to future treatment trials.

  11. Classification of California streams using combined deductive and inductive approaches: Setting the foundation for analysis of hydrologic alteration

    USGS Publications Warehouse

    Pyne, Matthew I.; Carlisle, Daren M.; Konrad, Christopher P.; Stein, Eric D.

    2017-01-01

    Regional classification of streams is an early step in the Ecological Limits of Hydrologic Alteration framework. Many stream classifications are based on an inductive approach using hydrologic data from minimally disturbed basins, but this approach may underrepresent streams from heavily disturbed basins or sparsely gaged arid regions. An alternative is a deductive approach, using watershed climate, land use, and geomorphology to classify streams, but this approach may miss important hydrological characteristics of streams. We classified all stream reaches in California using both approaches. First, we used Bayesian and hierarchical clustering to classify reaches according to watershed characteristics. Streams were clustered into seven classes according to elevation, sedimentary rock, and winter precipitation. Permutation-based analysis of variance and random forest analyses were used to determine which hydrologic variables best separate streams into their respective classes. Stream typology (i.e., the class that a stream reach is assigned to) is shaped mainly by patterns of high and mean flow behavior within the stream's landscape context. Additionally, random forest was used to determine which hydrologic variables best separate minimally disturbed reference streams from non-reference streams in each of the seven classes. In contrast to stream typology, deviation from reference conditions is more difficult to detect and is largely defined by changes in low-flow variables, average daily flow, and duration of flow. Our combined deductive/inductive approach allows us to estimate flow under minimally disturbed conditions based on the deductive analysis and compare to measured flow based on the inductive analysis in order to estimate hydrologic change.

  12. Multivariate Analysis and Its Applications

    DTIC Science & Technology

    1989-02-14

    defined in situations where measurements are taken on natural clusters of individuals like brothers in a family. A number of problems arise in the study of...intraclass correlations. How do we estimate it when observations are available on clusters of different sizes? How do we test the hypothesis that the...the random variable y(X) = #I X + G2X 2 + ... + GmX m , follows an exponential distribution with mean unity. Such a class of life distributions, has a

  13. Clustering of short and long-term co-movements in international financial and commodity markets in wavelet domain

    NASA Astrophysics Data System (ADS)

    Lahmiri, Salim; Uddin, Gazi Salah; Bekiros, Stelios

    2017-11-01

    We propose a general framework for measuring short and long term dynamics in asset classes based on the wavelet presentation of clustering analysis. The empirical results show strong evidence of instability of the financial system aftermath of the global financial crisis. Indeed, both short and long-term dynamics have significantly changed after the global financial crisis. This study provides an interesting insights complex structure of global financial and economic system.

  14. Dynamic cluster generation for a fuzzy classifier with ellipsoidal regions.

    PubMed

    Abe, S

    1998-01-01

    In this paper, we discuss a fuzzy classifier with ellipsoidal regions that dynamically generates clusters. First, for the data belonging to a class we define a fuzzy rule with an ellipsoidal region. Namely, using the training data for each class, we calculate the center and the covariance matrix of the ellipsoidal region for the class. Then we tune the fuzzy rules, i.e., the slopes of the membership functions, successively until there is no improvement in the recognition rate of the training data. Then if the number of the data belonging to a class that are misclassified into another class exceeds a prescribed number, we define a new cluster to which those data belong and the associated fuzzy rule. Then we tune the newly defined fuzzy rules in the similar way as stated above, fixing the already obtained fuzzy rules. We iterate generation of clusters and tuning of the newly generated fuzzy rules until the number of the data belonging to a class that are misclassified into another class does not exceed the prescribed number. We evaluate our method using thyroid data, Japanese Hiragana data of vehicle license plates, and blood cell data. By dynamic cluster generation, the generalization ability of the classifier is improved and the recognition rate of the fuzzy classifier for the test data is the best among the neural network classifiers and other fuzzy classifiers if there are no discrete input variables.

  15. Atlas-guided cluster analysis of large tractography datasets.

    PubMed

    Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

    2013-01-01

    Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment.

  16. Final Report of the Evaluation of the 1969-1970 Benjamin Franklin Cluster Program: Programs and Patterns for Disadvantaged High School Students. ESEA Title I.

    ERIC Educational Resources Information Center

    Hoffman, Louis J.

    The Cluster Program at Benjamin Franklin High School, funded under Title I of the 1965 Elementary Secondary Education Act, is designed to be a school within a school in which 249 ninth grade students attend classes in two separate clusters. Each cluster is formulated such that all students receive instruction from five teachers in classes whose…

  17. Freshman Health Topics

    ERIC Educational Resources Information Center

    Hovde, Karen

    2011-01-01

    This article examines a cluster of health topics that are frequently selected by students in lower division classes. Topics address issues relating to addictive substances, including alcohol and tobacco, eating disorders, obesity, and dieting. Analysis of the topics examines their interrelationships and organization in the reference literature.…

  18. Novel approach to classifying patients with pulmonary arterial hypertension using cluster analysis.

    PubMed

    Parikh, Kishan S; Rao, Youlan; Ahmad, Tariq; Shen, Kai; Felker, G Michael; Rajagopal, Sudarshan

    2017-01-01

    Pulmonary arterial hypertension (PAH) patients have distinct disease courses and responses to treatment, but current diagnostic and treatment schemes provide limited insight. We aimed to see if cluster analysis could distinguish clinical phenotypes in PAH. An unbiased cluster analysis was performed on 17 baseline clinical variables of PAH patients from the FREEDOM-M, FREEDOM-C, and FREEDOM-C2 randomized trials of oral treprostinil versus placebo. Participants were either treatment-naïve (FREEDOM-M) or on background therapy (FREEDOM-C, FREEDOM-C2). We tested for association of clusters with outcomes and interaction with respect to treatment. Primary outcome was 6-minute walking distance (6MWD) change. We included 966 participants with 12-week (FREEDOM-M) or 16-week (FREEDOM-C and FREEDOM-C2) follow-up. Four patient clusters were identified. Compared with Clusters 1 (n = 131) and 2 (n = 496), Clusters 3 (n = 246) and 4 (n = 93) patients were older, heavier, had worse baseline functional class, 6MWD, Borg Dyspnea Index, and fewer years since PAH diagnosis. Clusters also differed by PAH etiology and background therapies, but not gender or race. Mean treatment effect of oral treprostinil differed across Clusters 1-4 increased in a monotonic fashion (Cluster 1: 10.9 m; Cluster 2: 13.0 m; Cluster 3: 25.0 m; Cluster 4: 50.9 m; interaction P value = 0.048). We identified four distinct clusters of PAH patients based on common patient characteristics. Patients who were older, diagnosed with PAH for a shorter period, and had worse baseline symptoms and exercise capacity had the greatest response to oral treprostinil treatment.

  19. Cluster analysis of the national weight control registry to identify distinct subgroups maintaining successful weight loss.

    PubMed

    Ogden, Lorraine G; Stroebele, Nanette; Wyatt, Holly R; Catenacci, Victoria A; Peters, John C; Stuht, Jennifer; Wing, Rena R; Hill, James O

    2012-10-01

    The National Weight Control Registry (NWCR) is the largest ongoing study of individuals successful at maintaining weight loss; the registry enrolls individuals maintaining a weight loss of at least 13.6 kg (30 lb) for a minimum of 1 year. The current report uses multivariate latent class cluster analysis to identify unique clusters of individuals within the NWCR that have distinct experiences, strategies, and attitudes with respect to weight loss and weight loss maintenance. The cluster analysis considers weight and health history, weight control behaviors and strategies, effort and satisfaction with maintaining weight, and psychological and demographic characteristics. The analysis includes 2,228 participants enrolled between 1998 and 2002. Cluster 1 (50.5%) represents a weight-stable, healthy, exercise conscious group who are very satisfied with their current weight. Cluster 2 (26.9%) has continuously struggled with weight since childhood; they rely on the greatest number of resources and strategies to lose and maintain weight, and report higher levels of stress and depression. Cluster 3 (12.7%) represents a group successful at weight reduction on the first attempt; they were least likely to be overweight as children, are maintaining the longest duration of weight loss, and report the least difficulty maintaining weight. Cluster 4 (9.9%) represents a group less likely to use exercise to control weight; they tend to be older, eat fewer meals, and report more health problems. Further exploration of the unique characteristics of these clusters could be useful for tailoring future weight loss and weight maintenance programs to the specific characteristics of an individual.

  20. Percolation and epidemics in random clustered networks

    NASA Astrophysics Data System (ADS)

    Miller, Joel C.

    2009-08-01

    The social networks that infectious diseases spread along are typically clustered. Because of the close relation between percolation and epidemic spread, the behavior of percolation in such networks gives insight into infectious disease dynamics. A number of authors have studied percolation or epidemics in clustered networks, but the networks often contain preferential contacts in high degree nodes. We introduce a class of random clustered networks and a class of random unclustered networks with the same preferential mixing. Percolation in the clustered networks reduces the component sizes and increases the epidemic threshold compared to the unclustered networks.

  1. A 1400-MHz survey of 1478 Abell clusters of galaxies

    NASA Technical Reports Server (NTRS)

    Owen, F. N.; White, R. A.; Hilldrup, K. C.; Hanisch, R. J.

    1982-01-01

    Observations of 1478 Abell clusters of galaxies with the NRAO 91-m telescope at 1400 MHz are reported. The measured beam shape was deconvolved from the measured source Gaussian fits in order to estimate the source size and position angle. All detected sources within 0.5 corrected Abell cluster radii are listed, including the cluster number, richness class, distance class, magnitude of the tenth brightest galaxy, redshift estimate, corrected cluster radius in arcmin, right ascension and error, declination and error, total flux density and error, and angular structure for each source.

  2. Variations in students' perceived reasons for, sources of, and forms of in-school discrimination: A latent class analysis.

    PubMed

    Byrd, Christy M; Carter Andrews, Dorinda J

    2016-08-01

    Although there exists a healthy body of literature related to discrimination in schools, this research has primarily focused on racial or ethnic discrimination as perceived and experienced by students of color. Few studies examine students' perceptions of discrimination from a variety of sources, such as adults and peers, their descriptions of the discrimination, or the frequency of discrimination in the learning environment. Middle and high school students in a Midwestern school district (N=1468) completed surveys identifying whether they experienced discrimination from seven sources (e.g., peers, teachers, administrators), for seven reasons (e.g., gender, race/ethnicity, religion), and in eight forms (e.g., punished more frequently, called names, excluded from social groups). The sample was 52% White, 15% Black/African American, 14% Multiracial, and 17% Other. Latent class analysis was used to cluster individuals based on reported sources of, reasons for, and forms of discrimination. Four clusters were found, and ANOVAs were used to test for differences between clusters on perceptions of school climate, relationships with teachers, perceptions that the school was a "good school," and engagement. The Low Discrimination cluster experienced the best outcomes, whereas an intersectional cluster experienced the most discrimination and the worst outcomes. The results confirm existing research on the negative effects of discrimination. Additionally, the paper adds to the literature by highlighting the importance of an intersectional approach to examining students' perceptions of in-school discrimination. Copyright © 2016 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  3. Modeling sports highlights using a time-series clustering framework and model interpretation

    NASA Astrophysics Data System (ADS)

    Radhakrishnan, Regunathan; Otsuka, Isao; Xiong, Ziyou; Divakaran, Ajay

    2005-01-01

    In our past work on sports highlights extraction, we have shown the utility of detecting audience reaction using an audio classification framework. The audio classes in the framework were chosen based on intuition. In this paper, we present a systematic way of identifying the key audio classes for sports highlights extraction using a time series clustering framework. We treat the low-level audio features as a time series and model the highlight segments as "unusual" events in a background of an "usual" process. The set of audio classes to characterize the sports domain is then identified by analyzing the consistent patterns in each of the clusters output from the time series clustering framework. The distribution of features from the training data so obtained for each of the key audio classes, is parameterized by a Minimum Description Length Gaussian Mixture Model (MDL-GMM). We also interpret the meaning of each of the mixture components of the MDL-GMM for the key audio class (the "highlight" class) that is correlated with highlight moments. Our results show that the "highlight" class is a mixture of audience cheering and commentator's excited speech. Furthermore, we show that the precision-recall performance for highlights extraction based on this "highlight" class is better than that of our previous approach which uses only audience cheering as the key highlight class.

  4. Polymorphism in magic-sized Au144(SR)60 clusters

    NASA Astrophysics Data System (ADS)

    Jensen, Kirsten M. Ø.; Juhas, Pavol; Tofanelli, Marcus A.; Heinecke, Christine L.; Vaughan, Gavin; Ackerson, Christopher J.; Billinge, Simon J. L.

    2016-06-01

    Ultra-small, magic-sized metal nanoclusters represent an important new class of materials with properties between molecules and particles. However, their small size challenges the conventional methods for structure characterization. Here we present the structure of ultra-stable Au144(SR)60 magic-sized nanoclusters obtained from atomic pair distribution function analysis of X-ray powder diffraction data. The study reveals structural polymorphism in these archetypal nanoclusters. In addition to confirming the theoretically predicted icosahedral-cored cluster, we also find samples with a truncated decahedral core structure, with some samples exhibiting a coexistence of both cluster structures. Although the clusters are monodisperse in size, structural diversity is apparent. The discovery of polymorphism may open up a new dimension in nanoscale engineering.

  5. Doubly stochastic Poisson process models for precipitation at fine time-scales

    NASA Astrophysics Data System (ADS)

    Ramesh, Nadarajah I.; Onof, Christian; Xie, Dichao

    2012-09-01

    This paper considers a class of stochastic point process models, based on doubly stochastic Poisson processes, in the modelling of rainfall. We examine the application of this class of models, a neglected alternative to the widely-known Poisson cluster models, in the analysis of fine time-scale rainfall intensity. These models are mainly used to analyse tipping-bucket raingauge data from a single site but an extension to multiple sites is illustrated which reveals the potential of this class of models to study the temporal and spatial variability of precipitation at fine time-scales.

  6. Multi-class ERP-based BCI data analysis using a discriminant space self-organizing map.

    PubMed

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    Emotional or non-emotional image stimulus is recently applied to event-related potential (ERP) based brain computer interfaces (BCI). Though the classification performance is over 80% in a single trial, a discrimination between those ERPs has not been considered. In this research we tried to clarify the discriminability of four-class ERP-based BCI target data elicited by desk, seal, spider images and letter intensifications. A conventional self organizing map (SOM) and newly proposed discriminant space SOM (ds-SOM) were applied, then the discriminabilites were visualized. We also classify all pairs of those ERPs by stepwise linear discriminant analysis (SWLDA) and verify the visualization of discriminabilities. As a result, the ds-SOM showed understandable visualization of the data with a shorter computational time than the traditional SOM. We also confirmed the clear boundary between the letter cluster and the other clusters. The result was coherent with the classification performances by SWLDA. The method might be helpful not only for developing a new BCI paradigm, but also for the big data analysis.

  7. Internet Gamblers Differ on Social Variables: A Latent Class Analysis.

    PubMed

    Khazaal, Yasser; Chatton, Anne; Achab, Sophia; Monney, Gregoire; Thorens, Gabriel; Dufour, Magali; Zullino, Daniele; Rothen, Stephane

    2017-09-01

    Online gambling has gained popularity in the last decade, leading to an important shift in how consumers engage in gambling and in the factors related to problem gambling and prevention. Indebtedness and loneliness have previously been associated with problem gambling. The current study aimed to characterize online gamblers in relation to indebtedness, loneliness, and several in-game social behaviors. The data set was obtained from 584 Internet gamblers recruited online through gambling websites and forums. Of these gamblers, 372 participants completed all study assessments and were included in the analyses. Questionnaires included those on sociodemographics and social variables (indebtedness, loneliness, in-game social behaviors), as well as the Gambling Motives Questionnaire, Gambling Related Cognitions Scale, Internet Addiction Test, Problem Gambling Severity Index, Short Depression-Happiness Scale, and UPPS-P Impulsive Behavior Scale. Social variables were explored with a latent class model. The clusters obtained were compared for psychological measures and three clusters were found: lonely indebted gamblers (cluster 1: 6.5%), not lonely not indebted gamblers (cluster 2: 75.4%), and not lonely indebted gamblers (cluster 3: 18%). Participants in clusters 1 and 3 (particularly in cluster 1) were at higher risk of problem gambling than were those in cluster 2. The three groups differed on most assessed variables, including the Problem Gambling Severity Index, the Short Depression-Happiness Scale, and the UPPS-P subscales (except the sensation seeking subscore). Results highlight significant between-group differences, suggesting that Internet gamblers are not a homogeneous group. Specific intervention strategies could be implemented for groups at risk.

  8. Bullied youth: the impact of bullying through lesbian, gay, and bisexual name calling.

    PubMed

    Evans, Caroline B R; Chapman, Mimi V

    2014-11-01

    Bullying is a common experience for many school-aged youth, but the majority of bullying research and intervention does not address the content of bullying behavior, particularly teasing. Understanding the various forms of bullying as well as the language used in bullying is important given that bullying can have persistent consequences, particularly for victims who are bullied through biased-based bullying, such as being called gay, lesbian, or queer. This study examines bullying experiences in a racially and ethnically diverse sample of 3,379 rural elementary-, middle-, and high-school youth. We use latent class analysis to establish clusters of bullying behaviors, including forms of biased-based bullying. The resulting classes are examined to ascertain if and how bullying by biased-based labeling is clustered with other forms of bullying behavior. This analysis identifies 3 classes of youth: youth who experience no bullying victimization, youth who experience social and emotional bullying, and youth who experience all forms of social and physical bullying, including being bullied by being called gay, lesbian, or queer. Youth in Classes 2 and 3 labeled their experiences as bullying. Results indicate that youth bullied by being called gay, lesbian, or queer are at a high risk of experiencing all forms of bullying behavior, highlighting the importance of increased support for this vulnerable group. (c) 2014 APA, all rights reserved.

  9. Classification and Clustering Methods for Multiple Environmental Factors in Gene-Environment Interaction: Application to the Multi-Ethnic Study of Atherosclerosis.

    PubMed

    Ko, Yi-An; Mukherjee, Bhramar; Smith, Jennifer A; Kardia, Sharon L R; Allison, Matthew; Diez Roux, Ana V

    2016-11-01

    There has been an increased interest in identifying gene-environment interaction (G × E) in the context of multiple environmental exposures. Most G × E studies analyze one exposure at a time, but we are exposed to multiple exposures in reality. Efficient analysis strategies for complex G × E with multiple environmental factors in a single model are still lacking. Using the data from the Multiethnic Study of Atherosclerosis, we illustrate a two-step approach for modeling G × E with multiple environmental factors. First, we utilize common clustering and classification strategies (e.g., k-means, latent class analysis, classification and regression trees, Bayesian clustering using Dirichlet Process) to define subgroups corresponding to distinct environmental exposure profiles. Second, we illustrate the use of an additive main effects and multiplicative interaction model, instead of the conventional saturated interaction model using product terms of factors, to study G × E with the data-driven exposure subgroups defined in the first step. We demonstrate useful analytical approaches to translate multiple environmental exposures into one summary class. These tools not only allow researchers to consider several environmental exposures in G × E analysis but also provide some insight into how genes modify the effect of a comprehensive exposure profile instead of examining effect modification for each exposure in isolation.

  10. A WISE Survey of New Star Clusters in the Central Plane Region of the Milky Way

    NASA Astrophysics Data System (ADS)

    Ryu, Jinhyuk; Lee, Myung Gyoon

    2018-04-01

    We present the discovery of new star clusters in the central plane region (| l| < 30^\\circ and | b| < 6^\\circ ) of the Milky Way. In order to overcome the extinction problem and the spatial limit of previous surveys, we use the Wide-field Infrared Survey Explorer (WISE) data to find clusters. We also use other infrared survey data in the archive for additional analysis. We find 923 new clusters, of which 202 clusters are embedded clusters. These clusters are concentrated toward the Galactic plane and show a symmetric distribution with respect to the Galactic latitude. The embedded clusters show a stronger concentration to the Galactic plane than the nonembedded clusters. The new clusters are found more in the first Galactic quadrant, while previously known clusters are found more in the fourth Galactic quadrant. The spatial distribution of the combined sample of known clusters and new clusters is approximately symmetric with respect to the Galactic longitude. We estimate reddenings, distances, and relative ages of the 15 class A clusters using theoretical isochrones. Ten of them are relatively old (age >800 Myr) and five are young (age ≈4 Myr).

  11. Statistical Analyses of Femur Parameters for Designing Anatomical Plates.

    PubMed

    Wang, Lin; He, Kunjin; Chen, Zhengming

    2016-01-01

    Femur parameters are key prerequisites for scientifically designing anatomical plates. Meanwhile, individual differences in femurs present a challenge to design well-fitting anatomical plates. Therefore, to design anatomical plates more scientifically, analyses of femur parameters with statistical methods were performed in this study. The specific steps were as follows. First, taking eight anatomical femur parameters as variables, 100 femur samples were classified into three classes with factor analysis and Q-type cluster analysis. Second, based on the mean parameter values of the three classes of femurs, three sizes of average anatomical plates corresponding to the three classes of femurs were designed. Finally, based on Bayes discriminant analysis, a new femur could be assigned to the proper class. Thereafter, the average anatomical plate suitable for that new femur was selected from the three available sizes of plates. Experimental results showed that the classification of femurs was quite reasonable based on the anatomical aspects of the femurs. For instance, three sizes of condylar buttress plates were designed. Meanwhile, 20 new femurs are judged to which classes the femurs belong. Thereafter, suitable condylar buttress plates were determined and selected.

  12. Somatosensory nociceptive characteristics differentiate subgroups in people with chronic low back pain: a cluster analysis.

    PubMed

    Rabey, Martin; Slater, Helen; OʼSullivan, Peter; Beales, Darren; Smith, Anne

    2015-10-01

    The objectives of this study were to explore the existence of subgroups in a cohort with chronic low back pain (n = 294) based on the results of multimodal sensory testing and profile subgroups on demographic, psychological, lifestyle, and general health factors. Bedside (2-point discrimination, brush, vibration and pinprick perception, temporal summation on repeated monofilament stimulation) and laboratory (mechanical detection threshold, pressure, heat and cold pain thresholds, conditioned pain modulation) sensory testing were examined at wrist and lumbar sites. Data were entered into principal component analysis, and 5 component scores were entered into latent class analysis. Three clusters, with different sensory characteristics, were derived. Cluster 1 (31.9%) was characterised by average to high temperature and pressure pain sensitivity. Cluster 2 (52.0%) was characterised by average to high pressure pain sensitivity. Cluster 3 (16.0%) was characterised by low temperature and pressure pain sensitivity. Temporal summation occurred significantly more frequently in cluster 1. Subgroups were profiled on pain intensity, disability, depression, anxiety, stress, life events, fear avoidance, catastrophizing, perception of the low back region, comorbidities, body mass index, multiple pain sites, sleep, and activity levels. Clusters 1 and 2 had a significantly greater proportion of female participants and higher depression and sleep disturbance scores than cluster 3. The proportion of participants undertaking <300 minutes per week of moderate activity was significantly greater in cluster 1 than in clusters 2 and 3. Low back pain, therefore, does not appear to be homogeneous. Pain mechanisms relating to presentations of each subgroup were postulated. Future research may investigate prognoses and interventions tailored towards these subgroups.

  13. Atlas-Guided Cluster Analysis of Large Tractography Datasets

    PubMed Central

    Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

    2013-01-01

    Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment. PMID:24386292

  14. [Poverty profile regarding households participating in a food assistance program].

    PubMed

    Álvarez-Uribe, Martha C; Aguirre-Acevedo, Daniel C

    2012-06-01

    This study was aimed at establishing subgroups having specific socioeconomic characteristics by using latent class analysis as a method for segmenting target population members of the MANA-ICBF supplementary food program in the Antioquia department of Colombia and determine their differences regarding poverty and health conditions in efficiently addressing pertinent resources, programs and policies. The target population consisted of 200,000 children and their households involved in the MANA food assistance program; a representative sample by region was used. Latent class analysis was used, as were the expectation-maximization and Newton Raphson algorithms for identifying the appropriate number of classes. The final model classified the households into four clusters or classes, differing according to well-defined socio-demographic conditions affecting children's health. Some homes had a greater depth of poverty, therefore lowering the families' quality of life and affecting the health of the children in this age group.

  15. Chandra/ACIS-I Study of the X-Ray Properties of the NGC 6611 and M16 Stellar Populations

    NASA Astrophysics Data System (ADS)

    Guarcello, M. G.; Caramazza, M.; Micela, G.; Sciortino, S.; Drake, J. J.; Prisinzano, L.

    2012-07-01

    Mechanisms regulating the origin of X-rays in young stellar objects and the correlation with their evolutionary stage are under debate. Studies of the X-ray properties in young clusters allow us to understand these mechanisms. One ideal target for this analysis is the Eagle Nebula (M16), with its central cluster NGC 6611. At 1750 pc from the Sun, it harbors 93 OB stars, together with a population of low-mass stars from embedded protostars to disk-less Class III objects, with age <=3 Myr. We study an archival 78 ks Chandra/ACIS-I observation of NGC 6611 and two new 80 ks observations of the outer region of M16, one centered on the Column V and the other on a region of the molecular cloud with ongoing star formation. We detect 1755 point sources with 1183 candidate cluster members (219 disk-bearing and 964 disk-less). We study the global X-ray properties of M16 and compare them with those of the Orion Nebula Cluster. We also compare the level of X-ray emission of Class II and Class III stars and analyze the X-ray spectral properties of OB stars. Our study supports the lower level of X-ray activity for the disk-bearing stars with respect to the disk-less members. The X-ray luminosity function (XLF) of M16 is similar to that of Orion, supporting the universality of the XLF in young clusters. Eighty-five percent of the O stars of NGC 6611 have been detected in X-rays. With only one possible exception, they show soft spectra with no hard components, indicating that mechanisms for the production of hard X-ray emission in O stars are not operating in NGC 6611.

  16. A Latent Class Multidimensional Scaling Model for Two-Way One-Mode Continuous Rating Dissimilarity Data

    ERIC Educational Resources Information Center

    Vera, J. Fernando; Macias, Rodrigo; Heiser, Willem J.

    2009-01-01

    In this paper, we propose a cluster-MDS model for two-way one-mode continuous rating dissimilarity data. The model aims at partitioning the objects into classes and simultaneously representing the cluster centers in a low-dimensional space. Under the normal distribution assumption, a latent class model is developed in terms of the set of…

  17. An iterative approach to optimize change classification in SAR time series data

    NASA Astrophysics Data System (ADS)

    Boldt, Markus; Thiele, Antje; Schulz, Karsten; Hinz, Stefan

    2016-10-01

    The detection of changes using remote sensing imagery has become a broad field of research with many approaches for many different applications. Besides the simple detection of changes between at least two images acquired at different times, analyses which aim on the change type or category are at least equally important. In this study, an approach for a semi-automatic classification of change segments is presented. A sparse dataset is considered to ensure the fast and simple applicability for practical issues. The dataset is given by 15 high resolution (HR) TerraSAR-X (TSX) amplitude images acquired over a time period of one year (11/2013 to 11/2014). The scenery contains the airport of Stuttgart (GER) and its surroundings, including urban, rural, and suburban areas. Time series imagery offers the advantage of analyzing the change frequency of selected areas. In this study, the focus is set on the analysis of small-sized high frequently changing regions like parking areas, construction sites and collecting points consisting of high activity (HA) change objects. For each HA change object, suitable features are extracted and a k-means clustering is applied as the categorization step. Resulting clusters are finally compared to a previously introduced knowledge-based class catalogue, which is modified until an optimal class description results. In other words, the subjective understanding of the scenery semantics is optimized by the data given reality. Doing so, an even sparsely dataset containing only amplitude imagery can be evaluated without requiring comprehensive training datasets. Falsely defined classes might be rejected. Furthermore, classes which were defined too coarsely might be divided into sub-classes. Consequently, classes which were initially defined too narrowly might be merged. An optimal classification results when the combination of previously defined key indicators (e.g., number of clusters per class) reaches an optimum.

  18. Characterization of Erwinia chrysanthemi by pectinolytic isozyme polymorphism and restriction fragment length polymorphism analysis of PCR-amplified fragments of pel genes.

    PubMed Central

    Nassar, A; Darrasse, A; Lemattre, M; Kotoujansky, A; Dervin, C; Vedel, R; Bertheau, Y

    1996-01-01

    Conserved regions about 420 bp long of the pelADE cluster specific to Erwinia chrysanthemi were amplified by PCR and used to differentiate 78 strains of E. chrysanthemi that were obtained from different hosts and geographical areas. No PCR products were obtained from DNA samples extracted from other pectinolytic and nonpectinolytic species and genera. The pel fragments amplified from the E. chrysanthemi strains studied were compared by performing a restriction fragment length polymorphism (RFLP) analysis. On the basis of similarity coefficients derived from the RFLP analysis, the strains were separated into 16 PCR RFLP patterns grouped in six clusters, These clusters appeared to be correlated with other infraspecific levels of E. chrysanthemi classification, such as pathovar and biovar, and occasionally with geographical origin. Moreover, the clusters correlated well with the polymorphism of pectate lyase and pectin methylesterase isoenzymes. While the pectin methylesterase profiles correlated with host monocot-dicot classification, the pectate lyase polymorphism might reflect the cell wall microdomains of the plants belonging to these classes. PMID:8779560

  19. Parental Involvement in Adolescent Romantic Relationships: Patterns and Correlates

    ERIC Educational Resources Information Center

    Kan, Marni L.; McHale, Susan M.; Crouter, Ann C.

    2008-01-01

    This study examined dimensions of mothers' and fathers' involvement in adolescents' romantic relationships when offspring were age 17. Using cluster analysis, parents from 105 White, working and middle class families were classified as positively involved, negatively involved, or autonomy-oriented with respect to their adolescents' romantic…

  20. Perception of aesthetics and personality traits in orthognathic surgery patients: A comparison of still and moving images

    PubMed Central

    Tran, Ulrich S.; Wutzl, Arno; Seemann, Rudolf; Millesi, Gabriele; Jagsch, Reinhold

    2018-01-01

    It is common in practicing orthognathic surgery to evaluate faces with retruded or protruded chins (dysgnathic faces) using photographs. Because motion may alter how the face is perceived, we investigated the perception of faces presented via photographs and videos. Two hundred naïve raters (lay persons, without maxillo facial surgery background) evaluated 12 subjects with varying chin anatomy [so-called skeletal Class I (normal chin), Class II (retruded chin), and Class III (protruded chin)]. Starting from eight traits, with Factor analysis we found a two-Factor solution, i.e. an "aesthetics associated traits cluster" and a Factor "personality traits cluster" which appeared to be uncorrelated. Internal consistency of the Factors found for photographs and videos was excellent. Generally, female raters delivered better ratings than males, but the effect sizes were small. We analyzed differences and the respective effect magnitude between photograph and video perception. For each skeletal class the aesthetics associated dimensions were rated similarly between photographs and video clips. In contrast, specific personality traits were rated differently. Differences in the class-specific personality traits seen on photographs were "smoothed" in the assessment of videos, which implies that photos enhance stereotypes commonly attributed to a retruded or protruded chin. PMID:29775466

  1. Korean immigrants' knowledge of heart attack symptoms and risk factors.

    PubMed

    Hwang, Seon Y; Ryan, Catherine J; Zerwic, Julie Johnson

    2008-02-01

    This study assessed the knowledge of heart attack symptoms and risk factors in a convenience sample of Korean immigrants. A total of 116 Korean immigrants in a Midwestern metropolitan area were recruited through Korean churches and markets. Knowledge was assessed using both open-ended questions and a structured questionnaire. Latent class cluster analysis and Chi-square tests were used to analyze the data. About 76% of the sample had at least one self-reported risk factor for cardiovascular disease. Using an open-ended question, the majority of subjects could only identify one symptom. In the structured questionnaire, subjects identified a mean of 5 out of 10 heart attack symptoms and a mean of 5 out of 9 heart attack risk factors. Latent class cluster analysis showed that subjects clustered into two groups for both risk factors and symptoms: a high knowledge group and a low knowledge group. Subjects who clustered into the risk factor low knowledge group (48%) were more likely than the risk factor high knowledge group to be older than 65 years, to have lower education, to not know to use 911 when a heart attack occurred, and to not have a family history of heart attack. Korean immigrants' knowledge of heart attack symptoms and risk factors was variable, ranging from high to very low. Education should be focused on those at highest risk for a heart attack, which includes the elderly and those with risk factors.

  2. Multi-angle backscatter classification and sub-bottom profiling for improved seafloor characterization

    NASA Astrophysics Data System (ADS)

    Alevizos, Evangelos; Snellen, Mirjam; Simons, Dick; Siemes, Kerstin; Greinert, Jens

    2018-06-01

    This study applies three classification methods exploiting the angular dependence of acoustic seafloor backscatter along with high resolution sub-bottom profiling for seafloor sediment characterization in the Eckernförde Bay, Baltic Sea Germany. This area is well suited for acoustic backscatter studies due to its shallowness, its smooth bathymetry and the presence of a wide range of sediment types. Backscatter data were acquired using a Seabeam1180 (180 kHz) multibeam echosounder and sub-bottom profiler data were recorded using a SES-2000 parametric sonar transmitting 6 and 12 kHz. The high density of seafloor soundings allowed extracting backscatter layers for five beam angles over a large part of the surveyed area. A Bayesian probability method was employed for sediment classification based on the backscatter variability at a single incidence angle, whereas Maximum Likelihood Classification (MLC) and Principal Components Analysis (PCA) were applied to the multi-angle layers. The Bayesian approach was used for identifying the optimum number of acoustic classes because cluster validation is carried out prior to class assignment and class outputs are ordinal categorical values. The method is based on the principle that backscatter values from a single incidence angle express a normal distribution for a particular sediment type. The resulting Bayesian classes were well correlated to median grain sizes and the percentage of coarse material. The MLC method uses angular response information from five layers of training areas extracted from the Bayesian classification map. The subsequent PCA analysis is based on the transformation of these five layers into two principal components that comprise most of the data variability. These principal components were clustered in five classes after running an external cluster validation test. In general both methods MLC and PCA, separated the various sediment types effectively, showing good agreement (kappa >0.7) with the Bayesian approach which also correlates well with ground truth data (r2 > 0.7). In addition, sub-bottom data were used in conjunction with the Bayesian classification results to characterize acoustic classes with respect to their geological and stratigraphic interpretation. The joined interpretation of seafloor and sub-seafloor data sets proved to be an efficient approach for a better understanding of seafloor backscatter patchiness and to discriminate acoustically similar classes in different geological/bathymetric settings.

  3. Support Vector Data Descriptions and k-Means Clustering: One Class?

    PubMed

    Gornitz, Nico; Lima, Luiz Alberto; Muller, Klaus-Robert; Kloft, Marius; Nakajima, Shinichi

    2017-09-27

    We present ClusterSVDD, a methodology that unifies support vector data descriptions (SVDDs) and k-means clustering into a single formulation. This allows both methods to benefit from one another, i.e., by adding flexibility using multiple spheres for SVDDs and increasing anomaly resistance and flexibility through kernels to k-means. In particular, our approach leads to a new interpretation of k-means as a regularized mode seeking algorithm. The unifying formulation further allows for deriving new algorithms by transferring knowledge from one-class learning settings to clustering settings and vice versa. As a showcase, we derive a clustering method for structured data based on a one-class learning scenario. Additionally, our formulation can be solved via a particularly simple optimization scheme. We evaluate our approach empirically to highlight some of the proposed benefits on artificially generated data, as well as on real-world problems, and provide a Python software package comprising various implementations of primal and dual SVDD as well as our proposed ClusterSVDD.

  4. An evaluation of ISOCLS and CLASSY clustering algorithms for forest classification in northern Idaho. [Elk River quadrange of the Clearwater National Forest

    NASA Technical Reports Server (NTRS)

    Werth, L. F. (Principal Investigator)

    1981-01-01

    Both the iterative self-organizing clustering system (ISOCLS) and the CLASSY algorithms were applied to forest and nonforest classes for one 1:24,000 quadrangle map of northern Idaho and the classification and mapping accuracies were evaluated with 1:30,000 color infrared aerial photography. Confusion matrices for the two clustering algorithms were generated and studied to determine which is most applicable to forest and rangeland inventories in future projects. In an unsupervised mode, ISOCLS requires many trial-and-error runs to find the proper parameters to separate desired information classes. CLASSY tells more in a single run concerning the classes that can be separated, shows more promise for forest stratification than ISOCLS, and shows more promise for consistency. One major drawback to CLASSY is that important forest and range classes that are smaller than a minimum cluster size will be combined with other classes. The algorithm requires so much computer storage that only data sets as small as a quadrangle can be used at one time.

  5. Genome-wide identification and expression analysis of the ClTCP transcription factors in Citrullus lanatus.

    PubMed

    Shi, Pibiao; Guy, Kateta Malangisha; Wu, Weifang; Fang, Bingsheng; Yang, Jinghua; Zhang, Mingfang; Hu, Zhongyuan

    2016-04-12

    The plant-specific TCP transcription factor family, which is involved in the regulation of cell growth and proliferation, performs diverse functions in multiple aspects of plant growth and development. However, no comprehensive analysis of the TCP family in watermelon (Citrullus lanatus) has been undertaken previously. A total of 27 watermelon TCP encoding genes distributed on nine chromosomes were identified. Phylogenetic analysis clustered the genes into 11 distinct subgroups. Furthermore, phylogenetic and structural analyses distinguished two homology classes within the ClTCP family, designated Class I and Class II. The Class II genes were differentiated into two subclasses, the CIN subclass and the CYC/TB1 subclass. The expression patterns of all members were determined by semi-quantitative PCR. The functions of two ClTCP genes, ClTCP14a and ClTCP15, in regulating plant height were confirmed by ectopic expression in Arabidopsis wild-type and ortholog mutants. This study represents the first genome-wide analysis of the watermelon TCP gene family, which provides valuable information for understanding the classification and functions of the TCP genes in watermelon.

  6. The W40 region in the gould belt: An embedded cluster and H II region at the junction of filaments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mallick, K. K.; Ojha, D. K.; Kumar, M. S. N.

    We present a multiwavelength study of the W40 star-forming region using infrared (IR) observations in the UKIRT JHK bands, Spitzer Infrared Array Camera bands, and Herschel PACS bands, 2.12 μm H{sub 2} narrowband imaging, and radio continuum observations from GMRT (610 and 1280 MHz), in a field of view (FoV) of ∼34' × 40'. Archival Spitzer observations in conjunction with near-IR observations are used to identify 1162 Class II/III and 40 Class I sources in the FoV. The nearest-neighbor stellar surface density analysis shows that the majority of these young stellar objects (YSOs) constitute the embedded cluster centered on themore » high-mass source IRS 1A South. Some YSOs, predominantly the younger population, are distributed along and trace the filamentary structures at lower stellar surface density. The cluster radius is measured to be 0.44 pc—matching well with the extent of radio emission—with a peak density of 650 pc{sup –2}. The JHK data are used to map the extinction in the region, which is subsequently used to compute the cloud mass—126 M {sub ☉} and 71 M {sub ☉} for the central cluster and the northern IRS 5 region, respectively. H{sub 2} narrowband imaging shows significant emission, which prominently resembles fluorescent emission arising at the borders of dense regions. Radio continuum analysis shows that this region has a blister morphology, with the radio peak coinciding with a protostellar source. Free-free emission spectral energy distribution analysis is used to obtain physical parameters of the overall photoionized region and the IRS 5 sub-region. This multiwavelength scenario is suggestive of star formation having resulted from the merging of multiple filaments to form a hub. Star formation seems to have taken place in two successive epochs, with the first epoch traced by the central cluster and the high-mass star(s)—followed by a second epoch that is spreading into the filaments as uncovered by the Class I sources and even younger protostellar sources along the filaments. The IRS 5 H II region displays indications of swept-up material that has possibly led to the formation of protostars.« less

  7. Editing ERTS-1 data to exclude land aids cluster analysis of water targets

    NASA Technical Reports Server (NTRS)

    Erb, R. B. (Principal Investigator)

    1973-01-01

    The author has identified the following significant results. It has been determined that an increase in the number of spectrally distinct coastal water types is achieved when data values over the adjacent land areas are excluded from the processing routine. This finding resulted from an automatic clustering analysis of ERTS-1 system corrected MSS scene 1002-18134 of 25 July 1972 over Monterey Bay, California. When the entire study area data set was submitted to the clustering only two distinct water classes were extracted. However, when the land area data points were removed from the data set and resubmitted to the clustering routine, four distinct groupings of water features were identified. Additionally, unlike the previous separation, the four types could be correlated to features observable in the associated ERTS-1 imagery. This exercise demonstrates that by proper selection of data submitted to the processing routine, based upon the specific application of study, additional information may be extracted from the ERTS-1 MSS data.

  8. Applying Machine Learning to Star Cluster Classification

    NASA Astrophysics Data System (ADS)

    Fedorenko, Kristina; Grasha, Kathryn; Calzetti, Daniela; Mahadevan, Sridhar

    2016-01-01

    Catalogs describing populations of star clusters are essential in investigating a range of important issues, from star formation to galaxy evolution. Star cluster catalogs are typically created in a two-step process: in the first step, a catalog of sources is automatically produced; in the second step, each of the extracted sources is visually inspected by 3-to-5 human classifiers and assigned a category. Classification by humans is labor-intensive and time consuming, thus it creates a bottleneck, and substantially slows down progress in star cluster research.We seek to automate the process of labeling star clusters (the second step) through applying supervised machine learning techniques. This will provide a fast, objective, and reproducible classification. Our data is HST (WFC3 and ACS) images of galaxies in the distance range of 3.5-12 Mpc, with a few thousand star clusters already classified by humans as a part of the LEGUS (Legacy ExtraGalactic UV Survey) project. The classification is based on 4 labels (Class 1 - symmetric, compact cluster; Class 2 - concentrated object with some degree of asymmetry; Class 3 - multiple peak system, diffuse; and Class 4 - spurious detection). We start by looking at basic machine learning methods such as decision trees. We then proceed to evaluate performance of more advanced techniques, focusing on convolutional neural networks and other Deep Learning methods. We analyze the results, and suggest several directions for further improvement.

  9. Polymorphism in magic-sized Au144(SR)60 clusters

    DOE PAGES

    Jensen, Kirsten M. O.; Juhas, Pavol; Tofanelli, Marcus A.; ...

    2016-06-14

    Ultra-small, magic-sized metal nanoclusters represent an important new class of materials with properties between molecules and particles. However, their small size challenges the conventional methods for structure characterization. We present the structure of ultra-stable Au144(SR)60 magic-sized nanoclusters obtained from atomic pair distribution function analysis of X-ray powder diffraction data. Our study reveals structural polymorphism in these archetypal nanoclusters. Additionally, in order to confirm the theoretically predicted icosahedral-cored cluster, we also find samples with a truncated decahedral core structure, with some samples exhibiting a coexistence of both cluster structures. Although the clusters are monodisperse in size, structural diversity is apparent. Finally,more » the discovery of polymorphism may open up a new dimension in nanoscale engineering.« less

  10. Clustering of adversity in young adults on disability pension due to mental disorders: a latent class analysis.

    PubMed

    Joensuu, Matti; Mattila-Holappa, Pauliina; Ahola, Kirsi; Ervasti, Jenni; Kivimäki, Mika; Kivekäs, Teija; Koskinen, Aki; Vahtera, Jussi; Virtanen, Marianna

    2016-02-01

    Mental disorders are the leading cause of work disability among young adults. This study examined whether distinct classes could be identified among young adults on the basis of medical history before receiving a disability pension due to a mental disorder. Medical history was obtained from pension applications and attached medical certificates for 1163 individuals aged 18-34 years who, in 2008, received a disability pension due to a mental disorder. Using latent class analysis, 10 clinical and individual adversities and their associations with sex, age and diagnostic category were examined. Three classes were identified: childhood adversity (prevalence, 33%), comorbidity (23%), and undefined (44%). The childhood adversity class was characterized by adverse events and symptoms reported during childhood and it associated with depressive disorders. The comorbidity class was characterized by comorbid mental disorders, suicide attempts and substance abuse and associated with younger age and bipolar disorder. The undefined class formed no distinct profile; individuals in this class had the lowest number of adversities and it associated with psychotic disorders. The identification of subgroups characterized by childhood circumstances and comorbidity may help planning of prevention and support practices for young adults with mental disorders and risk of work disability.

  11. Co-occurrence and clustering of health conditions at age 11: cross-sectional findings from the Millennium Cohort Study

    PubMed Central

    Hesketh, Kathryn R; Fagg, James; Muniz-Terrera, Graciela; Law, Catherine; Hope, Steven

    2016-01-01

    Objectives To identify patterns of co-occurrence and clustering of 6 common adverse health conditions in 11-year-old children and explore differences by sociodemographic factors. Design Nationally representative prospective cohort study. Setting Children born in the UK between 2000 and 2002. Participants 11 399 11-year-old singleton children for whom data on all 6 health conditions and sociodemographic information were available (complete cases). Main outcome measures Prevalence, co-occurrence and clustering of 6 common health conditions: wheeze; eczema; long-standing illness (excluding wheeze and eczema); injury; socioemotional difficulties (measured using Strengths and Difficulties Questionnaire) and unfavourable weight (thin/overweight/obese vs normal). Results 42.4% of children had 2 or more adverse health conditions (co-occurrence). Co-occurrence was more common in boys and children from lower income households. Latent class analysis identified 6 classes: ‘normative’ (57.4%): ‘atopic burdened’ (14.0%); ‘socioemotional burdened’ (11.0%); ‘unfavourable weight/injury’ (7.7%); ‘eczema/injury’ (6.0%) and ‘eczema/unfavourable weight’ (3.9%). As with co-occurrence, class membership differed by sociodemographic factors: boys, children of mothers with lower educational attainment and children from lower income households were more likely to be in the ‘socioemotional burdened’ class. Children of mothers with higher educational attainment were more likely to be in the ‘normative’ and ‘eczema/unfavourable weight’ classes. Conclusions Co-occurrence of adverse health conditions at age 11 is common and is associated with adverse socioeconomic circumstances. Holistic, child focused care, particularly in boys and those in lower income groups, may help to prevent and reduce co-occurrence in later childhood and adolescence. PMID:27881529

  12. An analysis of genetic architecture in populations of Ponderosa Pine

    Treesearch

    Yan B. Linhart; Jeffry B. Mitton; Kareen B. Sturgeon; Martha L. Davis

    1981-01-01

    Patterns of genetic variation were studied in three populations of ponderosa pine in Colorado by using electrophoretically variable protein loci. Significant genetic differences were found between separate clusters of trees and between age classes within populations. In addition, data indicate that differential cone production and differential animal damage have...

  13. Dyadic Parenting and Children's Externalizing Symptoms

    ERIC Educational Resources Information Center

    Meteyer, Karen B.; Perry-Jenkins, Maureen

    2009-01-01

    We explore dyadic parenting styles and their association with first-grade children's externalizing behavior symptoms in a sample of 85 working-class, dual-earner families. Cluster analysis is used to create a typology of parenting types, reflecting the parental warmth, overreactivity, and laxness of both mothers and fathers in two-parent families.…

  14. Evaluation of SLAR and thematic mapper MSS data for forest cover mapping using computer-aided analysis techniques

    NASA Technical Reports Server (NTRS)

    Hoffer, R. M. (Principal Investigator)

    1980-01-01

    Several possibilities were considered for defining the data set in which the same test areas could be used for each of the four different spatial resolutions being evaluated. The LARSYS CLUSTER was used to sort the vectors into spectral classes to reduce the within-spectral class variability in an effort to develop training statistics. A data quality test was written to determine the basic signal to noise characteristics within the data set being used. Because preliminary analysis of the LANDSAT MSS data revealed the presence of high cirrus clouds, other data sets are being sought.

  15. OGLE II Eclipsing Binaries In The LMC: Analysis With Class

    NASA Astrophysics Data System (ADS)

    Devinney, Edward J.; Prsa, A.; Guinan, E. F.; DeGeorge, M.

    2011-01-01

    The Eclipsing Binaries (EBs) via Artificial Intelligence (EBAI) Project is applying machine learning techniques to elucidate the nature of EBs. Previously, Prsa, et al. applied artificial neural networks (ANNs) trained on physically-realistic Wilson-Devinney models to solve the light curves of the 1882 detached EBs in the LMC discovered by the OGLE II Project (Wyrzykowski, et al.) fully automatically, bypassing the need for manually-derived starting solutions. A curious result is the non-monotonic distribution of the temperature ratio parameter T2/T1, featuring a subsidiary peak noted previously by Mazeh, et al. in an independent analysis using the EBOP EB solution code (Tamuz, et al.). To explore this and to gain a fuller understanding of the multivariate EBAI LMC observational plus solutions data, we have employed automatic clustering and advanced visualization (CAV) techniques. Clustering the OGLE II data aggregates objects that are similar with respect to many parameter dimensions. Measures of similarity for example, could include the multidimensional Euclidean Distance between data objects, although other measures may be appropriate. Applying clustering, we find good evidence that the T2/T1 subsidiary peak is due to evolved binaries, in support of Mazeh et al.'s speculation. Further, clustering suggests that the LMC detached EBs occupying the main sequence region belong to two distinct classes. Also identified as a separate cluster in the multivariate data are stars having a Period-I band relation. Derekas et al. had previously found a Period-K band relation for LMC EBs discovered by the MACHO Project (Alcock, et al.). We suggest such CAV techniques will prove increasingly useful for understanding the large, multivariate datasets increasingly being produced in astronomy. We are grateful for the support of this research from NSF/RUI Grant AST-05-75042 f.

  16. A Spectroscopic Analysis of the Galactic Globular Cluster NGC 6273 (M19)

    NASA Astrophysics Data System (ADS)

    Johnson, Christian I.; Rich, R. Michael; Pilachowski, Catherine A.; Caldwell, Nelson; Mateo, Mario; Bailey, John I., III; Crane, Jeffrey D.

    2015-08-01

    A combined effort utilizing spectroscopy and photometry has revealed the existence of a new globular cluster class. These “anomalous” clusters, which we refer to as “iron-complex” clusters, are differentiated from normal clusters by exhibiting large (≳0.10 dex) intrinsic metallicity dispersions, complex sub-giant branches, and correlated [Fe/H] and s-process enhancements. In order to further investigate this phenomenon, we have measured radial velocities and chemical abundances for red giant branch stars in the massive, but scarcely studied, globular cluster NGC 6273. The velocities and abundances were determined using high resolution (R ˜ 27,000) spectra obtained with the Michigan/Magellan Fiber System (M2FS) and MSpec spectrograph on the Magellan-Clay 6.5 m telescope at Las Campanas Observatory. We find that NGC 6273 has an average heliocentric radial velocity of +144.49 km s-1 (σ = 9.64 km s-1) and an extended metallicity distribution ([Fe/H] = -1.80 to -1.30) composed of at least two distinct stellar populations. Although the two dominant populations have similar [Na/Fe], [Al/Fe], and [α/Fe] abundance patterns, the more metal-rich stars exhibit significant [La/Fe] enhancements. The [La/Eu] data indicate that the increase in [La/Fe] is due to almost pure s-process enrichment. A third more metal-rich population with low [X/Fe] ratios may also be present. Therefore, NGC 6273 joins clusters such as ω Centauri, M2, M22, and NGC 5286 as a new class of iron-complex clusters exhibiting complicated star formation histories. This paper includes data gathered with the 6.5 m Magellan Telescopes located at Las Campanas Observatory, Chile.

  17. Clustering of Genetically Defined Allele Classes in the Caenorhabditis elegans DAF-2 Insulin/IGF-1 Receptor

    PubMed Central

    Patel, Dhaval S.; Garza-Garcia, Acely; Nanji, Manoj; McElwee, Joshua J.; Ackerman, Daniel; Driscoll, Paul C.; Gems, David

    2008-01-01

    The DAF-2 insulin/IGF-1 receptor regulates development, metabolism, and aging in the nematode Caenorhabditis elegans. However, complex differences among daf-2 alleles complicate analysis of this gene. We have employed epistasis analysis, transcript profile analysis, mutant sequence analysis, and homology modeling of mutant receptors to understand this complexity. We define an allelic series of nonconditional daf-2 mutants, including nonsense and deletion alleles, and a putative null allele, m65. The most severe daf-2 alleles show incomplete suppression by daf-18(0) and daf-16(0) and have a range of effects on early development. Among weaker daf-2 alleles there exist distinct mutant classes that differ in epistatic interactions with mutations in other genes. Mutant sequence analysis (including 11 newly sequenced alleles) reveals that class 1 mutant lesions lie only in certain extracellular regions of the receptor, while class 2 (pleiotropic) and nonconditional missense mutants have lesions only in the ligand-binding pocket of the receptor ectodomain or the tyrosine kinase domain. Effects of equivalent mutations on the human insulin receptor suggest an altered balance of intracellular signaling in class 2 alleles. These studies consolidate and extend our understanding of the complex genetics of daf-2 and its underlying molecular biology. PMID:18245374

  18. Yellow supergiants in open clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sowell, J.R.

    1986-01-01

    Superluminous giant stars (SLGs) have been reported in young globular clusters in the Large Magellanic Cloud (LMC). These stars appear to be in the post-asymptotic-giant-branch phase of evolution. This program was an investigation of galactic SLG candidates in open clusters, which are more like the LMC young globular clusters. These were chosen because luminosity, mass, and age determinations can be made for members since cluster distances and interstellar reddenings are known. Color magnitude diagrams were searched for candidates, using the same selection criteria as for SLGs in the LMC. Classification spectra were obtained of 115 program stars from McGraw-Hill Observatorymore » and of 68 stars from Cerro Tololo Inter-American Observatory Chile. These stars were visually classified on the MK system using spectral scans of standard stars taken at the respective observations. Published information was combined with this program's data for 83 stars in 30 clusters. Membership probabilities were assigned to these stars, and the clusters were analyzed according to age. It was seen that the intrinsically brightest supergiants are found in the youngest clusters. With increasing cluster age, the absolute luminosities attained by the supergiants decline. Also, it appears that the evolutionary tracks of luminosity class II stars are more similar to those of class I than of class III.« less

  19. Transient regulation of three clustered tomato class-I small heat-shock chaperone genes by ethylene is mediated by SIMADS-RIN transcription factor

    USDA-ARS?s Scientific Manuscript database

    An intronless cluster of three class I small heat shock protein (sHSP) chaperone genes, Sl17.6, Sl20.0 and Sl20.1, resident on the short arm of chromosome 6 in tomato, was previously characterized (Goyal et al., 2012). This shsp chaperone gene cluster was found decorated with cis sequences known to ...

  20. Phylogenetic Tree Analysis of the Cold-Hot Nature of Traditional Chinese Marine Medicine for Possible Anticancer Activity

    PubMed Central

    Song, Xuxia; Li, Xuebo; Zhang, Fengcong; Wang, Changyun

    2017-01-01

    Traditional Chinese Marine Medicine (TCMM) represents one of the medicinal resources for research and development of novel anticancer drugs. In this study, to investigate the presence of anticancer activity (AA) displayed by cold or hot nature of TCMM, we analyzed the association relationship and the distribution regularity of TCMMs with different nature (613 TCMMs originated from 1,091 species of marine organisms) via association rules mining and phylogenetic tree analysis. The screened association rules were collected from three taxonomy groups: (1) Bacteria superkingdom, Phaeophyceae class, Fucales order, Sargassaceae family, and Sargassum genus; (2) Viridiplantae kingdom, Streptophyta phylum, Malpighiales class, and Rhizophoraceae family; (3) Holothuroidea class, Aspidochirotida order, and Holothuria genus. Our analyses showed that TCMMs with closer taxonomic relationship were more likely to possess anticancer bioactivity. We found that the cluster pattern of marine organisms with reported AA tended to cluster with cold nature TCMMs. Moreover, TCMMs with salty-cold nature demonstrated properties for softening hard mass and removing stasis to treat cancers, and species within Metazoa or Viridiplantae kingdom of cold nature were more likely to contain AA properties. We propose that TCMMs from these marine groups may enable focused bioprospecting for discovery of novel anticancer drugs derived from marine bioresources. PMID:28191021

  1. EClerize: A customized force-directed graph drawing algorithm for biological graphs with EC attributes.

    PubMed

    Danaci, Hasan Fehmi; Cetin-Atalay, Rengul; Atalay, Volkan

    2018-03-26

    Visualizing large-scale data produced by the high throughput experiments as a biological graph leads to better understanding and analysis. This study describes a customized force-directed layout algorithm, EClerize, for biological graphs that represent pathways in which the nodes are associated with Enzyme Commission (EC) attributes. The nodes with the same EC class numbers are treated as members of the same cluster. Positions of nodes are then determined based on both the biological similarity and the connection structure. EClerize minimizes the intra-cluster distance, that is the distance between the nodes of the same EC cluster and maximizes the inter-cluster distance, that is the distance between two distinct EC clusters. EClerize is tested on a number of biological pathways and the improvement brought in is presented with respect to the original algorithm. EClerize is available as a plug-in to cytoscape ( http://apps.cytoscape.org/apps/eclerize ).

  2. Cloud classification from satellite data using a fuzzy sets algorithm: A polar example

    NASA Technical Reports Server (NTRS)

    Key, J. R.; Maslanik, J. A.; Barry, R. G.

    1988-01-01

    Where spatial boundaries between phenomena are diffuse, classification methods which construct mutually exclusive clusters seem inappropriate. The Fuzzy c-means (FCM) algorithm assigns each observation to all clusters, with membership values as a function of distance to the cluster center. The FCM algorithm is applied to AVHRR data for the purpose of classifying polar clouds and surfaces. Careful analysis of the fuzzy sets can provide information on which spectral channels are best suited to the classification of particular features, and can help determine likely areas of misclassification. General agreement in the resulting classes and cloud fraction was found between the FCM algorithm, a manual classification, and an unsupervised maximum likelihood classifier.

  3. Phylogenetic analysis of Newcastle disease viruses from Bangladesh suggests continuing evolution of genotype XIII.

    PubMed

    Barman, Lalita Rani; Nooruzzaman, Mohammed; Sarker, Rahul Deb; Rahman, Md Tazinur; Saife, Md Rajib Bin; Giasuddin, Mohammad; Das, Bidhan Chandra; Das, Priya Mohan; Chowdhury, Emdadul Haque; Islam, Mohammad Rafiqul

    2017-10-01

    A total of 23 Newcastle disease virus (NDV) isolates from Bangladesh taken between 2010 and 2012 were characterized on the basis of partial F gene sequences. All the isolates belonged to genotype XIII of class II NDV but segregated into three sub-clusters. One sub-cluster with 17 isolates aligned with sub-genotype XIIIc. The other two sub-clusters were phylogenetically distinct from the previously described sub-genotypes XIIIa, XIIIb and XIIIc and could be candidates of new sub-genotypes; however, that needs to be validated through full-length F gene sequence data. The results of the present study suggest that genotype XIII NDVs are under continuing evolution in Bangladesh.

  4. Systematic study of rapidity dispersion parameter in high energy nucleus-nucleus interactions

    NASA Astrophysics Data System (ADS)

    Bhattacharyya, Swarnapratim; Haiduc, Maria; Neagu, Alina Tania; Firu, Elena

    2014-03-01

    A systematic study of rapidity dispersion parameter as a quantitative measure of clustering of particles has been carried out in the interactions of 16O, 28Si and 32S projectiles at 4.5 A GeV/c with heavy (AgBr) and light (CNO) groups of targets present in the nuclear emulsion. For all the interactions, the total ensemble of events has been divided into four overlapping multiplicity classes depending on the number of shower particles. For all the interactions and for each multiplicity class, the rapidity dispersion parameter values indicate the occurrence of clusterization during the multiparticle production at Dubna energy. The measured rapidity dispersion parameter values are found to decrease with the increase of average multiplicity for all the interactions. The dependence of rapidity dispersion parameter on the average multiplicity can be successfully described by a relation D(η) = a + b + c2. The experimental results have been compared with the results obtained from the analysis of Monte Carlo simulated (MC-RAND) events. MC-RAND events show weaker clusterization among the pions in comparison to the experimental data.

  5. Neighbourhood socioeconomic deprivation and health-related quality of life: A multilevel analysis

    PubMed Central

    Ribeiro, Ana Isabel; Severo, Milton; Barros, Henrique; Fraga, Sílvia

    2017-01-01

    Objective To assess the relationship between socioeconomic deprivation and health-related quality of life in urban neighbourhoods, using a multilevel approach. Methods Of the population-based cohort EPIPorto, 1154 georeferenced participants completed the 36-Item Short-Form Health Survey. Neighbourhood socioeconomic deprivation classes were estimated using latent-class analysis. Multilevel models measured clustering and contextual effects of neighbourhood deprivation on physical and mental HRQoL. Results Residents from the least deprived neighbourhoods had higher physical HRQoL. Neighbourhood socioeconomic deprivation together with individual-level variables (age, gender and education) and health-related factors (smoking, alcohol consumption, sedentariness and chronic diseases) explained 98% of the total between-neighbourhood variance. Neighbourhood socioeconomic deprivation was significantly associated with physical health when comparing least and most deprived neighbourhoods (class 2—beta coefficient: -0.60; 95% confidence interval:-1.76;-0.56; class 3 –beta coefficient: -2.28; 95% confidence interval:-3.96;-0.60), and as neighbourhood deprivation increases, a decrease in all values of physical health dimensions (physical functioning, role physical, bodily pain and general health) was also observed. Regarding the mental health dimension, no neighbourhood clustering or contextual effects were found. However, as neighbourhood deprivation increases, the values of vitality and role emotional dimensions significantly decreased. Conclusion Neighbourhood socioeconomic deprivation is associated with HRQoL, affecting particularly physical health. This study suggests that to improve HRQoL, people and places should be targeted simultaneously. PMID:29236719

  6. Preliminary Comparisons of the Information Content and Utility of TM Versus MSS Data

    NASA Technical Reports Server (NTRS)

    Markham, B. L.

    1984-01-01

    Comparisons were made between subscenes from the first TM scene acquired of the Washington, D.C. area and a MSS scene acquired approximately one year earlier. Three types of analyses were conducted to compare TM and MSS data: a water body analysis, a principal components analysis and a spectral clustering analysis. The water body analysis compared the capability of the TM to the MSS for detecting small uniform targets. Of the 59 ponds located on aerial photographs 34 (58%) were detected by the TM with six commission errors (15%) and 13 (22%) were detected by the MSS with three commission errors (19%). The smallest water body detected by the TM was 16 meters; the smallest detected by the MSS was 40 meters. For the principal components analysis, means and covariance matrices were calculated for each subscene, and principal components images generated and characterized. In the spectral clustering comparison each scene was independently clustered and the clusters were assigned to informational classes. The preliminary comparison indicated that TM data provides enhancements over MSS in terms of (1) small target detection and (2) data dimensionality (even with 4-band data). The extra dimension, partially resultant from TM band 1, appears useful for built-up/non-built-up area separation.

  7. Cluster analysis of fasciolosis in dairy cow herds in Munster province of Ireland and detection of major climatic and environmental predictors of the exposure risk.

    PubMed

    Selemetas, Nikolaos; Phelan, Paul; O'Kiely, Padraig; de Waal, Theo

    2015-03-19

    Fasciolosis caused by Fasciola hepatica is a widespread parasitic disease in cattle farms. The aim of this study was to detect clusters of fasciolosis in dairy cow herds in Munster Province, Ireland and to identify significant climatic and environmental predictors of the exposure risk. In total, 1,292 dairy herds across Munster was sampled in September 2012 providing a single bulk tank milk (BTM) sample. The analysis of samples by an in-house antibody-detection enzyme-linked immunosorbent assay (ELISA), showed that 65% of the dairy herds (n = 842) had been exposed to F. hepatica. Using the Getis-Ord Gi* statistic, 16 high-risk and 24 low-risk (P <0.01) clusters of fasciolosis were identified. The spatial distribution of high-risk clusters was more dispersed and mainly located in the northern and western regions of Munster compared to the low-risk clusters that were mostly concentrated in the southern and eastern regions. The most significant classes of variables that could reflect the difference between high-risk and low-risk clusters were the total number of wet-days and rain-days, rainfall, the normalized difference vegetation index (NDVI), temperature and soil type. There was a bigger proportion of well-drained soils among the low-risk clusters, whereas poorly drained soils were more common among the high-risk clusters. These results stress the role of precipitation, grazing, temperature and drainage on the life cycle of F. hepatica in the temperate Irish climate. The findings of this study highlight the importance of cluster analysis for identifying significant differences in climatic and environmental variables between high-risk and low-risk clusters of fasciolosis in Irish dairy herds.

  8. Multimorbidity and survival for patients with acute myocardial infarction in England and Wales: Latent class analysis of a nationwide population-based cohort.

    PubMed

    Hall, Marlous; Dondo, Tatendashe B; Yan, Andrew T; Mamas, Mamas A; Timmis, Adam D; Deanfield, John E; Jernberg, Tomas; Hemingway, Harry; Fox, Keith A A; Gale, Chris P

    2018-03-01

    There is limited knowledge of the scale and impact of multimorbidity for patients who have had an acute myocardial infarction (AMI). Therefore, this study aimed to determine the extent to which multimorbidity is associated with long-term survival following AMI. This national observational study included 693,388 patients (median age 70.7 years, 452,896 [65.5%] male) from the Myocardial Ischaemia National Audit Project (England and Wales) who were admitted with AMI between 1 January 2003 and 30 June 2013. There were 412,809 (59.5%) patients with multimorbidity at the time of admission with AMI, i.e., having at least 1 of the following long-term health conditions: diabetes, chronic obstructive pulmonary disease or asthma, heart failure, renal failure, cerebrovascular disease, peripheral vascular disease, or hypertension. Those with heart failure, renal failure, or cerebrovascular disease had the worst outcomes (39.5 [95% CI 39.0-40.0], 38.2 [27.7-26.8], and 26.6 [25.2-26.4] deaths per 100 person-years, respectively). Latent class analysis revealed 3 multimorbidity phenotype clusters: (1) a high multimorbidity class, with concomitant heart failure, peripheral vascular disease, and hypertension, (2) a medium multimorbidity class, with peripheral vascular disease and hypertension, and (3) a low multimorbidity class. Patients in class 1 were less likely to receive pharmacological therapies compared with class 2 and 3 patients (including aspirin, 83.8% versus 87.3% and 87.2%, respectively; β-blockers, 74.0% versus 80.9% and 81.4%; and statins, 80.6% versus 85.9% and 85.2%). Flexible parametric survival modelling indicated that patients in class 1 and class 2 had a 2.4-fold (95% CI 2.3-2.5) and 1.5-fold (95% CI 1.4-1.5) increased risk of death and a loss in life expectancy of 2.89 and 1.52 years, respectively, compared with those in class 3 over the 8.4-year follow-up period. The study was limited to all-cause mortality due to the lack of available cause-specific mortality data. However, we isolated the disease-specific association with mortality by providing the loss in life expectancy following AMI according to multimorbidity phenotype cluster compared with the general age-, sex-, and year-matched population. Multimorbidity among patients with AMI was common, and conferred an accumulative increased risk of death. Three multimorbidity phenotype clusters that were significantly associated with loss in life expectancy were identified and should be a concomitant treatment target to improve cardiovascular outcomes. ClinicalTrials.gov NCT03037255.

  9. Detection of sunn pest-damaged wheat samples using visible/near-infrared spectroscopy based on pattern recognition.

    PubMed

    Basati, Zahra; Jamshidi, Bahareh; Rasekh, Mansour; Abbaspour-Gilandeh, Yousef

    2018-05-30

    The presence of sunn pest-damaged grains in wheat mass reduces the quality of flour and bread produced from it. Therefore, it is essential to assess the quality of the samples in collecting and storage centers of wheat and flour mills. In this research, the capability of visible/near-infrared (Vis/NIR) spectroscopy combined with pattern recognition methods was investigated for discrimination of wheat samples with different percentages of sunn pest-damaged. To this end, various samples belonging to five classes (healthy and 5%, 10%, 15% and 20% unhealthy) were analyzed using Vis/NIR spectroscopy (wavelength range of 350-1000 nm) based on both supervised and unsupervised pattern recognition methods. Principal component analysis (PCA) and hierarchical cluster analysis (HCA) as the unsupervised techniques and soft independent modeling of class analogies (SIMCA) and partial least squares-discriminant analysis (PLS-DA) as supervised methods were used. The results showed that Vis/NIR spectra of healthy samples were correctly clustered using both PCA and HCA. Due to the high overlapping between the four unhealthy classes (5%, 10%, 15% and 20%), it was not possible to discriminate all the unhealthy samples in individual classes. However, when considering only the two main categories of healthy and unhealthy, an acceptable degree of separation between the classes can be obtained after classification with supervised pattern recognition methods of SIMCA and PLS-DA. SIMCA based on PCA modeling correctly classified samples in two classes of healthy and unhealthy with classification accuracy of 100%. Moreover, the power of the wavelengths of 839 nm, 918 nm and 995 nm were more than other wavelengths to discriminate two classes of healthy and unhealthy. It was also concluded that PLS-DA provides excellent classification results of healthy and unhealthy samples (R 2  = 0.973 and RMSECV = 0.057). Therefore, Vis/NIR spectroscopy based on pattern recognition techniques can be useful for rapid distinguishing the healthy wheat samples from those damaged by sunn pest in the maintenance and processing centers. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. Utility of Metabolomics toward Assessing the Metabolic Basis of Quality Traits in Apple Fruit with an Emphasis on Antioxidants

    PubMed Central

    Cuthbertson, Daniel; Andrews, Preston K.; Reganold, John P.; Davies, Neal M.; Lange, B. Markus

    2012-01-01

    A gas chromatography–mass spectrometry approach was employed to evaluate the use of metabolite patterns to differentiate fruit from six commercially grown apple cultivars harvested in 2008. Principal component analysis (PCA) of apple fruit peel and flesh data indicated that individual cultivar replicates clustered together and were separated from all other cultivar samples. An independent metabolomics investigation with fruit harvested in 2003 confirmed the separate clustering of fruit from different cultivars. Further evidence for cultivar separation was obtained using a hierarchical clustering analysis. An evaluation of PCA component loadings revealed specific metabolite classes that contributed the most to each principal component, whereas a correlation analysis demonstrated that specific metabolites correlate directly with quality traits such as antioxidant activity, total phenolics, and total anthocyanins, which are important parameters in the selection of breeding germplasm. These data sets lay the foundation for elucidating the metabolic basis of commercially important fruit quality traits. PMID:22881116

  11. A clustering-based graph Laplacian framework for value function approximation in reinforcement learning.

    PubMed

    Xu, Xin; Huang, Zhenhua; Graves, Daniel; Pedrycz, Witold

    2014-12-01

    In order to deal with the sequential decision problems with large or continuous state spaces, feature representation and function approximation have been a major research topic in reinforcement learning (RL). In this paper, a clustering-based graph Laplacian framework is presented for feature representation and value function approximation (VFA) in RL. By making use of clustering-based techniques, that is, K-means clustering or fuzzy C-means clustering, a graph Laplacian is constructed by subsampling in Markov decision processes (MDPs) with continuous state spaces. The basis functions for VFA can be automatically generated from spectral analysis of the graph Laplacian. The clustering-based graph Laplacian is integrated with a class of approximation policy iteration algorithms called representation policy iteration (RPI) for RL in MDPs with continuous state spaces. Simulation and experimental results show that, compared with previous RPI methods, the proposed approach needs fewer sample points to compute an efficient set of basis functions and the learning control performance can be improved for a variety of parameter settings.

  12. Design of partially supervised classifiers for multispectral image data

    NASA Technical Reports Server (NTRS)

    Jeon, Byeungwoo; Landgrebe, David

    1993-01-01

    A partially supervised classification problem is addressed, especially when the class definition and corresponding training samples are provided a priori only for just one particular class. In practical applications of pattern classification techniques, a frequently observed characteristic is the heavy, often nearly impossible requirements on representative prior statistical class characteristics of all classes in a given data set. Considering the effort in both time and man-power required to have a well-defined, exhaustive list of classes with a corresponding representative set of training samples, this 'partially' supervised capability would be very desirable, assuming adequate classifier performance can be obtained. Two different classification algorithms are developed to achieve simplicity in classifier design by reducing the requirement of prior statistical information without sacrificing significant classifying capability. The first one is based on optimal significance testing, where the optimal acceptance probability is estimated directly from the data set. In the second approach, the partially supervised classification is considered as a problem of unsupervised clustering with initially one known cluster or class. A weighted unsupervised clustering procedure is developed to automatically define other classes and estimate their class statistics. The operational simplicity thus realized should make these partially supervised classification schemes very viable tools in pattern classification.

  13. Ergatis: a web interface and scalable software system for bioinformatics workflows

    PubMed Central

    Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.

    2010-01-01

    Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634

  14. The effectivenes of science domain-based science learning integrated with local potency

    NASA Astrophysics Data System (ADS)

    Kurniawati, Arifah Putri; Prasetyo, Zuhdan Kun; Wilujeng, Insih; Suryadarma, I. Gusti Putu

    2017-08-01

    This research aimed to determine the significant effect of science domain-based science learning integrated with local potency toward science process skills. The research method used was a quasi-experimental design with nonequivalent control group design. The population of this research was all students of class VII SMP Negeri 1 Muntilan. The sample of this research was selected through cluster random sampling, namely class VII B as an experiment class (24 students) and class VII C as a control class (24 students). This research used a test instrument that was adapted from Agus Dwianto's research. The aspect of science process skills in this research was observation, classification, interpretation and communication. The analysis of data used the one factor anova at 0,05 significance level and normalized gain score. The significance level result of science process skills with one factor anova is 0,000. It shows that the significance level < alpha (0,05). It means that there was significant effect of science domain-based science learning integrated with local potency toward science learning process skills. The results of analysis show that the normalized gain score are 0,29 (low category) in control class and 0,67 (medium category) in experiment class.

  15. Entanglement enhancement through multirail noise reduction for continuous-variable measurement-based quantum-information processing

    NASA Astrophysics Data System (ADS)

    Su, Yung-Chao; Wu, Shin-Tza

    2017-09-01

    We study theoretically the teleportation of a controlled-phase (cz) gate through measurement-based quantum-information processing for continuous-variable systems. We examine the degree of entanglement in the output modes of the teleported cz-gate for two classes of resource states: the canonical cluster states that are constructed via direct implementations of two-mode squeezing operations and the linear-optical version of cluster states which are built from linear-optical networks of beam splitters and phase shifters. In order to reduce the excess noise arising from finite-squeezed resource states, teleportation through resource states with different multirail designs will be considered and the enhancement of entanglement in the teleported cz gates will be analyzed. For multirail cluster with an arbitrary number of rails, we obtain analytical expressions for the entanglement in the output modes and analyze in detail the results for both classes of resource states. At the same time, we also show that for uniformly squeezed clusters the multirail noise reduction can be optimized when the excess noise is allocated uniformly to the rails. To facilitate the analysis, we develop a trick with manipulations of quadrature operators that can reveal rather efficiently the measurement sequence and corrective operations needed for the measurement-based gate teleportation, which will also be explained in detail.

  16. Ethnic differences in the clustering and outcomes of health behaviours during pregnancy: results from the Born in Bradford cohort.

    PubMed

    Petherick, E S; Fairley, L; Parslow, R C; McEachan, R; Tuffnell, D; Pickett, K E; Leon, D; Lawlor, D A; Wright, J

    2017-09-01

    Pregnancy is a time of optimal motivation for many women to make positive behavioural changes. We aim to describe pregnant women with similar patterns of self-reported health behaviours and examine associations with birth outcomes. We examined the clustering of multiple health behaviours during pregnancy in the Born in Bradford cohort, including smoking physical inactivity, vitamin d supplementation and exposure to second-hand smoke. Latent class analysis was used to identify groups of individuals with similar patterns of health behaviours separately for White British (WB) and Pakistani mothers. Multinomial regression was then used to examine the association between group membership and birth outcomes, which included preterm birth and mean birthweight. For WB mothers, offspring of those in the 'Unhealthiest' group had lower mean birthweight than those in the 'Mostly healthy but inactive' class, although no association was observed for preterm birth. For Pakistani mothers, group membership was not associated with birthweight differences, although the odds of preterm birth was higher in 'Inactive smokers' compared to the 'Mostly healthy but inactive' group. The use of latent class methods provides important information about the clustering of health behaviours which can be used to target population segments requiring behaviour change interventions considering multiple risk factors. Given the dominant negative association of smoking with the birth outcomes investigated, latent class groupings of other health behaviours may not confer additional risk information for these outcomes. © The Author 2016. Published by Oxford University Press on behalf of Faculty of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  17. Quantum cluster algebras and quantum nilpotent algebras.

    PubMed

    Goodearl, Kenneth R; Yakimov, Milen T

    2014-07-08

    A major direction in the theory of cluster algebras is to construct (quantum) cluster algebra structures on the (quantized) coordinate rings of various families of varieties arising in Lie theory. We prove that all algebras in a very large axiomatically defined class of noncommutative algebras possess canonical quantum cluster algebra structures. Furthermore, they coincide with the corresponding upper quantum cluster algebras. We also establish analogs of these results for a large class of Poisson nilpotent algebras. Many important families of coordinate rings are subsumed in the class we are covering, which leads to a broad range of applications of the general results to the above-mentioned types of problems. As a consequence, we prove the Berenstein-Zelevinsky conjecture [Berenstein A, Zelevinsky A (2005) Adv Math 195:405-455] for the quantized coordinate rings of double Bruhat cells and construct quantum cluster algebra structures on all quantum unipotent groups, extending the theorem of Geiß et al. [Geiß C, et al. (2013) Selecta Math 19:337-397] for the case of symmetric Kac-Moody groups. Moreover, we prove that the upper cluster algebras of Berenstein et al. [Berenstein A, et al. (2005) Duke Math J 126:1-52] associated with double Bruhat cells coincide with the corresponding cluster algebras.

  18. Quantum cluster algebras and quantum nilpotent algebras

    PubMed Central

    Goodearl, Kenneth R.; Yakimov, Milen T.

    2014-01-01

    A major direction in the theory of cluster algebras is to construct (quantum) cluster algebra structures on the (quantized) coordinate rings of various families of varieties arising in Lie theory. We prove that all algebras in a very large axiomatically defined class of noncommutative algebras possess canonical quantum cluster algebra structures. Furthermore, they coincide with the corresponding upper quantum cluster algebras. We also establish analogs of these results for a large class of Poisson nilpotent algebras. Many important families of coordinate rings are subsumed in the class we are covering, which leads to a broad range of applications of the general results to the above-mentioned types of problems. As a consequence, we prove the Berenstein–Zelevinsky conjecture [Berenstein A, Zelevinsky A (2005) Adv Math 195:405–455] for the quantized coordinate rings of double Bruhat cells and construct quantum cluster algebra structures on all quantum unipotent groups, extending the theorem of Geiß et al. [Geiß C, et al. (2013) Selecta Math 19:337–397] for the case of symmetric Kac–Moody groups. Moreover, we prove that the upper cluster algebras of Berenstein et al. [Berenstein A, et al. (2005) Duke Math J 126:1–52] associated with double Bruhat cells coincide with the corresponding cluster algebras. PMID:24982197

  19. Investigation of correlation classification techniques

    NASA Technical Reports Server (NTRS)

    Haskell, R. E.

    1975-01-01

    A two-step classification algorithm for processing multispectral scanner data was developed and tested. The first step is a single pass clustering algorithm that assigns each pixel, based on its spectral signature, to a particular cluster. The output of that step is a cluster tape in which a single integer is associated with each pixel. The cluster tape is used as the input to the second step, where ground truth information is used to classify each cluster using an iterative method of potentials. Once the clusters have been assigned to classes the cluster tape is read pixel-by-pixel and an output tape is produced in which each pixel is assigned to its proper class. In addition to the digital classification programs, a method of using correlation clustering to process multispectral scanner data in real time by means of an interactive color video display is also described.

  20. Probing cluster surface morphology by cryo spectroscopy of N2 on cationic nickel clusters

    NASA Astrophysics Data System (ADS)

    Dillinger, Sebastian; Mohrbach, Jennifer; Niedner-Schatteburg, Gereon

    2017-11-01

    We present the cryogenic (26 K) IR spectra of selected [Nin(N2)m]+ (n = 5-20, m = 1 - mmax), which strongly reveal n- and m-dependent features in the N2 stretching region, in conjunction with density functional theory modeling of some of these findings. The observed spectral features allow us to refine the kinetic classification [cf. J. Mohrbach, S. Dillinger, and G. Niedner-Schatteburg, J. Chem. Phys. 147, 184304 (2017)] and to define four classes of structure related surface adsorption behavior: Class (1) of Ni6+, Ni13+, and Ni19+ are highly symmetrical clusters with all smooth surfaces of equally coordinated Ni atoms that entertain stepwise N2 adsorption up to stoichiometric N2:Nisurface saturation. Class (2) of Ni12+ and Ni18+ are highly symmetrical clusters minus one. Their relaxed smooth surfaces reorganize by enhanced N2 uptake toward some low coordinated Ni surface atoms with double N2 occupation. Class (3) of Ni5+ and Ni7+ through Ni11+ are small clusters of rough surfaces with low coordinated Ni surface atoms, and some reveal semi-internal Ni atoms of high next-neighbor coordination. Surface reorganization upon N2 uptake turns rough into rough surface by Ni atom migration and turns octahedral based structures into pentagonal bipyramidal structures. Class (4) of Ni14+ through Ni17+ and Ni20+ are large clusters with rough and smooth surface areas. They possess smooth icosahedral surfaces with some proximate capping atom(s) on one hemisphere of the icosahedron with the other one largely unaffected.

  1. Quantification of plaque area and characterization of plaque biochemical composition with atherosclerosis progression in ApoE/LDLR(-/-) mice by FT-IR imaging.

    PubMed

    Wrobel, Tomasz P; Mateuszuk, Lukasz; Kostogrys, Renata B; Chlopicki, Stefan; Baranska, Malgorzata

    2013-11-07

    In this work the quantitative determination of atherosclerotic lesion area (ApoE/LDLR(-/-) mice) by FT-IR imaging is presented and validated by comparison with atherosclerotic lesion area determination by classic Oil Red O staining. Cluster analysis of FT-IR-based measurements in the 2800-3025 cm(-1) range allowed for quantitative analysis of the atherosclerosis plaque area, the results of which were highly correlated with those of Oil Red O histological staining (R(2) = 0.935). Moreover, a specific class obtained from a second cluster analysis of the aortic cross-section samples at different stages of disease progression (3, 4 and 6 months old) seemed to represent the macrophages (CD68) area within the atherosclerotic plaque.

  2. Cluster Analysis for Cognitive Diagnosis: Theory and Applications

    ERIC Educational Resources Information Center

    Chiu, Chia-Yi; Douglas, Jeffrey A.; Li, Xiaodong

    2009-01-01

    Latent class models for cognitive diagnosis often begin with specification of a matrix that indicates which attributes or skills are needed for each item. Then by imposing restrictions that take this into account, along with a theory governing how subjects interact with items, parametric formulations of item response functions are derived and…

  3. A Systematic Evaluation of ADHD and Comorbid Psychopathology in a Population-Based Twin Sample

    ERIC Educational Resources Information Center

    Volk, Heather E.; Neuman, Rosalind J.; Todd, Richard D.

    2005-01-01

    Objective: Clinical and population samples demonstrate that attention-deficit/hyperactivity disorder (ADHD) occurs with other disorders. Comorbid disorder clustering within ADHD subtypes is not well studied. Method: Latent class analysis (LCA) examined the co-occurrence of DSM-IV ADHD, oppositional defiant disorder (ODD), conduct disorder (CD),…

  4. Classic fungal natural products in the genomic age: the molecular legacy of Harold Raistrick.

    PubMed

    Schor, Raissa; Cox, Russell

    2018-03-01

    Covering: 1893 to 2017Harold Raistrick was involved in the discovery of many of the most important classes of fungal metabolites during the 20th century. This review focusses on how these discoveries led to developments in isotopic labelling, biomimetic chemistry and the discovery, analysis and exploitation of biosynthetic gene clusters for major classes of fungal metabolites including: alternariol; geodin and metabolites of the emodin pathway; maleidrides; citrinin and the azaphilones; dehydrocurvularin; mycophenolic acid; and the tropolones. Key recent advances in the molecular understanding of these important pathways, including the discovery of biosynthetic gene clusters, the investigation of the molecular and chemical aspects of key biosynthetic steps, and the reengineering of key components of the pathways are reviewed and compared. Finally, discussion of key relationships between metabolites and pathways and the most important recent advances and opportunities for future research directions are given.

  5. Using spatial analysis to demonstrate the heterogeneity of the cardiovascular drug-prescribing pattern in Taiwan

    PubMed Central

    2011-01-01

    Background Geographic Information Systems (GIS) combined with spatial analytical methods could be helpful in examining patterns of drug use. Little attention has been paid to geographic variation of cardiovascular prescription use in Taiwan. The main objective was to use local spatial association statistics to test whether or not the cardiovascular medication-prescribing pattern is homogenous across 352 townships in Taiwan. Methods The statistical methods used were the global measures of Moran's I and Local Indicators of Spatial Association (LISA). While Moran's I provides information on the overall spatial distribution of the data, LISA provides information on types of spatial association at the local level. LISA statistics can also be used to identify influential locations in spatial association analysis. The major classes of prescription cardiovascular drugs were taken from Taiwan's National Health Insurance Research Database (NHIRD), which has a coverage rate of over 97%. The dosage of each prescription was converted into defined daily doses to measure the consumption of each class of drugs. Data were analyzed with ArcGIS and GeoDa at the township level. Results The LISA statistics showed an unusual use of cardiovascular medications in the southern townships with high local variation. Patterns of drug use also showed more low-low spatial clusters (cold spots) than high-high spatial clusters (hot spots), and those low-low associations were clustered in the rural areas. Conclusions The cardiovascular drug prescribing patterns were heterogeneous across Taiwan. In particular, a clear pattern of north-south disparity exists. Such spatial clustering helps prioritize the target areas that require better education concerning drug use. PMID:21609462

  6. New data clustering for RBF classifier of agriculture products from x-ray images

    NASA Astrophysics Data System (ADS)

    Casasent, David P.; Chen, Xuewen

    1999-08-01

    Classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a subsystem for automated non-invasive detection of defective product items on a conveyor belt. We discuss the use of clustering and how it is vital to achieve useful classification. New clustering methods using class identify and new cluster classes are advanced and shown to be of use for this application. Radial basis function neural net classifiers are emphasized. We expect our results to be of use for other classifiers and applications.

  7. Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors

    PubMed Central

    Andersson, Claes R; Hvidsten, Torgeir R; Isaksson, Anders; Gustafsson, Mats G; Komorowski, Jan

    2007-01-01

    Background We address the issue of explaining the presence or absence of phase-specific transcription in budding yeast cultures under different conditions. To this end we use a model-based detector of gene expression periodicity to divide genes into classes depending on their behavior in experiments using different synchronization methods. While computational inference of gene regulatory circuits typically relies on expression similarity (clustering) in order to find classes of potentially co-regulated genes, this method instead takes advantage of known time profile signatures related to the studied process. Results We explain the regulatory mechanisms of the inferred periodic classes with cis-regulatory descriptors that combine upstream sequence motifs with experimentally determined binding of transcription factors. By systematic statistical analysis we show that periodic classes are best explained by combinations of descriptors rather than single descriptors, and that different combinations correspond to periodic expression in different classes. We also find evidence for additive regulation in that the combinations of cis-regulatory descriptors associated with genes periodically expressed in fewer conditions are frequently subsets of combinations associated with genes periodically expression in more conditions. Finally, we demonstrate that our approach retrieves combinations that are more specific towards known cell-cycle related regulators than the frequently used clustering approach. Conclusion The results illustrate how a model-based approach to expression analysis may be particularly well suited to detect biologically relevant mechanisms. Our new approach makes it possible to provide more refined hypotheses about regulatory mechanisms of the cell cycle and it can easily be adjusted to reveal regulation of other, non-periodic, cellular processes. PMID:17939860

  8. On the analysis of large data sets

    NASA Astrophysics Data System (ADS)

    Ruch, Gerald T., Jr.

    We present a set of tools and techniques for performing detailed comparisons between computational models with high dimensional parameter spaces and large sets of archival data. By combining a principal component analysis of a large grid of samples from the model with an artificial neural network, we create a powerful data visualization tool as well as a way to robustly recover physical parameters from a large set of experimental data. Our techniques are applied in the context of circumstellar disks, the likely sites of planetary formation. An analysis is performed applying the two layer approximation of Chiang et al. (2001) and Dullemond et al. (2001) to the archive created by the Spitzer Space Telescope Cores to Disks Legacy program. We find two populations of disk sources. The first population is characterized by the lack of a puffed up inner rim while the second population appears to contain an inner rim which casts a shadow across the disk. The first population also exhibits a trend of increasing spectral index while the second population exhibits a decreasing trend in the strength of the 20 mm silicate emission feature. We also present images of the giant molecular cloud W3 obtained with the Infrared Array Camera (IRAC) and the Multiband Imaging Photometer (MIPS) on board the Spitzer Space Telescope. The images encompass the star forming regions W3 Main, W3(OH), and a region that we refer to as the Central Cluster which encloses the emission nebula IC 1795. We present a star count analysis of the point sources detected in W3. The star count analysis shows that the stellar population of the Central Cluster, when compared to that in the background, contains an over density of sources. The Central Cluster also contains an excess of sources with colors consistent with Class II Young Stellar Objects (YSOs). A analysis of the color-color diagrams also reveals a large number of Class II YSOs in the Central Cluster. Our results suggest that an earlier epoch of star formation created the Central Cluster, created a cavity, and triggered the active star formation in the W3 Main and W3(OH) regions. We also detect a new outflow and its candidate exciting star.

  9. Students' Perceptions of Motivational Climate and Enjoyment in Finnish Physical Education: A Latent Profile Analysis.

    PubMed

    Jaakkola, Timo; Wang, C K John; Soini, Markus; Liukkonen, Jarmo

    2015-09-01

    The purpose of this study was to identify student clusters with homogenous profiles in perceptions of task- and ego-involving, autonomy, and social relatedness supporting motivational climate in school physical education. Additionally, we investigated whether different motivational climate groups differed in their enjoyment in PE. Participants of the study were 2 594 girls and 1 803 boys, aged 14-15 years. Students responded to questionnaires assessing their perception of motivational climate and enjoyment in physical education. Latent profile analyses produced a five-cluster solution labeled 1) 'low autonomy, relatedness, task, and moderate ego climate' group', 2) 'low autonomy, relatedness, and high task and ego climate, 3) 'moderate autonomy, relatedness, task and ego climate' group 4) 'high autonomy, relatedness, task, and moderate ego climate' group, and 5) 'high relatedness and task but moderate autonomy and ego climate' group. Analyses of variance showed that students in clusters 4 and 5 perceived the highest level of enjoyment whereas students in cluster 1 experienced the lowest level of enjoyment. The results showed that the students' perceptions of various motivational climates created differential levels of enjoyment in PE classes. Key pointsLatent profile analyses produced a five-cluster solution labeled 1) 'low autonomy, relatedness, task, and moderate ego climate' group', 2) 'low autonomy, relatedness, and high task and ego climate, 3) 'moderate autonomy, relatedness, task and ego climate' group 4) 'high autonomy, relatedness, task, and moderate ego climate' group, and 5) 'high relatedness and task but moderate autonomy and ego climate' group.Analyses of variance showed that clusters 4 and 5 perceived the highest level of enjoyment whereas cluster 1 experienced the lowest level of enjoyment. The results showed that the students' perceptions of motivational climate create differential levels of enjoyment in PE classes.

  10. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences.

    PubMed

    Medema, Marnix H; Blin, Kai; Cimermancic, Peter; de Jager, Victor; Zakrzewski, Piotr; Fischbach, Michael A; Weber, Tilmann; Takano, Eriko; Breitling, Rainer

    2011-07-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide variety of microbes. However, rapidly and reliably pinpointing all the potential gene clusters for secondary metabolites in dozens of newly sequenced genomes has been extremely challenging, due to their biochemical heterogeneity, the presence of unknown enzymes and the dispersed nature of the necessary specialized bioinformatics tools and resources. Here, we present antiSMASH (antibiotics & Secondary Metabolite Analysis Shell), the first comprehensive pipeline capable of identifying biosynthetic loci covering the whole range of known secondary metabolite compound classes (polyketides, non-ribosomal peptides, terpenes, aminoglycosides, aminocoumarins, indolocarbazoles, lantibiotics, bacteriocins, nucleosides, beta-lactams, butyrolactones, siderophores, melanins and others). It aligns the identified regions at the gene cluster level to their nearest relatives from a database containing all other known gene clusters, and integrates or cross-links all previously available secondary-metabolite specific gene analysis methods in one interactive view. antiSMASH is available at http://antismash.secondarymetabolites.org.

  11. Integrative analysis of signaling pathways and diseases associated with the miR-106b/25 cluster and their function study in berberine-induced multiple myeloma cells.

    PubMed

    Gu, Chunming; Li, Tianfu; Yin, Zhao; Chen, Shengting; Fei, Jia; Shen, Jianping; Zhang, Yuan

    2017-05-01

    Berberine (BBR), a traditional Chinese herbal medicine compound, has emerged as a novel class of anti-tumor agent. Our previous microRNA (miRNA) microarray demonstrated that miR-106b/25 was significantly down-regulated in BBR-treated multiple myeloma (MM) cells. Here, systematic integration showed that miR-106b/25 cluster is involved in multiple cancer-related signaling pathways and tumorigenesis. MiREnvironment database revealed that multiple environmental factors (drug, ionizing radiation, hypoxia) affected the miR-106b/25 cluster expression. By targeting the seed region in the miRNA, tiny anti-mir106b/25 cluster (t-anti-mir106b/25 cluster) significantly induced suppression in cell viability and colony formation. Western blot validated that t-anti-miR-106b/25 cluster effectively inhibited the expression of P38 MAPK and phospho-P38 MAPK in MM cells. These findings indicated the miR-106b/25 cluster functioned as oncogene and might provide a novel molecular insight into MM.

  12. Possibilistic clustering for shape recognition

    NASA Technical Reports Server (NTRS)

    Keller, James M.; Krishnapuram, Raghu

    1993-01-01

    Clustering methods have been used extensively in computer vision and pattern recognition. Fuzzy clustering has been shown to be advantageous over crisp (or traditional) clustering in that total commitment of a vector to a given class is not required at each iteration. Recently fuzzy clustering methods have shown spectacular ability to detect not only hypervolume clusters, but also clusters which are actually 'thin shells', i.e., curves and surfaces. Most analytic fuzzy clustering approaches are derived from Bezdek's Fuzzy C-Means (FCM) algorithm. The FCM uses the probabilistic constraint that the memberships of a data point across classes sum to one. This constraint was used to generate the membership update equations for an iterative algorithm. Unfortunately, the memberships resulting from FCM and its derivatives do not correspond to the intuitive concept of degree of belonging, and moreover, the algorithms have considerable trouble in noisy environments. Recently, the clustering problem was cast into the framework of possibility theory. Our approach was radically different from the existing clustering methods in that the resulting partition of the data can be interpreted as a possibilistic partition, and the membership values may be interpreted as degrees of possibility of the points belonging to the classes. An appropriate objective function whose minimum will characterize a good possibilistic partition of the data was constructed, and the membership and prototype update equations from necessary conditions for minimization of our criterion function were derived. The ability of this approach to detect linear and quartic curves in the presence of considerable noise is shown.

  13. Possibilistic clustering for shape recognition

    NASA Technical Reports Server (NTRS)

    Keller, James M.; Krishnapuram, Raghu

    1992-01-01

    Clustering methods have been used extensively in computer vision and pattern recognition. Fuzzy clustering has been shown to be advantageous over crisp (or traditional) clustering in that total commitment of a vector to a given class is not required at each iteration. Recently fuzzy clustering methods have shown spectacular ability to detect not only hypervolume clusters, but also clusters which are actually 'thin shells', i.e., curves and surfaces. Most analytic fuzzy clustering approaches are derived from Bezdek's Fuzzy C-Means (FCM) algorithm. The FCM uses the probabilistic constraint that the memberships of a data point across classes sum to one. This constraint was used to generate the membership update equations for an iterative algorithm. Unfortunately, the memberships resulting from FCM and its derivatives do not correspond to the intuitive concept of degree of belonging, and moreover, the algorithms have considerable trouble in noisy environments. Recently, we cast the clustering problem into the framework of possibility theory. Our approach was radically different from the existing clustering methods in that the resulting partition of the data can be interpreted as a possibilistic partition, and the membership values may be interpreted as degrees of possibility of the points belonging to the classes. We constructed an appropriate objective function whose minimum will characterize a good possibilistic partition of the data, and we derived the membership and prototype update equations from necessary conditions for minimization of our criterion function. In this paper, we show the ability of this approach to detect linear and quartic curves in the presence of considerable noise.

  14. Real Time Intelligent Target Detection and Analysis with Machine Vision

    NASA Technical Reports Server (NTRS)

    Howard, Ayanna; Padgett, Curtis; Brown, Kenneth

    2000-01-01

    We present an algorithm for detecting a specified set of targets for an Automatic Target Recognition (ATR) application. ATR involves processing images for detecting, classifying, and tracking targets embedded in a background scene. We address the problem of discriminating between targets and nontarget objects in a scene by evaluating 40x40 image blocks belonging to an image. Each image block is first projected onto a set of templates specifically designed to separate images of targets embedded in a typical background scene from those background images without targets. These filters are found using directed principal component analysis which maximally separates the two groups. The projected images are then clustered into one of n classes based on a minimum distance to a set of n cluster prototypes. These cluster prototypes have previously been identified using a modified clustering algorithm based on prior sensed data. Each projected image pattern is then fed into the associated cluster's trained neural network for classification. A detailed description of our algorithm will be given in this paper. We outline our methodology for designing the templates, describe our modified clustering algorithm, and provide details on the neural network classifiers. Evaluation of the overall algorithm demonstrates that our detection rates approach 96% with a false positive rate of less than 0.03%.

  15. Job Satisfaction among Health-Care Staff in Township Health Centers in Rural China: Results from a Latent Class Analysis.

    PubMed

    Wang, Haipeng; Tang, Chengxiang; Zhao, Shichao; Meng, Qingyue; Liu, Xiaoyun

    2017-09-22

    Background : The lower job satisfaction of health-care staff will lead to more brain drain, worse work performance, and poorer health-care outcomes. The aim of this study was to identify patterns of job satisfaction among health-care staff in rural China, and to investigate the association between the latent clusters and health-care staff's personal and professional features; Methods : We selected 12 items of five-point Likert scale questions to measure job satisfaction. A latent-class analysis was performed to identify subgroups based on the items of job satisfaction; Results : Four latent classes of job satisfaction were identified: 8.9% had high job satisfaction, belonging to "satisfied class"; 38.2% had low job satisfaction, named as "unsatisfied class"; 30.5% were categorized into "unsatisfied class with the exception of interpersonal relationships"; 22.4% were identified as "pseudo-satisfied class", only satisfied with management-oriented items. Low job satisfaction was associated with specialty, training opportunity, and income inequality. Conclusions : The minority of health-care staff belong to the "satisfied class". Three among four subgroups are not satisfied with income, benefit, training, and career development. Targeting policy interventions should be implemented to improve the items of job satisfaction based on the patterns and health-care staff's features.

  16. High- and low-level hierarchical classification algorithm based on source separation process

    NASA Astrophysics Data System (ADS)

    Loghmari, Mohamed Anis; Karray, Emna; Naceur, Mohamed Saber

    2016-10-01

    High-dimensional data applications have earned great attention in recent years. We focus on remote sensing data analysis on high-dimensional space like hyperspectral data. From a methodological viewpoint, remote sensing data analysis is not a trivial task. Its complexity is caused by many factors, such as large spectral or spatial variability as well as the curse of dimensionality. The latter describes the problem of data sparseness. In this particular ill-posed problem, a reliable classification approach requires appropriate modeling of the classification process. The proposed approach is based on a hierarchical clustering algorithm in order to deal with remote sensing data in high-dimensional space. Indeed, one obvious method to perform dimensionality reduction is to use the independent component analysis process as a preprocessing step. The first particularity of our method is the special structure of its cluster tree. Most of the hierarchical algorithms associate leaves to individual clusters, and start from a large number of individual classes equal to the number of pixels; however, in our approach, leaves are associated with the most relevant sources which are represented according to mutually independent axes to specifically represent some land covers associated with a limited number of clusters. These sources contribute to the refinement of the clustering by providing complementary rather than redundant information. The second particularity of our approach is that at each level of the cluster tree, we combine both a high-level divisive clustering and a low-level agglomerative clustering. This approach reduces the computational cost since the high-level divisive clustering is controlled by a simple Boolean operator, and optimizes the clustering results since the low-level agglomerative clustering is guided by the most relevant independent sources. Then at each new step we obtain a new finer partition that will participate in the clustering process to enhance semantic capabilities and give good identification rates.

  17. Evaluation of SLAR and thematic mapper MSS data for forest cover mapping using computer-aided analysis techniques

    NASA Technical Reports Server (NTRS)

    Hoffer, R. M. (Principal Investigator); Knowlton, D. J.; Dean, M. E.

    1981-01-01

    Supervised and cluster block training statistics were used to analyze the thematic mapper simulation MSS data (both 1979 and 1980 data sets). Cover information classes identified on SAR imagery include: hardwood, pine, mixed pine hardwood, clearcut, pasture, crops, emergent crops, bare soil, urban, and water. Preliminary analysis of the HH and HV polarized SAR data indicate a high variance associated with each information class except for water and bare soil. The large variance for most spectral classes suggests that while the means might be statistically separable, an overlap may exist between the classes which could introduce a significant classification error. The quantitative values of many cover types are much larger on the HV polarization than on the HH, thereby indicating the relative nature of the digitized data values. The mean values of the spectral classes in the areas with larger look angles are greater than the means of the same cover type in other areas having steeper look angles. Difficulty in accurately overlaying the dual polarization of the SAR data was resolved.

  18. Uncertainties in the cluster-cluster correlation function

    NASA Astrophysics Data System (ADS)

    Ling, E. N.; Frenk, C. S.; Barrow, J. D.

    1986-12-01

    The bootstrap resampling technique is applied to estimate sampling errors and significance levels of the two-point correlation functions determined for a subset of the CfA redshift survey of galaxies and a redshift sample of 104 Abell clusters. The angular correlation function for a sample of 1664 Abell clusters is also calculated. The standard errors in xi(r) for the Abell data are found to be considerably larger than quoted 'Poisson errors'. The best estimate for the ratio of the correlation length of Abell clusters (richness class R greater than or equal to 1, distance class D less than or equal to 4) to that of CfA galaxies is 4.2 + 1.4 or - 1.0 (68 percentile error). The enhancement of cluster clustering over galaxy clustering is statistically significant in the presence of resampling errors. The uncertainties found do not include the effects of possible systematic biases in the galaxy and cluster catalogs and could be regarded as lower bounds on the true uncertainty range.

  19. From dust to light: a study of star formation in NGC2264

    NASA Astrophysics Data System (ADS)

    Teixeira, P. S.

    2008-10-01

    The goal of this dissertation is to characterize the star formation history of the young cluster NGC2264 using the unique observational capabilities of the Spitzer Space Telescope. The motivation to conduct this study stems from the fact that most stars are formed within clusters, so the formation and evolution of the latter will effect the stellar mass distribution in the field. Detailed observational studies of young stellar clusters are therefore crucial to provide necessary constraints for theoretical models of cloud and cluster formation and evolution. This study also addresses the evolution of circumstellar disks in NGC2264; empirical knowledge of protoplanetary disk evolution is required for the understanding of how planetary systems such as our own form. The first result obtained from this study was both completely new and unexpected. A dense region within NGC2264 was found to be teeming with bright 24 μm Class I protostars; these sources are embedded within dense submillimeter cores and are spatially distributed along dense filamentary fingers of gas and dust that radially converge on a B-type binary Class I source. This cluster of protostars was baptized the "Spokes cluster" and its analysis provided further insight into the role of thermal support during core formation, collapse and fragmentation. The nearest neighbor projected separation distribution of these Class I sources shows a characteristic spacing that is similar to the Jeans length for the region, indicating that the dusty filaments may have undergone thermal fragmentation. The submillimeter cores of the Spokes cluster were observed at 230GHz using the SubMillimeter Array (SMA) and the resulting high resolution (~1.3") continuum observations revealed a dense grouping of 7 Class 0 sources embedded within a particular core, D-MM1 (~20"x20"). The compact sources have masses ranging between 0.4M and 1.2M, and radii of ~600AU. The mean separation of the Class 0 sources within D-MM1 is considerably smaller than the characteristic spacing between the Class I sources in the larger Spokes cluster and is consistent with hierarchical thermal fragmentation of the dense molecular gas in this region. The results obtained by the study of the Spokes cluster show that the spatial substructuring of a cluster or subcluster is correlated with age, i.e., groupings of very young protostars have clearly more concentrated and substructured spatial distributions. The Spokes cluster could thus be one of several building blocks of NGC2264, and will likely expand and disperse its members through the surrounding region, adding to the rest of NGC2264's stellar population.To further explore this scenario, I identified Pre-Main Sequence (PMS) disk bearing sources in the whole region of NGC2264, as surveyed by InfraRed Array Camera (IRAC) analyzing both their spatial distributions and ages. Of the 1404 sources detected in all four IRAC bands, 116 sources were found to have anemic IRAC disks and 217 sources were found to have thick IRAC disks; the disk fraction was calculated to be 37.5%±6.3% and found to be a function of spectral type, increasing for later type sources. I identified 4 candidate sources with transition disks (disks with inner holes), as well as 6 sources with anemic inner disks and thick outer disks that could be the immediate precursors of transition disks. This is a relevant result for it suggests planet formation may be occurring in the inner disk at very early ages. I found that the spatial distribution of the disk-bearing sources was a function of both disk type and amount of reddening. This spatial analysis enabled the identification of three groups of sources, namely, (i) embedded (AV> 3 magnitudes) sources with thick disks, (ii) unembedded sources with thick disks, and (iii) sources with anemic disks. The first group was found to have a median age of 1 Myr and its spatial distribution is highly concentrated and substructured. The second group, (ii), has a median age of 2 Myr and its spatial distribution is less concentrated and substructured than group (i), but more than the group of sources with anemic disks - the spatial distribution of this third group (age ~ 2 Myr) is not substructured and is more distributed, showing no particular peak or concentration. The star formation history of NGC2264 appears to be as follows: the northern region appears to have undergone the first epoch or episode of star formation, while the second epoch is currently occurring in the center (Spokes cluster) and south (near Allen's source). Status: RO

  20. Using targeted active-learning exercises and diagnostic question clusters to improve students' understanding of carbon cycling in ecosystems.

    PubMed

    Maskiewicz, April Cordero; Griscom, Heather Peckham; Welch, Nicole Turrill

    2012-01-01

    In this study, we used targeted active-learning activities to help students improve their ways of reasoning about carbon flow in ecosystems. The results of a validated ecology conceptual inventory (diagnostic question clusters [DQCs]) provided us with information about students' understanding of and reasoning about transformation of inorganic and organic carbon-containing compounds in biological systems. These results helped us identify specific active-learning exercises that would be responsive to students' existing knowledge. The effects of the active-learning interventions were then examined through analysis of students' pre- and postinstruction responses on the DQCs. The biology and non-biology majors participating in this study attended a range of institutions and the instructors varied in their use of active learning; one lecture-only comparison class was included. Changes in pre- to postinstruction scores on the DQCs showed that an instructor's teaching method had a highly significant effect on student reasoning following course instruction, especially for questions pertaining to cellular-level, carbon-transforming processes. We conclude that using targeted in-class activities had a beneficial effect on student learning regardless of major or class size, and argue that using diagnostic questions to identify effective learning activities is a valuable strategy for promoting learning, as gains from lecture-only classes were minimal.

  1. Using Targeted Active-Learning Exercises and Diagnostic Question Clusters to Improve Students' Understanding of Carbon Cycling in Ecosystems

    PubMed Central

    Maskiewicz, April Cordero; Griscom, Heather Peckham; Welch, Nicole Turrill

    2012-01-01

    In this study, we used targeted active-learning activities to help students improve their ways of reasoning about carbon flow in ecosystems. The results of a validated ecology conceptual inventory (diagnostic question clusters [DQCs]) provided us with information about students' understanding of and reasoning about transformation of inorganic and organic carbon-containing compounds in biological systems. These results helped us identify specific active-learning exercises that would be responsive to students' existing knowledge. The effects of the active-learning interventions were then examined through analysis of students' pre- and postinstruction responses on the DQCs. The biology and non–biology majors participating in this study attended a range of institutions and the instructors varied in their use of active learning; one lecture-only comparison class was included. Changes in pre- to postinstruction scores on the DQCs showed that an instructor's teaching method had a highly significant effect on student reasoning following course instruction, especially for questions pertaining to cellular-level, carbon-transforming processes. We conclude that using targeted in-class activities had a beneficial effect on student learning regardless of major or class size, and argue that using diagnostic questions to identify effective learning activities is a valuable strategy for promoting learning, as gains from lecture-only classes were minimal. PMID:22383618

  2. Visualization of heterogeneity and regional grading of gliomas by multiple features using magnetic resonance-based clustered images.

    PubMed

    Inano, Rika; Oishi, Naoya; Kunieda, Takeharu; Arakawa, Yoshiki; Kikuchi, Takayuki; Fukuyama, Hidenao; Miyamoto, Susumu

    2016-07-26

    Preoperative glioma grading is important for therapeutic strategies and influences prognosis. Intratumoral heterogeneity can cause an underestimation of grading because of the sampling error in biopsies. We developed a voxel-based unsupervised clustering method with multiple magnetic resonance imaging (MRI)-derived features using a self-organizing map followed by K-means. This method produced novel magnetic resonance-based clustered images (MRcIs) that enabled the visualization of glioma grades in 36 patients. The 12-class MRcIs revealed the highest classification performance for the prediction of glioma grading (area under the receiver operating characteristic curve = 0.928; 95% confidential interval = 0.920-0.936). Furthermore, we also created 12-class MRcIs in four new patients using the previous data from the 36 patients as training data and obtained tissue sections of the classes 11 and 12, which were significantly higher in high-grade gliomas (HGGs), and those of classes 4, 5 and 9, which were not significantly different between HGGs and low-grade gliomas (LGGs), according to a MRcI-based navigational system. The tissues of classes 11 and 12 showed features of malignant glioma, whereas those of classes 4, 5 and 9 showed LGGs without anaplastic features. These results suggest that the proposed voxel-based clustering method provides new insights into preoperative regional glioma grading.

  3. The Four U's: Latent Classes of Hookup Motivations Among College Students.

    PubMed

    Uecker, Jeremy E; Pearce, Lisa D; Andercheck, Brita

    2015-06-01

    College students' "hookups" have been the subject of a great deal of research in recent years. Motivations for hooking up have been linked to differences in well-being after the hookup, but studies detailing college students' motivations for engaging in hookups focus on single motivations. Using data from the 2010 Duke Hookup Survey, we consider how motivations for hooking up cluster to produce different classes, or profiles, of students who hook up, and how these classes are related to hookup regret. Four distinct classes of motivations emerged from our latent class analysis: Utilitarians (50%), Uninhibiteds (27%), Uninspireds (19%), and Unreflectives (4%). We find a number of differences in hookup motivation classes across social characteristics, including gender, year in school, race-ethnicity, self-esteem, and attitudes about sexual behavior outside committed relationships. Additionally, Uninspireds regret hookups more frequently than members of the other classes, and Uninhibiteds report regret less frequently than Utilitarians and Uninspireds. These findings reveal the complexity of motivations for hooking up and the link between motivations and regret.

  4. Customized recommendations for production management clusters of North American automatic milking systems.

    PubMed

    Tremblay, Marlène; Hess, Justin P; Christenson, Brock M; McIntyre, Kolby K; Smink, Ben; van der Kamp, Arjen J; de Jong, Lisanne G; Döpfer, Dörte

    2016-07-01

    Automatic milking systems (AMS) are implemented in a variety of situations and environments. Consequently, there is a need to characterize individual farming practices and regional challenges to streamline management advice and objectives for producers. Benchmarking is often used in the dairy industry to compare farms by computing percentile ranks of the production values of groups of farms. Grouping for conventional benchmarking is commonly limited to the use of a few factors such as farms' geographic region or breed of cattle. We hypothesized that herds' production data and management information could be clustered in a meaningful way using cluster analysis and that this clustering approach would yield better peer groups of farms than benchmarking methods based on criteria such as country, region, breed, or breed and region. By applying mixed latent-class model-based cluster analysis to 529 North American AMS dairy farms with respect to 18 significant risk factors, 6 clusters were identified. Each cluster (i.e., peer group) represented unique management styles, challenges, and production patterns. When compared with peer groups based on criteria similar to the conventional benchmarking standards, the 6 clusters better predicted milk produced (kilograms) per robot per day. Each cluster represented a unique management and production pattern that requires specialized advice. For example, cluster 1 farms were those that recently installed AMS robots, whereas cluster 3 farms (the most northern farms) fed high amounts of concentrates through the robot to compensate for low-energy feed in the bunk. In addition to general recommendations for farms within a cluster, individual farms can generate their own specific goals by comparing themselves to farms within their cluster. This is very comparable to benchmarking but adds the specific characteristics of the peer group, resulting in better farm management advice. The improvement that cluster analysis allows for is characterized by the multivariable approach and the fact that comparisons between production units can be accomplished within a cluster and between clusters as a choice. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  5. Galaxy evolution in the densest environments: HST imaging

    NASA Astrophysics Data System (ADS)

    Jorgensen, Inger

    2013-10-01

    We propose to process in a consistent fashion all available HST/ACS and WFC3 imaging of seven rich clusters of galaxies at z=1.2-1.6. The clusters are part of our larger project aimed at constraining models for galaxy evolution in dense environments from observations of stellar populations in rich z=1.2-2 galaxy clusters. The main objective is to establish the star formation {SF} history and structural evolution over this epoch during which large changes in SF rates and galaxy structure are expected to take place in cluster galaxies.The observational data required to meet our main objective are deep HST imaging and high S/N spectroscopy of individual cluster members. The HST imaging already exists for the seven rich clusters at z=1.2-1.6 included in this archive proposal. However, the data have not been consistently processed to derive colors, magnitudes, sizes and morphological parameters for all potential cluster members bright enough to be suitable for spectroscopic observations with 8-m class telescopes. We propose to carry out this processing and make all derived parameters publicly available. We will use the parameters derived from the HST imaging to {1} study the structural evolution of the galaxies, {2} select clusters and galaxies for spectroscopic observations, and {3} use the photometry and spectroscopy together for a unified analysis aimed at the SF history and structural changes. The analysis will also utilize data from the Gemini/HST Cluster Galaxy Project, which covers rich clusters at z=0.2-1.0 and for which we have similar HST imaging and high S/N spectroscopy available.

  6. Response to traumatic brain injury neurorehabilitation through an artificial intelligence and statistics hybrid knowledge discovery from databases methodology.

    PubMed

    Gibert, Karina; García-Rudolph, Alejandro; García-Molina, Alberto; Roig-Rovira, Teresa; Bernabeu, Montse; Tormos, José María

    2008-01-01

    Develop a classificatory tool to identify different populations of patients with Traumatic Brain Injury based on the characteristics of deficit and response to treatment. A KDD framework where first, descriptive statistics of every variable was done, data cleaning and selection of relevant variables. Then data was mined using a generalization of Clustering based on rules (CIBR), an hybrid AI and Statistics technique which combines inductive learning (AI) and clustering (Statistics). A prior Knowledge Base (KB) is considered to properly bias the clustering; semantic constraints implied by the KB hold in final clusters, guaranteeing interpretability of the resultis. A generalization (Exogenous Clustering based on rules, ECIBR) is presented, allowing to define the KB in terms of variables which will not be considered in the clustering process itself, to get more flexibility. Several tools as Class panel graph are introduced in the methodology to assist final interpretation. A set of 5 classes was recommended by the system and interpretation permitted profiles labeling. From the medical point of view, composition of classes is well corresponding with different patterns of increasing level of response to rehabilitation treatments. All the patients initially assessable conform a single group. Severe impaired patients are subdivided in four profiles which clearly distinct response patterns. Particularly interesting the partial response profile, where patients could not improve executive functions. Meaningful classes were obtained and, from a semantics point of view, the results were sensibly improved regarding classical clustering, according to our opinion that hybrid AI & Stats techniques are more powerful for KDD than pure ones.

  7. Validating clustering of molecular dynamics simulations using polymer models.

    PubMed

    Phillips, Joshua L; Colvin, Michael E; Newsam, Shawn

    2011-11-14

    Molecular dynamics (MD) simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our knowledge, our framework is the first to utilize model polymers to rigorously test the utility of clustering algorithms for studying biopolymers.

  8. Validating clustering of molecular dynamics simulations using polymer models

    PubMed Central

    2011-01-01

    Background Molecular dynamics (MD) simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. Results We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. Conclusions We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our knowledge, our framework is the first to utilize model polymers to rigorously test the utility of clustering algorithms for studying biopolymers. PMID:22082218

  9. Latent Profile Analysis and Conversion to Psychosis: Characterizing Subgroups to Enhance Risk Prediction.

    PubMed

    Healey, Kristin M; Penn, David L; Perkins, Diana; Woods, Scott W; Keefe, Richard S E; Addington, Jean

    2018-02-15

    Groups at clinical high risk (CHR) of developing psychosis are heterogeneous, composed of individuals with different clusters of symptoms. It is likely that there exist subgroups, each associated with different symptom constellations and probabilities of conversion. Present study used latent profile analysis (LPA) to ascertain subgroups in a combined sample of CHR (n = 171) and help-seeking controls (HSCs; n = 100; PREDICT study). Indicators in the LPA model included baseline Scale of Prodromal Symptoms (SOPS), Calgary Depression Scale for Schizophrenia (CDSS), and neurocognitive performance as measured by multiple instruments, including category instances (CAT). Subgroups were further characterized using covariates measuring demographic and clinical features. Three classes emerged: class 1 (mild, transition rate 5.6%), lowest SOPS and depression scores, intact neurocognitive performance; class 2 (paranoid-affective, transition rate 14.2%), highest suspiciousness, mild negative symptoms, moderate depression; and class 3 (negative-neurocognitive, transition rate 29.3%), highest negative symptoms, neurocognitive impairment, social cognitive impairment. Classes 2 and 3 evidenced poor social functioning. Results support a subgroup approach to research, assessment, and treatment of help-seeking individuals. Class 3 may be an early risk stage of developing schizophrenia.

  10. Automated method to differentiate between native and mirror protein models obtained from contact maps.

    PubMed

    Kurczynska, Monika; Kotulska, Malgorzata

    2018-01-01

    Mirror protein structures are often considered as artifacts in modeling protein structures. However, they may soon become a new branch of biochemistry. Moreover, methods of protein structure reconstruction, based on their residue-residue contact maps, need methodology to differentiate between models of native and mirror orientation, especially regarding the reconstructed backbones. We analyzed 130 500 structural protein models obtained from contact maps of 1 305 SCOP domains belonging to all 7 structural classes. On average, the same numbers of native and mirror models were obtained among 100 models generated for each domain. Since their structural features are often not sufficient for differentiating between the two types of model orientations, we proposed to apply various energy terms (ETs) from PyRosetta to separate native and mirror models. To automate the procedure for differentiating these models, the k-means clustering algorithm was applied. Using total energy did not allow to obtain appropriate clusters-the accuracy of the clustering for class A (all helices) was no more than 0.52. Therefore, we tested a series of different k-means clusterings based on various combinations of ETs. Finally, applying two most differentiating ETs for each class allowed to obtain satisfying results. To unify the method for differentiating between native and mirror models, independent of their structural class, the two best ETs for each class were considered. Finally, the k-means clustering algorithm used three common ETs: probability of amino acid assuming certain values of dihedral angles Φ and Ψ, Ramachandran preferences and Coulomb interactions. The accuracies of clustering with these ETs were in the range between 0.68 and 0.76, with sensitivity and selectivity in the range between 0.68 and 0.87, depending on the structural class. The method can be applied to all fully-automated tools for protein structure reconstruction based on contact maps, especially those analyzing big sets of models.

  11. Evolution of homeobox genes.

    PubMed

    Holland, Peter W H

    2013-01-01

    Many homeobox genes encode transcription factors with regulatory roles in animal and plant development. Homeobox genes are found in almost all eukaryotes, and have diversified into 11 gene classes and over 100 gene families in animal evolution, and 10 to 14 gene classes in plants. The largest group in animals is the ANTP class which includes the well-known Hox genes, plus other genes implicated in development including ParaHox (Cdx, Xlox, Gsx), Evx, Dlx, En, NK4, NK3, Msx, and Nanog. Genomic data suggest that the ANTP class diversified by extensive tandem duplication to generate a large array of genes, including an NK gene cluster and a hypothetical ProtoHox gene cluster that duplicated to generate Hox and ParaHox genes. Expression and functional data suggest that NK, Hox, and ParaHox gene clusters acquired distinct roles in patterning the mesoderm, nervous system, and gut. The PRD class is also diverse and includes Pax2/5/8, Pax3/7, Pax4/6, Gsc, Hesx, Otx, Otp, and Pitx genes. PRD genes are not generally arranged in ancient genomic clusters, although the Dux, Obox, and Rhox gene clusters arose in mammalian evolution as did several non-clustered PRD genes. Tandem duplication and genome duplication expanded the number of homeobox genes, possibly contributing to the evolution of developmental complexity, but homeobox gene loss must not be ignored. Evolutionary changes to homeobox gene expression have also been documented, including Hox gene expression patterns shifting in concert with segmental diversification in vertebrates and crustaceans, and deletion of a Pitx1 gene enhancer in pelvic-reduced sticklebacks. WIREs Dev Biol 2013, 2:31-45. doi: 10.1002/wdev.78 For further resources related to this article, please visit the WIREs website. The author declares that he has no conflicts of interest. Copyright © 2012 Wiley Periodicals, Inc.

  12. Calcisponges have a ParaHox gene and dynamic expression of dispersed NK homeobox genes.

    PubMed

    Fortunato, Sofia A V; Adamski, Marcin; Ramos, Olivia Mendivil; Leininger, Sven; Liu, Jing; Ferrier, David E K; Adamska, Maja

    2014-10-30

    Sponges are simple animals with few cell types, but their genomes paradoxically contain a wide variety of developmental transcription factors, including homeobox genes belonging to the Antennapedia (ANTP) class, which in bilaterians encompass Hox, ParaHox and NK genes. In the genome of the demosponge Amphimedon queenslandica, no Hox or ParaHox genes are present, but NK genes are linked in a tight cluster similar to the NK clusters of bilaterians. It has been proposed that Hox and ParaHox genes originated from NK cluster genes after divergence of sponges from the lineage leading to cnidarians and bilaterians. On the other hand, synteny analysis lends support to the notion that the absence of Hox and ParaHox genes in Amphimedon is a result of secondary loss (the ghost locus hypothesis). Here we analysed complete suites of ANTP-class homeoboxes in two calcareous sponges, Sycon ciliatum and Leucosolenia complicata. Our phylogenetic analyses demonstrate that these calcisponges possess orthologues of bilaterian NK genes (Hex, Hmx and Msx), a varying number of additional NK genes and one ParaHox gene, Cdx. Despite the generation of scaffolds spanning multiple genes, we find no evidence of clustering of Sycon NK genes. All Sycon ANTP-class genes are developmentally expressed, with patterns suggesting their involvement in cell type specification in embryos and adults, metamorphosis and body plan patterning. These results demonstrate that ParaHox genes predate the origin of sponges, thus confirming the ghost locus hypothesis, and highlight the need to analyse the genomes of multiple sponge lineages to obtain a complete picture of the ancestral composition of the first animal genome.

  13. Remote Sensing techniques used to characterize soil erosion in southwestern Sao Paulo state. M.S. Thesis - 29 Sep. 1982; [Brazil

    NASA Technical Reports Server (NTRS)

    Parada, N. D. J. (Principal Investigator); Pinto, S. D. A. F.

    1983-01-01

    Within randomly sampled squares of a 1 km x 1 km grid, rill/gullies frequency, land cover/land use type and shape of the slopes were extracted from aerial photographs of the Ribeirao Anhumas drainage basin. Mean slope gradient, stream frequency and slope length were calculated on topographic maps. Ground truth data on fine sand/coarse sand ratio and vegetation cover densities were obtained. The MSS-LANDSAT-2 data (CCTs) were analyzed using single-cell, cluster synthesis and slicer algorithms. Graphical and statistical analyses of the data indicate that different slope gradients and land cover/land use types are the most significant factors related to the soil erosion process. The digital analysis of MSS data allowed the association among gray level classes and vegetation cover classes, which defined seven classes. These gray level classes and slope gradient classes were used to rank erosion risk.

  14. CHANDRA/ACIS-I STUDY OF THE X-RAY PROPERTIES OF THE NGC 6611 AND M16 STELLAR POPULATIONS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Guarcello, M. G.; Drake, J. J.; Caramazza, M.

    2012-07-10

    Mechanisms regulating the origin of X-rays in young stellar objects and the correlation with their evolutionary stage are under debate. Studies of the X-ray properties in young clusters allow us to understand these mechanisms. One ideal target for this analysis is the Eagle Nebula (M16), with its central cluster NGC 6611. At 1750 pc from the Sun, it harbors 93 OB stars, together with a population of low-mass stars from embedded protostars to disk-less Class III objects, with age {<=}3 Myr. We study an archival 78 ks Chandra/ACIS-I observation of NGC 6611 and two new 80 ks observations of themore » outer region of M16, one centered on the Column V and the other on a region of the molecular cloud with ongoing star formation. We detect 1755 point sources with 1183 candidate cluster members (219 disk-bearing and 964 disk-less). We study the global X-ray properties of M16 and compare them with those of the Orion Nebula Cluster. We also compare the level of X-ray emission of Class II and Class III stars and analyze the X-ray spectral properties of OB stars. Our study supports the lower level of X-ray activity for the disk-bearing stars with respect to the disk-less members. The X-ray luminosity function (XLF) of M16 is similar to that of Orion, supporting the universality of the XLF in young clusters. Eighty-five percent of the O stars of NGC 6611 have been detected in X-rays. With only one possible exception, they show soft spectra with no hard components, indicating that mechanisms for the production of hard X-ray emission in O stars are not operating in NGC 6611.« less

  15. Molecular reclassification of Crohn's disease: a cautionary note on population stratification.

    PubMed

    Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel

    2013-01-01

    Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn's disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn's disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals.

  16. Molecular Reclassification of Crohn’s Disease: A Cautionary Note on Population Stratification

    PubMed Central

    Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M.; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel

    2013-01-01

    Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn’s disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn’s disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals. PMID:24147066

  17. Two-Year Predictive Validity of Conduct Disorder Subtypes in Early Adolescence: A Latent Class Analysis of a Canadian Longitudinal Sample

    ERIC Educational Resources Information Center

    Lacourse, Eric; Baillargeon, Raymond; Dupere, Veronique; Vitaro, Frank; Romano, Elisa; Tremblay, Richard

    2010-01-01

    Background: Investigating the latent structure of conduct disorder (CD) can help clarify how symptoms related to aggression, property destruction, theft, and serious violations of rules cluster in individuals with this disorder. Discovering homogeneous subtypes can be useful for etiologic, treatment, and prevention purposes depending on the…

  18. Intergroup Stereotypes of Working Class Blacks and Whites: Implications for Stereotype Threat.

    ERIC Educational Resources Information Center

    Niemann, Yolanda Flores; O'Connor, Elizabeth; McClorie, Randall

    1998-01-01

    Examined stereotypes of urban blacks and whites at a flea market with 68 black respondents, and at another flea market with 20 white respondents. Cluster-analysis results show that blacks have a relatively complex, multidimensional representation of themselves and of whites, while whites seem to have a more simplistic and negative view of blacks.…

  19. Fingerprint analysis of Hibiscus mutabilis L. leaves based on ultra performance liquid chromatography with photodiode array detector combined with similarity analysis and hierarchical clustering analysis methods

    PubMed Central

    Liang, Xianrui; Ma, Meiling; Su, Weike

    2013-01-01

    Background: A method for chemical fingerprint analysis of Hibiscus mutabilis L. leaves was developed based on ultra performance liquid chromatography with photodiode array detector (UPLC-PAD) combined with similarity analysis (SA) and hierarchical clustering analysis (HCA). Materials and Methods: 10 batches of Hibiscus mutabilis L. leaves samples were collected from different regions of China. UPLC-PAD was employed to collect chemical fingerprints of Hibiscus mutabilis L. leaves. Results: The relative standard deviations (RSDs) of the relative retention times (RRT) and relative peak areas (RPA) of 10 characteristic peaks (one of them was identified as rutin) in precision, repeatability and stability test were less than 3%, and the method of fingerprint analysis was validated to be suitable for the Hibiscus mutabilis L. leaves. Conclusions: The chromatographic fingerprints showed abundant diversity of chemical constituents qualitatively in the 10 batches of Hibiscus mutabilis L. leaves samples from different locations by similarity analysis on basis of calculating the correlation coefficients between each two fingerprints. Moreover, the HCA method clustered the samples into four classes, and the HCA dendrogram showed the close or distant relations among the 10 samples, which was consistent to the SA result to some extent. PMID:23930008

  20. Authentication of monofloral Yemeni Sidr honey using ultraviolet spectroscopy and chemometric analysis.

    PubMed

    Roshan, Abdul-Rahman A; Gad, Haidy A; El-Ahmady, Sherweit H; Khanbash, Mohamed S; Abou-Shoer, Mohamed I; Al-Azizi, Mohamed M

    2013-08-14

    This work describes a simple model developed for the authentication of monofloral Yemeni Sidr honey using UV spectroscopy together with chemometric techniques of hierarchical cluster analysis (HCA), principal component analysis (PCA), and soft independent modeling of class analogy (SIMCA). The model was constructed using 13 genuine Sidr honey samples and challenged with 25 honey samples of different botanical origins. HCA and PCA were successfully able to present a preliminary clustering pattern to segregate the genuine Sidr samples from the lower priced local polyfloral and non-Sidr samples. The SIMCA model presented a clear demarcation of the samples and was used to identify genuine Sidr honey samples as well as detect admixture with lower priced polyfloral honey by detection limits >10%. The constructed model presents a simple and efficient method of analysis and may serve as a basis for the authentication of other honey types worldwide.

  1. Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome.

    PubMed

    Tothill, Richard W; Tinker, Anna V; George, Joshy; Brown, Robert; Fox, Stephen B; Lade, Stephen; Johnson, Daryl S; Trivett, Melanie K; Etemadmoghadam, Dariush; Locandro, Bianca; Traficante, Nadia; Fereday, Sian; Hung, Jillian A; Chiew, Yoke-Eng; Haviv, Izhak; Gertig, Dorota; DeFazio, Anna; Bowtell, David D L

    2008-08-15

    The study aim to identify novel molecular subtypes of ovarian cancer by gene expression profiling with linkage to clinical and pathologic features. Microarray gene expression profiling was done on 285 serous and endometrioid tumors of the ovary, peritoneum, and fallopian tube. K-means clustering was applied to identify robust molecular subtypes. Statistical analysis identified differentially expressed genes, pathways, and gene ontologies. Laser capture microdissection, pathology review, and immunohistochemistry validated the array-based findings. Patient survival within k-means groups was evaluated using Cox proportional hazards models. Class prediction validated k-means groups in an independent dataset. A semisupervised survival analysis of the array data was used to compare against unsupervised clustering results. Optimal clustering of array data identified six molecular subtypes. Two subtypes represented predominantly serous low malignant potential and low-grade endometrioid subtypes, respectively. The remaining four subtypes represented higher grade and advanced stage cancers of serous and endometrioid morphology. A novel subtype of high-grade serous cancers reflected a mesenchymal cell type, characterized by overexpression of N-cadherin and P-cadherin and low expression of differentiation markers, including CA125 and MUC1. A poor prognosis subtype was defined by a reactive stroma gene expression signature, correlating with extensive desmoplasia in such samples. A similar poor prognosis signature could be found using a semisupervised analysis. Each subtype displayed distinct levels and patterns of immune cell infiltration. Class prediction identified similar subtypes in an independent ovarian dataset with similar prognostic trends. Gene expression profiling identified molecular subtypes of ovarian cancer of biological and clinical importance.

  2. Functionally relevant protein motions: Extracting basin-specific collective coordinates from molecular dynamics trajectories

    NASA Astrophysics Data System (ADS)

    Pan, Patricia Wang; Dickson, Russell J.; Gordon, Heather L.; Rothstein, Stuart M.; Tanaka, Shigenori

    2005-01-01

    Functionally relevant motion of proteins has been associated with a number of atoms moving in a concerted fashion along so-called "collective coordinates." We present an approach to extract collective coordinates from conformations obtained from molecular dynamics simulations. The power of this technique for differentiating local structural fuctuations between classes of conformers obtained by clustering is illustrated by analyzing nanosecond-long trajectories for the response regulator protein Spo0F of Bacillus subtilis, generated both in vacuo and using an implicit-solvent representation. Conformational clustering is performed using automated histogram filtering of the inter-Cα distances. Orthogonal (varimax) rotation of the vectors obtained by principal component analysis of these interresidue distances for the members of individual clusters is key to the interpretation of collective coordinates dominating each conformational class. The rotated loadings plots isolate significant variation in interresidue distances, and these are associated with entire mobile secondary structure elements. From this we infer concerted motions of these structural elements. For the Spo0F simulations employing an implicit-solvent representation, collective coordinates obtained in this fashion are consistent with the location of the protein's known active sites and experimentally determined mobile regions.

  3. Classification of neocortical interneurons using affinity propagation.

    PubMed

    Santana, Roberto; McGarry, Laura M; Bielza, Concha; Larrañaga, Pedro; Yuste, Rafael

    2013-01-01

    In spite of over a century of research on cortical circuits, it is still unknown how many classes of cortical neurons exist. In fact, neuronal classification is a difficult problem because it is unclear how to designate a neuronal cell class and what are the best characteristics to define them. Recently, unsupervised classifications using cluster analysis based on morphological, physiological, or molecular characteristics, have provided quantitative and unbiased identification of distinct neuronal subtypes, when applied to selected datasets. However, better and more robust classification methods are needed for increasingly complex and larger datasets. Here, we explored the use of affinity propagation, a recently developed unsupervised classification algorithm imported from machine learning, which gives a representative example or exemplar for each cluster. As a case study, we applied affinity propagation to a test dataset of 337 interneurons belonging to four subtypes, previously identified based on morphological and physiological characteristics. We found that affinity propagation correctly classified most of the neurons in a blind, non-supervised manner. Affinity propagation outperformed Ward's method, a current standard clustering approach, in classifying the neurons into 4 subtypes. Affinity propagation could therefore be used in future studies to validly classify neurons, as a first step to help reverse engineer neural circuits.

  4. Epidemiology of multiple childhood traumatic events: child abuse, parental psychopathology, and other family-level stressors.

    PubMed

    Menard, C B; Bandeen-Roche, K J; Chilcoat, H D

    2004-11-01

    Multiple family-level childhood stressors are common and are correlated. It is unknown if clusters of commonly co-occurring stressors are identifiable. The study was designed to explore family-level stressor clustering in the general population, to estimate the prevalence of exposure classes, and to examine the correlation of sociodemographic characteristics with class prevalence. Data were collected from an epidemiological sample and analyzed using latent class regression. A six-class solution was identified. Classes were characterized by low risk (prevalence=23%), universal high risk (7 %), family conflict (11 %), household substance problems (22 %), non-nuclear family structure (24 %), parent's mental illness (13 %). Class prevalence varied with race and welfare status, not gender. Interventions for childhood stressors are person-focused; the analytic approach may uniquely inform resource allocation.

  5. Graph-Based Object Class Discovery

    NASA Astrophysics Data System (ADS)

    Xia, Shengping; Hancock, Edwin R.

    We are interested in the problem of discovering the set of object classes present in a database of images using a weakly supervised graph-based framework. Rather than making use of the ”Bag-of-Features (BoF)” approach widely used in current work on object recognition, we represent each image by a graph using a group of selected local invariant features. Using local feature matching and iterative Procrustes alignment, we perform graph matching and compute a similarity measure. Borrowing the idea of query expansion , we develop a similarity propagation based graph clustering (SPGC) method. Using this method class specific clusters of the graphs can be obtained. Such a cluster can be generally represented by using a higher level graph model whose vertices are the clustered graphs, and the edge weights are determined by the pairwise similarity measure. Experiments are performed on a dataset, in which the number of images increases from 1 to 50K and the number of objects increases from 1 to over 500. Some objects have been discovered with total recall and a precision 1 in a single cluster.

  6. Patterns of HIV Risks and Related Factors among People Who Inject Drugs in Kermanshah, Iran: A Latent Class Analysis.

    PubMed

    Sharifi, Hamid; Mirzazadeh, Ali; Noroozi, Alireza; Marshall, Brandon D L; Farhoudian, Ali; Higgs, Peter; Vameghi, Meroe; Mohhamadi Shahboulaghi, Farahnaz; Qorbani, Mostafa; Massah, Omid; Armoon, Bahram; Noroozi, Mehdi

    2017-01-01

    The objective of this study was to explore patterns of drug use and sexual risk behaviors among people who inject drugs (PWID) in Iran. We surveyed 500 PWID in Kermanshah concerning demographic characteristics, sexual risk behaviors, and drug-related risk behaviors in the month prior to study. We used latent class analysis (LCA) to establish a baseline model of risk profiles and to identify the optimal number of latent classes, and we used ordinal regression to identify factors associated with class membership. Three classes of multiple HIV risk were identified. The probability of membership in the high-risk class was 0.33, compared to 0.26 and 0.40 for the low- and moderate-risk classes, respectively. Compared to members in the lowest-risk class (reference group), the highest-risk class members had higher odds of being homeless (OR = 4.5, CI: 1.44-8.22; p = 0.001) in the past 12 months. Members of the high-risk class had lower odds of regularly visiting a needle and syringe exchange program as compared to the lowest-risk class members (AOR = 0.42, CI: 0.2-0.81; p = 0.01). Findings show the sexual and drug-related HIV risk clusters among PWID in Iran, and emphasize the importance of developing targeted prevention and harm reduction programs for all domains of risk behaviors, both sexual and drug use related.

  7. Classification of asteroid spectra using a neural network

    NASA Technical Reports Server (NTRS)

    Howell, E. S.; Merenyi, E.; Lebofsky, L. A.

    1994-01-01

    The 52-color asteroid survey (Bell et al., 1988) together with the 8-color asteroid survey (Zellner et al., 1985) provide a data set of asteroid spectra spanning 0.3-2.5 micrometers. An artificial neural network clusters these asteroid spectra based on their similarity to each other. We have also trained the neural network with a categorization learning output layer in a supervised mode to associate the established clusters with taxonomic classes. Results of our classification agree with Tholen's classification based on the 8-color data alone. When extending the spectral range using the 52-color survey data, we find that some modification of the Tholen classes is indicated to produce a cleaner, self-consistent set of taxonomic classes. After supervised training using our modified classes, the network correctly classifies both the training examples, and additional spectra into the correct class with an average of 90% accuracy. Our classification supports the separation of the K class from the S class, as suggested by Bell et al. (1987), based on the near-infrared spectrum. We define two end-member subclasses which seem to have compositional significance within the S class: the So class, which is olivine-rich and red, and the Sp class, which is pyroxene-rich and less red. The remaining S-class asteroids have intermediate compositions of both olivine and pyroxene and moderately red continua. The network clustering suggests some additional structure within the E-, M-, and P-class asteroids, even in the absence of albedo information, which is the only discriminant between these in the Tholen classification. New relationships are seen between the C class and related G, B, and F classes. However, in both cases, the number of spectra is too small to interpret or determine the significance of these separations.

  8. The association between school exclusion, delinquency and subtypes of cyber- and F2F-victimizations: identifying and predicting risk profiles and subtypes using latent class analysis.

    PubMed

    Barboza, Gia Elise

    2015-01-01

    This purpose of this paper is to identify risk profiles of youth who are victimized by on- and offline harassment and to explore the consequences of victimization on school outcomes. Latent class analysis is used to explore the overlap and co-occurrence of different clusters of victims and to examine the relationship between class membership and school exclusion and delinquency. Participants were a random sample of youth between the ages of 12 and 18 selected for inclusion to participate in the 2011 National Crime Victimization Survey: School Supplement. The latent class analysis resulted in four categories of victims: approximately 3.1% of students were highly victimized by both bullying and cyberbullying behaviors; 11.6% of youth were classified as being victims of relational bullying, verbal bullying and cyberbullying; a third class of students were victims of relational bullying, verbal bullying and physical bullying but were not cyberbullied (8%); the fourth and final class, characteristic of the majority of students (77.3%), was comprised of non-victims. The inclusion of covariates to the latent class model indicated that gender, grade and race were significant predictors of at least one of the four victim classes. School delinquency measures were included as distal outcomes to test for both overall and pairwise associations between classes. With one exception, the results were indicative of a significant relationship between school delinquency and the victim subtypes. Implications for these findings are discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Latent Cognitive Phenotypes in De Novo Parkinson's Disease: A Person-Centered Approach.

    PubMed

    LaBelle, Denise R; Walsh, Ryan R; Banks, Sarah J

    2017-08-01

    Cognitive impairment is an important aspect of Parkinson's disease (PD), but there is considerable heterogeneity in its presentation. This investigation aims to identify and characterize latent cognitive phenotypes in early PD. Latent class analysis, a data-driven, person-centered, cluster analysis was performed on cognitive data from the Parkinson's Progressive Markers Initiative baseline visit. This analytic method facilitates identification of naturally occurring endophenotypes. Resulting classes were compared across biomarker, symptom, and demographic data. Six cognitive phenotypes were identified. Three demonstrated consistent performance across indicators, representing poor ("Weak-Overall"), average ("Typical-Overall"), and strong ("Strong-Overall") cognition. The remaining classes demonstrated unique patterns of cognition, characterized by "Strong-Memory," "Weak-Visuospatial," and "Amnestic" profiles. The Amnestic class evidenced greater tremor severity and anosmia, but was unassociated with biomarkers linked with Alzheimer's disease. The Weak-Overall class was older and reported more non-motor features associated with cognitive decline, including anxiety, depression, autonomic dysfunction, anosmia, and REM sleep behaviors. The Strong-Overall class was younger, more female, and reported less dysautonomia and anosmia. Classes were unrelated to disease duration, functional independence, or available biomarkers. Latent cognitive phenotypes with focal patterns of impairment were observed in recently diagnosed individuals with PD. Cognitive profiles were found to be independent of traditional biomarkers and motoric indices of disease progression. Only globally impaired class was associated with previously reported indicators of cognitive decline, suggesting this group may drive the effects reported in studies using variable-based analysis. Longitudinal and neuroanatomical characterization of classes will yield further insight into the evolution of cognitive change in the disease. (JINS, 2017, 23, 551-563).

  10. Constructivism in Practice: an Exploratory Study of Teaching Patterns and Student Motivation in Physics Classrooms in Finland, Germany and Switzerland

    NASA Astrophysics Data System (ADS)

    Beerenwinkel, Anne; von Arx, Matthias

    2017-04-01

    For the last three decades, moderate constructivism has become an increasingly prominent perspective in science education. Researchers have defined characteristics of constructivist-oriented science classrooms, but the implementation of such science teaching in daily classroom practice seems difficult. Against this background, we conducted a sub-study within the tri-national research project Quality of Instruction in Physics (QuIP) analysing 60 videotaped physics classes involving a large sample of students ( N = 1192) from Finland, Germany and Switzerland in order to investigate the kinds of constructivist components and teaching patterns that can be found in regular classrooms without any intervention. We applied a newly developed coding scheme to capture constructivist facets of science teaching and conducted principal component and cluster analyses to explore which components and patterns were most prominent in the classes observed. Two underlying components were found, resulting in two scales—Structured Knowledge Acquisition and Fostering Autonomy—which describe key aspects of constructivist teaching. Only the first scale was rather well established in the lessons investigated. Classes were clustered based on these scales. The analysis of the different clusters suggested that teaching physics in a structured way combined with fostering students' autonomy contributes to students' motivation. However, our regression models indicated that content knowledge is a more important predictor for students' motivation, and there was no homogeneous pattern for all gender- and country-specific subgroups investigated. The results are discussed in light of recent discussions on the feasibility of constructivism in practice.

  11. The extracellular Leucine-Rich Repeat superfamily; a comparative survey and analysis of evolutionary relationships and expression patterns

    PubMed Central

    Dolan, Jackie; Walshe, Karen; Alsbury, Samantha; Hokamp, Karsten; O'Keeffe, Sean; Okafuji, Tatsuya; Miller, Suzanne FC; Tear, Guy; Mitchell, Kevin J

    2007-01-01

    Background Leucine-rich repeats (LRRs) are highly versatile and evolvable protein-ligand interaction motifs found in a large number of proteins with diverse functions, including innate immunity and nervous system development. Here we catalogue all of the extracellular LRR (eLRR) proteins in worms, flies, mice and humans. We use convergent evidence from several transmembrane-prediction and motif-detection programs, including a customised algorithm, LRRscan, to identify eLRR proteins, and a hierarchical clustering method based on TribeMCL to establish their evolutionary relationships. Results This yields a total of 369 proteins (29 in worm, 66 in fly, 135 in mouse and 139 in human), many of them of unknown function. We group eLRR proteins into several classes: those with only LRRs, those that cluster with Toll-like receptors (Tlrs), those with immunoglobulin or fibronectin-type 3 (FN3) domains and those with some other domain. These groups show differential patterns of expansion and diversification across species. Our analyses reveal several clusters of novel genes, including two Elfn genes, encoding transmembrane proteins with eLRRs and an FN3 domain, and six genes encoding transmembrane proteins with eLRRs only (the Elron cluster). Many of these are expressed in discrete patterns in the developing mouse brain, notably in the thalamus and cortex. We have also identified a number of novel fly eLRR proteins with discrete expression in the embryonic nervous system. Conclusion This study provides the necessary foundation for a systematic analysis of the functions of this class of genes, which are likely to include prominently innate immunity, inflammation and neural development, especially the specification of neuronal connectivity. PMID:17868438

  12. Focused Crawling of the Deep Web Using Service Class Descriptions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rocco, D; Liu, L; Critchlow, T

    2004-06-21

    Dynamic Web data sources--sometimes known collectively as the Deep Web--increase the utility of the Web by providing intuitive access to data repositories anywhere that Web access is available. Deep Web services provide access to real-time information, like entertainment event listings, or present a Web interface to large databases or other data repositories. Recent studies suggest that the size and growth rate of the dynamic Web greatly exceed that of the static Web, yet dynamic content is often ignored by existing search engine indexers owing to the technical challenges that arise when attempting to search the Deep Web. To address thesemore » challenges, we present DynaBot, a service-centric crawler for discovering and clustering Deep Web sources offering dynamic content. DynaBot has three unique characteristics. First, DynaBot utilizes a service class model of the Web implemented through the construction of service class descriptions (SCDs). Second, DynaBot employs a modular, self-tuning system architecture for focused crawling of the DeepWeb using service class descriptions. Third, DynaBot incorporates methods and algorithms for efficient probing of the Deep Web and for discovering and clustering Deep Web sources and services through SCD-based service matching analysis. Our experimental results demonstrate the effectiveness of the service class discovery, probing, and matching algorithms and suggest techniques for efficiently managing service discovery in the face of the immense scale of the Deep Web.« less

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hajian, Amir; Alvarez, Marcelo A.; Bond, J. Richard, E-mail: ahajian@cita.utoronto.ca, E-mail: malvarez@cita.utoronto.ca, E-mail: bond@cita.utoronto.ca

    Making mock simulated catalogs is an important component of astrophysical data analysis. Selection criteria for observed astronomical objects are often too complicated to be derived from first principles. However the existence of an observed group of objects is a well-suited problem for machine learning classification. In this paper we use one-class classifiers to learn the properties of an observed catalog of clusters of galaxies from ROSAT and to pick clusters from mock simulations that resemble the observed ROSAT catalog. We show how this method can be used to study the cross-correlations of thermal Sunya'ev-Zeldovich signals with number density maps ofmore » X-ray selected cluster catalogs. The method reduces the bias due to hand-tuning the selection function and is readily scalable to large catalogs with a high-dimensional space of astrophysical features.« less

  14. Relation between lifespan polytrauma typologies and post-trauma mental health.

    PubMed

    Contractor, Ateka A; Brown, Lily A; Weiss, Nicole H

    2018-01-01

    Most individuals experience more than one trauma. Hence, it is important to consider the count and types of traumas (polytraumatization) in relation to post-trauma mental health. The current study examined the relation of polytraumatization patterns to PTSD clusters (intrusions, avoidance, negative alterations in cognitions and mood [NACM], and alterations in arousal and reactivity [AAR]), depression, and impulsivity facets (lack of perseverance, lack of premeditation, negative urgency, sensation seeking) using a web-based sample of 346 participants. Age, gender, race, and ethnicity were covariates. Results of latent class analyses indicated a three-class solution: Low Experience, Moderate Experience - Predominent Threat/Indirect PTEs (Moderate Experience), and High Experience - Predominant Interpersonal PTEs (High/Interpersonal). Multinomial logistic regression results indicated that ethnicity and gender were significant covariates in predicting Low versus High/Interpersonal Class, and Moderate Experience versus High/Interpersonal Class membership, respectively. The High/Interpersonal Class had higher scores on most PTSD clusters, depression, and the impulsivity facets of lack of perseverance and negative urgency compared to the other classes. The Low and Moderate Experience Classes differed on PTSD's avoidance and AAR clusters (lower in the former). Individuals exposed to multiple PTE types, particularly interpersonal traumas, may be at risk for more severe post-trauma symptoms. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Social network analysis of duplicative prescriptions: One-month analysis of medical facilities in Japan.

    PubMed

    Takahashi, Yoshimitsu; Ishizaki, Tatsuro; Nakayama, Takeo; Kawachi, Ichiro

    2016-03-01

    Duplicative prescriptions refer to situations in which patients receive medications for the same condition from two or more sources. Health officials in Japan have expressed concern about medical "waste" resulting from this practices. We sought to conduct descriptive analysis of duplicative prescriptions using social network analysis and to report their prevalence across ages. We analyzed a health insurance claims database including 1.24 million people from December 2012. Through social network analysis, we examined the duplicative prescription networks, representing each medical facility as nodes, and individual prescriptions for patients as edges. The prevalence of duplicative prescription for any drug class was strongly correlated with its frequency of prescription (r=0.90). Among patients aged 0-19, cough and colds drugs showed the highest prevalence of duplicative prescriptions (10.8%). Among people aged 65 and over, antihypertensive drugs had the highest frequency of prescriptions, but the prevalence of duplicative prescriptions was low (0.2-0.3%). Social network analysis revealed clusters of facilities connected via duplicative prescriptions, e.g., psychotropic drugs showed clustering due to a few patients receiving drugs from 10 or more facilities. Overall, the prevalence of duplicative prescriptions was quite low - less than 10% - although the extent of the problem varied by drug class and age group. Our approach illustrates the potential utility of using a social network approach to understand these practices. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  16. Landsat-4 MSS and Thematic Mapper data quality and information content analysis

    NASA Technical Reports Server (NTRS)

    Anuta, P. E.; Bartolucci, L. A.; Dean, M. E.; Lozano, D. F.; Malaret, E.; Mcgillem, C. D.; Valdes, J. A.; Valenzuela, C. R.

    1984-01-01

    Landsat-4 Thematic Mapper and Multispectral Scanner data were analyzed to obtain information on data quality and information content. Geometric evaluations were performed to test band-to-band registration accuracy. Thematic Mapper overall system resolution was evaluated using scene objects which demonstrated sharp high contrast edge responses. Radiometric evaluation included detector relative calibration, effects of resampling, and coherent noise effects. Information content evaluation was carried out using clustering, principal components, transformed divergence separability measure, and numerous supervised classifiers on data from Iowa and Illinois. A detailed spectral class analysis (multispectral classification) was carried out on data from the Des Moines, IA area to compare the information content of the MSS and TM for a large number of scene classes.

  17. Variability Survey of ω Centauri in the Near-IR: Period-Luminosity Relations

    NASA Astrophysics Data System (ADS)

    Navarrete, Camila; Catelan, Márcio; Contreras Ramos, Rodrigo; Gran, Felipe; Alonso-García, Javier; Dékány, István

    2015-08-01

    ω Centauri (NGC 5139) is by far the most massive globular star cluster in the Milky Way, and has even been suggested to be the remnant of a dwarf galaxy. As such, it contains a large number of variable stars of different classes. Here we report on a deep, wide-field, near-infrared variability survey of omega Cen, carried out by our team using ESO's 4.1m VISTA telescope. Our time-series data comprise 42 and 100 epochs in J and Ks, respectively. This unique dataset has allowed us to derive complete light curves for hundreds of variable stars in the cluster, and thereby perform a detailed analysis of the near-infrared period-luminosity (PL) relations for different variability classes, including type II Cepheids, SX Phoenicis, and RR Lyrae stars. In this contribution, in addition to describing our survey and presenting the derived light curves, we present the resulting PL relations for each of these variability classes, including the first calibration of this sort for the SX Phoenicis stars. Based on these relations, we also provide an updated (pulsational) distance modulus for omega Cen, compare with results based on independent techniques, and discuss possible sources of systematic errors.

  18. Length-independent structural similarities enrich the antibody CDR canonical class model.

    PubMed

    Nowak, Jaroslaw; Baker, Terry; Georges, Guy; Kelm, Sebastian; Klostermann, Stefan; Shi, Jiye; Sridharan, Sudharsan; Deane, Charlotte M

    2016-01-01

    Complementarity-determining regions (CDRs) are antibody loops that make up the antigen binding site. Here, we show that all CDR types have structurally similar loops of different lengths. Based on these findings, we created length-independent canonical classes for the non-H3 CDRs. Our length variable structural clusters show strong sequence patterns suggesting either that they evolved from the same original structure or result from some form of convergence. We find that our length-independent method not only clusters a larger number of CDRs, but also predicts canonical class from sequence better than the standard length-dependent approach. To demonstrate the usefulness of our findings, we predicted cluster membership of CDR-L3 sequences from 3 next-generation sequencing datasets of the antibody repertoire (over 1,000,000 sequences). Using the length-independent clusters, we can structurally classify an additional 135,000 sequences, which represents a ∼20% improvement over the standard approach. This suggests that our length-independent canonical classes might be a highly prevalent feature of antibody space, and could substantially improve our ability to accurately predict the structure of novel CDRs identified by next-generation sequencing.

  19. Maximization of the Supportable Number of Sensors in QoS-Aware Cluster-Based Underwater Acoustic Sensor Networks

    PubMed Central

    Nguyen, Thi-Tham; Van Le, Duc; Yoon, Seokhoon

    2014-01-01

    This paper proposes a practical low-complexity MAC (medium access control) scheme for quality of service (QoS)-aware and cluster-based underwater acoustic sensor networks (UASN), in which the provision of differentiated QoS is required. In such a network, underwater sensors (U-sensor) in a cluster are divided into several classes, each of which has a different QoS requirement. The major problem considered in this paper is the maximization of the number of nodes that a cluster can accommodate while still providing the required QoS for each class in terms of the PDR (packet delivery ratio). In order to address the problem, we first estimate the packet delivery probability (PDP) and use it to formulate an optimization problem to determine the optimal value of the maximum packet retransmissions for each QoS class. The custom greedy and interior-point algorithms are used to find the optimal solutions, which are verified by extensive simulations. The simulation results show that, by solving the proposed optimization problem, the supportable number of underwater sensor nodes can be maximized while satisfying the QoS requirements for each class. PMID:24608009

  20. Maximization of the supportable number of sensors in QoS-aware cluster-based underwater acoustic sensor networks.

    PubMed

    Nguyen, Thi-Tham; Le, Duc Van; Yoon, Seokhoon

    2014-03-07

    This paper proposes a practical low-complexity MAC (medium access control) scheme for quality of service (QoS)-aware and cluster-based underwater acoustic sensor networks (UASN), in which the provision of differentiated QoS is required. In such a network, underwater sensors (U-sensor) in a cluster are divided into several classes, each of which has a different QoS requirement. The major problem considered in this paper is the maximization of the number of nodes that a cluster can accommodate while still providing the required QoS for each class in terms of the PDR (packet delivery ratio). In order to address the problem, we first estimate the packet delivery probability (PDP) and use it to formulate an optimization problem to determine the optimal value of the maximum packet retransmissions for each QoS class. The custom greedy and interior-point algorithms are used to find the optimal solutions, which are verified by extensive simulations. The simulation results show that, by solving the proposed optimization problem, the supportable number of underwater sensor nodes can be maximized while satisfying the QoS requirements for each class.

  1. Spatial distribution of 12 class B notifiable infectious diseases in China: A retrospective study.

    PubMed

    Zhu, Bin; Fu, Yang; Liu, Jinlin; Mao, Ying

    2018-01-01

    China is the largest developing country with a relatively developed public health system. To further prevent and eliminate the spread of infectious diseases, China has listed 39 notifiable infectious diseases characterized by wide prevalence or great harm, and classified them into classes A, B, and C, with severity decreasing across classes. Class A diseases have been almost eradicated in China, thus making class B diseases a priority in infectious disease prevention and control. In this retrospective study, we analyze the spatial distribution patterns of 12 class B notifiable infectious diseases that remain active all over China. Global and local Moran's I and corresponding graphic tools are adopted to explore and visualize the global and local spatial distribution of the incidence of the selected epidemics, respectively. Inter-correlations of clustering patterns of each pair of diseases and a cumulative summary of the high/low cluster frequency of the provincial units are also provided by means of figures and maps. Of the 12 most commonly notifiable class B infectious diseases, viral hepatitis and tuberculosis show high incidence rates and account for more than half of the reported cases. Almost all the diseases, except pertussis, exhibit positive spatial autocorrelation at the provincial level. All diseases feature varying spatial concentrations. Nevertheless, associations exist between spatial distribution patterns, with some provincial units displaying the same type of cluster features for two or more infectious diseases. Overall, high-low (unit with high incidence surrounded by units with high incidence, the same below) and high-high spatial cluster areas tend to be prevalent in the provincial units located in western and southwest China, whereas low-low and low-high spatial cluster areas abound in provincial units in north and east China. Despite the various distribution patterns of 12 class B notifiable infectious diseases, certain similarities between their spatial distributions are present. Substantial evidence is available to support disease-specific, location-specific, and disease-combined interventions. Regarding provinces that show high-high/high-low patterns of multiple diseases, comprehensive interventions targeting different diseases should be established. As to the adjacent provincial units revealing similar patterns, coordinated actions need to be taken across borders.

  2. Perceived risk associated with ecstasy use: a latent class analysis approach

    PubMed Central

    Martins, SS; Carlson, RG; Alexandre, PK; Falck, RS

    2011-01-01

    This study aims to define categories of perceived health problems among ecstasy users based on observed clustering of their perceptions of ecstasy-related health problems. Data from a community sample of ecstasy users (n=402) aged 18 to 30, in Ohio, was used in this study. Data was analyzed via Latent Class Analysis (LCA) and Regression. This study identified five different subgroups of ecstasy users based on their perceptions of health problems they associated with their ecstasy use. Almost one third of the sample (28.9%) belonged to a class with “low level of perceived problems” (Class 4). About one fourth (25.6%) of the sample (Class 2), had high probabilities of “perceiving problems on sexual-related items”, but generally low or moderate probabilities of perceiving problems in other areas. Roughly one-fifth of the sample (21.1%, Class 1) had moderate probabilities of perceiving ecstasy health-related problems in all areas. A small proportion of respondents (11.9%, Class 5) had high probabilities of reporting “perceived memory and cognitive problems, and of perceiving “ecstasy related-problems in all areas” (12.4%, Class 3). A large proportion of ecstasy users perceive either low or moderate risk associated with their ecstasy use. It is important to further investigate whether lower levels of risk perception are associated with persistence of ecstasy use. PMID:21296504

  3. Classes and continua of hippocampal CA1 inhibitory neurons revealed by single-cell transcriptomics.

    PubMed

    Harris, Kenneth D; Hochgerner, Hannah; Skene, Nathan G; Magno, Lorenza; Katona, Linda; Bengtsson Gonzales, Carolina; Somogyi, Peter; Kessaris, Nicoletta; Linnarsson, Sten; Hjerling-Leffler, Jens

    2018-06-18

    Understanding any brain circuit will require a categorization of its constituent neurons. In hippocampal area CA1, at least 23 classes of GABAergic neuron have been proposed to date. However, this list may be incomplete; additionally, it is unclear whether discrete classes are sufficient to describe the diversity of cortical inhibitory neurons or whether continuous modes of variability are also required. We studied the transcriptomes of 3,663 CA1 inhibitory cells, revealing 10 major GABAergic groups that divided into 49 fine-scale clusters. All previously described and several novel cell classes were identified, with three previously described classes unexpectedly found to be identical. A division into discrete classes, however, was not sufficient to describe the diversity of these cells, as continuous variation also occurred between and within classes. Latent factor analysis revealed that a single continuous variable could predict the expression levels of several genes, which correlated similarly with it across multiple cell types. Analysis of the genes correlating with this variable suggested it reflects a range from metabolically highly active faster-spiking cells that proximally target pyramidal cells to slower-spiking cells targeting distal dendrites or interneurons. These results elucidate the complexity of inhibitory neurons in one of the simplest cortical structures and show that characterizing these cells requires continuous modes of variation as well as discrete cell classes.

  4. Identification of piecewise affine systems based on fuzzy PCA-guided robust clustering technique

    NASA Astrophysics Data System (ADS)

    Khanmirza, Esmaeel; Nazarahari, Milad; Mousavi, Alireza

    2016-12-01

    Hybrid systems are a class of dynamical systems whose behaviors are based on the interaction between discrete and continuous dynamical behaviors. Since a general method for the analysis of hybrid systems is not available, some researchers have focused on specific types of hybrid systems. Piecewise affine (PWA) systems are one of the subsets of hybrid systems. The identification of PWA systems includes the estimation of the parameters of affine subsystems and the coefficients of the hyperplanes defining the partition of the state-input domain. In this paper, we have proposed a PWA identification approach based on a modified clustering technique. By using a fuzzy PCA-guided robust k-means clustering algorithm along with neighborhood outlier detection, the two main drawbacks of the well-known clustering algorithms, i.e., the poor initialization and the presence of outliers, are eliminated. Furthermore, this modified clustering technique enables us to determine the number of subsystems without any prior knowledge about system. In addition, applying the structure of the state-input domain, that is, considering the time sequence of input-output pairs, provides a more efficient clustering algorithm, which is the other novelty of this work. Finally, the proposed algorithm has been evaluated by parameter identification of an IGV servo actuator. Simulation together with experiment analysis has proved the effectiveness of the proposed method.

  5. The sirodesmin biosynthetic gene cluster of the plant pathogenic fungus Leptosphaeria maculans.

    PubMed

    Gardiner, Donald M; Cozijnsen, Anton J; Wilson, Leanne M; Pedras, M Soledade C; Howlett, Barbara J

    2004-09-01

    Sirodesmin PL is a phytotoxin produced by the fungus Leptosphaeria maculans, which causes blackleg disease of canola (Brassica napus). This phytotoxin belongs to the epipolythiodioxopiperazine (ETP) class of toxins produced by fungi including mammalian and plant pathogens. We report the cloning of a cluster of genes with predicted roles in the biosynthesis of sirodesmin PL and show via gene disruption that one of these genes (encoding a two-module non-ribosomal peptide synthetase) is essential for sirodesmin PL biosynthesis. Of the nine genes in the cluster tested, all are co-regulated with the production of sirodesmin PL in culture. A similar cluster is present in the genome of the opportunistic human pathogen Aspergillus fumigatus and is most likely responsible for the production of gliotoxin, which is also an ETP. Homologues of the genes in the cluster were also identified in expressed sequence tags of the ETP producing fungus Chaetomium globosum. Two other fungi with publicly available genome sequences, Magnaporthe grisea and Fusarium graminearum, had similar gene clusters. A comparative analysis of all four clusters is presented. This is the first report of the genes responsible for the biosynthesis of an ETP. Copyright 2004 Blackwell Publishing Ltd

  6. [A phylogenetic analysis of plant communities of Teberda Biosphere Reserve].

    PubMed

    Shulakov, A A; Egorov, A V; Onipchenko, V G

    2016-01-01

    Phylogenetic analysis of communities is based on the comparison of distances on the phylogenetic tree between species of a community under study and those distances in random samples taken out of local flora. It makes it possible to determine to what extent a community composition is formed by more closely related species (i.e., "clustered") or, on the opposite, it is more even and includes species that are less related with each other. The first case is usually interpreted as a result of strong influence caused by abiotic factors, due to which species with similar ecology, a priori more closely related, would remain: In the second case, biotic factors, such as competition, may come to the fore and lead to forming a community out of distant clades due to divergence of their ecological niches: The aim of this' study Was Ad explore the phylogenetic structure in communities of the northwestern Caucasus at two spatial scales - the scale of area from 4 to 100 m2 and the smaller scale within a community. The list of local flora of the alpine belt has been composed using the database of geobotanic descriptions carried out in Teberda Biosphere Reserve at true altitudes exceeding.1800 m. It includes 585 species of flowering plants belonging to 57 families. Basal groups of flowering plants are.not represented in the list. At the scale of communities of three classes, namely Thlaspietea rotundifolii - commumties formed on screes and pebbles, Calluno-Ulicetea - alpine meadow, and Mulgedio-Aconitetea subalpine meadows, have not demonstrated significant distinction of phylogenetic structure. At intra level, for alpine meadows the larger share of closely related species. (clustered community) is detected. Significantly clustered happen to be those communities developing on rocks (class Asplenietea trichomanis) and alpine (class Juncetea trifidi). At the same time, alpine lichen proved to have even phylogenetic structure at the small scale. Alpine (class Salicetea herbaceae) that develop under conditions of winter snow accumulation were more,even at the both.scale, i.e., contained more diverse and distantly related plant species compared with random samples. (Scheuchzerio-Caricetea fuscae) aquatic communities in cold (Montio-Cardaminetea), sedge meadows (Carici rupestris-Kobresietea bellardii), and communities, in which shrubs and predominated (juniper and rhododendron elfin woods, class Loiseleurio-Vaccinietea), have been studied only at the larger scale and showed significant evenness of species composition, i.e., were phylogenetically more diverse compared with random samples.

  7. American and Chinese Students' Profiles Based on Spanish-Learning Strategies: A Transcultural Comparison

    ERIC Educational Resources Information Center

    Bernardo, Aránzazu; Amérigo, María; García, Juan A.

    2017-01-01

    This paper presents a study on the use of learning strategies in foreign languages, and more specifically Spanish. The study was conducted with 376 Chinese and American students who were studying Spanish in their countries of origin. The results obtained from a latent class cluster analysis identified five groups of participants based on the…

  8. Beyond Academic Tracking: Using Cluster Analysis and Self-Organizing Maps to Investigate Secondary Students' Chemistry Self-Concept

    ERIC Educational Resources Information Center

    Nielsen, Sara E.; Yezierski, Ellen J.

    2016-01-01

    Academic tracking, placing students in different classes based on past performance, is a common feature of the American secondary school system. A longitudinal study of secondary students' chemistry self-concept scores was conducted, and one feature of the study was the presence of academic tracking. Though academic tracking is one way to group…

  9. Complete Genome Sequence of a Highly Virulent Newcastle Disease Virus Currently Circulating in Mexico

    PubMed Central

    Xiao, Sa; Paldurai, Anandan; Nayak, Baibaswata; Mirande, Armando; Collins, Peter L.

    2013-01-01

    The complete genome sequence was determined for a highly virulent Newcastle disease virus strain from vaccinated chicken farms in Mexico during outbreaks in 2010. On the basis of phylogenetic analysis this strain was classified into genotype V in the class II cluster that was closely related to Mexican strains that appeared in 2004–2006. PMID:23409252

  10. The clustering of diet, physical activity and sedentary behavior in children and adolescents: a review.

    PubMed

    Leech, Rebecca M; McNaughton, Sarah A; Timperio, Anna

    2014-01-22

    Diet, physical activity (PA) and sedentary behavior are important, yet modifiable, determinants of obesity. Recent research into the clustering of these behaviors suggests that children and adolescents have multiple obesogenic risk factors. This paper reviews studies using empirical, data-driven methodologies, such as cluster analysis (CA) and latent class analysis (LCA), to identify clustering patterns of diet, PA and sedentary behavior among children or adolescents and their associations with socio-demographic indicators, and overweight and obesity. A literature search of electronic databases was undertaken to identify studies which have used data-driven methodologies to investigate the clustering of diet, PA and sedentary behavior among children and adolescents aged 5-18 years old. Eighteen studies (62% of potential studies) were identified that met the inclusion criteria, of which eight examined the clustering of PA and sedentary behavior and eight examined diet, PA and sedentary behavior. Studies were mostly cross-sectional and conducted in older children and adolescents (≥ 9 years). Findings from the review suggest that obesogenic cluster patterns are complex with a mixed PA/sedentary behavior cluster observed most frequently, but healthy and unhealthy patterning of all three behaviors was also reported. Cluster membership was found to differ according to age, gender and socio-economic status (SES). The tendency for older children/adolescents, particularly females, to comprise clusters defined by low PA was the most robust finding. Findings to support an association between obesogenic cluster patterns and overweight and obesity were inconclusive, with longitudinal research in this area limited. Diet, PA and sedentary behavior cluster together in complex ways that are not well understood. Further research, particularly in younger children, is needed to understand how cluster membership differs according to socio-demographic profile. Longitudinal research is also essential to establish how different cluster patterns track over time and their influence on the development of overweight and obesity.

  11. The clustering of diet, physical activity and sedentary behavior in children and adolescents: a review

    PubMed Central

    2014-01-01

    Diet, physical activity (PA) and sedentary behavior are important, yet modifiable, determinants of obesity. Recent research into the clustering of these behaviors suggests that children and adolescents have multiple obesogenic risk factors. This paper reviews studies using empirical, data-driven methodologies, such as cluster analysis (CA) and latent class analysis (LCA), to identify clustering patterns of diet, PA and sedentary behavior among children or adolescents and their associations with socio-demographic indicators, and overweight and obesity. A literature search of electronic databases was undertaken to identify studies which have used data-driven methodologies to investigate the clustering of diet, PA and sedentary behavior among children and adolescents aged 5–18 years old. Eighteen studies (62% of potential studies) were identified that met the inclusion criteria, of which eight examined the clustering of PA and sedentary behavior and eight examined diet, PA and sedentary behavior. Studies were mostly cross-sectional and conducted in older children and adolescents (≥9 years). Findings from the review suggest that obesogenic cluster patterns are complex with a mixed PA/sedentary behavior cluster observed most frequently, but healthy and unhealthy patterning of all three behaviors was also reported. Cluster membership was found to differ according to age, gender and socio-economic status (SES). The tendency for older children/adolescents, particularly females, to comprise clusters defined by low PA was the most robust finding. Findings to support an association between obesogenic cluster patterns and overweight and obesity were inconclusive, with longitudinal research in this area limited. Diet, PA and sedentary behavior cluster together in complex ways that are not well understood. Further research, particularly in younger children, is needed to understand how cluster membership differs according to socio-demographic profile. Longitudinal research is also essential to establish how different cluster patterns track over time and their influence on the development of overweight and obesity. PMID:24450617

  12. Short-term droughts forecast using Markov chain model in Victoria, Australia

    NASA Astrophysics Data System (ADS)

    Rahmat, Siti Nazahiyah; Jayasuriya, Niranjali; Bhuiyan, Muhammed A.

    2017-07-01

    A comprehensive risk management strategy for dealing with drought should include both short-term and long-term planning. The objective of this paper is to present an early warning method to forecast drought using the Standardised Precipitation Index (SPI) and a non-homogeneous Markov chain model. A model such as this is useful for short-term planning. The developed method has been used to forecast droughts at a number of meteorological monitoring stations that have been regionalised into six (6) homogenous clusters with similar drought characteristics based on SPI. The non-homogeneous Markov chain model was used to estimate drought probabilities and drought predictions up to 3 months ahead. The drought severity classes defined using the SPI were computed at a 12-month time scale. The drought probabilities and the predictions were computed for six clusters that depict similar drought characteristics in Victoria, Australia. Overall, the drought severity class predicted was quite similar for all the clusters, with the non-drought class probabilities ranging from 49 to 57 %. For all clusters, the near normal class had a probability of occurrence varying from 27 to 38 %. For the more moderate and severe classes, the probabilities ranged from 2 to 13 % and 3 to 1 %, respectively. The developed model predicted drought situations 1 month ahead reasonably well. However, 2 and 3 months ahead predictions should be used with caution until the models are developed further.

  13. Catchment classification by runoff behaviour with self-organizing maps (SOM)

    NASA Astrophysics Data System (ADS)

    Ley, R.; Casper, M. C.; Hellebrand, H.; Merz, R.

    2011-09-01

    Catchments show a wide range of response behaviour, even if they are adjacent. For many purposes it is necessary to characterise and classify them, e.g. for regionalisation, prediction in ungauged catchments, model parameterisation. In this study, we investigate hydrological similarity of catchments with respect to their response behaviour. We analyse more than 8200 event runoff coefficients (ERCs) and flow duration curves of 53 gauged catchments in Rhineland-Palatinate, Germany, for the period from 1993 to 2008, covering a huge variability of weather and runoff conditions. The spatio-temporal variability of event-runoff coefficients and flow duration curves are assumed to represent how different catchments "transform" rainfall into runoff. From the runoff coefficients and flow duration curves we derive 12 signature indices describing various aspects of catchment response behaviour to characterise each catchment. Hydrological similarity of catchments is defined by high similarities of their indices. We identify, analyse and describe hydrologically similar catchments by cluster analysis using Self-Organizing Maps (SOM). As a result of the cluster analysis we get five clusters of similarly behaving catchments where each cluster represents one differentiated class of catchments. As catchment response behaviour is supposed to be dependent on its physiographic and climatic characteristics, we compare groups of catchments clustered by response behaviour with clusters of catchments based on catchment properties. Results show an overlap of 67% between these two pools of clustered catchments which can be improved using the topologic correctness of SOMs.

  14. Catchment classification by runoff behaviour with self-organizing maps (SOM)

    NASA Astrophysics Data System (ADS)

    Ley, R.; Casper, M. C.; Hellebrand, H.; Merz, R.

    2011-03-01

    Catchments show a wide range of response behaviour, even if they are adjacent. For many purposes it is necessary to characterise and classify them, e.g. for regionalisation, prediction in ungauged catchments, model parameterisation. In this study, we investigate hydrological similarity of catchments with respect to their response behaviour. We analyse more than 8200 event runoff coefficients (ERCs) and flow duration curves of 53 gauged catchments in Rhineland-Palatinate, Germany, for the period from 1993 to 2008, covering a huge variability of weather and runoff conditions. The spatio-temporal variability of event-runoff coefficients and flow duration curves are assumed to represent how different catchments "transform" rainfall into runoff. From the runoff coefficients and flow duration curves we derive 12 signature indices describing various aspects of catchment response behaviour to characterise each catchment. Hydrological similarity of catchments is defined by high similarities of their indices. We identify, analyse and describe hydrologically similar catchments by cluster analysis using Self-Organizing Maps (SOM). As a result of the cluster analysis we get five clusters of similarly behaving catchments where each cluster represents one differentiated class of catchments. As catchment response behaviour is supposed to be dependent on its physiographic and climatic characteristics, we compare groups of catchments clustered by response behaviour with clusters of catchments based on catchment properties. Results show an overlap of 67% between these two pools of clustered catchments which can be improved using the topologic correctness of SOMs.

  15. Sensory Characterization of Odors in Used Disposable Absorbent Incontinence Products

    PubMed Central

    Widén, Heléne; Forsgren-Brusk, Ulla; Hall, Gunnar

    2017-01-01

    PURPOSE: The objectives of this study were to characterize the odors of used incontinence products by descriptive analysis and to define attributes to be used in the analysis. A further objective was to investigate to what extent the odor profiles of used incontinence products differed from each other and, if possible, to group these profiles into classes. SUBJECTS AND SETTING: Used incontinence products were collected from 14 residents with urinary incontinence living in geriatric nursing homes in the Gothenburg area, Sweden. METHODS: Pieces were cut from the wet area of used incontinence products. They were placed in glass bottles and kept frozen until odor analysis was completed. A trained panel consisting of 8 judges experienced in this area of investigation defined terminology for odor attributes. The intensities of these attributes in the used products were determined by descriptive odor analysis. Data were analyzed both by analysis of variance (ANOVA) followed by the Tukey post hoc test and by principal component analysis and cluster analysis. RESULTS: An odor wheel, with 10 descriptive attributes, was developed. The total odor intensity, and the intensities of the attributes, varied considerably between different, used incontinence products. The typical odors varied from “sweetish” to “urinal,” “ammonia,” and “smoked.” Cluster analysis showed that the used products, based on the quantitative odor data, could be divided into 5 odor classes with different profiles. CONCLUSIONS: The used products varied considerably in odor character and intensity. Findings suggest that odors in used absorptive products are caused by different types of compounds that may vary in concentration. PMID:28328646

  16. Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning

    PubMed Central

    Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi

    2017-01-01

    Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization. PMID:28786986

  17. Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning.

    PubMed

    Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi; Mao, Youdong

    2017-01-01

    Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.

  18. Changes of humoral anti-endotoxin immunity and low-intensity inflammation in diabetes mellitus type 1 and 2.

    PubMed

    Gordienko, A I; Beloglazov, V A; Kubyshkin, A V

    2016-01-01

    The purpose. Investigate the levels of different classes serum anti-endotoxin antibodies in patients with diabetes mellitus type 1 and 2 and to hold the cluster analysis of the relationship between the individual levels of such antibodies and the concentration of C-reactive protein in the blood. We examined 51 patients with diabetes mellitus type 1 and 60 patients with diabetes mellitus type 2. The diagnosis of diabetes mellitus type 1 or type 2 has been delivered in accordance with the criteria of the World Health Organization. The control group included 49 healthy people who have not a history of any chronic disease, and the clinical manifestations of acute diseases were absent at the time of the survey. By sex and age, the control group of healthy people matched to a group of patients with diabetes type 1 and type 2. The concentration of C-reactive protein in the blood and the levels of serum anti-endotoxin antibodies of different classes (A, M and G) was determined by ELISA. Using cluster analysis revealed that 40.8% of patients with type 1 diabetes increased concentration of C-reactive protein in the blood is associated with a significant reduction of levels of serum anti-endotoxin antibodies classes A, M and G. In 56.7% of patients with type 2 diabetes the high concentration of C-reactive protein in the blood levels of serum anti-endotoxin antibody classes A and M were not significantly different from the normal values, but the levels of serum anti-endotoxin antibodies of class G were significantly increased. The activation of inflammation with a further increase of C-reactive protein in the blood of patients with type 2 diabetes mellitus accompanied by a significant increase in levels of serum anti-endotoxin antibodies classes A and G, and also a tendency to reduce of levels anti-endotoxin antibodies class M. The results suggest about the relationship between low-intensity inflammation and immune response to enterobacterial endotoxins in patients with diabetes mellitus type 1 and 2.

  19. Providing Multi-Page Data Extraction Services with XWRAPComposer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Ling; Zhang, Jianjun; Han, Wei

    2008-04-30

    Dynamic Web data sources – sometimes known collectively as the Deep Web – increase the utility of the Web by providing intuitive access to data repositories anywhere that Web access is available. Deep Web services provide access to real-time information, like entertainment event listings, or present a Web interface to large databases or other data repositories. Recent studies suggest that the size and growth rate of the dynamic Web greatly exceed that of the static Web, yet dynamic content is often ignored by existing search engine indexers owing to the technical challenges that arise when attempting to search the Deepmore » Web. To address these challenges, we present DYNABOT, a service-centric crawler for discovering and clustering Deep Web sources offering dynamic content. DYNABOT has three unique characteristics. First, DYNABOT utilizes a service class model of the Web implemented through the construction of service class descriptions (SCDs). Second, DYNABOT employs a modular, self-tuning system architecture for focused crawling of the Deep Web using service class descriptions. Third, DYNABOT incorporates methods and algorithms for efficient probing of the Deep Web and for discovering and clustering Deep Web sources and services through SCD-based service matching analysis. Our experimental results demonstrate the effectiveness of the service class discovery, probing, and matching algorithms and suggest techniques for efficiently managing service discovery in the face of the immense scale of the Deep Web.« less

  20. [Analysis of commercial specifications and grades of wild and cultivated Gentianae Macrophyllae Radix based on multi-indicative constituents].

    PubMed

    Yang, Yan-Mei; Lin, Li; Lu, You-Yuan; Ma, Xiao-Hui; Jin, Ling; Zhu, Tian-Tian

    2016-03-01

    The study is aimed to analyze the commercial specifications and grades of wild and cultivated Gentianae Macrophllae Radix based on multi-indicative constituents. The seven kinds of main chemical components containing in Gentianae Macrophyllae Radix were determined by UPLC, and then the quality levels of chemical component of Gentianae Macrophyllae Radix were clustered and classified by modern statistical methods (canonical correspondence analysis, Fisher discriminant analysis and so on). The quality indices were selected and their correlations were analyzed. Lastly, comprehensively quantitative grade division for quality under different commodity-specifications and different grades of same commodity-specifications of wild and planting were divided. The results provide a basis for a reasonable division of specification and grade of the commodity of Gentianae Macrophyllae Radix. The range of quality evaluation of main index components (gentiopicrin, loganin acid and swertiamarin) was proposed, and the Herbal Quality Index (HQI) was introduced. The rank discriminant function was established based on the quality by Fisher discriminant analysis. According to the analysis, the quality of wild and cultivated Luobojiao, one of the commercial specification of Gentianae Macrophyllae Radix was the best, Mahuajiao, the other commercial specification, was average , Xiaoqinjiao was inferior. Among grades, the quality of first-class cultivated Luobojiao was the worst, of second class secondary, and the third class the best; The quality of the first-class of wild Luobojiao was secondary, and the second-class the best; The quality of the second-class of Mahuajiao was secondary, and the first-class was the best; the quality of first-class Xiaoqinjiao was secondary, and the second-class was the better one between the two grades, but not obvious significantly. The method provides a new idea and method for evaluation of comprehensively quantitative on the quality of Gentianae Macrophyllae Radix. Copyright© by the Chinese Pharmaceutical Association.

  1. Validation of gait analysis with dynamic radiostereometric analysis (RSA) in patients operated with total hip arthroplasty.

    PubMed

    Zügner, Roland; Tranberg, Roy; Lisovskaja, Vera; Shareghi, Bita; Kärrholm, Johan

    2017-07-01

    We simultaneously examined 14 patients with OTS and dynamic radiostereometric analysis (RSA) to evaluate the accuracy of both skin- and a cluster-marker models. The mean differences between the OTS and RSA system in hip flexion, abduction, and rotation varied up to 9.5° for the skin-marker and up to 11.3° for the cluster-marker models, respectively. Both models tended to underestimate the amount of flexion and abduction, but a significant systematic difference between the marker and RSA evaluations could only be established for recordings of hip abduction using cluster markers (p = 0.04). The intra-class correlation coefficient (ICC) was 0.7 or higher during flexion for both models and during abduction using skin markers, but decreased to 0.5-0.6 when abduction motion was studied with cluster markers. During active hip rotation, the two marker models tended to deviate from the RSA recordings in different ways with poor correlations at the end of the motion (ICC ≤0.4). During active hip motions soft tissue displacements occasionally induced considerable differences when compared to skeletal motions. The best correlation between RSA recordings and the skin- and cluster-marker model was found for studies of hip flexion and abduction with the skin-marker model. Studies of hip abduction with use of cluster markers were associated with a constant underestimation of the motion. Recordings of skeletal motions with use of skin or cluster markers during hip rotation were associated with high mean errors amounting up to about 10° at certain positions. © 2016 Orthopaedic Research Society. Published by Wiley Periodicals, Inc. J Orthop Res 35:1515-1522, 2017. © 2016 Orthopaedic Research Society. Published by Wiley Periodicals, Inc.

  2. Female-to-male transmasculine adult health: a mixed-methods community-based needs assessment.

    PubMed

    Reisner, Sari L; Gamarel, Kristi E; Dunham, Emilia; Hopwood, Ruben; Hwahng, Sel

    2013-01-01

    There is a dearth of health research about transgender people. This mixed-methods study sought to formatively investigate the health and perceived health needs of female-to-male transmasculine adults. A cross-sectional quantitative needs assessment (n = 73) and qualitative open-ended input (n = 19) were conducted in June 2011. A latent class analysis modeled six binary health indicators (depression, alcohol use, current smoking, asthma, physical inactivity, overweight status) to identify clusters of presenting health issues. Four clusters of health indicators emerged: (a) depression; (b) syndemic (all indicators); (c) alcohol use, overweight status; and (d) smoking, physical inactivity, overweight status. Transphobic discrimination in health care and avoiding care were each associated with membership in the syndemic class. Qualitative themes included personal health care needs, community needs, and resilience and protective factors. Findings fill an important gap about the health of transmasculine communities, including the need for public health efforts that holistically address concomitant health concerns.

  3. Horizontal transfer of a large and highly toxic secondary metabolic gene cluster between fungi.

    PubMed

    Slot, Jason C; Rokas, Antonis

    2011-01-25

    Genes involved in intermediary and secondary metabolism in fungi are frequently physically linked or clustered. For example, in Aspergillus nidulans the entire pathway for the production of sterigmatocystin (ST), a highly toxic secondary metabolite and a precursor to the aflatoxins (AF), is located in a ∼54 kb, 23 gene cluster. We discovered that a complete ST gene cluster in Podospora anserina was horizontally transferred from Aspergillus. Phylogenetic analysis shows that most Podospora cluster genes are adjacent to or nested within Aspergillus cluster genes, although the two genera belong to different taxonomic classes. Furthermore, the Podospora cluster is highly conserved in content, sequence, and microsynteny with the Aspergillus ST/AF clusters and its intergenic regions contain 14 putative binding sites for AflR, the transcription factor required for activation of the ST/AF biosynthetic genes. Examination of ∼52,000 Podospora expressed sequence tags identified transcripts for 14 genes in the cluster, with several expressed at multiple life cycle stages. The presence of putative AflR-binding sites and the expression evidence for several cluster genes, coupled with the recent independent discovery of ST production in Podospora [1], suggest that this HGT event probably resulted in a functional cluster. Given the abundance of metabolic gene clusters in fungi, our finding that one of the largest known metabolic gene clusters moved intact between species suggests that such transfers might have significantly contributed to fungal metabolic diversity. PAPERFLICK: Copyright © 2011 Elsevier Ltd. All rights reserved.

  4. Geomorphometric comparative analysis of Latin-American volcanoes

    NASA Astrophysics Data System (ADS)

    Camiz, Sergio; Poscolieri, Maurizio; Roverato, Matteo

    2017-07-01

    The geomorphometric classifications of three groups of volcanoes situated in the Andes Cordillera, Central America, and Mexico are performed and compared. Input data are eight local topographic gradients (i.e. elevation differences) obtained by processing each volcano raster ASTER-GDEM data. The pixels of each volcano DEM have been classified into 17 classes through a K-means clustering procedure following principal component analysis of the gradients. The spatial distribution of the classes, representing homogeneous terrain units, is shown on thematic colour maps, where colours are assigned according to mean slope and aspect class values. The interpretation of the geomorphometric classification of the volcanoes is based on the statistics of both gradients and morphometric parameters (slope, aspect and elevation). The latter were used for a comparison of the volcanoes, performed through classes' slope/aspect scatterplots and multidimensional methods. In this paper, we apply the mentioned methodology on 21 volcanoes, randomly chosen from Mexico to Patagonia, to show how it may contribute to detect geomorphological similarities and differences among them. As such, both its descriptive and graphical abilities may be a useful complement to future volcanological studies.

  5. A priori evaluation of two-stage cluster sampling for accuracy assessment of large-area land-cover maps

    USGS Publications Warehouse

    Wickham, J.D.; Stehman, S.V.; Smith, J.H.; Wade, T.G.; Yang, L.

    2004-01-01

    Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, within-cluster correlation may reduce the precision of the accuracy estimates. The detailed population information to quantify a priori the effect of within-cluster correlation on precision is typically unavailable. Consequently, a convenient, practical approach to evaluate the likely performance of a two-stage cluster sample is needed. We describe such an a priori evaluation protocol focusing on the spatial distribution of the sample by land-cover class across different cluster sizes and costs of different sampling options, including options not imposing clustering. This protocol also assesses the two-stage design's adequacy for estimating the precision of accuracy estimates for rare land-cover classes. We illustrate the approach using two large-area, regional accuracy assessments from the National Land-Cover Data (NLCD), and describe how the a priorievaluation was used as a decision-making tool when implementing the NLCD design.

  6. Investigations of Potential Phenotypes of Foot Osteoarthritis: Cross‐Sectional Analysis From the Clinical Assessment Study of the Foot

    PubMed Central

    Marshall, Michelle; Thomas, Martin J.; Menz, Hylton B.; Myers, Helen L.; Thomas, Elaine; Downes, Thomas; Peat, George; Roddy, Edward

    2016-01-01

    Objective To investigate the existence of distinct foot osteoarthritis (OA) phenotypes based on pattern of joint involvement and comparative symptom and risk profiles. Methods Participants ages ≥50 years reporting foot pain in the previous year were drawn from a population‐based cohort. Radiographs were scored for OA in the first metatarsophalangeal (MTP) joint, first and second cuneometatarsal, navicular first cuneiform, and talonavicular joints according to a published atlas. Chi‐square tests established clustering, and odds ratios (ORs) examined symmetry and pairwise associations of radiographic OA in the feet. Distinct underlying classes of foot OA were investigated by latent class analysis (LCA) and their association with symptoms and risk factors was assessed. Results In 533 participants (mean age 64.9 years, 55.9% female) radiographic OA clustered across both feet (P < 0.001) and was highly symmetrical (adjusted OR 3.0, 95% confidence interval 2.1, 4.2). LCA identified 3 distinct classes of foot OA: no or minimal foot OA (64%), isolated first MTP joint OA (22%), and polyarticular foot OA (15%). After adjustment for age and sex, polyarticular foot OA was associated with nodal OA, increased body mass index, and more pain and functional limitation compared to the other classes. Conclusion Patterning of radiographic foot OA has provided insight into the existence of 2 forms of foot OA: isolated first MTP joint OA and polyarticular foot OA. The symptom and risk factor profiles in individuals with polyarticular foot OA indicate a possible distinctive phenotype of foot OA, but further research is needed to explore the characteristics of isolated first MTP joint and polyarticular foot OA. PMID:26238801

  7. The Four U's: Latent Classes of Hookup Motivations Among College Students

    PubMed Central

    Uecker, Jeremy E.; Pearce, Lisa D.; Andercheck, Brita

    2016-01-01

    College students’ “hookups” have been the subject of a great deal of research in recent years. Motivations for hooking up have been linked to differences in well-being after the hookup, but studies detailing college students’ motivations for engaging in hookups focus on single motivations. Using data from the 2010 Duke Hookup Survey, we consider how motivations for hooking up cluster to produce different classes, or profiles, of students who hook up, and how these classes are related to hookup regret. Four distinct classes of motivations emerged from our latent class analysis: Utilitarians (50%), Uninhibiteds (27%), Uninspireds (19%), and Unreflectives (4%). We find a number of differences in hookup motivation classes across social characteristics, including gender, year in school, race-ethnicity, self-esteem, and attitudes about sexual behavior outside committed relationships. Additionally, Uninspireds regret hookups more frequently than members of the other classes, and Uninhibiteds report regret less frequently than Utilitarians and Uninspireds. These findings reveal the complexity of motivations for hooking up and the link between motivations and regret. PMID:27066516

  8. Prediction models for clustered data: comparison of a random intercept and standard regression model

    PubMed Central

    2013-01-01

    Background When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Methods Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. Results The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. Conclusion The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters. PMID:23414436

  9. Prediction models for clustered data: comparison of a random intercept and standard regression model.

    PubMed

    Bouwmeester, Walter; Twisk, Jos W R; Kappen, Teus H; van Klei, Wilton A; Moons, Karel G M; Vergouwe, Yvonne

    2013-02-15

    When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters.

  10. Valid statistical approaches for analyzing sholl data: Mixed effects versus simple linear models.

    PubMed

    Wilson, Machelle D; Sethi, Sunjay; Lein, Pamela J; Keil, Kimberly P

    2017-03-01

    The Sholl technique is widely used to quantify dendritic morphology. Data from such studies, which typically sample multiple neurons per animal, are often analyzed using simple linear models. However, simple linear models fail to account for intra-class correlation that occurs with clustered data, which can lead to faulty inferences. Mixed effects models account for intra-class correlation that occurs with clustered data; thus, these models more accurately estimate the standard deviation of the parameter estimate, which produces more accurate p-values. While mixed models are not new, their use in neuroscience has lagged behind their use in other disciplines. A review of the published literature illustrates common mistakes in analyses of Sholl data. Analysis of Sholl data collected from Golgi-stained pyramidal neurons in the hippocampus of male and female mice using both simple linear and mixed effects models demonstrates that the p-values and standard deviations obtained using the simple linear models are biased downwards and lead to erroneous rejection of the null hypothesis in some analyses. The mixed effects approach more accurately models the true variability in the data set, which leads to correct inference. Mixed effects models avoid faulty inference in Sholl analysis of data sampled from multiple neurons per animal by accounting for intra-class correlation. Given the widespread practice in neuroscience of obtaining multiple measurements per subject, there is a critical need to apply mixed effects models more widely. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Chandra Detection of an Evolved Population of Young Stars in Serpens South

    NASA Astrophysics Data System (ADS)

    Winston, E.; Wolk, S. J.; Gutermuth, R.; Bourke, T. L.

    2018-06-01

    We present a Chandra study of the deeply embedded Serpens South star-forming region, examining cluster structure and disk properties at the earliest stages. In total, 152 X-ray sources are detected. Combined with Spitzer and 2MASS photometry, 66 X-ray sources are reliably matched to an IR counterpart. We identify 21 class I, 6 flat spectrum, 16 class II, and 18 class III young stars; 5 were unclassified. Eighteen sources were variable in X-rays, 8 exhibiting flare-like emission and one source being periodic. The cluster’s X-ray luminosity distance was estimated: the best match was to the nearer distance of 260 pc for the front of the Aquila Rift complex. The ratio of N H to A K is found to be ∼0.68 × 1022, similar to that measured in other young low-mass regions, but lower than that measured in the interstellar medium and high-mass clusters (∼(1.6–2) × 1022). We find that the spatial distribution closely follows that of the dense filament from which the stars have formed, with the class II population still strongly associated with the filament. There are four subclusters in the field, with three forming knots in the filament, and a fourth to the west, which may not be associated but may be contributing to the distributed class III population. A high percentage of diskless class IIIs (upper limit 30% of classified X-ray sources) in such a young cluster could indicate that processing of disks is influenced by the cluster environment and is not solely dependent on timescale.

  12. Evaluation of diagnostic tools that tertiary teachers can apply to profile their students' conceptions

    NASA Astrophysics Data System (ADS)

    Schultz, Madeleine; Lawrie, Gwendolyn A.; Bailey, Chantal H.; Bedford, Simon B.; Dargaville, Tim R.; O'Brien, Glennys; Tasker, Roy; Thompson, Christopher D.; Williams, Mark; Wright, Anthony H.

    2017-03-01

    A multi-institution collaborative team of Australian chemistry education researchers, teaching a total of over 3000 first year chemistry students annually, has explored a tool for diagnosing students' prior conceptions as they enter tertiary chemistry courses. Five core topics were selected and clusters of diagnostic items were assembled linking related concepts in each topic together. An ordered multiple choice assessment strategy was adopted to enable provision of formative feedback to students through combination of the specific distractors that they chose. Concept items were either sourced from existing research instruments or developed by the project team. The outcome is a diagnostic tool consisting of five topic clusters of five concept items that has been delivered in large introductory chemistry classes at five Australian institutions. Statistical analysis of data has enabled exploration of the composition and validity of the instrument including a comparison between delivery of the complete 25 item instrument with subsets of five items, clustered by topic. This analysis revealed that most items retained their validity when delivered in small clusters. Tensions between the assembly, validation and delivery of diagnostic instruments for the purposes of acquiring robust psychometric research data versus their pragmatic use are considered in this study.

  13. Diversity in Older Adults' Use of the Internet: Identifying Subgroups Through Latent Class Analysis.

    PubMed

    van Boekel, Leonieke C; Peek, Sebastiaan Tm; Luijkx, Katrien G

    2017-05-24

    As for all individuals, the Internet is important in the everyday life of older adults. Research on older adults' use of the Internet has merely focused on users versus nonusers and consequences of Internet use and nonuse. Older adults are a heterogeneous group, which may implicate that their use of the Internet is diverse as well. Older adults can use the Internet for different activities, and this usage can be of influence on benefits the Internet can have for them. The aim of this paper was to describe the diversity or heterogeneity in the activities for which older adults use the Internet and determine whether diversity is related to social or health-related variables. We used data of a national representative Internet panel in the Netherlands. Panel members aged 65 years and older and who have access to and use the Internet were selected (N=1418). We conducted a latent class analysis based on the Internet activities that panel members reported to spend time on. Second, we described the identified clusters with descriptive statistics and compared the clusters using analysis of variance (ANOVA) and chi-square tests. Four clusters were distinguished. Cluster 1 was labeled as the "practical users" (36.88%, n=523). These respondents mainly used the Internet for practical and financial purposes such as searching for information, comparing products, and banking. Respondents in Cluster 2, the "minimizers" (32.23%, n=457), reported lowest frequency on most Internet activities, are older (mean age 73 years), and spent the smallest time on the Internet. Cluster 3 was labeled as the "maximizers" (17.77%, n=252); these respondents used the Internet for various activities, spent most time on the Internet, and were relatively younger (mean age below 70 years). Respondents in Cluster 4, the "social users," mainly used the Internet for social and leisure-related activities such as gaming and social network sites. The identified clusters significantly differed in age (P<.001, ω 2 =0.07), time spent on the Internet (P<.001, ω 2 =0.12), and frequency of downloading apps (P<.001, ω 2 =0.14), with medium to large effect sizes. Social and health-related variables were significantly different between the clusters, except social and emotional loneliness. However, effect sizes were small. The minimizers scored significantly lower on psychological well-being, instrumental activities of daily living (iADL), and experienced health compared with the practical users and maximizers. Older adults are a diverse group in terms of their activities on the Internet. This underlines the importance to look beyond use versus nonuse when studying older adults' Internet use. The clusters we have identified in this study can help tailor the development and deployment of eHealth intervention to specific segments of the older population. ©Leonieke C van Boekel, Sebastiaan TM Peek, Katrien G Luijkx. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 24.05.2017.

  14. Diversity in Older Adults’ Use of the Internet: Identifying Subgroups Through Latent Class Analysis

    PubMed Central

    van Boekel, Leonieke C; Peek, Sebastiaan TM; Luijkx, Katrien G

    2017-01-01

    Background As for all individuals, the Internet is important in the everyday life of older adults. Research on older adults’ use of the Internet has merely focused on users versus nonusers and consequences of Internet use and nonuse. Older adults are a heterogeneous group, which may implicate that their use of the Internet is diverse as well. Older adults can use the Internet for different activities, and this usage can be of influence on benefits the Internet can have for them. Objective The aim of this paper was to describe the diversity or heterogeneity in the activities for which older adults use the Internet and determine whether diversity is related to social or health-related variables. Methods We used data of a national representative Internet panel in the Netherlands. Panel members aged 65 years and older and who have access to and use the Internet were selected (N=1418). We conducted a latent class analysis based on the Internet activities that panel members reported to spend time on. Second, we described the identified clusters with descriptive statistics and compared the clusters using analysis of variance (ANOVA) and chi-square tests. Results Four clusters were distinguished. Cluster 1 was labeled as the “practical users” (36.88%, n=523). These respondents mainly used the Internet for practical and financial purposes such as searching for information, comparing products, and banking. Respondents in Cluster 2, the “minimizers” (32.23%, n=457), reported lowest frequency on most Internet activities, are older (mean age 73 years), and spent the smallest time on the Internet. Cluster 3 was labeled as the “maximizers” (17.77%, n=252); these respondents used the Internet for various activities, spent most time on the Internet, and were relatively younger (mean age below 70 years). Respondents in Cluster 4, the “social users,” mainly used the Internet for social and leisure-related activities such as gaming and social network sites. The identified clusters significantly differed in age (P<.001, ω2=0.07), time spent on the Internet (P<.001, ω2=0.12), and frequency of downloading apps (P<.001, ω2=0.14), with medium to large effect sizes. Social and health-related variables were significantly different between the clusters, except social and emotional loneliness. However, effect sizes were small. The minimizers scored significantly lower on psychological well-being, instrumental activities of daily living (iADL), and experienced health compared with the practical users and maximizers. Conclusions Older adults are a diverse group in terms of their activities on the Internet. This underlines the importance to look beyond use versus nonuse when studying older adults’ Internet use. The clusters we have identified in this study can help tailor the development and deployment of eHealth intervention to specific segments of the older population. PMID:28539302

  15. WEBGIS based CropWatch online agriculture monitoring system

    NASA Astrophysics Data System (ADS)

    Zhang, X.; Wu, B.; Zeng, H.; Zhang, M.; Yan, N.

    2015-12-01

    CropWatch, which was developed by the Institute of Remote Sensing and Digital Earth (RADI), Chinese Academy of Sciences (CAS), has achieved breakthrough results in the integration of methods, independence of the assessments and support to emergency response by periodically releasing global agricultural information. Taking advantages of the multi-source remote sensing data and the openness of the data sharing policies, CropWatch group reported their monitoring results by publishing four bulletins one year. In order to better analysis and generate the bulletin and provide an alternative way to access agricultural monitoring indicators and results in CropWatch, The CropWatch online system based on the WEBGIS techniques has been developed. Figure 1 shows the CropWatch online system structure and the system UI in Clustering mode. Data visualization is sorted into three different modes: Vector mode, Raster mode and Clustering mode. Vector mode provides the statistic value for all the indicators over each monitoring units which allows users to compare current situation with historical values (average, maximum, etc.). Users can compare the profiles of each indicator over the current growing season with the historical data in a chart by selecting the region of interest (ROI). Raster mode provides pixel based anomaly of CropWatch indicators globally. In this mode, users are able to zoom in to the regions where the notable anomaly was identified from statistic values in vector mode. Data from remote sensing image series at high temporal and low spatial resolution provide key information in agriculture monitoring. Clustering mode provides integrated information on different classes in maps, the corresponding profiles for each class and the percentage of area of each class to the total area of all classes. The time series data is categorized into limited types by the ISODATA algorithm. For each clustering type, pixels on the map, profiles, and percentage legend are all linked together. All the three visualization methods are applied to four scales including 65 monitoring and reporting units (MRUs), 7 major production zones (MPZs), 173 countries and sub-countries for 9 large countries. Agro-Climatic information, Agronomic information and indicators related with crop area, crop yield and crop production are provided.

  16. MAPS OF MASSIVE CLUMPS IN THE EARLY STAGE OF CLUSTER FORMATION: TWO MODES OF CLUSTER FORMATION, COEVAL OR NON-COEVAL?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Higuchi, Aya E.; Saito, Masao; Mauersberger, Rainer

    2013-03-10

    We present maps of seven young massive molecular clumps within five target regions in C{sup 18}O (J = 1-0) line emission, using the Nobeyama 45 m telescope. These clumps, which are not associated with clusters, lie at distances between 0.7 and 2.1 kpc. We find C{sup 18}O clumps with radii of 0.5-1.7 pc, masses of 470-4200 M{sub Sun }, and velocity widths of 1.4-3.3 km s{sup -1}. All of the clumps are massive and approximately in virial equilibrium, suggesting they will potentially form clusters. Three of our target regions are associated with H II regions (CWHRs), while the other twomore » are unassociated with H II regions (CWOHRs). The C{sup 18}O clumps can be classified into two morphological types: CWHRs with a filamentary or shell-like structure and spherical CWOHRs. The two CWOHRs have systematic velocity gradients. Using the publicly released WISE database, Class I and Class II protostellar candidates are identified within the C{sup 18}O clumps. The fraction of Class I candidates among all YSO candidates (Class I+Class II) is {>=}50% in CWHRs and {<=}50% in CWOHRs. We conclude that effects from the H II regions can be seen in (1) the spatial distributions of the clumps: filamentary or shell-like structure running along the H II regions; (2) the velocity structures of the clumps: large velocity dispersion along shells; and (3) the small age spreads of YSOs. The small spreads in age of the YSOs show that the presence of H II regions tends to trigger coeval cluster formation.« less

  17. A new approach for computing a flood vulnerability index using cluster analysis

    NASA Astrophysics Data System (ADS)

    Fernandez, Paulo; Mourato, Sandra; Moreira, Madalena; Pereira, Luísa

    2016-08-01

    A Flood Vulnerability Index (FloodVI) was developed using Principal Component Analysis (PCA) and a new aggregation method based on Cluster Analysis (CA). PCA simplifies a large number of variables into a few uncorrelated factors representing the social, economic, physical and environmental dimensions of vulnerability. CA groups areas that have the same characteristics in terms of vulnerability into vulnerability classes. The grouping of the areas determines their classification contrary to other aggregation methods in which the areas' classification determines their grouping. While other aggregation methods distribute the areas into classes, in an artificial manner, by imposing a certain probability for an area to belong to a certain class, as determined by the assumption that the aggregation measure used is normally distributed, CA does not constrain the distribution of the areas by the classes. FloodVI was designed at the neighbourhood level and was applied to the Portuguese municipality of Vila Nova de Gaia where several flood events have taken place in the recent past. The FloodVI sensitivity was assessed using three different aggregation methods: the sum of component scores, the first component score and the weighted sum of component scores. The results highlight the sensitivity of the FloodVI to different aggregation methods. Both sum of component scores and weighted sum of component scores have shown similar results. The first component score aggregation method classifies almost all areas as having medium vulnerability and finally the results obtained using the CA show a distinct differentiation of the vulnerability where hot spots can be clearly identified. The information provided by records of previous flood events corroborate the results obtained with CA, because the inundated areas with greater damages are those that are identified as high and very high vulnerability areas by CA. This supports the fact that CA provides a reliable FloodVI.

  18. Automated method to differentiate between native and mirror protein models obtained from contact maps

    PubMed Central

    Kurczynska, Monika

    2018-01-01

    Mirror protein structures are often considered as artifacts in modeling protein structures. However, they may soon become a new branch of biochemistry. Moreover, methods of protein structure reconstruction, based on their residue-residue contact maps, need methodology to differentiate between models of native and mirror orientation, especially regarding the reconstructed backbones. We analyzed 130 500 structural protein models obtained from contact maps of 1 305 SCOP domains belonging to all 7 structural classes. On average, the same numbers of native and mirror models were obtained among 100 models generated for each domain. Since their structural features are often not sufficient for differentiating between the two types of model orientations, we proposed to apply various energy terms (ETs) from PyRosetta to separate native and mirror models. To automate the procedure for differentiating these models, the k-means clustering algorithm was applied. Using total energy did not allow to obtain appropriate clusters–the accuracy of the clustering for class A (all helices) was no more than 0.52. Therefore, we tested a series of different k-means clusterings based on various combinations of ETs. Finally, applying two most differentiating ETs for each class allowed to obtain satisfying results. To unify the method for differentiating between native and mirror models, independent of their structural class, the two best ETs for each class were considered. Finally, the k-means clustering algorithm used three common ETs: probability of amino acid assuming certain values of dihedral angles Φ and Ψ, Ramachandran preferences and Coulomb interactions. The accuracies of clustering with these ETs were in the range between 0.68 and 0.76, with sensitivity and selectivity in the range between 0.68 and 0.87, depending on the structural class. The method can be applied to all fully-automated tools for protein structure reconstruction based on contact maps, especially those analyzing big sets of models. PMID:29787567

  19. Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes

    PubMed Central

    Bushel, Pierre R; Wolfinger, Russell D; Gibson, Greg

    2007-01-01

    Background Commonly employed clustering methods for analysis of gene expression data do not directly incorporate phenotypic data about the samples. Furthermore, clustering of samples with known phenotypes is typically performed in an informal fashion. The inability of clustering algorithms to incorporate biological data in the grouping process can limit proper interpretation of the data and its underlying biology. Results We present a more formal approach, the modk-prototypes algorithm, for clustering biological samples based on simultaneously considering microarray gene expression data and classes of known phenotypic variables such as clinical chemistry evaluations and histopathologic observations. The strategy involves constructing an objective function with the sum of the squared Euclidean distances for numeric microarray and clinical chemistry data and simple matching for histopathology categorical values in order to measure dissimilarity of the samples. Separate weighting terms are used for microarray, clinical chemistry and histopathology measurements to control the influence of each data domain on the clustering of the samples. The dynamic validity index for numeric data was modified with a category utility measure for determining the number of clusters in the data sets. A cluster's prototype, formed from the mean of the values for numeric features and the mode of the categorical values of all the samples in the group, is representative of the phenotype of the cluster members. The approach is shown to work well with a simulated mixed data set and two real data examples containing numeric and categorical data types. One from a heart disease study and another from acetaminophen (an analgesic) exposure in rat liver that causes centrilobular necrosis. Conclusion The modk-prototypes algorithm partitioned the simulated data into clusters with samples in their respective class group and the heart disease samples into two groups (sick and buff denoting samples having pain type representative of angina and non-angina respectively) with an accuracy of 79%. This is on par with, or better than, the assignment accuracy of the heart disease samples by several well-known and successful clustering algorithms. Following modk-prototypes clustering of the acetaminophen-exposed samples, informative genes from the cluster prototypes were identified that are descriptive of, and phenotypically anchored to, levels of necrosis of the centrilobular region of the rat liver. The biological processes cell growth and/or maintenance, amine metabolism, and stress response were shown to discern between no and moderate levels of acetaminophen-induced centrilobular necrosis. The use of well-known and traditional measurements directly in the clustering provides some guarantee that the resulting clusters will be meaningfully interpretable. PMID:17408499

  20. The terrain signatures of administrative units: a tool for environmental assessment.

    PubMed

    Miliaresis, George Ch

    2009-03-01

    The quantification of knowledge related to the terrain and the landuse/landcover of administrative units in Southern Greece (Peloponnesus) is performed from the CGIAR-CSI SRTM digital elevation model and the CORINE landuse/landcover database. Each administrative unit is parametrically represented by a set of attributes related to its relief. Administrative units are classified on the basis of K-means cluster analysis in an attempt to see how they are organized into groups and cluster derived geometric signatures are defined. Finally each cluster is parametrically represented on the basis of the occurrence of the Corine landuse/landcover classes included and thus, landcover signatures are derived. The geometric and the landuse/landcover signatures revealed a terrain dependent landuse/landcover organization that was used in the assessment of the forest fires impact at moderate resolution scale.

  1. The genetic heterogeneity of Arab populations as inferred from HLA genes

    PubMed Central

    Almawi, Wassim Y.; Arnaiz-Villena, Antonio; Hattab, Lasmar; Hmida, Slama

    2018-01-01

    This is the first genetic anthropology study on Arabs in MENA (Middle East and North Africa) region. The present meta-analysis included 100 populations from 36 Arab and non-Arab communities, comprising 16,006 individuals, and evaluates the genetic profile of Arabs using HLA class I (A, B) and class II (DRB1, DQB1) genes. A total of 56 Arab populations comprising 10,283 individuals were selected from several databases, and were compared with 44 Mediterranean, Asian, and sub-Saharan populations. The most frequent alleles in Arabs are A*01, A*02, B*35, B*51, DRB1*03:01, DRB1*07:01, DQB1*02:01, and DQB1*03:01, while DRB1*03:01-DQB1*02:01 and DRB1*07:01-DQB1*02:02 are the most frequent class II haplotypes. Dendrograms, correspondence analyses, genetic distances, and haplotype analysis indicate that Arabs could be stratified into four groups. The first consists of North Africans (Algerians, Tunisians, Moroccans, and Libyans), and the first Arabian Peninsula cluster (Saudis, Kuwaitis, and Yemenis), who appear to be related to Western Mediterraneans, including Iberians; this might be explained for a massive migration into these areas when Sahara underwent a relatively rapid desiccation, starting about 10,000 years BC. The second includes Levantine Arabs (Palestinians, Jordanians, Lebanese, and Syrians), along with Iraqi and Egyptians, who are related to Eastern Mediterraneans. The third comprises Sudanese and Comorians, who tend to cluster with Sub-Saharans. The fourth comprises the second Arabian Peninsula cluster, made up of Omanis, Emiratis, and Bahrainis. It is noteworthy that the two large minorities (Berbers and Kurds) are indigenous (autochthonous), and are not genetically different from “host” and neighboring populations. In conclusion, this study confirmed high genetic heterogeneity among present-day Arabs, and especially those of the Arabian Peninsula. PMID:29522542

  2. Semantic segmentation of 3D textured meshes for urban scene analysis

    NASA Astrophysics Data System (ADS)

    Rouhani, Mohammad; Lafarge, Florent; Alliez, Pierre

    2017-01-01

    Classifying 3D measurement data has become a core problem in photogrammetry and 3D computer vision, since the rise of modern multiview geometry techniques, combined with affordable range sensors. We introduce a Markov Random Field-based approach for segmenting textured meshes generated via multi-view stereo into urban classes of interest. The input mesh is first partitioned into small clusters, referred to as superfacets, from which geometric and photometric features are computed. A random forest is then trained to predict the class of each superfacet as well as its similarity with the neighboring superfacets. Similarity is used to assign the weights of the Markov Random Field pairwise-potential and to account for contextual information between the classes. The experimental results illustrate the efficacy and accuracy of the proposed framework.

  3. Probing dark matter physics with galaxy clusters

    NASA Astrophysics Data System (ADS)

    Dalal, Neal

    2016-10-01

    We propose a theoretical investigation of the effects of a class of dark matter (DM) self-interactions on the properties of galaxy clusters and their host dark matter halos. Recent work using HST has claimed the detection of a particular form of DM self-interaction, which can lead to observable displacements between satellite galaxies within clusters and the DM subhalos hosting them. This form of self-interaction is highly anisotropic, favoring forward scattering with low momentum transfer, unlike isotropically scattering self-interacting dark matter (SIDM) models. This class of models has not been simulated numerically, clouding the interpretation of the claimed offsets between galaxies and lensing peaks observed by HST. We propose to perform high resolution simulations of cosmological structure formation for this class of SIDM model, focusing on three observables accessible to existing HST observations of clusters. First, we will quantify the extent to which offsets between baryons and DM can arise in these models, as a function of the cross section. Secondly, we will also quantify the effects of this type of DM self-interaction on halo concentrations, to determine the range of cross-sections allowed by existing stringent constraints from HST. Finally we will compute the so-called splashback feature in clusters, specifically focusing on whether SIDM can resolve the current discrepancy between observed values of splashback radii in clusters compared to theoretical predictions for CDM. The proposed investigations will add value to all existing deep HST observations of galaxy clusters by allowing them to probe dark matter physics in three independent ways.

  4. Is There a Typology of Teacher and Leader Responders to Call and Do They Cluster in Different Types of Schools? A Two-Level Latent Class Analysis of Call Survey Data

    ERIC Educational Resources Information Center

    Bowers, Alex J.; Blitz, Mark; Modeste, Marsha; Salisbury, Jason; Halverson, Richard R.

    2017-01-01

    Background: Across the recent research on school leadership, leadership for learning has emerged as a strong framework for integrating current theories, such as instructional, transformational, and distributed leadership as well as effective human resource practices, instructional evaluation, and resource allocation. Yet, questions remain as to…

  5. Capacity Analysis of Multihop Packet Radio Networks under a General Class of Channel Access Protocols and Capture Models

    DTIC Science & Technology

    1987-03-01

    Gitman in [Gitm75]. The system considered consisted of a set of clusters (each with an infinite popula- tion of users) that communicate with a central...30, no. 5, pp. 985-995, May 1982. [Gitm75] I. Gitman , "On the Capacity of Slotted ALOHA Networks and Some Design Problems," IEEE Trans. Comm., vol

  6. Positive selection on MHC class II DRB and DQB genes in the bank vole (Myodes glareolus).

    PubMed

    Scherman, Kristin; Råberg, Lars; Westerdahl, Helena

    2014-05-01

    The major histocompatibility complex (MHC) class IIB genes show considerable sequence similarity between loci. The MHC class II DQB and DRB genes are known to exhibit a high level of polymorphism, most likely maintained by parasite-mediated selection. Studies of the MHC in wild rodents have focused on DRB, whilst DQB has been given much less attention. Here, we characterised DQB genes in Swedish bank voles Myodes glareolus, using full-length transcripts. We then designed primers that specifically amplify exon 2 from DRB (202 bp) and DQB (205 bp) and investigated molecular signatures of natural selection on DRB and DQB alleles. The presence of two separate gene clusters was confirmed using BLASTN and phylogenetic analysis, where our seven transcripts clustered according to either DQB or DRB homologues. These gene clusters were again confirmed on exon 2 data from 454-amplicon sequencing. Our DRB primers amplify a similar number of alleles per individual as previously published DRB primers, though our reads are longer. Traditional d N/d S analyses of DRB sequences in the bank vole have not found a conclusive signal of positive selection. Using a more advanced substitution model (the Kumar method) we found positive selection in the peptide binding region (PBR) of both DRB and DQB genes. Maximum likelihood models of codon substitutions detected positively selected sites located in the PBR of both DQB and DRB. Interestingly, these analyses detected at least twice as many positively selected sites in DQB than DRB, suggesting that DQB has been under stronger positive selection than DRB over evolutionary time.

  7. Identification of Chiari Type I Malformation subtypes using whole genome expression profiles and cranial base morphometrics

    PubMed Central

    2014-01-01

    Background Chiari Type I Malformation (CMI) is characterized by herniation of the cerebellar tonsils through the foramen magnum at the base of the skull, resulting in significant neurologic morbidity. As CMI patients display a high degree of clinical variability and multiple mechanisms have been proposed for tonsillar herniation, it is hypothesized that this heterogeneous disorder is due to multiple genetic and environmental factors. The purpose of the present study was to gain a better understanding of what factors contribute to this heterogeneity by using an unsupervised statistical approach to define disease subtypes within a case-only pediatric population. Methods A collection of forty-four pediatric CMI patients were ascertained to identify disease subtypes using whole genome expression profiles generated from patient blood and dura mater tissue samples, and radiological data consisting of posterior fossa (PF) morphometrics. Sparse k-means clustering and an extension to accommodate multiple data sources were used to cluster patients into more homogeneous groups using biological and radiological data both individually and collectively. Results All clustering analyses resulted in the significant identification of patient classes, with the pure biological classes derived from patient blood and dura mater samples demonstrating the strongest evidence. Those patient classes were further characterized by identifying enriched biological pathways, as well as correlated cranial base morphological and clinical traits. Conclusions Our results implicate several strong biological candidates warranting further investigation from the dura expression analysis and also identified a blood gene expression profile corresponding to a global down-regulation in protein synthesis. PMID:24962150

  8. Predictors of fibromyalgia: a population-based twin cohort study.

    PubMed

    Markkula, Ritva A; Kalso, Eija A; Kaprio, Jaakko A

    2016-01-15

    Fibromyalgia (FM) is a pain syndrome, the mechanisms and predictors of which are still unclear. We have earlier validated a set of FM-symptom questions for detecting possible FM in an epidemiological survey and thereby identified a cluster with "possible FM". This study explores prospectively predictors for membership of that FM-symptom cluster. A population-based sample of 8343 subjects of the older Finnish Twin Cohort replied to health questionnaires in 1975, 1981, and 1990. Their answers to the set of FM-symptom questions in 1990 classified them in three latent classes (LC): LC1 with no or few symptoms, LC2 with some symptoms, and LC3 with many FM symptoms. We analysed putative predictors for these symptom classes using baseline (1975 and 1981) data on regional pain, headache, migraine, sleeping, body mass index (BMI), physical activity, smoking, and zygosity, adjusted for age, gender, and education. Those with a high likelihood of having fibromyalgia at baseline were excluded from the analysis. In the final multivariate regression model, regional pain, sleeping problems, and overweight were all predictors for membership in the class with many FM symptoms. The strongest non-genetic predictor was frequent headache (OR 8.6, CI 95% 3.8-19.2), followed by persistent back pain (OR 4.7, CI 95% 3.3-6.7) and persistent neck pain (OR 3.3, CI 95% 1.8-6.0). Regional pain, frequent headache, and persistent back or neck pain, sleeping problems, and overweight are predictors for having a cluster of symptoms consistent with fibromyalgia.

  9. Ensemble Clustering Classification compete SVM and One-Class classifiers applied on plant microRNAs Data.

    PubMed

    Yousef, Malik; Khalifa, Waleed; AbedAllah, Loai

    2016-12-22

    The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that ECkNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  10. Ensemble Clustering Classification Applied to Competing SVM and One-Class Classifiers Exemplified by Plant MicroRNAs Data.

    PubMed

    Yousef, Malik; Khalifa, Waleed; AbdAllah, Loai

    2016-12-01

    The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that EC-kNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  11. Evolution of Chemical Diversity in a Group of Non-Reduced Polyketide Gene Clusters: Using Phylogenetics to Inform the Search for Novel Fungal Natural Products

    PubMed Central

    Throckmorton, Kurt; Wiemann, Philipp; Keller, Nancy P.

    2015-01-01

    Fungal polyketides are a diverse class of natural products, or secondary metabolites (SMs), with a wide range of bioactivities often associated with toxicity. Here, we focus on a group of non-reducing polyketide synthases (NR-PKSs) in the fungal phylum Ascomycota that lack a thioesterase domain for product release, group V. Although widespread in ascomycete taxa, this group of NR-PKSs is notably absent in the mycotoxigenic genus Fusarium and, surprisingly, found in genera not known for their secondary metabolite production (e.g., the mycorrhizal genus Oidiodendron, the powdery mildew genus Blumeria, and the causative agent of white-nose syndrome in bats, Pseudogymnoascus destructans). This group of NR-PKSs, in association with the other enzymes encoded by their gene clusters, produces a variety of different chemical classes including naphthacenediones, anthraquinones, benzophenones, grisandienes, and diphenyl ethers. We discuss the modification of and transitions between these chemical classes, the requisite enzymes, and the evolution of the SM gene clusters that encode them. Integrating this information, we predict the likely products of related but uncharacterized SM clusters, and we speculate upon the utility of these classes of SMs as virulence factors or chemical defenses to various plant, animal, and insect pathogens, as well as mutualistic fungi. PMID:26378577

  12. The Future of Classification in Wheelchair Sports; Can Data Science and Technological Advancement Offer an Alternative Point of View?

    PubMed

    van der Slikke, Rienk M A; Bregman, Daan J J; Berger, Monique A M; de Witte, Annemarie M H; Veeger, Dirk-Jan H E J

    2017-11-01

    Classification is a defining factor for competition in wheelchair sports, but it is a delicate and time-consuming process with often questionable validity. 1 New inertial sensor based measurement methods applied in match play and field tests, allow for more precise and objective estimates of the impairment effect on wheelchair mobility performance. It was evaluated if these measures could offer an alternative point of view for classification. Six standard wheelchair mobility performance outcomes of different classification groups were measured in match play (n=29), as well as best possible performance in a field test (n=47). In match-results a clear relationship between classification and performance level is shown, with increased performance outcomes in each adjacent higher classification group. Three outcomes differed significantly between the low and mid-class groups, and one between the mid and high-class groups. In best performance (field test), a split between the low and mid-class groups shows (5 out of 6 outcomes differed significantly) but hardly any difference between the mid and high-class groups. This observed split was confirmed by cluster analysis, revealing the existence of only two performance based clusters. The use of inertial sensor technology to get objective measures of wheelchair mobility performance, combined with a standardized field-test, brought alternative views for evidence based classification. The results of this approach provided arguments for a reduced number of classes in wheelchair basketball. Future use of inertial sensors in match play and in field testing could enhance evaluation of classification guidelines as well as individual athlete performance.

  13. Four- and eight-membered rings carbon nanotubes: A new class of carbon nanomaterials

    NASA Astrophysics Data System (ADS)

    Li, Fangfang; Lu, Junzhe; Zhu, Hengjiang; Lin, Xiang

    2018-06-01

    A new class of carbon nanomaterials composed of alternating four- and eight-membered rings is studied by density functional theory (DFT), including single-walled carbon nanotubes (SWCNTs) double-walled carbon nanotubes (DWCNTs) and triple-walled CNTs (TWCNTs). The analysis of geometrical structure shows that carbon atoms' hybridization in novel carbon tubular clusters (CTCs) and the corresponding carbon nanotubes (CNTs) are both sp2 hybridization; The thermal properties exhibit the high stability of these new CTCs. The results of energy band and density of state (DOS) indicate that the electronic properties of CNTs are independent of their diameter, number of walls and chirality, exhibit obvious metal properties.

  14. Java implementation of Class Association Rule algorithms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tamura, Makio

    2007-08-30

    Java implementation of three Class Association Rule mining algorithms, NETCAR, CARapriori, and clustering based rule mining. NETCAR algorithm is a novel algorithm developed by Makio Tamura. The algorithm is discussed in a paper: UCRL-JRNL-232466-DRAFT, and would be published in a peer review scientific journal. The software is used to extract combinations of genes relevant with a phenotype from a phylogenetic profile and a phenotype profile. The phylogenetic profiles is represented by a binary matrix and a phenotype profile is represented by a binary vector. The present application of this software will be in genome analysis, however, it could be appliedmore » more generally.« less

  15. Hierarchical Adaptive Means (HAM) clustering for hardware-efficient, unsupervised and real-time spike sorting.

    PubMed

    Paraskevopoulou, Sivylla E; Wu, Di; Eftekhar, Amir; Constandinou, Timothy G

    2014-09-30

    This work presents a novel unsupervised algorithm for real-time adaptive clustering of neural spike data (spike sorting). The proposed Hierarchical Adaptive Means (HAM) clustering method combines centroid-based clustering with hierarchical cluster connectivity to classify incoming spikes using groups of clusters. It is described how the proposed method can adaptively track the incoming spike data without requiring any past history, iteration or training and autonomously determines the number of spike classes. Its performance (classification accuracy) has been tested using multiple datasets (both simulated and recorded) achieving a near-identical accuracy compared to k-means (using 10-iterations and provided with the number of spike classes). Also, its robustness in applying to different feature extraction methods has been demonstrated by achieving classification accuracies above 80% across multiple datasets. Last but crucially, its low complexity, that has been quantified through both memory and computation requirements makes this method hugely attractive for future hardware implementation. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. Supervised Clustering Based on DPClusO: Prediction of Plant-Disease Relations Using Jamu Formulas of KNApSAcK Database

    PubMed Central

    Husnawati, Husnawati; Afendi, Farit Mochamad; Darusman, Latifah K.; Altaf-Ul-Amin, Md.; Sato, Tetsuo; Ono, Naoaki; Sugiura, Tadao; Kanaya, Shigehiko

    2014-01-01

    Indonesia has the largest medicinal plant species in the world and these plants are used as Jamu medicines. Jamu medicines are popular traditional medicines from Indonesia and we need to systemize the formulation of Jamu and develop basic scientific principles of Jamu to meet the requirement of Indonesian Healthcare System. We propose a new approach to predict the relation between plant and disease using network analysis and supervised clustering. At the preliminary step, we assigned 3138 Jamu formulas to 116 diseases of International Classification of Diseases (ver. 10) which belong to 18 classes of disease from National Center for Biotechnology Information. The correlation measures between Jamu pairs were determined based on their ingredient similarity. Networks are constructed and analyzed by selecting highly correlated Jamu pairs. Clusters were then generated by using the network clustering algorithm DPClusO. By using matching score of a cluster, the dominant disease and high frequency plant associated to the cluster are determined. The plant to disease relations predicted by our method were evaluated in the context of previously published results and were found to produce around 90% successful predictions. PMID:24804251

  17. The Formation of Cluster Populations Through Direct Galaxy Collisions

    NASA Astrophysics Data System (ADS)

    Peterson, Bradley W.; Smith, Beverly J.; Struck, Curtis

    2016-01-01

    Much progress has been made on the question of how globular clusters form. In particular, the study of extragalactic populations of young, high-mass clusters ("super star clusters") has revealed a class of objects can evolve into globular clusters. The process by which these clusters form, and how many survive long enough to become globular clusters, is not wholly understood. Here, we use new data on the colliding galaxy system Arp 261 to investigate the possibility that young, massive clusters form in greater numbers during direct galaxy collisions, compared to less direct tidal collisions.

  18. Time-Series Analysis: Assessing the Effects of Multiple Educational Interventions in a Small-Enrollment Course

    NASA Astrophysics Data System (ADS)

    Warren, Aaron R.

    2009-11-01

    Time-series designs are an alternative to pretest-posttest methods that are able to identify and measure the impacts of multiple educational interventions, even for small student populations. Here, we use an instrument employing standard multiple-choice conceptual questions to collect data from students at regular intervals. The questions are modified by asking students to distribute 100 Confidence Points among the options in order to indicate the perceived likelihood of each answer option being the correct one. Tracking the class-averaged ratings for each option produces a set of time-series. ARIMA (autoregressive integrated moving average) analysis is then used to test for, and measure, changes in each series. In particular, it is possible to discern which educational interventions produce significant changes in class performance. Cluster analysis can also identify groups of students whose ratings evolve in similar ways. A brief overview of our methods and an example are presented.

  19. Nottingham Prognostic Index Plus (NPI+): a modern clinical decision making tool in breast cancer.

    PubMed

    Rakha, E A; Soria, D; Green, A R; Lemetre, C; Powe, D G; Nolan, C C; Garibaldi, J M; Ball, G; Ellis, I O

    2014-04-02

    Current management of breast cancer (BC) relies on risk stratification based on well-defined clinicopathologic factors. Global gene expression profiling studies have demonstrated that BC comprises distinct molecular classes with clinical relevance. In this study, we hypothesised that molecular features of BC are a key driver of tumour behaviour and when coupled with a novel and bespoke application of established clinicopathologic prognostic variables can predict both clinical outcome and relevant therapeutic options more accurately than existing methods. In the current study, a comprehensive panel of biomarkers with relevance to BC was applied to a large and well-characterised series of BC, using immunohistochemistry and different multivariate clustering techniques, to identify the key molecular classes. Subsequently, each class was further stratified using a set of well-defined prognostic clinicopathologic variables. These variables were combined in formulae to prognostically stratify different molecular classes, collectively known as the Nottingham Prognostic Index Plus (NPI+). The NPI+ was then used to predict outcome in the different molecular classes. Seven core molecular classes were identified using a selective panel of 10 biomarkers. Incorporation of clinicopathologic variables in a second-stage analysis resulted in identification of distinct prognostic groups within each molecular class (NPI+). Outcome analysis showed that using the bespoke NPI formulae for each biological BC class provides improved patient outcome stratification superior to the traditional NPI. This study provides proof-of-principle evidence for the use of NPI+ in supporting improved individualised clinical decision making.

  20. IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites.

    PubMed

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world. Copyright © 2015 Hadjithomas et al.

  1. Clustering determines the dynamics of complex contagions in multiplex networks

    NASA Astrophysics Data System (ADS)

    Zhuang, Yong; Arenas, Alex; Yaǧan, Osman

    2017-01-01

    We present the mathematical analysis of generalized complex contagions in a class of clustered multiplex networks. The model is intended to understand spread of influence, or any other spreading process implying a threshold dynamics, in setups of interconnected networks with significant clustering. The contagion is assumed to be general enough to account for a content-dependent linear threshold model, where each link type has a different weight (for spreading influence) that may depend on the content (e.g., product, rumor, political view) that is being spread. Using the generating functions formalism, we determine the conditions, probability, and expected size of the emergent global cascades. This analysis provides a generalization of previous approaches and is especially useful in problems related to spreading and percolation. The results present nontrivial dependencies between the clustering coefficient of the networks and its average degree. In particular, several phase transitions are shown to occur depending on these descriptors. Generally speaking, our findings reveal that increasing clustering decreases the probability of having global cascades and their size, however, this tendency changes with the average degree. There exists a certain average degree from which on clustering favors the probability and size of the contagion. By comparing the dynamics of complex contagions over multiplex networks and their monoplex projections, we demonstrate that ignoring link types and aggregating network layers may lead to inaccurate conclusions about contagion dynamics, particularly when the correlation of degrees between layers is high.

  2. Spatial distribution of 12 class B notifiable infectious diseases in China: A retrospective study

    PubMed Central

    Zhu, Bin; Fu, Yang; Liu, Jinlin

    2018-01-01

    Background China is the largest developing country with a relatively developed public health system. To further prevent and eliminate the spread of infectious diseases, China has listed 39 notifiable infectious diseases characterized by wide prevalence or great harm, and classified them into classes A, B, and C, with severity decreasing across classes. Class A diseases have been almost eradicated in China, thus making class B diseases a priority in infectious disease prevention and control. In this retrospective study, we analyze the spatial distribution patterns of 12 class B notifiable infectious diseases that remain active all over China. Methods Global and local Moran’s I and corresponding graphic tools are adopted to explore and visualize the global and local spatial distribution of the incidence of the selected epidemics, respectively. Inter-correlations of clustering patterns of each pair of diseases and a cumulative summary of the high/low cluster frequency of the provincial units are also provided by means of figures and maps. Results Of the 12 most commonly notifiable class B infectious diseases, viral hepatitis and tuberculosis show high incidence rates and account for more than half of the reported cases. Almost all the diseases, except pertussis, exhibit positive spatial autocorrelation at the provincial level. All diseases feature varying spatial concentrations. Nevertheless, associations exist between spatial distribution patterns, with some provincial units displaying the same type of cluster features for two or more infectious diseases. Overall, high–low (unit with high incidence surrounded by units with high incidence, the same below) and high–high spatial cluster areas tend to be prevalent in the provincial units located in western and southwest China, whereas low–low and low–high spatial cluster areas abound in provincial units in north and east China. Conclusion Despite the various distribution patterns of 12 class B notifiable infectious diseases, certain similarities between their spatial distributions are present. Substantial evidence is available to support disease-specific, location-specific, and disease-combined interventions. Regarding provinces that show high–high/high–low patterns of multiple diseases, comprehensive interventions targeting different diseases should be established. As to the adjacent provincial units revealing similar patterns, coordinated actions need to be taken across borders. PMID:29621351

  3. A Search for Pulsation in Young Brown Dwarfs and Very Low Mass Stars

    NASA Astrophysics Data System (ADS)

    Cody, Ann Marie

    2012-05-01

    In 2005, Palla and Baraffe proposed that brown dwarfs and very low mass stars (<0.1 solar masses) may be unstable to radial oscillations during the pre-main-sequence deuterium burning phase. With associated oscillation periods of 1--4 hours, this potentially new class of pulsation offers unprecedented opportunities to probe the interiors and evolution of low-mass objects in the 1--15 million year age range. Furthermore, several previous reports of short-period variability have suggested that deuterium-burning pulsation is in fact at work in young clusters. For my dissertation, I developed a photometric monitoring campaign to search for low-amplitude periodic variability in young brown dwarfs and very low mass stars using meter-class telescopes from both the ground and space. The resulting high-precision, high-cadence time-series photometry targeted four young clusters and achieved sensitivity to periodic oscillations with photometric amplitudes down to several millimagnitudes. This unprecedented variability census probed timescales ranging from minutes to weeks in a sample of 200 young, low-mass cluster members of IC 348, Sigma Orionis, Chamaeleon I, and Upper Scorpius. While I find a dearth of photometric periods under 10 hours, the campaign's high time resolution and precision have enabled detailed study of diverse light curve behavior in the clusters: rotational spot modulation, accretion signatures, and occultations by surrounding disk material. Analysis of the data has led to the establishment of a lower limit for the timescale of periodic photometric variability in young low-mass and substellar objects, an extension of the rotation period distribution to the brown dwarf regime, as well as insights into the connection between variability and circumstellar disks in the Sigma Orionis and Chamaeleon I clusters.

  4. Hydrometeor classification through statistical clustering of polarimetric radar measurements: a semi-supervised approach

    NASA Astrophysics Data System (ADS)

    Besic, Nikola; Ventura, Jordi Figueras i.; Grazioli, Jacopo; Gabella, Marco; Germann, Urs; Berne, Alexis

    2016-09-01

    Polarimetric radar-based hydrometeor classification is the procedure of identifying different types of hydrometeors by exploiting polarimetric radar observations. The main drawback of the existing supervised classification methods, mostly based on fuzzy logic, is a significant dependency on a presumed electromagnetic behaviour of different hydrometeor types. Namely, the results of the classification largely rely upon the quality of scattering simulations. When it comes to the unsupervised approach, it lacks the constraints related to the hydrometeor microphysics. The idea of the proposed method is to compensate for these drawbacks by combining the two approaches in a way that microphysical hypotheses can, to a degree, adjust the content of the classes obtained statistically from the observations. This is done by means of an iterative approach, performed offline, which, in a statistical framework, examines clustered representative polarimetric observations by comparing them to the presumed polarimetric properties of each hydrometeor class. Aside from comparing, a routine alters the content of clusters by encouraging further statistical clustering in case of non-identification. By merging all identified clusters, the multi-dimensional polarimetric signatures of various hydrometeor types are obtained for each of the studied representative datasets, i.e. for each radar system of interest. These are depicted by sets of centroids which are then employed in operational labelling of different hydrometeors. The method has been applied on three C-band datasets, each acquired by different operational radar from the MeteoSwiss Rad4Alp network, as well as on two X-band datasets acquired by two research mobile radars. The results are discussed through a comparative analysis which includes a corresponding supervised and unsupervised approach, emphasising the operational potential of the proposed method.

  5. Improved Test Planning and Analysis Through the Use of Advanced Statistical Methods

    NASA Technical Reports Server (NTRS)

    Green, Lawrence L.; Maxwell, Katherine A.; Glass, David E.; Vaughn, Wallace L.; Barger, Weston; Cook, Mylan

    2016-01-01

    The goal of this work is, through computational simulations, to provide statistically-based evidence to convince the testing community that a distributed testing approach is superior to a clustered testing approach for most situations. For clustered testing, numerous, repeated test points are acquired at a limited number of test conditions. For distributed testing, only one or a few test points are requested at many different conditions. The statistical techniques of Analysis of Variance (ANOVA), Design of Experiments (DOE) and Response Surface Methods (RSM) are applied to enable distributed test planning, data analysis and test augmentation. The D-Optimal class of DOE is used to plan an optimally efficient single- and multi-factor test. The resulting simulated test data are analyzed via ANOVA and a parametric model is constructed using RSM. Finally, ANOVA can be used to plan a second round of testing to augment the existing data set with new data points. The use of these techniques is demonstrated through several illustrative examples. To date, many thousands of comparisons have been performed and the results strongly support the conclusion that the distributed testing approach outperforms the clustered testing approach.

  6. The enigma of the open cluster M29 (NGC 6913) solved

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Straižys, V.; Milašius, K.; Černis, K.

    2014-11-01

    Determining the distance to the open cluster M29 (NGC 6913) has proven difficult, with distances determined by various authors differing by a factor of two or more. To solve this problem, we have initiated a new photometric investigation of the cluster in the Vilnius seven-color photometric system, supplementing it with available data in the BV and JHK {sub s} photometric systems and spectra of the nine brightest stars of spectral classes O and B. Photometric spectral classes and luminosities of 260 stars in a 15' × 15' area down to V = 19 mag are used to investigate the interstellarmore » extinction run with distance and to estimate the distance of the Great Cygnus Rift, ∼ 800 pc. The interstellar reddening law in the optical and near-infrared regions is found to be close to normal, with the ratio of extinction to color excess R{sub BV} = 2.87. The extinction A{sub V} of cluster members is between 2.5 and 3.8 mag, with a mean value of 2.97 mag, or E {sub B–V} = 1.03. The average distance of eight stars of spectral types O9-B2 is 1.54 ± 0.15 kpc. Two stars from the seven brightest stars are field stars: HDE 229238 is a background B0.5 supergiant and HD 194378 is a foreground F star. In the intrinsic color-magnitude diagram, seven fainter stars of spectral classes B3-B8 are identified as possible members of the cluster. The 15 selected members of the cluster of spectral classes O9-B8 plotted on the log L/L {sub ☉} versus log T {sub eff} diagram, together with the isochrones from the Padova database, give the age of the cluster as 5 ± 1 Myr.« less

  7. Large-Scale Phylogenetic Classification of Fungal Chitin Synthases and Identification of a Putative Cell-Wall Metabolism Gene Cluster in Aspergillus Genomes

    PubMed Central

    Pacheco-Arjona, Jose Ramon; Ramirez-Prado, Jorge Humberto

    2014-01-01

    The cell wall is a protective and versatile structure distributed in all fungi. The component responsible for its rigidity is chitin, a product of chitin synthase (Chsp) enzymes. There are seven classes of chitin synthase genes (CHS) and the amount and type encoded in fungal genomes varies considerably from one species to another. Previous Chsp sequence analyses focused on their study as individual units, regardless of genomic context. The identification of blocks of conserved genes between genomes can provide important clues about the interactions and localization of chitin synthases. On the present study, we carried out an in silico search of all putative Chsp encoded in 54 full fungal genomes, encompassing 21 orders from five phyla. Phylogenetic studies of these Chsp were able to confidently classify 347 out of the 369 Chsp identified (94%). Patterns in the distribution of Chsp related to taxonomy were identified, the most prominent being related to the type of fungal growth. More importantly, a synteny analysis for genomic blocks centered on class IV Chsp (the most abundant and widely distributed Chsp class) identified a putative cell wall metabolism gene cluster in members of the genus Aspergillus, the first such association reported for any fungal genome. PMID:25148134

  8. Automated thematic mapping and change detection of ERTS-A images. [farmlands, cities, and mountain identification in Utah, Washington, Arizona, and California

    NASA Technical Reports Server (NTRS)

    Gramenopoulos, N. (Principal Investigator)

    1974-01-01

    The author has identified the following significant results. A diffraction pattern analysis of MSS images led to the development of spatial signatures for farm land, urban areas and mountains. Four spatial features are employed to describe the spatial characteristics of image cells in the digital data. Three spectral features are combined with the spatial features to form a seven dimensional vector describing each cell. Then, the classification of the feature vectors is accomplished by using the maximum likelihood criterion. It was determined that the recognition accuracy with the maximum likelihood criterion depends on the statistics of the feature vectors. It was also determined that for a given geographic area the statistics of the classes remain invariable for a period of a month, but vary substantially between seasons. Three ERTS-1 images from the Phoenix, Arizona area were processed, and recognition rates between 85% and 100% were obtained for the terrain classes of desert, farms, mountains, and urban areas. To eliminate the need for training data, a new clustering algorithm has been developed. Seven ERTS-1 images from four test sites have been processed through the clustering algorithm, and high recognition rates have been achieved for all terrain classes.

  9. antiSMASH 2.0--a versatile platform for genome mining of secondary metabolite producers.

    PubMed

    Blin, Kai; Medema, Marnix H; Kazempour, Daniyal; Fischbach, Michael A; Breitling, Rainer; Takano, Eriko; Weber, Tilmann

    2013-07-01

    Microbial secondary metabolites are a potent source of antibiotics and other pharmaceuticals. Genome mining of their biosynthetic gene clusters has become a key method to accelerate their identification and characterization. In 2011, we developed antiSMASH, a web-based analysis platform that automates this process. Here, we present the highly improved antiSMASH 2.0 release, available at http://antismash.secondarymetabolites.org/. For the new version, antiSMASH was entirely re-designed using a plug-and-play concept that allows easy integration of novel predictor or output modules. antiSMASH 2.0 now supports input of multiple related sequences simultaneously (multi-FASTA/GenBank/EMBL), which allows the analysis of draft genomes comprising multiple contigs. Moreover, direct analysis of protein sequences is now possible. antiSMASH 2.0 has also been equipped with the capacity to detect additional classes of secondary metabolites, including oligosaccharide antibiotics, phenazines, thiopeptides, homo-serine lactones, phosphonates and furans. The algorithm for predicting the core structure of the cluster end product is now also covering lantipeptides, in addition to polyketides and non-ribosomal peptides. The antiSMASH ClusterBlast functionality has been extended to identify sub-clusters involved in the biosynthesis of specific chemical building blocks. The new features currently make antiSMASH 2.0 the most comprehensive resource for identifying and analyzing novel secondary metabolite biosynthetic pathways in microorganisms.

  10. Clustering and group selection of multiple criteria alternatives with application to space-based networks.

    PubMed

    Malakooti, Behnam; Yang, Ziyong

    2004-02-01

    In many real-world problems, the range of consequences of different alternatives are considerably different. In addition, sometimes, selection of a group of alternatives (instead of only one best alternative) is necessary. Traditional decision making approaches treat the set of alternatives with the same method of analysis and selection. In this paper, we propose clustering alternatives into different groups so that different methods of analysis, selection, and implementation for each group can be applied. As an example, consider the selection of a group of functions (or tasks) to be processed by a group of processors. The set of tasks can be grouped according to their similar criteria, and hence, each cluster of tasks to be processed by a processor. The selection of the best alternative for each clustered group can be performed using existing methods; however, the process of selecting groups is different than the process of selecting alternatives within a group. We develop theories and procedures for clustering discrete multiple criteria alternatives. We also demonstrate how the set of alternatives is clustered into mutually exclusive groups based on 1) similar features among alternatives; 2) ideal (or most representative) alternatives given by the decision maker; and 3) other preferential information of the decision maker. The clustering of multiple criteria alternatives also has the following advantages. 1) It decreases the set of alternatives to be considered by the decision maker (for example, different decision makers are assigned to different groups of alternatives). 2) It decreases the number of criteria. 3) It may provide a different approach for analyzing multiple decision makers problems. Each decision maker may cluster alternatives differently, and hence, clustering of alternatives may provide a basis for negotiation. The developed approach is applicable for solving a class of telecommunication networks problems where a set of objects (such as routers, processors, or intelligent autonomous vehicles) are to be clustered into similar groups. Objects are clustered based on several criteria and the decision maker's preferences.

  11. Sugar Lego: gene composition of bacterial carbohydrate metabolism genomic loci.

    PubMed

    Kaznadzey, Anna; Shelyakin, Pavel; Gelfand, Mikhail S

    2017-11-25

    Bacterial carbohydrate metabolism is extremely diverse, since carbohydrates serve as a major energy source and are involved in a variety of cellular processes. Bacterial genes belonging to same metabolic pathway are often co-localized in the chromosome, but it is not a strict rule. Gene co-localization in linked to co-evolution and co-regulation. This study focuses on a large-scale analysis of bacterial genomic loci related to the carbohydrate metabolism. We demonstrate that only 53% of 148,000 studied genes from over six hundred bacterial genomes are co-localized in bacterial genomes with other carbohydrate metabolism genes, which points to a significant role of singleton genes. Co-localized genes form cassettes, ranging in size from two to fifteen genes. Two major factors influencing the cassette-forming tendency are gene function and bacterial phylogeny. We have obtained a comprehensive picture of co-localization preferences of genes for nineteen major carbohydrate metabolism functional classes, over two hundred gene orthologous clusters, and thirty bacterial classes, and characterized the cassette variety in size and content among different species, highlighting a significant role of short cassettes. The preference towards co-localization of carbohydrate metabolism genes varies between 40 and 76% for bacterial taxa. Analysis of frequently co-localized genes yielded forty-five significant pairwise links between genes belonging to different functional classes. The number of such links per class range from zero to eight, demonstrating varying preferences of respective genes towards a specific chromosomal neighborhood. Genes from eleven functional classes tend to co-localize with genes from the same class, indicating an important role of clustering of genes with similar functions. At that, in most cases such co-localization does not originate from local duplication events. Overall, we describe a complex web formed by evolutionary relationships of bacterial carbohydrate metabolism genes, manifested as co-localization patterns. This article was reviewed by Daria V. Dibrova (A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia), nominated by Armen Mulkidjanian (University of Osnabrück, Germany), Igor Rogozin (NCBI, NLM, NIH, USA) and Yuri Wolf (NCBI, NLM, NIH, USA).

  12. Cascades on a class of clustered random networks

    NASA Astrophysics Data System (ADS)

    Hackett, Adam; Melnik, Sergey; Gleeson, James P.

    2011-05-01

    We present an analytical approach to determining the expected cascade size in a broad range of dynamical models on the class of random networks with arbitrary degree distribution and nonzero clustering introduced previously in [M. E. J. Newman, Phys. Rev. Lett. PRLTAO0031-900710.1103/PhysRevLett.103.058701103, 058701 (2009)]. A condition for the existence of global cascades is derived as well as a general criterion that determines whether increasing the level of clustering will increase, or decrease, the expected cascade size. Applications, examples of which are provided, include site percolation, bond percolation, and Watts’ threshold model; in all cases analytical results give excellent agreement with numerical simulations.

  13. Hydrogenases and H(+)-reduction in primary energy conservation.

    PubMed

    Vignais, Paulette M

    2008-01-01

    Hydrogenases are metalloenzymes subdivided into two classes that contain iron-sulfur clusters and catalyze the reversible oxidation of hydrogen gas (H(2)[Symbol: see text]left arrow over right arrow[Symbol: see text]2H(+)[Symbol: see text]+[Symbol: see text]2e(-)). Two metal atoms are present at their active center: either a Ni and an Fe atom in the [NiFe]hydrogenases, or two Fe atoms in the [FeFe]hydrogenases. They are phylogenetically distinct classes of proteins. The catalytic core of [NiFe]hydrogenases is a heterodimeric protein associated with additional subunits in many of these enzymes. The catalytic core of [FeFe]hydrogenases is a domain of about 350 residues that accommodates the active site (H cluster). Many [FeFe]hydrogenases are monomeric but possess additional domains that contain redox centers, mostly Fe-S clusters. A third class of hydrogenase, characterized by a specific iron-containing cofactor and by the absence of Fe-S cluster, is found in some methanogenic archaea; this Hmd hydrogenase has catalytic properties different from those of [NiFe]- and [FeFe]hydrogenases. The [NiFe]hydrogenases can be subdivided into four subgroups: (1) the H(2) uptake [NiFe]hydrogenases (group 1); (2) the cyanobacterial uptake hydrogenases and the cytoplasmic H(2) sensors (group 2); (3) the bidirectional cytoplasmic hydrogenases able to bind soluble cofactors (group 3); and (4) the membrane-associated, energy-converting, H(2) evolving hydrogenases (group 4). Unlike the [NiFe]hydrogenases, the [FeFe]hydrogenases form a homogeneous group and are primarily involved in H(2) evolution. This review recapitulates the classification of hydrogenases based on phylogenetic analysis and the correlation with hydrogenase function of the different phylogenetic groupings, discusses the possible role of the [FeFe]hydrogenases in the genesis of the eukaryotic cell, and emphasizes the structural and functional relationships of hydrogenase subunits with those of complex I of the respiratory electron transport chain.

  14. Sexuality Generates Diversity in the Aflatoxin Gene Cluster: Evidence on a Global Scale

    PubMed Central

    Moore, Geromy G.; Elliott, Jacalyn L.; Singh, Rakhi; Horn, Bruce W.; Dorner, Joe W.; Stone, Eric A.; Chulze, Sofia N.; Barros, German G.; Naik, Manjunath K.; Wright, Graeme C.; Hell, Kerstin; Carbone, Ignazio

    2013-01-01

    Aflatoxins are produced by Aspergillus flavus and A. parasiticus in oil-rich seed and grain crops and are a serious problem in agriculture, with aflatoxin B1 being the most carcinogenic natural compound known. Sexual reproduction in these species occurs between individuals belonging to different vegetative compatibility groups (VCGs). We examined natural genetic variation in 758 isolates of A. flavus, A. parasiticus and A. minisclerotigenes sampled from single peanut fields in the United States (Georgia), Africa (Benin), Argentina (Córdoba), Australia (Queensland) and India (Karnataka). Analysis of DNA sequence variation across multiple intergenic regions in the aflatoxin gene clusters of A. flavus, A. parasiticus and A. minisclerotigenes revealed significant linkage disequilibrium (LD) organized into distinct blocks that are conserved across different localities, suggesting that genetic recombination is nonrandom and a global occurrence. To assess the contributions of asexual and sexual reproduction to fixation and maintenance of toxin chemotype diversity in populations from each locality/species, we tested the null hypothesis of an equal number of MAT1-1 and MAT1-2 mating-type individuals, which is indicative of a sexually recombining population. All samples were clone-corrected using multi-locus sequence typing which associates closely with VCG. For both A. flavus and A. parasiticus, when the proportions of MAT1-1 and MAT1-2 were significantly different, there was more extensive LD in the aflatoxin cluster and populations were fixed for specific toxin chemotype classes, either the non-aflatoxigenic class in A. flavus or the B1-dominant and G1-dominant classes in A. parasiticus. A mating type ratio close to 1∶1 in A. flavus, A. parasiticus and A. minisclerotigenes was associated with higher recombination rates in the aflatoxin cluster and less pronounced chemotype differences in populations. This work shows that the reproductive nature of the population (more sexual versus more asexual) is predictive of aflatoxin chemotype diversity in these agriculturally important fungi. PMID:24009506

  15. REMOVING COOL CORES AND CENTRAL METALLICITY PEAKS IN GALAXY CLUSTERS WITH POWERFUL ACTIVE GALACTIC NUCLEUS OUTBURSTS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Guo Fulai; Mathews, William G., E-mail: fulai@ucolick.or

    2010-07-10

    Recent X-ray observations of galaxy clusters suggest that cluster populations are bimodally distributed according to central gas entropy and are separated into two distinct classes: cool core (CC) and non-cool core (NCC) clusters. While it is widely accepted that active galactic nucleus (AGN) feedback plays a key role in offsetting radiative losses and maintaining many clusters in the CC state, the origin of NCC clusters is much less clear. At the same time, a handful of extremely powerful AGN outbursts have recently been detected in clusters, with a total energy {approx}10{sup 61}-10{sup 62} erg. Using two-dimensional hydrodynamic simulations, we showmore » that if a large fraction of this energy is deposited near the centers of CC clusters, which is likely common due to dense cores, these AGN outbursts can completely remove CCs, transforming them to NCC clusters. Our model also has interesting implications for cluster abundance profiles, which usually show a central peak in CC systems. Our calculations indicate that during the CC to NCC transformation, AGN outbursts efficiently mix metals in cluster central regions and may even remove central abundance peaks if they are not broad enough. For CC clusters with broad central abundance peaks, AGN outbursts decrease peak abundances, but cannot effectively destroy the peaks. Our model may simultaneously explain the contradictory (possibly bimodal) results of abundance profiles in NCC clusters, some of which are nearly flat, while others have strong central peaks similar to those in CC clusters. A statistical analysis of the sizes of central abundance peaks and their redshift evolution may shed interesting insights on the origin of both types of NCC clusters and the evolution history of thermodynamics and AGN activity in clusters.« less

  16. Clustering eating habits: frequent consumption of different dietary patterns among the Italian general population in the association with obesity, physical activity, sociocultural characteristics and psychological factors.

    PubMed

    Denoth, Francesca; Scalese, Marco; Siciliano, Valeria; Di Renzo, Laura; De Lorenzo, Antonino; Molinaro, Sabrina

    2016-06-01

    (a) To identify clusters of eating patterns among the Italian population aged 15-64 years, focusing on typical Mediterranean diet (Med-diet) items consumption; (b) to examine the distribution of eating habits, as identified clusters, among age classes and genders; (c) evaluate the impact of: belonging to a specific eating cluster, level of physical activity (PA), sociocultural and psychological factors, as elements determining weight abnormalities. Data for this cross-sectional study were collected using self-reporting questionnaires administered to a sample of 33,127 subjects participating in the Italian population survey on alcohol and other drugs (IPSAD(®)2011). The cluster analysis was performed on a subsample (n = 5278 subjects) which provided information on eating habits, and adapted to identify categories of eating patterns. Stepwise multinomial regression analysis was performed to evaluate the associations between weight categories and eating clusters, adjusted for the following background variables: PA levels, sociocultural and psychological factors. Three clusters were identified: "Mediterranean-like", "Western-like" and "low fruit/vegetables". Frequent consumption of Med-diet patterns was more common among females and elderly. The relationship between overweight/obesity and male gender, educational level, PA, depression and eating disorders (p < 0.05) was confirmed. Belonging to a cluster other than "Mediterranean-like" was significantly associated with obesity. The low consumption of Med-diet patterns among youth, and the frequent association of sociocultural, psychological issues and inappropriate lifestyle with overweight/obesity, highlight the need for an interdisciplinary approach including market policies, to promote a wider awareness of the Mediterranean eating habit benefits in combination with an appropriate lifestyle.

  17. The complete mitochondrial genome of Pallisentis celatus (Acanthocephala) with phylogenetic analysis of acanthocephalans and rotifers.

    PubMed

    Pan, Ting Shuang; Nie, Pin

    2013-07-01

    Acanthocephalans are a small group of obligate endoparasites. They and rotifers are recently placed in a group called Syndermata. However, phylogenetic relationships within classes of acanthocephalans, and between them and rotifers, have not been well resolved, possibly due to the lack of molecular data suitable for such analysis. In this study, the mitochondrial (mt) genome was sequenced from Pallisentis celatus (Van Cleave, 1928), an acanthocephalan in the class Eoacanthocephala, an intestinal parasite of rice-field eel, Monopterus albus (Zuiew, 1793), in China. The complete mt genome sequence of P. celatus is 13 855 bp long, containing 36 genes including 12 protein-coding genes, 22 transfer RNAs (tRNAs) and 2 ribosomal RNAs (rRNAs) as reported for other acanthocephalan species. All genes are encoded on the same strand and in the same direction. Phylogenetic analysis indicated that acanthocephalans are closely related with a clade containing bdelloids, which then correlates with the clade containing monogononts. The class Eoacanthocephala, containing P. celatus and Paratenuisentis ambiguus (Van Cleave, 1921) was closely related to the Palaeacanthocephala. It is thus indicated that acanthocephalans may be just clustered among groups of rotifers. However, the resolving of phylogenetic relationship among all classes of acanthocephalans and between them and rotifers may require further sampling and more molecular data.

  18. Clustering analysis of line indices for LAMOST spectra with AstroStat

    NASA Astrophysics Data System (ADS)

    Chen, Shu-Xin; Sun, Wei-Min; Yan, Qi

    2018-06-01

    The application of data mining in astronomical surveys, such as the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) survey, provides an effective approach to automatically analyze a large amount of complex survey data. Unsupervised clustering could help astronomers find the associations and outliers in a big data set. In this paper, we employ the k-means method to perform clustering for the line index of LAMOST spectra with the powerful software AstroStat. Implementing the line index approach for analyzing astronomical spectra is an effective way to extract spectral features for low resolution spectra, which can represent the main spectral characteristics of stars. A total of 144 340 line indices for A type stars is analyzed through calculating their intra and inter distances between pairs of stars. For intra distance, we use the definition of Mahalanobis distance to explore the degree of clustering for each class, while for outlier detection, we define a local outlier factor for each spectrum. AstroStat furnishes a set of visualization tools for illustrating the analysis results. Checking the spectra detected as outliers, we find that most of them are problematic data and only a few correspond to rare astronomical objects. We show two examples of these outliers, a spectrum with abnormal continuumand a spectrum with emission lines. Our work demonstrates that line index clustering is a good method for examining data quality and identifying rare objects.

  19. Oxidation catalysis by polyoxometalates fundamental electron-transfer phenomena

    Treesearch

    Yurii V. Geletii; Rajai H. Atalla; Alan J. Bailey; Laurent Delannoy; Craig L. Hill; Ira A. Weinstock

    2002-01-01

    Early transition-metal oxygen-anion clusters (polyoxometalates, POMs) are a large and rapidly growing class of versatile and tunable oxidation catalysts. All key molecular properties of these clusters (composition, size, shape, charge density, reduction potential, solubility, etc.) can be systematically altered, and the clusters themselves can serve as tunable ligands...

  20. The "p"-Median Model as a Tool for Clustering Psychological Data

    ERIC Educational Resources Information Center

    Kohn, Hans-Friedrich; Steinley, Douglas; Brusco, Michael J.

    2010-01-01

    The "p"-median clustering model represents a combinatorial approach to partition data sets into disjoint, nonhierarchical groups. Object classes are constructed around "exemplars", that is, manifest objects in the data set, with the remaining instances assigned to their closest cluster centers. Effective, state-of-the-art implementations of…

  1. Dolidze-35: Results for a Possible Open Cluster

    NASA Astrophysics Data System (ADS)

    Gulledge, Deborah J.; Borges, Richard A.; Juelfs, Elizabeth; Allyn Smith, J.; Olive, Mary E.; McDonald, Christopher P.; Williams, Sarah M.; Cohen, Eden M.; Gawel, Jason D.; McCole, Bambi A.; Robertson, Jacob M.; Wilson, Tyler; Young, William J.; Buckner, Spencer L.; Allen, Nic R.; Head, H. Hope

    2016-01-01

    Dolidze-35 is an under-observed northern hemisphere open cluster. It is noted in WEBDA as "No data available for this cluster". As such, we chose this cluster as an undergraduate class project to investigate its existence. We present SDSS-ugriz magnitudes for the possible cluster and cross these with existing JHK data obtained from 2MASS. Selection of possible members is aided by the proper motion study of Krone-Martins (2010).

  2. Identifying poor metabolic adaptation during early lactation in dairy cows using cluster analysis.

    PubMed

    Tremblay, M; Kammer, M; Lange, H; Plattner, S; Baumgartner, C; Stegeman, J A; Duda, J; Mansfeld, R; Döpfer, D

    2018-05-02

    Currently, cows with poor metabolic adaptation during early lactation, or poor metabolic adaptation syndrome (PMAS), are often identified based on detection of hyperketonemia. Unfortunately, elevated blood ketones do not manifest consistently with indications of PMAS. Expected indicators of PMAS include elevated liver enzymes and bilirubin, decreased rumen fill, reduced rumen contractions, and a decrease in milk production. Cows with PMAS typically are higher producing, older cows that are earlier in lactation and have greater body condition score at the start of lactation. It was our aim to evaluate commonly used measures of metabolic health (input variables) that were available [i.e., blood β-hydroxybutyrate acid, milk fat:protein ratio, blood nonesterified fatty acids (NEFA)] to characterize PMAS. Bavarian farms (n = 26) with robotic milking systems were enrolled for weekly visits for an average of 6.7 wk. Physical examinations of the cows (5-50 d in milk) were performed by veterinarians during each visit, and blood and milk samples were collected. Resulting data included 790 observations from 312 cows (309 Simmental, 1 Red Holstein, 2 Holstein). Principal component analysis was conducted on the 3 input variables, followed by K-means cluster analysis of the first 2 orthogonal components. The 5 resulting clusters were then ascribed to low, intermediate, or high PMAS classes based on their degree of agreement with expected PMAS indicators and characteristics in comparison with other clusters. Results revealed that PMAS classes were most significantly associated with blood NEFA levels. Next, we evaluated NEFA values that classify observations into appropriate PMAS classes in this data set, which we called separation values. Our resulting NEFA separation values [<0.39 mmol/L (95% confidence limits = 0.360-0.410) to identify low PMAS observations and ≥0.7 mmol/L (95% confidence limits = 0.650-0.775) to identify high PMAS observations] were similar to values determined for Holsteins in conventional milking settings diagnosed with hyperketonemia and clinical symptoms such as anorexia and a reduction in milk yield, as reported in the literature. Future studies evaluating additional clinical and laboratory data, breeds, and milking systems are needed to validate these finding. The aim of future studies would be to build a PMAS prediction model to alert producers of cows needing attention and help evaluate on-farm metabolic health management at the herd level. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  3. Structural genes for thiamine biosynthetic enzymes (thiCEFGH) in Escherichia coli K-12.

    PubMed Central

    Vander Horn, P B; Backstrom, A D; Stewart, V; Begley, T P

    1993-01-01

    Escherichia coli K-12 synthesizes thiamine pyrophosphate (vitamin B1) de novo. Two precursors [4-methyl-5-(beta-hydroxyethyl)thiazole monophosphate and 4-amino-5-hydroxymethyl-2-methylpyrimidine pyrophosphate] are coupled to form thiamine monophosphate, which is then phosphorylated to make thiamine pyrophosphate. Previous studies have identified two classes of thi mutations, clustered at 90 min on the genetic map, which result in requirements for the thiazole or the hydroxymethylpryimidine. We report here our initial molecular genetic analysis of the thi cluster. We cloned the thi cluster genes and examined their organization, structure, and function by a combination of phenotypic testing, complementation analysis, polypeptide expression, and DNA sequencing. We found five tightly linked genes, designated thiCEFGH. The thiC gene product is required for the synthesis of the hydroxymethylpyrimidine. The thiE, thiF, thiG, and thiH gene products are required for synthesis of the thiazole. These mutants did not respond to 1-deoxy-D-threo-2-pentulose, indicating that they are blocked in the conversion of this precursor compound to the thiazole itself. Images PMID:8432721

  4. A strategy for analysis of (molecular) equilibrium simulations: Configuration space density estimation, clustering, and visualization

    NASA Astrophysics Data System (ADS)

    Hamprecht, Fred A.; Peter, Christine; Daura, Xavier; Thiel, Walter; van Gunsteren, Wilfred F.

    2001-02-01

    We propose an approach for summarizing the output of long simulations of complex systems, affording a rapid overview and interpretation. First, multidimensional scaling techniques are used in conjunction with dimension reduction methods to obtain a low-dimensional representation of the configuration space explored by the system. A nonparametric estimate of the density of states in this subspace is then obtained using kernel methods. The free energy surface is calculated from that density, and the configurations produced in the simulation are then clustered according to the topography of that surface, such that all configurations belonging to one local free energy minimum form one class. This topographical cluster analysis is performed using basin spanning trees which we introduce as subgraphs of Delaunay triangulations. Free energy surfaces obtained in dimensions lower than four can be visualized directly using iso-contours and -surfaces. Basin spanning trees also afford a glimpse of higher-dimensional topographies. The procedure is illustrated using molecular dynamics simulations on the reversible folding of peptide analoga. Finally, we emphasize the intimate relation of density estimation techniques to modern enhanced sampling algorithms.

  5. Spatial clusters of daytime sleepiness and association with nighttime noise levels in a Swiss general population (GeoHypnoLaus).

    PubMed

    Joost, Stéphane; Haba-Rubio, José; Himsl, Rebecca; Vollenweider, Peter; Preisig, Martin; Waeber, Gérard; Marques-Vidal, Pedro; Heinzer, Raphaël; Guessous, Idris

    2018-05-31

    Daytime sleepiness is highly prevalent in the general adult population and has been linked to an increased risk of workplace and vehicle accidents, lower professional performance and poorer health. Despite the established relationship between noise and daytime sleepiness, little research has explored the individual-level spatial distribution of noise-related sleep disturbances. We assessed the spatial dependence of daytime sleepiness and tested whether clusters of individuals exhibiting higher daytime sleepiness were characterized by higher nocturnal noise levels than other clusters. Population-based cross-sectional study, in the city of Lausanne, Switzerland. Sleepiness was measured using the Epworth Sleepiness Scale (ESS) for 3697 georeferenced individuals from the CoLaus|PsyCoLaus cohort (period = 2009-2012). We used the sonBASE georeferenced database produced by the Swiss Federal Office for the Environment to characterize nighttime road traffic noise exposure throughout the city. We used the GeoDa software program to calculate the Getis-Ord G i * statistics for unadjusted and adjusted ESS in order to detect spatial clusters of high and low ESS values. Modeled nighttime noise exposure from road and rail traffic was compared across ESS clusters. Daytime sleepiness was not randomly distributed and showed a significant spatial dependence. The median nighttime traffic noise exposure was significantly different across the three ESS Getis cluster classes (p < 0.001). The mean nighttime noise exposure in the high ESS cluster class was 47.6, dB(A) 5.2 dB(A) higher than in low clusters (p < 0.001) and 2.1 dB(A) higher than in the neutral class (p < 0.001). These associations were independent of major potential confounders including body mass index and neighborhood income level. Clusters of higher daytime sleepiness in adults are associated with higher median nighttime noise levels. The identification of these clusters can guide tailored public health interventions. Copyright © 2018 The Authors. Published by Elsevier GmbH.. All rights reserved.

  6. Development of a novel fingerprint for chemical reactions and its application to large-scale reaction classification and similarity.

    PubMed

    Schneider, Nadine; Lowe, Daniel M; Sayle, Roger A; Landrum, Gregory A

    2015-01-26

    Fingerprint methods applied to molecules have proven to be useful for similarity determination and as inputs to machine-learning models. Here, we present the development of a new fingerprint for chemical reactions and validate its usefulness in building machine-learning models and in similarity assessment. Our final fingerprint is constructed as the difference of the atom-pair fingerprints of products and reactants and includes agents via calculated physicochemical properties. We validated the fingerprints on a large data set of reactions text-mined from granted United States patents from the last 40 years that have been classified using a substructure-based expert system. We applied machine learning to build a 50-class predictive model for reaction-type classification that correctly predicts 97% of the reactions in an external test set. Impressive accuracies were also observed when applying the classifier to reactions from an in-house electronic laboratory notebook. The performance of the novel fingerprint for assessing reaction similarity was evaluated by a cluster analysis that recovered 48 out of 50 of the reaction classes with a median F-score of 0.63 for the clusters. The data sets used for training and primary validation as well as all python scripts required to reproduce the analysis are provided in the Supporting Information.

  7. Cosmology with XMM galaxy clusters: the X-CLASS/GROND catalogue and photometric redshifts

    NASA Astrophysics Data System (ADS)

    Ridl, J.; Clerc, N.; Sadibekova, T.; Faccioli, L.; Pacaud, F.; Greiner, J.; Krühler, T.; Rau, A.; Salvato, M.; Menzel, M.-L.; Steinle, H.; Wiseman, P.; Nandra, K.; Sanders, J.

    2017-06-01

    The XMM Cluster Archive Super Survey (X-CLASS) is a serendipitously detected X-ray-selected sample of 845 galaxy clusters based on 2774 XMM archival observations and covering an approximately 90 deg2 spread across the high-Galactic latitude (|b| > 20°) sky. The primary goal of this survey is to produce a well-selected sample of galaxy clusters on which cosmological analyses can be performed. This paper presents the photometric redshift follow-up of a high signal-to-noise ratio subset of 265 of these clusters with declination δ < +20° with Gamma-Ray Burst Optical and Near-Infrared Detector (GROND), a 7-channel (grizJHK) simultaneous imager on the MPG 2.2-m telescope at the ESO La Silla Observatory. We use a newly developed technique based on the red sequence colour-redshift relation, enhanced with information coming from the X-ray detection to provide photometric redshifts for this sample. We determine photometric redshifts for 232 clusters, finding a median redshift of z = 0.39 with an accuracy of Δz = 0.02(1 + z) when compared to a sample of 76 spectroscopically confirmed clusters. We also compute X-ray luminosities for the entire sample and find a median bolometric luminosity of 7.2 × 1043 erg s-1 and a median temperature of 2.9 keV. We compare our results to those of the XMM-XCS and XMM-XXL surveys, finding good agreement in both samples. The X-CLASS catalogue is available online at http://xmm-lss.in2p3.fr:8080/l4sdb/.

  8. Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.

    PubMed

    Yousef, Malik; Saçar Demirci, Müşerref Duygu; Khalifa, Waleed; Allmer, Jens

    2016-01-01

    MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.

  9. Patterns of multiple health risk-behaviours in university students and their association with mental health: application of latent class analysis.

    PubMed

    Kwan, M Y; Arbour-Nicitopoulos, K P; Duku, E; Faulkner, G

    2016-08-01

    University and college campuses may be the last setting where it is possible to comprehensively address the health of a large proportion of the young adult population. It is important that health promoters understand the collective challenges students are facing, and to better understand the broader lifestyle behavioural patterning evident during this life stage. The purpose of this study was to examine the clustering of modifiable health-risk behaviours and to explore the relationship between these identified clusters and mental health outcomes among a large Canadian university sample. Undergraduate students (n = 837; mean age = 21 years) from the University of Toronto completed the National College Health Assessment survey. The survey consists of approximately 300 items, including assessments of student health status, mental health and health-risk behaviours. Latent class analysis was used to identify patterning based on eight salient health-risk behaviours (marijuana use, other illicit drug use, risky sex, smoking, binge drinking, poor diet, physical inactivity, and insufficient sleep). A three-class model based on student behavioural patterns emerged: "typical," "high-risk" and "moderately healthy." Results also found high-risk students reporting significantly higher levels of stress than typical students (χ2(1671) = 7.26, p < .01). Students with the highest likelihood of engaging in multiple health-risk behaviours reported poorer mental health, particularly as it relates to stress. Although these findings should be interpreted with caution due to the 28% response rate, they do suggest that interventions targeting specific student groups with similar patterning of multiple health-risk behaviours may be needed.

  10. Aryl Polyenes, a Highly Abundant Class of Bacterial Natural Products, Are Functionally Related to Antioxidative Carotenoids.

    PubMed

    Schöner, Tim A; Gassel, Sören; Osawa, Ayako; Tobias, Nicholas J; Okuno, Yukari; Sakakibara, Yui; Shindo, Kazutoshi; Sandmann, Gerhard; Bode, Helge B

    2016-02-02

    Bacterial pigments of the aryl polyene type are structurally similar to the well-known carotenoids with respect to their polyene systems. Their biosynthetic gene cluster is widespread in taxonomically distant bacteria, and four classes of such pigments have been found. Here we report the structure elucidation of the aryl polyene/dialkylresorcinol hybrid pigments of Variovorax paradoxus B4 by HPLC-UV-MS, MALDI-MS and NMR. Furthermore, we show for the first time that this pigment class protects the bacterium from reactive oxygen species, similarly to what is known for carotenoids. An analysis of the distribution of biosynthetic genes for aryl polyenes and carotenoids in bacterial genomes is presented; it shows a complementary distribution of these protective pigments in bacteria. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Utilizing geogebra in financial mathematics problems: didactic experiment in vocational college

    NASA Astrophysics Data System (ADS)

    Ghozi, Saiful; Yuniarti, Suci

    2017-12-01

    GeoGebra application offers users to solve real problems in geometry, statistics, and algebra fields. This studydeterminesthe effect of utilizing Geogebra on students understanding skill in the field of financial mathematics. This didactic experiment study used pre-test-post-test control group design. Population of this study were vocational college students in Banking and Finance Program of Balikpapan State Polytechnic. Two classes in the first semester were chosen using cluster random sampling technique, one class as experiment group and one class as control group. Data were analysed used independent sample t-test. The result of data analysis showed that students understanding skill with learning by utilizing GeoGeobra is better than students understanding skill with conventional learning. This result supported that utilizing GeoGebra in learning can assist the students to enhance their ability and depth understanding on mathematics subject.

  12. Social class and body weight among Chinese urban adults: the role of the middle classes in the nutrition transition.

    PubMed

    Bonnefond, Céline; Clément, Matthieu

    2014-07-01

    While a plethoric empirical literature addresses the relationship between socio-economic status and body weight, little is known about the influence of social class on nutritional outcomes, particularly in developing countries. The purpose of this article is to contribute to the analysis of the social determinants of adult body weight in urban China by taking into account the influence of social class. More specifically, we propose to analyse the position of the Chinese urban middle class in terms of being overweight or obese. The empirical investigations conducted as part of this research are based on a sample of 1320 households and 2841 adults from the China Health and Nutrition Survey for 2009. For the first step, we combine an economic approach and a sociological approach to identify social classes at household level. First, households with an annual per capita income between 10,000 Yuan and the 95th income percentile are considered as members of the middle class. Second, we strengthen the characterization of the middle class using information on education and employment. By applying clustering methods, we identify four groups: the elderly and inactive middle class, the old middle class, the lower middle class and the new middle class. For the second step, we implement an econometric analysis to assess the influence of social class on adult body mass index and on the probability of being overweight or obese. We use multinomial treatment regressions to deal with the endogeneity of the social class variable. Our results show that among the four subgroups of the urban middle class, the new middle class is the only one to be relatively well-protected against obesity. We suggest that this group plays a special role in adopting healthier food consumption habits and seems to be at a more advanced stage of the nutrition transition. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. Co-clustering phenome–genome for phenotype classification and disease gene discovery

    PubMed Central

    Hwang, TaeHyun; Atluri, Gowtham; Xie, MaoQiang; Dey, Sanjoy; Hong, Changjin; Kumar, Vipin; Kuang, Rui

    2012-01-01

    Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways. PMID:22735708

  14. Profiles in coping: responses to sexual harassment across persons, organizations, and cultures.

    PubMed

    Cortina, Lilia M; Wasti, S Arzu

    2005-01-01

    This study explicates the complexity of sexual harassment coping behavior among 4 diverse samples of working women: (a) working-class Hispanic Americans, (b) working-class Anglo Americans, (c) professional Turks, and (d) professional Anglo Americans. K-means cluster analysis revealed 3 common harassment coping profiles: (a) detached, (b) avoidant negotiating, and (c) support seeking. The authors then tested an integrated framework of coping profile determinants, involving social power, stressor severity, social support, and culture. Analysis of variance, chi-square, and discriminant function results identified significant determinants at each of the 4 levels of this ecological model. These findings underscore the importance of focusing on whole patterns of experience--and considering influences at the level of the individual employee and multiple levels of the surrounding context--when studying how women cope with workplace sexual harassment.

  15. EEG microstates during resting represent personality differences.

    PubMed

    Schlegel, Felix; Lehmann, Dietrich; Faber, Pascal L; Milz, Patricia; Gianotti, Lorena R R

    2012-01-01

    We investigated the spontaneous brain electric activity of 13 skeptics and 16 believers in paranormal phenomena; they were university students assessed with a self-report scale about paranormal beliefs. 33-channel EEG recordings during no-task resting were processed as sequences of momentary potential distribution maps. Based on the maps at peak times of Global Field Power, the sequences were parsed into segments of quasi-stable potential distribution, the 'microstates'. The microstates were clustered into four classes of map topographies (A-D). Analysis of the microstate parameters time coverage, occurrence frequency and duration as well as the temporal sequence (syntax) of the microstate classes revealed significant differences: Believers had a higher coverage and occurrence of class B, tended to decreased coverage and occurrence of class C, and showed a predominant sequence of microstate concatenations from A to C to B to A that was reversed in skeptics (A to B to C to A). Microstates of different topographies, putative "atoms of thought", are hypothesized to represent different types of information processing.The study demonstrates that personality differences can be detected in resting EEG microstate parameters and microstate syntax. Microstate analysis yielded no conclusive evidence for the hypothesized relation between paranormal belief and schizophrenia.

  16. Using multivariate techniques to assess the effects of urbanization on surface water quality: a case study in the Liangjiang New Area, China.

    PubMed

    Luo, Kun; Hu, Xuebin; He, Qiang; Wu, Zhengsong; Cheng, Hao; Hu, Zhenlong; Mazumder, Asit

    2017-04-01

    Rapid urbanization in China has been causing dramatic deterioration in the water quality of rivers and threatening aquatic ecosystem health. In this paper, multivariate techniques, such as factor analysis (FA) and cluster analysis (CA), were applied to analyze the water quality datasets for 19 rivers in Liangjiang New Area (LJNA), China, collected in April (dry season) and September (wet season) of 2014 and 2015. In most sampling rivers, total phosphorus, total nitrogen, and fecal coliform exceeded the Class V guideline (GB3838-2002), which could thereby threaten the water quality in Yangtze and Jialing Rivers. FA clearly identified the five groups of water quality variables, which explain majority of the experimental data. Nutritious pollution, seasonal changes, and construction activities were three key factors influencing rivers' water quality in LJNA. CA grouped 19 sampling sites into two clusters, which located at sub-catchments with high- and low-level urbanization, respectively. One-way ANOVA showed the nutrients (total phosphorus, soluble reactive phosphorus, total nitrogen, ammonium nitrogen, and nitrite), fecal coliform, and conductivity in cluster 1 were significantly greater than in cluster 2. Thus, catchment urbanization degraded rivers' water quality in Liangjiang New Area. Identifying effective buffer zones at riparian scale to weaken the negative impacts of catchment urbanization was recommended.

  17. The Fe-S cluster-containing NEET proteins mitoNEET and NAF-1 as chemotherapeutic targets in breast cancer.

    PubMed

    Bai, Fang; Morcos, Faruck; Sohn, Yang-Sung; Darash-Yahana, Merav; Rezende, Celso O; Lipper, Colin H; Paddock, Mark L; Song, Luhua; Luo, Yuting; Holt, Sarah H; Tamir, Sagi; Theodorakis, Emmanuel A; Jennings, Patricia A; Onuchic, José N; Mittler, Ron; Nechushtai, Rachel

    2015-03-24

    Identification of novel drug targets and chemotherapeutic agents is a high priority in the fight against cancer. Here, we report that MAD-28, a designed cluvenone (CLV) derivative, binds to and destabilizes two members of a unique class of mitochondrial and endoplasmic reticulum (ER) 2Fe-2S proteins, mitoNEET (mNT) and nutrient-deprivation autophagy factor-1 (NAF-1), recently implicated in cancer cell proliferation. Docking analysis of MAD-28 to mNT/NAF-1 revealed that in contrast to CLV, which formed a hydrogen bond network that stabilized the 2Fe-2S clusters of these proteins, MAD-28 broke the coordinative bond between the His ligand and the cluster's Fe of mNT/NAF-1. Analysis of MAD-28 performed with control (Michigan Cancer Foundation; MCF-10A) and malignant (M.D. Anderson-metastatic breast; MDA-MB-231 or MCF-7) human epithelial breast cells revealed that MAD-28 had a high specificity in the selective killing of cancer cells, without any apparent effects on normal breast cells. MAD-28 was found to target the mitochondria of cancer cells and displayed a surprising similarity in its effects to the effects of mNT/NAF-1 shRNA suppression in cancer cells, causing a decrease in respiration and mitochondrial membrane potential, as well as an increase in mitochondrial iron content and glycolysis. As expected, if the NEET proteins are targets of MAD-28, cancer cells with suppressed levels of NAF-1 or mNT were less susceptible to the drug. Taken together, our results suggest that NEET proteins are a novel class of drug targets in the chemotherapeutic treatment of breast cancer, and that MAD-28 can now be used as a template for rational drug design for NEET Fe-S cluster-destabilizing anticancer drugs.

  18. Identification and classification of cathinone unknowns by statistical analysis processing of direct analysis in real time-high resolution mass spectrometry-derived "neutral loss" spectra.

    PubMed

    Fowble, Kristen L; Shepard, Jason R E; Musah, Rabi A

    2018-03-01

    An approach to the rapid determination of the structures of novel synthetic cathinone designer drugs, also known as bath salts, is reported. While cathinones fragment so extensively by electron impact mass spectrometry that their mass spectra often cannot be used to identify the structure, collision-induced dissociation (CID) direct analysis in real time-high resolution mass spectrometry (DART-HRMS) experiments furnished spectra that provided diagnostic fragmentation patterns for the analyzed cathinones. From this data, neutral loss spectra, which reflect the presence of specific chemical moieties, could be acquired. These spectra showed striking similarities between cathinones sharing structural features such as pyrrolidine rings and methylenedioxy moieties. Principle component analysis (PCA) of the neutral loss spectra of nine synthetic cathinones of various types including ethcathinones, those containing a methylenedioxy moiety appended to the benzene ring, and pyrrolidine-containing structures, illustrated that cathinones falling within the same class clustered together and could be distinguished from those of other classes. Furthermore, hierarchical clustering analysis of the neutral loss data of a model set derived from 44 synthetic cathinones, furnished a dendrogram in which structurally similar cathinones clustered together. The ability of this model system to facilitate structure determination was tested using 4-fluoroethcathinone, 3,4-methylenedioxy-α-pyrrolidinohexanophenone (MDPHP), and ethylone, which fall into the ethcathinone, pyrrolidine-containing, and methylenedioxy-containing subclasses respectively. The results showed that their neutral loss spectra correctly fell within the ethcathinone, pyrrolidine-containing and methylenedioxy-containing cathinone clades of the dendrogram, and that the neutral loss information could be used to infer the structures of these compounds. The analysis and data processing steps are rapid and samples can be analyzed in their native form without any sample processing steps. The robustness of the dendrogram dataset can be readily increased by continued addition of newly discovered structures. The approach can be broadly applied to structure determination of unknowns, and would be particularly useful for analyses where sample amounts are limited. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. DYNER: A DYNamic ClustER for Education and Research

    ERIC Educational Resources Information Center

    Kehagias, Dimitris; Grivas, Michael; Mamalis, Basilis; Pantziou, Grammati

    2006-01-01

    Purpose: The purpose of this paper is to evaluate the use of a non-expensive dynamic computing resource, consisting of a Beowulf class cluster and a NoW, as an educational and research infrastructure. Design/methodology/approach: Clusters, built using commodity-off-the-shelf (COTS) hardware components and free, or commonly used, software, provide…

  20. Branching points in the low-temperature dipolar hard sphere fluid

    NASA Astrophysics Data System (ADS)

    Rovigatti, Lorenzo; Kantorovich, Sofia; Ivanov, Alexey O.; Tavares, José Maria; Sciortino, Francesco

    2013-10-01

    In this contribution, we investigate the low-temperature, low-density behaviour of dipolar hard-sphere (DHS) particles, i.e., hard spheres with dipoles embedded in their centre. We aim at describing the DHS fluid in terms of a network of chains and rings (the fundamental clusters) held together by branching points (defects) of different nature. We first introduce a systematic way of classifying inter-cluster connections according to their topology, and then employ this classification to analyse the geometric and thermodynamic properties of each class of defects, as extracted from state-of-the-art equilibrium Monte Carlo simulations. By computing the average density and energetic cost of each defect class, we find that the relevant contribution to inter-cluster interactions is indeed provided by (rare) three-way junctions and by four-way junctions arising from parallel or anti-parallel locally linear aggregates. All other (numerous) defects are either intra-cluster or associated to low cluster-cluster interaction energies, suggesting that these defects do not play a significant part in the thermodynamic description of the self-assembly processes of dipolar hard spheres.

  1. "A Richness Study of 14 Distant X-Ray Clusters from the 160 Square Degree Survey"

    NASA Technical Reports Server (NTRS)

    Jones, Christine; West, Donald (Technical Monitor)

    2001-01-01

    We have measured the surface density of galaxies toward 14 X-ray-selected cluster candidates at redshifts z(sub i) 0.46, and we show that they are associated with rich galaxy concentrations. These clusters, having X-ray luminosities of Lx(0.5-2 keV) approx. (0.5 - 2.6) x 10(exp 44) ergs/ sec are among the most distant and luminous in our 160 deg(exp 2) ROSAT Position Sensitive Proportional Counter cluster survey. We find that the clusters range between Abell richness classes 0 and 2 and have a most probable richness class of 1. We compare the richness distribution of our distant clusters to those for three samples of nearby clusters with similar X-ray luminosities. We find that the nearby and distant samples have similar richness distributions, which shows that clusters have apparently not evolved substantially in richness since redshift z=0.5. There is, however, a marginal tendency for the distant clusters to be slightly poorer than nearby clusters, although deeper multicolor data for a large sample would be required to confirm this trend. We compare the distribution of distant X-ray clusters in the L(sub X)-richness plane to the distribution of optically selected clusters from the Palomar Distant Cluster Survey. The optically selected clusters appear overly rich for their X-ray luminosities, when compared to X-ray-selected clusters. Apparently, X-ray and optical surveys do not necessarily sample identical mass concentrations at large redshifts. This may indicate the existence of a population of optically rich clusters with anomalously low X-ray emission, More likely, however, it reflects the tendency for optical surveys to select unvirialized mass concentrations, as might be expected when peering along large-scale filaments.

  2. Monothiol glutaredoxins and A-type proteins: partners in Fe-S cluster trafficking.

    PubMed

    Mapolelo, Daphne T; Zhang, Bo; Randeniya, Sajini; Albetel, Angela-Nadia; Li, Haoran; Couturier, Jérémy; Outten, Caryn E; Rouhier, Nicolas; Johnson, Michael K

    2013-03-07

    Monothiol glutaredoxins (Grxs) are proposed to function in Fe-S cluster storage and delivery, based on their ability to exist as apo monomeric forms and dimeric forms containing a subunit-bridging [Fe(2)S(2)](2+) cluster, and to accept [Fe(2)S(2)](2+) clusters from primary scaffold proteins. In addition yeast cytosolic monothiol Grxs interact with Fra2 (Fe repressor of activation-2), to form a heterodimeric complex with a bound [Fe(2)S(2)](2+) cluster that plays a key role in iron sensing and regulation of iron homeostasis. In this work, we report on in vitro UV-visible CD studies of cluster transfer between homodimeric monothiol Grxs and members of the ubiquitous A-type class of Fe-S cluster carrier proteins ((Nif)IscA and SufA). The results reveal rapid, unidirectional, intact and quantitative cluster transfer from the [Fe(2)S(2)](2+) cluster-bound forms of A. thaliana GrxS14, S. cerevisiae Grx3, and A. vinelandii Grx-nif homodimers to A. vinelandii(Nif)IscA and from A. thaliana GrxS14 to A. thaliana SufA1. Coupled with in vivo evidence for interaction between monothiol Grxs and A-type Fe-S cluster carrier proteins, the results indicate that these two classes of proteins work together in cellular Fe-S cluster trafficking. However, cluster transfer is reversed in the presence of Fra2, since the [Fe(2)S(2)](2+) cluster-bound heterodimeric Grx3-Fra2 complex can be formed by intact [Fe(2)S(2)](2+) cluster transfer from (Nif)IscA. The significance of these results for Fe-S cluster biogenesis or repair and the cellular regulation of the Fe-S cluster status are discussed.

  3. Sequence determination and analysis of S-adenosyl-L-homocysteine hydrolase from yellow lupine (Lupinus luteus).

    PubMed

    Brzeziński, K; Janowski, R; Podkowiński, J; Jaskólski, M

    2001-01-01

    The coding sequences of two S-adenosyl-L-homocysteine hydrolases (SAHases) were identified in yellow lupine by screenig of a cDNA library. One of them, corresponding to the complete protein, was sequenced and compared with 52 other SAHase sequences. Phylogenetic analysis of these proteins identified three groups of the enzymes. Group A comprises only bacterial sequences. Group B is subdivided into two subgroups, one of which (B1) is formed by animal sequences. Subgroup B2 consist of two distinct clusters, B2a and B2b. Cluster B2b comprises all known plant sequences, including the yellow lupine enzyme, which are distinguished by a 50-residue insert. Group C is heterogeneous and contains SAHases from Archaea as well as a new class of animal enzymes, distinctly different from those in group B1.

  4. High-Resolution Metabolomics for Nutrition and Health Assessment of Armed Forces Personnel.

    PubMed

    Accardi, Carolyn Jonas; Walker, Douglas I; Uppal, Karan; Quyyumi, Arshed A; Rohrbeck, Patricia; Pennell, Kurt D; Mallon, Col Timothy M; Jones, Dean P

    2016-08-01

    The aim of this study was to test the utility of high-resolution metabolomics (HRM) for analysis of nutritional status and health indicators in military personnel. Serum samples from 400 military personnel were obtained from the Department of Defense Serum Repository (DoDSR) and analyzed for metabolites related to nutrition and health status. Metabolic profile organization was studied using modulated modularity clustering (MMC). HRM provided quantitative measures of 61 metabolites across chemical classes for use as nutritional and clinical biomarkers. Levels were comparable to reported values except for arginine and glutamine, which were above and below reference ranges, respectively. MMC generated five clusters, three of which were associated and contained amino acids. The others contained lipids and mitochondria-related metabolites. HRM analysis of serum is suitable for real-time and/or retrospective evaluation of nutrition and health status of specific military cohorts.

  5. Tongue Color Analysis for Medical Application

    PubMed Central

    Wang, Xingzheng; You, Jane

    2013-01-01

    An in-depth systematic tongue color analysis system for medical applications is proposed. Using the tongue color gamut, tongue foreground pixels are first extracted and assigned to one of 12 colors representing this gamut. The ratio of each color for the entire image is calculated and forms a tongue color feature vector. Experimenting on a large dataset consisting of 143 Healthy and 902 Disease (13 groups of more than 10 samples and one miscellaneous group), a given tongue sample can be classified into one of these two classes with an average accuracy of 91.99%. Further testing showed that Disease samples can be split into three clusters, and within each cluster most if not all the illnesses are distinguished from one another. In total 11 illnesses have a classification rate greater than 70%. This demonstrates a relationship between the state of the human body and its tongue color. PMID:23737824

  6. Metabolite Profiling Reveals Developmental Inequalities in Pinot Noir Berry Tissues Late in Ripening.

    PubMed

    Vondras, Amanda M; Commisso, Mauro; Guzzo, Flavia; Deluc, Laurent G

    2017-01-01

    Uneven ripening in Vitis vinifera is increasingly recognized as a phenomenon of interest, with substantial implications for fruit and wine composition and quality. This study sought to determine whether variation late in ripening (∼Modified Eichhorn-Lorenz stage 39) was associated with developmental differences that were observable as fruits within a cluster initiated ripening (véraison). Four developmentally distinct ripening classes of berries were tagged at cluster véraison, sampled at three times late in ripening, and subjected to untargeted HPLC-MS to measure variation in amino acids, sugars, organic acids, and phenolic metabolites in skin, pulp, and seed tissues separately. Variability was described using predominantly two strategies. In the first, multivariate analysis (Orthogonal Projections to Latent Structures-Discriminant Analysis, OPLS-DA) was used to determine whether fruits were still distinguishable per their developmental position at véraison and to identify which metabolites accounted for these distinctions. The same technique was used to assess changes in each tissue over time. In a second strategy and for each annotated metabolite, the variance across the ripening classes at each time point was measured to show whether intra-cluster variance (ICV) was growing, shrinking, or constant over the period observed. Indeed, berries could be segregated by OPLS-DA late in ripening based on their developmental position at véraison, though the four ripening classes were aggregated into two larger ripening groups. Further, not all tissues were dynamic over the period examined. Although pulp tissues could be segregated by time sampled, this was not true for seed and only moderately so for skin. Ripening group differences in seed and skin, rather than the time fruit was sampled, were better able to define berries. Metabolites also experienced significant reductions in ICV between single pairs of time points, but never across the entire experiment. Metabolites often exhibited a combination of ICV expansion, contraction and persistence. Finally, we observed significant differences in the abundance of some metabolites between ripening classes that suggest the berries that initiated ripening first remained developmentally ahead of the lagging fruit even late in the ripening phase. This presents a challenge to producers who would seek to harvest at uniformity or at a predefined level of variation.

  7. The search for person-related information in general practice: a qualitative study.

    PubMed

    Schrans, Diego; Avonts, Dirk; Christiaens, Thierry; Willems, Sara; de Smet, Kaat; van Boven, Kees; Boeckxstaens, Pauline; Kühlein, Thomas

    2016-02-01

    General practice is person-focused. Contextual information influences the clinical decision-making process in primary care. Currently, person-related information (PeRI) is neither recorded in a systematic way nor coded in the electronic medical record (EMR), and therefore not usable for scientific use. To search for classes of PeRI influencing the process of care. GPs, from nine countries worldwide, were asked to write down narrative case histories where personal factors played a role in decision-making. In an inductive process, the case histories were consecutively coded according to classes of PeRI. The classes found were deductively applied to the following cases and refined, until saturation was reached. Then, the classes were grouped into code-families and further clustered into domains. The inductive analysis of 32 case histories resulted in 33 defined PeRI codes, classifying all personal-related information in the cases. The 33 codes were grouped in the following seven mutually exclusive code-families: 'aspects between patient and formal care provider', 'social environment and family', 'functioning/behaviour', 'life history/non-medical experiences', 'personal medical information', 'socio-demographics' and 'work-/employment-related information'. The code-families were clustered into four domains: 'social environment and extended family', 'medicine', 'individual' and 'work and employment'. As PeRI is used in the process of decision-making, it should be part of the EMR. The PeRI classes we identified might form the basis of a new contextual classification mainly for research purposes. This might help to create evidence of the person-centredness of general practice. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. A class of compact dwarf galaxies from disruptive processes in galaxy clusters.

    PubMed

    Drinkwater, M J; Gregg, M D; Hilker, M; Bekki, K; Couch, W J; Ferguson, H C; Jones, J B; Phillipps, S

    2003-05-29

    Dwarf galaxies have attracted increased attention in recent years, because of their susceptibility to galaxy transformation processes within rich galaxy clusters. Direct evidence for these processes, however, has been difficult to obtain, with a small number of diffuse light trails and intra-cluster stars being the only signs of galaxy disruption. Furthermore, our current knowledge of dwarf galaxy populations may be very incomplete, because traditional galaxy surveys are insensitive to extremely diffuse or compact galaxies. Aware of these concerns, we recently undertook an all-object survey of the Fornax galaxy cluster. This revealed a new population of compact members, overlooked in previous conventional surveys. Here we demonstrate that these 'ultra-compact' dwarf galaxies are structurally and dynamically distinct from both globular star clusters and known types of dwarf galaxy, and thus represent a new class of dwarf galaxy. Our data are consistent with the interpretation that these are the remnant nuclei of disrupted dwarf galaxies, making them an easily observed tracer of galaxy disruption.

  9. 49 CFR 192.5 - Class locations.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... nearest building with four or more stories above ground. (2) When a cluster of buildings intended for... the nearest building in the cluster. [Amdt. 192-78, 61 FR 28783, June 6, 1996; 61 FR 35139, July 5...

  10. Mineral constituents profile of biochar derived from diversified waste biomasses: implications for agricultural applications.

    PubMed

    Zhao, Ling; Cao, Xinde; Wang, Qun; Yang, Fan; Xu, Shi

    2013-01-01

    The wide distribution and high heterogeneity of different elements in biochars derived from diverse feedstocks make it difficult to regulate their application in soil and to evaluate the maximum potential contribution of the nutrients and trace metals as well as the potential risk of toxic metals. This study classified 20 biochars, covering six typical categories, into three clusters according to their similarity and distance on nutrients and minerals using cluster analysis. Four principle components (PC) were extracted using factor analysis to reduce dimension and clearly characterize the mineral profile of these biochars. The contribution of each group of elements in the PCs to every cluster was clarified. PC1 had a high loading for Mg, Cu, Zn, Al, and Fe; PC2 was related to N, K, and Mn; and PC3 and PC4 mainly represented P and Ca. Cluster 1 included bone dregs and eggshell biochars with PC3 and PC4 as the main contributors. Cluster 2 included waterweeds and waste paper biochars, which were close to shrimp hull and chlorella biochars, with the main contributions being from PC2 and PC4. Cluster 3 included biochars with PC1 as the main contributor. At a soil biochar amendment rate of 50 t ha, the soil nutrients were significantly elevated, whereas the rise in toxic metals was negligible compared with Class I of the China Environmental Quality Standards for Soil. Biochar can potentially supply soil nutrients and trace metals, and different cluster biochars can be applied appropriately to different soils so that excessive or deficient nutrient and metal applications can be avoided. Copyright © by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America, Inc.

  11. Simultaneous alignment and clustering of peptide data using a Gibbs sampling approach.

    PubMed

    Andreatta, Massimo; Lund, Ole; Nielsen, Morten

    2013-01-01

    Proteins recognizing short peptide fragments play a central role in cellular signaling. As a result of high-throughput technologies, peptide-binding protein specificities can be studied using large peptide libraries at dramatically lower cost and time. Interpretation of such large peptide datasets, however, is a complex task, especially when the data contain multiple receptor binding motifs, and/or the motifs are found at different locations within distinct peptides. The algorithm presented in this article, based on Gibbs sampling, identifies multiple specificities in peptide data by performing two essential tasks simultaneously: alignment and clustering of peptide data. We apply the method to de-convolute binding motifs in a panel of peptide datasets with different degrees of complexity spanning from the simplest case of pre-aligned fixed-length peptides to cases of unaligned peptide datasets of variable length. Example applications described in this article include mixtures of binders to different MHC class I and class II alleles, distinct classes of ligands for SH3 domains and sub-specificities of the HLA-A*02:01 molecule. The Gibbs clustering method is available online as a web server at http://www.cbs.dtu.dk/services/GibbsCluster.

  12. Subtyping depression by clinical features: the Australasian database.

    PubMed

    Parker, G; Roy, K; Hadzi-Pavlovic, D; Mitchell, P; Wilhelm, K; Menkes, D B; Snowdon, J; Loo, C; Schweitzer, I

    2000-01-01

    To distinguish psychotic, melancholic and a residual non-melancholic class on the basis of clinical features alone. Previous studies at our Mood Disorders Unit (MDU) favour a hierarchical model, with the classes able to be distinguished by two specific clinical features, but any such intramural study risks rater bias and requires external replication. This replication study involved 27 Australasian psychiatrist raters, thus extending the sample and raters beyond the MDU facility. They collected clinical feature data using a standardized assessment with precoded rating options. A psychotic depression (PD) class was derived by respecting DSM-IV decision rules while a cluster analysis distinguished melancholic (MEL) and non-melancholic classes. The MELs were distinguished virtually entirely by the presence of significant psychomotor disturbance (PMD), as rated by the observationally based CORE measure, with over-representation on only three of an extensive set of 'endogeneity symptoms'. In comparison to PMD, endogeneity symptoms appear to be poor indicators of 'melancholic' type, confounding typology with severity. Results again support the hierarchical model.

  13. Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution

    PubMed Central

    Lo, Kenneth

    2011-01-01

    Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components. PMID:22125375

  14. Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution.

    PubMed

    Lo, Kenneth; Gottardo, Raphael

    2012-01-01

    Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.

  15. Joint Clustering and Component Analysis of Correspondenceless Point Sets: Application to Cardiac Statistical Modeling.

    PubMed

    Gooya, Ali; Lekadir, Karim; Alba, Xenia; Swift, Andrew J; Wild, Jim M; Frangi, Alejandro F

    2015-01-01

    Construction of Statistical Shape Models (SSMs) from arbitrary point sets is a challenging problem due to significant shape variation and lack of explicit point correspondence across the training data set. In medical imaging, point sets can generally represent different shape classes that span healthy and pathological exemplars. In such cases, the constructed SSM may not generalize well, largely because the probability density function (pdf) of the point sets deviates from the underlying assumption of Gaussian statistics. To this end, we propose a generative model for unsupervised learning of the pdf of point sets as a mixture of distinctive classes. A Variational Bayesian (VB) method is proposed for making joint inferences on the labels of point sets, and the principal modes of variations in each cluster. The method provides a flexible framework to handle point sets with no explicit point-to-point correspondences. We also show that by maximizing the marginalized likelihood of the model, the optimal number of clusters of point sets can be determined. We illustrate this work in the context of understanding the anatomical phenotype of the left and right ventricles in heart. To this end, we use a database containing hearts of healthy subjects, patients with Pulmonary Hypertension (PH), and patients with Hypertrophic Cardiomyopathy (HCM). We demonstrate that our method can outperform traditional PCA in both generalization and specificity measures.

  16. Universal dynamical properties preclude standard clustering in a large class of biochemical data.

    PubMed

    Gomez, Florian; Stoop, Ralph L; Stoop, Ruedi

    2014-09-01

    Clustering of chemical and biochemical data based on observed features is a central cognitive step in the analysis of chemical substances, in particular in combinatorial chemistry, or of complex biochemical reaction networks. Often, for reasons unknown to the researcher, this step produces disappointing results. Once the sources of the problem are known, improved clustering methods might revitalize the statistical approach of compound and reaction search and analysis. Here, we present a generic mechanism that may be at the origin of many clustering difficulties. The variety of dynamical behaviors that can be exhibited by complex biochemical reactions on variation of the system parameters are fundamental system fingerprints. In parameter space, shrimp-like or swallow-tail structures separate parameter sets that lead to stable periodic dynamical behavior from those leading to irregular behavior. We work out the genericity of this phenomenon and demonstrate novel examples for their occurrence in realistic models of biophysics. Although we elucidate the phenomenon by considering the emergence of periodicity in dependence on system parameters in a low-dimensional parameter space, the conclusions from our simple setting are shown to continue to be valid for features in a higher-dimensional feature space, as long as the feature-generating mechanism is not too extreme and the dimension of this space is not too high compared with the amount of available data. For online versions of super-paramagnetic clustering see http://stoop.ini.uzh.ch/research/clustering. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. GeoGebra Assist Discovery Learning Model for Problem Solving Ability and Attitude toward Mathematics

    NASA Astrophysics Data System (ADS)

    Murni, V.; Sariyasa, S.; Ardana, I. M.

    2017-09-01

    This study aims to describe the effet of GeoGebra utilization in the discovery learning model on mathematical problem solving ability and students’ attitude toward mathematics. This research was quasi experimental and post-test only control group design was used in this study. The population in this study was 181 of students. The sampling technique used was cluster random sampling, so the sample in this study was 120 students divided into 4 classes, 2 classes for the experimental class and 2 classes for the control class. Data were analyzed by using one way MANOVA. The results of data analysis showed that the utilization of GeoGebra in discovery learning can lead to solving problems and attitudes towards mathematics are better. This is because the presentation of problems using geogebra can assist students in identifying and solving problems and attracting students’ interest because geogebra provides an immediate response process to students. The results of the research are the utilization of geogebra in the discovery learning can be applied in learning and teaching wider subject matter, beside subject matter in this study.

  18. Multiple Service System Involvement and Later Offending Behavior: Implications for Prevention and Early Intervention.

    PubMed

    Bright, Charlotte Lyn; Jonson-Reid, Melissa

    2015-07-01

    We investigated patterns of childhood and adolescent experiences that correspond to later justice system entry, including persistence into adulthood, and explored whether timing of potential supports to the child or onset of family poverty, according to developmental periods and gender, would distinguish among latent classes. We constructed a database containing records for 8587 youths from a Midwestern metropolitan region, born between 1982 and 1991, with outcomes. We used data from multiple publicly funded systems (child welfare, income maintenance, juvenile and criminal justice, mental health, Medicaid, vital statistics). We applied a latent class analysis and interpreted a 7-class model. Classes with higher rates of offending persisting into adulthood were characterized by involvement with multiple publicly funded systems in childhood and adolescence, with the exception of 1 less-urban, predominantly female class that had similarly high system involvement coupled with lower rates of offending. Poverty and maltreatment appear to play a critical role in offending trajectories. Identifying risk factors that cluster together may help program and intervention staff best target those most in need of more intensive intervention.

  19. Cellulose synthase 'class specific regions' are intrinsically disordered and functionally undifferentiated.

    PubMed

    Scavuzzo-Duggan, Tess R; Chaves, Arielle M; Singh, Abhishek; Sethaphong, Latsavongsakda; Slabaugh, Erin; Yingling, Yaroslava G; Haigler, Candace H; Roberts, Alison W

    2018-06-01

    Cellulose synthases (CESAs) are glycosyltransferases that catalyze formation of cellulose microfibrils in plant cell walls. Seed plant CESA isoforms cluster in six phylogenetic clades, whose non-interchangeable members play distinct roles within cellulose synthesis complexes (CSCs). A 'class specific region' (CSR), with higher sequence similarity within versus between functional CESA classes, has been suggested to contribute to specific activities or interactions of different isoforms. We investigated CESA isoform specificity in the moss, Physcomitrella patens (Hedw.) B. S. G. to gain evolutionary insights into CESA structure/function relationships. Like seed plants, P. patens has oligomeric rosette-type CSCs, but the PpCESAs diverged independently and form a separate CESA clade. We showed that P. patens has two functionally distinct CESAs classes, based on the ability to complement the gametophore-negative phenotype of a ppcesa5 knockout line. Thus, non-interchangeable CESA classes evolved separately in mosses and seed plants. However, testing of chimeric moss CESA genes for complementation demonstrated that functional class-specificity is not determined by the CSR. Sequence analysis and computational modeling showed that the CSR is intrinsically disordered and contains predicted molecular recognition features, consistent with a possible role in CESA oligomerization and explaining the evolution of class-specific sequences without selection for class-specific function. © 2018 Institute of Botany, Chinese Academy of Sciences.

  20. A class of spherical, truncated, anisotropic models for application to globular clusters

    NASA Astrophysics Data System (ADS)

    de Vita, Ruggero; Bertin, Giuseppe; Zocchi, Alice

    2016-05-01

    Recently, a class of non-truncated, radially anisotropic models (the so-called f(ν)-models), originally constructed in the context of violent relaxation and modelling of elliptical galaxies, has been found to possess interesting qualities in relation to observed and simulated globular clusters. In view of new applications to globular clusters, we improve this class of models along two directions. To make them more suitable for the description of small stellar systems hosted by galaxies, we introduce a "tidal" truncation by means of a procedure that guarantees full continuity of the distribution function. The new fT(ν)-models are shown to provide a better fit to the observed photometric and spectroscopic profiles for a sample of 13 globular clusters studied earlier by means of non-truncated models; interestingly, the best-fit models also perform better with respect to the radial-orbit instability. Then, we design a flexible but simple two-component family of truncated models to study the separate issues of mass segregation and multiple populations. We do not aim at a fully realistic description of globular clusters to compete with the description currently obtained by means of dedicated simulations. The goal here is to try to identify the simplest models, that is, those with the smallest number of free parameters, but still have the capacity to provide a reasonable description for clusters that are evidently beyond the reach of one-component models. With this tool, we aim at identifying the key factors that characterize mass segregation or the presence of multiple populations. To reduce the relevant parameter space, we formulate a few physical arguments based on recent observations and simulations. A first application to two well-studied globular clusters is briefly described and discussed.

  1. Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification.

    PubMed

    Li, Jinyan; Fong, Simon; Sung, Yunsick; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L

    2016-01-01

    An imbalanced dataset is defined as a training dataset that has imbalanced proportions of data in both interesting and uninteresting classes. Often in biomedical applications, samples from the stimulating class are rare in a population, such as medical anomalies, positive clinical tests, and particular diseases. Although the target samples in the primitive dataset are small in number, the induction of a classification model over such training data leads to poor prediction performance due to insufficient training from the minority class. In this paper, we use a novel class-balancing method named adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique (ASCB_DmSMOTE) to solve this imbalanced dataset problem, which is common in biomedical applications. The proposed method combines under-sampling and over-sampling into a swarm optimisation algorithm. It adaptively selects suitable parameters for the rebalancing algorithm to find the best solution. Compared with the other versions of the SMOTE algorithm, significant improvements, which include higher accuracy and credibility, are observed with ASCB_DmSMOTE. Our proposed method tactfully combines two rebalancing techniques together. It reasonably re-allocates the majority class in the details and dynamically optimises the two parameters of SMOTE to synthesise a reasonable scale of minority class for each clustered sub-imbalanced dataset. The proposed methods ultimately overcome other conventional methods and attains higher credibility with even greater accuracy of the classification model.

  2. Could the clinical interpretability of subgroups detected using clustering methods be improved by using a novel two-stage approach?

    PubMed

    Kent, Peter; Stochkendahl, Mette Jensen; Christensen, Henrik Wulff; Kongsted, Alice

    2015-01-01

    Recognition of homogeneous subgroups of patients can usefully improve prediction of their outcomes and the targeting of treatment. There are a number of research approaches that have been used to recognise homogeneity in such subgroups and to test their implications. One approach is to use statistical clustering techniques, such as Cluster Analysis or Latent Class Analysis, to detect latent relationships between patient characteristics. Influential patient characteristics can come from diverse domains of health, such as pain, activity limitation, physical impairment, social role participation, psychological factors, biomarkers and imaging. However, such 'whole person' research may result in data-driven subgroups that are complex, difficult to interpret and challenging to recognise clinically. This paper describes a novel approach to applying statistical clustering techniques that may improve the clinical interpretability of derived subgroups and reduce sample size requirements. This approach involves clustering in two sequential stages. The first stage involves clustering within health domains and therefore requires creating as many clustering models as there are health domains in the available data. This first stage produces scoring patterns within each domain. The second stage involves clustering using the scoring patterns from each health domain (from the first stage) to identify subgroups across all domains. We illustrate this using chest pain data from the baseline presentation of 580 patients. The new two-stage clustering resulted in two subgroups that approximated the classic textbook descriptions of musculoskeletal chest pain and atypical angina chest pain. The traditional single-stage clustering resulted in five clusters that were also clinically recognisable but displayed less distinct differences. In this paper, a new approach to using clustering techniques to identify clinically useful subgroups of patients is suggested. Research designs, statistical methods and outcome metrics suitable for performing that testing are also described. This approach has potential benefits but requires broad testing, in multiple patient samples, to determine its clinical value. The usefulness of the approach is likely to be context-specific, depending on the characteristics of the available data and the research question being asked of it.

  3. Two distinct phenotypes of asthma in elite athletes identified by latent class analysis.

    PubMed

    Couto, Mariana; Stang, Julie; Horta, Luís; Stensrud, Trine; Severo, Milton; Mowinckel, Petter; Silva, Diana; Delgado, Luís; Moreira, André; Carlsen, Kai-Håkon

    2015-01-01

    Clusters of asthma in athletes have been insufficiently studied. Therefore, the present study aimed to characterize asthma phenotypes in elite athletes using latent class analysis (LCA) and to evaluate its association with the type of sport practiced. In the present cross-sectional study, an analysis of athletes' records was carried out in databases of the Portuguese National Anti-Doping Committee and the Norwegian School of Sport Sciences. Athletes with asthma, diagnosed according to criteria given by the International Olympic Committee, were included for LCA. Sports practiced were categorized into water, winter and other sports. Of 324 files screened, 150 files belonged to asthmatic athletes (91 Portuguese; 59 Norwegian). LCA retrieved two clusters: "atopic asthma" defined by allergic sensitization, rhinitis and allergic co-morbidities and increased exhaled nitric oxide levels; and "sports asthma", defined by exercise-induced respiratory symptoms and airway hyperesponsiveness without allergic features. The risk of developing the phenotype "sports asthma" was significantly increased in athletes practicing water (OR = 2.87; 95% CI [1.82-4.51]) and winter (OR = 8.65; 95% CI [2.67-28.03]) sports, when compared with other athletes. Two asthma phenotypes were identified in elite athletes: "atopic asthma" and "sports asthma". The type of sport practiced was associated with different phenotypes: water and winter sport athletes had three- and ninefold increased risk of "sports asthma". Recognizing different phenotypes is clinically relevant as it would lead to distinct targeted treatments.

  4. Short communication: cheminformatics analysis to identify predictors of antiviral drug penetration into the female genital tract.

    PubMed

    Thompson, Corbin G; Sedykh, Alexander; Nicol, Melanie R; Muratov, Eugene; Fourches, Denis; Tropsha, Alexander; Kashuba, Angela D M

    2014-11-01

    The exposure of oral antiretroviral (ARV) drugs in the female genital tract (FGT) is variable and almost unpredictable. Identifying an efficient method to find compounds with high tissue penetration would streamline the development of regimens for both HIV preexposure prophylaxis and viral reservoir targeting. Here we describe the cheminformatics investigation of diverse drugs with known FGT penetration using cluster analysis and quantitative structure-activity relationships (QSAR) modeling. A literature search over the 1950-2012 period identified 58 compounds (including 21 ARVs and representing 13 drug classes) associated with their actual concentration data for cervical or vaginal tissue, or cervicovaginal fluid. Cluster analysis revealed significant trends in the penetrative ability for certain chemotypes. QSAR models to predict genital tract concentrations normalized to blood plasma concentrations were developed with two machine learning techniques utilizing drugs' molecular descriptors and pharmacokinetic parameters as inputs. The QSAR model with the highest predictive accuracy had R(2)test=0.47. High volume of distribution, high MRP1 substrate probability, and low MRP4 substrate probability were associated with FGT concentrations ≥1.5-fold plasma concentrations. However, due to the limited FGT data available, prediction performances of all models were low. Despite this limitation, we were able to support our findings by correctly predicting the penetration class of rilpivirine and dolutegravir. With more data to enrich the models, we believe these methods could potentially enhance the current approach of clinical testing.

  5. The Effect of Cluster-Based Instruction on Mathematic Achievement in Inclusive Schools

    ERIC Educational Resources Information Center

    Gunarhadi, Sunardi; Anwar, Mohammad; Andayani, Tri Rejeki; Shaari, Abdull Sukor

    2016-01-01

    The research aimed to investigate the effect of Cluster-Based Instruction (CBI) on the academic achievement of Mathematics in inclusive schools. The sample was 68 students in two intact classes, including those with learning disabilities, selected using a cluster random technique among 17 inclusive schools in the regency of Surakarta. The two…

  6. A novel mechanism for creating double pulsars

    NASA Technical Reports Server (NTRS)

    Sigurdsson, Steinn; Hernquist, Lars

    1992-01-01

    Simulations of encounters between pairs of hard binaries, each containing a neutron star and a main-sequence star, reveal a new formation mechanism for double pulsars in dense cores of globular clusters. In many cases, the two normal stars are disrupted to form a common envelope around the pair of neutron stars, both of which will be spun up to become millisecond pulsars. We predict that a new class of pulsars, double millisecond pulsars, will be discovered in the cores of dense globular clusters. The genesis proceeds through a short-lived double-core common envelope phase, with the envelope ejected in a fast wind. It is possible that the progenitor may also undergo a double X-ray binary phase. Any circular, short-period double pulsar found in the galaxy would necessarily come from disrupted disk clusters, unlike Hulse-Taylor class pulsars or low-mass X-ray binaries which may be ejected from clusters or formed in the galaxy.

  7. Maturation of nitrogenase cofactor—the role of a class E radical SAM methyltransferase NifB

    PubMed Central

    Hu, Yilin; Ribbe, Markus W.

    2016-01-01

    Nitrogenase catalyzes the important reactions of N2-, CO- and CO2-reduction at its active cofactor site. Designated the M-cluster, this complex metallocofactor is assembled through the generation of a characteristic 8Fe-core prior to the insertion of Mo and homocitrate that completes the stoichiometry of the M-cluster. NifB catalyzes the critical step of radical SAM-dependent carbide insertion that occurs concomitant with the insertion a “9th” sulfur and the rearrangement/coupling of two 4Fe-clusters into a complete 8Fe-core of the M-cluster. Further categorization of a family of NifB proteins as a new class of radical SAM methyltransferases suggests a general function of these proteins in complex metallocofactor assembly and provides a new platform for unveiling unprecedented chemical reactions catalyzed by biological systems. PMID:26969410

  8. Automatic classification of canine PRG neuronal discharge patterns using K-means clustering.

    PubMed

    Zuperku, Edward J; Prkic, Ivana; Stucke, Astrid G; Miller, Justin R; Hopp, Francis A; Stuth, Eckehard A

    2015-02-01

    Respiratory-related neurons in the parabrachial-Kölliker-Fuse (PB-KF) region of the pons play a key role in the control of breathing. The neuronal activities of these pontine respiratory group (PRG) neurons exhibit a variety of inspiratory (I), expiratory (E), phase spanning and non-respiratory related (NRM) discharge patterns. Due to the variety of patterns, it can be difficult to classify them into distinct subgroups according to their discharge contours. This report presents a method that automatically classifies neurons according to their discharge patterns and derives an average subgroup contour of each class. It is based on the K-means clustering technique and it is implemented via SigmaPlot User-Defined transform scripts. The discharge patterns of 135 canine PRG neurons were classified into seven distinct subgroups. Additional methods for choosing the optimal number of clusters are described. Analysis of the results suggests that the K-means clustering method offers a robust objective means of both automatically categorizing neuron patterns and establishing the underlying archetypical contours of subtypes based on the discharge patterns of group of neurons. Published by Elsevier B.V.

  9. Statistical analyses and characteristics of volcanic tremor on Stromboli Volcano (Italy)

    NASA Astrophysics Data System (ADS)

    Falsaperla, S.; Langer, H.; Spampinato, S.

    A study of volcanic tremor on Stromboli is carried out on the basis of data recorded daily between 1993 and 1995 by a permanent seismic station (STR) located 1.8km away from the active craters. We also consider the signal of a second station (TF1), which operated for a shorter time span. Changes in the spectral tremor characteristics can be related to modifications in volcanic activity, particularly to lava effusions and explosive sequences. Statistical analyses were carried out on a set of spectra calculated daily from seismic signals where explosion quakes were present or excluded. Principal component analysis and cluster analysis were applied to identify different classes of spectra. Three clusters of spectra are associated with two different states of volcanic activity. One cluster corresponds to a state of low to moderate activity, whereas the two other clusters are present during phases with a high magma column as inferred from the occurrence of lava fountains or effusions. We therefore conclude that variations in volcanic activity at Stromboli are usually linked to changes in the spectral characteristics of volcanic tremor. Site effects are evident when comparing the spectra calculated from signals synchronously recorded at STR and TF1. However, some major spectral peaks at both stations may reflect source properties. Statistical considerations and polarization analysis are in favor of a prevailing presence of P-waves in the tremor signal along with a position of the source northwest of the craters and at shallow depth.

  10. Crystal structure of an Fe-S cluster-containing fumarate hydratase enzyme from Leishmania major reveals a unique protein fold.

    PubMed

    Feliciano, Patricia R; Drennan, Catherine L; Nonato, M Cristina

    2016-08-30

    Fumarate hydratases (FHs) are essential metabolic enzymes grouped into two classes. Here, we present the crystal structure of a class I FH, the cytosolic FH from Leishmania major, which reveals a previously undiscovered protein fold that coordinates a catalytically essential [4Fe-4S] cluster. Our 2.05 Å resolution data further reveal a dimeric architecture for this FH that resembles a heart, with each lobe comprised of two domains that are arranged around the active site. Besides the active site, where the substrate S-malate is bound bidentate to the unique iron of the [4Fe-4S] cluster, other binding pockets are found near the dimeric enzyme interface, some of which are occupied by malonate, shown here to be a weak inhibitor of this enzyme. Taken together, these data provide a framework both for investigations of the class I FH catalytic mechanism and for drug design aimed at fighting neglected tropical diseases.

  11. Two Polyhydroxyalkanoate Synthases from Distinct Classes from the Aromatic Degrader Cupriavidus pinatubonensis JMP134 Exhibit the Same Substrate Preference.

    PubMed

    Jiang, Xuan; Luo, Xi; Zhou, Ning-Yi

    2015-01-01

    Cupriavidus pinatubonensis JMP134 utilizes a variety of aromatic substrates as sole carbon sources, including meta-nitrophenol (MNP). Two polyhydroxyalkanoate (PHA) synthase genes, phaC1 and phaC2, were annotated and categorized as class I and class II PHA synthase genes, respectively. In this study, both His-tagged purified PhaC1 and PhaC2 were shown to exhibit typical class I PHA synthase substrate specificity to make short-chain-length (SCL) PHA from 3-hydroxybutyryl-CoA and failed to make medium-chain-length (MCL) PHA from 3-hydroxyoctanoyl-CoA. The phaC1 or phaC2 deletion strain could also produce SCL PHA when grown in fructose or octanoate, but the double mutant of phaC1 and phaC2 lost this ability. The PhaC2 also exhibited substrate preference towards SCL substrates when expressed in Pseudomonas aeruginosa PAO1 phaC mutant strain. On the other hand, the transcriptional level of phaC1 was 70-fold higher than that of phaC2 in MNP-grown cells, but 240-fold lower in octanoate-grown cells. Further study demonstrated that only phaC1 was involved in PHA synthesis in MNP-grown cells. These findings suggested that phaC1 and phaC2 genes were differentially regulated under different growth conditions in this strain. Within the phaC2-containing gene cluster, a single copy of PHA synthase gene was present clustering with genes encoding enzymes in the biosynthesis of PHA precursors. This is markedly different from the genetic organization of all other previously reported class II PHA synthase gene clusters and this cluster likely comes from a distinct evolutionary path.

  12. Two Polyhydroxyalkanoate Synthases from Distinct Classes from the Aromatic Degrader Cupriavidus pinatubonensis JMP134 Exhibit the Same Substrate Preference

    PubMed Central

    Jiang, Xuan; Luo, Xi; Zhou, Ning-Yi

    2015-01-01

    Cupriavidus pinatubonensis JMP134 utilizes a variety of aromatic substrates as sole carbon sources, including meta-nitrophenol (MNP). Two polyhydroxyalkanoate (PHA) synthase genes, phaC1 and phaC2, were annotated and categorized as class I and class II PHA synthase genes, respectively. In this study, both His-tagged purified PhaC1 and PhaC2 were shown to exhibit typical class I PHA synthase substrate specificity to make short-chain-length (SCL) PHA from 3-hydroxybutyryl-CoA and failed to make medium-chain-length (MCL) PHA from 3-hydroxyoctanoyl-CoA. The phaC1 or phaC2 deletion strain could also produce SCL PHA when grown in fructose or octanoate, but the double mutant of phaC1 and phaC2 lost this ability. The PhaC2 also exhibited substrate preference towards SCL substrates when expressed in Pseudomonas aeruginosa PAO1 phaC mutant strain. On the other hand, the transcriptional level of phaC1 was 70-fold higher than that of phaC2 in MNP-grown cells, but 240-fold lower in octanoate-grown cells. Further study demonstrated that only phaC1 was involved in PHA synthesis in MNP-grown cells. These findings suggested that phaC1 and phaC2 genes were differentially regulated under different growth conditions in this strain. Within the phaC2-containing gene cluster, a single copy of PHA synthase gene was present clustering with genes encoding enzymes in the biosynthesis of PHA precursors. This is markedly different from the genetic organization of all other previously reported class II PHA synthase gene clusters and this cluster likely comes from a distinct evolutionary path. PMID:26544851

  13. The lipidome in major depressive disorder: Shared genetic influence for ether-phosphatidylcholines, a plasma-based phenotype related to inflammation, and disease risk.

    PubMed

    Knowles, E E M; Huynh, K; Meikle, P J; Göring, H H H; Olvera, R L; Mathias, S R; Duggirala, R; Almasy, L; Blangero, J; Curran, J E; Glahn, D C

    2017-06-01

    The lipidome is rapidly garnering interest in the field of psychiatry. Recent studies have implicated lipidomic changes across numerous psychiatric disorders. In particular, there is growing evidence that the concentrations of several classes of lipids are altered in those diagnosed with MDD. However, for lipidomic abnormalities to be considered potential treatment targets for MDD (rather than secondary manifestations of the disease), a shared etiology between lipid concentrations and MDD should be demonstrated. In a sample of 567 individuals from 37 extended pedigrees (average size 13.57 people, range=3-80), we used mass spectrometry lipidomic measures to evaluate the genetic overlap between twenty-three biologically distinct lipid classes and a dimensional scale of MDD. We found that the lipid class with the largest endophenotype ranking value (ERV, a standardized parametric measure of pleiotropy) were ether-phosphodatidylcholines (alkylphosphatidylcholine, PC(O) and alkenylphosphatidylcholine, PC(P) subclasses). Furthermore, we examined the cluster structure of the twenty-five species within the top-ranked lipid class, and the relationship of those clusters with MDD. This analysis revealed that species containing arachidonic acid generally exhibited the greatest degree of genetic overlap with MDD. This study is the first to demonstrate a shared genetic etiology between MDD and ether-phosphatidylcholine species containing arachidonic acid, an omega-6 fatty acid that is a precursor to inflammatory mediators, such as prostaglandins. The study highlights the potential utility of the well-characterized linoleic/arachidonic acid inflammation pathway as a diagnostic marker and/or treatment target for MDD. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  14. LANDSAT-4 MSS and Thematic Mapper data quality and information content analysis

    NASA Technical Reports Server (NTRS)

    Anuta, P.; Bartolucci, L.; Dean, E.; Lozano, F.; Malaret, E.; Mcgillem, C. D.; Valdes, J.; Valenzuela, C.

    1984-01-01

    LANDSAT-4 thematic mapper (TM) and multispectral scanner (MSS) data were analyzed to obtain information on data quality and information content. Geometric evaluations were performed to test band-to-band registration accuracy. Thematic mapper overall system resolution was evaluated using scene objects which demonstrated sharp high contrast edge responses. Radiometric evaluation included detector relative calibration, effects of resampling, and coherent noise effects. Information content evaluation was carried out using clustering, principal components, transformed divergence separability measure, and supervised classifiers on test data. A detailed spectral class analysis (multispectral classification) was carried out to compare the information content of the MSS and TM for a large number of scene classes. A temperature-mapping experiment was carried out for a cooling pond to test the quality of thermal-band calibration. Overall TM data quality is very good. The MSS data are noisier than previous LANDSAT results.

  15. Quantification and statistical significance analysis of group separation in NMR-based metabonomics studies

    PubMed Central

    Goodpaster, Aaron M.; Kennedy, Michael A.

    2015-01-01

    Currently, no standard metrics are used to quantify cluster separation in PCA or PLS-DA scores plots for metabonomics studies or to determine if cluster separation is statistically significant. Lack of such measures makes it virtually impossible to compare independent or inter-laboratory studies and can lead to confusion in the metabonomics literature when authors putatively identify metabolites distinguishing classes of samples based on visual and qualitative inspection of scores plots that exhibit marginal separation. While previous papers have addressed quantification of cluster separation in PCA scores plots, none have advocated routine use of a quantitative measure of separation that is supported by a standard and rigorous assessment of whether or not the cluster separation is statistically significant. Here quantification and statistical significance of separation of group centroids in PCA and PLS-DA scores plots are considered. The Mahalanobis distance is used to quantify the distance between group centroids, and the two-sample Hotelling's T2 test is computed for the data, related to an F-statistic, and then an F-test is applied to determine if the cluster separation is statistically significant. We demonstrate the value of this approach using four datasets containing various degrees of separation, ranging from groups that had no apparent visual cluster separation to groups that had no visual cluster overlap. Widespread adoption of such concrete metrics to quantify and evaluate the statistical significance of PCA and PLS-DA cluster separation would help standardize reporting of metabonomics data. PMID:26246647

  16. Influence of Ganglioside GM1 Concentration on Lipid Clustering and Membrane Properties and Curvature.

    PubMed

    Patel, Dhilon S; Park, Soohyung; Wu, Emilia L; Yeom, Min Sun; Widmalm, Göran; Klauda, Jeffery B; Im, Wonpil

    2016-11-01

    Gangliosides are a class of glycosphingolipids (GSLs) with amphiphilic character that are found at the outer leaflet of the cell membranes, where their ability to organize into special domains makes them vital cell membrane components. However, a molecular understanding of GSL-rich membranes in terms of their clustered organization, stability, and dynamics is still elusive. To gain molecular insight into the organization and dynamics of GSL-rich membranes, we performed all-atom molecular-dynamics simulations of bicomponent ganglioside GM1 in 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) phospholipid bilayers with varying concentrations of GM1 (10%, 20%, and 30%). Overall, the simulations show very good agreement with available experimental data, including x-ray electron density profiles along the membrane normal, NMR carbohydrate proton-proton distances, and x-ray crystal structures. This validates the quality of our model systems for investigating GM1 clustering through an ordered-lipid-cluster analysis. The increase in GM1 concentration induces tighter lipid packing, driven mainly by inter-GM1 carbohydrate-carbohydrate interactions, leading to a greater preference for the positive curvature of GM1-containing membranes and larger cluster sizes of ordered-lipid clusters (with a composite of GM1 and POPC). These clusters tend to segregate and form a large percolated cluster at a 30% GM1 concentration at 293 K. At a higher temperature of 330 K, however, the segregation is not maintained. Copyright © 2016 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  17. Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses

    ERIC Educational Resources Information Center

    Huang, Guan-Hua; Wang, Su-Mei; Hsu, Chung-Chu

    2011-01-01

    Statisticians typically estimate the parameters of latent class and latent profile models using the Expectation-Maximization algorithm. This paper proposes an alternative two-stage approach to model fitting. The first stage uses the modified k-means and hierarchical clustering algorithms to identify the latent classes that best satisfy the…

  18. A gene network bioinformatics analysis for pemphigoid autoimmune blistering diseases.

    PubMed

    Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo

    2015-07-01

    In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.

  19. IMG-ABC. A knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites

    DOE PAGES

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; ...

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve asmore » the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in lphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG’s extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world.« less

  20. Procedure of Partitioning Data Into Number of Data Sets or Data Group - A Review

    NASA Astrophysics Data System (ADS)

    Kim, Tai-Hoon

    The goal of clustering is to decompose a dataset into similar groups based on a objective function. Some already well established clustering algorithms are there for data clustering. Objective of these data clustering algorithms are to divide the data points of the feature space into a number of groups (or classes) so that a predefined set of criteria are satisfied. The article considers the comparative study about the effectiveness and efficiency of traditional data clustering algorithms. For evaluating the performance of the clustering algorithms, Minkowski score is used here for different data sets.

  1. Deep, wide-field, multi-band imaging of z approximately equal to 0.4 clusters and their environs

    NASA Technical Reports Server (NTRS)

    Silva, David R.; Pierce, Michael J.

    1993-01-01

    The existence of an excess population of blue galaxies in the cores of distant, rich clusters of galaxies, commonly referred to as the 'Butcher-Oemler' effect is now well established. Spectroscopy of clusters at z = 0.2-0.4 has confirmed that the luminous blue populations comprise as much as 20 percent of these clusters. This fraction is much higher that the 2 percent blue fraction found for nearby rich clusters, such as Coma, indicating that rapid galaxy evolution has occurred on a relatively short time scale. Spectroscopy has also shown that the 'blue' galaxies can basically be divided into three classes: 'starburst' galaxies with large (O II) equivalent widths, 'post-starburst' E+A galaxies (i.e. galaxies with strong Balmer lines shortward of 4000A but elliptical-like colors, and normal spiral/irregulars. Unfortunately, it is difficult to obtain enough spectra of individual galaxies in these intermediate redshift clusters to say anything statistically meaningful. Thus, limited information is available about the relative numbers of these three classes of 'blue' galaxies and the associated E/SO population in these intermediate redshift clusters. More statistically meaningful results can be derived from deep imaging of these clusters. However, the best published data to date (e.g. MacLaren et al. 1988; Dressler & Gunn 1992) are limited to the cluster cores and do not sample the galaxy luminosity functions very deeply at the bluest wavelengths. Furthermore, only limited spectro-energy distribution data is available below 4000A in the observed cluster rest frame providing limited sensitivity to 'recent' star formation activity. To improve this situation, we are currently obtaining deep, wide-field UBRI images of all known rich clusters at z approx. equals 0.4. Our main objective is to obtain the necessary color information to distinguish between the E+SO, 'E+A', and spiral/irregular galaxy populations throughout the cluster/supercluster complex. At this redshift, UBRI correspond to rest-frame 2500A/UVR bandpasses. The rest-frame UVR system provides a powerful 'blue' galaxy discriminate given the expected color distribution. Moreover, since 'hot' stars peak near 2500A, that bandpass is a powerful probe of recent star formation activity in all classes of galaxies. In particular, it is sensitive to ellipticals with 'UV excess' populations (MacLaren et al. 1988).

  2. Detection of major climatic and environmental predictors of liver fluke exposure risk in Ireland using spatial cluster analysis.

    PubMed

    Selemetas, Nikolaos; de Waal, Theo

    2015-04-30

    Fasciolosis caused by Fasciola hepatica (liver fluke) can cause significant economic and production losses in dairy cow farms. The aim of the current study was to identify important weather and environmental predictors of the exposure risk to liver fluke by detecting clusters of fasciolosis in Ireland. During autumn 2012, bulk-tank milk samples from 4365 dairy farms were collected throughout Ireland. Using an in-house antibody-detection ELISA, the analysis of BTM samples showed that 83% (n=3602) of dairy farms had been exposed to liver fluke. The Getis-Ord Gi* statistic identified 74 high-risk and 130 low-risk significant (P<0.01) clusters of fasciolosis. The low-risk clusters were mostly located in the southern regions of Ireland, whereas the high-risk clusters were mainly situated in the western part. Several climatic variables (monthly and seasonal mean rainfall and temperatures, total wet days and rain days) and environmental datasets (soil types, enhanced vegetation index and normalised difference vegetation index) were used to investigate dissimilarities in the exposure to liver fluke between clusters. Rainfall, total wet days and rain days, and soil type were the significant classes of climatic and environmental variables explaining the differences between significant clusters. A discriminant function analysis was used to predict the exposure risk to liver fluke using 80% of data for modelling and the remaining subset of 20% for post hoc model validation. The most significant predictors of the model risk function were total rainfall in August and September and total wet days. The risk model presented 100% sensitivity and 91% specificity and an accuracy of 95% correctly classified cases. A risk map of exposure to liver fluke was constructed with higher probability of exposure in western and north-western regions. The results of this study identified differences between clusters of fasciolosis in Ireland regarding climatic and environmental variables and detected significant predictors of the exposure risk to liver fluke. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Patterns of multiple health risk–behaviours in university students and their association with mental health: application of latent class analysis

    PubMed Central

    Kwan, M. Y.; Arbour-Nicitopoulos, K. P.; Duku, E.; Faulkner, G.

    2016-01-01

    Abstract Introduction: University and college campuses may be the last setting where it is possible to comprehensively address the health of a large proportion of the young adult population. It is important that health promoters understand the collective challenges students are facing, and to better understand the broader lifestyle behavioural patterning evident during this life stage. The purpose of this study was to examine the clustering of modifiable health-risk behaviours and to explore the relationship between these identified clusters and mental health outcomes among a large Canadian university sample. Methods: Undergraduate students (n = 837; mean age = 21 years) from the University of Toronto completed the National College Health Assessment survey. The survey consists of approximately 300 items, including assessments of student health status, mental health and health-risk behaviours. Latent class analysis was used to identify patterning based on eight salient health-risk behaviours (marijuana use, other illicit drug use, risky sex, smoking, binge drinking, poor diet, physical inactivity, and insufficient sleep). Results: A three-class model based on student behavioural patterns emerged: “typical,” “high-risk” and “moderately healthy.” Results also found high-risk students reporting significantly higher levels of stress than typical students (χ2(1671) = 7.26, p < .01). Conclusion: Students with the highest likelihood of engaging in multiple health-risk behaviours reported poorer mental health, particularly as it relates to stress. Although these findings should be interpreted with caution due to the 28% response rate, they do suggest that interventions targeting specific student groups with similar patterning of multiple health-risk behaviours may be needed. PMID:27556920

  4. ADHD latent class clusters: DSM-IV subtypes and comorbidity

    PubMed Central

    Elia, Josephine; Arcos-Burgos, Mauricio; Bolton, Kelly L.; Ambrosini, Paul J.; Berrettini, Wade; Muenke, Maximilian

    2014-01-01

    ADHD (Attention Deficit Hyperactivity Disorder) has a complex, heterogeneous phenotype only partially captured by Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) criteria. In this report, latent class analyses (LCA) are used to identify ADHD phenotypes using K-SADS-IVR (Schedule for Affective Disorders & Schizophrenia for School Age Children-IV-Revised) symptoms and symptom severity data from a clinical sample of 500 ADHD subjects, ages 6–18, participating in an ADHD genetic study. Results show that LCA identified six separate ADHD clusters, some corresponding to specific DSM-IV subtypes while others included several subtypes. DSM-IV comorbid anxiety and mood disorders were generally similar across all clusters, and subjects without comorbidity did not aggregate within any one cluster. Age and gender composition also varied. These results support findings from population-based LCA studies. The six clusters provide additional homogenous groups that can be used to define ADHD phenotypes in genetic association studies. The limited age ranges aggregating in the different clusters may prove to be a particular advantage in genetic studies where candidate gene expression may vary during developmental phases. DSM-IV comorbid mood and anxiety disorders also do not appear to increase cluster heterogeneity; however, longitudinal studies that cover period of risk are needed to support this finding. PMID:19900717

  5. Symptom Cluster Trajectories During Chemotherapy in Breast Cancer Outpatients.

    PubMed

    Hsu, Hsin-Tien; Lin, Kuan-Chia; Wu, Li-Min; Juan, Chiung-Hui; Hou, Ming-Feng; Hwang, Shiow-Li; Liu, Yi; Dodd, Marylin J

    2017-06-01

    Breast cancer patients often experience multiple symptoms and substantial discomfort. Some symptoms may occur simultaneously and throughout the duration of chemotherapy treatment. The aim of this study was to investigate symptom severity and symptom cluster trajectories during chemotherapy in outpatients with breast cancer in Taiwan. This prospective, longitudinal, repeated measures study administered a standardized questionnaire (M. D. Anderson Symptom Inventory Taiwan version) to 103 breast cancer patients during each day of the third 21-day cycle of chemotherapy. Latent class growth analysis was performed to examine symptom cluster trajectories. Three symptom clusters were identified within the first 14 days of the 21-day chemotherapy cycle: the neurocognition cluster (pain, shortness of breath, vomiting, memory problems, and numbness/tingling) with a trajectory of Y = 2.09 - 0.11 (days), the emotion-nausea cluster (nausea, disturbed sleep, distress/upset, drowsiness, and sadness) with a trajectory ofY = 3.57 - 0.20 (days), and the fatigue-anorexia cluster (fatigue, lack of appetite, and dry mouth) with a trajectory of Y = 4.22 - 0.21 (days). The "fatigue-anorexia cluster" and "emotion-nausea cluster" peaked at moderate levels on chemotherapy days 3-5, and then gradually decreased to mild levels within the first 14 days of the 21-day chemotherapy cycle. Distinct symptom clusters were observed during the third cycle of chemotherapy. Systematic and ongoing evaluation of symptom cluster trajectories during cancer treatment is essential. Healthcare providers can use these findings to enhance communication with their breast cancer patients and to prioritize symptoms that require attention and intervention. Copyright © 2017 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.

  6. Topological structures in the equities market network

    PubMed Central

    Leibon, Gregory; Pauls, Scott; Rockmore, Daniel; Savell, Robert

    2008-01-01

    We present a new method for articulating scale-dependent topological descriptions of the network structure inherent in many complex systems. The technique is based on “partition decoupled null models,” a new class of null models that incorporate the interaction of clustered partitions into a random model and generalize the Gaussian ensemble. As an application, we analyze a correlation matrix derived from 4 years of close prices of equities in the New York Stock Exchange (NYSE) and National Association of Securities Dealers Automated Quotation (NASDAQ). In this example, we expose (i) a natural structure composed of 2 interacting partitions of the market that both agrees with and generalizes standard notions of scale (e.g., sector and industry) and (ii) structure in the first partition that is a topological manifestation of a well-known pattern of capital flow called “sector rotation.” Our approach gives rise to a natural form of multiresolution analysis of the underlying time series that naturally decomposes the basic data in terms of the effects of the different scales at which it clusters. We support our conclusions and show the robustness of the technique with a successful analysis on a simulated network with an embedded topological structure. The equities market is a prototypical complex system, and we expect that our approach will be of use in understanding a broad class of complex systems in which correlation structures are resident.

  7. Internet gamblers: a latent class analysis of their behaviours and health experiences.

    PubMed

    Lloyd, Joanne; Doll, Helen; Hawton, Keith; Dutton, William H; Geddes, John R; Goodwin, Guy M; Rogers, Robert D

    2010-09-01

    In order to learn about the behaviours and health experiences of people who gamble on the Internet, we conducted an international online survey with respondents recruited via gambling and gambling-related websites. The mean (SD) age of the 4,125 respondents completing the survey was 35.5 (11.8) years, with 79.1% being male and 68.8% UK residents. Respondents provided demographic details and completed validated psychometric screening instruments for problem gambling, mood disturbances, as well as alcohol and substance misuse, and history of deliberate self harm. We applied latent class analysis to respondents' patterns of regular online gambling activities, and identified subgroups of individuals who used the Internet to gamble in different ways (L (2) = 44.27, bootstrap P = 0.07). We termed the characteristic profiles as 'non-to-minimal gamblers'; 'sports bettors'; 'casino & sports gamblers'; 'lottery players'; and 'multi-activity gamblers'. Furthermore, these subgroups of respondents differed on other demographic and psychological dimensions, with significant inter-cluster differences in proportion of individuals scoring above threshold for problem gambling, mood disorders and substance misuse, and history of deliberate self harm (all Chi (2)s > 23.4, all P-values <0.001). The 'casino & sports' and 'multi-activity-gamblers' clusters had the highest prevalence of mental disorder. Internet gamblers appear to be heterogeneous but composed of several subgroups, differing markedly on both demographic and clinical characteristics.

  8. An analysis of the currently available calibrations in Strömgren photometry by using open clusters

    NASA Astrophysics Data System (ADS)

    Jordi, C.; Masana, E.; Figueras, F.; Torra, J.

    1997-05-01

    In recent years, several authors have revised the calibrations used to compute physical parameters (Mv, Teff, log g, [Fe/H]) from intrinsic colours in the uvby H_beta photometric system. For reddened stars, these intrinsic colours can be computed through the standard relations among colour indices for each of the regions defined by \\cite[Stromgren (1966)]{str66} on the HR diagram. We present a discussion of the coherence of these calibrations for main-sequence stars. Stars from open clusters are used to carry out this analysis. Assuming that individual reddening values and distances should be similar for all the members of a given open cluster, systematic differences among the calibrations used in each of the photometric regions might arise when comparing mean reddening values and distances for the members of each region. To classify the stars into Stromgren's regions we extended the algorithm presented by \\cite[Figueras et al. (1991)]{fig91} to a wider range of spectral types and luminosity classes. The observational ZAMS are compared with the theoretical ZAMS from stellar evolutionary models, in the range 6500-30000 K. The discrepancies are also discussed.

  9. Identification and Analysis of the Biosynthetic Gene Cluster Encoding the Thiopeptide Antibiotic Cyclothiazomycin in Streptomyces hygroscopicus 10-22▿ †

    PubMed Central

    Wang, Jiang; Yu, Yi; Tang, Kexuan; Liu, Wen; He, Xinyi; Huang, Xi; Deng, Zixin

    2010-01-01

    Thiopeptide antibiotics are an important class of natural products resulting from posttranslational modifications of ribosomally synthesized peptides. Cyclothiazomycin is a typical thiopeptide antibiotic that has a unique bridged macrocyclic structure derived from an 18-amino-acid structural peptide. Here we reported cloning, sequencing, and heterologous expression of the cyclothiazomycin biosynthetic gene cluster from Streptomyces hygroscopicus 10-22. Remarkably, successful heterologous expression of a 22.7-kb gene cluster in Streptomyces lividans 1326 suggested that there is a minimum set of 15 open reading frames that includes all of the functional genes required for cyclothiazomycin production. Six genes of these genes, cltBCDEFG flanking the structural gene cltA, were predicted to encode the enzymes required for the main framework of cyclothiazomycin, and two enzymes encoded by a putative operon, cltMN, were hypothesized to participate in the tailoring step to generate the tertiary thioether, leading to the final cyclization of the bridged macrocyclic structure. This rigorous bioinformatics analysis based on heterologous expression of cyclothiazomycin resulted in an ideal biosynthetic model for us to understand the biosynthesis of thiopeptides. PMID:20154110

  10. Wide-Field Infrared Survey Explorer Observations of Young Stellar Objects in the Lynds 1509 Dark Cloud in Auriga

    NASA Technical Reports Server (NTRS)

    Liu, Wilson M.; Padgett, Deborah L.; Terebey, Susan; Angione, John; Rebull, Luisa M.; McCollum, Bruce; Fajardo-Acosta, Sergio; Leisawitz, David

    2015-01-01

    The Wide-Field Infrared Survey Explorer (WISE) has uncovered a striking cluster of young stellar object (YSO) candidates associated with the L1509 dark cloud in Auriga. The WISE observations, at 3.4, 4.6, 12, and 22 microns, show a number of objects with colors consistent with YSOs, and their spectral energy distributions suggest the presence of circumstellar dust emission, including numerous Class I, flat spectrum, and Class II objects. In general, the YSOs in L1509 are much more tightly clustered than YSOs in other dark clouds in the Taurus-Auriga star forming region, with Class I and flat spectrum objects confined to the densest aggregates, and Class II objects more sparsely distributed. We estimate a most probable distance of 485-700 pc, and possibly as far as the previously estimated distance of 2 kpc.

  11. Numerical trials of HISSE

    NASA Technical Reports Server (NTRS)

    Peters, C.; Kampe, F. (Principal Investigator)

    1980-01-01

    The mathematical description and implementation of the statistical estimation procedure known as the Houston integrated spatial/spectral estimator (HISSE) is discussed. HISSE is based on a normal mixture model and is designed to take advantage of spectral and spatial information of LANDSAT data pixels, utilizing the initial classification and clustering information provided by the AMOEBA algorithm. The HISSE calculates parametric estimates of class proportions which reduce the error inherent in estimates derived from typical classify and count procedures common to nonparametric clustering algorithms. It also singles out spatial groupings of pixels which are most suitable for labeling classes. These calculations are designed to aid the analyst/interpreter in labeling patches with a crop class label. Finally, HISSE's initial performance on an actual LANDSAT agricultural ground truth data set is reported.

  12. Clustering by soft-constraint affinity propagation: applications to gene-expression data.

    PubMed

    Leone, Michele; Sumedha; Weigt, Martin

    2007-10-15

    Similarity-measure-based clustering is a crucial problem appearing throughout scientific data analysis. Recently, a powerful new algorithm called Affinity Propagation (AP) based on message-passing techniques was proposed by Frey and Dueck (2007a). In AP, each cluster is identified by a common exemplar all other data points of the same cluster refer to, and exemplars have to refer to themselves. Albeit its proved power, AP in its present form suffers from a number of drawbacks. The hard constraint of having exactly one exemplar per cluster restricts AP to classes of regularly shaped clusters, and leads to suboptimal performance, e.g. in analyzing gene expression data. This limitation can be overcome by relaxing the AP hard constraints. A new parameter controls the importance of the constraints compared to the aim of maximizing the overall similarity, and allows to interpolate between the simple case where each data point selects its closest neighbor as an exemplar and the original AP. The resulting soft-constraint affinity propagation (SCAP) becomes more informative, accurate and leads to more stable clustering. Even though a new a priori free parameter is introduced, the overall dependence of the algorithm on external tuning is reduced, as robustness is increased and an optimal strategy for parameter selection emerges more naturally. SCAP is tested on biological benchmark data, including in particular microarray data related to various cancer types. We show that the algorithm efficiently unveils the hierarchical cluster structure present in the data sets. Further on, it allows to extract sparse gene expression signatures for each cluster.

  13. COOL CORE CLUSTERS FROM COSMOLOGICAL SIMULATIONS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rasia, E.; Borgani, S.; Murante, G.

    2015-11-01

    We present results obtained from a set of cosmological hydrodynamic simulations of galaxy clusters, aimed at comparing predictions with observational data on the diversity between cool-core (CC) and non-cool-core (NCC) clusters. Our simulations include the effects of stellar and active galactic nucleus (AGN) feedback and are based on an improved version of the smoothed particle hydrodynamics code GADGET-3, which ameliorates gas mixing and better captures gas-dynamical instabilities by including a suitable artificial thermal diffusion. In this Letter, we focus our analysis on the entropy profiles, the primary diagnostic we used to classify the degree of cool-coreness of clusters, and themore » iron profiles. In keeping with observations, our simulated clusters display a variety of behaviors in entropy profiles: they range from steadily decreasing profiles at small radii, characteristic of CC systems, to nearly flat core isentropic profiles, characteristic of NCC systems. Using observational criteria to distinguish between the two classes of objects, we find that they occur in similar proportions in both simulations and observations. Furthermore, we also find that simulated CC clusters have profiles of iron abundance that are steeper than those of NCC clusters, which is also in agreement with observational results. We show that the capability of our simulations to generate a realistic CC structure in the cluster population is due to AGN feedback and artificial thermal diffusion: their combined action allows us to naturally distribute the energy extracted from super-massive black holes and to compensate for the radiative losses of low-entropy gas with short cooling time residing in the cluster core.« less

  14. Cool Core Clusters from Cosmological Simulations

    NASA Astrophysics Data System (ADS)

    Rasia, E.; Borgani, S.; Murante, G.; Planelles, S.; Beck, A. M.; Biffi, V.; Ragone-Figueroa, C.; Granato, G. L.; Steinborn, L. K.; Dolag, K.

    2015-11-01

    We present results obtained from a set of cosmological hydrodynamic simulations of galaxy clusters, aimed at comparing predictions with observational data on the diversity between cool-core (CC) and non-cool-core (NCC) clusters. Our simulations include the effects of stellar and active galactic nucleus (AGN) feedback and are based on an improved version of the smoothed particle hydrodynamics code GADGET-3, which ameliorates gas mixing and better captures gas-dynamical instabilities by including a suitable artificial thermal diffusion. In this Letter, we focus our analysis on the entropy profiles, the primary diagnostic we used to classify the degree of cool-coreness of clusters, and the iron profiles. In keeping with observations, our simulated clusters display a variety of behaviors in entropy profiles: they range from steadily decreasing profiles at small radii, characteristic of CC systems, to nearly flat core isentropic profiles, characteristic of NCC systems. Using observational criteria to distinguish between the two classes of objects, we find that they occur in similar proportions in both simulations and observations. Furthermore, we also find that simulated CC clusters have profiles of iron abundance that are steeper than those of NCC clusters, which is also in agreement with observational results. We show that the capability of our simulations to generate a realistic CC structure in the cluster population is due to AGN feedback and artificial thermal diffusion: their combined action allows us to naturally distribute the energy extracted from super-massive black holes and to compensate for the radiative losses of low-entropy gas with short cooling time residing in the cluster core.

  15. Patterns and predictors of violence against children in Uganda: a latent class analysis

    PubMed Central

    Clarke, Kelly; Patalay, Praveetha; Allen, Elizabeth; Knight, Louise; Naker, Dipak; Devries, Karen

    2016-01-01

    Objective To explore patterns of physical, emotional and sexual violence against Ugandan children. Design Latent class and multinomial logistic regression analysis of cross-sectional data. Setting Luwero District, Uganda. Participants In all, 3706 primary 5, 6 and 7 students attending 42 primary schools. Main outcome and measure To measure violence, we used the International Society for the Prevention of Child Abuse and Neglect Child Abuse Screening Tool—Child Institutional. We used the Strengths and Difficulties Questionnaire to assess mental health and administered reading, spelling and maths tests. Results We identified three violence classes. Class 1 (N=696 18.8%) was characterised by emotional and physical violence by parents and relatives, and sexual and emotional abuse by boyfriends, girlfriends and unrelated adults outside school. Class 2 (N=975 26.3%) was characterised by physical, emotional and sexual violence by peers (male and female students). Children in Classes 1 and 2 also had a high probability of exposure to emotional and physical violence by school staff. Class 3 (N=2035 54.9%) was characterised by physical violence by school staff and a lower probability of all other forms of violence compared to Classes 1 and 2. Children in Classes 1 and 2 were more likely to have worked for money (Class 1 Relative Risk Ratio 1.97, 95% CI 1.54 to 2.51; Class 2 1.55, 1.29 to 1.86), been absent from school in the previous week (Class 1 1.31, 1.02 to 1.67; Class 2 1.34, 1.10 to 1.63) and to have more mental health difficulties (Class 1 1.09, 1.07 to 1.11; Class 2 1.11, 1.09 to 1.13) compared to children in Class 3. Female sex (3.44, 2.48 to 4.78) and number of children sharing a sleeping area predicted being in Class 1. Conclusions Childhood violence in Uganda forms distinct patterns, clustered by perpetrator and setting. Research is needed to understand experiences of victimised children, and to develop mental health interventions for those with severe violence exposures. Trial registration number NCT01678846; Results. PMID:27221125

  16. Quantifying spatial patterns in the Yakama Nation Tribal Forest and Okanogan-Wenatchee National Forest to assess forest health

    NASA Astrophysics Data System (ADS)

    Wilder, T. F.

    2013-05-01

    Over the past century western United States have experienced drastic anthropogenic land use change from practices such as agriculture, fire exclusion, and timber harvesting. These changes have complex social, cultural, economic, and ecological interactions and consequences. This research studied landscapes patterns of watersheds with similar LANDFIRE potential vegetation in the Southern Washington Cascades physiographic province, within the Yakama Nation Tribal Forest (YTF) and Okanogan-Wenatchee National Forest, Naches Ranger District (NRD). In the selected watersheds, vegetation-mapping units were delineated and populated based on physiognomy of homogeneous areas of vegetative composition and structure using high-resolution aerial photos. Cover types and structural classes were derived from the raw, photo-interpreted vegetation attributes for individual vegetation mapping units and served as individual and composite response variables to quantify and assess spatial patterns and forest health conditions between the two ownerships. Structural classes in both the NRD and YTF were spatially clustered (Z-score 3.1, p-value 0.01; Z-score 2.3, p-value 0.02, respectively), however, ownership and logging type both explained a significant amount of variance in structural class composition. Based on FRAGSTATS landscape metrics, structural classes in the NRD displayed greater clustering and fragmentation with lower interspersion relative to the YTF. The NRD landscape was comprised of 47.4% understory reinitiation structural class type and associated high FRAGASTAT class metrics demonstrated high aggregation with moderate interspersion. Stem exclusion open canopy displayed the greatest dispersal of structural class types throughout the NRD, but adjacencies were correlated to other class types. In the YTF, stem exclusion open canopy comprised 37.7% of the landscape and displayed a high degree of aggregation and interspersion about clusters throughout the YTF. Composite cover type-structural class spatial autocorrelation was clustered in the NRD (Z-score 5.1, p-value 0.01), while the YTF exhibited a random spatial pattern. After accounting for location effects, logging type was the most significant factor explaining variation in composite cover-structure composition. FRAGSTATS landscape metrics identified composite cover-structure classes in the NRD displayed greater aggregation and fragmentation with lower interspersion relative to the YTF. The NRD landscape was comprised of 30.5% Pinus ponderosa-understory reinitiation and associated class metrics demonstrated a high degree of aggregation and fragmentation with low interspersion. Pinus ponderosa-stem exclusion open canopy comprised 24.6% of the YTF landscape and associated class metrics displayed moderate aggregation and fragmentation with high interspersion. A discussion integrating the results and existing relevant literature was indited to assess management regime influences on landscape patterns and, in turn, forest health attributes. This dialog is in provision of enhancing collaboration to optimize forest-health restoration activities across ownerships throughout the study area.

  17. Content Themes of Alcohol Advertising in U.S. Television-Latent Class Analysis.

    PubMed

    Morgenstern, Matthis; Schoeppe, Franziska; Campbell, Julie; Braam, Marloes W G; Stoolmiller, Michael; Sargent, James D

    2015-09-01

    There is little alcohol research that reports on the thematic contents of contemporary alcohol advertisements in U.S. television. Studies of alcohol ads from 2 decades ago did not identify "Partying" as a social theme. Aim of this study was to describe and classify alcohol advertisements aired in national television in terms of contents, airing times, and channel placements and to identify different marketing strategies of alcohol brands. Content analysis of all ads from the top 20 U.S. beer and spirit brands aired between July 2009 and June 2011. These were 581 unique alcohol ads accounting for 272,828 (78%) national television airings. Ads were coded according to predefined definitions of 13 content areas. A latent class analysis (LCA) was conducted to define content cluster themes and determine alcoholic brands that were more likely to exploit these themes. About half of the advertisements (46%) were aired between 3 am and 8 pm, and the majority were placed either in Entertainment (40%) and Sports (38%) channels. Beer ads comprised 64% of the sample, with significant variation in airing times and channels between types of products and brands. LCA revealed 5 content classes that exploited the "Partying," "Quality," "Sports," "Manly," and "Relax" themes. The partying class, indicative of ad messages surrounding partying, love, and sex, was the dominant theme comprising 42% of all advertisements. Ads for alcopops, flavored spirits, and liqueur were more likely to belong to the party class, but there were also some beer brands (Corona, Heineken) where more than 67% of ads exploited this theme. This is the first analysis to identify a partying theme to contemporary alcohol advertising. Future analyses can now determine whether exposure to that or other themes predicts alcohol misuse among youth audiences. Copyright © 2015 by the Research Society on Alcoholism.

  18. ALMA Survey of Class II Disks in the Young Stellar Cluster IC 348

    NASA Astrophysics Data System (ADS)

    Ruiz, Dary; Cieza, Lucas; Williams, Jonathan; Andrews, Sean; Principe, David

    2018-01-01

    We present a 1.3 mm continuum survey of the young (2-3 Myr) stellar cluster IC 348 at a distance of 270 pc, which is dominated by low-mass stars. We observed 146 Class II sources (disks that are optically thick in the infrared) at 0.8 '' (200 au) resolution with a 3σ sensitivity of 0.2 MEarth. We detect 46 of the targets and construct a disk luminosity function. We compare the disk mass distribution in IC 348 to those of younger and older regions, taking into account the dependence on stellar mass. We find a clear evolution in disk masses from 1 to 5-10 Myr. The disk masses in IC 348 are significantly lower than those in Taurus (1-2 Myr) and Lupus (1-3 Myr), similar to those of Chamaleon I, (2-3 Myr) and σ-Ori (3-5 Myr) and significantly higher than in Upper Scorpius (5-10 Myr). About 20 disks in our sample (~5% of the cluster members) have estimated masses (dust + gas) of >1 MJUP. and might be the precursors of giant planets in the cluster. Some of the most massive disks include transition objects with inner opacity holes based on their infrared SEDs. From an stacking analysis of the 90 non-detections, we find that these disks have a typical dust mass of just ≤ 0.1 MEarth, even though the vast majority of their infrared SEDs remain optically thick and show little signs of evolution. Such low-mass disks are likely to be the precursors of the small rocky planets found by Kepler around M-type stars.

  19. A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB.

    PubMed

    Kent, Peter; Jensen, Rikke K; Kongsted, Alice

    2014-10-02

    There are various methodological approaches to identifying clinically important subgroups and one method is to identify clusters of characteristics that differentiate people in cross-sectional and/or longitudinal data using Cluster Analysis (CA) or Latent Class Analysis (LCA). There is a scarcity of head-to-head comparisons that can inform the choice of which clustering method might be suitable for particular clinical datasets and research questions. Therefore, the aim of this study was to perform a head-to-head comparison of three commonly available methods (SPSS TwoStep CA, Latent Gold LCA and SNOB LCA). The performance of these three methods was compared: (i) quantitatively using the number of subgroups detected, the classification probability of individuals into subgroups, the reproducibility of results, and (ii) qualitatively using subjective judgments about each program's ease of use and interpretability of the presentation of results.We analysed five real datasets of varying complexity in a secondary analysis of data from other research projects. Three datasets contained only MRI findings (n = 2,060 to 20,810 vertebral disc levels), one dataset contained only pain intensity data collected for 52 weeks by text (SMS) messaging (n = 1,121 people), and the last dataset contained a range of clinical variables measured in low back pain patients (n = 543 people). Four artificial datasets (n = 1,000 each) containing subgroups of varying complexity were also analysed testing the ability of these clustering methods to detect subgroups and correctly classify individuals when subgroup membership was known. The results from the real clinical datasets indicated that the number of subgroups detected varied, the certainty of classifying individuals into those subgroups varied, the findings had perfect reproducibility, some programs were easier to use and the interpretability of the presentation of their findings also varied. The results from the artificial datasets indicated that all three clustering methods showed a near-perfect ability to detect known subgroups and correctly classify individuals into those subgroups. Our subjective judgement was that Latent Gold offered the best balance of sensitivity to subgroups, ease of use and presentation of results with these datasets but we recognise that different clustering methods may suit other types of data and clinical research questions.

  20. Percolation on fitness landscapes: effects of correlation, phenotype, and incompatibilities

    PubMed Central

    Gravner, Janko; Pitman, Damien; Gavrilets, Sergey

    2009-01-01

    We study how correlations in the random fitness assignment may affect the structure of fitness landscapes, in three classes of fitness models. The first is a phenotype space in which individuals are characterized by a large number n of continuously varying traits. In a simple model of random fitness assignment, viable phenotypes are likely to form a giant connected cluster percolating throughout the phenotype space provided the viability probability is larger than 1/2n. The second model explicitly describes genotype-to-phenotype and phenotype-to-fitness maps, allows for neutrality at both phenotype and fitness levels, and results in a fitness landscape with tunable correlation length. Here, phenotypic neutrality and correlation between fitnesses can reduce the percolation threshold, and correlations at the point of phase transition between local and global are most conducive to the formation of the giant cluster. In the third class of models, particular combinations of alleles or values of phenotypic characters are “incompatible” in the sense that the resulting genotypes or phenotypes have zero fitness. This setting can be viewed as a generalization of the canonical Bateson-Dobzhansky-Muller model of speciation and is related to K- SAT problems, prominent in computer science. We analyze the conditions for the existence of viable genotypes, their number, as well as the structure and the number of connected clusters of viable genotypes. We show that analysis based on expected values can easily lead to wrong conclusions, especially when fitness correlations are strong. We focus on pairwise incompatibilities between diallelic loci, but we also address multiple alleles, complex incompatibilities, and continuous phenotype spaces. In the case of diallelic loci, the number of clusters is stochastically bounded and each cluster contains a very large sub-cube. Finally, we demonstrate that the discrete NK model shares some signature properties of models with high correlations. PMID:17692873

  1. An initial perspective of S-asteroid subtypes within asteroid families

    NASA Technical Reports Server (NTRS)

    Kelley, M. S.; Gaffey, M. J.

    1993-01-01

    Many main belt asteroids cluster around certain values of semi-major axis (a), inclination (i), and eccentricity (e). Hirayama was the first to notice these concentrations which he interpreted as evidence of disruptions of larger parent bodies. He called these clusters 'asteroid families'. The term 'families' is increasingly reserved for genetic associations to distinguish them from clusters of unknown or purely dynamical origin (e.g. the Phocaea cluster). Members of a genetic asteroid family represent fragments derived from various depths within the original parent planetesimal. Thus, family members offer the potential for direct examination of the interiors of parent bodies which have undergone metamorphism and differentiation similar to that occurring in the inaccessible interiors of terrestrial planets. The differentiation similar to that occurring in the inaccessible interiors of terrestrial planets. The condition that genetic family members represent the fragments of a parent object provides a critical test of whether an association (cluster in proper element space) is a genetic family. Compositions (types and relative abundances of materials) of family members must permit the reconstruction of a compositionally plausible parent body. The compositions of proposed family members can be utilized to test the genetic reality of the family and to determine the type and degree of internal differentiation within the parent planetesimal. The interpretation of the S-class mineralogy provides a preliminary evaluation of family memberships. Detailed mineralogical and petrological analysis was done based on the reflectance spectra of 39 S-type asteroids. The result is a division of the S-asteroid class into seven subtypes based on compositional differences. These subtypes, designated S(I) to S(VII), correspond to surface silicate assemblages ranging from monomineralic olivine (dunites) through olivine-pyroxene mixtures to pure pyroxene or pyroxene-feldspar mixtures (basalts). The most general conclusion is that the S-asteroids cannot be treated as a single group of objects without greatly oversimplifying their properties. Each S-subtype needs to be treated as an independent group with a distinct evolutionary history.

  2. Evolution of major histocompatibility complex class I and class II genes in the brown bear

    PubMed Central

    2012-01-01

    Background Major histocompatibility complex (MHC) proteins constitute an essential component of the vertebrate immune response, and are coded by the most polymorphic of the vertebrate genes. Here, we investigated sequence variation and evolution of MHC class I and class II DRB, DQA and DQB genes in the brown bear Ursus arctos to characterise the level of polymorphism, estimate the strength of positive selection acting on them, and assess the extent of gene orthology and trans-species polymorphism in Ursidae. Results We found 37 MHC class I, 16 MHC class II DRB, four DQB and two DQA alleles. We confirmed the expression of several loci: three MHC class I, two DRB, two DQB and one DQA. MHC class I also contained two clusters of non-expressed sequences. MHC class I and DRB allele frequencies differed between northern and southern populations of the Scandinavian brown bear. The rate of nonsynonymous substitutions (dN) exceeded the rate of synonymous substitutions (dS) at putative antigen binding sites of DRB and DQB loci and, marginally significantly, at MHC class I loci. Models of codon evolution supported positive selection at DRB and MHC class I loci. Both MHC class I and MHC class II sequences showed orthology to gene clusters found in the giant panda Ailuropoda melanoleuca. Conclusions Historical positive selection has acted on MHC class I, class II DRB and DQB, but not on the DQA locus. The signal of historical positive selection on the DRB locus was particularly strong, which may be a general feature of caniforms. The presence of MHC class I pseudogenes may indicate faster gene turnover in this class through the birth-and-death process. South–north population structure at MHC loci probably reflects origin of the populations from separate glacial refugia. PMID:23031405

  3. Evolution of major histocompatibility complex class I and class II genes in the brown bear.

    PubMed

    Kuduk, Katarzyna; Babik, Wiesław; Bojarska, Katarzyna; Sliwińska, Ewa B; Kindberg, Jonas; Taberlet, Pierre; Swenson, Jon E; Radwan, Jacek

    2012-10-02

    Major histocompatibility complex (MHC) proteins constitute an essential component of the vertebrate immune response, and are coded by the most polymorphic of the vertebrate genes. Here, we investigated sequence variation and evolution of MHC class I and class II DRB, DQA and DQB genes in the brown bear Ursus arctos to characterise the level of polymorphism, estimate the strength of positive selection acting on them, and assess the extent of gene orthology and trans-species polymorphism in Ursidae. We found 37 MHC class I, 16 MHC class II DRB, four DQB and two DQA alleles. We confirmed the expression of several loci: three MHC class I, two DRB, two DQB and one DQA. MHC class I also contained two clusters of non-expressed sequences. MHC class I and DRB allele frequencies differed between northern and southern populations of the Scandinavian brown bear. The rate of nonsynonymous substitutions (dN) exceeded the rate of synonymous substitutions (dS) at putative antigen binding sites of DRB and DQB loci and, marginally significantly, at MHC class I loci. Models of codon evolution supported positive selection at DRB and MHC class I loci. Both MHC class I and MHC class II sequences showed orthology to gene clusters found in the giant panda Ailuropoda melanoleuca. Historical positive selection has acted on MHC class I, class II DRB and DQB, but not on the DQA locus. The signal of historical positive selection on the DRB locus was particularly strong, which may be a general feature of caniforms. The presence of MHC class I pseudogenes may indicate faster gene turnover in this class through the birth-and-death process. South-north population structure at MHC loci probably reflects origin of the populations from separate glacial refugia.

  4. On the Unusually High Temperature of the Cluster of Galaxies 1E 0657-56

    NASA Technical Reports Server (NTRS)

    Yaqoob, Tahir

    1999-01-01

    A recent X-ray observation of the cluster 1E 0657-56 (z = 0.296) with ASC,4 implied an unusually high temperature of approx. 17 keV. Such a high temperature would make it the hottest known cluster and severely constrain cosmological models since, in a Universe with critical density (Omega = 1) the probability of observing such a cluster is only approx. 4 x 10(exp -5). Here we test the robustness of this observational result since it has such important implications. We analysed the data using a variety of different data analysis methods and spectral analysis assumptions and find a temperature of approx. 11 - 12 keV in all cases, except for one class of spectral fits. These are fits in which the absorbing column density is fixed at the Galactic value. Using simulated data for a 12 keV cluster, we show that a high temperature of approx. 17 keV is artificially obtained if the true spectrum has a stronger low-energy cut-off than that for Galactic absorption only. The apparent extra absorption may be astrophysical in origin, (either intrinsic or line-of-sight), or it may be a problem with the low-energy CCD efficiency. Although significantly lower than previous measurements, this temperature of kT approx. 11 - 12 keV is still relatively high since only a few clusters have been found to have temperatures higher than 10 keV and the data therefore still present some difficulty for an Omega = 1 Universe. Our results will also be useful to anyone who wants to estimate the systematic errors involved in different methods of background subtraction of ASCA data for sources with similar signal-to-noise to that of the IE 0657-56 data reported here.

  5. Phylogenetic and Pathotypic Characterization of Newcastle Disease Viruses Circulating in West Africa and Efficacy of a Current Vaccine

    PubMed Central

    Samuel, Arthur; Nayak, Baibaswata; Paldurai, Anandan; Xiao, Sa; Aplogan, Gilbert L.; Awoume, Kodzo A.; Webby, Richard J.; Ducatez, Mariette F.; Collins, Peter L.

    2013-01-01

    Newcastle disease (ND) is a deadly avian disease worldwide. In Africa, ND is enzootic and causes large economic losses, but little is known about the Newcastle disease virus (NDV) strains circulating in African countries. In this study, 27 NDV isolates collected from apparently healthy chickens in live-bird markets of the West African countries Benin and Togo in 2009 were characterized. All isolates had polybasic fusion (F)-protein cleavage sites and were shown to be highly virulent in standard pathogenicity assays. Infection of 2-week-old chickens with two of the isolates resulted in 100% mortality within 4 days. Phylogenetic analysis of the 27 isolates based on a partial F-protein gene sequence identified three clusters: one containing all the isolates from Togo and one from Benin (cluster 2), one containing most isolates from Benin (cluster 3), and an outlier isolate from Benin (cluster 1). All the three clusters are related to genotype VII strains of NDV. In addition, the cluster of viruses from Togo contained a recently identified 6-nucleotide insert between the hemagglutinin-neuraminidase (HN) and large polymerase (L) genes in a complete genome of an NDV isolate from this geographical region. Multiple strains that include this novel element suggest local emergence of a new genome length class. These results reveal genetic diversity within and among local NDV populations in Africa. Sequence analysis showed that the F and HN proteins of six West African isolates share 83.2 to 86.6% and 86.5 to 87.9% identities, respectively, with vaccine strain LaSota, indicative of considerable diversity. A vaccine efficacy study showed that the LaSota vaccine protected birds from morbidity and mortality but did not prevent shedding of West African challenge viruses. PMID:23254128

  6. Poisson Mixture Regression Models for Heart Disease Prediction.

    PubMed

    Mufudza, Chipo; Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.

  7. Poisson Mixture Regression Models for Heart Disease Prediction

    PubMed Central

    Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611

  8. Characterization of fluorescence in heat-treated silver-exchanged zeolites.

    PubMed

    De Cremer, Gert; Coutiño-Gonzalez, Eduardo; Roeffaers, Maarten B J; Moens, Bart; Ollevier, Jeroen; Van der Auweraer, Mark; Schoonheydt, Robert; Jacobs, Pierre A; De Schryver, Frans C; Hofkens, Johan; De Vos, Dirk E; Sels, Bert F; Vosch, Tom

    2009-03-04

    Thermal treatment of Ag(+)-exchanged zeolites yields discrete highly photostable luminescent clusters without formation of metallic nanoparticles. Different types of emitters with characteristic luminescence colors are observed, depending on the nature of the cocation, the amount of exchanged silver, and the host topology. The dominant emission bands in LTA samples are situated around 550 and 690 nm for the samples with, respectively, low and high silver content, while in FAU-type materials only a broad band around 550 nm is observed, regardless of the degree of exchange. Analysis of the fluorescent properties in combination with ESR spectroscopy suggests that a Ag(6)(+) cluster with doublet electronic ground state is associated with the appearance of the 690-nm emitter, having a decay of a few hundred microseconds. Tentatively, the nanosecond-decaying 550-nm emitter is assigned to the Ag(3)(+) cluster. This new class of photostable luminescent particles with tunable emission colors offers interesting perspectives for various applications such as biocompatible labels for intracellular imaging.

  9. Haemophilus ducreyi Cutaneous Ulcer Strains Are Nearly Identical to Class I Genital Ulcer Strains

    PubMed Central

    Gangaiah, Dharanesh; Webb, Kristen M.; Humphreys, Tricia L.; Fortney, Kate R.; Toh, Evelyn; Tai, Albert; Katz, Samantha S.; Pillay, Allan; Chen, Cheng-Yen; Roberts, Sally A.; Munson, Robert S.; Spinola, Stanley M.

    2015-01-01

    Background Although cutaneous ulcers (CU) in the tropics is frequently attributed to Treponema pallidum subspecies pertenue, the causative agent of yaws, Haemophilus ducreyi has emerged as a major cause of CU in yaws-endemic regions of the South Pacific islands and Africa. H. ducreyi is generally susceptible to macrolides, but CU strains persist after mass drug administration of azithromycin for yaws or trachoma. H. ducreyi also causes genital ulcers (GU) and was thought to be exclusively transmitted by microabrasions that occur during sex. In human volunteers, the GU strain 35000HP does not infect intact skin; wounds are required to initiate infection. These data led to several questions: Are CU strains a new variant of H. ducreyi or did they evolve from GU strains? Do CU strains contain additional genes that could allow them to infect intact skin? Are CU strains susceptible to azithromycin? Methodology/Principal Findings To address these questions, we performed whole-genome sequencing and antibiotic susceptibility testing of 5 CU strains obtained from Samoa and Vanuatu and 9 archived class I and class II GU strains. Except for single nucleotide polymorphisms, the CU strains were genetically almost identical to the class I strain 35000HP and had no additional genetic content. Phylogenetic analysis showed that class I and class II strains formed two separate clusters and CU strains evolved from class I strains. Class I strains diverged from class II strains ~1.95 million years ago (mya) and CU strains diverged from the class I strain 35000HP ~0.18 mya. CU and GU strains evolved under similar selection pressures. Like 35000HP, the CU strains were highly susceptible to antibiotics, including azithromycin. Conclusions/Significance These data suggest that CU strains are derivatives of class I strains that were not recognized until recently. These findings require confirmation by analysis of CU strains from other regions. PMID:26147869

  10. Causal effects of socioeconomic status on central adiposity risks: Evidence using panel data from urban Mexico.

    PubMed

    Levasseur, Pierre

    2015-07-01

    Associated with overweight, obesity and chronic diseases, the nutrition transition process reveals important socioeconomic issues in Mexico. Using panel data from the Mexican Family Life Survey, the purpose of the study is to estimate the causal effect of household socioeconomic status (SES) on nutritional outcomes among urban adults. We divide the analysis into two steps. First, using a mixed clustering procedure, we distinguish four socioeconomic classes based on income, educational and occupational dimensions: (i) a poor class; (ii) a lower-middle class; (iii) an upper-middle class; (iv) a rich class. Second, using an econometric framework adapted to our study (the Hausman-Taylor estimator), we measure the impact of belonging to these socioeconomic groups on individual anthropometric indicators, based on the body-mass index (BMI) and the waist-to-height ratio (WHtR). Our results make several contributions: (i) we show that a new middle class, rising out of poverty, is the most exposed to the risks of adiposity; (ii) as individuals from the upper class seem to be fatter than individuals from the upper-middle class, we can reject the assumption of an inverted U-shaped relationship between socioeconomic and anthropometric status as commonly suggested in emerging economies; (iii) the influence of SES on central adiposity appears to be particularly strong for men. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Some approaches to optimal cluster labeling of aerospace imagery

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1980-01-01

    Some approaches are presented to the problem of labeling clusters using information from a given set of labeled and unlabeled aerospace imagery patterns. The assignment of class labels to the clusters is formulated as the determination of the best assignment over all possible ones with respect to some criterion. Cluster labeling is also viewed as the probability of correct labeling with a maximization of likelihood function. Results of the application of these techniques in the processing of remotely sensed multispectral scanner imagery data are presented.

  12. Revisiting the monster: the mass profile of the galaxy cluster Abell 3827 using dynamical and strong lensing constrains

    NASA Astrophysics Data System (ADS)

    Rodrigo Carrasco Damele, Eleazar; Verdugo, Tomas

    2018-01-01

    The galaxy cluster Abell 3827 is one of the most massive clusters know at z ≦ 0.1 (Richness class 2, BM typeI, X-ray LX = 2.4 x 1044 erg s-1). The Brightest Cluster Galaxy (BCG) in Abell 3827 is perhaps the most extreme example of ongoing galaxy cannibalism. The multi-component BCG hosts the stellar remnants nuclei of at least four bright elliptical galaxies embedded in a common assymetric halo extended up to 15 kpc. The most notorious characteristic of the BCG is the existence of a unique strong gravitational lens system located within the inner 15 kpc region. A mass estimation of the galaxy based on strong lensing model was presented in Carrasco et al (2010, ApJL, 715, 160). Moreover, the exceptional strong lensing lens system in Abell 3827 and the location of the four bright galaxies has been used to measure for the first time small physical separations between dark and ordinary matter (Williams et al. 2011, MNRAS, 415, 448, Massey et al. 2015, MNRAS, 449, 3393). In this contribution, we present a detailed strong lensing and dynamical analysis of the cluster Abell 3827 based on spectroscopic redshift of the lensed features and from ~70 spectroscopically confirmed member galaxies inside 0.5 x 0.5 Mpc from the cluster center.

  13. Querying Co-regulated Genes on Diverse Gene Expression Datasets Via Biclustering.

    PubMed

    Deveci, Mehmet; Küçüktunç, Onur; Eren, Kemal; Bozdağ, Doruk; Kaya, Kamer; Çatalyürek, Ümit V

    2016-01-01

    Rapid development and increasing popularity of gene expression microarrays have resulted in a number of studies on the discovery of co-regulated genes. One important way of discovering such co-regulations is the query-based search since gene co-expressions may indicate a shared role in a biological process. Although there exist promising query-driven search methods adapting clustering, they fail to capture many genes that function in the same biological pathway because microarray datasets are fraught with spurious samples or samples of diverse origin, or the pathways might be regulated under only a subset of samples. On the other hand, a class of clustering algorithms known as biclustering algorithms which simultaneously cluster both the items and their features are useful while analyzing gene expression data, or any data in which items are related in only a subset of their samples. This means that genes need not be related in all samples to be clustered together. Because many genes only interact under specific circumstances, biclustering may recover the relationships that traditional clustering algorithms can easily miss. In this chapter, we briefly summarize the literature using biclustering for querying co-regulated genes. Then we present a novel biclustering approach and evaluate its performance by a thorough experimental analysis.

  14. Use of phylogenetical analysis to predict susceptibility of pathogenic Candida spp. to antifungal drugs.

    PubMed

    Maheux, Andrée F; Sellam, Adnane; Piché, Yves; Boissinot, Maurice; Pelletier, René; Boudreau, Dominique K; Picard, François J; Trépanier, Hélène; Boily, Marie-Josée; Ouellette, Marc; Roy, Paul H; Bergeron, Michel G

    2016-12-01

    Successful treatment of a Candida infection relies on 1) an accurate identification of the pathogenic fungus and 2) on its susceptibility to antifungal drugs. In the present study we investigated the level of correlation between phylogenetical evolution and susceptibility of pathogenic Candida spp. to antifungal drugs. For this, we compared a phylogenetic tree, assembled with the concatenated sequences (2475-bp) of the ATP2, TEF1, and TUF1 genes from 20 representative Candida species, with published minimal inhibitory concentrations (MIC) of the four principal antifungal drug classes commonly used in the treatment of candidiasis: polyenes, triazoles, nucleoside analogues, and echinocandins. The phylogenetic tree revealed three distinct phylogenetic clusters among Candida species. Species within a given phylogenetic cluster have generally similar susceptibility profiles to antifungal drugs and species within Clusters II and III were less sensitive to antifungal drugs than Cluster I species. These results showed that phylogenetical relationship between clusters and susceptibility to several antifungal drugs could be used to guide therapy when only species identification is available prior to information pertaining to its resistance profile. An extended study comprising a large panel of clinical samples should be conducted to confirm the efficiency of this approach in the treatment of candidiasis. Copyright © 2016. Published by Elsevier B.V.

  15. AUTOMATED UNSUPERVISED CLASSIFICATION OF THE SLOAN DIGITAL SKY SURVEY STELLAR SPECTRA USING k-MEANS CLUSTERING

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanchez Almeida, J.; Allende Prieto, C., E-mail: jos@iac.es, E-mail: callende@iac.es

    2013-01-20

    Large spectroscopic surveys require automated methods of analysis. This paper explores the use of k-means clustering as a tool for automated unsupervised classification of massive stellar spectral catalogs. The classification criteria are defined by the data and the algorithm, with no prior physical framework. We work with a representative set of stellar spectra associated with the Sloan Digital Sky Survey (SDSS) SEGUE and SEGUE-2 programs, which consists of 173,390 spectra from 3800 to 9200 A sampled on 3849 wavelengths. We classify the original spectra as well as the spectra with the continuum removed. The second set only contains spectral lines,more » and it is less dependent on uncertainties of the flux calibration. The classification of the spectra with continuum renders 16 major classes. Roughly speaking, stars are split according to their colors, with enough finesse to distinguish dwarfs from giants of the same effective temperature, but with difficulties to separate stars with different metallicities. There are classes corresponding to particular MK types, intrinsically blue stars, dust-reddened, stellar systems, and also classes collecting faulty spectra. Overall, there is no one-to-one correspondence between the classes we derive and the MK types. The classification of spectra without continuum renders 13 classes, the color separation is not so sharp, but it distinguishes stars of the same effective temperature and different metallicities. Some classes thus obtained present a fairly small range of physical parameters (200 K in effective temperature, 0.25 dex in surface gravity, and 0.35 dex in metallicity), so that the classification can be used to estimate the main physical parameters of some stars at a minimum computational cost. We also analyze the outliers of the classification. Most of them turn out to be failures of the reduction pipeline, but there are also high redshift QSOs, multiple stellar systems, dust-reddened stars, galaxies, and, finally, odd spectra whose nature we have not deciphered. The template spectra representative of the classes are publicly available in the online journal.« less

  16. Unsupervised Anomaly Detection Based on Clustering and Multiple One-Class SVM

    NASA Astrophysics Data System (ADS)

    Song, Jungsuk; Takakura, Hiroki; Okabe, Yasuo; Kwon, Yongjin

    Intrusion detection system (IDS) has played an important role as a device to defend our networks from cyber attacks. However, since it is unable to detect unknown attacks, i.e., 0-day attacks, the ultimate challenge in intrusion detection field is how we can exactly identify such an attack by an automated manner. Over the past few years, several studies on solving these problems have been made on anomaly detection using unsupervised learning techniques such as clustering, one-class support vector machine (SVM), etc. Although they enable one to construct intrusion detection models at low cost and effort, and have capability to detect unforeseen attacks, they still have mainly two problems in intrusion detection: a low detection rate and a high false positive rate. In this paper, we propose a new anomaly detection method based on clustering and multiple one-class SVM in order to improve the detection rate while maintaining a low false positive rate. We evaluated our method using KDD Cup 1999 data set. Evaluation results show that our approach outperforms the existing algorithms reported in the literature; especially in detection of unknown attacks.

  17. GEANT4 distributed computing for compact clusters

    NASA Astrophysics Data System (ADS)

    Harrawood, Brian P.; Agasthya, Greeshma A.; Lakshmanan, Manu N.; Raterman, Gretchen; Kapadia, Anuj J.

    2014-11-01

    A new technique for distribution of GEANT4 processes is introduced to simplify running a simulation in a parallel environment such as a tightly coupled computer cluster. Using a new C++ class derived from the GEANT4 toolkit, multiple runs forming a single simulation are managed across a local network of computers with a simple inter-node communication protocol. The class is integrated with the GEANT4 toolkit and is designed to scale from a single symmetric multiprocessing (SMP) machine to compact clusters ranging in size from tens to thousands of nodes. User designed 'work tickets' are distributed to clients using a client-server work flow model to specify the parameters for each individual run of the simulation. The new g4DistributedRunManager class was developed and well tested in the course of our Neutron Stimulated Emission Computed Tomography (NSECT) experiments. It will be useful for anyone running GEANT4 for large discrete data sets such as covering a range of angles in computed tomography, calculating dose delivery with multiple fractions or simply speeding the through-put of a single model.

  18. Spectroscopic characterization of galaxy clusters in RCS-1: spectroscopic confirmation, redshift accuracy, and dynamical mass-richness relation

    NASA Astrophysics Data System (ADS)

    Gilbank, David G.; Barrientos, L. Felipe; Ellingson, Erica; Blindert, Kris; Yee, H. K. C.; Anguita, T.; Gladders, M. D.; Hall, P. B.; Hertling, G.; Infante, L.; Yan, R.; Carrasco, M.; Garcia-Vergara, Cristina; Dawson, K. S.; Lidman, C.; Morokuma, T.

    2018-05-01

    We present follow-up spectroscopic observations of galaxy clusters from the first Red-sequence Cluster Survey (RCS-1). This work focuses on two samples, a lower redshift sample of ˜30 clusters ranging in redshift from z ˜ 0.2-0.6 observed with multiobject spectroscopy (MOS) on 4-6.5-m class telescopes and a z ˜ 1 sample of ˜10 clusters 8-m class telescope observations. We examine the detection efficiency and redshift accuracy of the now widely used red-sequence technique for selecting clusters via overdensities of red-sequence galaxies. Using both these data and extended samples including previously published RCS-1 spectroscopy and spectroscopic redshifts from SDSS, we find that the red-sequence redshift using simple two-filter cluster photometric redshifts is accurate to σz ≈ 0.035(1 + z) in RCS-1. This accuracy can potentially be improved with better survey photometric calibration. For the lower redshift sample, ˜5 per cent of clusters show some (minor) contamination from secondary systems with the same red-sequence intruding into the measurement aperture of the original cluster. At z ˜ 1, the rate rises to ˜20 per cent. Approximately ten per cent of projections are expected to be serious, where the two components contribute significant numbers of their red-sequence galaxies to another cluster. Finally, we present a preliminary study of the mass-richness calibration using velocity dispersions to probe the dynamical masses of the clusters. We find a relation broadly consistent with that seen in the local universe from the WINGS sample at z ˜ 0.05.

  19. Digital modelling of landscape and soil in a mountainous region: A neuro-fuzzy approach

    NASA Astrophysics Data System (ADS)

    Viloria, Jesús A.; Viloria-Botello, Alvaro; Pineda, María Corina; Valera, Angel

    2016-01-01

    Research on genetic relationships between soil and landforms has largely improved soil mapping. Recent technological advances have created innovative methods for modelling the spatial soil variation from digital elevation models (DEMs) and remote sensors. This generates new opportunities for the application of geomorphology to soil mapping. This study applied a method based on artificial neural networks and fuzzy clustering to recognize digital classes of land surfaces in a mountainous area in north-central Venezuela. The spatial variation of the fuzzy memberships exposed the areas where each class predominates, while the class centres helped to recognize the topographic attributes and vegetation cover of each class. The obtained classes of terrain revealed the structure of the land surface, which showed regional differences in climate, vegetation, and topography and landscape stability. The land-surface classes were subdivided on the basis of the geological substratum to produce landscape classes that additionally considered the influence of soil parent material. These classes were used as a framework for soil sampling. A redundancy analysis confirmed that changes of landscape classes explained the variation in soil properties (p = 0.01), and a Kruskal-Wallis test showed significant differences (p = 0.01) in clay, hydraulic conductivity, soil organic carbon, base saturation, and exchangeable Ca and Mg between classes. Thus, the produced landscape classes correspond to three-dimensional bodies that differ in soil conditions. Some changes of land-surface classes coincide with abrupt boundaries in the landscape, such as ridges and thalwegs. However, as the model is continuous, it disclosed the remaining variation between those boundaries.

  20. Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering.

    PubMed

    Suraj; Tiwari, Purnendu; Ghosh, Subhojit; Sinha, Rakesh Kumar

    2015-01-01

    Transferring the brain computer interface (BCI) from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG) signal reasons us to look toward evolutionary algorithm (EA). Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI) tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA) and particle swarm optimization (PSO) based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR) techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD) and desynchronization (ERD) feature vector is formed.

  1. Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering

    PubMed Central

    Suraj; Tiwari, Purnendu; Ghosh, Subhojit; Sinha, Rakesh Kumar

    2015-01-01

    Transferring the brain computer interface (BCI) from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG) signal reasons us to look toward evolutionary algorithm (EA). Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI) tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA) and particle swarm optimization (PSO) based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR) techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD) and desynchronization (ERD) feature vector is formed. PMID:25972896

  2. Identification and manipulation of the pleuromutilin gene cluster from Clitopilus passeckerianus for increased rapid antibiotic production

    NASA Astrophysics Data System (ADS)

    Bailey, Andy M.; Alberti, Fabrizio; Kilaru, Sreedhar; Collins, Catherine M.; de Mattos-Shipley, Kate; Hartley, Amanda J.; Hayes, Patrick; Griffin, Alison; Lazarus, Colin M.; Cox, Russell J.; Willis, Christine L.; O'Dwyer, Karen; Spence, David W.; Foster, Gary D.

    2016-05-01

    Semi-synthetic derivatives of the tricyclic diterpene antibiotic pleuromutilin from the basidiomycete Clitopilus passeckerianus are important in combatting bacterial infections in human and veterinary medicine. These compounds belong to the only new class of antibiotics for human applications, with novel mode of action and lack of cross-resistance, representing a class with great potential. Basidiomycete fungi, being dikaryotic, are not generally amenable to strain improvement. We report identification of the seven-gene pleuromutilin gene cluster and verify that using various targeted approaches aimed at increasing antibiotic production in C. passeckerianus, no improvement in yield was achieved. The seven-gene pleuromutilin cluster was reconstructed within Aspergillus oryzae giving production of pleuromutilin in an ascomycete, with a significant increase (2106%) in production. This is the first gene cluster from a basidiomycete to be successfully expressed in an ascomycete, and paves the way for the exploitation of a metabolically rich but traditionally overlooked group of fungi.

  3. DYADIC PARENTING AND CHILDREN’S EXTERNALIZING SYMPTOMS

    PubMed Central

    Meteyer, Karen B.; Perry-Jenkins, Maureen

    2010-01-01

    We explore dyadic parenting styles and their association with first-grade children’s externalizing behavior symptoms in a sample of 85 working-class, dual-earner families. Cluster analysis is used to create a typology of parenting types, reflecting the parental warmth, overreactivity, and laxness of both mothers and fathers in two-parent families. Three distinct groups emerged: Supportive Parents, Mixed-Support Parents and Unsupportive Parents. Results indicate that dyadic parenting styles were related to teacher-reported externalizing symptoms for boys but not for girls. PMID:20221305

  4. Laser-excited luminescence and absorption study of mixed valence for K 2Pt(CN) 4—K 2Pt(CN) 6 crystals

    NASA Astrophysics Data System (ADS)

    Kasi Viswanath, A.; Smith, Wayne L.; Patterson, H.

    1982-04-01

    Crystals of K 2Pt(CN) 6 doped with Pt(CN) 2-4 show an absorption band at 337 nm which is assigned as a mixed-valence (MV) transition from Pt (II) to Pt(IV). From a Hush model analysis, the absorption band is interpreted to be class II in the Day—Robin scheme. When the MV band is laser excited at 337 nm, emmision is observed from Pt(CN) 2-4 clusters.

  5. DYADIC PARENTING AND CHILDREN'S EXTERNALIZING SYMPTOMS.

    PubMed

    Meteyer, Karen B; Perry-Jenkins, Maureen

    2009-07-01

    We explore dyadic parenting styles and their association with first-grade children's externalizing behavior symptoms in a sample of 85 working-class, dual-earner families. Cluster analysis is used to create a typology of parenting types, reflecting the parental warmth, overreactivity, and laxness of both mothers and fathers in two-parent families. Three distinct groups emerged: Supportive Parents, Mixed-Support Parents and Unsupportive Parents. Results indicate that dyadic parenting styles were related to teacher-reported externalizing symptoms for boys but not for girls.

  6. Chronic Disease Risk Typologies among Young Adults in Community College.

    PubMed

    Jeffries, Jayne K; Lytle, Leslie; Sotres-Alvarez, Daniela; Golden, Shelley; Aiello, Allison E; Linnan, Laura

    2018-03-01

    To address chronic disease risk holistically from a behavioral perspective, insights are needed to refine understanding of the covariance of key health behaviors. This study aims to identify distinct typologies of young adults based on 4 modifiable risk factors of chronic disease using a latent class analysis approach, and to describe patterns of class membership based on demographic characteristics, living arrangements, and weight. Overall, 441 young adults aged 18-35 attending community colleges in the Minnesota Twin Cities area completed a baseline questionnaire for the Choosing Healthy Options in College Environments and Settings study, a RCT. Behavioral items were used to create indicators for latent classes, and individuals were classified using maximum-probability assignment. Three latent classes were identified: 'active, binge-drinkers with a healthy dietary intake' (13.1%); 'non-active, moderate-smokers and non-drinkers with poor dietary intake' (38.2%); 'moderately active, non-smokers and non-drinkers with moderately healthy dietary intake' (48.7%). Classes exhibited unique demographic and weight-related profiles. This study may contribute to the literature on health behaviors among young adults and provides evidence that there are weight and age differences among subgroups. Understanding how behaviors cluster is important for identifying groups for targeted interventions in community colleges.

  7. Revealing interdyad differences in naturally occurring staff reactions to challenging behaviour of clients with severe or profound intellectual disabilities by means of Clusterwise Hierarchical Classes Analysis (HICLAS).

    PubMed

    Wilderjans, T F; Lambrechts, G; Maes, B; Ceulemans, E

    2014-11-01

    Investigating interdyad (i.e. couples of a client and their usual caregiver) differences in naturally occurring patterns of staff reactions to challenging behaviour (e.g. self-injurious, stereotyped and aggressive/destructive behaviour) of clients with severe or profound intellectual disabilities is important to optimise client-staff interactions. Most studies, however, fail to combine a naturalistic setup with a person-level analysis, in that they do not involve a careful inspection of the interdyad differences and similarities. In this study, the recently proposed Clusterwise Hierarchical Classes Analysis (HICLAS) method is adopted and applied to data of in which video fragments (recorded in a naturalistic setting) of a client showing challenging behaviour and the staff reacting to it were analysed. In a Clusterwise HICLAS analysis, the staff-client dyads are grouped into a number of clusters and the prototypical behaviour-reaction patterns that are specific for each cluster (i.e. interdyad differences and similarities) are revealed. Clusterwise HICLAS discloses clear interdyad differences (and similarities) in the prototypical patterns of clients' challenging behaviour and the associated staff reactions, complementing and qualifying the results of earlier studies in which only general patterns were disclosed. The usefulness and clinical relevance of Clusterwise HICLAS is demonstrated. In particular, Clusterwise HICLAS may capture idiosyncratic aspects of staff-client interactions, which may stimulate direct support workers to adopt person-centred support practices that take the specific abilities of the client into account. © 2013 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.

  8. Malignant pleural mesothelioma and mesothelial hyperplasia: A new molecular tool for the differential diagnosis.

    PubMed

    Bruno, Rossella; Alì, Greta; Giannini, Riccardo; Proietti, Agnese; Lucchi, Marco; Chella, Antonio; Melfi, Franca; Mussi, Alfredo; Fontanini, Gabriella

    2017-01-10

    Malignant pleural mesothelioma (MPM) is a rare asbestos related cancer, aggressive and unresponsive to therapies. Histological examination of pleural lesions is the gold standard of MPM diagnosis, although it is sometimes hard to discriminate the epithelioid type of MPM from benign mesothelial hyperplasia (MH).This work aims to define a new molecular tool for the differential diagnosis of MPM, using the expression profile of 117 genes deregulated in this tumour.The gene expression analysis was performed by nanoString System on tumour tissues from 36 epithelioid MPM and 17 MH patients, and on 14 mesothelial pleural samples analysed in a blind way. Data analysis included raw nanoString data normalization, unsupervised cluster analysis by Pearson correlation, non-parametric Mann Whitney U-test and molecular classification by the Uncorrelated Shrunken Centroid (USC) Algorithm.The Mann-Whitney U-test found 35 genes upregulated and 31 downregulated in MPM. The unsupervised cluster analysis revealed two clusters, one composed only of MPM and one only of MH samples, thus revealing class-specific gene profiles. The Uncorrelated Shrunken Centroid algorithm identified two classifiers, one including 22 genes and the other 40 genes, able to properly classify all the samples as benign or malignant using gene expression data; both classifiers were also able to correctly determine, in a blind analysis, the diagnostic categories of all the 14 unknown samples.In conclusion we delineated a diagnostic tool combining molecular data (gene expression) and computational analysis (USC algorithm), which can be applied in the clinical practice for the differential diagnosis of MPM.

  9. Chemical Composition and Crystal Morphology of Epicuticular Wax in Mature Fruits of 35 Pear (Pyrus spp.) Cultivars

    PubMed Central

    Wu, Xiao; Yin, Hao; Shi, Zebin; Chen, Yangyang; Qi, Kaijie; Qiao, Xin; Wang, Guoming; Cao, Peng; Zhang, Shaoling

    2018-01-01

    An evaluation of fruit wax components will provide us with valuable information for pear breeding and enhancing fruit quality. Here, we dissected the epicuticular wax concentration, composition and structure of mature fruits from 35 pear cultivars belonging to five different species and hybrid interspecies. A total of 146 epicuticular wax compounds were detected, and the wax composition and concentration varied dramatically among species, with the highest level of 1.53 mg/cm2 in Pyrus communis and the lowest level of 0.62 mg/cm2 in Pyrus pyrifolia. Field emission scanning electron microscopy (FESEM) analysis showed amorphous structures of the epicuticular wax crystals of different pear cultivars. Cluster analysis revealed that the Pyrus bretschneideri cultivars were grouped much closer to Pyrus pyrifolia and Pyrus ussuriensis, and the Pyrus sinkiangensis cultivars were clustered into a distant group. Based on the principal component analysis (PCA), the cultivars could be divided into three groups and five groups according to seven main classes of epicuticular wax compounds and 146 wax compounds, respectively. PMID:29875784

  10. Detection of a variable number of ribosomal DNA loci by fluorescent in situ hybridization in Populus species.

    PubMed

    Prado, E A; Faivre-Rampant, P; Schneider, C; Darmency, M A

    1996-10-01

    Fluorescent in situ hybridization (FISH) was applied to related Populus species (2n = 19) in order to detect rDNA loci. An interspecific variability in the number of hybridization sites was revealed using as probe an homologous 25S clone from Populus deltoides. The application of image analysis methods to measure fluorescence intensity of the hybridization signals has enabled us to characterize major and minor loci in the 18S-5.8S-25S rDNA. We identified one pair of such rDNA clusters in Populus alba; two pairs, one major and one minor, in both Populus nigra and P. deltoides; and three pairs in Populus balsamifera, (two major and one minor) and Populus euroamericana (one major and two minor). FISH results are in agreement with those based on RFLP analysis. The pBG13 probe containing 5S sequence from flax detected two separate clusters corresponding to the two size classes of units that coexist within 5S rDNA of most Populus species. Key words : Populus spp., fluorescent in situ hybridization, FISH, rDNA variability, image analysis.

  11. DNA Microarray Data Analysis: A Novel Biclustering Algorithm Approach

    NASA Astrophysics Data System (ADS)

    Tchagang, Alain B.; Tewfik, Ahmed H.

    2006-12-01

    Biclustering algorithms refer to a distinct class of clustering algorithms that perform simultaneous row-column clustering. Biclustering problems arise in DNA microarray data analysis, collaborative filtering, market research, information retrieval, text mining, electoral trends, exchange analysis, and so forth. When dealing with DNA microarray experimental data for example, the goal of biclustering algorithms is to find submatrices, that is, subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated activities for every condition. In this study, we develop novel biclustering algorithms using basic linear algebra and arithmetic tools. The proposed biclustering algorithms can be used to search for all biclusters with constant values, biclusters with constant values on rows, biclusters with constant values on columns, and biclusters with coherent values from a set of data in a timely manner and without solving any optimization problem. We also show how one of the proposed biclustering algorithms can be adapted to identify biclusters with coherent evolution. The algorithms developed in this study discover all valid biclusters of each type, while almost all previous biclustering approaches will miss some.

  12. DEFINITION OF MULTIVARIATE GEOCHEMICAL ASSOCIATIONS WITH POLYMETALLIC MINERAL OCCURRENCES USING A SPATIALLY DEPENDENT CLUSTERING TECHNIQUE AND RASTERIZED STREAM SEDIMENT DATA - AN ALASKAN EXAMPLE.

    USGS Publications Warehouse

    Jenson, Susan K.; Trautwein, C.M.

    1984-01-01

    The application of an unsupervised, spatially dependent clustering technique (AMOEBA) to interpolated raster arrays of stream sediment data has been found to provide useful multivariate geochemical associations for modeling regional polymetallic resource potential. The technique is based on three assumptions regarding the compositional and spatial relationships of stream sediment data and their regional significance. These assumptions are: (1) compositionally separable classes exist and can be statistically distinguished; (2) the classification of multivariate data should minimize the pair probability of misclustering to establish useful compositional associations; and (3) a compositionally defined class represented by three or more contiguous cells within an array is a more important descriptor of a terrane than a class represented by spatial outliers.

  13. Spectroscopic Studies of the Iron and Manganese Reconstituted Tyrosyl Radical in Bacillus Cereus Ribonucleotide Reductase R2 Protein

    PubMed Central

    Tomter, Ane B.; Zoppellaro, Giorgio; Bell, Caleb B.; Barra, Anne-Laure; Andersen, Niels H.; Solomon, Edward I.; Andersson, K. Kristoffer

    2012-01-01

    Ribonucleotide reductase (RNR) catalyzes the rate limiting step in DNA synthesis where ribonucleotides are reduced to the corresponding deoxyribonucleotides. Class Ib RNRs consist of two homodimeric subunits: R1E, which houses the active site; and R2F, which contains a metallo cofactor and a tyrosyl radical that initiates the ribonucleotide reduction reaction. We studied the R2F subunit of B. cereus reconstituted with iron or alternatively with manganese ions, then subsequently reacted with molecular oxygen to generate two tyrosyl-radicals. The two similar X-band EPR spectra did not change significantly over 4 to 50 K. From the 285 GHz EPR spectrum of the iron form, a g 1-value of 2.0090 for the tyrosyl radical was extracted. This g 1-value is similar to that observed in class Ia E. coli R2 and class Ib R2Fs with iron-oxygen cluster, suggesting the absence of hydrogen bond to the phenoxyl group. This was confirmed by resonance Raman spectroscopy, where the stretching vibration associated to the radical (C-O, ν7a = 1500 cm−1) was found to be insensitive to deuterium-oxide exchange. Additionally, the 18O-sensitive Fe-O-Fe symmetric stretching (483 cm−1) of the metallo-cofactor was also insensitive to deuterium-oxide exchange indicating no hydrogen bonding to the di-iron-oxygen cluster, and thus, different from mouse R2 with a hydrogen bonded cluster. The HF-EPR spectrum of the manganese reconstituted RNR R2F gave a g 1-value of ∼2.0094. The tyrosyl radical microwave power saturation behavior of the iron-oxygen cluster form was as observed in class Ia R2, with diamagnetic di-ferric cluster ground state, while the properties of the manganese reconstituted form indicated a magnetic ground state of the manganese-cluster. The recent activity measurements (Crona et al., (2011) J Biol Chem 286: 33053–33060) indicates that both the manganese and iron reconstituted RNR R2F could be functional. The manganese form might be very important, as it has 8 times higher activity. PMID:22432022

  14. Upon the opportunity to apply ART2 Neural Network for clusterization of biodiesel fuels

    NASA Astrophysics Data System (ADS)

    Petkov, T.; Mustafa, Z.; Sotirov, S.; Milina, R.; Moskovkina, M.

    2016-03-01

    A chemometric approach using artificial neural network for clusterization of biodiesels was developed. It is based on artificial ART2 neural network. Gas chromatography (GC) and Gas Chromatography - mass spectrometry (GC-MS) were used for quantitative and qualitative analysis of biodiesels, produced from different feedstocks, and FAME (fatty acid methyl esters) profiles were determined. Totally 96 analytical results for 7 different classes of biofuel plants: sunflower, rapeseed, corn, soybean, palm, peanut, "unknown" were used as objects. The analysis of biodiesels showed the content of five major FAME (C16:0, C18:0, C18:1, C18:2, C18:3) and those components were used like inputs in the model. After training with 6 samples, for which the origin was known, ANN was verified and tested with ninety "unknown" samples. The present research demonstrated the successful application of neural network for recognition of biodiesels according to their feedstock which give information upon their properties and handling.

  15. Using cluster analysis to examine the combinations of motivation regulations of physical education students.

    PubMed

    Ullrich-French, Sarah; Cox, Anne

    2009-06-01

    According to self-determination theory, motivation is multidimensional, with motivation regulations lying along a continuum of self-determination (Ryan & Deci, 2007). Accounting for the different types of motivation in physical activity research presents a challenge. This study used cluster analysis to identify motivation regulation profiles and examined their utility by testing profile differences in relative levels of self-determination (i.e., self-determination index), and theoretical antecedents (i.e., competence, autonomy, relatedness) and consequences (i.e., enjoyment, worry, effort, value, physical activity) of physical education motivation. Students (N= 386) in 6th- through 8th-grade physical education classes completed questionnaires of the variables listed above. Five profiles emerged, including average (n = 81), motivated (n = 82), self-determined (n = 91), low motivation (n = 73), and external (n = 59). Group difference analyses showed that students with greater levels of self-determined forms of motivation, regardless of non-self-determined motivation levels, reported the most adaptive physical education experiences.

  16. Patterns of media use and alcohol brand consumption among underage drinking youth in the United States.

    PubMed

    Borzekowski, Dina L G; Ross, Craig S; Jernigan, David H; DeJong, William; Siegel, Michael

    2015-01-01

    This study investigated whether underage drinkers with varied media use patterns differentially consume popular brands of alcohol. A survey was conducted with a national online panel of 1,032 underage youth 13-20 years of age who had consumed at least 1 drink in the past 30 days. A latent class analysis identified four distinct media use patterns. Further analyses explored whether these media use groups differentially consumed the most frequently used alcohol brands. The results showed that past 30-day consumption of specific alcohol brands differed significantly across the four media use clusters, even after controlling for sex, race/ethnicity, household income, U.S. geographic region, frequency of parent's alcohol overconsumption, cigarette smoking, and seatbelt use. This study shows that youth use media in different ways, and this differential use is significantly associated with the consumption of specific alcohol brands. The media clusters revealed in this analysis may inform future research about the association between specific alcohol media exposures and individual brand consumption.

  17. Parent, Sibling and Peer Associations with Subtypes of Psychiatric and Substance Use Disorder Comorbidity in Offspring

    PubMed Central

    McCutcheon, Vivia V.; Scherrer, Jeffrey F.; Grant, Julia D.; Xian, Hong; Haber, Jon Randolph; Jacob, Theodore; Bucholz, Kathleen K

    2012-01-01

    BACKGROUND Parental substance use disorder (SUD) is associated with a range of negative offspring outcomes and psychopathology, but the clustering of these outcomes into subtypes has seldom been examined, nor have the familial and environmental contexts of these subtypes been reported. The present study examines the clustering of offspring lifetime substance use and psychiatric disorders into subtypes and characterizes them in terms of familial and non-familial influences using an offspring-of-twins design. METHOD Telephone-administered diagnostic interviews were used to collect data on psychiatric disorders and SUD from 488 twin fathers, 420 biological mothers and 831 offspring. Latent class analysis (LCA) was used to derive subtypes of lifetime comorbidity in offspring. Familial risk and environmental variables associated with each subtype (i.e. parenting, childhood physical or sexual abuse, perceived sibling and peer substance use) were identified using multinomial logistic regression. RESULTS Four classes identified by LCA were characterized as 1) unaffected, 2) alcohol abuse/dependence, 3) alcohol abuse/dependence comorbid with anxiety and depression, and 4) alcohol, cannabis abuse/dependence and nicotine dependence comorbid with conduct disorder. Inconsistent parenting, childhood physical/sexual abuse, and perceived sibling and peer substance use were significantly associated with profiles of offspring comorbidity after adjusting for familial vulnerability. Some associations were specific (i.e. perceived peer alcohol use to the AUD class), while others were general (peer smoking to all 3 comorbidity classes). CONCLUSIONS We observed distinct subtypes of psychiatric and SUD comorbidity in adolescents and young adults. Subtypes of offspring psychopathology have varied associations with parental psychopathology, family environment, and sibling and peer behaviors. PMID:22921146

  18. Latent Class Detection and Class Assignment: A Comparison of the MAXEIG Taxometric Procedure and Factor Mixture Modeling Approaches

    ERIC Educational Resources Information Center

    Lubke, Gitta; Tueller, Stephen

    2010-01-01

    Taxometric procedures such as MAXEIG and factor mixture modeling (FMM) are used in latent class clustering, but they have very different sets of strengths and weaknesses. Taxometric procedures, popular in psychiatric and psychopathology applications, do not rely on distributional assumptions. Their sole purpose is to detect the presence of latent…

  19. The Best of Both Worlds: Building on the COPUS and RTOP Observation Protocols to Easily and Reliably Measure Various Levels of Reformed Instructional Practice

    PubMed Central

    Lund, Travis J.; Pilarz, Matthew; Velasco, Jonathan B.; Chakraverty, Devasmita; Rosploch, Kaitlyn; Undersander, Molly; Stains, Marilyne

    2015-01-01

    Researchers, university administrators, and faculty members are increasingly interested in measuring and describing instructional practices provided in science, technology, engineering, and mathematics (STEM) courses at the college level. Specifically, there is keen interest in comparing instructional practices between courses, monitoring changes over time, and mapping observed practices to research-based teaching. While increasingly common observation protocols (Reformed Teaching Observation Protocol [RTOP] and Classroom Observation Protocol in Undergraduate STEM [COPUS]) at the postsecondary level help achieve some of these goals, they also suffer from weaknesses that limit their applicability. In this study, we leverage the strengths of these protocols to provide an easy method that enables the reliable and valid characterization of instructional practices. This method was developed empirically via a cluster analysis using observations of 269 individual class periods, corresponding to 73 different faculty members, 28 different research-intensive institutions, and various STEM disciplines. Ten clusters, called COPUS profiles, emerged from this analysis; they represent the most common types of instructional practices enacted in the classrooms observed for this study. RTOP scores were used to validate the alignment of the 10 COPUS profiles with reformed teaching. Herein, we present a detailed description of the cluster analysis method, the COPUS profiles, and the distribution of the COPUS profiles across various STEM courses at research-intensive universities. PMID:25976654

  20. Clustering of health behaviours in adult survivors of childhood cancer and the general population.

    PubMed

    Rebholz, C E; Rueegg, C S; Michel, G; Ammann, R A; von der Weid, N X; Kuehni, C E; Spycher, B D

    2012-07-10

    Little is known about engagement in multiple health behaviours in childhood cancer survivors. Using latent class analysis, we identified health behaviour patterns in 835 adult survivors of childhood cancer (age 20-35 years) and 1670 age- and sex-matched controls from the general population. Behaviour groups were determined from replies to questions on smoking, drinking, cannabis use, sporting activities, diet, sun protection and skin examination. The model identified four health behaviour patterns: 'risk-avoidance', with a generally healthy behaviour; 'moderate drinking', with higher levels of sporting activities, but moderate alcohol-consumption; 'risk-taking', engaging in several risk behaviours; and 'smoking', smoking but not drinking. Similar proportions of survivors and controls fell into the 'risk-avoiding' (42% vs 44%) and the 'risk-taking' cluster (14% vs 12%), but more survivors were in the 'moderate drinking' (39% vs 28%) and fewer in the 'smoking' cluster (5% vs 16%). Determinants of health behaviour clusters were gender, migration background, income and therapy. A comparable proportion of childhood cancer survivors as in the general population engage in multiple health-compromising behaviours. Because of increased vulnerability of survivors, multiple risk behaviours should be addressed in targeted health interventions.

  1. A sequential-move game for enhancing safety and security cooperation within chemical clusters.

    PubMed

    Pavlova, Yulia; Reniers, Genserik

    2011-02-15

    The present paper provides a game theoretic analysis of strategic cooperation on safety and security among chemical companies within a chemical industrial cluster. We suggest a two-stage sequential move game between adjacent chemical plants and the so-called Multi-Plant Council (MPC). The MPC is considered in the game as a leader player who makes the first move, and the individual chemical companies are the followers. The MPC's objective is to achieve full cooperation among players through establishing a subsidy system at minimum expense. The rest of the players rationally react to the subsidies proposed by the MPC and play Nash equilibrium. We show that such a case of conflict between safety and security, and social cooperation, belongs to the 'coordination with assurance' class of games, and we explore the role of cluster governance (fulfilled by the MPC) in achieving a full cooperative outcome in domino effects prevention negotiations. The paper proposes an algorithm that can be used by the MPC to develop the subsidy system. Furthermore, a stepwise plan to improve cross-company safety and security management in a chemical industrial cluster is suggested and an illustrative example is provided. Copyright © 2010 Elsevier B.V. All rights reserved.

  2. Clusters of Antibiotic Resistance Genes Enriched Together Stay Together in Swine Agriculture

    PubMed Central

    Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong; Cole, James R.; Hashsham, Syed A.; Looft, Torey; Zhu, Yong-Guan

    2016-01-01

    ABSTRACT   Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk. PMID:27073098

  3. Higher frequencies of HLA DQB1*05:01 and anti-glycosphingolipid antibodies in a cluster of severe Guillain-Barré syndrome.

    PubMed

    Schirmer, L; Worthington, V; Solloch, U; Loleit, V; Grummel, V; Lakdawala, N; Grant, D; Wassmuth, R; Schmidt, A H; Gebhardt, F; Andlauer, T F M; Sauter, J; Berthele, A; Lunn, M P; Hemmer, Bernhard

    2016-10-01

    Few regional and seasonal Guillain-Barré syndrome (GBS) clusters have been reported so far. It is unknown whether patients suffering from sporadic GBS differ from GBS clusters with respect to clinical and paraclinical parameters, HLA association and antibody response to glycosphingolipids and Campylobacter jejuni (Cj). We examined 40 consecutive patients with GBS from the greater Munich area in Germany with 14 of those admitted within a period of 3 months in fall 2010 defining a cluster of GBS. Sequencing-based HLA typing of the HLA genes DRB1, DQB1, and DPB1 was performed, and ELISA for anti-glycosphingolipid antibodies was carried out. Clinical and paraclinical findings (Cj seroreactivity, cerebrospinal fluid parameters, and electrophysiology) were obtained and analyzed. GBS cluster patients were characterized by a more severe clinical phenotype with more patients requiring mechanical ventilation and higher frequencies of autoantibodies against sulfatide, GalC and certain ganglioside epitopes (54 %) as compared to sporadic GBS cases (13 %, p = 0.017). Cj seropositivity tended to be higher within GBS cluster patients (69 %) as compared to sporadic cases (46 %, p = 0.155). We noted higher frequencies of HLA class II allele DQB1*05:01 in the cluster cohort (23 %) as compared to sporadic GBS patients (3 %, p = 0.019). Cluster of severe GBS was defined by higher frequencies of autoantibodies against glycosphingolipids. HLA class II allele DQB1*05:01 might contribute to clinical worsening in the cluster patients.

  4. Glaucomatous patterns in Frequency Doubling Technology (FDT) perimetry data identified by unsupervised machine learning classifiers.

    PubMed

    Bowd, Christopher; Weinreb, Robert N; Balasubramanian, Madhusudhanan; Lee, Intae; Jang, Giljin; Yousefi, Siamak; Zangwill, Linda M; Medeiros, Felipe A; Girkin, Christopher A; Liebmann, Jeffrey M; Goldbaum, Michael H

    2014-01-01

    The variational Bayesian independent component analysis-mixture model (VIM), an unsupervised machine-learning classifier, was used to automatically separate Matrix Frequency Doubling Technology (FDT) perimetry data into clusters of healthy and glaucomatous eyes, and to identify axes representing statistically independent patterns of defect in the glaucoma clusters. FDT measurements were obtained from 1,190 eyes with normal FDT results and 786 eyes with abnormal FDT results from the UCSD-based Diagnostic Innovations in Glaucoma Study (DIGS) and African Descent and Glaucoma Evaluation Study (ADAGES). For all eyes, VIM input was 52 threshold test points from the 24-2 test pattern, plus age. FDT mean deviation was -1.00 dB (S.D. = 2.80 dB) and -5.57 dB (S.D. = 5.09 dB) in FDT-normal eyes and FDT-abnormal eyes, respectively (p<0.001). VIM identified meaningful clusters of FDT data and positioned a set of statistically independent axes through the mean of each cluster. The optimal VIM model separated the FDT fields into 3 clusters. Cluster N contained primarily normal fields (1109/1190, specificity 93.1%) and clusters G1 and G2 combined, contained primarily abnormal fields (651/786, sensitivity 82.8%). For clusters G1 and G2 the optimal number of axes were 2 and 5, respectively. Patterns automatically generated along axes within the glaucoma clusters were similar to those known to be indicative of glaucoma. Fields located farther from the normal mean on each glaucoma axis showed increasing field defect severity. VIM successfully separated FDT fields from healthy and glaucoma eyes without a priori information about class membership, and identified familiar glaucomatous patterns of loss.

  5. Variance-Based Cluster Selection Criteria in a K-Means Framework for One-Mode Dissimilarity Data.

    PubMed

    Vera, J Fernando; Macías, Rodrigo

    2017-06-01

    One of the main problems in cluster analysis is that of determining the number of groups in the data. In general, the approach taken depends on the cluster method used. For K-means, some of the most widely employed criteria are formulated in terms of the decomposition of the total point scatter, regarding a two-mode data set of N points in p dimensions, which are optimally arranged into K classes. This paper addresses the formulation of criteria to determine the number of clusters, in the general situation in which the available information for clustering is a one-mode [Formula: see text] dissimilarity matrix describing the objects. In this framework, p and the coordinates of points are usually unknown, and the application of criteria originally formulated for two-mode data sets is dependent on their possible reformulation in the one-mode situation. The decomposition of the variability of the clustered objects is proposed in terms of the corresponding block-shaped partition of the dissimilarity matrix. Within-block and between-block dispersion values for the partitioned dissimilarity matrix are derived, and variance-based criteria are subsequently formulated in order to determine the number of groups in the data. A Monte Carlo experiment was carried out to study the performance of the proposed criteria. For simulated clustered points in p dimensions, greater efficiency in recovering the number of clusters is obtained when the criteria are calculated from the related Euclidean distances instead of the known two-mode data set, in general, for unequal-sized clusters and for low dimensionality situations. For simulated dissimilarity data sets, the proposed criteria always outperform the results obtained when these criteria are calculated from their original formulation, using dissimilarities instead of distances.

  6. Enhancing interdisciplinary mathematics and biology education: a microarray data analysis course bridging these disciplines.

    PubMed

    Tra, Yolande V; Evans, Irene M

    2010-01-01

    BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course.

  7. Enhancing Interdisciplinary Mathematics and Biology Education: A Microarray Data Analysis Course Bridging These Disciplines

    PubMed Central

    Evans, Irene M.

    2010-01-01

    BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course. PMID:20810954

  8. Principal component analysis on molecular descriptors as an alternative point of view in the search of new Hsp90 inhibitors.

    PubMed

    Lauria, Antonino; Ippolito, Mario; Almerico, Anna Maria

    2009-10-01

    Inhibiting a protein that regulates multiple signal transduction pathways in cancer cells is an attractive goal for cancer therapy. Heat shock protein 90 (Hsp90) is one of the most promising molecular targets for such an approach. In fact, Hsp90 is a ubiquitous molecular chaperone protein that is involved in folding, activating and assembling of many key mediators of signal transduction, cellular growth, differentiation, stress-response and apoptothic pathways. With the aim to analyze which molecular descriptors have the higher importance in the binding interactions of these classes, we first performed molecular docking experiments on the 187 Hsp90 inhibitors included in the BindingDB, a public database of measured binding affinities. Further, for each frozen conformation obtained from the docking, a set of 250 molecular descriptors was calculated, and the resulting Structure/Descriptors matrix was submitted to Principal Component Analysis. From the factor scores it emerged a good clusterization among similar compounds both in terms of structural class and activity spectrum, while examination of the loadings of the first two factors also allowed to study the classes of descriptors which mainly contribute to each one.

  9. Wood-inhabiting fungi in southern Italy forest stands: morphogroups, vegetation types and decay classes.

    PubMed

    Granito, Vito Mario; Lunghini, Dario; Maggi, Oriana; Persiani, Anna Maria

    2015-01-01

    The authors conducted an ecological study of forests subjected to varying management. The aim of the study was to extend and integrate, within a multivariate context, knowledge of how saproxylic fungal communities behave along altitudinal/vegetational gradients in response to the varying features and quality of coarse woody debris (CWD). The intra-annual seasonal monitoring of saproxylic fungi, based on sporocarp inventories, was used to investigate saproxylic fungi in relation to vegetation types and management categories. We analyzed fungal species occurrence, recorded according to the presence/absence and frequency of sporocarps, on the basis of the harvest season, of coarse woody debris decay classes as well as other environmental and ecological variables. Two-way cluster analysis, DCA and Spearman's rank correlations, for indirect gradient analysis, were performed to identify any patterns of seasonality and decay. Most of the species were found on CWD in an intermediate decay stage. The first DCA axis revealed the vegetational/microclimate gradient as the main driver of fungal community composition, while the second axis corresponded to a strong gradient of CWD decay classes. © 2015 by The Mycological Society of America.

  10. Theory for electron transfer from a mixed-valence dimer with paramagnetic sites to a mononuclear acceptor

    NASA Astrophysics Data System (ADS)

    Bominaar, E. L.; Achim, C.; Borshch, S. A.

    1999-06-01

    Polynuclear transition-metal complexes, such as Fe-S clusters, are the prosthetic groups in a large number of metalloproteins and serve as temporary electron storage units in a number of important redox-based biological processes. Polynuclearity distinguishes clusters from mononuclear centers and confers upon them unique properties, such as spin ordering and the presence of thermally accessible excited spin states in clusters with paramagnetic sites, and fractional valencies in clusters of the mixed-valence type. In an earlier study we presented an effective-mode (EM) analysis of electron transfer from a binuclear mixed-valence donor with paramagnetic sites to a mononuclear acceptor which revealed that the cluster-specific attributes have an important impact on the kinetics of long-range electron transfer. In the present study, the validity of these results is tested in the framework of more detailed theories which we have termed the multimode semiclassical (SC) model and the quantum-mechanical (QM) model. It is found that the qualitative trends in the rate constant are the same in all treatments and that the semiclassical models provide a good approximation of the more rigorous quantum-mechanical description of electron transfer under physiologically relevant conditions. In particular, the present results corroborate the importance of electron transfer via excited spin states in reactions with a low driving force and justify the use of semiclassical theory in cases in which the QM model is computationally too demanding. We consider cases in which either one or two donor sites of a dimer are electronically coupled to the acceptor. In the case of multiconnectivity, the rate constant for electron transfer from a valence-delocalized (class-III) donor is nonadditive with respect to transfer from individual metal sites of the donor and undergoes an order-of-magnitude change by reversing the sign of the intradimer metal-metal resonance parameter (β). In the case of single connectivity, the rate constant for electron transfer from a valence-localized (class-II) donor can readily be tuned over several orders of magnitude by introducing differences in the electronic potentials at the two metal sites of the donor. These results indicate that theories of cluster-based electron transfer, in order to be realistic, need to consider both intrinsic electronic structure and extrinsic interactions of the cluster with the protein environment.

  11. The effect of team accelerated instruction on students’ mathematics achievement and learning motivation

    NASA Astrophysics Data System (ADS)

    Sri Purnami, Agustina; Adi Widodo, Sri; Charitas Indra Prahmana, Rully

    2018-01-01

    This study aimed to know the improvement of achievement and motivation of learning mathematics by using Team Accelerated Instruction. The research method used was the experiment with descriptive pre-test post-test experiment. The population in this study was all students of class VIII junior high school in Jogjakarta. The sample was taken using cluster random sampling technique. The instrument used in this research was questionnaire and test. Data analysis technique used was Wilcoxon test. It concluded that there was an increase in motivation and student achievement of class VII on linear equation system material by using the learning model of Team Accelerated Instruction. Based on the results of the learning model Team Accelerated Instruction can be used as a variation model in learning mathematics.

  12. Structure of a gene encoding a murine thymus leukemia antigen, and organization of Tla genes in the BALB/c mouse

    PubMed Central

    1985-01-01

    We have determined the DNA sequence of a gene encoding a thymus leukemia (TL) antigen in the BALB/c mouse, and have more definitively mapped the cloned BALB/c Tla-region class I gene clusters. Analysis of the sequence shows that the Tla gene is less closely related to the H-2 genes than H-2 genes are to one another or to a Qa-2,3-region genes. The Tla gene, 17.3A, contains an apparent gene conversion. Comparison of the BALB/c Tla genes with those from C57BL shows that BALB/c has more Tla-region class I genes, and that one of the genes absent in C57BL is gene 17.3A. PMID:3894562

  13. Carbon nuclear magnetic resonance spectroscopic fingerprinting of commercial gasoline: pattern-recognition analyses for screening quality control purposes.

    PubMed

    Flumignan, Danilo Luiz; Boralle, Nivaldo; Oliveira, José Eduardo de

    2010-06-30

    In this work, the combination of carbon nuclear magnetic resonance ((13)C NMR) fingerprinting with pattern-recognition analyses provides an original and alternative approach to screening commercial gasoline quality. Soft Independent Modelling of Class Analogy (SIMCA) was performed on spectroscopic fingerprints to classify representative commercial gasoline samples, which were selected by Hierarchical Cluster Analyses (HCA) over several months in retails services of gas stations, into previously quality-defined classes. Following optimized (13)C NMR-SIMCA algorithm, sensitivity values were obtained in the training set (99.0%), with leave-one-out cross-validation, and external prediction set (92.0%). Governmental laboratories could employ this method as a rapid screening analysis to discourage adulteration practices. Copyright 2010 Elsevier B.V. All rights reserved.

  14. Studies of the Virgo Cluster. II - A catalog of 2096 galaxies in the Virgo Cluster area. V - Luminosity functions of Virgo Cluster galaxies

    NASA Technical Reports Server (NTRS)

    Binggeli, B.; Tammann, G. A.; Sandage, A.

    1985-01-01

    The present catalog of 2096 galaxies within an area of about 140 sq deg approximately centered on the Virgo cluster should be an essentially complete listing of all certain and possible cluster members, independent of morphological type. Cluster membership is essentially decided by galaxy morphology; for giants and the rare class of high surface brightness dwarfs, membership rests on velocity data. While 1277 of the catalog entries are considered members of the Virgo cluster, 574 are possible members and 245 appear to be background Zwicky galaxies. Major-to-minor axis ratios are given for all galaxies brighter than B(T) = 18, as well as for many fainter ones.

  15. Using Fuzzy Clustering for Real-time Space Flight Safety

    NASA Technical Reports Server (NTRS)

    Lee, Charles; Haskell, Richard E.; Hanna, Darrin; Alena, Richard L.

    2004-01-01

    To ensure space flight safety, it is necessary to monitor myriad sensor readings on the ground and in flight. Since a space shuttle has many sensors, monitoring data and drawing conclusions from information contained within the data in real time is challenging. The nature of the information can be critical to the success of the mission and safety of the crew and therefore, must be processed with minimal data-processing time. Data analysis algorithms could be used to synthesize sensor readings and compare data associated with normal operation with the data obtained that contain fault patterns to draw conclusions. Detecting abnormal operation during early stages in the transition from safe to unsafe operation requires a large amount of historical data that can be categorized into different classes (non-risk, risk). Even though the 40 years of shuttle flight program has accumulated volumes of historical data, these data don t comprehensively represent all possible fault patterns since fault patterns are usually unknown before the fault occurs. This paper presents a method that uses a similarity measure between fuzzy clusters to detect possible faults in real time. A clustering technique based on a fuzzy equivalence relation is used to characterize temporal data. Data collected during an initial time period are separated into clusters. These clusters are characterized by their centroids. Clusters formed during subsequent time periods are either merged with an existing cluster or added to the cluster list. The resulting list of cluster centroids, called a cluster group, characterizes the behavior of a particular set of temporal data. The degree to which new clusters formed in a subsequent time period are similar to the cluster group is characterized by a similarity measure, q. This method is applied to downlink data from Columbia flights. The results show that this technique can detect an unexpected fault that has not been present in the training data set.

  16. Electron transfer from alpha-keggin anions to dioxygen

    Treesearch

    Yurii V. Geletii; Rajai H. Atalla; Craig L. Hill; Ira A. Weinstock

    2004-01-01

    Polyoxometalates (POMs), of which alpha-Keggin anions are representative, are a diverse and rapidly growing class of water-soluble cluster-anion structures with applications ranging from molecular catalysis to materials. [1] POMs are inexpensive, minimally or non-toxic, negatively charged clusters comprised of early transition-metals, usually in their do electronic...

  17. Estimating accuracy of land-cover composition from two-stage cluster sampling

    USGS Publications Warehouse

    Stehman, S.V.; Wickham, J.D.; Fattorini, L.; Wade, T.D.; Baffetta, F.; Smith, J.H.

    2009-01-01

    Land-cover maps are often used to compute land-cover composition (i.e., the proportion or percent of area covered by each class), for each unit in a spatial partition of the region mapped. We derive design-based estimators of mean deviation (MD), mean absolute deviation (MAD), root mean square error (RMSE), and correlation (CORR) to quantify accuracy of land-cover composition for a general two-stage cluster sampling design, and for the special case of simple random sampling without replacement (SRSWOR) at each stage. The bias of the estimators for the two-stage SRSWOR design is evaluated via a simulation study. The estimators of RMSE and CORR have small bias except when sample size is small and the land-cover class is rare. The estimator of MAD is biased for both rare and common land-cover classes except when sample size is large. A general recommendation is that rare land-cover classes require large sample sizes to ensure that the accuracy estimators have small bias. ?? 2009 Elsevier Inc.

  18. Time-Frequency Analysis And Pattern Recognition Using Singular Value Decomposition Of The Wigner-Ville Distribution

    NASA Astrophysics Data System (ADS)

    Boashash, Boualem; Lovell, Brian; White, Langford

    1988-01-01

    Time-Frequency analysis based on the Wigner-Ville Distribution (WVD) is shown to be optimal for a class of signals where the variation of instantaneous frequency is the dominant characteristic. Spectral resolution and instantaneous frequency tracking is substantially improved by using a Modified WVD (MWVD) based on an Autoregressive spectral estimator. Enhanced signal-to-noise ratio may be achieved by using 2D windowing in the Time-Frequency domain. The WVD provides a tool for deriving descriptors of signals which highlight their FM characteristics. These descriptors may be used for pattern recognition and data clustering using the methods presented in this paper.

  19. Genomic Characterization of USA300 Methicillin-Resistant Staphylococcus aureus (MRSA) to Evaluate Intraclass Transmission and Recurrence of Skin and Soft Tissue Infection (SSTI) Among High-Risk Military Trainees.

    PubMed

    Millar, Eugene V; Rice, Gregory K; Elassal, Emad M; Schlett, Carey D; Bennett, Jason W; Redden, Cassie L; Mor, Deepika; Law, Natasha N; Tribble, David R; Hamilton, Theron; Ellis, Michael W; Bishop-Lilly, Kimberly A

    2017-08-01

    Military trainees are at increased risk for methicillin-resistant Staphylococcus aureus (MRSA) skin and soft tissue infection (SSTI). Whole genome sequencing (WGS) can refine our understanding of MRSA transmission and microevolution in congregate settings. We conducted a prospective case-control study of SSTI among US Army infantry trainees at Fort Benning, Georgia, from July 2012 to December 2014. We identified clusters of USA300 MRSA SSTI within select training classes and performed WGS on clinical isolates. We then linked genomic, phylogenetic, epidemiologic, and clinical data in order to evaluate intra- and interclass disease transmission. Furthermore, among cases of recurrent MRSA SSTI, we evaluated the intrahost relatedness of infecting strains. Nine training classes with ≥5 cases of USA300 MRSA SSTI were selected. Eighty USA300 MRSA clinical isolates from 74 trainees, 6 (8.1%) of whom had recurrent infection, were subjected to WGS. We identified 2719 single nucleotide variants (SNVs). The overall median (range) SNV difference between isolates was 173 (1-339). Intraclass median SNV differences ranged from 23 to 245. Two phylogenetic clusters were suggestive of interclass MRSA transmission. One of these clusters stemmed from 2 classes that were separated by a 13-month period but housed in the same barracks. Among trainees with recurrent MRSA SSTI, the intrahost median SNV difference was 7.5 (1-48). Application of WGS revealed intra- and interclass transmission of MRSA among military trainees. An interclass cluster between 2 noncontemporaneous classes suggests a long-term reservoir for MRSA in this setting. Published by Oxford University Press for the Infectious Diseases Society of America 2017. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  20. TMEM88, CCL14 and CLEC3B as prognostic biomarkers for prognosis and palindromia of human hepatocellular carcinoma.

    PubMed

    Zhang, Xin; Wan, Jin-Xiang; Ke, Zun-Ping; Wang, Feng; Chai, Hai-Xia; Liu, Jia-Qiang

    2017-07-01

    Hepatocellular carcinoma is one of the most mortal and prevalent cancers with increasing incidence worldwide. Elucidating genetic driver genes for prognosis and palindromia of hepatocellular carcinoma helps managing clinical decisions for patients. In this study, the high-throughput RNA sequencing data on platform IlluminaHiSeq of hepatocellular carcinoma were downloaded from The Cancer Genome Atlas with 330 primary hepatocellular carcinoma patient samples. Stable key genes with differential expressions were identified with which Kaplan-Meier survival analysis was performed using Cox proportional hazards test in R language. Driver genes influencing the prognosis of this disease were determined using clustering analysis. Functional analysis of driver genes was performed by literature search and Gene Set Enrichment Analysis. Finally, the selected driver genes were verified using external dataset GSE40873. A total of 5781 stable key genes were identified, including 156 genes definitely related to prognoses of hepatocellular carcinoma. Based on the significant key genes, samples were grouped into five clusters which were further integrated into high- and low-risk classes based on clinical features. TMEM88, CCL14, and CLEC3B were selected as driver genes which clustered high-/low-risk patients successfully (generally, p = 0.0005124445). Finally, survival analysis of the high-/low-risk samples from external database illustrated significant difference with p value 0.0198. In conclusion, TMEM88, CCL14, and CLEC3B genes were stable and available in predicting the survival and palindromia time of hepatocellular carcinoma. These genes could function as potential prognostic genes contributing to improve patients' outcomes and survival.

  1. Association between Pesticide Profiles Used on Agricultural Fields near Maternal Residences during Pregnancy and IQ at Age 7 Years.

    PubMed

    Coker, Eric; Gunier, Robert; Bradman, Asa; Harley, Kim; Kogut, Katherine; Molitor, John; Eskenazi, Brenda

    2017-05-09

    We previously showed that potential prenatal exposure to agricultural pesticides was associated with adverse neurodevelopmental outcomes in children, yet the effects of joint exposure to multiple pesticides is poorly understood. In this paper, we investigate associations between the joint distribution of agricultural use patterns of multiple pesticides (denoted as "pesticide profiles") applied near maternal residences during pregnancy and Full-Scale Intelligence Quotient (FSIQ) at 7 years of age. Among a cohort of children residing in California's Salinas Valley, we used Pesticide Use Report (PUR) data to characterize potential exposure from use within 1 km of maternal residences during pregnancy for 15 potentially neurotoxic pesticides from five different chemical classes. We used Bayesian profile regression (BPR) to examine associations between clustered pesticide profiles and deficits in childhood FSIQ. BPR identified eight distinct clusters of prenatal pesticide profiles. Two of the pesticide profile clusters exhibited some of the highest cumulative pesticide use levels and were associated with deficits in adjusted FSIQ of -6.9 (95% credible interval: -11.3, -2.2) and -6.4 (95% credible interval: -13.1, 0.49), respectively, when compared with the pesticide profile cluster that showed the lowest level of pesticides use. Although maternal residence during pregnancy near high agricultural use of multiple neurotoxic pesticides was associated with FSIQ deficit, the magnitude of the associations showed potential for sub-additive effects. Epidemiologic analysis of pesticides and their potential health effects can benefit from a multi-pollutant approach to analysis.

  2. An Analysis of Rich Cluster Redshift Survey Data for Large Scale Structure Studies

    NASA Astrophysics Data System (ADS)

    Slinglend, K.; Batuski, D.; Haase, S.; Hill, J.

    1994-12-01

    The results from the COBE satellite show the existence of structure on scales on the order of 10% or more of the horizon scale of the universe. Rich clusters of galaxies from Abell's catalog show evidence of structure on scales of 100 Mpc and may hold the promise of confirming structure on the scale of the COBE result. However, many Abell clusters have zero or only one measured redshift, so present knowledge of their three dimensional distribution has quite large uncertainties. The shortage of measured redshifts for these clusters may also mask a problem of projection effects corrupting the membership counts for the clusters. Our approach in this effort has been to use the MX multifiber spectrometer on the Steward 2.3m to measure redshifts of at least ten galaxies in each of 80 Abell cluster fields with richness class R>= 1 and mag10 <= 16.8 (estimated z<= 0.12) and zero or one measured redshifts. This work will result in a deeper, more complete (and reliable) sample of positions of rich clusters. Our primary intent for the sample is for two-point correlation and other studies of the large scale structure traced by these clusters in an effort to constrain theoretical models for structure formation. We are also obtaining enough redshifts per cluster so that a much better sample of reliable cluster velocity dispersions will be available for other studies of cluster properties. To date, we have collected such data for 64 clusters, and for most of them, we have seven or more cluster members with redshifts, allowing for reliable velocity dispersion calculations. Velocity histograms and stripe density plots for several interesting cluster fields are presented, along with summary tables of cluster redshift results. Also, with 10 or more redshifts in most of our cluster fields (30({') } square, just about an `Abell diameter' at z ~ 0.1) we have investigated the extent of projection effects within the Abell catalog in an effort to quantify and understand how this may effect the Abell sample.

  3. Job Satisfaction among Health-Care Staff in Township Health Centers in Rural China: Results from a Latent Class Analysis

    PubMed Central

    Wang, Haipeng; Tang, Chengxiang; Zhao, Shichao; Meng, Qingyue; Liu, Xiaoyun

    2017-01-01

    Background: The lower job satisfaction of health-care staff will lead to more brain drain, worse work performance, and poorer health-care outcomes. The aim of this study was to identify patterns of job satisfaction among health-care staff in rural China, and to investigate the association between the latent clusters and health-care staff’s personal and professional features; Methods: We selected 12 items of five-point Likert scale questions to measure job satisfaction. A latent-class analysis was performed to identify subgroups based on the items of job satisfaction; Results: Four latent classes of job satisfaction were identified: 8.9% had high job satisfaction, belonging to “satisfied class”; 38.2% had low job satisfaction, named as “unsatisfied class”; 30.5% were categorized into “unsatisfied class with the exception of interpersonal relationships”; 22.4% were identified as “pseudo-satisfied class”, only satisfied with management-oriented items. Low job satisfaction was associated with specialty, training opportunity, and income inequality. Conclusions: The minority of health-care staff belong to the “satisfied class”. Three among four subgroups are not satisfied with income, benefit, training, and career development. Targeting policy interventions should be implemented to improve the items of job satisfaction based on the patterns and health-care staff’s features. PMID:28937609

  4. An algorithm for spatial heirarchy clustering

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Velasco, F. R. D.

    1981-01-01

    A method for utilizing both spectral and spatial redundancy in compacting and preclassifying images is presented. In multispectral satellite images, a high correlation exists between neighboring image points which tend to occupy dense and restricted regions of the feature space. The image is divided into windows of the same size where the clustering is made. The classes obtained in several neighboring windows are clustered, and then again successively clustered until only one region corresponding to the whole image is obtained. By employing this algorithm only a few points are considered in each clustering, thus reducing computational effort. The method is illustrated as applied to LANDSAT images.

  5. Patterns and predictors of violence against children in Uganda: a latent class analysis.

    PubMed

    Clarke, Kelly; Patalay, Praveetha; Allen, Elizabeth; Knight, Louise; Naker, Dipak; Devries, Karen

    2016-05-24

    To explore patterns of physical, emotional and sexual violence against Ugandan children. Latent class and multinomial logistic regression analysis of cross-sectional data. Luwero District, Uganda. In all, 3706 primary 5, 6 and 7 students attending 42 primary schools. To measure violence, we used the International Society for the Prevention of Child Abuse and Neglect Child Abuse Screening Tool-Child Institutional. We used the Strengths and Difficulties Questionnaire to assess mental health and administered reading, spelling and maths tests. We identified three violence classes. Class 1 (N=696 18.8%) was characterised by emotional and physical violence by parents and relatives, and sexual and emotional abuse by boyfriends, girlfriends and unrelated adults outside school. Class 2 (N=975 26.3%) was characterised by physical, emotional and sexual violence by peers (male and female students). Children in Classes 1 and 2 also had a high probability of exposure to emotional and physical violence by school staff. Class 3 (N=2035 54.9%) was characterised by physical violence by school staff and a lower probability of all other forms of violence compared to Classes 1 and 2. Children in Classes 1 and 2 were more likely to have worked for money (Class 1 Relative Risk Ratio 1.97, 95% CI 1.54 to 2.51; Class 2 1.55, 1.29 to 1.86), been absent from school in the previous week (Class 1 1.31, 1.02 to 1.67; Class 2 1.34, 1.10 to 1.63) and to have more mental health difficulties (Class 1 1.09, 1.07 to 1.11; Class 2 1.11, 1.09 to 1.13) compared to children in Class 3. Female sex (3.44, 2.48 to 4.78) and number of children sharing a sleeping area predicted being in Class 1. Childhood violence in Uganda forms distinct patterns, clustered by perpetrator and setting. Research is needed to understand experiences of victimised children, and to develop mental health interventions for those with severe violence exposures. NCT01678846; Results. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  6. Joint spatial-spectral hyperspectral image clustering using block-diagonal amplified affinity matrix

    NASA Astrophysics Data System (ADS)

    Fan, Lei; Messinger, David W.

    2018-03-01

    The large number of spectral channels in a hyperspectral image (HSI) produces a fine spectral resolution to differentiate between materials in a scene. However, difficult classes that have similar spectral signatures are often confused while merely exploiting information in the spectral domain. Therefore, in addition to spectral characteristics, the spatial relationships inherent in HSIs should also be considered for incorporation into classifiers. The growing availability of high spectral and spatial resolution of remote sensors provides rich information for image clustering. Besides the discriminating power in the rich spectrum, contextual information can be extracted from the spatial domain, such as the size and the shape of the structure to which one pixel belongs. In recent years, spectral clustering has gained popularity compared to other clustering methods due to the difficulty of accurate statistical modeling of data in high dimensional space. The joint spatial-spectral information could be effectively incorporated into the proximity graph for spectral clustering approach, which provides a better data representation by discovering the inherent lower dimensionality from the input space. We embedded both spectral and spatial information into our proposed local density adaptive affinity matrix, which is able to handle multiscale data by automatically selecting the scale of analysis for every pixel according to its neighborhood of the correlated pixels. Furthermore, we explored the "conductivity method," which aims at amplifying the block diagonal structure of the affinity matrix to further improve the performance of spectral clustering on HSI datasets.

  7. Understanding the Molecular Basis of Multiple Mitochondrial Dysfunctions Syndrome 1 (MMDS1)-Impact of a Disease-Causing Gly208Cys Substitution on Structure and Activity of NFU1 in the Fe/S Cluster Biosynthetic Pathway.

    PubMed

    Wachnowsky, Christine; Wesley, Nathaniel A; Fidai, Insiya; Cowan, J A

    2017-03-24

    Iron-sulfur (Fe/S)-cluster-containing proteins constitute one of the largest protein classes, with varied functions that include electron transport, regulation of gene expression, substrate binding and activation, and radical generation. Consequently, the biosynthetic machinery for Fe/S clusters is evolutionarily conserved, and mutations in a variety of putative intermediate Fe/S cluster scaffold proteins can cause disease states, including multiple mitochondrial dysfunctions syndrome (MMDS), sideroblastic anemia, and mitochondrial encephalomyopathy. Herein, we have characterized the impact of defects occurring in the MMDS1 disease state that result from a point mutation (Gly208Cys) near the active site of NFU1, an Fe/S scaffold protein, via an in vitro investigation into the structural and functional consequences. Analysis of protein stability and oligomeric state demonstrates that the mutant increases the propensity to dimerize and perturbs the secondary structure composition. These changes appear to underlie the severely decreased ability of mutant NFU1 to accept an Fe/S cluster from physiologically relevant sources. Therefore, the point mutation on NFU1 impairs downstream cluster trafficking and results in the disease phenotype, because there does not appear to be an alternative in vivo reconstitution path, most likely due to greater protein oligomerization from a minor structural change. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. An Intercomparison Between Radar Reflectivity and the IR Cloud Classification Technique for the TOGA-COARE Area

    NASA Technical Reports Server (NTRS)

    Carvalho, L. M. V.; Rickenbach, T.

    1999-01-01

    Satellite infrared (IR) and visible (VIS) images from the Tropical Ocean Global Atmosphere - Coupled Ocean Atmosphere Response Experiment (TOGA-COARE) experiment are investigated through the use of Clustering Analysis. The clusters are obtained from the values of IR and VIS counts and the local variance for both channels. The clustering procedure is based on the standardized histogram of each variable obtained from 179 pairs of images. A new approach to classify high clouds using only IR and the clustering technique is proposed. This method allows the separation of the enhanced convection in two main classes: convective tops, more closely related to the most active core of the storm, and convective systems, which produce regions of merged, thick anvil clouds. The resulting classification of different portions of cloudiness is compared to the radar reflectivity field for intensive events. Convective Systems and Convective Tops are followed during their life cycle using the IR clustering method. The areal coverage of precipitation and features related to convective and stratiform rain is obtained from the radar for each stage of the evolving Mesoscale Convective Systems (MCS). In order to compare the IR clustering method with a simple threshold technique, two IR thresholds (Tir) were used to identify different portions of cloudiness, Tir=240K which roughly defines the extent of all cloudiness associated with the MCS, and Tir=220K which indicates the presence of deep convection. It is shown that the IR clustering technique can be used as a simple alternative to identify the actual portion of convective and stratiform rainfall.

  9. Predicting lower mantle heterogeneity from 4-D Earth models

    NASA Astrophysics Data System (ADS)

    Flament, Nicolas; Williams, Simon; Müller, Dietmar; Gurnis, Michael; Bower, Dan J.

    2016-04-01

    The Earth's lower mantle is characterized by two large-low-shear velocity provinces (LLSVPs), approximately ˜15000 km in diameter and 500-1000 km high, located under Africa and the Pacific Ocean. The spatial stability and chemical nature of these LLSVPs are debated. Here, we compare the lower mantle structure predicted by forward global mantle flow models constrained by tectonic reconstructions (Bower et al., 2015) to an analysis of five global tomography models. In the dynamic models, spanning 230 million years, slabs subducting deep into the mantle deform an initially uniform basal layer containing 2% of the volume of the mantle. Basal density, convective vigour (Rayleigh number Ra), mantle viscosity, absolute plate motions, and relative plate motions are varied in a series of model cases. We use cluster analysis to classify a set of equally-spaced points (average separation ˜0.45°) on the Earth's surface into two groups of points with similar variations in present-day temperature between 1000-2800 km depth, for each model case. Below ˜2400 km depth, this procedure reveals a high-temperature cluster in which mantle temperature is significantly larger than ambient and a low-temperature cluster in which mantle temperature is lower than ambient. The spatial extent of the high-temperature cluster is in first-order agreement with the outlines of the African and Pacific LLSVPs revealed by a similar cluster analysis of five tomography models (Lekic et al., 2012). Model success is quantified by computing the accuracy and sensitivity of the predicted temperature clusters in predicting the low-velocity cluster obtained from tomography (Lekic et al., 2012). In these cases, the accuracy varies between 0.61-0.80, where a value of 0.5 represents the random case, and the sensitivity ranges between 0.18-0.83. The largest accuracies and sensitivities are obtained for models with Ra ≈ 5 x 107, no asthenosphere (or an asthenosphere restricted to the oceanic domain), and a basal layer ˜ 4% denser than ambient mantle. Increasing convective vigour (Ra ≈ 5 x 108) or decreasing the density of the basal layer decreases both the accuracy and sensitivity of the predicted lower mantle structure. References: D. J. Bower, M. Gurnis, N. Flament, Assimilating lithosphere and slab history in 4-D Earth models. Phys. Earth Planet. Inter. 238, 8-22 (2015). V. Lekic, S. Cottaar, A. Dziewonski, B. Romanowicz, Cluster analysis of global lower mantle tomography: A new class of structure and implications for chemical heterogeneity. Earth Planet. Sci. Lett. 357, 68-77 (2012).

  10. NGC 6273: Towards Defining A New Class of Galactic Globular Clusters?

    NASA Astrophysics Data System (ADS)

    Johnson, Christian I.; Rich, Robert Michael; Pilachowski, Catherine A.; Caldwell, Nelson; Mateo, Mario L.; Ira Bailey, John; Crane, Jeffrey D.

    2016-01-01

    A growing number of observations have found that several Galactic globular clusters exhibit abundance dispersions beyond the well-known light element (anti-)correlations. These clusters tend to be very massive, have >0.1 dex intrinsic metallicity dispersions, have complex sub-giant branch morphologies, and have correlated [Fe/H] and s-process element enhancements. Interestingly, nearly all of these clusters discovered so far have [Fe/H]~-1.7. In this context, we have examined the chemical composition of 18 red giant branch (RGB) stars in the massive, metal-poor Galactic bulge globular cluster NGC 6273 using high signal-to-noise, high resolution (R~27,000) spectra obtained with the Michigan/Magellan Fiber System (M2FS) and MSpec spectrograph mounted on the Magellan-Clay 6.5m telescope at Las Campanas Observatory. We find that the cluster exhibits a metallicity range from [Fe/H]=-1.80 to -1.30 and is composed of two dominant populations separated in [Fe/H] and [La/Fe] abundance. The increase in [La/Eu] as a function of [La/H] suggests that the increase in [La/Fe] with [Fe/H] is due to almost pure s-process enrichment. The most metal-rich star in our sample is not strongly La-enhanced, but is α-poor and may belong to a third "anomalous" stellar population. The two dominant populations exhibit the same [Na/Fe]-[Al/Fe] correlation found in other "normal" globular clusters. Therefore, NGC 6273 joins ω Centauri, M 22, M 2, and NGC 5286 as a possible new class of Galactic globular clusters.

  11. Structure and Function of 4-Hydroxyphenylacetate Decarboxylase and Its Cognate Activating Enzyme.

    PubMed

    Selvaraj, Brinda; Buckel, Wolfgang; Golding, Bernard T; Ullmann, G Matthias; Martins, Berta M

    2016-01-01

    4-Hydroxyphenylacetate decarboxylase (4Hpad) is the prototype of a new class of Fe-S cluster-dependent glycyl radical enzymes (Fe-S GREs) acting on aromatic compounds. The two-enzyme component system comprises a decarboxylase responsible for substrate conversion and a dedicated activating enzyme (4Hpad-AE). The decarboxylase uses a glycyl/thiyl radical dyad to convert 4-hydroxyphenylacetate into p-cresol (4-methylphenol) by a biologically unprecedented Kolbe-type decarboxylation. In addition to the radical dyad prosthetic group, the decarboxylase unit contains two [4Fe-4S] clusters coordinated by an extra small subunit of unknown function. 4Hpad-AE reductively cleaves S-adenosylmethionine (SAM or AdoMet) at a site-differentiated [4Fe-4S]2+/+ cluster (RS cluster) generating a transient 5'-deoxyadenosyl radical that produces a stable glycyl radical in the decarboxylase by the abstraction of a hydrogen atom. 4Hpad-AE binds up to two auxiliary [4Fe-4S] clusters coordinated by a ferredoxin-like insert that is C-terminal to the RS cluster-binding motif. The ferredoxin-like domain with its two auxiliary clusters is not vital for SAM-dependent glycyl radical formation in the decarboxylase, but facilitates a longer lifetime for the radical. This review describes the 4Hpad and cognate AE families and focuses on the recent advances and open questions concerning the structure, function and mechanism of this novel Fe-S-dependent class of GREs. © 2016 S. Karger AG, Basel.

  12. Recapitulation of Ayurveda constitution types by machine learning of phenotypic traits.

    PubMed

    Tiwari, Pradeep; Kutum, Rintu; Sethi, Tavpritesh; Shrivastava, Ankita; Girase, Bhushan; Aggarwal, Shilpi; Patil, Rutuja; Agarwal, Dhiraj; Gautam, Pramod; Agrawal, Anurag; Dash, Debasis; Ghosh, Saurabh; Juvekar, Sanjay; Mukerji, Mitali; Prasher, Bhavana

    2017-01-01

    In Ayurveda system of medicine individuals are classified into seven constitution types, "Prakriti", for assessing disease susceptibility and drug responsiveness. Prakriti evaluation involves clinical examination including questions about physiological and behavioural traits. A need was felt to develop models for accurately predicting Prakriti classes that have been shown to exhibit molecular differences. The present study was carried out on data of phenotypic attributes in 147 healthy individuals of three extreme Prakriti types, from a genetically homogeneous population of Western India. Unsupervised and supervised machine learning approaches were used to infer inherent structure of the data, and for feature selection and building classification models for Prakriti respectively. These models were validated in a North Indian population. Unsupervised clustering led to emergence of three natural clusters corresponding to three extreme Prakriti classes. The supervised modelling approaches could classify individuals, with distinct Prakriti types, in the training and validation sets. This study is the first to demonstrate that Prakriti types are distinct verifiable clusters within a multidimensional space of multiple interrelated phenotypic traits. It also provides a computational framework for predicting Prakriti classes from phenotypic attributes. This approach may be useful in precision medicine for stratification of endophenotypes in healthy and diseased populations.

  13. Detecting grizzly bear use of ungulate carcasses using global positioning system telemetry and activity data

    USGS Publications Warehouse

    Ebinger, Michael R.; Haroldson, Mark A.; van Manen, Frank T.; Costello, Cecily M.; Bjornlie, Daniel D.; Thompson, Daniel J.; Gunther, Kerry A.; Fortin, Jennifer K.; Teisberg, Justin E.; Pils, Shannon R; White, P J; Cain, Steven L.; Cross, Paul C.

    2016-01-01

    Global positioning system (GPS) wildlife collars have revolutionized wildlife research. Studies of predation by free-ranging carnivores have particularly benefited from the application of location clustering algorithms to determine when and where predation events occur. These studies have changed our understanding of large carnivore behavior, but the gains have concentrated on obligate carnivores. Facultative carnivores, such as grizzly/brown bears (Ursus arctos), exhibit a variety of behaviors that can lead to the formation of GPS clusters. We combined clustering techniques with field site investigations of grizzly bear GPS locations (n = 732 site investigations; 2004–2011) to produce 174 GPS clusters where documented behavior was partitioned into five classes (large-biomass carcass, small-biomass carcass, old carcass, non-carcass activity, and resting). We used multinomial logistic regression to predict the probability of clusters belonging to each class. Two cross-validation methods—leaving out individual clusters, or leaving out individual bears—showed that correct prediction of bear visitation to large-biomass carcasses was 78–88%, whereas the false-positive rate was 18–24%. As a case study, we applied our predictive model to a GPS data set of 266 bear-years in the Greater Yellowstone Ecosystem (2002–2011) and examined trends in carcass visitation during fall hyperphagia (September–October). We identified 1997 spatial GPS clusters, of which 347 were predicted to be large-biomass carcasses. We used the clustered data to develop a carcass visitation index, which varied annually, but more than doubled during the study period. Our study demonstrates the effectiveness and utility of identifying GPS clusters associated with carcass visitation by a facultative carnivore.

  14. Detecting grizzly bear use of ungulate carcasses using global positioning system telemetry and activity data.

    PubMed

    Ebinger, Michael R; Haroldson, Mark A; van Manen, Frank T; Costello, Cecily M; Bjornlie, Daniel D; Thompson, Daniel J; Gunther, Kerry A; Fortin, Jennifer K; Teisberg, Justin E; Pils, Shannon R; White, P J; Cain, Steven L; Cross, Paul C

    2016-07-01

    Global positioning system (GPS) wildlife collars have revolutionized wildlife research. Studies of predation by free-ranging carnivores have particularly benefited from the application of location clustering algorithms to determine when and where predation events occur. These studies have changed our understanding of large carnivore behavior, but the gains have concentrated on obligate carnivores. Facultative carnivores, such as grizzly/brown bears (Ursus arctos), exhibit a variety of behaviors that can lead to the formation of GPS clusters. We combined clustering techniques with field site investigations of grizzly bear GPS locations (n = 732 site investigations; 2004-2011) to produce 174 GPS clusters where documented behavior was partitioned into five classes (large-biomass carcass, small-biomass carcass, old carcass, non-carcass activity, and resting). We used multinomial logistic regression to predict the probability of clusters belonging to each class. Two cross-validation methods-leaving out individual clusters, or leaving out individual bears-showed that correct prediction of bear visitation to large-biomass carcasses was 78-88 %, whereas the false-positive rate was 18-24 %. As a case study, we applied our predictive model to a GPS data set of 266 bear-years in the Greater Yellowstone Ecosystem (2002-2011) and examined trends in carcass visitation during fall hyperphagia (September-October). We identified 1997 spatial GPS clusters, of which 347 were predicted to be large-biomass carcasses. We used the clustered data to develop a carcass visitation index, which varied annually, but more than doubled during the study period. Our study demonstrates the effectiveness and utility of identifying GPS clusters associated with carcass visitation by a facultative carnivore.

  15. Optimizing Scheme for Remote Preparation of Four-particle Cluster-like Entangled States

    NASA Astrophysics Data System (ADS)

    Wang, Dong; Ye, Liu

    2011-09-01

    Recently, Ma et al. (Opt. Commun. 283:2640, 2010) have proposed a novel scheme for preparing a class of cluster-like entangled states based on a four-particle projective measurement. In this paper, we put forward a new and optimal scheme to realize the remote preparation for this class of cluster-like states with the aid of two bipartite partially entangled channels. Different from the previous scheme, we employ a two-particle projective measurement instead of the four-particle projective measurement during the preparation. Besides, the resource consumptions are computed in our scheme, which include classical communication cost and quantum resource consumptions. Moreover, we have some discussions on the features of our scheme and make some comparisons on resource consumptions and operation complexity between the previous scheme and ours. The results show that our scheme is more economic and feasible compared with the previous.

  16. Clustering the lexicon in the brain: a meta-analysis of the neurofunctional evidence on noun and verb processing

    PubMed Central

    Crepaldi, Davide; Berlingeri, Manuela; Cattinelli, Isabella; Borghese, Nunzio A.; Luzzatti, Claudio; Paulesu, Eraldo

    2013-01-01

    Although it is widely accepted that nouns and verbs are functionally independent linguistic entities, it is less clear whether their processing recruits different brain areas. This issue is particularly relevant for those theories of lexical semantics (and, more in general, of cognition) that suggest the embodiment of abstract concepts, i.e., based strongly on perceptual and motoric representations. This paper presents a formal meta-analysis of the neuroimaging evidence on noun and verb processing in order to address this dichotomy more effectively at the anatomical level. We used a hierarchical clustering algorithm that grouped fMRI/PET activation peaks solely on the basis of spatial proximity. Cluster specificity for grammatical class was then tested on the basis of the noun-verb distribution of the activation peaks included in each cluster. Thirty-two clusters were identified: three were associated with nouns across different tasks (in the right inferior temporal gyrus, the left angular gyrus, and the left inferior parietal gyrus); one with verbs across different tasks (in the posterior part of the right middle temporal gyrus); and three showed verb specificity in some tasks and noun specificity in others (in the left and right inferior frontal gyrus and the left insula). These results do not support the popular tenets that verb processing is predominantly based in the left frontal cortex and noun processing relies specifically on temporal regions; nor do they support the idea that verb lexical-semantic representations are heavily based on embodied motoric information. Our findings suggest instead that the cerebral circuits deputed to noun and verb processing lie in close spatial proximity in a wide network including frontal, parietal, and temporal regions. The data also indicate a predominant—but not exclusive—left lateralization of the network. PMID:23825451

  17. Structures in magnetohydrodynamic turbulence: Detection and scaling

    NASA Astrophysics Data System (ADS)

    Uritsky, V. M.; Pouquet, A.; Rosenberg, D.; Mininni, P. D.; Donovan, E. F.

    2010-11-01

    We present a systematic analysis of statistical properties of turbulent current and vorticity structures at a given time using cluster analysis. The data stem from numerical simulations of decaying three-dimensional magnetohydrodynamic turbulence in the absence of an imposed uniform magnetic field; the magnetic Prandtl number is taken equal to unity, and we use a periodic box with grids of up to 15363 points and with Taylor Reynolds numbers up to 1100. The initial conditions are either an X -point configuration embedded in three dimensions, the so-called Orszag-Tang vortex, or an Arn’old-Beltrami-Childress configuration with a fully helical velocity and magnetic field. In each case two snapshots are analyzed, separated by one turn-over time, starting just after the peak of dissipation. We show that the algorithm is able to select a large number of structures (in excess of 8000) for each snapshot and that the statistical properties of these clusters are remarkably similar for the two snapshots as well as for the two flows under study in terms of scaling laws for the cluster characteristics, with the structures in the vorticity and in the current behaving in the same way. We also study the effect of Reynolds number on cluster statistics, and we finally analyze the properties of these clusters in terms of their velocity-magnetic-field correlation. Self-organized criticality features have been identified in the dissipative range of scales. A different scaling arises in the inertial range, which cannot be identified for the moment with a known self-organized criticality class consistent with magnetohydrodynamics. We suggest that this range can be governed by turbulence dynamics as opposed to criticality and propose an interpretation of intermittency in terms of propagation of local instabilities.

  18. Dyspnea descriptors developed in Brazil: application in obese patients and in patients with cardiorespiratory diseases.

    PubMed

    Teixeira, Christiane Aires; Rodrigues Júnior, Antonio Luiz; Straccia, Luciana Cristina; Vianna, Elcio Dos Santos Oliveira; Silva, Geruza Alves da; Martinez, José Antônio Baddini

    2011-01-01

    To develop a set of descriptive terms applied to the sensation of dyspnea (dyspnea descriptors) for use in Brazil and to investigate the usefulness of these descriptors in four distinct clinical conditions that can be accompanied by dyspnea. We collected 111 dyspnea descriptors from 67 patients and 10 health professionals. These descriptors were analyzed and reduced to 15 based on their frequency of use, similarity of meaning, and potential pathophysiological value. Those 15 descriptors were applied in 50 asthma patients, 50 COPD patients, 30 patients with heart failure, and 50 patients with class II or III obesity. The three best descriptors, as selected by the patients, were studied by cluster analysis. Potential associations between the identified clusters and the four clinical conditions were also investigated. The use of this set of descriptors led to a solution with seven clusters, designated sufoco (suffocating), aperto (tight), rápido (rapid), fadiga (fatigue), abafado (stuffy), trabalho/inspiração (work/inhalation), and falta de ar (shortness of breath). Overlapping of descriptors was quite common among the patients, regardless of their clinical condition. Asthma was significantly associated with the sufoco and trabalho/inspiração clusters, whereas COPD and heart failure were associated with the sufoco, trabalho/inspiração, and falta de ar clusters. Obesity was associated only with the falta de ar cluster. In Brazil, patients who are accustomed to perceiving dyspnea employ various descriptors in order to describe the symptom, and these descriptors can be grouped into similar clusters. In our study sample, such clusters showed no usefulness in differentiating among the four clinical conditions evaluated.

  19. Electronic medical records and physician stress in primary care: results from the MEMO Study

    PubMed Central

    Babbott, Stewart; Manwell, Linda Baier; Brown, Roger; Montague, Enid; Williams, Eric; Schwartz, Mark; Hess, Erik; Linzer, Mark

    2014-01-01

    Background Little has been written about physician stress that may be associated with electronic medical records (EMR). Objective We assessed relationships between the number of EMR functions, primary care work conditions, and physician satisfaction, stress and burnout. Design and participants 379 primary care physicians and 92 managers at 92 clinics from New York City and the upper Midwest participating in the 2001–5 Minimizing Error, Maximizing Outcome (MEMO) Study. A latent class analysis identified clusters of physicians within clinics with low, medium and high EMR functions. Main measures We assessed physician-reported stress, burnout, satisfaction, and intent to leave the practice, and predictors including time pressure during visits. We used a two-level regression model to estimate the mean response for each physician cluster to each outcome, adjusting for physician age, sex, specialty, work hours and years using the EMR. Effect sizes (ES) of these relationships were considered small (0.14), moderate (0.39), and large (0.61). Key results Compared to the low EMR cluster, physicians in the moderate EMR cluster reported more stress (ES 0.35, p=0.03) and lower satisfaction (ES −0.45, p=0.006). Physicians in the high EMR cluster indicated lower satisfaction than low EMR cluster physicians (ES −0.39, p=0.01). Time pressure was associated with significantly more burnout, dissatisfaction and intent to leave only within the high EMR cluster. Conclusions Stress may rise for physicians with a moderate number of EMR functions. Time pressure was associated with poor physician outcomes mainly in the high EMR cluster. Work redesign may address these stressors. PMID:24005796

  20. Comparative Study of Broadband Photometry Relations for Ultra-Diffuse and Normal Galaxies in the Coma Cluster

    NASA Astrophysics Data System (ADS)

    Stone, Maria Babakhanyan

    Ultra-diffuse galaxies are a novel type of galaxies discovered first in the Coma cluster. These objects are characterized simultaneously by large sizes and by very low counts of constituent stars. Conflicting theories have been proposed to explain how these large diffuse galaxies could have survived in the harsh environment of clusters. To date, thousands of these new galaxies have been identified in cluster environments. However, further studies are required to understand their relationship to the known giant and dwarf classes of galaxies. The purpose of this study is to compare the trends of inner and outer populations of normal members of the Coma cluster and ultra-diffuse galaxies in color-magnitude space. The present work used several astronomical catalogs to identify the member galaxies based on the coordinates of their positions and to extract available colors and magnitudes. We obtained correlations to convert colors and magnitudes from different systems into the common Sloan Digital Sky Survey system to facilitate the comparative analysis. We showed the quantitative relations describing the color-magnitude trends of galaxies in the core and the outskirts of the cluster. We confirmed that the inner and outer populations of ultra-diffuse galaxies exhibit an offset similar to the normal red sequence galaxies. We presented an initial assessment of stellar population ages and metallicities which correspond to the obtained color offsets. We surveyed the available images of the cluster for outliers, merger candidates, and candidate ultra-diffuse galaxies. We conclude that ultra-diffuse galaxies are an important part of the Coma cluster evolutionary history and future work is needed especially in obtaining spectroscopic data of a larger number of these dim galaxies.

Top