Science.gov

Sample records for mining approach identifies

  1. A novel pattern mining approach for identifying cognitive activity in EEG based functional brain networks.

    PubMed

    Thilaga, M; Vijayalakshmi, R; Nadarajan, R; Nandagopal, D

    2016-06-01

    The complex nature of neuronal interactions of the human brain has posed many challenges to the research community. To explore the underlying mechanisms of neuronal activity of cohesive brain regions during different cognitive activities, many innovative mathematical and computational models are required. This paper presents a novel Common Functional Pattern Mining approach to demonstrate the similar patterns of interactions due to common behavior of certain brain regions. The electrode sites of EEG-based functional brain network are modeled as a set of transactions and node-based complex network measures as itemsets. These itemsets are transformed into a graph data structure called Functional Pattern Graph. By mining this Functional Pattern Graph, the common functional patterns due to specific brain functioning can be identified. The empirical analyses show the efficiency of the proposed approach in identifying the extent to which the electrode sites (transactions) are similar during various cognitive load states. PMID:27401999

  2. A novel pattern mining approach for identifying cognitive activity in EEG based functional brain networks.

    PubMed

    Thilaga, M; Vijayalakshmi, R; Nadarajan, R; Nandagopal, D

    2016-06-01

    The complex nature of neuronal interactions of the human brain has posed many challenges to the research community. To explore the underlying mechanisms of neuronal activity of cohesive brain regions during different cognitive activities, many innovative mathematical and computational models are required. This paper presents a novel Common Functional Pattern Mining approach to demonstrate the similar patterns of interactions due to common behavior of certain brain regions. The electrode sites of EEG-based functional brain network are modeled as a set of transactions and node-based complex network measures as itemsets. These itemsets are transformed into a graph data structure called Functional Pattern Graph. By mining this Functional Pattern Graph, the common functional patterns due to specific brain functioning can be identified. The empirical analyses show the efficiency of the proposed approach in identifying the extent to which the electrode sites (transactions) are similar during various cognitive load states.

  3. Computational Approaches for Mining GRO-Seq Data to Identify and Characterize Active Enhancers.

    PubMed

    Nagari, Anusha; Murakami, Shino; Malladi, Venkat S; Kraus, W Lee

    2017-01-01

    Transcriptional enhancers are DNA regulatory elements that are bound by transcription factors and act to positively regulate the expression of nearby or distally located target genes. Enhancers have many features that have been discovered using genomic analyses. Recent studies have shown that active enhancers recruit RNA polymerase II (Pol II) and are transcribed, producing enhancer RNAs (eRNAs). GRO-seq, a method for identifying the location and orientation of all actively transcribing RNA polymerases across the genome, is a powerful approach for monitoring nascent enhancer transcription. Furthermore, the unique pattern of enhancer transcription can be used to identify enhancers in the absence of any information about the underlying transcription factors. Here, we describe the computational approaches required to identify and analyze active enhancers using GRO-seq data, including data pre-processing, alignment, and transcript calling. In addition, we describe protocols and computational pipelines for mining GRO-seq data to identify active enhancers, as well as known transcription factor binding sites that are transcribed. Furthermore, we discuss approaches for integrating GRO-seq-based enhancer data with other genomic data, including target gene expression and function. Finally, we describe molecular biology assays that can be used to confirm and explore further the function of enhancers that have been identified using genomic assays. Together, these approaches should allow the user to identify and explore the features and biological functions of new cell type-specific enhancers. PMID:27662874

  4. An integrative data mining approach to identifying adverse outcome pathway signatures.

    PubMed

    Oki, Noffisat O; Edwards, Stephen W

    2016-03-28

    The Adverse Outcome Pathway (AOP) framework is a tool for making biological connections and summarizing key information across different levels of biological organization to connect biological perturbations at the molecular level to adverse outcomes for an individual or population. Computational approaches to explore and determine these connections can accelerate the assembly of AOPs. By leveraging the wealth of publicly available data covering chemical effects on biological systems, computationally-predicted AOPs (cpAOPs) were assembled via data mining of high-throughput screening (HTS) in vitro data, in vivo data and other disease phenotype information. Frequent Itemset Mining (FIM) was used to find associations between the gene targets of ToxCast HTS assays and disease data from Comparative Toxicogenomics Database (CTD) by using the chemicals as the common aggregators between datasets. The method was also used to map gene expression data to disease data from CTD. A cpAOP network was defined by considering genes and diseases as nodes and FIM associations as edges. This network contained 18,283 gene to disease associations for the ToxCast data and 110,253 for CTD gene expression. Two case studies show the value of the cpAOP network by extracting subnetworks focused either on fatty liver disease or the Aryl Hydrocarbon Receptor (AHR). The subnetwork surrounding fatty liver disease included many genes known to play a role in this disease. When querying the cpAOP network with the AHR gene, an interesting subnetwork including glaucoma was identified. While substantial literature exists to support the potential for AHR ligands to elicit glaucoma, it was not explicitly captured in the public annotation information in CTD. The subnetwork from this analysis suggests a cpAOP that includes changes in CYP1B1 expression, which has been previously established in the literature as a primary cause of glaucoma. These case studies highlight the value in integrating multiple data

  5. Towards an efficient computational mining approach to identify EST-SSR markers.

    PubMed

    Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Barooah, Madhumita; Modi, Mahendra Kumar; Talukdar, Anupam Das

    2012-01-01

    Microsatellites are the markers of choice due to their high abundance reproducibility, degree of polymorphism and co-dominant nature. These are mainly used for studying the genetic variability in different species and Marker assisted selection. Expressed Sequence Tags (ESTs) serve as the main resource for Simple Sequence Repeats (SSRs). The computational approach for detecting SSRs and developing SSR markers from EST-SSRs is preferred over the conventional methods as it reduces time and cost to a great extent. The available EST sequence databases, various web interfaces and standalone tools provide the platform for an easy analysis of the EST sequences leading to the development of potential EST-SSR Markers. This paper is an overview of in silico approach to develop SSR Markers from the EST sequence using some of the most efficient tools that are available freely for academic purpose.

  6. Data mining approach identifies research priorities and data requirements for resolving the red algal tree of life

    PubMed Central

    2010-01-01

    Background The assembly of the tree of life has seen significant progress in recent years but algae and protists have been largely overlooked in this effort. Many groups of algae and protists have ancient roots and it is unclear how much data will be required to resolve their phylogenetic relationships for incorporation in the tree of life. The red algae, a group of primary photosynthetic eukaryotes of more than a billion years old, provide the earliest fossil evidence for eukaryotic multicellularity and sexual reproduction. Despite this evolutionary significance, their phylogenetic relationships are understudied. This study aims to infer a comprehensive red algal tree of life at the family level from a supermatrix containing data mined from GenBank. We aim to locate remaining regions of low support in the topology, evaluate their causes and estimate the amount of data required to resolve them. Results Phylogenetic analysis of a supermatrix of 14 loci and 98 red algal families yielded the most complete red algal tree of life to date. Visualization of statistical support showed the presence of five poorly supported regions. Causes for low support were identified with statistics about the age of the region, data availability and node density, showing that poor support has different origins in different parts of the tree. Parametric simulation experiments yielded optimistic estimates of how much data will be needed to resolve the poorly supported regions (ca. 103 to ca. 104 nucleotides for the different regions). Nonparametric simulations gave a markedly more pessimistic image, some regions requiring more than 2.8 105 nucleotides or not achieving the desired level of support at all. The discrepancies between parametric and nonparametric simulations are discussed in light of our dataset and known attributes of both approaches. Conclusions Our study takes the red algae one step closer to meaningful inclusion in the tree of life. In addition to the recovery of stable

  7. SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics.

    PubMed

    Bertone, P; Kluger, Y; Lan, N; Zheng, D; Christendat, D; Yee, A; Edwards, A M; Arrowsmith, C H; Montelione, G T; Gerstein, M

    2001-07-01

    High-throughput structural proteomics is expected to generate considerable amounts of data on the progress of structure determination for many proteins. For each protein this includes information about cloning, expression, purification, biophysical characterization and structure determination via NMR spectroscopy or X-ray crystallography. It will be essential to develop specifications and ontologies for standardizing this information to make it amenable to retrospective analysis. To this end we created the SPINE database and analysis system for the Northeast Structural Genomics Consortium. SPINE, which is available at bioinfo.mbb.yale.edu/nesg or nesg.org, is specifically designed to enable distributed scientific collaboration via the Internet. It was designed not just as an information repository but as an active vehicle to standardize proteomics data in a form that would enable systematic data mining. The system features an intuitive user interface for interactive retrieval and modification of expression construct data, query forms designed to track global project progress and external links to many other resources. Currently the database contains experimental data on 985 constructs, of which 740 are drawn from Methanobacterium thermoautotrophicum, 123 from Saccharomyces cerevisiae, 93 from Caenorhabditis elegans and the remainder from other organisms. We developed a comprehensive set of data mining features for each protein, including several related to experimental progress (e.g. expression level, solubility and crystallization) and 42 based on the underlying protein sequence (e.g. amino acid composition, secondary structure and occurrence of low complexity regions). We demonstrate in detail the application of a particular machine learning approach, decision trees, to the tasks of predicting a protein's solubility and propensity to crystallize based on sequence features. We are able to extract a number of key rules from our trees, in particular that soluble

  8. Federal Mine Safety and Health Act - identifying opportunities for partnership

    SciTech Connect

    Beverage, L.

    1996-12-31

    Opportunities for partnership provided by the Federal Mine Safety and Health Act (FMSHA) are identified. These opportunities include: putting the FMSHA into perspective; criticisms of the Mine Act: legislative reform initiatives; understanding FMSHA`s inspections and investigation system; understanding FMSHA`s enforcement tools; developing a partnership built on mutual respect; and post-enforcement challenges.

  9. A data mining approach to intelligence operations

    NASA Astrophysics Data System (ADS)

    Memon, Nasrullah; Hicks, David L.; Harkiolakis, Nicholas

    2008-03-01

    In this paper we examine the latest thinking, approaches and methodologies in use for finding the nuggets of information and subliminal (and perhaps intentionally hidden) patterns and associations that are critical to identify criminal activity and suspects to private and government security agencies. An emphasis in the paper is placed on Social Network Analysis and Investigative Data Mining, and the use of these technologies in the counterterrorism domain. Tools and techniques from both areas are described, along with the important tasks for which they can be used to assist with the investigation and analysis of terrorist organizations. The process of collecting data about these organizations is also considered along with the inherent difficulties that are involved.

  10. Implementation of an original approach on the Mines-Douai Comparative Reactivity Method (MD-CRM) instrument to identify part of the missing OH reactivity at an urban site

    NASA Astrophysics Data System (ADS)

    Dusanter, S.; Michoud, V.; Leonardis, T.; Riffault, V.; Zhang, S.; Locoge, N.

    2015-12-01

    Due to the large number of Volatile Organic Compounds (VOCs) expected in the atmosphere (104-105) (Goldstein and Galbally, ES&T, 2007), exhaustive measurements of VOCs appear to be currently unfeasible using common analytical techniques. In this context, measurements of the total sink of OH, referred as total OH reactivity, can provide a critical test to assess the completeness of trace gas measurements during field campaigns. This can be done by comparing the measured total OH reactivity to values calculated from trace gas measurements. Indeed, large discrepancies are usually found between measured and calculated OH reactivity values revealing the presence of important unmeasured reactive species, which have yet to be identified. A Comparative Reactivity Method (CRM) instrument has been setup at Mines Douai to allow sequential measurements of VOCs and OH reactivity using the same Proton Transfer Reaction-Time of Flight Mass Spectrometer. This approach aims at identifying unmeasured reactive VOCs based on a method proposed by Kato et al. (Atmos. Environ., 2011), taking advantage of VOC oxidations occurring in the CRM sampling reactor. MD-CRM has been deployed at an urban site in Dunkirk (France) during July 2014 to test this new approach. During this campaign, a large fraction of the OH reactivity was not explained by collocated measurements of trace gases (67% on average). In this presentation, we will first describe the approach that was implemented in the CRM instrument to identify part of the observed missing OH reactivity and we will then discuss the OH reactivity budget regarding the origin of air masses reaching the measurement site.

  11. Mining for Murder-Suicide: An Approach to Identifying Cases of Murder-Suicide in the National Violent Death Reporting System Restricted Access Database.

    PubMed

    McNally, Matthew R; Patton, Christina L; Fremouw, William J

    2016-01-01

    The National Violent Death Reporting System (NVDRS) is a United States Centers for Disease Control and Prevention (CDC) database of violent deaths from 2003 to the present. The NVDRS collects information from 32 states on several types of violent deaths, including suicides, homicides, homicides followed by suicides, and deaths resulting from child maltreatment or intimate partner violence, as well as legal intervention and accidental firearm deaths. Despite the availability of data from police narratives, medical examiner reports, and other sources, reliably finding the cases of murder-suicide in the NVDRS has proven problematic due to the lack of a unique code for murder-suicide incidents and outdated descriptions of case-finding procedures from previous researchers. By providing a description of the methods used to access to the NVDRS and coding procedures used to decipher these data, the authors seek to assist future researchers in correctly identifying cases of murder-suicide deaths while avoiding false positives.

  12. Identifying Engineering Students' English Sentence Reading Comprehension Errors: Applying a Data Mining Technique

    ERIC Educational Resources Information Center

    Tsai, Yea-Ru; Ouyang, Chen-Sen; Chang, Yukon

    2016-01-01

    The purpose of this study is to propose a diagnostic approach to identify engineering students' English reading comprehension errors. Student data were collected during the process of reading texts of English for science and technology on a web-based cumulative sentence analysis system. For the analysis, the association-rule, data mining technique…

  13. Identifying the Cause of Toxicity of a Saline Mine Water

    PubMed Central

    van Dam, Rick A.; Harford, Andrew J.; Lunn, Simon A.; Gagnon, Marthe M.

    2014-01-01

    Elevated major ions (or salinity) are recognised as being a key contributor to the toxicity of many mine waste waters but the complex interactions between the major ions and large inter-species variability in response to salinity, make it difficult to relate toxicity to causal factors. This study aimed to determine if the toxicity of a typical saline seepage water was solely due to its major ion constituents; and determine which major ions were the leading contributors to the toxicity. Standardised toxicity tests using two tropical freshwater species Chlorella sp. (alga) and Moinodaphnia macleayi (cladoceran) were used to compare the toxicity of 1) mine and synthetic seepage water; 2) key major ions (e.g. Na, Cl, SO4 and HCO3); 3) synthetic seepage water that were modified by excluding key major ions. For Chlorella sp., the toxicity of the seepage water was not solely due to its major ion concentrations because there were differences in effects caused by the mine seepage and synthetic seepage. However, for M. macleayi this hypothesis was supported because similar effects caused by mine seepage and synthetic seepage. Sulfate was identified as a major ion that could predict the toxicity of the synthetic waters, which might be expected as it was the dominant major ion in the seepage water. However, sulfate was not the primary cause of toxicity in the seepage water and electrical conductivity was a better predictor of effects. Ultimately, the results show that specific major ions do not clearly drive the toxicity of saline seepage waters and the effects are probably due to the electrical conductivity of the mine waste waters. PMID:25180579

  14. Identifying the cause of toxicity of a saline mine water.

    PubMed

    van Dam, Rick A; Harford, Andrew J; Lunn, Simon A; Gagnon, Marthe M

    2014-01-01

    Elevated major ions (or salinity) are recognised as being a key contributor to the toxicity of many mine waste waters but the complex interactions between the major ions and large inter-species variability in response to salinity, make it difficult to relate toxicity to causal factors. This study aimed to determine if the toxicity of a typical saline seepage water was solely due to its major ion constituents; and determine which major ions were the leading contributors to the toxicity. Standardised toxicity tests using two tropical freshwater species Chlorella sp. (alga) and Moinodaphnia macleayi (cladoceran) were used to compare the toxicity of 1) mine and synthetic seepage water; 2) key major ions (e.g. Na, Cl, SO4 and HCO3); 3) synthetic seepage water that were modified by excluding key major ions. For Chlorella sp., the toxicity of the seepage water was not solely due to its major ion concentrations because there were differences in effects caused by the mine seepage and synthetic seepage. However, for M. macleayi this hypothesis was supported because similar effects caused by mine seepage and synthetic seepage. Sulfate was identified as a major ion that could predict the toxicity of the synthetic waters, which might be expected as it was the dominant major ion in the seepage water. However, sulfate was not the primary cause of toxicity in the seepage water and electrical conductivity was a better predictor of effects. Ultimately, the results show that specific major ions do not clearly drive the toxicity of saline seepage waters and the effects are probably due to the electrical conductivity of the mine waste waters. PMID:25180579

  15. Data Mining and Pattern Recognition Models for Identifying Inherited Diseases: Challenges and Implications

    PubMed Central

    Iddamalgoda, Lahiru; Das, Partha S.; Aponso, Achala; Sundararajan, Vijayaraghava S.; Suravajhala, Prashanth; Valadi, Jayaraman K.

    2016-01-01

    Data mining and pattern recognition methods reveal interesting findings in genetic studies, especially on how the genetic makeup is associated with inherited diseases. Although researchers have proposed various data mining models for biomedical approaches, there remains a challenge in accurately prioritizing the single nucleotide polymorphisms (SNP) associated with the disease. In this commentary, we review the state-of-art data mining and pattern recognition models for identifying inherited diseases and deliberate the need of binary classification- and scoring-based prioritization methods in determining causal variants. While we discuss the pros and cons associated with these methods known, we argue that the gene prioritization methods and the protein interaction (PPI) methods in conjunction with the K nearest neighbors' could be used in accurately categorizing the genetic factors in disease causation. PMID:27559342

  16. Data Mining and Pattern Recognition Models for Identifying Inherited Diseases: Challenges and Implications.

    PubMed

    Iddamalgoda, Lahiru; Das, Partha S; Aponso, Achala; Sundararajan, Vijayaraghava S; Suravajhala, Prashanth; Valadi, Jayaraman K

    2016-01-01

    Data mining and pattern recognition methods reveal interesting findings in genetic studies, especially on how the genetic makeup is associated with inherited diseases. Although researchers have proposed various data mining models for biomedical approaches, there remains a challenge in accurately prioritizing the single nucleotide polymorphisms (SNP) associated with the disease. In this commentary, we review the state-of-art data mining and pattern recognition models for identifying inherited diseases and deliberate the need of binary classification- and scoring-based prioritization methods in determining causal variants. While we discuss the pros and cons associated with these methods known, we argue that the gene prioritization methods and the protein interaction (PPI) methods in conjunction with the K nearest neighbors' could be used in accurately categorizing the genetic factors in disease causation.

  17. Data Mining and Pattern Recognition Models for Identifying Inherited Diseases: Challenges and Implications.

    PubMed

    Iddamalgoda, Lahiru; Das, Partha S; Aponso, Achala; Sundararajan, Vijayaraghava S; Suravajhala, Prashanth; Valadi, Jayaraman K

    2016-01-01

    Data mining and pattern recognition methods reveal interesting findings in genetic studies, especially on how the genetic makeup is associated with inherited diseases. Although researchers have proposed various data mining models for biomedical approaches, there remains a challenge in accurately prioritizing the single nucleotide polymorphisms (SNP) associated with the disease. In this commentary, we review the state-of-art data mining and pattern recognition models for identifying inherited diseases and deliberate the need of binary classification- and scoring-based prioritization methods in determining causal variants. While we discuss the pros and cons associated with these methods known, we argue that the gene prioritization methods and the protein interaction (PPI) methods in conjunction with the K nearest neighbors' could be used in accurately categorizing the genetic factors in disease causation. PMID:27559342

  18. Microarray data analysis and mining approaches.

    PubMed

    Cordero, Francesca; Botta, Marco; Calogero, Raffaele A

    2007-12-01

    Microarray based transcription profiling is now a consolidated methodology and has widespread use in areas such as pharmacogenomics, diagnostics and drug target identification. Large-scale microarray studies are also becoming crucial to a new way of conceiving experimental biology. A main issue in microarray transcription profiling is data analysis and mining. When microarrays became a methodology of general use, considerable effort was made to produce algorithms and methods for the identification of differentially expressed genes. More recently, the focus has switched to algorithms and database development for microarray data mining. Furthermore, the evolution of microarray technology is allowing researchers to grasp the regulative nature of transcription, integrating basic expression analysis with mRNA characteristics, i.e. exon-based arrays, and with DNA characteristics, i.e. comparative genomic hybridization, single nucleotide polymorphism, tiling and promoter structure. In this article, we will review approaches used to detect differentially expressed genes and to link differential expression to specific biological functions.

  19. Pennsylvania's approach to underground coal mine permitting and long-term mine pool management

    SciTech Connect

    Callaghan, T.; Koricich, J.

    1999-07-01

    Pennsylvania's underground coal mine permitting process has two goals: first, to ensure that the mining and reclamation plan is designed to minimize adverse environmental impacts; and second, to minimize interference with the applicant's recovery of coal. A successful review process includes the consistent evaluation of mine site hydrology through scrutiny of key indicators of mining-induced, adverse hydrologic consequences. This allows the regulatory agency to assess the potential for mining-related impacts as well as cumulative impacts throughout the proposed mine area and adjacent area. General trends have been identified regarding quality of underground mine drainage versus coal seam mined. However, the large number of factors controlling the final mine pool chemistry along with the lack of focused research have combined to stunt the development of reliable methodologies for the prediction of postmining water quality. Absent reliable predictive methodologies, mine layout has become the best demonstrated technology for pollution prevention. Strategies include: (1) promotion of postmining inundation by down-dip development with proper location of mine openings and sizing and location of barriers; (2) restriction of mining to zones within the groundwater system where flow is relatively lethargic and time of travel is great when compared to natural mine pool amelioration time frames; and (3) mining in zones remote from groundwater discharge areas and features which may serve to short-circuit mine water to nearby existing water-supply aquifers or to the surface. This paper discusses Pennsylvania's application process for underground bituminous coal mines. It briefly outlines Pennsylvania's statutory history relating to mine discharges, touches on some of the tools permit reviewers use to evaluate the hydrology of proposed underground mining sites, and discusses the key factors that permit reviewers consider in assessing potential postmining mine pool levels.

  20. Mines and human casualties: a robotics approach toward mine clearing

    NASA Astrophysics Data System (ADS)

    Ghaffari, Masoud; Manthena, Dinesh; Ghaffari, Alireza; Hall, Ernest L.

    2004-10-01

    An estimated 100 million landmines which have been planted in more than 60 countries kill or maim thousands of civilians every year. Millions of people live in the vast dangerous areas and are not able to access to basic human services because of landmines" threats. This problem has affected many third world countries and poor nations which are not able to afford high cost solutions. This paper tries to present some experiences with the land mine victims and solutions for the mine clearing. It studies current situation of this crisis as well as state of the art robotics technology for the mine clearing. It also introduces a survey robot which is suitable for the mine clearing applications. The results show that in addition to technical aspects, this problem has many socio-economic issues. The significance of this study is to persuade robotics researchers toward this topic and to peruse the technical and humanitarian facets of this issue.

  1. Systematic evaluation of satellite remote sensing for identifying uranium mines and mills.

    SciTech Connect

    Blair, Dianna Sue; Stork, Christopher Lyle; Smartt, Heidi Anne; Smith, Jody Lynn

    2006-01-01

    In this report, we systematically evaluate the ability of current-generation, satellite-based spectroscopic sensors to distinguish uranium mines and mills from other mineral mining and milling operations. We perform this systematic evaluation by (1) outlining the remote, spectroscopic signal generation process, (2) documenting the capabilities of current commercial satellite systems, (3) systematically comparing the uranium mining and milling process to other mineral mining and milling operations, and (4) identifying the most promising observables associated with uranium mining and milling that can be identified using satellite remote sensing. The Ranger uranium mine and mill in Australia serves as a case study where we apply and test the techniques developed in this systematic analysis. Based on literature research of mineral mining and milling practices, we develop a decision tree which utilizes the information contained in one or more observables to determine whether uranium is possibly being mined and/or milled at a given site. Promising observables associated with uranium mining and milling at the Ranger site included in the decision tree are uranium ore, sulfur, the uranium pregnant leach liquor, ammonia, and uranyl compounds and sulfate ion disposed of in the tailings pond. Based on the size, concentration, and spectral characteristics of these promising observables, we then determine whether these observables can be identified using current commercial satellite systems, namely Hyperion, ASTER, and Quickbird. We conclude that the only promising observables at Ranger that can be uniquely identified using a current commercial satellite system (notably Hyperion) are magnesium chlorite in the open pit mine and the sulfur stockpile. Based on the identified magnesium chlorite and sulfur observables, the decision tree narrows the possible mineral candidates at Ranger to uranium, copper, zinc, manganese, vanadium, the rare earths, and phosphorus, all of which are

  2. Design approaches in quarrying and pit-mining reclamation

    USGS Publications Warehouse

    Arbogast, Belinda F.

    1999-01-01

    Reclaimed mine sites have been evaluated so that the public, industry, and land planners may recognize there are innovative designs available for consideration and use. People tend to see cropland, range, and road cuts as a necessary part of their everyday life, not as disturbed areas despite their high visibility. Mining also generates a disturbed landscape, unfortunately one that many consider waste until reclaimed by human beings. The development of mining provides an economic base and use of a natural resource to improve the quality of human life. Equally important is a sensitivity to the geologic origin and natural pattern of the land. Wisely shaping out environment requires a design plan and product that responds to a site's physiography, ecology, function, artistic form, and publication perception. An examination of selected sites for their landscape design suggested nine approaches for mining reclamation. The oldest design approach around is nature itself. Humans may sometimes do more damage going to an area in the attempt to repair it. Given enough geologic time, a small-site area, and stable adjacent ecosystems, disturbed areas recover without mankind's input. Visual screens and buffer zones conceal the facility in a camouflage approach. Typically, earth berms, fences, and plantings are used to disguise the mining facility. Restoration targets social or economic benefits by reusing the site for public amenities, most often in urban centers with large populations. A mitigation approach attempts to protect the environment and return mined areas to use with scientific input. The reuse of cement, building rubble, macadam meets only about 10% of the demand from aggregate. Recognizing the limited supply of mineral resources and encouraging recycling efforts are steps are steps in a renewable resource approach. An educative design approach effectively communicates mining information through outreach, land stewardship, and community service. Mine sites used for

  3. Experimental approaches for identifying schizophrenia risk genes.

    PubMed

    Mantripragada, Kiran K; Carroll, Liam S; Williams, Nigel M

    2010-01-01

    Schizophrenia is a severe, debilitating and common psychiatric disorder, which directly affects approximately 1% of the population worldwide. Although previous studies have unequivocally shown that schizophrenia has a strong genetic component, our understanding of its pathophysiology remains limited. The precise genetic architecture of schizophrenia remains elusive and is likely to be complex. It is believed that multiple genetic variants, with each contributing a modest effect on disease risk, interact with environmental factors resulting in the phenotype. In this chapter, we summarise the main molecular genetic approaches that have been utilised in identifying susceptibility genes for schizophrenia and discuss the advantages and disadvantages of each approach. First, we detail the findings of linkage mapping in pedigrees (affected families), which analyse the co-segregation of polymorphic genetic markers with disease phenotype. Second, the contribution of targeted and genome-wide association studies, which compare differential allelic frequencies in schizophrenia cases and matched controls, is presented. Third, we discuss about the identification of susceptibility genes through analysis of chromosomal structural variation (gains and losses of genetic material). Lastly, we introduce the concept of re-sequencing, where the entire genome/exome is sequenced both in affected and unaffected individuals. This approach has the potential to provide a clarified picture of the majority of the genetic variation underlying disease pathogenesis. PMID:21312414

  4. Graduates employment classification using data mining approach

    NASA Astrophysics Data System (ADS)

    Aziz, Mohd Tajul Rizal Ab; Yusof, Yuhanis

    2016-08-01

    Data Mining is a platform to extract hidden knowledge in a collection of data. This study investigates the suitable classification model to classify graduates employment for one of the MARA Professional College (KPM) in Malaysia. The aim is to classify the graduates into either as employed, unemployed or further study. Five data mining algorithms offered in WEKA were used; Naïve Bayes, Logistic regression, Multilayer perceptron, k-nearest neighbor and Decision tree J48. Based on the obtained result, it is learned that the Logistic regression produces the highest classification accuracy which is at 92.5%. Such result was obtained while using 80% data for training and 20% for testing. The produced classification model will benefit the management of the college as it provides insight to the quality of graduates that they produce and how their curriculum can be improved to cater the needs from the industry.

  5. A proactive approach to sustainable management of mine tailings

    NASA Astrophysics Data System (ADS)

    Edraki, Mansour; Baumgartl, Thomas

    2015-04-01

    The reactive strategies to manage mine tailings i.e. containment of slurries of tailings in tailings storage facilities (TSF's) and remediation of tailings solids or tailings seepage water after the decommissioning of those facilities, can be technically inefficient to eliminate environmental risks (e.g. prevent dispersion of contaminants and catastrophic dam wall failures), pose a long term economic burden for companies, governments and society after mine closure, and often fail to meet community expectations. Most preventive environmental management practices promote proactive integrated approaches to waste management whereby the source of environmental issues are identified to help make a more informed decisions. They often use life cycle assessment to find the "hot spots" of environmental burdens. This kind of approach is often based on generic data and has rarely been used for tailings. Besides, life cycle assessments are less useful for designing operations or simulating changes in the process and consequent environmental outcomes. It is evident that an integrated approach for tailings research linked to better processing options is needed. A literature review revealed that there are only few examples of integrated approaches. The aim of this project is to develop new tailings management models by streamlining orebody characterization, process optimization and rehabilitation. The approach is based on continuous fingerprinting of geochemical processes from orebody to tailings storage facility, and benchmark the success of such proactive initiatives by evidence of no impacts and no future projected impacts on receiving environments. We present an approach for developing such a framework and preliminary results from a case study where combined grinding and flotation models developed using geometallurgical data from the orebody were constructed to predict the properties of tailings produced under various processing scenarios. The modelling scenarios based on the

  6. Current approaches for mitigating acid mine drainage.

    PubMed

    Sahoo, Prafulla Kumar; Kim, Kangjoo; Equeenuddin, Sk Md; Powell, Michael A

    2013-01-01

    AMD is one of the critical environmental problems that causes acidification and metal contamination of surface and ground water bodies when mine materials and/or over burden-containing metal sulfides are exposed to oxidizing conditions. The best option to limit AMD is early avoidance of sulfide oxidation. Several techniques are available to achieve this. In this paper, we review all of the major methods now used to limit sulfide oxidation. These fall into five categories: (1) physical barriers,(2) bacterial inhibition, (3) chemical passivation, ( 4) electrochemical, and (5) desulfurization.We describe the processes underlying each method by category and then address aspects relating to effectiveness, cost, and environmental impact. This paper may help researchers and environmental engineers to select suitable methods for addressing site-specific AMD problems.Irrespective of the mechanism by which each method works, all share one common feature, i.e., they delay or prevent oxidation. In addition, all have limitations.Physical barriers such as wet or dry cover have retarded sulfide oxidation in several studies; however, both wet and dry barriers exhibit only short-term effectiveness.Wet cover is suitable at specific sites where complete inundation is established, but this approach requires high maintenance costs. When employing dry cover, plastic liners are expensive and rarely used for large volumes of waste. Bactericides can suppress oxidation, but are only effective on fresh tailings and short-lived, and do not serve as a permanent solution to AMD. In addition, application of bactericides may be toxic to aquatic organisms.Encapsulation or passivation of sulfide surfaces (applying organic and/or inorganic coatings) is simple and effective in preventing AMD. Among inorganic coatings,silica is the most promising, stable, acid-resistant and long lasting, as compared to phosphate and other inorganic coatings. Permanganate passivation is also promising because it

  7. Current approaches for mitigating acid mine drainage.

    PubMed

    Sahoo, Prafulla Kumar; Kim, Kangjoo; Equeenuddin, Sk Md; Powell, Michael A

    2013-01-01

    AMD is one of the critical environmental problems that causes acidification and metal contamination of surface and ground water bodies when mine materials and/or over burden-containing metal sulfides are exposed to oxidizing conditions. The best option to limit AMD is early avoidance of sulfide oxidation. Several techniques are available to achieve this. In this paper, we review all of the major methods now used to limit sulfide oxidation. These fall into five categories: (1) physical barriers,(2) bacterial inhibition, (3) chemical passivation, ( 4) electrochemical, and (5) desulfurization.We describe the processes underlying each method by category and then address aspects relating to effectiveness, cost, and environmental impact. This paper may help researchers and environmental engineers to select suitable methods for addressing site-specific AMD problems.Irrespective of the mechanism by which each method works, all share one common feature, i.e., they delay or prevent oxidation. In addition, all have limitations.Physical barriers such as wet or dry cover have retarded sulfide oxidation in several studies; however, both wet and dry barriers exhibit only short-term effectiveness.Wet cover is suitable at specific sites where complete inundation is established, but this approach requires high maintenance costs. When employing dry cover, plastic liners are expensive and rarely used for large volumes of waste. Bactericides can suppress oxidation, but are only effective on fresh tailings and short-lived, and do not serve as a permanent solution to AMD. In addition, application of bactericides may be toxic to aquatic organisms.Encapsulation or passivation of sulfide surfaces (applying organic and/or inorganic coatings) is simple and effective in preventing AMD. Among inorganic coatings,silica is the most promising, stable, acid-resistant and long lasting, as compared to phosphate and other inorganic coatings. Permanganate passivation is also promising because it

  8. Wastewater treatment polymers identified as the toxic component of a diamond mine effluent.

    PubMed

    De Rosemond, Simone J C; Liber, Karsten

    2004-09-01

    The Ekati Diamond Mine, located approximately 300 km northeast of Yellowknife in Canada's Northwest Territories, uses mechanical crushing and washing processes to extract diamonds from kimberlite ore. The processing plant's effluent contains kimberlite ore particles (< or =0.5 mm), wastewater, and two wastewater treatment polymers, a cationic polydiallydimethylammonium chloride (DADMAC) polymer and an anionic sodium acrylate polyacrylamide (PAM) polymer. A series of acute (48-h) and chronic (7-d) toxicity tests determined the processed kimberlite effluent (PKE) was chronically, but not acutely, toxic to Ceriodaphnia dubia. Reproduction of C. dubia was inhibited significantly at concentrations as low as 12.5% PKE. Toxicity identification evaluations (TIE) were initiated to identify the toxic component of PKE. Ethylenediaminetetraacetic acid (EDTA), sodium thiosulfate, aeration, and solid phase extraction with C-18 manipulations failed to reduce PKE toxicity. Toxicity was reduced significantly by pH adjustments to pH 3 or 11 followed by filtration. Toxicity testing with C. dubia determined that the cationic DADMAC polymer had a 48-h median lethal concentration (LC50) of 0.32 mg/L and 7-d median effective concentration (EC50) of 0.014 mg/L. The anionic PAM polymer had a 48-h LC50 of 218 mg/L. A weight-of-evidence approach, using the data obtained from the TIE, the polymer toxicity experiments, the estimated concentration of the cationic polymer in the kimberlite effluent, and the behavior of kimberlite minerals in pH-adjusted solutions provided sufficient evidence to identify the cationic DADMAC polymer as the toxic component of the diamond mine PKE.

  9. Wastewater treatment polymers identified as the toxic component of a diamond mine effluent.

    PubMed

    De Rosemond, Simone J C; Liber, Karsten

    2004-09-01

    The Ekati Diamond Mine, located approximately 300 km northeast of Yellowknife in Canada's Northwest Territories, uses mechanical crushing and washing processes to extract diamonds from kimberlite ore. The processing plant's effluent contains kimberlite ore particles (< or =0.5 mm), wastewater, and two wastewater treatment polymers, a cationic polydiallydimethylammonium chloride (DADMAC) polymer and an anionic sodium acrylate polyacrylamide (PAM) polymer. A series of acute (48-h) and chronic (7-d) toxicity tests determined the processed kimberlite effluent (PKE) was chronically, but not acutely, toxic to Ceriodaphnia dubia. Reproduction of C. dubia was inhibited significantly at concentrations as low as 12.5% PKE. Toxicity identification evaluations (TIE) were initiated to identify the toxic component of PKE. Ethylenediaminetetraacetic acid (EDTA), sodium thiosulfate, aeration, and solid phase extraction with C-18 manipulations failed to reduce PKE toxicity. Toxicity was reduced significantly by pH adjustments to pH 3 or 11 followed by filtration. Toxicity testing with C. dubia determined that the cationic DADMAC polymer had a 48-h median lethal concentration (LC50) of 0.32 mg/L and 7-d median effective concentration (EC50) of 0.014 mg/L. The anionic PAM polymer had a 48-h LC50 of 218 mg/L. A weight-of-evidence approach, using the data obtained from the TIE, the polymer toxicity experiments, the estimated concentration of the cationic polymer in the kimberlite effluent, and the behavior of kimberlite minerals in pH-adjusted solutions provided sufficient evidence to identify the cationic DADMAC polymer as the toxic component of the diamond mine PKE. PMID:15379002

  10. Mining the Metabiome: Identifying Novel Natural Products from Microbial Communities

    PubMed Central

    Milshteyn, Aleksandr; Schneider, Jessica S.; Brady, Sean F.

    2014-01-01

    Summary Microbial-derived natural products provide the foundation for most of the chemotherapeutic arsenal available to contemporary medicine. In the face of a dwindling pipeline of new lead structures identified by traditional culturing techniques and an increasing need for new therapeutics, surveys of microbial biosynthetic diversity across environmental metabiomes have revealed enormous reservoirs of as yet untapped natural products chemistry. In this review we touch on the historical context of microbial natural product discovery and discuss innovations and technological advances that are facilitating culture-dependent and culture-independent access to new chemistry from environmental microbiomes with the goal of re-invigorating the small molecule therapeutics discovery pipeline. We highlight the successful strategies that have emerged and some of the challenges that must be overcome to enable the development of high-throughput methods for natural product discovery from complex microbial communities. PMID:25237864

  11. Multisensor neural network approach to mine detection

    NASA Astrophysics Data System (ADS)

    Iler, Amber L.; Marble, Jay A.; Rauss, Patrick J.

    2001-10-01

    A neural network is applied to data collected by the close-in detector for the Mine Hunter Killer (MHK) project with promising results. We use the ground penetrating radar (GPR) and metal detector to create three channels (two from the GPR) and train a basic, two layer (single hidden layer), feed-forward neural network. By experimenting with the number of hidden nodes and training goals, we were able to surpass the performance of the single sensors when we fused the three channels via our neural network and applied the trained net to different data. The fused sensors exceeded the best single sensor performance above 95 percent detection by providing a lower, but still high, false alarm rate. And though our three channel neural net worked best, we saw an increase in performance with fewer than three channels, as well.

  12. Data mining approach to model the diagnostic service management.

    PubMed

    Lee, Sun-Mi; Lee, Ae-Kyung; Park, Il-Su

    2006-01-01

    Korea has National Health Insurance Program operated by the government-owned National Health Insurance Corporation, and diagnostic services are provided every two year for the insured and their family members. Developing a customer relationship management (CRM) system using data mining technology would be useful to improve the performance of diagnostic service programs. Under these circumstances, this study developed a model for diagnostic service management taking into account the characteristics of subjects using a data mining approach. This study could be further used to develop an automated CRM system contributing to the increase in the rate of receiving diagnostic services. PMID:17102454

  13. A Node Linkage Approach for Sequential Pattern Mining

    PubMed Central

    Navarro, Osvaldo; Cumplido, René; Villaseñor-Pineda, Luis; Feregrino-Uribe, Claudia; Carrasco-Ochoa, Jesús Ariel

    2014-01-01

    Sequential Pattern Mining is a widely addressed problem in data mining, with applications such as analyzing Web usage, examining purchase behavior, and text mining, among others. Nevertheless, with the dramatic increase in data volume, the current approaches prove inefficient when dealing with large input datasets, a large number of different symbols and low minimum supports. In this paper, we propose a new sequential pattern mining algorithm, which follows a pattern-growth scheme to discover sequential patterns. Unlike most pattern growth algorithms, our approach does not build a data structure to represent the input dataset, but instead accesses the required sequences through pseudo-projection databases, achieving better runtime and reducing memory requirements. Our algorithm traverses the search space in a depth-first fashion and only preserves in memory a pattern node linkage and the pseudo-projections required for the branch being explored at the time. Experimental results show that our new approach, the Node Linkage Depth-First Traversal algorithm (NLDFT), has better performance and scalability in comparison with state of the art algorithms. PMID:24933123

  14. Using Helicopter Electromagnetic Surveys to Identify Potential Hazards at Mine Waste Impoundments

    SciTech Connect

    Hammack, R.W.

    2008-01-01

    In July 2003, helicopter electromagnetic surveys were conducted at 14 coal waste impoundments in southern West Virginia. The purpose of the surveys was to detect conditions that could lead to impoundment failure either by structural failure of the embankment or by the flooding of adjacent or underlying mine works. Specifically, the surveys attempted to: 1) identify saturated zones within the mine waste, 2) delineate filtrate flow paths through the embankment or into adjacent strata and receiving streams, and 3) identify flooded mine workings underlying or adjacent to the waste impoundment. Data from the helicopter surveys were processed to generate conductivity/depth images. Conductivity/depth images were then spatially linked to georeferenced air photos or topographic maps for interpretation. Conductivity/depth images were found to provide a snapshot of the hydrologic conditions that exist within the impoundment. This information can be used to predict potential areas of failure within the embankment because of its ability to image the phreatic zone. Also, the electromagnetic survey can identify areas of unconsolidated slurry in the decant basin and beneath the embankment. Although shallow, flooded mineworks beneath the impoundment were identified by this survey, it cannot be assumed that electromagnetic surveys can detect all underlying mines. A preliminary evaluation of the data implies that helicopter electromagnetic surveys can provide a better understanding of the phreatic zone than the piezometer arrays that are typically used.

  15. Data Mining Approaches for Modeling Complex Electronic Circuit Design Activities

    SciTech Connect

    Kwon, Yongjin; Omitaomu, Olufemi A; Wang, Gi-Nam

    2008-01-01

    A printed circuit board (PCB) is an essential part of modern electronic circuits. It is made of a flat panel of insulating materials with patterned copper foils that act as electric pathways for various components such as ICs, diodes, capacitors, resistors, and coils. The size of PCBs has been shrinking over the years, while the number of components mounted on these boards has increased considerably. This trend makes the design and fabrication of PCBs ever more difficult. At the beginning of design cycles, it is important to estimate the time to complete the steps required accurately, based on many factors such as the required parts, approximate board size and shape, and a rough sketch of schematics. Current approach uses multiple linear regression (MLR) technique for time and cost estimations. However, the need for accurate predictive models continues to grow as the technology becomes more advanced. In this paper, we analyze a large volume of historical PCB design data, extract some important variables, and develop predictive models based on the extracted variables using a data mining approach. The data mining approach uses an adaptive support vector regression (ASVR) technique; the benchmark model used is the MLR technique currently being used in the industry. The strengths of SVR for this data include its ability to represent data in high-dimensional space through kernel functions. The computational results show that a data mining approach is a better prediction technique for this data. Our approach reduces computation time and enhances the practical applications of the SVR technique.

  16. Mining Clinicians' Electronic Documentation to Identify Heart Failure Patients with Ineffective Self-Management: A Pilot Text-Mining Study.

    PubMed

    Topaz, Maxim; Radhakrishnan, Kavita; Lei, Victor; Zhou, Li

    2016-01-01

    Effective self-management can decrease up to 50% of heart failure hospitalizations. Unfortunately, self-management by patients with heart failure remains poor. This pilot study aimed to explore the use of text-mining to identify heart failure patients with ineffective self-management. We first built a comprehensive self-management vocabulary based on the literature and clinical notes review. We then randomly selected 545 heart failure patients treated within Partners Healthcare hospitals (Boston, MA, USA) and conducted a regular expression search with the compiled vocabulary within 43,107 interdisciplinary clinical notes of these patients. We found that 38.2% (n = 208) patients had documentation of ineffective heart failure self-management in the domains of poor diet adherence (28.4%), missed medical encounters (26.4%) poor medication adherence (20.2%) and non-specified self-management issues (e.g., "compliance issues", 34.6%). We showed the feasibility of using text-mining to identify patients with ineffective self-management. More natural language processing algorithms are needed to help busy clinicians identify these patients.

  17. Mining Clinicians' Electronic Documentation to Identify Heart Failure Patients with Ineffective Self-Management: A Pilot Text-Mining Study.

    PubMed

    Topaz, Maxim; Radhakrishnan, Kavita; Lei, Victor; Zhou, Li

    2016-01-01

    Effective self-management can decrease up to 50% of heart failure hospitalizations. Unfortunately, self-management by patients with heart failure remains poor. This pilot study aimed to explore the use of text-mining to identify heart failure patients with ineffective self-management. We first built a comprehensive self-management vocabulary based on the literature and clinical notes review. We then randomly selected 545 heart failure patients treated within Partners Healthcare hospitals (Boston, MA, USA) and conducted a regular expression search with the compiled vocabulary within 43,107 interdisciplinary clinical notes of these patients. We found that 38.2% (n = 208) patients had documentation of ineffective heart failure self-management in the domains of poor diet adherence (28.4%), missed medical encounters (26.4%) poor medication adherence (20.2%) and non-specified self-management issues (e.g., "compliance issues", 34.6%). We showed the feasibility of using text-mining to identify patients with ineffective self-management. More natural language processing algorithms are needed to help busy clinicians identify these patients. PMID:27332377

  18. Identifying MMORPG Bots: A Traffic Analysis Approach

    NASA Astrophysics Data System (ADS)

    Chen, Kuan-Ta; Jiang, Jhih-Wei; Huang, Polly; Chu, Hao-Hua; Lei, Chin-Laung; Chen, Wen-Chin

    2008-12-01

    Massively multiplayer online role playing games (MMORPGs) have become extremely popular among network gamers. Despite their success, one of MMORPG's greatest challenges is the increasing use of game bots, that is, autoplaying game clients. The use of game bots is considered unsportsmanlike and is therefore forbidden. To keep games in order, game police, played by actual human players, often patrol game zones and question suspicious players. This practice, however, is labor-intensive and ineffective. To address this problem, we analyze the traffic generated by human players versus game bots and propose general solutions to identify game bots. Taking Ragnarok Online as our subject, we study the traffic generated by human players and game bots. We find that their traffic is distinguishable by 1) the regularity in the release time of client commands, 2) the trend and magnitude of traffic burstiness in multiple time scales, and 3) the sensitivity to different network conditions. Based on these findings, we propose four strategies and two ensemble schemes to identify bots. Finally, we discuss the robustness of the proposed methods against countermeasures of bot developers, and consider a number of possible ways to manage the increasingly serious bot problem.

  19. Mining Functional Modules in Heterogeneous Biological Networks Using Multiplex PageRank Approach

    PubMed Central

    Li, Jun; Zhao, Patrick X.

    2016-01-01

    Identification of functional modules/sub-networks in large-scale biological networks is one of the important research challenges in current bioinformatics and systems biology. Approaches have been developed to identify functional modules in single-class biological networks; however, methods for systematically and interactively mining multiple classes of heterogeneous biological networks are lacking. In this paper, we present a novel algorithm (called mPageRank) that utilizes the Multiplex PageRank approach to mine functional modules from two classes of biological networks. We demonstrate the capabilities of our approach by successfully mining functional biological modules through integrating expression-based gene-gene association networks and protein-protein interaction networks. We first compared the performance of our method with that of other methods using simulated data. We then applied our method to identify the cell division cycle related functional module and plant signaling defense-related functional module in the model plant Arabidopsis thaliana. Our results demonstrated that the mPageRank method is effective for mining sub-networks in both expression-based gene-gene association networks and protein-protein interaction networks, and has the potential to be adapted for the discovery of functional modules/sub-networks in other heterogeneous biological networks. The mPageRank executable program, source code, the datasets and results of the presented two case studies are publicly and freely available at http://plantgrn.noble.org/MPageRank/. PMID:27446133

  20. Mining Functional Modules in Heterogeneous Biological Networks Using Multiplex PageRank Approach.

    PubMed

    Li, Jun; Zhao, Patrick X

    2016-01-01

    Identification of functional modules/sub-networks in large-scale biological networks is one of the important research challenges in current bioinformatics and systems biology. Approaches have been developed to identify functional modules in single-class biological networks; however, methods for systematically and interactively mining multiple classes of heterogeneous biological networks are lacking. In this paper, we present a novel algorithm (called mPageRank) that utilizes the Multiplex PageRank approach to mine functional modules from two classes of biological networks. We demonstrate the capabilities of our approach by successfully mining functional biological modules through integrating expression-based gene-gene association networks and protein-protein interaction networks. We first compared the performance of our method with that of other methods using simulated data. We then applied our method to identify the cell division cycle related functional module and plant signaling defense-related functional module in the model plant Arabidopsis thaliana. Our results demonstrated that the mPageRank method is effective for mining sub-networks in both expression-based gene-gene association networks and protein-protein interaction networks, and has the potential to be adapted for the discovery of functional modules/sub-networks in other heterogeneous biological networks. The mPageRank executable program, source code, the datasets and results of the presented two case studies are publicly and freely available at http://plantgrn.noble.org/MPageRank/.

  1. Mining Functional Modules in Heterogeneous Biological Networks Using Multiplex PageRank Approach.

    PubMed

    Li, Jun; Zhao, Patrick X

    2016-01-01

    Identification of functional modules/sub-networks in large-scale biological networks is one of the important research challenges in current bioinformatics and systems biology. Approaches have been developed to identify functional modules in single-class biological networks; however, methods for systematically and interactively mining multiple classes of heterogeneous biological networks are lacking. In this paper, we present a novel algorithm (called mPageRank) that utilizes the Multiplex PageRank approach to mine functional modules from two classes of biological networks. We demonstrate the capabilities of our approach by successfully mining functional biological modules through integrating expression-based gene-gene association networks and protein-protein interaction networks. We first compared the performance of our method with that of other methods using simulated data. We then applied our method to identify the cell division cycle related functional module and plant signaling defense-related functional module in the model plant Arabidopsis thaliana. Our results demonstrated that the mPageRank method is effective for mining sub-networks in both expression-based gene-gene association networks and protein-protein interaction networks, and has the potential to be adapted for the discovery of functional modules/sub-networks in other heterogeneous biological networks. The mPageRank executable program, source code, the datasets and results of the presented two case studies are publicly and freely available at http://plantgrn.noble.org/MPageRank/. PMID:27446133

  2. Data Mining for Identifying Novel Associations and Temporal Relationships with Charcot Foot

    PubMed Central

    Munson, Michael E.; Wrobel, James S.; Holmes, Crystal M.; Hanauer, David A.

    2014-01-01

    Introduction. Charcot foot is a rare and devastating complication of diabetes. While some risk factors are known, debate continues regarding etiology. Elucidating other associated disorders and their temporal occurrence could lead to a better understanding of its pathogenesis. We applied a large data mining approach to Charcot foot for elucidating novel associations. Methods. We conducted an association analysis using ICD-9 diagnosis codes for every patient in our health system (n = 1.6 million with 41.2 million time-stamped ICD-9 codes). For the current analysis, we focused on the 388 patients with Charcot foot (ICD-9 713.5). Results. We found 710 associations, 676 (95.2%) of which had a P value for the association less than 1.0 × 10−5 and 603 (84.9%) of which had an odds ratio > 5.0. There were 111 (15.6%) associations with a significant temporal relationship (P < 1.0 × 10−3). The three novel associations with the strongest temporal component were cardiac dysrhythmia, pulmonary eosinophilia, and volume depletion disorder. Conclusion. We identified novel associations with Charcot foot in the context of pathogenesis models that include neurotrophic, neurovascular, and microtraumatic factors mediated through inflammatory cytokines. Future work should focus on confirmatory analyses. These novel areas of investigation could lead to prevention or earlier diagnosis. PMID:24868558

  3. A Pattern Mining Approach for Classifying Multivariate Temporal Data

    PubMed Central

    Batal, Iyad; Valizadegan, Hamed; Cooper, Gregory F.; Hauskrecht, Milos

    2012-01-01

    We study the problem of learning classification models from complex multivariate temporal data encountered in electronic health record systems. The challenge is to define a good set of features that are able to represent well the temporal aspect of the data. Our method relies on temporal abstractions and temporal pattern mining to extract the classification features. Temporal pattern mining usually returns a large number of temporal patterns, most of which may be irrelevant to the classification task. To address this problem, we present the minimal predictive temporal patterns framework to generate a small set of predictive and non-spurious patterns. We apply our approach to the real-world clinical task of predicting patients who are at risk of developing heparin induced thrombocytopenia. The results demonstrate the benefit of our approach in learning accurate classifiers, which is a key step for developing intelligent clinical monitoring systems. PMID:22267987

  4. A geomorphological approach to the management of rivers contaminated by metal mining

    NASA Astrophysics Data System (ADS)

    Macklin, M. G.; Brewer, P. A.; Hudson-Edwards, K. A.; Bird, G.; Coulthard, T. J.; Dennis, I. A.; Lechler, P. J.; Miller, J. R.; Turner, J. N.

    2006-09-01

    As the result of current and historical metal mining, river channels and floodplains in many parts of the world have become contaminated by metal-rich waste in concentrations that may pose a hazard to human livelihoods and sustainable development. Environmental and human health impacts commonly arise because of the prolonged residence time of heavy metals in river sediments and alluvial soils and their bioaccumulatory nature in plants and animals. This paper considers how an understanding of the processes of sediment-associated metal dispersion in rivers, and the space and timescales over which they operate, can be used in a practical way to help river basin managers more effectively control and remediate catchments affected by current and historical metal mining. A geomorphological approach to the management of rivers contaminated by metals is outlined and four emerging research themes are highlighted and critically reviewed. These are: (1) response and recovery of river systems following the failures of major tailings dams; (2) effects of flooding on river contamination and the sustainable use of floodplains; (3) new developments in isotopic fingerprinting, remote sensing and numerical modelling for identifying the sources of contaminant metals and for mapping the spatial distribution of contaminants in river channels and floodplains; and (4) current approaches to the remediation of river basins affected by mining, appraised in light of the European Union's Water Framework Directive (2000/60/EC). Future opportunities for geomorphologically-based assessments of mining-affected catchments are also identified.

  5. Efflorescent sulfates from Baia Sprie mining area (Romania)--Acid mine drainage and climatological approach.

    PubMed

    Buzatu, Andrei; Dill, Harald G; Buzgar, Nicolae; Damian, Gheorghe; Maftei, Andreea Elena; Apopei, Andrei Ionuț

    2016-01-15

    The Baia Sprie epithermal system, a well-known deposit for its impressive mineralogical associations, shows the proper conditions for acid mine drainage and can be considered a general example for affected mining areas around the globe. Efflorescent samples from the abandoned open pit Minei Hill have been analyzed by X-ray diffraction (XRD), scanning electron microscopy (SEM), Raman and near-infrared (NIR) spectrometry. The identified phases represent mostly iron sulfates with different hydration degrees (szomolnokite, rozenite, melanterite, coquimbite, ferricopiapite), Zn and Al sulfates (gunningite, alunogen, halotrichite). The samples were heated at different temperatures in order to establish the phase transformations among the studied sulfates. The dehydration temperatures and intermediate phases upon decomposition were successfully identified for each of mineral phases. Gunningite was the single sulfate that showed no transformations during the heating experiment. All the other sulfates started to dehydrate within the 30-90 °C temperature range. The acid mine drainage is the main cause for sulfates formation, triggered by pyrite oxidation as the major source for the abundant iron sulfates. Based on the dehydration temperatures, the climatological interpretation indicated that melanterite formation and long-term presence is related to continental and temperate climates. Coquimbite and rozenite are attributed also to the dry arid/semi-arid areas, in addition to the above mentioned ones. The more stable sulfates, alunogen, halotrichite, szomolnokite, ferricopiapite and gunningite, can form and persists in all climate regimes, from dry continental to even tropical humid. PMID:26544892

  6. Magnetic signature of overbank sediment in industry impacted floodplains identified by data mining methods

    NASA Astrophysics Data System (ADS)

    Chudaničová, Monika; Hutchinson, Simon M.

    2016-11-01

    Our study attempts to identify a characteristic magnetic signature of overbank sediments exhibiting anthropogenically induced magnetic enhancement and thereby to distinguish them from unenhanced sediments with weak magnetic background values, using a novel approach based on data mining methods, thus providing a mean of rapid pollution determination. Data were obtained from 539 bulk samples from vertical profiles through overbank sediment, collected on seven rivers in the eastern Czech Republic and three rivers in northwest England. k-Means clustering and hierarchical clustering methods, paired group (UPGMA) and Ward's method, were used to divide the samples to natural groups according to their attributes. Interparametric ratios: SIRM/χ; SIRM/ARM; and S-0.1T were chosen as attributes for analyses making the resultant model more widely applicable as magnetic concentration values can differ by two orders. Division into three clusters appeared to be optimal and corresponded to inherent clusters in the data scatter. Clustering managed to separate samples with relatively weak anthropogenically induced enhancement, relatively strong anthropogenically induced enhancement and samples lacking enhancement. To describe the clusters explicitly and thus obtain a discrete magnetic signature, classification rules (JRip method) and decision trees (J4.8 and Simple Cart methods) were used. Samples lacking anthropogenic enhancement typically exhibited an S-0.1T < c. 0.5, SIRM/ARM < c. 150 and SIRM/χ < c. 6000 A m-1. Samples with magnetic enhancement all exhibited an S-0.1T > 0.5. Samples with relatively stronger anthropogenic enhancement were unequivocally distinguished from the samples with weaker enhancement by an SIRM/ARM > c. 150. Samples with SIRM/ARM in a range c. 126-150 were classified as relatively strongly enhanced when their SIRM/χ > 18 000 A m-1 and relatively less enhanced when their SIRM/χ < 18 000 A m-1. An additional rule was arbitrary added to exclude samples with

  7. Identifying and Describing a Seismogenic Zone in a Sublevel Caving Mine

    NASA Astrophysics Data System (ADS)

    Abolfazlzadeh, Yousef; Hudyma, Marty

    2016-09-01

    Analysis of caving-induced seismicity can aid in the understanding of rock mass behaviour in the different stages of the caving process. A detailed analysis of caving-induced seismicity at the Telfer sublevel caving mine was undertaken. Interpretation of seismic data in the Telfer mine showed the influence of the major geological features on cave behaviour and helped to identify the phases of cave evolution. Two geological zones with unique seismic characteristics (the M50 and M30 stiff reefs) and four key caving phases (initial undercut blasting, cave initiation, cave propagation and breakthrough) were defined through seismic data analysis. Movement of the seismogenic zone was significantly affected by the stiff reefs within the cave column. Seismic source parameter analysis was used to investigate caving mechanisms at Telfer.

  8. Mining Patterns of Disease Progression: A Topic-Model-Based Approach.

    PubMed

    Zhang, Lingxiao; Zhao, Junfeng; Wang, Yasha; Xie, Bing

    2016-01-01

    Knowledge of how diseases progress and transform is crucial for clinical decision making. Frequent pattern mining techniques, such as sequential pattern mining (SPM) algorithms, can automatically extract such knowledge from large collections of electronic medical records (EMR). However, EMR data are usually unorganized and highly noisy. Finding meaningful disease patterns often calls for manual manipulation such as cohort and feature selection on EMR data by medical professionals. In this paper, we propose a topic-model-based SPM approach to find disease progression patterns from diagnostic records. We improve the traditional SPM algorithms by filtering and grouping the diagnosis sequences according to different clinical topics. These topics represent certain clinical conditions with closely related diagnoses, and are detected without prior medical knowledge. The experiment on real-world EMR data shows that our approach is able to find meaningful progression patterns with less noises, and can help quickly identify interesting patterns related to a certain clinical condition with less human effort. PMID:27577403

  9. WHAT INNOVATIVE APPROACHES CAN BE DEVELOPED FOR MINING SITES?

    EPA Science Inventory

    Mining is essential to maintain our way of life. However, based upon industry's reporting in the most recent Toxic Release Inventory (TRI), the primary sources of heavy metal releases to the environment are mining and mining related activities. The hard rock mining industry rel...

  10. Identifying Understudied Nuclear Reactions by Text-mining the EXFOR Experimental Nuclear Reaction Library

    NASA Astrophysics Data System (ADS)

    Hirdt, J. A.; Brown, D. A.

    2016-01-01

    The EXFOR library contains the largest collection of experimental nuclear reaction data available as well as the data's bibliographic information and experimental details. We text-mined the REACTION and MONITOR fields of the ENTRYs in the EXFOR library in order to identify understudied reactions and quantities. Using the results of the text-mining, we created an undirected graph from the EXFOR datasets with each graph node representing a single reaction and quantity and graph links representing the various types of connections between these reactions and quantities. This graph is an abstract representation of the connections in EXFOR, similar to graphs of social networks, authorship networks, etc. We use various graph theoretical tools to identify important yet understudied reactions and quantities in EXFOR. Although we identified a few cross sections relevant for shielding applications and isotope production, mostly we identified charged particle fluence monitor cross sections. As a side effect of this work, we learn that our abstract graph is typical of other real-world graphs.

  11. Approaches to Post-Mining Land Reclamation in Polish Open-Cast Lignite Mining

    NASA Astrophysics Data System (ADS)

    Kasztelewicz, Zbigniew

    2014-06-01

    The paper presents the situation regarding the reclamation of post-mining land in the case of particular lignite mines in Poland until 2012 against the background of the whole opencast mining. It discusses the process of land purchase for mining operations and its sales after reclamation. It presents the achievements of mines in the reclamation and regeneration of post-mining land as a result of which-after development processes carried out according to European standards-it now serves the inhabitants as a recreational area that increases the attractiveness of the regions.

  12. Development and application of the Safe Performance Index as a risk-based methodology for identifying major hazard-related safety issues in underground coal mines

    NASA Astrophysics Data System (ADS)

    Kinilakodi, Harisha

    The underground coal mining industry has been under constant watch due to the high risk involved in its activities, and scrutiny increased because of the disasters that occurred in 2006-07. In the aftermath of the incidents, the U.S. Congress passed the Mine Improvement and New Emergency Response Act of 2006 (MINER Act), which strengthened the existing regulations and mandated new laws to address the various issues related to a safe working environment in the mines. Risk analysis in any form should be done on a regular basis to tackle the possibility of unwanted major hazard-related events such as explosions, outbursts, airbursts, inundations, spontaneous combustion, and roof fall instabilities. One of the responses by the Mine Safety and Health Administration (MSHA) in 2007 involved a new pattern of violations (POV) process to target mines with a poor safety performance, specifically to improve their safety. However, the 2010 disaster (worst in 40 years) gave an impression that the collective effort of the industry, federal/state agencies, and researchers to achieve the goal of zero fatalities and serious injuries has gone awry. The Safe Performance Index (SPI) methodology developed in this research is a straight-forward, effective, transparent, and reproducible approach that can help in identifying and addressing some of the existing issues while targeting (poor safety performance) mines which need help. It combines three injury and three citation measures that are scaled to have an equal mean (5.0) in a balanced way with proportionate weighting factors (0.05, 0.15, 0.30) and overall normalizing factor (15) into a mine safety performance evaluation tool. It can be used to assess the relative safety-related risk of mines, including by mine-size category. Using 2008 and 2009 data, comparisons were made of SPI-associated, normalized safety performance measures across mine-size categories, with emphasis on small-mine safety performance as compared to large- and

  13. Using Frequent Item Set Mining and Feature Selection Methods to Identify Interacted Risk Factors - The Atrial Fibrillation Case Study.

    PubMed

    Li, Xiang; Liu, Haifeng; Du, Xin; Hu, Gang; Xie, Guotong; Zhang, Ping

    2016-01-01

    Disease risk prediction is highly important for early intervention and treatment, and identification of predictive risk factors is the key point to achieve accurate prediction. In addition to original independent features in a dataset, some interacted features, such as comorbidities and combination therapies, may have non-additive influence on the disease outcome and can also be used in risk prediction to improve the prediction performance. However, it is usually difficult to manually identify the possible interacted risk factors due to the combination explosion of features. In this paper, we propose an automatic approach to identify predictive risk factors with interactions using frequent item set mining and feature selection methods. The proposed approach was applied in the real world case study of predicting ischemic stroke and thromboembolism for atrial fibrillation patients on the Chinese atrial fibrillation registry dataset, and the results show that our approach can not only improve the prediction performance, but also identify the comorbidities and combination therapies that have potential influences on TE occurrence for AF. PMID:27577446

  14. An Integrated Assessment Approach to Address Artisanal and Small-Scale Gold Mining in Ghana.

    PubMed

    Basu, Niladri; Renne, Elisha P; Long, Rachel N

    2015-09-17

    Artisanal and small-scale gold mining (ASGM) is growing in many regions of the world including Ghana. The problems in these communities are complex and multi-faceted. To help increase understanding of such problems, and to enable consensus-building and effective translation of scientific findings to stakeholders, help inform policies, and ultimately improve decision making, we utilized an Integrated Assessment approach to study artisanal and small-scale gold mining activities in Ghana. Though Integrated Assessments have been used in the fields of environmental science and sustainable development, their use in addressing specific matter in public health, and in particular, environmental and occupational health is quite limited despite their many benefits. The aim of the current paper was to describe specific activities undertaken and how they were organized, and the outputs and outcomes of our activity. In brief, three disciplinary workgroups (Natural Sciences, Human Health, Social Sciences and Economics) were formed, with 26 researchers from a range of Ghanaian institutions plus international experts. The workgroups conducted activities in order to address the following question: What are the causes, consequences and correctives of small-scale gold mining in Ghana? More specifically: What alternatives are available in resource-limited settings in Ghana that allow for gold-mining to occur in a manner that maintains ecological health and human health without hindering near- and long-term economic prosperity? Several response options were identified and evaluated, and are currently being disseminated to various stakeholders within Ghana and internationally.

  15. An Integrated Assessment Approach to Address Artisanal and Small-Scale Gold Mining in Ghana

    PubMed Central

    Basu, Niladri; Renne, Elisha P.; Long, Rachel N.

    2015-01-01

    Artisanal and small-scale gold mining (ASGM) is growing in many regions of the world including Ghana. The problems in these communities are complex and multi-faceted. To help increase understanding of such problems, and to enable consensus-building and effective translation of scientific findings to stakeholders, help inform policies, and ultimately improve decision making, we utilized an Integrated Assessment approach to study artisanal and small-scale gold mining activities in Ghana. Though Integrated Assessments have been used in the fields of environmental science and sustainable development, their use in addressing specific matter in public health, and in particular, environmental and occupational health is quite limited despite their many benefits. The aim of the current paper was to describe specific activities undertaken and how they were organized, and the outputs and outcomes of our activity. In brief, three disciplinary workgroups (Natural Sciences, Human Health, Social Sciences and Economics) were formed, with 26 researchers from a range of Ghanaian institutions plus international experts. The workgroups conducted activities in order to address the following question: What are the causes, consequences and correctives of small-scale gold mining in Ghana? More specifically: What alternatives are available in resource-limited settings in Ghana that allow for gold-mining to occur in a manner that maintains ecological health and human health without hindering near- and long-term economic prosperity? Several response options were identified and evaluated, and are currently being disseminated to various stakeholders within Ghana and internationally. PMID:26393627

  16. Identifying Catchment-Scale Predictors of Coal Mining Impacts on New Zealand Stream Communities

    NASA Astrophysics Data System (ADS)

    Clapcott, Joanne E.; Goodwin, Eric O.; Harding, Jon S.

    2016-03-01

    Coal mining activities can have severe and long-term impacts on freshwater ecosystems. At the individual stream scale, these impacts have been well studied; however, few attempts have been made to determine the predictors of mine impacts at a regional scale. We investigated whether catchment-scale measures of mining impacts could be used to predict biological responses. We collated data from multiple studies and analyzed algae, benthic invertebrate, and fish community data from 186 stream sites, including un-mined streams, and those associated with 620 mines on the West Coast of the South Island, New Zealand. Algal, invertebrate, and fish richness responded to mine impacts and were significantly higher in un-mined compared to mine-impacted streams. Changes in community composition toward more acid- and metal-tolerant species were evident for algae and invertebrates, whereas changes in fish communities were significant and driven by a loss of nonmigratory native species. Consistent catchment-scale predictors of mining activities affecting biota included the time post mining (years), mining density (the number of mines upstream per catchment area), and mining intensity (tons of coal production per catchment area). Mining was associated with a decline in stream biodiversity irrespective of catchment size, and recovery was not evident until at least 30 years after mining activities have ceased. These catchment-scale predictors can provide managers and regulators with practical metrics to focus on management and remediation decisions.

  17. Identifying Catchment-Scale Predictors of Coal Mining Impacts on New Zealand Stream Communities.

    PubMed

    Clapcott, Joanne E; Goodwin, Eric O; Harding, Jon S

    2016-03-01

    Coal mining activities can have severe and long-term impacts on freshwater ecosystems. At the individual stream scale, these impacts have been well studied; however, few attempts have been made to determine the predictors of mine impacts at a regional scale. We investigated whether catchment-scale measures of mining impacts could be used to predict biological responses. We collated data from multiple studies and analyzed algae, benthic invertebrate, and fish community data from 186 stream sites, including un-mined streams, and those associated with 620 mines on the West Coast of the South Island, New Zealand. Algal, invertebrate, and fish richness responded to mine impacts and were significantly higher in un-mined compared to mine-impacted streams. Changes in community composition toward more acid- and metal-tolerant species were evident for algae and invertebrates, whereas changes in fish communities were significant and driven by a loss of nonmigratory native species. Consistent catchment-scale predictors of mining activities affecting biota included the time post mining (years), mining density (the number of mines upstream per catchment area), and mining intensity (tons of coal production per catchment area). Mining was associated with a decline in stream biodiversity irrespective of catchment size, and recovery was not evident until at least 30 years after mining activities have ceased. These catchment-scale predictors can provide managers and regulators with practical metrics to focus on management and remediation decisions.

  18. Identifying Liver Cancer and Its Relations with Diseases, Drugs, and Genes: A Literature-Based Approach

    PubMed Central

    Song, Min

    2016-01-01

    In biomedicine, scientific literature is a valuable source for knowledge discovery. Mining knowledge from textual data has become an ever important task as the volume of scientific literature is growing unprecedentedly. In this paper, we propose a framework for examining a certain disease based on existing information provided by scientific literature. Disease-related entities that include diseases, drugs, and genes are systematically extracted and analyzed using a three-level network-based approach. A paper-entity network and an entity co-occurrence network (macro-level) are explored and used to construct six entity specific networks (meso-level). Important diseases, drugs, and genes as well as salient entity relations (micro-level) are identified from these networks. Results obtained from the literature-based literature mining can serve to assist clinical applications. PMID:27195695

  19. Detecting buried mines in ground-penetrating radar using a Hough transform approach

    NASA Astrophysics Data System (ADS)

    Carlotto, Mark J.

    2002-08-01

    A method for detecting buried mines in ground penetrating radar (GPR) data using a Hough transform approach is described. GPR is one of three sensors used in the Mine Hunter/Killer (MH/K) system for detecting buried mines. A buried mine modeled as a point scatterer in object space gives rise to a hyperbolic response in GPR measurement space. Our approach uses the Hough transform to recover the object space representation (i.e., the location of mines in x, y, and depth) from the GPR data, in effect 'deconvolving' the response of the radar. This is done by having each point in measurement space vote for all points in object space where the mine could be located. Against a baseline energy detector, the Hough algorithm shows a one half order reduction in false alarm rate at a fixed probability of detection for low metal, metal, and non metal mines.

  20. A data-mining approach to predict influent quality.

    PubMed

    Kusiak, Andrew; Verma, Anoop; Wei, Xiupeng

    2013-03-01

    In wastewater treatment plants, predicting influent water quality is important for energy management. The influent water quality is measured by metrics such as carbonaceous biochemical oxygen demand (CBOD), potential of hydrogen, and total suspended solid. In this paper, a data-driven approach for time-ahead prediction of CBOD is presented. Due to limitations in the industrial data acquisition system, CBOD is not recorded at regular time intervals, which causes gaps in the time-series data. Numerous experiments have been performed to approximate the functional relationship between the input and output parameters and thereby fill in the missing CBOD data. Models incorporating seasonality effects are investigated. Four data-mining algorithms-multilayered perceptron, classification and regression tree, multivariate adaptive regression spline, and random forest-are employed to construct prediction models with the maximum prediction horizon of 5 days.

  1. Online Discourse on Fibromyalgia: Text-Mining to Identify Clinical Distinction and Patient Concerns

    PubMed Central

    Park, Jungsik; Ryu, Young Uk

    2014-01-01

    Background The purpose of this study was to evaluate the possibility of using text-mining to identify clinical distinctions and patient concerns in online memoires posted by patients with fibromyalgia (FM). Material/Methods A total of 399 memoirs were collected from an FM group website. The unstructured data of memoirs associated with FM were collected through a crawling process and converted into structured data with a concordance, parts of speech tagging, and word frequency. We also conducted a lexical analysis and phrase pattern identification. After examining the data, a set of FM-related keywords were obtained and phrase net relationships were set through a web-based visualization tool. Results The clinical distinction of FM was verified. Pain is the biggest issue to the FM patients. The pains were affecting body parts including ‘muscles,’ ‘leg,’ ‘neck,’ ‘back,’ ‘joints,’ and ‘shoulders’ with accompanying symptoms such as ‘spasms,’ ‘stiffness,’ and ‘aching,’ and were described as ‘sever,’ ‘chronic,’ and ‘constant.’ This study also demonstrated that it was possible to understand the interests and concerns of FM patients through text-mining. FM patients wanted to escape from the pain and symptoms, so they were interested in medical treatment and help. Also, they seemed to have interest in their work and occupation, and hope to continue to live life through the relationships with the people around them. Conclusions This research shows the potential for extracting keywords to confirm the clinical distinction of a certain disease, and text-mining can help objectively understand the concerns of patients by generalizing their large number of subjective illness experiences. However, it is believed that there are limitations to the processes and methods for organizing and classifying large amounts of text, so these limits have to be considered when analyzing the results. The development of research methodology to overcome

  2. A New Approach in Coal Mine Exploration Using Cosmic Ray Muons

    NASA Astrophysics Data System (ADS)

    Darijani, Reza; Negarestani, Ali; Rezaie, Mohammad Reza; Fatemi, Syed Jalil; Akhond, Ahmad

    2016-08-01

    Muon radiography is a technique that uses cosmic ray muons to image the interior of large scale geological structures. The muon absorption in matter is the most important parameter in cosmic ray muon radiography. Cosmic ray muon radiography is similar to X-ray radiography. The main aim in this survey is the simulation of the muon radiography for exploration of mines. So, the production source, tracking, and detection of cosmic ray muons were simulated by MCNPX code. For this purpose, the input data of the source card in MCNPX code were extracted from the muon energy spectrum at sea level. In addition, the other input data such as average density and thickness of layers that were used in this code are the measured data from Pabdana (Kerman, Iran) coal mines. The average thickness and density of these layers in the coal mines are from 2 to 4 m and 1.3 gr/cm3, respectively. To increase the spatial resolution, a detector was placed inside the mountain. The results indicated that using this approach, the layers with minimum thickness about 2.5 m can be identified.

  3. Identifying the Educationally Influential Physician: A Systematic Review of Approaches

    ERIC Educational Resources Information Center

    Kronberger, Matthew P.; Bakken, Lori L.

    2011-01-01

    Introduction: Previous studies have indicated that educationally influential physicians' (EIPs) interactions with peers can lead to practice changes and improved patient outcomes. However, multiple approaches have been used to identify and investigate EIPs' informal or formal influence on practice, which creates study outcomes that are difficult…

  4. A Tools-Based Approach to Teaching Data Mining Methods

    ERIC Educational Resources Information Center

    Jafar, Musa J.

    2010-01-01

    Data mining is an emerging field of study in Information Systems programs. Although the course content has been streamlined, the underlying technology is still in a state of flux. The purpose of this paper is to describe how we utilized Microsoft Excel's data mining add-ins as a front-end to Microsoft's Cloud Computing and SQL Server 2008 Business…

  5. A data mining approach to finding relationships between reservoir properties and oil production for CHOPS

    NASA Astrophysics Data System (ADS)

    Cai, Yongxiang; Wang, Xin; Hu, Kezhen; Dong, Mingzhe

    2014-12-01

    Cold heavy oil production with sand (CHOPS) is a primary oil extraction process for heavy crude oil and reservoir properties are key factors that contribute to the effectiveness of CHOPS. However, identification of the key reservoir properties and quantification of the relationships between the reservoir properties and the oil production are still challenging tasks. In this paper, we propose the use of a data mining approach for finding quantitative relationships between various reservoir properties and oil production for CHOPS. The approach includes four steps: firstly, a set of reservoir properties are identified to describe reservoir characteristics through a petrophysical analysis. In addition to common parameters, such as porosity and permeability, two new parameters - a fluid mobility factor and the maximum inscribed rectangular of net pay (MIRNP) - are proposed. Secondly, three new parameters to describe the production performance of wells are proposed: the peak value, effective life cycle and effective yield. Next, the fuzzy ranking method is used to rank the importance of the identified reservoir properties in terms of oil production. Finally, association rule mining is used to obtain quantitative relationships between reservoir property variables and the production performance of wells. The proposed methods have been applied for 118 wells in the Sparky Formation of the Lloydminster heavy oil field in Alberta. The result shows that the production performance of wells in the area could be described and predicted by using the found quantitative relations.

  6. Risk evaluation of uranium mining: A geochemical inverse modelling approach

    NASA Astrophysics Data System (ADS)

    Rillard, J.; Zuddas, P.; Scislewski, A.

    2011-12-01

    It is well known that uranium extraction operations can increase risks linked to radiation exposure. The toxicity of uranium and associated heavy metals is the main environmental concern regarding exploitation and processing of U-ore. In areas where U mining is planned, a careful assessment of toxic and radioactive element concentrations is recommended before the start of mining activities. A background evaluation of harmful elements is important in order to prevent and/or quantify future water contamination resulting from possible migration of toxic metals coming from ore and waste water interaction. Controlled leaching experiments were carried out to investigate processes of ore and waste (leached ore) degradation, using samples from the uranium exploitation site located in Caetité-Bahia, Brazil. In experiments in which the reaction of waste with water was tested, we found that the water had low pH and high levels of sulphates and aluminium. On the other hand, in experiments in which ore was tested, the water had a chemical composition comparable to natural water found in the region of Caetité. On the basis of our experiments, we suggest that waste resulting from sulphuric acid treatment can induce acidification and salinization of surface and ground water. For this reason proper storage of waste is imperative. As a tool to evaluate the risks, a geochemical inverse modelling approach was developed to estimate the water-mineral interaction involving the presence of toxic elements. We used a method earlier described by Scislewski and Zuddas 2010 (Geochim. Cosmochim. Acta 74, 6996-7007) in which the reactive surface area of mineral dissolution can be estimated. We found that the reactive surface area of rock parent minerals is not constant during time but varies according to several orders of magnitude in only two months of interaction. We propose that parent mineral heterogeneity and particularly, neogenic phase formation may explain the observed variation of the

  7. An Integrative Proteomic Approach Identifies Novel Cellular SMYD2 Substrates.

    PubMed

    Ahmed, Hazem; Duan, Shili; Arrowsmith, Cheryl H; Barsyte-Lovejoy, Dalia; Schapira, Matthieu

    2016-06-01

    Protein methylation is a post-translational modification with important roles in transcriptional regulation and other biological processes, but the enzyme-substrate relationship between the 68 known human protein methyltransferases and the thousands of reported methylation sites is poorly understood. Here, we propose a bioinformatic approach that integrates structural, biochemical, cellular, and proteomic data to identify novel cellular substrates of the lysine methyltransferase SMYD2. Of the 14 novel putative SMYD2 substrates identified by our approach, six were confirmed in cells by immunoprecipitation: MAPT, CCAR2, EEF2, NCOA3, STUB1, and UTP14A. Treatment with the selective SMYD2 inhibitor BAY-598 abrogated the methylation signal, indicating that methylation of these novel substrates was dependent on the catalytic activity of the enzyme. We believe that our integrative approach can be applied to other protein lysine methyltransferases, and help understand how lysine methylation participates in wider signaling processes. PMID:27163177

  8. Proteomic and Genetic Approaches Identify Syk as an AML Target

    PubMed Central

    Hahn, Cynthia K.; Berchuck, Jacob E.; Ross, Kenneth N.; Kakoza, Rose M.; Clauser, Karl; Schinzel, Anna C.; Ross, Linda; Galinsky, Ilene; Davis, Tina N.; Silver, Serena J.; Root, David E.; Stone, Richard M.; DeAngelo, Daniel J.; Carroll, Martin; Hahn, William C.; Carr, Steven A.; Golub, Todd R.; Kung, Andrew L.; Stegmaier, Kimberly

    2009-01-01

    SUMMARY Cell-based screening can facilitate rapid identification of compounds inducing complex cellular phenotypes. Advancing a compound toward the clinic, however, generally requires identification of precise mechanisms of action. We previously found that epidermal growth factor receptor (EGFR) inhibitors induce acute myeloid leukemia (AML) differentiation via a non-EGFR mechanism. In this report, we integrated proteomic and RNAi-based strategies to identify their off-target anti-AML mechanism. These orthogonal approaches identified Syk as a target in AML. Genetic and pharmacological inactivation of Syk with a drug in clinical trial for other indications promoted differentiation of AML cells and attenuated leukemia growth in vivo. These results demonstrate the power of integrating diverse chemical, proteomic, and genomic screening approaches to identify therapeutic strategies for cancer. PMID:19800574

  9. Identifying the Uncertainty in Physician Practice Location through Spatial Analytics and Text Mining

    PubMed Central

    Shi, Xuan; Xue, Bowei; Xierali, Imam M.

    2016-01-01

    In response to the widespread concern about the adequacy, distribution, and disparity of access to a health care workforce, the correct identification of physicians’ practice locations is critical to access public health services. In prior literature, little effort has been made to detect and resolve the uncertainty about whether the address provided by a physician in the survey is a practice address or a home address. This paper introduces how to identify the uncertainty in a physician’s practice location through spatial analytics, text mining, and visual examination. While land use and zoning code, embedded within the parcel datasets, help to differentiate resident areas from other types, spatial analytics may have certain limitations in matching and comparing physician and parcel datasets with different uncertainty issues, which may lead to unforeseen results. Handling and matching the string components between physicians’ addresses and the addresses of the parcels could identify the spatial uncertainty and instability to derive a more reasonable relationship between different datasets. Visual analytics and examination further help to clarify the undetectable patterns. This research will have a broader impact over federal and state initiatives and policies to address both insufficiency and maldistribution of a health care workforce to improve the accessibility to public health services. PMID:27657100

  10. Identifying woody vegetation on coal surface mines using phenological indicators with multitemporal Landsat imagery

    NASA Astrophysics Data System (ADS)

    Oliphant, A. J.; Li, J.; Wynne, R. H.; Donovan, P. F.; Zipper, C. E.

    2014-11-01

    Surface mining for coal has disturbed large land areas in the Appalachian Mountains. Better information on mined lands' ecosystem recovery status is necessary for effective environmental management in mining-impacted regions. Because record quality varies between state mining agencies and much mining occurred prior to widespread use of geospatial technologies, accurate maps of mining extents, durations, and land cover effects are often not available. Landsat data are well suited to mapping and characterizing land cover and forest recovery on former coal surface mines. Past mine reclamation techniques have often failed to restore premining forest vegetation but natural processes may enable native forests to re-establish on mined areas with time. However, the invasive species autumn olive (Elaeagnus umbellate) is proliferating widely on former coal surface mines, often inhibiting reestablishment of native forests. Autumn olive outcompetes native vegetation because it fixes atmospheric nitrogen and benefits from a longer growing season than native deciduous trees. This longer growing season, along with Landsat 8's high signal to noise ratio, has enabled species-level classification of autumn olive using multitemporal Landsat 8 data at accuracy levels usually only obtainable using higher spatial or spectral resolution sensors. We have used classification and regression tree (CART®) and support vector machine (SVM) to classify five counties in the coal mining region of Virginia for presence and absence of autumn olive. The best model found was a CART® model with 36 nodes which had an overall accuracy of 84% and kappa of 0.68. Autumn olive had conditional kappa of 0.65 and a producers and users accuracy of 86% and 83% respectively. The best SVM model used a second order polynomial kernel and had an overall accuracy of 77%, an overall kappa of 0.54 and a producers and users accuracy of 60% and 90% respectively.

  11. National Conference on Mining-Influenced Waters: Approaches for Characterization, Source Control and Treatment

    EPA Science Inventory

    The conference goal was to provide a forum for the exchange of scientific information on current and emerging approaches to assessing characterization, monitoring, source control, treatment and/or remediation on mining-influenced waters. The conference was aimed at mining remedi...

  12. TOXICITY APPROACHES TO ASSESSING MINING IMPACTS AND MINE WASTE TREATMENT EFFECTIVENESS

    EPA Science Inventory

    The USEPA Office of Research and Development's National Exposure Research Laboratory and National Risk Management Research Laboratory have been evaluating the impact of mining sites on receiving streams and the effectiveness of waste treatment technologies in removing toxicity fo...

  13. IDENTIFYING RECENT SURFACE MINING ACTIVITIES USING A NORMALIZED DIFFERENCE VEGETATION INDEX (NDVI) CHANGE DETECTION METHOD

    EPA Science Inventory



    Coal mining is a major resource extraction activity on the Appalachian Mountains. The increased size and frequency of a specific type of surface mining, known as mountain top removal-valley fill, has in recent years raised various environmental concerns. During mountainto...

  14. Identifying Learning Behaviors by Contextualizing Differential Sequence Mining with Action Features and Performance Evolution

    ERIC Educational Resources Information Center

    Kinnebrew, John S.; Biswas, Gautam

    2012-01-01

    Our learning-by-teaching environment, Betty's Brain, captures a wealth of data on students' learning interactions as they teach a virtual agent. This paper extends an exploratory data mining methodology for assessing and comparing students' learning behaviors from these interaction traces. The core algorithm employs sequence mining techniques to…

  15. GTA: a game theoretic approach to identifying cancer subnetwork markers.

    PubMed

    Farahmand, S; Goliaei, S; Ansari-Pour, N; Razaghi-Moghadam, Z

    2016-03-01

    The identification of genetic markers (e.g. genes, pathways and subnetworks) for cancer has been one of the most challenging research areas in recent years. A subset of these studies attempt to analyze genome-wide expression profiles to identify markers with high reliability and reusability across independent whole-transcriptome microarray datasets. Therefore, the functional relationships of genes are integrated with their expression data. However, for a more accurate representation of the functional relationships among genes, utilization of the protein-protein interaction network (PPIN) seems to be necessary. Herein, a novel game theoretic approach (GTA) is proposed for the identification of cancer subnetwork markers by integrating genome-wide expression profiles and PPIN. The GTA method was applied to three distinct whole-transcriptome breast cancer datasets to identify the subnetwork markers associated with metastasis. To evaluate the performance of our approach, the identified subnetwork markers were compared with gene-based, pathway-based and network-based markers. We show that GTA is not only capable of identifying robust metastatic markers, it also provides a higher classification performance. In addition, based on these GTA-based subnetworks, we identified a new bonafide candidate gene for breast cancer susceptibility. PMID:26750920

  16. Acid mine drainage risks - A modeling approach to siting mine facilities in Northern Minnesota USA

    NASA Astrophysics Data System (ADS)

    Myers, Tom

    2016-02-01

    Most watershed-scale planning for mine-caused contamination concerns remediation of past problems while future planning relies heavily on engineering controls. As an alternative, a watershed scale groundwater fate and transport model for the Rainy Headwaters, a northeastern Minnesota watershed, has been developed to examine the risks of leaks or spills to a pristine downstream watershed. The model shows that the risk depends on the location and whether the source of the leak is on the surface or from deeper underground facilities. Underground sources cause loads that last longer but arrive at rivers after a longer travel time and have lower concentrations due to dilution and attenuation. Surface contaminant sources could cause much more short-term damage to the resource. Because groundwater dominates baseflow, mine contaminant seepage would cause the most damage during low flow periods. Groundwater flow and transport modeling is a useful tool for decreasing the risk to downgradient sources by aiding in the placement of mine facilities. Although mines are located based on the minerals, advance planning and analysis could avoid siting mine facilities where failure or leaks would cause too much natural resource damage. Watershed scale transport modeling could help locate the facilities or decide in advance that the mine should not be constructed due to the risk to downstream resources.

  17. A novel approach to generating CER hypotheses based on mining clinical data.

    PubMed

    Zhang, Shuo; Li, Lin; Yu, Yiqin; Sun, Xingzhi; Xu, Linhao; Zhao, Wei; Teng, Xiaofei; Pan, Yue

    2013-01-01

    Comparative effectiveness research (CER) is a scientific method of investigating the effectiveness of alternative intervention methods. In a CER study, clinical researchers typically start with a CER hypothesis, and aim to evaluate it by applying a series of medical statistical methods. Traditionally, the CER hypotheses are defined manually by clinical researchers. This makes the task of hypothesis generation very time-consuming and the quality of hypothesis heavily dependent on the researchers' skills. Recently, with more electronic medical data being collected, it is highly promising to apply the computerized method for discovering CER hypotheses from clinical data sets. In this poster, we proposes a novel approach to automatically generating CER hypotheses based on mining clinical data, and presents a case study showing that the approach can facilitate clinical researchers to identify potentially valuable hypotheses and eventually define high quality CER studies.

  18. Large screen approaches to identify novel malaria vaccine candidates.

    PubMed

    Davies, D Huw; Duffy, Patrick; Bodmer, Jean-Luc; Felgner, Philip L; Doolan, Denise L

    2015-12-22

    Until recently, malaria vaccine development efforts have focused almost exclusively on a handful of well characterized Plasmodium falciparum antigens. Despite dedicated work by many researchers on different continents spanning more than half a century, a successful malaria vaccine remains elusive. Sequencing of the P. falciparum genome has revealed more than five thousand genes, providing the foundation for systematic approaches to discover candidate vaccine antigens. We are taking advantage of this wealth of information to discover new antigens that may be more effective vaccine targets. Herein, we describe different approaches to large-scale screening of the P. falciparum genome to identify targets of either antibody responses or T cell responses using human specimens collected in Controlled Human Malaria Infections (CHMI) or under conditions of natural exposure in the field. These genome, proteome and transcriptome based approaches offer enormous potential for the development of an efficacious malaria vaccine. PMID:26428458

  19. Trend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks

    SciTech Connect

    Jin, R; McCallen, S; Almaas, E

    2007-05-28

    Complex networks have been used successfully in scientific disciplines ranging from sociology to microbiology to describe systems of interacting units. Until recently, studies of complex networks have mainly focused on their network topology. However, in many real world applications, the edges and vertices have associated attributes that are frequently represented as vertex or edge weights. Furthermore, these weights are often not static, instead changing with time and forming a time series. Hence, to fully understand the dynamics of the complex network, we have to consider both network topology and related time series data. In this work, we propose a motif mining approach to identify trend motifs for such purposes. Simply stated, a trend motif describes a recurring subgraph where each of its vertices or edges displays similar dynamics over a userdefined period. Given this, each trend motif occurrence can help reveal significant events in a complex system; frequent trend motifs may aid in uncovering dynamic rules of change for the system, and the distribution of trend motifs may characterize the global dynamics of the system. Here, we have developed efficient mining algorithms to extract trend motifs. Our experimental validation using three disparate empirical datasets, ranging from the stock market, world trade, to a protein interaction network, has demonstrated the efficiency and effectiveness of our approach.

  20. Determining the familial risk distribution of colorectal cancer: a data mining approach.

    PubMed

    Chau, Rowena; Jenkins, Mark A; Buchanan, Daniel D; Ait Ouakrim, Driss; Giles, Graham G; Casey, Graham; Gallinger, Steven; Haile, Robert W; Le Marchand, Loic; Newcomb, Polly A; Lindor, Noralane M; Hopper, John L; Win, Aung Ko

    2016-04-01

    This study was aimed to characterize the distribution of colorectal cancer risk using family history of cancers by data mining. Family histories for 10,066 colorectal cancer cases recruited to population cancer registries of the Colon Cancer Family Registry were analyzed using a data mining framework. A novel index was developed to quantify familial cancer aggregation. Artificial neural network was used to identify distinct categories of familial risk. Standardized incidence ratios (SIRs) and corresponding 95% confidence intervals (CIs) of colorectal cancer were calculated for each category. We identified five major, and 66 minor categories of familial risk for developing colorectal cancer. The distribution the major risk categories were: (1) 7% of families (SIR = 7.11; 95% CI 6.65-7.59) had a strong family history of colorectal cancer; (2) 13% of families (SIR = 2.94; 95% CI 2.78-3.10) had a moderate family history of colorectal cancer; (3) 11% of families (SIR = 1.23; 95% CI 1.12-1.36) had a strong family history of breast cancer and a weak family history of colorectal cancer; (4) 9 % of families (SIR = 1.06; 95 % CI 0.96-1.18) had strong family history of prostate cancer and weak family history of colorectal cancer; and (5) 60% of families (SIR = 0.61; 95% CI 0.57-0.65) had a weak family history of all cancers. There is a wide variation of colorectal cancer risk that can be categorized by family history of cancer, with a strong gradient of colorectal cancer risk between the highest and lowest risk categories. The risk of colorectal cancer for people with the highest risk category of family history (7% of the population) was 12-times that for people in the lowest risk category (60%) of the population. Data mining was proven an effective approach for gaining insight into the underlying cancer aggregation patterns and for categorizing familial risk of colorectal cancer.

  1. A novel approach to tag and identify geranylgeranylated proteins

    PubMed Central

    Chan, Lai N.; Hart, Courtenay; Guo, Lea; Nyberg, Tamara; Davies, Brandon S.J.; Fong, Loren G.; Young, Stephen G.; Agnew, Brian J.; Tamanoi, Fuyuhiko

    2010-01-01

    A recently developed proteomic strategy, the “GG-azide”-labeling approach, is described for the detection and proteomic analysis of geranylgeranylated proteins. This approach involves metabolic incorporation of a synthetic azido-geranylgeranyl analog and chemoselective derivatization of azido-geranylgeranyl-modified proteins by the “click” chemistry, using a tetramethylrhodamine-alkyne. The resulting conjugated proteins can be separated by 1-D or 2-D and pH fractionation, and detected by fluorescence imaging. This method is compatible with downstream LC-MS/MS analysis. Proteomic analysis of conjugated proteins by this approach identified several known geranylgeranylated proteins as well as Rap2c, a novel member of the Ras family. Furthermore, prenylation of progerin in mouse embryonic fibroblast cells was examined using this approach, demonstrating that this strategy can be used to study prenylation of specific proteins. The “GG-azide”-labeling approach provides a new tool for the detection and proteomic analysis of geranylgeranylated proteins, and it can readily be extended to other post-translational modifications. PMID:19784953

  2. Hazards identified and the need for health risk assessment in the South African mining industry.

    PubMed

    Utembe, W; Faustman, E M; Matatiele, P; Gulumian, M

    2015-12-01

    Although mining plays a prominent role in the economy of South Africa, it is associated with many chemical hazards. Exposure to dust from mining can lead to many pathological effects depending on mineralogical composition, size, shape and levels and duration of exposure. Mining and processing of minerals also result in occupational exposure to toxic substances such as platinum, chromium, vanadium, manganese, mercury, cyanide and diesel particulate. South Africa has set occupational exposure limits (OELs) for some hazards, but mine workers are still at a risk. Since the hazard posed by a mineral depends on its physiochemical properties, it is recommended that South Africa should not simply adopt OELs from other countries but rather set her own standards based on local toxicity studies. The limits should take into account the issue of mixtures to which workers could be exposed as well as the health status of the workers. The mining industry is also a source of contamination of the environment, due inter alia to the large areas of tailings dams and dumps left behind. Therefore, there is need to develop guidelines for safe land-uses of contaminated lands after mine closure.

  3. Practical Approaches for Mining Frequent Patterns in Molecular Datasets.

    PubMed

    Naulaerts, Stefan; Moens, Sandy; Engelen, Kristof; Berghe, Wim Vanden; Goethals, Bart; Laukens, Kris; Meysman, Pieter

    2016-01-01

    Pattern detection is an inherent task in the analysis and interpretation of complex and continuously accumulating biological data. Numerous itemset mining algorithms have been developed in the last decade to efficiently detect specific pattern classes in data. Although many of these have proven their value for addressing bioinformatics problems, several factors still slow down promising algorithms from gaining popularity in the life science community. Many of these issues stem from the low user-friendliness of these tools and the complexity of their output, which is often large, static, and consequently hard to interpret. Here, we apply three software implementations on common bioinformatics problems and illustrate some of the advantages and disadvantages of each, as well as inherent pitfalls of biological data mining. Frequent itemset mining exists in many different flavors, and users should decide their software choice based on their research question, programming proficiency, and added value of extra features. PMID:27168722

  4. Practical Approaches for Mining Frequent Patterns in Molecular Datasets

    PubMed Central

    Naulaerts, Stefan; Moens, Sandy; Engelen, Kristof; Berghe, Wim Vanden; Goethals, Bart; Laukens, Kris; Meysman, Pieter

    2016-01-01

    Pattern detection is an inherent task in the analysis and interpretation of complex and continuously accumulating biological data. Numerous itemset mining algorithms have been developed in the last decade to efficiently detect specific pattern classes in data. Although many of these have proven their value for addressing bioinformatics problems, several factors still slow down promising algorithms from gaining popularity in the life science community. Many of these issues stem from the low user-friendliness of these tools and the complexity of their output, which is often large, static, and consequently hard to interpret. Here, we apply three software implementations on common bioinformatics problems and illustrate some of the advantages and disadvantages of each, as well as inherent pitfalls of biological data mining. Frequent itemset mining exists in many different flavors, and users should decide their software choice based on their research question, programming proficiency, and added value of extra features. PMID:27168722

  5. A genomic approach to identify hybrid incompatibility genes

    PubMed Central

    Cooper, Jacob C.; Phadnis, Nitin

    2016-01-01

    ABSTRACT Uncovering the genetic and molecular basis of barriers to gene flow between populations is key to understanding how new species are born. Intrinsic postzygotic reproductive barriers such as hybrid sterility and hybrid inviability are caused by deleterious genetic interactions known as hybrid incompatibilities. The difficulty in identifying these hybrid incompatibility genes remains a rate-limiting step in our understanding of the molecular basis of speciation. We recently described how whole genome sequencing can be applied to identify hybrid incompatibility genes, even from genetically terminal hybrids. Using this approach, we discovered a new hybrid incompatibility gene, gfzf, between Drosophila melanogaster and Drosophila simulans, and found that it plays an essential role in cell cycle regulation. Here, we discuss the history of the hunt for incompatibility genes between these species, discuss the molecular roles of gfzf in cell cycle regulation, and explore how intragenomic conflict drives the evolution of fundamental cellular mechanisms that lead to the developmental arrest of hybrids. PMID:27230814

  6. Quantiles Regression Approach to Identifying the Determinant of Breastfeeding Duration

    NASA Astrophysics Data System (ADS)

    Mahdiyah; Norsiah Mohamed, Wan; Ibrahim, Kamarulzaman

    In this study, quantiles regression approach is applied to the data of Malaysian Family Life Survey (MFLS), to identify factors which are significantly related to the different conditional quantiles of the breastfeeding duration. It is known that the classical linear regression methods are based on minimizing residual sum of squared, but quantiles regression use a mechanism which are based on the conditional median function and the full range of other conditional quantile functions. Overall, it is found that the period of breastfeeding is significantly related to place of living, religion and total number of children in the family.

  7. The Usage of Association Rule Mining to Identify Influencing Factors on Deafness After Birth

    PubMed Central

    Shahraki, Azimeh Danesh; Safdari, Reza; Gahfarokhi, Hamid Habibi; Tahmasebian, Shahram

    2015-01-01

    Background: Providing complete and high quality health care services has very important role to enable people to understand the factors related to personal and social health and to make decision regarding choice of suitable healthy behaviors in order to achieve healthy life. For this reason, demographic and clinical data of person are collecting, this huge volume of data can be known as a valuable resource for analyzing, exploring and discovering valuable information and communication. This study using forum rules techniques in the data mining has tried to identify the affecting factors on hearing loss after birth in Iran. Materials and Methods: The survey is kind of data oriented study. The population of the study is contained questionnaires in several provinces of the country. First, all data of questionnaire was implemented in the form of information table in Software SQL Server and followed by Data Entry using written software of C # .Net, then algorithm Association in SQL Server Data Tools software and Clementine software was implemented to determine the rules and hidden patterns in the gathered data. Findings: Two factors of number of deaf brothers and the degree of consanguinity of the parents have a significant impact on severity of deafness of individuals. Also, when the severity of hearing loss is greater than or equal to moderately severe hearing loss, people use hearing aids and Men are also less interested in the use of hearing aids. Conclusion: In fact, it can be said that in families with consanguineous marriage of parents that are from first degree (girl/boy cousins) and 2nd degree relatives (girl/boy cousins) and especially from first degree, the number of people with severe hearing loss or deafness are more and in the use of hearing aids, gender of the patient is more important than the severity of the hearing loss. PMID:26862245

  8. Experimental approaches to identify non-coding RNAs

    PubMed Central

    Hüttenhofer, Alexander; Vogel, Jörg

    2006-01-01

    Cellular RNAs that do not function as messenger RNAs (mRNAs), transfer RNAs (tRNAs) or ribosomal RNAs (rRNAs) comprise a diverse class of molecules that are commonly referred to as non-protein-coding RNAs (ncRNAs). These molecules have been known for quite a while, but their importance was not fully appreciated until recent genome-wide searches discovered thousands of these molecules and their genes in a variety of model organisms. Some of these screens were based on biocomputational prediction of ncRNA candidates within entire genomes of model organisms. Alternatively, direct biochemical isolation of expressed ncRNAs from cells, tissues or entire organisms has been shown to be a powerful approach to identify ncRNAs both at the level of individual molecules and at a global scale. In this review, we will survey several such wet-lab strategies, i.e. direct sequencing of ncRNAs, shotgun cloning of small-sized ncRNAs (cDNA libraries), microarray analysis and genomic SELEX to identify novel ncRNAs, and discuss the advantages and limits of these approaches. PMID:16436800

  9. A data mining approach to evolutionary optimisation of noisy multi-objective problems

    NASA Astrophysics Data System (ADS)

    Chia, J. Y.; Goh, C. K.; Shim, V. A.; Tan, K. C.

    2012-07-01

    Many real world optimisation problems have opposing objective functions which are subjected to the influence of noise. Noise in the objective functions can adversely affect the stability, performance and convergence of evolutionary optimisers. This article proposes a Bayesian frequent data mining (DM) approach to identify optimal regions to guide the population amidst the presence of noise. The aggregated information provided by all the solutions helped to average out the effects of noise. This article proposes a DM crossover operator to make use of the rules mined. After implementation of this operator, a better convergence to the true Pareto front is achieved at the expense of the diversity of the solution. Consequently, an ExtremalExploration operator will be proposed in the later part of this article to help curb the loss in diversity caused by the DM operator. The result is a more directive search with a faster convergence rate. The search is effective in decision space where the Pareto set is in a tight cluster. A further investigation of the performance of the proposed algorithm in noisy and noiseless environment will also be studied with respect to non-convexity, discontinuity, multi-modality and uniformity. The proposed algorithm is evaluated on ZDT and other benchmarks problems. The results of the simulations indicate that the proposed method is effective in handling noise and is competitive against the other noise tolerant algorithms.

  10. Identifying Subgroups among Hardcore Smokers: a Latent Profile Approach

    PubMed Central

    Bommelé, Jeroen; Kleinjan, Marloes; Schoenmakers, Tim M.; Burk, William J.; van den Eijnden, Regina; van de Mheen, Dike

    2015-01-01

    Introduction Hardcore smokers are smokers who have little to no intention to quit. Previous research suggests that there are distinct subgroups among hardcore smokers and that these subgroups vary in the perceived pros and cons of smoking and quitting. Identifying these subgroups could help to develop individualized messages for the group of hardcore smokers. In this study we therefore used the perceived pros and cons of smoking and quitting to identify profiles among hardcore smokers. Methods A sample of 510 hardcore smokers completed an online survey on the perceived pros and cons of smoking and quitting. We used these perceived pros and cons in a latent profile analysis to identify possible subgroups among hardcore smokers. To validate the profiles identified among hardcore smokers, we analysed data from a sample of 338 non-hardcore smokers in a similar way. Results We found three profiles among hardcore smokers. ‘Receptive’ hardcore smokers (36%) perceived many cons of smoking and many pros of quitting. ‘Ambivalent’ hardcore smokers (59%) were rather undecided towards quitting. ‘Resistant’ hardcore smokers (5%) saw few cons of smoking and few pros of quitting. Among non-hardcore smokers, we found similar groups of ‘receptive’ smokers (30%) and ‘ambivalent’ smokers (54%). However, a third group consisted of ‘disengaged’ smokers (16%), who saw few pros and cons of both smoking and quitting. Discussion Among hardcore smokers, we found three distinct profiles based on perceived pros and cons of smoking. This indicates that hardcore smokers are not a homogenous group. Each profile might require a different tobacco control approach. Our findings may help to develop individualized tobacco control messages for the particularly hard-to-reach group of hardcore smokers. PMID:26207829

  11. An enhanced stream mining approach for network anomaly detection

    NASA Astrophysics Data System (ADS)

    Bellaachia, Abdelghani; Bhatt, Rajat

    2005-03-01

    Network anomaly detection is one of the hot topics in the market today. Currently, researchers are trying to find a way in which machines could automatically learn both normal and anomalous behavior and thus detect anomalies if and when they occur. Most important applications which could spring out of these systems is intrusion detection and spam mail detection. In this paper, the primary focus on the problem and solution of "real time" network intrusion detection although the underlying theory discussed may be used for other applications of anomaly detection (like spam detection or spy-ware detection) too. Since a machine needs a learning process on its own, data mining has been chosen as a preferred technique. The object of this paper is to present a real time clustering system; we call Enhanced Stream Mining (ESM) which could analyze packet information (headers, and data) to determine intrusions.

  12. PM2: a partitioning-mining-measuring method for identifying progressive changes in older adults' sleeping activity.

    PubMed

    Lin, Qiang; Zhang, Daqing; Connelly, Kay; Zhou, Xingshe; Ni, Hongbo

    2014-01-01

    As people age, their health typically declines, resulting in difficulty in performing daily activities. Sleep-related problems are common issues with older adults, including shifts in circadian rhythms. A detection method is proposed to identify progressive changes in sleeping activity using a three-step process: partitioning, mining, and measuring. Specifically, the original spatiotemporal representation of each sleeping activity instance was first transformed into a sequence of equal-sized segments, or symbols, via a partitioning process. A data-mining-based algorithm was proposed to find symbols that are not present in all instances of a sleeping activity. Finally, a measuring process was responsible for evaluating the changes in these symbols. Experimental evaluation conducted on a group of datasets of older adults showed that the proposed method is able to identify progressive changes in sleeping activity.

  13. Novel approaches to identify protein adducts produced by lipid peroxidation.

    PubMed

    Codreanu, S G; Liebler, D C

    2015-01-01

    Lipid peroxidation is responsible for the generation of chemically reactive, diffusible lipid-derived electrophiles (LDEs) that covalently modify cellular protein targets. These protein modifications modulate protein activity and macromolecular interactions and induce adaptive and toxic cell signaling. Protein modifications induced by LDEs can be identified and quantified by affinity enrichment and liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based techniques. Tagged LDE analog probes with different electrophilic groups can be covalently captured by click chemistry for LC-MS/MS analyses, thereby enabling in-depth studies of proteome damage at the protein and peptide sequence levels. Conversely, click-reactive, thiol-directed probes can be used to evaluate thiol damage caused by LDE by difference. These analytical approaches permit systematic study of the dynamics of protein damage caused by LDE and mechanisms by which oxidative stress contribute to toxicity and diseases. PMID:25819163

  14. Integrative biology approach identifies cytokine targeting strategies for psoriasis.

    PubMed

    Perera, Gayathri K; Ainali, Chrysanthi; Semenova, Ekaterina; Hundhausen, Christian; Barinaga, Guillermo; Kassen, Deepika; Williams, Andrew E; Mirza, Muddassar M; Balazs, Mercedesz; Wang, Xiaoting; Rodriguez, Robert Sanchez; Alendar, Andrej; Barker, Jonathan; Tsoka, Sophia; Ouyang, Wenjun; Nestle, Frank O

    2014-02-12

    Cytokines are critical checkpoints of inflammation. The treatment of human autoimmune disease has been revolutionized by targeting inflammatory cytokines as key drivers of disease pathogenesis. Despite this, there exist numerous pitfalls when translating preclinical data into the clinic. We developed an integrative biology approach combining human disease transcriptome data sets with clinically relevant in vivo models in an attempt to bridge this translational gap. We chose interleukin-22 (IL-22) as a model cytokine because of its potentially important proinflammatory role in epithelial tissues. Injection of IL-22 into normal human skin grafts produced marked inflammatory skin changes resembling human psoriasis. Injection of anti-IL-22 monoclonal antibody in a human xenotransplant model of psoriasis, developed specifically to test potential therapeutic candidates, efficiently blocked skin inflammation. Bioinformatic analysis integrating both the IL-22 and anti-IL-22 cytokine transcriptomes and mapping them onto a psoriasis disease gene coexpression network identified key cytokine-dependent hub genes. Using knockout mice and small-molecule blockade, we show that one of these hub genes, the so far unexplored serine/threonine kinase PIM1, is a critical checkpoint for human skin inflammation and potential future therapeutic target in psoriasis. Using in silico integration of human data sets and biological models, we were able to identify a new target in the treatment of psoriasis.

  15. An Approach to Realizing Process Control for Underground Mining Operations of Mobile Machines.

    PubMed

    Song, Zhen; Schunnesson, Håkan; Rinne, Mikael; Sturgul, John

    2015-01-01

    The excavation and production in underground mines are complicated processes which consist of many different operations. The process of underground mining is considerably constrained by the geometry and geology of the mine. The various mining operations are normally performed in series at each working face. The delay of a single operation will lead to a domino effect, thus delay the starting time for the next process and the completion time of the entire process. This paper presents a new approach to the process control for underground mining operations, e.g. drilling, bolting, mucking. This approach can estimate the working time and its probability for each operation more efficiently and objectively by improving the existing PERT (Program Evaluation and Review Technique) and CPM (Critical Path Method). If the delay of the critical operation (which is on a critical path) inevitably affects the productivity of mined ore, the approach can rapidly assign mucking machines new jobs to increase this amount at a maximum level by using a new mucking algorithm under external constraints.

  16. An Approach to Realizing Process Control for Underground Mining Operations of Mobile Machines

    PubMed Central

    Song, Zhen; Schunnesson, Håkan; Rinne, Mikael; Sturgul, John

    2015-01-01

    The excavation and production in underground mines are complicated processes which consist of many different operations. The process of underground mining is considerably constrained by the geometry and geology of the mine. The various mining operations are normally performed in series at each working face. The delay of a single operation will lead to a domino effect, thus delay the starting time for the next process and the completion time of the entire process. This paper presents a new approach to the process control for underground mining operations, e.g. drilling, bolting, mucking. This approach can estimate the working time and its probability for each operation more efficiently and objectively by improving the existing PERT (Program Evaluation and Review Technique) and CPM (Critical Path Method). If the delay of the critical operation (which is on a critical path) inevitably affects the productivity of mined ore, the approach can rapidly assign mucking machines new jobs to increase this amount at a maximum level by using a new mucking algorithm under external constraints. PMID:26062092

  17. An Approach to Realizing Process Control for Underground Mining Operations of Mobile Machines.

    PubMed

    Song, Zhen; Schunnesson, Håkan; Rinne, Mikael; Sturgul, John

    2015-01-01

    The excavation and production in underground mines are complicated processes which consist of many different operations. The process of underground mining is considerably constrained by the geometry and geology of the mine. The various mining operations are normally performed in series at each working face. The delay of a single operation will lead to a domino effect, thus delay the starting time for the next process and the completion time of the entire process. This paper presents a new approach to the process control for underground mining operations, e.g. drilling, bolting, mucking. This approach can estimate the working time and its probability for each operation more efficiently and objectively by improving the existing PERT (Program Evaluation and Review Technique) and CPM (Critical Path Method). If the delay of the critical operation (which is on a critical path) inevitably affects the productivity of mined ore, the approach can rapidly assign mucking machines new jobs to increase this amount at a maximum level by using a new mucking algorithm under external constraints. PMID:26062092

  18. THE FUTURE OF COMPUTER-BASED TOXICITY PREDICTION: MECHANISM-BASED MODELS VS. INFORMATION MINING APPROACHES

    EPA Science Inventory


    The Future of Computer-Based Toxicity Prediction:
    Mechanism-Based Models vs. Information Mining Approaches

    When we speak of computer-based toxicity prediction, we are generally referring to a broad array of approaches which rely primarily upon chemical structure ...

  19. Dual-band, infrared buried mine detection using a statistical pattern recognition approach

    SciTech Connect

    Buhl, M.R.; Hernandez, J.E.; Clark, G.A.; Sengupta, S.K.

    1993-08-01

    The main objective of this work was to detect surrogate land mines, which were buried in clay and sand, using dual-band, infrared images. A statistical pattern recognition approach was used to achieve this objective. This approach is discussed and results of applying it to real images are given.

  20. Assessing Weather-Yield Relationships in Rice at Local Scale Using Data Mining Approaches.

    PubMed

    Delerce, Sylvain; Dorado, Hugo; Grillon, Alexandre; Rebolledo, Maria Camila; Prager, Steven D; Patiño, Victor Hugo; Garcés Varón, Gabriel; Jiménez, Daniel

    2016-01-01

    Seasonal and inter-annual climate variability have become important issues for farmers, and climate change has been shown to increase them. Simultaneously farmers and agricultural organizations are increasingly collecting observational data about in situ crop performance. Agriculture thus needs new tools to cope with changing environmental conditions and to take advantage of these data. Data mining techniques make it possible to extract embedded knowledge associated with farmer experiences from these large observational datasets in order to identify best practices for adapting to climate variability. We introduce new approaches through a case study on irrigated and rainfed rice in Colombia. Preexisting observational datasets of commercial harvest records were combined with in situ daily weather series. Using Conditional Inference Forest and clustering techniques, we assessed the relationships between climatic factors and crop yield variability at the local scale for specific cultivars and growth stages. The analysis showed clear relationships in the various location-cultivar combinations, with climatic factors explaining 6 to 46% of spatiotemporal variability in yield, and with crop responses to weather being non-linear and cultivar-specific. Climatic factors affected cultivars differently during each stage of development. For instance, one cultivar was affected by high nighttime temperatures in the reproductive stage but responded positively to accumulated solar radiation during the ripening stage. Another was affected by high nighttime temperatures during both the vegetative and reproductive stages. Clustering of the weather patterns corresponding to individual cropping events revealed different groups of weather patterns for irrigated and rainfed systems with contrasting yield levels. Best-suited cultivars were identified for some weather patterns, making weather-site-specific recommendations possible. This study illustrates the potential of data mining for

  1. Assessing Weather-Yield Relationships in Rice at Local Scale Using Data Mining Approaches.

    PubMed

    Delerce, Sylvain; Dorado, Hugo; Grillon, Alexandre; Rebolledo, Maria Camila; Prager, Steven D; Patiño, Victor Hugo; Garcés Varón, Gabriel; Jiménez, Daniel

    2016-01-01

    Seasonal and inter-annual climate variability have become important issues for farmers, and climate change has been shown to increase them. Simultaneously farmers and agricultural organizations are increasingly collecting observational data about in situ crop performance. Agriculture thus needs new tools to cope with changing environmental conditions and to take advantage of these data. Data mining techniques make it possible to extract embedded knowledge associated with farmer experiences from these large observational datasets in order to identify best practices for adapting to climate variability. We introduce new approaches through a case study on irrigated and rainfed rice in Colombia. Preexisting observational datasets of commercial harvest records were combined with in situ daily weather series. Using Conditional Inference Forest and clustering techniques, we assessed the relationships between climatic factors and crop yield variability at the local scale for specific cultivars and growth stages. The analysis showed clear relationships in the various location-cultivar combinations, with climatic factors explaining 6 to 46% of spatiotemporal variability in yield, and with crop responses to weather being non-linear and cultivar-specific. Climatic factors affected cultivars differently during each stage of development. For instance, one cultivar was affected by high nighttime temperatures in the reproductive stage but responded positively to accumulated solar radiation during the ripening stage. Another was affected by high nighttime temperatures during both the vegetative and reproductive stages. Clustering of the weather patterns corresponding to individual cropping events revealed different groups of weather patterns for irrigated and rainfed systems with contrasting yield levels. Best-suited cultivars were identified for some weather patterns, making weather-site-specific recommendations possible. This study illustrates the potential of data mining for

  2. Assessing Weather-Yield Relationships in Rice at Local Scale Using Data Mining Approaches

    PubMed Central

    Delerce, Sylvain; Dorado, Hugo; Grillon, Alexandre; Rebolledo, Maria Camila; Prager, Steven D.; Patiño, Victor Hugo; Garcés Varón, Gabriel; Jiménez, Daniel

    2016-01-01

    Seasonal and inter-annual climate variability have become important issues for farmers, and climate change has been shown to increase them. Simultaneously farmers and agricultural organizations are increasingly collecting observational data about in situ crop performance. Agriculture thus needs new tools to cope with changing environmental conditions and to take advantage of these data. Data mining techniques make it possible to extract embedded knowledge associated with farmer experiences from these large observational datasets in order to identify best practices for adapting to climate variability. We introduce new approaches through a case study on irrigated and rainfed rice in Colombia. Preexisting observational datasets of commercial harvest records were combined with in situ daily weather series. Using Conditional Inference Forest and clustering techniques, we assessed the relationships between climatic factors and crop yield variability at the local scale for specific cultivars and growth stages. The analysis showed clear relationships in the various location-cultivar combinations, with climatic factors explaining 6 to 46% of spatiotemporal variability in yield, and with crop responses to weather being non-linear and cultivar-specific. Climatic factors affected cultivars differently during each stage of development. For instance, one cultivar was affected by high nighttime temperatures in the reproductive stage but responded positively to accumulated solar radiation during the ripening stage. Another was affected by high nighttime temperatures during both the vegetative and reproductive stages. Clustering of the weather patterns corresponding to individual cropping events revealed different groups of weather patterns for irrigated and rainfed systems with contrasting yield levels. Best-suited cultivars were identified for some weather patterns, making weather-site-specific recommendations possible. This study illustrates the potential of data mining for

  3. Social Network Analysis and Mining to Monitor and Identify Problems with Large-Scale Information and Communication Technology Interventions.

    PubMed

    da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa

    2016-01-01

    The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants' municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar

  4. Social Network Analysis and Mining to Monitor and Identify Problems with Large-Scale Information and Communication Technology Interventions

    PubMed Central

    da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa

    2016-01-01

    The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants’ municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar

  5. Social Network Analysis and Mining to Monitor and Identify Problems with Large-Scale Information and Communication Technology Interventions.

    PubMed

    da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa

    2016-01-01

    The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants' municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar

  6. Mining 3D genome structure populations identifies major factors governing the stability of regulatory communities

    PubMed Central

    Dai, Chao; Li, Wenyuan; Tjong, Harianto; Hao, Shengli; Zhou, Yonggang; Li, Qingjiao; Chen, Lin; Zhu, Bing; Alber, Frank; Jasmine Zhou, Xianghong

    2016-01-01

    Three-dimensional (3D) genome structures vary from cell to cell even in an isogenic sample. Unlike protein structures, genome structures are highly plastic, posing a significant challenge for structure-function mapping. Here we report an approach to comprehensively identify 3D chromatin clusters that each occurs frequently across a population of genome structures, either deconvoluted from ensemble-averaged Hi-C data or from a collection of single-cell Hi-C data. Applying our method to a population of genome structures (at the macrodomain resolution) of lymphoblastoid cells, we identify an atlas of stable inter-chromosomal chromatin clusters. A large number of these clusters are enriched in binding of specific regulatory factors and are therefore defined as ‘Regulatory Communities.' We reveal two major factors, centromere clustering and transcription factor binding, which significantly stabilize such communities. Finally, we show that the regulatory communities differ substantially from cell to cell, indicating that expression variability could be impacted by genome structures. PMID:27240697

  7. What are the important flood damage-influencing parameters? A data mining approach

    NASA Astrophysics Data System (ADS)

    Merz, B.; Kreibich, H.; Lall, U.

    2012-04-01

    Today's approaches for assessing and modeling direct flood damages are not very advanced. The usual approach consists of stage-damage functions which relate the relative or absolute damage for a certain class of objects to the inundation depth. Other characteristics of the flooding situation and of the flooded object are rarely taken into account, although flood damage is influenced by a variety of factors. In this contribution we apply a group of data-mining techniques, known as tree-structured models, to flood damage assessment. Tree-structured models are attractive candidates for identifying important damage-influencing parameters in large damage data sets and for describing quantitatively the non-linear interactions between damage and damage-influencing parameters. A very comprehensive data set of more than 2000 damage records of private households in Germany is used. Each record contains details about a variety of potential damage-influencing characteristics, such as hydrological and hydraulic aspects of the flooding situation, state of precaution of the household, early warning and emergency measures undertaken, socio-economic status of the household. Tree-structured models are used to derive the dominating damage-influencing variables and their (non-linear) interactions. We show that they are a flexible and powerful alternative to traditional damage assessment approaches.

  8. A quantitative approach to identifying predators from nest remains

    USGS Publications Warehouse

    Anthony, R.M.; Grand, J.B.; Fondell, T.F.; Manly, B.F.

    2004-01-01

    Nesting success of Dusky Canada Geese (Branta canadensis occidentalis) has declined greatly since a major earthquake affected southern Alaska in 1964. To identify nest predators, we collected predation data at goose nests and photographs of predators at natural nests containing artificial eggs in 1997-2000. To document feeding behavior by nest predators, we compiled the evidence from destroyed nests with known predators on our study site and from previous studies. We constructed a profile for each predator group and compared the evidence from 895 nests with unknown predators to our predator profiles using mixture-model analysis. This analysis indicated that 72% of destroyed nests were depredated by Bald Eagles and 13% by brown bears, and also yielded the probability that each nest was correctly assigned to a predator group based on model fit. Model testing using simulations indicated that the proportion estimated for eagle predation was unbiased and the proportion for bear predation was slightly overestimated. This approach may have application whenever there are adequate data on nests destroyed by known predators and predators exhibit different feeding behavior at nests.

  9. Towards a Numerical Approach to Identify Hydrogeological Landscape Units

    NASA Astrophysics Data System (ADS)

    Akter, F.; Vervoort, R. W.; Bishop, T. F. A.

    2014-12-01

    Groundwater salinity remains a major issue due to its impact on agriculture and infrastructure. In Australia, it is recognized that groundwater salinity varies significantly across space and time. In NSW, the state government has developed a landscape classification for management based on hydrogeology, landuse and landscape aspects, mainly derived from GIS overlays and operator experience. In this study, we use historical water quality data, geology and drilling logs to develop a more rigorous numerical approach to landscape classification. A combination of statistical methods (Generalised additive model (GAM) and Semi-variogram analysis) was used to identify the significant spatio-temporal factors that induce the variability of groundwater salinity across the Muttama catchment (1059km2) in the southern part of NSW, Australia. The statistical model explained 57% of the variance in the electrical conductivity levels in the groundwater across the landscape. Geology and lag rainfall were the key factors that explained overall catchment groundwater salinity, thus defining the hydrogeological landscape units. Semi-variogram analysis revealed the remaining residuals did not indicate further spatial organisation. Current work focusses on also predicting groundwater response times. Therefore, the results of this study highlighted framework to numerically develop hydrogeological units based on the geological landscape characteristics.

  10. Identifying predictors of physics item difficulty: A linear regression approach

    NASA Astrophysics Data System (ADS)

    Mesic, Vanes; Muratovic, Hasnija

    2011-06-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge

  11. A fluorescent approach for identifying P2X1 ligands

    PubMed Central

    Ruepp, Marc-David; Brozik, James A.; de Esch, Iwan J.P.; Farndale, Richard W.; Murrell-Lagnado, Ruth D.; Thompson, Andrew J.

    2015-01-01

    There are no commercially available, small, receptor-specific P2X1 ligands. There are several synthetic derivatives of the natural agonist ATP and some structurally-complex antagonists including compounds such as PPADS, NTP-ATP, suramin and its derivatives (e.g. NF279, NF449). NF449 is the most potent and selective ligand, but potencies of many others are not particularly high and they can also act at other P2X, P2Y and non-purinergic receptors. While there is clearly scope for further work on P2X1 receptor pharmacology, screening can be difficult owing to rapid receptor desensitisation. To reduce desensitisation substitutions can be made within the N-terminus of the P2X1 receptor, but these could also affect ligand properties. An alternative is the use of fluorescent voltage-sensitive dyes that respond to membrane potential changes resulting from channel opening. Here we utilised this approach in conjunction with fragment-based drug-discovery. Using a single concentration (300 μM) we identified 46 novel leads from a library of 1443 fragments (hit rate = 3.2%). These hits were independently validated by measuring concentration-dependence with the same voltage-sensitive dye, and by visualising the competition of hits with an Alexa-647-ATP fluorophore using confocal microscopy; confocal yielded kon (1.142 × 106 M−1 s−1) and koff (0.136 s−1) for Alexa-647-ATP (Kd = 119 nM). The identified hit fragments had promising structural diversity. In summary, the measurement of functional responses using voltage-sensitive dyes was flexible and cost-effective because labelled competitors were not needed, effects were independent of a specific binding site, and both agonist and antagonist actions were probed in a single assay. The method is widely applicable and could be applied to all P2X family members, as well as other voltage-gated and ligand-gated ion channels. This article is part of the Special Issue entitled ‘Fluorescent Tools in Neuropharmacology

  12. Identifying Heterogeneous Anisotropic Properties in Cerebral Aneurysms: A Pointwise Approach

    PubMed Central

    Zhao, Xuefeng; Raghavan, Madhavan L.; Lu, Jia

    2014-01-01

    The traditional approaches of estimating heterogeneous properties in a soft tissue structure using optimization based inverse methods often face difficulties because of the large number of unknowns to be simultaneously determined. This article proposes a new method for identifying the heterogeneous anisotropic nonlinear elastic properties in cerebral aneurysms. In this method, the local properties are determined directly from the pointwise stress-strain data, thus avoiding the need for simultaneously optimizing for the property values at all points/regions in the aneurysm. The stress distributions needed for a pointwise identification are computed using an inverse elastostatic method without invoking the material properties in question. This paradigm is tested numerically through simulated inflation tests on an image-based cerebral aneurysm sac. The wall tissue is modeled as an eight-ply laminate whose constitutive behavior is described by an anisotropic hyperelastic strain-energy function containing four parameters. The parameters are assumed to vary continuously in the sac. Deformed configurations generated from forward finite element analysis are taken as input to inversely establish the parameter distributions. The delineated and the assigned distributions are in excellent agreement. A forward verification is conducted by comparing the displacement solutions obtained from the delineated and the assigned material parameters at a different pressure. The deviations in nodal displacements are found to be within 0.2% in most part of the sac. The study highlights some distinct features of the proposed method, and demonstrates the feasibility of organ level identification of the distributive anisotropic nonlinear properties in cerebral aneurysms. PMID:20490886

  13. A Lagrangian approach to identifying vortex pinch-off.

    PubMed

    O'Farrell, Clara; Dabiri, John O

    2010-03-01

    A criterion for identifying vortex ring pinch-off based on the Lagrangian coherent structures (LCSs) in the flow is proposed and demonstrated for a piston-cylinder arrangement with a piston stroke to diameter (L/D) ratio of approximately 12. It is found that the appearance of a new disconnected LCS and the termination of the original LCS are indicative of the initiation of vortex pinch-off. The subsequent growth of new LCSs, which tend to roll into spirals, indicates the formation of new vortex cores in the trailing shear layer. Using this criterion, the formation number is found to be 4.1+/-0.1, which is consistent with the predicted formation number of approximately 4 of Gharib et al. [Gharib et al. J. Fluid Mech. 360, 121 (1998)]. The results obtained using the proposed LCS criterion are compared with those obtained using the circulation criterion of Gharib et al. and are found to be in excellent agreement. The LCS approach is also compared against other metrics, both Lagrangian and Eulerian, and is found to yield insight into the pinch-off process that these do not. Furthermore, the LCS analysis reveals a consistent pattern of coalescing or "pairing" of adjacent vortices in the trailing shear layer, a process which has been extensively documented in circular jets. Given that LCSs are objective and insensitive to local errors in the velocity field, the proposed criterion has the potential to be a robust tool for pinch-off identification. In particular, it may prove useful in the study of unsteady and low Reynolds number flows, where conventional methods based on vorticity prove difficult to use. PMID:20370303

  14. EST mining of the UniGene dataset to identify retina-specific genes.

    PubMed

    Stöhr, H; Mah, N; Schulz, H L; Gehrig, A; Fröhlich, S; Weber, B H

    2000-01-01

    Age-related macular degeneration (AMD) is a multifactorial disorder affecting the visual system with a high prevalence among the elderly population but with no effective therapy available at present. To better understand the pathogenesis of this disorder, the identification of the genetic factors and the determination of their contribution to AMD is needed. Towards this goal, we are pursuing a strategy that makes use of the EST data processed in the UniGene database and aims at the generation of a comprehensive catalogue of genes preferentially active in the human retina. Subsequently, these genes will be systematically assessed in AMD. We performed a retina EST sampling and obtained a total of 673 clusters containing only retina ESTs as well as 568 clusters with at least 30% of the ESTs in each cluster originating from retina cDNA libraries. Of these, 180 representative EST clusters with varying retina and non-retina EST contents were analyzed for their in vitro expression. This approach identified 39 transcripts with retina-specific expression. One of these genes (C18orf2) mapping to chromosome 18 was further characterized. Multiple C18orf2 transcripts display a complex pattern of differential splicing in the human retina. The various isoforms encode hypothetical polypeptides with no homologies to known proteins or protein motifs.

  15. Approaches to Identifying Synthetic Lethal Interactions in Cancer

    PubMed Central

    Thompson, Jordan M.; Nguyen, Quy H.; Singh, Manpreet; Razorenova, Olga V.

    2015-01-01

    Targeting synthetic lethal interactions is a promising new therapeutic approach to exploit specific changes that occur within cancer cells. Multiple approaches to investigate these interactions have been developed and successfully implemented, including chemical, siRNA, shRNA, and CRISPR library screens. Genome-wide computational approaches, such as DAISY, also have been successful in predicting synthetic lethal interactions from both cancer cell lines and patient samples. Each approach has its advantages and disadvantages that need to be considered depending on the cancer type and its molecular alterations. This review discusses these approaches and examines case studies that highlight their use. PMID:26029013

  16. A Hybrid Approach for Efficient Modeling of Medium-Frequency Propagation in Coal Mines

    PubMed Central

    Brocker, Donovan E.; Sieber, Peter E.; Waynert, Joseph A.; Li, Jingcheng; Werner, Pingjuan L.; Werner, Douglas H.

    2015-01-01

    An efficient procedure for modeling medium frequency (MF) communications in coal mines is introduced. In particular, a hybrid approach is formulated and demonstrated utilizing ideal transmission line equations to model MF propagation in combination with full-wave sections used for accurate simulation of local antenna-line coupling and other near-field effects. This work confirms that the hybrid method accurately models signal propagation from a source to a load for various system geometries and material compositions, while significantly reducing computation time. With such dramatic improvement to solution times, it becomes feasible to perform large-scale optimizations with the primary motivation of improving communications in coal mines both for daily operations and emergency response. Furthermore, it is demonstrated that the hybrid approach is suitable for modeling and optimizing large communication networks in coal mines that may otherwise be intractable to simulate using traditional full-wave techniques such as moment methods or finite-element analysis. PMID:26478686

  17. A Hybrid Data Mining Approach for Credit Card Usage Behavior Analysis

    NASA Astrophysics Data System (ADS)

    Tsai, Chieh-Yuan

    Credit card is one of the most popular e-payment approaches in current online e-commerce. To consolidate valuable customers, card issuers invest a lot of money to maintain good relationship with their customers. Although several efforts have been done in studying card usage motivation, few researches emphasize on credit card usage behavior analysis when time periods change from t to t+1. To address this issue, an integrated data mining approach is proposed in this paper. First, the customer profile and their transaction data at time period t are retrieved from databases. Second, a LabelSOM neural network groups customers into segments and identify critical characteristics for each group. Third, a fuzzy decision tree algorithm is used to construct usage behavior rules of interesting customer groups. Finally, these rules are used to analysis the behavior changes between time periods t and t+1. An implementation case using a practical credit card database provided by a commercial bank in Taiwan is illustrated to show the benefits of the proposed framework.

  18. A data-mining approach to biomarker identification from protein profiles using discrete stationary wavelet transform

    PubMed Central

    Montazery-Kordy, Hussain; Miran-Baygi, Mohammad Hossein; Moradi, Mohammad Hassan

    2008-01-01

    Objective: To develop a new bioinformatic tool based on a data-mining approach for extraction of the most informative proteins that could be used to find the potential biomarkers for the detection of cancer. Methods: Two independent datasets from serum samples of 253 ovarian cancer and 167 breast cancer patients were used. The samples were examined by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS). The datasets were used to extract the informative proteins using a data-mining method in the discrete stationary wavelet transform domain. As a dimensionality reduction procedure, the hard thresholding method was applied to reduce the number of wavelet coefficients. Also, a distance measure was used to select the most discriminative coefficients. To find the potential biomarkers using the selected wavelet coefficients, we applied the inverse discrete stationary wavelet transform combined with a two-sided t-test. Results: From the ovarian cancer dataset, a set of five proteins were detected as potential biomarkers that could be used to identify the cancer patients from the healthy cases with accuracy, sensitivity, and specificity of 100%. Also, from the breast cancer dataset, a set of eight proteins were found as the potential biomarkers that could separate the healthy cases from the cancer patients with accuracy of 98.26%, sensitivity of 100%, and specificity of 95.6%. Conclusion: The results have shown that the new bioinformatic tool can be used in combination with the high-throughput proteomic data such as SELDI-TOF MS to find the potential biomarkers with high discriminative power. PMID:18988305

  19. Cluster Analysis-Based Approaches for Geospatiotemporal Data Mining of Massive Data Sets for Identification of Forest Threats

    SciTech Connect

    Mills, Richard T; Hoffman, Forrest M; Kumar, Jitendra; HargroveJr., William Walter

    2011-01-01

    We investigate methods for geospatiotemporal data mining of multi-year land surface phenology data (250 m2 Normalized Difference Vegetation Index (NDVI) values derived from the Moderate Resolution Imaging Spectrometer (MODIS) in this study) for the conterminous United States (CONUS) as part of an early warning system for detecting threats to forest ecosystems. The approaches explored here are based on k-means cluster analysis of this massive data set, which provides a basis for defining the bounds of the expected or normal phenological patterns that indicate healthy vegetation at a given geographic location. We briefly describe the computational approaches we have used to make cluster analysis of such massive data sets feasible, describe approaches we have explored for distinguishing between normal and abnormal phenology, and present some examples in which we have applied these approaches to identify various forest disturbances in the CONUS.

  20. North American Bats and Mines Project: A cooperative approach for integrating bat conservation and mine-land reclamation

    SciTech Connect

    Ducummon, S.L.

    1997-12-31

    Inactive underground mines now provide essential habitat for more than half of North America`s 44 bat species, including some of the largest remaining populations. Thousands of abandoned mines have already been closed or are slated for safety closures, and many are destroyed during renewed mining in historic districts. The available evidence suggests that millions of bats have already been lost due to these closures. Bats are primary predators of night-flying insects that cost American farmers and foresters billions of dollars annually, therefore, threats to bat survival are cause for serious concern. Fortunately, mine closure methods exist that protect both bats and humans. Bat Conservation International (BCI) and the USDI-Bureau of Land Management founded the North American Bats and Mines Project to provide national leadership and coordination to minimize the loss of mine-roosting bats. This partnership has involved federal and state mine-land and wildlife managers and the mining industry. BCI has trained hundreds of mine-land and wildlife managers nationwide in mine assessment techniques for bats and bat-compatible closure methods, published technical information on bats and mine-land management, presented papers on bats and mines at national mining and wildlife conferences, and collaborated with numerous federal, state, and private partners to protect some of the most important mine-roosting bat populations. Our new mining industry initiative, Mining for Habitat, is designed to develop bat habitat conservation and enhancement plans for active mining operations. It includes the creation of cost-effective artificial underground bat roosts using surplus mining materials such as old mine-truck tires and culverts buried beneath waste rock.

  1. Quantitative risk-based approach for improving water quality management in mining.

    PubMed

    Liu, Wenying; Moran, Chris J; Vink, Sue

    2011-09-01

    The potential environmental threats posed by freshwater withdrawal and mine water discharge are some of the main drivers for the mining industry to improve water management. The use of multiple sources of water supply and introducing water reuse into the mine site water system have been part of the operating philosophies employed by the mining industry to realize these improvements. However, a barrier to implementation of such good water management practices is concomitant water quality variation and the resulting impacts on the efficiency of mineral separation processes, and an increased environmental consequence of noncompliant discharge events. There is an increasing appreciation that conservative water management practices, production efficiency, and environmental consequences are intimately linked through the site water system. It is therefore essential to consider water management decisions and their impacts as an integrated system as opposed to dealing with each impact separately. This paper proposes an approach that could assist mine sites to manage water quality issues in a systematic manner at the system level. This approach can quantitatively forecast the risk related with water quality and evaluate the effectiveness of management strategies in mitigating the risk by quantifying implications for production and hence economic viability. PMID:21797262

  2. Quantitative risk-based approach for improving water quality management in mining.

    PubMed

    Liu, Wenying; Moran, Chris J; Vink, Sue

    2011-09-01

    The potential environmental threats posed by freshwater withdrawal and mine water discharge are some of the main drivers for the mining industry to improve water management. The use of multiple sources of water supply and introducing water reuse into the mine site water system have been part of the operating philosophies employed by the mining industry to realize these improvements. However, a barrier to implementation of such good water management practices is concomitant water quality variation and the resulting impacts on the efficiency of mineral separation processes, and an increased environmental consequence of noncompliant discharge events. There is an increasing appreciation that conservative water management practices, production efficiency, and environmental consequences are intimately linked through the site water system. It is therefore essential to consider water management decisions and their impacts as an integrated system as opposed to dealing with each impact separately. This paper proposes an approach that could assist mine sites to manage water quality issues in a systematic manner at the system level. This approach can quantitatively forecast the risk related with water quality and evaluate the effectiveness of management strategies in mitigating the risk by quantifying implications for production and hence economic viability.

  3. Integrated approach of environmental impact and risk assessment of Rosia Montana Mining Area, Romania.

    PubMed

    Stefănescu, Lucrina; Robu, Brînduşa Mihaela; Ozunu, Alexandru

    2013-11-01

    The environmental impact assessment of mining sites represents nowadays a large interest topic in Romania. Historical pollution in the Rosia Montana mining area of Romania caused extensive damage to environmental media. This paper has two goals: to investigate the environmental pollution induced by mining activities in the Rosia Montana area and to quantify the environmental impacts and associated risks by means of an integrated approach. Thus, a new method was developed and applied for quantifying the impact of mining activities, taking account of the quality of environmental media in the mining area, and used as case study in the present paper. The associated risks are a function of the environmental impacts and the probability of their occurrence. The results show that the environmental impacts and quantified risks, based on quality indicators to characterize the environmental quality, are of a higher order, and thus measures for pollution remediation and control need to be considered in the investigated area. The conclusion drawn is that an integrated approach for the assessment of environmental impact and associated risks is a valuable and more objective method, and is an important tool that can be applied in the decision-making process for national authorities in the prioritization of emergency action.

  4. Identifying Similarities in Cognitive Subtest Functional Requirements: An Empirical Approach

    ERIC Educational Resources Information Center

    Frisby, Craig L.; Parkin, Jason R.

    2007-01-01

    In the cognitive test interpretation literature, a Rational/Intuitive, Indirect Empirical, or Combined approach is typically used to construct conceptual taxonomies of the functional (behavioral) similarities between subtests. To address shortcomings of these approaches, the functional requirements for 49 subtests from six individually…

  5. Data mining approach to web application intrusions detection

    NASA Astrophysics Data System (ADS)

    Kalicki, Arkadiusz

    2011-10-01

    Web applications became most popular medium in the Internet. Popularity, easiness of web application script languages and frameworks together with careless development results in high number of web application vulnerabilities and high number of attacks performed. There are several types of attacks possible because of improper input validation: SQL injection Cross-site scripting, Cross-Site Request Forgery (CSRF), web spam in blogs and others. In order to secure web applications intrusion detection (IDS) and intrusion prevention systems (IPS) are being used. Intrusion detection systems are divided in two groups: misuse detection (traditional IDS) and anomaly detection. This paper presents data mining based algorithm for anomaly detection. The principle of this method is the comparison of the incoming HTTP traffic with a previously built profile that contains a representation of the "normal" or expected web application usage sequence patterns. The frequent sequence patterns are found with GSP algorithm. Previously presented detection method was rewritten and improved. Some tests show that the software catches malicious requests, especially long attack sequences, results quite good with medium length sequences, for short length sequences must be complemented with other methods.

  6. Ultrabroadband photonic Internet: data mining approach to security aspects

    NASA Astrophysics Data System (ADS)

    Kalicki, Arkadiusz

    2009-06-01

    Web applications became most popular medium in the Internet. Popularity, easiness of web application frameworks together with careless development results in high number of vulnerabilities and attacks. There are several types of attacks possible because of improper input validation. SQL injection is ability to execute arbitrary SQL queries in a database through an existing application. Cross-site scripting is the vulnerability which allows malicious web users to inject code into the web pages viewed by other users. Cross-Site Request Forgery (CSRF) is an attack that tricks the victim into loading a page that contains malicious request. Web spam in blogs. In order to secure web applications intrusion detection (IDS) and intrusion prevention systems (IPS) are being used. Intrusion detection systems are divided in two groups: misuse detection (traditional IDS) and anomaly detection. Misuse detection systems are signature based, have high accuracy in detecting many kinds of known attacks but cannot detect unknown and emerging attacks. This can be complemented with anomaly based intrusion detection and prevention systems. This paper presents anomaly driven proxy as an IPS and data mining based algorithm which was used to detecting anomalies. The principle of this method is the comparison of the incoming HTTP traffic with a previously built profile that contains a representation of the "normal" or expected web application usage sequence patterns. The frequent sequence patterns are found with GSP algorithm. Some basic tests show that the software catches malicious requests.

  7. EVALUATION OF A TWO-STAGE PASSIVE TREATMENT APPROACH FOR MINING INFLUENCE WATERS

    EPA Science Inventory

    A two-stage passive treatment approach was assessed at bench-scale using two Colorado Mining Influenced Waters (MIWs). The first-stage was a limestone drain with the purpose of removing iron and aluminum and mitigating the potential effects of mineral acidity. The second stage w...

  8. DNA enrichment approaches to identify unauthorized genetically modified organisms (GMOs).

    PubMed

    Arulandhu, Alfred J; van Dijk, Jeroen P; Dobnik, David; Holst-Jensen, Arne; Shi, Jianxin; Zel, Jana; Kok, Esther J

    2016-07-01

    With the increased global production of different genetically modified (GM) plant varieties, chances increase that unauthorized GM organisms (UGMOs) may enter the food chain. At the same time, the detection of UGMOs is a challenging task because of the limited sequence information that will generally be available. PCR-based methods are available to detect and quantify known UGMOs in specific cases. If this approach is not feasible, DNA enrichment of the unknown adjacent sequences of known GMO elements is one way to detect the presence of UGMOs in a food or feed product. These enrichment approaches are also known as chromosome walking or gene walking (GW). In recent years, enrichment approaches have been coupled with next generation sequencing (NGS) analysis and implemented in, amongst others, the medical and microbiological fields. The present review will provide an overview of these approaches and an evaluation of their applicability in the identification of UGMOs in complex food or feed samples. PMID:27086015

  9. DNA enrichment approaches to identify unauthorized genetically modified organisms (GMOs).

    PubMed

    Arulandhu, Alfred J; van Dijk, Jeroen P; Dobnik, David; Holst-Jensen, Arne; Shi, Jianxin; Zel, Jana; Kok, Esther J

    2016-07-01

    With the increased global production of different genetically modified (GM) plant varieties, chances increase that unauthorized GM organisms (UGMOs) may enter the food chain. At the same time, the detection of UGMOs is a challenging task because of the limited sequence information that will generally be available. PCR-based methods are available to detect and quantify known UGMOs in specific cases. If this approach is not feasible, DNA enrichment of the unknown adjacent sequences of known GMO elements is one way to detect the presence of UGMOs in a food or feed product. These enrichment approaches are also known as chromosome walking or gene walking (GW). In recent years, enrichment approaches have been coupled with next generation sequencing (NGS) analysis and implemented in, amongst others, the medical and microbiological fields. The present review will provide an overview of these approaches and an evaluation of their applicability in the identification of UGMOs in complex food or feed samples.

  10. Abandoned mined land reclamation on the Wayne National Forest - an interdisciplinary approach

    SciTech Connect

    Moss, R.G.

    1982-12-01

    The Wayne National Forest contains several thousand acres of abandoned surface-mined lands, many of which are in need of reclamation. The Forest Service has developed a systematic interdisciplinary approach to planning and implementing reclamation projects. An environmental assessment report is prepared before the project is designed which provides decision makers the information needed to select a preferred reclamation alternative. A case study known as the Yost II Abandoned Mined Land Reclamation Project is presented. The abandoned mine, basically a double contour configuration, presented designers with a difficult mosaic of barren, toxic areas, well-revegetated areas, and acid ponds. The reclamation technique employed utilized burial of toxic soil, pond underdrains, crushed limestone filter strips, and topsoiling.

  11. Data Mining: A Systems Approach to Formative Assessment

    ERIC Educational Resources Information Center

    Schmid, Dale

    2012-01-01

    This article describes how using raw data and information from reliable assessments can inform teachers' decisions leading to improved instruction. The primary aim is to use a systems approach to provide evidence of what students know and how they demonstrate mastery. Such evidence can empower teachers to reach all students. The pedagogic…

  12. Data mining approaches to high-throughput crystal structure and compound prediction.

    PubMed

    Hautier, Geoffroy

    2014-01-01

    Predicting unknown inorganic compounds and their crystal structure is a critical step of high-throughput computational materials design and discovery. One way to achieve efficient compound prediction is to use data mining or machine learning methods. In this chapter we present a few algorithms for data mining compound prediction and their applications to different materials discovery problems. In particular, the patterns or correlations governing phase stability for experimental or computational inorganic compound databases are statistically learned and used to build probabilistic or regression models to identify novel compounds and their crystal structures. The stability of those compound candidates is then assessed using ab initio techniques. Finally, we report a few cases where data mining driven computational predictions were experimentally confirmed through inorganic synthesis.

  13. An Approach for Identifying Benefit Segments among Prospective College Students.

    ERIC Educational Resources Information Center

    Miller, Patrick; And Others

    1990-01-01

    A study investigated the importance to 578 applicants of various benefits offered by a moderately selective private university. Applicants rated the institution on 43 academic, social, financial, religious, and curricular attributes. The objective was to test the efficacy of one approach to college market segmentation. Results support the utility…

  14. Identifying Predictors of Physics Item Difficulty: A Linear Regression Approach

    ERIC Educational Resources Information Center

    Mesic, Vanes; Muratovic, Hasnija

    2011-01-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary…

  15. Terrestrial Loads of Mercury in a River Catchment Impacted by Former Mercury Mining Activity - GIS Approach

    NASA Astrophysics Data System (ADS)

    Kocman, D.; Horvat, M.

    2010-12-01

    Mercury (Hg) mining activities in Idrija, Slovenia, have resulted in significant contamination to surrounding local environments. Due to the abundant precipitation, steep slopes and highly erodible underlying lithology, this region is especially prone to erosion. Hence, even today, due to the lack of proper remediation actions, Hg contaminated soils, Hg-laden material and tailings continue to supply Hg to the local river system and consequently to the Gulf of Trieste in N Adriatic. In addition, significant amounts of elemental Hg from these areas are emitted to the atmosphere. In this contribution, environmental modeling of mercury in the Idrijca River catchment (640 km2) as a tool for sound remediation planning will be discussed. Based on the experimental results, a mass-balance model of sources, sinks and mercury transport process on the catchment scale was developed within the GIS environment. The model is based upon mercury mass-balance calculations as a combination of different approaches (WCS mercury tool and Erosion Potential Model coupled with newly developed GIS emission model) adopted for site specific conditions. In this way, accounting for Hg entering and leaving the catchment, its mass flows, flux rates, and turnover times were identified. Moreover, spatial distribution and significance of most polluted sites that need to be properly managed were assessed. The developed model can be used as an efficient tool for remediation interventions planning as well as to check the effects of management choices which would minimize the uncertainties in the process of decision making.

  16. Genomic approaches to identifying transcriptional regulators of osteoblast differentiation

    NASA Technical Reports Server (NTRS)

    Stains, Joseph P.; Civitelli, Roberto

    2003-01-01

    Recent microarray studies of mouse and human osteoblast differentiation in vitro have identified novel transcription factors that may be important in the establishment and maintenance of differentiation. These findings help unravel the pattern of gene-expression changes that underly the complex process of bone formation.

  17. Identifying the "Truly Disadvantaged": A Comprehensive Biosocial Approach

    ERIC Educational Resources Information Center

    Barnes, J. C.; Beaver, Kevin M.; Connolly, Eric J.; Schwartz, Joseph A.

    2016-01-01

    There has been significant interest in examining the developmental factors that predispose individuals to chronic criminal offending. This body of research has identified some social-environmental risk factors as potentially important. At the same time, the research producing these results has generally failed to employ genetically sensitive…

  18. Systems Approaches to Identifying Gene Regulatory Networks in Plants

    PubMed Central

    Long, Terri A.; Brady, Siobhan M.; Benfey, Philip N.

    2009-01-01

    Complex gene regulatory networks are composed of genes, noncoding RNAs, proteins, metabolites, and signaling components. The availability of genome-wide mutagenesis libraries; large-scale transcriptome, proteome, and metabalome data sets; and new high-throughput methods that uncover protein interactions underscores the need for mathematical modeling techniques that better enable scientists to synthesize these large amounts of information and to understand the properties of these biological systems. Systems biology approaches can allow researchers to move beyond a reductionist approach and to both integrate and comprehend the interactions of multiple components within these systems. Descriptive and mathematical models for gene regulatory networks can reveal emergent properties of these plant systems. This review highlights methods that researchers are using to obtain large-scale data sets, and examples of gene regulatory networks modeled with these data. Emergent properties revealed by the use of these network models and perspectives on the future of systems biology are discussed. PMID:18616425

  19. Multidisciplinary approach to identify aquifer-peatland connectivity

    NASA Astrophysics Data System (ADS)

    Larocque, Marie; Pellerin, Stéphanie; Cloutier, Vincent; Ferlatte, Miryane; Munger, Julie; Quillet, Anne; Paniconi, Claudio

    2015-04-01

    In southern Quebec (Canada), wetlands sustain increasing pressures from agriculture, urban development, and peat exploitation. To protect both groundwater and ecosystems, it is important to be able to identify how, where, and to what extent shallow aquifers and wetlands are connected. This study focuses on peatlands which are especially abundant in Quebec. The objective of this research was to better understand aquifer-peatland connectivity and to identify easily measured indicators of this connectivity. Geomorphology, hydrogeochemistry, and vegetation were selected as key indicators of connectivity. Twelve peatland transects were instrumented and monitored in the Abitibi (slope peatlands associated with eskers) and Centre-du-Quebec (depression peatlands) regions of Quebec (Canada). Geomorphology, geology, water levels, water chemistry, and vegetation species were identified/measured on all transects. Flow conditions were simulated numerically on two typical transects. Results show that a majority of peatland transects receives groundwater from a shallow aquifer. In slope peatlands, groundwater flows through the organic deposits towards the peatland center. In depression peatlands, groundwater flows only 100-200 m within the peatland before being redirected through surface routes towards the outlet. Flow modeling and sensitivity analysis have identified that the thickness and hydraulic conductivity of permeable deposits close to the peatland and beneath the organic deposits influence flow directions within the peatland. Geochemical data have confirmed the usefulness of total dissolved solids (TDS) exceeding 14 mg/L as an indicator of the presence of groundwater within the peatland. Vegetation surveys have allowed the identification of species and groups of species that occur mostly when groundwater is present, for instance Carex limosa and Sphagnum russowii. Geomorphological conditions (slope or depression peatland), TDS, and vegetation can be measured

  20. Identification of hydrologic indicators related to fish diversity and abundance: A data mining approach for fish community analysis

    NASA Astrophysics Data System (ADS)

    Yang, Yi-Chen E.; Cai, Ximing; Herricks, Edwin E.

    2008-04-01

    This paper develops a new approach to identify hydrologic indicators related to fish community and generate a quantitative function between an ecological target index and the identified hydrologic indicators. The approach is based on genetic programming (GP), a data mining method. Using the Shannon Index (a fish community diversity index) or the number of individuals (total abundance) of a fish community, as an ecological target, the GP identified the most ecologically relevant hydrologic indicators (ERHIs) from 32 indicators of hydrologic alteration, for the case study site, the upper Illinois River. Robustness analysis showed that different GP runs found a similar set of ERHIs; each of the identified ERHI from different GP runs had a consistent relationship with the target index. By comparing the GP results with those from principal component analysis and autecology matrix, the three approaches identified a small number (six) of common ERHIs. Particularly, the timing of low flow (Dmin) seems to be more relevant to the diversity of the fish community, while the magnitude of the low flow (Qb) is more relevant to the total fish abundance; large rising rates result in a significant improvement of fish diversity, which is counterintuitive and against previous findings. The quantitative function developed by GP was further used to construct an indicator impact matrix (IIM), which was demonstrated as a potentially useful tool for streamflow restoration design.

  1. A data mining based approach to predict spatiotemporal changes in satellite images

    NASA Astrophysics Data System (ADS)

    Boulila, W.; Farah, I. R.; Ettabaa, K. Saheb; Solaiman, B.; Ghézala, H. Ben

    2011-06-01

    The interpretation of remotely sensed images in a spatiotemporal context is becoming a valuable research topic. However, the constant growth of data volume in remote sensing imaging makes reaching conclusions based on collected data a challenging task. Recently, data mining appears to be a promising research field leading to several interesting discoveries in various areas such as marketing, surveillance, fraud detection and scientific discovery. By integrating data mining and image interpretation techniques, accurate and relevant information (i.e. functional relation between observed parcels and a set of informational contents) can be automatically elicited. This study presents a new approach to predict spatiotemporal changes in satellite image databases. The proposed method exploits fuzzy sets and data mining concepts to build predictions and decisions for several remote sensing fields. It takes into account imperfections related to the spatiotemporal mining process in order to provide more accurate and reliable information about land cover changes in satellite images. The proposed approach is validated using SPOT images representing the Saint-Denis region, capital of Reunion Island. Results show good performances of the proposed framework in predicting change for the urban zone.

  2. New Seasonal Shift in In-Stream Diurnal Nitrate Cycles Identified by Mining High-Frequency Data.

    PubMed

    Aubert, Alice H; Breuer, Lutz

    2016-01-01

    The recent development of in-situ monitoring devices, such as UV-spectrometers, makes the study of short-term stream chemistry variation relevant, especially the study of diurnal cycles, which are not yet fully understood. Our study is based on high-frequency data from an agricultural catchment (Studienlandschaft Schwingbachtal, Germany). We propose a novel approach, i.e. the combination of cluster analysis and Linear Discriminant Analysis, to mine from these data nitrate behavior patterns. As a result, we observe a seasonality of nitrate diurnal cycles, that differs from the most common cycle seasonality described in the literature, i.e. pre-dawn peaks in spring. Our cycles appear in summer and the maximum and minimum shift to a later time in late summer/autumn. This is observed both for water- and energy-limited years, thus potentially stressing the role of evapotranspiration. This concluding hypothesis on the role of evapotranspiration on nitrate stream concentration, which was obtained through data mining, broadens the perspective on the diurnal cycling of stream nitrate concentrations. PMID:27073838

  3. New Seasonal Shift in In-Stream Diurnal Nitrate Cycles Identified by Mining High-Frequency Data

    PubMed Central

    2016-01-01

    The recent development of in-situ monitoring devices, such as UV-spectrometers, makes the study of short-term stream chemistry variation relevant, especially the study of diurnal cycles, which are not yet fully understood. Our study is based on high-frequency data from an agricultural catchment (Studienlandschaft Schwingbachtal, Germany). We propose a novel approach, i.e. the combination of cluster analysis and Linear Discriminant Analysis, to mine from these data nitrate behavior patterns. As a result, we observe a seasonality of nitrate diurnal cycles, that differs from the most common cycle seasonality described in the literature, i.e. pre-dawn peaks in spring. Our cycles appear in summer and the maximum and minimum shift to a later time in late summer/autumn. This is observed both for water- and energy-limited years, thus potentially stressing the role of evapotranspiration. This concluding hypothesis on the role of evapotranspiration on nitrate stream concentration, which was obtained through data mining, broadens the perspective on the diurnal cycling of stream nitrate concentrations. PMID:27073838

  4. Using Data Mining to Identify Actionable Information: Breaking New Ground in Data-Driven Decision Making

    ERIC Educational Resources Information Center

    Streifer, Philip A.; Schumann, Jeffrey A.

    2005-01-01

    The implementation of No Child Left Behind (NCLB) presents important challenges for schools across the nation to identify problems that lead to poor performance. Yet schools must intervene with instructional programs that can make a difference and evaluate the effectiveness of such programs. New advances in artificial intelligence (AI) data-mining…

  5. Identifying the Factors Affecting Science and Mathematics Achievement Using Data Mining Methods

    ERIC Educational Resources Information Center

    Kiray, S. Ahmet; Gok, Bilge; Bozkir, A. Selman

    2015-01-01

    The purpose of this article is to identify the order of significance of the variables that affect science and mathematics achievement in middle school students. For this aim, the study deals with the relationship between science and math in terms of different angles using the perspectives of multiple causes-single effect and of multiple…

  6. Mining a Written Values Affirmation Intervention to Identify the Unique Linguistic Features of Stigmatized Groups

    ERIC Educational Resources Information Center

    Riddle, Travis; Bhagavatula, Sowmya Sree; Guo, Weiwei; Muresan, Smaranda; Cohen, Geoff; Cook, Jonathan E.; Purdie-Vaughns, Valerie

    2015-01-01

    Social identity threat refers to the process through which an individual underperforms in some domain due to their concern with confirming a negative stereotype held about their group. Psychological research has identified this as one contributor to the underperformance and underrepresentation of women, Blacks, and Latinos in STEM fields. Over the…

  7. Timely approaches to identify probiotic species of the genus Lactobacillus

    PubMed Central

    2013-01-01

    Over the past decades the use of probiotics in food has increased largely due to the manufacturer’s interest in placing “healthy” food on the market based on the consumer’s ambitions to live healthy. Due to this trend, health benefits of products containing probiotic strains such as lactobacilli are promoted and probiotic strains have been established in many different products with their numbers increasing steadily. Probiotics are used as starter cultures in dairy products such as cheese or yoghurts and in addition they are also utilized in non-dairy products such as fermented vegetables, fermented meat and pharmaceuticals, thereby, covering a large variety of products. To assure quality management, several pheno-, physico- and genotyping methods have been established to unambiguously identify probiotic lactobacilli. These methods are often specific enough to identify the probiotic strains at genus and species levels. However, the probiotic ability is often strain dependent and it is impossible to distinguish strains by basic microbiological methods. Therefore, this review aims to critically summarize and evaluate conventional identification methods for the genus Lactobacillus, complemented by techniques that are currently being developed. PMID:24063519

  8. A new approach to estimate fugitive methane emissions from coal mining in China.

    PubMed

    Ju, Yiwen; Sun, Yue; Sa, Zhanyou; Pan, Jienan; Wang, Jilin; Hou, Quanlin; Li, Qingguang; Yan, Zhifeng; Liu, Jie

    2016-02-01

    Developing a more accurate greenhouse gas (GHG) emissions inventory draws too much attention. Because of its resource endowment and technical status, China has made coal-related GHG emissions a big part of its inventory. Lacking a stoichiometric carbon conversion coefficient and influenced by geological conditions and mining technologies, previous efforts to estimate fugitive methane emissions from coal mining in China has led to disagreeing results. This paper proposes a new calculation methodology to determine fugitive methane emissions from coal mining based on the domestic analysis of gas geology, gas emission features, and the merits and demerits of existing estimation methods. This new approach involves four main parameters: in-situ original gas content, gas remaining post-desorption, raw coal production, and mining influence coefficient. The case studies in Huaibei-Huainan Coalfield and Jincheng Coalfield show that the new method obtains the smallest error, +9.59% and 7.01% respectively compared with other methods, Tier 1 and Tier 2 (with two samples) in this study, which resulted in +140.34%, +138.90%, and -18.67%, in Huaibei-Huainan Coalfield, while +64.36%, +47.07%, and -14.91% in Jincheng Coalfield. Compared with the predominantly used methods, this new one possesses the characteristics of not only being a comparably more simple process and lower uncertainty than the "emission factor method" (IPCC recommended Tier 1 and Tier 2), but also having easier data accessibility, similar uncertainty, and additional post-mining emissions compared to the "absolute gas emission method" (IPCC recommended Tier 3). Therefore, methane emissions dissipated from most of the producing coal mines worldwide could be more accurately and more easily estimated.

  9. A novel meta-analytic approach: mining frequent co-activation patterns in neuroimaging databases.

    PubMed

    Caspers, Julian; Zilles, Karl; Beierle, Christoph; Rottschy, Claudia; Eickhoff, Simon B

    2014-04-15

    In recent years, coordinate-based meta-analyses have become a powerful and widely used tool to study co-activity across neuroimaging experiments, a development that was supported by the emergence of large-scale neuroimaging databases like BrainMap. However, the evaluation of co-activation patterns is constrained by the fact that previous coordinate-based meta-analysis techniques like Activation Likelihood Estimation (ALE) and Multilevel Kernel Density Analysis (MKDA) reveal all brain regions that show convergent activity within a dataset without taking into account actual within-experiment co-occurrence patterns. To overcome this issue we here propose a novel meta-analytic approach named PaMiNI that utilizes a combination of two well-established data-mining techniques, Gaussian mixture modeling and the Apriori algorithm. By this, PaMiNI enables a data-driven detection of frequent co-activation patterns within neuroimaging datasets. The feasibility of the method is demonstrated by means of several analyses on simulated data as well as a real application. The analyses of the simulated data show that PaMiNI identifies the brain regions underlying the simulated activation foci and perfectly separates the co-activation patterns of the experiments in the simulations. Furthermore, PaMiNI still yields good results when activation foci of distinct brain regions become closer together or if they are non-Gaussian distributed. For the further evaluation, a real dataset on working memory experiments is used, which was previously examined in an ALE meta-analysis and hence allows a cross-validation of both methods. In this latter analysis, PaMiNI revealed a fronto-parietal "core" network of working memory and furthermore indicates a left-lateralization in this network. Finally, to encourage a widespread usage of this new method, the PaMiNI approach was implemented into a publicly available software system. PMID:24365675

  10. Identifying pathogenicity islands in bacterial pathogenomics using computational approaches.

    PubMed

    Che, Dongsheng; Hasan, Mohammad Shabbir; Chen, Bernard

    2014-01-13

    High-throughput sequencing technologies have made it possible to study bacteria through analyzing their genome sequences. For instance, comparative genome sequence analyses can reveal the phenomenon such as gene loss, gene gain, or gene exchange in a genome. By analyzing pathogenic bacterial genomes, we can discover that pathogenic genomic regions in many pathogenic bacteria are horizontally transferred from other bacteria, and these regions are also known as pathogenicity islands (PAIs). PAIs have some detectable properties, such as having different genomic signatures than the rest of the host genomes, and containing mobility genes so that they can be integrated into the host genome. In this review, we will discuss various pathogenicity island-associated features and current computational approaches for the identification of PAIs. Existing pathogenicity island databases and related computational resources will also be discussed, so that researchers may find it to be useful for the studies of bacterial evolution and pathogenicity mechanisms.

  11. Utilizing Soize's Approach to Identify Parameter and Model Uncertainties

    SciTech Connect

    Bonney, Matthew S.; Brake, Matthew Robert

    2014-10-01

    Quantifying uncertainty in model parameters is a challenging task for analysts. Soize has derived a method that is able to characterize both model and parameter uncertainty independently. This method is explained with the assumption that some experimental data is available, and is divided into seven steps. Monte Carlo analyses are performed to select the optimal dispersion variable to match the experimental data. Along with the nominal approach, an alternative distribution can be used along with corrections that can be utilized to expand the scope of this method. This method is one of a very few methods that can quantify uncertainty in the model form independently of the input parameters. Two examples are provided to illustrate the methodology, and example code is provided in the Appendix.

  12. Multimodal Approach to Identifying Malingered Posttraumatic Stress Disorder: A Review

    PubMed Central

    Jabeen, Shagufta; Alam, Farzana

    2015-01-01

    The primary aim of this article is to aid clinicians in differentiating true posttraumatic stress disorder from malingered posttraumatic stress disorder. Posttraumatic stress disorder and malingering are defined, and prevalence rates are explored. Similarities and differences in diagnostic criteria between the fourth and fifth editions of the Diagnostic and Statistical Manual of Mental Disorders are described for posttraumatic stress disorder. Possible motivations for malingering posttraumatic stress disorder are discussed, and common characteristics of malingered posttraumatic stress disorder are described. A multimodal approach is described for evaluating posttraumatic stress disorder, including interview techniques, collection of collateral data, and psychometric and physiologic testing, that should allow clinicians to distinguish between those patients who are truly suffering from posttraumatic disorder and those who are malingering the illness. PMID:25852974

  13. Identifying Pathogenicity Islands in Bacterial Pathogenomics Using Computational Approaches

    PubMed Central

    Che, Dongsheng; Hasan, Mohammad Shabbir; Chen, Bernard

    2014-01-01

    High-throughput sequencing technologies have made it possible to study bacteria through analyzing their genome sequences. For instance, comparative genome sequence analyses can reveal the phenomenon such as gene loss, gene gain, or gene exchange in a genome. By analyzing pathogenic bacterial genomes, we can discover that pathogenic genomic regions in many pathogenic bacteria are horizontally transferred from other bacteria, and these regions are also known as pathogenicity islands (PAIs). PAIs have some detectable properties, such as having different genomic signatures than the rest of the host genomes, and containing mobility genes so that they can be integrated into the host genome. In this review, we will discuss various pathogenicity island-associated features and current computational approaches for the identification of PAIs. Existing pathogenicity island databases and related computational resources will also be discussed, so that researchers may find it to be useful for the studies of bacterial evolution and pathogenicity mechanisms. PMID:25437607

  14. A new approach to identify, classify and count drugrelated events

    PubMed Central

    Bürkle, Thomas; Müller, Fabian; Patapovas, Andrius; Sonst, Anja; Pfistermeister, Barbara; Plank-Kiegele, Bettina; Dormann, Harald; Maas, Renke

    2013-01-01

    Aims The incidence of clinical events related to medication errors and/or adverse drug reactions reported in the literature varies by a degree that cannot solely be explained by the clinical setting, the varying scrutiny of investigators or varying definitions of drug-related events. Our hypothesis was that the individual complexity of many clinical cases may pose relevant limitations for current definitions and algorithms used to identify, classify and count adverse drug-related events. Methods Based on clinical cases derived from an observational study we identified and classified common clinical problems that cannot be adequately characterized by the currently used definitions and algorithms. Results It appears that some key models currently used to describe the relation of medication errors (MEs), adverse drug reactions (ADRs) and adverse drug events (ADEs) can easily be misinterpreted or contain logical inconsistencies that limit their accurate use to all but the simplest clinical cases. A key limitation of current models is the inability to deal with complex interactions such as one drug causing two clinically distinct side effects or multiple drugs contributing to a single clinical event. Using a large set of clinical cases we developed a revised model of the interdependence between MEs, ADEs and ADRs and extended current event definitions when multiple medications cause multiple types of problems. We propose algorithms that may help to improve the identification, classification and counting of drug-related events. Conclusions The new model may help to overcome some of the limitations that complex clinical cases pose to current paper- or software-based drug therapy safety. PMID:24007453

  15. PedMine – A simulated annealing algorithm to identify maximally unrelated individuals in population isolates

    PubMed Central

    Douglas, Julie A.; Sandefur, Conner I.

    2010-01-01

    Summary In family-based genetic studies, it is often useful to identify a subset of unrelated individuals. When such studies are conducted in population isolates, however, most if not all individuals are often detectably related to each other. To identify a set of maximally unrelated (or equivalently, minimally related) individuals, we have implemented simulated annealing, a general-purpose algorithm for solving difficult combinatorial optimization problems. We illustrate our method on data from a genetic study in the Old Order Amish of Lancaster County, Pennsylvania, a population isolate derived from a modest number of founders. Given one or more pedigrees, our program automatically and rapidly extracts a fixed number of maximally unrelated individuals. PMID:18321883

  16. Newer Approaches to Identify Potential Untoward Effects in Functional Foods.

    PubMed

    Marone, Palma Ann; Birkenbach, Victoria L; Hayes, A Wallace

    2016-01-01

    Globalization has greatly accelerated the numbers and variety of food and beverage products available worldwide. The exchange among greater numbers of countries, manufacturers, and products in the United States and worldwide has necessitated enhanced quality measures for nutritional products for larger populations increasingly reliant on functionality. These functional foods, those that provide benefit beyond basic nutrition, are increasingly being used for their potential to alleviate food insufficiency while enhancing quality and longevity of life. In the United States alone, a steady import increase of greater than 15% per year or 24 million shipments, over 70% products of which are food related, is regulated under the Food and Drug Administration (FDA). This unparalleled growth has resulted in the need for faster, cheaper, and better safety and efficacy screening methods in the form of harmonized guidelines and recommendations for product standardization. In an effort to meet this need, the in vitro toxicology testing market has similarly grown with an anticipatory 15% increase between 2010 and 2015 of US$1.3 to US$2.7 billion. Although traditionally occupying a small fraction of the market behind pharmaceuticals and cosmetic/household products, the scope of functional food testing, including additives/supplements, ingredients, residues, contact/processing, and contaminants, is potentially expansive. Similarly, as functional food testing has progressed, so has the need to identify potential adverse factors that threaten the safety and quality of these products. PMID:26657815

  17. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  18. Enhanced Approaches for Identifying Amadori Products: Application to Peanut Allergens.

    PubMed

    Johnson, Katina L; Williams, Jason G; Maleki, Soheila J; Hurlburt, Barry K; London, Robert E; Mueller, Geoffrey A

    2016-02-17

    The dry roasting of peanuts is suggested to influence allergic sensitization as a result of the formation of advanced glycation end products (AGEs) on peanut proteins. Identifying AGEs is technically challenging. The AGEs of a peanut allergen were probed with nano-scale liquid chromatography-electrospray ionization-mass spectrometry (nanoLC-ESI-MS) and tandem mass spectrometry (MS/MS) analyses. Amadori product ions matched to expected peptides and yielded fragments that included a loss of three waters and HCHO. As a result of the paucity of b and y ions in the MS/MS spectrum, standard search algorithms do not perform well. Reactions with isotopically labeled sugars confirmed that the peptides contained Amadori products. An algorithm was developed on the basis of information content (Shannon entropy) and the loss of water and HCHO. Results with test data show that the algorithm finds the correct spectra with high precision, reducing the time needed to manually inspect data. Computational and technical improvements allowed for better identification of the chemical differences between modified and unmodified proteins. PMID:26811263

  19. Enhanced Approaches for Identifying Amadori Products: Application to Peanut Allergens.

    PubMed

    Johnson, Katina L; Williams, Jason G; Maleki, Soheila J; Hurlburt, Barry K; London, Robert E; Mueller, Geoffrey A

    2016-02-17

    The dry roasting of peanuts is suggested to influence allergic sensitization as a result of the formation of advanced glycation end products (AGEs) on peanut proteins. Identifying AGEs is technically challenging. The AGEs of a peanut allergen were probed with nano-scale liquid chromatography-electrospray ionization-mass spectrometry (nanoLC-ESI-MS) and tandem mass spectrometry (MS/MS) analyses. Amadori product ions matched to expected peptides and yielded fragments that included a loss of three waters and HCHO. As a result of the paucity of b and y ions in the MS/MS spectrum, standard search algorithms do not perform well. Reactions with isotopically labeled sugars confirmed that the peptides contained Amadori products. An algorithm was developed on the basis of information content (Shannon entropy) and the loss of water and HCHO. Results with test data show that the algorithm finds the correct spectra with high precision, reducing the time needed to manually inspect data. Computational and technical improvements allowed for better identification of the chemical differences between modified and unmodified proteins.

  20. A Novel Approach for Mining Polymorphic Microsatellite Markers In Silico

    PubMed Central

    Hoffman, Joseph I.; Nichols, Hazel J.

    2011-01-01

    An important emerging application of high-throughput 454 sequencing is the isolation of molecular markers such as microsatellites from genomic DNA. However, few studies have developed microsatellites from cDNA despite the added potential for targeting candidate genes. Moreover, to develop microsatellites usually requires the evaluation of numerous primer pairs for polymorphism in the focal species. This can be time-consuming and wasteful, particularly for taxa with low genetic diversity where the majority of primers often yield monomorphic polymerase chain reaction (PCR) products. Transcriptome assemblies provide a convenient solution, functional annotation of transcripts allowing markers to be targeted towards candidate genes, while high sequence coverage in principle permits the assessment of variability in silico. Consequently, we evaluated fifty primer pairs designed to amplify microsatellites, primarily residing within transcripts related to immunity and growth, identified from an Antarctic fur seal (Arctocephalus gazella) transcriptome assembly. In silico visualization was used to classify each microsatellite as being either polymorphic or monomorphic and to quantify the number of distinct length variants, each taken to represent a different allele. The majority of loci (n = 36, 76.0%) yielded interpretable PCR products, 23 of which were polymorphic in a sample of 24 fur seal individuals. Loci that appeared variable in silico were significantly more likely to yield polymorphic PCR products, even after controlling for microsatellite length measured in silico. We also found a significant positive relationship between inferred and observed allele number. This study not only demonstrates the feasibility of generating modest panels of microsatellites targeted towards specific classes of gene, but also suggests that in silico microsatellite variability may provide a useful proxy for PCR product polymorphism. PMID:21853104

  1. A landscape ecology approach identifies important drivers of urban biodiversity.

    PubMed

    Turrini, Tabea; Knop, Eva

    2015-04-01

    Cities are growing rapidly worldwide, yet a mechanistic understanding of the impact of urbanization on biodiversity is lacking. We assessed the impact of urbanization on arthropod diversity (species richness and evenness) and abundance in a study of six cities and nearby intensively managed agricultural areas. Within the urban ecosystem, we disentangled the relative importance of two key landscape factors affecting biodiversity, namely the amount of vegetated area and patch isolation. To do so, we a priori selected sites that independently varied in the amount of vegetated area in the surrounding landscape at the 500-m scale and patch isolation at the 100-m scale, and we hold local patch characteristics constant. As indicator groups, we used bugs, beetles, leafhoppers, and spiders. Compared to intensively managed agricultural ecosystems, urban ecosystems supported a higher abundance of most indicator groups, a higher number of bug species, and a lower evenness of bug and beetle species. Within cities, a high amount of vegetated area increased species richness and abundance of most arthropod groups, whereas evenness showed no clear pattern. Patch isolation played only a limited role in urban ecosystems, which contrasts findings from agro-ecological studies. Our results show that urban areas can harbor a similar arthropod diversity and abundance compared to intensively managed agricultural ecosystems. Further, negative consequences of urbanization on arthropod diversity can be mitigated by providing sufficient vegetated space in the urban area, while patch connectivity is less important in an urban context. This highlights the need for applying a landscape ecological approach to understand the mechanisms shaping urban biodiversity and underlines the potential of appropriate urban planning for mitigating biodiversity loss.

  2. A Bayesian Approach to Identifying New Risk Factors for Dementia

    PubMed Central

    Wen, Yen-Hsia; Wu, Shihn-Sheng; Lin, Chun-Hung Richard; Tsai, Jui-Hsiu; Yang, Pinchen; Chang, Yang-Pei; Tseng, Kuan-Hua

    2016-01-01

    Abstract Dementia is one of the most disabling and burdensome health conditions worldwide. In this study, we identified new potential risk factors for dementia from nationwide longitudinal population-based data by using Bayesian statistics. We first tested the consistency of the results obtained using Bayesian statistics with those obtained using classical frequentist probability for 4 recognized risk factors for dementia, namely severe head injury, depression, diabetes mellitus, and vascular diseases. Then, we used Bayesian statistics to verify 2 new potential risk factors for dementia, namely hearing loss and senile cataract, determined from the Taiwan's National Health Insurance Research Database. We included a total of 6546 (6.0%) patients diagnosed with dementia. We observed older age, female sex, and lower income as independent risk factors for dementia. Moreover, we verified the 4 recognized risk factors for dementia in the older Taiwanese population; their odds ratios (ORs) ranged from 3.469 to 1.207. Furthermore, we observed that hearing loss (OR = 1.577) and senile cataract (OR = 1.549) were associated with an increased risk of dementia. We found that the results obtained using Bayesian statistics for assessing risk factors for dementia, such as head injury, depression, DM, and vascular diseases, were consistent with those obtained using classical frequentist probability. Moreover, hearing loss and senile cataract were found to be potential risk factors for dementia in the older Taiwanese population. Bayesian statistics could help clinicians explore other potential risk factors for dementia and for developing appropriate treatment strategies for these patients. PMID:27227925

  3. Identifying new targets in leukemogenesis using computational approaches.

    PubMed

    Jayaraman, Archana; Jamil, Kaiser; Khan, Haseeb A

    2015-09-01

    There is a need to identify novel targets in Acute Lymphoblastic Leukemia (ALL), a hematopoietic cancer affecting children, to improve our understanding of disease biology and that can be used for developing new therapeutics. Hence, the aim of our study was to find new genes as targets using in silico studies; for this we retrieved the top 10% overexpressed genes from Oncomine public domain microarray expression database; 530 overexpressed genes were short-listed from Oncomine database. Then, using prioritization tools such as ENDEAVOUR, DIR and TOPPGene online tools, we found fifty-four genes common to the three prioritization tools which formed our candidate leukemogenic genes for this study. As per the protocol we selected thirty training genes from PubMed. The prioritized and training genes were then used to construct STRING functional association network, which was further analyzed using cytoHubba hub analysis tool to investigate new genes which could form drug targets in leukemia. Analysis of the STRING protein network built from these prioritized and training genes led to identification of two hub genes, SMAD2 and CDK9, which were not implicated in leukemogenesis earlier. Filtering out from several hundred genes in the network we also found MEN1, HDAC1 and LCK genes, which re-emphasized the important role of these genes in leukemogenesis. This is the first report on these five additional signature genes in leukemogenesis. We propose these as new targets for developing novel therapeutics and also as biomarkers in leukemogenesis, which could be important for prognosis and diagnosis.

  4. An ecosystem approach to evaluate restoration measures in the lignite mining district of Lusatia/Germany

    NASA Astrophysics Data System (ADS)

    Schaaf, Wolfgang

    2015-04-01

    Lignite mining in Lusatia has a history of over 100 years. Open-cast mining directly affected an area of 1000 km2. Since 20 years we established an ecosystem oriented approach to evaluate the development and site characteristics of post-mining areas mainly restored for agricultural and silvicultural land use. Water and element budgets of afforested sites were studied under different geochemical settings in a chronosequence approach (Schaaf 2001), as well as the effect of soil amendments like sewage sludge or compost in restoration (Schaaf & Hüttl 2006). Since 10 years we also study the development of natural site regeneration in the constructed catchment Chicken Creek at the watershed scale (Schaaf et al. 2011, 2013). One of the striking characteristics of post-mining sites is a very large small-scale soil heterogeneity that has to be taken into account with respect to soil forming processes and element cycling. Results from these studies in combination with smaller-scale process studies enable to evaluate the long-term effect of restoration measures and adapted land use options. In addition, it is crucial to compare these results with data from undisturbed, i.e. non-mined sites. Schaaf, W., 2001: What can element budgets of false-time series tell us about ecosystem development on post-lignite mining sites? Ecological Engineering 17, 241-252. Schaaf, W. and Hüttl, R. F., 2006: Direct and indirect effects of soil pollution by lignite mining. Water, Air and Soil Pollution - Focus 6, 253-264. Schaaf, W., Bens, O., Fischer, A., Gerke, H.H., Gerwin, W., Grünewald, U., Holländer, H.M., Kögel-Knabner, I., Mutz, M., Schloter, M., Schulin, R., Veste, M., Winter, S. & Hüttl, R.F., 2011: Patterns and processes of initial terrestrial-ecosystem development. Journal of Plant Nutrition and Soil Science, 174, 229-239. Schaaf, W., Elmer, M., Fischer, A., Gerwin, W., Nenov, R., Pretsch, H. and Zaplate, M.K., 2013: Feedbacks between vegetation, surface structures and hydrology

  5. MbtH homology codes to identify gifted microbes for genome mining.

    PubMed

    Baltz, Richard H

    2014-02-01

    Advances in DNA sequencing technologies have made it possible to sequence large numbers of microbial genomes rapidly and inexpensively. In recent years, genome sequencing initiatives have demonstrated that actinomycetes with large genomes generally have the genetic potential to produce many secondary metabolites, most of which remain cryptic. Since the numbers of new and novel pathways vary considerably among actinomycetes, and the correct assembly of secondary metabolite pathways containing type I polyketide synthase or nonribosomal peptide synthetase (NRPS) genes is costly and time consuming, it would be advantageous to have simple genetic predictors for the number and potential novelty of secondary metabolite pathways in targeted microorganisms. For secondary metabolite pathways that utilize NRPS mechanisms, the small chaperone-like proteins related to MbtH encoded by Mycobacterium tuberculosis offer unique probes or beacons to identify gifted microbes encoding large numbers of diverse NRPS pathways because of their unique function(s) and small size. The small size of the mbtH-homolog genes makes surveying large numbers of genomes straight-forward with less than ten-fold sequencing coverage. Multiple MbtH orthologs and paralogs have been coupled to generate a 24-mer multiprobe to assign numerical codes to individual MbtH homologs by BLASTp analysis. This multiprobe can be used to identify gifted microbes encoding new and novel secondary metabolites for further focused exploration by extensive DNA sequencing, pathway assembly and annotation, and expression studies in homologous or heterologous hosts.

  6. Radon releases from Australian uranium mining and milling projects: assessing the UNSCEAR approach.

    PubMed

    Mudd, Gavin M

    2008-02-01

    The release of radon gas and progeny from the mining and milling of uranium-bearing ores has long been recognised as a potential radiological health hazard. The standards for exposure to radon and progeny have decreased over time as the understanding of their health risk has improved. In recent years there has been debate on the long-term releases (10,000 years) of radon from uranium mining and milling sites, focusing on abandoned, operational and rehabilitated sites. The primary purpose has been estimates of the radiation exposure of both local and global populations. Although there has been an increasing number of radon release studies over recent years in the USA, Australia, Canada and elsewhere, a systematic evaluation of this work has yet to be published in the international literature. This paper presents a detailed compilation and analysis of Australian studies. In order to quantify radon sources, a review of data on uranium mining and milling wastes in Australia, as they influence radon releases, is presented. An extensive compilation of the available radon release data is then assembled for the various projects, including a comparison to predictions of radon behaviour where available. An analysis of cumulative radon releases is then developed and compared to the UNSCEAR approach. The implications for the various assessments of long-term releases of radon are discussed, including aspects such as the need for ongoing monitoring of rehabilitation at uranium mining and milling sites and life-cycle accounting.

  7. Evaluation of the approach to respirable quartz exposure control in U.S. coal mines.

    PubMed

    Joy, Gerald J

    2012-01-01

    Occupational exposure to high levels of respirable quartz can result in respiratory and other diseases in humans. The Mine Safety and Health Adminstration (MSHA) regulates exposure to respirable quartz in coal mines indirectly through reductions in the respirable coal mine dust exposure limit based on the content of quartz in the airborne respirable dust. This reduction is implemented when the quartz content of airborne respirable dust exceeds 5% by weight. The intent of this dust standard reduction is to restrict miners' exposure to respirable quartz to a time-weighted average concentration of 100 μg/m(3). The effectiveness of this indirect approach to control quartz exposure was evaluated by analyzing respirable dust samples collected by MSHA inspectors from 1995 through 2008. The performance of the current regulatory approach was found to be lacking due to the use of a variable property-quartz content in airborne dust-to establish a standard for subsequent exposures. In one situation, 11.7% (4370/37,346) of samples that were below the applicable respirable coal mine dust exposure limit exceeded 100 μg/m(3) quartz. In a second situation, 4.4% (895/20,560) of samples with 5% or less quartz content in the airborne respirable dust exceeded 100 μg/m(3) quartz. In these two situations, the samples exceeding 100 μg/m(3) quartz were not subject to any potential compliance action. Therefore, the current respirable quartz exposure control approach does not reliably maintain miner exposure below 100 μg/m(3) quartz. A separate and specific respirable quartz exposure standard may improve control of coal miners' occupational exposure to respirable quartz.

  8. Forecasting Precipitation over the MENA Region: A Data Mining and Remote Sensing Based Approach

    NASA Astrophysics Data System (ADS)

    Elkadiri, R.; Sultan, M.; Elbayoumi, T.; Chouinard, K.

    2015-12-01

    We developed and applied an integrated approach to construct predictive tools with lead times of 1 to 12 months to forecast precipitation amounts over the Middle East and North Africa (MENA) region. The following steps were conducted: (1) acquire and analyze temporal remote sensing-based precipitation datasets (i.e. Tropical Rainfall Measuring Mission [TRMM]) over five main water source regions in the MENA area (i.e. Atlas Mountains in Morocco, Southern Sudan, Red Sea Hills of Yemen, and Blue Nile and White Nile source areas) throughout the investigation period (1998 to 2015), (2) acquire and extract monthly values for all of the climatic indices that are likely to influence the climatic patterns over the MENA region (e.g., Northern Atlantic Oscillation [NOI], Southern Oscillation Index [SOI], and Tropical North Atlantic Index [TNA]); and (3) apply data mining methods to extract relationships between the observed precipitation and the controlling factors (climatic indices) and use predictive tools to forecast monthly precipitation over each of the identified pilot study areas. Preliminary results indicate that by using the period from January 1998 until August 2012 for model training and the period from September 2012 to January 2015 for testing, precipitation can be successfully predicted with a three-months lead over South West Yemen, Atlas Mountains in Morocco, Southern Sudan, Blue Nile sources and White Nile sources with confidence (Pearson correlation coefficient: 0.911, 0.823, 0.807, 0.801 and 0.895 respectively). Future work will focus on applying this technique for prediction of precipitation over each of the climatically contiguous areas of the MENA region. If our efforts are successful, our findings will lead the way to the development and implementation of sound water management scenarios for the MENA countries.

  9. Intrusion detection: a novel approach that combines boosting genetic fuzzy classifier and data mining techniques

    NASA Astrophysics Data System (ADS)

    Ozyer, Tansel; Alhajj, Reda; Barker, Ken

    2005-03-01

    This paper proposes an intelligent intrusion detection system (IDS) which is an integrated approach that employs fuzziness and two of the well-known data mining techniques: namely classification and association rule mining. By using these two techniques, we adopted the idea of using an iterative rule learning that extracts out rules from the data set. Our final intention is to predict different behaviors in networked computers. To achieve this, we propose to use a fuzzy rule based genetic classifier. Our approach has two main stages. First, fuzzy association rule mining is applied and a large number of candidate rules are generated for each class. Then the rules pass through pre-screening mechanism in order to reduce the fuzzy rule search space. Candidate rules obtained after pre-screening are used in genetic fuzzy classifier to generate rules for the specified classes. Classes are defined as Normal, PRB-probe, DOS-denial of service, U2R-user to root and R2L- remote to local. Second, an iterative rule learning mechanism is employed for each class to find its fuzzy rules required to classify data each time a fuzzy rule is extracted and included in the system. A Boosting mechanism evaluates the weight of each data item in order to help the rule extraction mechanism focus more on data having relatively higher weight. Finally, extracted fuzzy rules having the corresponding weight values are aggregated on class basis to find the vote of each class label for each data item.

  10. EST mining identifies proteins putatively secreted by the anthracnose pathogen Colletotrichum truncatum

    PubMed Central

    2011-01-01

    Background Colletotrichum truncatum is a haploid, hemibiotrophic, ascomycete fungal pathogen that causes anthracnose disease on many economically important leguminous crops. This pathogen exploits sequential biotrophic- and necrotrophic- infection strategies to colonize the host. Transition from biotrophy to a destructive necrotrophic phase called the biotrophy-necrotrophy switch is critical in symptom development. C. truncatum likely secretes an arsenal of proteins that are implicated in maintaining a compatible interaction with its host. Some of them might be transition specific. Results A directional cDNA library was constructed from mRNA isolated from infected Lens culinaris leaflet tissues displaying the biotrophy-necrotrophy switch of C. truncatum and 5000 expressed sequence tags (ESTs) with an average read of > 600 bp from the 5-prime end were generated. Nearly 39% of the ESTs were predicted to encode proteins of fungal origin and among these, 162 ESTs were predicted to contain N-terminal signal peptides (SPs) in their deduced open reading frames (ORFs). The 162 sequences could be assembled into 122 tentative unigenes comprising 32 contigs and 90 singletons. Sequence analyses of unigenes revealed four potential groups: hydrolases, cell envelope associated proteins (CEAPs), candidate effectors and other proteins. Eleven candidate effector genes were identified based on features common to characterized fungal effectors, i.e. they encode small, soluble (lack of transmembrane domain), cysteine-rich proteins with a putative SP. For a selected subset of CEAPs and candidate effectors, semiquantitative RT-PCR showed that these transcripts were either expressed constitutively in both in vitro and in planta or induced during plant infection. Using potato virus X (PVX) based transient expression assays, we showed that one of the candidate effectors, i. e. contig 8 that encodes a cerato-platanin (CP) domain containing protein, unlike CP proteins from other fungal

  11. An Approach to Identify Site Response Directivity of Accelerometer Sites and Application to the Iranian Area

    NASA Astrophysics Data System (ADS)

    Del Gaudio, Vincenzo; Pierri, Pierpaolo; Rajabi, Ali M.

    2015-06-01

    In recent years, several workers have found numerous cases of sites characterised by significant azimuthal variation of dynamic response to seismic shaking. The causes of this phenomenon are still unclear, but are possibly related to combinations of geological and geomorphological factors determining a polarisation of resonance effects. To improve their comprehension, it would be desirable to extend the database of observations on this phenomenon. Thus, considering that unrevealed cases of site response directivity can be "hidden" among the sites of accelerometer networks, we developed a two-stage approach of data mining from existing strong motion databases to identify sites affected by directional amplification. The proposed procedure first calculates Arias Intensity tensor components from accelerometer recordings of each site to determine mean directional variations of total shaking energy. Then, at the sites where a significant anisotropy appears in ground motion, azimuthal variations of HVSR values (spectral ratios between horizontal and vertical components of recordings) are analysed to confirm the occurrence of site resonance conditions. We applied this technique to a database of recordings acquired by accelerometer stations in the Iranian area. The results of this investigation pointed out some sites affected by directional resonance that appear to be correlated to the orientation of local tectonic lineaments, these being mostly transversal to the direction of maximum shaking. Comparing Arias Intensities observed at these sites with theoretical estimates provided by ground motion prediction equations, the presence of significant site amplifications was confirmed. The magnitude of the amplification factors appear to be correlated to the results of HVSR analysis, even though the pattern of dispersion of HVSR values suggests that while high peak values of spectral ratios are indicative of strong amplifications, lower values do not necessarily imply lower

  12. Development of a data-mining algorithm to identify ages at reproductive milestones in electronic medical records.

    PubMed

    Malinowski, Jennifer; Farber-Eger, Eric; Crawford, Dana C

    2014-01-01

    Electronic medical records (EMRs) are becoming more widely implemented following directives from the federal government and incentives for supplemental reimbursements for Medicare and Medicaid claims. Replete with rich phenotypic data, EMRs offer a unique opportunity for clinicians and researchers to identify potential research cohorts and perform epidemiologic studies. Notable limitations to the traditional epidemiologic study include cost, time to complete the study, and limited ancestral diversity; EMR-based epidemiologic studies offer an alternative. The Epidemiologic Architecture for Genes Linked to Environment (EAGLE) Study, as part of the Population Architecture using Genomics and Epidemiology (PAGE) I Study, has genotyped more than 15,000 patients of diverse ancestry in BioVU, the Vanderbilt University Medical Center's biorepository linked to the EMR (EAGLE BioVU). We report here the development and performance of data-mining techniques used to identify the age at menarche (AM) and age at menopause (AAM), important milestones in the reproductive lifespan, in women from EAGLE BioVU for genetic association studies. In addition, we demonstrate the ability to discriminate age at naturally-occurring menopause (ANM) from medically-induced menopause. Unusual timing of these events may indicate underlying pathologies and increased risk for some complex diseases and cancer; however, they are not consistently recorded in the EMR. Our algorithm offers a mechanism by which to extract these data for clinical and research goals.

  13. Identifying high-cost patients using data mining techniques and a small set of non-trivial attributes.

    PubMed

    Izad Shenas, Seyed Abdolmotalleb; Raahemi, Bijan; Hossein Tekieh, Mohammad; Kuziemsky, Craig

    2014-10-01

    In this paper, we use data mining techniques, namely neural networks and decision trees, to build predictive models to identify very high-cost patients in the top 5 percentile among the general population. A large empirical dataset from the Medical Expenditure Panel Survey with 98,175 records was used in our study. After pre-processing, partitioning and balancing the data, the refined dataset of 31,704 records was modeled by Decision Trees (including C5.0 and CHAID), and Neural Networks. The performances of the models are analyzed using various measures including accuracy, G-mean, and Area under ROC curve. We concluded that the CHAID classifier returns the best G-mean and AUC measures for top performing predictive models ranging from 76% to 85%, and 0.812 to 0.942 units, respectively. We also identify a small set of 5 non-trivial attributes among a primary set of 66 attributes to identify the top 5% of the high cost population. The attributes are the individual׳s overall health perception, age, history of blood cholesterol check, history of physical/sensory/mental limitations, and history of colonic prevention measures. The small set of attributes are what we call non-trivial and does not include visits to care providers, doctors or hospitals, which are highly correlated with expenditures and does not offer new insight to the data. The results of this study can be used by healthcare data analysts, policy makers, insurer, and healthcare planners to improve the delivery of health services.

  14. Biogeometallurgical pre-mining characterization of ore deposits: an approach to increase sustainability in the mining process.

    PubMed

    Dold, Bernhard; Weibel, Leyla

    2013-11-01

    Based on the knowledge obtained from acid mine drainage formation in mine waste environments (tailings impoundments and waste rock dumps), a new methodology is applied to characterize new ore deposits before exploitation starts. This gives the opportunity to design optimized processes for metal recovery of the different mineral assemblages in an ore deposit and at the same time to minimize the environmental impact and costs downstream for mine waste management. Additionally, the whole economic potential is evaluated including strategic elements. The methodology integrates high-resolution geochemistry by sequential extractions and quantitative mineralogy in combination with kinetic bioleach tests. The produced data set allows to define biogeometallurgical units in the ore deposit and to predict the behavior of each element, economically or environmentally relevant, along the mining process.

  15. Text Mining.

    ERIC Educational Resources Information Center

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  16. Identifying and overcoming the constraints that prevent the full implementation of decommissioning and remediation programs in uranium mining sites.

    PubMed

    Franklin, Mariza Ramalho; Fernandes, Horst Monken

    2013-05-01

    Environmental remediation of radioactive contamination is about achieving appropriate reduction of exposures to ionizing radiation. This goal can be achieved by means of isolation or removal of the contamination source(s) or by breaking the exposure pathways. Ideally, environmental remediation is part of the planning phase of any industrial operation with the potential to cause environmental contamination. This concept is even more important in mining operations due to the significant impacts produced. This approach has not been considered in several operations developed in the past. Therefore many legacy sites face the challenge to implement appropriate remediation plans. One of the first barriers to remediation works is the lack of financial resources as environmental issues used to be taken in the past as marginal costs and were not included in the overall budget of the company. This paper analyses the situation of the former uranium production site of Poços de Caldas in Brazil. It is demonstrated that in addition to the lack of resources, other barriers such as the lack of information on site characteristics, appropriate regulatory framework, funding mechanisms, stakeholder involvement, policy and strategy, technical experience and mechanism for the appropriation of adequate technical expertise will play key roles in preventing the implementation of remediation programs. All these barriers are discussed and some solutions are suggested. It is expected that lessons learned from the Poços de Caldas legacy site may stimulate advancement of more sustainable options in the development of future uranium production centers.

  17. Pattern recognition and data mining techniques to identify factors in wafer processing and control determining overlay error

    NASA Astrophysics Data System (ADS)

    Lam, Auguste; Ypma, Alexander; Gatefait, Maxime; Deckers, David; Koopman, Arne; van Haren, Richard; Beltman, Jan

    2015-03-01

    On-product overlay can be improved through the use of context data from the fab and the scanner. Continuous improvements in lithography and processing performance over the past years have resulted in consequent overlay performance improvement for critical layers. Identification of the remaining factors causing systematic disturbances and inefficiencies will further reduce overlay. By building a context database, mappings between context, fingerprints and alignment & overlay metrology can be learned through techniques from pattern recognition and data mining. We relate structure (`patterns') in the metrology data to relevant contextual factors. Once understood, these factors could be moved to the known effects (e.g. the presence of systematic fingerprints from reticle writing error or lens and reticle heating). Hence, we build up a knowledge base of known effects based on data. Outcomes from such an integral (`holistic') approach to lithography data analysis may be exploited in a model-based predictive overlay controller that combines feedback and feedforward control [1]. Hence, the available measurements from scanner, fab and metrology equipment are combined to reveal opportunities for further overlay improvement which would otherwise go unnoticed.

  18. Identifying and overcoming the constraints that prevent the full implementation of decommissioning and remediation programs in uranium mining sites.

    PubMed

    Franklin, Mariza Ramalho; Fernandes, Horst Monken

    2013-05-01

    Environmental remediation of radioactive contamination is about achieving appropriate reduction of exposures to ionizing radiation. This goal can be achieved by means of isolation or removal of the contamination source(s) or by breaking the exposure pathways. Ideally, environmental remediation is part of the planning phase of any industrial operation with the potential to cause environmental contamination. This concept is even more important in mining operations due to the significant impacts produced. This approach has not been considered in several operations developed in the past. Therefore many legacy sites face the challenge to implement appropriate remediation plans. One of the first barriers to remediation works is the lack of financial resources as environmental issues used to be taken in the past as marginal costs and were not included in the overall budget of the company. This paper analyses the situation of the former uranium production site of Poços de Caldas in Brazil. It is demonstrated that in addition to the lack of resources, other barriers such as the lack of information on site characteristics, appropriate regulatory framework, funding mechanisms, stakeholder involvement, policy and strategy, technical experience and mechanism for the appropriation of adequate technical expertise will play key roles in preventing the implementation of remediation programs. All these barriers are discussed and some solutions are suggested. It is expected that lessons learned from the Poços de Caldas legacy site may stimulate advancement of more sustainable options in the development of future uranium production centers. PMID:21955840

  19. Metal dispersion resulting from mining activities in coastal environments: a pathways approach

    USGS Publications Warehouse

    Koski, Randolph A.

    2012-01-01

    Acid rock drainage (ARD) and disposal of tailings that result from mining activities impact coastal areas in many countries. The dispersion of metals from mine sites that are both proximal and distal to the shoreline can be examined using a pathways approach in which physical and chemical processes guide metal transport in the continuum from sources (sulfide minerals) to bioreceptors (marine biota). Large amounts of metals can be physically transported to the coastal environment by intentional or accidental release of sulfide-bearing mine tailings. Oxidation of sulfide minerals results in elevated dissolved metal concentrations in surface waters on land (producing ARD) and in pore waters of submarine tailings. Changes in pH, adsorption by insoluble secondary minerals (e.g., Fe oxyhydroxides), and precipitation of soluble salts (e.g., sulfates) affect dissolved metal fluxes. Evidence for bioaccumulation includes anomalous metal concentrations in bivalves and reef corals, and overlapping Pb isotope ratios for sulfides, shellfish, and seaweed in contaminated environments. Although bioavailability and potential toxicity are, to a large extent, functions of metal speciation, specific uptake pathways, such as adsorption from solution and ingestion of particles, also play important roles. Recent emphasis on broader ecological impacts has led to complementary methodologies involving laboratory toxicity tests and field studies of species richness and diversity.

  20. An experimental approach to assessing the effects of mining subsidence on a flood meadow community

    SciTech Connect

    Benyon, P.R.; Humphries, R.N.; Gregson, K.; Marshall, S.; Peace, S.W.

    1998-12-31

    The Lower Derwent Valley (LDV) is a candidate Special Area of Conservation (SAC) under the provisions of the UK 1994 Conservation Regulations for its internationally important Alopecurus pratense-Sanguisorba officinalis flood meadow vegetation. Mining from RJB`s Selby Complex (UK`s largest mine) has taken place around and under the LDV since the 1980s. Under the provisions of the Regulations the potential effects of mining subsidence have been recently reviewed. From field data and models it has been predicted that the resulting small amount of subsidence is unlikely to have a deleterious effect on the composition and extent of the key community. While the proposed long-term monitoring will verify the prediction, it will be some years before the results will be available. In order to identify incipient changes in grassland community and to implement any necessary mitigation measures before significant changes occur, a field experiment was set up in late 1996 to assess the effects of increased wetness and inundation which might be induced by subsidence. This involved the transplantation of turves from the different grassland communities within and along a previously defined gradient of relative wetness and inundation. The response of the communities to the different conditions is being monitored. The background studies and the results of the transplantation so far will be presented.

  1. Soil quality assessment using GIS-based chemometric approach and pollution indices: Nakhlak mining district, Central Iran.

    PubMed

    Moore, Farid; Sheykhi, Vahideh; Salari, Mohammad; Bagheri, Adel

    2016-04-01

    This paper is a comprehensive assessment of the quality of soil in the Nakhlak mining district in Central Iran with special reference to potentially toxic metals. In this regard, an integrated approach involving geostatistical, correlation matrix, pollution indices, and chemical fractionation measurement is used to evaluate selected potentially toxic metals in soil samples. The fractionation of metals indicated a relatively high variability. Some metals (Mo, Ag, and Pb) showed important enrichment in the bioavailable fractions (i.e., exchangeable and carbonate), whereas the residual fraction mostly comprised Sb and Cr. The Cd, Zn, Co, Ni, Mo, Cu, and As were retained in Fe-Mn oxide and oxidizable fractions, suggesting that they may be released to the environment by changes in physicochemical conditions. The spatial variability patterns of 11 soil heavy metals (Ag, As, Cd, Co, Cr, Cu, Mo, Ni, Pb, Sb, and Zn) were identified and mapped. The results demonstrated that Ag, As, Cd, Mo, Cu, Pb, Sb, and Zn pollution are associated with mineralized veins and mining operations in this area. Further environmental monitoring and remedial actions are required for management of soil heavy metals in the study area. The present study not only enhanced our knowledge regarding soil pollution in the study area but also introduced a better technique to analyze pollution indices by multivariate geostatistical methods.

  2. Soil quality assessment using GIS-based chemometric approach and pollution indices: Nakhlak mining district, Central Iran.

    PubMed

    Moore, Farid; Sheykhi, Vahideh; Salari, Mohammad; Bagheri, Adel

    2016-04-01

    This paper is a comprehensive assessment of the quality of soil in the Nakhlak mining district in Central Iran with special reference to potentially toxic metals. In this regard, an integrated approach involving geostatistical, correlation matrix, pollution indices, and chemical fractionation measurement is used to evaluate selected potentially toxic metals in soil samples. The fractionation of metals indicated a relatively high variability. Some metals (Mo, Ag, and Pb) showed important enrichment in the bioavailable fractions (i.e., exchangeable and carbonate), whereas the residual fraction mostly comprised Sb and Cr. The Cd, Zn, Co, Ni, Mo, Cu, and As were retained in Fe-Mn oxide and oxidizable fractions, suggesting that they may be released to the environment by changes in physicochemical conditions. The spatial variability patterns of 11 soil heavy metals (Ag, As, Cd, Co, Cr, Cu, Mo, Ni, Pb, Sb, and Zn) were identified and mapped. The results demonstrated that Ag, As, Cd, Mo, Cu, Pb, Sb, and Zn pollution are associated with mineralized veins and mining operations in this area. Further environmental monitoring and remedial actions are required for management of soil heavy metals in the study area. The present study not only enhanced our knowledge regarding soil pollution in the study area but also introduced a better technique to analyze pollution indices by multivariate geostatistical methods. PMID:26956012

  3. A Control Chart Approach for Representing and Mining Data Streams with Shape Based Similarity

    SciTech Connect

    Omitaomu, Olufemi A

    2014-01-01

    The mining of data streams for online condition monitoring is a challenging task in several domains including (electric) power grid system, intelligent manufacturing, and consumer science. Considering a power grid application in which thousands of sensors, called the phasor measurement units, are deployed on the power grid network to continuously collect streams of digital data for real-time situational awareness and system management. Depending on design, each sensor could stream between ten and sixty data samples per second. The myriad of sensory data captured could convey deeper insights about sequence of events in real-time and before major damages are done. However, the timely processing and analysis of these high-velocity and high-volume data streams is a challenge. Hence, a new data processing and transformation approach, based on the concept of control charts, for representing sequence of data streams from sensors is proposed. In addition, an application of the proposed approach for enhancing data mining tasks such as clustering using real-world power grid data streams is presented. The results indicate that the proposed approach is very efficient for data streams storage and manipulation.

  4. The adaptive approach for storage assignment by mining data of warehouse management system for distribution centres

    NASA Astrophysics Data System (ADS)

    Ming-Huang Chiang, David; Lin, Chia-Ping; Chen, Mu-Chen

    2011-05-01

    Among distribution centre operations, order picking has been reported to be the most labour-intensive activity. Sophisticated storage assignment policies adopted to reduce the travel distance of order picking have been explored in the literature. Unfortunately, previous research has been devoted to locating entire products from scratch. Instead, this study intends to propose an adaptive approach, a Data Mining-based Storage Assignment approach (DMSA), to find the optimal storage assignment for newly delivered products that need to be put away when there is vacant shelf space in a distribution centre. In the DMSA, a new association index (AIX) is developed to evaluate the fitness between the put away products and the unassigned storage locations by applying association rule mining. With AIX, the storage location assignment problem (SLAP) can be formulated and solved as a binary integer programming. To evaluate the performance of DMSA, a real-world order database of a distribution centre is obtained and used to compare the results from DMSA with a random assignment approach. It turns out that DMSA outperforms random assignment as the number of put away products and the proportion of put away products with high turnover rates increase.

  5. Knowledge Discovery using Domain-Concept Mining Approach for the Behavioral Risk Factor Surveillance System (BRFSS) Data

    PubMed Central

    Mahamaneerat, Wannapa Kay; Shyu, Chi-Ren

    2006-01-01

    The publicly available Behavioral Risk Factor Surveillance System (BRFSS) data is the largest telephone survey data set in the world. Often times, the data set is under-utilized due to its size and the difficulties to comprehend and explore the relationships among variables. With a traditional data mining approach, such as association rule (AR) mining, it is still not possible to discover valuable information under the existing computational power. To promote the usefulness of this rich data set efficiently, we propose a novel data mining approach called Domain-Concept Mining (DCM) that partitions data into groups of relevant domain-concept, then extracts associations among variables from each partition. The findings from the DCM show that it can efficiently discover relevant information from the BRFSS with respect to the previously published literature. PMID:17238640

  6. Data mining of protein-binding profiling data identifies structural modifications that distinguish selective and promiscuous compounds.

    PubMed

    Yongye, Austin B; Medina-Franco, José L

    2012-09-24

    Activity profiling of compound collections across multiple targets is increasingly being used in probe and drug discovery. Herein, we discuss an approach to systematically analyzing the structure-activity relationships of a large screening profile data with emphasis on identifying structural changes that have a significant impact on the number of proteins to which a compound binds. As a case study, we analyzed a recently released public data set of more than 15 000 compounds screened across 100 sequence-unrelated proteins. The screened compounds have different origins and include natural products, synthetic molecules from academic groups, and commercial compounds. Similar synthetic structures from academic groups showed, overall, greater promiscuity differences than do natural products and commercial compounds. The method implemented in this work readily identified structural changes that differentiated highly specific from promiscuous compounds. This approach is general and can be applied to analyze any other large-scale protein-binding profile data.

  7. VALUING ACID MINE DRAINAGE REMEDIATION OF IMPAIRED WATERWAYS IN WEST VIRGINIA: A HEDONIC MODELING APPROACH

    EPA Science Inventory

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD), the metal rich runoff flowing primarily from abandoned mines and surface deposits of mine waste. AMD can lower stream and river pH ...

  8. A Data Mining Approach to Predict In Situ Detoxification Potential of Chlorinated Ethenes.

    PubMed

    Lee, Jaejin; Im, Jeongdae; Kim, Ungtae; Löffler, Frank E

    2016-05-17

    Despite advances in physicochemical remediation technologies, in situ bioremediation treatment based on Dehalococcoides mccartyi (Dhc) reductive dechlorination activity remains a cornerstone approach to remedy sites impacted with chlorinated ethenes. Selecting the best remedial strategy is challenging due to uncertainties and complexity associated with biological and geochemical factors influencing Dhc activity. Guidelines based on measurable biogeochemical parameters have been proposed, but contemporary efforts fall short of meaningfully integrating the available information. Extensive groundwater monitoring data sets have been collected for decades, but have not been systematically analyzed and used for developing tools to guide decision-making. In the present study, geochemical and microbial data sets collected from 35 wells at five contaminated sites were used to demonstrate that a data mining prediction model using the classification and regression tree (CART) algorithm can provide improved predictive understanding of a site's reductive dechlorination potential. The CART model successfully predicted the 3-month-ahead reductive dechlorination potential with 75.8% and 69.5% true positive rate (i.e., sensitivity) for the training set and the test set, respectively. The machine learning algorithm ranked parameters by relative importance for assessing in situ reductive dechlorination potential. The abundance of Dhc 16S rRNA genes, CH4, Fe(2+), NO3(-), NO2(-), and SO4(2-) concentrations, total organic carbon (TOC) amounts, and oxidation-reduction potential (ORP) displayed significant correlations (p < 0.01) with dechlorination potential, with NO3(-), NO2(-), and Fe(2+) concentrations exhibiting precedence over other parameters. Contrary to prior efforts, the power of data mining approaches lies in the ability to discern synergetic effects between multiple parameters that affect reductive dechlorination activity. Overall, these findings demonstrate that data mining

  9. A Data Mining Approach to Predict In Situ Detoxification Potential of Chlorinated Ethenes.

    PubMed

    Lee, Jaejin; Im, Jeongdae; Kim, Ungtae; Löffler, Frank E

    2016-05-17

    Despite advances in physicochemical remediation technologies, in situ bioremediation treatment based on Dehalococcoides mccartyi (Dhc) reductive dechlorination activity remains a cornerstone approach to remedy sites impacted with chlorinated ethenes. Selecting the best remedial strategy is challenging due to uncertainties and complexity associated with biological and geochemical factors influencing Dhc activity. Guidelines based on measurable biogeochemical parameters have been proposed, but contemporary efforts fall short of meaningfully integrating the available information. Extensive groundwater monitoring data sets have been collected for decades, but have not been systematically analyzed and used for developing tools to guide decision-making. In the present study, geochemical and microbial data sets collected from 35 wells at five contaminated sites were used to demonstrate that a data mining prediction model using the classification and regression tree (CART) algorithm can provide improved predictive understanding of a site's reductive dechlorination potential. The CART model successfully predicted the 3-month-ahead reductive dechlorination potential with 75.8% and 69.5% true positive rate (i.e., sensitivity) for the training set and the test set, respectively. The machine learning algorithm ranked parameters by relative importance for assessing in situ reductive dechlorination potential. The abundance of Dhc 16S rRNA genes, CH4, Fe(2+), NO3(-), NO2(-), and SO4(2-) concentrations, total organic carbon (TOC) amounts, and oxidation-reduction potential (ORP) displayed significant correlations (p < 0.01) with dechlorination potential, with NO3(-), NO2(-), and Fe(2+) concentrations exhibiting precedence over other parameters. Contrary to prior efforts, the power of data mining approaches lies in the ability to discern synergetic effects between multiple parameters that affect reductive dechlorination activity. Overall, these findings demonstrate that data mining

  10. A multi-disciplinary approach to understanding the impacts of mines on traditional uses of water in Northern Mongolia.

    PubMed

    McIntyre, Neil; Bulovic, Nevenka; Cane, Isabel; McKenna, Phill

    2016-07-01

    Mongolia is an example of a nation where the rapidity of mining development is outpacing capacity to manage the potential land and water resources impacts. Further, Mongolia has a particular social and economic reliance on traditional uses of land and water, principally livestock herding. While some mining operations are setting high standards in protecting the natural resources surrounding the mine site, others have less incentive and capacity to do so and therefore are having adverse effects on surrounding communities. The paper describes a case study of the Sharyn Gol Soum in northern Mongolia where a range of mining types, from artisanal, small-scale mining to a large coal mine, operate alongside traditional herding lifestyles. A multi-disciplinary approach is taken to observe and attribute causes to the water resources impacts in the area. Surveys of the herding household community, land use mapping, and monitoring the spatial variations in water quality indicate deterioration of water resources. Collectively, the different sources of evidence suggest that the deterioration is mainly due to small-scale gold mining. The evidence included the perception of 78% of the interviewed herders that water quality had changed due to mining; a change in the footprint of small-scale gold mining from 2.8 to 15.2km(2) during the period 1999 to 2015; and pH and sulphate values in 2015 consistently outside the ranges observed at a baseline site in the same region. It is concluded that the lack of baseline data and effective governance mechanisms are fundamental challenges that need to be addressed if Mongolia's transition to a mining economy is to be managed alongside sustainability of herder lifestyles. PMID:27016688

  11. A comparison of fractal methods and probability plots in identifying and mapping soil metal contamination near an active mining area, Iran.

    PubMed

    Geranian, Hamid; Mokhtari, Ahmad Reza; Cohen, David R

    2013-10-01

    Mining activities may contribute significant amounts of metals to surrounding soils. Assessing the potential effects and extent of metal contamination requires the differentiation between geogenic and additional anthropogenic sources. This study compares the use of conventional probability plots with two forms of fractal analysis (number-size and concentration-area) to separate geochemical populations of ore-related elements in agricultural area soils adjacent to Pb-Zn mining operations in the Irankuh Mountains, central Iran. The two general approaches deliver similar spatial groupings of univariate geochemical populations, but the fractal methods provide more distinct separation between populations and require less data manipulation and modeling than the probability plots. The concentration-area fractal approach was more effective than the number-size fractal and probability plotting methods at separating sub-populations within the samples affected by contamination from the mining operations. There is a general lack of association between major elements and ore-related metals in the soils. The background populations display higher relative variation in the major elements than the ore-related metals whereas near the mining operations there is far greater relative variation in the ore-related metals. The extent of the transport of contaminants away from the mine site is partly a function of the greater dispersion of Zn compared with Pb and As, however, the patterns indicate dispersion of contaminants from the mine site is via dust and not surface/groundwater. A combination of geochemical and graphical assessment, with different methods of threshold determination, is shown to be effective in separating geogenic and anthropogenic geochemical patterns.

  12. Using Data Mining and Computational Approaches to Study Intermediate Filament Structure and Function.

    PubMed

    Parry, David A D

    2016-01-01

    Experimental and theoretical research aimed at determining the structure and function of the family of intermediate filament proteins has made significant advances over the past 20 years. Much of this has either contributed to or relied on the amino acid sequence databases that are now available online, and the data mining approaches that have been developed to analyze these sequences. As the quality of sequence data is generally high, it follows that it is the design of the computational and graphical methodologies that are of especial importance to researchers who aspire to gain a greater understanding of those sequence features that specify both function and structural hierarchy. However, these techniques are necessarily subject to limitations and it is important that these be recognized. In addition, no single method is likely to be successful in solving a particular problem, and a coordinated approach using a suite of methods is generally required. A final step in the process involves the interpretation of the results obtained and the construction of a working model or hypothesis that suggests further experimentation. While such methods allow meaningful progress to be made it is still important that the data are interpreted correctly and conservatively. New data mining methods are continually being developed, and it can be expected that even greater understanding of the relationship between structure and function will be gleaned from sequence data in the coming years.

  13. A Comparison of Educational Statistics and Data Mining Approaches to Identify Characteristics That Impact Online Learning

    ERIC Educational Resources Information Center

    Miller, L. Dee; Soh, Leen-Kiat; Samal, Ashok; Kupzyk, Kevin; Nugent, Gwen

    2015-01-01

    Learning objects (LOs) are important online resources for both learners and instructors and usage for LOs is growing. Automatic LO tracking collects large amounts of metadata about individual students as well as data aggregated across courses, learning objects, and other demographic characteristics (e.g. gender). The challenge becomes identifying…

  14. Novel data-mining approach identifies biomarkers for diagnosis of Kawasaki disease

    PubMed Central

    Tremoulet, Adriana H.; Dutkowski, Janusz; Sato, Yuichiro; Kanegaye, John T.; Ling, Xuefeng B.; Burns, Jane C.

    2015-01-01

    Background As Kawasaki disease (KD) shares many clinical features with other more common febrile illnesses and misdiagnosis, leading to a delay in treatment, increases the risk of coronary artery damage, a diagnostic test for KD is urgently needed. We sought to develop a panel of biomarkers that could distinguish between acute KD patients and febrile controls (FC) with sufficient accuracy to be clinically useful. Methods Plasma samples were collected from three independent cohorts of FC and acute KD patients who met the American Heart Association definition for KD and presented within the first 10 days of fever. The levels of 88 biomarkers associated with inflammation were assessed by Luminex bead technology. Unsupervised clustering followed by supervised clustering using a Random Forest model was used to find a panel of candidate biomarkers. Results A panel of biomarkers commonly available in the hospital laboratory (absolute neutrophil count, erythrocyte sedimentation rate, alanine aminotransferase, gamma glutamyl transferase, concentrations of alpha-1-antitrypsin, C-reactive protein, and fibrinogen, and platelet count) accurately diagnosed 81 to 96% of KD patients in a series of three independent cohorts. Conclusions After prospective validation, this 8-biomarker panel may improve the recognition of KD. PMID:26237629

  15. Use of lead isotopes to identify sources of metal and metalloid contaminants in atmospheric aerosol from mining operations.

    PubMed

    Félix, Omar I; Csavina, Janae; Field, Jason; Rine, Kyle P; Sáez, A Eduardo; Betterton, Eric A

    2015-03-01

    Mining operations are a potential source of metal and metalloid contamination by atmospheric particulate generated from smelting activities, as well as from erosion of mine tailings. In this work, we show how lead isotopes can be used for source apportionment of metal and metalloid contaminants from the site of an active copper mine. Analysis of atmospheric aerosol shows two distinct isotopic signatures: one prevalent in fine particles (<1μm aerodynamic diameter) while the other corresponds to coarse particles as well as particles in all size ranges from a nearby urban environment. The lead isotopic ratios found in the fine particles are equal to those of the mine that provides the ore to the smelter. Topsoil samples at the mining site show concentrations of Pb and As decreasing with distance from the smelter. Isotopic ratios for the sample closest to the smelter (650m) and from topsoil at all sample locations, extending to more than 1km from the smelter, were similar to those found in fine particles in atmospheric dust. The results validate the use of lead isotope signatures for source apportionment of metal and metalloid contaminants transported by atmospheric particulate. PMID:25496740

  16. Lead isotope ratio measurements by ICP-QMS to identify metal accumulation in vegetation specimens growing in mining environments.

    PubMed

    Marguí, E; Iglesias, M; Queralt, I; Hidalgo, M

    2006-08-31

    The use of variations in stable Pb isotope ratios has become a well-established diagnostic technique for characterising sources of lead contamination. In this work, lead isotope ratios in mining wastes (lead content 320-130,000 mg kg-1) and vegetation specimens (lead concentration 7-650 mg kg-1) have been determined by inductively coupled plasma quadrupole-based mass spectrometry (ICP-QMS) in order to investigate lead bioaccumulation in Buddleia davidii growing on wastes from two abandoned Pb/Zn mining areas in Spain. The accuracy of the isotope ratio measurements was evaluated by analysing a certified isotopic standard NIST SRM 981. Good agreements were obtained between the lead isotope ratios measured and the certified values (deviations within 0.01-0.2%). The results indicate that the lead isotopic ratios in vegetation samples collected in the mining areas differed from those of a specimen from an uncontaminated site (control sample). However, close lead isotope ratio values were found between vegetation specimens and mining tailings. Therefore, the results suggest that lead in the collected vegetation specimens is most likely related to the influence of mining activities rather than to other sources like past leaded-petrol emissions.

  17. Use of lead isotopes to identify sources of metal and metalloid contaminants in atmospheric aerosol from mining operations.

    PubMed

    Félix, Omar I; Csavina, Janae; Field, Jason; Rine, Kyle P; Sáez, A Eduardo; Betterton, Eric A

    2015-03-01

    Mining operations are a potential source of metal and metalloid contamination by atmospheric particulate generated from smelting activities, as well as from erosion of mine tailings. In this work, we show how lead isotopes can be used for source apportionment of metal and metalloid contaminants from the site of an active copper mine. Analysis of atmospheric aerosol shows two distinct isotopic signatures: one prevalent in fine particles (<1μm aerodynamic diameter) while the other corresponds to coarse particles as well as particles in all size ranges from a nearby urban environment. The lead isotopic ratios found in the fine particles are equal to those of the mine that provides the ore to the smelter. Topsoil samples at the mining site show concentrations of Pb and As decreasing with distance from the smelter. Isotopic ratios for the sample closest to the smelter (650m) and from topsoil at all sample locations, extending to more than 1km from the smelter, were similar to those found in fine particles in atmospheric dust. The results validate the use of lead isotope signatures for source apportionment of metal and metalloid contaminants transported by atmospheric particulate.

  18. Use of Lead Isotopes to Identify Sources of Metal and Metalloid Contaminants in Atmospheric Aerosol from Mining Operations

    PubMed Central

    Félix, Omar I.; Csavina, Janae; Field, Jason; Rine, Kyle P.; Sáez, A. Eduardo; Betterton, Eric A.

    2014-01-01

    Mining operations are a potential source of metal and metalloid contamination by atmospheric particulate generated from smelting activities, as well as from erosion of mine tailings. In this work, we show how lead isotopes can be used for source apportionment of metal and metalloid contaminants from the site of an active copper mine. Analysis of atmospheric aerosol shows two distinct isotopic signatures: one prevalent in fine particles (< 1 μm aerodynamic diameter) while the other corresponds to coarse particles as well as particles in all size ranges from a nearby urban environment. The lead isotopic ratios found in the fine particles are equal to those of the mine that provides the ore to the smelter. Topsoil samples at the mining site show concentrations of Pb and As decreasing with distance from the smelter. Isotopic ratios for the sample closest to the smelter (650 m) and from topsoil at all sample locations, extending to more than 1 km from the smelter, were similar to those found in fine particles in atmospheric dust. The results validate the use of lead isotope signatures for source apportionment of metal and metalloid contaminants transported by atmospheric particulate. PMID:25496740

  19. The first step in the development of text mining technology for cancer risk assessment: identifying and organizing scientific evidence in risk assessment literature

    PubMed Central

    Korhonen, Anna; Silins, Ilona; Sun, Lin; Stenius, Ulla

    2009-01-01

    Background One of the most neglected areas of biomedical Text Mining (TM) is the development of systems based on carefully assessed user needs. We have recently investigated the user needs of an important task yet to be tackled by TM -- Cancer Risk Assessment (CRA). Here we take the first step towards the development of TM technology for the task: identifying and organizing the scientific evidence required for CRA in a taxonomy which is capable of supporting extensive data gathering from biomedical literature. Results The taxonomy is based on expert annotation of 1297 abstracts downloaded from relevant PubMed journals. It classifies 1742 unique keywords found in the corpus to 48 classes which specify core evidence required for CRA. We report promising results with inter-annotator agreement tests and automatic classification of PubMed abstracts to taxonomy classes. A simple user test is also reported in a near real-world CRA scenario which demonstrates along with other evaluation that the resources we have built are well-defined, accurate, and applicable in practice. Conclusion We present our annotation guidelines and a tool which we have designed for expert annotation of PubMed abstracts. A corpus annotated for keywords and document relevance is also presented, along with the taxonomy which organizes the keywords into classes defining core evidence for CRA. As demonstrated by the evaluation, the materials we have constructed provide a good basis for classification of CRA literature along multiple dimensions. They can support current manual CRA as well as facilitate the development of an approach based on TM. We discuss extending the taxonomy further via manual and machine learning approaches and the subsequent steps required to develop TM technology for the needs of CRA. PMID:19772619

  20. Citation Mining: Integrating Text Mining and Bibliometrics for Research User Profiling.

    ERIC Educational Resources Information Center

    Kostoff, Ronald N.; del Rio, J. Antonio; Humenik, James A.; Garcia, Esther Ofilia; Ramirez, Ana Maria

    2001-01-01

    Discusses the importance of identifying the users and impact of research, and describes an approach for identifying the pathways through which research can impact other research, technology development, and applications. Describes a study that used citation mining, an integration of citation bibliometrics and text mining, on articles from the…

  1. Using Synchrotron-based X-ray Absorption Spectrometry to Identify the Arsenic Chemical Forms in Mine Waste Materials

    SciTech Connect

    Matanitobua, Vitukawalu P.; Noller, Barry N.; Chiswell, Barry; Ng, Jack C.; Bruce, Scott L.; Huang, Daphne; Riley, Mark; Harris, Hugh H.

    2007-01-19

    X-ray Absorption Near Edge Spectroscopy (XANES) gives arsenic form directly in the solid phase and has lower detection limits than extraction techniques. An important and common application of XANES is to use the shift of the edge position to determine the valence state. XANES speciation analysis is based on fitting linear combinations of known spectra from model compounds to determine the ratios of valence states and/or phases present. As(V)/As(III) ratios were determined for various Australian mine waste samples and dispersed mine waste samples from river/creek sediments in Vatukoula, Fiji.

  2. Pharmacogenomic responses of rat liver to methylprednisolone: an approach to mining a rich microarray time series.

    PubMed

    Almon, Richard R; Dubois, Debra C; Jin, Jin Y; Jusko, William J

    2005-08-18

    A data set was generated to examine global changes in gene expression in rat liver over time in response to a single bolus dose of methylprednisolone. Four control animals and 43 drug-treated animals were humanely killed at 16 different time points following drug administration. Total RNA preparations from the livers of these animals were hybridized to 47 individual Affymetrix RU34A gene chips, generating data for 8799 different probe sets for each chip. Data mining techniques that are applicable to gene array time series data sets in order to identify drug-regulated changes in gene expression were applied to this data set. A series of 4 sequentially applied filters were developed that were designed to eliminate probe sets that were not expressed in the tissue, were not regulated by the drug treatment, or did not meet defined quality control standards. These filters eliminated 7287 probe sets of the 8799 total (82%) from further consideration. Application of judiciously chosen filters is an effective tool for data mining of time series data sets. The remaining data can then be further analyzed by clustering and mathematical modeling techniques.

  3. Selection of remedial alternatives for mine sites: a multicriteria decision analysis approach.

    PubMed

    Betrie, Getnet D; Sadiq, Rehan; Morin, Kevin A; Tesfamariam, Solomon

    2013-04-15

    The selection of remedial alternatives for mine sites is a complex task because it involves multiple criteria and often with conflicting objectives. However, an existing framework used to select remedial alternatives lacks multicriteria decision analysis (MCDA) aids and does not consider uncertainty in the selection of alternatives. The objective of this paper is to improve the existing framework by introducing deterministic and probabilistic MCDA methods. The Preference Ranking Organization Method for Enrichment Evaluation (PROMETHEE) methods have been implemented in this study. The MCDA analysis involves processing inputs to the PROMETHEE methods that are identifying the alternatives, defining the criteria, defining the criteria weights using analytical hierarchical process (AHP), defining the probability distribution of criteria weights, and conducting Monte Carlo Simulation (MCS); running the PROMETHEE methods using these inputs; and conducting a sensitivity analysis. A case study was presented to demonstrate the improved framework at a mine site. The results showed that the improved framework provides a reliable way of selecting remedial alternatives as well as quantifying the impact of different criteria on selecting alternatives.

  4. Assessing the pollution potential of non-point mine wastes on surface water using a geo-spatial modeling approach

    NASA Astrophysics Data System (ADS)

    Xiao, Huaguo

    Abandoned mine lands (or inactive and abandoned mines) have received increasing concerns because they may cause severe environmental and public health problems. Most of previous studies to characterize mine waste pollution potential were focused on screening-level investigations. The issues related to pollution potential of mine waste were poorly addressed from the perspective of non-point source pollution, and few efforts have been made to study the effect of spatial characteristics of mine wastes on water quality using spatial technology such as GIS, remote sensing and spatial modeling. This research develops a geo-spatial approach to assessing mine waste pollution on surface water, which integrates GIS, remote sensing and watershed modeling techniques in order to effectively address the effects of spatial characteristics of pollutants. The study area is Tri-State Mining District which is located in the conjunction of Missouri, Kansas and Okalahoma. This district was the most important lead and zinc mining area in U.S. The historic mining left behind a huge area of mine wastes. Satellite remote sensing data (Landsat MSS and TM) were acquired, processed and classified in a decadal interval to generate land use/land cover (LULC) data for the entire district. Watersheds within the district were delineated by using USGS DEM data and a newly-developed GIS tool. Water quality indicators were selected and relevant water quality data between 1970 and 2002 was retrieved from USGS and USEPA databases. With the classified LULC data as a data source, landscape metrics (composition and spatial configuration indices) for each water quality station in mine waste-located watersheds were calculated. Statistical analyses were performed to quantify the relationship between landscape and surface water quality and to evaluate the impacts of landscape characteristics on surface water quality. Related GIS data layers were then created and a cell-based watershed modeling was conducted

  5. Identifying Low-Effort Examinees on Student Learning Outcomes Assessment: A Comparison of Two Approaches

    ERIC Educational Resources Information Center

    Rios, Joseph A.; Liu, Ou Lydia; Bridgeman, Brent

    2014-01-01

    This chapter describes a study that compares two approaches (self-reported effort [SRE] and response time effort [RTE]) for identifying low-effort examinees in student learning outcomes assessment. Although both approaches equally discriminated from measures of ability (e.g., SAT scores), RTE was found to have a stronger relationship with test…

  6. Using a Text-Mining Approach to Evaluate the Quality of Nursing Records.

    PubMed

    Chang, Hsiu-Mei; Chiou, Shwu-Fen; Liu, Hsiu-Yun; Yu, Hui-Chu

    2016-01-01

    Nursing records in Taiwan have been computerized, but their quality has rarely been discussed. Therefore, this study employed a text-mining approach and a cross-sectional retrospective research design to evaluate the quality of electronic nursing records at a medical center in Northern Taiwan. SAS Text Miner software Version 13.2 was employed to analyze unstructured nursing event records. The results show that SAS Text Miner is suitable for developing a textmining model for validating nursing records. The sensitivity of SAS Text Miner was approximately 0.94, and the specificity and accuracy were 0.99. Thus, SAS Text Miner software is an effective tool for auditing unstructured electronic nursing records. PMID:27332355

  7. A data mining approach to predict in situ chlorinated ethene detoxification potential

    NASA Astrophysics Data System (ADS)

    Lee, J.; Im, J.; Kim, U.; Loeffler, F. E.

    2015-12-01

    Despite major advances in physicochemical remediation technologies, in situ biostimulation and bioaugmentation treatment aimed at stimulating Dehalococcoides mccartyi (Dhc) reductive dechlorination activity remains a cornerstone approach to remedy sites impacted with chlorinated ethenes. In practice, selecting the best remedial strategy is challenging due to uncertainties associated with the microbiology (e.g., presence and activity of Dhc) and geochemical factors influencing Dhc activity. Extensive groundwater datasets collected over decades of monitoring exist, but have not been systematically analyzed. In the present study, geochemical and microbial data sets collected from 35 wells at 5 contaminated sites were used to develop a predictive empirical model using a machine learning algorithm (i) to rank the relative importance of parameters that affect in situ reductive dechlorination potential, and (ii) to provide recommendations for selecting the optimal remediation strategy at a specific site. Classification and regression tree (CART) analysis was applied, and a representative classification tree model was developed that allowed short-term prediction of dechlorination potential. Indirect indicators for low dissolved oxygen (e.g., low NO3-and NO2-, high Fe2+ and CH4) were the most influential factors for predicting dechlorination potential, followed by total organic carbon content (TOC) and Dhc cell abundance. These findings indicate that machine learning-based data mining techniques applied to groundwater monitoring data can lead to the development of predictive groundwater remediation models. A major need for improving the predictive capabilities of the data mining approach is a curated, up-to-date and comprehensive collection of groundwater monitoring data.

  8. Integrating Communication into Engineering Curricula: An Interdisciplinary Approach to Facilitating Transfer at New Mexico Institute of Mining and Technology

    ERIC Educational Resources Information Center

    Ford, Julie Dyke

    2012-01-01

    This program profile describes a new approach towards integrating communication within Mechanical Engineering curricula. The author, who holds a joint appointment between Technical Communication and Mechanical Engineering at New Mexico Institute of Mining and Technology, has been collaborating with Mechanical Engineering colleagues to establish a…

  9. An Approach to Developing Independent Learning and Non-Technical Skills Amongst Final Year Mining Engineering Students

    ERIC Educational Resources Information Center

    Knobbs, C. G.; Grayson, D. J.

    2012-01-01

    There is mounting evidence to show that engineers need more than technical skills to succeed in industry. This paper describes a curriculum innovation in which so-called "soft" skills, specifically inter-personal and intra-personal skills, were integrated into a final year mining engineering course. The instructional approach was designed to…

  10. A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells

    PubMed Central

    Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antzack, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J.; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco

    2016-01-01

    Abstract The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication

  11. A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells.

    PubMed

    Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antzack, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J; Guindani, Michele; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco

    2016-04-01

    The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication networks

  12. A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells.

    PubMed

    Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antzack, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J; Guindani, Michele; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco

    2016-04-01

    The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication networks

  13. The Voice of Chinese Health Consumers: A Text Mining Approach to Web-Based Physician Reviews

    PubMed Central

    Zhang, Kunpeng

    2016-01-01

    experience of finding doctors, doctors’ technical skills and bedside manner, general appreciation from patients, and description of various symptoms. Conclusions To the best of our knowledge, our work is the first study using an automated text-mining approach to analyze a large amount of unstructured textual data of Web-based physician reviews in China. Based on our analysis, we found that Chinese reviewers mainly concentrate on a few popular topics. This is consistent with the goal of Chinese online health platforms and demonstrates the health care focus in China’s health care system. Our text-mining approach reveals a new research area on how to use big data to help health care providers, health care administrators, and policy makers hear patient voices, target patient concerns, and improve the quality of care in this age of patient-centered care. Also, on the health care consumer side, our text mining technique helps patients make more informed decisions about which specialists to see without reading thousands of reviews, which is simply not feasible. In addition, our comparison analysis of Web-based physician reviews in China and the United States also indicates some cultural differences. PMID:27165558

  14. A Multi-Atlas Labeling Approach for Identifying Subject-Specific Functional Regions of Interest.

    PubMed

    Huang, Lijie; Zhou, Guangfu; Liu, Zhaoguo; Dang, Xiaobin; Yang, Zetian; Kong, Xiang-Zhen; Wang, Xu; Song, Yiying; Zhen, Zonglei; Liu, Jia

    2016-01-01

    The functional region of interest (fROI) approach has increasingly become a favored methodology in functional magnetic resonance imaging (fMRI) because it can circumvent inter-subject anatomical and functional variability, and thus increase the sensitivity and functional resolution of fMRI analyses. The standard fROI method requires human experts to meticulously examine and identify subject-specific fROIs within activation clusters. This process is time-consuming and heavily dependent on experts' knowledge. Several algorithmic approaches have been proposed for identifying subject-specific fROIs; however, these approaches cannot easily incorporate prior knowledge of inter-subject variability. In the present study, we improved the multi-atlas labeling approach for defining subject-specific fROIs. In particular, we used a classifier-based atlas-encoding scheme and an atlas selection procedure to account for the large spatial variability across subjects. Using a functional atlas database for face recognition, we showed that with these two features, our approach efficiently circumvented inter-subject anatomical and functional variability and thus improved labeling accuracy. Moreover, in comparison with a single-atlas approach, our multi-atlas labeling approach showed better performance in identifying subject-specific fROIs.

  15. Identifying diagnostically-relevant resting state brain functional connectivity in the ventral posterior complex via genetic data mining in autism spectrum disorder.

    PubMed

    Baldwin, Philip R; Curtis, Kaylah N; Patriquin, Michelle A; Wolf, Varina; Viswanath, Humsini; Shaw, Chad; Sakai, Yasunari; Salas, Ramiro

    2016-05-01

    Exome sequencing and copy number variation analyses continue to provide novel insight to the biological bases of autism spectrum disorder (ASD). The growing speed at which massive genetic data are produced causes serious lags in analysis and interpretation of the data. Thus, there is a need to develop systematic genetic data mining processes that facilitate efficient analysis of large datasets. We report a new genetic data mining system, ProcessGeneLists and integrated a list of ASD-related genes with currently available resources in gene expression and functional connectivity of the human brain. Our data-mining program successfully identified three primary regions of interest (ROIs) in the mouse brain: inferior colliculus, ventral posterior complex of the thalamus (VPC), and parafascicular nucleus (PFn). To understand its pathogenic relevance in ASD, we examined the resting state functional connectivity (RSFC) of the homologous ROIs in human brain with other brain regions that were previously implicated in the neuro-psychiatric features of ASD. Among them, the RSFC of the VPC with the medial frontal gyrus (MFG) was significantly more anticorrelated, whereas the RSFC of the PN with the globus pallidus was significantly increased in children with ASD compared with healthy children. Moreover, greater values of RSFC between VPC and MFG were correlated with severity index and repetitive behaviors in children with ASD. No significant RSFC differences were detected in adults with ASD. Together, these data demonstrate the utility of our data-mining program through identifying the aberrant connectivity of thalamo-cortical circuits in children with ASD. Autism Res 2016, 9: 553-562. © 2015 International Society for Autism Research, Wiley Periodicals, Inc.

  16. Model-based approach to the detection and classification of mines in sidescan sonar.

    PubMed

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-10

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis.

  17. Model-based approach to the detection and classification of mines in sidescan sonar.

    PubMed

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-10

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis. PMID:14735943

  18. Systematic Analysis of the Molecular Mechanism Underlying Decidualization Using a Text Mining Approach

    PubMed Central

    Liu, Ji-Long; Wang, Tong-Song

    2015-01-01

    Decidualization is a crucial process for successful embryo implantation and pregnancy in humans. Defects in decidualization during early pregnancy are associated with several pregnancy complications, such as pre-eclampsia, intrauterine growth restriction and recurrent pregnancy loss. However, the mechanism underlying decidualization remains poorly understood. In the present study, we performed a systematic analysis of decidualization-related genes using text mining. We identified 286 genes for humans and 287 genes for mice respectively, with an overlap of 111 genes shared by both species. Through enrichment test, we demonstrated that although divergence was observed, the majority of enriched gene ontology terms and pathways were shared by both species, suggesting that functional categories were more conserved than individual genes. We further constructed a decidualization-related protein-protein interaction network consisted of 344 nodes connected via 1,541 edges. We prioritized genes in this network and identified 12 genes that may be key regulators of decidualization. These findings would provide some clues for further research on the mechanism underlying decidualization. PMID:26222155

  19. Systematic Analysis of the Molecular Mechanism Underlying Decidualization Using a Text Mining Approach.

    PubMed

    Liu, Ji-Long; Wang, Tong-Song

    2015-01-01

    Decidualization is a crucial process for successful embryo implantation and pregnancy in humans. Defects in decidualization during early pregnancy are associated with several pregnancy complications, such as pre-eclampsia, intrauterine growth restriction and recurrent pregnancy loss. However, the mechanism underlying decidualization remains poorly understood. In the present study, we performed a systematic analysis of decidualization-related genes using text mining. We identified 286 genes for humans and 287 genes for mice respectively, with an overlap of 111 genes shared by both species. Through enrichment test, we demonstrated that although divergence was observed, the majority of enriched gene ontology terms and pathways were shared by both species, suggesting that functional categories were more conserved than individual genes. We further constructed a decidualization-related protein-protein interaction network consisted of 344 nodes connected via 1,541 edges. We prioritized genes in this network and identified 12 genes that may be key regulators of decidualization. These findings would provide some clues for further research on the mechanism underlying decidualization. PMID:26222155

  20. Optimizing data collection for public health decisions: a data mining approach

    PubMed Central

    2014-01-01

    Background Collecting data can be cumbersome and expensive. Lack of relevant, accurate and timely data for research to inform policy may negatively impact public health. The aim of this study was to test if the careful removal of items from two community nutrition surveys guided by a data mining technique called feature selection, can (a) identify a reduced dataset, while (b) not damaging the signal inside that data. Methods The Nutrition Environment Measures Surveys for stores (NEMS-S) and restaurants (NEMS-R) were completed on 885 retail food outlets in two counties in West Virginia between May and November of 2011. A reduced dataset was identified for each outlet type using feature selection. Coefficients from linear regression modeling were used to weight items in the reduced datasets. Weighted item values were summed with the error term to compute reduced item survey scores. Scores produced by the full survey were compared to the reduced item scores using a Wilcoxon rank-sum test. Results Feature selection identified 9 store and 16 restaurant survey items as significant predictors of the score produced from the full survey. The linear regression models built from the reduced feature sets had R2 values of 92% and 94% for restaurant and grocery store data, respectively. Conclusions While there are many potentially important variables in any domain, the most useful set may only be a small subset. The use of feature selection in the initial phase of data collection to identify the most influential variables may be a useful tool to greatly reduce the amount of data needed thereby reducing cost. PMID:24919484

  1. The impact of vascular diameter ratio on hemodialysis maturation time: Evidence from data mining approaches and thermodynamics law

    PubMed Central

    Rezapour, Mohammad; Taran, Somayeh; Balin Parast, Mahmood; Khavanin Zadeh, Morteza

    2016-01-01

    Background: Vascular Access (VA) is an important aspect for blood circulatory in Hemodialysis (HD). Arteriovenous Fistula (AVF) is a suitable procedure to gain VA. Maturation of the AVF is a status of AVF, which can be cannulated for HD. This study aimed to discover the parameters that effectively reduce the duration between VA and start of HD, which symbolizes the maturation time (MT). Methods: Ninety-six patients who underwent AVF creation were selected for this study. The decision tree method was used based on CART/C4.5 algorithm, which is one of the data mining approaches for data classification. Vascular diameter ratio (VDR) coefficient was obtained (VDR=Artery/Vein diameters). Results: We investigated the relationship between the VDR and MT in this study and found that MT is reversely related to VDR in elderly patients, while this relation was direct in younger patients. Conclusion: The analysis revealed a Spearman's correlation coefficient for Vein diameter with MT. MT decreases when diameters of vein and artery are close to one another. This study can help the surgeons to identify high- risk patients who elongate MT for HD. PMID:27453889

  2. APPLICATION OF A TIERED SURROGATE APPROACH TO IDENTIFY TOXICITY SURROGATES FOR HUMAN HEALTH RISK ASSESSMENT

    EPA Science Inventory

    APPLICATION OF A TIERED SURROGATE APPROACH TO IDENTIFY TOXICITY SURROGATES FOR HUMAN HEALTH RISK ASSESSMENT. P.R. Dodmane1, L.E. Lizarraga1, J.P. Kaiser2, S.C. Wesselkamper2, Q.J. Zhao2. 1ORISE Participant, U.S. EPA, National Center for Environmental Assessment (NCEA), Cincinnati...

  3. An Information Theoretic Approach for Identifying Shared Information and Asymmetric Relationships among Variables.

    ERIC Educational Resources Information Center

    Golden, Linda L.; And Others

    1990-01-01

    The general-information-theoretic approach was used to identify informational overlap and asymmetry between variables, using affective, cognitive, and behavioral measures. Using the chi-squared test, no significant differences were found in response rates, demographics, or patronage frequency of three stores between numerical (n=453) and graphic…

  4. Identifying Core Mobile Learning Faculty Competencies Based Integrated Approach: A Delphi Study

    ERIC Educational Resources Information Center

    Elbarbary, Rafik Said

    2015-01-01

    This study is based on the integrated approach as a concept framework to identify, categorize, and rank a key component of mobile learning core competencies for Egyptian faculty members in higher education. The field investigation framework used four rounds Delphi technique to determine the importance rate of each component of core competencies…

  5. A Comprehensive Approach to Identifying Intervention Targets for Patient-Safety Improvement in a Hospital Setting

    ERIC Educational Resources Information Center

    Cunningham, Thomas R.; Geller, E. Scott

    2012-01-01

    Despite differences in approaches to organizational problem solving, healthcare managers and organizational behavior management (OBM) practitioners share a number of practices, and connecting healthcare management with OBM may lead to improvements in patient safety. A broad needs-assessment methodology was applied to identify patient-safety…

  6. Identifying Useful Auxiliary Variables for Incomplete Data Analyses: A Note on a Group Difference Examination Approach

    ERIC Educational Resources Information Center

    Raykov, Tenko; Marcoulides, George A.

    2014-01-01

    This research note contributes to the discussion of methods that can be used to identify useful auxiliary variables for analyses of incomplete data sets. A latent variable approach is discussed, which is helpful in finding auxiliary variables with the property that if included in subsequent maximum likelihood analyses they may enhance considerably…

  7. The Baby TALK Model: An Innovative Approach to Identifying High-Risk Children and Families

    ERIC Educational Resources Information Center

    Villalpando, Aimee Hilado; Leow, Christine; Hornstein, John

    2012-01-01

    This research report examines the Baby TALK model, an innovative early childhood intervention approach used to identify, recruit, and serve young children who are at-risk for developmental delays, mental health needs, and/or school failure, and their families. The report begins with a description of the model. This description is followed by an…

  8. VALUING ACID MINE DRAINAGE REMEDIATION IN WEST VIRGINIA: A HEDONIC MODELING APPROACH

    EPA Science Inventory

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD). Appalachian states have an especially large number of contaminated streams and rivers, and the USGS places AMD as the primary source...

  9. A Data Mining Approach to Reveal Representative Collaboration Indicators in Open Collaboration Frameworks

    ERIC Educational Resources Information Center

    Anaya, Antonio R.; Boticario, Jesus G.

    2009-01-01

    Data mining methods are successful in educational environments to discover new knowledge or learner skills or features. Unfortunately, they have not been used in depth with collaboration. We have developed a scalable data mining method, whose objective is to infer information on the collaboration during the collaboration process in a…

  10. VALUING ACID MINE DRAINAGE REMEDIATION IN WEST VIRGINIA: A HEDONIC MODELING APPROACH INCORPORATING GEOGRAPHIC INFORMATION SYSTEMS

    EPA Science Inventory

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD). Appalachian states have an especially large number of contaminated streams and rivers, and the USGS places AMD as the primary source...

  11. Identifying Bioaccumulative Halogenated Organic Compounds Using a Nontargeted Analytical Approach: Seabirds as Sentinels

    PubMed Central

    Millow, Christopher J.; Mackintosh, Susan A.; Lewison, Rebecca L.; Dodder, Nathan G.; Hoh, Eunha

    2015-01-01

    Persistent organic pollutants (POPs) are typically monitored via targeted mass spectrometry, which potentially identifies only a fraction of the contaminants actually present in environmental samples. With new anthropogenic compounds continuously introduced to the environment, novel and proactive approaches that provide a comprehensive alternative to targeted methods are needed in order to more completely characterize the diversity of known and unknown compounds likely to cause adverse effects. Nontargeted mass spectrometry attempts to extensively screen for compounds, providing a feasible approach for identifying contaminants that warrant future monitoring. We employed a nontargeted analytical method using comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (GC×GC/TOF-MS) to characterize halogenated organic compounds (HOCs) in California Black skimmer (Rynchops niger) eggs. Our study identified 111 HOCs; 84 of these compounds were regularly detected via targeted approaches, while 27 were classified as typically unmonitored or unknown. Typically unmonitored compounds of note in bird eggs included tris(4-chlorophenyl)methane (TCPM), tris(4-chlorophenyl)methanol (TCPMOH), triclosan, permethrin, heptachloro-1'-methyl-1,2'-bipyrrole (MBP), as well as four halogenated unknown compounds that could not be identified through database searching or the literature. The presence of these compounds in Black skimmer eggs suggests they are persistent, bioaccumulative, potentially biomagnifying, and maternally transferring. Our results highlight the utility and importance of employing nontargeted analytical tools to assess true contaminant burdens in organisms, as well as to demonstrate the value in using environmental sentinels to proactively identify novel contaminants. PMID:26020245

  12. Application of techniques to identify coal-mine and power-generation effects on surface-water quality, San Juan River basin, New Mexico and Colorado

    USGS Publications Warehouse

    Goetz, C.L.; Abeyta, Cynthia G.; Thomas, E.V.

    1987-01-01

    Numerous analytical techniques were applied to determine water quality changes in the San Juan River basin upstream of Shiprock , New Mexico. Eight techniques were used to analyze hydrologic data such as: precipitation, water quality, and streamflow. The eight methods used are: (1) Piper diagram, (2) time-series plot, (3) frequency distribution, (4) box-and-whisker plot, (5) seasonal Kendall test, (6) Wilcoxon rank-sum test, (7) SEASRS procedure, and (8) analysis of flow adjusted, specific conductance data and smoothing. Post-1963 changes in dissolved solids concentration, dissolved potassium concentration, specific conductance, suspended sediment concentration, or suspended sediment load in the San Juan River downstream from the surface coal mines were examined to determine if coal mining was having an effect on the quality of surface water. None of the analytical methods used to analyzed the data showed any increase in dissolved solids concentration, dissolved potassium concentration, or specific conductance in the river downstream from the mines; some of the analytical methods used showed a decrease in dissolved solids concentration and specific conductance. Chaco River, an ephemeral stream tributary to the San Juan River, undergoes changes in water quality due to effluent from a power generation facility. The discharge in the Chaco River contributes about 1.9% of the average annual discharge at the downstream station, San Juan River at Shiprock, NM. The changes in water quality detected at the Chaco River station were not detected at the downstream Shiprock station. It was not possible, with the available data, to identify any effects of the surface coal mines on water quality that were separable from those of urbanization, agriculture, and other cultural and natural changes. In order to determine the specific causes of changes in water quality, it would be necessary to collect additional data at strategically located stations. (Author 's abstract)

  13. A Temporal Approach to Monitor Surface Mine Reclamation Progress via LANDSAT

    NASA Technical Reports Server (NTRS)

    Davis, A. L.; Bloemer, H. L.; Brumfield, J. O.

    1982-01-01

    Using LANDSAT satellite imagery, the mine reclamation process can be studied on a temporal and continuing basis. Not only can the progress of reclamation be readily monitored, but also a breakdown in the mining reclamation process can be detected. In viewing reclamation, it is important to monitor the mined site well past initial revegation stages. With present mining law and bonding procedures, fast revegetational growth is encouraged, often leading to poor soil fertilizing and inappropriate stabilizing species. As a result, the initial reclamation may exhibit good qualities for one or two years but then may experience vegetational deterioration after the state has relinquished the mining company from it's responsibility. It is this small-scale breakdown in the reclamation process that was detected using an unsupervised classification technique with eight-year temporal LANDSAT imagery coverage.

  14. Quantitative and qualitative approaches to identifying migration chronology in a continental migrant

    USGS Publications Warehouse

    Beatty, William S.; Kesler, Dylan C.; Webb, Elisabeth B.; Raedeke, Andrew H.; Naylor, Luke W.; Humburg, Dale D.

    2013-01-01

    The degree to which extrinsic factors influence migration chronology in North American waterfowl has not been quantified, particularly for dabbling ducks. Previous studies have examined waterfowl migration using various methods, however, quantitative approaches to define avian migration chronology over broad spatio-temporal scales are limited, and the implications for using different approaches have not been assessed. We used movement data from 19 female adult mallards (Anas platyrhynchos) equipped with solar-powered global positioning system satellite transmitters to evaluate two individual level approaches for quantifying migration chronology. The first approach defined migration based on individual movements among geopolitical boundaries (state, provincial, international), whereas the second method modeled net displacement as a function of time using nonlinear models. Differences in migration chronologies identified by each of the approaches were examined with analysis of variance. The geopolitical method identified mean autumn migration midpoints at 15 November 2010 and 13 November 2011, whereas the net displacement method identified midpoints at 15 November 2010 and 14 November 2011. The mean midpoints for spring migration were 3 April 2011 and 20 March 2012 using the geopolitical method and 31 March 2011 and 22 March 2012 using the net displacement method. The duration, initiation date, midpoint, and termination date for both autumn and spring migration did not differ between the two individual level approaches. Although we did not detect differences in migration parameters between the different approaches, the net displacement metric offers broad potential to address questions in movement ecology for migrating species. Ultimately, an objective definition of migration chronology will allow researchers to obtain a comprehensive understanding of the extrinsic factors that drive migration at the individual and population levels. As a result, targeted

  15. Missing defects? A comparison of microscopic and macroscopic approaches to identifying linear enamel hypoplasia.

    PubMed

    Hassett, Brenna R

    2014-03-01

    Linear enamel hypoplasia (LEH), the presence of linear defects of dental enamel formed during periods of growth disruption, is frequently analyzed in physical anthropology as evidence for childhood health in the past. However, a wide variety of methods for identifying and interpreting these defects in archaeological remains exists, preventing easy cross-comparison of results from disparate studies. This article compares a standard approach to identifying LEH using the naked eye to the evidence of growth disruption observed microscopically from the enamel surface. This comparison demonstrates that what is interpreted as evidence of growth disruption microscopically is not uniformly identified with the naked eye, and provides a reference for the level of consistency between the number and timing of defects identified using microscopic versus macroscopic approaches. This is done for different tooth types using a large sample of unworn permanent teeth drawn from several post-medieval London burial assemblages. The resulting schematic diagrams showing where macroscopic methods achieve more or less similar results to microscopic methods are presented here and clearly demonstrate that "naked-eye" methods of identifying growth disruptions do not identify LEH as often as microscopic methods in areas where perikymata are more densely packed.

  16. Identifying inhibitory compounds in lignocellulosic biomass hydrolysates using an exometabolomics approach

    PubMed Central

    2014-01-01

    Background Inhibitors are formed that reduce the fermentation performance of fermenting yeast during the pretreatment process of lignocellulosic biomass. An exometabolomics approach was applied to systematically identify inhibitors in lignocellulosic biomass hydrolysates. Results We studied the composition and fermentability of 24 different biomass hydrolysates. To create diversity, the 24 hydrolysates were prepared from six different biomass types, namely sugar cane bagasse, corn stover, wheat straw, barley straw, willow wood chips and oak sawdust, and with four different pretreatment methods, i.e. dilute acid, mild alkaline, alkaline/peracetic acid and concentrated acid. Their composition and that of fermentation samples generated with these hydrolysates were analyzed with two GC-MS methods. Either ethyl acetate extraction or ethyl chloroformate derivatization was used before conducting GC-MS to prevent sugars are overloaded in the chromatograms, which obscure the detection of less abundant compounds. Using multivariate PLS-2CV and nPLS-2CV data analysis models, potential inhibitors were identified through establishing relationship between fermentability and composition of the hydrolysates. These identified compounds were tested for their effects on the growth of the model yeast, Saccharomyces. cerevisiae CEN.PK 113-7D, confirming that the majority of the identified compounds were indeed inhibitors. Conclusion Inhibitory compounds in lignocellulosic biomass hydrolysates were successfully identified using a non-targeted systematic approach: metabolomics. The identified inhibitors include both known ones, such as furfural, HMF and vanillin, and novel inhibitors, namely sorbic acid and phenylacetaldehyde. PMID:24655423

  17. Missing defects? A comparison of microscopic and macroscopic approaches to identifying linear enamel hypoplasia.

    PubMed

    Hassett, Brenna R

    2014-03-01

    Linear enamel hypoplasia (LEH), the presence of linear defects of dental enamel formed during periods of growth disruption, is frequently analyzed in physical anthropology as evidence for childhood health in the past. However, a wide variety of methods for identifying and interpreting these defects in archaeological remains exists, preventing easy cross-comparison of results from disparate studies. This article compares a standard approach to identifying LEH using the naked eye to the evidence of growth disruption observed microscopically from the enamel surface. This comparison demonstrates that what is interpreted as evidence of growth disruption microscopically is not uniformly identified with the naked eye, and provides a reference for the level of consistency between the number and timing of defects identified using microscopic versus macroscopic approaches. This is done for different tooth types using a large sample of unworn permanent teeth drawn from several post-medieval London burial assemblages. The resulting schematic diagrams showing where macroscopic methods achieve more or less similar results to microscopic methods are presented here and clearly demonstrate that "naked-eye" methods of identifying growth disruptions do not identify LEH as often as microscopic methods in areas where perikymata are more densely packed. PMID:24323494

  18. Multi-variate flood damage assessment: a tree-based data-mining approach

    NASA Astrophysics Data System (ADS)

    Merz, B.; Kreibich, H.; Lall, U.

    2013-01-01

    The usual approach for flood damage assessment consists of stage-damage functions which relate the relative or absolute damage for a certain class of objects to the inundation depth. Other characteristics of the flooding situation and of the flooded object are rarely taken into account, although flood damage is influenced by a variety of factors. We apply a group of data-mining techniques, known as tree-structured models, to flood damage assessment. A very comprehensive data set of more than 1000 records of direct building damage of private households in Germany is used. Each record contains details about a large variety of potential damage-influencing characteristics, such as hydrological and hydraulic aspects of the flooding situation, early warning and emergency measures undertaken, state of precaution of the household, building characteristics and socio-economic status of the household. Regression trees and bagging decision trees are used to select the more important damage-influencing variables and to derive multi-variate flood damage models. It is shown that these models outperform existing models, and that tree-structured models are a promising alternative to traditional damage models.

  19. A cross-species bi-clustering approach to identifying conserved co-regulated genes

    PubMed Central

    Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

    2016-01-01

    Motivation: A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. Results: We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on

  20. A comparison of approaches for finding minimum identifying codes on graphs

    NASA Astrophysics Data System (ADS)

    Horan, Victoria; Adachi, Steve; Bak, Stanley

    2016-05-01

    In order to formulate mathematical conjectures likely to be true, a number of base cases must be determined. However, many combinatorial problems are NP-hard and the computational complexity makes this research approach difficult using a standard brute force approach on a typical computer. One sample problem explored is that of finding a minimum identifying code. To work around the computational issues, a variety of methods are explored and consist of a parallel computing approach using MATLAB, an adiabatic quantum optimization approach using a D-Wave quantum annealing processor, and lastly using satisfiability modulo theory (SMT) and corresponding SMT solvers. Each of these methods requires the problem to be formulated in a unique manner. In this paper, we address the challenges of computing solutions to this NP-hard problem with respect to each of these methods.

  1. Ab initio thermodynamic approach to identify mixed solid sorbents for CO2 capture technology

    DOE PAGES

    Duan, Yuhua

    2015-10-15

    Because the current technologies for capturing CO2 are still too energy intensive, new materials must be developed that can capture CO2 reversibly with acceptable energy costs. At a given CO2 pressure, the turnover temperature (Tt) of the reaction of an individual solid that can capture CO2 is fixed. Such Tt may be outside the operating temperature range (ΔTo) for a practical capture technology. To adjust Tt to fit the practical ΔTo, in this study, three scenarios of mixing schemes are explored by combining thermodynamic database mining with first principles density functional theory and phonon lattice dynamics calculations. Our calculated resultsmore » demonstrate that by mixing different types of solids, it’s possible to shift Tt to the range of practical operating temperature conditions. According to the requirements imposed by the pre- and post- combustion technologies and based on our calculated thermodynamic properties for the CO2 capture reactions by the mixed solids of interest, we were able to identify the mixing ratios of two or more solids to form new sorbent materials for which lower capture energy costs are expected at the desired pressure and temperature conditions.« less

  2. Support for information management in critical care: a new approach to identify needs.

    PubMed Central

    Rosenal, T. W.; Forsythe, D. E.; Musen, M. A.; Seiver, A.

    1995-01-01

    Managing information is necessary to support clinical decision making and action in critical care. By understanding the nature of information management and its relationship to sound clinical practice, we should come to use technology more wisely. We demonstrated that a new approach inspired by ethnographic research methods could identify useful and unexpected findings about clinical information management. In this approach, a clinician experienced in a specific domain (critical care), with advice from a medical anthropologist, made short-term observations of information management in that domain. We identified 8 areas in a critical care Unit in which information management was seriously in need of better support. We also found interesting differences in how these needs were viewed by nurses and physicians. Our interest in this approach was at two levels: 1. Identify and describe representative instances of sub-optimal information management in a critical care Unit. 2. Investigate the effectiveness of such short-term observations by clinicians. Our long-range goal is to explore the use of this approach and the information it reveals to optimize the process of developing and selecting new information support tools, preparing for their introduction, and optimizing clinical outcomes. PMID:8563267

  3. An information-theoretic approach to assess practical identifiability of parametric dynamical systems.

    PubMed

    Pant, Sanjay; Lombardi, Damiano

    2015-10-01

    A new approach for assessing parameter identifiability of dynamical systems in a Bayesian setting is presented. The concept of Shannon entropy is employed to measure the inherent uncertainty in the parameters. The expected reduction in this uncertainty is seen as the amount of information one expects to gain about the parameters due to the availability of noisy measurements of the dynamical system. Such expected information gain is interpreted in terms of the variance of a hypothetical measurement device that can measure the parameters directly, and is related to practical identifiability of the parameters. If the individual parameters are unidentifiable, correlation between parameter combinations is assessed through conditional mutual information to determine which sets of parameters can be identified together. The information theoretic quantities of entropy and information are evaluated numerically through a combination of Monte Carlo and k-nearest neighbour methods in a non-parametric fashion. Unlike many methods to evaluate identifiability proposed in the literature, the proposed approach takes the measurement-noise into account and is not restricted to any particular noise-structure. Whilst computationally intensive for large dynamical systems, it is easily parallelisable and is non-intrusive as it does not necessitate re-writing of the numerical solvers of the dynamical system. The application of such an approach is presented for a variety of dynamical systems--ranging from systems governed by ordinary differential equations to partial differential equations--and, where possible, validated against results previously published in the literature.

  4. Identifying potential adverse effects using the web: a new approach to medical hypothesis generation

    PubMed Central

    Benton, Adrian; Ungar, Lyle; Hill, Shawndra; Hennessy, Sean; Mao, Jun; Chung, Annie; Leonard, Charles E.; Holmes, John H.

    2011-01-01

    Medical message boards are online resources where users with a particular condition exchange information, some of which they might not otherwise share with medical providers. Many of these boards contain a large number of posts and contain patient opinions and experiences that would be potentially useful to clinicians and researchers. We present an approach that is able to collect a corpus of medical message board posts, de-identify the corpus, and extract information on potential adverse drug effects discussed by users. Using a corpus of posts to breast cancer message boards, we identified drug event pairs using co-occurrence statistics. We then compared the identified drug event pairs with adverse effects listed on the package labels of tamoxifen, anastrozole, exemestane, and letrozole. Of the pairs identified by our system, 75–80% were documented on the drug labels. Some of the undocumented pairs may represent previously unidentified adverse drug effects. PMID:21820083

  5. New approach for reduction of diesel consumption by comparing different mining haulage configurations.

    PubMed

    Rodovalho, Edmo da Cunha; Lima, Hernani Mota; de Tomi, Giorgio

    2016-05-01

    The mining operations of loading and haulage have an energy source that is highly dependent on fossil fuels. In mining companies that select trucks for haulage, this input is the main component of mining costs. How can the impact of the operational aspects on the diesel consumption of haulage operations in surface mines be assessed? There are many studies relating the consumption of fuel trucks to several variables, but a methodology that prioritizes higher-impact variables under each specific condition is not available. Generic models may not apply to all operational settings presented in the mining industry. This study aims to create a method of analysis, identification, and prioritization of variables related to fuel consumption of haul trucks in open pit mines. For this purpose, statistical analysis techniques and mathematical modelling tools using multiple linear regressions will be applied. The model is shown to be suitable because the results generate a good description of the fuel consumption behaviour. In the practical application of the method, the reduction of diesel consumption reached 10%. The implementation requires no large-scale investments or very long deadlines and can be applied to mining haulage operations in other settings. PMID:26946166

  6. Occupational safety risk management in Australian mining.

    PubMed

    Joy, J

    2004-08-01

    In the past 15 years, there has been a major safety improvement in the Australian mining industry. Part of this change can be attributed to the development and application of risk assessment methods. These systematic, team-based techniques identify, assess and control unacceptable risks to people, assets, the environment and production. The outcomes have improved mine management systems. This paper discusses the risk assessment approach applied to equipment design and mining operations, as well as the specific risk assessment methodology. The paper also discusses the reactive side of risk management, incident and accident investigation. Systematic analytical methods have also been adopted by regulatory authorities and mining companies to investigate major losses.

  7. A Critical Study on the Underground Environment of Coal Mines in India-an Ergonomic Approach

    NASA Astrophysics Data System (ADS)

    Dey, Netai Chandra; Sharma, Gourab Dhara

    2013-04-01

    Ergonomics application on underground miner's health plays a great role in controlling the efficiency of miners. The job stress in underground mine is still physically demanding and continuous stress due to certain posture or movement of miners during work leads to localized muscle fatigue creating musculo-skeletal disorders. A good working environment can change the degree of job heaviness and thermal stress (WBGT values) can directly have the effect on stretch of work of miners. Out of many unit operations in underground mine, roof bolting keeps an important contribution with regard to safety of the mine and miners. Occupational stress of roof bolters from ergonomic consideration has been discussed in the paper.

  8. Floodplain storage of mine tailings in the Belle Fourche river system: a sediment budget approach

    USGS Publications Warehouse

    Marron, D.C.

    1992-01-01

    Arsenic-contaminated mine tailings that were discharged into Whitewood Creek at Lead, South Dakota, from 1876 to 1978, were deposited along the floodplains of Whitewood Creek and the Belle Fourche River. The resulting arsenic-contaminated floodplain deposit consists mostly of overbank sediments and filled abandoned meanders along Whitewood Creek, and overbank and point-bar sediments along the Belle Fourche River. Arsenic concentrations of the contaminated sediments indicate the degree of dilution of mine tailings by uncontaminated alluvium. About 13% of the 110 ?? 106 Mg of mine tailings that were discharged at Lead were deposited along the Whitewood Creek floodplain. -from Author

  9. A novel approach to identify genes that determine grain protein deviation in cereals.

    PubMed

    Mosleth, Ellen F; Wan, Yongfang; Lysenko, Artem; Chope, Gemma A; Penson, Simon P; Shewry, Peter R; Hawkesford, Malcolm J

    2015-06-01

    Grain yield and protein content were determined for six wheat cultivars grown over 3 years at multiple sites and at multiple nitrogen (N) fertilizer inputs. Although grain protein content was negatively correlated with yield, some grain samples had higher protein contents than expected based on their yields, a trait referred to as grain protein deviation (GPD). We used novel statistical approaches to identify gene transcripts significantly related to GPD across environments. The yield and protein content were initially adjusted for nitrogen fertilizer inputs and then adjusted for yield (to remove the negative correlation with protein content), resulting in a parameter termed corrected GPD. Significant genetic variation in corrected GPD was observed for six cultivars grown over a range of environmental conditions (a total of 584 samples). Gene transcript profiles were determined in a subset of 161 samples of developing grain to identify transcripts contributing to GPD. Principal component analysis (PCA), analysis of variance (ANOVA) and means of scores regression (MSR) were used to identify individual principal components (PCs) correlating with GPD alone. Scores of the selected PCs, which were significantly related to GPD and protein content but not to the yield and significantly affected by cultivar, were identified as reflecting a multivariate pattern of gene expression related to genetic variation in GPD. Transcripts with consistent variation along the selected PCs were identified by an approach hereby called one-block means of scores regression (one-block MSR).

  10. An approach to identifying drug resistance associated mutations in bacterial strains

    PubMed Central

    2012-01-01

    Background Drug resistance in bacterial pathogens is an increasing problem, which stimulates research. However, our understanding of drug resistance mechanisms remains incomplete. Fortunately, the fast-growing number of fully sequenced bacterial strains now enables us to develop new methods to identify mutations associated with drug resistance. Results We present a new comparative approach to identify genes and mutations that are likely to be associated with drug resistance mechanisms. In order to test the approach, we collected genotype and phenotype data of 100 fully sequenced strains of S. aureus and 10 commonly used drugs. Then, applying the method, we re-discovered the most common genetic determinants of drug resistance and identified some novel putative associations. Conclusions Firstly, the collected data may help other researchers to develop and verify similar techniques. Secondly, the proposed method is successful in identifying drug resistance determinants. Thirdly, the in-silico identified genetic mutations, which are putatively involved in drug resistance mechanisms, may increase our understanding of the drug resistance mechanisms. PMID:23281931

  11. An innovative and integrated approach based on DNA walking to identify unauthorised GMOs.

    PubMed

    Fraiture, Marie-Alice; Herman, Philippe; Taverniers, Isabel; De Loose, Marc; Deforce, Dieter; Roosens, Nancy H

    2014-03-15

    In the coming years, the frequency of unauthorised genetically modified organisms (GMOs) being present in the European food and feed chain will increase significantly. Therefore, we have developed a strategy to identify unauthorised GMOs containing a pCAMBIA family vector, frequently present in transgenic plants. This integrated approach is performed in two successive steps on Bt rice grains. First, the potential presence of unauthorised GMOs is assessed by the qPCR SYBR®Green technology targeting the terminator 35S pCAMBIA element. Second, its presence is confirmed via the characterisation of the junction between the transgenic cassette and the rice genome. To this end, a DNA walking strategy is applied using a first reverse primer followed by two semi-nested PCR rounds using primers that are each time nested to the previous reverse primer. This approach allows to rapidly identify the transgene flanking region and can easily be implemented by the enforcement laboratories. PMID:24206686

  12. An innovative and integrated approach based on DNA walking to identify unauthorised GMOs.

    PubMed

    Fraiture, Marie-Alice; Herman, Philippe; Taverniers, Isabel; De Loose, Marc; Deforce, Dieter; Roosens, Nancy H

    2014-03-15

    In the coming years, the frequency of unauthorised genetically modified organisms (GMOs) being present in the European food and feed chain will increase significantly. Therefore, we have developed a strategy to identify unauthorised GMOs containing a pCAMBIA family vector, frequently present in transgenic plants. This integrated approach is performed in two successive steps on Bt rice grains. First, the potential presence of unauthorised GMOs is assessed by the qPCR SYBR®Green technology targeting the terminator 35S pCAMBIA element. Second, its presence is confirmed via the characterisation of the junction between the transgenic cassette and the rice genome. To this end, a DNA walking strategy is applied using a first reverse primer followed by two semi-nested PCR rounds using primers that are each time nested to the previous reverse primer. This approach allows to rapidly identify the transgene flanking region and can easily be implemented by the enforcement laboratories.

  13. An integrated remote sensing approach for identifying ecological range sites. [parker mountain

    NASA Technical Reports Server (NTRS)

    Jaynes, R. A.

    1983-01-01

    A model approach for identifying ecological range sites was applied to high elevation sagebrush-dominated rangelands on Parker Mountain, in south-central Utah. The approach utilizes map information derived from both high altitude color infrared photography and LANDSAT digital data, integrated with soils, geological, and precipitation maps. Identification of the ecological range site for a given area requires an evaluation of all relevant environmental factors which combine to give that site the potential to produce characteristic types and amounts of vegetation. A table is presented which allows the user to determine ecological range site based upon an integrated use of the maps which were prepared. The advantages of identifying ecological range sites through an integrated photo interpretation/LANDSAT analysis are discussed.

  14. A new approach to hazardous materials transportation risk analysis: decision modeling to identify critical variables.

    PubMed

    Clark, Renee M; Besterfield-Sacre, Mary E

    2009-03-01

    We take a novel approach to analyzing hazardous materials transportation risk in this research. Previous studies analyzed this risk from an operations research (OR) or quantitative risk assessment (QRA) perspective by minimizing or calculating risk along a transport route. Further, even though the majority of incidents occur when containers are unloaded, the research has not focused on transportation-related activities, including container loading and unloading. In this work, we developed a decision model of a hazardous materials release during unloading using actual data and an exploratory data modeling approach. Previous studies have had a theoretical perspective in terms of identifying and advancing the key variables related to this risk, and there has not been a focus on probability and statistics-based approaches for doing this. Our decision model empirically identifies the critical variables using an exploratory methodology for a large, highly categorical database involving latent class analysis (LCA), loglinear modeling, and Bayesian networking. Our model identified the most influential variables and countermeasures for two consequences of a hazmat incident, dollar loss and release quantity, and is one of the first models to do this. The most influential variables were found to be related to the failure of the container. In addition to analyzing hazmat risk, our methodology can be used to develop data-driven models for strategic decision making in other domains involving risk.

  15. A new approach to hazardous materials transportation risk analysis: decision modeling to identify critical variables.

    PubMed

    Clark, Renee M; Besterfield-Sacre, Mary E

    2009-03-01

    We take a novel approach to analyzing hazardous materials transportation risk in this research. Previous studies analyzed this risk from an operations research (OR) or quantitative risk assessment (QRA) perspective by minimizing or calculating risk along a transport route. Further, even though the majority of incidents occur when containers are unloaded, the research has not focused on transportation-related activities, including container loading and unloading. In this work, we developed a decision model of a hazardous materials release during unloading using actual data and an exploratory data modeling approach. Previous studies have had a theoretical perspective in terms of identifying and advancing the key variables related to this risk, and there has not been a focus on probability and statistics-based approaches for doing this. Our decision model empirically identifies the critical variables using an exploratory methodology for a large, highly categorical database involving latent class analysis (LCA), loglinear modeling, and Bayesian networking. Our model identified the most influential variables and countermeasures for two consequences of a hazmat incident, dollar loss and release quantity, and is one of the first models to do this. The most influential variables were found to be related to the failure of the container. In addition to analyzing hazmat risk, our methodology can be used to develop data-driven models for strategic decision making in other domains involving risk. PMID:19087232

  16. Microbial populations identified by fluorescence in situ hybridization in a constructed wetland treating acid coal mine drainage

    SciTech Connect

    Nicomrat, D.; Dick, W.A.; Tuovinen, O.H.

    2006-07-15

    Microorganisms are an integral part of the biogeochemical processes in wetlands, yet microbial communities in sediments within constructed wetlands receiving acid mine drainage (AMD) are only poorly understood. The purpose of this study was to characterize the microbial diversity and abundance in a wetland receiving AMD using fluorescence in situ hybridization (FISH) analysis. Seasonal samples of oxic surface sediments, comprised of Fe(III) precipitates, were collected from two treatment cells of the constructed wetland system. The pH of the bulk samples ranged between pH 2.1 and 3.9. Viable counts of acidophilic Fe and S oxidizers and heterotrophs were determined with a most probable number (MPN) method. The MPN counts were only a fraction of the corresponding FISH counts. The sediment samples contained microorganisms in the Bacteria (including the subgroups of acidophilic Fe- and S-oxidizing bacteria and Acidiphilium spp.) and Eukarya domains. Archaea were present in the sediment surface samples at < 0.01% of the total microbial community. The most numerous bacterial species in this wetland system was Acidithiobacillus ferrooxidans, comprising up to 37% of the bacterial population. Acidithiobacillus thiooxidans was also abundant.

  17. Improved landmine detection capability (ILDC): systematic approach to the detection of buried mines using passive IR imaging

    NASA Astrophysics Data System (ADS)

    Simard, Jean-Robert

    1996-05-01

    In order to reduce the serious problem associated with the mining of important supply/communication roads by hostile parties during peacekeeping operations, the Canadian Department of National Defense has recently begun the development of a multi-sensor teleoperated mine detection vehicle, the Improved Landmine Detection Capability. One sensor identified as a serious candidate for that project is a passive IR camera. In the past, many organizations have assessed the efficiency of this technique of detection and reported widely fluctuating results. It is believed that the main reason for these fluctuations is associated with the ad hoc interpretations used by different researchers. In this paper, a more systematic analysis is presented which takes into account variables such as time of the day, time of the year, weather conditions, type of road and many others. A working model is proposed in order to facilitate the prediction of the IR signature of the buried land-mine and is compared with data acquired from multiple trials. These trials were done with live mines (without fuzes) and surrogates buried in different types of road (packed gravel and sand) and during different times of the day and different times of the year.

  18. A Chemical Screening Approach to Identify Novel Key Mediators of Erythroid Enucleation

    PubMed Central

    Wölwer, Christina B.; Pase, Luke B.; Pearson, Helen B.; Gödde, Nathan J.; Lackovic, Kurt; Huang, David C. S.; Russell, Sarah M.; Humbert, Patrick O.

    2015-01-01

    Erythroid enucleation is critical for terminal differentiation of red blood cells, and involves extrusion of the nucleus by orthochromatic erythroblasts to produce reticulocytes. Due to the difficulty of synchronizing erythroblasts, the molecular mechanisms underlying the enucleation process remain poorly understood. To elucidate the cellular program governing enucleation, we utilized a novel chemical screening approach whereby orthochromatic cells primed for enucleation were enriched ex vivo and subjected to a functional drug screen using a 324 compound library consisting of structurally diverse, medicinally active and cell permeable drugs. Using this approach, we have confirmed the role of HDACs, proteasomal regulators and MAPK in erythroid enucleation and introduce a new role for Cyclin-dependent kinases, in particular CDK9, in this process. Importantly, we demonstrate that when coupled with imaging analysis, this approach provides a powerful means to identify and characterize rate limiting steps involved in the erythroid enucleation process. PMID:26569102

  19. An approach for identifying cytokines based on a novel ensemble classifier.

    PubMed

    Zou, Quan; Wang, Zhen; Guan, Xinjun; Liu, Bin; Wu, Yunfeng; Lin, Ziyu

    2013-01-01

    Biology is meaningful and important to identify cytokines and investigate their various functions and biochemical mechanisms. However, several issues remain, including the large scale of benchmark datasets, serious imbalance of data, and discovery of new gene families. In this paper, we employ the machine learning approach based on a novel ensemble classifier to predict cytokines. We directly selected amino acids sequences as research objects. First, we pretreated the benchmark data accurately. Next, we analyzed the physicochemical properties and distribution of whole amino acids and then extracted a group of 120-dimensional (120D) valid features to represent sequences. Third, in the view of the serious imbalance in benchmark datasets, we utilized a sampling approach based on the synthetic minority oversampling technique algorithm and K-means clustering undersampling algorithm to rebuild the training set. Finally, we built a library for dynamic selection and circulating combination based on clustering (LibD3C) and employed the new training set to realize cytokine classification. Experiments showed that the geometric mean of sensitivity and specificity obtained through our approach is as high as 93.3%, which proves that our approach is effective for identifying cytokines.

  20. Culling a clinical terminology: a systematic approach to identifying problematic content.

    PubMed

    Sable, J H; Nash, S K; Wang, A Y

    2001-01-01

    The College of American Pathologists and the National Health Service (NHS) in the United Kingdom are merging their respective clinical terminologies, SNOMED RT and Clinical Terms Version 3, into a new terminology, SNOMED CT. This requires mapping concept descriptions between the two existing terminologies. During the mapping process, many descriptions were identified as being potentially problematic. They require further review by the SNOMED editorial process before either (1) being incorporated into SNOMED CT, or (2) retired from active use. This article presents data on the concept descriptions that were identified as needing further review during the early phases of SNOMED CT development. Based on this work, we describe fourteen types of problematic terminology content. Identifying problematic terminology content can be approached in a systematic manner.

  1. A virtual screening approach for identifying plants with anti H5N1 neuraminidase activity.

    PubMed

    Ikram, Nur Kusaira Khairul; Durrant, Jacob D; Muchtaridi, Muchtaridi; Zalaludin, Ayunni Salihah; Purwitasari, Neny; Mohamed, Nornisah; Rahim, Aisyah Saad Abdul; Lam, Chan Kit; Normi, Yahaya M; Rahman, Noorsaadah Abd; Amaro, Rommie E; Wahab, Habibah A

    2015-02-23

    Recent outbreaks of highly pathogenic and occasional drug-resistant influenza strains have highlighted the need to develop novel anti-influenza therapeutics. Here, we report computational and experimental efforts to identify influenza neuraminidase inhibitors from among the 3000 natural compounds in the Malaysian-Plants Natural-Product (NADI) database. These 3000 compounds were first docked into the neuraminidase active site. The five plants with the largest number of top predicted ligands were selected for experimental evaluation. Twelve specific compounds isolated from these five plants were shown to inhibit neuraminidase, including two compounds with IC50 values less than 92 μM. Furthermore, four of the 12 isolated compounds had also been identified in the top 100 compounds from the virtual screen. Together, these results suggest an effective new approach for identifying bioactive plant species that will further the identification of new pharmacologically active compounds from diverse natural-product resources. PMID:25555059

  2. A Virtual Screening Approach For Identifying Plants with Anti H5N1 Neuraminidase Activity

    PubMed Central

    2016-01-01

    Recent outbreaks of highly pathogenic and occasional drug-resistant influenza strains have highlighted the need to develop novel anti-influenza therapeutics. Here, we report computational and experimental efforts to identify influenza neuraminidase inhibitors from among the 3000 natural compounds in the Malaysian-Plants Natural-Product (NADI) database. These 3000 compounds were first docked into the neuraminidase active site. The five plants with the largest number of top predicted ligands were selected for experimental evaluation. Twelve specific compounds isolated from these five plants were shown to inhibit neuraminidase, including two compounds with IC50 values less than 92 μM. Furthermore, four of the 12 isolated compounds had also been identified in the top 100 compounds from the virtual screen. Together, these results suggest an effective new approach for identifying bioactive plant species that will further the identification of new pharmacologically active compounds from diverse natural-product resources. PMID:25555059

  3. Shifting species ranges and changing phenology: A new approach to mining social media for ecosystems observations

    NASA Astrophysics Data System (ADS)

    Fuka, M. Z.; Osborne-Gowey, J. D.; Fuka, D. R.

    2013-12-01

    Geoscientists & ecologists are increasingly using social media to solicit 'citizen scientists' to participate in the data collection process. However, social media users are also a largely untapped resource of spontaneous, unsolicited observations of the natural world. Of particular interest are observations of species phenology & range to better develop a predictive understanding of how ecosystems are affected by a changing climate and human-mediated influences. Social media users' observations include information on phenological & biological phenomena such as flowers blooming, native & invasive species sightings, unusual behaviors, animal tracks, droppings, damage, feeding, nesting, etc. Our AGU2011 pilot study on the North American armadillo suggests that useful observational data can be extracted from Twitter to map current species ranges to compare with past ranges. We have expanded that work by mining Twitter for a number of North American species and ecosystem observations to determine usefulness for environmental applications such as: 1) supplementing existing databases, 2) identifying outlier phenomena, 3) guiding additional crowd-sourced studies and data collection efforts, 4) recruiting citizen scientists, 5) gauging sentiment about the observations and 6) informing ecosystems policy-making and education. We present the results for our evaluation of a representative sample from a list of 200+ species for which we've collected data since August 2011. Our results include frequency of reports and sightings by day, week and month, where the number of observations range from a few per month to ten or more per day. We discuss challenges, best practices and tools for distilling information from crowd-sourced observations gathered via Twitter in the form of 140-character 'tweets'. For example, geolocation is a critical issue. Despite the prevalence of smart phones, specific latitudinal and longitudinal coordinates are included in fewer than 10% of the

  4. Data mining the NCI cancer cell line compound GI(50) values: identifying quinone subtypes effective against melanoma and leukemia cell classes.

    PubMed

    Marx, Kenneth A; O'Neil, Philip; Hoffman, Patrick; Ujwal, M L

    2003-01-01

    Using data mining techniques, we have studied a subset (1400) of compounds from the large public National Cancer Institute (NCI) compounds data repository. We first carried out a functional class identity assignment for the 60 NCI cancer testing cell lines via hierarchical clustering of gene expression data. Comprised of nine clinical tissue types, the 60 cell lines were placed into six classes-melanoma, leukemia, renal, lung, and colorectal, and the sixth class was comprised of mixed tissue cell lines not found in any of the other five classes. We then carried out supervised machine learning, using the GI(50) values tested on a panel of 60 NCI cancer cell lines. For separate 3-class and 2-class problem clustering, we successfully carried out clear cell line class separation at high stringency, p < 0.01 (Bonferroni corrected t-statistic), using feature reduction clustering algorithms embedded in RadViz, an integrated high dimensional analytic and visualization tool. We started with the 1400 compound GI(50) values as input and selected only those compounds, or features, significant in carrying out the classification. With this approach, we identified two small sets of compounds that were most effective in carrying out complete class separation of the melanoma, non-melanoma classes and leukemia, non-leukemia classes. To validate these results, we showed that these two compound sets' GI(50) values were highly accurate classifiers using five standard analytical algorithms. One compound set was most effective against the melanoma class cell lines (14 compounds), and the other set was most effective against the leukemia class cell lines (30 compounds). The two compound classes were both significantly enriched in two different types of substituted p-quinones. The melanoma cell line class of 14 compounds was comprised of 11 compounds that were internal substituted p-quinones, and the leukemia cell line class of 30 compounds was comprised of 6 compounds that were external

  5. Improvement Evaluation on Ceramic Roof Extraction Using WORLDVIEW-2 Imagery and Geographic Data Mining Approach

    NASA Astrophysics Data System (ADS)

    Brum-Bastos, V. S.; Ribeiro, B. M. G.; Pinho, C. M. D.; Korting, T. S.; Fonseca, L. M. G.

    2016-06-01

    Advances in geotechnologies and in remote sensing have improved analysis of urban environments. The new sensors are increasingly suited to urban studies, due to the enhancement in spatial, spectral and radiometric resolutions. Urban environments present high heterogeneity, which cannot be tackled using pixel-based approaches on high resolution images. Geographic Object-Based Image Analysis (GEOBIA) has been consolidated as a methodology for urban land use and cover monitoring; however, classification of high resolution images is still troublesome. This study aims to assess the improvement on ceramic roof classification using WorldView-2 images due to the increase of 4 new bands besides the standard "Blue-Green-Red-Near Infrared" bands. Our methodology combines GEOBIA, C4.5 classification tree algorithm, Monte Carlo simulation and statistical tests for classification accuracy. Two samples groups were considered: 1) eight multispectral and panchromatic bands, and 2) four multispectral and panchromatic bands, representing previous high-resolution sensors. The C4.5 algorithm generates a decision tree that can be used for classification; smaller decision trees are closer to the semantic networks produced by experts on GEOBIA, while bigger trees, are not straightforward to implement manually, but are more accurate. The choice for a big or small tree relies on the user's skills to implement it. This study aims to determine for what kind of user the addition of the 4 new bands might be beneficial: 1) the common user (smaller trees) or 2) a more skilled user with coding and/or data mining abilities (bigger trees). In overall the classification was improved by the addition of the four new bands for both types of users.

  6. Smart-card-based automatic meal record system intervention tool for analysis using data mining approach.

    PubMed

    Zenitani, Satoko; Nishiuchi, Hiromu; Kiuchi, Takahiro

    2010-04-01

    The Smart-card-based Automatic Meal Record system for company cafeterias (AutoMealRecord system) was recently developed and used to monitor employee eating habits. The system could be a unique nutrition assessment tool for automatically monitoring the meal purchases of all employees, although it only focuses on company cafeterias and has never been validated. Before starting an interventional study, we tested the reliability of the data collected by the system using the data mining approach. The AutoMealRecord data were examined to determine if it could predict current obesity. All data used in this study (n = 899) were collected by a major electric company based in Tokyo, which has been operating the AutoMealRecord system for several years. We analyzed dietary patterns by principal component analysis using data from the system and extracted 5 major dietary patterns: healthy, traditional Japanese, Chinese, Japanese noodles, and pasta. The ability to predict current body mass index (BMI) with dietary preference was assessed with multiple linear regression analyses, and in the current study, BMI was positively correlated with male gender, preference for "Japanese noodles," mean energy intake, protein content, and frequency of body measurement at a body measurement booth in the cafeteria. There was a negative correlation with age, dietary fiber, and lunchtime cafeteria use (R(2) = 0.22). This regression model predicted "would-be obese" participants (BMI >or= 23) with 68.8% accuracy by leave-one-out cross validation. This shows that there was sufficient predictability of BMI based on data from the AutoMealRecord System. We conclude that the AutoMealRecord system is valuable for further consideration as a health care intervention tool.

  7. Smart-card-based automatic meal record system intervention tool for analysis using data mining approach.

    PubMed

    Zenitani, Satoko; Nishiuchi, Hiromu; Kiuchi, Takahiro

    2010-04-01

    The Smart-card-based Automatic Meal Record system for company cafeterias (AutoMealRecord system) was recently developed and used to monitor employee eating habits. The system could be a unique nutrition assessment tool for automatically monitoring the meal purchases of all employees, although it only focuses on company cafeterias and has never been validated. Before starting an interventional study, we tested the reliability of the data collected by the system using the data mining approach. The AutoMealRecord data were examined to determine if it could predict current obesity. All data used in this study (n = 899) were collected by a major electric company based in Tokyo, which has been operating the AutoMealRecord system for several years. We analyzed dietary patterns by principal component analysis using data from the system and extracted 5 major dietary patterns: healthy, traditional Japanese, Chinese, Japanese noodles, and pasta. The ability to predict current body mass index (BMI) with dietary preference was assessed with multiple linear regression analyses, and in the current study, BMI was positively correlated with male gender, preference for "Japanese noodles," mean energy intake, protein content, and frequency of body measurement at a body measurement booth in the cafeteria. There was a negative correlation with age, dietary fiber, and lunchtime cafeteria use (R(2) = 0.22). This regression model predicted "would-be obese" participants (BMI >or= 23) with 68.8% accuracy by leave-one-out cross validation. This shows that there was sufficient predictability of BMI based on data from the AutoMealRecord System. We conclude that the AutoMealRecord system is valuable for further consideration as a health care intervention tool. PMID:20534329

  8. Identifying comorbid depression and disruptive behavior disorders: Comparison of two approaches used in adolescent studies

    PubMed Central

    Stoep, Ann Vander; Adrian, Molly C.; Rhew, Isaac C.; McCauley, Elizabeth; Herting, Jerald R.; Kraemer, Helena C.

    2013-01-01

    Interest in commonly co-occurring depression and disruptive behavior disorders in children has yielded a small body of research that estimates the prevalence of this comorbid condition and compares children with the comorbid condition and children with depression or disruptive behavior disorders alone with respect to antecedents and outcomes. Prior studies have used one of two different approaches to measure comorbid disorders: 1) meeting criteria for two DSM or ICD diagnoses or 2) scoring .5 SD above the mean or higher on two dimensional scales. This study compares two snapshots of comorbidity taken simultaneously in the same sample with each of the measurement approaches. The Developmental Pathways Project administered structured diagnostic interviews as well as dimensional scales to a community-based sample of 521 11-12 year olds to assess depression and disruptive behavior disorders. Clinical caseness indicators of children identified as “comorbid” by each method were examined concurrently and 3-years later. Cross-classification of adolescents via the two approaches revealed low agreement. When other indicators of caseness, including functional impairment, need for services, and clinical elevations on other symptom scales were examined, adolescents identified as comorbid via dimensional scales only were similar to those who were identified as comorbid via DSM-IV diagnostic criteria. Findings suggest that when relying solely on DSM diagnostic criteria for comorbid depression and disruptive behavior disorders, many adolescents with significant impairment will be overlooked. Findings also suggest that lower dimensional scale thresholds can be set when comorbid conditions, rather than single forms of psychopathology, are being identified. PMID:22575333

  9. A practical approach to identifying maternal deaths missed from routine hospital reports: lessons from Indonesia

    PubMed Central

    Qomariyah, Siti Nurul; Bell, Jacqueline S.; Pambudi, Eko S.; Anggondowati, Trisari; Latief, Kamaluddin; Achadi, Endang L.; Graham, Wendy J.

    2009-01-01

    Background Accurate estimates of the number of maternal deaths in both the community and facility are important, in order to allocate adequate resources to address such deaths. On the other hand, current studies show that routine methods of identifying maternal deaths in facilities underestimate the number by more than one-half. Objective To assess the utility of a new approach to identifying maternal deaths in hospitals. Method Deaths of women of reproductive age were retrospectively identified from registers in two district hospitals in Indonesia over a 24-month period. Based on information retrieved, deaths were classified as ‘maternal’ or ‘non-maternal’ where possible. For deaths that remained unclassified, a detailed case note review was undertaken and the extracted data were used to facilitate classification. Results One hundred and fifty-five maternal deaths were identified, mainly from the register review. Only 67 maternal deaths were recorded in the hospitals’ routine reports over the same period. This underestimation of maternal deaths was partly due to the incomplete coverage of the routine reporting system; however, even in the wards where routine reports were made, the study identified twice as many deaths. Conclusion The RAPID method is a practical method that provides a more complete estimate of hospital maternal mortality than routine reporting systems. PMID:20027272

  10. Importance of multi-modal approaches to effectively identify cataract cases from electronic health records

    PubMed Central

    Rasmussen, Luke V; Berg, Richard L; Linneman, James G; McCarty, Catherine A; Waudby, Carol; Chen, Lin; Denny, Joshua C; Wilke, Russell A; Pathak, Jyotishman; Carrell, David; Kho, Abel N; Starren, Justin B

    2012-01-01

    Objective There is increasing interest in using electronic health records (EHRs) to identify subjects for genomic association studies, due in part to the availability of large amounts of clinical data and the expected cost efficiencies of subject identification. We describe the construction and validation of an EHR-based algorithm to identify subjects with age-related cataracts. Materials and methods We used a multi-modal strategy consisting of structured database querying, natural language processing on free-text documents, and optical character recognition on scanned clinical images to identify cataract subjects and related cataract attributes. Extensive validation on 3657 subjects compared the multi-modal results to manual chart review. The algorithm was also implemented at participating electronic MEdical Records and GEnomics (eMERGE) institutions. Results An EHR-based cataract phenotyping algorithm was successfully developed and validated, resulting in positive predictive values (PPVs) >95%. The multi-modal approach increased the identification of cataract subject attributes by a factor of three compared to single-mode approaches while maintaining high PPV. Components of the cataract algorithm were successfully deployed at three other institutions with similar accuracy. Discussion A multi-modal strategy incorporating optical character recognition and natural language processing may increase the number of cases identified while maintaining similar PPVs. Such algorithms, however, require that the needed information be embedded within clinical documents. Conclusion We have demonstrated that algorithms to identify and characterize cataracts can be developed utilizing data collected via the EHR. These algorithms provide a high level of accuracy even when implemented across multiple EHRs and institutional boundaries. PMID:22319176

  11. A novel approach for identifying causal models of complex diseases from family data.

    PubMed

    Park, Leeyoung; Kim, Ju H

    2015-04-01

    Causal models including genetic factors are important for understanding the presentation mechanisms of complex diseases. Familial aggregation and segregation analyses based on polygenic threshold models have been the primary approach to fitting genetic models to the family data of complex diseases. In the current study, an advanced approach to obtaining appropriate causal models for complex diseases based on the sufficient component cause (SCC) model involving combinations of traditional genetics principles was proposed. The probabilities for the entire population, i.e., normal-normal, normal-disease, and disease-disease, were considered for each model for the appropriate handling of common complex diseases. The causal model in the current study included the genetic effects from single genes involving epistasis, complementary gene interactions, gene-environment interactions, and environmental effects. Bayesian inference using a Markov chain Monte Carlo algorithm (MCMC) was used to assess of the proportions of each component for a given population lifetime incidence. This approach is flexible, allowing both common and rare variants within a gene and across multiple genes. An application to schizophrenia data confirmed the complexity of the causal factors. An analysis of diabetes data demonstrated that environmental factors and gene-environment interactions are the main causal factors for type II diabetes. The proposed method is effective and useful for identifying causal models, which can accelerate the development of efficient strategies for identifying causal factors of complex diseases. PMID:25701286

  12. A systematic approach to identify functional motifs within vertebrate developmental enhancers

    PubMed Central

    Li, Qiang; Ritter, Deborah; Yang, Nan; Dong, Zhiqiang; Li, Hao; Chuang, Jeffrey H.; Guo, Su

    2012-01-01

    Uncovering the cis-regulatory logic of developmental enhancers is critical to understanding the role of non-coding DNA in development. However, it is cumbersome to identify functional motifs within enhancers, and thus few vertebrate enhancers have their core functional motifs revealed. Here we report a combined experimental and computational approach for discovering regulatory motifs in developmental enhancers. Making use of the zebrafish gene expression database, we computationally identified conserved non-coding elements (CNEs) likely to have a desired tissue-specificity based on the expression of nearby genes. Through a high throughput and robust enhancer assay, we tested the activity of ~100 such CNEs and efficiently uncovered developmental enhancers with desired spatial and temporal expression patterns in the zebrafish brain. Application of de novo motif prediction algorithms on a group of forebrain enhancers identified five top-ranked motifs, all of which were experimentally validated as critical for forebrain enhancer activity. These results demonstrate a systematic approach to discover important regulatory motifs in vertebrate developmental enhancers. Moreover, this dataset provides a useful resource for further dissection of vertebrate brain development and function. PMID:19850031

  13. Identification of gefitinib off-targets using a structure-based systems biology approach; their validation with reverse docking and retrospective data mining

    PubMed Central

    Verma, Nidhi; Rai, Amit Kumar; Kaushik, Vibha; Brünnert, Daniela; Chahar, Kirti Raj; Pandey, Janmejay; Goyal, Pankaj

    2016-01-01

    Gefitinib, an EGFR tyrosine kinase inhibitor, is used as FDA approved drug in breast cancer and non-small cell lung cancer treatment. However, this drug has certain side effects and complications for which the underlying molecular mechanisms are not well understood. By systems biology based in silico analysis, we identified off-targets of gefitinib that might explain side effects of this drugs. The crystal structure of EGFR-gefitinib complex was used for binding pocket similarity searches on a druggable proteome database (Sc-PDB) by using IsoMIF Finder. The top 128 hits of putative off-targets were validated by reverse docking approach. The results showed that identified off-targets have efficient binding with gefitinib. The identified human specific off-targets were confirmed and further analyzed for their links with biological process and clinical disease pathways using retrospective studies and literature mining, respectively. Noticeably, many of the identified off-targets in this study were reported in previous high-throughput screenings. Interestingly, the present study reveals that gefitinib may have positive effects in reducing brain and bone metastasis, and may be useful in defining novel gefitinib based treatment regime. We propose that a system wide approach could be useful during new drug development and to minimize side effect of the prospective drug. PMID:27653775

  14. Impacts of mountaintop mining on terrestrial ecosystem integrity: Identifying landscape thresholds for avian species in the central Appalachians, United States

    USGS Publications Warehouse

    Becker, Douglas A.; Wood, Petra Bohall; Strager, Michael P.; Mazzarella, Christine

    2014-01-01

    Because of little overlap in habitat requirements, managing landscapes simultaneously to maximally benefit both guilds may not be possible. Our avian thresholds identify single community management targets accounting for scarce species. Guild or individual species thresholds allow for species-specific management.

  15. Mining and biodiversity offsets: a transparent and science-based approach to measure "no-net-loss".

    PubMed

    Virah-Sawmy, Malika; Ebeling, Johannes; Taplin, Roslyn

    2014-10-01

    Mining and associated infrastructure developments can present themselves as economic opportunities that are difficult to forego for developing and industrialised countries alike. Almost inevitably, however, they lead to biodiversity loss. This trade-off can be greatest in economically poor but highly biodiverse regions. Biodiversity offsets have, therefore, increasingly been promoted as a mechanism to help achieve both the aims of development and biodiversity conservation. Accordingly, this mechanism is emerging as a key tool for multinational mining companies to demonstrate good environmental stewardship. Relying on offsets to achieve "no-net-loss" of biodiversity, however, requires certainty in their ecological integrity where they are used to sanction habitat destruction. Here, we discuss real-world practices in biodiversity offsetting by assessing how well some leading initiatives internationally integrate critical aspects of biodiversity attributes, net loss accounting and project management. With the aim of improving, rather than merely critiquing the approach, we analyse different aspects of biodiversity offsetting. Further, we analyse the potential pitfalls of developing counterfactual scenarios of biodiversity loss or gains in a project's absence. In this, we draw on insights from experience with carbon offsetting. This informs our discussion of realistic projections of project effectiveness and permanence of benefits to ensure no net losses, and the risk of displacing, rather than avoiding biodiversity losses ("leakage"). We show that the most prominent existing biodiversity offset initiatives employ broad and somewhat arbitrary parameters to measure habitat value and do not sufficiently consider real-world challenges in compensating losses in an effective and lasting manner. We propose a more transparent and science-based approach, supported with a new formula, to help design biodiversity offsets to realise their potential in enabling more responsible

  16. Mining and biodiversity offsets: a transparent and science-based approach to measure "no-net-loss".

    PubMed

    Virah-Sawmy, Malika; Ebeling, Johannes; Taplin, Roslyn

    2014-10-01

    Mining and associated infrastructure developments can present themselves as economic opportunities that are difficult to forego for developing and industrialised countries alike. Almost inevitably, however, they lead to biodiversity loss. This trade-off can be greatest in economically poor but highly biodiverse regions. Biodiversity offsets have, therefore, increasingly been promoted as a mechanism to help achieve both the aims of development and biodiversity conservation. Accordingly, this mechanism is emerging as a key tool for multinational mining companies to demonstrate good environmental stewardship. Relying on offsets to achieve "no-net-loss" of biodiversity, however, requires certainty in their ecological integrity where they are used to sanction habitat destruction. Here, we discuss real-world practices in biodiversity offsetting by assessing how well some leading initiatives internationally integrate critical aspects of biodiversity attributes, net loss accounting and project management. With the aim of improving, rather than merely critiquing the approach, we analyse different aspects of biodiversity offsetting. Further, we analyse the potential pitfalls of developing counterfactual scenarios of biodiversity loss or gains in a project's absence. In this, we draw on insights from experience with carbon offsetting. This informs our discussion of realistic projections of project effectiveness and permanence of benefits to ensure no net losses, and the risk of displacing, rather than avoiding biodiversity losses ("leakage"). We show that the most prominent existing biodiversity offset initiatives employ broad and somewhat arbitrary parameters to measure habitat value and do not sufficiently consider real-world challenges in compensating losses in an effective and lasting manner. We propose a more transparent and science-based approach, supported with a new formula, to help design biodiversity offsets to realise their potential in enabling more responsible

  17. Translational informatics approach for identifying the functional molecular communicators linking coronary artery disease, infection and inflammation

    PubMed Central

    SHARMA, ANKIT; GHATGE, MADANKUMAR; MUNDKUR, LAKSHMI; VANGALA, RAJANI KANTH

    2016-01-01

    Translational informatics approaches are required for the integration of diverse and accumulating data to enable the administration of effective translational medicine specifically in complex diseases such as coronary artery disease (CAD). In the current study, a novel approach for elucidating the association between infection, inflammation and CAD was used. Genes for CAD were collected from the CAD-gene database and those for infection and inflammation were collected from the UniProt database. The cytomegalovirus (CMV)-induced genes were identified from the literature and the CAD-associated clinical phenotypes were obtained from the Unified Medical Language System. A total of 55 gene ontologies (GO) termed functional communicator ontologies were identifed in the gene sets linking clinical phenotypes in the diseasome network. The network topology analysis suggested that important functions including viral entry, cell adhesion, apoptosis, inflammatory and immune responses networked with clinical phenotypes. Microarray data was extracted from the Gene Expression Omnibus (dataset: GSE48060) for highly networked disease myocardial infarction. Further analysis of differentially expressed genes and their GO terms suggested that CMV infection may trigger a xenobiotic response, oxidative stress, inflammation and immune modulation. Notably, the current study identified γ-glutamyl transferase (GGT)-5 as a potential biomarker with an odds ratio of 1.947, which increased to 2.561 following the addition of CMV and CMV-neutralizing antibody (CMV-NA) titers. The C-statistics increased from 0.530 for conventional risk factors (CRFs) to 0.711 for GGT in combination with the above mentioned infections and CRFs. Therefore, the translational informatics approach used in the current study identified a potential molecular mechanism for CMV infection in CAD, and a potential biomarker for risk prediction. PMID:27035874

  18. Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches

    PubMed Central

    Havemann, Frank; Gläser, Jochen; Heinz, Michael; Struck, Alexander

    2012-01-01

    The aim of this paper is to introduce and assess three algorithms for the identification of overlapping thematic structures in networks of papers. We implemented three recently proposed approaches to the identification of overlapping and hierarchical substructures in graphs and applied the corresponding algorithms to a network of 492 information-science papers coupled via their cited sources. The thematic substructures obtained and overlaps produced by the three hierarchical cluster algorithms were compared to a content-based categorisation, which we based on the interpretation of titles, abstracts, and keywords. We defined sets of papers dealing with three topics located on different levels of aggregation: h-index, webometrics, and bibliometrics. We identified these topics with branches in the dendrograms produced by the three cluster algorithms and compared the overlapping topics they detected with one another and with the three predefined paper sets. We discuss the advantages and drawbacks of applying the three approaches to paper networks in research fields. PMID:22479376

  19. Leveraging Concept-based Approaches to Identify Potential Phyto-therapies

    PubMed Central

    Sharma, Vivekanand; Sarkar, Indra Neil

    2013-01-01

    The potential of plant-based remedies has been documented in both traditional and contemporary biomedical literature. Such types of text sources may thus be sources from which one might identify potential plant-based therapies (“phyto-therapies”). Concept-based analytic approaches have been shown to uncover knowledge embedded within biomedical literature. However, to date there has been limited attention towards leveraging such techniques for the identification of potential phyto-therapies. This study presents concept-based analytic approaches for the retrieval and ranking of associations between plants and human diseases. Focusing on identification of phyto-therapies described in MEDLINE, both MeSH descriptors used for indexing and MetaMap inferred UMLS concepts are considered. Furthermore, the identification and ranking consider both direct (i.e., plant concepts directly correlated with disease concepts) and inferred (i.e., plant concepts associated with disease concepts based on shared signs and symptoms) relationships. Based on the two scoring methodologies used in this study, it was found that a vector space model approach outperformed probabilistic reliability based inferences. An evaluation of the approach is provided based on therapeutic interventions catalogued in both ClinicalTrials.gov and NDF-RT. The promising findings from this feasibility study highlight the challenges and applicability of concept-based analytic strategies for distilling phyto-therapeutic knowledge from text based knowledge sources like MEDLINE. PMID:23665360

  20. Leveraging concept-based approaches to identify potential phyto-therapies.

    PubMed

    Sharma, Vivekanand; Sarkar, Indra Neil

    2013-08-01

    The potential of plant-based remedies has been documented in both traditional and contemporary biomedical literature. Such types of text sources may thus be sources from which one might identify potential plant-based therapies ("phyto-therapies"). Concept-based analytic approaches have been shown to uncover knowledge embedded within biomedical literature. However, to date there has been limited attention towards leveraging such techniques for the identification of potential phyto-therapies. This study presents concept-based analytic approaches for the retrieval and ranking of associations between plants and human diseases. Focusing on identification of phyto-therapies described in MEDLINE, both MeSH descriptors used for indexing and MetaMap inferred UMLS concepts are considered. Furthermore, the identification and ranking consider both direct (i.e., plant concepts directly correlated with disease concepts) and inferred (i.e., plant concepts associated with disease concepts based on shared signs and symptoms) relationships. Based on the two scoring methodologies used in this study, it was found that a Vector Space Model approach outperformed probabilistic reliability based inferences. An evaluation of the approach is provided based on therapeutic interventions catalogued in both ClinicalTrials.gov and NDF-RT. The promising findings from this feasibility study highlight the challenges and applicability of concept-based analytic strategies for distilling phyto-therapeutic knowledge from text based knowledge sources like MEDLINE.

  1. A multi-criteria decision making approach to identify a vaccine formulation.

    PubMed

    Dewé, Walthère; Durand, Christelle; Marion, Sandie; Oostvogels, Lidia; Devaster, Jeanne-Marie; Fourneau, Marc

    2016-01-01

    This article illustrates the use of a multi-criteria decision making approach, based on desirability functions, to identify an appropriate adjuvant composition for an influenza vaccine to be used in elderly. The proposed adjuvant system contained two main elements: monophosphoryl lipid and α-tocopherol with squalene in an oil/water emulsion. The objective was to elicit a stronger immune response while maintaining an acceptable reactogenicity and safety profile. The study design, the statistical models, the choice of the desirability functions, the computation of the overall desirability index, and the assessment of the robustness of the ranking are all detailed in this manuscript.

  2. Demonstrating a Market-Based Approach to the Reclamation of Mined Lands in West Virginia

    SciTech Connect

    John W. Goodrich-Mahoney; Paul Ziemkiewicz

    2006-07-19

    This is the third quarter progress report of Phase II of a three-phase project to develop and evaluate the efficacy of developing multiple environmental market trading credits on a partially reclaimed surface mined site near Valley Point, Preston County, WV. Construction of the passive acid mine drainage (AMD) treatment system was completed but several modifications from the original design had to be made following the land survey and during construction to compensate for unforeseen circumstances. We continued to collect baseline quality data from the Conner Run AMD seeps to confirm the conceptual and final design for the passive AMD treatment system.

  3. An Integrated Human/Murine Transcriptome and Pathway Approach To Identify Prenatal Treatments For Down Syndrome

    PubMed Central

    Guedj, Faycal; Pennings, Jeroen LA; Massingham, Lauren J.; Wick, Heather C.; Siegel, Ashley E.; Tantravahi, Umadevi; Bianchi, Diana W.

    2016-01-01

    Anatomical and functional brain abnormalities begin during fetal life in Down syndrome (DS). We hypothesize that novel prenatal treatments can be identified by targeting signaling pathways that are consistently perturbed in cell types/tissues obtained from human fetuses with DS and mouse embryos. We analyzed transcriptome data from fetuses with trisomy 21, age and sex-matched euploid controls, and embryonic day 15.5 forebrains from Ts1Cje, Ts65Dn, and Dp16 mice. The new datasets were compared to other publicly available datasets from humans with DS. We used the human Connectivity Map (CMap) database and created a murine adaptation to identify FDA-approved drugs that can rescue affected pathways. USP16 and TTC3 were dysregulated in all affected human cells and two mouse models. DS-associated pathway abnormalities were either the result of gene dosage specific effects or the consequence of a global cell stress response with activation of compensatory mechanisms. CMap analyses identified 56 molecules with high predictive scores to rescue abnormal gene expression in both species. Our novel integrated human/murine systems biology approach identified commonly dysregulated genes and pathways. This can help to prioritize therapeutic molecules on which to further test safety and efficacy. Additional studies in human cells are ongoing prior to pre-clinical prenatal treatment in mice. PMID:27586445

  4. An Integrated Human/Murine Transcriptome and Pathway Approach To Identify Prenatal Treatments For Down Syndrome.

    PubMed

    Guedj, Faycal; Pennings, Jeroen LA; Massingham, Lauren J; Wick, Heather C; Siegel, Ashley E; Tantravahi, Umadevi; Bianchi, Diana W

    2016-01-01

    Anatomical and functional brain abnormalities begin during fetal life in Down syndrome (DS). We hypothesize that novel prenatal treatments can be identified by targeting signaling pathways that are consistently perturbed in cell types/tissues obtained from human fetuses with DS and mouse embryos. We analyzed transcriptome data from fetuses with trisomy 21, age and sex-matched euploid controls, and embryonic day 15.5 forebrains from Ts1Cje, Ts65Dn, and Dp16 mice. The new datasets were compared to other publicly available datasets from humans with DS. We used the human Connectivity Map (CMap) database and created a murine adaptation to identify FDA-approved drugs that can rescue affected pathways. USP16 and TTC3 were dysregulated in all affected human cells and two mouse models. DS-associated pathway abnormalities were either the result of gene dosage specific effects or the consequence of a global cell stress response with activation of compensatory mechanisms. CMap analyses identified 56 molecules with high predictive scores to rescue abnormal gene expression in both species. Our novel integrated human/murine systems biology approach identified commonly dysregulated genes and pathways. This can help to prioritize therapeutic molecules on which to further test safety and efficacy. Additional studies in human cells are ongoing prior to pre-clinical prenatal treatment in mice. PMID:27586445

  5. A Multiple-Tracer Approach for Identifying Sewage Sources to an Urban Stream System

    USGS Publications Warehouse

    Hyer, Kenneth Edward

    2007-01-01

    The presence of human-derived fecal coliform bacteria (sewage) in streams and rivers is recognized as a human health hazard. The source of these human-derived bacteria, however, is often difficult to identify and eliminate, because sewage can be delivered to streams through a variety of mechanisms, such as leaking sanitary sewers or private lateral lines, cross-connected pipes, straight pipes, sewer-line overflows, illicit dumping of septic waste, and vagrancy. A multiple-tracer study was conducted to identify site-specific sources of sewage in Accotink Creek, an urban stream in Fairfax County, Virginia, that is listed on the Commonwealth's priority list of impaired streams for violations of the fecal coliform bacteria standard. Beyond developing this multiple-tracer approach for locating sources of sewage inputs to Accotink Creek, the second objective of the study was to demonstrate how the multiple-tracer approach can be applied to other streams affected by sewage sources. The tracers used in this study were separated into indicator tracers, which are relatively simple and inexpensive to apply, and confirmatory tracers, which are relatively difficult and expensive to analyze. Indicator tracers include fecal coliform bacteria, surfactants, boron, chloride, chloride/bromide ratio, specific conductance, dissolved oxygen, turbidity, and water temperature. Confirmatory tracers include 13 organic compounds that are associated with human waste, including caffeine, cotinine, triclosan, a number of detergent metabolites, several fragrances, and several plasticizers. To identify sources of sewage to Accotink Creek, a detailed investigation of the Accotink Creek main channel, tributaries, and flowing storm drains was undertaken from 2001 to 2004. Sampling was conducted in a series of eight synoptic sampling events, each of which began at the most downstream site and extended upstream through the watershed and into the headwaters of each tributary. Using the synoptic

  6. Biomarker Identification Using Text Mining

    PubMed Central

    Li, Hui; Liu, Chunmei

    2012-01-01

    Identifying molecular biomarkers has become one of the important tasks for scientists to assess the different phenotypic states of cells or organisms correlated to the genotypes of diseases from large-scale biological data. In this paper, we proposed a text-mining-based method to discover biomarkers from PubMed. First, we construct a database based on a dictionary, and then we used a finite state machine to identify the biomarkers. Our method of text mining provides a highly reliable approach to discover the biomarkers in the PubMed database. PMID:23197989

  7. Approach for Identifying Human Leukocyte Antigen (HLA)-DR Bound Peptides from Scarce Clinical Samples.

    PubMed

    Heyder, Tina; Kohler, Maxie; Tarasova, Nataliya K; Haag, Sabrina; Rutishauser, Dorothea; Rivera, Natalia V; Sandin, Charlotta; Mia, Sohel; Malmström, Vivianne; Wheelock, Åsa M; Wahlström, Jan; Holmdahl, Rikard; Eklund, Anders; Zubarev, Roman A; Grunewald, Johan; Ytterberg, A Jimmy

    2016-09-01

    Immune-mediated diseases strongly associating with human leukocyte antigen (HLA) alleles are likely linked to specific antigens. These antigens are presented to T cells in the form of peptides bound to HLA molecules on antigen presenting cells, e.g. dendritic cells, macrophages or B cells. The identification of HLA-DR-bound peptides presents a valuable tool to investigate the human immunopeptidome. The lung is likely a key player in the activation of potentially auto-aggressive T cells prior to entering target tissues and inducing autoimmune disease. This makes the lung of exceptional interest and presents an ideal paradigm to study the human immunopeptidome and to identify antigenic peptides.Our previous investigation of HLA-DR peptide presentation in the lung required high numbers of cells (800 × 10(6) bronchoalveolar lavage (BAL) cells). Because BAL from healthy nonsmokers typically contains 10-15 × 10(6) cells, there is a need for a highly sensitive approach to study immunopeptides in the lungs of individual patients and controls.In this work, we analyzed the HLA-DR immunopeptidome in the lung by an optimized methodology to identify HLA-DR-bound peptides from low cell numbers. We used an Epstein-Barr Virus (EBV) immortalized B cell line and bronchoalveolar lavage (BAL) cells obtained from patients with sarcoidosis, an inflammatory T cell driven disease mainly occurring in the lung. Specifically, membrane complexes were isolated prior to immunoprecipitation, eluted peptides were identified by nanoLC-MS/MS and processed using the in-house developed ClusterMHCII software. With the optimized procedure we were able to identify peptides from 10 × 10(6) cells, which on average correspond to 10.9 peptides/million cells in EBV-B cells and 9.4 peptides/million cells in BAL cells. This work presents an optimized approach designed to identify HLA-DR-bound peptides from low numbers of cells, enabling the investigation of the BAL immunopeptidome from individual patients

  8. Deformation Prediction and Geometrical Modeling of Head and Neck Cancer Tumor: A Data Mining Approach

    NASA Astrophysics Data System (ADS)

    Azimi, Maryam

    Radiation therapy has been used in the treatment of cancer tumors for several years and many cancer patients receive radiotherapy. It may be used as primary therapy or with a combination of surgery or other kinds of therapy such as chemotherapy, hormone therapy or some mixture of the three. The treatment objective is to destroy cancer cells or shrink the tumor by planning an adequate radiation dose to the desired target without damaging the normal tissues. By using the pre-treatment Computer Tomography (CT) images, most of the radiotherapy planning systems design the target and assume that the size of the tumor will not change throughout the treatment course, which takes 5 to 7 weeks. Based on this assumption, the total amount of radiation is planned and fractionated for the daily dose required to be delivered to the patient's body. However, this assumption is flawed because the patients receiving radiotherapy have marked changes in tumor geometry during the treatment period. Therefore, there is a critical need to understand the changes of the tumor shape and size over time during the course of radiotherapy in order to prevent significant effects of inaccuracy in the planning. In this research, a methodology is proposed in order to monitor and predict daily (fraction day) tumor volume and surface changes of head and neck cancer tumors during the entire treatment period. In the proposed method, geometrical modeling and data mining techniques will be used rather than repetitive CT scans data to predict the tumor deformation for radiation planning. Clinical patient data were obtained from the University of Texas-MD Anderson Cancer Center (MDACC). In the first step, by using CT scan data, the tumor's progressive geometric changes during the treatment period are quantified. The next step relates to using regression analysis in order to develop predictive models for tumor geometry based on the geometric analysis results and the patients' selected attributes (age, weight

  9. Energy-optimised pharmacophore approach to identify potential hotspots during inhibition of Class II HDAC isoforms.

    PubMed

    Ganai, Shabir Ahmad; Shanmugam, Karthi; Mahadevan, Vijayalakshmi

    2015-01-01

    Histone deacetylases (HDACs) are conjugated enzymes that modulate chromatin architecture by deacetylating lysine residues on the histone tails leading to transcriptional repression. Pharmacological interventions of these enzymes with small molecule inhibitors called Histone deacetylase inhibitors (HDACi) have shown enhanced acetylation of the genome and are hence emerging as potential targets at the clinic. Type-specific inhibition of Class II HDACs has shown enhanced therapeutic benefits against developmental and neurodegenerative disorders. However, the structural identity of class-specific isoforms limits the potential of their inhibitors in precise targeting of their enzymes. Diverse strategies have been implemented to recognise the features in HDAC enzymes which may help in identifying isoform specificity factors. This work attempts a computational approach that combines in silico docking and energy-optimised pharmacophore (E-pharmacophore) mapping of 18 known HDAC inhibitors and has identified structural variations that regulate their interactions against the six Class II HDAC enzymes considered for the study. This combined approach establishes that inhibitors possessing higher number of aromatic rings in different structural regions might function as potent inhibitors, while inhibitors with scarce ring structures might point to compromised potency. This would aid the rationale for chemical optimisation and design of isoform selective HDAC inhibitors with enhanced affinity and therapeutic efficiency.

  10. A Mutant Library Approach to Identify Improved Meningococcal Factor H Binding Protein Vaccine Antigens

    PubMed Central

    Konar, Monica; Rossi, Raffaella; Walter, Helen; Pajon, Rolando; Beernink, Peter T.

    2015-01-01

    Factor H binding protein (FHbp) is a virulence factor used by meningococci to evade the host complement system. FHbp elicits bactericidal antibodies in humans and is part of two recently licensed vaccines. Using human complement Factor H (FH) transgenic mice, we previously showed that binding of FH decreased the protective antibody responses to FHbp vaccination. Therefore, in the present study we devised a library-based method to identify mutant FHbp antigens with very low binding of FH. Using an FHbp sequence variant in one of the two licensed vaccines, we displayed an error-prone PCR mutant FHbp library on the surface of Escherichia coli. We used fluorescence-activated cell sorting to isolate FHbp mutants with very low binding of human FH and preserved binding of control anti-FHbp monoclonal antibodies. We sequenced the gene encoding FHbp from selected clones and introduced the mutations into a soluble FHbp construct. Using this approach, we identified several new mutant FHbp vaccine antigens that had very low binding of FH as measured by ELISA and surface plasmon resonance. The new mutant FHbp antigens elicited protective antibody responses in human FH transgenic mice that were up to 20-fold higher than those elicited by the wild-type FHbp antigen. This approach offers the potential to discover mutant antigens that might not be predictable even with protein structural information and potentially can be applied to other microbial vaccine antigens that bind host proteins. PMID:26057742

  11. Data mining for water resource management part 2 - methods and approaches to solving contemporary problems

    USGS Publications Warehouse

    Roehl, Edwin A.; Conrads, Paul A.

    2010-01-01

    This is the second of two papers that describe how data mining can aid natural-resource managers with the difficult problem of controlling the interactions between hydrologic and man-made systems. Data mining is a new science that assists scientists in converting large databases into knowledge, and is uniquely able to leverage the large amounts of real-time, multivariate data now being collected for hydrologic systems. Part 1 gives a high-level overview of data mining, and describes several applications that have addressed major water resource issues in South Carolina. This Part 2 paper describes how various data mining methods are integrated to produce predictive models for controlling surface- and groundwater hydraulics and quality. The methods include: - signal processing to remove noise and decompose complex signals into simpler components; - time series clustering that optimally groups hundreds of signals into "classes" that behave similarly for data reduction and (or) divide-and-conquer problem solving; - classification which optimally matches new data to behavioral classes; - artificial neural networks which optimally fit multivariate data to create predictive models; - model response surface visualization that greatly aids in understanding data and physical processes; and, - decision support systems that integrate data, models, and graphics into a single package that is easy to use.

  12. Early Prediction of Students' Grade Point Averages at Graduation: A Data Mining Approach

    ERIC Educational Resources Information Center

    Tekin, Ahmet

    2014-01-01

    Problem Statement: There has recently been interest in educational databases containing a variety of valuable but sometimes hidden data that can be used to help less successful students to improve their academic performance. The extraction of hidden information from these databases often implements aspects of the educational data mining (EDM)…

  13. General Purpose 2D and 3D Similarity Approach to Identify hERG Blockers.

    PubMed

    Schyman, Patric; Liu, Ruifeng; Wallqvist, Anders

    2016-01-25

    Screening compounds for human ether-à-go-go-related gene (hERG) channel inhibition is an important component of early stage drug development and assessment. In this study, we developed a high-confidence (p-value < 0.01) hERG prediction model based on a combined two-dimensional (2D) and three-dimensional (3D) modeling approach. We developed a 3D similarity conformation approach (SCA) based on examining a limited fixed number of pairwise 3D similarity scores between a query molecule and a set of known hERG blockers. By combining 3D SCA with 2D similarity ensemble approach (SEA) methods, we achieved a maximum sensitivity in hERG inhibition prediction with an accuracy not achieved by either method separately. The combined model achieved 69% sensitivity and 95% specificity on an independent external data set. Further validation showed that the model correctly picked up documented hERG inhibition or interactions among the Food and Drug Administration- approved drugs with the highest similarity scores-with 18 of 20 correctly identified. The combination of ascertaining 2D and 3D similarity of compounds allowed us to synergistically use 2D fingerprint matching with 3D shape and chemical complementarity matching. PMID:26718126

  14. General Purpose 2D and 3D Similarity Approach to Identify hERG Blockers.

    PubMed

    Schyman, Patric; Liu, Ruifeng; Wallqvist, Anders

    2016-01-25

    Screening compounds for human ether-à-go-go-related gene (hERG) channel inhibition is an important component of early stage drug development and assessment. In this study, we developed a high-confidence (p-value < 0.01) hERG prediction model based on a combined two-dimensional (2D) and three-dimensional (3D) modeling approach. We developed a 3D similarity conformation approach (SCA) based on examining a limited fixed number of pairwise 3D similarity scores between a query molecule and a set of known hERG blockers. By combining 3D SCA with 2D similarity ensemble approach (SEA) methods, we achieved a maximum sensitivity in hERG inhibition prediction with an accuracy not achieved by either method separately. The combined model achieved 69% sensitivity and 95% specificity on an independent external data set. Further validation showed that the model correctly picked up documented hERG inhibition or interactions among the Food and Drug Administration- approved drugs with the highest similarity scores-with 18 of 20 correctly identified. The combination of ascertaining 2D and 3D similarity of compounds allowed us to synergistically use 2D fingerprint matching with 3D shape and chemical complementarity matching.

  15. A recursive network approach can identify constitutive regulatory circuits in gene expression data

    NASA Astrophysics Data System (ADS)

    Blasi, Monica Francesca; Casorelli, Ida; Colosimo, Alfredo; Blasi, Francesco Simone; Bignami, Margherita; Giuliani, Alessandro

    2005-03-01

    The activity of the cell is often coordinated by the organisation of proteins into regulatory circuits that share a common function. Genome-wide expression profiles might contain important information on these circuits. Current approaches for the analysis of gene expression data include clustering the individual expression measurements and relating them to biological functions as well as modelling and simulation of gene regulation processes by additional computer tools. The identification of the regulative programmes from microarray experiments is limited, however, by the intrinsic difficulty of linear methods to detect low-variance signals and by the sensitivity of the different approaches. Here we face the problem of recognising invariant patterns of correlations among gene expression reminiscent of regulation circuits. We demonstrate that a recursive neural network approach can identify genetic regulation circuits from expression data for ribosomal and genome stability genes. The proposed method, by greatly enhancing the sensitivity of microarray studies, allows the identification of important aspects of genetic regulation networks and might be useful for the discrimination of the different players involved in regulation circuits. Our results suggest that the constitutive regulatory networks involved in the generic organisation of the cell display a high degree of clustering depending on a modular architecture.

  16. Identifying functional reorganization of spelling networks: an individual peak probability comparison approach

    PubMed Central

    Purcell, Jeremy J.; Rapp, Brenda

    2013-01-01

    Previous research has shown that damage to the neural substrates of orthographic processing can lead to functional reorganization during reading (Tsapkini et al., 2011); in this research we ask if the same is true for spelling. To examine the functional reorganization of spelling networks we present a novel three-stage Individual Peak Probability Comparison (IPPC) analysis approach for comparing the activation patterns obtained during fMRI of spelling in a single brain-damaged individual with dysgraphia to those obtained in a set of non-impaired control participants. The first analysis stage characterizes the convergence in activations across non-impaired control participants by applying a technique typically used for characterizing activations across studies: Activation Likelihood Estimate (ALE) (Turkeltaub et al., 2002). This method was used to identify locations that have a high likelihood of yielding activation peaks in the non-impaired participants. The second stage provides a characterization of the degree to which the brain-damaged individual's activations correspond to the group pattern identified in Stage 1. This involves performing a Mahalanobis distance statistics analysis (Tsapkini et al., 2011) that compares each of a control group's peak activation locations to the nearest peak generated by the brain-damaged individual. The third stage evaluates the extent to which the brain-damaged individual's peaks are atypical relative to the range of individual variation among the control participants. This IPPC analysis allows for a quantifiable, statistically sound method for comparing an individual's activation pattern to the patterns observed in a control group and, thus, provides a valuable tool for identifying functional reorganization in a brain-damaged individual with impaired spelling. Furthermore, this approach can be applied more generally to compare any individual's activation pattern with that of a set of other individuals. PMID:24399981

  17. Identifying Quality Indicators Used by Patients to Choose Secondary Health Care Providers: A Mixed Methods Approach

    PubMed Central

    Zaman, Saman Sara; Kahlon, Gurnaaz Kaur; Naik, Aditi; Jessel, Amar Singh; Nanavati, Niraj; Shah, Akash; Cox, Benita; Darzi, Ara

    2015-01-01

    Background Patients in health systems across the world can now choose between different health care providers. Patients are increasingly using websites and apps to compare the quality of health care services available in order to make a choice of provider. In keeping with many patient-facing platforms, most services currently providing comparative information on different providers do not take account of end-user requirements or the available evidence base. Objective To investigate what factors were considered most important when choosing nonemergency secondary health care providers in the United Kingdom with the purpose of translating these insights into a ratings platform delivered through a consumer mHealth app. Methods A mixed methods approach was used to identify key indicators incorporating a literature review to identify and categorize existing quality indicators, a questionnaire survey to formulate a ranked list of performance indicators, and focus groups to explore rationales behind the rankings. Findings from qualitative and quantitative methodologies were mapped onto each other under the four categories identified by the literature review. Results Quality indicators were divided into four categories. Hospital access was the least important category. The mean differences between the other three categories hospital statistics, hospital staff, and hospital facilities, were not statistically significant. Staff competence was the most important indicator in the hospital staff category; cleanliness and up-to-date facilities were equally important in hospital facilities; ease of travel to the hospital was found to be most important in hospital access. All quality indicators within the hospital statistics category were equally important. Focus groups elaborated that users find it difficult to judge staff competence despite its importance. Conclusions A mixed methods approach is presented, which supported a patient-centered development and evaluation of a

  18. A Likelihood-Based Approach to Identifying Contaminated Food Products Using Sales Data: Performance and Challenges

    PubMed Central

    Kaufman, James; Lessler, Justin; Harry, April; Edlund, Stefan; Hu, Kun; Douglas, Judith; Thoens, Christian; Appel, Bernd; Käsbohrer, Annemarie; Filter, Matthias

    2014-01-01

    Foodborne disease outbreaks of recent years demonstrate that due to increasingly interconnected supply chains these type of crisis situations have the potential to affect thousands of people, leading to significant healthcare costs, loss of revenue for food companies, and—in the worst cases—death. When a disease outbreak is detected, identifying the contaminated food quickly is vital to minimize suffering and limit economic losses. Here we present a likelihood-based approach that has the potential to accelerate the time needed to identify possibly contaminated food products, which is based on exploitation of food products sales data and the distribution of foodborne illness case reports. Using a real world food sales data set and artificially generated outbreak scenarios, we show that this method performs very well for contamination scenarios originating from a single “guilty” food product. As it is neither always possible nor necessary to identify the single offending product, the method has been extended such that it can be used as a binary classifier. With this extension it is possible to generate a set of potentially “guilty” products that contains the real outbreak source with very high accuracy. Furthermore we explore the patterns of food distributions that lead to “hard-to-identify” foods, the possibility of identifying these food groups a priori, and the extent to which the likelihood-based method can be used to quantify uncertainty. We find that high spatial correlation of sales data between products may be a useful indicator for “hard-to-identify” products. PMID:24992565

  19. A Sparse Reconstruction Approach for Identifying Gene Regulatory Networks Using Steady-State Experiment Data

    PubMed Central

    Zhang, Wanhong; Zhou, Tong

    2015-01-01

    Motivation Identifying gene regulatory networks (GRNs) which consist of a large number of interacting units has become a problem of paramount importance in systems biology. Situations exist extensively in which causal interacting relationships among these units are required to be reconstructed from measured expression data and other a priori information. Though numerous classical methods have been developed to unravel the interactions of GRNs, these methods either have higher computing complexities or have lower estimation accuracies. Note that great similarities exist between identification of genes that directly regulate a specific gene and a sparse vector reconstruction, which often relates to the determination of the number, location and magnitude of nonzero entries of an unknown vector by solving an underdetermined system of linear equations y = Φx. Based on these similarities, we propose a novel framework of sparse reconstruction to identify the structure of a GRN, so as to increase accuracy of causal regulation estimations, as well as to reduce their computational complexity. Results In this paper, a sparse reconstruction framework is proposed on basis of steady-state experiment data to identify GRN structure. Different from traditional methods, this approach is adopted which is well suitable for a large-scale underdetermined problem in inferring a sparse vector. We investigate how to combine the noisy steady-state experiment data and a sparse reconstruction algorithm to identify causal relationships. Efficiency of this method is tested by an artificial linear network, a mitogen-activated protein kinase (MAPK) pathway network and the in silico networks of the DREAM challenges. The performance of the suggested approach is compared with two state-of-the-art algorithms, the widely adopted total least-squares (TLS) method and those available results on the DREAM project. Actual results show that, with a lower computational cost, the proposed method can

  20. A new simplex chemometric approach to identify olive oil blends with potentially high traceability.

    PubMed

    Semmar, N; Laroussi-Mezghani, S; Grati-Kamoun, N; Hammami, M; Artaud, J

    2016-10-01

    Olive oil blends (OOBs) are complex matrices combining different cultivars at variable proportions. Although qualitative determinations of OOBs have been subjected to several chemometric works, quantitative evaluations of their contents remain poorly developed because of traceability difficulties concerning co-occurring cultivars. Around this question, we recently published an original simplex approach helping to develop predictive models of the proportions of co-occurring cultivars from chemical profiles of resulting blends (Semmar & Artaud, 2015). Beyond predictive model construction and validation, this paper presents an extension based on prediction errors' analysis to statistically define the blends with the highest predictability among all the possible ones that can be made by mixing cultivars at different proportions. This provides an interesting way to identify a priori labeled commercial products with potentially high traceability taking into account the natural chemical variability of different constitutive cultivars.

  1. Identifying Risk and Protective Factors in Recidivist Juvenile Offenders: A Decision Tree Approach

    PubMed Central

    Ortega-Campos, Elena; García-García, Juan; Gil-Fenoy, Maria José; Zaldívar-Basurto, Flor

    2016-01-01

    Research on juvenile justice aims to identify profiles of risk and protective factors in juvenile offenders. This paper presents a study of profiles of risk factors that influence young offenders toward committing sanctionable antisocial behavior (S-ASB). Decision tree analysis is used as a multivariate approach to the phenomenon of repeated sanctionable antisocial behavior in juvenile offenders in Spain. The study sample was made up of the set of juveniles who were charged in a court case in the Juvenile Court of Almeria (Spain). The period of study of recidivism was two years from the baseline. The object of study is presented, through the implementation of a decision tree. Two profiles of risk and protective factors are found. Risk factors associated with higher rates of recidivism are antisocial peers, age at baseline S-ASB, problems in school and criminality in family members. PMID:27611313

  2. Identifying Risk and Protective Factors in Recidivist Juvenile Offenders: A Decision Tree Approach.

    PubMed

    Ortega-Campos, Elena; García-García, Juan; Gil-Fenoy, Maria José; Zaldívar-Basurto, Flor

    2016-01-01

    Research on juvenile justice aims to identify profiles of risk and protective factors in juvenile offenders. This paper presents a study of profiles of risk factors that influence young offenders toward committing sanctionable antisocial behavior (S-ASB). Decision tree analysis is used as a multivariate approach to the phenomenon of repeated sanctionable antisocial behavior in juvenile offenders in Spain. The study sample was made up of the set of juveniles who were charged in a court case in the Juvenile Court of Almeria (Spain). The period of study of recidivism was two years from the baseline. The object of study is presented, through the implementation of a decision tree. Two profiles of risk and protective factors are found. Risk factors associated with higher rates of recidivism are antisocial peers, age at baseline S-ASB, problems in school and criminality in family members. PMID:27611313

  3. Cocrystal dissociation in the presence of water: a general approach for identifying stable cocrystal forms.

    PubMed

    Eddleston, Mark D; Madusanka, Nadeesh; Jones, William

    2014-09-01

    In previous studies, cocrystals have been shown to be susceptible to dissociation at high humidity because of differences in the solubilities of the two coformer molecules, especially when these molecules can form hydrates. Contrastingly, however, the propensity of the pharmaceutically active compound caffeine to hydrate formation is reduced by cocrystallization with oxalic acid. Here, the stability of the oxalic acid cocrystal of caffeine is investigated from a thermodynamic perspective through the use of aqueous slurries of caffeine hydrate and oxalic acid dihydrate. Conversion to the anhydrous caffeine-oxalic acid cocrystal occurred under these conditions confirming that this form is thermodynamically stable in an aqueous environment. The slurry methodology was further developed as a general approach to screening for cocrystals that are not susceptible to dissociation at high humidity. In this manner, cocrystals of the hydrate-forming molecules theophylline, carbamazepine, and piroxicam that are stable at high humidity, indefinitely avoiding hydrate formation, were identified.

  4. An integrated genomic and proteomic approach to identify signatures of endosulfan exposure in hepatocellular carcinoma cells.

    PubMed

    Gandhi, Deepa; Tarale, Prashant; Naoghare, Pravin K; Bafana, Amit; Krishnamurthi, Kannan; Arrigo, Patrizio; Saravanadevi, Sivanesan

    2015-11-01

    Present study reports the identification of genomic and proteomic signatures of endosulfan exposure in hepatocellular carcinoma cells (HepG2). HepG2 cells were exposed to sublethal concentration (15μM) of endosulfan for 24h. DNA microarray and MALDI-TOF-MS analyses revealed that endosulfan induced significant alterations in the expression level of genes and proteins involved in multiple cellular pathways (apoptosis, transcription, immune/inflammatory response, carbohydrate metabolism, etc.). Furthermore, downregulation of PHLDA gene, upregulation of ACIN1 protein and caspase-3 activation in exposed cells indicated that endosulfan can trigger apoptotic cascade in hepatocellular carcinoma cells. In total 135 transcripts and 19 proteins were differentially expressed. This study presents an integrated approach to identify the alteration of biological/cellular pathways in HepG2 cells upon endosulfan exposure.

  5. A new simplex chemometric approach to identify olive oil blends with potentially high traceability.

    PubMed

    Semmar, N; Laroussi-Mezghani, S; Grati-Kamoun, N; Hammami, M; Artaud, J

    2016-10-01

    Olive oil blends (OOBs) are complex matrices combining different cultivars at variable proportions. Although qualitative determinations of OOBs have been subjected to several chemometric works, quantitative evaluations of their contents remain poorly developed because of traceability difficulties concerning co-occurring cultivars. Around this question, we recently published an original simplex approach helping to develop predictive models of the proportions of co-occurring cultivars from chemical profiles of resulting blends (Semmar & Artaud, 2015). Beyond predictive model construction and validation, this paper presents an extension based on prediction errors' analysis to statistically define the blends with the highest predictability among all the possible ones that can be made by mixing cultivars at different proportions. This provides an interesting way to identify a priori labeled commercial products with potentially high traceability taking into account the natural chemical variability of different constitutive cultivars. PMID:27132835

  6. Identifying the critical financial ratios for stocks evaluation: A fuzzy delphi approach

    NASA Astrophysics Data System (ADS)

    Mokhtar, Mazura; Shuib, Adibah; Mohamad, Daud

    2014-12-01

    Stocks evaluation has always been an interesting and challenging problem for both researchers and practitioners. Generally, the evaluation can be made based on a set of financial ratios. Nevertheless, there are a variety of financial ratios that can be considered and if all ratios in the set are placed into the evaluation process, data collection would be more difficult and time consuming. Thus, the objective of this paper is to identify the most important financial ratios upon which to focus in order to evaluate the stock's performance. For this purpose, a survey was carried out using an approach which is based on an expert judgement, namely the Fuzzy Delphi Method (FDM). The results of this study indicated that return on equity, return on assets, net profit margin, operating profit margin, earnings per share and debt to equity are the most important ratios.

  7. A mass spectrometry-guided genome mining approach for natural product peptidogenomics.

    PubMed

    Kersten, Roland D; Yang, Yu-Liang; Xu, Yuquan; Cimermancic, Peter; Nam, Sang-Jip; Fenical, William; Fischbach, Michael A; Moore, Bradley S; Dorrestein, Pieter C

    2011-10-09

    Peptide natural products show broad biological properties and are commonly produced by orthogonal ribosomal and nonribosomal pathways in prokaryotes and eukaryotes. To harvest this large and diverse resource of bioactive molecules, we introduce here natural product peptidogenomics (NPP), a new MS-guided genome-mining method that connects the chemotypes of peptide natural products to their biosynthetic gene clusters by iteratively matching de novo tandem MS (MS(n)) structures to genomics-based structures following biosynthetic logic. In this study, we show that NPP enabled the rapid characterization of over ten chemically diverse ribosomal and nonribosomal peptide natural products of previously unidentified composition from Streptomycete bacteria as a proof of concept to begin automating the genome-mining process. We show the identification of lantipeptides, lasso peptides, linardins, formylated peptides and lipopeptides, many of which are from well-characterized model Streptomycetes, highlighting the power of NPP in the discovery of new peptide natural products from even intensely studied organisms.

  8. A reclamation approach for mined prime farmland by adding organic wastes and lime to the subsoil

    SciTech Connect

    Zhai, Qiang; Barnhisel, R.I.

    1996-12-31

    Surface mined prime farmland may be reclaimed by adding organic wastes and lime to subsoil thus improving conditions in root zone. In this study, sewage sludge, poultry manure, horse bedding, and lime were applied to subsoil (15-30 cm) during reclamation. Soil properties and plant growth were measured over two years. All organic amendments tended to lower the subsoil bulk density and increase organic matter and total nitrogen. Liming raised exchangeable calcium, slightly increased pH, but decreased exchangeable magnesium and potassium. Corn ear-leaf and forage tissue nitrogen, yields, and nitrogen removal increased in treatments amended with sewage sludge and poultry manure, but not horse bedding. Subsoil application of sewage sludge or poultry manure seems like a promising method in the reclamation of surface mined prime farmland based on the improvements observed in the root zone environment.

  9. Perspective: NanoMine: A material genome approach for polymer nanocomposites analysis and design

    NASA Astrophysics Data System (ADS)

    Zhao, He; Li, Xiaolin; Zhang, Yichi; Schadler, Linda S.; Chen, Wei; Brinson, L. Catherine

    2016-05-01

    Polymer nanocomposites are a designer class of materials where nanoscale particles, functional chemistry, and polymer resin combine to provide materials with unprecedented combinations of physical properties. In this paper, we introduce NanoMine, a data-driven web-based platform for analysis and design of polymer nanocomposite systems under the material genome concept. This open data resource strives to curate experimental and computational data on nanocomposite processing, structure, and properties, as well as to provide analysis and modeling tools that leverage curated data for material property prediction and design. With a continuously expanding dataset and toolkit, NanoMine encourages community feedback and input to construct a sustainable infrastructure that benefits nanocomposite material research and development.

  10. Objective Definition of Rosette Shape Variation Using a Combined Computer Vision and Data Mining Approach

    PubMed Central

    Camargo, Anyela; Papadopoulou, Dimitra; Spyropoulou, Zoi; Vlachonasios, Konstantinos; Doonan, John H.; Gay, Alan P.

    2014-01-01

    Computer-vision based measurements of phenotypic variation have implications for crop improvement and food security because they are intrinsically objective. It should be possible therefore to use such approaches to select robust genotypes. However, plants are morphologically complex and identification of meaningful traits from automatically acquired image data is not straightforward. Bespoke algorithms can be designed to capture and/or quantitate specific features but this approach is inflexible and is not generally applicable to a wide range of traits. In this paper, we have used industry-standard computer vision techniques to extract a wide range of features from images of genetically diverse Arabidopsis rosettes growing under non-stimulated conditions, and then used statistical analysis to identify those features that provide good discrimination between ecotypes. This analysis indicates that almost all the observed shape variation can be described by 5 principal components. We describe an easily implemented pipeline including image segmentation, feature extraction and statistical analysis. This pipeline provides a cost-effective and inherently scalable method to parameterise and analyse variation in rosette shape. The acquisition of images does not require any specialised equipment and the computer routines for image processing and data analysis have been implemented using open source software. Source code for data analysis is written using the R package. The equations to calculate image descriptors have been also provided. PMID:24804972

  11. Data mining approach to the evaluation of diagnostic tests in Wilson disease

    NASA Astrophysics Data System (ADS)

    Plutecki, Michal M.; Dądalski, Maciej; Socha, Piotr; Mulawka, Jan J.

    2009-06-01

    The purpose of this paper is to figure out a new, better than so-far-known, evaluation method of diagnostic tests in Wilson disease. In order to find the most interesting classification models various data mining techniques were applied to real, suffering from Wilson disease, set of patients. It occurred that a combination of two classification algorithms with its implementations in Weka environment may significantly increase classification ability.

  12. Comparison of approaches to classifier fusion for improving mine detection/classification performance

    NASA Astrophysics Data System (ADS)

    Bello, Martin G.

    2002-08-01

    We describe here the current form of Alphatech's image processing and neural network based algorithms for detection and classification of mines in side-scan sonar imagery, and results obtained from their application. In particular, drawing on the Machine Learning literature, we contrast here results obtained from employing the bagging and boosting methods for classifier fusion, in the attempt to obtain more desirable performance characteristics than that achieved with single classifiers.

  13. A New Approach to Identifying the Drivers of Regulation Compliance Using Multivariate Behavioural Models

    PubMed Central

    Thomas, Alyssa S.; Milfont, Taciano L.; Gavin, Michael C.

    2016-01-01

    Non-compliance with fishing regulations can undermine management effectiveness. Previous bivariate approaches were unable to untangle the complex mix of factors that may influence fishers’ compliance decisions, including enforcement, moral norms, perceived legitimacy of regulations and the behaviour of others. We compared seven multivariate behavioural models of fisher compliance decisions using structural equation modeling. An online survey of over 300 recreational fishers tested the ability of each model to best predict their compliance with two fishing regulations (daily and size limits). The best fitting model for both regulations was composed solely of psycho-social factors, with social norms having the greatest influence on fishers’ compliance behaviour. Fishers’ attitude also directly affected compliance with size limit, but to a lesser extent. On the basis of these findings, we suggest behavioural interventions to target social norms instead of increasing enforcement for the focal regulations in the recreational blue cod fishery in the Marlborough Sounds, New Zealand. These interventions could include articles in local newspapers and fishing magazines highlighting the extent of regulation compliance as well as using respected local fishers to emphasize the benefits of compliance through public meetings or letters to the editor. Our methodological approach can be broadly applied by natural resource managers as an effective tool to identify drivers of compliance that can then guide the design of interventions to decrease illegal resource use. PMID:27727292

  14. Membrane Glycoproteins Associated with Breast Tumor Cell Progression Identified by a Lectin Affinity Approach

    PubMed Central

    Wang, Yanfei; Ao, Xiaoping; Vuong, Huy; Konanur, Meghana; Miller, Fred R.; Goodison, Steve; Lubman, David M.

    2008-01-01

    The membrane glycoprotein component of the cellular proteome represents a promising source for potential disease biomarkers and therapeutic targets. Here we describe the development of a method that facilitates the analysis of membrane glycoproteins and apply it to the differential analysis of breast tumor cells with distinct malignant phenotypes. The approach combines two membrane extraction procedures, and enrichment using ConA and WGA lectin affinity columns, prior to digestion and analysis by LC–MS/MS. The glycoproteins are identified and quantified by spectral counting. Although the distribution of glycoprotein expression as a function of MW and pI was very similar between the two related cell lines tested, the approach enabled the identification of several distinct membrane glycoproteins with an expression index correlated with either a precancerous (MCF10AT1), or a malignant, metastatic cellular phenotype (MCF10CA1a). Among the proteins associated with the malignant phenotype, Gamma-glutamyl hydrolase, CD44, Galectin-3-binding protein, and Syndecan-1 protein have been reported as potential biomarkers of breast cancer. PMID:18729497

  15. Identifying Potential Areas for Siting Interim Nuclear Waste Facilities Using Map Algebra and Optimization Approaches

    SciTech Connect

    Omitaomu, Olufemi A; Liu, Cheng; Cetiner, Sacit M; Belles, Randy; Mays, Gary T; Tuttle, Mark A

    2013-01-01

    The renewed interest in siting new nuclear power plants in the United States has brought to the center stage, the need to site interim facilities for long-term management of spent nuclear fuel (SNF). In this paper, a two-stage approach for identifying potential areas for siting interim SNF facilities is presented. In the first stage, the land area is discretized into grids of uniform size (e.g., 100m x 100m grids). For the continental United States, this process resulted in a data matrix of about 700 million cells. Each cell of the matrix is then characterized as a binary decision variable to indicate whether an exclusion criterion is satisfied or not. A binary data matrix is created for each of the 25 siting criteria considered in this study. Using map algebra approach, cells that satisfy all criteria are clustered and regarded as potential siting areas. In the second stage, an optimization problem is formulated as a p-median problem on a rail network such that the sum of the shortest distance between nuclear power plants with SNF and the potential storage sites from the first stage is minimized. The implications of obtained results for energy policies are presented and discussed.

  16. Multivariate statistical and GIS-based approach to identify heavy metal sources in soils.

    PubMed

    Facchinelli, A; Sacchi, E; Mallen, L

    2001-01-01

    The knowledge of the regional variability, the background values and the anthropic vs. natural origin for potentially harmful elements in soils is of critical importance to assess human impact and to fix guide values and quality standards. The present study was undertaken as a preliminary survey on soil contamination on a regional scale in Piemonte (NW Italy). The aims of the study were: (1) to determine average regional concentrations of some heavy metals (Cr, Co, Ni, Cu, Zn, Pb); (2) to find out their large-scale variability; (3) to define their natural or artificial origin; and (4) to identify possible non-point sources of contamination. Multivariate statistic approaches (Principal Component Analysis and Cluster Analysis) were adopted for data treatment, allowing the identification of three main factors controlling the heavy metal variability in cultivated soils. Geostatistics were used to construct regional distribution maps, to be compared with the geographical, geologic and land use regional database using GIS software. This approach, evidencing spatial relationships, proved very useful to the confirmation and refinement of geochemical interpretations of the statistical output. Cr, Co and Ni were associated with and controlled by parent rocks, whereas Cu together with Zn, and Pb alone were controlled by anthropic activities. The study indicates that background values and realistic mandatory guidelines are impossible to fix without an extensive data collection and without a correct geochemical interpretation of the data. PMID:11584630

  17. Chemical proteomics approaches for identifying the cellular targets of natural products.

    PubMed

    Wright, M H; Sieber, S A

    2016-05-01

    Covering: 2010 up to 2016Deconvoluting the mode of action of natural products and drugs remains one of the biggest challenges in chemistry and biology today. Chemical proteomics is a growing area of chemical biology that seeks to design small molecule probes to understand protein function. In the context of natural products, chemical proteomics can be used to identify the protein binding partners or targets of small molecules in live cells. Here, we highlight recent examples of chemical probes based on natural products and their application for target identification. The review focuses on probes that can be covalently linked to their target proteins (either via intrinsic chemical reactivity or via the introduction of photocrosslinkers), and can be applied "in situ" - in living systems rather than cell lysates. We also focus here on strategies that employ a click reaction, the copper-catalysed azide-alkyne cycloaddition reaction (CuAAC), to allow minimal functionalisation of natural product scaffolds with an alkyne or azide tag. We also discuss 'competitive mode' approaches that screen for natural products that compete with a well-characterised chemical probe for binding to a particular set of protein targets. Fuelled by advances in mass spectrometry instrumentation and bioinformatics, many modern strategies are now embracing quantitative proteomics to help define the true interacting partners of probes, and we highlight the opportunities this rapidly evolving technology provides in chemical proteomics. Finally, some of the limitations and challenges of chemical proteomics approaches are discussed.

  18. A non-target approach to identify disinfection byproducts of structurally similar sulfonamide antibiotics.

    PubMed

    Wang, Mian; Helbling, Damian E

    2016-10-01

    There is growing concern over the formation of new types of disinfection byproducts (DBPs) from pharmaceuticals and other emerging contaminants during drinking water production. Free chlorine is a widely used disinfectant that reacts non-selectively with organic molecules to form a variety of byproducts. In this research, we aimed to investigate the DBPs formed from three structurally similar sulfonamide antibiotics (sulfamethoxazole, sulfathiazole, and sulfadimethoxine) to determine how chemical structure influences the types of chlorination reactions observed. We conducted free chlorination experiments and developed a non-target approach to extract masses from the experimental dataset that represent the masses of candidate DBPs. Structures were assigned to the candidate DBPs based on analytical data and knowledge of chlorine chemistry. Confidence levels were assigned to each proposed structure according to conventions in the field. In total, 11, 12, and 15 DBP structures were proposed for sulfamethoxazole, sulfathiazole, and sulfadimethoxine, respectively. The structures of the products suggest a variety of reaction types including chlorine substitution, SC cleavage, SN hydrolysis, desulfonation, oxidation/hydroxylation, and conjugation reactions. Some reaction types were common to all of the sulfonamide antibiotics, but unique reaction types were also observed for each sulfonamide antibiotic suggesting that selective prediction of DBP structures of other sulfonamide antibiotics based on chemical structure is unlikely to be possible based on these data alone. This research offers an approach to comprehensively identify DBPs of organic molecules and fills in much needed data on the formation of specific DBPs from three environmentally relevant sulfonamide antibiotics.

  19. Chemical proteomics approaches for identifying the cellular targets of natural products

    PubMed Central

    Sieber, S. A.

    2016-01-01

    Covering: 2010 up to 2016 Deconvoluting the mode of action of natural products and drugs remains one of the biggest challenges in chemistry and biology today. Chemical proteomics is a growing area of chemical biology that seeks to design small molecule probes to understand protein function. In the context of natural products, chemical proteomics can be used to identify the protein binding partners or targets of small molecules in live cells. Here, we highlight recent examples of chemical probes based on natural products and their application for target identification. The review focuses on probes that can be covalently linked to their target proteins (either via intrinsic chemical reactivity or via the introduction of photocrosslinkers), and can be applied “in situ” – in living systems rather than cell lysates. We also focus here on strategies that employ a click reaction, the copper-catalysed azide–alkyne cycloaddition reaction (CuAAC), to allow minimal functionalisation of natural product scaffolds with an alkyne or azide tag. We also discuss ‘competitive mode’ approaches that screen for natural products that compete with a well-characterised chemical probe for binding to a particular set of protein targets. Fuelled by advances in mass spectrometry instrumentation and bioinformatics, many modern strategies are now embracing quantitative proteomics to help define the true interacting partners of probes, and we highlight the opportunities this rapidly evolving technology provides in chemical proteomics. Finally, some of the limitations and challenges of chemical proteomics approaches are discussed. PMID:27098809

  20. RaSH, a rapid subtraction hybridization approach for identifying and cloning differentially expressed genes

    PubMed Central

    Jiang, Hongping; Kang, Dong-chul; Alexandre, Deborah; Fisher, Paul B.

    2000-01-01

    Human melanoma cells growth-arrest irreversibly and terminally differentiate on treatment with a combination of fibroblast interferon and the protein kinase C activator mezerein. This experimental protocol also results in a loss of tumorigenic potential and profound changes in gene expression. Various cloning and cDNA microarray strategies are being used to determine the complete spectrum of gene expression changes underlying these alterations in human melanoma cells. An efficient approach, Rapid Subtraction Hybridization (RaSH), has been developed that is permitting the identification of genes of potential relevance to cancer growth control and terminal cell differentiation. RaSH cDNA libraries are prepared from double-stranded cDNAs that are enzymatically digested into small fragments, ligated to adapters, and PCR amplified followed by incubation of tester and driver PCR fragments. This subtraction hybridization scheme is technically simple and results in the identification of a high proportion of differentially expressed sequences, including known genes and those not described in current DNA databases. The RaSH approach represents an efficient methodology for identifying and cloning genes displaying differential expression that associate with and potentially regulate complex biological processes. PMID:11058161

  1. Modelling Creativity: Identifying Key Components through a Corpus-Based Approach

    PubMed Central

    2016-01-01

    Creativity is a complex, multi-faceted concept encompassing a variety of related aspects, abilities, properties and behaviours. If we wish to study creativity scientifically, then a tractable and well-articulated model of creativity is required. Such a model would be of great value to researchers investigating the nature of creativity and in particular, those concerned with the evaluation of creative practice. This paper describes a unique approach to developing a suitable model of how creative behaviour emerges that is based on the words people use to describe the concept. Using techniques from the field of statistical natural language processing, we identify a collection of fourteen key components of creativity through an analysis of a corpus of academic papers on the topic. Words are identified which appear significantly often in connection with discussions of the concept. Using a measure of lexical similarity to help cluster these words, a number of distinct themes emerge, which collectively contribute to a comprehensive and multi-perspective model of creativity. The components provide an ontology of creativity: a set of building blocks which can be used to model creative practice in a variety of domains. The components have been employed in two case studies to evaluate the creativity of computational systems and have proven useful in articulating achievements of this work and directions for further research. PMID:27706185

  2. Calibrated photostimulated luminescence is an effective approach to identify irradiated orange during storage

    NASA Astrophysics Data System (ADS)

    Jo, Yunhee; Sanyal, Bhaskar; Chung, Namhyeok; Lee, Hyun-Gyu; Park, Yunji; Park, Hae-Jun; Kwon, Joong-Ho

    2015-06-01

    Photostimulated luminescence (PSL) has been employed as a fast screening method for various irradiated foods. In this study the potential use of PSL was evaluated to identify oranges irradiated with gamma ray, electron beam and X-ray (0-2 kGy) and stored under different conditions for 6 weeks. The effects of light conditions (natural light, artificial light, and dark) and storage temperatures (4 and 20 °C) on PSL photon counts (PCs) during post-irradiation periods were studied. Non-irradiated samples always showed negative values of PCs, while irradiated oranges exhibited intermediate results after first PSL measurements. However, the irradiated samples had much higher PCs. The PCs of all the samples declined as the storage time increased. Calibrated second PSL measurements showed PSL ratio <10 for the irradiated samples after 3 weeks of irradiation confirming their irradiation status in all the storage conditions. Calibrated PSL and sample storage in dark at 4 °C were found out to be most suitable approaches to identify irradiated oranges during storage.

  3. A Proteomics and Transcriptomics Approach to Identify Leukemic Stem Cell (LSC) Markers*

    PubMed Central

    Bonardi, Francesco; Fusetti, Fabrizia; Deelen, Patrick; van Gosliga, Djoke; Vellenga, Edo; Schuringa, Jan Jacob

    2013-01-01

    Interactions between hematopoietic stem cells and their niche are mediated by proteins within the plasma membrane (PM) and changes in these interactions might alter hematopoietic stem cell fate and ultimately result in acute myeloid leukemia (AML). Here, using nano-LC/MS/MS, we set out to analyze the PM profile of two leukemia patient samples. We identified 867 and 610 unique CD34+ PM (-associated) proteins in these AML samples respectively, including previously described proteins such as CD47, CD44, CD135, CD96, and ITGA5, but also novel ones like CD82, CD97, CD99, PTH2R, ESAM, MET, and ITGA6. Further validation by flow cytometry and functional studies indicated that long-term self-renewing leukemic stem cells reside within the CD34+/ITGA6+ fraction, at least in a subset of AML cases. Furthermore, we combined proteomics with transcriptomics approaches using a large panel of AML CD34+ (n = 60) and normal bone marrow CD34+ (n = 40) samples. Thus, we identified eight subgroups of AML patients based on their specific PM expression profile. GSEA analysis revealed that these eight subgroups are enriched for specific cellular processes. PMID:23233446

  4. A neural network approach for identifying particle pitch angle distributions in Van Allen Probes data

    NASA Astrophysics Data System (ADS)

    Souza, V. M.; Vieira, L. E. A.; Medeiros, C.; Da Silva, L. A.; Alves, L. R.; Koga, D.; Sibeck, D. G.; Walsh, B. M.; Kanekal, S. G.; Jauer, P. R.; Rockenbach, M.; Dal Lago, A.; Silveira, M. V. D.; Marchezi, J. P.; Mendes, O.; Gonzalez, W. D.; Baker, D. N.

    2016-04-01

    Analysis of particle pitch angle distributions (PADs) has been used as a means to comprehend a multitude of different physical mechanisms that lead to flux variations in the Van Allen belts and also to particle precipitation into the upper atmosphere. In this work we developed a neural network-based data clustering methodology that automatically identifies distinct PAD types in an unsupervised way using particle flux data. One can promptly identify and locate three well-known PAD types in both time and radial distance, namely, 90° peaked, butterfly, and flattop distributions. In order to illustrate the applicability of our methodology, we used relativistic electron flux data from the whole month of November 2014, acquired from the Relativistic Electron-Proton Telescope instrument on board the Van Allen Probes, but it is emphasized that our approach can also be used with multiplatform spacecraft data. Our PAD classification results are in reasonably good agreement with those obtained by standard statistical fitting algorithms. The proposed methodology has a potential use for Van Allen belt's monitoring.

  5. Multi-compartment approach to identify minimal flow and maximal recreational use of a lowland river

    NASA Astrophysics Data System (ADS)

    Pusch, Martin; Lorenz, Stefan

    2013-04-01

    Most approaches to establish a minimum flow rate for river sections subjected to water abstraction focus on flow requirements of fish and benthic invertebrates. However, artificial reduction of river flow will always affect additional key ecosystem features, as sediment properties and the metabolism of matter in these ecosystems as well, and may even influence adjacent floodplains. Thus, significant effects e.g. on the dissolved oxygen content of river water, on habitat conditions in the benthic zone, and on water levels in the floodplain are to be expected. Thus, we chose a multiple compartment method to identify minimum flow requirements in a lowland River in northern Germany (Spree River), selecting the minimal required flow level out of all compartments studied. Results showed that minimal flow levels necessary to keep key ecosystem features at a 'good' state depended significantly on actual water quality and on river channel morphology. Thereby, water quality of the Spree is potentially influenced by recreational boating activity, which causes mussels to stop filter-feeding, and thus impedes self-purification. Disturbance of mussel feeding was shown to directly depend on boat type and speed, with substantial differences among mussel species. Thus, a maximal recreational boating intensity could be derived that does not significantly affect self purification. We conclude that minimal flow levels should be identified not only based on flow preferences of target species, but also considering channel morphology, ecological functions, and the intensity of other human uses of the river section.

  6. A genome-wide approach identifies that the aspartate metabolism pathway contributes to asparaginase sensitivity

    PubMed Central

    Chen, Shih-Hsiang; Yang, Wenjian; Fan, Yiping; Stocco, Gabriele; Crews, Kristine R.; Yang, Jun J.; Paugh, Steven W.; Pui, Ching-Hon; Evans, William E.; Relling, Mary V.

    2011-01-01

    Asparaginase is an important component of treatment for childhood acute lymphoblastic leukemia (ALL). The basis for interindividual differences in asparaginase sensitivity remains unclear. To comprehensively identify genetic variants important in the cytotoxicity of asparaginase, we employed a genome-wide association approach using the HapMap lymphoblastoid cell lines (87 CEU trio members) and 54 primary ALL leukemic blast samples at diagnosis. Asparaginase sensitivity was assessed as the drug concentration necessary to inhibit 50% of growth (IC50). In CEU lines, we tested 2,390,203 SNP genotypes at the individual SNP (p < 0.001) and the gene level (p < 0.05) and identified 329 SNPs representing 94 genes that were associated with asparaginase IC50. The aspartate metabolism pathway was the most over-represented among 199 pathways evaluated (p = 8.1 × 10−3), with primary involvement of ADSL and DARS genes. We validated that SNPs in the aspartate metabolism pathway were also associated with asparaginase sensitivity in primary ALL leukemic blast samples (p = 5.5 × 10−5). Our genome-wide interrogation of CEU cell lines and primary ALL blasts revealed that inherited genomic interindividual variation in a plausible candidate pathway can contribute to asparaginase sensitivity. PMID:21072045

  7. Prevalence of Heart Failure Signs and Symptoms in a Large Primary Care Population Identified Through the Use of Text and Data Mining of the Electronic Health Record

    PubMed Central

    Vijayakrishnan, Rajakrishnan; Steinhubl, Steven R.; Ng, Kenney; Sun, Jimeng; Byrd, Roy J.; Daar, Zahra; Williams, Brent A.; deFilippi, Christopher; Ebadollahi, Shahram; Stewart, Walter F.

    2014-01-01

    Background The electronic health record contains a tremendous amount of data that if appropriately detected can lead to earlier identification of disease states such as heart failure (HF). Using a novel text and data analytic tool we explored the longitudinal EHR of over 50,000 primary care patients to identify the documentation of the signs and symptoms of HF in the years preceding its diagnosis. Methods and Results Retrospective analysis consisting of 4,644 incident HF cases and 45,981 group-matched controls. Documentation of Framingham HF signs and symptoms within encounter notes were carried out using a previously validated natural language processing procedure. A total of 892,805 affirmed criteria were documented over an average observation period of 3.4 years. Among eventual HF cases, 85% had at least one criterion within a year prior to their HF diagnosis (as did 55% of controls). Substantial variability in the prevalence of individual signs and symptoms were found in both cases and controls. Conclusions HF signs and symptoms are frequently documented in a primary care population as identified through automated text and data mining of EHRs. Their frequent identification demonstrates the rich data available within EHRs that will allow for future work on automated criterion identification to help develop predictive models for HF. PMID:24709663

  8. An Integrated Multiomics Approach to Identify Candidate Antigens for Serodiagnosis of Human Onchocerciasis.

    PubMed

    McNulty, Samantha N; Rosa, Bruce A; Fischer, Peter U; Rumsey, Jeanne M; Erdmann-Gilmore, Petra; Curtis, Kurt C; Specht, Sabine; Townsend, R Reid; Weil, Gary J; Mitreva, Makedonka

    2015-12-01

    Improved diagnostic methods are needed to support ongoing efforts to eliminate onchocerciasis (river blindness). This study used an integrated approach to identify adult female Onchocerca volvulus antigens that can be explored for developing serodiagnostic tests. The first step was to develop a detailed multi-omics database of all O. volvulus proteins deduced from the genome, gene transcription data for different stages of the parasite including eight individual female worms (providing gene expression information for 94.8% of all protein coding genes), and the adult female worm proteome (detecting 2126 proteins). Next, female worm proteins were purified with IgG antibodies from onchocerciasis patients and identified using LC-MS with a high-resolution hybrid quadrupole-time-of-flight mass spectrometer. A total of 241 immunoreactive proteins were identified among those bound by IgG from infected individuals but not IgG from uninfected controls. These included most of the major diagnostic antigens described over the past 25 years plus many new candidates. Proteins of interest were prioritized for further study based on a lack of conservation with orthologs in the human host and other helminthes, their expression pattern across the life cycle, and their consistent expression among individual female worms. Based on these criteria, we selected 33 proteins that should be carried forward for testing as serodiagnostic antigens to supplement existing diagnostic tools. These candidates, together with the extensive pan-omics dataset generated in this study are available to the community (http://nematode.net) to facilitate basic and translational research on onchocerciasis. PMID:26472727

  9. Age-related changes in mesenchymal stem cells identified using a multi-omics approach.

    PubMed

    Peffers, M J; Collins, J; Fang, Y; Goljanek-Whysall, K; Rushton, M; Loughlin, J; Proctor, C; Clegg, P D

    2016-01-01

    Mesenchymal stem cells (MSC) are capable of multipotent differentiation into connective tissues and as such are an attractive source for autologous cell-based treatments for many clinical diseases and injuries. Ageing is associated with various altered cellular phenotypes coupled with a variety of transcriptional, epigenetic and translational changes. Furthermore, the regeneration potential of MSCs is reduced with increasing age and is correlated with changes in cellular functions. This study used a systems biology approach to investigate the transcriptomic (RNASeq), epigenetic (miRNASeq and DNA methylation) and protein alterations in ageing MSCs in order to understand the age-related functional and biological variations, which may affect their applications to regenerative medicine. We identified no change in expression of the cellular senescence markers. Alterations were evident at both the transcriptional and post-transcriptional level in a number of transcription factors. There was enrichment in genes involved in developmental disorders at mRNA and differential methylated loci (DML) level. Alterations in energy metabolism were apparent at the DML and protein level. The microRNA miR-199b-5p, whose expression was reduced in old MSCs, had predicted gene targets involved in energy metabolism and cell survival. Additionally, enrichment of DML and proteins in cell survival was evident. Enrichment in metabolic processes was revealed at the protein level and in genes identified as undergoing alternate splicing. Overall, an altered phenotype in MSC ageing at a number of levels implicated roles for inflamm-ageing and mitochondrial ageing. Identified changes represent novel insights into the ageing process, with implications for stem cell therapies in older patients. PMID:26853623

  10. A novel approach identifying hybrid sterility QTL on the autosomes of Drosophila simulans and D. mauritiana.

    PubMed

    Dickman, Christopher T D; Moehring, Amanda J

    2013-01-01

    When species interbreed, the hybrid offspring that are produced are often sterile. If only one hybrid sex is sterile, it is almost always the heterogametic (XY or ZW) sex. Taking this trend into account, the predominant model used to explain the genetic basis of F1 sterility involves a deleterious interaction between recessive sex-linked loci from one species and dominant autosomal loci from the other species. This model is difficult to evaluate, however, as only a handful of loci influencing interspecies hybrid sterility have been identified, and their autosomal genetic interactors have remained elusive. One hindrance to their identification has been the overwhelming effect of the sex chromosome in mapping studies, which could 'mask' the ability to accurately map autosomal factors. Here, we use a novel approach employing attached-X chromosomes to create reciprocal backcross interspecies hybrid males that have a non-recombinant sex chromosome and recombinant autosomes. The heritable variation in phenotype is thus solely caused by differences in the autosomes, thereby allowing us to accurately identify the number and location of autosomal sterility loci. In one direction of backcross, all males were sterile, indicating that sterility could be entirely induced by the sex chromosome complement in these males. In the other direction, we identified nine quantitative trait loci that account for a surprisingly large amount (56%) of the autosome-induced phenotypic variance in sterility, with a large contribution of autosome-autosome epistatic interactions. These loci are capable of acting dominantly, and thus could contribute to F1 hybrid sterility.

  11. An algorithmic calibration approach to identify globally optimal parameters for constraining the DayCent model

    SciTech Connect

    Rafique, Rashid; Kumar, Sandeep; Luo, Yiqi; Kiely, Gerard; Asrar, Ghassem R.

    2015-02-01

    he accurate calibration of complex biogeochemical models is essential for the robust estimation of soil greenhouse gases (GHG) as well as other environmental conditions and parameters that are used in research and policy decisions. DayCent is a popular biogeochemical model used both nationally and internationally for this purpose. Despite DayCent’s popularity, its complex parameter estimation is often based on experts’ knowledge which is somewhat subjective. In this study we used the inverse modelling parameter estimation software (PEST), to calibrate the DayCent model based on sensitivity and identifi- ability analysis. Using previously published N2 O and crop yield data as a basis of our calibration approach, we found that half of the 140 parameters used in this study were the primary drivers of calibration dif- ferences (i.e. the most sensitive) and the remaining parameters could not be identified given the data set and parameter ranges we used in this study. The post calibration results showed improvement over the pre-calibration parameter set based on, a decrease in residual differences 79% for N2O fluxes and 84% for crop yield, and an increase in coefficient of determination 63% for N2O fluxes and 72% for corn yield. The results of our study suggest that future studies need to better characterize germination tem- perature, number of degree-days and temperature dependency of plant growth; these processes were highly sensitive and could not be adequately constrained by the data used in our study. Furthermore, the sensitivity and identifiability analysis was helpful in providing deeper insight for important processes and associated parameters that can lead to further improvement in calibration of DayCent model.

  12. SU-E-J-212: Identifying Bones From MRI: A Dictionary Learnign and Sparse Regression Approach

    SciTech Connect

    Ruan, D; Yang, Y; Cao, M; Hu, P; Low, D

    2014-06-01

    Purpose: To develop an efficient and robust scheme to identify bony anatomy based on MRI-only simulation images. Methods: MRI offers important soft tissue contrast and functional information, yet its lack of correlation to electron-density has placed it as an auxiliary modality to CT in radiotherapy simulation and adaptation. An effective scheme to identify bony anatomy is an important first step towards MR-only simulation/treatment paradigm and would satisfy most practical purposes. We utilize a UTE acquisition sequence to achieve visibility of the bone. By contrast to manual + bulk or registration-to identify bones, we propose a novel learning-based approach for improved robustness to MR artefacts and environmental changes. Specifically, local information is encoded with MR image patch, and the corresponding label is extracted (during training) from simulation CT aligned to the UTE. Within each class (bone vs. nonbone), an overcomplete dictionary is learned so that typical patches within the proper class can be represented as a sparse combination of the dictionary entries. For testing, an acquired UTE-MRI is divided to patches using a sliding scheme, where each patch is sparsely regressed against both bone and nonbone dictionaries, and subsequently claimed to be associated with the class with the smaller residual. Results: The proposed method has been applied to the pilot site of brain imaging and it has showed general good performance, with dice similarity coefficient of greater than 0.9 in a crossvalidation study using 4 datasets. Importantly, it is robust towards consistent foreign objects (e.g., headset) and the artefacts relates to Gibbs and field heterogeneity. Conclusion: A learning perspective has been developed for inferring bone structures based on UTE MRI. The imaging setting is subject to minimal motion effects and the post-processing is efficient. The improved efficiency and robustness enables a first translation to MR-only routine. The scheme

  13. An Integrated Multiomics Approach to Identify Candidate Antigens for Serodiagnosis of Human Onchocerciasis*

    PubMed Central

    McNulty, Samantha N.; Rosa, Bruce A.; Fischer, Peter U.; Rumsey, Jeanne M.; Erdmann-Gilmore, Petra; Curtis, Kurt C.; Specht, Sabine; Townsend, R. Reid; Weil, Gary J.; Mitreva, Makedonka

    2015-01-01

    Improved diagnostic methods are needed to support ongoing efforts to eliminate onchocerciasis (river blindness). This study used an integrated approach to identify adult female Onchocerca volvulus antigens that can be explored for developing serodiagnostic tests. The first step was to develop a detailed multi-omics database of all O. volvulus proteins deduced from the genome, gene transcription data for different stages of the parasite including eight individual female worms (providing gene expression information for 94.8% of all protein coding genes), and the adult female worm proteome (detecting 2126 proteins). Next, female worm proteins were purified with IgG antibodies from onchocerciasis patients and identified using LC-MS with a high-resolution hybrid quadrupole-time-of-flight mass spectrometer. A total of 241 immunoreactive proteins were identified among those bound by IgG from infected individuals but not IgG from uninfected controls. These included most of the major diagnostic antigens described over the past 25 years plus many new candidates. Proteins of interest were prioritized for further study based on a lack of conservation with orthologs in the human host and other helminthes, their expression pattern across the life cycle, and their consistent expression among individual female worms. Based on these criteria, we selected 33 proteins that should be carried forward for testing as serodiagnostic antigens to supplement existing diagnostic tools. These candidates, together with the extensive pan-omics dataset generated in this study are available to the community (http://nematode.net) to facilitate basic and translational research on onchocerciasis. PMID:26472727

  14. Engineering Approach to Identifying Patients with Colon Tumors on the Basis of Electrophotonic Imaging Technique Data

    PubMed Central

    Yakovleva, E.G.; Korotkov, K.G.; Fedorov, E.D.; Ivanova, E.V.; Plahov, R.V.; Belonosov, S.S.

    2016-01-01

    Background: Colonic neoplasms are quite a serious problem today. Screening methods play an important role in diagnosing the disease. Colorectal cancer screening is a complex undertaking, having various options, which require a lot of efforts both from the doctor and from the patient, including the use of sedatives and the necessity of the presence of an assistant for some procedures such as colonoscopy. This is why it is very important to find a method by which one can make a diagnosis quickly, easily, and painlessly. Methods: The ability to identify patients with tumors of the colon using the Electrophotonic Imaging (EPI) technique, as well as using it for differential diagnosis of tumors of the colon by their morphology, size and quantity was investigated. Selection of the most significant parameters of the EPI-graphy for the separation of the control group and the group of patients with tumors of the colon was developed. 137 people were studied with the EPI camera, with ages ranging from 16 to 86 years, including 49 males and 88 females. Based on the results of the colonoscopy and histological findings all subjects were divided into 2 groups: control group of 55 people, 9 males, 46 females; and patients with tumors (benign or malignant) of the colon - 82 people; 40 males and 42 females. Then all subjects were divided into smaller groups based on morphology, size, number of tumors and localization. Results: Based on the identified indicators decision rules to determine the patients with tumors of the colon were constructed. The specificity of the resulting function was 80.0% and sensitivity 75.6%. Decision rule was built as well with logistic regression. The specificity of the resulting function was 78.2% and sensitivity 90.0%. The accuracy of this approach was higher than using discriminant analysis. Conclusions: The results of this study have proven the ability to identify patients with tumors of the colon using EPI technology, as well as use it for

  15. Improving mine safety technology and training: establishing US global leadership

    SciTech Connect

    2006-12-15

    In 2006, the USA's record of mine safety was interrupted by fatalities that rocked the industry and caused the National Mining Association and its members to recommit to returning the US underground coal mining industry to a global mine safety leadership role. This report details a comprehensive approach to increase the odds of survival for miners in emergency situations and to create a culture of prevention of accidents. Among its 75 recommendations are a need to improve communications, mine rescue training, and escape and protection of miners. Section headings of the report are: Introduction; Review of mine emergency situations in the past 25 years: identifying and addressing the issues and complexities; Risk-based design and management; Communications technology; Escape and protection strategies; Emergency response and mine rescue procedures; Training for preparedness; Summary of recommendations; and Conclusions. 37 refs., 3 figs., 5 apps.

  16. 30 CFR 77.1200 - Mine map.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... SAFETY STANDARDS, SURFACE COAL MINES AND SURFACE WORK AREAS OF UNDERGROUND COAL MINES Maps § 77.1200 Mine...) The location of railroad tracks and public highways leading to the mine, and mine buildings of a permanent nature with identifying names shown; (k) Underground mine workings underlying and within...

  17. Identifying typical patterns of vulnerability: A 5-step approach based on cluster analysis

    NASA Astrophysics Data System (ADS)

    Sietz, Diana; Lüdeke, Matthias; Kok, Marcel; Lucas, Paul; Carsten, Walther; Janssen, Peter

    2013-04-01

    Specific processes that shape the vulnerability of socio-ecological systems to climate, market and other stresses derive from diverse background conditions. Within the multitude of vulnerability-creating mechanisms, distinct processes recur in various regions inspiring research on typical patterns of vulnerability. The vulnerability patterns display typical combinations of the natural and socio-economic properties that shape a systems' vulnerability to particular stresses. Based on the identification of a limited number of vulnerability patterns, pattern analysis provides an efficient approach to improving our understanding of vulnerability and decision-making for vulnerability reduction. However, current pattern analyses often miss explicit descriptions of their methods and pay insufficient attention to the validity of their groupings. Therefore, the question arises as to how do we identify typical vulnerability patterns in order to enhance our understanding of a systems' vulnerability to stresses? A cluster-based pattern recognition applied at global and local levels is scrutinised with a focus on an applicable methodology and practicable insights. Taking the example of drylands, this presentation demonstrates the conditions necessary to identify typical vulnerability patterns. They are summarised in five methodological steps comprising the elicitation of relevant cause-effect hypotheses and the quantitative indication of mechanisms as well as an evaluation of robustness, a validation and a ranking of the identified patterns. Reflecting scale-dependent opportunities, a global study is able to support decision-making with insights into the up-scaling of interventions when available funds are limited. In contrast, local investigations encourage an outcome-based validation. This constitutes a crucial step in establishing the credibility of the patterns and hence their suitability for informing extension services and individual decisions. In this respect, working at

  18. Rehabilitation prioritization of abandoned mines and its application to Nyala Magnesite Mine

    NASA Astrophysics Data System (ADS)

    Mhlongo, Sphiwe Emmanuel; Amponsah-Dacosta, Francis; Mphephu, Nndweleni Fredrick

    2013-12-01

    The issue of abandoned mine sites is a major environmental and social problem for the mining industry, communities and governments. Historical mine sites are characterized by significant environmental, health and safety problems. The aim of this study was to develop hazard maps that can assist in the prioritization of rehabilitation at Nyala Mine. The approach used involved site examination and characterization to establish the environmental conditions of the mine. Hazards at the mine were identified, scored, and rated using modified Historic Mine Site Scoring System. The scoring focused on source and exposure pathways. The developed hazard maps showed that the best approach of effectively reducing the physical and environmental hazards at Nyala Mine was to give priority to extremely and moderately hazardous pits; surface infrastructure and spoil dumps, and then to tailings dumps characterized with less physical hazards but extremely high environmental hazards. Pits and spoil materials which were found to be relatively less problematic in terms of physical hazards were to receive least attention. The use of this hazard-scoring and risk-ranking methodology coupled with the hazard maps would provide a more robust scientific basis for making sound decisions and prioritize actions that need to be taken to minimize or manage risks associated with various areas of the mine site.

  19. Ask and Ye Shall Receive? Automated Text Mining of Michigan Capital Facility Finance Bond Election Proposals to Identify Which Topics Are Associated with Bond Passage and Voter Turnout

    ERIC Educational Resources Information Center

    Bowers, Alex J.; Chen, Jingjing

    2015-01-01

    The purpose of this study is to bring together recent innovations in the research literature around school district capital facility finance, municipal bond elections, statistical models of conditional time-varying outcomes, and data mining algorithms for automated text mining of election ballot proposals to examine the factors that influence the…

  20. Demonstrating a Market-Based Approach to the Reclamation of Mined Lands in West Virginia

    SciTech Connect

    Goodrich-Mahoney, John; Donnelly, Ellen

    2009-12-31

    This project demonstrated that developing environmental credits on private land—including abandoned mined lands—is dependent on a number of factors, some of them beyond the control of the project team. In this project, acid mine drainage (AMD) was successfully remediated through the construction of a passive AMD treatment system. Extensive water quality sampling both before and after the installation of the passive AMD treatment system showed that the system achieved removal efficiencies and pollutant loading reductions for acidity, iron, aluminum and manganese that were consistent with systems of similar size and design. The success of the passive AMD treatment system should have resulted in water credits if the project had not been terminated. Developing carbon sequestration credits, however, was much more complex and was not achieved in this project. The primary challenge that the project team encountered in meeting the full project objectives was the unsuccessful attempt to have the landowner sign a conservation easement for his property. This would have allowed the project team to clear and reforest the site, monitor the progress of the newly planted trees, and eventually realize carbon sequestration credits once the forest was mature. The delays caused by the lack of a conservation easement, as well as other factors, eventually resulted in the reforestation portion of the project being cancelled. The information in this report will help the public make more informed decisions regarding the potential of using water and carbon, and other credits to support the remediation of minded lands through out the United States. The hope is that by using credits that more mined lands with be remediated.

  1. A Computational Approach to Identifying Gene-microRNA Modules in Cancer

    PubMed Central

    Jin, Daeyong; Lee, Hyunju

    2015-01-01

    MicroRNAs (miRNAs) play key roles in the initiation and progression of various cancers by regulating genes. Regulatory interactions between genes and miRNAs are complex, as multiple miRNAs can regulate multiple genes. In addtion, these interactions vary from patient to patient and even among patients with the same cancer type, as cancer development is a heterogeneous process. These relationships are more complicated because transcription factors and other regulatory molecules can also regulate miRNAs and genes. Hence, it is important to identify the complex relationships between genes and miRNAs in cancer. In this study, we propose a computational approach to constructing modules that represent these relationships by integrating the expression data of genes and miRNAs with gene-gene interaction data. First, we used a biclustering algorithm to construct modules consisting of a subset of genes and a subset of samples to incorporate the heterogeneity of cancer cells. Second, we combined gene-gene interactions to include genes that play important roles in cancer-related pathways. Then, we selected miRNAs that are closely associated with genes in the modules based on a Gaussian Bayesian network and Bayesian Information Criteria. When we applied our approach to ovarian cancer and glioblastoma (GBM) data sets, 33 and 54 modules were constructed, respectively. In these modules, 91% and 94% of ovarian cancer and GBM modules, respectively, were explained either by direct regulation between genes and miRNAs or by indirect relationships via transcription factors. In addition, 48.4% and 74.0% of modules from ovarian cancer and GBM, respectively, were enriched with cancer-related pathways, and 51.7% and 71.7% of miRNAs in modules were ovarian cancer-related miRNAs and GBM-related miRNAs, respectively. Finally, we extensively analyzed significant modules and showed that most genes in these modules were related to ovarian cancer and GBM. PMID:25611546

  2. Neuroimaging and Neuromodulation: Complementary Approaches for Identifying the Neuronal Correlates of Tinnitus

    PubMed Central

    Langguth, Berthold; Schecklmann, Martin; Lehner, Astrid; Landgrebe, Michael; Poeppl, Timm Benjamin; Kreuzer, Peter Michal; Schlee, Winfried; Weisz, Nathan; Vanneste, Sven; De Ridder, Dirk

    2012-01-01

    An inherent limitation of functional imaging studies is their correlational approach. More information about critical contributions of specific brain regions can be gained by focal transient perturbation of neural activity in specific regions with non-invasive focal brain stimulation methods. Functional imaging studies have revealed that tinnitus is related to alterations in neuronal activity of central auditory pathways. Modulation of neuronal activity in auditory cortical areas by repetitive transcranial magnetic stimulation (rTMS) can reduce tinnitus loudness and, if applied repeatedly, exerts therapeutic effects, confirming the relevance of auditory cortex activation for tinnitus generation and persistence. Measurements of oscillatory brain activity before and after rTMS demonstrate that the same stimulation protocol has different effects on brain activity in different patients, presumably related to interindividual differences in baseline activity in the clinically heterogeneous study cohort. In addition to alterations in auditory pathways, imaging techniques also indicate the involvement of non-auditory brain areas, such as the fronto-parietal “awareness” network and the non-tinnitus-specific distress network consisting of the anterior cingulate cortex, anterior insula, and amygdale. Involvement of the hippocampus and the parahippocampal region putatively reflects the relevance of memory mechanisms in the persistence of the phantom percept and the associated distress. Preliminary studies targeting the dorsolateral prefrontal cortex, the dorsal anterior cingulate cortex, and the parietal cortex with rTMS and with transcranial direct current stimulation confirm the relevance of the mentioned non-auditory networks. Available data indicate the important value added by brain stimulation as a complementary approach to neuroimaging for identifying the neuronal correlates of the various clinical aspects of tinnitus. PMID:22509155

  3. Long-range prediction of Indian summer monsoon rainfall using data mining and statistical approaches

    NASA Astrophysics Data System (ADS)

    H, Vathsala; Koolagudi, Shashidhar G.

    2016-07-01

    This paper presents a hybrid model to better predict Indian summer monsoon rainfall. The algorithm considers suitable techniques for processing dense datasets. The proposed three-step algorithm comprises closed itemset generation-based association rule mining for feature selection, cluster membership for dimensionality reduction, and simple logistic function for prediction. The application of predicting rainfall into flood, excess, normal, deficit, and drought based on 36 predictors consisting of land and ocean variables is presented. Results show good accuracy in the considered study period of 37years (1969-2005).

  4. A metabolite profiling approach to identify biomarkers of flavonoid intake in humans.

    PubMed

    Loke, Wai Mun; Jenner, Andrew M; Proudfoot, Julie M; McKinley, Allan J; Hodgson, Jonathan M; Halliwell, Barry; Croft, Kevin D

    2009-12-01

    Flavonoids are phytochemicals that are widespread in the human diet. Despite limitations in their bioavailability, experimental and epidemiological data suggest health benefits of flavonoid consumption. Valid biomarkers of flavonoid intake may be useful for estimating exposure in a range of settings. However, to date, few useful flavonoid biomarkers have been identified. In this study, we used a metabolite profiling approach to examine the aromatic and phenolic profile of plasma and urine of healthy men after oral consumption of 200 mg of the pure flavonoids, quercetin, (-)-epicatechin, and epigallocatechin gallate, which represent major flavonoid constituents in the diet. Following enzymatic hydrolysis, 71 aromatic compounds were quantified in plasma and urine at 2 and 5 h, respectively, after flavonoid ingestion. Plasma concentrations of different aromatic compounds ranged widely, from 0.01 to 10 micromol/L, with variation among volunteers. None of the aromatic compounds was significantly elevated in plasma 2 h after consumption of either flavonoid compared with water placebo. This indicates that flavonoid-derived aromatic compounds are not responsible for the acute physiological effects reported within 2 h in previous human intervention studies involving flavonoids or flavonoid-rich food consumption. These effects are more likely due to absorption of the intact flavonoid. Our urine analysis suggested that urinary 4-ethylphenol, benzoic acid, and 4-ethylbenzoic acid may be potential biomarkers of quercetin intake and 1,3,5-trimethoxybenzene, 4-O-methylgallic acid, 3-O-methylgallic acid, and gallic acid may be potential markers of epigallocatechin gallate intake. Potential biomarkers of (-)-epicatechin were not identified. These urinary biomarkers may provide an accurate indication of flavonoid exposure. PMID:19812218

  5. A Genome-Wide Methylation Approach Identifies a New Hypermethylated Gene Panel in Ulcerative Colitis

    PubMed Central

    Kang, Keunsoo; Bae, Jin-Han; Han, Kyudong; Kim, Eun Soo; Kim, Tae-Oh; Yi, Joo Mi

    2016-01-01

    The cause of inflammatory bowel disease (IBD) is still unknown, but there is growing evidence that environmental factors such as epigenetic changes can contribute to the disease etiology. The aim of this study was to identify newly hypermethylated genes in ulcerative colitis (UC) using a genome-wide DNA methylation approach. Using an Infinium HumanMethylation450 BeadChip array, we screened the DNA methylation changes in three normal colon controls and eight UC patients. Using these methylation profiles, 48 probes associated with CpG promoter methylation showed differential hypermethylation between UC patients and normal controls. Technical validations for methylation analyses in a larger series of UC patients (n = 79) were performed by methylation-specific PCR (MSP) and bisulfite sequencing analysis. We finally found that three genes (FAM217B, KIAA1614 and RIBC2) that were significantly elevating the promoter methylation levels in UC compared to normal controls. Interestingly, we confirmed that three genes were transcriptionally silenced in UC patient samples by qRT-PCR, suggesting that their silencing is correlated with the promoter hypermethylation. Pathway analyses were performed using GO and KEGG databases with differentially hypermethylated genes in UC. Our results highlight that aberrant hypermethylation was identified in UC patients which can be a potential biomarker for detecting UC. Moreover, pathway-enriched hypermethylated genes are possibly implicating important cellular function in the pathogenesis of UC. Overall, this study describes a newly hypermethylated gene panel in UC patients and provides new clinical information that can be used for the diagnosis and therapeutic treatment of IBD. PMID:27517910

  6. A Genome-Wide Methylation Approach Identifies a New Hypermethylated Gene Panel in Ulcerative Colitis.

    PubMed

    Kang, Keunsoo; Bae, Jin-Han; Han, Kyudong; Kim, Eun Soo; Kim, Tae-Oh; Yi, Joo Mi

    2016-01-01

    The cause of inflammatory bowel disease (IBD) is still unknown, but there is growing evidence that environmental factors such as epigenetic changes can contribute to the disease etiology. The aim of this study was to identify newly hypermethylated genes in ulcerative colitis (UC) using a genome-wide DNA methylation approach. Using an Infinium HumanMethylation450 BeadChip array, we screened the DNA methylation changes in three normal colon controls and eight UC patients. Using these methylation profiles, 48 probes associated with CpG promoter methylation showed differential hypermethylation between UC patients and normal controls. Technical validations for methylation analyses in a larger series of UC patients (n = 79) were performed by methylation-specific PCR (MSP) and bisulfite sequencing analysis. We finally found that three genes (FAM217B, KIAA1614 and RIBC2) that were significantly elevating the promoter methylation levels in UC compared to normal controls. Interestingly, we confirmed that three genes were transcriptionally silenced in UC patient samples by qRT-PCR, suggesting that their silencing is correlated with the promoter hypermethylation. Pathway analyses were performed using GO and KEGG databases with differentially hypermethylated genes in UC. Our results highlight that aberrant hypermethylation was identified in UC patients which can be a potential biomarker for detecting UC. Moreover, pathway-enriched hypermethylated genes are possibly implicating important cellular function in the pathogenesis of UC. Overall, this study describes a newly hypermethylated gene panel in UC patients and provides new clinical information that can be used for the diagnosis and therapeutic treatment of IBD. PMID:27517910

  7. On the Way to New Possible Na-Ion Conductors: The Voronoi-Dirichlet Approach, Data Mining and Symmetry Considerations in Ternary Na Oxides.

    PubMed

    Meutzner, Falk; Münchgesang, Wolfram; Kabanova, Natalya A; Zschornak, Matthias; Leisegang, Tilmann; Blatov, Vladislav A; Meyer, Dirk C

    2015-11-01

    With the constant growth of the lithium battery market and the introduction of electric vehicles and stationary energy storage solutions, the low abundance and high price of lithium will greatly impact its availability in the future. Thus, a diversification of electrochemical energy storage technologies based on other source materials is of great relevance. Sodium is energetically similar to lithium but cheaper and more abundant, which results in some already established stationary concepts, such as Na-S and ZEBRA cells. The most significant bottleneck for these technologies is to find effective solid ionic conductors. Thus, the goal of this work is to identify new ionic conductors for Na ions in ternary Na oxides. For this purpose, the Voronoi-Dirichlet approach has been applied to the Inorganic Crystal Structure Database and some new procedures are introduced to the algorithm implemented in the programme package ToposPro. The main new features are the use of data mined values, which are then used for the evaluation of void spaces, and a new method of channel size calculation. 52 compounds have been identified to be high-potential candidates for solid ionic conductors. The results were analysed from a crystallographic point of view in combination with phenomenological requirements for ionic conductors and intercalation hosts. Of the most promising candidates, previously reported compounds have also been successfully identified by using the employed algorithm, which shows the reliability of the method. PMID:26395985

  8. An observation-based approach to identify local natural dust events from routine aerosol ground monitoring

    NASA Astrophysics Data System (ADS)

    Tong, D. Q.; Dan, M.; Wang, T.; Lee, P.

    2012-02-01

    Dust is a major component of atmospheric aerosols in many parts of the world. Although there exist many routine aerosol monitoring networks, it is often difficult to obtain dust records from these networks, because these monitors are either deployed far away from dust active regions (most likely collocated with dense population) or contaminated by anthropogenic sources and other natural sources, such as wildfires and vegetation detritus. Here we propose a new approach to identify local dust events relying solely on aerosol mass and composition from general-purpose aerosol measurements. Through analyzing the chemical and physical characteristics of aerosol observations during satellite-detected dust episodes, we select five indicators to be used to identify local dust records: (1) high PM10 concentrations; (2) low PM2.5/PM10 ratio; (3) higher concentrations and percentage of crustal elements; (4) lower percentage of anthropogenic pollutants; and (5) low enrichment factors of anthropogenic elements. After establishing these identification criteria, we conduct hierarchical cluster analysis for all validated aerosol measurement data over 68 IMPROVE sites in the Western United States. A total of 182 local dust events were identified over 30 of the 68 locations from 2000 to 2007. These locations are either close to the four US Deserts, namely the Great Basin Desert, the Mojave Desert, the Sonoran Desert, and the Chihuahuan Desert, or in the high wind power region (Colorado). During the eight-year study period, the total number of dust events displays an interesting four-year activity cycle (one in 2000-2003 and the other in 2004-2007). The years of 2003, 2002 and 2007 are the three most active dust periods, with 46, 31 and 24 recorded dust events, respectively, while the years of 2000, 2004 and 2005 are the calmest periods, all with single digit dust records. Among these deserts, the Chihuahua Desert (59 cases) and the Sonoran Desert (62 cases) are by far the most active

  9. Novel and Unexpected Microbial Diversity in Acid Mine Drainage in Svalbard (78° N), Revealed by Culture-Independent Approaches

    PubMed Central

    García-Moyano, Antonio; Austnes, Andreas Erling; Lanzén, Anders; González-Toril, Elena; Aguilera, Ángeles; Øvreås, Lise

    2015-01-01

    Svalbard, situated in the high Arctic, is an important past and present coal mining area. Dozens of abandoned waste rock piles can be found in the proximity of Longyearbyen. This environment offers a unique opportunity for studying the biological control over the weathering of sulphide rocks at low temperatures. Although the extension and impact of acid mine drainage (AMD) in this area is known, the native microbial communities involved in this process are still scarcely studied and uncharacterized. Several abandoned mining areas were explored in the search for active AMD and a culture-independent approach was applied with samples from two different runoffs for the identification and quantification of the native microbial communities. The results obtained revealed two distinct microbial communities. One of the runoffs was more extreme with regards to pH and higher concentration of soluble iron and heavy metals. These conditions favored the development of algal-dominated microbial mats. Typical AMD microorganisms related to known iron-oxidizing bacteria (Acidithiobacillus ferrivorans, Acidobacteria and Actinobacteria) dominated the bacterial community although some unexpected populations related to Chloroflexi were also significant. No microbial mats were found in the second area. The geochemistry here showed less extreme drainage, most likely in direct contact with the ore under the waste pile. Large deposits of secondary minerals were found and the presence of iron stalks was revealed by microscopy analysis. Although typical AMD microorganisms were also detected here, the microbial community was dominated by other populations, some of them new to this type of system (Saccharibacteria, Gallionellaceae). These were absent or lowered in numbers the farther from the spring source and they could represent native populations involved in the oxidation of sulphide rocks within the waste rock pile. This environment appears thus as a highly interesting field of potential

  10. Novel and Unexpected Microbial Diversity in Acid Mine Drainage in Svalbard (78° N), Revealed by Culture-Independent Approaches

    PubMed Central

    García-Moyano, Antonio; Austnes, Andreas Erling; Lanzén, Anders; González-Toril, Elena; Aguilera, Ángeles; Øvreås, Lise

    2015-01-01

    Svalbard, situated in the high Arctic, is an important past and present coal mining area. Dozens of abandoned waste rock piles can be found in the proximity of Longyearbyen. This environment offers a unique opportunity for studying the biological control over the weathering of sulphide rocks at low temperatures. Although the extension and impact of acid mine drainage (AMD) in this area is known, the native microbial communities involved in this process are still scarcely studied and uncharacterized. Several abandoned mining areas were explored in the search for active AMD and a culture-independent approach was applied with samples from two different runoffs for the identification and quantification of the native microbial communities. The results obtained revealed two distinct microbial communities. One of the runoffs was more extreme with regards to pH and higher concentration of soluble iron and heavy metals. These conditions favored the development of algal-dominated microbial mats. Typical AMD microorganisms related to known iron-oxidizing bacteria (Acidithiobacillus ferrivorans, Acidobacteria and Actinobacteria) dominated the bacterial community although some unexpected populations related to Chloroflexi were also significant. No microbial mats were found in the second area. The geochemistry here showed less extreme drainage, most likely in direct contact with the ore under the waste pile. Large deposits of secondary minerals were found and the presence of iron stalks was revealed by microscopy analysis. Although typical AMD microorganisms were also detected here, the microbial community was dominated by other populations, some of them new to this type of system (Saccharibacteria, Gallionellaceae). These were absent or lowered in numbers the farther from the spring source and they could represent native populations involved in the oxidation of sulphide rocks within the waste rock pile. This environment appears thus as a highly interesting field of potential

  11. Novel and Unexpected Microbial Diversity in Acid Mine Drainage in Svalbard (78° N), Revealed by Culture-Independent Approaches.

    PubMed

    García-Moyano, Antonio; Austnes, Andreas Erling; Lanzén, Anders; González-Toril, Elena; Aguilera, Ángeles; Øvreås, Lise

    2015-10-13

    Svalbard, situated in the high Arctic, is an important past and present coal mining area. Dozens of abandoned waste rock piles can be found in the proximity of Longyearbyen. This environment offers a unique opportunity for studying the biological control over the weathering of sulphide rocks at low temperatures. Although the extension and impact of acid mine drainage (AMD) in this area is known, the native microbial communities involved in this process are still scarcely studied and uncharacterized. Several abandoned mining areas were explored in the search for active AMD and a culture-independent approach was applied with samples from two different runoffs for the identification and quantification of the native microbial communities. The results obtained revealed two distinct microbial communities. One of the runoffs was more extreme with regards to pH and higher concentration of soluble iron and heavy metals. These conditions favored the development of algal-dominated microbial mats. Typical AMD microorganisms related to known iron-oxidizing bacteria (Acidithiobacillus ferrivorans, Acidobacteria and Actinobacteria) dominated the bacterial community although some unexpected populations related to Chloroflexi were also significant. No microbial mats were found in the second area. The geochemistry here showed less extreme drainage, most likely in direct contact with the ore under the waste pile. Large deposits of secondary minerals were found and the presence of iron stalks was revealed by microscopy analysis. Although typical AMD microorganisms were also detected here, the microbial community was dominated by other populations, some of them new to this type of system (Saccharibacteria, Gallionellaceae). These were absent or lowered in numbers the farther from the spring source and they could represent native populations involved in the oxidation of sulphide rocks within the waste rock pile. This environment appears thus as a highly interesting field of potential

  12. Perception of Air Pollution in the Jinchuan Mining Area, China: A Structural Equation Modeling Approach.

    PubMed

    Li, Zhengtao; Folmer, Henk; Xue, Jianhong

    2016-07-21

    Studies on the perception of air pollution in China are very limited. The aim of this paper is to help to fill this gap by analyzing a cross-sectional dataset of 759 residents of the Jinchuan mining area, Gansu Province, China. The estimations suggest that perception of air pollution is two-dimensional. The first dimension is the perceived intensity of air pollution and the second is the perceived hazardousness of the pollutants. Both dimensions are influenced by environmental knowledge. Perceived intensity is furthermore influenced by socio-economic status and proximity to the pollution source; perceived hazardousness is influenced by socio-economic status, family health experience, family size and proximity to the pollution source. There are no reverse effects from perception on environmental knowledge. The main conclusion is that virtually all Jinchuan residents perceive high intensity and hazardousness of air pollution despite the fact that public information on air pollution and its health impacts is classified to a great extent. It is suggested that, to assist the residents to take appropriate preventive action, the local government should develop counseling and educational campaigns and institutionalize disclosure of air quality conditions. These programs should pay special attention to young residents who have limited knowledge of air pollution in the Jinchuan mining area.

  13. Perception of Air Pollution in the Jinchuan Mining Area, China: A Structural Equation Modeling Approach

    PubMed Central

    Li, Zhengtao; Folmer, Henk; Xue, Jianhong

    2016-01-01

    Studies on the perception of air pollution in China are very limited. The aim of this paper is to help to fill this gap by analyzing a cross-sectional dataset of 759 residents of the Jinchuan mining area, Gansu Province, China. The estimations suggest that perception of air pollution is two-dimensional. The first dimension is the perceived intensity of air pollution and the second is the perceived hazardousness of the pollutants. Both dimensions are influenced by environmental knowledge. Perceived intensity is furthermore influenced by socio-economic status and proximity to the pollution source; perceived hazardousness is influenced by socio-economic status, family health experience, family size and proximity to the pollution source. There are no reverse effects from perception on environmental knowledge. The main conclusion is that virtually all Jinchuan residents perceive high intensity and hazardousness of air pollution despite the fact that public information on air pollution and its health impacts is classified to a great extent. It is suggested that, to assist the residents to take appropriate preventive action, the local government should develop counseling and educational campaigns and institutionalize disclosure of air quality conditions. These programs should pay special attention to young residents who have limited knowledge of air pollution in the Jinchuan mining area. PMID:27455291

  14. A New Approach for Mining Order-Preserving Submatrices Based on All Common Subsequences.

    PubMed

    Xue, Yun; Liao, Zhengling; Li, Meihang; Luo, Jie; Kuang, Qiuhua; Hu, Xiaohui; Li, Tiechen

    2015-01-01

    Order-preserving submatrices (OPSMs) have been applied in many fields, such as DNA microarray data analysis, automatic recommendation systems, and target marketing systems, as an important unsupervised learning model. Unfortunately, most existing methods are heuristic algorithms which are unable to reveal OPSMs entirely in NP-complete problem. In particular, deep OPSMs, corresponding to long patterns with few supporting sequences, incur explosive computational costs and are completely pruned by most popular methods. In this paper, we propose an exact method to discover all OPSMs based on frequent sequential pattern mining. First, an existing algorithm was adjusted to disclose all common subsequence (ACS) between every two row sequences, and therefore all deep OPSMs will not be missed. Then, an improved data structure for prefix tree was used to store and traverse ACS, and Apriori principle was employed to efficiently mine the frequent sequential pattern. Finally, experiments were implemented on gene and synthetic datasets. Results demonstrated the effectiveness and efficiency of this method. PMID:26161131

  15. Perception of Air Pollution in the Jinchuan Mining Area, China: A Structural Equation Modeling Approach.

    PubMed

    Li, Zhengtao; Folmer, Henk; Xue, Jianhong

    2016-01-01

    Studies on the perception of air pollution in China are very limited. The aim of this paper is to help to fill this gap by analyzing a cross-sectional dataset of 759 residents of the Jinchuan mining area, Gansu Province, China. The estimations suggest that perception of air pollution is two-dimensional. The first dimension is the perceived intensity of air pollution and the second is the perceived hazardousness of the pollutants. Both dimensions are influenced by environmental knowledge. Perceived intensity is furthermore influenced by socio-economic status and proximity to the pollution source; perceived hazardousness is influenced by socio-economic status, family health experience, family size and proximity to the pollution source. There are no reverse effects from perception on environmental knowledge. The main conclusion is that virtually all Jinchuan residents perceive high intensity and hazardousness of air pollution despite the fact that public information on air pollution and its health impacts is classified to a great extent. It is suggested that, to assist the residents to take appropriate preventive action, the local government should develop counseling and educational campaigns and institutionalize disclosure of air quality conditions. These programs should pay special attention to young residents who have limited knowledge of air pollution in the Jinchuan mining area. PMID:27455291

  16. Data mining approach to evaluating the use of skin surface electropotentials for breast cancer detection.

    PubMed

    Sree, S Vinitha; Ng, E Y K; Acharya, U Rajendra

    2010-02-01

    The Biofield Diagnostic System (BDS) uses a score formed with measured skin surface electropotentials and a prior Level Of Suspicion (LOS) value (predicted by the physician based on the patient's ultrasound or mammography results) to calculate a revised Post-BDS LOS to indicate the presence of breast cancer. The demographic details, BDS test results, and the recorded electropotential values form a potentially useful dataset, which can be further explored with data mining tools to extract important information that can be used to improve the current predictive accuracy of the device. According to the proposed data mining framework, the BDS dataset with 291 cases was first pre-processed to remove outliers and then used to select relevant and informative features for classifier development and finally to evaluate the capability of the built classifiers in detecting the presence of the disease. Two popular feature selection techniques, namely, the filter and wrapper methods, were used in parallel for feature selection. A few statistical inference based classifiers and neural networks were used for classification. The proposed technique significantly improved the BDS prediction accuracy. Also, the use of prior LOS and, hence, the Post-BDS LOS, associates a mild subjective interpretation to the current prediction methodology used by BDS. However, the feature subset selected in our analysis that gave the best accuracy did not use either of these features. This result indicates the possibility of using BDS as a better objective assessment tool for breast cancer detection.

  17. Analysis of Maintenance Service Contracts for Dump Trucks Used in Mining Industry with Simulation Approach

    NASA Astrophysics Data System (ADS)

    Dymasius, A.; Wangsaputra, R.; Iskandar, B. P.

    2016-02-01

    A mining company needs high availability of dump trucks used to haul mining materials. As a result, an effective maintenance action is required to keep the dump trucks in a good condition and hence reducing failure and downtime of the dump trucks. To carry out maintenance in-house requires a high intensive maintenance facility and high skilled maintenance specialists. Often, outsourcing maintenance is an economic option for the company. An external agent takes a proactive action with offering some maintenance contract options to the owner. The decision problem for the owner is to decide the best option and for the agent is to determine the optimal price for each option offered. A non-cooperative game-theory is used to formulate the decision problems for the owner and the agent. We consider that failure pattern of each truck follows a non-homogeneous Poisson process (NHPP) and a queueing theory with multiple servers is used to estimate the downtime. As it involves high complexity to model downtime using a queueing theory, then in this paper we use a simulation method. Furthermore, we conduct experiment to seek for the best number of maintenance facilities (servers) which minimises maintenance and penalty costs incurred to the agent.

  18. A systems approach to accident causation in mining: an application of the HFACS method.

    PubMed

    Lenné, Michael G; Salmon, Paul M; Liu, Charles C; Trotter, Margaret

    2012-09-01

    This project aimed to provide a greater understanding of the systemic factors involved in mining accidents, and to examine those organisational and supervisory failures that are predictive of sub-standard performance at operator level. A sample of 263 significant mining incidents in Australia across 2007-2008 were analysed using the Human Factors Analysis and Classification System (HFACS). Two human factors specialists independently undertook the analysis. Incidents occurred more frequently in operations concerning the use of surface mobile equipment (38%) and working at heights (21%), however injury was more frequently associated with electrical operations and vehicles and machinery. Several HFACS categories appeared frequently: skill-based errors (64%) and violations (57%), issues with the physical environment (56%), and organisational processes (65%). Focussing on the overall system, several factors were found to predict the presence of failures in other parts of the system, including planned inappropriate operations and team resource management; inadequate supervision and team resource management; and organisational climate and inadequate supervision. It is recommended that these associations deserve greater attention in future attempts to develop accident countermeasures, although other significant associations should not be ignored. In accordance with findings from previous HFACS-based analyses of aviation and medical incidents, efforts to reduce the frequency of unsafe acts or operations should be directed to a few critical HFACS categories at the higher levels: organisational climate, planned inadequate operations, and inadequate supervision. While remedial strategies are proposed it is important that future efforts evaluate the utility of the measures proposed in studies of system safety.

  19. An integrated systems biology approach identifies positive cofactor 4 as a factor that increases reprogramming efficiency

    PubMed Central

    Jo, Junghyun; Hwang, Sohyun; Kim, Hyung Joon; Hong, Soomin; Lee, Jeoung Eun; Lee, Sung-Geum; Baek, Ahmi; Han, Heonjong; Lee, Jin Il; Lee, Insuk; Lee, Dong Ryul

    2016-01-01

    Spermatogonial stem cells (SSCs) can spontaneously dedifferentiate into embryonic stem cell (ESC)-like cells, which are designated as multipotent SSCs (mSSCs), without ectopic expression of reprogramming factors. Interestingly, SSCs express key pluripotency genes such as Oct4, Sox2, Klf4 and Myc. Therefore, molecular dissection of mSSC reprogramming may provide clues about novel endogenous reprogramming or pluripotency regulatory factors. Our comparative transcriptome analysis of mSSCs and induced pluripotent stem cells (iPSCs) suggests that they have similar pluripotency states but are reprogrammed via different transcriptional pathways. We identified 53 genes as putative pluripotency regulatory factors using an integrated systems biology approach. We demonstrated a selected candidate, Positive cofactor 4 (Pc4), can enhance the efficiency of somatic cell reprogramming by promoting and maintaining transcriptional activity of the key reprograming factors. These results suggest that Pc4 has an important role in inducing spontaneous somatic cell reprogramming via up-regulation of key pluripotency genes. PMID:26740582

  20. An approach to identify the novel miRNA encoded from H. Annuus EST sequences.

    PubMed

    Gupta, Hemant; Tiwari, Tanushree; Patel, Maulik; Mehta, Aditya; Ghosh, Arpita

    2015-12-01

    MicroRNAs are a newly discovered class of non-protein small RNAs with 22-24 nucleotides. They play multiple roles in biological processes including development, cell proliferation, apoptosis, stress responses and many other cell functions. In this research, several approaches were combined to make a computational prediction of potential miRNAs and their targets in Helianthus annuus (H. annuus). The already available information of the plant miRNAs present in miRBase v21 was used against expressed sequence tags (ESTs). A total of three miRNAs were detected from which one potential novel miRNA was identified following a range of strict filtering criteria. The target prediction was carried out for these three miRNAs having various targets. These targets were functionally annotated and GO terms were assigned. To study the conserved nature of the miRNAs, predicted phylogenetic analysis was carried out. These findings will significantly provide the broader picture for understanding the functions in H. annuus. PMID:26697356

  1. Identifying human disease genes: advances in molecular genetics and computational approaches.

    PubMed

    Bakhtiar, S M; Ali, A; Baig, S M; Barh, D; Miyoshi, A; Azevedo, V

    2014-07-04

    The human genome project is one of the significant achievements that have provided detailed insight into our genetic legacy. During the last two decades, biomedical investigations have gathered a considerable body of evidence by detecting more than 2000 disease genes. Despite the imperative advances in the genetic understanding of various diseases, the pathogenesis of many others remains obscure. With recent advances, the laborious methodologies used to identify DNA variations are replaced by direct sequencing of genomic DNA to detect genetic changes. The ability to perform such studies depends equally on the development of high-throughput and economical genotyping methods. Currently, basically for every disease whose origen is still unknown, genetic approaches are available which could be pedigree-dependent or -independent with the capacity to elucidate fundamental disease mechanisms. Computer algorithms and programs for linkage analysis have formed the foundation for many disease gene detection projects, similarly databases of clinical findings have been widely used to support diagnostic decisions in dysmorphology and general human disease. For every disease type, genome sequence variations, particularly single nucleotide polymorphisms are mapped by comparing the genetic makeup of case and control groups. Methods that predict the effects of polymorphisms on protein stability are useful for the identification of possible disease associations, whereas structural effects can be assessed using methods to predict stability changes in proteins using sequence and/or structural information.

  2. Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies

    PubMed Central

    Delmont, Tom O.

    2016-01-01

    High-throughput sequencing provides a fast and cost-effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results against potential contamination of non-target organisms requires advanced bioinformatics approaches and practices. Here, we re-analyzed the sequencing data generated for the tardigrade Hypsibius dujardini, and created a holistic display of the eukaryotic genome assembly using DNA data originating from two groups and eleven sequencing libraries. By using bacterial single-copy genes, k-mer frequencies, and coverage values of scaffolds we could identify and characterize multiple near-complete bacterial genomes from the raw assembly, and curate a 182 Mbp draft genome for H. dujardini supported by RNA-Seq data. Our results indicate that most contaminant scaffolds were assembled from Moleculo long-read libraries, and most of these contaminants have differed between library preparations. Our re-analysis shows that visualization and curation of eukaryotic genome assemblies can benefit from tools designed to address the needs of today’s microbiologists, who are constantly challenged by the difficulties associated with the identification of distinct microbial genomes in complex environmental metagenomes. PMID:27069789

  3. Identifying medication error chains from critical incident reports: a new analytic approach.

    PubMed

    Huckels-Baumgart, Saskia; Manser, Tanja

    2014-10-01

    Research into the distribution of medication errors usually focuses on isolated stages within the medication use process. Our study aimed to provide a novel process-oriented approach to medication incident analysis focusing on medication error chains. Our study was conducted across a 900-bed teaching hospital in Switzerland. All reported 1,591 medication errors 2009-2012 were categorized using the Medication Error Index NCC MERP and the WHO Classification for Patient Safety Methodology. In order to identify medication error chains, each reported medication incident was allocated to the relevant stage of the hospital medication use process. Only 25.8% of the reported medication errors were detected before they propagated through the medication use process. The majority of medication errors (74.2%) formed an error chain encompassing two or more stages. The most frequent error chain comprised preparation up to and including medication administration (45.2%). "Non-consideration of documentation/prescribing" during the drug preparation was the most frequent contributor for "wrong dose" during the administration of medication. Medication error chains provide important insights for detecting and stopping medication errors before they reach the patient. Existing and new safety barriers need to be extended to interrupt error chains and to improve patient safety.

  4. Mixed Integer Linear Programming based machine learning approach identifies regulators of telomerase in yeast.

    PubMed

    Poos, Alexandra M; Maicher, André; Dieckmann, Anna K; Oswald, Marcus; Eils, Roland; Kupiec, Martin; Luke, Brian; König, Rainer

    2016-06-01

    Understanding telomere length maintenance mechanisms is central in cancer biology as their dysregulation is one of the hallmarks for immortalization of cancer cells. Important for this well-balanced control is the transcriptional regulation of the telomerase genes. We integrated Mixed Integer Linear Programming models into a comparative machine learning based approach to identify regulatory interactions that best explain the discrepancy of telomerase transcript levels in yeast mutants with deleted regulators showing aberrant telomere length, when compared to mutants with normal telomere length. We uncover novel regulators of telomerase expression, several of which affect histone levels or modifications. In particular, our results point to the transcription factors Sum1, Hst1 and Srb2 as being important for the regulation of EST1 transcription, and we validated the effect of Sum1 experimentally. We compiled our machine learning method leading to a user friendly package for R which can straightforwardly be applied to similar problems integrating gene regulator binding information and expression profiles of samples of e.g. different phenotypes, diseases or treatments. PMID:26908654

  5. Identifying an appropriate measurement modeling approach for the Mini-Mental State Examination.

    PubMed

    Rubright, Jonathan D; Nandakumar, Ratna; Karlawish, Jason

    2016-02-01

    The Mini-Mental State Examination (MMSE) is a 30-item, dichotomously scored test of general cognition. A number of benefits could be gained by modeling the MMSE in an item response theory (IRT) framework, as opposed to the currently used classical additive approach. However, the test, which is built from groups of items related to separate cognitive subdomains, may violate a key assumption of IRT: local item independence. This study aimed to identify the most appropriate measurement model for the MMSE: a unidimensional IRT model, a testlet response theory model, or a bifactor model. Local dependence analysis using nationally representative data showed a meaningful violation of the local item independence assumption, indicating multidimensionality. In addition, the testlet and bifactor models displayed superior fit indices over a unidimensional IRT model. Statistical comparisons showed that the bifactor model fit MMSE respondent data significantly better than the other models considered. These results suggest that application of a traditional unidimensional IRT model is inappropriate in this context. Instead, a bifactor model is suggested for future modeling of MMSE data as it more accurately represents the multidimensional nature of the scale. (PsycINFO Database Record

  6. Mixed Integer Linear Programming based machine learning approach identifies regulators of telomerase in yeast

    PubMed Central

    Poos, Alexandra M.; Maicher, André; Dieckmann, Anna K.; Oswald, Marcus; Eils, Roland; Kupiec, Martin; Luke, Brian; König, Rainer

    2016-01-01

    Understanding telomere length maintenance mechanisms is central in cancer biology as their dysregulation is one of the hallmarks for immortalization of cancer cells. Important for this well-balanced control is the transcriptional regulation of the telomerase genes. We integrated Mixed Integer Linear Programming models into a comparative machine learning based approach to identify regulatory interactions that best explain the discrepancy of telomerase transcript levels in yeast mutants with deleted regulators showing aberrant telomere length, when compared to mutants with normal telomere length. We uncover novel regulators of telomerase expression, several of which affect histone levels or modifications. In particular, our results point to the transcription factors Sum1, Hst1 and Srb2 as being important for the regulation of EST1 transcription, and we validated the effect of Sum1 experimentally. We compiled our machine learning method leading to a user friendly package for R which can straightforwardly be applied to similar problems integrating gene regulator binding information and expression profiles of samples of e.g. different phenotypes, diseases or treatments. PMID:26908654

  7. Rapid in vivo forward genetic approach for identifying axon death genes in Drosophila

    PubMed Central

    Neukomm, Lukas J.; Burdett, Thomas C.; Gonzalez, Michael A.; Züchner, Stephan; Freeman, Marc R.

    2014-01-01

    Axons damaged by acute injury, toxic insults, or neurodegenerative diseases execute a poorly defined autodestruction signaling pathway leading to widespread fragmentation and functional loss. Here, we describe an approach to study Wallerian degeneration in the Drosophila L1 wing vein that allows for analysis of axon degenerative phenotypes with single-axon resolution in vivo. This method allows for the axotomy of specific subsets of axons followed by examination of progressive axonal degeneration and debris clearance alongside uninjured control axons. We developed new Flippase (FLP) reagents using proneural gene promoters to drive FLP expression very early in neural lineages. These tools allow for the production of mosaic clone populations with high efficiency in sensory neurons in the wing. We describe a collection of lines optimized for forward genetic mosaic screens using MARCM (mosaic analysis with a repressible cell marker; i.e., GFP-labeled, homozygous mutant) on all major autosomal arms (∼95% of the fly genome). Finally, as a proof of principle we screened the X chromosome and identified a collection eight recessive and two dominant alleles of highwire, a ubiquitin E3 ligase required for axon degeneration. Similar unbiased forward genetic screens should help rapidly delineate axon death genes, thereby providing novel potential drug targets for therapeutic intervention to prevent axonal and synaptic loss. PMID:24958874

  8. A simple methodological approach for counting and identifying culturable viruses adsorbed to cellulose nitrate membrane filters.

    PubMed

    Papageorgiou, G T; Mocé-Llivina, L; Christodoulou, C G; Lucena, F; Akkelidou, D; Ioannou, E; Jofre, J

    2000-01-01

    We identified conditions under which Buffalo green monkey cells grew on the surfaces of cellulose nitrate membrane filters in such a way that they covered the entire surface of each filter and penetrated through the pores. When such conditions were used, poliovirus that had previously been adsorbed on the membranes infected the cells and replicated. A plaque assay method and a quantal method (most probable number of cytopathic units) were used to detect and count the viruses adsorbed on the membrane filters. Polioviruses in aqueous suspensions were then concentrated by adsorption to cellulose membrane filters and were subsequently counted without elution, a step which is necessary when the commonly used methods are employed. The pore size of the membrane filter, the sample contents, and the sample volume were optimized for tap water, seawater, and a 0.25 M glycine buffer solution. The numbers of viruses recovered under the optimized conditions were more than 50% greater than the numbers counted by the standard plaque assay. When ceftazidime was added to the assay medium in addition to the antibiotics which are typically used, the method could be used to study natural samples with low and intermediate levels of microbial pollution without decontamination of the samples. This methodological approach also allowed plaque hybridization either directly on cellulose nitrate membranes or on Hybond N+ membranes after the preparations were transferred.

  9. Novel Vaccine Candidates against Brucella melitensis Identified through Reverse Vaccinology Approach.

    PubMed

    Vishnu, Udayakumar S; Sankarasubramanian, Jagadesan; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash

    2015-11-01

    Global health therapeutics is a rapidly emerging facet of postgenomics medicine. In this connection, Brucella melitensis is an intracellular bacterium that causes the zoonotic infectious disease, brucellosis. Presently, no licensed vaccines are available for human brucellosis. Here, we report the identification of potential vaccine candidates against B. melitensis using a reverse vaccinology approach. Based on a systematic screening of exoproteome and secretome of B. melitensis 16 M, we identified eight proteins as potential vaccine candidates, including LPS-assembly protein LptD, a polysaccharide export protein, a cell surface protein, heme transporter BhuA, flagellin FliC, 7-alpha-hydroxysteroid dehydrogenase, immunoglobulin-binding protein EIBE, and hemagglutinin. Among these, the roles of BhuA and hemagglutinin in the virulence of Brucella are essential to establish infection. Roles of other proteins in the virulence are yet to be studied. Prediction of protein-protein interactions revealed that these proteins can interact with other proteins involved in virulence, secretion system, metabolism, and transport. From these eight potential vaccine candidates, we predicted three surface exposed novel antigenic epitopes that can induce both B-cell and T-cell immune responses. These peptides can be used for the development of either exclusive peptide vaccines or multi-component vaccines against human brucellosis. Reverse vaccinology is an important strategy for discovery of novel global health therapeutics. PMID:26479901

  10. Synoptic monitoring as an approach to discriminating between point and diffuse source contributions to zinc loads in mining impacted catchments.

    PubMed

    Banks, V J; Palumbo-Roe, B

    2010-09-01

    One of the global legacies of industrialisation is the environmental impacts of historic mineral exploitation. Recent national initiatives to manage the impacts on ground and surface waters have driven the need to develop better techniques for assessing understanding of the catchment-scale distribution and characterisation of the relative contribution of point and diffuse contaminant sources. The benefits of a detailed, multidisciplinary investigation are highlighted through a case study focused on the Rookhope Burn, a tributary of the River Wear, which falls within a significantly mine impacted area of the North Pennines Orefield, UK. Zinc (Zn) has been identified as the contaminant of concern within this catchment, which is judged by the Environment Agency to be at risk of failing to achieve good water quality status in the context of the Water Framework Directive. The results of synoptic flow monitoring and sampling for chemical determinations of major and trace elements have been used to calculate mass balances of instream and inflow chemical loads in the Rookhope Burn. Despite a dominant impact on the water quality from a mine outburst (especially Zn [1.45 to 2.42 mg/l], Fe [2.18 to 3.97 mg/l], Mn [3.69 to 6.77 mg/l], F [3.99 to 4.80 mg/l] and SO(4) [178 to 299 mg/l]), mass balance calculations combined with geological mapping have facilitated the identification of significant, previously unknown, subsurface contributions of Zn contaminated groundwater (with Zn concentrations in excess of 0.4 to 0.9 mg/l and 0.18 to 0.36 mg/l) to the Burn. The subsurface contributions exhibit spatial correspondence to mine workings with associated mineral veins and adits, or to points of suspected karst groundwater resurgence. These findings reiterate the challenges posed in decision making with respect to remediation, in this case in the context of the management of significant subsurface contributions.

  11. A multi-proxy approach to identifying short-lived marine incursions in the Early Carboniferous

    NASA Astrophysics Data System (ADS)

    Bennett, Carys; Davies, Sarah; Leng, Melanie; Snelling, Andrea; Millward, David; Kearsey, Timothy; Marshall, John; Reves, Emma

    2015-04-01

    This study is a contribution to the TW:eed Project (Tetrapod World: early evolution and diversification), which examines the rebuilding of Carboniferous ecosystems following a mass extinction at the end of the Devonian. The project focuses on the Tournaisian Ballagan Formation of Scotland and the Borders, which contains rare fish and tetrapod material. The Ballagan Formation is characterised by sandstones, dolomitic cementstones, paleosols, siltstones and gypsum deposits. The depositional environment ranges from fluvial, alluvial-plain to marginal-marine environments, with fluvial, floodplain and lacustrine deposition dominant. A multi-proxy approach combining sedimentology, palaeontology, micropalaeontology, palynology and geochemistry is used to identify short-lived marine transgressions onto the floodplain environment. Rare marginal marine fossils are: Chondrites-Phycosiphon, Spirorbis, Serpula, certain ostracod species, rare orthocones, brachiopods and putative marine sharks. More common non-marine fauna include Leiocopida and Podocopida ostracods, Mytilida and Myalinida bivalves, plants, eurypterids, gastropods and fish. Thin carbonate-bearing dolomitic cementstones and siltstone contain are the sedimentary deposits of marine incursions and occur throughout the formation. Over 600 bulk carbon isotope samples were taken from the 500 metre thick Norham Core (located near Berwick-Upon-Tweed), encompassing a time interval of around 13 million years. The results range from -26o to -19 δ13Corg, with an average of -19o much lighter than the average value for Early Carboniferous marine bulk organic matter (δ13C of -28 to -30). The isotope results correspond to broad-scale changes in the depositional setting, with more positive δ13C in pedogenic sediments and more negative δ13C in un-altered grey siltstones. They may also relate to cryptic (short-lived) marine incursions. A comparison of δ13C values from specific plant/wood fragments, palynology and bulk

  12. Outbreaks source: A new mathematical approach to identify their possible location

    NASA Astrophysics Data System (ADS)

    Buscema, Massimo; Grossi, Enzo; Breda, Marco; Jefferson, Tom

    2009-11-01

    Classical epidemiology has generally relied on the description and explanation of the occurrence of infectious diseases in relation to time occurrence of events rather than to place of occurrence. In recent times, computer generated dot maps have facilitated the modeling of the spread of infectious epidemic diseases either with classical statistics approaches or with artificial “intelligent systems”. Few attempts, however, have been made so far to identify the origin of the epidemic spread rather than its evolution by mathematical topology methods. We report on the use of a new artificial intelligence method (the H-PST Algorithm) and we compare this new technique with other well known algorithms to identify the source of three examples of infectious disease outbreaks derived from literature. The H-PST algorithm is a new system able to project a distances matrix of points (events) into a bi-dimensional space, with the generation of a new point, named hidden unit. This new hidden unit deforms the original Euclidean space and transforms it into a new space (cognitive space). The cost function of this transformation is the minimization of the differences between the original distance matrix among the assigned points and the distance matrix of the same points projected into the bi-dimensional map (or any different set of constraints). For many reasons we will discuss, the position of the hidden unit shows to target the outbreak source in many epidemics much better than the other classic algorithms specifically targeted for this task. Compared with main algorithms known in the location theory, the hidden unit was within yards of the outbreak source in the first example (the 2007 epidemic of Chikungunya fever in Italy). The hidden unit was located in the river between the two village epicentres of the spread exactly where the index case was living. Equally in the second (the 1967 foot and mouth disease epidemic in England), and the third (1854 London Cholera epidemic

  13. Identifying diffused nitrate sources in a stream in an agricultural field using a dual isotopic approach.

    PubMed

    Ding, Jingtao; Xi, Beidou; Gao, Rutai; He, Liansheng; Liu, Hongliang; Dai, Xuanli; Yu, Yijun

    2014-06-15

    Nitrate (NO3(-)) pollution is a severe problem in aquatic systems in Taihu Lake Basin in China. A dual isotope approach (δ(15)NNO3(-) and δ(18)ONO3(-)) was applied to identify diffused NO3(-) inputs in a stream in an agricultural field at the basin in 2013. The site-specific isotopic characteristics of five NO3(-) sources (atmospheric deposition, AD; NO3(-) derived from soil organic matter nitrification, NS; NO3(-) derived from chemical fertilizer nitrification, NF; groundwater, GW; and manure and sewage, M&S) were identified. NO3(-) concentrations in the stream during the rainy season [mean±standard deviation (SD)=2.5±0.4mg/L] were lower than those during the dry season (mean±SD=4.0±0.5mg/L), whereas the δ(18)ONO3(-) values during the rainy season (mean±SD=+12.3±3.6‰) were higher than those during the dry season (mean±SD=+0.9±1.9‰). Both chemical and isotopic characteristics indicated that mixing with atmospheric NO3(-) resulted in the high δ(18)O values during the rainy season, whereas NS and M&S were the dominant NO3(-) sources during the dry season. A Bayesian model was used to determine the contribution of each NO3(-) source to total stream NO3(-). Results showed that reduced N nitrification in soil zones (including soil organic matter and fertilizer) was the main NO3(-) source throughout the year. M&S contributed more NO3(-) during the dry season (22.4%) than during the rainy season (17.8%). AD generated substantial amounts of NO3(-) in May (18.4%), June (29.8%), and July (24.5%). With the assessment of temporal variation of diffused NO3(-) sources in agricultural field, improved agricultural management practices can be implemented to protect the water resource and avoid further water quality deterioration in Taihu Lake Basin. PMID:24686140

  14. A Systems Biology Approach Identifies a Regulatory Network in Parotid Acinar Cell Terminal Differentiation

    PubMed Central

    Metzler, Melissa A.; Venkatesh, Srirangapatnam G.; Lakshmanan, Jaganathan; Carenbauer, Anne L.; Perez, Sara M.; Andres, Sarah A.; Appana, Savitri; Brock, Guy N.; Wittliff, James L.; Darling, Douglas S.

    2015-01-01

    Objective The transcription factor networks that drive parotid salivary gland progenitor cells to terminally differentiate, remain largely unknown and are vital to understanding the regeneration process. Methodology A systems biology approach was taken to measure mRNA and microRNA expression in vivo across acinar cell terminal differentiation in the rat parotid salivary gland. Laser capture microdissection (LCM) was used to specifically isolate acinar cell RNA at times spanning the month-long period of parotid differentiation. Results Clustering of microarray measurements suggests that expression occurs in four stages. mRNA expression patterns suggest a novel role for Pparg which is transiently increased during mid postnatal differentiation in concert with several target gene mRNAs. 79 microRNAs are significantly differentially expressed across time. Profiles of statistically significant changes of mRNA expression, combined with reciprocal correlations of microRNAs and their target mRNAs, suggest a putative network involving Klf4, a differentiation inhibiting transcription factor, which decreases as several targeting microRNAs increase late in differentiation. The network suggests a molecular switch (involving Prdm1, Sox11, Pax5, miR-200a, and miR-30a) progressively decreases repression of Xbp1 gene transcription, in concert with decreased translational repression by miR-214. The transcription factor Xbp1 mRNA is initially low, increases progressively, and may be maintained by a positive feedback loop with Atf6. Transfection studies show that Xbp1Mist1 promoter. In addition, Xbp1 and Mist1 each activate the parotid secretory protein (Psp) gene, which encodes an abundant salivary protein, and is a marker of terminal differentiation. Conclusion This study identifies novel expression patterns of Pparg, Klf4, and Sox11 during parotid acinar cell differentiation, as well as numerous differentially expressed microRNAs. Network analysis identifies a novel stemness arm, a

  15. Heavy metal contamination and its indexing approach for groundwater of Goa mining region, India

    NASA Astrophysics Data System (ADS)

    Singh, Gurdeep; Kamal, Rakesh Kant

    2016-06-01

    The objective of the study is to reveal the seasonal variations in the groundwater quality with respect to heavy metal contamination. To get the extent of the heavy metals contamination, groundwater samples were collected from 45 different locations in and around Goa mining area during the monsoon and post-monsoon seasons. The concentration of heavy metals, such as lead, copper, manganese, zinc, cadmium, iron, and chromium, were determined using atomic absorption spectrophotometer. Most of the samples were found within limit except for Fe content during the monsoon season at two sampling locations which is above desirable limit, i.e., 300 µg/L as per Indian drinking water standard. The data generated were used to calculate the heavy metal pollution index (HPI) for groundwater. The mean values of HPI were 1.5 in the monsoon season and 2.1 in the post-monsoon season, and these values are well below the critical index limit of 100.

  16. [Combined approach to the assessment of new forms of work organization at Kuzbass coal mines].

    PubMed

    Davydova, N N; Diatlova, L A

    1991-02-01

    Physiological-hygienic assessment has been made of the conditions of labour, the degree of difficulty, tension of labour, fatigue of miners during the work shift, indices of health status and sociopsychological climate under the new conditions of the team form of labour organization of miners. It has been found out, that transition to the contract and piece-rate and premium system of labour organization lead to higher labour productivity, longer usage of mining equipment, stability of the collective. However, alongside with this, labour conditions are getting worse, the work becomes more difficult and tense, which leads to more rapid development of fatigue of miners during the work shift, to chronic morbidity. Prophylaxis should be aimed at rationalization of regimes of labour and rest, normalization of the psychological climate in brigades, strengthening of the treatment-prophylaxis work. PMID:2055504

  17. Selection Effects in Identifying Magnetic Clouds and the Importance of the Closest Approach Parameter

    NASA Technical Reports Server (NTRS)

    Lepping, R. P.; Wu, Chin-Chun

    2010-01-01

    This study is motivated by the unusually low number of magnetic clouds (MCs) that are strictly identified within interplanetary coronal mass ejections (ICMEs), as observed at 1 AU; this is usually estimated to be around 30% or lower. But a looser definition of MCs may significantly increase this percentage. Another motivation is the unexpected shape of the occurrence distribution of the observers' "closest approach distances" (measured from a MC's axis, and called CA) which drops off somewhat rapidly as |CA| (in % of MC radius) approaches 100%, based on earlier studies. We suggest, for various geometrical and physical reasons, that the |CA|-distribution should be somewhere between a uniform one and the one actually observed, and therefore the 30% estimate should be higher. So we ask, When there is a failure to identify a MC within an ICME, is it occasionally due to a large |CA| passage, making MC identification more difficult, i.e., is it due to an event selection effect? In attempting to answer this question we examine WIND data to obtain an accurate distribution of the number of MCs vs. |CA| distance, whether the event is ICME-related or not, where initially a large number of cases (N=98) are considered. This gives a frequence distribution that is far from uniform, confirming earlier studies. This along with the fact that there are many ICME identification-parameters that do not depend on |CA| suggest that, indeed an MC event selection effect may explain at least part of the low ratio of (No. MCs)/(No. ICMEs). We also show that there is an acceptable geometrical and physical consistency in the relationships for both average "normalized" magnetic field intensity change and field direction change vs. |CA| within a MC, suggesting that our estimates of |CA|, B(sub 0) (magnetic field intensity on the axis), and choice of a proper "cloud coordinate" system (all needed in the analysis) are acceptably accurate. Therefore the MC fitting model (Lepping et al., 1990) is

  18. Selection effects in identifying magnetic clouds and the importance of the closest approach parameter

    NASA Astrophysics Data System (ADS)

    Lepping, R. P.; Wu, C.-C.

    2010-08-01

    This study is motivated by the unusually low number of magnetic clouds (MCs) that are strictly identified within interplanetary coronal mass ejections (ICMEs), as observed at 1 AU; this is usually estimated to be around 30% or lower. But a looser definition of MCs may significantly increase this percentage. Another motivation is the unexpected shape of the occurrence distribution of the observers' "closest approach distances" (measured from a MC's axis, and called CA) which drops off somewhat rapidly as |CA| (in % of MC radius) approaches 100%, based on earlier studies. We suggest, for various geometrical and physical reasons, that the |CA|-distribution should be somewhere between a uniform one and the one actually observed, and therefore the 30% estimate should be higher. So we ask, When there is a failure to identify a MC within an ICME, is it occasionally due to a large |CA| passage, making MC identification more difficult, i.e., is it due to an event selection effect? In attempting to answer this question we examine WIND data to obtain an accurate distribution of the number of MCs vs. |CA| distance, whether the event is ICME-related or not, where initially a large number of cases (N=98) are considered. This gives a frequence distribution that is far from uniform, confirming earlier studies. This along with the fact that there are many ICME identification-parameters that do not depend on |CA| suggest that, indeed an MC event selection effect may explain at least part of the low ratio of (No. MCs)/(No. ICMEs). We also show that there is an acceptable geometrical and physical consistency in the relationships for both average "normalized" magnetic field intensity change and field direction change vs. |CA| within a MC, suggesting that our estimates of |CA|, BO (magnetic field intensity on the axis), and choice of a proper "cloud coordinate" system (all needed in the analysis) are acceptably accurate. Therefore, the MC fitting model (Lepping et al., 1990) is adequate

  19. A Comprehensive Regression-Based Approach for Identifying Sources of Person Misfit in Typical-Response Measures

    ERIC Educational Resources Information Center

    Ferrando, Pere J.; Lorenzo-Seva, Urbano

    2016-01-01

    This article proposes a general parametric item response theory approach for identifying sources of misfit in response patterns that have been classified as potentially inconsistent by a global person-fit index. The approach, which is based on the weighted least squared regression of the observed responses on the model-expected responses, can be…

  20. Predicting Fish Growth Potential and Identifying Water Quality Constraints: A Spatially-Explicit Bioenergetics Approach

    NASA Astrophysics Data System (ADS)

    Budy, Phaedra; Baker, Matthew; Dahle, Samuel K.

    2011-10-01

    Anthropogenic impairment of water bodies represents a global environmental concern, yet few attempts have successfully linked fish performance to thermal habitat suitability and fewer have distinguished co-varying water quality constraints. We interfaced fish bioenergetics, field measurements, and Thermal Remote Imaging to generate a spatially-explicit, high-resolution surface of fish growth potential, and next employed a structured hypothesis to detect relationships among measures of fish performance and co-varying water quality constraints. Our thermal surface of fish performance captured the amount and spatial-temporal arrangement of thermally-suitable habitat for three focal species in an extremely heterogeneous reservoir, but interpretation of this pattern was initially confounded by seasonal covariation of water residence time and water quality. Subsequent path analysis revealed that in terms of seasonal patterns in growth potential, catfish and walleye responded to temperature, positively and negatively, respectively; crappie and walleye responded to eutrophy (negatively). At the high eutrophy levels observed in this system, some desired fishes appear to suffer from excessive cultural eutrophication within the context of elevated temperatures whereas others appear to be largely unaffected or even enhanced. Our overall findings do not lead to the conclusion that this system is degraded by pollution; however, they do highlight the need to use a sensitive focal species in the process of determining allowable nutrient loading and as integrators of habitat suitability across multiple spatial and temporal scales. We provide an integrated approach useful for quantifying fish growth potential and identifying water quality constraints on fish performance at spatial scales appropriate for whole-system management.

  1. A molecular screening approach to identify and characterize inhibitors of glioblastoma stem cells.

    PubMed

    Visnyei, Koppany; Onodera, Hideyuki; Damoiseaux, Robert; Saigusa, Kuniyasu; Petrosyan, Syuzanna; De Vries, David; Ferrari, Denise; Saxe, Jonathan; Panosyan, Eduard H; Masterman-Smith, Michael; Mottahedeh, Jack; Bradley, Kenneth A; Huang, Jing; Sabatti, Chiara; Nakano, Ichiro; Kornblum, Harley I

    2011-10-01

    Glioblastoma (GBM) is among the most lethal of all cancers. GBM consist of a heterogeneous population of tumor cells among which a tumor-initiating and treatment-resistant subpopulation, here termed GBM stem cells, have been identified as primary therapeutic targets. Here, we describe a high-throughput small molecule screening approach that enables the identification and characterization of chemical compounds that are effective against GBM stem cells. The paradigm uses a tissue culture model to enrich for GBM stem cells derived from human GBM resections and combines a phenotype-based screen with gene target-specific screens for compound identification. We used 31,624 small molecules from 7 chemical libraries that we characterized and ranked based on their effect on a panel of GBM stem cell-enriched cultures and their effect on the expression of a module of genes whose expression negatively correlates with clinical outcome: MELK, ASPM, TOP2A, and FOXM1b. Of the 11 compounds meeting criteria for exerting differential effects across cell types used, 4 compounds showed selectivity by inhibiting multiple GBM stem cells-enriched cultures compared with nonenriched cultures: emetine, n-arachidonoyl dopamine, n-oleoyldopamine (OLDA), and n-palmitoyl dopamine. ChemBridge compounds #5560509 and #5256360 inhibited the expression of the 4 mitotic module genes. OLDA, emetine, and compounds #5560509 and #5256360 were chosen for more detailed study and inhibited GBM stem cells in self-renewal assays in vitro and in a xenograft model in vivo. These studies show that our screening strategy provides potential candidates and a blueprint for lead compound identification in larger scale screens or screens involving other cancer types. PMID:21859839

  2. A molecular screening approach to identify and characterize inhibitors of glioblastoma stem cells

    PubMed Central

    Visnyei, Koppany; Onodera, Hideyuki; Damoiseaux, Robert; Saigusa, Kuniyasu; Petrosyan, Syuzanna; De Vries, David; Ferrari, Denise; Saxe, Jonathan; Panosyan, Eduard H.; Masterman-Smith, Michael; Mottahedeh, Jack; Bradley, Kenneth A.; Huang, Jing; Sabatti, Chiara; Nakano, Ichiro; Kornblum, Harley I.

    2011-01-01

    Glioblastoma multiforme (GBM) is amongst the most lethal of all cancers. GBM consist of a heterogeneous population of tumor cells amongst which a tumor initiating and treatment-resistant subpopulation, here termed GBM stem cells (GSC), have been identified as primary therapeutic targets. Here, we describe a high-throughput small molecule screening approach that enables the identification and characterization of chemical compounds that are effective against GSC. The paradigm uses a tissue culture model to enrich for GSC derived from human GBM resections and combines a phenotype-based screen with gene target-specific screens for compound identification. We used 31,624 small molecules from seven chemical libraries that we characterized and ranked based on their effect on a panel of GSC-enriched cultures as well as their effect on the expression of a module of genes whose expression negatively correlates with clinical outcome: MELK, ASPM, TOP2A and FOXM1b. Of the 11 compounds meeting criteria for exerting differential effects across cell types used, 4 compounds demonstrated selectivity by inhibiting multiple GSC-enriched cultures compared to non-enriched cultures: Emetine, N-Arachidonoyldopamine (NADA), N-Oleoyldopamine (OLDA), and N-Palmitoyldopamine (PALDA). ChemBridge compounds #5560509 and #5256360 inhibited the expression of the 4 mitotic module genes. OLDA, Emetine, and compounds #5560509 and #5256360 were chosen for more detailed study and inhibited GSC in self-renewal assays in vitro and in a xenograft model in vivo. These studies demonstrate that our screening strategy provides potential candidates as well as a blueprint for lead compound identification in larger scale screens or screens involving other cancer types. PMID:21859839

  3. A Machine Learning Approach To Identify Hydrogenosomal Proteins in Trichomonas vaginalis

    PubMed Central

    Burstein, David; Gould, Sven B.; Zimorski, Verena; Kloesges, Thorsten; Kiosse, Fuat; Major, Peter; Martin, William F.; Pupko, Tal

    2012-01-01

    The protozoan parasite Trichomonas vaginalis is the causative agent of trichomoniasis, the most widespread nonviral sexually transmitted disease in humans. It possesses hydrogenosomes—anaerobic mitochondria that generate H2, CO2, and acetate from pyruvate while converting ADP to ATP via substrate-level phosphorylation. T. vaginalis hydrogenosomes lack a genome and translation machinery; hence, they import all their proteins from the cytosol. To date, however, only 30 imported proteins have been shown to localize to the organelle. A total of 226 nuclear-encoded proteins inferred from the genome sequence harbor a characteristic short N-terminal presequence, reminiscent of mitochondrial targeting peptides, which is thought to mediate hydrogenosomal targeting. Recent studies suggest, however, that the presequences might be less important than previously thought. We sought to identify new hydrogenosomal proteins within the 59,672 annotated open reading frames (ORFs) of T. vaginalis, independent of the N-terminal targeting signal, using a machine learning approach. Our training set included 57 gene and protein features determined for all 30 known hydrogenosomal proteins and 576 nonhydrogenosomal proteins. Several classifiers were trained on this set to yield an import score for all proteins encoded by T. vaginalis ORFs, predicting the likelihood of hydrogenosomal localization. The machine learning results were tested through immunofluorescence assay and immunodetection in isolated cell fractions of 14 protein predictions using hemagglutinin constructs expressed under the homologous SCSα promoter in transiently transformed T. vaginalis cells. Localization of 6 of the 10 top predicted hydrogenosome-localized proteins was confirmed, and two of these were found to lack an obvious N-terminal targeting signal. PMID:22140228

  4. A spatial modeling approach to identify potential butternut restoration sites in Mammoth Cave National Park

    USGS Publications Warehouse

    Thompson, L.M.; Van Manen, F.T.; Schlarbaum, S.E.; DePoy, M.

    2006-01-01

    Incorporation of disease resistance is nearly complete for several important North American hardwood species threatened by exotic fungal diseases. The next important step toward species restoration would be to develop reliable tools to delineate ideal restoration sites on a landscape scale. We integrated spatial modeling and remote sensing techniques to delineate potential restoration sites for Butternut (Juglans cinerea L.) trees, a hardwood species being decimated by an exotic fungus, in Mammoth Cave National Park (MCNP), Kentucky. We first developed a multivariate habitat model to determine optimum Butternut habitats within MCNP. Habitat characteristics of 54 known Butternut locations were used in combination with eight topographic and land use data layers to calculate an index of habitat suitability based on Mahalanobis distance (D2). We used a bootstrapping technique to test the reliability of model predictions. Based on a threshold value for the D2 statistic, 75.9% of the Butternut locations were correctly classified, indicating that the habitat model performed well. Because Butternut seedlings require extensive amounts of sunlight to become established, we used canopy cover data to refine our delineation of favorable areas for Butternut restoration. Areas with the most favorable conditions to establish Butternut seedlings were limited to 291.6 ha. Our study provides a useful reference on the amount and location of favorable Butternut habitat in MCNP and can be used to identify priority areas for future Butternut restoration. Given the availability of relevant habitat layers and accurate location records, our approach can be applied to other tree species and areas. ?? 2006 Society for Ecological Restoration International.

  5. Predicting fish growth potential and identifying water quality constraints: a spatially-explicit bioenergetics approach.

    PubMed

    Budy, Phaedra; Baker, Matthew; Dahle, Samuel K

    2011-10-01

    Anthropogenic impairment of water bodies represents a global environmental concern, yet few attempts have successfully linked fish performance to thermal habitat suitability and fewer have distinguished co-varying water quality constraints. We interfaced fish bioenergetics, field measurements, and Thermal Remote Imaging to generate a spatially-explicit, high-resolution surface of fish growth potential, and next employed a structured hypothesis to detect relationships among measures of fish performance and co-varying water quality constraints. Our thermal surface of fish performance captured the amount and spatial-temporal arrangement of thermally-suitable habitat for three focal species in an extremely heterogeneous reservoir, but interpretation of this pattern was initially confounded by seasonal covariation of water residence time and water quality. Subsequent path analysis revealed that in terms of seasonal patterns in growth potential, catfish and walleye responded to temperature, positively and negatively, respectively; crappie and walleye responded to eutrophy (negatively). At the high eutrophy levels observed in this system, some desired fishes appear to suffer from excessive cultural eutrophication within the context of elevated temperatures whereas others appear to be largely unaffected or even enhanced. Our overall findings do not lead to the conclusion that this system is degraded by pollution; however, they do highlight the need to use a sensitive focal species in the process of determining allowable nutrient loading and as integrators of habitat suitability across multiple spatial and temporal scales. We provide an integrated approach useful for quantifying fish growth potential and identifying water quality constraints on fish performance at spatial scales appropriate for whole-system management.

  6. Employment among Working-Age Adults with Multiple Sclerosis: A Data-Mining Approach to Identifying Employment Interventions

    ERIC Educational Resources Information Center

    Bishop, Malachy; Chan, Fong; Rumrill, Phillip D., Jr.; Frain, Michael P.; Tansey, Timothy N.; Chiu, Chung-Yi; Strauser, David; Umeasiegbu, Veronica I.

    2015-01-01

    Purpose: To examine demographic, functional, and clinical multiple sclerosis (MS) variables affecting employment status in a national sample of adults with MS in the United States. Method: The sample included 4,142 working-age (20-65 years) Americans with MS (79.1% female) who participated in a national survey. The mean age of participants was…

  7. Genetic Programming and Frequent Itemset Mining to Identify Feature Selection Patterns of iEEG and fMRI Epilepsy Data

    PubMed Central

    Smart, Otis; Burrell, Lauren

    2014-01-01

    Pattern classification for intracranial electroencephalogram (iEEG) and functional magnetic resonance imaging (fMRI) signals has furthered epilepsy research toward understanding the origin of epileptic seizures and localizing dysfunctional brain tissue for treatment. Prior research has demonstrated that implicitly selecting features with a genetic programming (GP) algorithm more effectively determined the proper features to discern biomarker and non-biomarker interictal iEEG and fMRI activity than conventional feature selection approaches. However for each the iEEG and fMRI modalities, it is still uncertain whether the stochastic properties of indirect feature selection with a GP yield (a) consistent results within a patient data set and (b) features that are specific or universal across multiple patient data sets. We examined the reproducibility of implicitly selecting features to classify interictal activity using a GP algorithm by performing several selection trials and subsequent frequent itemset mining (FIM) for separate iEEG and fMRI epilepsy patient data. We observed within-subject consistency and across-subject variability with some small similarity for selected features, indicating a clear need for patient-specific features and possible need for patient-specific feature selection or/and classification. For the fMRI, using nearest-neighbor classification and 30 GP generations, we obtained over 60% median sensitivity and over 60% median selectivity. For the iEEG, using nearest-neighbor classification and 30 GP generations, we obtained over 65% median sensitivity and over 65% median selectivity except one patient. PMID:25580059

  8. A systems biology and proteomics-based approach identifies SRC and VEGFA as biomarkers in risk factor mediated coronary heart disease.

    PubMed

    V, Alexandar; Nayar, Pradeep G; Murugesan, R; S, Shajahan; Krishnan, Jayalakshmi; Ahmed, Shiek S S J

    2016-07-19

    Coronary heart disease (CHD) is the most common cause of death worldwide. The burden of CHD increases with risk factors such as smoking, hypertension, obesity and diabetes. Several studies have demonstrated the association of these classical risk factors with CHD. However, the mechanisms of these associations remain largely unclear due to the complexity of disease pathophysiology and the lack of an integrative approach that fails to provide a definite understanding of molecular linkage. To overcome these problems, we propose a novel systems biology approach that relates causative genes, interactomes and pathways to elucidate the risk factors mediating the molecular mechanisms and biomarkers for feasible diagnosis. The literature was mined to retrieve the causative genes of each risk factor and CHD to construct protein interactomes. The interactomes were examined to identify 298 common molecular signatures. The common signatures were mapped to the tissue network to synthesize a sub-network consisting of 82 proteins. Further, the dissection of the sub-network provides functional modules representing a diverse range of molecular functions, including the AKT/p13k, MAPK and wnt pathways. Also, the prioritization of functional modules identifies SRC, VEGFA and HIF1A as potential candidate markers. Further, we validate these candidates with the existing markers CRP, NOS3 and VCAM1 in the serum of 63 individuals, 33 with CHD and 30 controls, using ELISA. SRC, VEGFA, H1F1A, CRP and NOS3 were significantly altered in patients compared to controls. These results support the utility of these candidate markers for the diagnosis of CHD. Overall, our molecular observations indicate the influence of risk factors in the pathophysiology of CHD and identify serum markers for diagnosis. PMID:27279347

  9. A multivariate and stochastic approach to identify key variables to rank dairy farms on profitability.

    PubMed

    Atzori, A S; Tedeschi, L O; Cannas, A

    2013-05-01

    The economic efficiency of dairy farms is the main goal of farmers. The objective of this work was to use routinely available information at the dairy farm level to develop an index of profitability to rank dairy farms and to assist the decision-making process of farmers to increase the economic efficiency of the entire system. A stochastic modeling approach was used to study the relationships between inputs and profitability (i.e., income over feed cost; IOFC) of dairy cattle farms. The IOFC was calculated as: milk revenue + value of male calves + culling revenue - herd feed costs. Two databases were created. The first one was a development database, which was created from technical and economic variables collected in 135 dairy farms. The second one was a synthetic database (sDB) created from 5,000 synthetic dairy farms using the Monte Carlo technique and based on the characteristics of the development database data. The sDB was used to develop a ranking index as follows: (1) principal component analysis (PCA), excluding IOFC, was used to identify principal components (sPC); and (2) coefficient estimates of a multiple regression of the IOFC on the sPC were obtained. Then, the eigenvectors of the sPC were used to compute the principal component values for the original 135 dairy farms that were used with the multiple regression coefficient estimates to predict IOFC (dRI; ranking index from development database). The dRI was used to rank the original 135 dairy farms. The PCA explained 77.6% of the sDB variability and 4 sPC were selected. The sPC were associated with herd profile, milk quality and payment, poor management, and reproduction based on the significant variables of the sPC. The mean IOFC in the sDB was 0.1377 ± 0.0162 euros per liter of milk (€/L). The dRI explained 81% of the variability of the IOFC calculated for the 135 original farms. When the number of farms below and above 1 standard deviation (SD) of the dRI were calculated, we found that 21

  10. A multivariate and stochastic approach to identify key variables to rank dairy farms on profitability.

    PubMed

    Atzori, A S; Tedeschi, L O; Cannas, A

    2013-05-01

    The economic efficiency of dairy farms is the main goal of farmers. The objective of this work was to use routinely available information at the dairy farm level to develop an index of profitability to rank dairy farms and to assist the decision-making process of farmers to increase the economic efficiency of the entire system. A stochastic modeling approach was used to study the relationships between inputs and profitability (i.e., income over feed cost; IOFC) of dairy cattle farms. The IOFC was calculated as: milk revenue + value of male calves + culling revenue - herd feed costs. Two databases were created. The first one was a development database, which was created from technical and economic variables collected in 135 dairy farms. The second one was a synthetic database (sDB) created from 5,000 synthetic dairy farms using the Monte Carlo technique and based on the characteristics of the development database data. The sDB was used to develop a ranking index as follows: (1) principal component analysis (PCA), excluding IOFC, was used to identify principal components (sPC); and (2) coefficient estimates of a multiple regression of the IOFC on the sPC were obtained. Then, the eigenvectors of the sPC were used to compute the principal component values for the original 135 dairy farms that were used with the multiple regression coefficient estimates to predict IOFC (dRI; ranking index from development database). The dRI was used to rank the original 135 dairy farms. The PCA explained 77.6% of the sDB variability and 4 sPC were selected. The sPC were associated with herd profile, milk quality and payment, poor management, and reproduction based on the significant variables of the sPC. The mean IOFC in the sDB was 0.1377 ± 0.0162 euros per liter of milk (€/L). The dRI explained 81% of the variability of the IOFC calculated for the 135 original farms. When the number of farms below and above 1 standard deviation (SD) of the dRI were calculated, we found that 21

  11. Novel approach to identifying the hepatitis B virus pre-S deletions associated with hepatocellular carcinoma

    PubMed Central

    Zhao, Zhi-Mei; Jin, Yan; Gan, Yu; Zhu, Yu; Chen, Tao-Yang; Wang, Jin-Bing; Sun, Yan; Cao, Zhi-Gang; Qian, Geng-Sun; Tu, Hong

    2014-01-01

    AIM: To develop a novel non-sequencing method for the detection of hepatitis B virus (HBV) pre-S deletion mutants in HBV carriers. METHODS: The entire region of HBV pre-S1 and pre-S2 was amplified by polymerase chain reaction (PCR). The size of PCR products was subsequently determined by capillary gel electrophoresis (CGE). CGE were carried out in a PACE-MDQ instrument equipped with a UV detector set at 254 nm. The samples were separated in 50 μm ID eCAP Neutral Coated Capillaries using a voltage of 6 kV for 30 min. Data acquisition and analysis were performed using the 32 Karat Software. A total of 114 DNA clones containing different sizes of the HBV pre-S gene were used to determine the accuracy of the CGE method. One hundred and fifty seven hepatocellular carcinoma (HCC) and 160 non-HCC patients were recruited into the study to assess the association between HBV pre-S deletion and HCC by using the newly-established CGE method. Nine HCC cases with HBV pre-S deletion at the diagnosis year were selected to conduct a longitudinal observation using serial serum samples collected 2-9 years prior to HCC diagnosis. RESULTS: CGE allowed the separation of PCR products differing in size > 3 bp and was able to identify 10% of the deleted DNA in a background of wild-type DNA. The accuracy rate of CGE-based analysis was 99.1% compared with the clone sequencing results. Using this assay, pre-S deletion was more frequently found in HCC patients than in non-HCC controls (47.1% vs 28.1%, P < 0.001). Interestingly, the increased risk of HCC was mainly contributed by the short deletion of pre-S. While the deletion ≤ 99 bp was associated with a 2.971-fold increased risk of HCC (95%CI: 1.723-5.122, P < 0.001), large deletion (> 99 bp) did not show any association with HCC (P = 0.918, OR = 0.966, 95%CI: 0.501-1.863). Of the 9 patients who carried pre-S deletions at the stage of HCC, 88.9% (8/9) had deletions 2-5 years prior to HCC, while only 44.4%4 (4/9) contained such deletions 6

  12. An Unsupervised Opinion Mining Approach for Japanese Weblog Reputation Information Using an Improved SO-PMI Algorithm

    NASA Astrophysics Data System (ADS)

    Wang, Guangwei; Araki, Kenji

    In this paper, we propose an improved SO-PMI (Semantic Orientation Using Pointwise Mutual Information) algorithm, for use in Japanese Weblog Opinion Mining. SO-PMI is an unsupervised approach proposed by Turney that has been shown to work well for English. When this algorithm was translated into Japanese naively, most phrases, whether positive or negative in meaning, received a negative SO. For dealing with this slanting phenomenon, we propose three improvements: to expand the reference words to sets of words, to introduce a balancing factor and to detect neutral expressions. In our experiments, the proposed improvements obtained a well-balanced result: both positive and negative accuracy exceeded 62%, when evaluated on 1,200 opinion sentences sampled from three different domains (reviews of Electronic Products, Cars and Travels from Kakaku. com). In a comparative experiment on the same corpus, a supervised approach (SA-Demo) achieved a very similar accuracy to our method. This shows that our proposed approach effectively adapted SO-PMI for Japanese, and it also shows the generality of SO-PMI.

  13. Identifying Behavioral Barriers to Campus Sustainability: A Multi-Method Approach

    ERIC Educational Resources Information Center

    Horhota, Michelle; Asman, Jenni; Stratton, Jeanine P.; Halfacre, Angela C.

    2014-01-01

    Purpose: The purpose of this paper is to assess the behavioral barriers to sustainable action in a campus community. Design/methodology/approach: This paper reports three different methodological approaches to the assessment of behavioral barriers to sustainable actions on a college campus. Focus groups and surveys were used to assess campus…

  14. Modeling Approach/Strategy for Corrective Action Unit 97, Yucca Flat and Climax Mine , Revision 0

    SciTech Connect

    Janet Willie

    2003-08-01

    The objectives of the UGTA corrective action strategy are to predict the location of the contaminant boundary for each CAU, develop and implement a corrective action, and close each CAU. The process for achieving this strategy includes modeling to define the maximum extent of contaminant transport within a specified time frame. Modeling is a method of forecasting how the hydrogeologic system, including the underground test cavities, will behave over time with the goal of assessing the migration of radionuclides away from the cavities and chimneys. Use of flow and transport models to achieve the objectives of the corrective action strategy is specified in the FFACO. In the Yucca Flat/Climax Mine system, radionuclide migration will be governed by releases from the cavities and chimneys, and transport in alluvial aquifers, fractured and partially fractured volcanic rock aquifers and aquitards, the carbonate aquifers, and in intrusive units. Additional complexity is associated with multiple faults in Yucca Flat and the need to consider reactive transport mechanisms that both reduce and enhance the mobility of radionuclides. A summary of the data and information that form the technical basis for the model is provided in this document.

  15. Geologic considerations in underground coal mining system design

    NASA Technical Reports Server (NTRS)

    Camilli, F. A.; Maynard, D. P.; Mangolds, A.; Harris, J.

    1981-01-01

    Geologic characteristics of coal resources which may impact new extraction technologies are identified and described to aid system designers and planners in their task of designing advanced coal extraction systems for the central Appalachian region. These geologic conditions are then organized into a matrix identified as the baseline mine concept. A sample region, eastern Kentucy is analyzed using both the developed baseline mine concept and the traditional geologic investigative approach.

  16. A Hybrid Knowledge-Based and Data-Driven Approach to Identifying Semantically Similar Concepts

    PubMed Central

    Pivovarov, Rimma; Elhadad, Noémie

    2012-01-01

    An open research question when leveraging ontological knowledge is when to treat different concepts separately from each other and when to aggregate them. For instance, concepts for the terms "paroxysmal cough" and "nocturnal cough" might be aggregated in a kidney disease study, but should be left separate in a pneumonia study. Determining whether two concepts are similar enough to be aggregated can help build better datasets for data mining purposes and avoid signal dilution. Quantifying the similarity among concepts is a difficult task, however, in part because such similarity is context-dependent. We propose a comprehensive method, which computes a similarity score for a concept pair by combining data-driven and ontology-driven knowledge. We demonstrate our method on concepts from SNOMED-CT and on a corpus of clinical notes of patients with chronic kidney disease. By combining information from usage patterns in clinical notes and from ontological structure, the method can prune out concepts that are simply related from those which are semantically similar. When evaluated against a list of concept pairs annotated for similarity, our method reaches an AUC (area under the curve) of 92%. PMID:22289420

  17. Identifying Creatively Gifted Students: Necessity of a Multi-Method Approach

    ERIC Educational Resources Information Center

    Ambrose, Laura; Machek, Greg R.

    2015-01-01

    The process of identifying students as creatively gifted provides numerous challenges for educators. Although many schools assess for creativity in identifying students for gifted and talented services, the relationship between creativity and giftedness is often not fully understood. This article reviews commonly used methods of creativity…

  18. Integrated approach to assess the environmental impact of mining activities: estimation of the spatial distribution of soil contamination (Panasqueira mining area, Central Portugal).

    PubMed

    Candeias, Carla; Ávila, Paula F; Ferreira da Silva, Eduardo; Teixeira, João Paulo

    2015-03-01

    Through the years, mining and beneficiation processes in Panasqueira Sn-W mine (Central Portugal) produced large amounts of As-rich mine wastes laid up in huge tailings and open-air impoundments (Barroca Grande and Rio tailings) that are the main source of pollution in the surrounding area once they are exposed to the weathering conditions leading to the formation of acid mine drainage (AMD) and consequently to the contamination of the surrounding environments, particularly soils. The active mine started the exploration during the nineteenth century. This study aims to look at the extension of the soil pollution due to mining activities and tailing erosion by combining data on the degree of soil contamination that allows a better understanding of the dynamics inherent to leaching, transport, and accumulation of some potential toxic elements in soil and their environmental relevance. Soil samples were collected in the surrounding soils of the mine, were digested in aqua regia, and were analyzed for 36 elements by inductively coupled plasma mass spectrometry (ICP-MS). Selected results are that (a) an association of elements like Ag, As, Bi, Cd, Cu, W, and Zn strongly correlated and controlled by the local sulfide mineralization geochemical signature was revealed; (b) the global area discloses significant concentrations of As, Bi, Cd, and W linked to the exchangeable and acid-soluble bearing phases; and (c) wind promotes the mechanical dispersion of the rejected materials, from the milled waste rocks and the mineral processing plant, with subsequent deposition on soils and waters. Arsenic- and sulfide-related heavy metals (such as Cu and Cd) are associated to the fine materials that are transported in suspension by surface waters or associated to the acidic waters, draining these sites and contaminating the local soils. Part of this fraction, especially for As, Cd, and Cu, is temporally retained in solid phases by precipitation of soluble secondary minerals (through

  19. Integrated approach to assess the environmental impact of mining activities: estimation of the spatial distribution of soil contamination (Panasqueira mining area, Central Portugal).

    PubMed

    Candeias, Carla; Ávila, Paula F; Ferreira da Silva, Eduardo; Teixeira, João Paulo

    2015-03-01

    Through the years, mining and beneficiation processes in Panasqueira Sn-W mine (Central Portugal) produced large amounts of As-rich mine wastes laid up in huge tailings and open-air impoundments (Barroca Grande and Rio tailings) that are the main source of pollution in the surrounding area once they are exposed to the weathering conditions leading to the formation of acid mine drainage (AMD) and consequently to the contamination of the surrounding environments, particularly soils. The active mine started the exploration during the nineteenth century. This study aims to look at the extension of the soil pollution due to mining activities and tailing erosion by combining data on the degree of soil contamination that allows a better understanding of the dynamics inherent to leaching, transport, and accumulation of some potential toxic elements in soil and their environmental relevance. Soil samples were collected in the surrounding soils of the mine, were digested in aqua regia, and were analyzed for 36 elements by inductively coupled plasma mass spectrometry (ICP-MS). Selected results are that (a) an association of elements like Ag, As, Bi, Cd, Cu, W, and Zn strongly correlated and controlled by the local sulfide mineralization geochemical signature was revealed; (b) the global area discloses significant concentrations of As, Bi, Cd, and W linked to the exchangeable and acid-soluble bearing phases; and (c) wind promotes the mechanical dispersion of the rejected materials, from the milled waste rocks and the mineral processing plant, with subsequent deposition on soils and waters. Arsenic- and sulfide-related heavy metals (such as Cu and Cd) are associated to the fine materials that are transported in suspension by surface waters or associated to the acidic waters, draining these sites and contaminating the local soils. Part of this fraction, especially for As, Cd, and Cu, is temporally retained in solid phases by precipitation of soluble secondary minerals (through

  20. Ongoing soil arsenic exposure of children living in an historical gold mining area in regional Victoria, Australia: Identifying risk factors associated with uptake

    NASA Astrophysics Data System (ADS)

    Martin, Rachael; Dowling, Kim; Pearce, Dora; Bennett, John; Stopic, Attila

    2013-11-01

    Elevated levels of arsenic have been observed in some mine wastes and soils around historical gold mining areas in regional Victoria, Australia. Arsenic uptake from soil by children living in these areas has been demonstrated using toenail arsenic concentration as a biomarker, with evidence of some systemic absorption associated with periodic exposures. We conducted a follow-up study to ascertain if toenail arsenic concentrations, and risk factors for exposure, had changed over a five year period in an historical gold mining region in western regional Victoria, Australia. Residential soil samples (N = 14) and toenail clippings (N = 24) were analyzed for total arsenic using instrumental neutron activation analysis, including 19 toenail clippings samples that were obtained from the same study cohort in 2006. Toenail arsenic concentrations in 2011 (geometric mean, 0.171 μg/g; range, 0.030-0.540 μg/g) were significantly lower than those in 2006 (geometric mean, 0.464 μg/g; range, 0.150-2.10 μg/g; p < 0.001). However, toenail arsenic concentrations were again correlated with soil arsenic levels (Spearman's rho = 0.630; p = 0.001). Spending time outdoors more often and for longer periods correlates with increased arsenic uptake (p < 0.05). Mining-influenced residential soils represent a long-term continuing source for potential arsenic exposure for children living in this historical mining region.

  1. LeadMine: a grammar and dictionary driven approach to entity recognition

    PubMed Central

    2015-01-01

    Background Chemical entity recognition has traditionally been performed by machine learning approaches. Here we describe an approach using grammars and dictionaries. This approach has the advantage that the entities found can be directly related to a given grammar or dictionary, which allows the type of an entity to be known and, if an entity is misannotated, indicates which resource should be corrected. As recognition is driven by what is expected, if spelling errors occur, they can be corrected. Correcting such errors is highly useful when attempting to lookup an entity in a database or, in the case of chemical names, converting them to structures. Results Our system uses a mixture of expertly curated grammars and dictionaries, as well as dictionaries automatically derived from public resources. We show that the heuristics developed to filter our dictionary of trivial chemical names (from PubChem) yields a better performing dictionary than the previously published Jochem dictionary. Our final system performs post-processing steps to modify the boundaries of entities and to detect abbreviations. These steps are shown to significantly improve performance (2.6% and 4.0% F1-score respectively). Our complete system, with incremental post-BioCreative workshop improvements, achieves 89.9% precision and 85.4% recall (87.6% F1-score) on the CHEMDNER test set. Conclusions Grammar and dictionary approaches can produce results at least as good as the current state of the art in machine learning approaches. While machine learning approaches are commonly thought of as "black box" systems, our approach directly links the output entities to the input dictionaries and grammars. Our approach also allows correction of errors in detected entities, which can assist with entity resolution. PMID:25810776

  2. Data Mining Approach for Evaluating Vegetation Dynamics in Earth System Models (ESMs) Using Satellite Remote Sensing Products

    NASA Astrophysics Data System (ADS)

    Shu, S.; Hoffman, F. M.; Kumar, J.; Hargrove, W. W.; Jain, A. K.

    2014-12-01

    biome types. However, Mapcurves results showed a relatively low goodness of fit score for modeled phenology projected onto observations. This study demonstrates the utility of a data mining approach for cross-validation of observations and evaluation of model performance.

  3. Identifying a Diverse Student Body: Selective College Admissions and Alternative Approaches

    ERIC Educational Resources Information Center

    Bial, Deborah; Rodriguez, Alba

    2007-01-01

    This chapter explores alternative solutions for selective institutions of higher education to reach beyond their traditional admission measures and identify diverse students who might otherwise not be selected by traditional admission criteria.

  4. Using a watershed-centric approach to identify potentially impacted beaches

    EPA Science Inventory

    Beaches can be affected by a variety of contaminants. Of particular concern are beaches impacted by human fecal contamination and urban runoff. This poster demonstrates a methodology to identify potentially impacted beaches using Geographic Information Systems (GIS). Since h...

  5. A Spatio-temporal Data Mining Approach to Global scale Burned Area Monitoring

    NASA Astrophysics Data System (ADS)

    Mithal, V.; Khandelwal, A.; Nayak, G.; Kumar, V.; Nemani, R. R.; Oza, N.

    2014-12-01

    We present a novel technique for burned area mapping in forests using the Enhanced Vegetation Index (EVI) from the MODIS 16-day Level 3 1km Vegetation Indices (MOD13A2) and the Active Fire (AF) from the MODIS 8-day Level 3 1km Thermal Anomalies and Fire products (MOD14A2). The proposed method leverages the spatial and temporal co-occurrence of thermal anomalies and vegetation loss caused due to forest fires to detect burned areas. Our approach derives features from Enhanced Vegetation Index that target locations which show an abrupt change in their vegetation time series that take at least several months to recover. One unique aspect of our approach is that it uses data from multiple months around the fire event and is therefore more robust to issues in data quality. Comparison with other burned area products show that our approach detects several large previously undetected burned areas across multiple geographical regions. In particular, we found that our approach detects several large burned regions in the tropical forests of Indonesia and South America that had been missed by the state-of-arts burned area approaches. For example, using our approach in Indonesia we discovered that the state-of-the-art MODIS Burned area product had missed around 20,000 sq. km. of burned area (nearly as much burned area as it has reported). We show that all these previously unreported burned areas detected by our approach are actually significant fires which suffered a large, abrupt loss in their vegetation at the time of the fire event and take at least several months to recover back to their normal vegetation. To evaluate these burned areas we compared the Landsat-based composites before and after the date of the event. Our Landsat analysis shows that the burned areas detected by the proposed approach are true burns with a very small error of commission. We believe our work has the potential to provide a scalable approach to global forest monitoring as well as reduce the

  6. A data mining approach for grouping and analyzing trajectories of care using claim data: the example of breast cancer

    PubMed Central

    2013-01-01

    Background With the increasing burden of chronic diseases, analyzing and understanding trajectories of care is essential for efficient planning and fair allocation of resources. We propose an approach based on mining claim data to support the exploration of trajectories of care. Methods A clustering of trajectories of care for breast cancer was performed with Formal Concept Analysis. We exported Data from the French national casemix system, covering all inpatient admissions in the country. Patients admitted for breast cancer surgery in 2009 were selected and their trajectory of care was recomposed with all hospitalizations occuring within one year after surgery. The main diagnoses of hospitalizations were used to produce morbidity profiles. Cumulative hospital costs were computed for each profile. Results 57,552 patients were automatically grouped into 19 classes. The resulting profiles were clinically meaningful and economically relevant. The mean cost per trajectory was 9,600€. Severe conditions were generally associated with higher costs. The lowest costs (6,957€) were observed for patients with in situ carcinoma of the breast, the highest for patients hospitalized for palliative care (26,139€). Conclusions Formal Concept Analysis can be applied on claim data to produce an automatic classification of care trajectories. This flexible approach takes advantages of routinely collected data and can be used to setup cost-of-illness studies. PMID:24289668

  7. A Bayesian approach to identifying structural nonlinearity using free-decay response: Application to damage detection in composites

    USGS Publications Warehouse

    Nichols, J.M.; Link, W.A.; Murphy, K.D.; Olson, C.C.

    2010-01-01

    This work discusses a Bayesian approach to approximating the distribution of parameters governing nonlinear structural systems. Specifically, we use a Markov Chain Monte Carlo method for sampling the posterior parameter distributions thus producing both point and interval estimates for parameters. The method is first used to identify both linear and nonlinear parameters in a multiple degree-of-freedom structural systems using free-decay vibrations. The approach is then applied to the problem of identifying the location, size, and depth of delamination in a model composite beam. The influence of additive Gaussian noise on the response data is explored with respect to the quality of the resulting parameter estimates.

  8. A statistical approach to evaluate the relation of coal mining, land reclamation, and surface-water quality in Ohio

    USGS Publications Warehouse

    Hren, Janet; Wilson, K.S.; Helsel, D.R.

    1984-01-01

    Base-flow data from 779 sites in Ohio 's coal region were analyzed statistically to relate land use to selected water-quality characteristics. Sites were classified into five categories: unmined (100 percent unmined land), abandoned (50 percent or more abandoned surface mines), reclaimed (50 percent or more reclaimed surface mines), deep-mined (50 percent or more underground mines), and mixed (all others). Specific conductance , pH, alkalinity, acidity, sulfate, dissolved iron, total iron, and total manganese in streams draining basins in the coal region were the eight characteristics selected for analysis. (USGS)

  9. Development of Novel Random Network Theory-Based Approaches to Identify Network Interactions among Nitrifying Bacteria

    SciTech Connect

    Shi, Cindy

    2015-07-17

    The interactions among different microbial populations in a community could play more important roles in determining ecosystem functioning than species numbers and their abundances, but very little is known about such network interactions at a community level. The goal of this project is to develop novel framework approaches and associated software tools to characterize the network interactions in microbial communities based on high throughput, large scale high-throughput metagenomics data and apply these approaches to understand the impacts of environmental changes (e.g., climate change, contamination) on network interactions among different nitrifying populations and associated microbial communities.

  10. Visual Data Mining: An Exploratory Approach to Analyzing Temporal Patterns of Eye Movements

    ERIC Educational Resources Information Center

    Yu, Chen; Yurovsky, Daniel; Xu, Tian

    2012-01-01

    Infant eye movements are an important behavioral resource to understand early human development and learning. But the complexity and amount of gaze data recorded from state-of-the-art eye-tracking systems also pose a challenge: how does one make sense of such dense data? Toward this goal, this article describes an interactive approach based on…

  11. Functional Analysis of Problem Behavior: A Systematic Approach for Identifying Idiosyncratic Variables

    PubMed Central

    Roscoe, Eileen M.; Schlichenmeyer, Kevin J.; Dube, William V.

    2015-01-01

    When inconclusive functional analysis (FA) outcomes occur, a number of modifications have been made to enhance the putative establishing operation or consequence associated with behavioral maintenance. However, a systematic method for identifying relevant events to test during modified FAs has not been evaluated. The purpose of this study was to develop and evaluate a technology for systematically identifying events to test in a modified FA after an initial FA led to inconclusive outcomes. Six individuals whose initial FA showed little or no responding or high levels only in the control condition participated. An indirect assessment (IA) questionnaire developed for identifying idiosyncratic variables was administered, and a descriptive analysis (DA) was conducted. Results from the IA only or a combination of the IA and DA were used to inform modified FA test and control conditions. Conclusive FA outcomes were obtained with five of the six participants during the modified FA phase. PMID:25930176

  12. Identifying individual competency in emerging areas of practice: an applied approach.

    PubMed

    Gebbie, Kristine; Merrill, Jacqueline; Hwang, Inseon; Gupta, Meera; Btoush, Rula; Wagner, Monte

    2002-09-01

    Competency designation is important for any discipline to define individual performance expectations. Although public health (PH) agencies have always responded to emergencies, individual expectations have not been specified. The authors identified individual competencies necessary for organizations to meet performance standards. In the first stage, a Delphi survey served to identify competencies needed by staff to respond to any emergency, including bio-terrorism, yielding competency sets for four levels of workers. In the second stage, focus groups were used to assess the competencies with public health agencies. This feedback validated the Delphi-identified competencies as accurate and necessary for emergency response. The authors demonstrate the feasibility of using these methods to arrive at statements of value to PH practice at a reasonable investment of resources.

  13. Root productivity of deciduous and evergreen species identified using a molecular approach

    NASA Astrophysics Data System (ADS)

    Ellsworth, P.; Sternberg, L. O.

    2012-12-01

    The linkage between leaf traits and root structure may explain how plants integrate above and belowground traits into whole plant adaptations to environmental stresses. In dry seasonal forests, the lack of dry season precipitation dries out the relatively nutrient-rich shallow soil, leaving shallow soil water and nutrients inaccessible to uptake until the wet season. In tropical or subtropical seasonal dry forests, deciduousness may allow for the survival of shallow fine roots during the dry season. Losing leaves during the dry season reduces aboveground plant water demand, and a greater proportion of water extracted from deep soil can be used to maintain shallow roots until the wet season. Higher shallow root survival through the dry season than evergreen species means that deciduous species can take advantage of the nutrient pulse associated with the onset of the wet season. To test the above hypothesis, fine roots were collected from soil cores in a seasonally dry forest during the dry season, onset of the wet season, and the wet season and were identified to selected evergreen and deciduous study species. The fine roots of two of the selected species (Lyonia ferruginea and Carya floridana) could be identified from visual characteristics. The other three study species, which were all from the genus Quercus (Q. geminata, Q. myrtifolia, and Q. laevis), were impossible to separate visually. We developed a PCR-based restriction fragment length polymorphism (PCR-RFLP) technique, which provided a quick, simple, low-cost way to identify the species of all fine roots of our study species. We extracted DNA from all roots that were not visually identified, amplified the internal transcribed spacer region (ITS), digested the ITS region with the restriction enzyme TaqαI, and used gel electrophoresis to separate DNA fragments. Using a PCR-RFLP based root identification key that we developed for the species at Archbold Biological Station, all species that could not be

  14. Quantitative high-throughput screening: A titration-based approach that efficiently identifies biological activities in large chemical libraries

    PubMed Central

    Inglese, James; Auld, Douglas S.; Jadhav, Ajit; Johnson, Ronald L.; Simeonov, Anton; Yasgar, Adam; Zheng, Wei; Austin, Christopher P.

    2006-01-01

    High-throughput screening (HTS) of chemical compounds to identify modulators of molecular targets is a mainstay of pharmaceutical development. Increasingly, HTS is being used to identify chemical probes of gene, pathway, and cell functions, with the ultimate goal of comprehensively delineating relationships between chemical structures and biological activities. Achieving this goal will require methodologies that efficiently generate pharmacological data from the primary screen and reliably profile the range of biological activities associated with large chemical libraries. Traditional HTS, which tests compounds at a single concentration, is not suited to this task, because HTS is burdened by frequent false positives and false negatives and requires extensive follow-up testing. We have developed a paradigm, quantitative HTS (qHTS), tested with the enzyme pyruvate kinase, to generate concentration–response curves for >60,000 compounds in a single experiment. We show that this method is precise, refractory to variations in sample preparation, and identifies compounds with a wide range of activities. Concentration–response curves were classified to rapidly identify pyruvate kinase activators and inhibitors with a variety of potencies and efficacies and elucidate structure–activity relationships directly from the primary screen. Comparison of qHTS with traditional single-concentration HTS revealed a high prevalence of false negatives in the single-point screen. This study demonstrates the feasibility of qHTS for accurately profiling every compound in large chemical libraries (>105 compounds). qHTS produces rich data sets that can be immediately mined for reliable biological activities, thereby providing a platform for chemical genomics and accelerating the identification of leads for drug discovery. PMID:16864780

  15. Approaches to Identify Exceedances of Water Quality Thresholds Associated with Ocean Conditions

    EPA Science Inventory

    WED scientists have developed a method to help distinguish whether failures to meet water quality criteria are associated with natural coastal upwelling by using the statistical approach of logistic regression. Estuaries along the west coast of the United States periodically ha...

  16. A LANDSCAPE ECOLOGY APPROACH TO IDENTIFYING ECOLOGICAL VULNERABILITY IN GEOGRAPHICALLY ISOLATED WETLANDS

    EPA Science Inventory

    U.S. EPA 's Office of Research and Development is using a landscape approach to assess the ecological/hydrologic functions of geographically isolated wetlands in the mid-western, southern, and western regions of the United States. Geographically isolated wetlands are considered t...

  17. Identifying Country-Specific Cultures of Physics Education: A Differential Item Functioning Approach

    ERIC Educational Resources Information Center

    Mesic, Vanes

    2012-01-01

    In international large-scale assessments of educational outcomes, student achievement is often represented by unidimensional constructs. This approach allows for drawing general conclusions about country rankings with respect to the given achievement measure, but it typically does not provide specific diagnostic information which is necessary for…

  18. The Natural Selection: Identifying Student Misconceptions through an Inquiry-Based, Critical Approach to Evolution

    ERIC Educational Resources Information Center

    Robbins, Jennifer R.; Roy, Pamela

    2007-01-01

    We invited 141 non-science major undergraduates to share and then challenge their preconceptions about evolution in a four-lesson inquiry lab unit that integrated diverse topics with rigorous assessment. Our experience suggests that an inquiring approach to evolutionary theory can be highly persuasive.

  19. Mining Students' Learning Patterns and Performance in Web-Based Instruction: A Cognitive Style Approach

    ERIC Educational Resources Information Center

    Chen, Sherry Y.; Liu, Xiaohui

    2011-01-01

    Personalization has been widely used in Web-based instruction (WBI). To deliver effective personalization, there is a need to understand different preferences of each student. Cognitive style has been identified as one of the most pertinent factors that affect students' learning preferences. Therefore, it is essential to investigate how learners…

  20. A Data Mining Approach to Improve Re-Accessibility and Delivery of Learning Knowledge Objects

    ERIC Educational Resources Information Center

    Sabitha, Sai; Mehrotra, Deepti; Bansal, Abhay

    2014-01-01

    Today Learning Management Systems (LMS) have become an integral part of learning mechanism of both learning institutes and industry. A Learning Object (LO) can be one of the atomic components of LMS. A large amount of research is conducted into identifying benchmarks for creating Learning Objects. Some of the major concerns associated with LO are…

  1. Identifying Underrepresented Disadvantaged Gifted and Talented Children: A Multifaceted Approach. (Volumes 1 and 2.)

    ERIC Educational Resources Information Center

    Saccuzzo, Dennis P.; And Others

    The primary purpose of this study was to determine if a model for identifying gifted and talented students could be developed which would provide equal access to gifted programs for children of all ethnic and economic backgrounds. The culturally and ethnically diverse San Diego City School District provided a pool of over 35,000 children referred…

  2. A Statistical Approach to Identifying Schools Demonstrating Substantial Improvement in Student Learning

    ERIC Educational Resources Information Center

    Meyers, Coby; Lindsay, Jim; Condon, Chris; Wan, Yinmei

    2012-01-01

    The rising tide behind the school turnaround movement is significant, as national education leaders continue to call for the rapid improvement of the nation's lowest-performing schools. To date, little work has been done to identify schools that are drastically improving their performance. Using publically available school-level student…

  3. Identifying Ghanaian Pre-Service Teachers' Readiness for Computer Use: A Technology Acceptance Model Approach

    ERIC Educational Resources Information Center

    Gyamfi, Stephen Adu

    2016-01-01

    This study extends the technology acceptance model to identify factors that influence technology acceptance among pre-service teachers in Ghana. Data from 380 usable questionnaires were tested against the research model. Utilising the extended technology acceptance model (TAM) as a research framework, the study found that: pre-service teachers'…

  4. CONCEPTUAL APPROACHES TO IDENTIFY AND ASSESS MULTPLE STRESSORS, SECTION 1.1

    EPA Science Inventory

    Every ecosystem is subject to multiple stressors arising from the interactions of biological, physical, and socioeconomic processes (e.g. exploitation and development). These stressors and their interactions need to be identified if risks associated with a planned activity are to...

  5. Candidate fire blight resistance genes in Malus identified with the use of genomic tools and approaches

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The goal of this research is to utilize current advances in Rosaceae genomics to identify DNA markers for use in marker-assisted selection of durable resistance to fire blight. Candidate fire blight resistance genes were selected and ranked based upon differential expression after inoculation with ...

  6. Enhanced approaches for identifying Amadori products:application to peanut allergens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The dry roasting of peanuts is suggested to influence allergenic sensitization due to formation of advanced glycation end products (AGE) on peanut proteins. Identifying AGEs is technically challenging. The AGE composition of peanut proteins was probed with nanoLC-ESI-MS and MS/MS analyses. Amadori ...

  7. Constructing New Theory for Identifying Students with Emotional Disturbance: A Grounded Theory Approach

    ERIC Educational Resources Information Center

    Barnett, Dori A.

    2010-01-01

    The problem area explored by this study is the identification of students with emotional and behavioral difficulties for special education supports and services under the criteria for emotional disturbance (ED). A review of the literature indicated that the problem of identifying students with ED was compounded by subjectivity and ambiguity…

  8. Identifying whole grain foods: a comparison of different approaches for selecting more healthful whole grain products

    PubMed Central

    Mozaffarian, Rebecca S; Lee, Rebekka M; Kennedy, Mary A; Ludwig, David S; Mozaffarian, Dariush; Gortmaker, Steven L

    2015-01-01

    Objective Eating whole grains (WG) is recommended for health, but multiple conflicting definitions exist for identifying whole grain (WG) products, limiting the ability of consumers and organizations to select such products. We investigated how five recommended WG criteria relate to healthfulness and price of grain products. Design We categorized grain products by different WG criteria including: the industry-sponsored Whole Grain stamp (WG-Stamp); WG as the first ingredient (WG-first); WG as the first ingredient without added sugars (WG-first-no-added-sugars); the word ‘whole’ before any grain in the ingredients (‘whole’-anywhere); and a content of total carbohydrate to fibre of ≤10:1 (10:1-ratio). We investigated associations of each criterion with health-related characteristics including fibre, sugars, sodium, energy, trans-fats and price. Setting Two major grocery store chains. Subjects Five hundred and forty-five grain products. Results Each WG criterion identified products with higher fibre than products considered non-WG; the 10:1-ratio exhibited the largest differences (+3.15 g/serving, P<0.0001). Products achieving the 10:1-ratio also contained lower sugar (−1.28 g/serving, P=0.01), sodium (−15.4 mg/serving, P=0.04) and likelihood of trans-fats (OR=0.14, P<0.0001), without energy differences. WG-first-no-added-sugars performed similarly, but identified many fewer products as WG and also not a lower likelihood of containing trans-fats. The WG-Stamp, WG-first and ‘whole’-anywhere criteria identified products with a lower likelihood of trans-fats, but also significantly more sugars and energy (P<0.05 each). Products meeting the WG-Stamp or 10:1-ratio criterion were more expensive than products that did not (+$US 0.04/serving, P=0.009 and +$US 0.05/serving, P=0.003, respectively). Conclusions Among proposed WG criteria, the 10:1-ratio identified the most healthful WG products. Other criteria performed less well, including the industry

  9. Differentially Private Frequent Subgraph Mining

    PubMed Central

    Xu, Shengzhi; Xiong, Li; Cheng, Xiang; Xiao, Ke

    2016-01-01

    Mining frequent subgraphs from a collection of input graphs is an important topic in data mining research. However, if the input graphs contain sensitive information, releasing frequent subgraphs may pose considerable threats to individual's privacy. In this paper, we study the problem of frequent subgraph mining (FGM) under the rigorous differential privacy model. We introduce a novel differentially private FGM algorithm, which is referred to as DFG. In this algorithm, we first privately identify frequent subgraphs from input graphs, and then compute the noisy support of each identified frequent subgraph. In particular, to privately identify frequent subgraphs, we present a frequent subgraph identification approach which can improve the utility of frequent subgraph identifications through candidates pruning. Moreover, to compute the noisy support of each identified frequent subgraph, we devise a lattice-based noisy support derivation approach, where a series of methods has been proposed to improve the accuracy of the noisy supports. Through formal privacy analysis, we prove that our DFG algorithm satisfies ε-differential privacy. Extensive experimental results on real datasets show that the DFG algorithm can privately find frequent subgraphs with high data utility.

  10. Differentially Private Frequent Subgraph Mining

    PubMed Central

    Xu, Shengzhi; Xiong, Li; Cheng, Xiang; Xiao, Ke

    2016-01-01

    Mining frequent subgraphs from a collection of input graphs is an important topic in data mining research. However, if the input graphs contain sensitive information, releasing frequent subgraphs may pose considerable threats to individual's privacy. In this paper, we study the problem of frequent subgraph mining (FGM) under the rigorous differential privacy model. We introduce a novel differentially private FGM algorithm, which is referred to as DFG. In this algorithm, we first privately identify frequent subgraphs from input graphs, and then compute the noisy support of each identified frequent subgraph. In particular, to privately identify frequent subgraphs, we present a frequent subgraph identification approach which can improve the utility of frequent subgraph identifications through candidates pruning. Moreover, to compute the noisy support of each identified frequent subgraph, we devise a lattice-based noisy support derivation approach, where a series of methods has been proposed to improve the accuracy of the noisy supports. Through formal privacy analysis, we prove that our DFG algorithm satisfies ε-differential privacy. Extensive experimental results on real datasets show that the DFG algorithm can privately find frequent subgraphs with high data utility. PMID:27616876

  11. New approaches and omics tools for mining of vaccine candidates against vector-borne diseases.

    PubMed

    Kuleš, Josipa; Horvatić, Anita; Guillemin, Nicolas; Galan, Asier; Mrljak, Vladimir; Bhide, Mangesh

    2016-08-16

    Vector-borne diseases (VBDs) present a major threat to human and animal health, as well as place a substantial burden on livestock production. As a way of sustainable VBD control, focus is set on vaccine development. Advances in genomics and other "omics" over the past two decades have given rise to a "third generation" of vaccines based on technologies such as reverse vaccinology, functional genomics, immunomics, structural vaccinology and the systems biology approach. The application of omics approaches is shortening the time required to develop the vaccines and increasing the probability of discovery of potential vaccine candidates. Herein, we review the development of new generation vaccines for VBDs, and discuss technological advancement and overall challenges in the vaccine development pipeline. Special emphasis is placed on the development of anti-tick vaccines that can quell both vectors and pathogens.

  12. Microarray data mining: A novel optimization-based approach to uncover biologically coherent structures

    PubMed Central

    Tan, Meng P; Smith, Erin N; Broach, James R; Floudas, Christodoulos A

    2008-01-01

    Background DNA microarray technology allows for the measurement of genome-wide expression patterns. Within the resultant mass of data lies the problem of analyzing and presenting information on this genomic scale, and a first step towards the rapid and comprehensive interpretation of this data is gene clustering with respect to the expression patterns. Classifying genes into clusters can lead to interesting biological insights. In this study, we describe an iterative clustering approach to uncover biologically coherent structures from DNA microarray data based on a novel clustering algorithm EP_GOS_Clust. Results We apply our proposed iterative algorithm to three sets of experimental DNA microarray data from experiments with the yeast Saccharomyces cerevisiae and show that the proposed iterative approach improves biological coherence. Comparison with other clustering techniques suggests that our iterative algorithm provides superior performance with regard to biological coherence. An important consequence of our approach is that an increasing proportion of genes find membership in clusters of high biological coherence and that the average cluster specificity improves. Conclusion The results from these clustering experiments provide a robust basis for extracting motifs and trans-acting factors that determine particular patterns of expression. In addition, the biological coherence of the clusters is iteratively assessed independently of the clustering. Thus, this method will not be severely impacted by functional annotations that are missing, inaccurate, or sparse. PMID:18538024

  13. Geologic considerations in underground coal mining system design

    SciTech Connect

    Camilli, F.A.; Maynard, D.P.; Mangolds, A.; Harris, J.

    1981-10-01

    Geologic characteristics of coal resources which may impact new extraction technologies are identified and described to aid system designers and planners in their task of designing advanced coal extraction systems for the central Appalachian region. These geologic conditions are then organized into a matrix identified as the baseline mine concept. A sample region, eastern Kentucky, is next analyzed, using both the new baseline mine concept and traditional geologic investigative approach. The baseline mine concept presented is intended as a framework, providing a consistent basis for further analyses to be subsequently conducted in other geographic regions. The baseline mine concept is intended as a tool to give system designers a more realistic feel of the mine environment and will hopefully lead to acceptable alternatives for advanced coal extraction system.

  14. Heat–Health Warning Systems: A Comparison of the Predictive Capacity of Different Approaches to Identifying Dangerously Hot Days

    PubMed Central

    Sheridan, Scott C.; Allen, Michael J.; Pascal, Mathilde; Laaidi, Karine; Yagouti, Abderrahmane; Bickis, Ugis; Tobias, Aurelio; Bourque, Denis; Armstrong, Ben G.; Kosatsky, Tom

    2010-01-01

    Objectives. We compared the ability of several heat–health warning systems to predict days of heat-associated mortality using common data sets. Methods. Heat–health warning systems initiate emergency public health interventions once forecasts have identified weather conditions to breach predetermined trigger levels. We examined 4 commonly used trigger-setting approaches: (1) synoptic classification, (2) epidemiologic assessment of the temperature–mortality relationship, (3) temperature–humidity index, and (4) physiologic classification. We applied each approach in Chicago, Illinois; London, United Kingdom; Madrid, Spain; and Montreal, Canada, to identify days expected to be associated with the highest heat-related mortality. Results. We found little agreement across the approaches in which days were identified as most dangerous. In general, days identified by temperature–mortality assessment were associated with the highest excess mortality. Conclusions. Triggering of alert days and ultimately the initiation of emergency responses by a heat–health warning system varies significantly across approaches adopted to establish triggers. PMID:20395585

  15. A distance difference matrix approach to identifying transcription factors that regulate differential gene expression

    PubMed Central

    De Bleser, Pieter; Hooghe, Bart; Vlieghe, Dominique; van Roy, Frans

    2007-01-01

    We introduce a method that considers target genes of a transcription factor, and searches for transcription factor binding sites (TFBSs) of secondary factors responsible for differential responses among these targets. Based on the distance difference matrix concept, the method simultaneously integrates statistical overrepresentation and co-occurrence of TFBSs. Our approach is validated on datasets of differentially regulated human genes and is shown to be highly effective in detecting TFBSs responsible for the observed differential gene expression. PMID:17504544

  16. Identifying active foraminifera in the Sea of Japan using metatranscriptomic approach

    NASA Astrophysics Data System (ADS)

    Lejzerowicz, Franck; Voltsky, Ivan; Pawlowski, Jan

    2013-02-01

    Metagenetics represents an efficient and rapid tool to describe environmental diversity patterns of microbial eukaryotes based on ribosomal DNA sequences. However, the results of metagenetic studies are often biased by the presence of extracellular DNA molecules that are persistent in the environment, especially in deep-sea sediment. As an alternative, short-lived RNA molecules constitute a good proxy for the detection of active species. Here, we used a metatranscriptomic approach based on RNA-derived (cDNA) sequences to study the diversity of the deep-sea benthic foraminifera and compared it to the metagenetic approach. We analyzed 257 ribosomal DNA and cDNA sequences obtained from seven sediments samples collected in the Sea of Japan at depths ranging from 486 to 3665 m. The DNA and RNA-based approaches gave a similar view of the taxonomic composition of foraminiferal assemblage, but differed in some important points. First, the cDNA dataset was dominated by sequences of rotaliids and robertiniids, suggesting that these calcareous species, some of which have been observed in Rose Bengal stained samples, are the most active component of foraminiferal community. Second, the richness of monothalamous (single-chambered) foraminifera was particularly high in DNA extracts from the deepest samples, confirming that this group of foraminifera is abundant but not necessarily very active in the deep-sea sediments. Finally, the high divergence of undetermined sequences in cDNA dataset indicate the limits of our database and lack of knowledge about some active but possibly rare species. Our study demonstrates the capability of the metatranscriptomic approach to detect active foraminiferal species and prompt its use in future high-throughput sequencing-based environmental surveys.

  17. A novel approach to identifying regulatory motifs in distantly related genomes

    PubMed Central

    Van Hellemont, Ruth; Monsieurs, Pieter; Thijs, Gert; De Moor, Bart; Van de Peer, Yves; Marchal, Kathleen

    2005-01-01

    Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size. PMID:16420672

  18. Pattern Recognition-Based Approach for Identifying Metabolites in Nuclear Magnetic Resonance-Based Metabolomics.

    PubMed

    Dubey, Abhinav; Rangarajan, Annapoorni; Pal, Debnath; Atreya, Hanudatta S

    2015-07-21

    Identification and assignments of metabolites is an important step in metabolomics and is necessary for the discovery of new biomarkers. In nuclear magnetic resonance (NMR) spectroscopy-based studies, the conventional approach involves a database search, wherein chemical shifts are assigned to specific metabolites by use of a tolerance limit. This is inefficient because deviation in chemical shifts associated with pH or temperature variations, as well as missing peaks, impairs a robust comparison with the database. We propose here a novel method based on matching the pattern of peaks rather than absolute tolerance thresholds, using a combination of geometric hashing and similarity scoring techniques. Tests with 719 metabolites from the Human Metabolome Database (HMDB) show that 100% of the metabolites can be assigned correctly when accurate data are available. A high success rate is obtained even in the presence of large chemical shift deviations such as 0.5 ppm in (1)H and 3 ppm in (13)C and missing peaks (up to 50%), compared to nearly no assignments obtained under these conditions with existing methods that employ a direct database search approach. The method was evaluated on experimental data on a mixture of 16 metabolites at eight different combinations of pH and temperature conditions. The pattern recognition approach thus helps in identification and assignment of metabolites independent of the pH, temperature, and ionic strength used, thereby obviating the need for spectral calibration with internal or external standards.

  19. A review of current approaches to identifying human genes involved in myopia.

    PubMed

    Tang, Wing Chun; Yap, Maurice K H; Yip, Shea Ping

    2008-01-01

    The prevalence of myopia is high in many parts of the world, particularly among the Orientals such as Chinese and Japanese. Like other complex diseases such as diabetes and hypertension, myopia is likely to be caused by both genetic and environmental factors, and possibly their interactions. Owing to multiple genes with small effects, genetic heterogeneity and phenotypic complexity, the study of the genetics of myopia poses a complex challenge. This paper reviews the current approaches to the genetic analysis of complex diseases and how these can be applied to the identification of genes that predispose humans to myopia. These approaches include parametric linkage analysis, non-parametric linkage analysis like allele-sharing methods and genetic association studies. Basic concepts, advantages and disadvantages of these approaches are discussed and explained using examples from the literature on myopia. Microsatellites and single nucleotide polymorphisms are common genetic markers in the human genome and are indispensable tools for gene mapping. High throughput genotyping of millions of such markers has become feasible and efficient with recent technological advances. In turn, this makes the identification of myopia susceptibility genes a reality.

  20. Identifying risk of hospital readmission among Medicare aged patients: an approach using routinely collected data.

    PubMed

    Navarro, Adria E; Enguídanos, Susan; Wilber, Kathleen H

    2012-01-01

    Readmission provisions in the Patient Protection and Affordable Care Act of March 2010 have created urgent fiscal accountability requirements for hospitals, dependent upon a better understanding of their specific populations, along with development of mechanisms to easily identify these at-risk patients. Readmissions are disruptive and costly to both patients and the health care system. Effectively addressing hospital readmissions among Medicare aged patients offers promising targets for resources aimed at improved quality of care for older patients. Routinely collected data, accessible via electronic medical records, were examined using logistic models of sociodemographic, clinical, and utilization factors to identify predictors among patients who required rehospitalization within 30 days. Specific comorbidities and discharge care orders in this urban, nonprofit hospital had significantly greater odds of predicting a Medicare aged patient's risk of readmission within 30 days. PMID:22656916

  1. A DNA barcoding approach to identify plant species in multiflower honey.

    PubMed

    Bruni, I; Galimberti, A; Caridi, L; Scaccabarozzi, D; De Mattia, F; Casiraghi, M; Labra, M

    2015-03-01

    The purpose of this study was to test the ability of DNA barcoding to identify the plant origins of processed honey. Four multifloral honeys produced at different sites in a floristically rich area in the northern Italian Alps were examined by using the rbcL and trnH-psbA plastid regions as barcode markers. An extensive reference database of barcode sequences was generated for the local flora to determine the taxonomic composition of honey. Thirty-nine plant species were identified in the four honey samples, each of which originated from a mix of common plants belonging to Castanea, Quercus, Fagus and several herbaceous taxa. Interestingly, at least one endemic plant was found in all four honey samples, providing a clear signature for the geographic identity of these products. DNA of the toxic plant Atropa belladonna was detected in one sample, illustrating the usefulness of DNA barcoding for evaluating the safety of honey.

  2. Identifying risk of hospital readmission among Medicare aged patients: an approach using routinely collected data.

    PubMed

    Navarro, Adria E; Enguídanos, Susan; Wilber, Kathleen H

    2012-01-01

    Readmission provisions in the Patient Protection and Affordable Care Act of March 2010 have created urgent fiscal accountability requirements for hospitals, dependent upon a better understanding of their specific populations, along with development of mechanisms to easily identify these at-risk patients. Readmissions are disruptive and costly to both patients and the health care system. Effectively addressing hospital readmissions among Medicare aged patients offers promising targets for resources aimed at improved quality of care for older patients. Routinely collected data, accessible via electronic medical records, were examined using logistic models of sociodemographic, clinical, and utilization factors to identify predictors among patients who required rehospitalization within 30 days. Specific comorbidities and discharge care orders in this urban, nonprofit hospital had significantly greater odds of predicting a Medicare aged patient's risk of readmission within 30 days.

  3. MAS C-Terminal Tail Interacting Proteins Identified by Mass Spectrometry- Based Proteomic Approach

    PubMed Central

    Tirupula, Kalyan C.; Zhang, Dongmei; Osbourne, Appledene; Chatterjee, Arunachal; Desnoyer, Russ; Willard, Belinda; Karnik, Sadashiva S.

    2015-01-01

    Propagation of signals from G protein-coupled receptors (GPCRs) in cells is primarily mediated by protein-protein interactions. MAS is a GPCR that was initially discovered as an oncogene and is now known to play an important role in cardiovascular physiology. Current literature suggests that MAS interacts with common heterotrimeric G-proteins, but MAS interaction with proteins which might mediate G protein-independent or atypical signaling is unknown. In this study we hypothesized that MAS C-terminal tail (Ct) is a major determinant of receptor-scaffold protein interactions mediating MAS signaling. Mass-spectrometry based proteomic analysis was used to comprehensively identify the proteins that interact with MAS Ct comprising the PDZ-binding motif (PDZ-BM). We identified both PDZ and non-PDZ proteins from human embryonic kidney cell line, mouse atrial cardiomyocyte cell line and human heart tissue to interact specifically with MAS Ct. For the first time our study provides a panel of PDZ and other proteins that potentially interact with MAS with high significance. A ‘cardiac-specific finger print’ of MAS interacting PDZ proteins was identified which includes DLG1, MAGI1 and SNTA. Cell based experiments with wild-type and mutant MAS lacking the PDZ-BM validated MAS interaction with PDZ proteins DLG1 and TJP2. Bioinformatics analysis suggested well-known multi-protein scaffold complexes involved in nitric oxide signaling (NOS), cell-cell signaling of neuromuscular junctions, synapses and epithelial cells. Majority of these protein hits were predicted to be part of disease categories comprising cancers and malignant tumors. We propose a ‘MAS-signalosome’ model to stimulate further research in understanding the molecular mechanism of MAS function. Identifying hierarchy of interactions of ‘signalosome’ components with MAS will be a necessary step in future to fully understand the physiological and pathological functions of this enigmatic receptor. PMID

  4. Knowledge-Assisted Approach to Identify Pathways with Differential Dependencies | Office of Cancer Genomics

    Cancer.gov

    We have previously developed a statistical method to identify gene sets enriched with condition-specific genetic dependencies. The method constructs gene dependency networks from bootstrapped samples in one condition and computes the divergence between distributions of network likelihood scores from different conditions. It was shown to be capable of sensitive and specific identification of pathways with phenotype-specific dysregulation, i.e., rewiring of dependencies between genes in different conditions.

  5. Proceedings: Fourth Workshop on Mining Scientific Datasets

    SciTech Connect

    Kamath, C

    2001-07-24

    Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratory data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of applications is

  6. Identifying Predictors, Moderators, and Mediators of Antidepressant Response in Major Depressive Disorder: Neuroimaging Approaches

    PubMed Central

    Phillips, Mary L.; Chase, Henry W.; Sheline, Yvette I.; Etkin, Amit; Almeida, Jorge R.C.; Deckersbach, Thilo; Trivedi, Madhukar H.

    2015-01-01

    Objective Despite significant advances in neuroscience and treatment development, no widely accepted biomarkers are available to inform diagnostics or identify preferred treatments for individuals with major depressive disorder. Method In this critical review, the authors examine the extent to which multimodal neuroimaging techniques can identify biomarkers reflecting key pathophysiologic processes in depression and whether such biomarkers may act as predictors, moderators, and mediators of treatment response that might facilitate development of personalized treatments based on a better understanding of these processes. Results The authors first highlight the most consistent findings from neuroimaging studies using different techniques in depression, including structural and functional abnormalities in two parallel neural circuits: serotonergically modulated implicit emotion regulation circuitry, centered on the amygdala and different regions in the medial prefrontal cortex; and dopaminergically modulated reward neural circuitry, centered on the ventral striatum and medial prefrontal cortex. They then describe key findings from the relatively small number of studies indicating that specific measures of regional function and, to a lesser extent, structure in these neural circuits predict treatment response in depression. Conclusions Limitations of existing studies include small sample sizes, use of only one neuroimaging modality, and a focus on identifying predictors rather than moderators and mediators of differential treatment response. By addressing these limitations and, most importantly, capitalizing on the benefits of multimodal neuroimaging, future studies can yield moderators and mediators of treatment response in depression to facilitate significant improvements in shorter- and longer-term clinical and functional outcomes. PMID:25640931

  7. A proteomic approach to identify proteins from Trichuris trichiura extract with immunomodulatory effects.

    PubMed

    Santos, L N; Gallo, M B C; Silva, E S; Figueiredo, C A V; Cooper, P J; Barreto, M L; Loureiro, S; Pontes-de-Carvalho, L C; Alcantara-Neves, N M

    2013-01-01

    Infections with Trichuris trichiura and other trichurid nematodes have been reported to display protective effects against atopy, allergic and autoimmune diseases. The aims of the present study were to investigate the immunomodulatory properties of T. trichiura adult worm extract (TtE) and its fractions (TtEFs) on the production of cytokines by peripheral blood mononuclear cells and to identify their proteinaceous components. Fourteen TtEFs were obtained by ion exchange chromatography and tested for effects on cytokine production by peripheral blood mononuclear cells. The molecular constituents of the six most active fractions were evaluated using nano-LC/mass spectrometry. The homology between T. trichiura and the related nematode Trichinella spiralis was used to identify 12 proteins in TtEFs. Among those identified, fructose biphosphate aldolase, a homologue of macrophage migration inhibitory factor and heat-shock protein 70 may contribute to the immunomodulatory effects of TtEFs. The identification of such proteins could lead to the development of novel drugs for the therapy of allergic and other inflammatory diseases.

  8. Identifying Vortex-Core-Line using a tetrahedral satellite configuration: Field Topology Approach

    NASA Astrophysics Data System (ADS)

    Jiang, Yao; Lembege, Bertrand; Nishikawa, Ken-ichi; Cai, DongSheng; Hasegawa, Hiroshi

    2016-04-01

    Identifying vortices are the key to understanding the turbulence in plasma shear layers. Here, the term 'vortex' or 'vortex core' is associated with a region of Galilean invariance [Jeong and Hussain, 1995]. Unfortunately, no single precise definition of a vortex is currently universally accepted, despite the fact that many space plasma authors claim that many observations have detected "vortices" (as Kelvin-Helmholtz vortices at/around the magnetopause). By using the four satellite velocity data, and Taylor series, we expand the velocity data around the satellites, calculate its first order tensor, and linearly approximate the field. We can identify the vortex structures by using various vortex identification criteria as follows: (i) The first criterion is Q-criterion that defines vortices as regions in which the vorticity energy prevails other energies; (ii) the second criterion is the lambda2-criterion that is related to the minus of the Hessian matrix of the pressure related term; and (iii) the third criterion requires the existence of vortex-core-lines that is the Galilean invariance inside the four satellite tetrahedral region. Using these methods, we can identify and analyze more precisely the 3D vortex using tetrahedral satellite configuration.

  9. Two-step web-mining approach to study geology/geophysics-related open-source software projects

    NASA Astrophysics Data System (ADS)

    Behrends, Knut; Conze, Ronald

    2013-04-01

    Geology/geophysics is a highly interdisciplinary science, overlapping with, for instance, physics, biology and chemistry. In today's software-intensive work environments, geoscientists often encounter new open-source software from scientific fields that are only remotely related to the own field of expertise. We show how web-mining techniques can help to carry out systematic discovery and evaluation of such software. In a first step, we downloaded ~500 abstracts (each consisting of ~1 kb UTF-8 text) from agu-fm12.abstractcentral.com. This web site hosts the abstracts of all publications presented at AGU Fall Meeting 2012, the world's largest annual geology/geophysics conference. All abstracts belonged to the category "Earth and Space Science Informatics", an interdisciplinary label cross-cutting many disciplines such as "deep biosphere", "atmospheric research", and "mineral physics". Each publication was represented by a highly structured record with ~20 short data attributes, the largest authorship-record being the unstructured "abstract" field. We processed texts of the abstracts with the statistics software "R" to calculate a corpus and a term-document matrix. Using R package "tm", we applied text-mining techniques to filter data and develop hypotheses about software-development activities happening in various geology/geophysics fields. Analyzing the term-document matrix with basic techniques (e.g., word frequencies, co-occurences, weighting) as well as more complex methods (clustering, classification) several key pieces of information were extracted. For example, text-mining can be used to identify scientists who are also developers of open-source scientific software, and the names of their programming projects and codes can also be identified. In a second step, based on the intermediate results found by processing the conference-abstracts, any new hypotheses can be tested in another webmining subproject: by merging the dataset with open data from github

  10. Multifunctional greenway approach for landscape planning and reclamation of a post-mining district: Cartagena-La Unión, SE Spain

    NASA Astrophysics Data System (ADS)

    Acosta, Jose A.; Faz, Ángel; Zornoza, Raúl; Martínez-Martínez, Silvia; Kabas, Sebla; Bech, Jaume

    2015-04-01

    Fragmented structures create metaphorical wounds in the landscape altering the ecological and cultural processes associated with it, as it can be seen in many mine areas. Therefore it is advisable to organize the reclamation plan in the beginning of mine operating to provide spatial and functional integration of the landscape based on scientific arguments and with all possible legal and administrative means, which is generally the case of the Strategic Environmental Assessment. However, there are many abandon mine areas where no reclamation plan has been carried out, such as the case of Mining District of Sierra Minera Cartagena-La Unión, SE Spain. In these cases it is vital to respond in a sustainable manner for healing the landscape wounds of post-mining activities. Reclamation activities of a post-mining district includes not only the mine soils also all land uses around them, for this reason on necessary create practical solutions for returning the functions of ecologic and cultural processes of the area. Greenway approach shows the main veins which are crucial for keeping alive and sustaining the mentioned processes of the area. Therefore the main objectives of this study are to 1) develop an integrated local greenway network to be able to preserve significant resources and values of the district, and to 2) develop this greenway network as a part of reclamation process for degraded areas. Landscape assessments revealed the most valuable and potential connectivity resources of the area. These clustering and linear patterns of resource concentrations include mountain range and valleys, natural drainage network, legally protected areas and cultural-historical resources. Conservation areas, cultural-educational resources of post-mining activities and the riverbeds have been the main building stones for the greenway corridor. The multifunctional greenway approach serves as landscape reclamation and planning tool in a degraded area by showing the priority zones for

  11. A novel approach for identifying the true temperature sensitivity from soil respiration measurements

    SciTech Connect

    Gu, Lianhong; Hanson, Paul J; Liu, Qing; Post, Wilfred M

    2008-01-01

    We propose a novel approach, called the localized ratio fitting (LRF), to estimating the true temperature sensitivity from soil respiration measurements, a task crucial to modeling terrestrial carbon cycle and climate but so far hindered by the inadequate conventional regression approach. LRF takes advantage of the different timescales of the pool dynamics Cinduced and environmental variation Cinduced changes in soil CO2 efflux. It first transforms the expression for soil respiration into a form suppressing the influence of soil carbon pool dynamics and then uses the transformed expression to infer the parameters of environmental sensitivities. LRF works best for high-frequency soil respiration measurements and thus is particularly suitable for analyzing time series produced by automated soil chambers and from soil incubation experiments. We evaluated the validity of LRF with both simulated (with a multipool soil organic carbon model driven by realistic plant litter input scenarios) and measured (with automated soil chambers) time series of soil respiration. LRF accurately retrieved the true temperature sensitivity from the simulated heterotrophic soil respiration while the conventional approach failed to do so. The simulation also revealed that LRF performed better than the conventional approach when a direct photosynthetic signal existed in the time series of soil respiration although even LRF could not completely eliminate the interference of photosynthetic contribution for estimating the true temperature sensitivity. Importantly, the simulation on the photosynthetic influence reproduced a typical seasonal pattern of apparent temperature sensitivity reported in the literature: higher sensitivity in winter (dormant season) and lower sensitivity in summer (growing season). Such pattern has been interpreted as an indication of temperature acclimation of soil respiration by previous studies. Our simulation now indicated that that interpretation may be incorrect. The

  12. Application of a PCR-based approach to identify sex in Hawaiian honeycreepers (Drepanidinae)

    USGS Publications Warehouse

    Jarvi, S.I.; Banko, P.C.

    2000-01-01

    The application of molecular techniques to conservation genetics issues can provide important guidance criteria for management of endangered species. The results from this study establish that PCR-based approaches for sex determination developed in other bird species (Griffiths and Tiwari 1995; Griffiths et al. 1996, 1998; Ellegren 1996) can be applied with a high degree of confidence to at least four species of Hawaiian honeycreepers. This provides a rapid, reliable method with which population managers can optimize sex ratios within populations of endangered species that are subject to artificial manipulation through captive breeding programmes or geographic translocation.

  13. A stable isotope approach and its application for identifying nitrate source and transformation process in water.

    PubMed

    Xu, Shiguo; Kang, Pingping; Sun, Ya

    2016-01-01

    Nitrate contamination of water is a worldwide environmental problem. Recent studies have demonstrated that the nitrogen (N) and oxygen (O) isotopes of nitrate (NO3(-)) can be used to trace nitrogen dynamics including identifying nitrate sources and nitrogen transformation processes. This paper analyzes the current state of identifying nitrate sources and nitrogen transformation processes using N and O isotopes of nitrate. With regard to nitrate sources, δ(15)N-NO3(-) and δ(18)O-NO3(-) values typically vary between sources, allowing the sources to be isotopically fingerprinted. δ(15)N-NO3(-) is often effective at tracing NO(-)3 sources from areas with different land use. δ(18)O-NO3(-) is more useful to identify NO3(-) from atmospheric sources. Isotopic data can be combined with statistical mixing models to quantify the relative contributions of NO3(-) from multiple delineated sources. With regard to N transformation processes, N and O isotopes of nitrate can be used to decipher the degree of nitrogen transformation by such processes as nitrification, assimilation, and denitrification. In some cases, however, isotopic fractionation may alter the isotopic fingerprint associated with the delineated NO3(-) source(s). This problem may be addressed by combining the N and O isotopic data with other types of, including the concentration of selected conservative elements, e.g., chloride (Cl(-)), boron isotope (δ(11)B), and sulfur isotope (δ(35)S) data. Future studies should focus on improving stable isotope mixing models and furthering our understanding of isotopic fractionation by conducting laboratory and field experiments in different environments.

  14. A stable isotope approach and its application for identifying nitrate source and transformation process in water.

    PubMed

    Xu, Shiguo; Kang, Pingping; Sun, Ya

    2016-01-01

    Nitrate contamination of water is a worldwide environmental problem. Recent studies have demonstrated that the nitrogen (N) and oxygen (O) isotopes of nitrate (NO3(-)) can be used to trace nitrogen dynamics including identifying nitrate sources and nitrogen transformation processes. This paper analyzes the current state of identifying nitrate sources and nitrogen transformation processes using N and O isotopes of nitrate. With regard to nitrate sources, δ(15)N-NO3(-) and δ(18)O-NO3(-) values typically vary between sources, allowing the sources to be isotopically fingerprinted. δ(15)N-NO3(-) is often effective at tracing NO(-)3 sources from areas with different land use. δ(18)O-NO3(-) is more useful to identify NO3(-) from atmospheric sources. Isotopic data can be combined with statistical mixing models to quantify the relative contributions of NO3(-) from multiple delineated sources. With regard to N transformation processes, N and O isotopes of nitrate can be used to decipher the degree of nitrogen transformation by such processes as nitrification, assimilation, and denitrification. In some cases, however, isotopic fractionation may alter the isotopic fingerprint associated with the delineated NO3(-) source(s). This problem may be addressed by combining the N and O isotopic data with other types of, including the concentration of selected conservative elements, e.g., chloride (Cl(-)), boron isotope (δ(11)B), and sulfur isotope (δ(35)S) data. Future studies should focus on improving stable isotope mixing models and furthering our understanding of isotopic fractionation by conducting laboratory and field experiments in different environments. PMID:26541149

  15. An inverse docking approach for identifying new potential anti-cancer targets.

    PubMed

    Grinter, Sam Z; Liang, Yayun; Huang, Sheng-You; Hyder, Salman M; Zou, Xiaoqin

    2011-04-01

    Inverse docking is a relatively new technique that has been used to identify potential receptor targets of small molecules. Our docking software package MDock is well suited for such an application as it is both computationally efficient, yet simultaneously shows adequate results in binding affinity predictions and enrichment tests. As a validation study, we present the first stage results of an inverse-docking study which seeks to identify potential direct targets of PRIMA-1. PRIMA-1 is well known for its ability to restore mutant p53's tumor suppressor function, leading to apoptosis in several types of cancer cells. For this reason, we believe that potential direct targets of PRIMA-1 identified in silico should be experimentally screened for their ability to inhibit cancer cell growth. The highest-ranked human protein of our PRIMA-1 docking results is oxidosqualene cyclase (OSC), which is part of the cholesterol synthetic pathway. The results of two followup experiments which treat OSC as a possible anti-cancer target are promising. We show that both PRIMA-1 and Ro 48-8071, a known potent OSC inhibitor, significantly reduce the viability of BT-474 and T47-D breast cancer cells relative to normal mammary cells. In addition, like PRIMA-1, we find that Ro 48-8071 results in increased binding of p53 to DNA in BT-474 cells (which express mutant p53). For the first time, Ro 48-8071 is shown as a potent agent in killing human breast cancer cells. The potential of OSC as a new target for developing anticancer therapies is worth further investigation.

  16. An approach to developing independent learning and non-technical skills amongst final year mining engineering students

    NASA Astrophysics Data System (ADS)

    Knobbs, C. G.; Grayson, D. J.

    2012-06-01

    There is mounting evidence to show that engineers need more than technical skills to succeed in industry. This paper describes a curriculum innovation in which so-called 'soft' skills, specifically inter-personal and intra-personal skills, were integrated into a final year mining engineering course. The instructional approach was designed to promote independent learning and to develop non-technical skills, essential for students on the threshold of becoming practising engineers. Three psychometric tests were administered at the beginning of the course to make students aware of their own and their classmates' characteristics. Substantial prescribed reading assignments preceded weekly group discussions. Several projects during the course required team work skills and application of content knowledge to real-world contexts. Results obtained from students' reflection papers, assignments related to 'soft' skills and end of course evaluations suggest that students' appreciation of the need for these skills, as well as their own perceived competence, increased during the course. Their ability to function as independent learners also increased.

  17. FXR antagonism of NSAIDs contributes to drug-induced liver injury identified by systems pharmacology approach

    PubMed Central

    Lu, Weiqiang; Cheng, Feixiong; Jiang, Jing; Zhang, Chen; Deng, Xiaokang; Xu, Zhongyu; Zou, Shien; Shen, Xu; Tang, Yun; Huang, Jin

    2015-01-01

    Non-steroidal anti-inflammatory drugs (NSAIDs) are worldwide used drugs for analgesic, antipyretic, and anti-inflammatory therapeutics. However, NSAIDs often cause several serious liver injuries, such as drug-induced liver injury (DILI), and the molecular mechanisms of DILI have not been clearly elucidated. In this study, we developed a systems pharmacology approach to explore the mechanism-of-action of NSAIDs. We found that the Farnesoid X Receptor (FXR) antagonism of NSAIDs is a potential molecular mechanism of DILI through systematic network analysis and in vitro assays. Specially, the quantitative real-time PCR assay reveals that indomethacin and ibuprofen regulate FXR downstream target gene expression in HepG2 cells. Furthermore, the western blot shows that FXR antagonism by indomethacin induces the phosphorylation of STAT3 (signal transducer and activator of transcription 3), promotes the activation of caspase9, and finally causes DILI. In summary, our systems pharmacology approach provided novel insights into molecular mechanisms of DILI for NSAIDs, which may propel the ways toward the design of novel anti-inflammatory pharmacotherapeutics. PMID:25631039

  18. MartiTracks: A Geometrical Approach for Identifying Geographical Patterns of Distribution

    PubMed Central

    Echeverría-Londoño, Susy; Miranda-Esquivel, Daniel Rafael

    2011-01-01

    Panbiogeography represents an evolutionary approach to biogeography, using rational cost-efficient methods to reduce initial complexity to locality data, and depict general distribution patterns. However, few quantitative, and automated panbiogeographic methods exist. In this study, we propose a new algorithm, within a quantitative, geometrical framework, to perform panbiogeographical analyses as an alternative to more traditional methods. The algorithm first calculates a minimum spanning tree, an individual track for each species in a panbiogeographic context. Then the spatial congruence among segments of the minimum spanning trees is calculated using five congruence parameters, producing a general distribution pattern. In addition, the algorithm removes the ambiguity, and subjectivity often present in a manual panbiogeographic analysis. Results from two empirical examples using 61 species of the genus Bomarea (2340 records), and 1031 genera of both plants and animals (100118 records) distributed across the Northern Andes, demonstrated that a geometrical approach to panbiogeography is a feasible quantitative method to determine general distribution patterns for taxa, reducing complexity, and the time needed for managing large data sets. PMID:21533259

  19. Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approach

    PubMed Central

    Lu, Jun; Tomfohr, John K; Kepler, Thomas B

    2005-01-01

    Background In testing for differential gene expression involving multiple serial analysis of gene expression (SAGE) libraries, it is critical to account for both between and within library variation. Several methods have been proposed, including the t test, tw test, and an overdispersed logistic regression approach. The merits of these tests, however, have not been fully evaluated. Questions still remain on whether further improvements can be made. Results In this article, we introduce an overdispersed log-linear model approach to analyzing SAGE; we evaluate and compare its performance with three other tests: the two-sample t test, tw test and another based on overdispersed logistic linear regression. Analysis of simulated and real datasets show that both the log-linear and logistic overdispersion methods generally perform better than the t and tw tests; the log-linear method is further found to have better performance than the logistic method, showing equal or higher statistical power over a range of parameter values and with different data distributions. Conclusion Overdispersed log-linear models provide an attractive and reliable framework for analyzing SAGE experiments involving multiple libraries. For convenience, the implementation of this method is available through a user-friendly web-interface available at . PMID:15987513

  20. Mass spectrometric approach for identifying putative plasma membrane proteins of Arabidopsis leaves associated with cold acclimation.

    PubMed

    Kawamura, Yukio; Uemura, Matsuo

    2003-10-01

    Although enhancement of freezing tolerance in plants during cold acclimation is closely associated with an increase in the cryostability of plasma membrane, the molecular mechanism for the increased cryostability of plasma membrane is still to be elucidated. In Arabidopsis, enhanced freezing tolerance was detectable after cold acclimation at 2 degrees C for as short as 1 day, and maximum freezing tolerance was attained after 1 week. To identify the plasma membrane proteins that change in quantity in response to cold acclimation, a highly purified plasma membrane fraction was isolated from leaves before and during cold acclimation, and the proteins in the fraction were separated with gel electrophoresis. We found that there were substantial changes in the protein profiles after as short as 1 day of cold acclimation. Subsequently, using matrix-assisted laser desorption-ionization time-of-flight mass spectrometry (MALDI-TOF MS), we identified 38 proteins that changed in quantity during cold acclimation. The proteins that changed in quantity during the first day of cold acclimation include those that are associated with membrane repair by membrane fusion, protection of the membrane against osmotic stress, enhancement of CO2 fixation, and proteolysis.

  1. Genetic Susceptibility to Vitiligo: GWAS Approaches for Identifying Vitiligo Susceptibility Genes and Loci.

    PubMed

    Shen, Changbing; Gao, Jing; Sheng, Yujun; Dou, Jinfa; Zhou, Fusheng; Zheng, Xiaodong; Ko, Randy; Tang, Xianfa; Zhu, Caihong; Yin, Xianyong; Sun, Liangdan; Cui, Yong; Zhang, Xuejun

    2016-01-01

    Vitiligo is an autoimmune disease with a strong genetic component, characterized by areas of depigmented skin resulting from loss of epidermal melanocytes. Genetic factors are known to play key roles in vitiligo through discoveries in association studies and family studies. Previously, vitiligo susceptibility genes were mainly revealed through linkage analysis and candidate gene studies. Recently, our understanding of the genetic basis of vitiligo has been rapidly advancing through genome-wide association study (GWAS). More than 40 robust susceptible loci have been identified and confirmed to be associated with vitiligo by using GWAS. Most of these associated genes participate in important pathways involved in the pathogenesis of vitiligo. Many susceptible loci with unknown functions in the pathogenesis of vitiligo have also been identified, indicating that additional molecular mechanisms may contribute to the risk of developing vitiligo. In this review, we summarize the key loci that are of genome-wide significance, which have been shown to influence vitiligo risk. These genetic loci may help build the foundation for genetic diagnosis and personalize treatment for patients with vitiligo in the future. However, substantial additional studies, including gene-targeted and functional studies, are required to confirm the causality of the genetic variants and their biological relevance in the development of vitiligo.

  2. A Model-Based Approach for Identifying Signatures of Ancient Balancing Selection in Genetic Data

    PubMed Central

    DeGiorgio, Michael; Lohmueller, Kirk E.; Nielsen, Rasmus

    2014-01-01

    While much effort has focused on detecting positive and negative directional selection in the human genome, relatively little work has been devoted to balancing selection. This lack of attention is likely due to the paucity of sophisticated methods for identifying sites under balancing selection. Here we develop two composite likelihood ratio tests for detecting balancing selection. Using simulations, we show that these methods outperform competing methods under a variety of assumptions and demographic models. We apply the new methods to whole-genome human data, and find a number of previously-identified loci with strong evidence of balancing selection, including several HLA genes. Additionally, we find evidence for many novel candidates, the strongest of which is FANK1, an imprinted gene that suppresses apoptosis, is expressed during meiosis in males, and displays marginal signs of segregation distortion. We hypothesize that balancing selection acts on this locus to stabilize the segregation distortion and negative fitness effects of the distorter allele. Thus, our methods are able to reproduce many previously-hypothesized signals of balancing selection, as well as discover novel interesting candidates. PMID:25144706

  3. A Proteomic Approach Identifies Candidate Early Biomarkers to Predict Severe Dengue in Children

    PubMed Central

    Nhi, Dang My; Huy, Nguyen Tien; Ohyama, Kaname; Kimura, Daisuke; Lan, Nguyen Thi Phuong; Uchida, Leo; Thuong, Nguyen Van; Nhon, Cao Thi My; Phuc, Le Hong; Mai, Nguyen Thi; Mizukami, Shusaku; Bao, Lam Quoc; Doan, Nguyen Ngoc; Binh, Nguyen Van Thanh; Quang, Luong Chan; Karbwang, Juntra; Yui, Katsuyuki; Morita, Kouichi; Huong, Vu Thi Que; Hirayama, Kenji

    2016-01-01

    Background Severe dengue with severe plasma leakage (SD-SPL) is the most frequent of dengue severe form. Plasma biomarkers for early predictive diagnosis of SD-SPL are required in the primary clinics for the prevention of dengue death. Methodology Among 63 confirmed dengue pediatric patients recruited, hospital based longitudinal study detected six SD-SPL and ten dengue with warning sign (DWS). To identify the specific proteins increased or decreased in the SD-SPL plasma obtained 6–48 hours before the shock compared with the DWS, the isobaric tags for relative and absolute quantification (iTRAQ) technology was performed using four patients each group. Validation was undertaken in 6 SD-SPL and 10 DWS patients. Principal findings Nineteen plasma proteins exhibited significantly different relative concentrations (p<0.05), with five over-expressed and fourteen under-expressed in SD-SPL compared with DWS. The individual protein was classified to either blood coagulation, vascular regulation, cellular transport-related processes or immune response. The immunoblot quantification showed angiotensinogen and antithrombin III significantly increased in SD-SPL whole plasma of early stage compared with DWS subjects. Even using this small number of samples, antithrombin III predicted SD-SPL before shock occurrence with accuracy. Conclusion Proteins identified here may serve as candidate predictive markers to diagnose SD-SPL for timely clinical management. Since the number of subjects are small, so further studies are needed to confirm all these biomarkers. PMID:26895439

  4. Proteomics Approaches to Identify Mono(ADP-ribosyl)ated and Poly(ADP-ribosyl)ated proteins

    PubMed Central

    Vivelo, Christina A.; Leung, Anthony K. L.

    2015-01-01

    ADP-ribosylation refers to the addition of one or more ADP-ribose units onto protein substrates and this protein modification has been implicated in various cellular processes including DNA damage repair, RNA metabolism, transcription and cell cycle regulation. This review focuses on a compilation of large-scale proteomics studies that identify ADP-ribosylated proteins and their associated proteins by mass spectrometry using a variety of enrichment strategies. Some methods, such as the use of a poly(ADP-ribose)-specific antibody and boronate affinity chromatography and NAD+ analogues, have been employed for decades while others, such as the use of protein microarrays and recombinant proteins that bind ADP-ribose moieties (such as macrodomains), have only recently been developed. The advantages and disadvantages of each method and whether these methods are specific for identifying mono(ADP-ribosyl)ated and poly(ADP-ribosyl)ated proteins will be discussed. Lastly, since poly(ADP-ribose) is heterogeneous in length, it has been difficult to attain a mass signature associated with the modification sites. Several strategies on how to reduce polymer chain length heterogeneity for site identification will be reviewed. PMID:25263235

  5. Unique drug screening approach for prion diseases identifies tacrolimus and astemizole as antiprion agents.

    PubMed

    Karapetyan, Yervand Eduard; Sferrazza, Gian Franco; Zhou, Minghai; Ottenberg, Gregory; Spicer, Timothy; Chase, Peter; Fallahi, Mohammad; Hodder, Peter; Weissmann, Charles; Lasmézas, Corinne Ida

    2013-04-23

    Prion diseases such as Creutzfeldt-Jakob disease (CJD) are incurable and rapidly fatal neurodegenerative diseases. Because prion protein (PrP) is necessary for prion replication but dispensable for the host, we developed the PrP-FRET-enabled high throughput assay (PrP-FEHTA) to screen for compounds that decrease PrP expression. We screened a collection of drugs approved for human use and identified astemizole and tacrolimus, which reduced cell-surface PrP and inhibited prion replication in neuroblastoma cells. Tacrolimus reduced total cellular PrP levels by a nontranscriptional mechanism. Astemizole stimulated autophagy, a hitherto unreported mode of action for this pharmacophore. Astemizole, but not tacrolimus, prolonged the survival time of prion-infected mice. Astemizole is used in humans to treat seasonal allergic rhinitis in a chronic setting. Given the absence of any treatment option for CJD patients and the favorable drug characteristics of astemizole, including its ability to cross the blood-brain barrier, it may be considered as therapy for CJD patients and for prophylactic use in familial prion diseases. Importantly, our results validate PrP-FEHTA as a method to identify antiprion compounds and, more generally, FEHTA as a unique drug discovery platform. PMID:23576755

  6. A molecular approach to identify active microbes in environmental eukaryote clone libraries.

    PubMed

    Stoeck, Thorsten; Zuendorf, Alexandra; Breiner, Hans-Werner; Behnke, Anke

    2007-02-01

    A rapid method for the simultaneous extraction of RNA and DNA from eukaryote plankton samples was developed in order to discriminate between indigenous active cells and signals from inactive or even dead organisms. The method was tested using samples from below the chemocline of an anoxic Danish fjord. The simple protocol yielded RNA and DNA of a purity suitable for amplification by reverse transcription-polymerase chain reaction (RT-PCR) and PCR, respectively. We constructed an rRNA-derived and an rDNA-derived clone library to assess the composition of the microeukaryote assemblage under study and to identify physiologically active constituents of the community. We retrieved nearly 600 protistan target clones, which grouped into 84 different phylotypes (98% sequence similarity). Of these phylotypes, 27% occurred in both libraries, 25% exclusively in the rRNA library, and 48% exclusively in the rDNA library. Both libraries revealed good correspondence of the general community composition in terms of higher taxonomic ranks. They were dominated by anaerobic ciliates and heterotrophic stramenopile flagellates thriving below the fjord's chemocline. The high abundance of these bacterivore organisms points out their role as a major trophic link in anoxic marine systems. A comparison of the two libraries identified phototrophic dinoflagellates, "uncultured marine alveolates group I," and different parasites, which were exclusively detected with the rDNA-derived library, as nonindigenous members of the anoxic microeukaryote community under study.

  7. Genetic Susceptibility to Vitiligo: GWAS Approaches for Identifying Vitiligo Susceptibility Genes and Loci

    PubMed Central

    Shen, Changbing; Gao, Jing; Sheng, Yujun; Dou, Jinfa; Zhou, Fusheng; Zheng, Xiaodong; Ko, Randy; Tang, Xianfa; Zhu, Caihong; Yin, Xianyong; Sun, Liangdan; Cui, Yong; Zhang, Xuejun

    2016-01-01

    Vitiligo is an autoimmune disease with a strong genetic component, characterized by areas of depigmented skin resulting from loss of epidermal melanocytes. Genetic factors are known to play key roles in vitiligo through discoveries in association studies and family studies. Previously, vitiligo susceptibility genes were mainly revealed through linkage analysis and candidate gene studies. Recently, our understanding of the genetic basis of vitiligo has been rapidly advancing through genome-wide association study (GWAS). More than 40 robust susceptible loci have been identified and confirmed to be associated with vitiligo by using GWAS. Most of these associated genes participate in important pathways involved in the pathogenesis of vitiligo. Many susceptible loci with unknown functions in the pathogenesis of vitiligo have also been identified, indicating that additional molecular mechanisms may contribute to the risk of developing vitiligo. In this review, we summarize the key loci that are of genome-wide significance, which have been shown to influence vitiligo risk. These genetic loci may help build the foundation for genetic diagnosis and personalize treatment for patients with vitiligo in the future. However, substantial additional studies, including gene-targeted and functional studies, are required to confirm the causality of the genetic variants and their biological relevance in the development of vitiligo. PMID:26870082

  8. Proteomics approaches to identify mono-(ADP-ribosyl)ated and poly(ADP-ribosyl)ated proteins.

    PubMed

    Vivelo, Christina A; Leung, Anthony K L

    2015-01-01

    ADP-ribosylation refers to the addition of one or more ADP-ribose units onto protein substrates and this protein modification has been implicated in various cellular processes including DNA damage repair, RNA metabolism, transcription, and cell cycle regulation. This review focuses on a compilation of large-scale proteomics studies that identify ADP-ribosylated proteins and their associated proteins by MS using a variety of enrichment strategies. Some methods, such as the use of a poly(ADP-ribose)-specific antibody and boronate affinity chromatography and NAD(+) analogues, have been employed for decades while others, such as the use of protein microarrays and recombinant proteins that bind ADP-ribose moieties (such as macrodomains), have only recently been developed. The advantages and disadvantages of each method and whether these methods are specific for identifying mono(ADP-ribosyl)ated and poly(ADP-ribosyl)ated proteins will be discussed. Lastly, since poly(ADP-ribose) is heterogeneous in length, it has been difficult to attain a mass signature associated with the modification sites. Several strategies on how to reduce polymer chain length heterogeneity for site identification will be reviewed. PMID:25263235

  9. Identifying the greatest team and captain—A complex network approach to cricket matches

    NASA Astrophysics Data System (ADS)

    Mukherjee, Satyam

    2012-12-01

    We consider all Test matches played between 1877 and 2010 and One Day International (ODI) matches played between 1971 and 2010. We form directed and weighted networks of teams and also of their captains. The success of a team (or captain) is determined by the ‘quality’ of the wins, not simply by the number of wins. We apply the diffusion-based PageRank algorithm to the networks to assess the importance of the wins, and rank the respective teams and captains. Our analysis identifies Australia as the best team in both forms of cricket, Test and ODI. Steve Waugh is identified as the best captain in Test cricket and Ricky Ponting is the best captain in the ODI format. We also compare our ranking scheme with an existing ranking scheme, the Reliance ICC ranking. Our method does not depend on ‘external’ criteria in the ranking of teams (captains). The purpose of this paper is to introduce a revised ranking of cricket teams and to quantify the success of the captains.

  10. What's Inside That Seed We Brew? A New Approach To Mining the Coffee Microbiome.

    PubMed

    Vaughan, Michael Joe; Mitchell, Thomas; McSpadden Gardener, Brian B

    2015-10-01

    Coffee is a critically important agricultural commodity for many tropical states and is a beverage enjoyed by millions of people worldwide. Recent concerns over the sustainability of coffee production have prompted investigations of the coffee microbiome as a tool to improve crop health and bean quality. This review synthesizes literature informing our knowledge of the coffee microbiome, with an emphasis on applications of fruit- and seed-associated microbes in coffee production and processing. A comprehensive inventory of microbial species cited in association with coffee fruits and seeds is presented as reference tool for researchers investigating coffee-microbe associations. It concludes with a discussion of the approaches and techniques that provide a path forward to improve our understanding of the coffee microbiome and its utility, as a whole and as individual components, to help ensure the future sustainability of coffee production.

  11. What's Inside That Seed We Brew? A New Approach To Mining the Coffee Microbiome.

    PubMed

    Vaughan, Michael Joe; Mitchell, Thomas; McSpadden Gardener, Brian B

    2015-10-01

    Coffee is a critically important agricultural commodity for many tropical states and is a beverage enjoyed by millions of people worldwide. Recent concerns over the sustainability of coffee production have prompted investigations of the coffee microbiome as a tool to improve crop health and bean quality. This review synthesizes literature informing our knowledge of the coffee microbiome, with an emphasis on applications of fruit- and seed-associated microbes in coffee production and processing. A comprehensive inventory of microbial species cited in association with coffee fruits and seeds is presented as reference tool for researchers investigating coffee-microbe associations. It concludes with a discussion of the approaches and techniques that provide a path forward to improve our understanding of the coffee microbiome and its utility, as a whole and as individual components, to help ensure the future sustainability of coffee production. PMID:26162877

  12. What's Inside That Seed We Brew? A New Approach To Mining the Coffee Microbiome

    PubMed Central

    Mitchell, Thomas; McSpadden Gardener, Brian B.

    2015-01-01

    Coffee is a critically important agricultural commodity for many tropical states and is a beverage enjoyed by millions of people worldwide. Recent concerns over the sustainability of coffee production have prompted investigations of the coffee microbiome as a tool to improve crop health and bean quality. This review synthesizes literature informing our knowledge of the coffee microbiome, with an emphasis on applications of fruit- and seed-associated microbes in coffee production and processing. A comprehensive inventory of microbial species cited in association with coffee fruits and seeds is presented as reference tool for researchers investigating coffee-microbe associations. It concludes with a discussion of the approaches and techniques that provide a path forward to improve our understanding of the coffee microbiome and its utility, as a whole and as individual components, to help ensure the future sustainability of coffee production. PMID:26162877

  13. Prediction of possible CaMnO3 modifications using an ab initio minimization data-mining approach.

    PubMed

    Zagorac, Jelena; Zagorac, Dejan; Zarubica, Aleksandra; Schön, J Christian; Djuris, Katarina; Matovic, Branko

    2014-10-01

    We have performed a crystal structure prediction study of CaMnO3 focusing on structures generated by octahedral tilting according to group-subgroup relations from the ideal perovskite type (Pm\\overline 3 m), which is the aristotype of the experimentally known CaMnO3 compound in the Pnma space group. Furthermore, additional structure candidates have been obtained using data mining. For each of the structure candidates, a local optimization on the ab initio level using density-functional theory (LDA, hybrid B3LYP) and the Hartree--Fock (HF) method was performed, and we find that several of the modifications may be experimentally accessible. In the high-pressure regime, we identify a post-perovskite phase in the CaIrO3 type, not previously observed in CaMnO3. Similarly, calculations at effective negative pressure predict a phase transition from the orthorhombic perovskite to an ilmenite-type (FeTiO3) modification of CaMnO3.

  14. Detecting a Weak Association by Testing its Multiple Perturbations: a Data Mining Approach

    NASA Astrophysics Data System (ADS)

    Lo, Min-Tzu; Lee, Wen-Chung

    2014-05-01

    Many risk factors/interventions in epidemiologic/biomedical studies are of minuscule effects. To detect such weak associations, one needs a study with a very large sample size (the number of subjects, n). The n of a study can be increased but unfortunately only to an extent. Here, we propose a novel method which hinges on increasing sample size in a different direction-the total number of variables (p). We construct a p-based `multiple perturbation test', and conduct power calculations and computer simulations to show that it can achieve a very high power to detect weak associations when p can be made very large. As a demonstration, we apply the method to analyze a genome-wide association study on age-related macular degeneration and identify two novel genetic variants that are significantly associated with the disease. The p-based method may set a stage for a new paradigm of statistical tests.

  15. Ab initio thermodynamic approach to identify mixed solid sorbents for CO2 capture technology

    SciTech Connect

    Duan, Yuhua

    2015-10-15

    Because the current technologies for capturing CO2 are still too energy intensive, new materials must be developed that can capture CO2 reversibly with acceptable energy costs. At a given CO2 pressure, the turnover temperature (Tt) of the reaction of an individual solid that can capture CO2 is fixed. Such Tt may be outside the operating temperature range (ΔTo) for a practical capture technology. To adjust Tt to fit the practical ΔTo, in this study, three scenarios of mixing schemes are explored by combining thermodynamic database mining with first principles density functional theory and phonon lattice dynamics calculations. Our calculated results demonstrate that by mixing different types of solids, it’s possible to shift Tt to the range of practical operating temperature conditions. According to the requirements imposed by the pre- and post- combustion technologies and based on our calculated thermodynamic properties for the CO2 capture reactions by the mixed solids of interest, we were able to identify the mixing ratios of two or more solids to form new sorbent materials for which lower capture energy costs are expected at the desired pressure and temperature conditions.

  16. Identifying dominant controls on hydrologic parameter transfer from gauged to ungauged catchments: a comparative hydrology approach

    USGS Publications Warehouse

    Singh, R.; Archfield, S.A.; Wagener, T.

    2014-01-01

    Daily streamflow information is critical for solving various hydrologic problems, though observations of continuous streamflow for model calibration are available at only a small fraction of the world’s rivers. One approach to estimate daily streamflow at an ungauged location is to transfer rainfall–runoff model parameters calibrated at a gauged (donor) catchment to an ungauged (receiver) catchment of interest. Central to this approach is the selection of a hydrologically similar donor. No single metric or set of metrics of hydrologic similarity have been demonstrated to consistently select a suitable donor catchment. We design an experiment to diagnose the dominant controls on successful hydrologic model parameter transfer. We calibrate a lumped rainfall–runoff model to 83 stream gauges across the United States. All locations are USGS reference gauges with minimal human influence. Parameter sets from the calibrated models are then transferred to each of the other catchments and the performance of the transferred parameters is assessed. This transfer experiment is carried out both at the scale of the entire US and then for six geographic regions. We use classification and regression tree (CART) analysis to determine the relationship between catchment similarity and performance of transferred parameters. Similarity is defined using physical/climatic catchment characteristics, as well as streamflow response characteristics (signatures such as baseflow index and runoff ratio). Across the entire US, successful parameter transfer is governed by similarity in elevation and climate, and high similarity in streamflow signatures. Controls vary for different geographic regions though. Geology followed by drainage, topography and climate constitute the dominant similarity metrics in forested eastern mountains and plateaus, whereas agricultural land use relates most strongly with successful parameter transfer in the humid plains.

  17. Identifying dominant controls on hydrologic parameter transfer from gauged to ungauged catchments - A comparative hydrology approach

    NASA Astrophysics Data System (ADS)

    Singh, R.; Archfield, S. A.; Wagener, T.

    2014-09-01

    Daily streamflow information is critical for solving various hydrologic problems, though observations of continuous streamflow for model calibration are available at only a small fraction of the world's rivers. One approach to estimate daily streamflow at an ungauged location is to transfer rainfall-runoff model parameters calibrated at a gauged (donor) catchment to an ungauged (receiver) catchment of interest. Central to this approach is the selection of a hydrologically similar donor. No single metric or set of metrics of hydrologic similarity have been demonstrated to consistently select a suitable donor catchment. We design an experiment to diagnose the dominant controls on successful hydrologic model parameter transfer. We calibrate a lumped rainfall-runoff model to 83 stream gauges across the United States. All locations are USGS reference gauges with minimal human influence. Parameter sets from the calibrated models are then transferred to each of the other catchments and the performance of the transferred parameters is assessed. This transfer experiment is carried out both at the scale of the entire US and then for six geographic regions. We use classification and regression tree (CART) analysis to determine the relationship between catchment similarity and performance of transferred parameters. Similarity is defined using physical/climatic catchment characteristics, as well as streamflow response characteristics (signatures such as baseflow index and runoff ratio). Across the entire US, successful parameter transfer is governed by similarity in elevation and climate, and high similarity in streamflow signatures. Controls vary for different geographic regions though. Geology followed by drainage, topography and climate constitute the dominant similarity metrics in forested eastern mountains and plateaus, whereas agricultural land use relates most strongly with successful parameter transfer in the humid plains.

  18. One Health approach to identify research needs in bovine and human babesioses: workshop report

    PubMed Central

    2010-01-01

    Background Babesia are emerging health threats to humans and animals in the United States. A collaborative effort of multiple disciplines to attain optimal health for people, animals and our environment, otherwise known as the One Health concept, was taken during a research workshop held in April 2009 to identify gaps in scientific knowledge regarding babesioses. The impetus for this analysis was the increased risk for outbreaks of bovine babesiosis, also known as Texas cattle fever, associated with the re-infestation of the U.S. by cattle fever ticks. Results The involvement of wildlife in the ecology of cattle fever ticks jeopardizes the ability of state and federal agencies to keep the national herd free of Texas cattle fever. Similarly, there has been a progressive increase in the number of cases of human babesiosis over the past 25 years due to an increase in the white-tailed deer population. Human babesiosis due to cattle-associated Babesia divergens and Babesia divergens-like organisms have begun to appear in residents of the United States. Research needs for human and bovine babesioses were identified and are presented herein. Conclusions The translation of this research is expected to provide veterinary and public health systems with the tools to mitigate the impact of bovine and human babesioses. However, economic, political, and social commitments are urgently required, including increased national funding for animal and human Babesia research, to prevent the re-establishment of cattle fever ticks and the increasing problem of human babesiosis in the United States. PMID:20377902

  19. Impact of trace metals from past mining on the aquatic ecosystem: a multi-proxy approach in the Morvan (France).

    PubMed

    Camizuli, E; Monna, F; Scheifler, R; Amiotte-Suchet, P; Losno, R; Beis, P; Bohard, B; Chateau, C; Alibert, P

    2014-10-01

    This study seeks to determine to what extent trace metals resulting from past mining activities are transferred to the aquatic ecosystem, and whether such trace metals still exert deleterious effects on biota. Concentrations of Cd, Cu, Pb and Zn were measured in streambed sediments, transplanted bryophytes and wild brown trout. This study was conducted at two scales: (i) the entire Morvan Regional Nature Park and (ii) three small watersheds selected for their degree of contamination, based on the presence or absence of past mining sites. The overall quality of streambed sediments was assessed using Sediment Quality Indices (SQIs). According to these standard guidelines, more than 96% of the sediments sampled should not represent a threat to biota. Nonetheless, in watersheds where past mining occurred, SQIs are significantly lower. Transplanted bryophytes at these sites consistently present higher trace metal concentrations. For wild brown trout, the scaled mass and liver indices appear to be negatively correlated with liver Pb concentrations, but there are no obvious relationships between past mining and liver metal concentrations or the developmental instability of specimens. Although the impact of past mining and metallurgical works is apparently not as strong as that usually observed in modern mining sites, it is still traceable. For this reason, past mining sites should be monitored, particularly in protected areas erroneously thought to be free of anthropogenic contamination.

  20. Impact of trace metals from past mining on the aquatic ecosystem: a multi-proxy approach in the Morvan (France).

    PubMed

    Camizuli, E; Monna, F; Scheifler, R; Amiotte-Suchet, P; Losno, R; Beis, P; Bohard, B; Chateau, C; Alibert, P

    2014-10-01

    This study seeks to determine to what extent trace metals resulting from past mining activities are transferred to the aquatic ecosystem, and whether such trace metals still exert deleterious effects on biota. Concentrations of Cd, Cu, Pb and Zn were measured in streambed sediments, transplanted bryophytes and wild brown trout. This study was conducted at two scales: (i) the entire Morvan Regional Nature Park and (ii) three small watersheds selected for their degree of contamination, based on the presence or absence of past mining sites. The overall quality of streambed sediments was assessed using Sediment Quality Indices (SQIs). According to these standard guidelines, more than 96% of the sediments sampled should not represent a threat to biota. Nonetheless, in watersheds where past mining occurred, SQIs are significantly lower. Transplanted bryophytes at these sites consistently present higher trace metal concentrations. For wild brown trout, the scaled mass and liver indices appear to be negatively correlated with liver Pb concentrations, but there are no obvious relationships between past mining and liver metal concentrations or the developmental instability of specimens. Although the impact of past mining and metallurgical works is apparently not as strong as that usually observed in modern mining sites, it is still traceable. For this reason, past mining sites should be monitored, particularly in protected areas erroneously thought to be free of anthropogenic contamination. PMID:25255284

  1. Evaluation of an innovative approach based on prototype engineered wetland to control and manage boron (B) mine effluent pollution.

    PubMed

    Türker, Onur Can; Türe, Cengiz; Böcük, Harun; Yakar, Anıl; Chen, Yi

    2016-10-01

    A major environmental problem associated with boron (B) mining in many parts of the world is B pollution, which can become a point source of B mine effluent pollution to aquatic habitats. In this study, a cost-effective, environment-friendly, and sustainable prototype engineered wetland was evaluated and tested to prevent B mine effluent from spilling into adjoining waterways in the largest B reserve in the world. According to the results, average B concentrations in mine effluent significantly decreased from 17.5 to 5.7 mg l(-1) after passing through the prototype with a hydraulic retention time of 14 days. The results of the present experiment, in which different doses of B had been introduced into the prototype, also demonstrated that Typha latifolia (selected as donor species in the prototype) showed a good resistance to alterations against B mine effluent loading rates. Moreover, we found that soil enzymes activities gradually decreased with increasing B dosages during the experiment. Boron mass balance model further showed that 60 % of total B was stored in the filtration media, and only 7 % of B was removed by plant uptake. Consequently, we suggested that application of the prototype in the vicinity of mining site may potentially become an innovative model and integral part of the overall landscape plan of B mine reserve areas worldwide. Graphical Abstract ᅟ. PMID:27364490

  2. Profiling animal toxicants by automatically mining public bioassay data: a big data approach for computational toxicology.

    PubMed

    Zhang, Jun; Hsieh, Jui-Hua; Zhu, Hao

    2014-01-01

    In vitro bioassays have been developed and are currently being evaluated as potential alternatives to traditional animal toxicity models. Already, the progress of high throughput screening techniques has resulted in an enormous amount of publicly available bioassay data having been generated for a large collection of compounds. When a compound is tested using a collection of various bioassays, all the testing results can be considered as providing a unique bio-profile for this compound, which records the responses induced when the compound interacts with different cellular systems or biological targets. Profiling compounds of environmental or pharmaceutical interest using useful toxicity bioassay data is a promising method to study complex animal toxicity. In this study, we developed an automatic virtual profiling tool to evaluate potential animal toxicants. First, we automatically acquired all PubChem bioassay data for a set of 4,841 compounds with publicly available rat acute toxicity results. Next, we developed a scoring system to evaluate the relevance between these extracted bioassays and animal acute toxicity. Finally, the top ranked bioassays were selected to profile the compounds of interest. The resulting response profiles proved to be useful to prioritize untested compounds for their animal toxicity potentials and form a potential in vitro toxicity testing panel. The protocol developed in this study could be combined with structure-activity approaches and used to explore additional publicly available bioassay datasets for modeling a broader range of animal toxicities.

  3. A pattern mining approach to enhance the accuracy of collaborative filtering in sparse data domains

    NASA Astrophysics Data System (ADS)

    Ramezani, Mohsen; Moradi, Parham; Akhlaghian, Fardin

    2014-08-01

    Recommender systems seek to find the interesting items by filtering out the worthless items. Collaborative filtering is one of the most successful recommendation approaches. It typically associates a user with a group of like-minded users based on their preferences over all the items and recommends the items which are welcomed by others in the group to the user. But, many challenges like sparsity and computational issues still arise. In this paper, to overcome these challenges, we propose a novel method to find the neighbor users based on the users’ interest patterns. The main idea is that users who are interested in the same set of items share similar interest patterns. Therefore, the non-redundant item subspaces are extracted to indicate the different patterns of interest. Then, a user’s tree structure is created based on the patterns he has in common with the active user. Moreover, a novel recommendation method is presented to predict a new rating value for unseen items. Experimental results on the Movielens and the Jester datasets show that in most cases, the proposed method gains better results than already widely used methods.

  4. Renewed mining and reclamation: Imapacts on bats and potential mitigation

    SciTech Connect

    Brown, P.E.; Berry, R.D.

    1997-12-31

    Historic mining created new roosting habitat for many bat species. Now the same industry has the potential to adversely impact bats. Contemporary mining operations usually occur in historic districts; consequently the old workings are destroyed by open pit operations. Occasionally, underground techniques are employed, resulting in the enlargement or destruction of the original workings. Even during exploratory operations, historic mine openings can be covered as drill roads are bulldozed, or drills can penetrate and collapse underground workings. Nearby blasting associated with mine construction and operation can disrupt roosting bats. Bats can also be disturbed by the entry of mine personnel to collect ore samples or by recreational mine explorers, since the creation of roads often results in easier access. In addition to roost disturbance, other aspects of renewed mining can have adverse impacts on bat populations, and affect even those bats that do not live in mines. Open cyanide ponds, or other water in which toxic chemicals accumulate, can poison bats and other wildlife. The creation of the pits, roads and processing areas often destroys critical foraging habitat, or change drainage patterns. Finally, at the completion of mining, any historic mines still open may be sealed as part of closure and reclamation activities. The net result can be a loss of bats and bat habitat. Conversely, in some contemporary underground operations, future roosting habitat for bats can be fabricated. An experimental approach to the creation of new roosting habitat is to bury culverts or old tires beneath waste rock. Mining companies can mitigate for impacts to bats by surveying to identify bat-roosting habitat, removing bats prior to renewed mining or closure, protecting non-impacted roost sites with gates and fences, researching to identify habitat requirements and creating new artificial roosts.

  5. Targeted approach to identify genetic loci associated with evolved dioxin tolerance in Atlantic Killifish (Fundulus heteroclitus)

    PubMed Central

    2014-01-01

    Background The most toxic aromatic hydrocarbon pollutants are categorized as dioxin-like compounds (DLCs) to which extreme tolerance has evolved independently and contemporaneously in (at least) four populations of Atlantic killifish (Fundulus heteroclitus). Surprisingly, the magnitude and phenotype of DLC tolerance is similar among these killifish populations that have adapted to varied, but highly aromatic hydrocarbon-contaminated urban/industrialized estuaries of the US Atlantic coast. Multiple tolerant and neighboring sensitive killifish populations were compared with the expectation that genetic loci associated with DLC tolerance would be revealed. Results Since the aryl hydrocarbon receptor (AHR) pathway partly or fully mediates DLC toxicity in vertebrates, single nucleotide polymorphisms (SNPs) from 42 genes associated with the AHR pathway were identified to serve as targeted markers. Wild fish (N = 36/37) from four highly tolerant killifish populations and four nearby sensitive populations were genotyped using 59 SNP markers. Similar to other killifish population genetic analyses, strong genetic differentiation among populations was detected, consistent with isolation by distance models. When DLC-sensitive populations were pooled and compared to pooled DLC-tolerant populations, multi-locus analyses did not distinguish the two groups. However, pairwise comparisons of nearby tolerant and sensitive populations revealed high differentiation among sensitive and tolerant populations at these specific loci: AHR 1 and 2, cathepsin Z, the cytochrome P450s (CYP1A and 3A30), and the NADH dehydrogenase subunits. In addition, significant shifts in minor allele frequency were observed at AHR2 and CYP1A loci across most sensitive/tolerant pairs, but only AHR2 exhibited shifts in the same direction across all pairs. Conclusions The observed differences in allelic composition at the AHR2 and CYP1A SNP loci were identified as significant among paired sensitive

  6. An experimental approach to identify dynamical models of transcriptional regulation in living cells

    NASA Astrophysics Data System (ADS)

    Fiore, G.; Menolascina, F.; di Bernardo, M.; di Bernardo, D.

    2013-06-01

    We describe an innovative experimental approach, and a proof of principle investigation, for the application of System Identification techniques to derive quantitative dynamical models of transcriptional regulation in living cells. Specifically, we constructed an experimental platform for System Identification based on a microfluidic device, a time-lapse microscope, and a set of automated syringes all controlled by a computer. The platform allows delivering a time-varying concentration of any molecule of interest to the cells trapped in the microfluidics device (input) and real-time monitoring of a fluorescent reporter protein (output) at a high sampling rate. We tested this platform on the GAL1 promoter in the yeast Saccharomyces cerevisiae driving expression of a green fluorescent protein (Gfp) fused to the GAL1 gene. We demonstrated that the System Identification platform enables accurate measurements of the input (sugars concentrations in the medium) and output (Gfp fluorescence intensity) signals, thus making it possible to apply System Identification techniques to obtain a quantitative dynamical model of the promoter. We explored and compared linear and nonlinear model structures in order to select the most appropriate to derive a quantitative model of the promoter dynamics. Our platform can be used to quickly obtain quantitative models of eukaryotic promoters, currently a complex and time-consuming process.

  7. Multi-omics approach identifies molecular mechanisms of plant-fungus mycorrhizal interaction

    DOE PAGES

    Larsen, Peter E.; Sreedasyam, Avinash; Trivedi, Geetika; Desai, Shalaka D.; Dai, Yang; Cseke, Leland; Collart, Frank R.

    2016-01-19

    In mycorrhizal symbiosis, plant roots form close, mutually beneficial interactions with soil fungi. Before this mycorrhizal interaction can be established however, plant roots must be capable of detecting potential beneficial fungal partners and initiating the gene expression patterns necessary to begin symbiosis. To predict a plant root – mycorrhizal fungi sensor systems, we analyzed in vitro experiments of Populus tremuloides (aspen tree) and Laccaria bicolor (mycorrhizal fungi) interaction and leveraged over 200 previously published transcriptomic experimental data sets, 159 experimentally validated plant