Science.gov

Sample records for mining approach identifies

  1. A novel pattern mining approach for identifying cognitive activity in EEG based functional brain networks.

    PubMed

    Thilaga, M; Vijayalakshmi, R; Nadarajan, R; Nandagopal, D

    2016-06-01

    The complex nature of neuronal interactions of the human brain has posed many challenges to the research community. To explore the underlying mechanisms of neuronal activity of cohesive brain regions during different cognitive activities, many innovative mathematical and computational models are required. This paper presents a novel Common Functional Pattern Mining approach to demonstrate the similar patterns of interactions due to common behavior of certain brain regions. The electrode sites of EEG-based functional brain network are modeled as a set of transactions and node-based complex network measures as itemsets. These itemsets are transformed into a graph data structure called Functional Pattern Graph. By mining this Functional Pattern Graph, the common functional patterns due to specific brain functioning can be identified. The empirical analyses show the efficiency of the proposed approach in identifying the extent to which the electrode sites (transactions) are similar during various cognitive load states. PMID:27401999

  2. An integrative data mining approach to identifying adverse outcome pathway signatures.

    PubMed

    Oki, Noffisat O; Edwards, Stephen W

    2016-03-28

    The Adverse Outcome Pathway (AOP) framework is a tool for making biological connections and summarizing key information across different levels of biological organization to connect biological perturbations at the molecular level to adverse outcomes for an individual or population. Computational approaches to explore and determine these connections can accelerate the assembly of AOPs. By leveraging the wealth of publicly available data covering chemical effects on biological systems, computationally-predicted AOPs (cpAOPs) were assembled via data mining of high-throughput screening (HTS) in vitro data, in vivo data and other disease phenotype information. Frequent Itemset Mining (FIM) was used to find associations between the gene targets of ToxCast HTS assays and disease data from Comparative Toxicogenomics Database (CTD) by using the chemicals as the common aggregators between datasets. The method was also used to map gene expression data to disease data from CTD. A cpAOP network was defined by considering genes and diseases as nodes and FIM associations as edges. This network contained 18,283 gene to disease associations for the ToxCast data and 110,253 for CTD gene expression. Two case studies show the value of the cpAOP network by extracting subnetworks focused either on fatty liver disease or the Aryl Hydrocarbon Receptor (AHR). The subnetwork surrounding fatty liver disease included many genes known to play a role in this disease. When querying the cpAOP network with the AHR gene, an interesting subnetwork including glaucoma was identified. While substantial literature exists to support the potential for AHR ligands to elicit glaucoma, it was not explicitly captured in the public annotation information in CTD. The subnetwork from this analysis suggests a cpAOP that includes changes in CYP1B1 expression, which has been previously established in the literature as a primary cause of glaucoma. These case studies highlight the value in integrating multiple data

  3. USING PharmGKB TO TRAIN TEXT MINING APPROACHES FOR IDENTIFYING POTENTIAL GENE TARGETS FOR PHARMACOGENOMIC STUDIES

    PubMed Central

    PAKHOMOV, S.; MCINNES, B.T.; LAMBA, J.; LIU, Y.; MELTON, G.B.; GHODKE, Y.; BHISE, N.; LAMBA, V.; BIRNBAUM, A.K.

    2012-01-01

    The main objective of this study was to investigate the feasibility of using PharmGKB, a pharmacogenomic database, as a source of training data in combination with text of MEDLINE abstracts for a text mining approach to identification of potential gene targets for pathway-driven pharmacogenomics research. We used the manually curated relations between drugs and genes in PharmGKB database to train a support vector machine predictive model and applied this model prospectively to MEDLINE abstracts. The gene targets suggested by this approach were subsequently manually reviewed. Our quantitative analysis showed that a support vector machine classifiers trained on MEDLINE abstracts with single words (unigrams) used as features and PharmGKB relations used for supervision, achieve an overall sensitivity of 85% and specificity of 69%. The subsequent qualitative analysis showed that gene targets “suggested” by the automatic classifier were not anticipated by expert reviewers but were subsequently found to be relevant to the three drugs that were investigated: carbamazepine, lamivudine and zidovudine. Our results show that this approach is not only feasible but may also find new gene targets not identifiable by other methods thus making it a valuable tool for pathway-driven pharmacogenomics research. PMID:22564551

  4. Data mining approach identifies research priorities and data requirements for resolving the red algal tree of life

    PubMed Central

    2010-01-01

    Background The assembly of the tree of life has seen significant progress in recent years but algae and protists have been largely overlooked in this effort. Many groups of algae and protists have ancient roots and it is unclear how much data will be required to resolve their phylogenetic relationships for incorporation in the tree of life. The red algae, a group of primary photosynthetic eukaryotes of more than a billion years old, provide the earliest fossil evidence for eukaryotic multicellularity and sexual reproduction. Despite this evolutionary significance, their phylogenetic relationships are understudied. This study aims to infer a comprehensive red algal tree of life at the family level from a supermatrix containing data mined from GenBank. We aim to locate remaining regions of low support in the topology, evaluate their causes and estimate the amount of data required to resolve them. Results Phylogenetic analysis of a supermatrix of 14 loci and 98 red algal families yielded the most complete red algal tree of life to date. Visualization of statistical support showed the presence of five poorly supported regions. Causes for low support were identified with statistics about the age of the region, data availability and node density, showing that poor support has different origins in different parts of the tree. Parametric simulation experiments yielded optimistic estimates of how much data will be needed to resolve the poorly supported regions (ca. 103 to ca. 104 nucleotides for the different regions). Nonparametric simulations gave a markedly more pessimistic image, some regions requiring more than 2.8 105 nucleotides or not achieving the desired level of support at all. The discrepancies between parametric and nonparametric simulations are discussed in light of our dataset and known attributes of both approaches. Conclusions Our study takes the red algae one step closer to meaningful inclusion in the tree of life. In addition to the recovery of stable

  5. Developing Isotope Tools for Identifying Mercury Mining Sources

    NASA Astrophysics Data System (ADS)

    Koster van Groos, P. G.; Esser, B. K.; Williams, R. W.; Hunt, J. R.

    2009-12-01

    Mining operations in California during the past two centuries have resulted in widespread mercury contamination. Source control strategies are difficult and expensive to implement, in part because links between specific mercury sources and exposures are often uncertain. Examination of mercury’s stable isotopes can help resolve this issue. Sources with distinct isotope compositions may be traced through the environment. Mercury mining operations are predicted to have led to waste tailings, mercury metal products, and air emissions with different isotope compositions as a result of inefficient mercury extraction and recovery from ores. The predicted differences in isotope composition, based on estimated kinetic and diffusion isotope effects, are greater than the precision of current analytical methods using multi-collector inductively coupled plasma mass-spectrometers (MC-ICP-MS). As such, mercury isotope measurements may help identify mercury originating from different mining operations. To support a mechanistic approach to mercury isotope fractionation, the isotope effects of diffusion through solids and gases are being investigated experimentally. Besides demonstrating the utility of mercury isotope analysis for source identification, this work is providing a mechanistic basis for differences in isotope compositions.

  6. A data mining approach to intelligence operations

    NASA Astrophysics Data System (ADS)

    Memon, Nasrullah; Hicks, David L.; Harkiolakis, Nicholas

    2008-03-01

    In this paper we examine the latest thinking, approaches and methodologies in use for finding the nuggets of information and subliminal (and perhaps intentionally hidden) patterns and associations that are critical to identify criminal activity and suspects to private and government security agencies. An emphasis in the paper is placed on Social Network Analysis and Investigative Data Mining, and the use of these technologies in the counterterrorism domain. Tools and techniques from both areas are described, along with the important tasks for which they can be used to assist with the investigation and analysis of terrorist organizations. The process of collecting data about these organizations is also considered along with the inherent difficulties that are involved.

  7. Implementation of an original approach on the Mines-Douai Comparative Reactivity Method (MD-CRM) instrument to identify part of the missing OH reactivity at an urban site

    NASA Astrophysics Data System (ADS)

    Dusanter, S.; Michoud, V.; Leonardis, T.; Riffault, V.; Zhang, S.; Locoge, N.

    2015-12-01

    Due to the large number of Volatile Organic Compounds (VOCs) expected in the atmosphere (104-105) (Goldstein and Galbally, ES&T, 2007), exhaustive measurements of VOCs appear to be currently unfeasible using common analytical techniques. In this context, measurements of the total sink of OH, referred as total OH reactivity, can provide a critical test to assess the completeness of trace gas measurements during field campaigns. This can be done by comparing the measured total OH reactivity to values calculated from trace gas measurements. Indeed, large discrepancies are usually found between measured and calculated OH reactivity values revealing the presence of important unmeasured reactive species, which have yet to be identified. A Comparative Reactivity Method (CRM) instrument has been setup at Mines Douai to allow sequential measurements of VOCs and OH reactivity using the same Proton Transfer Reaction-Time of Flight Mass Spectrometer. This approach aims at identifying unmeasured reactive VOCs based on a method proposed by Kato et al. (Atmos. Environ., 2011), taking advantage of VOC oxidations occurring in the CRM sampling reactor. MD-CRM has been deployed at an urban site in Dunkirk (France) during July 2014 to test this new approach. During this campaign, a large fraction of the OH reactivity was not explained by collocated measurements of trace gases (67% on average). In this presentation, we will first describe the approach that was implemented in the CRM instrument to identify part of the observed missing OH reactivity and we will then discuss the OH reactivity budget regarding the origin of air masses reaching the measurement site.

  8. Mining for Murder-Suicide: An Approach to Identifying Cases of Murder-Suicide in the National Violent Death Reporting System Restricted Access Database.

    PubMed

    McNally, Matthew R; Patton, Christina L; Fremouw, William J

    2016-01-01

    The National Violent Death Reporting System (NVDRS) is a United States Centers for Disease Control and Prevention (CDC) database of violent deaths from 2003 to the present. The NVDRS collects information from 32 states on several types of violent deaths, including suicides, homicides, homicides followed by suicides, and deaths resulting from child maltreatment or intimate partner violence, as well as legal intervention and accidental firearm deaths. Despite the availability of data from police narratives, medical examiner reports, and other sources, reliably finding the cases of murder-suicide in the NVDRS has proven problematic due to the lack of a unique code for murder-suicide incidents and outdated descriptions of case-finding procedures from previous researchers. By providing a description of the methods used to access to the NVDRS and coding procedures used to decipher these data, the authors seek to assist future researchers in correctly identifying cases of murder-suicide deaths while avoiding false positives. PMID:26258816

  9. Identifying Engineering Students' English Sentence Reading Comprehension Errors: Applying a Data Mining Technique

    ERIC Educational Resources Information Center

    Tsai, Yea-Ru; Ouyang, Chen-Sen; Chang, Yukon

    2016-01-01

    The purpose of this study is to propose a diagnostic approach to identify engineering students' English reading comprehension errors. Student data were collected during the process of reading texts of English for science and technology on a web-based cumulative sentence analysis system. For the analysis, the association-rule, data mining technique…

  10. Identifying the Cause of Toxicity of a Saline Mine Water

    PubMed Central

    van Dam, Rick A.; Harford, Andrew J.; Lunn, Simon A.; Gagnon, Marthe M.

    2014-01-01

    Elevated major ions (or salinity) are recognised as being a key contributor to the toxicity of many mine waste waters but the complex interactions between the major ions and large inter-species variability in response to salinity, make it difficult to relate toxicity to causal factors. This study aimed to determine if the toxicity of a typical saline seepage water was solely due to its major ion constituents; and determine which major ions were the leading contributors to the toxicity. Standardised toxicity tests using two tropical freshwater species Chlorella sp. (alga) and Moinodaphnia macleayi (cladoceran) were used to compare the toxicity of 1) mine and synthetic seepage water; 2) key major ions (e.g. Na, Cl, SO4 and HCO3); 3) synthetic seepage water that were modified by excluding key major ions. For Chlorella sp., the toxicity of the seepage water was not solely due to its major ion concentrations because there were differences in effects caused by the mine seepage and synthetic seepage. However, for M. macleayi this hypothesis was supported because similar effects caused by mine seepage and synthetic seepage. Sulfate was identified as a major ion that could predict the toxicity of the synthetic waters, which might be expected as it was the dominant major ion in the seepage water. However, sulfate was not the primary cause of toxicity in the seepage water and electrical conductivity was a better predictor of effects. Ultimately, the results show that specific major ions do not clearly drive the toxicity of saline seepage waters and the effects are probably due to the electrical conductivity of the mine waste waters. PMID:25180579

  11. Data Mining and Pattern Recognition Models for Identifying Inherited Diseases: Challenges and Implications.

    PubMed

    Iddamalgoda, Lahiru; Das, Partha S; Aponso, Achala; Sundararajan, Vijayaraghava S; Suravajhala, Prashanth; Valadi, Jayaraman K

    2016-01-01

    Data mining and pattern recognition methods reveal interesting findings in genetic studies, especially on how the genetic makeup is associated with inherited diseases. Although researchers have proposed various data mining models for biomedical approaches, there remains a challenge in accurately prioritizing the single nucleotide polymorphisms (SNP) associated with the disease. In this commentary, we review the state-of-art data mining and pattern recognition models for identifying inherited diseases and deliberate the need of binary classification- and scoring-based prioritization methods in determining causal variants. While we discuss the pros and cons associated with these methods known, we argue that the gene prioritization methods and the protein interaction (PPI) methods in conjunction with the K nearest neighbors' could be used in accurately categorizing the genetic factors in disease causation. PMID:27559342

  12. Data Mining and Pattern Recognition Models for Identifying Inherited Diseases: Challenges and Implications

    PubMed Central

    Iddamalgoda, Lahiru; Das, Partha S.; Aponso, Achala; Sundararajan, Vijayaraghava S.; Suravajhala, Prashanth; Valadi, Jayaraman K.

    2016-01-01

    Data mining and pattern recognition methods reveal interesting findings in genetic studies, especially on how the genetic makeup is associated with inherited diseases. Although researchers have proposed various data mining models for biomedical approaches, there remains a challenge in accurately prioritizing the single nucleotide polymorphisms (SNP) associated with the disease. In this commentary, we review the state-of-art data mining and pattern recognition models for identifying inherited diseases and deliberate the need of binary classification- and scoring-based prioritization methods in determining causal variants. While we discuss the pros and cons associated with these methods known, we argue that the gene prioritization methods and the protein interaction (PPI) methods in conjunction with the K nearest neighbors' could be used in accurately categorizing the genetic factors in disease causation. PMID:27559342

  13. Pennsylvania's approach to underground coal mine permitting and long-term mine pool management

    SciTech Connect

    Callaghan, T.; Koricich, J.

    1999-07-01

    Pennsylvania's underground coal mine permitting process has two goals: first, to ensure that the mining and reclamation plan is designed to minimize adverse environmental impacts; and second, to minimize interference with the applicant's recovery of coal. A successful review process includes the consistent evaluation of mine site hydrology through scrutiny of key indicators of mining-induced, adverse hydrologic consequences. This allows the regulatory agency to assess the potential for mining-related impacts as well as cumulative impacts throughout the proposed mine area and adjacent area. General trends have been identified regarding quality of underground mine drainage versus coal seam mined. However, the large number of factors controlling the final mine pool chemistry along with the lack of focused research have combined to stunt the development of reliable methodologies for the prediction of postmining water quality. Absent reliable predictive methodologies, mine layout has become the best demonstrated technology for pollution prevention. Strategies include: (1) promotion of postmining inundation by down-dip development with proper location of mine openings and sizing and location of barriers; (2) restriction of mining to zones within the groundwater system where flow is relatively lethargic and time of travel is great when compared to natural mine pool amelioration time frames; and (3) mining in zones remote from groundwater discharge areas and features which may serve to short-circuit mine water to nearby existing water-supply aquifers or to the surface. This paper discusses Pennsylvania's application process for underground bituminous coal mines. It briefly outlines Pennsylvania's statutory history relating to mine discharges, touches on some of the tools permit reviewers use to evaluate the hydrology of proposed underground mining sites, and discusses the key factors that permit reviewers consider in assessing potential postmining mine pool levels.

  14. HDB-Subdue: A Scalable Approach to Graph Mining

    NASA Astrophysics Data System (ADS)

    Padmanabhan, Srihari; Chakravarthy, Sharma

    Transactional data mining (association rules, decision trees etc.) has been effectively used to find non-trivial patterns in categorical and unstructured data. For applications that have an inherent structure (e.g., social networks, proteins), graph mining is useful since mapping the structured data into a transactional representation will lead to loss of information. Graph mining is used for identifying interesting or frequent subgraphs. Database mining uses SQL and relational representation to overcome limitations of main memory algorithms and to achieve scalability.

  15. Innovative approaches to mined land reclamation

    SciTech Connect

    Carlson, C.L.; Swisher, J.H.

    1987-01-01

    This is the proceedings of a conference held on mined land reclamation. The thrust of the meeting was coal-related, although applications provided in this conference are relevant to other noncoal mining programs. The main topics were on methods to forecast acid-forming materials to preclude acid mine drainage; methods to correct acid main drainage after formation; soil conservation and reconstruction; uses of waste materials in land reclamation; and methods of insuring vegetation survival on mined lands. Many papers are presented with regards to the regulatory aspects of these areas.

  16. Screening and prioritisation of chemical risks from metal mining operations, identifying exposure media of concern.

    PubMed

    Pan, Jilang; Oates, Christopher J; Ihlenfeld, Christian; Plant, Jane A; Voulvoulis, Nikolaos

    2010-04-01

    Metals have been central to the development of human civilisation from the Bronze Age to modern times, although in the past, metal mining and smelting have been the cause of serious environmental pollution with the potential to harm human health. Despite problems from artisanal mining in some developing countries, modern mining to Western standards now uses the best available mining technology combined with environmental monitoring, mitigation and remediation measures to limit emissions to the environment. This paper develops risk screening and prioritisation methods previously used for contaminated land on military and civilian sites and engineering systems for the analysis and prioritisation of chemical risks from modern metal mining operations. It uses hierarchical holographic modelling and multi-criteria decision making to analyse and prioritise the risks from potentially hazardous inorganic chemical substances released by mining operations. A case study of an active platinum group metals mine in South Africa is used to demonstrate the potential of the method. This risk-based methodology for identifying, filtering and ranking mining-related environmental and human health risks can be used to identify exposure media of greatest concern to inform risk management. It also provides a practical decision-making tool for mine acquisition and helps to communicate risk to all members of mining operation teams. PMID:19353294

  17. Systematic evaluation of satellite remote sensing for identifying uranium mines and mills.

    SciTech Connect

    Blair, Dianna Sue; Stork, Christopher Lyle; Smartt, Heidi Anne; Smith, Jody Lynn

    2006-01-01

    In this report, we systematically evaluate the ability of current-generation, satellite-based spectroscopic sensors to distinguish uranium mines and mills from other mineral mining and milling operations. We perform this systematic evaluation by (1) outlining the remote, spectroscopic signal generation process, (2) documenting the capabilities of current commercial satellite systems, (3) systematically comparing the uranium mining and milling process to other mineral mining and milling operations, and (4) identifying the most promising observables associated with uranium mining and milling that can be identified using satellite remote sensing. The Ranger uranium mine and mill in Australia serves as a case study where we apply and test the techniques developed in this systematic analysis. Based on literature research of mineral mining and milling practices, we develop a decision tree which utilizes the information contained in one or more observables to determine whether uranium is possibly being mined and/or milled at a given site. Promising observables associated with uranium mining and milling at the Ranger site included in the decision tree are uranium ore, sulfur, the uranium pregnant leach liquor, ammonia, and uranyl compounds and sulfate ion disposed of in the tailings pond. Based on the size, concentration, and spectral characteristics of these promising observables, we then determine whether these observables can be identified using current commercial satellite systems, namely Hyperion, ASTER, and Quickbird. We conclude that the only promising observables at Ranger that can be uniquely identified using a current commercial satellite system (notably Hyperion) are magnesium chlorite in the open pit mine and the sulfur stockpile. Based on the identified magnesium chlorite and sulfur observables, the decision tree narrows the possible mineral candidates at Ranger to uranium, copper, zinc, manganese, vanadium, the rare earths, and phosphorus, all of which are

  18. Design approaches in quarrying and pit-mining reclamation

    USGS Publications Warehouse

    Arbogast, Belinda F.

    1999-01-01

    Reclaimed mine sites have been evaluated so that the public, industry, and land planners may recognize there are innovative designs available for consideration and use. People tend to see cropland, range, and road cuts as a necessary part of their everyday life, not as disturbed areas despite their high visibility. Mining also generates a disturbed landscape, unfortunately one that many consider waste until reclaimed by human beings. The development of mining provides an economic base and use of a natural resource to improve the quality of human life. Equally important is a sensitivity to the geologic origin and natural pattern of the land. Wisely shaping out environment requires a design plan and product that responds to a site's physiography, ecology, function, artistic form, and publication perception. An examination of selected sites for their landscape design suggested nine approaches for mining reclamation. The oldest design approach around is nature itself. Humans may sometimes do more damage going to an area in the attempt to repair it. Given enough geologic time, a small-site area, and stable adjacent ecosystems, disturbed areas recover without mankind's input. Visual screens and buffer zones conceal the facility in a camouflage approach. Typically, earth berms, fences, and plantings are used to disguise the mining facility. Restoration targets social or economic benefits by reusing the site for public amenities, most often in urban centers with large populations. A mitigation approach attempts to protect the environment and return mined areas to use with scientific input. The reuse of cement, building rubble, macadam meets only about 10% of the demand from aggregate. Recognizing the limited supply of mineral resources and encouraging recycling efforts are steps are steps in a renewable resource approach. An educative design approach effectively communicates mining information through outreach, land stewardship, and community service. Mine sites used for

  19. Graduates employment classification using data mining approach

    NASA Astrophysics Data System (ADS)

    Aziz, Mohd Tajul Rizal Ab; Yusof, Yuhanis

    2016-08-01

    Data Mining is a platform to extract hidden knowledge in a collection of data. This study investigates the suitable classification model to classify graduates employment for one of the MARA Professional College (KPM) in Malaysia. The aim is to classify the graduates into either as employed, unemployed or further study. Five data mining algorithms offered in WEKA were used; Naïve Bayes, Logistic regression, Multilayer perceptron, k-nearest neighbor and Decision tree J48. Based on the obtained result, it is learned that the Logistic regression produces the highest classification accuracy which is at 92.5%. Such result was obtained while using 80% data for training and 20% for testing. The produced classification model will benefit the management of the college as it provides insight to the quality of graduates that they produce and how their curriculum can be improved to cater the needs from the industry.

  20. Current approaches for mitigating acid mine drainage.

    PubMed

    Sahoo, Prafulla Kumar; Kim, Kangjoo; Equeenuddin, Sk Md; Powell, Michael A

    2013-01-01

    AMD is one of the critical environmental problems that causes acidification and metal contamination of surface and ground water bodies when mine materials and/or over burden-containing metal sulfides are exposed to oxidizing conditions. The best option to limit AMD is early avoidance of sulfide oxidation. Several techniques are available to achieve this. In this paper, we review all of the major methods now used to limit sulfide oxidation. These fall into five categories: (1) physical barriers,(2) bacterial inhibition, (3) chemical passivation, ( 4) electrochemical, and (5) desulfurization.We describe the processes underlying each method by category and then address aspects relating to effectiveness, cost, and environmental impact. This paper may help researchers and environmental engineers to select suitable methods for addressing site-specific AMD problems.Irrespective of the mechanism by which each method works, all share one common feature, i.e., they delay or prevent oxidation. In addition, all have limitations.Physical barriers such as wet or dry cover have retarded sulfide oxidation in several studies; however, both wet and dry barriers exhibit only short-term effectiveness.Wet cover is suitable at specific sites where complete inundation is established, but this approach requires high maintenance costs. When employing dry cover, plastic liners are expensive and rarely used for large volumes of waste. Bactericides can suppress oxidation, but are only effective on fresh tailings and short-lived, and do not serve as a permanent solution to AMD. In addition, application of bactericides may be toxic to aquatic organisms.Encapsulation or passivation of sulfide surfaces (applying organic and/or inorganic coatings) is simple and effective in preventing AMD. Among inorganic coatings,silica is the most promising, stable, acid-resistant and long lasting, as compared to phosphate and other inorganic coatings. Permanganate passivation is also promising because it

  1. A proactive approach to sustainable management of mine tailings

    NASA Astrophysics Data System (ADS)

    Edraki, Mansour; Baumgartl, Thomas

    2015-04-01

    The reactive strategies to manage mine tailings i.e. containment of slurries of tailings in tailings storage facilities (TSF's) and remediation of tailings solids or tailings seepage water after the decommissioning of those facilities, can be technically inefficient to eliminate environmental risks (e.g. prevent dispersion of contaminants and catastrophic dam wall failures), pose a long term economic burden for companies, governments and society after mine closure, and often fail to meet community expectations. Most preventive environmental management practices promote proactive integrated approaches to waste management whereby the source of environmental issues are identified to help make a more informed decisions. They often use life cycle assessment to find the "hot spots" of environmental burdens. This kind of approach is often based on generic data and has rarely been used for tailings. Besides, life cycle assessments are less useful for designing operations or simulating changes in the process and consequent environmental outcomes. It is evident that an integrated approach for tailings research linked to better processing options is needed. A literature review revealed that there are only few examples of integrated approaches. The aim of this project is to develop new tailings management models by streamlining orebody characterization, process optimization and rehabilitation. The approach is based on continuous fingerprinting of geochemical processes from orebody to tailings storage facility, and benchmark the success of such proactive initiatives by evidence of no impacts and no future projected impacts on receiving environments. We present an approach for developing such a framework and preliminary results from a case study where combined grinding and flotation models developed using geometallurgical data from the orebody were constructed to predict the properties of tailings produced under various processing scenarios. The modelling scenarios based on the

  2. An approach to automated longwall mining

    NASA Technical Reports Server (NTRS)

    Palowitch, E. R.; Broussard, P. H., Jr.

    1979-01-01

    The longwall system of mining coal, providing advantages in the areas of productivity as well as health and safety, is described, and technological developments leading to a full automation of the system are discussed. In the longwall system large blocks of coal (up to 600 feet wide and up to 5000 feet long) are developed, with each block mined out by taking successive slices across the short dimension of the block and loading the broken coal onto a conveyor. A self-advancing system supports the roof over the length of the face throughout cutting and loading, with the supports advanced with the face, and the roof allowed to collapse behind them. A double-ranging drum longwall shearer provides the system with an efficient yaw, roll, and variable-thickness vertical control. Currently two machine operators function as error detectors and controllers. It is shown that electronic sensors can lead to a fully automated vertical control system, and automatic roll control is achievable with available instruments and machine tilt actuators.

  3. Mining the Metabiome: Identifying Novel Natural Products from Microbial Communities

    PubMed Central

    Milshteyn, Aleksandr; Schneider, Jessica S.; Brady, Sean F.

    2014-01-01

    Summary Microbial-derived natural products provide the foundation for most of the chemotherapeutic arsenal available to contemporary medicine. In the face of a dwindling pipeline of new lead structures identified by traditional culturing techniques and an increasing need for new therapeutics, surveys of microbial biosynthetic diversity across environmental metabiomes have revealed enormous reservoirs of as yet untapped natural products chemistry. In this review we touch on the historical context of microbial natural product discovery and discuss innovations and technological advances that are facilitating culture-dependent and culture-independent access to new chemistry from environmental microbiomes with the goal of re-invigorating the small molecule therapeutics discovery pipeline. We highlight the successful strategies that have emerged and some of the challenges that must be overcome to enable the development of high-throughput methods for natural product discovery from complex microbial communities. PMID:25237864

  4. Wastewater treatment polymers identified as the toxic component of a diamond mine effluent.

    PubMed

    De Rosemond, Simone J C; Liber, Karsten

    2004-09-01

    The Ekati Diamond Mine, located approximately 300 km northeast of Yellowknife in Canada's Northwest Territories, uses mechanical crushing and washing processes to extract diamonds from kimberlite ore. The processing plant's effluent contains kimberlite ore particles (< or =0.5 mm), wastewater, and two wastewater treatment polymers, a cationic polydiallydimethylammonium chloride (DADMAC) polymer and an anionic sodium acrylate polyacrylamide (PAM) polymer. A series of acute (48-h) and chronic (7-d) toxicity tests determined the processed kimberlite effluent (PKE) was chronically, but not acutely, toxic to Ceriodaphnia dubia. Reproduction of C. dubia was inhibited significantly at concentrations as low as 12.5% PKE. Toxicity identification evaluations (TIE) were initiated to identify the toxic component of PKE. Ethylenediaminetetraacetic acid (EDTA), sodium thiosulfate, aeration, and solid phase extraction with C-18 manipulations failed to reduce PKE toxicity. Toxicity was reduced significantly by pH adjustments to pH 3 or 11 followed by filtration. Toxicity testing with C. dubia determined that the cationic DADMAC polymer had a 48-h median lethal concentration (LC50) of 0.32 mg/L and 7-d median effective concentration (EC50) of 0.014 mg/L. The anionic PAM polymer had a 48-h LC50 of 218 mg/L. A weight-of-evidence approach, using the data obtained from the TIE, the polymer toxicity experiments, the estimated concentration of the cationic polymer in the kimberlite effluent, and the behavior of kimberlite minerals in pH-adjusted solutions provided sufficient evidence to identify the cationic DADMAC polymer as the toxic component of the diamond mine PKE. PMID:15379002

  5. Lignite mine spoil characterization and approaches for its rehabilitation

    SciTech Connect

    Praveen-Kumar; Kumar, S.; Sharma, K.D.; Choudhary, A.; Gehlot, K.

    2005-01-15

    Open cast mining of lignite leaves behind stockpiles of excavated materials (dumps) and refilled mining pits (spoils). Physicochemical and biochemical properties of both kinds of sites were estimated to identify the reasons for their barrenness. Subsequently, surface modifications were attempted, first in a greenhouse and later infield to develop a suitable approach for their rehabilitation. Dumps had low pH (4.8) and high Na{sup +} (2.5 mg g{sup -1}), spoils high pH (8.7) and high Na{sup +} (1.59 mg g{sup -1} soil). Both sites had low available nitrogen and phosphorus and showed very low dehydrogenase and phosphatases activity but no nitrification. The extreme physicochemical conditions and inert nature of damps and spoils explained their barrenness. In the greenhouse experiment, 14 plant species sown in surface materials of dumps and spoils after spreading a 0.15 m thick layer of dune sand, germinated ({gt}85%), and their seedlings survived for two months. This technique was followed at a spoil site (modified spoil site). After three years of stabilization the modified spoil site had only one-fifth Na{sup +} of that in spoil surface in the beginning and also showed higher dehydrogenase and phosphatase activity and nitrification. Pearl millet and Cenchrus ciliaris grown in modified spoil produced 128 to 394 kg and 2.25 to 3.50 Mg dry matter ha{sup -1}. Addition of farmyard manure with N and P fertilizers increased pearl millet yields.

  6. Data mining approach to model the diagnostic service management.

    PubMed

    Lee, Sun-Mi; Lee, Ae-Kyung; Park, Il-Su

    2006-01-01

    Korea has National Health Insurance Program operated by the government-owned National Health Insurance Corporation, and diagnostic services are provided every two year for the insured and their family members. Developing a customer relationship management (CRM) system using data mining technology would be useful to improve the performance of diagnostic service programs. Under these circumstances, this study developed a model for diagnostic service management taking into account the characteristics of subjects using a data mining approach. This study could be further used to develop an automated CRM system contributing to the increase in the rate of receiving diagnostic services. PMID:17102454

  7. A Node Linkage Approach for Sequential Pattern Mining

    PubMed Central

    Navarro, Osvaldo; Cumplido, René; Villaseñor-Pineda, Luis; Feregrino-Uribe, Claudia; Carrasco-Ochoa, Jesús Ariel

    2014-01-01

    Sequential Pattern Mining is a widely addressed problem in data mining, with applications such as analyzing Web usage, examining purchase behavior, and text mining, among others. Nevertheless, with the dramatic increase in data volume, the current approaches prove inefficient when dealing with large input datasets, a large number of different symbols and low minimum supports. In this paper, we propose a new sequential pattern mining algorithm, which follows a pattern-growth scheme to discover sequential patterns. Unlike most pattern growth algorithms, our approach does not build a data structure to represent the input dataset, but instead accesses the required sequences through pseudo-projection databases, achieving better runtime and reducing memory requirements. Our algorithm traverses the search space in a depth-first fashion and only preserves in memory a pattern node linkage and the pseudo-projections required for the branch being explored at the time. Experimental results show that our new approach, the Node Linkage Depth-First Traversal algorithm (NLDFT), has better performance and scalability in comparison with state of the art algorithms. PMID:24933123

  8. Using Helicopter Electromagnetic Surveys to Identify Potential Hazards at Mine Waste Impoundments

    SciTech Connect

    Hammack, R.W.

    2008-01-01

    In July 2003, helicopter electromagnetic surveys were conducted at 14 coal waste impoundments in southern West Virginia. The purpose of the surveys was to detect conditions that could lead to impoundment failure either by structural failure of the embankment or by the flooding of adjacent or underlying mine works. Specifically, the surveys attempted to: 1) identify saturated zones within the mine waste, 2) delineate filtrate flow paths through the embankment or into adjacent strata and receiving streams, and 3) identify flooded mine workings underlying or adjacent to the waste impoundment. Data from the helicopter surveys were processed to generate conductivity/depth images. Conductivity/depth images were then spatially linked to georeferenced air photos or topographic maps for interpretation. Conductivity/depth images were found to provide a snapshot of the hydrologic conditions that exist within the impoundment. This information can be used to predict potential areas of failure within the embankment because of its ability to image the phreatic zone. Also, the electromagnetic survey can identify areas of unconsolidated slurry in the decant basin and beneath the embankment. Although shallow, flooded mineworks beneath the impoundment were identified by this survey, it cannot be assumed that electromagnetic surveys can detect all underlying mines. A preliminary evaluation of the data implies that helicopter electromagnetic surveys can provide a better understanding of the phreatic zone than the piezometer arrays that are typically used.

  9. Mining Clinicians' Electronic Documentation to Identify Heart Failure Patients with Ineffective Self-Management: A Pilot Text-Mining Study.

    PubMed

    Topaz, Maxim; Radhakrishnan, Kavita; Lei, Victor; Zhou, Li

    2016-01-01

    Effective self-management can decrease up to 50% of heart failure hospitalizations. Unfortunately, self-management by patients with heart failure remains poor. This pilot study aimed to explore the use of text-mining to identify heart failure patients with ineffective self-management. We first built a comprehensive self-management vocabulary based on the literature and clinical notes review. We then randomly selected 545 heart failure patients treated within Partners Healthcare hospitals (Boston, MA, USA) and conducted a regular expression search with the compiled vocabulary within 43,107 interdisciplinary clinical notes of these patients. We found that 38.2% (n = 208) patients had documentation of ineffective heart failure self-management in the domains of poor diet adherence (28.4%), missed medical encounters (26.4%) poor medication adherence (20.2%) and non-specified self-management issues (e.g., "compliance issues", 34.6%). We showed the feasibility of using text-mining to identify patients with ineffective self-management. More natural language processing algorithms are needed to help busy clinicians identify these patients. PMID:27332377

  10. Data Mining Approaches for Modeling Complex Electronic Circuit Design Activities

    SciTech Connect

    Kwon, Yongjin; Omitaomu, Olufemi A; Wang, Gi-Nam

    2008-01-01

    A printed circuit board (PCB) is an essential part of modern electronic circuits. It is made of a flat panel of insulating materials with patterned copper foils that act as electric pathways for various components such as ICs, diodes, capacitors, resistors, and coils. The size of PCBs has been shrinking over the years, while the number of components mounted on these boards has increased considerably. This trend makes the design and fabrication of PCBs ever more difficult. At the beginning of design cycles, it is important to estimate the time to complete the steps required accurately, based on many factors such as the required parts, approximate board size and shape, and a rough sketch of schematics. Current approach uses multiple linear regression (MLR) technique for time and cost estimations. However, the need for accurate predictive models continues to grow as the technology becomes more advanced. In this paper, we analyze a large volume of historical PCB design data, extract some important variables, and develop predictive models based on the extracted variables using a data mining approach. The data mining approach uses an adaptive support vector regression (ASVR) technique; the benchmark model used is the MLR technique currently being used in the industry. The strengths of SVR for this data include its ability to represent data in high-dimensional space through kernel functions. The computational results show that a data mining approach is a better prediction technique for this data. Our approach reduces computation time and enhances the practical applications of the SVR technique.

  11. Application of data mining approaches to drug delivery.

    PubMed

    Ekins, Sean; Shimada, Jun; Chang, Cheng

    2006-11-30

    Computational approaches play a key role in all areas of the pharmaceutical industry from data mining, experimental and clinical data capture to pharmacoeconomics and adverse events monitoring. They will likely continue to be indispensable assets along with a growing library of software applications. This is primarily due to the increasingly massive amount of biology, chemistry and clinical data, which is now entering the public domain mainly as a result of NIH and commercially funded projects. We are therefore in need of new methods for mining this mountain of data in order to enable new hypothesis generation. The computational approaches include, but are not limited to, database compilation, quantitative structure activity relationships (QSAR), pharmacophores, network visualization models, decision trees, machine learning algorithms and multidimensional data visualization software that could be used to improve drug delivery after mining public and/or proprietary data. We will discuss some areas of unmet needs in the area of data mining for drug delivery that can be addressed with new software tools or databases of relevance to future pharmaceutical projects. PMID:17081647

  12. Data Mining for Identifying Novel Associations and Temporal Relationships with Charcot Foot

    PubMed Central

    Munson, Michael E.; Wrobel, James S.; Holmes, Crystal M.; Hanauer, David A.

    2014-01-01

    Introduction. Charcot foot is a rare and devastating complication of diabetes. While some risk factors are known, debate continues regarding etiology. Elucidating other associated disorders and their temporal occurrence could lead to a better understanding of its pathogenesis. We applied a large data mining approach to Charcot foot for elucidating novel associations. Methods. We conducted an association analysis using ICD-9 diagnosis codes for every patient in our health system (n = 1.6 million with 41.2 million time-stamped ICD-9 codes). For the current analysis, we focused on the 388 patients with Charcot foot (ICD-9 713.5). Results. We found 710 associations, 676 (95.2%) of which had a P value for the association less than 1.0 × 10−5 and 603 (84.9%) of which had an odds ratio > 5.0. There were 111 (15.6%) associations with a significant temporal relationship (P < 1.0 × 10−3). The three novel associations with the strongest temporal component were cardiac dysrhythmia, pulmonary eosinophilia, and volume depletion disorder. Conclusion. We identified novel associations with Charcot foot in the context of pathogenesis models that include neurotrophic, neurovascular, and microtraumatic factors mediated through inflammatory cytokines. Future work should focus on confirmatory analyses. These novel areas of investigation could lead to prevention or earlier diagnosis. PMID:24868558

  13. Identifying MMORPG Bots: A Traffic Analysis Approach

    NASA Astrophysics Data System (ADS)

    Chen, Kuan-Ta; Jiang, Jhih-Wei; Huang, Polly; Chu, Hao-Hua; Lei, Chin-Laung; Chen, Wen-Chin

    2008-12-01

    Massively multiplayer online role playing games (MMORPGs) have become extremely popular among network gamers. Despite their success, one of MMORPG's greatest challenges is the increasing use of game bots, that is, autoplaying game clients. The use of game bots is considered unsportsmanlike and is therefore forbidden. To keep games in order, game police, played by actual human players, often patrol game zones and question suspicious players. This practice, however, is labor-intensive and ineffective. To address this problem, we analyze the traffic generated by human players versus game bots and propose general solutions to identify game bots. Taking Ragnarok Online as our subject, we study the traffic generated by human players and game bots. We find that their traffic is distinguishable by 1) the regularity in the release time of client commands, 2) the trend and magnitude of traffic burstiness in multiple time scales, and 3) the sensitivity to different network conditions. Based on these findings, we propose four strategies and two ensemble schemes to identify bots. Finally, we discuss the robustness of the proposed methods against countermeasures of bot developers, and consider a number of possible ways to manage the increasingly serious bot problem.

  14. Mining Functional Modules in Heterogeneous Biological Networks Using Multiplex PageRank Approach.

    PubMed

    Li, Jun; Zhao, Patrick X

    2016-01-01

    Identification of functional modules/sub-networks in large-scale biological networks is one of the important research challenges in current bioinformatics and systems biology. Approaches have been developed to identify functional modules in single-class biological networks; however, methods for systematically and interactively mining multiple classes of heterogeneous biological networks are lacking. In this paper, we present a novel algorithm (called mPageRank) that utilizes the Multiplex PageRank approach to mine functional modules from two classes of biological networks. We demonstrate the capabilities of our approach by successfully mining functional biological modules through integrating expression-based gene-gene association networks and protein-protein interaction networks. We first compared the performance of our method with that of other methods using simulated data. We then applied our method to identify the cell division cycle related functional module and plant signaling defense-related functional module in the model plant Arabidopsis thaliana. Our results demonstrated that the mPageRank method is effective for mining sub-networks in both expression-based gene-gene association networks and protein-protein interaction networks, and has the potential to be adapted for the discovery of functional modules/sub-networks in other heterogeneous biological networks. The mPageRank executable program, source code, the datasets and results of the presented two case studies are publicly and freely available at http://plantgrn.noble.org/MPageRank/. PMID:27446133

  15. Mining Functional Modules in Heterogeneous Biological Networks Using Multiplex PageRank Approach

    PubMed Central

    Li, Jun; Zhao, Patrick X.

    2016-01-01

    Identification of functional modules/sub-networks in large-scale biological networks is one of the important research challenges in current bioinformatics and systems biology. Approaches have been developed to identify functional modules in single-class biological networks; however, methods for systematically and interactively mining multiple classes of heterogeneous biological networks are lacking. In this paper, we present a novel algorithm (called mPageRank) that utilizes the Multiplex PageRank approach to mine functional modules from two classes of biological networks. We demonstrate the capabilities of our approach by successfully mining functional biological modules through integrating expression-based gene-gene association networks and protein-protein interaction networks. We first compared the performance of our method with that of other methods using simulated data. We then applied our method to identify the cell division cycle related functional module and plant signaling defense-related functional module in the model plant Arabidopsis thaliana. Our results demonstrated that the mPageRank method is effective for mining sub-networks in both expression-based gene-gene association networks and protein-protein interaction networks, and has the potential to be adapted for the discovery of functional modules/sub-networks in other heterogeneous biological networks. The mPageRank executable program, source code, the datasets and results of the presented two case studies are publicly and freely available at http://plantgrn.noble.org/MPageRank/. PMID:27446133

  16. A geomorphological approach to the management of rivers contaminated by metal mining

    NASA Astrophysics Data System (ADS)

    Macklin, M. G.; Brewer, P. A.; Hudson-Edwards, K. A.; Bird, G.; Coulthard, T. J.; Dennis, I. A.; Lechler, P. J.; Miller, J. R.; Turner, J. N.

    2006-09-01

    As the result of current and historical metal mining, river channels and floodplains in many parts of the world have become contaminated by metal-rich waste in concentrations that may pose a hazard to human livelihoods and sustainable development. Environmental and human health impacts commonly arise because of the prolonged residence time of heavy metals in river sediments and alluvial soils and their bioaccumulatory nature in plants and animals. This paper considers how an understanding of the processes of sediment-associated metal dispersion in rivers, and the space and timescales over which they operate, can be used in a practical way to help river basin managers more effectively control and remediate catchments affected by current and historical metal mining. A geomorphological approach to the management of rivers contaminated by metals is outlined and four emerging research themes are highlighted and critically reviewed. These are: (1) response and recovery of river systems following the failures of major tailings dams; (2) effects of flooding on river contamination and the sustainable use of floodplains; (3) new developments in isotopic fingerprinting, remote sensing and numerical modelling for identifying the sources of contaminant metals and for mapping the spatial distribution of contaminants in river channels and floodplains; and (4) current approaches to the remediation of river basins affected by mining, appraised in light of the European Union's Water Framework Directive (2000/60/EC). Future opportunities for geomorphologically-based assessments of mining-affected catchments are also identified.

  17. Efflorescent sulfates from Baia Sprie mining area (Romania)--Acid mine drainage and climatological approach.

    PubMed

    Buzatu, Andrei; Dill, Harald G; Buzgar, Nicolae; Damian, Gheorghe; Maftei, Andreea Elena; Apopei, Andrei Ionuț

    2016-01-15

    The Baia Sprie epithermal system, a well-known deposit for its impressive mineralogical associations, shows the proper conditions for acid mine drainage and can be considered a general example for affected mining areas around the globe. Efflorescent samples from the abandoned open pit Minei Hill have been analyzed by X-ray diffraction (XRD), scanning electron microscopy (SEM), Raman and near-infrared (NIR) spectrometry. The identified phases represent mostly iron sulfates with different hydration degrees (szomolnokite, rozenite, melanterite, coquimbite, ferricopiapite), Zn and Al sulfates (gunningite, alunogen, halotrichite). The samples were heated at different temperatures in order to establish the phase transformations among the studied sulfates. The dehydration temperatures and intermediate phases upon decomposition were successfully identified for each of mineral phases. Gunningite was the single sulfate that showed no transformations during the heating experiment. All the other sulfates started to dehydrate within the 30-90 °C temperature range. The acid mine drainage is the main cause for sulfates formation, triggered by pyrite oxidation as the major source for the abundant iron sulfates. Based on the dehydration temperatures, the climatological interpretation indicated that melanterite formation and long-term presence is related to continental and temperate climates. Coquimbite and rozenite are attributed also to the dry arid/semi-arid areas, in addition to the above mentioned ones. The more stable sulfates, alunogen, halotrichite, szomolnokite, ferricopiapite and gunningite, can form and persists in all climate regimes, from dry continental to even tropical humid. PMID:26544892

  18. Mining Patterns of Disease Progression: A Topic-Model-Based Approach.

    PubMed

    Zhang, Lingxiao; Zhao, Junfeng; Wang, Yasha; Xie, Bing

    2016-01-01

    Knowledge of how diseases progress and transform is crucial for clinical decision making. Frequent pattern mining techniques, such as sequential pattern mining (SPM) algorithms, can automatically extract such knowledge from large collections of electronic medical records (EMR). However, EMR data are usually unorganized and highly noisy. Finding meaningful disease patterns often calls for manual manipulation such as cohort and feature selection on EMR data by medical professionals. In this paper, we propose a topic-model-based SPM approach to find disease progression patterns from diagnostic records. We improve the traditional SPM algorithms by filtering and grouping the diagnosis sequences according to different clinical topics. These topics represent certain clinical conditions with closely related diagnoses, and are detected without prior medical knowledge. The experiment on real-world EMR data shows that our approach is able to find meaningful progression patterns with less noises, and can help quickly identify interesting patterns related to a certain clinical condition with less human effort. PMID:27577403

  19. Clustering-based approaches to SAGE data mining.

    PubMed

    Wang, Haiying; Zheng, Huiru; Azuaje, Francisco

    2008-01-01

    Serial analysis of gene expression (SAGE) is one of the most powerful tools for global gene expression profiling. It has led to several biological discoveries and biomedical applications, such as the prediction of new gene functions and the identification of biomarkers in human cancer research. Clustering techniques have become fundamental approaches in these applications. This paper reviews relevant clustering techniques specifically designed for this type of data. It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation. PMID:18822151

  20. A Practical Approach for Content Mining of Tweets

    PubMed Central

    Yoon, Sunmoo; Elhadad, Noémie; Bakken, Suzanne

    2013-01-01

    Use of data generated through social media for health studies is gradually increasing. Twitter is a short-text message system developed 6 years ago, now with more than 100 million users generating over 300 million Tweets every day. Twitter may be used to gain real-world insights to promote healthy behaviors. The purposes of this paper are to describe a practical approach to analyzing Tweet contents and to illustrate an application of the approach to the topic of physical activity. The approach includes five steps: (1) selecting keywords to gather an initial set of Tweets to analyze; (2) importing data; (3) preparing data; (4) analyzing data (topic, sentiment, and ecologic context); and (5) interpreting data. The steps are implemented using tools that are publically available and free of charge and designed for use by researchers with limited programming skills. Content mining of Tweets can contribute to addressing challenges in health behavior research. PMID:23790998

  1. Identifying Understudied Nuclear Reactions by Text-mining the EXFOR Experimental Nuclear Reaction Library

    NASA Astrophysics Data System (ADS)

    Hirdt, J. A.; Brown, D. A.

    2016-01-01

    The EXFOR library contains the largest collection of experimental nuclear reaction data available as well as the data's bibliographic information and experimental details. We text-mined the REACTION and MONITOR fields of the ENTRYs in the EXFOR library in order to identify understudied reactions and quantities. Using the results of the text-mining, we created an undirected graph from the EXFOR datasets with each graph node representing a single reaction and quantity and graph links representing the various types of connections between these reactions and quantities. This graph is an abstract representation of the connections in EXFOR, similar to graphs of social networks, authorship networks, etc. We use various graph theoretical tools to identify important yet understudied reactions and quantities in EXFOR. Although we identified a few cross sections relevant for shielding applications and isotope production, mostly we identified charged particle fluence monitor cross sections. As a side effect of this work, we learn that our abstract graph is typical of other real-world graphs.

  2. Identifying underground coal mine displacement through field and laboratory laser scanning

    NASA Astrophysics Data System (ADS)

    Slaker, Brent; Westman, Erik

    2014-01-01

    The ability to identify ground movements in the unique environment of an underground coalmine is explored through the use of laser scanning. Time-lapse scans were performed in an underground coal mine to detect rib surface change after different volumes of coal were removed from the mine ribs. Surface changes in the rib as small as 57 cm3 were detected through analysis of surface differences between triangulated surfaces created from point clouds. Results suggest that the uneven geometry, coal reflectance, and small movements of objects and references in the scene due to ventilation air do not significantly influence monitoring ability. Time-lapse scans were also performed on an artificial coal rib constructed to allow the researchers to control deformation and error precisely. A test of displacement measurement precision showed relative standard deviations of <0.1% are attainable with point cloud densities of >3200 pts/m2. Changing the distance and angle of incidence of the artificial coal rib to the scanner had little impact on the accuracy of results beyond the expected reduction due to a smaller point density of the target area. The results collected in this study suggest that laser scanning can be a useful, comprehensive tool for measuring ground change in an underground coal mining environment.

  3. WHAT INNOVATIVE APPROACHES CAN BE DEVELOPED FOR MINING SITES?

    EPA Science Inventory

    Mining is essential to maintain our way of life. However, based upon industry's reporting in the most recent Toxic Release Inventory (TRI), the primary sources of heavy metal releases to the environment are mining and mining related activities. The hard rock mining industry rel...

  4. Using Frequent Item Set Mining and Feature Selection Methods to Identify Interacted Risk Factors - The Atrial Fibrillation Case Study.

    PubMed

    Li, Xiang; Liu, Haifeng; Du, Xin; Hu, Gang; Xie, Guotong; Zhang, Ping

    2016-01-01

    Disease risk prediction is highly important for early intervention and treatment, and identification of predictive risk factors is the key point to achieve accurate prediction. In addition to original independent features in a dataset, some interacted features, such as comorbidities and combination therapies, may have non-additive influence on the disease outcome and can also be used in risk prediction to improve the prediction performance. However, it is usually difficult to manually identify the possible interacted risk factors due to the combination explosion of features. In this paper, we propose an automatic approach to identify predictive risk factors with interactions using frequent item set mining and feature selection methods. The proposed approach was applied in the real world case study of predicting ischemic stroke and thromboembolism for atrial fibrillation patients on the Chinese atrial fibrillation registry dataset, and the results show that our approach can not only improve the prediction performance, but also identify the comorbidities and combination therapies that have potential influences on TE occurrence for AF. PMID:27577446

  5. Development and application of the Safe Performance Index as a risk-based methodology for identifying major hazard-related safety issues in underground coal mines

    NASA Astrophysics Data System (ADS)

    Kinilakodi, Harisha

    The underground coal mining industry has been under constant watch due to the high risk involved in its activities, and scrutiny increased because of the disasters that occurred in 2006-07. In the aftermath of the incidents, the U.S. Congress passed the Mine Improvement and New Emergency Response Act of 2006 (MINER Act), which strengthened the existing regulations and mandated new laws to address the various issues related to a safe working environment in the mines. Risk analysis in any form should be done on a regular basis to tackle the possibility of unwanted major hazard-related events such as explosions, outbursts, airbursts, inundations, spontaneous combustion, and roof fall instabilities. One of the responses by the Mine Safety and Health Administration (MSHA) in 2007 involved a new pattern of violations (POV) process to target mines with a poor safety performance, specifically to improve their safety. However, the 2010 disaster (worst in 40 years) gave an impression that the collective effort of the industry, federal/state agencies, and researchers to achieve the goal of zero fatalities and serious injuries has gone awry. The Safe Performance Index (SPI) methodology developed in this research is a straight-forward, effective, transparent, and reproducible approach that can help in identifying and addressing some of the existing issues while targeting (poor safety performance) mines which need help. It combines three injury and three citation measures that are scaled to have an equal mean (5.0) in a balanced way with proportionate weighting factors (0.05, 0.15, 0.30) and overall normalizing factor (15) into a mine safety performance evaluation tool. It can be used to assess the relative safety-related risk of mines, including by mine-size category. Using 2008 and 2009 data, comparisons were made of SPI-associated, normalized safety performance measures across mine-size categories, with emphasis on small-mine safety performance as compared to large- and

  6. Approaches to Post-Mining Land Reclamation in Polish Open-Cast Lignite Mining

    NASA Astrophysics Data System (ADS)

    Kasztelewicz, Zbigniew

    2014-06-01

    The paper presents the situation regarding the reclamation of post-mining land in the case of particular lignite mines in Poland until 2012 against the background of the whole opencast mining. It discusses the process of land purchase for mining operations and its sales after reclamation. It presents the achievements of mines in the reclamation and regeneration of post-mining land as a result of which-after development processes carried out according to European standards-it now serves the inhabitants as a recreational area that increases the attractiveness of the regions.

  7. Identifying Catchment-Scale Predictors of Coal Mining Impacts on New Zealand Stream Communities

    NASA Astrophysics Data System (ADS)

    Clapcott, Joanne E.; Goodwin, Eric O.; Harding, Jon S.

    2016-03-01

    Coal mining activities can have severe and long-term impacts on freshwater ecosystems. At the individual stream scale, these impacts have been well studied; however, few attempts have been made to determine the predictors of mine impacts at a regional scale. We investigated whether catchment-scale measures of mining impacts could be used to predict biological responses. We collated data from multiple studies and analyzed algae, benthic invertebrate, and fish community data from 186 stream sites, including un-mined streams, and those associated with 620 mines on the West Coast of the South Island, New Zealand. Algal, invertebrate, and fish richness responded to mine impacts and were significantly higher in un-mined compared to mine-impacted streams. Changes in community composition toward more acid- and metal-tolerant species were evident for algae and invertebrates, whereas changes in fish communities were significant and driven by a loss of nonmigratory native species. Consistent catchment-scale predictors of mining activities affecting biota included the time post mining (years), mining density (the number of mines upstream per catchment area), and mining intensity (tons of coal production per catchment area). Mining was associated with a decline in stream biodiversity irrespective of catchment size, and recovery was not evident until at least 30 years after mining activities have ceased. These catchment-scale predictors can provide managers and regulators with practical metrics to focus on management and remediation decisions.

  8. Identifying Catchment-Scale Predictors of Coal Mining Impacts on New Zealand Stream Communities.

    PubMed

    Clapcott, Joanne E; Goodwin, Eric O; Harding, Jon S

    2016-03-01

    Coal mining activities can have severe and long-term impacts on freshwater ecosystems. At the individual stream scale, these impacts have been well studied; however, few attempts have been made to determine the predictors of mine impacts at a regional scale. We investigated whether catchment-scale measures of mining impacts could be used to predict biological responses. We collated data from multiple studies and analyzed algae, benthic invertebrate, and fish community data from 186 stream sites, including un-mined streams, and those associated with 620 mines on the West Coast of the South Island, New Zealand. Algal, invertebrate, and fish richness responded to mine impacts and were significantly higher in un-mined compared to mine-impacted streams. Changes in community composition toward more acid- and metal-tolerant species were evident for algae and invertebrates, whereas changes in fish communities were significant and driven by a loss of nonmigratory native species. Consistent catchment-scale predictors of mining activities affecting biota included the time post mining (years), mining density (the number of mines upstream per catchment area), and mining intensity (tons of coal production per catchment area). Mining was associated with a decline in stream biodiversity irrespective of catchment size, and recovery was not evident until at least 30 years after mining activities have ceased. These catchment-scale predictors can provide managers and regulators with practical metrics to focus on management and remediation decisions. PMID:26467674

  9. An Integrated Assessment Approach to Address Artisanal and Small-Scale Gold Mining in Ghana

    PubMed Central

    Basu, Niladri; Renne, Elisha P.; Long, Rachel N.

    2015-01-01

    Artisanal and small-scale gold mining (ASGM) is growing in many regions of the world including Ghana. The problems in these communities are complex and multi-faceted. To help increase understanding of such problems, and to enable consensus-building and effective translation of scientific findings to stakeholders, help inform policies, and ultimately improve decision making, we utilized an Integrated Assessment approach to study artisanal and small-scale gold mining activities in Ghana. Though Integrated Assessments have been used in the fields of environmental science and sustainable development, their use in addressing specific matter in public health, and in particular, environmental and occupational health is quite limited despite their many benefits. The aim of the current paper was to describe specific activities undertaken and how they were organized, and the outputs and outcomes of our activity. In brief, three disciplinary workgroups (Natural Sciences, Human Health, Social Sciences and Economics) were formed, with 26 researchers from a range of Ghanaian institutions plus international experts. The workgroups conducted activities in order to address the following question: What are the causes, consequences and correctives of small-scale gold mining in Ghana? More specifically: What alternatives are available in resource-limited settings in Ghana that allow for gold-mining to occur in a manner that maintains ecological health and human health without hindering near- and long-term economic prosperity? Several response options were identified and evaluated, and are currently being disseminated to various stakeholders within Ghana and internationally. PMID:26393627

  10. An Integrated Assessment Approach to Address Artisanal and Small-Scale Gold Mining in Ghana.

    PubMed

    Basu, Niladri; Renne, Elisha P; Long, Rachel N

    2015-09-01

    Artisanal and small-scale gold mining (ASGM) is growing in many regions of the world including Ghana. The problems in these communities are complex and multi-faceted. To help increase understanding of such problems, and to enable consensus-building and effective translation of scientific findings to stakeholders, help inform policies, and ultimately improve decision making, we utilized an Integrated Assessment approach to study artisanal and small-scale gold mining activities in Ghana. Though Integrated Assessments have been used in the fields of environmental science and sustainable development, their use in addressing specific matter in public health, and in particular, environmental and occupational health is quite limited despite their many benefits. The aim of the current paper was to describe specific activities undertaken and how they were organized, and the outputs and outcomes of our activity. In brief, three disciplinary workgroups (Natural Sciences, Human Health, Social Sciences and Economics) were formed, with 26 researchers from a range of Ghanaian institutions plus international experts. The workgroups conducted activities in order to address the following question: What are the causes, consequences and correctives of small-scale gold mining in Ghana? More specifically: What alternatives are available in resource-limited settings in Ghana that allow for gold-mining to occur in a manner that maintains ecological health and human health without hindering near- and long-term economic prosperity? Several response options were identified and evaluated, and are currently being disseminated to various stakeholders within Ghana and internationally. PMID:26393627

  11. Identifying Liver Cancer and Its Relations with Diseases, Drugs, and Genes: A Literature-Based Approach

    PubMed Central

    Song, Min

    2016-01-01

    In biomedicine, scientific literature is a valuable source for knowledge discovery. Mining knowledge from textual data has become an ever important task as the volume of scientific literature is growing unprecedentedly. In this paper, we propose a framework for examining a certain disease based on existing information provided by scientific literature. Disease-related entities that include diseases, drugs, and genes are systematically extracted and analyzed using a three-level network-based approach. A paper-entity network and an entity co-occurrence network (macro-level) are explored and used to construct six entity specific networks (meso-level). Important diseases, drugs, and genes as well as salient entity relations (micro-level) are identified from these networks. Results obtained from the literature-based literature mining can serve to assist clinical applications. PMID:27195695

  12. A systematic approach to identify cellular auxetic materials

    NASA Astrophysics Data System (ADS)

    Körner, Carolin; Liebold-Ribeiro, Yvonne

    2015-02-01

    Auxetics are materials showing a negative Poisson’s ratio. This characteristic leads to unusual mechanical properties that make this an interesting class of materials. So far no systematic approach for generating auxetic cellular materials has been reported. In this contribution, we present a systematic approach to identifying auxetic cellular materials based on eigenmode analysis. The fundamental mechanism generating auxetic behavior is identified as rotation. With this knowledge, a variety of complex two-dimensional (2D) and three-dimensional (3D) auxetic structures based on simple unit cells can be identified.

  13. Text Mining approaches for automated literature knowledge extraction and representation.

    PubMed

    Nuzzo, Angelo; Mulas, Francesca; Gabetta, Matteo; Arbustini, Eloisa; Zupan, Blaz; Larizza, Cristiana; Bellazzi, Riccardo

    2010-01-01

    Due to the overwhelming volume of published scientific papers, information tools for automated literature analysis are essential to support current biomedical research. We have developed a knowledge extraction tool to help researcher in discovering useful information which can support their reasoning process. The tool is composed of a search engine based on Text Mining and Natural Language Processing techniques, and an analysis module which process the search results in order to build annotation similarity networks. We tested our approach on the available knowledge about the genetic mechanism of cardiac diseases, where the target is to find both known and possible hypothetical relations between specific candidate genes and the trait of interest. We show that the system i) is able to effectively retrieve medical concepts and genes and ii) plays a relevant role assisting researchers in the formulation and evaluation of novel literature-based hypotheses. PMID:20841825

  14. Application of numerical simulation using a progressive failure approach to underground-coal-mine stability analysis

    SciTech Connect

    Ash, N.F.

    1987-01-01

    Stability in underground coal mines is of major concern to the coal industry due to its effect on both safety and productivity. Consequently, this can have a great influence on the design of efficient mine systems. In this work a progressive failure approach was used to simulate underground coal mine stability at two different mines. The two mines considered have different characteristics. Two- and three- dimensional finite element models were created to model different areas of a longwall mine. Different chain pillar configurations were considered and the resulting stress distributions were comparable to field measurements. A complete mine section was successfully modeled taking into consideration face advancement. The roof above entry intersections was also modeled using laminated composite simulation and the finite element method. The results showed trends similar to field observations. In addition, the progressive development of subsidence for the two different mines was simulated. The same variation in subsidence behavior recorded at the mine was realized in the finite element simulation. The progressive failure approach used in this work can successfully simulate underground coal mine stability. It can also be a helpful tool in the design of more efficient mine systems which can increase productivity and maintain a high level of safety.

  15. A genetic algorithm approach to recognition and data mining

    SciTech Connect

    Punch, W.F.; Goodman, E.D.; Min, Pei

    1996-12-31

    We review here our use of genetic algorithm (GA) and genetic programming (GP) techniques to perform {open_quotes}data mining,{close_quotes} the discovery of particular/important data within large datasets, by finding optimal data classifications using known examples. Our first experiments concentrated on the use of a K-nearest neighbor algorithm in combination with a GA. The GA selected weights for each feature so as to optimize knn classification based on a linear combination of features. This combined GA-knn approach was successfully applied to both generated and real-world data. We later extended this work by substituting a GP for the GA. The GP-knn could not only optimize data classification via linear combinations of features but also determine functional relationships among the features. This allowed for improved performance and new information on important relationships among features. We review the effectiveness of the overall approach on examples from biology and compare the effectiveness of the GA and GP.

  16. Online Discourse on Fibromyalgia: Text-Mining to Identify Clinical Distinction and Patient Concerns

    PubMed Central

    Park, Jungsik; Ryu, Young Uk

    2014-01-01

    Background The purpose of this study was to evaluate the possibility of using text-mining to identify clinical distinctions and patient concerns in online memoires posted by patients with fibromyalgia (FM). Material/Methods A total of 399 memoirs were collected from an FM group website. The unstructured data of memoirs associated with FM were collected through a crawling process and converted into structured data with a concordance, parts of speech tagging, and word frequency. We also conducted a lexical analysis and phrase pattern identification. After examining the data, a set of FM-related keywords were obtained and phrase net relationships were set through a web-based visualization tool. Results The clinical distinction of FM was verified. Pain is the biggest issue to the FM patients. The pains were affecting body parts including ‘muscles,’ ‘leg,’ ‘neck,’ ‘back,’ ‘joints,’ and ‘shoulders’ with accompanying symptoms such as ‘spasms,’ ‘stiffness,’ and ‘aching,’ and were described as ‘sever,’ ‘chronic,’ and ‘constant.’ This study also demonstrated that it was possible to understand the interests and concerns of FM patients through text-mining. FM patients wanted to escape from the pain and symptoms, so they were interested in medical treatment and help. Also, they seemed to have interest in their work and occupation, and hope to continue to live life through the relationships with the people around them. Conclusions This research shows the potential for extracting keywords to confirm the clinical distinction of a certain disease, and text-mining can help objectively understand the concerns of patients by generalizing their large number of subjective illness experiences. However, it is believed that there are limitations to the processes and methods for organizing and classifying large amounts of text, so these limits have to be considered when analyzing the results. The development of research methodology to overcome

  17. A data-mining approach for multiple structural alignment of proteins.

    PubMed

    Siu, Wing-Yan; Mamoulis, Nikos; Yiu, Siu-Ming; Chan, Ho-Leung

    2010-01-01

    Comparing the 3D structures of proteins is an important but computationally hard problem in bioinformatics. In this paper, we propose studying the problem when much less information or assumptions are available. We model the structural alignment of proteins as a combinatorial problem. In the problem, each protein is simply a set of points in the 3D space, without sequence order information, and the objective is to discover all large enough alignments for any subset of the input. We propose a data-mining approach for this problem. We first perform geometric hashing of the structures such that points with similar locations in the 3D space are hashed into the same bin in the hash table. The novelty is that we consider each bin as a coincidence group and mine for frequent patterns, which is a well-studied technique in data mining. We observe that these frequent patterns are already potentially large alignments. Then a simple heuristic is used to extend the alignments if possible. We implemented the algorithm and tested it using real protein structures. The results were compared with existing tools. They showed that the algorithm is capable of finding conserved substructures that do not preserve sequence order, especially those existing in protein interfaces. The algorithm can also identify conserved substructures of functionally similar structures within a mixture with dissimilar ones. The running time of the program was smaller or comparable to that of the existing tools. PMID:21079664

  18. Pharmacogenomic Approach to Identify Drug Sensitivity in Small-Cell Lung Cancer

    PubMed Central

    Wildey, Gary; Chen, Yanwen; Lent, Ian; Stetson, Lindsay; Pink, John; Barnholtz-Sloan, Jill S.; Dowlati, Afshin

    2014-01-01

    There are currently no molecular targeted approaches to treat small-cell lung cancer (SCLC) similar to those used successfully against non-small-cell lung cancer. This failure is attributable to our inability to identify clinically-relevant subtypes of this disease. Thus, a more systematic approach to drug discovery for SCLC is needed. In this regard, two comprehensive studies recently published in Nature, the Cancer Cell Line Encyclopedia and the Cancer Genome Project, provide a wealth of data regarding the drug sensitivity and genomic profiles of many different types of cancer cells. In the present study we have mined these two studies for new therapeutic agents for SCLC and identified heat shock proteins, cyclin-dependent kinases and polo-like kinases (PLK) as attractive molecular targets with little current clinical trial activity in SCLC. Remarkably, our analyses demonstrated that most SCLC cell lines clustered into a single, predominant subgroup by either gene expression or CNV analyses, leading us to take a pharmacogenomic approach to identify subgroups of drug-sensitive SCLC cells. Using PLK inhibitors as an example, we identified and validated a gene signature for drug sensitivity in SCLC cell lines. This gene signature could distinguish subpopulations among human SCLC tumors, suggesting its potential clinical utility. Finally, circos plots were constructed to yield a comprehensive view of how transcriptional, copy number and mutational elements affect PLK sensitivity in SCLC cell lines. Taken together, this study outlines an approach to predict drug sensitivity in SCLC to novel targeted therapeutics. PMID:25198282

  19. An Integrative Proteomic Approach Identifies Novel Cellular SMYD2 Substrates.

    PubMed

    Ahmed, Hazem; Duan, Shili; Arrowsmith, Cheryl H; Barsyte-Lovejoy, Dalia; Schapira, Matthieu

    2016-06-01

    Protein methylation is a post-translational modification with important roles in transcriptional regulation and other biological processes, but the enzyme-substrate relationship between the 68 known human protein methyltransferases and the thousands of reported methylation sites is poorly understood. Here, we propose a bioinformatic approach that integrates structural, biochemical, cellular, and proteomic data to identify novel cellular substrates of the lysine methyltransferase SMYD2. Of the 14 novel putative SMYD2 substrates identified by our approach, six were confirmed in cells by immunoprecipitation: MAPT, CCAR2, EEF2, NCOA3, STUB1, and UTP14A. Treatment with the selective SMYD2 inhibitor BAY-598 abrogated the methylation signal, indicating that methylation of these novel substrates was dependent on the catalytic activity of the enzyme. We believe that our integrative approach can be applied to other protein lysine methyltransferases, and help understand how lysine methylation participates in wider signaling processes. PMID:27163177

  20. Proteomic and Genetic Approaches Identify Syk as an AML Target

    PubMed Central

    Hahn, Cynthia K.; Berchuck, Jacob E.; Ross, Kenneth N.; Kakoza, Rose M.; Clauser, Karl; Schinzel, Anna C.; Ross, Linda; Galinsky, Ilene; Davis, Tina N.; Silver, Serena J.; Root, David E.; Stone, Richard M.; DeAngelo, Daniel J.; Carroll, Martin; Hahn, William C.; Carr, Steven A.; Golub, Todd R.; Kung, Andrew L.; Stegmaier, Kimberly

    2009-01-01

    SUMMARY Cell-based screening can facilitate rapid identification of compounds inducing complex cellular phenotypes. Advancing a compound toward the clinic, however, generally requires identification of precise mechanisms of action. We previously found that epidermal growth factor receptor (EGFR) inhibitors induce acute myeloid leukemia (AML) differentiation via a non-EGFR mechanism. In this report, we integrated proteomic and RNAi-based strategies to identify their off-target anti-AML mechanism. These orthogonal approaches identified Syk as a target in AML. Genetic and pharmacological inactivation of Syk with a drug in clinical trial for other indications promoted differentiation of AML cells and attenuated leukemia growth in vivo. These results demonstrate the power of integrating diverse chemical, proteomic, and genomic screening approaches to identify therapeutic strategies for cancer. PMID:19800574

  1. A data mining approach to finding relationships between reservoir properties and oil production for CHOPS

    NASA Astrophysics Data System (ADS)

    Cai, Yongxiang; Wang, Xin; Hu, Kezhen; Dong, Mingzhe

    2014-12-01

    Cold heavy oil production with sand (CHOPS) is a primary oil extraction process for heavy crude oil and reservoir properties are key factors that contribute to the effectiveness of CHOPS. However, identification of the key reservoir properties and quantification of the relationships between the reservoir properties and the oil production are still challenging tasks. In this paper, we propose the use of a data mining approach for finding quantitative relationships between various reservoir properties and oil production for CHOPS. The approach includes four steps: firstly, a set of reservoir properties are identified to describe reservoir characteristics through a petrophysical analysis. In addition to common parameters, such as porosity and permeability, two new parameters - a fluid mobility factor and the maximum inscribed rectangular of net pay (MIRNP) - are proposed. Secondly, three new parameters to describe the production performance of wells are proposed: the peak value, effective life cycle and effective yield. Next, the fuzzy ranking method is used to rank the importance of the identified reservoir properties in terms of oil production. Finally, association rule mining is used to obtain quantitative relationships between reservoir property variables and the production performance of wells. The proposed methods have been applied for 118 wells in the Sparky Formation of the Lloydminster heavy oil field in Alberta. The result shows that the production performance of wells in the area could be described and predicted by using the found quantitative relations.

  2. A Tools-Based Approach to Teaching Data Mining Methods

    ERIC Educational Resources Information Center

    Jafar, Musa J.

    2010-01-01

    Data mining is an emerging field of study in Information Systems programs. Although the course content has been streamlined, the underlying technology is still in a state of flux. The purpose of this paper is to describe how we utilized Microsoft Excel's data mining add-ins as a front-end to Microsoft's Cloud Computing and SQL Server 2008 Business…

  3. Risk evaluation of uranium mining: A geochemical inverse modelling approach

    NASA Astrophysics Data System (ADS)

    Rillard, J.; Zuddas, P.; Scislewski, A.

    2011-12-01

    It is well known that uranium extraction operations can increase risks linked to radiation exposure. The toxicity of uranium and associated heavy metals is the main environmental concern regarding exploitation and processing of U-ore. In areas where U mining is planned, a careful assessment of toxic and radioactive element concentrations is recommended before the start of mining activities. A background evaluation of harmful elements is important in order to prevent and/or quantify future water contamination resulting from possible migration of toxic metals coming from ore and waste water interaction. Controlled leaching experiments were carried out to investigate processes of ore and waste (leached ore) degradation, using samples from the uranium exploitation site located in Caetité-Bahia, Brazil. In experiments in which the reaction of waste with water was tested, we found that the water had low pH and high levels of sulphates and aluminium. On the other hand, in experiments in which ore was tested, the water had a chemical composition comparable to natural water found in the region of Caetité. On the basis of our experiments, we suggest that waste resulting from sulphuric acid treatment can induce acidification and salinization of surface and ground water. For this reason proper storage of waste is imperative. As a tool to evaluate the risks, a geochemical inverse modelling approach was developed to estimate the water-mineral interaction involving the presence of toxic elements. We used a method earlier described by Scislewski and Zuddas 2010 (Geochim. Cosmochim. Acta 74, 6996-7007) in which the reactive surface area of mineral dissolution can be estimated. We found that the reactive surface area of rock parent minerals is not constant during time but varies according to several orders of magnitude in only two months of interaction. We propose that parent mineral heterogeneity and particularly, neogenic phase formation may explain the observed variation of the

  4. InSAR Identifies Mine-Dewatering Associated Bedrock Compaction and Subsidence in North- Central Nevada

    NASA Astrophysics Data System (ADS)

    Katzenstein, K. W.; Bell, J. W.; Watters, R. J.

    2007-12-01

    During the last decade, InSAR has been used extensively for the delineation of aquifer-system response to heavy groundwater pumping. A number of studies have demonstrated the vastly improved spatial resolution afforded by InSAR relative to traditional surveying techniques in detecting groundwater-related effects, including subsidence. This has allowed for further understanding of the complexity of subsidence bowls and the role of secondary factors such as structure, aquifer material properties and other previously unforeseen factors. In the western U.S., ground subsidence related to mine dewatering is a common occurrence due to the very large volumes of water (as high as 100,000 acre-ft/yr) that are typically pumped in order to lower the local groundwater table to facilitate the excavation of open pit and underground mines. Several gold mines located along the Carlin Trend of Central Nevada have produced distinct InSAR-identified subsidence signals of greater aerial extent and magnitude than most municipal groundwater signals, including signals partly or entirely within bedrock. One signal in particular shows a minimum of 54 cm of cumulative dewatering related subsidence between June 1, 1992 and September 21, 2000. Our study has produced many (>50) interferograms, each covering different time intervals, allowing a better understanding of how the subsidence signal has evolved in response to varied pumping rates from dewatering wells. Since the spatial resolution of the InSAR is much better than that of the monitoring well locations, the complexity of the signal is better delineated. The aerial extent of the subsidence feature is impressive as it extends as far as 20 km away from the location of the extraction wells used for dewatering. The area of maximum subsidence correlates well with the area of maximum groundwater drawdown, however the subsidence signal extends well beyond (as much as 8-10 km) the observed groundwater drawdown pattern. This suggests a much

  5. Data Mining Approaches for Genome-Wide Association of Mood Disorders

    PubMed Central

    Pirooznia, Mehdi; Seifuddin, Fayaz; Judy, Jennifer; Mahon, Pamela B.; Potash, James B.; Zandi, Peter P.

    2012-01-01

    Mood disorders are highly heritable forms of major mental illness. A major breakthrough in elucidating the genetic architecture of mood disorders was anticipated with the advent of genome-wide association studies (GWAS). However, to date few susceptibility loci have been conclusively identified. The genetic etiology of mood disorders appears to be quite complex, and as a result, alternative approaches for analyzing GWAS data are needed. Recently, a polygenic scoring approach that captures the effects of alleles across multiple loci was successfully applied to the analysis of GWAS data in schizophrenia and bipolar disorder (BP). However, this method may be overly simplistic in its approach to the complexity of genetic effects. Data mining methods are available that may be applied to analyze the high dimensional data generated by GWAS of complex psychiatric disorders. We sought to compare the performance of five data mining methods, namely, Bayesian Networks (BN), Support Vector Machine (SVM), Random Forest (RF), Radial Basis Function network (RBF), and Logistic Regression (LR), against the polygenic scoring approach in the analysis of GWAS data on BP. The different classification methods were trained on GWAS datasets from the Bipolar Genome Study (2,191 cases with BP and 1,434 controls) and their ability to accurately classify case/control status was tested on a GWAS dataset from the Wellcome Trust Case Control Consortium. The performance of the classifiers in the test dataset was evaluated by comparing area under the receiver operating characteristic curves (AUC). BN performed the best of all the data mining classifiers, but none of these did significantly better than the polygenic score approach. We further examined a subset of SNPs in genes that are expressed in the brain, under the hypothesis that these might be most relevant to BP susceptibility, but all the classifiers performed worse with this reduced set of SNPs. The discriminative accuracy of all of these

  6. IDENTIFYING RECENT SURFACE MINING ACTIVITIES USING A NORMALIZED DIFFERENCE VEGETATION INDEX (NDVI) CHANGE DETECTION METHOD

    EPA Science Inventory



    Coal mining is a major resource extraction activity on the Appalachian Mountains. The increased size and frequency of a specific type of surface mining, known as mountain top removal-valley fill, has in recent years raised various environmental concerns. During mountainto...

  7. Identifying Learning Behaviors by Contextualizing Differential Sequence Mining with Action Features and Performance Evolution

    ERIC Educational Resources Information Center

    Kinnebrew, John S.; Biswas, Gautam

    2012-01-01

    Our learning-by-teaching environment, Betty's Brain, captures a wealth of data on students' learning interactions as they teach a virtual agent. This paper extends an exploratory data mining methodology for assessing and comparing students' learning behaviors from these interaction traces. The core algorithm employs sequence mining techniques to…

  8. National Conference on Mining-Influenced Waters: Approaches for Characterization, Source Control and Treatment

    EPA Science Inventory

    The conference goal was to provide a forum for the exchange of scientific information on current and emerging approaches to assessing characterization, monitoring, source control, treatment and/or remediation on mining-influenced waters. The conference was aimed at mining remedi...

  9. GTA: a game theoretic approach to identifying cancer subnetwork markers.

    PubMed

    Farahmand, S; Goliaei, S; Ansari-Pour, N; Razaghi-Moghadam, Z

    2016-03-01

    The identification of genetic markers (e.g. genes, pathways and subnetworks) for cancer has been one of the most challenging research areas in recent years. A subset of these studies attempt to analyze genome-wide expression profiles to identify markers with high reliability and reusability across independent whole-transcriptome microarray datasets. Therefore, the functional relationships of genes are integrated with their expression data. However, for a more accurate representation of the functional relationships among genes, utilization of the protein-protein interaction network (PPIN) seems to be necessary. Herein, a novel game theoretic approach (GTA) is proposed for the identification of cancer subnetwork markers by integrating genome-wide expression profiles and PPIN. The GTA method was applied to three distinct whole-transcriptome breast cancer datasets to identify the subnetwork markers associated with metastasis. To evaluate the performance of our approach, the identified subnetwork markers were compared with gene-based, pathway-based and network-based markers. We show that GTA is not only capable of identifying robust metastatic markers, it also provides a higher classification performance. In addition, based on these GTA-based subnetworks, we identified a new bonafide candidate gene for breast cancer susceptibility. PMID:26750920

  10. TOXICITY APPROACHES TO ASSESSING MINING IMPACTS AND MINE WASTE TREATMENT EFFECTIVENESS

    EPA Science Inventory

    The USEPA Office of Research and Development's National Exposure Research Laboratory and National Risk Management Research Laboratory have been evaluating the impact of mining sites on receiving streams and the effectiveness of waste treatment technologies in removing toxicity fo...

  11. Acid mine drainage risks - A modeling approach to siting mine facilities in Northern Minnesota USA

    NASA Astrophysics Data System (ADS)

    Myers, Tom

    2016-02-01

    Most watershed-scale planning for mine-caused contamination concerns remediation of past problems while future planning relies heavily on engineering controls. As an alternative, a watershed scale groundwater fate and transport model for the Rainy Headwaters, a northeastern Minnesota watershed, has been developed to examine the risks of leaks or spills to a pristine downstream watershed. The model shows that the risk depends on the location and whether the source of the leak is on the surface or from deeper underground facilities. Underground sources cause loads that last longer but arrive at rivers after a longer travel time and have lower concentrations due to dilution and attenuation. Surface contaminant sources could cause much more short-term damage to the resource. Because groundwater dominates baseflow, mine contaminant seepage would cause the most damage during low flow periods. Groundwater flow and transport modeling is a useful tool for decreasing the risk to downgradient sources by aiding in the placement of mine facilities. Although mines are located based on the minerals, advance planning and analysis could avoid siting mine facilities where failure or leaks would cause too much natural resource damage. Watershed scale transport modeling could help locate the facilities or decide in advance that the mine should not be constructed due to the risk to downstream resources.

  12. Large screen approaches to identify novel malaria vaccine candidates.

    PubMed

    Davies, D Huw; Duffy, Patrick; Bodmer, Jean-Luc; Felgner, Philip L; Doolan, Denise L

    2015-12-22

    Until recently, malaria vaccine development efforts have focused almost exclusively on a handful of well characterized Plasmodium falciparum antigens. Despite dedicated work by many researchers on different continents spanning more than half a century, a successful malaria vaccine remains elusive. Sequencing of the P. falciparum genome has revealed more than five thousand genes, providing the foundation for systematic approaches to discover candidate vaccine antigens. We are taking advantage of this wealth of information to discover new antigens that may be more effective vaccine targets. Herein, we describe different approaches to large-scale screening of the P. falciparum genome to identify targets of either antibody responses or T cell responses using human specimens collected in Controlled Human Malaria Infections (CHMI) or under conditions of natural exposure in the field. These genome, proteome and transcriptome based approaches offer enormous potential for the development of an efficacious malaria vaccine. PMID:26428458

  13. Identifying Prolonged Grief Reactions in Children: Dimensional and Diagnostic Approaches

    PubMed Central

    Melhem, Nadine M.; Porta, Giovanna; Payne, Monica Walker; Brent, David A.

    2013-01-01

    Objective Children with prolonged grief reactions (PGR) have been found to be at increased risk for depression and functional impairment. Identifying and diagnosing PGR in children is challenging, as there are no available dimensional measures with established thresholds and no diagnostic criteria in the DSM-IV. We examine thresholds for the Inventory for Complicated Grief–Revised for Children (ICG-RC) and compare this dimensional approach to the proposed DSM-5 criteria for Persistent Complex Bereavement-Related Disorder. We also identify a screening tool for PGR. Method Parentally bereaved children, 8–17 years of age, were assessed at 9, 21, and 33 months after parental death. Receiver Operator Characteristics were used to establish the “best threshold” that would identify children with PGR and evaluate the proposed DSM-5 criteria cross-sectionally and longitudinally. Results A score of 68 or higher on the ICG-RC was found to have high sensitivity (0.942) and specificity (0.965) in differentiating cases with PGR from noncases at 9 months. We also identify a 6-item screening tool that consists of longing and yearning for the deceased, inability to accept the death, shock, disbelief, loneliness, and a changed world view. The proposed DSM-5 criteria only correctly identified 20% to 41.7% of cases with PGR at different timepoints. Conclusions For the identification of youth at risk for PGR, the dimensional approach outperformed the proposed categorical diagnostic criteria. We propose a brief screening scale that, if validated, can help clinicians identify bereaved children at risk for PGR, and guide the development of prevention and intervention strategies. PMID:23702449

  14. Trend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks

    SciTech Connect

    Jin, R; McCallen, S; Almaas, E

    2007-05-28

    Complex networks have been used successfully in scientific disciplines ranging from sociology to microbiology to describe systems of interacting units. Until recently, studies of complex networks have mainly focused on their network topology. However, in many real world applications, the edges and vertices have associated attributes that are frequently represented as vertex or edge weights. Furthermore, these weights are often not static, instead changing with time and forming a time series. Hence, to fully understand the dynamics of the complex network, we have to consider both network topology and related time series data. In this work, we propose a motif mining approach to identify trend motifs for such purposes. Simply stated, a trend motif describes a recurring subgraph where each of its vertices or edges displays similar dynamics over a userdefined period. Given this, each trend motif occurrence can help reveal significant events in a complex system; frequent trend motifs may aid in uncovering dynamic rules of change for the system, and the distribution of trend motifs may characterize the global dynamics of the system. Here, we have developed efficient mining algorithms to extract trend motifs. Our experimental validation using three disparate empirical datasets, ranging from the stock market, world trade, to a protein interaction network, has demonstrated the efficiency and effectiveness of our approach.

  15. A novel approach to tag and identify geranylgeranylated proteins

    PubMed Central

    Chan, Lai N.; Hart, Courtenay; Guo, Lea; Nyberg, Tamara; Davies, Brandon S.J.; Fong, Loren G.; Young, Stephen G.; Agnew, Brian J.; Tamanoi, Fuyuhiko

    2010-01-01

    A recently developed proteomic strategy, the “GG-azide”-labeling approach, is described for the detection and proteomic analysis of geranylgeranylated proteins. This approach involves metabolic incorporation of a synthetic azido-geranylgeranyl analog and chemoselective derivatization of azido-geranylgeranyl-modified proteins by the “click” chemistry, using a tetramethylrhodamine-alkyne. The resulting conjugated proteins can be separated by 1-D or 2-D and pH fractionation, and detected by fluorescence imaging. This method is compatible with downstream LC-MS/MS analysis. Proteomic analysis of conjugated proteins by this approach identified several known geranylgeranylated proteins as well as Rap2c, a novel member of the Ras family. Furthermore, prenylation of progerin in mouse embryonic fibroblast cells was examined using this approach, demonstrating that this strategy can be used to study prenylation of specific proteins. The “GG-azide”-labeling approach provides a new tool for the detection and proteomic analysis of geranylgeranylated proteins, and it can readily be extended to other post-translational modifications. PMID:19784953

  16. Determining the familial risk distribution of colorectal cancer: a data mining approach.

    PubMed

    Chau, Rowena; Jenkins, Mark A; Buchanan, Daniel D; Ait Ouakrim, Driss; Giles, Graham G; Casey, Graham; Gallinger, Steven; Haile, Robert W; Le Marchand, Loic; Newcomb, Polly A; Lindor, Noralane M; Hopper, John L; Win, Aung Ko

    2016-04-01

    This study was aimed to characterize the distribution of colorectal cancer risk using family history of cancers by data mining. Family histories for 10,066 colorectal cancer cases recruited to population cancer registries of the Colon Cancer Family Registry were analyzed using a data mining framework. A novel index was developed to quantify familial cancer aggregation. Artificial neural network was used to identify distinct categories of familial risk. Standardized incidence ratios (SIRs) and corresponding 95% confidence intervals (CIs) of colorectal cancer were calculated for each category. We identified five major, and 66 minor categories of familial risk for developing colorectal cancer. The distribution the major risk categories were: (1) 7% of families (SIR = 7.11; 95% CI 6.65-7.59) had a strong family history of colorectal cancer; (2) 13% of families (SIR = 2.94; 95% CI 2.78-3.10) had a moderate family history of colorectal cancer; (3) 11% of families (SIR = 1.23; 95% CI 1.12-1.36) had a strong family history of breast cancer and a weak family history of colorectal cancer; (4) 9 % of families (SIR = 1.06; 95 % CI 0.96-1.18) had strong family history of prostate cancer and weak family history of colorectal cancer; and (5) 60% of families (SIR = 0.61; 95% CI 0.57-0.65) had a weak family history of all cancers. There is a wide variation of colorectal cancer risk that can be categorized by family history of cancer, with a strong gradient of colorectal cancer risk between the highest and lowest risk categories. The risk of colorectal cancer for people with the highest risk category of family history (7% of the population) was 12-times that for people in the lowest risk category (60%) of the population. Data mining was proven an effective approach for gaining insight into the underlying cancer aggregation patterns and for categorizing familial risk of colorectal cancer. PMID:26681340

  17. Hazards identified and the need for health risk assessment in the South African mining industry.

    PubMed

    Utembe, W; Faustman, E M; Matatiele, P; Gulumian, M

    2015-12-01

    Although mining plays a prominent role in the economy of South Africa, it is associated with many chemical hazards. Exposure to dust from mining can lead to many pathological effects depending on mineralogical composition, size, shape and levels and duration of exposure. Mining and processing of minerals also result in occupational exposure to toxic substances such as platinum, chromium, vanadium, manganese, mercury, cyanide and diesel particulate. South Africa has set occupational exposure limits (OELs) for some hazards, but mine workers are still at a risk. Since the hazard posed by a mineral depends on its physiochemical properties, it is recommended that South Africa should not simply adopt OELs from other countries but rather set her own standards based on local toxicity studies. The limits should take into account the issue of mixtures to which workers could be exposed as well as the health status of the workers. The mining industry is also a source of contamination of the environment, due inter alia to the large areas of tailings dams and dumps left behind. Therefore, there is need to develop guidelines for safe land-uses of contaminated lands after mine closure. PMID:26614808

  18. The Usage of Association Rule Mining to Identify Influencing Factors on Deafness After Birth

    PubMed Central

    Shahraki, Azimeh Danesh; Safdari, Reza; Gahfarokhi, Hamid Habibi; Tahmasebian, Shahram

    2015-01-01

    Background: Providing complete and high quality health care services has very important role to enable people to understand the factors related to personal and social health and to make decision regarding choice of suitable healthy behaviors in order to achieve healthy life. For this reason, demographic and clinical data of person are collecting, this huge volume of data can be known as a valuable resource for analyzing, exploring and discovering valuable information and communication. This study using forum rules techniques in the data mining has tried to identify the affecting factors on hearing loss after birth in Iran. Materials and Methods: The survey is kind of data oriented study. The population of the study is contained questionnaires in several provinces of the country. First, all data of questionnaire was implemented in the form of information table in Software SQL Server and followed by Data Entry using written software of C # .Net, then algorithm Association in SQL Server Data Tools software and Clementine software was implemented to determine the rules and hidden patterns in the gathered data. Findings: Two factors of number of deaf brothers and the degree of consanguinity of the parents have a significant impact on severity of deafness of individuals. Also, when the severity of hearing loss is greater than or equal to moderately severe hearing loss, people use hearing aids and Men are also less interested in the use of hearing aids. Conclusion: In fact, it can be said that in families with consanguineous marriage of parents that are from first degree (girl/boy cousins) and 2nd degree relatives (girl/boy cousins) and especially from first degree, the number of people with severe hearing loss or deafness are more and in the use of hearing aids, gender of the patient is more important than the severity of the hearing loss. PMID:26862245

  19. A genomic approach to identify hybrid incompatibility genes

    PubMed Central

    Cooper, Jacob C.; Phadnis, Nitin

    2016-01-01

    ABSTRACT Uncovering the genetic and molecular basis of barriers to gene flow between populations is key to understanding how new species are born. Intrinsic postzygotic reproductive barriers such as hybrid sterility and hybrid inviability are caused by deleterious genetic interactions known as hybrid incompatibilities. The difficulty in identifying these hybrid incompatibility genes remains a rate-limiting step in our understanding of the molecular basis of speciation. We recently described how whole genome sequencing can be applied to identify hybrid incompatibility genes, even from genetically terminal hybrids. Using this approach, we discovered a new hybrid incompatibility gene, gfzf, between Drosophila melanogaster and Drosophila simulans, and found that it plays an essential role in cell cycle regulation. Here, we discuss the history of the hunt for incompatibility genes between these species, discuss the molecular roles of gfzf in cell cycle regulation, and explore how intragenomic conflict drives the evolution of fundamental cellular mechanisms that lead to the developmental arrest of hybrids. PMID:27230814

  20. Quantiles Regression Approach to Identifying the Determinant of Breastfeeding Duration

    NASA Astrophysics Data System (ADS)

    Mahdiyah; Norsiah Mohamed, Wan; Ibrahim, Kamarulzaman

    In this study, quantiles regression approach is applied to the data of Malaysian Family Life Survey (MFLS), to identify factors which are significantly related to the different conditional quantiles of the breastfeeding duration. It is known that the classical linear regression methods are based on minimizing residual sum of squared, but quantiles regression use a mechanism which are based on the conditional median function and the full range of other conditional quantile functions. Overall, it is found that the period of breastfeeding is significantly related to place of living, religion and total number of children in the family.

  1. Genetic approaches for identifying kinetochore components in Saccharomyces cerevisiae

    SciTech Connect

    Doheny, K.F.; Puziss, J.; Spencer, F.; Hieter, P.

    1993-12-31

    A fundamental aspect of the cell division cycle is the chromosome cycle in which each of the chromosomal DNA molecules undergoes a series of morphological changes and complex movements to ensure faithful distribution at mitosis. The gene products responsible for execution of the chromosome cycle include structural components, such as those that assemble into the mitotic spindle apparatus, and regulatory components, such as those that coordinate the ordered series of events leading to chromosome segregation within the cell cycle. We have been taking several genetic approaches to identify genes encoding determinants critical to the chromosome cycle in the budding yeast, S. cerevisiae.

  2. Practical Approaches for Mining Frequent Patterns in Molecular Datasets.

    PubMed

    Naulaerts, Stefan; Moens, Sandy; Engelen, Kristof; Berghe, Wim Vanden; Goethals, Bart; Laukens, Kris; Meysman, Pieter

    2016-01-01

    Pattern detection is an inherent task in the analysis and interpretation of complex and continuously accumulating biological data. Numerous itemset mining algorithms have been developed in the last decade to efficiently detect specific pattern classes in data. Although many of these have proven their value for addressing bioinformatics problems, several factors still slow down promising algorithms from gaining popularity in the life science community. Many of these issues stem from the low user-friendliness of these tools and the complexity of their output, which is often large, static, and consequently hard to interpret. Here, we apply three software implementations on common bioinformatics problems and illustrate some of the advantages and disadvantages of each, as well as inherent pitfalls of biological data mining. Frequent itemset mining exists in many different flavors, and users should decide their software choice based on their research question, programming proficiency, and added value of extra features. PMID:27168722

  3. Practical Approaches for Mining Frequent Patterns in Molecular Datasets

    PubMed Central

    Naulaerts, Stefan; Moens, Sandy; Engelen, Kristof; Berghe, Wim Vanden; Goethals, Bart; Laukens, Kris; Meysman, Pieter

    2016-01-01

    Pattern detection is an inherent task in the analysis and interpretation of complex and continuously accumulating biological data. Numerous itemset mining algorithms have been developed in the last decade to efficiently detect specific pattern classes in data. Although many of these have proven their value for addressing bioinformatics problems, several factors still slow down promising algorithms from gaining popularity in the life science community. Many of these issues stem from the low user-friendliness of these tools and the complexity of their output, which is often large, static, and consequently hard to interpret. Here, we apply three software implementations on common bioinformatics problems and illustrate some of the advantages and disadvantages of each, as well as inherent pitfalls of biological data mining. Frequent itemset mining exists in many different flavors, and users should decide their software choice based on their research question, programming proficiency, and added value of extra features. PMID:27168722

  4. Functional epigenetic approach identifies frequently methylated genes in Ewing sarcoma.

    PubMed

    Alholle, Abdullah; Brini, Anna T; Gharanei, Seley; Vaiyapuri, Sumathi; Arrigoni, Elena; Dallol, Ashraf; Gentle, Dean; Kishida, Takeshi; Hiruma, Toru; Avigad, Smadar; Grimer, Robert; Maher, Eamonn R; Latif, Farida

    2013-11-01

    Using a candidate gene approach we recently identified frequent methylation of the RASSF2 gene associated with poor overall survival in Ewing sarcoma (ES). To identify effective biomarkers in ES on a genome-wide scale, we used a functionally proven epigenetic approach, in which gene expression was induced in ES cell lines by treatment with a demethylating agent followed by hybridization onto high density gene expression microarrays. After following a strict selection criterion, 34 genes were selected for expression and methylation analysis in ES cell lines and primary ES. Eight genes (CTHRC1, DNAJA4, ECHDC2, NEFH, NPTX2, PHF11, RARRES2, TSGA14) showed methylation frequencies of>20% in ES tumors (range 24-71%), these genes were expressed in human bone marrow derived mesenchymal stem cells (hBMSC) and hypermethylation was associated with transcriptional silencing. Methylation of NPTX2 or PHF11 was associated with poorer prognosis in ES. In addition, six of the above genes also showed methylation frequency of>20% (range 36-50%) in osteosarcomas. Identification of these genes may provide insights into bone cancer tumorigenesis and development of epigenetic biomarkers for prognosis and detection of these rare tumor types. PMID:24005033

  5. Experimental approaches to identify non-coding RNAs

    PubMed Central

    Hüttenhofer, Alexander; Vogel, Jörg

    2006-01-01

    Cellular RNAs that do not function as messenger RNAs (mRNAs), transfer RNAs (tRNAs) or ribosomal RNAs (rRNAs) comprise a diverse class of molecules that are commonly referred to as non-protein-coding RNAs (ncRNAs). These molecules have been known for quite a while, but their importance was not fully appreciated until recent genome-wide searches discovered thousands of these molecules and their genes in a variety of model organisms. Some of these screens were based on biocomputational prediction of ncRNA candidates within entire genomes of model organisms. Alternatively, direct biochemical isolation of expressed ncRNAs from cells, tissues or entire organisms has been shown to be a powerful approach to identify ncRNAs both at the level of individual molecules and at a global scale. In this review, we will survey several such wet-lab strategies, i.e. direct sequencing of ncRNAs, shotgun cloning of small-sized ncRNAs (cDNA libraries), microarray analysis and genomic SELEX to identify novel ncRNAs, and discuss the advantages and limits of these approaches. PMID:16436800

  6. Detecting Structural Damage of Nuclear Power Plant by Interactive Data Mining Approach

    SciTech Connect

    Yufei Shu

    2006-07-01

    This paper presents a nonlinear structural damage identification technique, based on an interactive data mining approach, which integrates a human cognitive model in a data mining loop. A mining control agent emulating human analysts is developed, which directly interacts with the data miner, analyzing and verifying the output of the data miner and controlling the data mining process. Additionally, an artificial neural network method, which is adopted as a core component of the proposed interactive data mining method, is evolved by adding a novelty detecting and retraining function for handling complicated nuclear power plant quake-proof data. Plant quake-proof testing data has been applied to the system to show the validation of the proposed method. (author)

  7. An online approach for mining collective behaviors from molecular dynamics simulations.

    PubMed

    Ramanathan, Arvind; Agarwal, Pratul K; Kurnikova, Maria; Langmead, Christopher J

    2010-03-01

    Collective behavior involving distally separate regions in a protein is known to widely affect its function. In this article, we present an online approach to study and characterize collective behavior in proteins as molecular dynamics (MD) simulations progress. Our representation of MD simulations as a stream of continuously evolving data allows us to succinctly capture spatial and temporal dependencies that may exist and analyze them efficiently using data mining techniques. By using tensor analysis we identify (a) collective motions (i.e., dynamic couplings) and (b) time-points during the simulation where the collective motions suddenly change. We demonstrate the applicability of this method on two different protein simulations for barnase and cyclophilin A. We characterize the collective motions in these proteins using our method and analyze sudden changes in these motions. Taken together, our results indicate that tensor analysis is well suited to extracting information from MD trajectories in an online fashion. PMID:20377447

  8. An Integrated Approach to Identifying International Foodborne Norovirus Outbreaks1

    PubMed Central

    Kouyos, Roger D.; Vennema, Harry; Kroneman, Annelies; Siebenga, Joukje; van Pelt, Wilfrid; Koopmans, Marion

    2011-01-01

    International foodborne norovirus outbreaks can be difficult to recognize when using standard outbreak investigation methods. In a novel approach, we provide step-wise selection criteria to identify clusters of outbreaks that may involve an internationally distributed common foodborne source. After computerized linking of epidemiologic data to aligned sequences, we retrospectively identified 100 individually reported outbreaks that potentially represented 14 international common source events in Europe during 1999–2008. Analysis of capsid sequences of outbreak strains (n = 1,456), showed that ≈7% of outbreaks reported to the Foodborne Viruses in Europe database were part of an international event (range 2%–9%), compared with 0.4% identified through standard epidemiologic investigations. Our findings point to a critical gap in surveillance and suggest that international collaboration could have increased the number of recognized international foodborne outbreaks. Real-time exchange of combined epidemiologic and molecular data is needed to validate our findings through timely trace-backs of clustered outbreaks. PMID:21392431

  9. Diagnosis of cardiovascular abnormalities from compressed ECG: a data mining-based approach.

    PubMed

    Sufi, Fahim; Khalil, Ibrahim

    2011-01-01

    Usage of compressed ECG for fast and efficient telecardiology application is crucial, as ECG signals are enormously large in size. However, conventional ECG diagnosis algorithms require the compressed ECG packets to be decompressed before diagnosis can be performed. This added step of decompression before performing diagnosis for every ECG packet introduces unnecessary delay, which is undesirable for cardiovascular diseased (CVD) patients. In this paper, we are demonstrating an innovative technique that performs real-time classification of CVD. With the help of this real-time classification of CVD, the emergency personnel or the hospital can automatically be notified via SMS/MMS/e-mail when a life-threatening cardiac abnormality of the CVD affected patient is detected. Our proposed system initially uses data mining techniques, such as attribute selection (i.e., selects only a few features from the compressed ECG) and expectation maximization (EM)-based clustering. These data mining techniques running on a hospital server generate a set of constraints for representing each of the abnormalities. Then, the patient's mobile phone receives these set of constraints and employs a rule-based system that can identify each of abnormal beats in real time. Our experimentation results on 50 MIT-BIH ECG entries reveal that the proposed approach can successfully detect cardiac abnormalities (e.g., ventricular flutter/fibrillation, premature ventricular contraction, atrial fibrillation, etc.) with 97% accuracy on average. This innovative data mining technique on compressed ECG packets enables faster identification of cardiac abnormality directly from the compressed ECG, helping to build an efficient telecardiology diagnosis system. PMID:21097383

  10. Identifying Subgroups among Hardcore Smokers: a Latent Profile Approach

    PubMed Central

    Bommelé, Jeroen; Kleinjan, Marloes; Schoenmakers, Tim M.; Burk, William J.; van den Eijnden, Regina; van de Mheen, Dike

    2015-01-01

    Introduction Hardcore smokers are smokers who have little to no intention to quit. Previous research suggests that there are distinct subgroups among hardcore smokers and that these subgroups vary in the perceived pros and cons of smoking and quitting. Identifying these subgroups could help to develop individualized messages for the group of hardcore smokers. In this study we therefore used the perceived pros and cons of smoking and quitting to identify profiles among hardcore smokers. Methods A sample of 510 hardcore smokers completed an online survey on the perceived pros and cons of smoking and quitting. We used these perceived pros and cons in a latent profile analysis to identify possible subgroups among hardcore smokers. To validate the profiles identified among hardcore smokers, we analysed data from a sample of 338 non-hardcore smokers in a similar way. Results We found three profiles among hardcore smokers. ‘Receptive’ hardcore smokers (36%) perceived many cons of smoking and many pros of quitting. ‘Ambivalent’ hardcore smokers (59%) were rather undecided towards quitting. ‘Resistant’ hardcore smokers (5%) saw few cons of smoking and few pros of quitting. Among non-hardcore smokers, we found similar groups of ‘receptive’ smokers (30%) and ‘ambivalent’ smokers (54%). However, a third group consisted of ‘disengaged’ smokers (16%), who saw few pros and cons of both smoking and quitting. Discussion Among hardcore smokers, we found three distinct profiles based on perceived pros and cons of smoking. This indicates that hardcore smokers are not a homogenous group. Each profile might require a different tobacco control approach. Our findings may help to develop individualized tobacco control messages for the particularly hard-to-reach group of hardcore smokers. PMID:26207829

  11. A Visualization System Using Data Mining Techniques for Identifying Information Sources.

    ERIC Educational Resources Information Center

    Fowler, Richard H.; Karadayi, Tarkan; Chen, Zhixiang; Meng, Xiannong; Fowler, Wendy A. Lawrence

    The Visual Analysis System (VAS) was developed to couple emerging successes in data mining with information visualization techniques in order to create a richly interactive environment for information retrieval from the World Wide Web. VAS's retrieval strategy operates by first using a conventional search engine to form a core set of retrieved…

  12. Novel approaches to identify protein adducts produced by lipid peroxidation.

    PubMed

    Codreanu, S G; Liebler, D C

    2015-01-01

    Lipid peroxidation is responsible for the generation of chemically reactive, diffusible lipid-derived electrophiles (LDEs) that covalently modify cellular protein targets. These protein modifications modulate protein activity and macromolecular interactions and induce adaptive and toxic cell signaling. Protein modifications induced by LDEs can be identified and quantified by affinity enrichment and liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based techniques. Tagged LDE analog probes with different electrophilic groups can be covalently captured by click chemistry for LC-MS/MS analyses, thereby enabling in-depth studies of proteome damage at the protein and peptide sequence levels. Conversely, click-reactive, thiol-directed probes can be used to evaluate thiol damage caused by LDE by difference. These analytical approaches permit systematic study of the dynamics of protein damage caused by LDE and mechanisms by which oxidative stress contribute to toxicity and diseases. PMID:25819163

  13. A metabolomics approach to characterise and identify various Mycobacterium species.

    PubMed

    Olivier, Ilse; Loots, Du Toit

    2012-03-01

    We investigated the potential use of gas chromatography mass spectrometry (GC-MS), in combination with multivariate statistical data processing, to build a model for the classification of various tuberculosis (TB) causing, and non-TB Mycobacterium species, on the basis of their characteristic metabolite profiles. A modified Bligh-Dyer extraction procedure was used to extract lipid components from Mycobacterium tuberculosis, Mycobacterium avium, Mycobacterium bovis, and Mycobacterium kansasii cultures. Principle component analyses (PCA) of the GC-MS generated data showed a clear differentiation between all the Mycobacterium species tested. Subsequently, the 12 compounds best describing the variation between the sample groups were identified as potential metabolite markers, using PCA and partial least-squares discriminant analysis (PLS-DA). These metabolite markers were then used to build a discriminant classification model based on Bayes' theorem, in conjunction with multivariate kernel density estimation. This model subsequently correctly classified 2 "unknown" samples for each of the Mycobacterium species analysed, with probabilities ranging from 72 to 100%. Furthermore, Mycobacterium species classification could be achieved in less than 16 h, and the detection limit for this approach was 1×10(3)bacteriamL(-1). This study proves the capacity of a GC-MS, metabolomics pattern recognition approach for its possible use in TB diagnostics and disease characterisation. PMID:22301369

  14. Data-mining the FlyAtlas online resource to identify core functional motifs across transporting epithelia

    PubMed Central

    2013-01-01

    Background Comparative analysis of tissue-specific transcriptomes is a powerful technique to uncover tissue functions. Our FlyAtlas.org provides authoritative gene expression levels for multiple tissues of Drosophila melanogaster (1). Although the main use of such resources is single gene lookup, there is the potential for powerful meta-analysis to address questions that could not easily be framed otherwise. Here, we illustrate the power of data-mining of FlyAtlas data by comparing epithelial transcriptomes to identify a core set of highly-expressed genes, across the four major epithelial tissues (salivary glands, Malpighian tubules, midgut and hindgut) of both adults and larvae. Method Parallel hypothesis-led and hypothesis-free approaches were adopted to identify core genes that underpin insect epithelial function. In the former, gene lists were created from transport processes identified in the literature, and their expression profiles mapped from the flyatlas.org online dataset. In the latter, gene enrichment lists were prepared for each epithelium, and genes (both transport related and unrelated) consistently enriched in transporting epithelia identified. Results A key set of transport genes, comprising V-ATPases, cation exchangers, aquaporins, potassium and chloride channels, and carbonic anhydrase, was found to be highly enriched across the epithelial tissues, compared with the whole fly. Additionally, a further set of genes that had not been predicted to have epithelial roles, were co-expressed with the core transporters, extending our view of what makes a transporting epithelium work. Further insights were obtained by studying the genes uniquely overexpressed in each epithelium; for example, the salivary gland expresses lipases, the midgut organic solute transporters, the tubules specialize for purine metabolism and the hindgut overexpresses still unknown genes. Conclusion Taken together, these data provide a unique insight into epithelial function in this

  15. Using Bioinformatic Approaches to Identify Pathways Targeted by Human Leukemogens

    PubMed Central

    Thomas, Reuben; Phuong, Jimmy; McHale, Cliona M.; Zhang, Luoping

    2012-01-01

    We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD) for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other. PMID:22851955

  16. Data mining approaches for information retrieval from genomic databases

    NASA Astrophysics Data System (ADS)

    Liu, Donglin; Singh, Gautam B.

    2000-04-01

    Sequence retrieval in genomic databases is used for finding sequences related to a query sequence specified by a user. Comparison is the main part of the retrieval system in genomic databases. An efficient sequence comparison algorithm is critical in bioinformatics. There are several different algorithms to perform sequence comparison, such as the suffix array based database search, divergence measurement, methods that rely upon the existence of a local similarity between the query sequence and sequences in the database, or common mutual information between query and sequences in DB. In this paper we have described a new method for DNA sequence retrieval based on data mining techniques. Data mining tools generally find patterns among data and have been successfully applied in industries to improve marketing, sales, and customer support operations. We have applied the descriptive data mining techniques to find relevant patterns that are significant for comparing genetic sequences. Relevance feedback score based on common patterns is developed and employed to compute distance between sequences. The contigs of human chromosomes are used to test the retrieval accuracy and the experimental results are presented.

  17. An Approach to Realizing Process Control for Underground Mining Operations of Mobile Machines

    PubMed Central

    Song, Zhen; Schunnesson, Håkan; Rinne, Mikael; Sturgul, John

    2015-01-01

    The excavation and production in underground mines are complicated processes which consist of many different operations. The process of underground mining is considerably constrained by the geometry and geology of the mine. The various mining operations are normally performed in series at each working face. The delay of a single operation will lead to a domino effect, thus delay the starting time for the next process and the completion time of the entire process. This paper presents a new approach to the process control for underground mining operations, e.g. drilling, bolting, mucking. This approach can estimate the working time and its probability for each operation more efficiently and objectively by improving the existing PERT (Program Evaluation and Review Technique) and CPM (Critical Path Method). If the delay of the critical operation (which is on a critical path) inevitably affects the productivity of mined ore, the approach can rapidly assign mucking machines new jobs to increase this amount at a maximum level by using a new mucking algorithm under external constraints. PMID:26062092

  18. Dual-band, infrared buried mine detection using a statistical pattern recognition approach

    SciTech Connect

    Buhl, M.R.; Hernandez, J.E.; Clark, G.A.; Sengupta, S.K.

    1993-08-01

    The main objective of this work was to detect surrogate land mines, which were buried in clay and sand, using dual-band, infrared images. A statistical pattern recognition approach was used to achieve this objective. This approach is discussed and results of applying it to real images are given.

  19. THE FUTURE OF COMPUTER-BASED TOXICITY PREDICTION: MECHANISM-BASED MODELS VS. INFORMATION MINING APPROACHES

    EPA Science Inventory


    The Future of Computer-Based Toxicity Prediction:
    Mechanism-Based Models vs. Information Mining Approaches

    When we speak of computer-based toxicity prediction, we are generally referring to a broad array of approaches which rely primarily upon chemical structure ...

  20. A Feature Mining Based Approach for the Classification of Text Documents into Disjoint Classes.

    ERIC Educational Resources Information Center

    Nieto Sanchez, Salvador; Triantaphyllou, Evangelos; Kraft, Donald

    2002-01-01

    Proposes a new approach for classifying text documents into two disjoint classes. Highlights include a brief overview of document clustering; a data mining approach called the One Clause at a Time (OCAT) algorithm which is based on mathematical logic; vector space model (VSM); and comparing the OCAT to the VSM. (Author/LRW)

  1. A review of approaches to identifying patient phenotype cohorts using electronic health records

    PubMed Central

    Shivade, Chaitanya; Raghavan, Preethi; Fosler-Lussier, Eric; Embi, Peter J; Elhadad, Noemie; Johnson, Stephen B; Lai, Albert M

    2014-01-01

    Objective To summarize literature describing approaches aimed at automatically identifying patients with a common phenotype. Materials and methods We performed a review of studies describing systems or reporting techniques developed for identifying cohorts of patients with specific phenotypes. Every full text article published in (1) Journal of American Medical Informatics Association, (2) Journal of Biomedical Informatics, (3) Proceedings of the Annual American Medical Informatics Association Symposium, and (4) Proceedings of Clinical Research Informatics Conference within the past 3 years was assessed for inclusion in the review. Only articles using automated techniques were included. Results Ninety-seven articles met our inclusion criteria. Forty-six used natural language processing (NLP)-based techniques, 24 described rule-based systems, 41 used statistical analyses, data mining, or machine learning techniques, while 22 described hybrid systems. Nine articles described the architecture of large-scale systems developed for determining cohort eligibility of patients. Discussion We observe that there is a rise in the number of studies associated with cohort identification using electronic medical records. Statistical analyses or machine learning, followed by NLP techniques, are gaining popularity over the years in comparison with rule-based systems. Conclusions There are a variety of approaches for classifying patients into a particular phenotype. Different techniques and data sources are used, and good performance is reported on datasets at respective institutions. However, no system makes comprehensive use of electronic medical records addressing all of their known weaknesses. PMID:24201027

  2. Mining 3D genome structure populations identifies major factors governing the stability of regulatory communities

    PubMed Central

    Dai, Chao; Li, Wenyuan; Tjong, Harianto; Hao, Shengli; Zhou, Yonggang; Li, Qingjiao; Chen, Lin; Zhu, Bing; Alber, Frank; Jasmine Zhou, Xianghong

    2016-01-01

    Three-dimensional (3D) genome structures vary from cell to cell even in an isogenic sample. Unlike protein structures, genome structures are highly plastic, posing a significant challenge for structure-function mapping. Here we report an approach to comprehensively identify 3D chromatin clusters that each occurs frequently across a population of genome structures, either deconvoluted from ensemble-averaged Hi-C data or from a collection of single-cell Hi-C data. Applying our method to a population of genome structures (at the macrodomain resolution) of lymphoblastoid cells, we identify an atlas of stable inter-chromosomal chromatin clusters. A large number of these clusters are enriched in binding of specific regulatory factors and are therefore defined as ‘Regulatory Communities.' We reveal two major factors, centromere clustering and transcription factor binding, which significantly stabilize such communities. Finally, we show that the regulatory communities differ substantially from cell to cell, indicating that expression variability could be impacted by genome structures. PMID:27240697

  3. Social Network Analysis and Mining to Monitor and Identify Problems with Large-Scale Information and Communication Technology Interventions

    PubMed Central

    da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa

    2016-01-01

    The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants’ municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar

  4. Social Network Analysis and Mining to Monitor and Identify Problems with Large-Scale Information and Communication Technology Interventions.

    PubMed

    da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa

    2016-01-01

    The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants' municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar

  5. A text mining approach to the prediction of disease status from clinical discharge summaries.

    PubMed

    Yang, Hui; Spasic, Irena; Keane, John A; Nenadic, Goran

    2009-01-01

    OBJECTIVE The authors present a system developed for the Challenge in Natural Language Processing for Clinical Data-the i2b2 obesity challenge, whose aim was to automatically identify the status of obesity and 15 related co-morbidities in patients using their clinical discharge summaries. The challenge consisted of two tasks, textual and intuitive. The textual task was to identify explicit references to the diseases, whereas the intuitive task focused on the prediction of the disease status when the evidence was not explicitly asserted. DESIGN The authors assembled a set of resources to lexically and semantically profile the diseases and their associated symptoms, treatments, etc. These features were explored in a hybrid text mining approach, which combined dictionary look-up, rule-based, and machine-learning methods. MEASUREMENTS The methods were applied on a set of 507 previously unseen discharge summaries, and the predictions were evaluated against a manually prepared gold standard. The overall ranking of the participating teams was primarily based on the macro-averaged F-measure. RESULTS The implemented method achieved the macro-averaged F-measure of 81% for the textual task (which was the highest achieved in the challenge) and 63% for the intuitive task (ranked 7(th) out of 28 teams-the highest was 66%). The micro-averaged F-measure showed an average accuracy of 97% for textual and 96% for intuitive annotations. CONCLUSIONS The performance achieved was in line with the agreement between human annotators, indicating the potential of text mining for accurate and efficient prediction of disease statuses from clinical discharge summaries. PMID:19390098

  6. Assessing Weather-Yield Relationships in Rice at Local Scale Using Data Mining Approaches

    PubMed Central

    Delerce, Sylvain; Dorado, Hugo; Grillon, Alexandre; Rebolledo, Maria Camila; Prager, Steven D.; Patiño, Victor Hugo; Garcés Varón, Gabriel; Jiménez, Daniel

    2016-01-01

    Seasonal and inter-annual climate variability have become important issues for farmers, and climate change has been shown to increase them. Simultaneously farmers and agricultural organizations are increasingly collecting observational data about in situ crop performance. Agriculture thus needs new tools to cope with changing environmental conditions and to take advantage of these data. Data mining techniques make it possible to extract embedded knowledge associated with farmer experiences from these large observational datasets in order to identify best practices for adapting to climate variability. We introduce new approaches through a case study on irrigated and rainfed rice in Colombia. Preexisting observational datasets of commercial harvest records were combined with in situ daily weather series. Using Conditional Inference Forest and clustering techniques, we assessed the relationships between climatic factors and crop yield variability at the local scale for specific cultivars and growth stages. The analysis showed clear relationships in the various location-cultivar combinations, with climatic factors explaining 6 to 46% of spatiotemporal variability in yield, and with crop responses to weather being non-linear and cultivar-specific. Climatic factors affected cultivars differently during each stage of development. For instance, one cultivar was affected by high nighttime temperatures in the reproductive stage but responded positively to accumulated solar radiation during the ripening stage. Another was affected by high nighttime temperatures during both the vegetative and reproductive stages. Clustering of the weather patterns corresponding to individual cropping events revealed different groups of weather patterns for irrigated and rainfed systems with contrasting yield levels. Best-suited cultivars were identified for some weather patterns, making weather-site-specific recommendations possible. This study illustrates the potential of data mining for

  7. Assessing Weather-Yield Relationships in Rice at Local Scale Using Data Mining Approaches.

    PubMed

    Delerce, Sylvain; Dorado, Hugo; Grillon, Alexandre; Rebolledo, Maria Camila; Prager, Steven D; Patiño, Victor Hugo; Garcés Varón, Gabriel; Jiménez, Daniel

    2016-01-01

    Seasonal and inter-annual climate variability have become important issues for farmers, and climate change has been shown to increase them. Simultaneously farmers and agricultural organizations are increasingly collecting observational data about in situ crop performance. Agriculture thus needs new tools to cope with changing environmental conditions and to take advantage of these data. Data mining techniques make it possible to extract embedded knowledge associated with farmer experiences from these large observational datasets in order to identify best practices for adapting to climate variability. We introduce new approaches through a case study on irrigated and rainfed rice in Colombia. Preexisting observational datasets of commercial harvest records were combined with in situ daily weather series. Using Conditional Inference Forest and clustering techniques, we assessed the relationships between climatic factors and crop yield variability at the local scale for specific cultivars and growth stages. The analysis showed clear relationships in the various location-cultivar combinations, with climatic factors explaining 6 to 46% of spatiotemporal variability in yield, and with crop responses to weather being non-linear and cultivar-specific. Climatic factors affected cultivars differently during each stage of development. For instance, one cultivar was affected by high nighttime temperatures in the reproductive stage but responded positively to accumulated solar radiation during the ripening stage. Another was affected by high nighttime temperatures during both the vegetative and reproductive stages. Clustering of the weather patterns corresponding to individual cropping events revealed different groups of weather patterns for irrigated and rainfed systems with contrasting yield levels. Best-suited cultivars were identified for some weather patterns, making weather-site-specific recommendations possible. This study illustrates the potential of data mining for

  8. A quantitative approach to identifying predators from nest remains

    USGS Publications Warehouse

    Anthony, R.M.; Grand, J.B.; Fondell, T.F.; Manly, B.F.

    2004-01-01

    Nesting success of Dusky Canada Geese (Branta canadensis occidentalis) has declined greatly since a major earthquake affected southern Alaska in 1964. To identify nest predators, we collected predation data at goose nests and photographs of predators at natural nests containing artificial eggs in 1997-2000. To document feeding behavior by nest predators, we compiled the evidence from destroyed nests with known predators on our study site and from previous studies. We constructed a profile for each predator group and compared the evidence from 895 nests with unknown predators to our predator profiles using mixture-model analysis. This analysis indicated that 72% of destroyed nests were depredated by Bald Eagles and 13% by brown bears, and also yielded the probability that each nest was correctly assigned to a predator group based on model fit. Model testing using simulations indicated that the proportion estimated for eagle predation was unbiased and the proportion for bear predation was slightly overestimated. This approach may have application whenever there are adequate data on nests destroyed by known predators and predators exhibit different feeding behavior at nests.

  9. A geometric approach to identify cavities in particle systems

    NASA Astrophysics Data System (ADS)

    Voyiatzis, Evangelos; Böhm, Michael C.; Müller-Plathe, Florian

    2015-11-01

    The implementation of a geometric algorithm to identify cavities in particle systems in an open-source python program is presented. The algorithm makes use of the Delaunay space tessellation. The present python software is based on platform-independent tools, leading to a portable program. Its successful execution provides information concerning the accessible volume fraction of the system, the size and shape of the cavities and the group of atoms forming each of them. The program can be easily incorporated into the LAMMPS software. An advantage of the present algorithm is that no a priori assumption on the cavity shape has to be made. As an example, the cavity size and shape distributions in a polyethylene melt system are presented for three spherical probe particles. This paper serves also as an introductory manual to the script. It summarizes the algorithm, its implementation, the required user-defined parameters as well as the format of the input and output files. Additionally, we demonstrate possible applications of our approach and compare its capability with the ones of well documented cavity size estimators.

  10. Identifying predictors of physics item difficulty: A linear regression approach

    NASA Astrophysics Data System (ADS)

    Mesic, Vanes; Muratovic, Hasnija

    2011-06-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge

  11. A fluorescent approach for identifying P2X1 ligands

    PubMed Central

    Ruepp, Marc-David; Brozik, James A.; de Esch, Iwan J.P.; Farndale, Richard W.; Murrell-Lagnado, Ruth D.; Thompson, Andrew J.

    2015-01-01

    There are no commercially available, small, receptor-specific P2X1 ligands. There are several synthetic derivatives of the natural agonist ATP and some structurally-complex antagonists including compounds such as PPADS, NTP-ATP, suramin and its derivatives (e.g. NF279, NF449). NF449 is the most potent and selective ligand, but potencies of many others are not particularly high and they can also act at other P2X, P2Y and non-purinergic receptors. While there is clearly scope for further work on P2X1 receptor pharmacology, screening can be difficult owing to rapid receptor desensitisation. To reduce desensitisation substitutions can be made within the N-terminus of the P2X1 receptor, but these could also affect ligand properties. An alternative is the use of fluorescent voltage-sensitive dyes that respond to membrane potential changes resulting from channel opening. Here we utilised this approach in conjunction with fragment-based drug-discovery. Using a single concentration (300 μM) we identified 46 novel leads from a library of 1443 fragments (hit rate = 3.2%). These hits were independently validated by measuring concentration-dependence with the same voltage-sensitive dye, and by visualising the competition of hits with an Alexa-647-ATP fluorophore using confocal microscopy; confocal yielded kon (1.142 × 106 M−1 s−1) and koff (0.136 s−1) for Alexa-647-ATP (Kd = 119 nM). The identified hit fragments had promising structural diversity. In summary, the measurement of functional responses using voltage-sensitive dyes was flexible and cost-effective because labelled competitors were not needed, effects were independent of a specific binding site, and both agonist and antagonist actions were probed in a single assay. The method is widely applicable and could be applied to all P2X family members, as well as other voltage-gated and ligand-gated ion channels. This article is part of the Special Issue entitled ‘Fluorescent Tools in Neuropharmacology

  12. An Efficient Leave One Block Out approach to identify outliers

    NASA Astrophysics Data System (ADS)

    Biagi, Ludovico; Caldera, Stefano

    2013-03-01

    In Least Squares (LS), the linearized functional model betweenM observables and N unknown parameters is given. LS provides estimates of parameters, observables, residuals and a posteriori variance. To identify outliers and to estimate accuracies and reliabilities, tests on the model and on the individual residuals can be performed at different levels of significance and power. However, LS is not robust: one outlier could be spread into all the residuals and its identification is difficult. A possible solution to this problem is given by a Leave One Block Out approach. Let's suppose that the observation vector can be decomposed into m sub-vectors (blocks) that are reciprocally uncorrelated: in the case of completely uncorrelated observations, m = M. A suspected block is excluded from the adjustment, whose results are used to check it. Clearly, the check is more robust, because one outlier in the excluded block does not affect the adjustment results. The process can be repeated on all the blocks, but can be very slow, because m adjustments must be computed. To efficiently apply Leave One Block Out, an algorithm has been studied. The usual LS adjustment is performed on all the observations to obtain the 'batch' results. The contribution of each block is subtracted from the batch results by algebraic decompositions, with a minimal computational effort: this holds for parameters, a posteriori residuals and variance. Therefore all the blocks can be checked. In the paper, the algorithm is discussed. Two examples of ELOBO application are presented: the first testifies ELOBO reliability against classical LS tests. In the second, ELOBO numerical efficiency is analyzed.

  13. Identifying Heterogeneous Anisotropic Properties in Cerebral Aneurysms: A Pointwise Approach

    PubMed Central

    Zhao, Xuefeng; Raghavan, Madhavan L.; Lu, Jia

    2014-01-01

    The traditional approaches of estimating heterogeneous properties in a soft tissue structure using optimization based inverse methods often face difficulties because of the large number of unknowns to be simultaneously determined. This article proposes a new method for identifying the heterogeneous anisotropic nonlinear elastic properties in cerebral aneurysms. In this method, the local properties are determined directly from the pointwise stress-strain data, thus avoiding the need for simultaneously optimizing for the property values at all points/regions in the aneurysm. The stress distributions needed for a pointwise identification are computed using an inverse elastostatic method without invoking the material properties in question. This paradigm is tested numerically through simulated inflation tests on an image-based cerebral aneurysm sac. The wall tissue is modeled as an eight-ply laminate whose constitutive behavior is described by an anisotropic hyperelastic strain-energy function containing four parameters. The parameters are assumed to vary continuously in the sac. Deformed configurations generated from forward finite element analysis are taken as input to inversely establish the parameter distributions. The delineated and the assigned distributions are in excellent agreement. A forward verification is conducted by comparing the displacement solutions obtained from the delineated and the assigned material parameters at a different pressure. The deviations in nodal displacements are found to be within 0.2% in most part of the sac. The study highlights some distinct features of the proposed method, and demonstrates the feasibility of organ level identification of the distributive anisotropic nonlinear properties in cerebral aneurysms. PMID:20490886

  14. Approaches to identifying synthetic lethal interactions in cancer.

    PubMed

    Thompson, Jordan M; Nguyen, Quy H; Singh, Manpreet; Razorenova, Olga V

    2015-06-01

    Targeting synthetic lethal interactions is a promising new therapeutic approach to exploit specific changes that occur within cancer cells. Multiple approaches to investigate these interactions have been developed and successfully implemented, including chemical, siRNA, shRNA, and CRISPR library screens. Genome-wide computational approaches, such as DAISY, also have been successful in predicting synthetic lethal interactions from both cancer cell lines and patient samples. Each approach has its advantages and disadvantages that need to be considered depending on the cancer type and its molecular alterations. This review discusses these approaches and examines case studies that highlight their use. PMID:26029013

  15. Approaches to Identifying Synthetic Lethal Interactions in Cancer

    PubMed Central

    Thompson, Jordan M.; Nguyen, Quy H.; Singh, Manpreet; Razorenova, Olga V.

    2015-01-01

    Targeting synthetic lethal interactions is a promising new therapeutic approach to exploit specific changes that occur within cancer cells. Multiple approaches to investigate these interactions have been developed and successfully implemented, including chemical, siRNA, shRNA, and CRISPR library screens. Genome-wide computational approaches, such as DAISY, also have been successful in predicting synthetic lethal interactions from both cancer cell lines and patient samples. Each approach has its advantages and disadvantages that need to be considered depending on the cancer type and its molecular alterations. This review discusses these approaches and examines case studies that highlight their use. PMID:26029013

  16. A Hybrid Data Mining Approach for Credit Card Usage Behavior Analysis

    NASA Astrophysics Data System (ADS)

    Tsai, Chieh-Yuan

    Credit card is one of the most popular e-payment approaches in current online e-commerce. To consolidate valuable customers, card issuers invest a lot of money to maintain good relationship with their customers. Although several efforts have been done in studying card usage motivation, few researches emphasize on credit card usage behavior analysis when time periods change from t to t+1. To address this issue, an integrated data mining approach is proposed in this paper. First, the customer profile and their transaction data at time period t are retrieved from databases. Second, a LabelSOM neural network groups customers into segments and identify critical characteristics for each group. Third, a fuzzy decision tree algorithm is used to construct usage behavior rules of interesting customer groups. Finally, these rules are used to analysis the behavior changes between time periods t and t+1. An implementation case using a practical credit card database provided by a commercial bank in Taiwan is illustrated to show the benefits of the proposed framework.

  17. A Hybrid Approach for Efficient Modeling of Medium-Frequency Propagation in Coal Mines

    PubMed Central

    Brocker, Donovan E.; Sieber, Peter E.; Waynert, Joseph A.; Li, Jingcheng; Werner, Pingjuan L.; Werner, Douglas H.

    2015-01-01

    An efficient procedure for modeling medium frequency (MF) communications in coal mines is introduced. In particular, a hybrid approach is formulated and demonstrated utilizing ideal transmission line equations to model MF propagation in combination with full-wave sections used for accurate simulation of local antenna-line coupling and other near-field effects. This work confirms that the hybrid method accurately models signal propagation from a source to a load for various system geometries and material compositions, while significantly reducing computation time. With such dramatic improvement to solution times, it becomes feasible to perform large-scale optimizations with the primary motivation of improving communications in coal mines both for daily operations and emergency response. Furthermore, it is demonstrated that the hybrid approach is suitable for modeling and optimizing large communication networks in coal mines that may otherwise be intractable to simulate using traditional full-wave techniques such as moment methods or finite-element analysis. PMID:26478686

  18. Cluster Analysis-Based Approaches for Geospatiotemporal Data Mining of Massive Data Sets for Identification of Forest Threats

    SciTech Connect

    Mills, Richard T; Hoffman, Forrest M; Kumar, Jitendra; HargroveJr., William Walter

    2011-01-01

    We investigate methods for geospatiotemporal data mining of multi-year land surface phenology data (250 m2 Normalized Difference Vegetation Index (NDVI) values derived from the Moderate Resolution Imaging Spectrometer (MODIS) in this study) for the conterminous United States (CONUS) as part of an early warning system for detecting threats to forest ecosystems. The approaches explored here are based on k-means cluster analysis of this massive data set, which provides a basis for defining the bounds of the expected or normal phenological patterns that indicate healthy vegetation at a given geographic location. We briefly describe the computational approaches we have used to make cluster analysis of such massive data sets feasible, describe approaches we have explored for distinguishing between normal and abnormal phenology, and present some examples in which we have applied these approaches to identify various forest disturbances in the CONUS.

  19. Meta-control of combustion performance with a data mining approach

    NASA Astrophysics Data System (ADS)

    Song, Zhe

    Large scale combustion process is complex and proposes challenges of optimizing its performance. Traditional approaches based on thermal dynamics have limitations on finding optimal operational regions due to time-shift nature of the process. Recent advances in information technology enable people collect large volumes of process data easily and continuously. The collected process data contains rich information about the process and, to some extent, represents a digital copy of the process over time. Although large volumes of data exist in industrial combustion processes, they are not fully utilized to the level where the process can be optimized. Data mining is an emerging science which finds patterns or models from large data sets. It has found many successful applications in business marketing, medical and manufacturing domains The focus of this dissertation is on applying data mining to industrial combustion processes, and ultimately optimizing the combustion performance. However the philosophy, methods and frameworks discussed in this research can also be applied to other industrial processes. Optimizing an industrial combustion process has two major challenges. One is the underlying process model changes over time and obtaining an accurate process model is nontrivial. The other is that a process model with high fidelity is usually highly nonlinear, solving the optimization problem needs efficient heuristics. This dissertation is set to solve these two major challenges. The major contribution of this 4-year research is the data-driven solution to optimize the combustion process, where process model or knowledge is identified based on the process data, then optimization is executed by evolutionary algorithms to search for optimal operating regions.

  20. North American Bats and Mines Project: A cooperative approach for integrating bat conservation and mine-land reclamation

    SciTech Connect

    Ducummon, S.L.

    1997-12-31

    Inactive underground mines now provide essential habitat for more than half of North America`s 44 bat species, including some of the largest remaining populations. Thousands of abandoned mines have already been closed or are slated for safety closures, and many are destroyed during renewed mining in historic districts. The available evidence suggests that millions of bats have already been lost due to these closures. Bats are primary predators of night-flying insects that cost American farmers and foresters billions of dollars annually, therefore, threats to bat survival are cause for serious concern. Fortunately, mine closure methods exist that protect both bats and humans. Bat Conservation International (BCI) and the USDI-Bureau of Land Management founded the North American Bats and Mines Project to provide national leadership and coordination to minimize the loss of mine-roosting bats. This partnership has involved federal and state mine-land and wildlife managers and the mining industry. BCI has trained hundreds of mine-land and wildlife managers nationwide in mine assessment techniques for bats and bat-compatible closure methods, published technical information on bats and mine-land management, presented papers on bats and mines at national mining and wildlife conferences, and collaborated with numerous federal, state, and private partners to protect some of the most important mine-roosting bat populations. Our new mining industry initiative, Mining for Habitat, is designed to develop bat habitat conservation and enhancement plans for active mining operations. It includes the creation of cost-effective artificial underground bat roosts using surplus mining materials such as old mine-truck tires and culverts buried beneath waste rock.

  1. Quantitative risk-based approach for improving water quality management in mining.

    PubMed

    Liu, Wenying; Moran, Chris J; Vink, Sue

    2011-09-01

    The potential environmental threats posed by freshwater withdrawal and mine water discharge are some of the main drivers for the mining industry to improve water management. The use of multiple sources of water supply and introducing water reuse into the mine site water system have been part of the operating philosophies employed by the mining industry to realize these improvements. However, a barrier to implementation of such good water management practices is concomitant water quality variation and the resulting impacts on the efficiency of mineral separation processes, and an increased environmental consequence of noncompliant discharge events. There is an increasing appreciation that conservative water management practices, production efficiency, and environmental consequences are intimately linked through the site water system. It is therefore essential to consider water management decisions and their impacts as an integrated system as opposed to dealing with each impact separately. This paper proposes an approach that could assist mine sites to manage water quality issues in a systematic manner at the system level. This approach can quantitatively forecast the risk related with water quality and evaluate the effectiveness of management strategies in mitigating the risk by quantifying implications for production and hence economic viability. PMID:21797262

  2. Data mining approach to web application intrusions detection

    NASA Astrophysics Data System (ADS)

    Kalicki, Arkadiusz

    2011-10-01

    Web applications became most popular medium in the Internet. Popularity, easiness of web application script languages and frameworks together with careless development results in high number of web application vulnerabilities and high number of attacks performed. There are several types of attacks possible because of improper input validation: SQL injection Cross-site scripting, Cross-Site Request Forgery (CSRF), web spam in blogs and others. In order to secure web applications intrusion detection (IDS) and intrusion prevention systems (IPS) are being used. Intrusion detection systems are divided in two groups: misuse detection (traditional IDS) and anomaly detection. This paper presents data mining based algorithm for anomaly detection. The principle of this method is the comparison of the incoming HTTP traffic with a previously built profile that contains a representation of the "normal" or expected web application usage sequence patterns. The frequent sequence patterns are found with GSP algorithm. Previously presented detection method was rewritten and improved. Some tests show that the software catches malicious requests, especially long attack sequences, results quite good with medium length sequences, for short length sequences must be complemented with other methods.

  3. Ultrabroadband photonic Internet: data mining approach to security aspects

    NASA Astrophysics Data System (ADS)

    Kalicki, Arkadiusz

    2009-06-01

    Web applications became most popular medium in the Internet. Popularity, easiness of web application frameworks together with careless development results in high number of vulnerabilities and attacks. There are several types of attacks possible because of improper input validation. SQL injection is ability to execute arbitrary SQL queries in a database through an existing application. Cross-site scripting is the vulnerability which allows malicious web users to inject code into the web pages viewed by other users. Cross-Site Request Forgery (CSRF) is an attack that tricks the victim into loading a page that contains malicious request. Web spam in blogs. In order to secure web applications intrusion detection (IDS) and intrusion prevention systems (IPS) are being used. Intrusion detection systems are divided in two groups: misuse detection (traditional IDS) and anomaly detection. Misuse detection systems are signature based, have high accuracy in detecting many kinds of known attacks but cannot detect unknown and emerging attacks. This can be complemented with anomaly based intrusion detection and prevention systems. This paper presents anomaly driven proxy as an IPS and data mining based algorithm which was used to detecting anomalies. The principle of this method is the comparison of the incoming HTTP traffic with a previously built profile that contains a representation of the "normal" or expected web application usage sequence patterns. The frequent sequence patterns are found with GSP algorithm. Some basic tests show that the software catches malicious requests.

  4. EVALUATION OF A TWO-STAGE PASSIVE TREATMENT APPROACH FOR MINING INFLUENCE WATERS

    EPA Science Inventory

    A two-stage passive treatment approach was assessed at bench-scale using two Colorado Mining Influenced Waters (MIWs). The first-stage was a limestone drain with the purpose of removing iron and aluminum and mitigating the potential effects of mineral acidity. The second stage w...

  5. DNA enrichment approaches to identify unauthorized genetically modified organisms (GMOs).

    PubMed

    Arulandhu, Alfred J; van Dijk, Jeroen P; Dobnik, David; Holst-Jensen, Arne; Shi, Jianxin; Zel, Jana; Kok, Esther J

    2016-07-01

    With the increased global production of different genetically modified (GM) plant varieties, chances increase that unauthorized GM organisms (UGMOs) may enter the food chain. At the same time, the detection of UGMOs is a challenging task because of the limited sequence information that will generally be available. PCR-based methods are available to detect and quantify known UGMOs in specific cases. If this approach is not feasible, DNA enrichment of the unknown adjacent sequences of known GMO elements is one way to detect the presence of UGMOs in a food or feed product. These enrichment approaches are also known as chromosome walking or gene walking (GW). In recent years, enrichment approaches have been coupled with next generation sequencing (NGS) analysis and implemented in, amongst others, the medical and microbiological fields. The present review will provide an overview of these approaches and an evaluation of their applicability in the identification of UGMOs in complex food or feed samples. PMID:27086015

  6. Abandoned mined land reclamation on the Wayne National Forest - an interdisciplinary approach

    SciTech Connect

    Moss, R.G.

    1982-12-01

    The Wayne National Forest contains several thousand acres of abandoned surface-mined lands, many of which are in need of reclamation. The Forest Service has developed a systematic interdisciplinary approach to planning and implementing reclamation projects. An environmental assessment report is prepared before the project is designed which provides decision makers the information needed to select a preferred reclamation alternative. A case study known as the Yost II Abandoned Mined Land Reclamation Project is presented. The abandoned mine, basically a double contour configuration, presented designers with a difficult mosaic of barren, toxic areas, well-revegetated areas, and acid ponds. The reclamation technique employed utilized burial of toxic soil, pond underdrains, crushed limestone filter strips, and topsoiling.

  7. Data Mining: A Systems Approach to Formative Assessment

    ERIC Educational Resources Information Center

    Schmid, Dale

    2012-01-01

    This article describes how using raw data and information from reliable assessments can inform teachers' decisions leading to improved instruction. The primary aim is to use a systems approach to provide evidence of what students know and how they demonstrate mastery. Such evidence can empower teachers to reach all students. The pedagogic…

  8. An Approach for Identifying Benefit Segments among Prospective College Students.

    ERIC Educational Resources Information Center

    Miller, Patrick; And Others

    1990-01-01

    A study investigated the importance to 578 applicants of various benefits offered by a moderately selective private university. Applicants rated the institution on 43 academic, social, financial, religious, and curricular attributes. The objective was to test the efficacy of one approach to college market segmentation. Results support the utility…

  9. Identifying Predictors of Physics Item Difficulty: A Linear Regression Approach

    ERIC Educational Resources Information Center

    Mesic, Vanes; Muratovic, Hasnija

    2011-01-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary…

  10. Identifying the "Truly Disadvantaged": A Comprehensive Biosocial Approach

    ERIC Educational Resources Information Center

    Barnes, J. C.; Beaver, Kevin M.; Connolly, Eric J.; Schwartz, Joseph A.

    2016-01-01

    There has been significant interest in examining the developmental factors that predispose individuals to chronic criminal offending. This body of research has identified some social-environmental risk factors as potentially important. At the same time, the research producing these results has generally failed to employ genetically sensitive…

  11. Genomic approaches to identifying transcriptional regulators of osteoblast differentiation

    NASA Technical Reports Server (NTRS)

    Stains, Joseph P.; Civitelli, Roberto

    2003-01-01

    Recent microarray studies of mouse and human osteoblast differentiation in vitro have identified novel transcription factors that may be important in the establishment and maintenance of differentiation. These findings help unravel the pattern of gene-expression changes that underly the complex process of bone formation.

  12. Novel LanT Associated Lantibiotic Clusters Identified by Genome Database Mining

    PubMed Central

    Singh, Mangal; Sareen, Dipti

    2014-01-01

    Background Frequent use of antibiotics has led to the emergence of antibiotic resistance in bacteria. Lantibiotic compounds are ribosomally synthesized antimicrobial peptides against which bacteria are not able to produce resistance, hence making them a good alternative to antibiotics. Nisin is the oldest and the most widely used lantibiotic, in food preservation, without having developed any significant resistance against it. Having their antimicrobial potential and a limited number, there is a need to identify novel lantibiotics. Methodology/Findings Identification of novel lantibiotic biosynthetic clusters from an ever increasing database of bacterial genomes, can provide a major lead in this direction. In order to achieve this, a strategy was adopted to identify novel lantibiotic biosynthetic clusters by screening the sequenced genomes for LanT homolog, which is a conserved lantibiotic transporter specific to type IB clusters. This strategy resulted in identification of 54 bacterial strains containing the LanT homologs, which are not the known lantibiotic producers. Of these, 24 strains were subjected to a detailed bioinformatic analysis to identify genes encoding for precursor peptides, modification enzyme, immunity and quorum sensing proteins. Eight clusters having two LanM determinants, similar to haloduracin and lichenicidin were identified, along with 13 clusters having a single LanM determinant as in mersacidin biosynthetic cluster. Besides these, orphan LanT homologs were also identified which might be associated with novel bacteriocins, encoded somewhere else in the genome. Three identified gene clusters had a C39 domain containing LanT transporter, associated with the LanBC proteins and double glycine type precursor peptides, the only known example of such a cluster is that of salivaricin. Conclusion This study led to the identification of 8 novel putative two-component lantibiotic clusters along with 13 having a single LanM and 3 with LanBC genes

  13. A network approach for identifying and delimiting biogeographical regions.

    PubMed

    Vilhena, Daril A; Antonelli, Alexandre

    2015-01-01

    Biogeographical regions (geographically distinct assemblages of species and communities) constitute a cornerstone for ecology, biogeography, evolution and conservation biology. Species turnover measures are often used to quantify spatial biodiversity patterns, but algorithms based on similarity can be sensitive to common sampling biases in species distribution data. Here we apply a community detection approach from network theory that incorporates complex, higher-order presence-absence patterns. We demonstrate the performance of the method by applying it to all amphibian species in the world (c. 6,100 species), all vascular plant species of the USA (c. 17,600) and a hypothetical data set containing a zone of biotic transition. In comparison with current methods, our approach tackles the challenges posed by transition zones and succeeds in retrieving a larger number of commonly recognized biogeographical regions. This method can be applied to generate objective, data-derived identification and delimitation of the world's biogeographical regions. PMID:25907961

  14. New Seasonal Shift in In-Stream Diurnal Nitrate Cycles Identified by Mining High-Frequency Data.

    PubMed

    Aubert, Alice H; Breuer, Lutz

    2016-01-01

    The recent development of in-situ monitoring devices, such as UV-spectrometers, makes the study of short-term stream chemistry variation relevant, especially the study of diurnal cycles, which are not yet fully understood. Our study is based on high-frequency data from an agricultural catchment (Studienlandschaft Schwingbachtal, Germany). We propose a novel approach, i.e. the combination of cluster analysis and Linear Discriminant Analysis, to mine from these data nitrate behavior patterns. As a result, we observe a seasonality of nitrate diurnal cycles, that differs from the most common cycle seasonality described in the literature, i.e. pre-dawn peaks in spring. Our cycles appear in summer and the maximum and minimum shift to a later time in late summer/autumn. This is observed both for water- and energy-limited years, thus potentially stressing the role of evapotranspiration. This concluding hypothesis on the role of evapotranspiration on nitrate stream concentration, which was obtained through data mining, broadens the perspective on the diurnal cycling of stream nitrate concentrations. PMID:27073838

  15. New Seasonal Shift in In-Stream Diurnal Nitrate Cycles Identified by Mining High-Frequency Data

    PubMed Central

    2016-01-01

    The recent development of in-situ monitoring devices, such as UV-spectrometers, makes the study of short-term stream chemistry variation relevant, especially the study of diurnal cycles, which are not yet fully understood. Our study is based on high-frequency data from an agricultural catchment (Studienlandschaft Schwingbachtal, Germany). We propose a novel approach, i.e. the combination of cluster analysis and Linear Discriminant Analysis, to mine from these data nitrate behavior patterns. As a result, we observe a seasonality of nitrate diurnal cycles, that differs from the most common cycle seasonality described in the literature, i.e. pre-dawn peaks in spring. Our cycles appear in summer and the maximum and minimum shift to a later time in late summer/autumn. This is observed both for water- and energy-limited years, thus potentially stressing the role of evapotranspiration. This concluding hypothesis on the role of evapotranspiration on nitrate stream concentration, which was obtained through data mining, broadens the perspective on the diurnal cycling of stream nitrate concentrations. PMID:27073838

  16. Multidisciplinary approach to identify aquifer-peatland connectivity

    NASA Astrophysics Data System (ADS)

    Larocque, Marie; Pellerin, Stéphanie; Cloutier, Vincent; Ferlatte, Miryane; Munger, Julie; Quillet, Anne; Paniconi, Claudio

    2015-04-01

    In southern Quebec (Canada), wetlands sustain increasing pressures from agriculture, urban development, and peat exploitation. To protect both groundwater and ecosystems, it is important to be able to identify how, where, and to what extent shallow aquifers and wetlands are connected. This study focuses on peatlands which are especially abundant in Quebec. The objective of this research was to better understand aquifer-peatland connectivity and to identify easily measured indicators of this connectivity. Geomorphology, hydrogeochemistry, and vegetation were selected as key indicators of connectivity. Twelve peatland transects were instrumented and monitored in the Abitibi (slope peatlands associated with eskers) and Centre-du-Quebec (depression peatlands) regions of Quebec (Canada). Geomorphology, geology, water levels, water chemistry, and vegetation species were identified/measured on all transects. Flow conditions were simulated numerically on two typical transects. Results show that a majority of peatland transects receives groundwater from a shallow aquifer. In slope peatlands, groundwater flows through the organic deposits towards the peatland center. In depression peatlands, groundwater flows only 100-200 m within the peatland before being redirected through surface routes towards the outlet. Flow modeling and sensitivity analysis have identified that the thickness and hydraulic conductivity of permeable deposits close to the peatland and beneath the organic deposits influence flow directions within the peatland. Geochemical data have confirmed the usefulness of total dissolved solids (TDS) exceeding 14 mg/L as an indicator of the presence of groundwater within the peatland. Vegetation surveys have allowed the identification of species and groups of species that occur mostly when groundwater is present, for instance Carex limosa and Sphagnum russowii. Geomorphological conditions (slope or depression peatland), TDS, and vegetation can be measured

  17. Computational approaches to identify functional genetic variants in cancer genomes

    PubMed Central

    Gonzalez-Perez, Abel; Mustonen, Ville; Reva, Boris; Ritchie, Graham R.S.; Creixell, Pau; Karchin, Rachel; Vazquez, Miguel; Fink, J. Lynn; Kassahn, Karin S.; Pearson, John V.; Bader, Gary; Boutros, Paul C.; Muthuswamy, Lakshmi; Ouellette, B.F. Francis; Reimand, Jüri; Linding, Rune; Shibata, Tatsuhiro; Valencia, Alfonso; Butler, Adam; Dronov, Serge; Flicek, Paul; Shannon, Nick B.; Carter, Hannah; Ding, Li; Sander, Chris; Stuart, Josh M.; Stein, Lincoln D.; Lopez-Bigas, Nuria

    2014-01-01

    The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor, but only a minority drive tumor progression. We present the result of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype. PMID:23900255

  18. A data mining based approach to predict spatiotemporal changes in satellite images

    NASA Astrophysics Data System (ADS)

    Boulila, W.; Farah, I. R.; Ettabaa, K. Saheb; Solaiman, B.; Ghézala, H. Ben

    2011-06-01

    The interpretation of remotely sensed images in a spatiotemporal context is becoming a valuable research topic. However, the constant growth of data volume in remote sensing imaging makes reaching conclusions based on collected data a challenging task. Recently, data mining appears to be a promising research field leading to several interesting discoveries in various areas such as marketing, surveillance, fraud detection and scientific discovery. By integrating data mining and image interpretation techniques, accurate and relevant information (i.e. functional relation between observed parcels and a set of informational contents) can be automatically elicited. This study presents a new approach to predict spatiotemporal changes in satellite image databases. The proposed method exploits fuzzy sets and data mining concepts to build predictions and decisions for several remote sensing fields. It takes into account imperfections related to the spatiotemporal mining process in order to provide more accurate and reliable information about land cover changes in satellite images. The proposed approach is validated using SPOT images representing the Saint-Denis region, capital of Reunion Island. Results show good performances of the proposed framework in predicting change for the urban zone.

  19. Mining a Written Values Affirmation Intervention to Identify the Unique Linguistic Features of Stigmatized Groups

    ERIC Educational Resources Information Center

    Riddle, Travis; Bhagavatula, Sowmya Sree; Guo, Weiwei; Muresan, Smaranda; Cohen, Geoff; Cook, Jonathan E.; Purdie-Vaughns, Valerie

    2015-01-01

    Social identity threat refers to the process through which an individual underperforms in some domain due to their concern with confirming a negative stereotype held about their group. Psychological research has identified this as one contributor to the underperformance and underrepresentation of women, Blacks, and Latinos in STEM fields. Over the…

  20. Identifying the Factors Affecting Science and Mathematics Achievement Using Data Mining Methods

    ERIC Educational Resources Information Center

    Kiray, S. Ahmet; Gok, Bilge; Bozkir, A. Selman

    2015-01-01

    The purpose of this article is to identify the order of significance of the variables that affect science and mathematics achievement in middle school students. For this aim, the study deals with the relationship between science and math in terms of different angles using the perspectives of multiple causes-single effect and of multiple…

  1. Using Data Mining to Identify Actionable Information: Breaking New Ground in Data-Driven Decision Making

    ERIC Educational Resources Information Center

    Streifer, Philip A.; Schumann, Jeffrey A.

    2005-01-01

    The implementation of No Child Left Behind (NCLB) presents important challenges for schools across the nation to identify problems that lead to poor performance. Yet schools must intervene with instructional programs that can make a difference and evaluate the effectiveness of such programs. New advances in artificial intelligence (AI) data-mining…

  2. Timely approaches to identify probiotic species of the genus Lactobacillus

    PubMed Central

    2013-01-01

    Over the past decades the use of probiotics in food has increased largely due to the manufacturer’s interest in placing “healthy” food on the market based on the consumer’s ambitions to live healthy. Due to this trend, health benefits of products containing probiotic strains such as lactobacilli are promoted and probiotic strains have been established in many different products with their numbers increasing steadily. Probiotics are used as starter cultures in dairy products such as cheese or yoghurts and in addition they are also utilized in non-dairy products such as fermented vegetables, fermented meat and pharmaceuticals, thereby, covering a large variety of products. To assure quality management, several pheno-, physico- and genotyping methods have been established to unambiguously identify probiotic lactobacilli. These methods are often specific enough to identify the probiotic strains at genus and species levels. However, the probiotic ability is often strain dependent and it is impossible to distinguish strains by basic microbiological methods. Therefore, this review aims to critically summarize and evaluate conventional identification methods for the genus Lactobacillus, complemented by techniques that are currently being developed. PMID:24063519

  3. Proteomic Approach to Identify Nuclear Proteins in Wheat Grain.

    PubMed

    Bancel, Emmanuelle; Bonnot, Titouan; Davanture, Marlène; Branlard, Gérard; Zivy, Michel; Martre, Pierre

    2015-10-01

    The nuclear proteome of the grain of the two cultivated wheat species Triticum aestivum (hexaploid wheat; genomes A, B, and D) and T. monococcum (diploid wheat; genome A) was analyzed in two early stages of development using shotgun-based proteomics. A procedure was optimized to purify nuclei, and an improved protein sample preparation was developed to efficiently remove nonprotein substances (starch and nucleic acids). A total of 797 proteins corresponding to 528 unique proteins were identified, 36% of which were classified in functional groups related to DNA and RNA metabolism. A large number (107 proteins) of unknown functions and hypothetical proteins were also found. Some identified proteins may be multifunctional and may present multiple localizations. On the basis of the MS/MS analysis, 368 proteins were present in the two species, and in two stages of development, some qualitative differences between species and stages of development were also found. All of these data illustrate the dynamic function of the grain nucleus in the early stages of development. PMID:26228564

  4. PedMine – A simulated annealing algorithm to identify maximally unrelated individuals in population isolates

    PubMed Central

    Douglas, Julie A.; Sandefur, Conner I.

    2010-01-01

    Summary In family-based genetic studies, it is often useful to identify a subset of unrelated individuals. When such studies are conducted in population isolates, however, most if not all individuals are often detectably related to each other. To identify a set of maximally unrelated (or equivalently, minimally related) individuals, we have implemented simulated annealing, a general-purpose algorithm for solving difficult combinatorial optimization problems. We illustrate our method on data from a genetic study in the Old Order Amish of Lancaster County, Pennsylvania, a population isolate derived from a modest number of founders. Given one or more pedigrees, our program automatically and rapidly extracts a fixed number of maximally unrelated individuals. PMID:18321883

  5. A novel meta-analytic approach: Mining frequent co-activation patterns in neuroimaging databases

    PubMed Central

    Caspers, Julian; Zilles, Karl; Beierle, Christoph; Rottschy, Claudia; Eickhoff, Simon B.

    2016-01-01

    In recent years, coordinate-based meta-analyses have become a powerful and widely used tool to study coactivity across neuroimaging experiments, a development that was supported by the emergence of large-scale neuroimaging databases like BrainMap. However, the evaluation of co-activation patterns is constrained by the fact that previous coordinate-based meta-analysis techniques like Activation Likelihood Estimation (ALE) and Multilevel Kernel Density Analysis (MKDA) reveal all brain regions that show convergent activity within a dataset without taking into account actual within-experiment co-occurrence patterns. To overcome this issue we here propose a novel meta-analytic approach named PaMiNI that utilizes a combination of two well-established data-mining techniques, Gaussian mixture modeling and the Apriori algorithm. By this, PaMiNI enables a data-driven detection of frequent co-activation patterns within neuroimaging datasets. The feasibility of the method is demonstrated by means of several analyses on simulated data as well as a real application. The analyses of the simulated data show that PaMiNI identifies the brain regions underlying the simulated activation foci and perfectly separates the co-activation patterns of the experiments in the simulations. Furthermore, PaMiNI still yields good results when activation foci of distinct brain regions become closer together or if they are non-Gaussian distributed. For the further evaluation, a real dataset on working memory experiments is used, which was previously examined in an ALE meta-analysis and hence allows a cross-validation of both methods. In this latter analysis, PaMiNI revealed a fronto-parietal “core” network of working memory and furthermore indicates a left-lateralization in this network. Finally, to encourage a widespread usage of this new method, the PaMiNI approach was implemented into a publicly available software system. PMID:24365675

  6. Identifying Pathogenicity Islands in Bacterial Pathogenomics Using Computational Approaches

    PubMed Central

    Che, Dongsheng; Hasan, Mohammad Shabbir; Chen, Bernard

    2014-01-01

    High-throughput sequencing technologies have made it possible to study bacteria through analyzing their genome sequences. For instance, comparative genome sequence analyses can reveal the phenomenon such as gene loss, gene gain, or gene exchange in a genome. By analyzing pathogenic bacterial genomes, we can discover that pathogenic genomic regions in many pathogenic bacteria are horizontally transferred from other bacteria, and these regions are also known as pathogenicity islands (PAIs). PAIs have some detectable properties, such as having different genomic signatures than the rest of the host genomes, and containing mobility genes so that they can be integrated into the host genome. In this review, we will discuss various pathogenicity island-associated features and current computational approaches for the identification of PAIs. Existing pathogenicity island databases and related computational resources will also be discussed, so that researchers may find it to be useful for the studies of bacterial evolution and pathogenicity mechanisms. PMID:25437607

  7. Multimodal Approach to Identifying Malingered Posttraumatic Stress Disorder: A Review

    PubMed Central

    Jabeen, Shagufta; Alam, Farzana

    2015-01-01

    The primary aim of this article is to aid clinicians in differentiating true posttraumatic stress disorder from malingered posttraumatic stress disorder. Posttraumatic stress disorder and malingering are defined, and prevalence rates are explored. Similarities and differences in diagnostic criteria between the fourth and fifth editions of the Diagnostic and Statistical Manual of Mental Disorders are described for posttraumatic stress disorder. Possible motivations for malingering posttraumatic stress disorder are discussed, and common characteristics of malingered posttraumatic stress disorder are described. A multimodal approach is described for evaluating posttraumatic stress disorder, including interview techniques, collection of collateral data, and psychometric and physiologic testing, that should allow clinicians to distinguish between those patients who are truly suffering from posttraumatic disorder and those who are malingering the illness. PMID:25852974

  8. A new approach to estimate fugitive methane emissions from coal mining in China.

    PubMed

    Ju, Yiwen; Sun, Yue; Sa, Zhanyou; Pan, Jienan; Wang, Jilin; Hou, Quanlin; Li, Qingguang; Yan, Zhifeng; Liu, Jie

    2016-02-01

    Developing a more accurate greenhouse gas (GHG) emissions inventory draws too much attention. Because of its resource endowment and technical status, China has made coal-related GHG emissions a big part of its inventory. Lacking a stoichiometric carbon conversion coefficient and influenced by geological conditions and mining technologies, previous efforts to estimate fugitive methane emissions from coal mining in China has led to disagreeing results. This paper proposes a new calculation methodology to determine fugitive methane emissions from coal mining based on the domestic analysis of gas geology, gas emission features, and the merits and demerits of existing estimation methods. This new approach involves four main parameters: in-situ original gas content, gas remaining post-desorption, raw coal production, and mining influence coefficient. The case studies in Huaibei-Huainan Coalfield and Jincheng Coalfield show that the new method obtains the smallest error, +9.59% and 7.01% respectively compared with other methods, Tier 1 and Tier 2 (with two samples) in this study, which resulted in +140.34%, +138.90%, and -18.67%, in Huaibei-Huainan Coalfield, while +64.36%, +47.07%, and -14.91% in Jincheng Coalfield. Compared with the predominantly used methods, this new one possesses the characteristics of not only being a comparably more simple process and lower uncertainty than the "emission factor method" (IPCC recommended Tier 1 and Tier 2), but also having easier data accessibility, similar uncertainty, and additional post-mining emissions compared to the "absolute gas emission method" (IPCC recommended Tier 3). Therefore, methane emissions dissipated from most of the producing coal mines worldwide could be more accurately and more easily estimated. PMID:26605831

  9. Identifying hosts of families of viruses: a machine learning approach.

    PubMed

    Raj, Anil; Dewar, Michael; Palacios, Gustavo; Rabadan, Raul; Wiggins, Christopher H

    2011-01-01

    Identifying emerging viral pathogens and characterizing their transmission is essential to developing effective public health measures in response to an epidemic. Phylogenetics, though currently the most popular tool used to characterize the likely host of a virus, can be ambiguous when studying species very distant to known species and when there is very little reliable sequence information available in the early stages of the outbreak of disease. Motivated by an existing framework for representing biological sequence information, we learn sparse, tree-structured models, built from decision rules based on subsequences, to predict viral hosts from protein sequence data using popular discriminative machine learning tools. Furthermore, the predictive motifs robustly selected by the learning algorithm are found to show strong host-specificity and occur in highly conserved regions of the viral proteome. PMID:22174744

  10. A new approach to identify, classify and count drugrelated events

    PubMed Central

    Bürkle, Thomas; Müller, Fabian; Patapovas, Andrius; Sonst, Anja; Pfistermeister, Barbara; Plank-Kiegele, Bettina; Dormann, Harald; Maas, Renke

    2013-01-01

    Aims The incidence of clinical events related to medication errors and/or adverse drug reactions reported in the literature varies by a degree that cannot solely be explained by the clinical setting, the varying scrutiny of investigators or varying definitions of drug-related events. Our hypothesis was that the individual complexity of many clinical cases may pose relevant limitations for current definitions and algorithms used to identify, classify and count adverse drug-related events. Methods Based on clinical cases derived from an observational study we identified and classified common clinical problems that cannot be adequately characterized by the currently used definitions and algorithms. Results It appears that some key models currently used to describe the relation of medication errors (MEs), adverse drug reactions (ADRs) and adverse drug events (ADEs) can easily be misinterpreted or contain logical inconsistencies that limit their accurate use to all but the simplest clinical cases. A key limitation of current models is the inability to deal with complex interactions such as one drug causing two clinically distinct side effects or multiple drugs contributing to a single clinical event. Using a large set of clinical cases we developed a revised model of the interdependence between MEs, ADEs and ADRs and extended current event definitions when multiple medications cause multiple types of problems. We propose algorithms that may help to improve the identification, classification and counting of drug-related events. Conclusions The new model may help to overcome some of the limitations that complex clinical cases pose to current paper- or software-based drug therapy safety. PMID:24007453

  11. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  12. Enhanced Approaches for Identifying Amadori Products: Application to Peanut Allergens.

    PubMed

    Johnson, Katina L; Williams, Jason G; Maleki, Soheila J; Hurlburt, Barry K; London, Robert E; Mueller, Geoffrey A

    2016-02-17

    The dry roasting of peanuts is suggested to influence allergic sensitization as a result of the formation of advanced glycation end products (AGEs) on peanut proteins. Identifying AGEs is technically challenging. The AGEs of a peanut allergen were probed with nano-scale liquid chromatography-electrospray ionization-mass spectrometry (nanoLC-ESI-MS) and tandem mass spectrometry (MS/MS) analyses. Amadori product ions matched to expected peptides and yielded fragments that included a loss of three waters and HCHO. As a result of the paucity of b and y ions in the MS/MS spectrum, standard search algorithms do not perform well. Reactions with isotopically labeled sugars confirmed that the peptides contained Amadori products. An algorithm was developed on the basis of information content (Shannon entropy) and the loss of water and HCHO. Results with test data show that the algorithm finds the correct spectra with high precision, reducing the time needed to manually inspect data. Computational and technical improvements allowed for better identification of the chemical differences between modified and unmodified proteins. PMID:26811263

  13. Newer Approaches to Identify Potential Untoward Effects in Functional Foods.

    PubMed

    Marone, Palma Ann; Birkenbach, Victoria L; Hayes, A Wallace

    2016-01-01

    Globalization has greatly accelerated the numbers and variety of food and beverage products available worldwide. The exchange among greater numbers of countries, manufacturers, and products in the United States and worldwide has necessitated enhanced quality measures for nutritional products for larger populations increasingly reliant on functionality. These functional foods, those that provide benefit beyond basic nutrition, are increasingly being used for their potential to alleviate food insufficiency while enhancing quality and longevity of life. In the United States alone, a steady import increase of greater than 15% per year or 24 million shipments, over 70% products of which are food related, is regulated under the Food and Drug Administration (FDA). This unparalleled growth has resulted in the need for faster, cheaper, and better safety and efficacy screening methods in the form of harmonized guidelines and recommendations for product standardization. In an effort to meet this need, the in vitro toxicology testing market has similarly grown with an anticipatory 15% increase between 2010 and 2015 of US$1.3 to US$2.7 billion. Although traditionally occupying a small fraction of the market behind pharmaceuticals and cosmetic/household products, the scope of functional food testing, including additives/supplements, ingredients, residues, contact/processing, and contaminants, is potentially expansive. Similarly, as functional food testing has progressed, so has the need to identify potential adverse factors that threaten the safety and quality of these products. PMID:26657815

  14. Configurational approach to identifying the earliest hominin butchers.

    PubMed

    Domínguez-Rodrigo, Manuel; Pickering, Travis Rayne; Bunn, Henry T

    2010-12-01

    The announcement of two approximately 3.4-million-y-old purportedly butchered fossil bones from the Dikika paleoanthropological research area (Lower Awash Valley, Ethiopia) could profoundly alter our understanding of human evolution. Butchering damage on the Dikika bones would imply that tool-assisted meat-eating began approximately 800,000 y before previously thought, based on butchered bones from 2.6- to 2.5-million-y-old sites at the Ethiopian Gona and Bouri localities. Further, the only hominin currently known from Dikika at approximately 3.4 Ma is Australopithecus afarensis, a temporally and geographically widespread species unassociated previously with any archaeological evidence of butchering. Our taphonomic configurational approach to assess the claims of A. afarensis butchery at Dikika suggests the claims of unexpectedly early butchering at the site are not warranted. The Dikika research group focused its analysis on the morphology of the marks in question but failed to demonstrate, through recovery of similarly marked in situ fossils, the exact provenience of the published fossils, and failed to note occurrences of random striae on the cortices of the published fossils (incurred through incidental movement of the defleshed specimens across and/or within their abrasive encasing sediments). The occurrence of such random striae (sometimes called collectively "trampling" damage) on the two fossils provide the configurational context for rejection of the claimed butchery marks. The earliest best evidence for hominin butchery thus remains at 2.6 to 2.5 Ma, presumably associated with more derived species than A. afarensis. PMID:21078985

  15. A landscape ecology approach identifies important drivers of urban biodiversity.

    PubMed

    Turrini, Tabea; Knop, Eva

    2015-04-01

    Cities are growing rapidly worldwide, yet a mechanistic understanding of the impact of urbanization on biodiversity is lacking. We assessed the impact of urbanization on arthropod diversity (species richness and evenness) and abundance in a study of six cities and nearby intensively managed agricultural areas. Within the urban ecosystem, we disentangled the relative importance of two key landscape factors affecting biodiversity, namely the amount of vegetated area and patch isolation. To do so, we a priori selected sites that independently varied in the amount of vegetated area in the surrounding landscape at the 500-m scale and patch isolation at the 100-m scale, and we hold local patch characteristics constant. As indicator groups, we used bugs, beetles, leafhoppers, and spiders. Compared to intensively managed agricultural ecosystems, urban ecosystems supported a higher abundance of most indicator groups, a higher number of bug species, and a lower evenness of bug and beetle species. Within cities, a high amount of vegetated area increased species richness and abundance of most arthropod groups, whereas evenness showed no clear pattern. Patch isolation played only a limited role in urban ecosystems, which contrasts findings from agro-ecological studies. Our results show that urban areas can harbor a similar arthropod diversity and abundance compared to intensively managed agricultural ecosystems. Further, negative consequences of urbanization on arthropod diversity can be mitigated by providing sufficient vegetated space in the urban area, while patch connectivity is less important in an urban context. This highlights the need for applying a landscape ecological approach to understand the mechanisms shaping urban biodiversity and underlines the potential of appropriate urban planning for mitigating biodiversity loss. PMID:25620599

  16. Genetic Approaches To Identifying Novel Osteoporosis Drug Targets.

    PubMed

    Brommage, Robert

    2015-10-01

    During the past two decades effective drugs for treating osteoporosis have been developed, including anti-resorptives inhibiting bone resorption (estrogens, the SERM raloxifene, four bisphosphonates, RANKL inhibitor denosumab) and the anabolic bone forming daily injectable peptide teriparatide. Two potential drugs (odanacatib and romosozumab) are in late stage clinical development. The most pressing unmet need is for orally active anabolic drugs. This review describes the basic biological studies involved in developing these drugs, including the animal models employed for osteoporosis drug development. The genomics revolution continues to identify potential novel osteoporosis drug targets. Studies include human GWAS studies and identification of mutant genes in subjects having abnormal bone mass, mouse QTL and gene knockouts, and gene expression studies. Multiple lines of evidence indicate that Wnt signaling plays a major role in regulating bone formation and continued study of this complex pathway is likely to lead to key discoveries. In addition to the classic Wnt signaling targets DKK1 and sclerostin, LRP4, LRP5/LRP6, SFRP4, WNT16, and NOTUM can potentially be targeted to modulate Wnt signaling. Next-generation whole genome and exome sequencing, RNA-sequencing and CRISPR/CAS9 gene editing are new experimental techniques contributing to understanding the genome. The International Knockout Mouse Consortium efforts to knockout and phenotype all mouse genes are poised to accelerate. Accumulating knowledge will focus attention on readily accessible databases (Big Data). Efforts are underway by the International Bone and Mineral Society to develop an annotated Skeletome database providing information on all genes directly influencing bone mass, architecture, mineralization or strength. PMID:25833316

  17. A Bayesian Approach to Identifying New Risk Factors for Dementia

    PubMed Central

    Wen, Yen-Hsia; Wu, Shihn-Sheng; Lin, Chun-Hung Richard; Tsai, Jui-Hsiu; Yang, Pinchen; Chang, Yang-Pei; Tseng, Kuan-Hua

    2016-01-01

    Abstract Dementia is one of the most disabling and burdensome health conditions worldwide. In this study, we identified new potential risk factors for dementia from nationwide longitudinal population-based data by using Bayesian statistics. We first tested the consistency of the results obtained using Bayesian statistics with those obtained using classical frequentist probability for 4 recognized risk factors for dementia, namely severe head injury, depression, diabetes mellitus, and vascular diseases. Then, we used Bayesian statistics to verify 2 new potential risk factors for dementia, namely hearing loss and senile cataract, determined from the Taiwan's National Health Insurance Research Database. We included a total of 6546 (6.0%) patients diagnosed with dementia. We observed older age, female sex, and lower income as independent risk factors for dementia. Moreover, we verified the 4 recognized risk factors for dementia in the older Taiwanese population; their odds ratios (ORs) ranged from 3.469 to 1.207. Furthermore, we observed that hearing loss (OR = 1.577) and senile cataract (OR = 1.549) were associated with an increased risk of dementia. We found that the results obtained using Bayesian statistics for assessing risk factors for dementia, such as head injury, depression, DM, and vascular diseases, were consistent with those obtained using classical frequentist probability. Moreover, hearing loss and senile cataract were found to be potential risk factors for dementia in the older Taiwanese population. Bayesian statistics could help clinicians explore other potential risk factors for dementia and for developing appropriate treatment strategies for these patients. PMID:27227925

  18. Identifying new targets in leukemogenesis using computational approaches

    PubMed Central

    Jayaraman, Archana; Jamil, Kaiser; Khan, Haseeb A.

    2015-01-01

    There is a need to identify novel targets in Acute Lymphoblastic Leukemia (ALL), a hematopoietic cancer affecting children, to improve our understanding of disease biology and that can be used for developing new therapeutics. Hence, the aim of our study was to find new genes as targets using in silico studies; for this we retrieved the top 10% overexpressed genes from Oncomine public domain microarray expression database; 530 overexpressed genes were short-listed from Oncomine database. Then, using prioritization tools such as ENDEAVOUR, DIR and TOPPGene online tools, we found fifty-four genes common to the three prioritization tools which formed our candidate leukemogenic genes for this study. As per the protocol we selected thirty training genes from PubMed. The prioritized and training genes were then used to construct STRING functional association network, which was further analyzed using cytoHubba hub analysis tool to investigate new genes which could form drug targets in leukemia. Analysis of the STRING protein network built from these prioritized and training genes led to identification of two hub genes, SMAD2 and CDK9, which were not implicated in leukemogenesis earlier. Filtering out from several hundred genes in the network we also found MEN1, HDAC1 and LCK genes, which re-emphasized the important role of these genes in leukemogenesis. This is the first report on these five additional signature genes in leukemogenesis. We propose these as new targets for developing novel therapeutics and also as biomarkers in leukemogenesis, which could be important for prognosis and diagnosis. PMID:26288567

  19. A text mining approach to detect mentions of protein glycosylation in biomedical text

    PubMed Central

    Shukla, Daksha; Jayaraman, Valadi K

    2012-01-01

    Protein Glycosylation is an important post translational event that plays a pivotal role in protein folding and protein is trafficking. We describe a dictionary based and a rule based approach to mine ‘mentions‘ of protein glycosylation in text. The dictionary based approach relies on a set of manually curated dictionaries specially constructed to address this task. Abstracts are then screened for the ‘mentions‘ of words from these dictionaries which are further scored followed by classification on the basis of a threshold. The rule based approaches also relies on the words in the dictionary to arrive at the features which are used for classification. The performance of the system using both the approaches has been evaluated using a manually curated corpus of 3133 abstracts. The evaluation suggests that the performance of the Rule based approach supersedes that of the Dictionary based approach. PMID:23055626

  20. An ecosystem approach to evaluate restoration measures in the lignite mining district of Lusatia/Germany

    NASA Astrophysics Data System (ADS)

    Schaaf, Wolfgang

    2015-04-01

    Lignite mining in Lusatia has a history of over 100 years. Open-cast mining directly affected an area of 1000 km2. Since 20 years we established an ecosystem oriented approach to evaluate the development and site characteristics of post-mining areas mainly restored for agricultural and silvicultural land use. Water and element budgets of afforested sites were studied under different geochemical settings in a chronosequence approach (Schaaf 2001), as well as the effect of soil amendments like sewage sludge or compost in restoration (Schaaf & Hüttl 2006). Since 10 years we also study the development of natural site regeneration in the constructed catchment Chicken Creek at the watershed scale (Schaaf et al. 2011, 2013). One of the striking characteristics of post-mining sites is a very large small-scale soil heterogeneity that has to be taken into account with respect to soil forming processes and element cycling. Results from these studies in combination with smaller-scale process studies enable to evaluate the long-term effect of restoration measures and adapted land use options. In addition, it is crucial to compare these results with data from undisturbed, i.e. non-mined sites. Schaaf, W., 2001: What can element budgets of false-time series tell us about ecosystem development on post-lignite mining sites? Ecological Engineering 17, 241-252. Schaaf, W. and Hüttl, R. F., 2006: Direct and indirect effects of soil pollution by lignite mining. Water, Air and Soil Pollution - Focus 6, 253-264. Schaaf, W., Bens, O., Fischer, A., Gerke, H.H., Gerwin, W., Grünewald, U., Holländer, H.M., Kögel-Knabner, I., Mutz, M., Schloter, M., Schulin, R., Veste, M., Winter, S. & Hüttl, R.F., 2011: Patterns and processes of initial terrestrial-ecosystem development. Journal of Plant Nutrition and Soil Science, 174, 229-239. Schaaf, W., Elmer, M., Fischer, A., Gerwin, W., Nenov, R., Pretsch, H. and Zaplate, M.K., 2013: Feedbacks between vegetation, surface structures and hydrology

  1. Evaluation of the approach to respirable quartz exposure control in U.S. coal mines.

    PubMed

    Joy, Gerald J

    2012-01-01

    Occupational exposure to high levels of respirable quartz can result in respiratory and other diseases in humans. The Mine Safety and Health Adminstration (MSHA) regulates exposure to respirable quartz in coal mines indirectly through reductions in the respirable coal mine dust exposure limit based on the content of quartz in the airborne respirable dust. This reduction is implemented when the quartz content of airborne respirable dust exceeds 5% by weight. The intent of this dust standard reduction is to restrict miners' exposure to respirable quartz to a time-weighted average concentration of 100 μg/m(3). The effectiveness of this indirect approach to control quartz exposure was evaluated by analyzing respirable dust samples collected by MSHA inspectors from 1995 through 2008. The performance of the current regulatory approach was found to be lacking due to the use of a variable property-quartz content in airborne dust-to establish a standard for subsequent exposures. In one situation, 11.7% (4370/37,346) of samples that were below the applicable respirable coal mine dust exposure limit exceeded 100 μg/m(3) quartz. In a second situation, 4.4% (895/20,560) of samples with 5% or less quartz content in the airborne respirable dust exceeded 100 μg/m(3) quartz. In these two situations, the samples exceeding 100 μg/m(3) quartz were not subject to any potential compliance action. Therefore, the current respirable quartz exposure control approach does not reliably maintain miner exposure below 100 μg/m(3) quartz. A separate and specific respirable quartz exposure standard may improve control of coal miners' occupational exposure to respirable quartz. PMID:22181563

  2. EST mining identifies proteins putatively secreted by the anthracnose pathogen Colletotrichum truncatum

    PubMed Central

    2011-01-01

    Background Colletotrichum truncatum is a haploid, hemibiotrophic, ascomycete fungal pathogen that causes anthracnose disease on many economically important leguminous crops. This pathogen exploits sequential biotrophic- and necrotrophic- infection strategies to colonize the host. Transition from biotrophy to a destructive necrotrophic phase called the biotrophy-necrotrophy switch is critical in symptom development. C. truncatum likely secretes an arsenal of proteins that are implicated in maintaining a compatible interaction with its host. Some of them might be transition specific. Results A directional cDNA library was constructed from mRNA isolated from infected Lens culinaris leaflet tissues displaying the biotrophy-necrotrophy switch of C. truncatum and 5000 expressed sequence tags (ESTs) with an average read of > 600 bp from the 5-prime end were generated. Nearly 39% of the ESTs were predicted to encode proteins of fungal origin and among these, 162 ESTs were predicted to contain N-terminal signal peptides (SPs) in their deduced open reading frames (ORFs). The 162 sequences could be assembled into 122 tentative unigenes comprising 32 contigs and 90 singletons. Sequence analyses of unigenes revealed four potential groups: hydrolases, cell envelope associated proteins (CEAPs), candidate effectors and other proteins. Eleven candidate effector genes were identified based on features common to characterized fungal effectors, i.e. they encode small, soluble (lack of transmembrane domain), cysteine-rich proteins with a putative SP. For a selected subset of CEAPs and candidate effectors, semiquantitative RT-PCR showed that these transcripts were either expressed constitutively in both in vitro and in planta or induced during plant infection. Using potato virus X (PVX) based transient expression assays, we showed that one of the candidate effectors, i. e. contig 8 that encodes a cerato-platanin (CP) domain containing protein, unlike CP proteins from other fungal

  3. Forecasting Precipitation over the MENA Region: A Data Mining and Remote Sensing Based Approach

    NASA Astrophysics Data System (ADS)

    Elkadiri, R.; Sultan, M.; Elbayoumi, T.; Chouinard, K.

    2015-12-01

    We developed and applied an integrated approach to construct predictive tools with lead times of 1 to 12 months to forecast precipitation amounts over the Middle East and North Africa (MENA) region. The following steps were conducted: (1) acquire and analyze temporal remote sensing-based precipitation datasets (i.e. Tropical Rainfall Measuring Mission [TRMM]) over five main water source regions in the MENA area (i.e. Atlas Mountains in Morocco, Southern Sudan, Red Sea Hills of Yemen, and Blue Nile and White Nile source areas) throughout the investigation period (1998 to 2015), (2) acquire and extract monthly values for all of the climatic indices that are likely to influence the climatic patterns over the MENA region (e.g., Northern Atlantic Oscillation [NOI], Southern Oscillation Index [SOI], and Tropical North Atlantic Index [TNA]); and (3) apply data mining methods to extract relationships between the observed precipitation and the controlling factors (climatic indices) and use predictive tools to forecast monthly precipitation over each of the identified pilot study areas. Preliminary results indicate that by using the period from January 1998 until August 2012 for model training and the period from September 2012 to January 2015 for testing, precipitation can be successfully predicted with a three-months lead over South West Yemen, Atlas Mountains in Morocco, Southern Sudan, Blue Nile sources and White Nile sources with confidence (Pearson correlation coefficient: 0.911, 0.823, 0.807, 0.801 and 0.895 respectively). Future work will focus on applying this technique for prediction of precipitation over each of the climatically contiguous areas of the MENA region. If our efforts are successful, our findings will lead the way to the development and implementation of sound water management scenarios for the MENA countries.

  4. An Approach to Identify Site Response Directivity of Accelerometer Sites and Application to the Iranian Area

    NASA Astrophysics Data System (ADS)

    Del Gaudio, Vincenzo; Pierri, Pierpaolo; Rajabi, Ali M.

    2015-06-01

    In recent years, several workers have found numerous cases of sites characterised by significant azimuthal variation of dynamic response to seismic shaking. The causes of this phenomenon are still unclear, but are possibly related to combinations of geological and geomorphological factors determining a polarisation of resonance effects. To improve their comprehension, it would be desirable to extend the database of observations on this phenomenon. Thus, considering that unrevealed cases of site response directivity can be "hidden" among the sites of accelerometer networks, we developed a two-stage approach of data mining from existing strong motion databases to identify sites affected by directional amplification. The proposed procedure first calculates Arias Intensity tensor components from accelerometer recordings of each site to determine mean directional variations of total shaking energy. Then, at the sites where a significant anisotropy appears in ground motion, azimuthal variations of HVSR values (spectral ratios between horizontal and vertical components of recordings) are analysed to confirm the occurrence of site resonance conditions. We applied this technique to a database of recordings acquired by accelerometer stations in the Iranian area. The results of this investigation pointed out some sites affected by directional resonance that appear to be correlated to the orientation of local tectonic lineaments, these being mostly transversal to the direction of maximum shaking. Comparing Arias Intensities observed at these sites with theoretical estimates provided by ground motion prediction equations, the presence of significant site amplifications was confirmed. The magnitude of the amplification factors appear to be correlated to the results of HVSR analysis, even though the pattern of dispersion of HVSR values suggests that while high peak values of spectral ratios are indicative of strong amplifications, lower values do not necessarily imply lower

  5. Mining and characterization of two amidase signature family amidases from Brevibacterium epidermidis ZJB-07021 by an efficient genome mining approach.

    PubMed

    Ruan, Li-Tao; Zheng, Ren-Chao; Zheng, Yu-Guo

    2016-10-01

    Amidases have received increasing attention for their significant potential in the production of valuable carboxylic acids. In this study, two amidases belonging to amidase signature family (BeAmi2 and BeAmi4) were identified and mined from genomic DNA of Brevibacterium epidermidis ZJB-07021 by an efficient strategy combining comparative analysis of genomes and identification of unknown region by high-efficiency thermal asymmetric interlaced PCR (HiTAIL-PCR). The deduced amino acid sequences of BeAmi2 and BeAmi4 showed low identity (< 40%) with other reported amidases. The two amidases displayed optimum activity toward a wide spectrum of substrates at a mild alkaline pH and 45 °C. Both of them were remarkably inactivated by serine-directed inhibitor and sulfhydryl-reducing agent. Kinetic analysis revealed that nicotinamide was the preferable substrate for both amidases and the chlorine substitutions on the pyridine ring had a negative effect on activity. The bioprocesses for hydrolysis of 100 mM nicotinamide, isonicotinamide, 2-chloronicotinamide and 5-chloronicotinamide with purified BeAmi2 (6 U mL(-1)) were complete in 60 min with full conversion except 2-chloronicotinamide. These results indicated BeAmi2 was an effective catalyst for hydrolysis of several nicotinamide derivatives. PMID:27180252

  6. A Data Mining Approach for Examining Predictors of Physical Activity among Older Urban Adults

    PubMed Central

    Yoon, Sunmoo; Suero-Tejeda, Niurka; Bakken, Suzanne

    2015-01-01

    This study applied innovative data mining techniques to a community survey dataset to develop prediction models for two aspects of physical activity (active transport and screen time) in sample of older, primarily Hispanic, urban adults (N=2, 514). Main predictors for active transport (accuracy=69.29%, precision .67, recall .69) were immigrant status, high level of anxiety, having a place for physical activity, and willingness to make time for physical activity. The main predictors for screen time (accuracy=63.13%, precision .60, recall .63) were willingness to make time for exercise, having a place for exercise, age, and availability of family support to look up health information on the Internet. Data mining methods were useful to identify intervention targets and inform design of customized interventions. PMID:25941800

  7. A Data Mining Approach for Examining Predictors of Physical Activity Among Urban Older Adults.

    PubMed

    Yoon, Sunmoo; Suero-Tejeda, Niurka; Bakken, Suzanne

    2015-07-01

    The current study applied innovative data mining techniques to a community survey dataset to develop prediction models for two aspects of physical activity (i.e., active transport and screen time) in a sample of urban, primarily Hispanic, older adults (N=2,514). Main predictors for active transport (accuracy=69.29%, precision=0.67, recall=0.69) were immigrant status, high level of anxiety, having a place for physical activity, and willingness to make time for physical activity. The main predictors for screen time (accuracy=63.13%, precision=0.60, recall=0.63) were willingness to make time for exercise, having a place for exercise, age, and availability of family support to access health information on the Internet. Data mining methods were useful to identify intervention targets and inform design of customized interventions. PMID:25941800

  8. Text Mining.

    ERIC Educational Resources Information Center

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  9. Identifying and overcoming the constraints that prevent the full implementation of decommissioning and remediation programs in uranium mining sites.

    PubMed

    Franklin, Mariza Ramalho; Fernandes, Horst Monken

    2013-05-01

    Environmental remediation of radioactive contamination is about achieving appropriate reduction of exposures to ionizing radiation. This goal can be achieved by means of isolation or removal of the contamination source(s) or by breaking the exposure pathways. Ideally, environmental remediation is part of the planning phase of any industrial operation with the potential to cause environmental contamination. This concept is even more important in mining operations due to the significant impacts produced. This approach has not been considered in several operations developed in the past. Therefore many legacy sites face the challenge to implement appropriate remediation plans. One of the first barriers to remediation works is the lack of financial resources as environmental issues used to be taken in the past as marginal costs and were not included in the overall budget of the company. This paper analyses the situation of the former uranium production site of Poços de Caldas in Brazil. It is demonstrated that in addition to the lack of resources, other barriers such as the lack of information on site characteristics, appropriate regulatory framework, funding mechanisms, stakeholder involvement, policy and strategy, technical experience and mechanism for the appropriation of adequate technical expertise will play key roles in preventing the implementation of remediation programs. All these barriers are discussed and some solutions are suggested. It is expected that lessons learned from the Poços de Caldas legacy site may stimulate advancement of more sustainable options in the development of future uranium production centers. PMID:21955840

  10. Pattern recognition and data mining techniques to identify factors in wafer processing and control determining overlay error

    NASA Astrophysics Data System (ADS)

    Lam, Auguste; Ypma, Alexander; Gatefait, Maxime; Deckers, David; Koopman, Arne; van Haren, Richard; Beltman, Jan

    2015-03-01

    On-product overlay can be improved through the use of context data from the fab and the scanner. Continuous improvements in lithography and processing performance over the past years have resulted in consequent overlay performance improvement for critical layers. Identification of the remaining factors causing systematic disturbances and inefficiencies will further reduce overlay. By building a context database, mappings between context, fingerprints and alignment & overlay metrology can be learned through techniques from pattern recognition and data mining. We relate structure (`patterns') in the metrology data to relevant contextual factors. Once understood, these factors could be moved to the known effects (e.g. the presence of systematic fingerprints from reticle writing error or lens and reticle heating). Hence, we build up a knowledge base of known effects based on data. Outcomes from such an integral (`holistic') approach to lithography data analysis may be exploited in a model-based predictive overlay controller that combines feedback and feedforward control [1]. Hence, the available measurements from scanner, fab and metrology equipment are combined to reveal opportunities for further overlay improvement which would otherwise go unnoticed.

  11. A multi-isotope approach to characterize acid mine drainage in a hardrock alpine mine, Chaffe Co,Colorado.

    NASA Astrophysics Data System (ADS)

    Cordalis, D.; Williams, M. W.; Wireman, M.; Michel, R. L.; Manning, A.

    2004-12-01

    Here we present information from an innovative suite of stable, radiogenic, and cosmogenic isotopes to better understand groundwater flowpaths and groundwater-surface water interactions in an applied acid mine drainage system. Stable water isotopes, tritium, helium-tritium, sulfur-35, and uranium 234/238 ratios were analyzed from precipitation, groundwater wells, interior mine drainages, and surface waters at the Mary Murphy Mine in Colorado to determine hydrologic transport mechanisms responsible for contaminated zinc releases. Hydrometric measurements suggested a snowmelt-driven pulse of elevated zinc in adit outflow. However, mixing models using stable water isotopes showed a regional groundwater signal in the adit outflow. Tritium values of 11 to 13 TU showed a slight enrichment of bomb spike water compared to snow values of about 9 TU, suggesting an older water source as well. Helium/tritium ratios on a subset of groundwater wells suggested that average residence times of alluvial wells ranged from 2.5 to 8 years. The combination of stable water isotopes and sulfur-35 (half-life of 87 days), showed that zinc-rich waters within the mine derived from infiltrating snowmelt more than a year old. However, measurement of sulfur-35 using low-level scintillation counts was compromised at times by the presence of uranium. We were able to remove the uranium through wet chemistry procedures, improving the accuracy of S-35 measurements. The U234/U238 ratio shows promise in discriminating between acid mine drainage and acid rock drainage. Acid rock drainage shows an unaltered ratio of 1:1, while acid mine drainage is enriched relative to the 1:1 equilibrium ratio. The combination of cosmogenic and stable isotopes within and near the Mary Murphy Mine may provide a useful tool for studying interactions between groundwater and surfacewater in a fractured rock setting. Remediation techniques can be directed more appropriately, and cost effectively, by the characterization of

  12. Web Mining

    NASA Astrophysics Data System (ADS)

    Fürnkranz, Johannes

    The World-Wide Web provides every internet citizen with access to an abundance of information, but it becomes increasingly difficult to identify the relevant pieces of information. Research in web mining tries to address this problem by applying techniques from data mining and machine learning to Web data and documents. This chapter provides a brief overview of web mining techniques and research areas, most notably hypertext classification, wrapper induction, recommender systems and web usage mining.

  13. Stochastic Modeling Approach for the Evaluation of Backbreak due to Blasting Operations in Open Pit Mines

    NASA Astrophysics Data System (ADS)

    Sari, Mehmet; Ghasemi, Ebrahim; Ataei, Mohammad

    2014-03-01

    Backbreak is an undesirable side effect of bench blasting operations in open pit mines. A large number of parameters affect backbreak, including controllable parameters (such as blast design parameters and explosive characteristics) and uncontrollable parameters (such as rock and discontinuities properties). The complexity of the backbreak phenomenon and the uncertainty in terms of the impact of various parameters makes its prediction very difficult. The aim of this paper is to determine the suitability of the stochastic modeling approach for the prediction of backbreak and to assess the influence of controllable parameters on the phenomenon. To achieve this, a database containing actual measured backbreak occurrences and the major effective controllable parameters on backbreak (i.e., burden, spacing, stemming length, powder factor, and geometric stiffness ratio) was created from 175 blasting events in the Sungun copper mine, Iran. From this database, first, a new site-specific empirical equation for predicting backbreak was developed using multiple regression analysis. Then, the backbreak phenomenon was simulated by the Monte Carlo (MC) method. The results reveal that stochastic modeling is a good means of modeling and evaluating the effects of the variability of blasting parameters on backbreak. Thus, the developed model is suitable for practical use in the Sungun copper mine. Finally, a sensitivity analysis showed that stemming length is the most important parameter in controlling backbreak.

  14. Soil quality assessment using GIS-based chemometric approach and pollution indices: Nakhlak mining district, Central Iran.

    PubMed

    Moore, Farid; Sheykhi, Vahideh; Salari, Mohammad; Bagheri, Adel

    2016-04-01

    This paper is a comprehensive assessment of the quality of soil in the Nakhlak mining district in Central Iran with special reference to potentially toxic metals. In this regard, an integrated approach involving geostatistical, correlation matrix, pollution indices, and chemical fractionation measurement is used to evaluate selected potentially toxic metals in soil samples. The fractionation of metals indicated a relatively high variability. Some metals (Mo, Ag, and Pb) showed important enrichment in the bioavailable fractions (i.e., exchangeable and carbonate), whereas the residual fraction mostly comprised Sb and Cr. The Cd, Zn, Co, Ni, Mo, Cu, and As were retained in Fe-Mn oxide and oxidizable fractions, suggesting that they may be released to the environment by changes in physicochemical conditions. The spatial variability patterns of 11 soil heavy metals (Ag, As, Cd, Co, Cr, Cu, Mo, Ni, Pb, Sb, and Zn) were identified and mapped. The results demonstrated that Ag, As, Cd, Mo, Cu, Pb, Sb, and Zn pollution are associated with mineralized veins and mining operations in this area. Further environmental monitoring and remedial actions are required for management of soil heavy metals in the study area. The present study not only enhanced our knowledge regarding soil pollution in the study area but also introduced a better technique to analyze pollution indices by multivariate geostatistical methods. PMID:26956012

  15. An experimental approach to assessing the effects of mining subsidence on a flood meadow community

    SciTech Connect

    Benyon, P.R.; Humphries, R.N.; Gregson, K.; Marshall, S.; Peace, S.W.

    1998-12-31

    The Lower Derwent Valley (LDV) is a candidate Special Area of Conservation (SAC) under the provisions of the UK 1994 Conservation Regulations for its internationally important Alopecurus pratense-Sanguisorba officinalis flood meadow vegetation. Mining from RJB`s Selby Complex (UK`s largest mine) has taken place around and under the LDV since the 1980s. Under the provisions of the Regulations the potential effects of mining subsidence have been recently reviewed. From field data and models it has been predicted that the resulting small amount of subsidence is unlikely to have a deleterious effect on the composition and extent of the key community. While the proposed long-term monitoring will verify the prediction, it will be some years before the results will be available. In order to identify incipient changes in grassland community and to implement any necessary mitigation measures before significant changes occur, a field experiment was set up in late 1996 to assess the effects of increased wetness and inundation which might be induced by subsidence. This involved the transplantation of turves from the different grassland communities within and along a previously defined gradient of relative wetness and inundation. The response of the communities to the different conditions is being monitored. The background studies and the results of the transplantation so far will be presented.

  16. The adaptive approach for storage assignment by mining data of warehouse management system for distribution centres

    NASA Astrophysics Data System (ADS)

    Ming-Huang Chiang, David; Lin, Chia-Ping; Chen, Mu-Chen

    2011-05-01

    Among distribution centre operations, order picking has been reported to be the most labour-intensive activity. Sophisticated storage assignment policies adopted to reduce the travel distance of order picking have been explored in the literature. Unfortunately, previous research has been devoted to locating entire products from scratch. Instead, this study intends to propose an adaptive approach, a Data Mining-based Storage Assignment approach (DMSA), to find the optimal storage assignment for newly delivered products that need to be put away when there is vacant shelf space in a distribution centre. In the DMSA, a new association index (AIX) is developed to evaluate the fitness between the put away products and the unassigned storage locations by applying association rule mining. With AIX, the storage location assignment problem (SLAP) can be formulated and solved as a binary integer programming. To evaluate the performance of DMSA, a real-world order database of a distribution centre is obtained and used to compare the results from DMSA with a random assignment approach. It turns out that DMSA outperforms random assignment as the number of put away products and the proportion of put away products with high turnover rates increase.

  17. A Control Chart Approach for Representing and Mining Data Streams with Shape Based Similarity

    SciTech Connect

    Omitaomu, Olufemi A

    2014-01-01

    The mining of data streams for online condition monitoring is a challenging task in several domains including (electric) power grid system, intelligent manufacturing, and consumer science. Considering a power grid application in which thousands of sensors, called the phasor measurement units, are deployed on the power grid network to continuously collect streams of digital data for real-time situational awareness and system management. Depending on design, each sensor could stream between ten and sixty data samples per second. The myriad of sensory data captured could convey deeper insights about sequence of events in real-time and before major damages are done. However, the timely processing and analysis of these high-velocity and high-volume data streams is a challenge. Hence, a new data processing and transformation approach, based on the concept of control charts, for representing sequence of data streams from sensors is proposed. In addition, an application of the proposed approach for enhancing data mining tasks such as clustering using real-world power grid data streams is presented. The results indicate that the proposed approach is very efficient for data streams storage and manipulation.

  18. A collaborative approach for mine waste cleanup -- the Animas River experience

    SciTech Connect

    Broetzman, G.; Parsons, G.

    1996-11-01

    An innovative, collaborative approach is underway in the Animas River Basin for addressing a myriad of inactive mine waste sites using a watershed framework. A group composed of all vested interest in the Basin, including the regulatory agencies, are evaluating all sites. Their intent is to select those sites that will lead to a cost-effective attainment of State-defined water quality improvements in the Animas River. This paper will address process, methodology, regulatory, and related issues associated with this overall effort.

  19. Knowledge Discovery using Domain-Concept Mining Approach for the Behavioral Risk Factor Surveillance System (BRFSS) Data

    PubMed Central

    Mahamaneerat, Wannapa Kay; Shyu, Chi-Ren

    2006-01-01

    The publicly available Behavioral Risk Factor Surveillance System (BRFSS) data is the largest telephone survey data set in the world. Often times, the data set is under-utilized due to its size and the difficulties to comprehend and explore the relationships among variables. With a traditional data mining approach, such as association rule (AR) mining, it is still not possible to discover valuable information under the existing computational power. To promote the usefulness of this rich data set efficiently, we propose a novel data mining approach called Domain-Concept Mining (DCM) that partitions data into groups of relevant domain-concept, then extracts associations among variables from each partition. The findings from the DCM show that it can efficiently discover relevant information from the BRFSS with respect to the previously published literature. PMID:17238640

  20. VALUING ACID MINE DRAINAGE REMEDIATION OF IMPAIRED WATERWAYS IN WEST VIRGINIA: A HEDONIC MODELING APPROACH

    EPA Science Inventory

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD), the metal rich runoff flowing primarily from abandoned mines and surface deposits of mine waste. AMD can lower stream and river pH ...

  1. A Data Mining Approach to Predict In Situ Detoxification Potential of Chlorinated Ethenes.

    PubMed

    Lee, Jaejin; Im, Jeongdae; Kim, Ungtae; Löffler, Frank E

    2016-05-17

    Despite advances in physicochemical remediation technologies, in situ bioremediation treatment based on Dehalococcoides mccartyi (Dhc) reductive dechlorination activity remains a cornerstone approach to remedy sites impacted with chlorinated ethenes. Selecting the best remedial strategy is challenging due to uncertainties and complexity associated with biological and geochemical factors influencing Dhc activity. Guidelines based on measurable biogeochemical parameters have been proposed, but contemporary efforts fall short of meaningfully integrating the available information. Extensive groundwater monitoring data sets have been collected for decades, but have not been systematically analyzed and used for developing tools to guide decision-making. In the present study, geochemical and microbial data sets collected from 35 wells at five contaminated sites were used to demonstrate that a data mining prediction model using the classification and regression tree (CART) algorithm can provide improved predictive understanding of a site's reductive dechlorination potential. The CART model successfully predicted the 3-month-ahead reductive dechlorination potential with 75.8% and 69.5% true positive rate (i.e., sensitivity) for the training set and the test set, respectively. The machine learning algorithm ranked parameters by relative importance for assessing in situ reductive dechlorination potential. The abundance of Dhc 16S rRNA genes, CH4, Fe(2+), NO3(-), NO2(-), and SO4(2-) concentrations, total organic carbon (TOC) amounts, and oxidation-reduction potential (ORP) displayed significant correlations (p < 0.01) with dechlorination potential, with NO3(-), NO2(-), and Fe(2+) concentrations exhibiting precedence over other parameters. Contrary to prior efforts, the power of data mining approaches lies in the ability to discern synergetic effects between multiple parameters that affect reductive dechlorination activity. Overall, these findings demonstrate that data mining

  2. A multi-disciplinary approach to understanding the impacts of mines on traditional uses of water in Northern Mongolia.

    PubMed

    McIntyre, Neil; Bulovic, Nevenka; Cane, Isabel; McKenna, Phill

    2016-07-01

    Mongolia is an example of a nation where the rapidity of mining development is outpacing capacity to manage the potential land and water resources impacts. Further, Mongolia has a particular social and economic reliance on traditional uses of land and water, principally livestock herding. While some mining operations are setting high standards in protecting the natural resources surrounding the mine site, others have less incentive and capacity to do so and therefore are having adverse effects on surrounding communities. The paper describes a case study of the Sharyn Gol Soum in northern Mongolia where a range of mining types, from artisanal, small-scale mining to a large coal mine, operate alongside traditional herding lifestyles. A multi-disciplinary approach is taken to observe and attribute causes to the water resources impacts in the area. Surveys of the herding household community, land use mapping, and monitoring the spatial variations in water quality indicate deterioration of water resources. Collectively, the different sources of evidence suggest that the deterioration is mainly due to small-scale gold mining. The evidence included the perception of 78% of the interviewed herders that water quality had changed due to mining; a change in the footprint of small-scale gold mining from 2.8 to 15.2km(2) during the period 1999 to 2015; and pH and sulphate values in 2015 consistently outside the ranges observed at a baseline site in the same region. It is concluded that the lack of baseline data and effective governance mechanisms are fundamental challenges that need to be addressed if Mongolia's transition to a mining economy is to be managed alongside sustainability of herder lifestyles. PMID:27016688

  3. A data mining approach to optimize pellets manufacturing process based on a decision tree algorithm.

    PubMed

    Ronowicz, Joanna; Thommes, Markus; Kleinebudde, Peter; Krysiński, Jerzy

    2015-06-20

    The present study is focused on the thorough analysis of cause-effect relationships between pellet formulation characteristics (pellet composition as well as process parameters) and the selected quality attribute of the final product. The shape using the aspect ratio value expressed the quality of pellets. A data matrix for chemometric analysis consisted of 224 pellet formulations performed by means of eight different active pharmaceutical ingredients and several various excipients, using different extrusion/spheronization process conditions. The data set contained 14 input variables (both formulation and process variables) and one output variable (pellet aspect ratio). A tree regression algorithm consistent with the Quality by Design concept was applied to obtain deeper understanding and knowledge of formulation and process parameters affecting the final pellet sphericity. The clear interpretable set of decision rules were generated. The spehronization speed, spheronization time, number of holes and water content of extrudate have been recognized as the key factors influencing pellet aspect ratio. The most spherical pellets were achieved by using a large number of holes during extrusion, a high spheronizer speed and longer time of spheronization. The described data mining approach enhances knowledge about pelletization process and simultaneously facilitates searching for the optimal process conditions which are necessary to achieve ideal spherical pellets, resulting in good flow characteristics. This data mining approach can be taken into consideration by industrial formulation scientists to support rational decision making in the field of pellets technology. PMID:25835791

  4. Novel data-mining approach identifies biomarkers for diagnosis of Kawasaki disease

    PubMed Central

    Tremoulet, Adriana H.; Dutkowski, Janusz; Sato, Yuichiro; Kanegaye, John T.; Ling, Xuefeng B.; Burns, Jane C.

    2015-01-01

    Background As Kawasaki disease (KD) shares many clinical features with other more common febrile illnesses and misdiagnosis, leading to a delay in treatment, increases the risk of coronary artery damage, a diagnostic test for KD is urgently needed. We sought to develop a panel of biomarkers that could distinguish between acute KD patients and febrile controls (FC) with sufficient accuracy to be clinically useful. Methods Plasma samples were collected from three independent cohorts of FC and acute KD patients who met the American Heart Association definition for KD and presented within the first 10 days of fever. The levels of 88 biomarkers associated with inflammation were assessed by Luminex bead technology. Unsupervised clustering followed by supervised clustering using a Random Forest model was used to find a panel of candidate biomarkers. Results A panel of biomarkers commonly available in the hospital laboratory (absolute neutrophil count, erythrocyte sedimentation rate, alanine aminotransferase, gamma glutamyl transferase, concentrations of alpha-1-antitrypsin, C-reactive protein, and fibrinogen, and platelet count) accurately diagnosed 81 to 96% of KD patients in a series of three independent cohorts. Conclusions After prospective validation, this 8-biomarker panel may improve the recognition of KD. PMID:26237629

  5. Use of lead isotopes to identify sources of metal and metalloid contaminants in atmospheric aerosol from mining operations.

    PubMed

    Félix, Omar I; Csavina, Janae; Field, Jason; Rine, Kyle P; Sáez, A Eduardo; Betterton, Eric A

    2015-03-01

    Mining operations are a potential source of metal and metalloid contamination by atmospheric particulate generated from smelting activities, as well as from erosion of mine tailings. In this work, we show how lead isotopes can be used for source apportionment of metal and metalloid contaminants from the site of an active copper mine. Analysis of atmospheric aerosol shows two distinct isotopic signatures: one prevalent in fine particles (<1μm aerodynamic diameter) while the other corresponds to coarse particles as well as particles in all size ranges from a nearby urban environment. The lead isotopic ratios found in the fine particles are equal to those of the mine that provides the ore to the smelter. Topsoil samples at the mining site show concentrations of Pb and As decreasing with distance from the smelter. Isotopic ratios for the sample closest to the smelter (650m) and from topsoil at all sample locations, extending to more than 1km from the smelter, were similar to those found in fine particles in atmospheric dust. The results validate the use of lead isotope signatures for source apportionment of metal and metalloid contaminants transported by atmospheric particulate. PMID:25496740

  6. Use of Lead Isotopes to Identify Sources of Metal and Metalloid Contaminants in Atmospheric Aerosol from Mining Operations

    PubMed Central

    Félix, Omar I.; Csavina, Janae; Field, Jason; Rine, Kyle P.; Sáez, A. Eduardo; Betterton, Eric A.

    2014-01-01

    Mining operations are a potential source of metal and metalloid contamination by atmospheric particulate generated from smelting activities, as well as from erosion of mine tailings. In this work, we show how lead isotopes can be used for source apportionment of metal and metalloid contaminants from the site of an active copper mine. Analysis of atmospheric aerosol shows two distinct isotopic signatures: one prevalent in fine particles (< 1 μm aerodynamic diameter) while the other corresponds to coarse particles as well as particles in all size ranges from a nearby urban environment. The lead isotopic ratios found in the fine particles are equal to those of the mine that provides the ore to the smelter. Topsoil samples at the mining site show concentrations of Pb and As decreasing with distance from the smelter. Isotopic ratios for the sample closest to the smelter (650 m) and from topsoil at all sample locations, extending to more than 1 km from the smelter, were similar to those found in fine particles in atmospheric dust. The results validate the use of lead isotope signatures for source apportionment of metal and metalloid contaminants transported by atmospheric particulate. PMID:25496740

  7. tmVar: a text mining approach for extracting sequence variants in biomedical literature

    PubMed Central

    Wei, Chih-Hsuan; Harris, Bethany R.; Kao, Hung-Yu; Lu, Zhiyong

    2013-01-01

    Motivation: Text-mining mutation information from the literature becomes a critical part of the bioinformatics approach for the analysis and interpretation of sequence variations in complex diseases in the post-genomic era. It has also been used for assisting the creation of disease-related mutation databases. Most of existing approaches are rule-based and focus on limited types of sequence variations, such as protein point mutations. Thus, extending their extraction scope requires significant manual efforts in examining new instances and developing corresponding rules. As such, new automatic approaches are greatly needed for extracting different kinds of mutations with high accuracy. Results: Here, we report tmVar, a text-mining approach based on conditional random field (CRF) for extracting a wide range of sequence variants described at protein, DNA and RNA levels according to a standard nomenclature developed by the Human Genome Variation Society. By doing so, we cover several important types of mutations that were not considered in past studies. Using a novel CRF label model and feature set, our method achieves higher performance than a state-of-the-art method on both our corpus (91.4 versus 78.1% in F-measure) and their own gold standard (93.9 versus 89.4% in F-measure). These results suggest that tmVar is a high-performance method for mutation extraction from biomedical literature. Availability: tmVar software and its corpus of 500 manually curated abstracts are available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/pub/tmVar. Contact: zhiyong.lu@nih.gov PMID:23564842

  8. Order Batching in Warehouses by Minimizing Total Tardiness: A Hybrid Approach of Weighted Association Rule Mining and Genetic Algorithms

    PubMed Central

    Taheri, Shahrooz; Mat Saman, Muhamad Zameri; Wong, Kuan Yew

    2013-01-01

    One of the cost-intensive issues in managing warehouses is the order picking problem which deals with the retrieval of items from their storage locations in order to meet customer requests. Many solution approaches have been proposed in order to minimize traveling distance in the process of order picking. However, in practice, customer orders have to be completed by certain due dates in order to avoid tardiness which is neglected in most of the related scientific papers. Consequently, we proposed a novel solution approach in order to minimize tardiness which consists of four phases. First of all, weighted association rule mining has been used to calculate associations between orders with respect to their due date. Next, a batching model based on binary integer programming has been formulated to maximize the associations between orders within each batch. Subsequently, the order picking phase will come up which used a Genetic Algorithm integrated with the Traveling Salesman Problem in order to identify the most suitable travel path. Finally, the Genetic Algorithm has been applied for sequencing the constructed batches in order to minimize tardiness. Illustrative examples and comparisons are presented to demonstrate the proficiency and solution quality of the proposed approach. PMID:23864823

  9. Citation Mining: Integrating Text Mining and Bibliometrics for Research User Profiling.

    ERIC Educational Resources Information Center

    Kostoff, Ronald N.; del Rio, J. Antonio; Humenik, James A.; Garcia, Esther Ofilia; Ramirez, Ana Maria

    2001-01-01

    Discusses the importance of identifying the users and impact of research, and describes an approach for identifying the pathways through which research can impact other research, technology development, and applications. Describes a study that used citation mining, an integration of citation bibliometrics and text mining, on articles from the…

  10. Using a Text-Mining Approach to Evaluate the Quality of Nursing Records.

    PubMed

    Chang, Hsiu-Mei; Chiou, Shwu-Fen; Liu, Hsiu-Yun; Yu, Hui-Chu

    2016-01-01

    Nursing records in Taiwan have been computerized, but their quality has rarely been discussed. Therefore, this study employed a text-mining approach and a cross-sectional retrospective research design to evaluate the quality of electronic nursing records at a medical center in Northern Taiwan. SAS Text Miner software Version 13.2 was employed to analyze unstructured nursing event records. The results show that SAS Text Miner is suitable for developing a textmining model for validating nursing records. The sensitivity of SAS Text Miner was approximately 0.94, and the specificity and accuracy were 0.99. Thus, SAS Text Miner software is an effective tool for auditing unstructured electronic nursing records. PMID:27332355

  11. Colorado School of Mines behavioral approach to the 1995 UGR competition

    NASA Astrophysics Data System (ADS)

    Murphy, Robin R.; Hoff, William A.; Blitch, John; Gough, Val; Hawkins, Dale; Hoffman, James C.; Krosley, Ramon; Lyons, Torsten; Mali, Amol; MacMillan, James; Warshawsky, Steven

    1995-12-01

    The Colorado School of Mines (CSM) entry placed fourth in the 1995 International Unmanned Ground Robotics Competition sponsored by the Association for Unmanned Vehicles (AUVS). Clementine 2, a battery powered children's jeep outfitted with a 100 MHz Pentium field computer, a camcorder, and a panning ultrasonic range finder served as the platform. The objectives of the CSM team were to gain familiarity with the CSM architecture by applying it to a well defined problem, evaluate existing computer vision based road following techniques, and gain practical experience in using multiple sensing modalities. The entry used the behavioral portion of the CSM hybrid deliberative/reactive architecture, which divided robot activities into four strategic and tactical behaviors: vision based follow-path, ultrasonic based avoid-obstacle, pan-camera, and speed-control using inclinometers. This paper details the motivation behind the CSM entry, the approach taken, and lessons learned.

  12. A data mining approach to predict in situ chlorinated ethene detoxification potential

    NASA Astrophysics Data System (ADS)

    Lee, J.; Im, J.; Kim, U.; Loeffler, F. E.

    2015-12-01

    Despite major advances in physicochemical remediation technologies, in situ biostimulation and bioaugmentation treatment aimed at stimulating Dehalococcoides mccartyi (Dhc) reductive dechlorination activity remains a cornerstone approach to remedy sites impacted with chlorinated ethenes. In practice, selecting the best remedial strategy is challenging due to uncertainties associated with the microbiology (e.g., presence and activity of Dhc) and geochemical factors influencing Dhc activity. Extensive groundwater datasets collected over decades of monitoring exist, but have not been systematically analyzed. In the present study, geochemical and microbial data sets collected from 35 wells at 5 contaminated sites were used to develop a predictive empirical model using a machine learning algorithm (i) to rank the relative importance of parameters that affect in situ reductive dechlorination potential, and (ii) to provide recommendations for selecting the optimal remediation strategy at a specific site. Classification and regression tree (CART) analysis was applied, and a representative classification tree model was developed that allowed short-term prediction of dechlorination potential. Indirect indicators for low dissolved oxygen (e.g., low NO3-and NO2-, high Fe2+ and CH4) were the most influential factors for predicting dechlorination potential, followed by total organic carbon content (TOC) and Dhc cell abundance. These findings indicate that machine learning-based data mining techniques applied to groundwater monitoring data can lead to the development of predictive groundwater remediation models. A major need for improving the predictive capabilities of the data mining approach is a curated, up-to-date and comprehensive collection of groundwater monitoring data.

  13. EVALUATION OF FUGITIVE DUST EMISSIONS FROM MINING

    EPA Science Inventory

    This evaluation of fugitive dust air pollution from mining operations identifies and compiles currently available information on emission sources and rates, regulatory approaches, control techniques, measuring and monitoring techniques, health and welfare effects, and research pr...

  14. Identifying medical terms in patient-authored text: a crowdsourcing-based approach

    PubMed Central

    MacLean, Diana Lynn; Heer, Jeffrey

    2013-01-01

    Background and objective As people increasingly engage in online health-seeking behavior and contribute to health-oriented websites, the volume of medical text authored by patients and other medical novices grows rapidly. However, we lack an effective method for automatically identifying medical terms in patient-authored text (PAT). We demonstrate that crowdsourcing PAT medical term identification tasks to non-experts is a viable method for creating large, accurately-labeled PAT datasets; moreover, such datasets can be used to train classifiers that outperform existing medical term identification tools. Materials and methods To evaluate the viability of using non-expert crowds to label PAT, we compare expert (registered nurses) and non-expert (Amazon Mechanical Turk workers; Turkers) responses to a PAT medical term identification task. Next, we build a crowd-labeled dataset comprising 10 000 sentences from MedHelp. We train two models on this dataset and evaluate their performance, as well as that of MetaMap, Open Biomedical Annotator (OBA), and NaCTeM's TerMINE, against two gold standard datasets: one from MedHelp and the other from CureTogether. Results When aggregated according to a corroborative voting policy, Turker responses predict expert responses with an F1 score of 84%. A conditional random field (CRF) trained on 10 000 crowd-labeled MedHelp sentences achieves an F1 score of 78% against the CureTogether gold standard, widely outperforming OBA (47%), TerMINE (43%), and MetaMap (39%). A failure analysis of the CRF suggests that misclassified terms are likely to be either generic or rare. Conclusions Our results show that combining statistical models sensitive to sentence-level context with crowd-labeled data is a scalable and effective technique for automatically identifying medical terms in PAT. PMID:23645553

  15. A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells.

    PubMed

    Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antzack, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J; Guindani, Michele; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco

    2016-04-01

    The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication networks

  16. A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells

    PubMed Central

    Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antzack, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J.; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco

    2016-01-01

    Abstract The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication

  17. Mining quasi-bicliques from HIV-1-human protein interaction network: a multiobjective biclustering approach.

    PubMed

    Maulik, Ujjwal; Mukhopadhyay, Anirban; Bhattacharyya, Malay; Kaderali, Lars; Brors, Benedikt; Bandyopadhyay, Sanghamitra; Eils, Roland

    2013-01-01

    In this work, we model the problem of mining quasi-bicliques from weighted viral-host protein-protein interaction network as a biclustering problem for identifying strong interaction modules. In this regard, a multiobjective genetic algorithm-based biclustering technique is proposed that simultaneously optimizes three objective functions to obtain dense biclusters having high mean interaction strengths. The performance of the proposed technique has been compared with that of other existing biclustering methods on an artificial data. Subsequently, the proposed biclustering method is applied on the records of biologically validated and predicted interactions between a set of HIV-1 proteins and a set of human proteins to identify strong interaction modules. For this, the entire interaction information is realized as a bipartite graph. We have further investigated the biological significance of the obtained biclusters. The human proteins involved in the strong interaction module have been found to share common biological properties and they are identified as the gateways of viral infection leading to various diseases. These human proteins can be potential drug targets for developing anti-HIV drugs. PMID:23929866

  18. Mining Quasi-Bicliques from HIV-1--Human Protein Interaction Network: A Multiobjective Biclustering Approach.

    PubMed

    Maulik, Ujjwal; Mukhopadhyay, Anirban; Bhattacharyya, Malay; Kaderali, Lars; Brors, Benedikt; Bandyopadhyay, Sanghamitra; Eils, Roland

    2012-11-28

    In this work, we model the problem of mining quasi-bicliques from weighted viral-host protein-protein interaction network as a biclustering problem for identifying strong interaction modules. In this regard, a multiobjective genetic algorithm based biclustering technique is proposed that simultaneously optimizes three objective functions to obtain dense biclusters having high mean interaction strengths. The performance of the proposed technique has been compared with that of other existing biclustering methods on an artificial data. Subsequently, the proposed biclustering method is applied on the records of biologically validated and predicted interactions between a set of HIV-1 proteins and a set of human proteins to identify strong interaction modules. For this, the entire interaction information is realized as a bipartite graph. We have further investigated the biological significance of the obtained biclusters. The human proteins involved in the strong interaction module have been found to share common biological properties and they are identified as the gateways of viral infection leading to various diseases. These human proteins can be potential drug targets for developing anti-HIV drugs. PMID:23209057

  19. Integrating Communication into Engineering Curricula: An Interdisciplinary Approach to Facilitating Transfer at New Mexico Institute of Mining and Technology

    ERIC Educational Resources Information Center

    Ford, Julie Dyke

    2012-01-01

    This program profile describes a new approach towards integrating communication within Mechanical Engineering curricula. The author, who holds a joint appointment between Technical Communication and Mechanical Engineering at New Mexico Institute of Mining and Technology, has been collaborating with Mechanical Engineering colleagues to establish a…

  20. An Approach to Developing Independent Learning and Non-Technical Skills Amongst Final Year Mining Engineering Students

    ERIC Educational Resources Information Center

    Knobbs, C. G.; Grayson, D. J.

    2012-01-01

    There is mounting evidence to show that engineers need more than technical skills to succeed in industry. This paper describes a curriculum innovation in which so-called "soft" skills, specifically inter-personal and intra-personal skills, were integrated into a final year mining engineering course. The instructional approach was designed to…

  1. The Voice of Chinese Health Consumers: A Text Mining Approach to Web-Based Physician Reviews

    PubMed Central

    Zhang, Kunpeng

    2016-01-01

    experience of finding doctors, doctors’ technical skills and bedside manner, general appreciation from patients, and description of various symptoms. Conclusions To the best of our knowledge, our work is the first study using an automated text-mining approach to analyze a large amount of unstructured textual data of Web-based physician reviews in China. Based on our analysis, we found that Chinese reviewers mainly concentrate on a few popular topics. This is consistent with the goal of Chinese online health platforms and demonstrates the health care focus in China’s health care system. Our text-mining approach reveals a new research area on how to use big data to help health care providers, health care administrators, and policy makers hear patient voices, target patient concerns, and improve the quality of care in this age of patient-centered care. Also, on the health care consumer side, our text mining technique helps patients make more informed decisions about which specialists to see without reading thousands of reviews, which is simply not feasible. In addition, our comparison analysis of Web-based physician reviews in China and the United States also indicates some cultural differences. PMID:27165558

  2. Identifying diagnostically-relevant resting state brain functional connectivity in the ventral posterior complex via genetic data mining in autism spectrum disorder.

    PubMed

    Baldwin, Philip R; Curtis, Kaylah N; Patriquin, Michelle A; Wolf, Varina; Viswanath, Humsini; Shaw, Chad; Sakai, Yasunari; Salas, Ramiro

    2016-05-01

    Exome sequencing and copy number variation analyses continue to provide novel insight to the biological bases of autism spectrum disorder (ASD). The growing speed at which massive genetic data are produced causes serious lags in analysis and interpretation of the data. Thus, there is a need to develop systematic genetic data mining processes that facilitate efficient analysis of large datasets. We report a new genetic data mining system, ProcessGeneLists and integrated a list of ASD-related genes with currently available resources in gene expression and functional connectivity of the human brain. Our data-mining program successfully identified three primary regions of interest (ROIs) in the mouse brain: inferior colliculus, ventral posterior complex of the thalamus (VPC), and parafascicular nucleus (PFn). To understand its pathogenic relevance in ASD, we examined the resting state functional connectivity (RSFC) of the homologous ROIs in human brain with other brain regions that were previously implicated in the neuro-psychiatric features of ASD. Among them, the RSFC of the VPC with the medial frontal gyrus (MFG) was significantly more anticorrelated, whereas the RSFC of the PN with the globus pallidus was significantly increased in children with ASD compared with healthy children. Moreover, greater values of RSFC between VPC and MFG were correlated with severity index and repetitive behaviors in children with ASD. No significant RSFC differences were detected in adults with ASD. Together, these data demonstrate the utility of our data-mining program through identifying the aberrant connectivity of thalamo-cortical circuits in children with ASD. Autism Res 2016, 9: 553-562. © 2015 International Society for Autism Research, Wiley Periodicals, Inc. PMID:26451751

  3. Correlation of HIV protease structure with Indinavir resistance: a data mining and neural networks approach

    NASA Astrophysics Data System (ADS)

    Draghici, Sorin; Cumberland, Lonnie T., Jr.; Kovari, Ladislau C.

    2000-04-01

    This paper presents some results of data mining HIV genotypic and structural data. Our aim is to try to relate structural features of HIV enzymes essential to its reproductive abilities to the drug resistance phenomenon. This paper concentrates on the HIV protease enzyme and Indinavir which is one of the FDA approved protease inhibitors. Our starting point was the current list of HIV mutations related to drug resistance. We used the fact that some molecular structures determined through high resolution X-ray crystallography were available for the protease-Indinavir complex. Starting with these structures and the known mutations, we modelled the mutant proteases and studied the pattern of atomic contacts between the protease and the drug. After suitable pre- processing, these patterns have been used as the input of our data mining process. We have used both supervised and unsupervised learning techniques with the aim of understanding the relationship between structural features at a molecular level and resistance to Indinavir. The supervised learning was aimed at predicting IC90 values for arbitrary mutants. The SOFM was aimed at identifying those structural features that are important for drug resistance and discovering a classifier based on such features. We have used validation and cross validation to test the generalization abilities of the learning paradigm we have designed. The straightforward supervised learning was able to learn very successfully but validation results are less than satisfactory. This is due to the insufficient number of patterns in the training set which in turn is due to the scarcity of the available data. The data mining using SOFM was very successful. We have managed to distinguish between resistant and non-resistant mutants using structural features. We have been able to divide all reported HIV mutants into several categories based on their 3- dimensional molecular structures and the pattern of contacts between the mutant protease and

  4. Identifying Key Priorities for Future Palliative Care Research Using an Innovative Analytic Approach

    PubMed Central

    Riffin, Catherine; Pillemer, Karl; Chen, Emily K.; Warmington, Marcus; Adelman, Ronald D.; Reid, M. C.

    2015-01-01

    Using an innovative approach, we identified research priorities in palliative care to guide future research initiatives. We searched 7databases (2005–2012) for review articles published on the topics of palliative and hospice–end-of-life care. The identified research recommendations (n = 648) fell into 2 distinct categories: (1) ways to improve methodological approaches and (2) specific topic areas in need of future study. The most commonly cited priority within the theme of methodological approaches was the need for enhanced rigor. Specific topics in need of future study included perspectives and needs of patients, relatives, and providers; underrepresented populations; decision-making; cost-effectiveness; provider education; spirituality; service use; and inter-disciplinary approaches to delivering palliative care. This review underscores the need for additional research on specific topics and methodologically rigorous research to inform health policy and practice. PMID:25393169

  5. BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature

    PubMed Central

    2009-01-01

    Background To automatically process large quantities of biological literature for knowledge discovery and information curation, text mining tools are becoming essential. Abbreviation recognition is related to NER and can be considered as a pair recognition task of a terminology and its corresponding abbreviation from free text. The successful identification of abbreviation and its corresponding definition is not only a prerequisite to index terms of text databases to produce articles of related interests, but also a building block to improve existing gene mention tagging and gene normalization tools. Results Our approach to abbreviation recognition (AR) is based on machine-learning, which exploits a novel set of rich features to learn rules from training data. Tested on the AB3P corpus, our system demonstrated a F-score of 89.90% with 95.86% precision at 84.64% recall, higher than the result achieved by the existing best AR performance system. We also annotated a new corpus of 1200 PubMed abstracts which was derived from BioCreative II gene normalization corpus. On our annotated corpus, our system achieved a F-score of 86.20% with 93.52% precision at 79.95% recall, which also outperforms all tested systems. Conclusion By applying our system to extract all short form-long form pairs from all available PubMed abstracts, we have constructed BIOADI. Mining BIOADI reveals many interesting trends of bio-medical research. Besides, we also provide an off-line AR software in the download section on http://bioagent.iis.sinica.edu.tw/BIOADI/. PMID:19958517

  6. Model-based approach to the detection and classification of mines in sidescan sonar.

    PubMed

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-10

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis. PMID:14735943

  7. Model-based approach to the detection and classification of mines in sidescan sonar

    NASA Astrophysics Data System (ADS)

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-01

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis.

  8. Systematic Analysis of the Molecular Mechanism Underlying Decidualization Using a Text Mining Approach.

    PubMed

    Liu, Ji-Long; Wang, Tong-Song

    2015-01-01

    Decidualization is a crucial process for successful embryo implantation and pregnancy in humans. Defects in decidualization during early pregnancy are associated with several pregnancy complications, such as pre-eclampsia, intrauterine growth restriction and recurrent pregnancy loss. However, the mechanism underlying decidualization remains poorly understood. In the present study, we performed a systematic analysis of decidualization-related genes using text mining. We identified 286 genes for humans and 287 genes for mice respectively, with an overlap of 111 genes shared by both species. Through enrichment test, we demonstrated that although divergence was observed, the majority of enriched gene ontology terms and pathways were shared by both species, suggesting that functional categories were more conserved than individual genes. We further constructed a decidualization-related protein-protein interaction network consisted of 344 nodes connected via 1,541 edges. We prioritized genes in this network and identified 12 genes that may be key regulators of decidualization. These findings would provide some clues for further research on the mechanism underlying decidualization. PMID:26222155

  9. Systematic Analysis of the Molecular Mechanism Underlying Decidualization Using a Text Mining Approach

    PubMed Central

    Liu, Ji-Long; Wang, Tong-Song

    2015-01-01

    Decidualization is a crucial process for successful embryo implantation and pregnancy in humans. Defects in decidualization during early pregnancy are associated with several pregnancy complications, such as pre-eclampsia, intrauterine growth restriction and recurrent pregnancy loss. However, the mechanism underlying decidualization remains poorly understood. In the present study, we performed a systematic analysis of decidualization-related genes using text mining. We identified 286 genes for humans and 287 genes for mice respectively, with an overlap of 111 genes shared by both species. Through enrichment test, we demonstrated that although divergence was observed, the majority of enriched gene ontology terms and pathways were shared by both species, suggesting that functional categories were more conserved than individual genes. We further constructed a decidualization-related protein-protein interaction network consisted of 344 nodes connected via 1,541 edges. We prioritized genes in this network and identified 12 genes that may be key regulators of decidualization. These findings would provide some clues for further research on the mechanism underlying decidualization. PMID:26222155

  10. Optimizing data collection for public health decisions: a data mining approach

    PubMed Central

    2014-01-01

    Background Collecting data can be cumbersome and expensive. Lack of relevant, accurate and timely data for research to inform policy may negatively impact public health. The aim of this study was to test if the careful removal of items from two community nutrition surveys guided by a data mining technique called feature selection, can (a) identify a reduced dataset, while (b) not damaging the signal inside that data. Methods The Nutrition Environment Measures Surveys for stores (NEMS-S) and restaurants (NEMS-R) were completed on 885 retail food outlets in two counties in West Virginia between May and November of 2011. A reduced dataset was identified for each outlet type using feature selection. Coefficients from linear regression modeling were used to weight items in the reduced datasets. Weighted item values were summed with the error term to compute reduced item survey scores. Scores produced by the full survey were compared to the reduced item scores using a Wilcoxon rank-sum test. Results Feature selection identified 9 store and 16 restaurant survey items as significant predictors of the score produced from the full survey. The linear regression models built from the reduced feature sets had R2 values of 92% and 94% for restaurant and grocery store data, respectively. Conclusions While there are many potentially important variables in any domain, the most useful set may only be a small subset. The use of feature selection in the initial phase of data collection to identify the most influential variables may be a useful tool to greatly reduce the amount of data needed thereby reducing cost. PMID:24919484

  11. The impact of vascular diameter ratio on hemodialysis maturation time: Evidence from data mining approaches and thermodynamics law

    PubMed Central

    Rezapour, Mohammad; Taran, Somayeh; Balin Parast, Mahmood; Khavanin Zadeh, Morteza

    2016-01-01

    Background: Vascular Access (VA) is an important aspect for blood circulatory in Hemodialysis (HD). Arteriovenous Fistula (AVF) is a suitable procedure to gain VA. Maturation of the AVF is a status of AVF, which can be cannulated for HD. This study aimed to discover the parameters that effectively reduce the duration between VA and start of HD, which symbolizes the maturation time (MT). Methods: Ninety-six patients who underwent AVF creation were selected for this study. The decision tree method was used based on CART/C4.5 algorithm, which is one of the data mining approaches for data classification. Vascular diameter ratio (VDR) coefficient was obtained (VDR=Artery/Vein diameters). Results: We investigated the relationship between the VDR and MT in this study and found that MT is reversely related to VDR in elderly patients, while this relation was direct in younger patients. Conclusion: The analysis revealed a Spearman's correlation coefficient for Vein diameter with MT. MT decreases when diameters of vein and artery are close to one another. This study can help the surgeons to identify high- risk patients who elongate MT for HD. PMID:27453889

  12. Using a Linkage Mapping Approach to Identify QTL for Day-Neutrality in the Octoploid Strawberry

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A linkage mapping approach was used to identify quantitative trait loci (QTL) associated with day-neutrality in the commercial strawberry, Fragaria ×ananassa (Duch ex Rozier). Amplified Fragment Length Polymorphic (AFLP) markers were used to build a genetic map with a population of 127 lines develo...

  13. Identifying Useful Auxiliary Variables for Incomplete Data Analyses: A Note on a Group Difference Examination Approach

    ERIC Educational Resources Information Center

    Raykov, Tenko; Marcoulides, George A.

    2014-01-01

    This research note contributes to the discussion of methods that can be used to identify useful auxiliary variables for analyses of incomplete data sets. A latent variable approach is discussed, which is helpful in finding auxiliary variables with the property that if included in subsequent maximum likelihood analyses they may enhance considerably…

  14. APPLICATION OF A TIERED SURROGATE APPROACH TO IDENTIFY TOXICITY SURROGATES FOR HUMAN HEALTH RISK ASSESSMENT

    EPA Science Inventory

    APPLICATION OF A TIERED SURROGATE APPROACH TO IDENTIFY TOXICITY SURROGATES FOR HUMAN HEALTH RISK ASSESSMENT. P.R. Dodmane1, L.E. Lizarraga1, J.P. Kaiser2, S.C. Wesselkamper2, Q.J. Zhao2. 1ORISE Participant, U.S. EPA, National Center for Environmental Assessment (NCEA), Cincinnati...

  15. Doing the Work of Extension: Three Approaches to Identify, Amplify, and Implement Outreach

    ERIC Educational Resources Information Center

    Raison, Brian

    2014-01-01

    This article explores the literature and practice of how the Cooperative Extension Service does its work and asks if traditional outreach and engagement models have room for innovative delivery mechanisms that may identify emerging trends and help meet community needs. It considers three innovative approaches to the educational mission:…

  16. A Comprehensive Approach to Identifying Intervention Targets for Patient-Safety Improvement in a Hospital Setting

    ERIC Educational Resources Information Center

    Cunningham, Thomas R.; Geller, E. Scott

    2012-01-01

    Despite differences in approaches to organizational problem solving, healthcare managers and organizational behavior management (OBM) practitioners share a number of practices, and connecting healthcare management with OBM may lead to improvements in patient safety. A broad needs-assessment methodology was applied to identify patient-safety…

  17. The Baby TALK Model: An Innovative Approach to Identifying High-Risk Children and Families

    ERIC Educational Resources Information Center

    Villalpando, Aimee Hilado; Leow, Christine; Hornstein, John

    2012-01-01

    This research report examines the Baby TALK model, an innovative early childhood intervention approach used to identify, recruit, and serve young children who are at-risk for developmental delays, mental health needs, and/or school failure, and their families. The report begins with a description of the model. This description is followed by an…

  18. Identifying Core Mobile Learning Faculty Competencies Based Integrated Approach: A Delphi Study

    ERIC Educational Resources Information Center

    Elbarbary, Rafik Said

    2015-01-01

    This study is based on the integrated approach as a concept framework to identify, categorize, and rank a key component of mobile learning core competencies for Egyptian faculty members in higher education. The field investigation framework used four rounds Delphi technique to determine the importance rate of each component of core competencies…

  19. An Information Theoretic Approach for Identifying Shared Information and Asymmetric Relationships among Variables.

    ERIC Educational Resources Information Center

    Golden, Linda L.; And Others

    1990-01-01

    The general-information-theoretic approach was used to identify informational overlap and asymmetry between variables, using affective, cognitive, and behavioral measures. Using the chi-squared test, no significant differences were found in response rates, demographics, or patronage frequency of three stores between numerical (n=453) and graphic…

  20. A Function-First Approach to Identifying Formulaic Language in Academic Writing

    ERIC Educational Resources Information Center

    Durrant, Philip; Mathews-Aydinli, Julie

    2011-01-01

    There is currently much interest in creating pedagogically-oriented descriptions of formulaic language. Research in this area has typically taken what we call a "form-first" approach, in which formulas are identified as the most frequent recurrent forms in a relevant corpus. While this research continues to yield valuable results, the present…

  1. Identifying Bioaccumulative Halogenated Organic Compounds Using a Nontargeted Analytical Approach: Seabirds as Sentinels

    PubMed Central

    Millow, Christopher J.; Mackintosh, Susan A.; Lewison, Rebecca L.; Dodder, Nathan G.; Hoh, Eunha

    2015-01-01

    Persistent organic pollutants (POPs) are typically monitored via targeted mass spectrometry, which potentially identifies only a fraction of the contaminants actually present in environmental samples. With new anthropogenic compounds continuously introduced to the environment, novel and proactive approaches that provide a comprehensive alternative to targeted methods are needed in order to more completely characterize the diversity of known and unknown compounds likely to cause adverse effects. Nontargeted mass spectrometry attempts to extensively screen for compounds, providing a feasible approach for identifying contaminants that warrant future monitoring. We employed a nontargeted analytical method using comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (GC×GC/TOF-MS) to characterize halogenated organic compounds (HOCs) in California Black skimmer (Rynchops niger) eggs. Our study identified 111 HOCs; 84 of these compounds were regularly detected via targeted approaches, while 27 were classified as typically unmonitored or unknown. Typically unmonitored compounds of note in bird eggs included tris(4-chlorophenyl)methane (TCPM), tris(4-chlorophenyl)methanol (TCPMOH), triclosan, permethrin, heptachloro-1'-methyl-1,2'-bipyrrole (MBP), as well as four halogenated unknown compounds that could not be identified through database searching or the literature. The presence of these compounds in Black skimmer eggs suggests they are persistent, bioaccumulative, potentially biomagnifying, and maternally transferring. Our results highlight the utility and importance of employing nontargeted analytical tools to assess true contaminant burdens in organisms, as well as to demonstrate the value in using environmental sentinels to proactively identify novel contaminants. PMID:26020245

  2. Application of techniques to identify coal-mine and power-generation effects on surface-water quality, San Juan River basin, New Mexico and Colorado

    USGS Publications Warehouse

    Goetz, C.L.; Abeyta, Cynthia G.; Thomas, E.V.

    1987-01-01

    Numerous analytical techniques were applied to determine water quality changes in the San Juan River basin upstream of Shiprock , New Mexico. Eight techniques were used to analyze hydrologic data such as: precipitation, water quality, and streamflow. The eight methods used are: (1) Piper diagram, (2) time-series plot, (3) frequency distribution, (4) box-and-whisker plot, (5) seasonal Kendall test, (6) Wilcoxon rank-sum test, (7) SEASRS procedure, and (8) analysis of flow adjusted, specific conductance data and smoothing. Post-1963 changes in dissolved solids concentration, dissolved potassium concentration, specific conductance, suspended sediment concentration, or suspended sediment load in the San Juan River downstream from the surface coal mines were examined to determine if coal mining was having an effect on the quality of surface water. None of the analytical methods used to analyzed the data showed any increase in dissolved solids concentration, dissolved potassium concentration, or specific conductance in the river downstream from the mines; some of the analytical methods used showed a decrease in dissolved solids concentration and specific conductance. Chaco River, an ephemeral stream tributary to the San Juan River, undergoes changes in water quality due to effluent from a power generation facility. The discharge in the Chaco River contributes about 1.9% of the average annual discharge at the downstream station, San Juan River at Shiprock, NM. The changes in water quality detected at the Chaco River station were not detected at the downstream Shiprock station. It was not possible, with the available data, to identify any effects of the surface coal mines on water quality that were separable from those of urbanization, agriculture, and other cultural and natural changes. In order to determine the specific causes of changes in water quality, it would be necessary to collect additional data at strategically located stations. (Author 's abstract)

  3. VALUING ACID MINE DRAINAGE REMEDIATION IN WEST VIRGINIA: A HEDONIC MODELING APPROACH

    EPA Science Inventory

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD). Appalachian states have an especially large number of contaminated streams and rivers, and the USGS places AMD as the primary source...

  4. VALUING ACID MINE DRAINAGE REMEDIATION IN WEST VIRGINIA: A HEDONIC MODELING APPROACH INCORPORATING GEOGRAPHIC INFORMATION SYSTEMS

    EPA Science Inventory

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD). Appalachian states have an especially large number of contaminated streams and rivers, and the USGS places AMD as the primary source...

  5. A Data Mining Approach to Reveal Representative Collaboration Indicators in Open Collaboration Frameworks

    ERIC Educational Resources Information Center

    Anaya, Antonio R.; Boticario, Jesus G.

    2009-01-01

    Data mining methods are successful in educational environments to discover new knowledge or learner skills or features. Unfortunately, they have not been used in depth with collaboration. We have developed a scalable data mining method, whose objective is to infer information on the collaboration during the collaboration process in a…

  6. Identifying inhibitory compounds in lignocellulosic biomass hydrolysates using an exometabolomics approach

    PubMed Central

    2014-01-01

    Background Inhibitors are formed that reduce the fermentation performance of fermenting yeast during the pretreatment process of lignocellulosic biomass. An exometabolomics approach was applied to systematically identify inhibitors in lignocellulosic biomass hydrolysates. Results We studied the composition and fermentability of 24 different biomass hydrolysates. To create diversity, the 24 hydrolysates were prepared from six different biomass types, namely sugar cane bagasse, corn stover, wheat straw, barley straw, willow wood chips and oak sawdust, and with four different pretreatment methods, i.e. dilute acid, mild alkaline, alkaline/peracetic acid and concentrated acid. Their composition and that of fermentation samples generated with these hydrolysates were analyzed with two GC-MS methods. Either ethyl acetate extraction or ethyl chloroformate derivatization was used before conducting GC-MS to prevent sugars are overloaded in the chromatograms, which obscure the detection of less abundant compounds. Using multivariate PLS-2CV and nPLS-2CV data analysis models, potential inhibitors were identified through establishing relationship between fermentability and composition of the hydrolysates. These identified compounds were tested for their effects on the growth of the model yeast, Saccharomyces. cerevisiae CEN.PK 113-7D, confirming that the majority of the identified compounds were indeed inhibitors. Conclusion Inhibitory compounds in lignocellulosic biomass hydrolysates were successfully identified using a non-targeted systematic approach: metabolomics. The identified inhibitors include both known ones, such as furfural, HMF and vanillin, and novel inhibitors, namely sorbic acid and phenylacetaldehyde. PMID:24655423

  7. Missing defects? A comparison of microscopic and macroscopic approaches to identifying linear enamel hypoplasia.

    PubMed

    Hassett, Brenna R

    2014-03-01

    Linear enamel hypoplasia (LEH), the presence of linear defects of dental enamel formed during periods of growth disruption, is frequently analyzed in physical anthropology as evidence for childhood health in the past. However, a wide variety of methods for identifying and interpreting these defects in archaeological remains exists, preventing easy cross-comparison of results from disparate studies. This article compares a standard approach to identifying LEH using the naked eye to the evidence of growth disruption observed microscopically from the enamel surface. This comparison demonstrates that what is interpreted as evidence of growth disruption microscopically is not uniformly identified with the naked eye, and provides a reference for the level of consistency between the number and timing of defects identified using microscopic versus macroscopic approaches. This is done for different tooth types using a large sample of unworn permanent teeth drawn from several post-medieval London burial assemblages. The resulting schematic diagrams showing where macroscopic methods achieve more or less similar results to microscopic methods are presented here and clearly demonstrate that "naked-eye" methods of identifying growth disruptions do not identify LEH as often as microscopic methods in areas where perikymata are more densely packed. PMID:24323494

  8. Quantitative and qualitative approaches to identifying migration chronology in a continental migrant

    USGS Publications Warehouse

    Beatty, William S.; Kesler, Dylan C.; Webb, Elisabeth B.; Raedeke, Andrew H.; Naylor, Luke W.; Humburg, Dale D.

    2013-01-01

    The degree to which extrinsic factors influence migration chronology in North American waterfowl has not been quantified, particularly for dabbling ducks. Previous studies have examined waterfowl migration using various methods, however, quantitative approaches to define avian migration chronology over broad spatio-temporal scales are limited, and the implications for using different approaches have not been assessed. We used movement data from 19 female adult mallards (Anas platyrhynchos) equipped with solar-powered global positioning system satellite transmitters to evaluate two individual level approaches for quantifying migration chronology. The first approach defined migration based on individual movements among geopolitical boundaries (state, provincial, international), whereas the second method modeled net displacement as a function of time using nonlinear models. Differences in migration chronologies identified by each of the approaches were examined with analysis of variance. The geopolitical method identified mean autumn migration midpoints at 15 November 2010 and 13 November 2011, whereas the net displacement method identified midpoints at 15 November 2010 and 14 November 2011. The mean midpoints for spring migration were 3 April 2011 and 20 March 2012 using the geopolitical method and 31 March 2011 and 22 March 2012 using the net displacement method. The duration, initiation date, midpoint, and termination date for both autumn and spring migration did not differ between the two individual level approaches. Although we did not detect differences in migration parameters between the different approaches, the net displacement metric offers broad potential to address questions in movement ecology for migrating species. Ultimately, an objective definition of migration chronology will allow researchers to obtain a comprehensive understanding of the extrinsic factors that drive migration at the individual and population levels. As a result, targeted

  9. Quantitative and Qualitative Approaches to Identifying Migration Chronology in a Continental Migrant

    PubMed Central

    Beatty, William S.; Kesler, Dylan C.; Webb, Elisabeth B.; Raedeke, Andrew H.; Naylor, Luke W.; Humburg, Dale D.

    2013-01-01

    The degree to which extrinsic factors influence migration chronology in North American waterfowl has not been quantified, particularly for dabbling ducks. Previous studies have examined waterfowl migration using various methods, however, quantitative approaches to define avian migration chronology over broad spatio-temporal scales are limited, and the implications for using different approaches have not been assessed. We used movement data from 19 female adult mallards (Anas platyrhynchos) equipped with solar-powered global positioning system satellite transmitters to evaluate two individual level approaches for quantifying migration chronology. The first approach defined migration based on individual movements among geopolitical boundaries (state, provincial, international), whereas the second method modeled net displacement as a function of time using nonlinear models. Differences in migration chronologies identified by each of the approaches were examined with analysis of variance. The geopolitical method identified mean autumn migration midpoints at 15 November 2010 and 13 November 2011, whereas the net displacement method identified midpoints at 15 November 2010 and 14 November 2011. The mean midpoints for spring migration were 3 April 2011 and 20 March 2012 using the geopolitical method and 31 March 2011 and 22 March 2012 using the net displacement method. The duration, initiation date, midpoint, and termination date for both autumn and spring migration did not differ between the two individual level approaches. Although we did not detect differences in migration parameters between the different approaches, the net displacement metric offers broad potential to address questions in movement ecology for migrating species. Ultimately, an objective definition of migration chronology will allow researchers to obtain a comprehensive understanding of the extrinsic factors that drive migration at the individual and population levels. As a result, targeted

  10. A cross-species bi-clustering approach to identifying conserved co-regulated genes

    PubMed Central

    Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

    2016-01-01

    Motivation: A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. Results: We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on

  11. Integrative network-based approach identifies key genetic elements in breast invasive carcinoma

    PubMed Central

    2015-01-01

    Background Breast cancer is a genetically heterogeneous type of cancer that belongs to the most prevalent types with a high mortality rate. Treatment and prognosis of breast cancer would profit largely from a correct classification and identification of genetic key drivers and major determinants driving the tumorigenesis process. In the light of the availability of tumor genomic and epigenomic data from different sources and experiments, new integrative approaches are needed to boost the probability of identifying such genetic key drivers. We present here an integrative network-based approach that is able to associate regulatory network interactions with the development of breast carcinoma by integrating information from gene expression, DNA methylation, miRNA expression, and somatic mutation datasets. Results Our results showed strong association between regulatory elements from different data sources in terms of the mutual regulatory influence and genomic proximity. By analyzing different types of regulatory interactions, TF-gene, miRNA-mRNA, and proximity analysis of somatic variants, we identified 106 genes, 68 miRNAs, and 9 mutations that are candidate drivers of oncogenic processes in breast cancer. Moreover, we unraveled regulatory interactions among these key drivers and the other elements in the breast cancer network. Intriguingly, about one third of the identified driver genes are targeted by known anti-cancer drugs and the majority of the identified key miRNAs are implicated in cancerogenesis of multiple organs. Also, the identified driver mutations likely cause damaging effects on protein functions. The constructed gene network and the identified key drivers were compared to well-established network-based methods. Conclusion The integrated molecular analysis enabled by the presented network-based approach substantially expands our knowledge base of prospective genomic drivers of genes, miRNAs, and mutations. For a good part of the identified key drivers

  12. A comparison of approaches for finding minimum identifying codes on graphs

    NASA Astrophysics Data System (ADS)

    Horan, Victoria; Adachi, Steve; Bak, Stanley

    2016-05-01

    In order to formulate mathematical conjectures likely to be true, a number of base cases must be determined. However, many combinatorial problems are NP-hard and the computational complexity makes this research approach difficult using a standard brute force approach on a typical computer. One sample problem explored is that of finding a minimum identifying code. To work around the computational issues, a variety of methods are explored and consist of a parallel computing approach using MATLAB, an adiabatic quantum optimization approach using a D-Wave quantum annealing processor, and lastly using satisfiability modulo theory (SMT) and corresponding SMT solvers. Each of these methods requires the problem to be formulated in a unique manner. In this paper, we address the challenges of computing solutions to this NP-hard problem with respect to each of these methods.

  13. Multi-variate flood damage assessment: a tree-based data-mining approach

    NASA Astrophysics Data System (ADS)

    Merz, B.; Kreibich, H.; Lall, U.

    2013-01-01

    The usual approach for flood damage assessment consists of stage-damage functions which relate the relative or absolute damage for a certain class of objects to the inundation depth. Other characteristics of the flooding situation and of the flooded object are rarely taken into account, although flood damage is influenced by a variety of factors. We apply a group of data-mining techniques, known as tree-structured models, to flood damage assessment. A very comprehensive data set of more than 1000 records of direct building damage of private households in Germany is used. Each record contains details about a large variety of potential damage-influencing characteristics, such as hydrological and hydraulic aspects of the flooding situation, early warning and emergency measures undertaken, state of precaution of the household, building characteristics and socio-economic status of the household. Regression trees and bagging decision trees are used to select the more important damage-influencing variables and to derive multi-variate flood damage models. It is shown that these models outperform existing models, and that tree-structured models are a promising alternative to traditional damage models.

  14. Identifying potential adverse effects using the web: a new approach to medical hypothesis generation

    PubMed Central

    Benton, Adrian; Ungar, Lyle; Hill, Shawndra; Hennessy, Sean; Mao, Jun; Chung, Annie; Leonard, Charles E.; Holmes, John H.

    2011-01-01

    Medical message boards are online resources where users with a particular condition exchange information, some of which they might not otherwise share with medical providers. Many of these boards contain a large number of posts and contain patient opinions and experiences that would be potentially useful to clinicians and researchers. We present an approach that is able to collect a corpus of medical message board posts, de-identify the corpus, and extract information on potential adverse drug effects discussed by users. Using a corpus of posts to breast cancer message boards, we identified drug event pairs using co-occurrence statistics. We then compared the identified drug event pairs with adverse effects listed on the package labels of tamoxifen, anastrozole, exemestane, and letrozole. Of the pairs identified by our system, 75–80% were documented on the drug labels. Some of the undocumented pairs may represent previously unidentified adverse drug effects. PMID:21820083

  15. An information-theoretic approach to assess practical identifiability of parametric dynamical systems.

    PubMed

    Pant, Sanjay; Lombardi, Damiano

    2015-10-01

    A new approach for assessing parameter identifiability of dynamical systems in a Bayesian setting is presented. The concept of Shannon entropy is employed to measure the inherent uncertainty in the parameters. The expected reduction in this uncertainty is seen as the amount of information one expects to gain about the parameters due to the availability of noisy measurements of the dynamical system. Such expected information gain is interpreted in terms of the variance of a hypothetical measurement device that can measure the parameters directly, and is related to practical identifiability of the parameters. If the individual parameters are unidentifiable, correlation between parameter combinations is assessed through conditional mutual information to determine which sets of parameters can be identified together. The information theoretic quantities of entropy and information are evaluated numerically through a combination of Monte Carlo and k-nearest neighbour methods in a non-parametric fashion. Unlike many methods to evaluate identifiability proposed in the literature, the proposed approach takes the measurement-noise into account and is not restricted to any particular noise-structure. Whilst computationally intensive for large dynamical systems, it is easily parallelisable and is non-intrusive as it does not necessitate re-writing of the numerical solvers of the dynamical system. The application of such an approach is presented for a variety of dynamical systems--ranging from systems governed by ordinary differential equations to partial differential equations--and, where possible, validated against results previously published in the literature. PMID:26292167

  16. Ab initio thermodynamic approach to identify mixed solid sorbents for CO2 capture technology

    DOE PAGESBeta

    Duan, Yuhua

    2015-10-15

    Because the current technologies for capturing CO2 are still too energy intensive, new materials must be developed that can capture CO2 reversibly with acceptable energy costs. At a given CO2 pressure, the turnover temperature (Tt) of the reaction of an individual solid that can capture CO2 is fixed. Such Tt may be outside the operating temperature range (ΔTo) for a practical capture technology. To adjust Tt to fit the practical ΔTo, in this study, three scenarios of mixing schemes are explored by combining thermodynamic database mining with first principles density functional theory and phonon lattice dynamics calculations. Our calculated resultsmore » demonstrate that by mixing different types of solids, it’s possible to shift Tt to the range of practical operating temperature conditions. According to the requirements imposed by the pre- and post- combustion technologies and based on our calculated thermodynamic properties for the CO2 capture reactions by the mixed solids of interest, we were able to identify the mixing ratios of two or more solids to form new sorbent materials for which lower capture energy costs are expected at the desired pressure and temperature conditions.« less

  17. Identifying Prognostic Features by Bottom-Up Approach and Correlating to Drug Repositioning

    PubMed Central

    Li, Wei; Yu, Jian; Lian, Baofeng; Sun, Han; Li, Jing; Zhang, Menghuan; Li, Ling; Li, Yixue; Liu, Qian; Xie, Lu

    2015-01-01

    Background Traditionally top-down method was used to identify prognostic features in cancer research. That is to say, differentially expressed genes usually in cancer versus normal were identified to see if they possess survival prediction power. The problem is that prognostic features identified from one set of patient samples can rarely be transferred to other datasets. We apply bottom-up approach in this study: survival correlated or clinical stage correlated genes were selected first and prioritized by their network topology additionally, then a small set of features can be used as a prognostic signature. Methods Gene expression profiles of a cohort of 221 hepatocellular carcinoma (HCC) patients were used as a training set, ‘bottom-up’ approach was applied to discover gene-expression signatures associated with survival in both tumor and adjacent non-tumor tissues, and compared with ‘top-down’ approach. The results were validated in a second cohort of 82 patients which was used as a testing set. Results Two sets of gene signatures separately identified in tumor and adjacent non-tumor tissues by bottom-up approach were developed in the training cohort. These two signatures were associated with overall survival times of HCC patients and the robustness of each was validated in the testing set, and each predictive performance was better than gene expression signatures reported previously. Moreover, genes in these two prognosis signature gave some indications for drug-repositioning on HCC. Some approved drugs targeting these markers have the alternative indications on hepatocellular carcinoma. Conclusion Using the bottom-up approach, we have developed two prognostic gene signatures with a limited number of genes that associated with overall survival times of patients with HCC. Furthermore, prognostic markers in these two signatures have the potential to be therapeutic targets. PMID:25738841

  18. A rule-based approach for identifying obesity and its comorbidities in medical discharge summaries.

    PubMed

    Mishra, Ninad K; Cummo, David M; Arnzen, James J; Bonander, Jason

    2009-01-01

    OBJECTIVE Evaluate the effectiveness of a simple rule-based approach in classifying medical discharge summaries according to indicators for obesity and 15 associated co-morbidities as part of the 2008 i2b2 Obesity Challenge. METHODS The authors applied a rule-based approach that looked for occurrences of morbidity-related keywords and identified the types of assertions in which those keywords occurred. The documents were then classified using a simple scoring algorithm based on a mapping of the assertion types to possible judgment categories. MEASUREMENTS RESULTS for the challenge were evaluated based on macro F-measure. We report micro and macro F-measure results for all morbidities combined and for each morbidity separately. Results Our rule-based approach achieved micro and macro F-measures of 0.97 and 0.77, respectively, ranking fifth out of the entries submitted by 28 teams participating in the classification task based on textual judgments and substantially outperforming the average for the challenge. CONCLUSIONS As shown by its ranking in the challenge results, this approach performed relatively well under conditions in which limited training data existed for some judgment categories. Further, the approach held up well in relation to more complex approaches applied to this classification task. The approach could be enhanced by the addition of expert rules to model more complex medical reasoning. PMID:19390102

  19. A cellular genetics approach identifies gene-drug interactions and pinpoints drug toxicity pathway nodes

    PubMed Central

    Suzuki, Oscar T.; Frick, Amber; Parks, Bethany B.; Trask, O. Joseph; Butz, Natasha; Steffy, Brian; Chan, Emmanuel; Scoville, David K.; Healy, Eric; Benton, Cristina; McQuaid, Patricia E.; Thomas, Russell S.; Wiltshire, Tim

    2014-01-01

    New approaches to toxicity testing have incorporated high-throughput screening across a broad-range of in vitro assays to identify potential key events in response to chemical or drug treatment. To date, these approaches have primarily utilized repurposed drug discovery assays. In this study, we describe an approach that combines in vitro screening with genetic approaches for the experimental identification of genes and pathways involved in chemical or drug toxicity. Primary embryonic fibroblasts isolated from 32 genetically-characterized inbred mouse strains were treated in concentration-response format with 65 compounds, including pharmaceutical drugs, environmental chemicals, and compounds with known modes-of-action. Integrated cellular responses were measured at 24 and 72 h using high-content imaging and included cell loss, membrane permeability, mitochondrial function, and apoptosis. Genetic association analysis of cross-strain differences in the cellular responses resulted in a collection of candidate loci potentially underlying the variable strain response to each chemical. As a demonstration of the approach, one candidate gene involved in rotenone sensitivity, Cybb, was experimentally validated in vitro and in vivo. Pathway analysis on the combined list of candidate loci across all chemicals identified a number of over-connected nodes that may serve as core regulatory points in toxicity pathways. PMID:25221565

  20. A novel approach to identify genes that determine grain protein deviation in cereals.

    PubMed

    Mosleth, Ellen F; Wan, Yongfang; Lysenko, Artem; Chope, Gemma A; Penson, Simon P; Shewry, Peter R; Hawkesford, Malcolm J

    2015-06-01

    Grain yield and protein content were determined for six wheat cultivars grown over 3 years at multiple sites and at multiple nitrogen (N) fertilizer inputs. Although grain protein content was negatively correlated with yield, some grain samples had higher protein contents than expected based on their yields, a trait referred to as grain protein deviation (GPD). We used novel statistical approaches to identify gene transcripts significantly related to GPD across environments. The yield and protein content were initially adjusted for nitrogen fertilizer inputs and then adjusted for yield (to remove the negative correlation with protein content), resulting in a parameter termed corrected GPD. Significant genetic variation in corrected GPD was observed for six cultivars grown over a range of environmental conditions (a total of 584 samples). Gene transcript profiles were determined in a subset of 161 samples of developing grain to identify transcripts contributing to GPD. Principal component analysis (PCA), analysis of variance (ANOVA) and means of scores regression (MSR) were used to identify individual principal components (PCs) correlating with GPD alone. Scores of the selected PCs, which were significantly related to GPD and protein content but not to the yield and significantly affected by cultivar, were identified as reflecting a multivariate pattern of gene expression related to genetic variation in GPD. Transcripts with consistent variation along the selected PCs were identified by an approach hereby called one-block means of scores regression (one-block MSR). PMID:25400203

  1. New approach for reduction of diesel consumption by comparing different mining haulage configurations.

    PubMed

    Rodovalho, Edmo da Cunha; Lima, Hernani Mota; de Tomi, Giorgio

    2016-05-01

    The mining operations of loading and haulage have an energy source that is highly dependent on fossil fuels. In mining companies that select trucks for haulage, this input is the main component of mining costs. How can the impact of the operational aspects on the diesel consumption of haulage operations in surface mines be assessed? There are many studies relating the consumption of fuel trucks to several variables, but a methodology that prioritizes higher-impact variables under each specific condition is not available. Generic models may not apply to all operational settings presented in the mining industry. This study aims to create a method of analysis, identification, and prioritization of variables related to fuel consumption of haul trucks in open pit mines. For this purpose, statistical analysis techniques and mathematical modelling tools using multiple linear regressions will be applied. The model is shown to be suitable because the results generate a good description of the fuel consumption behaviour. In the practical application of the method, the reduction of diesel consumption reached 10%. The implementation requires no large-scale investments or very long deadlines and can be applied to mining haulage operations in other settings. PMID:26946166

  2. Floodplain storage of mine tailings in the Belle Fourche river system: a sediment budget approach

    USGS Publications Warehouse

    Marron, D.C.

    1992-01-01

    Arsenic-contaminated mine tailings that were discharged into Whitewood Creek at Lead, South Dakota, from 1876 to 1978, were deposited along the floodplains of Whitewood Creek and the Belle Fourche River. The resulting arsenic-contaminated floodplain deposit consists mostly of overbank sediments and filled abandoned meanders along Whitewood Creek, and overbank and point-bar sediments along the Belle Fourche River. Arsenic concentrations of the contaminated sediments indicate the degree of dilution of mine tailings by uncontaminated alluvium. About 13% of the 110 ?? 106 Mg of mine tailings that were discharged at Lead were deposited along the Whitewood Creek floodplain. -from Author

  3. A Critical Study on the Underground Environment of Coal Mines in India-an Ergonomic Approach

    NASA Astrophysics Data System (ADS)

    Dey, Netai Chandra; Sharma, Gourab Dhara

    2013-04-01

    Ergonomics application on underground miner's health plays a great role in controlling the efficiency of miners. The job stress in underground mine is still physically demanding and continuous stress due to certain posture or movement of miners during work leads to localized muscle fatigue creating musculo-skeletal disorders. A good working environment can change the degree of job heaviness and thermal stress (WBGT values) can directly have the effect on stretch of work of miners. Out of many unit operations in underground mine, roof bolting keeps an important contribution with regard to safety of the mine and miners. Occupational stress of roof bolters from ergonomic consideration has been discussed in the paper.

  4. Identifying Pigment Mixtures in Art Using SERS: A Treatment Flowchart Approach.

    PubMed

    Roh, Joo Yeon; Matecki, Mary K; Svoboda, Shelley A; Wustholz, Kristin L

    2016-02-16

    A novel treatment flowchart approach for surface-enhanced Raman scattering (SERS) is used to identify both blue and yellow organic pigments in a single microscopic sample from a series of reference oil paints as well as an actual 18th century oil painting. In particular, several treatment strategies using acids and solvents are integrated into a specific flowchart designed to enable the minimally invasive identification of unknown blue (i.e., indigo, Prussian blue) and yellow organic (i.e., Reseda lake, Stil de Grain, gamboge) pigments in one sample. We demonstrate the first successful identification of a yellow lake pigment in a historic painting using SERS as well as the utility of our treatment flowchart approach for identifying pigments of varying resonance conditions, surface affinities, and treatment requirements in a single microscopic sample from a historic oil painting. PMID:26799174

  5. An innovative and integrated approach based on DNA walking to identify unauthorised GMOs.

    PubMed

    Fraiture, Marie-Alice; Herman, Philippe; Taverniers, Isabel; De Loose, Marc; Deforce, Dieter; Roosens, Nancy H

    2014-03-15

    In the coming years, the frequency of unauthorised genetically modified organisms (GMOs) being present in the European food and feed chain will increase significantly. Therefore, we have developed a strategy to identify unauthorised GMOs containing a pCAMBIA family vector, frequently present in transgenic plants. This integrated approach is performed in two successive steps on Bt rice grains. First, the potential presence of unauthorised GMOs is assessed by the qPCR SYBR®Green technology targeting the terminator 35S pCAMBIA element. Second, its presence is confirmed via the characterisation of the junction between the transgenic cassette and the rice genome. To this end, a DNA walking strategy is applied using a first reverse primer followed by two semi-nested PCR rounds using primers that are each time nested to the previous reverse primer. This approach allows to rapidly identify the transgene flanking region and can easily be implemented by the enforcement laboratories. PMID:24206686

  6. Systems approaches to unraveling plant metabolism: identifying biosynthetic genes of secondary metabolic pathways.

    PubMed

    Spiering, Martin J; Kaur, Bhavneet; Parsons, James F; Eisenstein, Edward

    2014-01-01

    The diversity of useful compounds produced by plant secondary metabolism has stimulated broad systems biology approaches to identify the genes involved in their biosynthesis. Systems biology studies in non-model plants pose interesting but addressable challenges, and have been greatly facilitated by the ability to grow and maintain plants, develop laboratory culture systems, and profile key metabolites in order to identify critical genes involved their biosynthesis. In this chapter we describe a suite of approaches that have been useful in Actaea racemosa (L.; syn. Cimicifuga racemosa, Nutt., black coshosh), a non-model medicinal plant with no genome sequence and little horticultural information available, that have led to the development of initial gene-metabolite relationships for the production of several bioactive metabolites in this multicomponent botanical therapeutic, and that can be readily applied to a wide variety of under-characterized medicinal plants. PMID:24218220

  7. An integrated remote sensing approach for identifying ecological range sites. [parker mountain

    NASA Technical Reports Server (NTRS)

    Jaynes, R. A.

    1983-01-01

    A model approach for identifying ecological range sites was applied to high elevation sagebrush-dominated rangelands on Parker Mountain, in south-central Utah. The approach utilizes map information derived from both high altitude color infrared photography and LANDSAT digital data, integrated with soils, geological, and precipitation maps. Identification of the ecological range site for a given area requires an evaluation of all relevant environmental factors which combine to give that site the potential to produce characteristic types and amounts of vegetation. A table is presented which allows the user to determine ecological range site based upon an integrated use of the maps which were prepared. The advantages of identifying ecological range sites through an integrated photo interpretation/LANDSAT analysis are discussed.

  8. Microbial populations identified by fluorescence in situ hybridization in a constructed wetland treating acid coal mine drainage

    SciTech Connect

    Nicomrat, D.; Dick, W.A.; Tuovinen, O.H.

    2006-07-15

    Microorganisms are an integral part of the biogeochemical processes in wetlands, yet microbial communities in sediments within constructed wetlands receiving acid mine drainage (AMD) are only poorly understood. The purpose of this study was to characterize the microbial diversity and abundance in a wetland receiving AMD using fluorescence in situ hybridization (FISH) analysis. Seasonal samples of oxic surface sediments, comprised of Fe(III) precipitates, were collected from two treatment cells of the constructed wetland system. The pH of the bulk samples ranged between pH 2.1 and 3.9. Viable counts of acidophilic Fe and S oxidizers and heterotrophs were determined with a most probable number (MPN) method. The MPN counts were only a fraction of the corresponding FISH counts. The sediment samples contained microorganisms in the Bacteria (including the subgroups of acidophilic Fe- and S-oxidizing bacteria and Acidiphilium spp.) and Eukarya domains. Archaea were present in the sediment surface samples at < 0.01% of the total microbial community. The most numerous bacterial species in this wetland system was Acidithiobacillus ferrooxidans, comprising up to 37% of the bacterial population. Acidithiobacillus thiooxidans was also abundant.

  9. A new approach to hazardous materials transportation risk analysis: decision modeling to identify critical variables.

    PubMed

    Clark, Renee M; Besterfield-Sacre, Mary E

    2009-03-01

    We take a novel approach to analyzing hazardous materials transportation risk in this research. Previous studies analyzed this risk from an operations research (OR) or quantitative risk assessment (QRA) perspective by minimizing or calculating risk along a transport route. Further, even though the majority of incidents occur when containers are unloaded, the research has not focused on transportation-related activities, including container loading and unloading. In this work, we developed a decision model of a hazardous materials release during unloading using actual data and an exploratory data modeling approach. Previous studies have had a theoretical perspective in terms of identifying and advancing the key variables related to this risk, and there has not been a focus on probability and statistics-based approaches for doing this. Our decision model empirically identifies the critical variables using an exploratory methodology for a large, highly categorical database involving latent class analysis (LCA), loglinear modeling, and Bayesian networking. Our model identified the most influential variables and countermeasures for two consequences of a hazmat incident, dollar loss and release quantity, and is one of the first models to do this. The most influential variables were found to be related to the failure of the container. In addition to analyzing hazmat risk, our methodology can be used to develop data-driven models for strategic decision making in other domains involving risk. PMID:19087232

  10. An Approach for Identifying Cytokines Based on a Novel Ensemble Classifier

    PubMed Central

    Zou, Quan; Wang, Zhen; Guan, Xinjun; Liu, Bin; Wu, Yunfeng; Lin, Ziyu

    2013-01-01

    Biology is meaningful and important to identify cytokines and investigate their various functions and biochemical mechanisms. However, several issues remain, including the large scale of benchmark datasets, serious imbalance of data, and discovery of new gene families. In this paper, we employ the machine learning approach based on a novel ensemble classifier to predict cytokines. We directly selected amino acids sequences as research objects. First, we pretreated the benchmark data accurately. Next, we analyzed the physicochemical properties and distribution of whole amino acids and then extracted a group of 120-dimensional (120D) valid features to represent sequences. Third, in the view of the serious imbalance in benchmark datasets, we utilized a sampling approach based on the synthetic minority oversampling technique algorithm and K-means clustering undersampling algorithm to rebuild the training set. Finally, we built a library for dynamic selection and circulating combination based on clustering (LibD3C) and employed the new training set to realize cytokine classification. Experiments showed that the geometric mean of sensitivity and specificity obtained through our approach is as high as 93.3%, which proves that our approach is effective for identifying cytokines. PMID:24027761

  11. A Virtual Screening Approach For Identifying Plants with Anti H5N1 Neuraminidase Activity

    PubMed Central

    2016-01-01

    Recent outbreaks of highly pathogenic and occasional drug-resistant influenza strains have highlighted the need to develop novel anti-influenza therapeutics. Here, we report computational and experimental efforts to identify influenza neuraminidase inhibitors from among the 3000 natural compounds in the Malaysian-Plants Natural-Product (NADI) database. These 3000 compounds were first docked into the neuraminidase active site. The five plants with the largest number of top predicted ligands were selected for experimental evaluation. Twelve specific compounds isolated from these five plants were shown to inhibit neuraminidase, including two compounds with IC50 values less than 92 μM. Furthermore, four of the 12 isolated compounds had also been identified in the top 100 compounds from the virtual screen. Together, these results suggest an effective new approach for identifying bioactive plant species that will further the identification of new pharmacologically active compounds from diverse natural-product resources. PMID:25555059

  12. Identifying Patients in the Acute Psychiatric Hospital Who May Benefit From a Palliative Care Approach.

    PubMed

    Burton, M Caroline; Warren, Mark; Cha, Stephen S; Stevens, Maria; Blommer, Megan; Kung, Simon; Lapid, Maria I

    2016-04-01

    Identifying patients who will benefit from a palliative care approach is the first critical step in integrating palliative with curative therapy. Criteria are established that identify hospitalized medical patients who are near end of life, yet there are no criteria with respect to hospitalized patients with psychiatric disorders. The records of 276 consecutive patients admitted to a dedicated inpatient psychiatric unit were reviewed to identify prognostic criteria predictive of mortality. Mortality predictors were 2 or more admissions in the past year (P = .0114) and older age (P = .0006). Twenty-two percent of patients met National Hospice and Palliative Care Organization noncancer criteria for dementia. Palliative care intervention should be considered when treating inpatients with psychiatric disorders, especially older patients who have a previous hospitalization or history of dementia. PMID:25318929

  13. A virtual screening approach for identifying plants with anti H5N1 neuraminidase activity.

    PubMed

    Ikram, Nur Kusaira Khairul; Durrant, Jacob D; Muchtaridi, Muchtaridi; Zalaludin, Ayunni Salihah; Purwitasari, Neny; Mohamed, Nornisah; Rahim, Aisyah Saad Abdul; Lam, Chan Kit; Normi, Yahaya M; Rahman, Noorsaadah Abd; Amaro, Rommie E; Wahab, Habibah A

    2015-02-23

    Recent outbreaks of highly pathogenic and occasional drug-resistant influenza strains have highlighted the need to develop novel anti-influenza therapeutics. Here, we report computational and experimental efforts to identify influenza neuraminidase inhibitors from among the 3000 natural compounds in the Malaysian-Plants Natural-Product (NADI) database. These 3000 compounds were first docked into the neuraminidase active site. The five plants with the largest number of top predicted ligands were selected for experimental evaluation. Twelve specific compounds isolated from these five plants were shown to inhibit neuraminidase, including two compounds with IC50 values less than 92 μM. Furthermore, four of the 12 isolated compounds had also been identified in the top 100 compounds from the virtual screen. Together, these results suggest an effective new approach for identifying bioactive plant species that will further the identification of new pharmacologically active compounds from diverse natural-product resources. PMID:25555059

  14. Improved landmine detection capability (ILDC): systematic approach to the detection of buried mines using passive IR imaging

    NASA Astrophysics Data System (ADS)

    Simard, Jean-Robert

    1996-05-01

    In order to reduce the serious problem associated with the mining of important supply/communication roads by hostile parties during peacekeeping operations, the Canadian Department of National Defense has recently begun the development of a multi-sensor teleoperated mine detection vehicle, the Improved Landmine Detection Capability. One sensor identified as a serious candidate for that project is a passive IR camera. In the past, many organizations have assessed the efficiency of this technique of detection and reported widely fluctuating results. It is believed that the main reason for these fluctuations is associated with the ad hoc interpretations used by different researchers. In this paper, a more systematic analysis is presented which takes into account variables such as time of the day, time of the year, weather conditions, type of road and many others. A working model is proposed in order to facilitate the prediction of the IR signature of the buried land-mine and is compared with data acquired from multiple trials. These trials were done with live mines (without fuzes) and surrogates buried in different types of road (packed gravel and sand) and during different times of the day and different times of the year.

  15. Nuclear waste repositories in salt mines: a new approach to safety assessment.

    PubMed

    Memmert, G

    1996-08-01

    The long-term safety of radioactive waste repositories in rock-salt mines in the deep underground benefits significantly from the barrier effect of overlying rocks. The concentrations of radioactive substances released from the repository and migrating in the aquifer up to the biosphere are greatly reduced during passage through these rocks. In former safety analyses of waste repositories this transport has generally been modelled as a combination of the involved phenomena, e.g. convection, dispersion, adsorption, etc. The data required for a numerical evaluation of the overall effect are obtained either as (conservative) estimates based on experience or are empirical, based mainly on laboratory experiments. The approach presented here is much simpler and entirely empirical, and therefore more transparent. It makes use of the fact that the groundwater in the overlying rocks always contains dissolved salt from the salt formation and carries it continuously into the receiving channels or the drainage system. The relation between the total amount of dissolved solids present in a certain subsurface catchment area and their steady-state concentration in the receiving channels is assumed to be equivalent to the relation between the given amount of radionuclides released from the repository and their concentration in the receiving channels, the latter leading to a certain radiation exposure of the population. Two versions of this approach are discussed: version (a) assumes a continuous stream of radionuclides released from the repository, and version (b) assumes a pulse release of radionuclides from the repository. A simple calculation using data from the Gorleben exploration leads to the inequality [equation: see text] where Cmax is the maximum radionuclide concentration (with respect to time) in the receiving channels and W (Bq) is the amount of radionuclides released from the respository in a very short time. Cmax obtained from (1), is supposed to be an upper limit of

  16. Shifting species ranges and changing phenology: A new approach to mining social media for ecosystems observations

    NASA Astrophysics Data System (ADS)

    Fuka, M. Z.; Osborne-Gowey, J. D.; Fuka, D. R.

    2013-12-01

    Geoscientists & ecologists are increasingly using social media to solicit 'citizen scientists' to participate in the data collection process. However, social media users are also a largely untapped resource of spontaneous, unsolicited observations of the natural world. Of particular interest are observations of species phenology & range to better develop a predictive understanding of how ecosystems are affected by a changing climate and human-mediated influences. Social media users' observations include information on phenological & biological phenomena such as flowers blooming, native & invasive species sightings, unusual behaviors, animal tracks, droppings, damage, feeding, nesting, etc. Our AGU2011 pilot study on the North American armadillo suggests that useful observational data can be extracted from Twitter to map current species ranges to compare with past ranges. We have expanded that work by mining Twitter for a number of North American species and ecosystem observations to determine usefulness for environmental applications such as: 1) supplementing existing databases, 2) identifying outlier phenomena, 3) guiding additional crowd-sourced studies and data collection efforts, 4) recruiting citizen scientists, 5) gauging sentiment about the observations and 6) informing ecosystems policy-making and education. We present the results for our evaluation of a representative sample from a list of 200+ species for which we've collected data since August 2011. Our results include frequency of reports and sightings by day, week and month, where the number of observations range from a few per month to ten or more per day. We discuss challenges, best practices and tools for distilling information from crowd-sourced observations gathered via Twitter in the form of 140-character 'tweets'. For example, geolocation is a critical issue. Despite the prevalence of smart phones, specific latitudinal and longitudinal coordinates are included in fewer than 10% of the

  17. Microbial populations identified by fluorescence in situ hybridization in a constructed wetland treating acid coal mine drainage.

    PubMed

    Nicomrat, Duongruitai; Dick, Warren A; Tuovinen, Olli H

    2006-01-01

    Microorganisms are an integral part of the biogeochemical processes in wetlands, yet microbial communities in sediments within constructed wetlands receiving acid mine drainage (AMD) are only poorly understood. The purpose of this study was to characterize the microbial diversity and abundance in a wetland receiving AMD using fluorescence in situ hybridization (FISH) analysis. Seasonal samples of oxic surface sediments, comprised of Fe(III) precipitates, were collected from two treatment cells of the constructed wetland system. The pH of the bulk samples ranged between pH 2.1 and 3.9. Viable counts of acidophilic Fe and S oxidizers and heterotrophs were determined with a most probable number (MPN) method. The MPN counts were only a fraction of the corresponding FISH counts. The sediment samples contained microorganisms in the Bacteria (including the subgroups of acidophilic Fe- and S-oxidizing bacteria and Acidiphilium spp.) and Eukarya domains. Archaea were present in the sediment surface samples at < 0.01% of the total microbial community. The most numerous bacterial species in this wetland system was Acidithiobacillus ferrooxidans, comprising up to 37% of the bacterial population. Acidithiobacillus thiooxidans was also abundant. Heterotrophs in the Acidiphilium genus totaled 20% of the bacterial population. Leptospirillum ferrooxidans was below the level of detection in the bacterial community. The results from the FISH technique from this field study are consistent with results from other experiments involving enumeration by most probable number, dot-blot hybridization, and denaturing gradient gel electrophoresis analyses and with the geochemistry of the site. PMID:16825452

  18. Improvement Evaluation on Ceramic Roof Extraction Using WORLDVIEW-2 Imagery and Geographic Data Mining Approach

    NASA Astrophysics Data System (ADS)

    Brum-Bastos, V. S.; Ribeiro, B. M. G.; Pinho, C. M. D.; Korting, T. S.; Fonseca, L. M. G.

    2016-06-01

    Advances in geotechnologies and in remote sensing have improved analysis of urban environments. The new sensors are increasingly suited to urban studies, due to the enhancement in spatial, spectral and radiometric resolutions. Urban environments present high heterogeneity, which cannot be tackled using pixel-based approaches on high resolution images. Geographic Object-Based Image Analysis (GEOBIA) has been consolidated as a methodology for urban land use and cover monitoring; however, classification of high resolution images is still troublesome. This study aims to assess the improvement on ceramic roof classification using WorldView-2 images due to the increase of 4 new bands besides the standard "Blue-Green-Red-Near Infrared" bands. Our methodology combines GEOBIA, C4.5 classification tree algorithm, Monte Carlo simulation and statistical tests for classification accuracy. Two samples groups were considered: 1) eight multispectral and panchromatic bands, and 2) four multispectral and panchromatic bands, representing previous high-resolution sensors. The C4.5 algorithm generates a decision tree that can be used for classification; smaller decision trees are closer to the semantic networks produced by experts on GEOBIA, while bigger trees, are not straightforward to implement manually, but are more accurate. The choice for a big or small tree relies on the user's skills to implement it. This study aims to determine for what kind of user the addition of the 4 new bands might be beneficial: 1) the common user (smaller trees) or 2) a more skilled user with coding and/or data mining abilities (bigger trees). In overall the classification was improved by the addition of the four new bands for both types of users.

  19. Translational informatics approach for identifying the functional molecular communicators linking coronary artery disease, infection and inflammation.

    PubMed

    Sharma, Ankit; Ghatge, Madankumar; Mundkur, Lakshmi; Vangala, Rajani Kanth

    2016-05-01

    Translational informatics approaches are required for the integration of diverse and accumulating data to enable the administration of effective translational medicine specifically in complex diseases such as coronary artery disease (CAD). In the current study, a novel approach for elucidating the association between infection, inflammation and CAD was used. Genes for CAD were collected from the CAD‑gene database and those for infection and inflammation were collected from the UniProt database. The cytomegalovirus (CMV)‑induced genes were identified from the literature and the CAD‑associated clinical phenotypes were obtained from the Unified Medical Language System. A total of 55 gene ontologies (GO) termed functional communicator ontologies were identified in the gene sets linking clinical phenotypes in the diseasome network. The network topology analysis suggested that important functions including viral entry, cell adhesion, apoptosis, inflammatory and immune responses networked with clinical phenotypes. Microarray data was extracted from the Gene Expression Omnibus (dataset: GSE48060) for highly networked disease myocardial infarction. Further analysis of differentially expressed genes and their GO terms suggested that CMV infection may trigger a xenobiotic response, oxidative stress, inflammation and immune modulation. Notably, the current study identified γ‑glutamyl transferase (GGT)‑5 as a potential biomarker with an odds ratio of 1.947, which increased to 2.561 following the addition of CMV and CMV‑neutralizing antibody (CMV‑NA) titers. The C‑statistics increased from 0.530 for conventional risk factors (CRFs) to 0.711 for GGT in combination with the above mentioned infections and CRFs. Therefore, the translational informatics approach used in the current study identified a potential molecular mechanism for CMV infection in CAD, and a potential biomarker for risk prediction. PMID:27035874

  20. Identifying comorbid depression and disruptive behavior disorders: Comparison of two approaches used in adolescent studies

    PubMed Central

    Stoep, Ann Vander; Adrian, Molly C.; Rhew, Isaac C.; McCauley, Elizabeth; Herting, Jerald R.; Kraemer, Helena C.

    2013-01-01

    Interest in commonly co-occurring depression and disruptive behavior disorders in children has yielded a small body of research that estimates the prevalence of this comorbid condition and compares children with the comorbid condition and children with depression or disruptive behavior disorders alone with respect to antecedents and outcomes. Prior studies have used one of two different approaches to measure comorbid disorders: 1) meeting criteria for two DSM or ICD diagnoses or 2) scoring .5 SD above the mean or higher on two dimensional scales. This study compares two snapshots of comorbidity taken simultaneously in the same sample with each of the measurement approaches. The Developmental Pathways Project administered structured diagnostic interviews as well as dimensional scales to a community-based sample of 521 11-12 year olds to assess depression and disruptive behavior disorders. Clinical caseness indicators of children identified as “comorbid” by each method were examined concurrently and 3-years later. Cross-classification of adolescents via the two approaches revealed low agreement. When other indicators of caseness, including functional impairment, need for services, and clinical elevations on other symptom scales were examined, adolescents identified as comorbid via dimensional scales only were similar to those who were identified as comorbid via DSM-IV diagnostic criteria. Findings suggest that when relying solely on DSM diagnostic criteria for comorbid depression and disruptive behavior disorders, many adolescents with significant impairment will be overlooked. Findings also suggest that lower dimensional scale thresholds can be set when comorbid conditions, rather than single forms of psychopathology, are being identified. PMID:22575333

  1. Importance of multi-modal approaches to effectively identify cataract cases from electronic health records

    PubMed Central

    Rasmussen, Luke V; Berg, Richard L; Linneman, James G; McCarty, Catherine A; Waudby, Carol; Chen, Lin; Denny, Joshua C; Wilke, Russell A; Pathak, Jyotishman; Carrell, David; Kho, Abel N; Starren, Justin B

    2012-01-01

    Objective There is increasing interest in using electronic health records (EHRs) to identify subjects for genomic association studies, due in part to the availability of large amounts of clinical data and the expected cost efficiencies of subject identification. We describe the construction and validation of an EHR-based algorithm to identify subjects with age-related cataracts. Materials and methods We used a multi-modal strategy consisting of structured database querying, natural language processing on free-text documents, and optical character recognition on scanned clinical images to identify cataract subjects and related cataract attributes. Extensive validation on 3657 subjects compared the multi-modal results to manual chart review. The algorithm was also implemented at participating electronic MEdical Records and GEnomics (eMERGE) institutions. Results An EHR-based cataract phenotyping algorithm was successfully developed and validated, resulting in positive predictive values (PPVs) >95%. The multi-modal approach increased the identification of cataract subject attributes by a factor of three compared to single-mode approaches while maintaining high PPV. Components of the cataract algorithm were successfully deployed at three other institutions with similar accuracy. Discussion A multi-modal strategy incorporating optical character recognition and natural language processing may increase the number of cases identified while maintaining similar PPVs. Such algorithms, however, require that the needed information be embedded within clinical documents. Conclusion We have demonstrated that algorithms to identify and characterize cataracts can be developed utilizing data collected via the EHR. These algorithms provide a high level of accuracy even when implemented across multiple EHRs and institutional boundaries. PMID:22319176

  2. A Novel Approach for Identifying Causal Models of Complex Diseases from Family Data

    PubMed Central

    Park, Leeyoung; Kim, Ju H.

    2015-01-01

    Causal models including genetic factors are important for understanding the presentation mechanisms of complex diseases. Familial aggregation and segregation analyses based on polygenic threshold models have been the primary approach to fitting genetic models to the family data of complex diseases. In the current study, an advanced approach to obtaining appropriate causal models for complex diseases based on the sufficient component cause (SCC) model involving combinations of traditional genetics principles was proposed. The probabilities for the entire population, i.e., normal–normal, normal–disease, and disease–disease, were considered for each model for the appropriate handling of common complex diseases. The causal model in the current study included the genetic effects from single genes involving epistasis, complementary gene interactions, gene–environment interactions, and environmental effects. Bayesian inference using a Markov chain Monte Carlo algorithm (MCMC) was used to assess of the proportions of each component for a given population lifetime incidence. This approach is flexible, allowing both common and rare variants within a gene and across multiple genes. An application to schizophrenia data confirmed the complexity of the causal factors. An analysis of diabetes data demonstrated that environmental factors and gene–environment interactions are the main causal factors for type II diabetes. The proposed method is effective and useful for identifying causal models, which can accelerate the development of efficient strategies for identifying causal factors of complex diseases. PMID:25701286

  3. A systematic approach to identify functional motifs within vertebrate developmental enhancers

    PubMed Central

    Li, Qiang; Ritter, Deborah; Yang, Nan; Dong, Zhiqiang; Li, Hao; Chuang, Jeffrey H.; Guo, Su

    2012-01-01

    Uncovering the cis-regulatory logic of developmental enhancers is critical to understanding the role of non-coding DNA in development. However, it is cumbersome to identify functional motifs within enhancers, and thus few vertebrate enhancers have their core functional motifs revealed. Here we report a combined experimental and computational approach for discovering regulatory motifs in developmental enhancers. Making use of the zebrafish gene expression database, we computationally identified conserved non-coding elements (CNEs) likely to have a desired tissue-specificity based on the expression of nearby genes. Through a high throughput and robust enhancer assay, we tested the activity of ~100 such CNEs and efficiently uncovered developmental enhancers with desired spatial and temporal expression patterns in the zebrafish brain. Application of de novo motif prediction algorithms on a group of forebrain enhancers identified five top-ranked motifs, all of which were experimentally validated as critical for forebrain enhancer activity. These results demonstrate a systematic approach to discover important regulatory motifs in vertebrate developmental enhancers. Moreover, this dataset provides a useful resource for further dissection of vertebrate brain development and function. PMID:19850031

  4. Impacts of mountaintop mining on terrestrial ecosystem integrity: Identifying landscape thresholds for avian species in the central Appalachians, United States

    USGS Publications Warehouse

    Becker, Douglas A.; Wood, Petra Bohall; Strager, Michael P.; Mazzarella, Christine

    2014-01-01

    Because of little overlap in habitat requirements, managing landscapes simultaneously to maximally benefit both guilds may not be possible. Our avian thresholds identify single community management targets accounting for scarce species. Guild or individual species thresholds allow for species-specific management.

  5. Mining for Candidate Genes Related to Pancreatic Cancer Using Protein-Protein Interactions and a Shortest Path Approach

    PubMed Central

    Yuan, Fei; Zhang, Yu-Hang; Wan, Sibao; Wang, ShaoPeng; Kong, Xiang-Yin

    2015-01-01

    Pancreatic cancer (PC) is a highly malignant tumor derived from pancreas tissue and is one of the leading causes of death from cancer. Its molecular mechanism has been partially revealed by validating its oncogenes and tumor suppressor genes; however, the available data remain insufficient for medical workers to design effective treatments. Large-scale identification of PC-related genes can promote studies on PC. In this study, we propose a computational method for mining new candidate PC-related genes. A large network was constructed using protein-protein interaction information, and a shortest path approach was applied to mine new candidate genes based on validated PC-related genes. In addition, a permutation test was adopted to further select key candidate genes. Finally, for all discovered candidate genes, the likelihood that the genes are novel PC-related genes is discussed based on their currently known functions. PMID:26613085

  6. Translational informatics approach for identifying the functional molecular communicators linking coronary artery disease, infection and inflammation

    PubMed Central

    SHARMA, ANKIT; GHATGE, MADANKUMAR; MUNDKUR, LAKSHMI; VANGALA, RAJANI KANTH

    2016-01-01

    Translational informatics approaches are required for the integration of diverse and accumulating data to enable the administration of effective translational medicine specifically in complex diseases such as coronary artery disease (CAD). In the current study, a novel approach for elucidating the association between infection, inflammation and CAD was used. Genes for CAD were collected from the CAD-gene database and those for infection and inflammation were collected from the UniProt database. The cytomegalovirus (CMV)-induced genes were identified from the literature and the CAD-associated clinical phenotypes were obtained from the Unified Medical Language System. A total of 55 gene ontologies (GO) termed functional communicator ontologies were identifed in the gene sets linking clinical phenotypes in the diseasome network. The network topology analysis suggested that important functions including viral entry, cell adhesion, apoptosis, inflammatory and immune responses networked with clinical phenotypes. Microarray data was extracted from the Gene Expression Omnibus (dataset: GSE48060) for highly networked disease myocardial infarction. Further analysis of differentially expressed genes and their GO terms suggested that CMV infection may trigger a xenobiotic response, oxidative stress, inflammation and immune modulation. Notably, the current study identified γ-glutamyl transferase (GGT)-5 as a potential biomarker with an odds ratio of 1.947, which increased to 2.561 following the addition of CMV and CMV-neutralizing antibody (CMV-NA) titers. The C-statistics increased from 0.530 for conventional risk factors (CRFs) to 0.711 for GGT in combination with the above mentioned infections and CRFs. Therefore, the translational informatics approach used in the current study identified a potential molecular mechanism for CMV infection in CAD, and a potential biomarker for risk prediction. PMID:27035874

  7. Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches

    PubMed Central

    Havemann, Frank; Gläser, Jochen; Heinz, Michael; Struck, Alexander

    2012-01-01

    The aim of this paper is to introduce and assess three algorithms for the identification of overlapping thematic structures in networks of papers. We implemented three recently proposed approaches to the identification of overlapping and hierarchical substructures in graphs and applied the corresponding algorithms to a network of 492 information-science papers coupled via their cited sources. The thematic substructures obtained and overlaps produced by the three hierarchical cluster algorithms were compared to a content-based categorisation, which we based on the interpretation of titles, abstracts, and keywords. We defined sets of papers dealing with three topics located on different levels of aggregation: h-index, webometrics, and bibliometrics. We identified these topics with branches in the dendrograms produced by the three cluster algorithms and compared the overlapping topics they detected with one another and with the three predefined paper sets. We discuss the advantages and drawbacks of applying the three approaches to paper networks in research fields. PMID:22479376

  8. Identifying overlapping and hierarchical thematic structures in networks of scholarly papers: a comparison of three approaches.

    PubMed

    Havemann, Frank; Gläser, Jochen; Heinz, Michael; Struck, Alexander

    2012-01-01

    The aim of this paper is to introduce and assess three algorithms for the identification of overlapping thematic structures in networks of papers. We implemented three recently proposed approaches to the identification of overlapping and hierarchical substructures in graphs and applied the corresponding algorithms to a network of 492 information-science papers coupled via their cited sources. The thematic substructures obtained and overlaps produced by the three hierarchical cluster algorithms were compared to a content-based categorisation, which we based on the interpretation of titles, abstracts, and keywords. We defined sets of papers dealing with three topics located on different levels of aggregation: h-index, webometrics, and bibliometrics. We identified these topics with branches in the dendrograms produced by the three cluster algorithms and compared the overlapping topics they detected with one another and with the three predefined paper sets. We discuss the advantages and drawbacks of applying the three approaches to paper networks in research fields. PMID:22479376

  9. Leveraging Concept-based Approaches to Identify Potential Phyto-therapies

    PubMed Central

    Sharma, Vivekanand; Sarkar, Indra Neil

    2013-01-01

    The potential of plant-based remedies has been documented in both traditional and contemporary biomedical literature. Such types of text sources may thus be sources from which one might identify potential plant-based therapies (“phyto-therapies”). Concept-based analytic approaches have been shown to uncover knowledge embedded within biomedical literature. However, to date there has been limited attention towards leveraging such techniques for the identification of potential phyto-therapies. This study presents concept-based analytic approaches for the retrieval and ranking of associations between plants and human diseases. Focusing on identification of phyto-therapies described in MEDLINE, both MeSH descriptors used for indexing and MetaMap inferred UMLS concepts are considered. Furthermore, the identification and ranking consider both direct (i.e., plant concepts directly correlated with disease concepts) and inferred (i.e., plant concepts associated with disease concepts based on shared signs and symptoms) relationships. Based on the two scoring methodologies used in this study, it was found that a vector space model approach outperformed probabilistic reliability based inferences. An evaluation of the approach is provided based on therapeutic interventions catalogued in both ClinicalTrials.gov and NDF-RT. The promising findings from this feasibility study highlight the challenges and applicability of concept-based analytic strategies for distilling phyto-therapeutic knowledge from text based knowledge sources like MEDLINE. PMID:23665360

  10. Mining and biodiversity offsets: a transparent and science-based approach to measure "no-net-loss".

    PubMed

    Virah-Sawmy, Malika; Ebeling, Johannes; Taplin, Roslyn

    2014-10-01

    Mining and associated infrastructure developments can present themselves as economic opportunities that are difficult to forego for developing and industrialised countries alike. Almost inevitably, however, they lead to biodiversity loss. This trade-off can be greatest in economically poor but highly biodiverse regions. Biodiversity offsets have, therefore, increasingly been promoted as a mechanism to help achieve both the aims of development and biodiversity conservation. Accordingly, this mechanism is emerging as a key tool for multinational mining companies to demonstrate good environmental stewardship. Relying on offsets to achieve "no-net-loss" of biodiversity, however, requires certainty in their ecological integrity where they are used to sanction habitat destruction. Here, we discuss real-world practices in biodiversity offsetting by assessing how well some leading initiatives internationally integrate critical aspects of biodiversity attributes, net loss accounting and project management. With the aim of improving, rather than merely critiquing the approach, we analyse different aspects of biodiversity offsetting. Further, we analyse the potential pitfalls of developing counterfactual scenarios of biodiversity loss or gains in a project's absence. In this, we draw on insights from experience with carbon offsetting. This informs our discussion of realistic projections of project effectiveness and permanence of benefits to ensure no net losses, and the risk of displacing, rather than avoiding biodiversity losses ("leakage"). We show that the most prominent existing biodiversity offset initiatives employ broad and somewhat arbitrary parameters to measure habitat value and do not sufficiently consider real-world challenges in compensating losses in an effective and lasting manner. We propose a more transparent and science-based approach, supported with a new formula, to help design biodiversity offsets to realise their potential in enabling more responsible

  11. Identifying Inhibitors of Epithelial-Mesenchymal Transition by Connectivity-Map Based Systems Approach

    PubMed Central

    Reka, Ajaya Kumar; Kuick, Rork; Kurapati, Himabindu; Standiford, Theodore J.; Omenn, Gilbert S.; Keshamouni, Venkateshwar G.

    2011-01-01

    Background Acquisition of mesenchymal phenotype by epithelial cells by means of epithelial mesenchymal transition (EMT) is considered as an early event in the multi-step process of tumor metastasis. Therefore, inhibition of EMT might be a rational strategy to prevent metastasis. Methods Utilizing the global gene expression profile from a cell culture model of TGF-β-induced EMT, we identified potential EMT inhibitors. We used a publicly available database (www.broad.mit.edu/cmap) comprising gene expression profiles obtained from multiple different cell lines in response to various drugs to derive negative correlations to EMT gene expression profile using Connectivity Map (C-Map), a pattern matching tool. Results Experimental validation of the identified compounds showed rapamycin as a novel inhibitor of TGF-β signaling along with 17-AAG, a known modulator of TGF-β pathway. Both of these compounds completely blocked EMT and the associated migratory and invasive phenotype. The other identified compound, LY294002, demonstrated a selective inhibition of mesenchymal markers, cell migration and invasion, without affecting the loss of E-cadherin expression or Smad phosphorylation. Conclusions Collectively, our data reveals that rapamycin is a novel modulator of TGF-β signaling, and along with 17-AAG and LY294002, could be used as therapeutic agent for inhibiting EMT. Also, this analysis demonstrates the potential of a systems approach in identifying novel modulators of a complex biological process. PMID:21964532

  12. An Integrated Human/Murine Transcriptome and Pathway Approach To Identify Prenatal Treatments For Down Syndrome.

    PubMed

    Guedj, Faycal; Pennings, Jeroen LA; Massingham, Lauren J; Wick, Heather C; Siegel, Ashley E; Tantravahi, Umadevi; Bianchi, Diana W

    2016-01-01

    Anatomical and functional brain abnormalities begin during fetal life in Down syndrome (DS). We hypothesize that novel prenatal treatments can be identified by targeting signaling pathways that are consistently perturbed in cell types/tissues obtained from human fetuses with DS and mouse embryos. We analyzed transcriptome data from fetuses with trisomy 21, age and sex-matched euploid controls, and embryonic day 15.5 forebrains from Ts1Cje, Ts65Dn, and Dp16 mice. The new datasets were compared to other publicly available datasets from humans with DS. We used the human Connectivity Map (CMap) database and created a murine adaptation to identify FDA-approved drugs that can rescue affected pathways. USP16 and TTC3 were dysregulated in all affected human cells and two mouse models. DS-associated pathway abnormalities were either the result of gene dosage specific effects or the consequence of a global cell stress response with activation of compensatory mechanisms. CMap analyses identified 56 molecules with high predictive scores to rescue abnormal gene expression in both species. Our novel integrated human/murine systems biology approach identified commonly dysregulated genes and pathways. This can help to prioritize therapeutic molecules on which to further test safety and efficacy. Additional studies in human cells are ongoing prior to pre-clinical prenatal treatment in mice. PMID:27586445

  13. An Integrated Human/Murine Transcriptome and Pathway Approach To Identify Prenatal Treatments For Down Syndrome

    PubMed Central

    Guedj, Faycal; Pennings, Jeroen LA; Massingham, Lauren J.; Wick, Heather C.; Siegel, Ashley E.; Tantravahi, Umadevi; Bianchi, Diana W.

    2016-01-01

    Anatomical and functional brain abnormalities begin during fetal life in Down syndrome (DS). We hypothesize that novel prenatal treatments can be identified by targeting signaling pathways that are consistently perturbed in cell types/tissues obtained from human fetuses with DS and mouse embryos. We analyzed transcriptome data from fetuses with trisomy 21, age and sex-matched euploid controls, and embryonic day 15.5 forebrains from Ts1Cje, Ts65Dn, and Dp16 mice. The new datasets were compared to other publicly available datasets from humans with DS. We used the human Connectivity Map (CMap) database and created a murine adaptation to identify FDA-approved drugs that can rescue affected pathways. USP16 and TTC3 were dysregulated in all affected human cells and two mouse models. DS-associated pathway abnormalities were either the result of gene dosage specific effects or the consequence of a global cell stress response with activation of compensatory mechanisms. CMap analyses identified 56 molecules with high predictive scores to rescue abnormal gene expression in both species. Our novel integrated human/murine systems biology approach identified commonly dysregulated genes and pathways. This can help to prioritize therapeutic molecules on which to further test safety and efficacy. Additional studies in human cells are ongoing prior to pre-clinical prenatal treatment in mice. PMID:27586445

  14. A Multiple-Tracer Approach for Identifying Sewage Sources to an Urban Stream System

    USGS Publications Warehouse

    Hyer, Kenneth Edward

    2007-01-01

    The presence of human-derived fecal coliform bacteria (sewage) in streams and rivers is recognized as a human health hazard. The source of these human-derived bacteria, however, is often difficult to identify and eliminate, because sewage can be delivered to streams through a variety of mechanisms, such as leaking sanitary sewers or private lateral lines, cross-connected pipes, straight pipes, sewer-line overflows, illicit dumping of septic waste, and vagrancy. A multiple-tracer study was conducted to identify site-specific sources of sewage in Accotink Creek, an urban stream in Fairfax County, Virginia, that is listed on the Commonwealth's priority list of impaired streams for violations of the fecal coliform bacteria standard. Beyond developing this multiple-tracer approach for locating sources of sewage inputs to Accotink Creek, the second objective of the study was to demonstrate how the multiple-tracer approach can be applied to other streams affected by sewage sources. The tracers used in this study were separated into indicator tracers, which are relatively simple and inexpensive to apply, and confirmatory tracers, which are relatively difficult and expensive to analyze. Indicator tracers include fecal coliform bacteria, surfactants, boron, chloride, chloride/bromide ratio, specific conductance, dissolved oxygen, turbidity, and water temperature. Confirmatory tracers include 13 organic compounds that are associated with human waste, including caffeine, cotinine, triclosan, a number of detergent metabolites, several fragrances, and several plasticizers. To identify sources of sewage to Accotink Creek, a detailed investigation of the Accotink Creek main channel, tributaries, and flowing storm drains was undertaken from 2001 to 2004. Sampling was conducted in a series of eight synoptic sampling events, each of which began at the most downstream site and extended upstream through the watershed and into the headwaters of each tributary. Using the synoptic

  15. Demonstrating a Market-Based Approach to the Reclamation of Mined Lands in West Virginia

    SciTech Connect

    John W. Goodrich-Mahoney; Paul Ziemkiewicz

    2006-07-19

    This is the third quarter progress report of Phase II of a three-phase project to develop and evaluate the efficacy of developing multiple environmental market trading credits on a partially reclaimed surface mined site near Valley Point, Preston County, WV. Construction of the passive acid mine drainage (AMD) treatment system was completed but several modifications from the original design had to be made following the land survey and during construction to compensate for unforeseen circumstances. We continued to collect baseline quality data from the Conner Run AMD seeps to confirm the conceptual and final design for the passive AMD treatment system.

  16. Biomarker Identification Using Text Mining

    PubMed Central

    Li, Hui; Liu, Chunmei

    2012-01-01

    Identifying molecular biomarkers has become one of the important tasks for scientists to assess the different phenotypic states of cells or organisms correlated to the genotypes of diseases from large-scale biological data. In this paper, we proposed a text-mining-based method to discover biomarkers from PubMed. First, we construct a database based on a dictionary, and then we used a finite state machine to identify the biomarkers. Our method of text mining provides a highly reliable approach to discover the biomarkers in the PubMed database. PMID:23197989

  17. Approach for Identifying Human Leukocyte Antigen (HLA)-DR Bound Peptides from Scarce Clinical Samples.

    PubMed

    Heyder, Tina; Kohler, Maxie; Tarasova, Nataliya K; Haag, Sabrina; Rutishauser, Dorothea; Rivera, Natalia V; Sandin, Charlotta; Mia, Sohel; Malmström, Vivianne; Wheelock, Åsa M; Wahlström, Jan; Holmdahl, Rikard; Eklund, Anders; Zubarev, Roman A; Grunewald, Johan; Ytterberg, A Jimmy

    2016-09-01

    Immune-mediated diseases strongly associating with human leukocyte antigen (HLA) alleles are likely linked to specific antigens. These antigens are presented to T cells in the form of peptides bound to HLA molecules on antigen presenting cells, e.g. dendritic cells, macrophages or B cells. The identification of HLA-DR-bound peptides presents a valuable tool to investigate the human immunopeptidome. The lung is likely a key player in the activation of potentially auto-aggressive T cells prior to entering target tissues and inducing autoimmune disease. This makes the lung of exceptional interest and presents an ideal paradigm to study the human immunopeptidome and to identify antigenic peptides.Our previous investigation of HLA-DR peptide presentation in the lung required high numbers of cells (800 × 10(6) bronchoalveolar lavage (BAL) cells). Because BAL from healthy nonsmokers typically contains 10-15 × 10(6) cells, there is a need for a highly sensitive approach to study immunopeptides in the lungs of individual patients and controls.In this work, we analyzed the HLA-DR immunopeptidome in the lung by an optimized methodology to identify HLA-DR-bound peptides from low cell numbers. We used an Epstein-Barr Virus (EBV) immortalized B cell line and bronchoalveolar lavage (BAL) cells obtained from patients with sarcoidosis, an inflammatory T cell driven disease mainly occurring in the lung. Specifically, membrane complexes were isolated prior to immunoprecipitation, eluted peptides were identified by nanoLC-MS/MS and processed using the in-house developed ClusterMHCII software. With the optimized procedure we were able to identify peptides from 10 × 10(6) cells, which on average correspond to 10.9 peptides/million cells in EBV-B cells and 9.4 peptides/million cells in BAL cells. This work presents an optimized approach designed to identify HLA-DR-bound peptides from low numbers of cells, enabling the investigation of the BAL immunopeptidome from individual patients

  18. A Mutant Library Approach to Identify Improved Meningococcal Factor H Binding Protein Vaccine Antigens

    PubMed Central

    Konar, Monica; Rossi, Raffaella; Walter, Helen; Pajon, Rolando; Beernink, Peter T.

    2015-01-01

    Factor H binding protein (FHbp) is a virulence factor used by meningococci to evade the host complement system. FHbp elicits bactericidal antibodies in humans and is part of two recently licensed vaccines. Using human complement Factor H (FH) transgenic mice, we previously showed that binding of FH decreased the protective antibody responses to FHbp vaccination. Therefore, in the present study we devised a library-based method to identify mutant FHbp antigens with very low binding of FH. Using an FHbp sequence variant in one of the two licensed vaccines, we displayed an error-prone PCR mutant FHbp library on the surface of Escherichia coli. We used fluorescence-activated cell sorting to isolate FHbp mutants with very low binding of human FH and preserved binding of control anti-FHbp monoclonal antibodies. We sequenced the gene encoding FHbp from selected clones and introduced the mutations into a soluble FHbp construct. Using this approach, we identified several new mutant FHbp vaccine antigens that had very low binding of FH as measured by ELISA and surface plasmon resonance. The new mutant FHbp antigens elicited protective antibody responses in human FH transgenic mice that were up to 20-fold higher than those elicited by the wild-type FHbp antigen. This approach offers the potential to discover mutant antigens that might not be predictable even with protein structural information and potentially can be applied to other microbial vaccine antigens that bind host proteins. PMID:26057742

  19. A Mutant Library Approach to Identify Improved Meningococcal Factor H Binding Protein Vaccine Antigens.

    PubMed

    Konar, Monica; Rossi, Raffaella; Walter, Helen; Pajon, Rolando; Beernink, Peter T

    2015-01-01

    Factor H binding protein (FHbp) is a virulence factor used by meningococci to evade the host complement system. FHbp elicits bactericidal antibodies in humans and is part of two recently licensed vaccines. Using human complement Factor H (FH) transgenic mice, we previously showed that binding of FH decreased the protective antibody responses to FHbp vaccination. Therefore, in the present study we devised a library-based method to identify mutant FHbp antigens with very low binding of FH. Using an FHbp sequence variant in one of the two licensed vaccines, we displayed an error-prone PCR mutant FHbp library on the surface of Escherichia coli. We used fluorescence-activated cell sorting to isolate FHbp mutants with very low binding of human FH and preserved binding of control anti-FHbp monoclonal antibodies. We sequenced the gene encoding FHbp from selected clones and introduced the mutations into a soluble FHbp construct. Using this approach, we identified several new mutant FHbp vaccine antigens that had very low binding of FH as measured by ELISA and surface plasmon resonance. The new mutant FHbp antigens elicited protective antibody responses in human FH transgenic mice that were up to 20-fold higher than those elicited by the wild-type FHbp antigen. This approach offers the potential to discover mutant antigens that might not be predictable even with protein structural information and potentially can be applied to other microbial vaccine antigens that bind host proteins. PMID:26057742

  20. A novel integrative approach to identify lncRNAs associated with the survival of melanoma patients.

    PubMed

    Guo, Linna; Yao, Lijie; Jiang, Yang

    2016-07-10

    Long non-coding RNAs (lncRNAs) play important roles in diagnosis and prognosis of human cancers. With the development of microarray and RNA-seq, gene expression were measured in more and more tumor types for identification of prognostic markers. However, lncRNA expression profiles of tumor patients with follow-up information were rare. In this study, we developed a novel simple computational approach, which didn't use lncRNA expression, to identify lncRNAs associated with the survival of melanoma patients through integrating gene expression and lncRNA-target networks. First, we calculated the significance of associations between gene expression and patients' survival. Second, we constructed the experimentally validated lncRNA-target gene networks. Next, the significance of lncRNAs were obtained by combination of the p-values of their neighbor genes. Finally, we identified 15 lncRNAs that were significantly associated with the survival of melanoma patients (p<0.05), which were supported by functional analysis and literature review. Collectively, this study provides an effective approach to predict the lncRNA signatures for outcomes of tumor patients without lncRNA expression profiles. PMID:27016304

  1. Identifying locations of recent TB transmission in rural Uganda: a multidisciplinary approach

    PubMed Central

    Chamie, Gabriel; Wandera, Bonnie; Marquez, Carina; Kato-Maeda, Midori; Kamya, Moses R.; Havlir, Diane V.; Charlebois, Edwin D.

    2015-01-01

    Objectives Targeting high TB transmission sites may offer a novel approach to TB prevention in sub-Saharan Africa. We sought to characterize TB transmission sites in a rural Ugandan township. Methods We recruited adults starting TB treatment in Tororo, Uganda over one year. 54 TB cases provided names of frequent contacts, sites of residence, health care, work and social activities, and two sputum samples. Mycobacterium tuberculosis (MTB) culture-positive specimens underwent spoligotyping to identify strains with shared genotypes. We visualized TB case social networks, and obtained, mapped and geo-coded global positioning system measures for every location that cases reported frequenting one month before treatment. Locations of spatial overlap among genotype-clustered cases were considered potential transmission sites. Results Six distinct genotypic clusters were identified involving 21/33(64%) MTB culture-positive, genotyped cases; none shared a home. Although 18/54(33%) TB cases shared social network ties, none of the genotype-clustered cases shared social ties. Using spatial analysis, we identified potential sites of within-cluster TB transmission for five of six genotypic clusters. All sites but one were health care and social venues, including sites of drinking, worship and marketplaces. Cases reported spending the largest proportion of pre-treatment person-time (22.4%) at drinking venues. Conclusions Using molecular epidemiology, geospatial and social network data from adult TB cases identified at clinics, we quantified person-time spent at high-risk locations across a rural Ugandan community, and determined the most likely sites of recent TB transmission to be health care and social venues. These sites may not have been identified using contact investigation alone. PMID:25583212

  2. Deformation Prediction and Geometrical Modeling of Head and Neck Cancer Tumor: A Data Mining Approach

    NASA Astrophysics Data System (ADS)

    Azimi, Maryam

    Radiation therapy has been used in the treatment of cancer tumors for several years and many cancer patients receive radiotherapy. It may be used as primary therapy or with a combination of surgery or other kinds of therapy such as chemotherapy, hormone therapy or some mixture of the three. The treatment objective is to destroy cancer cells or shrink the tumor by planning an adequate radiation dose to the desired target without damaging the normal tissues. By using the pre-treatment Computer Tomography (CT) images, most of the radiotherapy planning systems design the target and assume that the size of the tumor will not change throughout the treatment course, which takes 5 to 7 weeks. Based on this assumption, the total amount of radiation is planned and fractionated for the daily dose required to be delivered to the patient's body. However, this assumption is flawed because the patients receiving radiotherapy have marked changes in tumor geometry during the treatment period. Therefore, there is a critical need to understand the changes of the tumor shape and size over time during the course of radiotherapy in order to prevent significant effects of inaccuracy in the planning. In this research, a methodology is proposed in order to monitor and predict daily (fraction day) tumor volume and surface changes of head and neck cancer tumors during the entire treatment period. In the proposed method, geometrical modeling and data mining techniques will be used rather than repetitive CT scans data to predict the tumor deformation for radiation planning. Clinical patient data were obtained from the University of Texas-MD Anderson Cancer Center (MDACC). In the first step, by using CT scan data, the tumor's progressive geometric changes during the treatment period are quantified. The next step relates to using regression analysis in order to develop predictive models for tumor geometry based on the geometric analysis results and the patients' selected attributes (age, weight

  3. Early Prediction of Students' Grade Point Averages at Graduation: A Data Mining Approach

    ERIC Educational Resources Information Center

    Tekin, Ahmet

    2014-01-01

    Problem Statement: There has recently been interest in educational databases containing a variety of valuable but sometimes hidden data that can be used to help less successful students to improve their academic performance. The extraction of hidden information from these databases often implements aspects of the educational data mining (EDM)…

  4. Data mining for water resource management part 2 - methods and approaches to solving contemporary problems

    USGS Publications Warehouse

    Roehl, Edwin A.; Conrads, Paul A.

    2010-01-01

    This is the second of two papers that describe how data mining can aid natural-resource managers with the difficult problem of controlling the interactions between hydrologic and man-made systems. Data mining is a new science that assists scientists in converting large databases into knowledge, and is uniquely able to leverage the large amounts of real-time, multivariate data now being collected for hydrologic systems. Part 1 gives a high-level overview of data mining, and describes several applications that have addressed major water resource issues in South Carolina. This Part 2 paper describes how various data mining methods are integrated to produce predictive models for controlling surface- and groundwater hydraulics and quality. The methods include: - signal processing to remove noise and decompose complex signals into simpler components; - time series clustering that optimally groups hundreds of signals into "classes" that behave similarly for data reduction and (or) divide-and-conquer problem solving; - classification which optimally matches new data to behavioral classes; - artificial neural networks which optimally fit multivariate data to create predictive models; - model response surface visualization that greatly aids in understanding data and physical processes; and, - decision support systems that integrate data, models, and graphics into a single package that is easy to use.

  5. Identifying functional reorganization of spelling networks: an individual peak probability comparison approach

    PubMed Central

    Purcell, Jeremy J.; Rapp, Brenda

    2013-01-01

    Previous research has shown that damage to the neural substrates of orthographic processing can lead to functional reorganization during reading (Tsapkini et al., 2011); in this research we ask if the same is true for spelling. To examine the functional reorganization of spelling networks we present a novel three-stage Individual Peak Probability Comparison (IPPC) analysis approach for comparing the activation patterns obtained during fMRI of spelling in a single brain-damaged individual with dysgraphia to those obtained in a set of non-impaired control participants. The first analysis stage characterizes the convergence in activations across non-impaired control participants by applying a technique typically used for characterizing activations across studies: Activation Likelihood Estimate (ALE) (Turkeltaub et al., 2002). This method was used to identify locations that have a high likelihood of yielding activation peaks in the non-impaired participants. The second stage provides a characterization of the degree to which the brain-damaged individual's activations correspond to the group pattern identified in Stage 1. This involves performing a Mahalanobis distance statistics analysis (Tsapkini et al., 2011) that compares each of a control group's peak activation locations to the nearest peak generated by the brain-damaged individual. The third stage evaluates the extent to which the brain-damaged individual's peaks are atypical relative to the range of individual variation among the control participants. This IPPC analysis allows for a quantifiable, statistically sound method for comparing an individual's activation pattern to the patterns observed in a control group and, thus, provides a valuable tool for identifying functional reorganization in a brain-damaged individual with impaired spelling. Furthermore, this approach can be applied more generally to compare any individual's activation pattern with that of a set of other individuals. PMID:24399981

  6. Identifying Quality Indicators Used by Patients to Choose Secondary Health Care Providers: A Mixed Methods Approach

    PubMed Central

    Zaman, Saman Sara; Kahlon, Gurnaaz Kaur; Naik, Aditi; Jessel, Amar Singh; Nanavati, Niraj; Shah, Akash; Cox, Benita; Darzi, Ara

    2015-01-01

    Background Patients in health systems across the world can now choose between different health care providers. Patients are increasingly using websites and apps to compare the quality of health care services available in order to make a choice of provider. In keeping with many patient-facing platforms, most services currently providing comparative information on different providers do not take account of end-user requirements or the available evidence base. Objective To investigate what factors were considered most important when choosing nonemergency secondary health care providers in the United Kingdom with the purpose of translating these insights into a ratings platform delivered through a consumer mHealth app. Methods A mixed methods approach was used to identify key indicators incorporating a literature review to identify and categorize existing quality indicators, a questionnaire survey to formulate a ranked list of performance indicators, and focus groups to explore rationales behind the rankings. Findings from qualitative and quantitative methodologies were mapped onto each other under the four categories identified by the literature review. Results Quality indicators were divided into four categories. Hospital access was the least important category. The mean differences between the other three categories hospital statistics, hospital staff, and hospital facilities, were not statistically significant. Staff competence was the most important indicator in the hospital staff category; cleanliness and up-to-date facilities were equally important in hospital facilities; ease of travel to the hospital was found to be most important in hospital access. All quality indicators within the hospital statistics category were equally important. Focus groups elaborated that users find it difficult to judge staff competence despite its importance. Conclusions A mixed methods approach is presented, which supported a patient-centered development and evaluation of a

  7. A recursive network approach can identify constitutive regulatory circuits in gene expression data

    NASA Astrophysics Data System (ADS)

    Blasi, Monica Francesca; Casorelli, Ida; Colosimo, Alfredo; Blasi, Francesco Simone; Bignami, Margherita; Giuliani, Alessandro

    2005-03-01

    The activity of the cell is often coordinated by the organisation of proteins into regulatory circuits that share a common function. Genome-wide expression profiles might contain important information on these circuits. Current approaches for the analysis of gene expression data include clustering the individual expression measurements and relating them to biological functions as well as modelling and simulation of gene regulation processes by additional computer tools. The identification of the regulative programmes from microarray experiments is limited, however, by the intrinsic difficulty of linear methods to detect low-variance signals and by the sensitivity of the different approaches. Here we face the problem of recognising invariant patterns of correlations among gene expression reminiscent of regulation circuits. We demonstrate that a recursive neural network approach can identify genetic regulation circuits from expression data for ribosomal and genome stability genes. The proposed method, by greatly enhancing the sensitivity of microarray studies, allows the identification of important aspects of genetic regulation networks and might be useful for the discrimination of the different players involved in regulation circuits. Our results suggest that the constitutive regulatory networks involved in the generic organisation of the cell display a high degree of clustering depending on a modular architecture.

  8. A Novel Proteomics Approach to Identify SUMOylated Proteins and Their Modification Sites in Human Cells*

    PubMed Central

    Galisson, Frederic; Mahrouche, Louiza; Courcelles, Mathieu; Bonneil, Eric; Meloche, Sylvain; Chelbi-Alix, Mounira K.; Thibault, Pierre

    2011-01-01

    The small ubiquitin-related modifier (SUMO) is a small group of proteins that are reversibly attached to protein substrates to modify their functions. The large scale identification of protein SUMOylation and their modification sites in mammalian cells represents a significant challenge because of the relatively small number of in vivo substrates and the dynamic nature of this modification. We report here a novel proteomics approach to selectively enrich and identify SUMO conjugates from human cells. We stably expressed different SUMO paralogs in HEK293 cells, each containing a His6 tag and a strategically located tryptic cleavage site at the C terminus to facilitate the recovery and identification of SUMOylated peptides by affinity enrichment and mass spectrometry. Tryptic peptides with short SUMO remnants offer significant advantages in large scale SUMOylome experiments including the generation of paralog-specific fragment ions following CID and ETD activation, and the identification of modified peptides using conventional database search engines such as Mascot. We identified 205 unique protein substrates together with 17 precise SUMOylation sites present in 12 SUMO protein conjugates including three new sites (Lys-380, Lys-400, and Lys-497) on the protein promyelocytic leukemia. Label-free quantitative proteomics analyses on purified nuclear extracts from untreated and arsenic trioxide-treated cells revealed that all identified SUMOylated sites of promyelocytic leukemia were differentially SUMOylated upon stimulation. PMID:21098080

  9. Whole genome approaches to identify early meiotic gene candidates in cereals.

    PubMed

    Bovill, William D; Deveshwar, Priyanka; Kapoor, Sanjay; Able, Jason A

    2009-05-01

    Early events during meiotic prophase I underpin not only viability but the variation of a species from generation to generation. Understanding and manipulating processes such as chromosome pairing and recombination are integral for improving plant breeding. This study uses comparative genetics, quantitative trait locus (QTL) analysis and a transcriptomics-based approach to identify genes that might have a role in genome-wide recombination control. Comparative genetics and the analysis of the yeast and Arabidopsis sequenced genomes has allowed the identification of early meiotic candidates that are conserved in wheat, rice and barley. Secondly, scoring recombination frequency as a phenotype for QTL analysis across wheat, rice and barley mapping populations has enabled us to identify genomic regions and candidate genes that could be involved in genome-wide recombination. Transcriptome data for candidate genes indicate that they are expressed in meiotic tissues. Candidates identified included a non-annotated expressed protein, a DNA topoisomerase 2-like candidate, RecG, RuvB and RAD54 homologues. PMID:18836753

  10. Integrated genomic approaches identify major pathways and upstream regulators in late onset Alzheimer’s disease

    PubMed Central

    Li, Xinzhong; Long, Jintao; He, Taigang; Belshaw, Robert; Scott, James

    2015-01-01

    Previous studies have evaluated gene expression in Alzheimer’s disease (AD) brains to identify mechanistic processes, but have been limited by the size of the datasets studied. Here we have implemented a novel meta-analysis approach to identify differentially expressed genes (DEGs) in published datasets comprising 450 late onset AD (LOAD) brains and 212 controls. We found 3124 DEGs, many of which were highly correlated with Braak stage and cerebral atrophy. Pathway Analysis revealed the most perturbed pathways to be (a) nitric oxide and reactive oxygen species in macrophages (NOROS), (b) NFkB and (c) mitochondrial dysfunction. NOROS was also up-regulated, and mitochondrial dysfunction down-regulated, in healthy ageing subjects. Upstream regulator analysis predicted the TLR4 ligands, STAT3 and NFKBIA, for activated pathways and RICTOR for mitochondrial genes. Protein-protein interaction network analysis emphasised the role of NFKB; identified a key interaction of CLU with complement; and linked TYROBP, TREM2 and DOK3 to modulation of LPS signalling through TLR4 and to phosphatidylinositol metabolism. We suggest that NEUROD6, ZCCHC17, PPEF1 and MANBAL are potentially implicated in LOAD, with predicted links to calcium signalling and protein mannosylation. Our study demonstrates a highly injurious combination of TLR4-mediated NFKB signalling, NOROS inflammatory pathway activation, and mitochondrial dysfunction in LOAD. PMID:26202100

  11. Selected Approaches for Rational Drug Design and High Throughput Screening to Identify Anti-Cancer Molecules

    PubMed Central

    Hedvat, Michael; Emdad, Luni; Das, Swadesh K.; Kim, Keetae; Dasgupta, Santanu; Thomas, Shibu; Hu, Bin; Zhu, Shan; Dash, Rupesh; Quinn, Bridget A.; Oyesanya, Regina A.; Kegelman, Timothy P.; Sokhi, Upneet K.; Sarkar, Siddik; Erdogan, Eda; Menezes, Mitchell E.; Bhoopathi, Praveen; Wang, Xiang-Yang; Pomper, Martin G.; Wei, Jun; Wu, Bainan; Stebbins, John L.; Diaz, Paul W.; Reed, John C.; Pellecchia, Maurizio; Sarkar, Devanand; Fisher, Paul B.

    2013-01-01

    Structure-based modeling combined with rational drug design, and high throughput screening approaches offer significant potential for identifying and developing lead compounds with therapeutic potential. The present review focuses on these two approaches using explicit examples based on specific derivatives of Gossypol generated through rational design and applications of a cancer-specific-promoter derived from Progression Elevated Gene-3. The Gossypol derivative Sabutoclax (BI-97C1) displays potent anti-tumor activity against a diverse spectrum of human tumors. The model of the docked structure of Gossypol bound to Bcl-XL provided a virtual structure-activity-relationship where appropriate modifications were predicted on a rational basis. These structure-based studies led to the isolation of Sabutoclax, an optically pure isomer of Apogossypol displaying superior efficacy and reduced toxicity. These studies illustrate the power of combining structure-based modeling with rational design to predict appropriate derivatives of lead compounds to be empirically tested and evaluated for bioactivity. Another approach to cancer drug discovery utilizes a cancer-specific promoter as readouts of the transformed state. The promoter region of Progression Elevated Gene-3 is such a promoter with cancer-specific activity. The specificity of this promoter has been exploited as a means of constructing cancer terminator viruses that selectively kill cancer cells and as a systemic imaging modality that specifically visualizes in vivo cancer growth with no background from normal tissues. Screening of small molecule inhibitors that suppress the Progression Elevated Gene-3-promoter may provide relevant lead compounds for cancer therapy that can be combined with further structure-based approaches leading to the development of novel compounds for cancer therapy. PMID:22931411

  12. A Likelihood-Based Approach to Identifying Contaminated Food Products Using Sales Data: Performance and Challenges

    PubMed Central

    Kaufman, James; Lessler, Justin; Harry, April; Edlund, Stefan; Hu, Kun; Douglas, Judith; Thoens, Christian; Appel, Bernd; Käsbohrer, Annemarie; Filter, Matthias

    2014-01-01

    Foodborne disease outbreaks of recent years demonstrate that due to increasingly interconnected supply chains these type of crisis situations have the potential to affect thousands of people, leading to significant healthcare costs, loss of revenue for food companies, and—in the worst cases—death. When a disease outbreak is detected, identifying the contaminated food quickly is vital to minimize suffering and limit economic losses. Here we present a likelihood-based approach that has the potential to accelerate the time needed to identify possibly contaminated food products, which is based on exploitation of food products sales data and the distribution of foodborne illness case reports. Using a real world food sales data set and artificially generated outbreak scenarios, we show that this method performs very well for contamination scenarios originating from a single “guilty” food product. As it is neither always possible nor necessary to identify the single offending product, the method has been extended such that it can be used as a binary classifier. With this extension it is possible to generate a set of potentially “guilty” products that contains the real outbreak source with very high accuracy. Furthermore we explore the patterns of food distributions that lead to “hard-to-identify” foods, the possibility of identifying these food groups a priori, and the extent to which the likelihood-based method can be used to quantify uncertainty. We find that high spatial correlation of sales data between products may be a useful indicator for “hard-to-identify” products. PMID:24992565

  13. A Sparse Reconstruction Approach for Identifying Gene Regulatory Networks Using Steady-State Experiment Data

    PubMed Central

    Zhang, Wanhong; Zhou, Tong

    2015-01-01

    Motivation Identifying gene regulatory networks (GRNs) which consist of a large number of interacting units has become a problem of paramount importance in systems biology. Situations exist extensively in which causal interacting relationships among these units are required to be reconstructed from measured expression data and other a priori information. Though numerous classical methods have been developed to unravel the interactions of GRNs, these methods either have higher computing complexities or have lower estimation accuracies. Note that great similarities exist between identification of genes that directly regulate a specific gene and a sparse vector reconstruction, which often relates to the determination of the number, location and magnitude of nonzero entries of an unknown vector by solving an underdetermined system of linear equations y = Φx. Based on these similarities, we propose a novel framework of sparse reconstruction to identify the structure of a GRN, so as to increase accuracy of causal regulation estimations, as well as to reduce their computational complexity. Results In this paper, a sparse reconstruction framework is proposed on basis of steady-state experiment data to identify GRN structure. Different from traditional methods, this approach is adopted which is well suitable for a large-scale underdetermined problem in inferring a sparse vector. We investigate how to combine the noisy steady-state experiment data and a sparse reconstruction algorithm to identify causal relationships. Efficiency of this method is tested by an artificial linear network, a mitogen-activated protein kinase (MAPK) pathway network and the in silico networks of the DREAM challenges. The performance of the suggested approach is compared with two state-of-the-art algorithms, the widely adopted total least-squares (TLS) method and those available results on the DREAM project. Actual results show that, with a lower computational cost, the proposed method can

  14. Identifying Risk and Protective Factors in Recidivist Juvenile Offenders: A Decision Tree Approach.

    PubMed

    Ortega-Campos, Elena; García-García, Juan; Gil-Fenoy, Maria José; Zaldívar-Basurto, Flor

    2016-01-01

    Research on juvenile justice aims to identify profiles of risk and protective factors in juvenile offenders. This paper presents a study of profiles of risk factors that influence young offenders toward committing sanctionable antisocial behavior (S-ASB). Decision tree analysis is used as a multivariate approach to the phenomenon of repeated sanctionable antisocial behavior in juvenile offenders in Spain. The study sample was made up of the set of juveniles who were charged in a court case in the Juvenile Court of Almeria (Spain). The period of study of recidivism was two years from the baseline. The object of study is presented, through the implementation of a decision tree. Two profiles of risk and protective factors are found. Risk factors associated with higher rates of recidivism are antisocial peers, age at baseline S-ASB, problems in school and criminality in family members. PMID:27611313

  15. Identifying the critical financial ratios for stocks evaluation: A fuzzy delphi approach

    NASA Astrophysics Data System (ADS)

    Mokhtar, Mazura; Shuib, Adibah; Mohamad, Daud

    2014-12-01

    Stocks evaluation has always been an interesting and challenging problem for both researchers and practitioners. Generally, the evaluation can be made based on a set of financial ratios. Nevertheless, there are a variety of financial ratios that can be considered and if all ratios in the set are placed into the evaluation process, data collection would be more difficult and time consuming. Thus, the objective of this paper is to identify the most important financial ratios upon which to focus in order to evaluate the stock's performance. For this purpose, a survey was carried out using an approach which is based on an expert judgement, namely the Fuzzy Delphi Method (FDM). The results of this study indicated that return on equity, return on assets, net profit margin, operating profit margin, earnings per share and debt to equity are the most important ratios.

  16. A new simplex chemometric approach to identify olive oil blends with potentially high traceability.

    PubMed

    Semmar, N; Laroussi-Mezghani, S; Grati-Kamoun, N; Hammami, M; Artaud, J

    2016-10-01

    Olive oil blends (OOBs) are complex matrices combining different cultivars at variable proportions. Although qualitative determinations of OOBs have been subjected to several chemometric works, quantitative evaluations of their contents remain poorly developed because of traceability difficulties concerning co-occurring cultivars. Around this question, we recently published an original simplex approach helping to develop predictive models of the proportions of co-occurring cultivars from chemical profiles of resulting blends (Semmar & Artaud, 2015). Beyond predictive model construction and validation, this paper presents an extension based on prediction errors' analysis to statistically define the blends with the highest predictability among all the possible ones that can be made by mixing cultivars at different proportions. This provides an interesting way to identify a priori labeled commercial products with potentially high traceability taking into account the natural chemical variability of different constitutive cultivars. PMID:27132835

  17. A similarity based approach to identify homogeneous regions for seasonal forecasting

    NASA Astrophysics Data System (ADS)

    Schick, Simon; Rössler, Ole; Weingartner, Rolf

    2015-04-01

    Seasonal runoff forecasting using statistical models is challenged by a large number of candidate predictors and a general weak predictor-predictand relationship. As the area of the target basin increases, often also the available data sets do, thus reinforcing the predictor selection challenge. We propose an approach which follows the idea of 'divide and conquer' as developed in computational sciences and machine learning: First, the macroscale target basin is partitioned into homogeneous regions using all its gauged mesoscale subbasins. Second, one representative subbasin per homogeneous region is identified, for which models are fitted and applied. Third, the resulting forecasts are combined at the scale of the macroscale target basin. This approach requires a suitable method to identify homogeneous regions and representative subbasins. We suggest a way based on hydrological similarity, as catchment similarity estimated with respect to physiographic-climatic descriptors does not necessarily imply similar runoff response. Each descriptor is derived from daily runoff series and aimed to reflect a specific catchment characteristic: autocorrelation coefficient, parameters of fitted Gamma distribution and low/high flow indices (based on daily runoff values) fluctuation of the standard deviation within the yearly cycle (based on weekly runoff values) dominant harmonics obtained from the discrete Fourier transform (based on monthly runoff values) long term trend (based on yearly runoff values) Where necessary, the runoff series first need to be standardized, aggregated, detrended or deseasonalized. As a preliminary study we present the results of a cluster analysis for the Swiss Rhine River as macroscale target basin, which leads to about 40 mesoscale subbasins with runoff series for the period 1991-2010. Problems we have to address include the choice of a clustering algorithm, the identification of an appropriate number of regions and the selection of representative

  18. Objective Definition of Rosette Shape Variation Using a Combined Computer Vision and Data Mining Approach

    PubMed Central

    Camargo, Anyela; Papadopoulou, Dimitra; Spyropoulou, Zoi; Vlachonasios, Konstantinos; Doonan, John H.; Gay, Alan P.

    2014-01-01

    Computer-vision based measurements of phenotypic variation have implications for crop improvement and food security because they are intrinsically objective. It should be possible therefore to use such approaches to select robust genotypes. However, plants are morphologically complex and identification of meaningful traits from automatically acquired image data is not straightforward. Bespoke algorithms can be designed to capture and/or quantitate specific features but this approach is inflexible and is not generally applicable to a wide range of traits. In this paper, we have used industry-standard computer vision techniques to extract a wide range of features from images of genetically diverse Arabidopsis rosettes growing under non-stimulated conditions, and then used statistical analysis to identify those features that provide good discrimination between ecotypes. This analysis indicates that almost all the observed shape variation can be described by 5 principal components. We describe an easily implemented pipeline including image segmentation, feature extraction and statistical analysis. This pipeline provides a cost-effective and inherently scalable method to parameterise and analyse variation in rosette shape. The acquisition of images does not require any specialised equipment and the computer routines for image processing and data analysis have been implemented using open source software. Source code for data analysis is written using the R package. The equations to calculate image descriptors have been also provided. PMID:24804972

  19. Objective definition of rosette shape variation using a combined computer vision and data mining approach.

    PubMed

    Camargo, Anyela; Papadopoulou, Dimitra; Spyropoulou, Zoi; Vlachonasios, Konstantinos; Doonan, John H; Gay, Alan P

    2014-01-01

    Computer-vision based measurements of phenotypic variation have implications for crop improvement and food security because they are intrinsically objective. It should be possible therefore to use such approaches to select robust genotypes. However, plants are morphologically complex and identification of meaningful traits from automatically acquired image data is not straightforward. Bespoke algorithms can be designed to capture and/or quantitate specific features but this approach is inflexible and is not generally applicable to a wide range of traits. In this paper, we have used industry-standard computer vision techniques to extract a wide range of features from images of genetically diverse Arabidopsis rosettes growing under non-stimulated conditions, and then used statistical analysis to identify those features that provide good discrimination between ecotypes. This analysis indicates that almost all the observed shape variation can be described by 5 principal components. We describe an easily implemented pipeline including image segmentation, feature extraction and statistical analysis. This pipeline provides a cost-effective and inherently scalable method to parameterise and analyse variation in rosette shape. The acquisition of images does not require any specialised equipment and the computer routines for image processing and data analysis have been implemented using open source software. Source code for data analysis is written using the R package. The equations to calculate image descriptors have been also provided. PMID:24804972

  20. A reclamation approach for mined prime farmland by adding organic wastes and lime to the subsoil

    SciTech Connect

    Zhai, Qiang; Barnhisel, R.I.

    1996-12-31

    Surface mined prime farmland may be reclaimed by adding organic wastes and lime to subsoil thus improving conditions in root zone. In this study, sewage sludge, poultry manure, horse bedding, and lime were applied to subsoil (15-30 cm) during reclamation. Soil properties and plant growth were measured over two years. All organic amendments tended to lower the subsoil bulk density and increase organic matter and total nitrogen. Liming raised exchangeable calcium, slightly increased pH, but decreased exchangeable magnesium and potassium. Corn ear-leaf and forage tissue nitrogen, yields, and nitrogen removal increased in treatments amended with sewage sludge and poultry manure, but not horse bedding. Subsoil application of sewage sludge or poultry manure seems like a promising method in the reclamation of surface mined prime farmland based on the improvements observed in the root zone environment.

  1. A novel approach of mining strong jumping emerging patterns based on BSC-tree

    NASA Astrophysics Data System (ADS)

    Liu, Quanzhong; Shi, Peng; Hu, Zhengguo; Zhang, Yang

    2014-03-01

    It is a great challenge to discover strong jumping emerging patterns (SJEPs) from a high-dimensional dataset because of the huge pattern space. In this article, we propose a dynamically growing contrast pattern tree (DGCP-tree) structure to store grown patterns and their path codes arrays with 1-bit counts, which are from the constructed bit string compression tree. A method of mining SJEPs based on DGCP-tree is developed. In order to reduce the pattern search space, we introduce a novel pattern pruning method, which dramatically reduces non-minimal jumping emerging patterns (JEPs) during the mining process. Experiments are performed on three real cancer datasets and three datasets from the University of California, Irvine machine-learning repository. Compared with the well-known CP-tree method, the results show that the proposed method is substantially faster, able to handle higher-dimensional datasets and to prune more non-minimal JEPs.

  2. Perspective: NanoMine: A material genome approach for polymer nanocomposites analysis and design

    NASA Astrophysics Data System (ADS)

    Zhao, He; Li, Xiaolin; Zhang, Yichi; Schadler, Linda S.; Chen, Wei; Brinson, L. Catherine

    2016-05-01

    Polymer nanocomposites are a designer class of materials where nanoscale particles, functional chemistry, and polymer resin combine to provide materials with unprecedented combinations of physical properties. In this paper, we introduce NanoMine, a data-driven web-based platform for analysis and design of polymer nanocomposite systems under the material genome concept. This open data resource strives to curate experimental and computational data on nanocomposite processing, structure, and properties, as well as to provide analysis and modeling tools that leverage curated data for material property prediction and design. With a continuously expanding dataset and toolkit, NanoMine encourages community feedback and input to construct a sustainable infrastructure that benefits nanocomposite material research and development.

  3. An integrated functional genomics approach identifies the regulatory network directed by brachyury (T) in chordoma.

    PubMed

    Nelson, Andrew C; Pillay, Nischalan; Henderson, Stephen; Presneau, Nadège; Tirabosco, Roberto; Halai, Dina; Berisha, Fitim; Flicek, Paul; Stemple, Derek L; Stern, Claudio D; Wardle, Fiona C; Flanagan, Adrienne M

    2012-11-01

    Chordoma is a rare malignant tumour of bone, the molecular marker of which is the expression of the transcription factor, brachyury. Having recently demonstrated that silencing brachyury induces growth arrest in a chordoma cell line, we now seek to identify its downstream target genes. Here we use an integrated functional genomics approach involving shRNA-mediated brachyury knockdown, gene expression microarray, ChIP-seq experiments, and bioinformatics analysis to achieve this goal. We confirm that the T-box binding motif of human brachyury is identical to that found in mouse, Xenopus, and zebrafish development, and that brachyury acts primarily as an activator of transcription. Using human chordoma samples for validation purposes, we show that brachyury binds 99 direct targets and indirectly influences the expression of 64 other genes, thereby acting as a master regulator of an elaborate oncogenic transcriptional network encompassing diverse signalling pathways including components of the cell cycle, and extracellular matrix components. Given the wide repertoire of its active binding and the relative specific localization of brachyury to the tumour cells, we propose that an RNA interference-based gene therapy approach is a plausible therapeutic avenue worthy of investigation. PMID:22847733

  4. Membrane Glycoproteins Associated with Breast Tumor Cell Progression Identified by a Lectin Affinity Approach

    PubMed Central

    Wang, Yanfei; Ao, Xiaoping; Vuong, Huy; Konanur, Meghana; Miller, Fred R.; Goodison, Steve; Lubman, David M.

    2008-01-01

    The membrane glycoprotein component of the cellular proteome represents a promising source for potential disease biomarkers and therapeutic targets. Here we describe the development of a method that facilitates the analysis of membrane glycoproteins and apply it to the differential analysis of breast tumor cells with distinct malignant phenotypes. The approach combines two membrane extraction procedures, and enrichment using ConA and WGA lectin affinity columns, prior to digestion and analysis by LC–MS/MS. The glycoproteins are identified and quantified by spectral counting. Although the distribution of glycoprotein expression as a function of MW and pI was very similar between the two related cell lines tested, the approach enabled the identification of several distinct membrane glycoproteins with an expression index correlated with either a precancerous (MCF10AT1), or a malignant, metastatic cellular phenotype (MCF10CA1a). Among the proteins associated with the malignant phenotype, Gamma-glutamyl hydrolase, CD44, Galectin-3-binding protein, and Syndecan-1 protein have been reported as potential biomarkers of breast cancer. PMID:18729497

  5. Identifying Potential Areas for Siting Interim Nuclear Waste Facilities Using Map Algebra and Optimization Approaches

    SciTech Connect

    Omitaomu, Olufemi A; Liu, Cheng; Cetiner, Sacit M; Belles, Randy; Mays, Gary T; Tuttle, Mark A

    2013-01-01

    The renewed interest in siting new nuclear power plants in the United States has brought to the center stage, the need to site interim facilities for long-term management of spent nuclear fuel (SNF). In this paper, a two-stage approach for identifying potential areas for siting interim SNF facilities is presented. In the first stage, the land area is discretized into grids of uniform size (e.g., 100m x 100m grids). For the continental United States, this process resulted in a data matrix of about 700 million cells. Each cell of the matrix is then characterized as a binary decision variable to indicate whether an exclusion criterion is satisfied or not. A binary data matrix is created for each of the 25 siting criteria considered in this study. Using map algebra approach, cells that satisfy all criteria are clustered and regarded as potential siting areas. In the second stage, an optimization problem is formulated as a p-median problem on a rail network such that the sum of the shortest distance between nuclear power plants with SNF and the potential storage sites from the first stage is minimized. The implications of obtained results for energy policies are presented and discussed.

  6. A non-target approach to identify disinfection byproducts of structurally similar sulfonamide antibiotics.

    PubMed

    Wang, Mian; Helbling, Damian E

    2016-10-01

    There is growing concern over the formation of new types of disinfection byproducts (DBPs) from pharmaceuticals and other emerging contaminants during drinking water production. Free chlorine is a widely used disinfectant that reacts non-selectively with organic molecules to form a variety of byproducts. In this research, we aimed to investigate the DBPs formed from three structurally similar sulfonamide antibiotics (sulfamethoxazole, sulfathiazole, and sulfadimethoxine) to determine how chemical structure influences the types of chlorination reactions observed. We conducted free chlorination experiments and developed a non-target approach to extract masses from the experimental dataset that represent the masses of candidate DBPs. Structures were assigned to the candidate DBPs based on analytical data and knowledge of chlorine chemistry. Confidence levels were assigned to each proposed structure according to conventions in the field. In total, 11, 12, and 15 DBP structures were proposed for sulfamethoxazole, sulfathiazole, and sulfadimethoxine, respectively. The structures of the products suggest a variety of reaction types including chlorine substitution, SC cleavage, SN hydrolysis, desulfonation, oxidation/hydroxylation, and conjugation reactions. Some reaction types were common to all of the sulfonamide antibiotics, but unique reaction types were also observed for each sulfonamide antibiotic suggesting that selective prediction of DBP structures of other sulfonamide antibiotics based on chemical structure is unlikely to be possible based on these data alone. This research offers an approach to comprehensively identify DBPs of organic molecules and fills in much needed data on the formation of specific DBPs from three environmentally relevant sulfonamide antibiotics. PMID:27348196

  7. A computational approach to identify genes for functional RNAs in genomic sequences

    PubMed Central

    Carter, Richard J.; Dubchak, Inna; Holbrook, Stephen R.

    2001-01-01

    Currently there is no successful computational approach for identification of genes encoding novel functional RNAs (fRNAs) in genomic sequences. We have developed a machine learning approach using neural networks and support vector machines to extract common features among known RNAs for prediction of new RNA genes in the unannotated regions of prokaryotic and archaeal genomes. The Escherichia coli genome was used for development, but we have applied this method to several other bacterial and archaeal genomes. Networks based on nucleotide composition were 80–90% accurate in jackknife testing experiments for bacteria and 90–99% for hyperthermophilic archaea. We also achieved a significant improvement in accuracy by combining these predictions with those obtained using a second set of parameters consisting of known RNA sequence motifs and the calculated free energy of folding. Several known fRNAs not included in the training datasets were identified as well as several hundred predicted novel RNAs. These studies indicate that there are many unidentified RNAs in simple genomes that can be predicted computationally as a precursor to experimental study. Public access to our RNA gene predictions and an interface for user predictions is available via the web. PMID:11574674

  8. Data mining approach to the evaluation of diagnostic tests in Wilson disease

    NASA Astrophysics Data System (ADS)

    Plutecki, Michal M.; Dądalski, Maciej; Socha, Piotr; Mulawka, Jan J.

    2009-06-01

    The purpose of this paper is to figure out a new, better than so-far-known, evaluation method of diagnostic tests in Wilson disease. In order to find the most interesting classification models various data mining techniques were applied to real, suffering from Wilson disease, set of patients. It occurred that a combination of two classification algorithms with its implementations in Weka environment may significantly increase classification ability.

  9. A neural network approach for identifying particle pitch angle distributions in Van Allen Probes data

    NASA Astrophysics Data System (ADS)

    Souza, V. M.; Vieira, L. E. A.; Medeiros, C.; Da Silva, L. A.; Alves, L. R.; Koga, D.; Sibeck, D. G.; Walsh, B. M.; Kanekal, S. G.; Jauer, P. R.; Rockenbach, M.; Dal Lago, A.; Silveira, M. V. D.; Marchezi, J. P.; Mendes, O.; Gonzalez, W. D.; Baker, D. N.

    2016-04-01

    Analysis of particle pitch angle distributions (PADs) has been used as a means to comprehend a multitude of different physical mechanisms that lead to flux variations in the Van Allen belts and also to particle precipitation into the upper atmosphere. In this work we developed a neural network-based data clustering methodology that automatically identifies distinct PAD types in an unsupervised way using particle flux data. One can promptly identify and locate three well-known PAD types in both time and radial distance, namely, 90° peaked, butterfly, and flattop distributions. In order to illustrate the applicability of our methodology, we used relativistic electron flux data from the whole month of November 2014, acquired from the Relativistic Electron-Proton Telescope instrument on board the Van Allen Probes, but it is emphasized that our approach can also be used with multiplatform spacecraft data. Our PAD classification results are in reasonably good agreement with those obtained by standard statistical fitting algorithms. The proposed methodology has a potential use for Van Allen belt's monitoring.

  10. Integrative screening approach identifies regulators of polyploidization and targets for acute megakaryocytic leukemia

    PubMed Central

    Wen, Qiang; Goldenson, Benjamin; Silver, Serena J.; Schenone, Monica; Dancik, Vladimir; Huang, Zan; Wang, Ling-Zhi; Lewis, Timothy; An, W. Frank; Li, Xiaoyu; Bray, Mark-Anthony; Thiollier, Clarisse; Diebold, Lauren; Gilles, Laure; Vokes, Martha S.; Moore, Christopher B.; Bliss-Moreau, Meghan; VerPlank, Lynn; Tolliday, Nicola J.; Mishra, Rama; Vemula, Sasidhar; Shi, Jianjian; Wei, Lei; Kapur, Reuben; Lopez, Cécile K.; Gerby, Bastien; Ballerini, Paola; Pflumio, Francoise; Gilliland, D. Gary; Goldberg, Liat; Birger, Yehudit; Izraeli, Shai; Gamis, Alan S.; Smith, Franklin O.; Woods, William G.; Taub, Jeffrey; Scherer, Christina A.; Bradner, James; Goh, Boon-Cher; Mercher, Thomas; Carpenter, Anne E.; Gould, Robert J.; Clemons, Paul A.; Carr, Steven A.; Root, David E.; Schreiber, Stuart L.; Stern, Andrew M.; Crispino, John D.

    2012-01-01

    Summary The mechanism by which cells decide to skip mitosis to become polyploid is largely undefined. Here we used a high-content image-based screen to identify small-molecule probes that induce polyploidization of megakaryocytic leukemia cells and serve as perturbagens to help understand this process. We found that dimethylfasudil (diMF, H-1152P) selectively increased polyploidization, mature cell-surface marker expression, and apoptosis of malignant megakaryocytes. A broadly applicable, highly integrated target identification approach employing proteomic and shRNA screening revealed that a major target of diMF is Aurora A kinase (AURKA), which has not been studied extensively in megakaryocytes. Moreover, we discovered that MLN8237 (Alisertib), a selective inhibitor of AURKA, induced polyploidization and expression of mature megakaryocyte markers in AMKL blasts and displayed potent anti-AMKL activity in vivo. This research provides the rationale to support clinical trials of MLN8237 and other inducers of polyploidization in AMKL. Finally, we have identified five networks of kinases that regulate the switch to polyploidy. PMID:22863010

  11. Multi-compartment approach to identify minimal flow and maximal recreational use of a lowland river

    NASA Astrophysics Data System (ADS)

    Pusch, Martin; Lorenz, Stefan

    2013-04-01

    Most approaches to establish a minimum flow rate for river sections subjected to water abstraction focus on flow requirements of fish and benthic invertebrates. However, artificial reduction of river flow will always affect additional key ecosystem features, as sediment properties and the metabolism of matter in these ecosystems as well, and may even influence adjacent floodplains. Thus, significant effects e.g. on the dissolved oxygen content of river water, on habitat conditions in the benthic zone, and on water levels in the floodplain are to be expected. Thus, we chose a multiple compartment method to identify minimum flow requirements in a lowland River in northern Germany (Spree River), selecting the minimal required flow level out of all compartments studied. Results showed that minimal flow levels necessary to keep key ecosystem features at a 'good' state depended significantly on actual water quality and on river channel morphology. Thereby, water quality of the Spree is potentially influenced by recreational boating activity, which causes mussels to stop filter-feeding, and thus impedes self-purification. Disturbance of mussel feeding was shown to directly depend on boat type and speed, with substantial differences among mussel species. Thus, a maximal recreational boating intensity could be derived that does not significantly affect self purification. We conclude that minimal flow levels should be identified not only based on flow preferences of target species, but also considering channel morphology, ecological functions, and the intensity of other human uses of the river section.

  12. Computational approaches to identifying and characterizing protein binding sites for ligand design.

    PubMed

    Henrich, Stefan; Salo-Ahen, Outi M H; Huang, Bingding; Rippmann, Friedrich F; Cruciani, Gabriele; Wade, Rebecca C

    2010-01-01

    Given the three-dimensional structure of a protein, how can one find the sites where other molecules might bind to it? Do these sites have the properties necessary for high affinity binding? Is this protein a suitable target for drug design? Here, we discuss recent developments in computational methods to address these and related questions. Geometric methods to identify pockets on protein surfaces have been developed over many years but, with new algorithms, their performance is still improving. Simulation methods show promise in accounting for protein conformational variability to identify transient pockets but lack the ease of use of many of the (rigid) shape-based tools. Sequence and structure comparison approaches are benefiting from the constantly increasing size of sequence and structure databases. Energetic methods can aid identification and characterization of binding pockets, and have undergone recent improvements in the treatment of solvation and hydrophobicity. The "druggability" of a binding site is still difficult to predict with an automated procedure. The methodologies available for this purpose range from simple shape and hydrophobicity scores to computationally demanding free energy simulations. PMID:19746440

  13. A regression tree approach to identifying subgroups with differential treatment effects.

    PubMed

    Loh, Wei-Yin; He, Xu; Man, Michael

    2015-05-20

    In the fight against hard-to-treat diseases such as cancer, it is often difficult to discover new treatments that benefit all subjects. For regulatory agency approval, it is more practical to identify subgroups of subjects for whom the treatment has an enhanced effect. Regression trees are natural for this task because they partition the data space. We briefly review existing regression tree algorithms. Then, we introduce three new ones that are practically free of selection bias and are applicable to data from randomized trials with two or more treatments, censored response variables, and missing values in the predictor variables. The algorithms extend the generalized unbiased interaction detection and estimation (GUIDE) approach by using three key ideas: (i) treatment as a linear predictor, (ii) chi-squared tests to detect residual patterns and lack of fit, and (iii) proportional hazards modeling via Poisson regression. Importance scores with thresholds for identifying influential variables are obtained as by-products. A bootstrap technique is used to construct confidence intervals for the treatment effects in each node. The methods are compared using real and simulated data. PMID:25656439

  14. Calibrated photostimulated luminescence is an effective approach to identify irradiated orange during storage

    NASA Astrophysics Data System (ADS)

    Jo, Yunhee; Sanyal, Bhaskar; Chung, Namhyeok; Lee, Hyun-Gyu; Park, Yunji; Park, Hae-Jun; Kwon, Joong-Ho

    2015-06-01

    Photostimulated luminescence (PSL) has been employed as a fast screening method for various irradiated foods. In this study the potential use of PSL was evaluated to identify oranges irradiated with gamma ray, electron beam and X-ray (0-2 kGy) and stored under different conditions for 6 weeks. The effects of light conditions (natural light, artificial light, and dark) and storage temperatures (4 and 20 °C) on PSL photon counts (PCs) during post-irradiation periods were studied. Non-irradiated samples always showed negative values of PCs, while irradiated oranges exhibited intermediate results after first PSL measurements. However, the irradiated samples had much higher PCs. The PCs of all the samples declined as the storage time increased. Calibrated second PSL measurements showed PSL ratio <10 for the irradiated samples after 3 weeks of irradiation confirming their irradiation status in all the storage conditions. Calibrated PSL and sample storage in dark at 4 °C were found out to be most suitable approaches to identify irradiated oranges during storage.

  15. A regression tree approach to identifying subgroups with differential treatment effects

    PubMed Central

    Loh, Wei-Yin; He, Xu; Man, Michael

    2015-01-01

    In the fight against hard-to-treat diseases such as cancer, it is often difficult to discover new treatments that benefit all subjects. For regulatory agency approval, it is more practical to identify subgroups of subjects for whom the treatment has an enhanced effect. Regression trees are natural for this task because they partition the data space. We briefly review existing regression tree algorithms. Then we introduce three new ones that are practically free of selection bias and are applicable to data from randomized trials with two or more treatments, censored response variables, and missing values in the predictor variables. The algorithms extend the GUIDE approach by using three key ideas: (i) treatment as a linear predictor, (ii) chi-squared tests to detect residual patterns and lack of fit, and (iii) proportional hazards modeling via Poisson regression. Importance scores with thresholds for identifying influential variables are obtained as by-products. A bootstrap technique is used to construct confidence intervals for the treatment effects in each node. The methods are compared using real and simulated data. PMID:25656439

  16. An algorithmic calibration approach to identify globally optimal parameters for constraining the DayCent model

    SciTech Connect

    Rafique, Rashid; Kumar, Sandeep; Luo, Yiqi; Kiely, Gerard; Asrar, Ghassem R.

    2015-02-01

    he accurate calibration of complex biogeochemical models is essential for the robust estimation of soil greenhouse gases (GHG) as well as other environmental conditions and parameters that are used in research and policy decisions. DayCent is a popular biogeochemical model used both nationally and internationally for this purpose. Despite DayCent’s popularity, its complex parameter estimation is often based on experts’ knowledge which is somewhat subjective. In this study we used the inverse modelling parameter estimation software (PEST), to calibrate the DayCent model based on sensitivity and identifi- ability analysis. Using previously published N2 O and crop yield data as a basis of our calibration approach, we found that half of the 140 parameters used in this study were the primary drivers of calibration dif- ferences (i.e. the most sensitive) and the remaining parameters could not be identified given the data set and parameter ranges we used in this study. The post calibration results showed improvement over the pre-calibration parameter set based on, a decrease in residual differences 79% for N2O fluxes and 84% for crop yield, and an increase in coefficient of determination 63% for N2O fluxes and 72% for corn yield. The results of our study suggest that future studies need to better characterize germination tem- perature, number of degree-days and temperature dependency of plant growth; these processes were highly sensitive and could not be adequately constrained by the data used in our study. Furthermore, the sensitivity and identifiability analysis was helpful in providing deeper insight for important processes and associated parameters that can lead to further improvement in calibration of DayCent model.

  17. An Integrated Multiomics Approach to Identify Candidate Antigens for Serodiagnosis of Human Onchocerciasis*

    PubMed Central

    McNulty, Samantha N.; Rosa, Bruce A.; Fischer, Peter U.; Rumsey, Jeanne M.; Erdmann-Gilmore, Petra; Curtis, Kurt C.; Specht, Sabine; Townsend, R. Reid; Weil, Gary J.; Mitreva, Makedonka

    2015-01-01

    Improved diagnostic methods are needed to support ongoing efforts to eliminate onchocerciasis (river blindness). This study used an integrated approach to identify adult female Onchocerca volvulus antigens that can be explored for developing serodiagnostic tests. The first step was to develop a detailed multi-omics database of all O. volvulus proteins deduced from the genome, gene transcription data for different stages of the parasite including eight individual female worms (providing gene expression information for 94.8% of all protein coding genes), and the adult female worm proteome (detecting 2126 proteins). Next, female worm proteins were purified with IgG antibodies from onchocerciasis patients and identified using LC-MS with a high-resolution hybrid quadrupole-time-of-flight mass spectrometer. A total of 241 immunoreactive proteins were identified among those bound by IgG from infected individuals but not IgG from uninfected controls. These included most of the major diagnostic antigens described over the past 25 years plus many new candidates. Proteins of interest were prioritized for further study based on a lack of conservation with orthologs in the human host and other helminthes, their expression pattern across the life cycle, and their consistent expression among individual female worms. Based on these criteria, we selected 33 proteins that should be carried forward for testing as serodiagnostic antigens to supplement existing diagnostic tools. These candidates, together with the extensive pan-omics dataset generated in this study are available to the community (http://nematode.net) to facilitate basic and translational research on onchocerciasis. PMID:26472727

  18. An Integrated Multiomics Approach to Identify Candidate Antigens for Serodiagnosis of Human Onchocerciasis.

    PubMed

    McNulty, Samantha N; Rosa, Bruce A; Fischer, Peter U; Rumsey, Jeanne M; Erdmann-Gilmore, Petra; Curtis, Kurt C; Specht, Sabine; Townsend, R Reid; Weil, Gary J; Mitreva, Makedonka

    2015-12-01

    Improved diagnostic methods are needed to support ongoing efforts to eliminate onchocerciasis (river blindness). This study used an integrated approach to identify adult female Onchocerca volvulus antigens that can be explored for developing serodiagnostic tests. The first step was to develop a detailed multi-omics database of all O. volvulus proteins deduced from the genome, gene transcription data for different stages of the parasite including eight individual female worms (providing gene expression information for 94.8% of all protein coding genes), and the adult female worm proteome (detecting 2126 proteins). Next, female worm proteins were purified with IgG antibodies from onchocerciasis patients and identified using LC-MS with a high-resolution hybrid quadrupole-time-of-flight mass spectrometer. A total of 241 immunoreactive proteins were identified among those bound by IgG from infected individuals but not IgG from uninfected controls. These included most of the major diagnostic antigens described over the past 25 years plus many new candidates. Proteins of interest were prioritized for further study based on a lack of conservation with orthologs in the human host and other helminthes, their expression pattern across the life cycle, and their consistent expression among individual female worms. Based on these criteria, we selected 33 proteins that should be carried forward for testing as serodiagnostic antigens to supplement existing diagnostic tools. These candidates, together with the extensive pan-omics dataset generated in this study are available to the community (http://nematode.net) to facilitate basic and translational research on onchocerciasis. PMID:26472727

  19. SU-E-J-212: Identifying Bones From MRI: A Dictionary Learnign and Sparse Regression Approach

    SciTech Connect

    Ruan, D; Yang, Y; Cao, M; Hu, P; Low, D

    2014-06-01

    Purpose: To develop an efficient and robust scheme to identify bony anatomy based on MRI-only simulation images. Methods: MRI offers important soft tissue contrast and functional information, yet its lack of correlation to electron-density has placed it as an auxiliary modality to CT in radiotherapy simulation and adaptation. An effective scheme to identify bony anatomy is an important first step towards MR-only simulation/treatment paradigm and would satisfy most practical purposes. We utilize a UTE acquisition sequence to achieve visibility of the bone. By contrast to manual + bulk or registration-to identify bones, we propose a novel learning-based approach for improved robustness to MR artefacts and environmental changes. Specifically, local information is encoded with MR image patch, and the corresponding label is extracted (during training) from simulation CT aligned to the UTE. Within each class (bone vs. nonbone), an overcomplete dictionary is learned so that typical patches within the proper class can be represented as a sparse combination of the dictionary entries. For testing, an acquired UTE-MRI is divided to patches using a sliding scheme, where each patch is sparsely regressed against both bone and nonbone dictionaries, and subsequently claimed to be associated with the class with the smaller residual. Results: The proposed method has been applied to the pilot site of brain imaging and it has showed general good performance, with dice similarity coefficient of greater than 0.9 in a crossvalidation study using 4 datasets. Importantly, it is robust towards consistent foreign objects (e.g., headset) and the artefacts relates to Gibbs and field heterogeneity. Conclusion: A learning perspective has been developed for inferring bone structures based on UTE MRI. The imaging setting is subject to minimal motion effects and the post-processing is efficient. The improved efficiency and robustness enables a first translation to MR-only routine. The scheme

  20. GROUNDWATER QUALITY MONITORING OF WESTERN COAL STRIP MINING: PRELIMINARY DESIGNS FOR ACTIVE MINE SOURCES OF POLLUTION

    EPA Science Inventory

    Three potential pollution source categories have been identified for Western coal strip mines. These sources include mine stockpiles, mine waters, and miscellaneous active mine sources. TEMPO's stepwise monitoring methodology (Todd et al., 1976) is used to develop groundwater qua...

  1. Engineering Approach to Identifying Patients with Colon Tumors on the Basis of Electrophotonic Imaging Technique Data

    PubMed Central

    Yakovleva, E.G.; Korotkov, K.G.; Fedorov, E.D.; Ivanova, E.V.; Plahov, R.V.; Belonosov, S.S.

    2016-01-01

    Background: Colonic neoplasms are quite a serious problem today. Screening methods play an important role in diagnosing the disease. Colorectal cancer screening is a complex undertaking, having various options, which require a lot of efforts both from the doctor and from the patient, including the use of sedatives and the necessity of the presence of an assistant for some procedures such as colonoscopy. This is why it is very important to find a method by which one can make a diagnosis quickly, easily, and painlessly. Methods: The ability to identify patients with tumors of the colon using the Electrophotonic Imaging (EPI) technique, as well as using it for differential diagnosis of tumors of the colon by their morphology, size and quantity was investigated. Selection of the most significant parameters of the EPI-graphy for the separation of the control group and the group of patients with tumors of the colon was developed. 137 people were studied with the EPI camera, with ages ranging from 16 to 86 years, including 49 males and 88 females. Based on the results of the colonoscopy and histological findings all subjects were divided into 2 groups: control group of 55 people, 9 males, 46 females; and patients with tumors (benign or malignant) of the colon - 82 people; 40 males and 42 females. Then all subjects were divided into smaller groups based on morphology, size, number of tumors and localization. Results: Based on the identified indicators decision rules to determine the patients with tumors of the colon were constructed. The specificity of the resulting function was 80.0% and sensitivity 75.6%. Decision rule was built as well with logistic regression. The specificity of the resulting function was 78.2% and sensitivity 90.0%. The accuracy of this approach was higher than using discriminant analysis. Conclusions: The results of this study have proven the ability to identify patients with tumors of the colon using EPI technology, as well as use it for

  2. An approach to identify microRNAs involved in neuropathic pain following a peripheral nerve injury

    PubMed Central

    Norcini, Monica; Sideris, Alexandra; Martin Hernandez, Lourdes A.; Zhang, Jin; Blanck, Thomas J. J.; Recio-Pinto, Esperanza

    2014-01-01

    Peripheral nerve injury alters the expression of hundreds of proteins in dorsal root ganglia (DRG). Targeting some of these proteins has led to successful treatments for acute pain, but not for sustained post-operative neuropathic pain. The latter may require targeting multiple proteins. Since a single microRNA (miR) can affect the expression of multiple proteins, here, we describe an approach to identify chronic neuropathic pain-relevant miRs. We used two variants of the spared nerve injury (SNI): Sural-SNI and Tibial-SNI and found distinct pain phenotypes between the two. Both models induced strong mechanical allodynia, but only Sural-SNI rats maintained strong mechanical and cold allodynia, as previously reported. In contrast, we found that Tibial-SNI rats recovered from mechanical allodynia and never developed cold allodynia. Since both models involve nerve injury, we increased the probability of identifying differentially regulated miRs that correlated with the quality and magnitude of neuropathic pain and decreased the probability of detecting miRs that are solely involved in neuronal regeneration. We found seven such miRs in L3-L5 DRG. The expression of these miRs increased in Tibial-SNI. These miRs displayed a lower level of expression in Sural-SNI, with four having levels lower than those in sham animals. Bioinformatic analysis of how these miRs could affect the expression of some ion channels supports the view that, following a peripheral nerve injury, the increase of the seven miRs may contribute to the recovery from neuropathic pain while the decrease of four of them may contribute to the development of chronic neuropathic pain. The approach used resulted in the identification of a small number of potentially neuropathic pain relevant miRs. Additional studies are required to investigate whether manipulating the expression of the identified miRs in primary sensory neurons can prevent or ameliorate chronic neuropathic pain following peripheral nerve

  3. Improving mine safety technology and training: establishing US global leadership

    SciTech Connect

    2006-12-15

    In 2006, the USA's record of mine safety was interrupted by fatalities that rocked the industry and caused the National Mining Association and its members to recommit to returning the US underground coal mining industry to a global mine safety leadership role. This report details a comprehensive approach to increase the odds of survival for miners in emergency situations and to create a culture of prevention of accidents. Among its 75 recommendations are a need to improve communications, mine rescue training, and escape and protection of miners. Section headings of the report are: Introduction; Review of mine emergency situations in the past 25 years: identifying and addressing the issues and complexities; Risk-based design and management; Communications technology; Escape and protection strategies; Emergency response and mine rescue procedures; Training for preparedness; Summary of recommendations; and Conclusions. 37 refs., 3 figs., 5 apps.

  4. Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes1

    PubMed Central

    Rombauts, Stephane; Florquin, Kobe; Lescot, Magali; Marchal, Kathleen; Rouzé, Pierre; Van de Peer, Yves

    2003-01-01

    The identification of promoters and their regulatory elements is one of the major challenges in bioinformatics and integrates comparative, structural, and functional genomics. Many different approaches have been developed to detect conserved motifs in a set of genes that are either coregulated or orthologous. However, although recent approaches seem promising, in general, unambiguous identification of regulatory elements is not straightforward. The delineation of promoters is even harder, due to its complex nature, and in silico promoter prediction is still in its infancy. Here, we review the different approaches that have been developed for identifying promoters and their regulatory elements. We discuss the detection of cis-acting regulatory elements using word-counting or probabilistic methods (so-called “search by signal” methods) and the delineation of promoters by considering both sequence content and structural features (“search by content” methods). As an example of search by content, we explored in greater detail the association of promoters with CpG islands. However, due to differences in sequence content, the parameters used to detect CpG islands in humans and other vertebrates cannot be used for plants. Therefore, a preliminary attempt was made to define parameters that could possibly define CpG and CpNpG islands in Arabidopsis, by exploring the compositional landscape around the transcriptional start site. To this end, a data set of more than 5,000 gene sequences was built, including the promoter region, the 5′-untranslated region, and the first introns and coding exons. Preliminary analysis shows that promoter location based on the detection of potential CpG/CpNpG islands in the Arabidopsis genome is not straightforward. Nevertheless, because the landscape of CpG/CpNpG islands differs considerably between promoters and introns on the one side and exons (whether coding or not) on the other, more sophisticated approaches can probably be

  5. 30 CFR 77.1200 - Mine map.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... SAFETY STANDARDS, SURFACE COAL MINES AND SURFACE WORK AREAS OF UNDERGROUND COAL MINES Maps § 77.1200 Mine...) The location of railroad tracks and public highways leading to the mine, and mine buildings of a permanent nature with identifying names shown; (k) Underground mine workings underlying and within...

  6. 30 CFR 77.1200 - Mine map.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... SAFETY STANDARDS, SURFACE COAL MINES AND SURFACE WORK AREAS OF UNDERGROUND COAL MINES Maps § 77.1200 Mine...) The location of railroad tracks and public highways leading to the mine, and mine buildings of a permanent nature with identifying names shown; (k) Underground mine workings underlying and within...

  7. 30 CFR 77.1200 - Mine map.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... SAFETY STANDARDS, SURFACE COAL MINES AND SURFACE WORK AREAS OF UNDERGROUND COAL MINES Maps § 77.1200 Mine...) The location of railroad tracks and public highways leading to the mine, and mine buildings of a permanent nature with identifying names shown; (k) Underground mine workings underlying and within...

  8. 30 CFR 77.1200 - Mine map.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... SAFETY STANDARDS, SURFACE COAL MINES AND SURFACE WORK AREAS OF UNDERGROUND COAL MINES Maps § 77.1200 Mine...) The location of railroad tracks and public highways leading to the mine, and mine buildings of a permanent nature with identifying names shown; (k) Underground mine workings underlying and within...

  9. 30 CFR 77.1200 - Mine map.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... SAFETY STANDARDS, SURFACE COAL MINES AND SURFACE WORK AREAS OF UNDERGROUND COAL MINES Maps § 77.1200 Mine...) The location of railroad tracks and public highways leading to the mine, and mine buildings of a permanent nature with identifying names shown; (k) Underground mine workings underlying and within...

  10. Identifying typical patterns of vulnerability: A 5-step approach based on cluster analysis

    NASA Astrophysics Data System (ADS)

    Sietz, Diana; Lüdeke, Matthias; Kok, Marcel; Lucas, Paul; Carsten, Walther; Janssen, Peter

    2013-04-01

    Specific processes that shape the vulnerability of socio-ecological systems to climate, market and other stresses derive from diverse background conditions. Within the multitude of vulnerability-creating mechanisms, distinct processes recur in various regions inspiring research on typical patterns of vulnerability. The vulnerability patterns display typical combinations of the natural and socio-economic properties that shape a systems' vulnerability to particular stresses. Based on the identification of a limited number of vulnerability patterns, pattern analysis provides an efficient approach to improving our understanding of vulnerability and decision-making for vulnerability reduction. However, current pattern analyses often miss explicit descriptions of their methods and pay insufficient attention to the validity of their groupings. Therefore, the question arises as to how do we identify typical vulnerability patterns in order to enhance our understanding of a systems' vulnerability to stresses? A cluster-based pattern recognition applied at global and local levels is scrutinised with a focus on an applicable methodology and practicable insights. Taking the example of drylands, this presentation demonstrates the conditions necessary to identify typical vulnerability patterns. They are summarised in five methodological steps comprising the elicitation of relevant cause-effect hypotheses and the quantitative indication of mechanisms as well as an evaluation of robustness, a validation and a ranking of the identified patterns. Reflecting scale-dependent opportunities, a global study is able to support decision-making with insights into the up-scaling of interventions when available funds are limited. In contrast, local investigations encourage an outcome-based validation. This constitutes a crucial step in establishing the credibility of the patterns and hence their suitability for informing extension services and individual decisions. In this respect, working at

  11. Ask and Ye Shall Receive? Automated Text Mining of Michigan Capital Facility Finance Bond Election Proposals to Identify Which Topics Are Associated with Bond Passage and Voter Turnout

    ERIC Educational Resources Information Center

    Bowers, Alex J.; Chen, Jingjing

    2015-01-01

    The purpose of this study is to bring together recent innovations in the research literature around school district capital facility finance, municipal bond elections, statistical models of conditional time-varying outcomes, and data mining algorithms for automated text mining of election ballot proposals to examine the factors that influence the…

  12. Rehabilitation prioritization of abandoned mines and its application to Nyala Magnesite Mine

    NASA Astrophysics Data System (ADS)

    Mhlongo, Sphiwe Emmanuel; Amponsah-Dacosta, Francis; Mphephu, Nndweleni Fredrick

    2013-12-01

    The issue of abandoned mine sites is a major environmental and social problem for the mining industry, communities and governments. Historical mine sites are characterized by significant environmental, health and safety problems. The aim of this study was to develop hazard maps that can assist in the prioritization of rehabilitation at Nyala Mine. The approach used involved site examination and characterization to establish the environmental conditions of the mine. Hazards at the mine were identified, scored, and rated using modified Historic Mine Site Scoring System. The scoring focused on source and exposure pathways. The developed hazard maps showed that the best approach of effectively reducing the physical and environmental hazards at Nyala Mine was to give priority to extremely and moderately hazardous pits; surface infrastructure and spoil dumps, and then to tailings dumps characterized with less physical hazards but extremely high environmental hazards. Pits and spoil materials which were found to be relatively less problematic in terms of physical hazards were to receive least attention. The use of this hazard-scoring and risk-ranking methodology coupled with the hazard maps would provide a more robust scientific basis for making sound decisions and prioritize actions that need to be taken to minimize or manage risks associated with various areas of the mine site.

  13. Neuroimaging and Neuromodulation: Complementary Approaches for Identifying the Neuronal Correlates of Tinnitus

    PubMed Central

    Langguth, Berthold; Schecklmann, Martin; Lehner, Astrid; Landgrebe, Michael; Poeppl, Timm Benjamin; Kreuzer, Peter Michal; Schlee, Winfried; Weisz, Nathan; Vanneste, Sven; De Ridder, Dirk

    2012-01-01

    An inherent limitation of functional imaging studies is their correlational approach. More information about critical contributions of specific brain regions can be gained by focal transient perturbation of neural activity in specific regions with non-invasive focal brain stimulation methods. Functional imaging studies have revealed that tinnitus is related to alterations in neuronal activity of central auditory pathways. Modulation of neuronal activity in auditory cortical areas by repetitive transcranial magnetic stimulation (rTMS) can reduce tinnitus loudness and, if applied repeatedly, exerts therapeutic effects, confirming the relevance of auditory cortex activation for tinnitus generation and persistence. Measurements of oscillatory brain activity before and after rTMS demonstrate that the same stimulation protocol has different effects on brain activity in different patients, presumably related to interindividual differences in baseline activity in the clinically heterogeneous study cohort. In addition to alterations in auditory pathways, imaging techniques also indicate the involvement of non-auditory brain areas, such as the fronto-parietal “awareness” network and the non-tinnitus-specific distress network consisting of the anterior cingulate cortex, anterior insula, and amygdale. Involvement of the hippocampus and the parahippocampal region putatively reflects the relevance of memory mechanisms in the persistence of the phantom percept and the associated distress. Preliminary studies targeting the dorsolateral prefrontal cortex, the dorsal anterior cingulate cortex, and the parietal cortex with rTMS and with transcranial direct current stimulation confirm the relevance of the mentioned non-auditory networks. Available data indicate the important value added by brain stimulation as a complementary approach to neuroimaging for identifying the neuronal correlates of the various clinical aspects of tinnitus. PMID:22509155

  14. A Computational Approach to Identifying Gene-microRNA Modules in Cancer

    PubMed Central

    Jin, Daeyong; Lee, Hyunju

    2015-01-01

    MicroRNAs (miRNAs) play key roles in the initiation and progression of various cancers by regulating genes. Regulatory interactions between genes and miRNAs are complex, as multiple miRNAs can regulate multiple genes. In addtion, these interactions vary from patient to patient and even among patients with the same cancer type, as cancer development is a heterogeneous process. These relationships are more complicated because transcription factors and other regulatory molecules can also regulate miRNAs and genes. Hence, it is important to identify the complex relationships between genes and miRNAs in cancer. In this study, we propose a computational approach to constructing modules that represent these relationships by integrating the expression data of genes and miRNAs with gene-gene interaction data. First, we used a biclustering algorithm to construct modules consisting of a subset of genes and a subset of samples to incorporate the heterogeneity of cancer cells. Second, we combined gene-gene interactions to include genes that play important roles in cancer-related pathways. Then, we selected miRNAs that are closely associated with genes in the modules based on a Gaussian Bayesian network and Bayesian Information Criteria. When we applied our approach to ovarian cancer and glioblastoma (GBM) data sets, 33 and 54 modules were constructed, respectively. In these modules, 91% and 94% of ovarian cancer and GBM modules, respectively, were explained either by direct regulation between genes and miRNAs or by indirect relationships via transcription factors. In addition, 48.4% and 74.0% of modules from ovarian cancer and GBM, respectively, were enriched with cancer-related pathways, and 51.7% and 71.7% of miRNAs in modules were ovarian cancer-related miRNAs and GBM-related miRNAs, respectively. Finally, we extensively analyzed significant modules and showed that most genes in these modules were related to ovarian cancer and GBM. PMID:25611546

  15. A computational approach to identifying gene-microRNA modules in cancer.

    PubMed

    Jin, Daeyong; Lee, Hyunju

    2015-01-01

    MicroRNAs (miRNAs) play key roles in the initiation and progression of various cancers by regulating genes. Regulatory interactions between genes and miRNAs are complex, as multiple miRNAs can regulate multiple genes. In addtion, these interactions vary from patient to patient and even among patients with the same cancer type, as cancer development is a heterogeneous process. These relationships are more complicated because transcription factors and other regulatory molecules can also regulate miRNAs and genes. Hence, it is important to identify the complex relationships between genes and miRNAs in cancer. In this study, we propose a computational approach to constructing modules that represent these relationships by integrating the expression data of genes and miRNAs with gene-gene interaction data. First, we used a biclustering algorithm to construct modules consisting of a subset of genes and a subset of samples to incorporate the heterogeneity of cancer cells. Second, we combined gene-gene interactions to include genes that play important roles in cancer-related pathways. Then, we selected miRNAs that are closely associated with genes in the modules based on a Gaussian Bayesian network and Bayesian Information Criteria. When we applied our approach to ovarian cancer and glioblastoma (GBM) data sets, 33 and 54 modules were constructed, respectively. In these modules, 91% and 94% of ovarian cancer and GBM modules, respectively, were explained either by direct regulation between genes and miRNAs or by indirect relationships via transcription factors. In addition, 48.4% and 74.0% of modules from ovarian cancer and GBM, respectively, were enriched with cancer-related pathways, and 51.7% and 71.7% of miRNAs in modules were ovarian cancer-related miRNAs and GBM-related miRNAs, respectively. Finally, we extensively analyzed significant modules and showed that most genes in these modules were related to ovarian cancer and GBM. PMID:25611546

  16. Identifying children in need of ancillary and enabling services: a population approach.

    PubMed

    Benedict, Ruth E; Farel, Anita M

    2003-12-01

    Children with chronic or disabling conditions use health, education and social services at a higher rate than their healthy peers. Estimates of the number of children in need of these specialized services are widely varied and often depend on categorical definitions that do not account for either the diversity or commonality of their experiences. Developing methods for identifying the population in need of services, particularly children likely to use long-term ancillary (audiology, occupational, physical or speech therapy, or social work) and/or enabling services (special equipment, personal care assistance, respite care, transportation, or environmental modifications), is essential for effective policy and program implementation. This study examines several recent attempts to operationalize definitions of children with chronic conditions using a noncategorical classification approach. Particular emphasis is placed on the subgroup of children identified as having functional limitations. Proposed operational definitions of children with functional limitations are compared using data from the 1994-1995 Disability Supplement to the US National Health Interview Survey. Estimates of the number of children reported to be using ancillary and enabling services are generated and compared across operational definitions of functional limitation as well as by the number, severity, and type (i.e. mobility, self-care, communication/sensory, social cognition/learning ability) of limitation. Depending on the operational definition selected, 9-14% of US community-dwelling children are estimated to have functional limitations. Among children with limitations, 26-30% regularly use ancillary services and 11-14% use enabling services. The strengths, limitations, and potential applications for each operational definition are discussed. PMID:14512235

  17. A Genome-Wide Methylation Approach Identifies a New Hypermethylated Gene Panel in Ulcerative Colitis.

    PubMed

    Kang, Keunsoo; Bae, Jin-Han; Han, Kyudong; Kim, Eun Soo; Kim, Tae-Oh; Yi, Joo Mi

    2016-01-01

    The cause of inflammatory bowel disease (IBD) is still unknown, but there is growing evidence that environmental factors such as epigenetic changes can contribute to the disease etiology. The aim of this study was to identify newly hypermethylated genes in ulcerative colitis (UC) using a genome-wide DNA methylation approach. Using an Infinium HumanMethylation450 BeadChip array, we screened the DNA methylation changes in three normal colon controls and eight UC patients. Using these methylation profiles, 48 probes associated with CpG promoter methylation showed differential hypermethylation between UC patients and normal controls. Technical validations for methylation analyses in a larger series of UC patients (n = 79) were performed by methylation-specific PCR (MSP) and bisulfite sequencing analysis. We finally found that three genes (FAM217B, KIAA1614 and RIBC2) that were significantly elevating the promoter methylation levels in UC compared to normal controls. Interestingly, we confirmed that three genes were transcriptionally silenced in UC patient samples by qRT-PCR, suggesting that their silencing is correlated with the promoter hypermethylation. Pathway analyses were performed using GO and KEGG databases with differentially hypermethylated genes in UC. Our results highlight that aberrant hypermethylation was identified in UC patients which can be a potential biomarker for detecting UC. Moreover, pathway-enriched hypermethylated genes are possibly implicating important cellular function in the pathogenesis of UC. Overall, this study describes a newly hypermethylated gene panel in UC patients and provides new clinical information that can be used for the diagnosis and therapeutic treatment of IBD. PMID:27517910

  18. A Genome-Wide Methylation Approach Identifies a New Hypermethylated Gene Panel in Ulcerative Colitis

    PubMed Central

    Kang, Keunsoo; Bae, Jin-Han; Han, Kyudong; Kim, Eun Soo; Kim, Tae-Oh; Yi, Joo Mi

    2016-01-01

    The cause of inflammatory bowel disease (IBD) is still unknown, but there is growing evidence that environmental factors such as epigenetic changes can contribute to the disease etiology. The aim of this study was to identify newly hypermethylated genes in ulcerative colitis (UC) using a genome-wide DNA methylation approach. Using an Infinium HumanMethylation450 BeadChip array, we screened the DNA methylation changes in three normal colon controls and eight UC patients. Using these methylation profiles, 48 probes associated with CpG promoter methylation showed differential hypermethylation between UC patients and normal controls. Technical validations for methylation analyses in a larger series of UC patients (n = 79) were performed by methylation-specific PCR (MSP) and bisulfite sequencing analysis. We finally found that three genes (FAM217B, KIAA1614 and RIBC2) that were significantly elevating the promoter methylation levels in UC compared to normal controls. Interestingly, we confirmed that three genes were transcriptionally silenced in UC patient samples by qRT-PCR, suggesting that their silencing is correlated with the promoter hypermethylation. Pathway analyses were performed using GO and KEGG databases with differentially hypermethylated genes in UC. Our results highlight that aberrant hypermethylation was identified in UC patients which can be a potential biomarker for detecting UC. Moreover, pathway-enriched hypermethylated genes are possibly implicating important cellular function in the pathogenesis of UC. Overall, this study describes a newly hypermethylated gene panel in UC patients and provides new clinical information that can be used for the diagnosis and therapeutic treatment of IBD. PMID:27517910

  19. Demonstrating a Market-Based Approach to the Reclamation of Mined Lands in West Virginia

    SciTech Connect

    Goodrich-Mahoney, John; Donnelly, Ellen

    2009-12-31

    This project demonstrated that developing environmental credits on private land—including abandoned mined lands—is dependent on a number of factors, some of them beyond the control of the project team. In this project, acid mine drainage (AMD) was successfully remediated through the construction of a passive AMD treatment system. Extensive water quality sampling both before and after the installation of the passive AMD treatment system showed that the system achieved removal efficiencies and pollutant loading reductions for acidity, iron, aluminum and manganese that were consistent with systems of similar size and design. The success of the passive AMD treatment system should have resulted in water credits if the project had not been terminated. Developing carbon sequestration credits, however, was much more complex and was not achieved in this project. The primary challenge that the project team encountered in meeting the full project objectives was the unsuccessful attempt to have the landowner sign a conservation easement for his property. This would have allowed the project team to clear and reforest the site, monitor the progress of the newly planted trees, and eventually realize carbon sequestration credits once the forest was mature. The delays caused by the lack of a conservation easement, as well as other factors, eventually resulted in the reforestation portion of the project being cancelled. The information in this report will help the public make more informed decisions regarding the potential of using water and carbon, and other credits to support the remediation of minded lands through out the United States. The hope is that by using credits that more mined lands with be remediated.

  20. Long-range prediction of Indian summer monsoon rainfall using data mining and statistical approaches

    NASA Astrophysics Data System (ADS)

    H, Vathsala; Koolagudi, Shashidhar G.

    2016-07-01

    This paper presents a hybrid model to better predict Indian summer monsoon rainfall. The algorithm considers suitable techniques for processing dense datasets. The proposed three-step algorithm comprises closed itemset generation-based association rule mining for feature selection, cluster membership for dimensionality reduction, and simple logistic function for prediction. The application of predicting rainfall into flood, excess, normal, deficit, and drought based on 36 predictors consisting of land and ocean variables is presented. Results show good accuracy in the considered study period of 37years (1969-2005).

  1. On the Way to New Possible Na-Ion Conductors: The Voronoi-Dirichlet Approach, Data Mining and Symmetry Considerations in Ternary Na Oxides.

    PubMed

    Meutzner, Falk; Münchgesang, Wolfram; Kabanova, Natalya A; Zschornak, Matthias; Leisegang, Tilmann; Blatov, Vladislav A; Meyer, Dirk C

    2015-11-01

    With the constant growth of the lithium battery market and the introduction of electric vehicles and stationary energy storage solutions, the low abundance and high price of lithium will greatly impact its availability in the future. Thus, a diversification of electrochemical energy storage technologies based on other source materials is of great relevance. Sodium is energetically similar to lithium but cheaper and more abundant, which results in some already established stationary concepts, such as Na-S and ZEBRA cells. The most significant bottleneck for these technologies is to find effective solid ionic conductors. Thus, the goal of this work is to identify new ionic conductors for Na ions in ternary Na oxides. For this purpose, the Voronoi-Dirichlet approach has been applied to the Inorganic Crystal Structure Database and some new procedures are introduced to the algorithm implemented in the programme package ToposPro. The main new features are the use of data mined values, which are then used for the evaluation of void spaces, and a new method of channel size calculation. 52 compounds have been identified to be high-potential candidates for solid ionic conductors. The results were analysed from a crystallographic point of view in combination with phenomenological requirements for ionic conductors and intercalation hosts. Of the most promising candidates, previously reported compounds have also been successfully identified by using the employed algorithm, which shows the reliability of the method. PMID:26395985

  2. An integrated systems biology approach identifies positive cofactor 4 as a factor that increases reprogramming efficiency

    PubMed Central

    Jo, Junghyun; Hwang, Sohyun; Kim, Hyung Joon; Hong, Soomin; Lee, Jeoung Eun; Lee, Sung-Geum; Baek, Ahmi; Han, Heonjong; Lee, Jin Il; Lee, Insuk; Lee, Dong Ryul

    2016-01-01

    Spermatogonial stem cells (SSCs) can spontaneously dedifferentiate into embryonic stem cell (ESC)-like cells, which are designated as multipotent SSCs (mSSCs), without ectopic expression of reprogramming factors. Interestingly, SSCs express key pluripotency genes such as Oct4, Sox2, Klf4 and Myc. Therefore, molecular dissection of mSSC reprogramming may provide clues about novel endogenous reprogramming or pluripotency regulatory factors. Our comparative transcriptome analysis of mSSCs and induced pluripotent stem cells (iPSCs) suggests that they have similar pluripotency states but are reprogrammed via different transcriptional pathways. We identified 53 genes as putative pluripotency regulatory factors using an integrated systems biology approach. We demonstrated a selected candidate, Positive cofactor 4 (Pc4), can enhance the efficiency of somatic cell reprogramming by promoting and maintaining transcriptional activity of the key reprograming factors. These results suggest that Pc4 has an important role in inducing spontaneous somatic cell reprogramming via up-regulation of key pluripotency genes. PMID:26740582

  3. Rapid in vivo forward genetic approach for identifying axon death genes in Drosophila

    PubMed Central

    Neukomm, Lukas J.; Burdett, Thomas C.; Gonzalez, Michael A.; Züchner, Stephan; Freeman, Marc R.

    2014-01-01

    Axons damaged by acute injury, toxic insults, or neurodegenerative diseases execute a poorly defined autodestruction signaling pathway leading to widespread fragmentation and functional loss. Here, we describe an approach to study Wallerian degeneration in the Drosophila L1 wing vein that allows for analysis of axon degenerative phenotypes with single-axon resolution in vivo. This method allows for the axotomy of specific subsets of axons followed by examination of progressive axonal degeneration and debris clearance alongside uninjured control axons. We developed new Flippase (FLP) reagents using proneural gene promoters to drive FLP expression very early in neural lineages. These tools allow for the production of mosaic clone populations with high efficiency in sensory neurons in the wing. We describe a collection of lines optimized for forward genetic mosaic screens using MARCM (mosaic analysis with a repressible cell marker; i.e., GFP-labeled, homozygous mutant) on all major autosomal arms (∼95% of the fly genome). Finally, as a proof of principle we screened the X chromosome and identified a collection eight recessive and two dominant alleles of highwire, a ubiquitin E3 ligase required for axon degeneration. Similar unbiased forward genetic screens should help rapidly delineate axon death genes, thereby providing novel potential drug targets for therapeutic intervention to prevent axonal and synaptic loss. PMID:24958874

  4. AN INTEGRATED NETWORK APPROACH TO IDENTIFYING BIOLOGICAL PATHWAYS AND ENVIRONMENTAL EXPOSURE INTERACTIONS IN COMPLEX DISEASES

    PubMed Central

    DARABOS, CHRISTIAN; QIU, JINGYA; MOORE, JASON H.

    2015-01-01

    Complex diseases are the result of intricate interactions between genetic, epigenetic and environmental factors. In previous studies, we used epidemiological and genetic data linking environmental exposure or genetic variants to phenotypic disease to construct Human Phenotype Networks and separately analyze the effects of both environment and genetic factors on disease interactions. To better capture the intricacies of the interactions between environmental exposure and the biological pathways in complex disorders, we integrate both aspects into a single “tripartite” network. Despite extensive research, the mechanisms by which chemical agents disrupt biological pathways are still poorly understood. In this study, we use our integrated network model to identify specific biological pathway candidates possibly disrupted by environmental agents. We conjecture that a higher number of co-occurrences between an environmental substance and biological pathway pair can be associated with a higher likelihood that the substance is involved in disrupting that pathway. We validate our model by demonstrating its ability to detect known arsenic and signal transduction pathway interactions and speculate on candidate cell-cell junction organization pathways disrupted by cadmium. The validation was supported by distinct publications of cell biology and genetic studies that associated environmental exposure to pathway disruption. The integrated network approach is a novel method for detecting the biological effects of environmental exposures. A better understanding of the molecular processes associated with specific environmental exposures will help in developing targeted molecular therapies for patients who have been exposed to the toxicity of environmental chemicals. PMID:26776169

  5. Mixed Integer Linear Programming based machine learning approach identifies regulators of telomerase in yeast.

    PubMed

    Poos, Alexandra M; Maicher, André; Dieckmann, Anna K; Oswald, Marcus; Eils, Roland; Kupiec, Martin; Luke, Brian; König, Rainer

    2016-06-01

    Understanding telomere length maintenance mechanisms is central in cancer biology as their dysregulation is one of the hallmarks for immortalization of cancer cells. Important for this well-balanced control is the transcriptional regulation of the telomerase genes. We integrated Mixed Integer Linear Programming models into a comparative machine learning based approach to identify regulatory interactions that best explain the discrepancy of telomerase transcript levels in yeast mutants with deleted regulators showing aberrant telomere length, when compared to mutants with normal telomere length. We uncover novel regulators of telomerase expression, several of which affect histone levels or modifications. In particular, our results point to the transcription factors Sum1, Hst1 and Srb2 as being important for the regulation of EST1 transcription, and we validated the effect of Sum1 experimentally. We compiled our machine learning method leading to a user friendly package for R which can straightforwardly be applied to similar problems integrating gene regulator binding information and expression profiles of samples of e.g. different phenotypes, diseases or treatments. PMID:26908654

  6. Mixed Integer Linear Programming based machine learning approach identifies regulators of telomerase in yeast

    PubMed Central

    Poos, Alexandra M.; Maicher, André; Dieckmann, Anna K.; Oswald, Marcus; Eils, Roland; Kupiec, Martin; Luke, Brian; König, Rainer

    2016-01-01

    Understanding telomere length maintenance mechanisms is central in cancer biology as their dysregulation is one of the hallmarks for immortalization of cancer cells. Important for this well-balanced control is the transcriptional regulation of the telomerase genes. We integrated Mixed Integer Linear Programming models into a comparative machine learning based approach to identify regulatory interactions that best explain the discrepancy of telomerase transcript levels in yeast mutants with deleted regulators showing aberrant telomere length, when compared to mutants with normal telomere length. We uncover novel regulators of telomerase expression, several of which affect histone levels or modifications. In particular, our results point to the transcription factors Sum1, Hst1 and Srb2 as being important for the regulation of EST1 transcription, and we validated the effect of Sum1 experimentally. We compiled our machine learning method leading to a user friendly package for R which can straightforwardly be applied to similar problems integrating gene regulator binding information and expression profiles of samples of e.g. different phenotypes, diseases or treatments. PMID:26908654

  7. Identifying human disease genes: advances in molecular genetics and computational approaches.

    PubMed

    Bakhtiar, S M; Ali, A; Baig, S M; Barh, D; Miyoshi, A; Azevedo, V

    2014-01-01

    The human genome project is one of the significant achievements that have provided detailed insight into our genetic legacy. During the last two decades, biomedical investigations have gathered a considerable body of evidence by detecting more than 2000 disease genes. Despite the imperative advances in the genetic understanding of various diseases, the pathogenesis of many others remains obscure. With recent advances, the laborious methodologies used to identify DNA variations are replaced by direct sequencing of genomic DNA to detect genetic changes. The ability to perform such studies depends equally on the development of high-throughput and economical genotyping methods. Currently, basically for every disease whose origen is still unknown, genetic approaches are available which could be pedigree-dependent or -independent with the capacity to elucidate fundamental disease mechanisms. Computer algorithms and programs for linkage analysis have formed the foundation for many disease gene detection projects, similarly databases of clinical findings have been widely used to support diagnostic decisions in dysmorphology and general human disease. For every disease type, genome sequence variations, particularly single nucleotide polymorphisms are mapped by comparing the genetic makeup of case and control groups. Methods that predict the effects of polymorphisms on protein stability are useful for the identification of possible disease associations, whereas structural effects can be assessed using methods to predict stability changes in proteins using sequence and/or structural information. PMID:25061732

  8. An approach to identify the novel miRNA encoded from H. Annuus EST sequences.

    PubMed

    Gupta, Hemant; Tiwari, Tanushree; Patel, Maulik; Mehta, Aditya; Ghosh, Arpita

    2015-12-01

    MicroRNAs are a newly discovered class of non-protein small RNAs with 22-24 nucleotides. They play multiple roles in biological processes including development, cell proliferation, apoptosis, stress responses and many other cell functions. In this research, several approaches were combined to make a computational prediction of potential miRNAs and their targets in Helianthus annuus (H. annuus). The already available information of the plant miRNAs present in miRBase v21 was used against expressed sequence tags (ESTs). A total of three miRNAs were detected from which one potential novel miRNA was identified following a range of strict filtering criteria. The target prediction was carried out for these three miRNAs having various targets. These targets were functionally annotated and GO terms were assigned. To study the conserved nature of the miRNAs, predicted phylogenetic analysis was carried out. These findings will significantly provide the broader picture for understanding the functions in H. annuus. PMID:26697356

  9. Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies

    PubMed Central

    Delmont, Tom O.

    2016-01-01

    High-throughput sequencing provides a fast and cost-effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results against potential contamination of non-target organisms requires advanced bioinformatics approaches and practices. Here, we re-analyzed the sequencing data generated for the tardigrade Hypsibius dujardini, and created a holistic display of the eukaryotic genome assembly using DNA data originating from two groups and eleven sequencing libraries. By using bacterial single-copy genes, k-mer frequencies, and coverage values of scaffolds we could identify and characterize multiple near-complete bacterial genomes from the raw assembly, and curate a 182 Mbp draft genome for H. dujardini supported by RNA-Seq data. Our results indicate that most contaminant scaffolds were assembled from Moleculo long-read libraries, and most of these contaminants have differed between library preparations. Our re-analysis shows that visualization and curation of eukaryotic genome assemblies can benefit from tools designed to address the needs of today’s microbiologists, who are constantly challenged by the difficulties associated with the identification of distinct microbial genomes in complex environmental metagenomes. PMID:27069789

  10. Tandem immunoprecipitation approach to identify HIV-1 Gag associated host factors.

    PubMed

    Gao, Wei; Li, Min; Zhang, Jingxin

    2014-07-01

    HIV-1 Gag by itself is able to assemble and release from host cells and thus serves as a simplified model to identify host factors involved in this stage of the HIV-1 life cycle. In this study, a tandem immunoprecipitation approach is taken to immunoprecipitate Gag-interacting host proteins from transfected 293T cells. It is demonstrated that with the tandem immunoprecipitation method Gag-interacting host factors can be precipitated more efficiently than by single-step immunoprecipitation. Gag proteins are found to interact with multiple RNA-binding proteins such as hnRNPs, nucleolin, EF1a and ribosomal proteins. Such interactions are mediated by cellular RNAs and the Gag Nuclear Capsid (NC) domain. Deletion of the NC domain results in removal of most of the RNA-binding proteins, as well as a reduction of the Gag releasing capability, which can be restored by replacing the deleted NC domain with another multimerization motif. Importantly, interactions between Gag and host factors are relevant functionally, as evidenced by significantly increased nucleolin protein in the cytoplasm where it is recruited into the Gag complex, and enhanced Gag release when nucleolin is over-expressed. PMID:24690621

  11. Novel Vaccine Candidates against Brucella melitensis Identified through Reverse Vaccinology Approach.

    PubMed

    Vishnu, Udayakumar S; Sankarasubramanian, Jagadesan; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash

    2015-11-01

    Global health therapeutics is a rapidly emerging facet of postgenomics medicine. In this connection, Brucella melitensis is an intracellular bacterium that causes the zoonotic infectious disease, brucellosis. Presently, no licensed vaccines are available for human brucellosis. Here, we report the identification of potential vaccine candidates against B. melitensis using a reverse vaccinology approach. Based on a systematic screening of exoproteome and secretome of B. melitensis 16 M, we identified eight proteins as potential vaccine candidates, including LPS-assembly protein LptD, a polysaccharide export protein, a cell surface protein, heme transporter BhuA, flagellin FliC, 7-alpha-hydroxysteroid dehydrogenase, immunoglobulin-binding protein EIBE, and hemagglutinin. Among these, the roles of BhuA and hemagglutinin in the virulence of Brucella are essential to establish infection. Roles of other proteins in the virulence are yet to be studied. Prediction of protein-protein interactions revealed that these proteins can interact with other proteins involved in virulence, secretion system, metabolism, and transport. From these eight potential vaccine candidates, we predicted three surface exposed novel antigenic epitopes that can induce both B-cell and T-cell immune responses. These peptides can be used for the development of either exclusive peptide vaccines or multi-component vaccines against human brucellosis. Reverse vaccinology is an important strategy for discovery of novel global health therapeutics. PMID:26479901

  12. A new approach: role of data mining in prediction of survival of burn patients.

    PubMed

    Patil, Bankat Madhavrao; Joshi, Ramesh C; Toshniwal, Durga; Biradar, Siddeshwar

    2011-12-01

    The prediction of burn patient survivability is a difficult problem to investigate till present times. In present study a prediction Model for patients with burns was built, and its capability to accurately predict the survivability was assessed. We have compared different data mining techniques to asses the performance of various algorithms based on the different measures used in the analysis of information pertaining to medical domain. Obtained results were evaluated for correctness with the help of registered medical practitioners. The dataset was collected from SRT (Swami Ramanand Tirth) Hospital in India, which is one of the Asia's largest rural hospitals. Dataset contains records of 180 patients mainly suffering from burn injuries collected during period from the year 2002 to 2006. Features contain patients' age, sex and percentage of burn received for eight different parts of the body. Prediction models have been developed through rigorous comparative study of important and relevant data mining classification techniques namely, navie bayes, decision tree, support vector machine and back propagation. Performance comparison was also carried out for measuring unbiased estimate of the prediction models using 10-fold cross-validation method. Using the analysis of obtained results, we show that Navie bayes is the best predictor with an accuracy of 97.78% on the holdout samples, further, both the decision tree and support vector machine (SVM) techniques demonstrated an accuracy of 96.12%, and back propagation technique resulted in achieving accuracy of 95%. PMID:20703764

  13. Analysis of Maintenance Service Contracts for Dump Trucks Used in Mining Industry with Simulation Approach

    NASA Astrophysics Data System (ADS)

    Dymasius, A.; Wangsaputra, R.; Iskandar, B. P.

    2016-02-01

    A mining company needs high availability of dump trucks used to haul mining materials. As a result, an effective maintenance action is required to keep the dump trucks in a good condition and hence reducing failure and downtime of the dump trucks. To carry out maintenance in-house requires a high intensive maintenance facility and high skilled maintenance specialists. Often, outsourcing maintenance is an economic option for the company. An external agent takes a proactive action with offering some maintenance contract options to the owner. The decision problem for the owner is to decide the best option and for the agent is to determine the optimal price for each option offered. A non-cooperative game-theory is used to formulate the decision problems for the owner and the agent. We consider that failure pattern of each truck follows a non-homogeneous Poisson process (NHPP) and a queueing theory with multiple servers is used to estimate the downtime. As it involves high complexity to model downtime using a queueing theory, then in this paper we use a simulation method. Furthermore, we conduct experiment to seek for the best number of maintenance facilities (servers) which minimises maintenance and penalty costs incurred to the agent.

  14. Perception of Air Pollution in the Jinchuan Mining Area, China: A Structural Equation Modeling Approach.

    PubMed

    Li, Zhengtao; Folmer, Henk; Xue, Jianhong

    2016-01-01

    Studies on the perception of air pollution in China are very limited. The aim of this paper is to help to fill this gap by analyzing a cross-sectional dataset of 759 residents of the Jinchuan mining area, Gansu Province, China. The estimations suggest that perception of air pollution is two-dimensional. The first dimension is the perceived intensity of air pollution and the second is the perceived hazardousness of the pollutants. Both dimensions are influenced by environmental knowledge. Perceived intensity is furthermore influenced by socio-economic status and proximity to the pollution source; perceived hazardousness is influenced by socio-economic status, family health experience, family size and proximity to the pollution source. There are no reverse effects from perception on environmental knowledge. The main conclusion is that virtually all Jinchuan residents perceive high intensity and hazardousness of air pollution despite the fact that public information on air pollution and its health impacts is classified to a great extent. It is suggested that, to assist the residents to take appropriate preventive action, the local government should develop counseling and educational campaigns and institutionalize disclosure of air quality conditions. These programs should pay special attention to young residents who have limited knowledge of air pollution in the Jinchuan mining area. PMID:27455291

  15. Data mining approach to evaluating the use of skin surface electropotentials for breast cancer detection.

    PubMed

    Sree, S Vinitha; Ng, E Y K; Acharya, U Rajendra

    2010-02-01

    The Biofield Diagnostic System (BDS) uses a score formed with measured skin surface electropotentials and a prior Level Of Suspicion (LOS) value (predicted by the physician based on the patient's ultrasound or mammography results) to calculate a revised Post-BDS LOS to indicate the presence of breast cancer. The demographic details, BDS test results, and the recorded electropotential values form a potentially useful dataset, which can be further explored with data mining tools to extract important information that can be used to improve the current predictive accuracy of the device. According to the proposed data mining framework, the BDS dataset with 291 cases was first pre-processed to remove outliers and then used to select relevant and informative features for classifier development and finally to evaluate the capability of the built classifiers in detecting the presence of the disease. Two popular feature selection techniques, namely, the filter and wrapper methods, were used in parallel for feature selection. A few statistical inference based classifiers and neural networks were used for classification. The proposed technique significantly improved the BDS prediction accuracy. Also, the use of prior LOS and, hence, the Post-BDS LOS, associates a mild subjective interpretation to the current prediction methodology used by BDS. However, the feature subset selected in our analysis that gave the best accuracy did not use either of these features. This result indicates the possibility of using BDS as a better objective assessment tool for breast cancer detection. PMID:20082535

  16. A New Approach for Mining Order-Preserving Submatrices Based on All Common Subsequences

    PubMed Central

    Xue, Yun; Liao, Zhengling; Li, Meihang; Luo, Jie; Kuang, Qiuhua; Hu, Xiaohui; Li, Tiechen

    2015-01-01

    Order-preserving submatrices (OPSMs) have been applied in many fields, such as DNA microarray data analysis, automatic recommendation systems, and target marketing systems, as an important unsupervised learning model. Unfortunately, most existing methods are heuristic algorithms which are unable to reveal OPSMs entirely in NP-complete problem. In particular, deep OPSMs, corresponding to long patterns with few supporting sequences, incur explosive computational costs and are completely pruned by most popular methods. In this paper, we propose an exact method to discover all OPSMs based on frequent sequential pattern mining. First, an existing algorithm was adjusted to disclose all common subsequence (ACS) between every two row sequences, and therefore all deep OPSMs will not be missed. Then, an improved data structure for prefix tree was used to store and traverse ACS, and Apriori principle was employed to efficiently mine the frequent sequential pattern. Finally, experiments were implemented on gene and synthetic datasets. Results demonstrated the effectiveness and efficiency of this method. PMID:26161131

  17. Screening for posttraumatic stress disorder using verbal features in self narratives: a text mining approach.

    PubMed

    He, Qiwei; Veldkamp, Bernard P; de Vries, Theo

    2012-08-15

    Much evidence has shown that people's physical and mental health can be predicted by the words they use. However, such verbal information is seldom used in the screening and diagnosis process probably because the procedure to handle these words is rather difficult with traditional quantitative methods. The first challenge would be to extract robust information from diversified expression patterns, the second to transform unstructured text into a structuralized dataset. The present study developed a new textual assessment method to screen the posttraumatic stress disorder (PTSD) patients using lexical features in the self narratives with text mining techniques. Using 300 self narratives collected online, we extracted highly discriminative keywords with the Chi-square algorithm and constructed a textual assessment model to classify individuals with the presence or absence of PTSD. This resulted in a high agreement between computer and psychiatrists' diagnoses for PTSD and revealed some expressive characteristics in the writings of PTSD patients. Although the results of text analysis are not completely analogous to the results of structured interviews in PTSD diagnosis, the application of text mining is a promising addition to assessing PTSD in clinical and research settings. PMID:22464046

  18. A New Approach for Mining Order-Preserving Submatrices Based on All Common Subsequences.

    PubMed

    Xue, Yun; Liao, Zhengling; Li, Meihang; Luo, Jie; Kuang, Qiuhua; Hu, Xiaohui; Li, Tiechen

    2015-01-01

    Order-preserving submatrices (OPSMs) have been applied in many fields, such as DNA microarray data analysis, automatic recommendation systems, and target marketing systems, as an important unsupervised learning model. Unfortunately, most existing methods are heuristic algorithms which are unable to reveal OPSMs entirely in NP-complete problem. In particular, deep OPSMs, corresponding to long patterns with few supporting sequences, incur explosive computational costs and are completely pruned by most popular methods. In this paper, we propose an exact method to discover all OPSMs based on frequent sequential pattern mining. First, an existing algorithm was adjusted to disclose all common subsequence (ACS) between every two row sequences, and therefore all deep OPSMs will not be missed. Then, an improved data structure for prefix tree was used to store and traverse ACS, and Apriori principle was employed to efficiently mine the frequent sequential pattern. Finally, experiments were implemented on gene and synthetic datasets. Results demonstrated the effectiveness and efficiency of this method. PMID:26161131

  19. Perception of Air Pollution in the Jinchuan Mining Area, China: A Structural Equation Modeling Approach

    PubMed Central

    Li, Zhengtao; Folmer, Henk; Xue, Jianhong

    2016-01-01

    Studies on the perception of air pollution in China are very limited. The aim of this paper is to help to fill this gap by analyzing a cross-sectional dataset of 759 residents of the Jinchuan mining area, Gansu Province, China. The estimations suggest that perception of air pollution is two-dimensional. The first dimension is the perceived intensity of air pollution and the second is the perceived hazardousness of the pollutants. Both dimensions are influenced by environmental knowledge. Perceived intensity is furthermore influenced by socio-economic status and proximity to the pollution source; perceived hazardousness is influenced by socio-economic status, family health experience, family size and proximity to the pollution source. There are no reverse effects from perception on environmental knowledge. The main conclusion is that virtually all Jinchuan residents perceive high intensity and hazardousness of air pollution despite the fact that public information on air pollution and its health impacts is classified to a great extent. It is suggested that, to assist the residents to take appropriate preventive action, the local government should develop counseling and educational campaigns and institutionalize disclosure of air quality conditions. These programs should pay special attention to young residents who have limited knowledge of air pollution in the Jinchuan mining area. PMID:27455291

  20. Parametric analysis of the biomechanical response of head subjected to the primary blast loading - a data mining approach.

    PubMed

    Zhu, Feng; Kalra, Anil; Saif, Tal; Yang, Zaihan; Yang, King H; King, Albert I

    2016-08-01

    Traumatic brain injury due to primary blast loading has become a signature injury in recent military conflicts and terrorist activities. Extensive experimental and computational investigations have been conducted to study the interrelationships between intracranial pressure response and intrinsic or 'input' parameters such as the head geometry and loading conditions. However, these relationships are very complicated and are usually implicit and 'hidden' in a large amount of simulation/test data. In this study, a data mining method is proposed to explore such underlying information from the numerical simulation results. The heads of different species are described as a highly simplified two-part (skull and brain) finite element model with varying geometric parameters. The parameters considered include peak incident pressure, skull thickness, brain radius and snout length. Their interrelationship and coupling effect are discovered by developing a decision tree based on the large simulation data-set. The results show that the proposed data-driven method is superior to the conventional linear regression method and is comparable to the nonlinear regression method. Considering its capability of exploring implicit information and the relatively simple relationships between response and input variables, the data mining method is considered to be a good tool for an in-depth understanding of the mechanisms of blast-induced brain injury. As a general method, this approach can also be applied to other nonlinear complex biomechanical systems. PMID:26442779

  1. A complementary bioinformatics approach to identify potential plant cell wall glycosyltransferase-encoding genes.

    PubMed

    Egelund, Jack; Skjøt, Michael; Geshi, Naomi; Ulvskov, Peter; Petersen, Bent Larsen

    2004-09-01

    Plant cell wall (CW) synthesizing enzymes can be divided into the glycan (i.e. cellulose and callose) synthases, which are multimembrane spanning proteins located at the plasma membrane, and the glycosyltransferases (GTs), which are Golgi localized single membrane spanning proteins, believed to participate in the synthesis of hemicellulose, pectin, mannans, and various glycoproteins. At the Carbohydrate-Active enZYmes (CAZy) database where e.g. glucoside hydrolases and GTs are classified into gene families primarily based on amino acid sequence similarities, 415 Arabidopsis GTs have been classified. Although much is known with regard to composition and fine structures of the plant CW, only a handful of CW biosynthetic GT genes-all classified in the CAZy system-have been characterized. In an effort to identify CW GTs that have not yet been classified in the CAZy database, a simple bioinformatics approach was adopted. First, the entire Arabidopsis proteome was run through the Transmembrane Hidden Markov Model 2.0 server and proteins containing one or, more rarely, two transmembrane domains within the N-terminal 150 amino acids were collected. Second, these sequences were submitted to the SUPERFAMILY prediction server, and sequences that were predicted to belong to the superfamilies NDP-sugartransferase, UDP-glycosyltransferase/glucogen-phosphorylase, carbohydrate-binding domain, Gal-binding domain, or Rossman fold were collected, yielding a total of 191 sequences. Fifty-two accessions already classified in CAZy were discarded. The resulting 139 sequences were then analyzed using the Three-Dimensional-Position-Specific Scoring Matrix and mGenTHREADER servers, and 27 sequences with similarity to either the GT-A or the GT-B fold were obtained. Proof of concept of the present approach has to some extent been provided by our recent demonstration that two members of this pool of 27 non-CAZy-classified putative GTs are xylosyltransferases involved in synthesis of pectin

  2. A Systems Biology Approach Identifies a Regulatory Network in Parotid Acinar Cell Terminal Differentiation

    PubMed Central

    Metzler, Melissa A.; Venkatesh, Srirangapatnam G.; Lakshmanan, Jaganathan; Carenbauer, Anne L.; Perez, Sara M.; Andres, Sarah A.; Appana, Savitri; Brock, Guy N.; Wittliff, James L.; Darling, Douglas S.

    2015-01-01

    Objective The transcription factor networks that drive parotid salivary gland progenitor cells to terminally differentiate, remain largely unknown and are vital to understanding the regeneration process. Methodology A systems biology approach was taken to measure mRNA and microRNA expression in vivo across acinar cell terminal differentiation in the rat parotid salivary gland. Laser capture microdissection (LCM) was used to specifically isolate acinar cell RNA at times spanning the month-long period of parotid differentiation. Results Clustering of microarray measurements suggests that expression occurs in four stages. mRNA expression patterns suggest a novel role for Pparg which is transiently increased during mid postnatal differentiation in concert with several target gene mRNAs. 79 microRNAs are significantly differentially expressed across time. Profiles of statistically significant changes of mRNA expression, combined with reciprocal correlations of microRNAs and their target mRNAs, suggest a putative network involving Klf4, a differentiation inhibiting transcription factor, which decreases as several targeting microRNAs increase late in differentiation. The network suggests a molecular switch (involving Prdm1, Sox11, Pax5, miR-200a, and miR-30a) progressively decreases repression of Xbp1 gene transcription, in concert with decreased translational repression by miR-214. The transcription factor Xbp1 mRNA is initially low, increases progressively, and may be maintained by a positive feedback loop with Atf6. Transfection studies show that Xbp1Mist1 promoter. In addition, Xbp1 and Mist1 each activate the parotid secretory protein (Psp) gene, which encodes an abundant salivary protein, and is a marker of terminal differentiation. Conclusion This study identifies novel expression patterns of Pparg, Klf4, and Sox11 during parotid acinar cell differentiation, as well as numerous differentially expressed microRNAs. Network analysis identifies a novel stemness arm, a

  3. Identifying diffused nitrate sources in a stream in an agricultural field using a dual isotopic approach.

    PubMed

    Ding, Jingtao; Xi, Beidou; Gao, Rutai; He, Liansheng; Liu, Hongliang; Dai, Xuanli; Yu, Yijun

    2014-06-15

    Nitrate (NO3(-)) pollution is a severe problem in aquatic systems in Taihu Lake Basin in China. A dual isotope approach (δ(15)NNO3(-) and δ(18)ONO3(-)) was applied to identify diffused NO3(-) inputs in a stream in an agricultural field at the basin in 2013. The site-specific isotopic characteristics of five NO3(-) sources (atmospheric deposition, AD; NO3(-) derived from soil organic matter nitrification, NS; NO3(-) derived from chemical fertilizer nitrification, NF; groundwater, GW; and manure and sewage, M&S) were identified. NO3(-) concentrations in the stream during the rainy season [mean±standard deviation (SD)=2.5±0.4mg/L] were lower than those during the dry season (mean±SD=4.0±0.5mg/L), whereas the δ(18)ONO3(-) values during the rainy season (mean±SD=+12.3±3.6‰) were higher than those during the dry season (mean±SD=+0.9±1.9‰). Both chemical and isotopic characteristics indicated that mixing with atmospheric NO3(-) resulted in the high δ(18)O values during the rainy season, whereas NS and M&S were the dominant NO3(-) sources during the dry season. A Bayesian model was used to determine the contribution of each NO3(-) source to total stream NO3(-). Results showed that reduced N nitrification in soil zones (including soil organic matter and fertilizer) was the main NO3(-) source throughout the year. M&S contributed more NO3(-) during the dry season (22.4%) than during the rainy season (17.8%). AD generated substantial amounts of NO3(-) in May (18.4%), June (29.8%), and July (24.5%). With the assessment of temporal variation of diffused NO3(-) sources in agricultural field, improved agricultural management practices can be implemented to protect the water resource and avoid further water quality deterioration in Taihu Lake Basin. PMID:24686140

  4. A multi-proxy approach to identifying short-lived marine incursions in the Early Carboniferous

    NASA Astrophysics Data System (ADS)

    Bennett, Carys; Davies, Sarah; Leng, Melanie; Snelling, Andrea; Millward, David; Kearsey, Timothy; Marshall, John; Reves, Emma

    2015-04-01

    This study is a contribution to the TW:eed Project (Tetrapod World: early evolution and diversification), which examines the rebuilding of Carboniferous ecosystems following a mass extinction at the end of the Devonian. The project focuses on the Tournaisian Ballagan Formation of Scotland and the Borders, which contains rare fish and tetrapod material. The Ballagan Formation is characterised by sandstones, dolomitic cementstones, paleosols, siltstones and gypsum deposits. The depositional environment ranges from fluvial, alluvial-plain to marginal-marine environments, with fluvial, floodplain and lacustrine deposition dominant. A multi-proxy approach combining sedimentology, palaeontology, micropalaeontology, palynology and geochemistry is used to identify short-lived marine transgressions onto the floodplain environment. Rare marginal marine fossils are: Chondrites-Phycosiphon, Spirorbis, Serpula, certain ostracod species, rare orthocones, brachiopods and putative marine sharks. More common non-marine fauna include Leiocopida and Podocopida ostracods, Mytilida and Myalinida bivalves, plants, eurypterids, gastropods and fish. Thin carbonate-bearing dolomitic cementstones and siltstone contain are the sedimentary deposits of marine incursions and occur throughout the formation. Over 600 bulk carbon isotope samples were taken from the 500 metre thick Norham Core (located near Berwick-Upon-Tweed), encompassing a time interval of around 13 million years. The results range from -26o to -19 δ13Corg, with an average of -19o much lighter than the average value for Early Carboniferous marine bulk organic matter (δ13C of -28 to -30). The isotope results correspond to broad-scale changes in the depositional setting, with more positive δ13C in pedogenic sediments and more negative δ13C in un-altered grey siltstones. They may also relate to cryptic (short-lived) marine incursions. A comparison of δ13C values from specific plant/wood fragments, palynology and bulk

  5. Outbreaks source: A new mathematical approach to identify their possible location

    NASA Astrophysics Data System (ADS)

    Buscema, Massimo; Grossi, Enzo; Breda, Marco; Jefferson, Tom

    2009-11-01

    Classical epidemiology has generally relied on the description and explanation of the occurrence of infectious diseases in relation to time occurrence of events rather than to place of occurrence. In recent times, computer generated dot maps have facilitated the modeling of the spread of infectious epidemic diseases either with classical statistics approaches or with artificial “intelligent systems”. Few attempts, however, have been made so far to identify the origin of the epidemic spread rather than its evolution by mathematical topology methods. We report on the use of a new artificial intelligence method (the H-PST Algorithm) and we compare this new technique with other well known algorithms to identify the source of three examples of infectious disease outbreaks derived from literature. The H-PST algorithm is a new system able to project a distances matrix of points (events) into a bi-dimensional space, with the generation of a new point, named hidden unit. This new hidden unit deforms the original Euclidean space and transforms it into a new space (cognitive space). The cost function of this transformation is the minimization of the differences between the original distance matrix among the assigned points and the distance matrix of the same points projected into the bi-dimensional map (or any different set of constraints). For many reasons we will discuss, the position of the hidden unit shows to target the outbreak source in many epidemics much better than the other classic algorithms specifically targeted for this task. Compared with main algorithms known in the location theory, the hidden unit was within yards of the outbreak source in the first example (the 2007 epidemic of Chikungunya fever in Italy). The hidden unit was located in the river between the two village epicentres of the spread exactly where the index case was living. Equally in the second (the 1967 foot and mouth disease epidemic in England), and the third (1854 London Cholera epidemic

  6. An Automated Approach for the Determination of the Seismic Moment Tensor in Mining Environments

    NASA Astrophysics Data System (ADS)

    Wamboldt, Lawrence R.

    A study was undertaken to evaluate an automated process to invert for seismic moment tensors from seismic data recorded in mining environments. The data for this study was recorded at Nickel Rim South mine, Sudbury, Ontario. The mine has a seismic monitoring system manufactured by ESG Solutions that performs continuous monitoring of seismicity. On average, approximately 400 seismic events are recorded each day. Currently, data are automatically processed by ESG Solution's software suite during acquisition. The automatic processors pick the P- and/or S-wave arrivals, locate the events and solve for certain source parameters, excluding the seismic moment tensor. In order to solve for the moment tensor, data must be manually processed, which is laborious and therefore seldom performed. This research evaluates an automatic seismic moment tensor inversion method and demonstrates some of the difficulties (through inversions of real and synthetic seismic data) of the inversion process. Results using the method are also compared to the inversion method currently available from ESG Solutions, which requires the manual picking of first-motion polarities for every event. As a result of the extensive synthetic testing of the automatic inversion program, as well as the inversion of real seismic data, it is apparent that there are key parameters requiring greater accuracy in order to increase the reliability of the automation. These parameters include the source time function definition, source location (in turn requiring more accurate and precise knowledge of the earth media), arrival time picks and an attenuation model to account for ray-path dependent filtering of the source time function. In order to improve the automatic method three key pieces of research are needed: (1) studying various location algorithms (and the effects of increasing earth model intricacy) and automatic time picking to improve source location methods, (2) studying how the source time pulse can be

  7. A Comprehensive Regression-Based Approach for Identifying Sources of Person Misfit in Typical-Response Measures

    ERIC Educational Resources Information Center

    Ferrando, Pere J.; Lorenzo-Seva, Urbano

    2016-01-01

    This article proposes a general parametric item response theory approach for identifying sources of misfit in response patterns that have been classified as potentially inconsistent by a global person-fit index. The approach, which is based on the weighted least squared regression of the observed responses on the model-expected responses, can be…

  8. Selection Effects in Identifying Magnetic Clouds and the Importance of the Closest Approach Parameter

    NASA Technical Reports Server (NTRS)

    Lepping, R. P.; Wu, Chin-Chun

    2010-01-01

    This study is motivated by the unusually low number of magnetic clouds (MCs) that are strictly identified within interplanetary coronal mass ejections (ICMEs), as observed at 1 AU; this is usually estimated to be around 30% or lower. But a looser definition of MCs may significantly increase this percentage. Another motivation is the unexpected shape of the occurrence distribution of the observers' "closest approach distances" (measured from a MC's axis, and called CA) which drops off somewhat rapidly as |CA| (in % of MC radius) approaches 100%, based on earlier studies. We suggest, for various geometrical and physical reasons, that the |CA|-distribution should be somewhere between a uniform one and the one actually observed, and therefore the 30% estimate should be higher. So we ask, When there is a failure to identify a MC within an ICME, is it occasionally due to a large |CA| passage, making MC identification more difficult, i.e., is it due to an event selection effect? In attempting to answer this question we examine WIND data to obtain an accurate distribution of the number of MCs vs. |CA| distance, whether the event is ICME-related or not, where initially a large number of cases (N=98) are considered. This gives a frequence distribution that is far from uniform, confirming earlier studies. This along with the fact that there are many ICME identification-parameters that do not depend on |CA| suggest that, indeed an MC event selection effect may explain at least part of the low ratio of (No. MCs)/(No. ICMEs). We also show that there is an acceptable geometrical and physical consistency in the relationships for both average "normalized" magnetic field intensity change and field direction change vs. |CA| within a MC, suggesting that our estimates of |CA|, B(sub 0) (magnetic field intensity on the axis), and choice of a proper "cloud coordinate" system (all needed in the analysis) are acceptably accurate. Therefore the MC fitting model (Lepping et al., 1990) is

  9. Assessment of a Novel Approach to Identify Trichiasis Cases Using Community Treatment Assistants in Tanzania

    PubMed Central

    Greene, Gregory S.; West, Sheila K.; Mkocha, Harran; Munoz, Beatriz; Merbs, Shannath L.

    2015-01-01

    Background Simple surgical intervention advocated by the World Health Organization can alleviate trachomatous trichiasis (TT) and prevent subsequent blindness. A large backlog of TT cases remain unidentified and untreated. To increase identification and referral of TT cases, a novel approach using standard screening questions, a card, and simple training for Community Treatment Assistants (CTAs) to use during Mass Drug Administration (MDA) was developed and evaluated in Kongwa District, a trachoma-endemic area of central Tanzania. Methodology/Principal Findings A community randomized trial was conducted in 36 communities during MDA. CTAs in intervention villages received an additional half-day of training and a TT screening card in addition to the training received by CTAs in villages assigned to usual care. All MDA participants 15 years and older were screened for TT, and senior TT graders confirmed case status by evaluating all screened-positive cases. A random sample of those screened negative for TT and those who did not present at MDA were also evaluated by the master graders. Intervention CTAs identified 5.6 times as many cases (n = 50) as those assigned to usual care (n = 9, p < 0.05). While specificity was above 90% for both groups, the sensitivity for the novel screening tool was 31.2% compared to 5.6% for the usual care group (p < 0.05). Conclusions/Significance CTAs appear to be viable resources for the identification of TT cases. Additional training and use of a TT screening card significantly increased the ability of CTAs to recognize and refer TT cases during MDA; however, further efforts are needed to improve case detection and reduce the number of false positive cases. PMID:26658938

  10. A multiplexed analysis approach identifies new association of inflammatory proteins in patients with overactive bladder.

    PubMed

    Ma, Emily; Vetter, Joel; Bliss, Laura; Lai, H Henry; Mysorekar, Indira U; Jain, Sanjay

    2016-07-01

    Overactive bladder (OAB) is a common debilitating bladder condition with unknown etiology and limited diagnostic modalities. Here, we explored a novel high-throughput and unbiased multiplex approach with cellular and molecular components in a well-characterized patient cohort to identify biomarkers that could be reliably used to distinguish OAB from controls or provide insights into underlying etiology. As a secondary analysis, we determined whether this method could discriminate between OAB and other chronic bladder conditions. We analyzed plasma samples from healthy volunteers (n = 19) and patients diagnosed with OAB, interstitial cystitis/bladder pain syndrome (IC/BPS), or urinary tract infections (UTI; n = 51) for proinflammatory, chemokine, cytokine, angiogenesis, and vascular injury factors using Meso Scale Discovery (MSD) analysis and urinary cytological analysis. Wilcoxon rank-sum tests were used to perform univariate and multivariate comparisons between patient groups (controls, OAB, IC/BPS, and UTI). Multivariate logistic regression models were fit for each MSD analyte on 1) OAB patients and controls, 2) OAB and IC/BPS patients, and 3) OAB and UTI patients. Age, race, and sex were included as independent variables in all multivariate analysis. Receiver operating characteristic (ROC) curves were generated to determine the diagnostic potential of a given analyte. Our findings demonstrate that five analytes, i.e., interleukin 4, TNF-α, macrophage inflammatory protein-1β, serum amyloid A, and Tie2 can reliably differentiate OAB relative to controls and can be used to distinguish OAB from the other conditions. Together, our pilot study suggests a molecular imbalance in inflammatory proteins may contribute to OAB pathogenesis. PMID:27029431

  11. A Machine Learning Approach To Identify Hydrogenosomal Proteins in Trichomonas vaginalis

    PubMed Central

    Burstein, David; Gould, Sven B.; Zimorski, Verena; Kloesges, Thorsten; Kiosse, Fuat; Major, Peter; Martin, William F.; Pupko, Tal

    2012-01-01

    The protozoan parasite Trichomonas vaginalis is the causative agent of trichomoniasis, the most widespread nonviral sexually transmitted disease in humans. It possesses hydrogenosomes—anaerobic mitochondria that generate H2, CO2, and acetate from pyruvate while converting ADP to ATP via substrate-level phosphorylation. T. vaginalis hydrogenosomes lack a genome and translation machinery; hence, they import all their proteins from the cytosol. To date, however, only 30 imported proteins have been shown to localize to the organelle. A total of 226 nuclear-encoded proteins inferred from the genome sequence harbor a characteristic short N-terminal presequence, reminiscent of mitochondrial targeting peptides, which is thought to mediate hydrogenosomal targeting. Recent studies suggest, however, that the presequences might be less important than previously thought. We sought to identify new hydrogenosomal proteins within the 59,672 annotated open reading frames (ORFs) of T. vaginalis, independent of the N-terminal targeting signal, using a machine learning approach. Our training set included 57 gene and protein features determined for all 30 known hydrogenosomal proteins and 576 nonhydrogenosomal proteins. Several classifiers were trained on this set to yield an import score for all proteins encoded by T. vaginalis ORFs, predicting the likelihood of hydrogenosomal localization. The machine learning results were tested through immunofluorescence assay and immunodetection in isolated cell fractions of 14 protein predictions using hemagglutinin constructs expressed under the homologous SCSα promoter in transiently transformed T. vaginalis cells. Localization of 6 of the 10 top predicted hydrogenosome-localized proteins was confirmed, and two of these were found to lack an obvious N-terminal targeting signal. PMID:22140228

  12. Identifying Country-Specific Cultures of Physics Education: A differential item functioning approach

    NASA Astrophysics Data System (ADS)

    Mesic, Vanes

    2012-11-01

    In international large-scale assessments of educational outcomes, student achievement is often represented by unidimensional constructs. This approach allows for drawing general conclusions about country rankings with respect to the given achievement measure, but it typically does not provide specific diagnostic information which is necessary for systematic comparisons and improvements of educational systems. Useful information could be obtained by exploring the differences in national profiles of student achievement between low-achieving and high-achieving countries. In this study, we aimed to identify the relative weaknesses and strengths of eighth graders' physics achievement in Bosnia and Herzegovina in comparison to the achievement of their peers from Slovenia. For this purpose, we ran a secondary analysis of Trends in International Mathematics and Science Study (TIMSS) 2007 data. The student sample consisted of 4,220 students from Bosnia and Herzegovina and 4,043 students from Slovenia. After analysing the cognitive demands of TIMSS 2007 physics items, the correspondent differential item functioning (DIF)/differential group functioning contrasts were estimated. Approximately 40% of items exhibited large DIF contrasts, indicating significant differences between cultures of physics education in Bosnia and Herzegovina and Slovenia. The relative strength of students from Bosnia and Herzegovina showed to be mainly associated with the topic area 'Electricity and magnetism'. Classes of items which required the knowledge of experimental method, counterintuitive thinking, proportional reasoning and/or the use of complex knowledge structures proved to be differentially easier for students from Slovenia. In the light of the presented results, the common practice of ranking countries with respect to universally established cognitive categories seems to be potentially misleading.

  13. Predicting Fish Growth Potential and Identifying Water Quality Constraints: A Spatially-Explicit Bioenergetics Approach

    NASA Astrophysics Data System (ADS)

    Budy, Phaedra; Baker, Matthew; Dahle, Samuel K.

    2011-10-01

    Anthropogenic impairment of water bodies represents a global environmental concern, yet few attempts have successfully linked fish performance to thermal habitat suitability and fewer have distinguished co-varying water quality constraints. We interfaced fish bioenergetics, field measurements, and Thermal Remote Imaging to generate a spatially-explicit, high-resolution surface of fish growth potential, and next employed a structured hypothesis to detect relationships among measures of fish performance and co-varying water quality constraints. Our thermal surface of fish performance captured the amount and spatial-temporal arrangement of thermally-suitable habitat for three focal species in an extremely heterogeneous reservoir, but interpretation of this pattern was initially confounded by seasonal covariation of water residence time and water quality. Subsequent path analysis revealed that in terms of seasonal patterns in growth potential, catfish and walleye responded to temperature, positively and negatively, respectively; crappie and walleye responded to eutrophy (negatively). At the high eutrophy levels observed in this system, some desired fishes appear to suffer from excessive cultural eutrophication within the context of elevated temperatures whereas others appear to be largely unaffected or even enhanced. Our overall findings do not lead to the conclusion that this system is degraded by pollution; however, they do highlight the need to use a sensitive focal species in the process of determining allowable nutrient loading and as integrators of habitat suitability across multiple spatial and temporal scales. We provide an integrated approach useful for quantifying fish growth potential and identifying water quality constraints on fish performance at spatial scales appropriate for whole-system management.

  14. A spatial modeling approach to identify potential butternut restoration sites in Mammoth Cave National Park

    USGS Publications Warehouse

    Thompson, L.M.; Van Manen, F.T.; Schlarbaum, S.E.; DePoy, M.

    2006-01-01

    Incorporation of disease resistance is nearly complete for several important North American hardwood species threatened by exotic fungal diseases. The next important step toward species restoration would be to develop reliable tools to delineate ideal restoration sites on a landscape scale. We integrated spatial modeling and remote sensing techniques to delineate potential restoration sites for Butternut (Juglans cinerea L.) trees, a hardwood species being decimated by an exotic fungus, in Mammoth Cave National Park (MCNP), Kentucky. We first developed a multivariate habitat model to determine optimum Butternut habitats within MCNP. Habitat characteristics of 54 known Butternut locations were used in combination with eight topographic and land use data layers to calculate an index of habitat suitability based on Mahalanobis distance (D2). We used a bootstrapping technique to test the reliability of model predictions. Based on a threshold value for the D2 statistic, 75.9% of the Butternut locations were correctly classified, indicating that the habitat model performed well. Because Butternut seedlings require extensive amounts of sunlight to become established, we used canopy cover data to refine our delineation of favorable areas for Butternut restoration. Areas with the most favorable conditions to establish Butternut seedlings were limited to 291.6 ha. Our study provides a useful reference on the amount and location of favorable Butternut habitat in MCNP and can be used to identify priority areas for future Butternut restoration. Given the availability of relevant habitat layers and accurate location records, our approach can be applied to other tree species and areas. ?? 2006 Society for Ecological Restoration International.

  15. [Combined approach to the assessment of new forms of work organization at Kuzbass coal mines].

    PubMed

    Davydova, N N; Diatlova, L A

    1991-02-01

    Physiological-hygienic assessment has been made of the conditions of labour, the degree of difficulty, tension of labour, fatigue of miners during the work shift, indices of health status and sociopsychological climate under the new conditions of the team form of labour organization of miners. It has been found out, that transition to the contract and piece-rate and premium system of labour organization lead to higher labour productivity, longer usage of mining equipment, stability of the collective. However, alongside with this, labour conditions are getting worse, the work becomes more difficult and tense, which leads to more rapid development of fatigue of miners during the work shift, to chronic morbidity. Prophylaxis should be aimed at rationalization of regimes of labour and rest, normalization of the psychological climate in brigades, strengthening of the treatment-prophylaxis work. PMID:2055504

  16. Heavy metal contamination and its indexing approach for groundwater of Goa mining region, India

    NASA Astrophysics Data System (ADS)

    Singh, Gurdeep; Kamal, Rakesh Kant

    2016-06-01

    The objective of the study is to reveal the seasonal variations in the groundwater quality with respect to heavy metal contamination. To get the extent of the heavy metals contamination, groundwater samples were collected from 45 different locations in and around Goa mining area during the monsoon and post-monsoon seasons. The concentration of heavy metals, such as lead, copper, manganese, zinc, cadmium, iron, and chromium, were determined using atomic absorption spectrophotometer. Most of the samples were found within limit except for Fe content during the monsoon season at two sampling locations which is above desirable limit, i.e., 300 µg/L as per Indian drinking water standard. The data generated were used to calculate the heavy metal pollution index (HPI) for groundwater. The mean values of HPI were 1.5 in the monsoon season and 2.1 in the post-monsoon season, and these values are well below the critical index limit of 100.

  17. A geostatistical approach to estimate mining efficiency indicators with flexible meshes

    NASA Astrophysics Data System (ADS)

    Freixas, Genis; Garriga, David; Fernàndez-Garcia, Daniel; Sanchez-Vila, Xavier

    2014-05-01

    Geostatistics is a branch of statistics developed originally to predict probability distributions of ore grades for mining operations by considering the attributes of a geological formation at unknown locations as a set of correlated random variables. Mining exploitations typically aim to maintain acceptable mineral laws to produce commercial products based upon demand. In this context, we present a new geostatistical methodology to estimate strategic efficiency maps that incorporate hydraulic test data, the evolution of concentrations with time obtained from chemical analysis (packer tests and production wells) as well as hydraulic head variations. The methodology is applied to a salt basin in South America. The exploitation is based on the extraction of brines through vertical and horizontal wells. Thereafter, brines are precipitated in evaporation ponds to obtain target potassium and magnesium salts of economic interest. Lithium carbonate is obtained as a byproduct of the production of potassium chloride. Aside from providing an assemble of traditional geostatistical methods, the strength of this study falls with the new methodology developed, which focus on finding the best sites to exploit the brines while maintaining efficiency criteria. Thus, some strategic indicator efficiency maps have been developed under the specific criteria imposed by exploitation standards to incorporate new extraction wells in new areas that would allow maintain or improve production. Results show that the uncertainty quantification of the efficiency plays a dominant role and that the use flexible meshes, which properly describe the curvilinear features associated with vertical stratification, provides a more consistent estimation of the geological processes. Moreover, we demonstrate that the vertical correlation structure at the given salt basin is essentially linked to variations in the formation thickness, which calls for flexible meshes and non-stationarity stochastic processes.

  18. Genetic Programming and Frequent Itemset Mining to Identify Feature Selection Patterns of iEEG and fMRI Epilepsy Data

    PubMed Central

    Smart, Otis; Burrell, Lauren

    2014-01-01

    Pattern classification for intracranial electroencephalogram (iEEG) and functional magnetic resonance imaging (fMRI) signals has furthered epilepsy research toward understanding the origin of epileptic seizures and localizing dysfunctional brain tissue for treatment. Prior research has demonstrated that implicitly selecting features with a genetic programming (GP) algorithm more effectively determined the proper features to discern biomarker and non-biomarker interictal iEEG and fMRI activity than conventional feature selection approaches. However for each the iEEG and fMRI modalities, it is still uncertain whether the stochastic properties of indirect feature selection with a GP yield (a) consistent results within a patient data set and (b) features that are specific or universal across multiple patient data sets. We examined the reproducibility of implicitly selecting features to classify interictal activity using a GP algorithm by performing several selection trials and subsequent frequent itemset mining (FIM) for separate iEEG and fMRI epilepsy patient data. We observed within-subject consistency and across-subject variability with some small similarity for selected features, indicating a clear need for patient-specific features and possible need for patient-specific feature selection or/and classification. For the fMRI, using nearest-neighbor classification and 30 GP generations, we obtained over 60% median sensitivity and over 60% median selectivity. For the iEEG, using nearest-neighbor classification and 30 GP generations, we obtained over 65% median sensitivity and over 65% median selectivity except one patient. PMID:25580059

  19. A systems biology and proteomics-based approach identifies SRC and VEGFA as biomarkers in risk factor mediated coronary heart disease.

    PubMed

    V, Alexandar; Nayar, Pradeep G; Murugesan, R; S, Shajahan; Krishnan, Jayalakshmi; Ahmed, Shiek S S J

    2016-07-19

    Coronary heart disease (CHD) is the most common cause of death worldwide. The burden of CHD increases with risk factors such as smoking, hypertension, obesity and diabetes. Several studies have demonstrated the association of these classical risk factors with CHD. However, the mechanisms of these associations remain largely unclear due to the complexity of disease pathophysiology and the lack of an integrative approach that fails to provide a definite understanding of molecular linkage. To overcome these problems, we propose a novel systems biology approach that relates causative genes, interactomes and pathways to elucidate the risk factors mediating the molecular mechanisms and biomarkers for feasible diagnosis. The literature was mined to retrieve the causative genes of each risk factor and CHD to construct protein interactomes. The interactomes were examined to identify 298 common molecular signatures. The common signatures were mapped to the tissue network to synthesize a sub-network consisting of 82 proteins. Further, the dissection of the sub-network provides functional modules representing a diverse range of molecular functions, including the AKT/p13k, MAPK and wnt pathways. Also, the prioritization of functional modules identifies SRC, VEGFA and HIF1A as potential candidate markers. Further, we validate these candidates with the existing markers CRP, NOS3 and VCAM1 in the serum of 63 individuals, 33 with CHD and 30 controls, using ELISA. SRC, VEGFA, H1F1A, CRP and NOS3 were significantly altered in patients compared to controls. These results support the utility of these candidate markers for the diagnosis of CHD. Overall, our molecular observations indicate the influence of risk factors in the pathophysiology of CHD and identify serum markers for diagnosis. PMID:27279347

  20. Employment among Working-Age Adults with Multiple Sclerosis: A Data-Mining Approach to Identifying Employment Interventions

    ERIC Educational Resources Information Center

    Bishop, Malachy; Chan, Fong; Rumrill, Phillip D., Jr.; Frain, Michael P.; Tansey, Timothy N.; Chiu, Chung-Yi; Strauser, David; Umeasiegbu, Veronica I.

    2015-01-01

    Purpose: To examine demographic, functional, and clinical multiple sclerosis (MS) variables affecting employment status in a national sample of adults with MS in the United States. Method: The sample included 4,142 working-age (20-65 years) Americans with MS (79.1% female) who participated in a national survey. The mean age of participants was…

  1. A multivariate and stochastic approach to identify key variables to rank dairy farms on profitability.

    PubMed

    Atzori, A S; Tedeschi, L O; Cannas, A

    2013-05-01

    The economic efficiency of dairy farms is the main goal of farmers. The objective of this work was to use routinely available information at the dairy farm level to develop an index of profitability to rank dairy farms and to assist the decision-making process of farmers to increase the economic efficiency of the entire system. A stochastic modeling approach was used to study the relationships between inputs and profitability (i.e., income over feed cost; IOFC) of dairy cattle farms. The IOFC was calculated as: milk revenue + value of male calves + culling revenue - herd feed costs. Two databases were created. The first one was a development database, which was created from technical and economic variables collected in 135 dairy farms. The second one was a synthetic database (sDB) created from 5,000 synthetic dairy farms using the Monte Carlo technique and based on the characteristics of the development database data. The sDB was used to develop a ranking index as follows: (1) principal component analysis (PCA), excluding IOFC, was used to identify principal components (sPC); and (2) coefficient estimates of a multiple regression of the IOFC on the sPC were obtained. Then, the eigenvectors of the sPC were used to compute the principal component values for the original 135 dairy farms that were used with the multiple regression coefficient estimates to predict IOFC (dRI; ranking index from development database). The dRI was used to rank the original 135 dairy farms. The PCA explained 77.6% of the sDB variability and 4 sPC were selected. The sPC were associated with herd profile, milk quality and payment, poor management, and reproduction based on the significant variables of the sPC. The mean IOFC in the sDB was 0.1377 ± 0.0162 euros per liter of milk (€/L). The dRI explained 81% of the variability of the IOFC calculated for the 135 original farms. When the number of farms below and above 1 standard deviation (SD) of the dRI were calculated, we found that 21

  2. Novel approach to identifying the hepatitis B virus pre-S deletions associated with hepatocellular carcinoma

    PubMed Central

    Zhao, Zhi-Mei; Jin, Yan; Gan, Yu; Zhu, Yu; Chen, Tao-Yang; Wang, Jin-Bing; Sun, Yan; Cao, Zhi-Gang; Qian, Geng-Sun; Tu, Hong

    2014-01-01

    AIM: To develop a novel non-sequencing method for the detection of hepatitis B virus (HBV) pre-S deletion mutants in HBV carriers. METHODS: The entire region of HBV pre-S1 and pre-S2 was amplified by polymerase chain reaction (PCR). The size of PCR products was subsequently determined by capillary gel electrophoresis (CGE). CGE were carried out in a PACE-MDQ instrument equipped with a UV detector set at 254 nm. The samples were separated in 50 μm ID eCAP Neutral Coated Capillaries using a voltage of 6 kV for 30 min. Data acquisition and analysis were performed using the 32 Karat Software. A total of 114 DNA clones containing different sizes of the HBV pre-S gene were used to determine the accuracy of the CGE method. One hundred and fifty seven hepatocellular carcinoma (HCC) and 160 non-HCC patients were recruited into the study to assess the association between HBV pre-S deletion and HCC by using the newly-established CGE method. Nine HCC cases with HBV pre-S deletion at the diagnosis year were selected to conduct a longitudinal observation using serial serum samples collected 2-9 years prior to HCC diagnosis. RESULTS: CGE allowed the separation of PCR products differing in size > 3 bp and was able to identify 10% of the deleted DNA in a background of wild-type DNA. The accuracy rate of CGE-based analysis was 99.1% compared with the clone sequencing results. Using this assay, pre-S deletion was more frequently found in HCC patients than in non-HCC controls (47.1% vs 28.1%, P < 0.001). Interestingly, the increased risk of HCC was mainly contributed by the short deletion of pre-S. While the deletion ≤ 99 bp was associated with a 2.971-fold increased risk of HCC (95%CI: 1.723-5.122, P < 0.001), large deletion (> 99 bp) did not show any association with HCC (P = 0.918, OR = 0.966, 95%CI: 0.501-1.863). Of the 9 patients who carried pre-S deletions at the stage of HCC, 88.9% (8/9) had deletions 2-5 years prior to HCC, while only 44.4%4 (4/9) contained such deletions 6

  3. Identifying Behavioral Barriers to Campus Sustainability: A Multi-Method Approach

    ERIC Educational Resources Information Center

    Horhota, Michelle; Asman, Jenni; Stratton, Jeanine P.; Halfacre, Angela C.

    2014-01-01

    Purpose: The purpose of this paper is to assess the behavioral barriers to sustainable action in a campus community. Design/methodology/approach: This paper reports three different methodological approaches to the assessment of behavioral barriers to sustainable actions on a college campus. Focus groups and surveys were used to assess campus…

  4. Geologic considerations in underground coal mining system design

    NASA Technical Reports Server (NTRS)

    Camilli, F. A.; Maynard, D. P.; Mangolds, A.; Harris, J.

    1981-01-01

    Geologic characteristics of coal resources which may impact new extraction technologies are identified and described to aid system designers and planners in their task of designing advanced coal extraction systems for the central Appalachian region. These geologic conditions are then organized into a matrix identified as the baseline mine concept. A sample region, eastern Kentucy is analyzed using both the developed baseline mine concept and the traditional geologic investigative approach.

  5. 75 FR 44973 - Report: A New Approach to Targeting Inspection Resources and Identifying Patterns of Adulteration...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-30

    ... Identifying Patterns of Adulteration: The Reportable Food Registry; Availability AGENCY: Food and Drug... Identifying Patterns of Adulteration: The Reportable Food Registry.'' The report presents FDA's experience... Registry is to help FDA better protect public health by tracking patterns of food and feed adulteration...

  6. Identifying Creatively Gifted Students: Necessity of a Multi-Method Approach

    ERIC Educational Resources Information Center

    Ambrose, Laura; Machek, Greg R.

    2015-01-01

    The process of identifying students as creatively gifted provides numerous challenges for educators. Although many schools assess for creativity in identifying students for gifted and talented services, the relationship between creativity and giftedness is often not fully understood. This article reviews commonly used methods of creativity…

  7. Modeling Approach/Strategy for Corrective Action Unit 97, Yucca Flat and Climax Mine , Revision 0

    SciTech Connect

    Janet Willie

    2003-08-01

    The objectives of the UGTA corrective action strategy are to predict the location of the contaminant boundary for each CAU, develop and implement a corrective action, and close each CAU. The process for achieving this strategy includes modeling to define the maximum extent of contaminant transport within a specified time frame. Modeling is a method of forecasting how the hydrogeologic system, including the underground test cavities, will behave over time with the goal of assessing the migration of radionuclides away from the cavities and chimneys. Use of flow and transport models to achieve the objectives of the corrective action strategy is specified in the FFACO. In the Yucca Flat/Climax Mine system, radionuclide migration will be governed by releases from the cavities and chimneys, and transport in alluvial aquifers, fractured and partially fractured volcanic rock aquifers and aquitards, the carbonate aquifers, and in intrusive units. Additional complexity is associated with multiple faults in Yucca Flat and the need to consider reactive transport mechanisms that both reduce and enhance the mobility of radionuclides. A summary of the data and information that form the technical basis for the model is provided in this document.

  8. Using a Data Mining Approach to Develop a Student Engagement-Based Institutional Typology. IR Applications, Volume 18, February 8, 2009

    ERIC Educational Resources Information Center

    Luan, Jing; Zhao, Chun-Mei; Hayek, John C.

    2009-01-01

    Data mining provides both systematic and systemic ways to detect patterns of student engagement among students at hundreds of institutions. Using traditional statistical techniques alone, the task would be significantly difficult--if not impossible--considering the size and complexity in both data and analytical approaches necessary for this…

  9. An integrated approach to identify protein complex based on best neighbour and modularity increment.

    PubMed

    Shen, Xianjun; Zhao, Yanli; Li, Yanan; Yi, Yang; He, Tingting; Yang, Jincai

    2015-01-01

    In order to overcome the limitations of global modularity and the deficiency of local modularity, we propose a hybrid modularity measure Local-Global Quantification (LGQ) which considers global modularity and local modularity together. LGQ adopts a suitable module feature adjustable parameter to control the balance of global detecting capability and local search capability in Protein-Protein Interactions (PPI) Network. Furthermore, we develop a new protein complex mining algorithm called Best Neighbour and Local-Global Quantification (BN-LGQ) which integrates the best neighbour node and modularity increment. BN-LGQ expands the protein complex by fast searching the best neighbour node of the current cluster and by calculating the modularity increment as a metric to determine whether the best neighbour node can join the current cluster. The experimental results show BN-LGQ performs a better accuracy on predicting protein complexes and has a higher match with the reference protein complexes than MCL and MCODE algorithms. Moreover, BN-LGQ can effectively discover protein complexes with better biological significance in the PPI network. PMID:26336669

  10. A Hybrid Knowledge-Based and Data-Driven Approach to Identifying Semantically Similar Concepts

    PubMed Central

    Pivovarov, Rimma; Elhadad, Noémie

    2012-01-01

    An open research question when leveraging ontological knowledge is when to treat different concepts separately from each other and when to aggregate them. For instance, concepts for the terms "paroxysmal cough" and "nocturnal cough" might be aggregated in a kidney disease study, but should be left separate in a pneumonia study. Determining whether two concepts are similar enough to be aggregated can help build better datasets for data mining purposes and avoid signal dilution. Quantifying the similarity among concepts is a difficult task, however, in part because such similarity is context-dependent. We propose a comprehensive method, which computes a similarity score for a concept pair by combining data-driven and ontology-driven knowledge. We demonstrate our method on concepts from SNOMED-CT and on a corpus of clinical notes of patients with chronic kidney disease. By combining information from usage patterns in clinical notes and from ontological structure, the method can prune out concepts that are simply related from those which are semantically similar. When evaluated against a list of concept pairs annotated for similarity, our method reaches an AUC (area under the curve) of 92%. PMID:22289420

  11. Ongoing soil arsenic exposure of children living in an historical gold mining area in regional Victoria, Australia: Identifying risk factors associated with uptake

    NASA Astrophysics Data System (ADS)

    Martin, Rachael; Dowling, Kim; Pearce, Dora; Bennett, John; Stopic, Attila

    2013-11-01

    Elevated levels of arsenic have been observed in some mine wastes and soils around historical gold mining areas in regional Victoria, Australia. Arsenic uptake from soil by children living in these areas has been demonstrated using toenail arsenic concentration as a biomarker, with evidence of some systemic absorption associated with periodic exposures. We conducted a follow-up study to ascertain if toenail arsenic concentrations, and risk factors for exposure, had changed over a five year period in an historical gold mining region in western regional Victoria, Australia. Residential soil samples (N = 14) and toenail clippings (N = 24) were analyzed for total arsenic using instrumental neutron activation analysis, including 19 toenail clippings samples that were obtained from the same study cohort in 2006. Toenail arsenic concentrations in 2011 (geometric mean, 0.171 μg/g; range, 0.030-0.540 μg/g) were significantly lower than those in 2006 (geometric mean, 0.464 μg/g; range, 0.150-2.10 μg/g; p < 0.001). However, toenail arsenic concentrations were again correlated with soil arsenic levels (Spearman's rho = 0.630; p = 0.001). Spending time outdoors more often and for longer periods correlates with increased arsenic uptake (p < 0.05). Mining-influenced residential soils represent a long-term continuing source for potential arsenic exposure for children living in this historical mining region.

  12. Testing alternative decision approaches for identifying cleanup priorities at contaminated sites.

    PubMed

    Arvai, Joseph; Gregory, Robin

    2003-04-15

    This exploratory study compares two approaches for involving nonexpert stakeholders in difficult policy choices. Both approaches have as their goal informing members of the public about contaminated sites and involving them in decisions regarding their cleanup. The first approach focuses on technical information and seeks to improve the available knowledge base so that participants can make choices informed by detailed scientific data. This approach is similar in intent to many of the science-based initiatives in public involvement now being undertaken by EPA, DOE, and other federal or state agencies. The second approach, in contrast, focuses on values-oriented information and seeks to improve stakeholders' ability to make difficult choices in light of required tradeoffs across a variety of technical and nontechnical concerns. The results demonstrate that although both approaches help to increase participants' knowledge level, a values-based approach is more successful in terms of helping nonexpert participants to make decisions aboutwhat have historically been viewed as primarily technical problems. PMID:12731826

  13. Expert Mining for Solving Social Harmony Problems

    NASA Astrophysics Data System (ADS)

    Gu, Jifa; Song, Wuqi; Zhu, Zhengxiang; Liu, Yijun

    Social harmony problems are being existed in social system, which is an open giant complex system. For solving such kind of problems the Meta-synthesis system approach proposed by Qian XS et al will be applied. In this approach the data, information, knowledge, model, experience and wisdom should be integrated and synthesized. Data mining, text mining and web mining are good techniques for using data, information and knowledge. Model mining, psychology mining and expert mining are new techniques for mining the idea, opinions, experiences and wisdom. In this paper we will introduce the expert mining, which is based on mining the experiences, knowledge and wisdom directly from experts, managers and leaders.

  14. Integrated approach to assess the environmental impact of mining activities: estimation of the spatial distribution of soil contamination (Panasqueira mining area, Central Portugal).

    PubMed

    Candeias, Carla; Ávila, Paula F; Ferreira da Silva, Eduardo; Teixeira, João Paulo

    2015-03-01

    Through the years, mining and beneficiation processes in Panasqueira Sn-W mine (Central Portugal) produced large amounts of As-rich mine wastes laid up in huge tailings and open-air impoundments (Barroca Grande and Rio tailings) that are the main source of pollution in the surrounding area once they are exposed to the weathering conditions leading to the formation of acid mine drainage (AMD) and consequently to the contamination of the surrounding environments, particularly soils. The active mine started the exploration during the nineteenth century. This study aims to look at the extension of the soil pollution due to mining activities and tailing erosion by combining data on the degree of soil contamination that allows a better understanding of the dynamics inherent to leaching, transport, and accumulation of some potential toxic elements in soil and their environmental relevance. Soil samples were collected in the surrounding soils of the mine, were digested in aqua regia, and were analyzed for 36 elements by inductively coupled plasma mass spectrometry (ICP-MS). Selected results are that (a) an association of elements like Ag, As, Bi, Cd, Cu, W, and Zn strongly correlated and controlled by the local sulfide mineralization geochemical signature was revealed; (b) the global area discloses significant concentrations of As, Bi, Cd, and W linked to the exchangeable and acid-soluble bearing phases; and (c) wind promotes the mechanical dispersion of the rejected materials, from the milled waste rocks and the mineral processing plant, with subsequent deposition on soils and waters. Arsenic- and sulfide-related heavy metals (such as Cu and Cd) are associated to the fine materials that are transported in suspension by surface waters or associated to the acidic waters, draining these sites and contaminating the local soils. Part of this fraction, especially for As, Cd, and Cu, is temporally retained in solid phases by precipitation of soluble secondary minerals (through

  15. LeadMine: a grammar and dictionary driven approach to entity recognition

    PubMed Central

    2015-01-01

    Background Chemical entity recognition has traditionally been performed by machine learning approaches. Here we describe an approach using grammars and dictionaries. This approach has the advantage that the entities found can be directly related to a given grammar or dictionary, which allows the type of an entity to be known and, if an entity is misannotated, indicates which resource should be corrected. As recognition is driven by what is expected, if spelling errors occur, they can be corrected. Correcting such errors is highly useful when attempting to lookup an entity in a database or, in the case of chemical names, converting them to structures. Results Our system uses a mixture of expertly curated grammars and dictionaries, as well as dictionaries automatically derived from public resources. We show that the heuristics developed to filter our dictionary of trivial chemical names (from PubChem) yields a better performing dictionary than the previously published Jochem dictionary. Our final system performs post-processing steps to modify the boundaries of entities and to detect abbreviations. These steps are shown to significantly improve performance (2.6% and 4.0% F1-score respectively). Our complete system, with incremental post-BioCreative workshop improvements, achieves 89.9% precision and 85.4% recall (87.6% F1-score) on the CHEMDNER test set. Conclusions Grammar and dictionary approaches can produce results at least as good as the current state of the art in machine learning approaches. While machine learning approaches are commonly thought of as "black box" systems, our approach directly links the output entities to the input dictionaries and grammars. Our approach also allows correction of errors in detected entities, which can assist with entity resolution. PMID:25810776

  16. Using a watershed-centric approach to identify potentially impacted beaches

    EPA Science Inventory

    Beaches can be affected by a variety of contaminants. Of particular concern are beaches impacted by human fecal contamination and urban runoff. This poster demonstrates a methodology to identify potentially impacted beaches using Geographic Information Systems (GIS). Since h...

  17. Data Mining Approach for Evaluating Vegetation Dynamics in Earth System Models (ESMs) Using Satellite Remote Sensing Products

    NASA Astrophysics Data System (ADS)

    Shu, S.; Hoffman, F. M.; Kumar, J.; Hargrove, W. W.; Jain, A. K.

    2014-12-01

    biome types. However, Mapcurves results showed a relatively low goodness of fit score for modeled phenology projected onto observations. This study demonstrates the utility of a data mining approach for cross-validation of observations and evaluation of model performance.

  18. A Bayesian approach to identifying structural nonlinearity using free-decay response: Application to damage detection in composites

    USGS Publications Warehouse

    Nichols, J.M.; Link, W.A.; Murphy, K.D.; Olson, C.C.

    2010-01-01

    This work discusses a Bayesian approach to approximating the distribution of parameters governing nonlinear structural systems. Specifically, we use a Markov Chain Monte Carlo method for sampling the posterior parameter distributions thus producing both point and interval estimates for parameters. The method is first used to identify both linear and nonlinear parameters in a multiple degree-of-freedom structural systems using free-decay vibrations. The approach is then applied to the problem of identifying the location, size, and depth of delamination in a model composite beam. The influence of additive Gaussian noise on the response data is explored with respect to the quality of the resulting parameter estimates.

  19. A Spatio-temporal Data Mining Approach to Global scale Burned Area Monitoring

    NASA Astrophysics Data System (ADS)

    Mithal, V.; Khandelwal, A.; Nayak, G.; Kumar, V.; Nemani, R. R.; Oza, N.

    2014-12-01

    We present a novel technique for burned area mapping in forests using the Enhanced Vegetation Index (EVI) from the MODIS 16-day Level 3 1km Vegetation Indices (MOD13A2) and the Active Fire (AF) from the MODIS 8-day Level 3 1km Thermal Anomalies and Fire products (MOD14A2). The proposed method leverages the spatial and temporal co-occurrence of thermal anomalies and vegetation loss caused due to forest fires to detect burned areas. Our approach derives features from Enhanced Vegetation Index that target locations which show an abrupt change in their vegetation time series that take at least several months to recover. One unique aspect of our approach is that it uses data from multiple months around the fire event and is therefore more robust to issues in data quality. Comparison with other burned area products show that our approach detects several large previously undetected burned areas across multiple geographical regions. In particular, we found that our approach detects several large burned regions in the tropical forests of Indonesia and South America that had been missed by the state-of-arts burned area approaches. For example, using our approach in Indonesia we discovered that the state-of-the-art MODIS Burned area product had missed around 20,000 sq. km. of burned area (nearly as much burned area as it has reported). We show that all these previously unreported burned areas detected by our approach are actually significant fires which suffered a large, abrupt loss in their vegetation at the time of the fire event and take at least several months to recover back to their normal vegetation. To evaluate these burned areas we compared the Landsat-based composites before and after the date of the event. Our Landsat analysis shows that the burned areas detected by the proposed approach are true burns with a very small error of commission. We believe our work has the potential to provide a scalable approach to global forest monitoring as well as reduce the

  20. A Statistical Approach to Identifying Compact Objects in X-ray Binaries

    NASA Astrophysics Data System (ADS)

    Vrtilek, Saeqa D.

    2013-04-01

    A standard approach towards statistical inferences in astronomy has been the application of Principal Components Analysis (PCA) to reduce dimensionality. However, for non-linear distributions this is not always an effective approach. A non-linear technique called ``diffusion maps" (Freema \\eta 2009; Richard \\eta 2009; Lee \\& Waterman 2010), a robust eigenmode-based framework, allows retention of the full ``connectivity" of the data points. Through this approach we define the highly non-linear geometry of X-ray binaries in a color-color-intensity diagram in an efficient and statistically sound manner providing a broadly applicable means of distinguishing between black holes and neutron stars in Galactic X-ray binaries.

  1. Development of Novel Random Network Theory-Based Approaches to Identify Network Interactions among Nitrifying Bacteria

    SciTech Connect

    Shi, Cindy

    2015-07-17

    The interactions among different microbial populations in a community could play more important roles in determining ecosystem functioning than species numbers and their abundances, but very little is known about such network interactions at a community level. The goal of this project is to develop novel framework approaches and associated software tools to characterize the network interactions in microbial communities based on high throughput, large scale high-throughput metagenomics data and apply these approaches to understand the impacts of environmental changes (e.g., climate change, contamination) on network interactions among different nitrifying populations and associated microbial communities.

  2. Integrative systems medicine approaches to identify molecular targets in lymphoid malignancies.

    PubMed

    Frazzi, Raffaele; Auffray, Charles; Ferrari, Angela; Filippini, Perla; Rutella, Sergio; Cesario, Alfredo

    2016-01-01

    Although survival rates for lymphoproliferative disorders are steadily increasing both in the US and in Europe, there is need for optimizing front-line therapies and developing more effective salvage strategies. Recent advances in molecular genetics have highlighted the biological diversity of lymphoproliferative disorders. In particular, integrative approaches including whole genome sequencing, whole exome sequencing, and transcriptome or RNA sequencing have been instrumental to the identification of molecular targets for treatment. Herein, we will discuss how genomic, epigenomic and proteomic approaches in lymphoproliferative disorders have supported the discovery of molecular lesions and their therapeutic targeting in the clinic. PMID:27580852

  3. Functional analysis of problem behavior: A systematic approach for identifying idiosyncratic variables.

    PubMed

    Roscoe, Eileen M; Schlichenmeyer, Kevin J; Dube, William V

    2015-01-01

    When inconclusive functional analysis (FA) outcomes occur, a number of modifications have been made to enhance the putative establishing operation or consequence associated with behavioral maintenance. However, a systematic method for identifying relevant events to test during modified FAs has not been evaluated. The purpose of this study was to develop and evaluate a technology for systematically identifying events to test in a modified FA after an initial FA led to inconclusive outcomes. Six individuals, whose initial FA showed little or no responding or high levels only in the control condition, participated. An indirect assessment (IA) questionnaire developed for identifying idiosyncratic variables was administered, and a descriptive analysis (DA) was conducted. Results from the IA only or a combination of the IA and DA were used to inform modified FA test and control conditions. Conclusive FA outcomes were obtained with 5 of the 6 participants during the modified FA phase. PMID:25930176

  4. Functional Analysis of Problem Behavior: A Systematic Approach for Identifying Idiosyncratic Variables

    PubMed Central

    Roscoe, Eileen M.; Schlichenmeyer, Kevin J.; Dube, William V.

    2015-01-01

    When inconclusive functional analysis (FA) outcomes occur, a number of modifications have been made to enhance the putative establishing operation or consequence associated with behavioral maintenance. However, a systematic method for identifying relevant events to test during modified FAs has not been evaluated. The purpose of this study was to develop and evaluate a technology for systematically identifying events to test in a modified FA after an initial FA led to inconclusive outcomes. Six individuals whose initial FA showed little or no responding or high levels only in the control condition participated. An indirect assessment (IA) questionnaire developed for identifying idiosyncratic variables was administered, and a descriptive analysis (DA) was conducted. Results from the IA only or a combination of the IA and DA were used to inform modified FA test and control conditions. Conclusive FA outcomes were obtained with five of the six participants during the modified FA phase. PMID:25930176

  5. A statistical approach to evaluate the relation of coal mining, land reclamation, and surface-water quality in Ohio

    USGS Publications Warehouse

    Hren, Janet; Wilson, K.S.; Helsel, D.R.

    1984-01-01

    Base-flow data from 779 sites in Ohio 's coal region were analyzed statistically to relate land use to selected water-quality characteristics. Sites were classified into five categories: unmined (100 percent unmined land), abandoned (50 percent or more abandoned surface mines), reclaimed (50 percent or more reclaimed surface mines), deep-mined (50 percent or more underground mines), and mixed (all others). Specific conductance , pH, alkalinity, acidity, sulfate, dissolved iron, total iron, and total manganese in streams draining basins in the coal region were the eight characteristics selected for analysis. (USGS)

  6. Quantitative high-throughput screening: A titration-based approach that efficiently identifies biological activities in large chemical libraries

    PubMed Central

    Inglese, James; Auld, Douglas S.; Jadhav, Ajit; Johnson, Ronald L.; Simeonov, Anton; Yasgar, Adam; Zheng, Wei; Austin, Christopher P.

    2006-01-01

    High-throughput screening (HTS) of chemical compounds to identify modulators of molecular targets is a mainstay of pharmaceutical development. Increasingly, HTS is being used to identify chemical probes of gene, pathway, and cell functions, with the ultimate goal of comprehensively delineating relationships between chemical structures and biological activities. Achieving this goal will require methodologies that efficiently generate pharmacological data from the primary screen and reliably profile the range of biological activities associated with large chemical libraries. Traditional HTS, which tests compounds at a single concentration, is not suited to this task, because HTS is burdened by frequent false positives and false negatives and requires extensive follow-up testing. We have developed a paradigm, quantitative HTS (qHTS), tested with the enzyme pyruvate kinase, to generate concentration–response curves for >60,000 compounds in a single experiment. We show that this method is precise, refractory to variations in sample preparation, and identifies compounds with a wide range of activities. Concentration–response curves were classified to rapidly identify pyruvate kinase activators and inhibitors with a variety of potencies and efficacies and elucidate structure–activity relationships directly from the primary screen. Comparison of qHTS with traditional single-concentration HTS revealed a high prevalence of false negatives in the single-point screen. This study demonstrates the feasibility of qHTS for accurately profiling every compound in large chemical libraries (>105 compounds). qHTS produces rich data sets that can be immediately mined for reliable biological activities, thereby providing a platform for chemical genomics and accelerating the identification of leads for drug discovery. PMID:16864780

  7. Root productivity of deciduous and evergreen species identified using a molecular approach

    NASA Astrophysics Data System (ADS)

    Ellsworth, P.; Sternberg, L. O.

    2012-12-01

    The linkage between leaf traits and root structure may explain how plants integrate above and belowground traits into whole plant adaptations to environmental stresses. In dry seasonal forests, the lack of dry season precipitation dries out the relatively nutrient-rich shallow soil, leaving shallow soil water and nutrients inaccessible to uptake until the wet season. In tropical or subtropical seasonal dry forests, deciduousness may allow for the survival of shallow fine roots during the dry season. Losing leaves during the dry season reduces aboveground plant water demand, and a greater proportion of water extracted from deep soil can be used to maintain shallow roots until the wet season. Higher shallow root survival through the dry season than evergreen species means that deciduous species can take advantage of the nutrient pulse associated with the onset of the wet season. To test the above hypothesis, fine roots were collected from soil cores in a seasonally dry forest during the dry season, onset of the wet season, and the wet season and were identified to selected evergreen and deciduous study species. The fine roots of two of the selected species (Lyonia ferruginea and Carya floridana) could be identified from visual characteristics. The other three study species, which were all from the genus Quercus (Q. geminata, Q. myrtifolia, and Q. laevis), were impossible to separate visually. We developed a PCR-based restriction fragment length polymorphism (PCR-RFLP) technique, which provided a quick, simple, low-cost way to identify the species of all fine roots of our study species. We extracted DNA from all roots that were not visually identified, amplified the internal transcribed spacer region (ITS), digested the ITS region with the restriction enzyme TaqαI, and used gel electrophoresis to separate DNA fragments. Using a PCR-RFLP based root identification key that we developed for the species at Archbold Biological Station, all species that could not be

  8. Visual Data Mining: An Exploratory Approach to Analyzing Temporal Patterns of Eye Movements

    ERIC Educational Resources Information Center

    Yu, Chen; Yurovsky, Daniel; Xu, Tian

    2012-01-01

    Infant eye movements are an important behavioral resource to understand early human development and learning. But the complexity and amount of gaze data recorded from state-of-the-art eye-tracking systems also pose a challenge: how does one make sense of such dense data? Toward this goal, this article describes an interactive approach based on…

  9. A LANDSCAPE ECOLOGY APPROACH TO IDENTIFYING ECOLOGICAL VULNERABILITY IN GEOGRAPHICALLY ISOLATED WETLANDS

    EPA Science Inventory

    U.S. EPA 's Office of Research and Development is using a landscape approach to assess the ecological/hydrologic functions of geographically isolated wetlands in the mid-western, southern, and western regions of the United States. Geographically isolated wetlands are considered t...

  10. Approaches to Identify Exceedances of Water Quality Thresholds Associated with Ocean Conditions

    EPA Science Inventory

    WED scientists have developed a method to help distinguish whether failures to meet water quality criteria are associated with natural coastal upwelling by using the statistical approach of logistic regression. Estuaries along the west coast of the United States periodically ha...

  11. An approach to identify time consistent model parameters: sub-period calibration

    NASA Astrophysics Data System (ADS)

    Gharari, S.; Hrachowitz, M.; Fenicia, F.; Savenije, H. H. G.

    2013-01-01

    Conceptual hydrological models rely on calibration for the identification of their parameters. As these models are typically designed to reflect real catchment processes, a key objective of an appropriate calibration strategy is the determination of parameter sets that reflect a "realistic" model behavior. Previous studies have shown that parameter estimates for different calibration periods can be significantly different. This questions model transposability in time, which is one of the key conditions for the set-up of a "realistic" model. This paper presents a new approach that selects parameter sets that provide a consistent model performance in time. The approach consists of testing model performance in different periods, and selecting parameter sets that are as close as possible to the optimum of each individual sub-period. While aiding model calibration, the approach is also useful as a diagnostic tool, illustrating tradeoffs in the identification of time-consistent parameter sets. The approach is applied to a case study in Luxembourg using the HyMod hydrological model as an example.

  12. Identifying Country-Specific Cultures of Physics Education: A Differential Item Functioning Approach

    ERIC Educational Resources Information Center

    Mesic, Vanes

    2012-01-01

    In international large-scale assessments of educational outcomes, student achievement is often represented by unidimensional constructs. This approach allows for drawing general conclusions about country rankings with respect to the given achievement measure, but it typically does not provide specific diagnostic information which is necessary for…

  13. The Natural Selection: Identifying Student Misconceptions through an Inquiry-Based, Critical Approach to Evolution

    ERIC Educational Resources Information Center

    Robbins, Jennifer R.; Roy, Pamela

    2007-01-01

    We invited 141 non-science major undergraduates to share and then challenge their preconceptions about evolution in a four-lesson inquiry lab unit that integrated diverse topics with rigorous assessment. Our experience suggests that an inquiring approach to evolutionary theory can be highly persuasive.

  14. A Polar Coordinate Approach to Identify and Remove Higher Mode Rayleigh Waves

    NASA Astrophysics Data System (ADS)

    Gribler, G.; Liberty, L. M.; Michaels, P.; Mikesell, T. D.

    2015-12-01

    We present an approach to isolate and separate higher mode Rayleigh wave signals using active source multicomponent seismic data. Our approach allows for improved subsurface shear wave velocity estimates compared to established single component, multi-channel (MASW) methods. We show that the phase velocity vs. frequency relationship of the fundamental Rayleigh wave mode can become contaminated when higher mode Rayleigh waves interfere with the fundamental mode dispersion. Under many geological models, we observe higher mode contamination and these higher velocity modes can lead to low relative coherence along the fundamental mode dispersion path or an overestimation of shear wave velocities with depth. For a typical range of frequencies utilized in active source surface wave analysis (5-100 Hz), the fundamental mode propagates in retrograde motion at the surface. For many earth models, higher mode Rayleigh waves can propagate in prograde motion. By utilizing vertical and horizontal inline seismic components, we can measure particle motion direction and selectively remove the prograde higher mode Rayleigh wave signals via our polar mute approach. We show with numerical models and field results that by removing these higher modes, we can better isolate the fundamental Rayleigh wave dispersion to improve our confidence of shear wave velocity estimates with depth compared to a single channel approach.

  15. Mining Students' Learning Patterns and Performance in Web-Based Instruction: A Cognitive Style Approach

    ERIC Educational Resources Information Center

    Chen, Sherry Y.; Liu, Xiaohui

    2011-01-01

    Personalization has been widely used in Web-based instruction (WBI). To deliver effective personalization, there is a need to understand different preferences of each student. Cognitive style has been identified as one of the most pertinent factors that affect students' learning preferences. Therefore, it is essential to investigate how learners…

  16. A Data Mining Approach to Improve Re-Accessibility and Delivery of Learning Knowledge Objects

    ERIC Educational Resources Information Center

    Sabitha, Sai; Mehrotra, Deepti; Bansal, Abhay

    2014-01-01

    Today Learning Management Systems (LMS) have become an integral part of learning mechanism of both learning institutes and industry. A Learning Object (LO) can be one of the atomic components of LMS. A large amount of research is conducted into identifying benchmarks for creating Learning Objects. Some of the major concerns associated with LO are…

  17. Differentially Private Frequent Subgraph Mining

    PubMed Central

    Xu, Shengzhi; Xiong, Li; Cheng, Xiang; Xiao, Ke

    2016-01-01

    Mining frequent subgraphs from a collection of input graphs is an important topic in data mining research. However, if the input graphs contain sensitive information, releasing frequent subgraphs may pose considerable threats to individual's privacy. In this paper, we study the problem of frequent subgraph mining (FGM) under the rigorous differential privacy model. We introduce a novel differentially private FGM algorithm, which is referred to as DFG. In this algorithm, we first privately identify frequent subgraphs from input graphs, and then compute the noisy support of each identified frequent subgraph. In particular, to privately identify frequent subgraphs, we present a frequent subgraph identification approach which can improve the utility of frequent subgraph identifications through candidates pruning. Moreover, to compute the noisy support of each identified frequent subgraph, we devise a lattice-based noisy support derivation approach, where a series of methods has been proposed to improve the accuracy of the noisy supports. Through formal privacy analysis, we prove that our DFG algorithm satisfies ε-differential privacy. Extensive experimental results on real datasets show that the DFG algorithm can privately find frequent subgraphs with high data utility.

  18. Constructing New Theory for Identifying Students with Emotional Disturbance: A Grounded Theory Approach

    ERIC Educational Resources Information Center

    Barnett, Dori A.

    2010-01-01

    The problem area explored by this study is the identification of students with emotional and behavioral difficulties for special education supports and services under the criteria for emotional disturbance (ED). A review of the literature indicated that the problem of identifying students with ED was compounded by subjectivity and ambiguity…

  19. A NEW APPROACH TO IDENTIFYING THE MOST POWERFUL GRAVITATIONAL LENSING TELESCOPES

    SciTech Connect

    Wong, Kenneth C.; Zabludoff, Ann I.; Ammons, S. Mark; Keeton, Charles R.; Hogg, David W.; Gonzalez, Anthony H.

    2013-05-20

    The best gravitational lenses for detecting distant galaxies are those with the largest mass concentrations and the most advantageous configurations of that mass along the line of sight. Our new method for finding such gravitational telescopes uses optical data to identify projected concentrations of luminous red galaxies (LRGs). LRGs are biased tracers of the underlying mass distribution, so lines of sight with the highest total luminosity in LRGs are likely to contain the largest total mass. We apply this selection technique to the Sloan Digital Sky Survey and identify the 200 fields with the highest total LRG luminosities projected within a 3.'5 radius over the redshift range 0.1 {<=} z {<=} 0.7. The redshift and angular distributions of LRGs in these fields trace the concentrations of non-LRG galaxies. These fields are diverse; 22.5% contain one known galaxy cluster and 56.0% contain multiple known clusters previously identified in the literature. Thus, our results confirm that these LRGs trace massive structures and that our selection technique identifies fields with large total masses. These fields contain two to three times higher total LRG luminosities than most known strong-lensing clusters and will be among the best gravitational lensing fields for the purpose of detecting the highest redshift galaxies.

  20. Enhanced approaches for identifying Amadori products:application to peanut allergens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The dry roasting of peanuts is suggested to influence allergenic sensitization due to formation of advanced glycation end products (AGE) on peanut proteins. Identifying AGEs is technically challenging. The AGE composition of peanut proteins was probed with nanoLC-ESI-MS and MS/MS analyses. Amadori ...

  1. Candidate fire blight resistance genes in Malus identified with the use of genomic tools and approaches

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The goal of this research is to utilize current advances in Rosaceae genomics to identify DNA markers for use in marker-assisted selection of durable resistance to fire blight. Candidate fire blight resistance genes were selected and ranked based upon differential expression after inoculation with ...

  2. An Integrative Pharmacogenomic Approach Identifies Two-drug Combination Therapies for Personalized Cancer Medicine

    PubMed Central

    Liu, Yin; Fei, Teng; Zheng, Xiaoqi; Brown, Myles; Zhang, Peng; Liu, X. Shirley; Wang, Haiyun

    2016-01-01

    An individual tumor harbors multiple molecular alterations that promote cell proliferation and prevent apoptosis and differentiation. Drugs that target specific molecular alterations have been introduced into personalized cancer medicine, but their effects can be modulated by the activities of other genes or molecules. Previous studies aiming to identify multiple molecular alterations for combination therapies are limited by available data. Given the recent large scale of available pharmacogenomic data, it is possible to systematically identify multiple biomarkers that contribute jointly to drug sensitivity, and to identify combination therapies for personalized cancer medicine. In this study, we used pharmacogenomic profiling data provided from two independent cohorts in a systematic in silico investigation of perturbed genes cooperatively associated with drug sensitivity. Our study predicted many pairs of molecular biomarkers that may benefit from the use of combination therapies. One of our predicted biomarker pairs, a mutation in the BRAF gene and upregulated expression of the PIM1 gene, was experimentally validated to benefit from a therapy combining BRAF inhibitor and PIM1 inhibitor in lung cancer. This study demonstrates how pharmacogenomic data can be used to systematically identify potentially cooperative genes and provide novel insights to combination therapies in personalized cancer medicine. PMID:26916442

  3. A Statistical Approach to Identifying Schools Demonstrating Substantial Improvement in Student Learning

    ERIC Educational Resources Information Center

    Meyers, Coby; Lindsay, Jim; Condon, Chris; Wan, Yinmei

    2012-01-01

    The rising tide behind the school turnaround movement is significant, as national education leaders continue to call for the rapid improvement of the nation's lowest-performing schools. To date, little work has been done to identify schools that are drastically improving their performance. Using publically available school-level student…

  4. Identifying Subtypes of Peer Status by Combining Popularity and Preference: A Cohort-Sequential Approach

    ERIC Educational Resources Information Center

    van den Berg, Yvonne H. M.; Burk, William J.; Cillessen, Antonius H. N.

    2015-01-01

    The purpose of this study was to identify and validate subtypes of peer status by integrating preference and popularity into a single framework. Person-oriented analyses were performed among 3,630 children and adolescents of different cohorts in primary and secondary education. In the young age groups (Grade 3/4 to Grade 7), three clusters were…

  5. An Integrative Pharmacogenomic Approach Identifies Two-drug Combination Therapies for Personalized Cancer Medicine.

    PubMed

    Liu, Yin; Fei, Teng; Zheng, Xiaoqi; Brown, Myles; Zhang, Peng; Liu, X Shirley; Wang, Haiyun

    2016-01-01

    An individual tumor harbors multiple molecular alterations that promote cell proliferation and prevent apoptosis and differentiation. Drugs that target specific molecular alterations have been introduced into personalized cancer medicine, but their effects can be modulated by the activities of other genes or molecules. Previous studies aiming to identify multiple molecular alterations for combination therapies are limited by available data. Given the recent large scale of available pharmacogenomic data, it is possible to systematically identify multiple biomarkers that contribute jointly to drug sensitivity, and to identify combination therapies for personalized cancer medicine. In this study, we used pharmacogenomic profiling data provided from two independent cohorts in a systematic in silico investigation of perturbed genes cooperatively associated with drug sensitivity. Our study predicted many pairs of molecular biomarkers that may benefit from the use of combination therapies. One of our predicted biomarker pairs, a mutation in the BRAF gene and upregulated expression of the PIM1 gene, was experimentally validated to benefit from a therapy combining BRAF inhibitor and PIM1 inhibitor in lung cancer. This study demonstrates how pharmacogenomic data can be used to systematically identify potentially cooperative genes and provide novel insights to combination therapies in personalized cancer medicine. PMID:26916442

  6. Searching for an Alternate Way To Identify Young Creative Minds: A Classroom-Based Observation Approach.

    ERIC Educational Resources Information Center

    Han, Ki-Soon; Marvin, Chris; Walden, Ann

    2003-01-01

    In a sample of 45 kindergarten children, significant, but relatively weak correlations were found between the Nebraska Starry Night Observation Protocol (NSNO) and the originality and elaboration scores of the concurrent measure, the Torrance Tests of Creative Thinking. Implications of using the NSNO to identify young creative children are…

  7. Comparison of Two Approaches for Identifying Reinforcers in Teaching Figure Coloring to Students with Down Syndrome

    ERIC Educational Resources Information Center

    Erbas, Dilek; Ozen, Arzu; Acar, Cimen

    2004-01-01

    The purpose of the present study was to extend previous research on reinforcer assessment by comparing effectiveness of stimuli identified by two preference procedures on teaching figure coloring to three children with disabilities in Turkey; and to find out what special education teachers think about social validation of the two preference…

  8. CONCEPTUAL APPROACHES TO IDENTIFY AND ASSESS MULTPLE STRESSORS, SECTION 1.1

    EPA Science Inventory

    Every ecosystem is subject to multiple stressors arising from the interactions of biological, physical, and socioeconomic processes (e.g. exploitation and development). These stressors and their interactions need to be identified if risks associated with a planned activity are to...

  9. Identifying whole grain foods: a comparison of different approaches for selecting more healthful whole grain products

    PubMed Central

    Mozaffarian, Rebecca S; Lee, Rebekka M; Kennedy, Mary A; Ludwig, David S; Mozaffarian, Dariush; Gortmaker, Steven L

    2015-01-01

    Objective Eating whole grains (WG) is recommended for health, but multiple conflicting definitions exist for identifying whole grain (WG) products, limiting the ability of consumers and organizations to select such products. We investigated how five recommended WG criteria relate to healthfulness and price of grain products. Design We categorized grain products by different WG criteria including: the industry-sponsored Whole Grain stamp (WG-Stamp); WG as the first ingredient (WG-first); WG as the first ingredient without added sugars (WG-first-no-added-sugars); the word ‘whole’ before any grain in the ingredients (‘whole’-anywhere); and a content of total carbohydrate to fibre of ≤10:1 (10:1-ratio). We investigated associations of each criterion with health-related characteristics including fibre, sugars, sodium, energy, trans-fats and price. Setting Two major grocery store chains. Subjects Five hundred and forty-five grain products. Results Each WG criterion identified products with higher fibre than products considered non-WG; the 10:1-ratio exhibited the largest differences (+3.15 g/serving, P<0.0001). Products achieving the 10:1-ratio also contained lower sugar (−1.28 g/serving, P=0.01), sodium (−15.4 mg/serving, P=0.04) and likelihood of trans-fats (OR=0.14, P<0.0001), without energy differences. WG-first-no-added-sugars performed similarly, but identified many fewer products as WG and also not a lower likelihood of containing trans-fats. The WG-Stamp, WG-first and ‘whole’-anywhere criteria identified products with a lower likelihood of trans-fats, but also significantly more sugars and energy (P<0.05 each). Products meeting the WG-Stamp or 10:1-ratio criterion were more expensive than products that did not (+$US 0.04/serving, P=0.009 and +$US 0.05/serving, P=0.003, respectively). Conclusions Among proposed WG criteria, the 10:1-ratio identified the most healthful WG products. Other criteria performed less well, including the industry

  10. Geologic considerations in underground coal mining system design

    SciTech Connect

    Camilli, F.A.; Maynard, D.P.; Mangolds, A.; Harris, J.

    1981-10-01

    Geologic characteristics of coal resources which may impact new extraction technologies are identified and described to aid system designers and planners in their task of designing advanced coal extraction systems for the central Appalachian region. These geologic conditions are then organized into a matrix identified as the baseline mine concept. A sample region, eastern Kentucky, is next analyzed, using both the new baseline mine concept and traditional geologic investigative approach. The baseline mine concept presented is intended as a framework, providing a consistent basis for further analyses to be subsequently conducted in other geographic regions. The baseline mine concept is intended as a tool to give system designers a more realistic feel of the mine environment and will hopefully lead to acceptable alternatives for advanced coal extraction system.

  11. Mine ventilation and air conditioning. 3. edition

    SciTech Connect

    Hartman, H.L.; Mutmansky, J.M.; Ramani, R.V.; Wang, Y.J.

    1998-12-31

    This revised edition presents an engineering design approach to ventilation and air conditioning as part of the comprehensive environmental control of the mine atmosphere. It provides an in-depth look, for practitioners who design and operate mines, into the health and safety aspects of environmental conditions in the underground workplace. The contents include: Environmental control of the mine atmosphere; Properties and behavior of air; Mine air-quality control; Mine gases; Dusts and other mine aerosols; Mine ventilation; Airflow through mine openings and ducts; Mine ventilation circuits and networks; Natural ventilation; Fan application to mines; Auxiliary ventilation and controlled recirculation; Economics of airflow; Control of mine fires and explosions; Mine air conditioning; Heat sources and effect in mines; Mine air conditioning systems; Appendices; References; Answers to selected problems; and Index.

  12. Heat–Health Warning Systems: A Comparison of the Predictive Capacity of Different Approaches to Identifying Dangerously Hot Days

    PubMed Central

    Sheridan, Scott C.; Allen, Michael J.; Pascal, Mathilde; Laaidi, Karine; Yagouti, Abderrahmane; Bickis, Ugis; Tobias, Aurelio; Bourque, Denis; Armstrong, Ben G.; Kosatsky, Tom

    2010-01-01

    Objectives. We compared the ability of several heat–health warning systems to predict days of heat-associated mortality using common data sets. Methods. Heat–health warning systems initiate emergency public health interventions once forecasts have identified weather conditions to breach predetermined trigger levels. We examined 4 commonly used trigger-setting approaches: (1) synoptic classification, (2) epidemiologic assessment of the temperature–mortality relationship, (3) temperature–humidity index, and (4) physiologic classification. We applied each approach in Chicago, Illinois; London, United Kingdom; Madrid, Spain; and Montreal, Canada, to identify days expected to be associated with the highest heat-related mortality. Results. We found little agreement across the approaches in which days were identified as most dangerous. In general, days identified by temperature–mortality assessment were associated with the highest excess mortality. Conclusions. Triggering of alert days and ultimately the initiation of emergency responses by a heat–health warning system varies significantly across approaches adopted to establish triggers. PMID:20395585

  13. New approaches and omics tools for mining of vaccine candidates against vector-borne diseases.

    PubMed

    Kuleš, Josipa; Horvatić, Anita; Guillemin, Nicolas; Galan, Asier; Mrljak, Vladimir; Bhide, Mangesh

    2016-08-16

    Vector-borne diseases (VBDs) present a major threat to human and animal health, as well as place a substantial burden on livestock production. As a way of sustainable VBD control, focus is set on vaccine development. Advances in genomics and other "omics" over the past two decades have given rise to a "third generation" of vaccines based on technologies such as reverse vaccinology, functional genomics, immunomics, structural vaccinology and the systems biology approach. The application of omics approaches is shortening the time required to develop the vaccines and increasing the probability of discovery of potential vaccine candidates. Herein, we review the development of new generation vaccines for VBDs, and discuss technological advancement and overall challenges in the vaccine development pipeline. Special emphasis is placed on the development of anti-tick vaccines that can quell both vectors and pathogens. PMID:27384976

  14. Prediction of Solar Radiation on Building Rooftops: A Data-Mining Approach

    SciTech Connect

    Omitaomu, Olufemi A; Bhaduri, Budhendra L; Kodysh, Jeffrey B

    2012-01-01

    Solar energy technologies offer a clean, renewable, and domestic energy source, and are essential components of a sustainable energy future. The accurate measurement of solar radiation data is essential for optimum site selection of future distributed solar power plants as well as sizing photovoltaic systems. However, solar radiation data are not readily available because measured sequences of radiation values are obtained for a few locations in a country. When the data are available, they are usually at different time periods and spatial scale. The availability of solar radiation data at hourly or daily time scale will enhance the integration of solar energy into electricity generation and promote a sustainable energy future. The ability to generate approximate solar radiation values is often the only practical way to obtain radiation data at hourly or daily time scale. As a result, several models have been developed for estimating solar radiation values based on analytical, numerical simulation, and statistical approaches. However, these models have inherent challenges. We will discuss some of those challenges in this paper. To enhance the prediction of solar radiation values, a novel approach is presented for estimating solar radiation values using support vector machine technique. The approach accounts for unique characteristics that influence solar radiation values. The preliminary results obtained offer useful insights for model enhancements.

  15. Microarray data mining: A novel optimization-based approach to uncover biologically coherent structures

    PubMed Central

    Tan, Meng P; Smith, Erin N; Broach, James R; Floudas, Christodoulos A

    2008-01-01

    Background DNA microarray technology allows for the measurement of genome-wide expression patterns. Within the resultant mass of data lies the problem of analyzing and presenting information on this genomic scale, and a first step towards the rapid and comprehensive interpretation of this data is gene clustering with respect to the expression patterns. Classifying genes into clusters can lead to interesting biological insights. In this study, we describe an iterative clustering approach to uncover biologically coherent structures from DNA microarray data based on a novel clustering algorithm EP_GOS_Clust. Results We apply our proposed iterative algorithm to three sets of experimental DNA microarray data from experiments with the yeast Saccharomyces cerevisiae and show that the proposed iterative approach improves biological coherence. Comparison with other clustering techniques suggests that our iterative algorithm provides superior performance with regard to biological coherence. An important consequence of our approach is that an increasing proportion of genes find membership in clusters of high biological coherence and that the average cluster specificity improves. Conclusion The results from these clustering experiments provide a robust basis for extracting motifs and trans-acting factors that determine particular patterns of expression. In addition, the biological coherence of the clusters is iteratively assessed independently of the clustering. Thus, this method will not be severely impacted by functional annotations that are missing, inaccurate, or sparse. PMID:18538024

  16. Identifying active foraminifera in the Sea of Japan using metatranscriptomic approach

    NASA Astrophysics Data System (ADS)

    Lejzerowicz, Franck; Voltsky, Ivan; Pawlowski, Jan

    2013-02-01

    Metagenetics represents an efficient and rapid tool to describe environmental diversity patterns of microbial eukaryotes based on ribosomal DNA sequences. However, the results of metagenetic studies are often biased by the presence of extracellular DNA molecules that are persistent in the environment, especially in deep-sea sediment. As an alternative, short-lived RNA molecules constitute a good proxy for the detection of active species. Here, we used a metatranscriptomic approach based on RNA-derived (cDNA) sequences to study the diversity of the deep-sea benthic foraminifera and compared it to the metagenetic approach. We analyzed 257 ribosomal DNA and cDNA sequences obtained from seven sediments samples collected in the Sea of Japan at depths ranging from 486 to 3665 m. The DNA and RNA-based approaches gave a similar view of the taxonomic composition of foraminiferal assemblage, but differed in some important points. First, the cDNA dataset was dominated by sequences of rotaliids and robertiniids, suggesting that these calcareous species, some of which have been observed in Rose Bengal stained samples, are the most active component of foraminiferal community. Second, the richness of monothalamous (single-chambered) foraminifera was particularly high in DNA extracts from the deepest samples, confirming that this group of foraminifera is abundant but not necessarily very active in the deep-sea sediments. Finally, the high divergence of undetermined sequences in cDNA dataset indicate the limits of our database and lack of knowledge about some active but possibly rare species. Our study demonstrates the capability of the metatranscriptomic approach to detect active foraminiferal species and prompt its use in future high-throughput sequencing-based environmental surveys.

  17. A comparison of two approaches for identifying reinforcers for persons with severe and profound disabilities.

    PubMed

    Fisher, W; Piazza, C C; Bowman, L G; Hagopian, L P; Owens, J C; Slevin, I

    1992-01-01

    The development of effective training programs for persons with profound mental retardation remains one of the greatest challenges for behavior analysts working in the field of developmental disabilities. One significant advancement for this population has been the reinforcer assessment procedure developed by Pace, Ivancic, Edwards, Iwata, and Page (1985), which involves repeatedly presenting a variety of stimuli to the client and then measuring approach behaviors to differentiate preferred from nonpreferred stimuli. One potential limitation of this procedure is that some clients consistently approach most or all of the stimuli on each presentation, making it difficult to differentiate among these stimuli. In this study, we used a concurrent operants paradigm to compare the Pace et al. (1985) procedure with a modified procedure wherein clients were presented with two stimuli simultaneously and were given access only to the first stimulus approached. The results revealed that this forced-choice stimulus preference assessment resulted in greater differentiation among stimuli and better predicted which stimuli would result in higher levels of responding when presented contingently in a concurrent operants paradigm. PMID:1634435

  18. Identifying risk of hospital readmission among Medicare aged patients: an approach using routinely collected data.

    PubMed

    Navarro, Adria E; Enguídanos, Susan; Wilber, Kathleen H

    2012-01-01

    Readmission provisions in the Patient Protection and Affordable Care Act of March 2010 have created urgent fiscal accountability requirements for hospitals, dependent upon a better understanding of their specific populations, along with development of mechanisms to easily identify these at-risk patients. Readmissions are disruptive and costly to both patients and the health care system. Effectively addressing hospital readmissions among Medicare aged patients offers promising targets for resources aimed at improved quality of care for older patients. Routinely collected data, accessible via electronic medical records, were examined using logistic models of sociodemographic, clinical, and utilization factors to identify predictors among patients who required rehospitalization within 30 days. Specific comorbidities and discharge care orders in this urban, nonprofit hospital had significantly greater odds of predicting a Medicare aged patient's risk of readmission within 30 days. PMID:22656916

  19. Data-mining approaches reveal hidden families of proteases in the genome of malaria parasite.

    PubMed

    Wu, Yimin; Wang, Xiangyun; Liu, Xia; Wang, Yufeng

    2003-04-01

    The search for novel antimalarial drug targets is urgent due to the growing resistance of Plasmodium falciparum parasites to available drugs. Proteases are attractive antimalarial targets because of their indispensable roles in parasite infection and development, especially in the processes of host erythrocyte rupture/invasion and hemoglobin degradation. However, to date, only a small number of proteases have been identified and characterized in Plasmodium species. Using an extensive sequence similarity search, we have identified 92 putative proteases in the P. falciparum genome. A set of putative proteases including calpain, metacaspase, and signal peptidase I have been implicated to be central mediators for essential parasitic activity and distantly related to the vertebrate host. Moreover, of the 92, at least 88 have been demonstrated to code for gene products at the transcriptional levels, based upon the microarray and RT-PCR results, and the publicly available microarray and proteomics data. The present study represents an initial effort to identify a set of expressed, active, and essential proteases as targets for inhibitor-based drug design. PMID:12671001

  20. Knowledge-Assisted Approach to Identify Pathways with Differential Dependencies | Office of Cancer Genomics

    Cancer.gov

    We have previously developed a statistical method to identify gene sets enriched with condition-specific genetic dependencies. The method constructs gene dependency networks from bootstrapped samples in one condition and computes the divergence between distributions of network likelihood scores from different conditions. It was shown to be capable of sensitive and specific identification of pathways with phenotype-specific dysregulation, i.e., rewiring of dependencies between genes in different conditions.

  1. MAS C-Terminal Tail Interacting Proteins Identified by Mass Spectrometry- Based Proteomic Approach

    PubMed Central

    Tirupula, Kalyan C.; Zhang, Dongmei; Osbourne, Appledene; Chatterjee, Arunachal; Desnoyer, Russ; Willard, Belinda; Karnik, Sadashiva S.

    2015-01-01

    Propagation of signals from G protein-coupled receptors (GPCRs) in cells is primarily mediated by protein-protein interactions. MAS is a GPCR that was initially discovered as an oncogene and is now known to play an important role in cardiovascular physiology. Current literature suggests that MAS interacts with common heterotrimeric G-proteins, but MAS interaction with proteins which might mediate G protein-independent or atypical signaling is unknown. In this study we hypothesized that MAS C-terminal tail (Ct) is a major determinant of receptor-scaffold protein interactions mediating MAS signaling. Mass-spectrometry based proteomic analysis was used to comprehensively identify the proteins that interact with MAS Ct comprising the PDZ-binding motif (PDZ-BM). We identified both PDZ and non-PDZ proteins from human embryonic kidney cell line, mouse atrial cardiomyocyte cell line and human heart tissue to interact specifically with MAS Ct. For the first time our study provides a panel of PDZ and other proteins that potentially interact with MAS with high significance. A ‘cardiac-specific finger print’ of MAS interacting PDZ proteins was identified which includes DLG1, MAGI1 and SNTA. Cell based experiments with wild-type and mutant MAS lacking the PDZ-BM validated MAS interaction with PDZ proteins DLG1 and TJP2. Bioinformatics analysis suggested well-known multi-protein scaffold complexes involved in nitric oxide signaling (NOS), cell-cell signaling of neuromuscular junctions, synapses and epithelial cells. Majority of these protein hits were predicted to be part of disease categories comprising cancers and malignant tumors. We propose a ‘MAS-signalosome’ model to stimulate further research in understanding the molecular mechanism of MAS function. Identifying hierarchy of interactions of ‘signalosome’ components with MAS will be a necessary step in future to fully understand the physiological and pathological functions of this enigmatic receptor. PMID

  2. Proceedings: Fourth Workshop on Mining Scientific Datasets

    SciTech Connect

    Kamath, C

    2001-07-24

    Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratory data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of applications is

  3. Identifying Vortex-Core-Line using a tetrahedral satellite configuration: Field Topology Approach

    NASA Astrophysics Data System (ADS)

    Jiang, Yao; Lembege, Bertrand; Nishikawa, Ken-ichi; Cai, DongSheng; Hasegawa, Hiroshi

    2016-04-01

    Identifying vortices are the key to understanding the turbulence in plasma shear layers. Here, the term 'vortex' or 'vortex core' is associated with a region of Galilean invariance [Jeong and Hussain, 1995]. Unfortunately, no single precise definition of a vortex is currently universally accepted, despite the fact that many space plasma authors claim that many observations have detected "vortices" (as Kelvin-Helmholtz vortices at/around the magnetopause). By using the four satellite velocity data, and Taylor series, we expand the velocity data around the satellites, calculate its first order tensor, and linearly approximate the field. We can identify the vortex structures by using various vortex identification criteria as follows: (i) The first criterion is Q-criterion that defines vortices as regions in which the vorticity energy prevails other energies; (ii) the second criterion is the lambda2-criterion that is related to the minus of the Hessian matrix of the pressure related term; and (iii) the third criterion requires the existence of vortex-core-lines that is the Galilean invariance inside the four satellite tetrahedral region. Using these methods, we can identify and analyze more precisely the 3D vortex using tetrahedral satellite configuration.

  4. Identifying Predictors, Moderators, and Mediators of Antidepressant Response in Major Depressive Disorder: Neuroimaging Approaches

    PubMed Central

    Phillips, Mary L.; Chase, Henry W.; Sheline, Yvette I.; Etkin, Amit; Almeida, Jorge R.C.; Deckersbach, Thilo; Trivedi, Madhukar H.

    2015-01-01

    Objective Despite significant advances in neuroscience and treatment development, no widely accepted biomarkers are available to inform diagnostics or identify preferred treatments for individuals with major depressive disorder. Method In this critical review, the authors examine the extent to which multimodal neuroimaging techniques can identify biomarkers reflecting key pathophysiologic processes in depression and whether such biomarkers may act as predictors, moderators, and mediators of treatment response that might facilitate development of personalized treatments based on a better understanding of these processes. Results The authors first highlight the most consistent findings from neuroimaging studies using different techniques in depression, including structural and functional abnormalities in two parallel neural circuits: serotonergically modulated implicit emotion regulation circuitry, centered on the amygdala and different regions in the medial prefrontal cortex; and dopaminergically modulated reward neural circuitry, centered on the ventral striatum and medial prefrontal cortex. They then describe key findings from the relatively small number of studies indicating that specific measures of regional function and, to a lesser extent, structure in these neural circuits predict treatment response in depression. Conclusions Limitations of existing studies include small sample sizes, use of only one neuroimaging modality, and a focus on identifying predictors rather than moderators and mediators of differential treatment response. By addressing these limitations and, most importantly, capitalizing on the benefits of multimodal neuroimaging, future studies can yield moderators and mediators of treatment response in depression to facilitate significant improvements in shorter- and longer-term clinical and functional outcomes. PMID:25640931

  5. Sequencing-based approach identified three new susceptibility loci for psoriasis.

    PubMed

    Sheng, Yujun; Jin, Xin; Xu, Jinhua; Gao, Jinping; Du, Xiaoqing; Duan, Dawei; Li, Bing; Zhao, Jinhua; Zhan, Wenying; Tang, Huayang; Tang, Xianfa; Li, Yang; Cheng, Hui; Zuo, Xianbo; Mei, Junpu; Zhou, Fusheng; Liang, Bo; Chen, Gang; Shen, Changbing; Cui, Hongzhou; Zhang, Xiaoguang; Zhang, Change; Wang, Wenjun; Zheng, Xiaodong; Fan, Xing; Wang, Zaixing; Xiao, Fengli; Cui, Yong; Li, Yingrui; Wang, Jun; Yang, Sen; Xu, Lei; Sun, Liangdan; Zhang, Xuejun

    2014-01-01

    In a previous large-scale exome sequencing analysis for psoriasis, we discovered seven common and low-frequency missense variants within six genes with genome-wide significance. Here we describe an in-depth analysis of noncoding variants based on sequencing data (10,727 cases and 10,582 controls) with replication in an independent cohort of Han Chinese individuals consisting of 4,480 cases and 6,521 controls to identify additional psoriasis susceptibility loci. We confirmed four known psoriasis susceptibility loci (IL12B, IFIH1, ERAP1 and RNF114; 2.30 × 10(-20)≤P≤2.41 × 10(-7)) and identified three new susceptibility loci: 4q24 (NFKB1) at rs1020760 (P=2.19 × 10(-8)), 12p13.3 (CD27-LAG3) at rs758739 (P=4.08 × 10(-8)) and 17q12 (IKZF3) at rs10852936 (P=1.96 × 10(-8)). Two suggestive loci, 3p21.31 and 17q25, are also identified with P<1.00 × 10(-6). The results of this study increase the number of confirmed psoriasis risk loci and provide novel insight into the pathogenesis of psoriasis. PMID:25006012

  6. Probabilistic approach to identify sensitive parameter distributions in multimedia pathway analysis.

    SciTech Connect

    Kamboj, S.; Gnanapragasam, E.; LePoire, D.; Biwer, B. M.; Cheng, J.; Arnish, J.; Yu, C.; Chen, S. Y.; Mo, T.; Abu-Eid, R.; Thaggard, M.; Environmental Assessment; NRC

    2002-01-01

    Sensitive parameter distributions were identified with the use of probabilistic analysis in the RESRAD computer code. RESRAD is a multimedia pathway analysis code designed to evaluate radiological exposures resulting from radiological contamination in soil. The dose distribution was obtained by using a set of default parameter distribution/values. Most of the variations in the output dose distribution could be attributed to uncertainty in a small set of input parameters that could be considered as sensitive parameter distributions. The identification of the sensitive parameters is a first step in the prioritization of future research and information gathering. When site-specific parameter distribution/values are available for an actual site, the same process should be used with these site-specific data. Regression analysis used to identify sensitive parameters indicated that the dominant pathways depended on the radionuclide and source configurations. However, two parameter distributions were sensitive for many radionuclides: the external shielding factor when external exposure was the dominant pathway and the plant transfer factor when plant ingestion was the dominant pathway. No single correlation or regression coefficient can be used alone to identify sensitive parameters in all the cases. The coefficients are useful guides, but they have to be used in conjunction with other aids, such as scatter plots, and should undergo further analysis.

  7. Predicting the Interplanetary Magnetic Field using Approaches Based on Data Mining and Physical Models

    NASA Astrophysics Data System (ADS)

    Riley, P.; Russell, C. T.; de Koning, C. A.; Biesecker, D. A.; Linker, J.; Owens, M. J.; Lugaz, N.; Martens, P.; Angryk, R.; Reinard, A.; Ulrich, R. K.; Horbury, T. S.; Pizzo, V. J.; Liu, Y.; Hoeksema, T.

    2015-12-01

    An accurate prediction of the interplanetary magnetic field, and, in particular, its z-component (Bz) is a crucial capability for any space weather forecasting system, and yet, thus far, it has remained largely elusive (a point exemplified by the fact that no prediction center currently provides a forecast for Bz). In this presentation, we discuss the various physical processes that can produce non-zero values of Bz and summarize a selection of promising approaches that may ultimately lead to reliable forecasts of Bz. We describe the first steps we have taken to develop a framework for assessing these techniques, and show preliminary results of their efficacy.

  8. Two-step web-mining approach to study geology/geophysics-related open-source software projects

    NASA Astrophysics Data System (ADS)

    Behrends, Knut; Conze, Ronald

    2013-04-01

    Geology/geophysics is a highly interdisciplinary science, overlapping with, for instance, physics, biology and chemistry. In today's software-intensive work environments, geoscientists often encounter new open-source software from scientific fields that are only remotely related to the own field of expertise. We show how web-mining techniques can help to carry out systematic discovery and evaluation of such software. In a first step, we downloaded ~500 abstracts (each consisting of ~1 kb UTF-8 text) from agu-fm12.abstractcentral.com. This web site hosts the abstracts of all publications presented at AGU Fall Meeting 2012, the world's largest annual geology/geophysics conference. All abstracts belonged to the category "Earth and Space Science Informatics", an interdisciplinary label cross-cutting many disciplines such as "deep biosphere", "atmospheric research", and "mineral physics". Each publication was represented by a highly structured record with ~20 short data attributes, the largest authorship-record being the unstructured "abstract" field. We processed texts of the abstracts with the statistics software "R" to calculate a corpus and a term-document matrix. Using R package "tm", we applied text-mining techniques to filter data and develop hypotheses about software-development activities happening in various geology/geophysics fields. Analyzing the term-document matrix with basic techniques (e.g., word frequencies, co-occurences, weighting) as well as more complex methods (clustering, classification) several key pieces of information were extracted. For example, text-mining can be used to identify scientists who are also developers of open-source scientific software, and the names of their programming projects and codes can also be identified. In a second step, based on the intermediate results found by processing the conference-abstracts, any new hypotheses can be tested in another webmining subproject: by merging the dataset with open data from github

  9. Application of a PCR-based approach to identify sex in Hawaiian honeycreepers (Drepanidinae)

    USGS Publications Warehouse

    Jarvi, S.I.; Banko, P.C.

    2000-01-01

    The application of molecular techniques to conservation genetics issues can provide important guidance criteria for management of endangered species. The results from this study establish that PCR-based approaches for sex determination developed in other bird species (Griffiths and Tiwari 1995; Griffiths et al. 1996, 1998; Ellegren 1996) can be applied with a high degree of confidence to at least four species of Hawaiian honeycreepers. This provides a rapid, reliable method with which population managers can optimize sex ratios within populations of endangered species that are subject to artificial manipulation through captive breeding programmes or geographic translocation.

  10. A novel approach for identifying the true temperature sensitivity from soil respiration measurements

    SciTech Connect

    Gu, Lianhong; Hanson, Paul J; Liu, Qing; Post, Wilfred M

    2008-01-01

    We propose a novel approach, called the localized ratio fitting (LRF), to estimating the true temperature sensitivity from soil respiration measurements, a task crucial to modeling terrestrial carbon cycle and climate but so far hindered by the inadequate conventional regression approach. LRF takes advantage of the different timescales of the pool dynamics Cinduced and environmental variation Cinduced changes in soil CO2 efflux. It first transforms the expression for soil respiration into a form suppressing the influence of soil carbon pool dynamics and then uses the transformed expression to infer the parameters of environmental sensitivities. LRF works best for high-frequency soil respiration measurements and thus is particularly suitable for analyzing time series produced by automated soil chambers and from soil incubation experiments. We evaluated the validity of LRF with both simulated (with a multipool soil organic carbon model driven by realistic plant litter input scenarios) and measured (with automated soil chambers) time series of soil respiration. LRF accurately retrieved the true temperature sensitivity from the simulated heterotrophic soil respiration while the conventional approach failed to do so. The simulation also revealed that LRF performed better than the conventional approach when a direct photosynthetic signal existed in the time series of soil respiration although even LRF could not completely eliminate the interference of photosynthetic contribution for estimating the true temperature sensitivity. Importantly, the simulation on the photosynthetic influence reproduced a typical seasonal pattern of apparent temperature sensitivity reported in the literature: higher sensitivity in winter (dormant season) and lower sensitivity in summer (growing season). Such pattern has been interpreted as an indication of temperature acclimation of soil respiration by previous studies. Our simulation now indicated that that interpretation may be incorrect. The

  11. Multifunctional greenway approach for landscape planning and reclamation of a post-mining district: Cartagena-La Unión, SE Spain

    NASA Astrophysics Data System (ADS)

    Acosta, Jose A.; Faz, Ángel; Zornoza, Raúl; Martínez-Martínez, Silvia; Kabas, Sebla; Bech, Jaume

    2015-04-01

    Fragmented structures create metaphorical wounds in the landscape altering the ecological and cultural processes associated with it, as it can be seen in many mine areas. Therefore it is advisable to organize the reclamation plan in the beginning of mine operating to provide spatial and functional integration of the landscape based on scientific arguments and with all possible legal and administrative means, which is generally the case of the Strategic Environmental Assessment. However, there are many abandon mine areas where no reclamation plan has been carried out, such as the case of Mining District of Sierra Minera Cartagena-La Unión, SE Spain. In these cases it is vital to respond in a sustainable manner for healing the landscape wounds of post-mining activities. Reclamation activities of a post-mining district includes not only the mine soils also all land uses around them, for this reason on necessary create practical solutions for returning the functions of ecologic and cultural processes of the area. Greenway approach shows the main veins which are crucial for keeping alive and sustaining the mentioned processes of the area. Therefore the main objectives of this study are to 1) develop an integrated local greenway network to be able to preserve significant resources and values of the district, and to 2) develop this greenway network as a part of reclamation process for degraded areas. Landscape assessments revealed the most valuable and potential connectivity resources of the area. These clustering and linear patterns of resource concentrations include mountain range and valleys, natural drainage network, legally protected areas and cultural-historical resources. Conservation areas, cultural-educational resources of post-mining activities and the riverbeds have been the main building stones for the greenway corridor. The multifunctional greenway approach serves as landscape reclamation and planning tool in a degraded area by showing the priority zones for

  12. A stable isotope approach and its application for identifying nitrate source and transformation process in water.

    PubMed

    Xu, Shiguo; Kang, Pingping; Sun, Ya

    2016-01-01

    Nitrate contamination of water is a worldwide environmental problem. Recent studies have demonstrated that the nitrogen (N) and oxygen (O) isotopes of nitrate (NO3(-)) can be used to trace nitrogen dynamics including identifying nitrate sources and nitrogen transformation processes. This paper analyzes the current state of identifying nitrate sources and nitrogen transformation processes using N and O isotopes of nitrate. With regard to nitrate sources, δ(15)N-NO3(-) and δ(18)O-NO3(-) values typically vary between sources, allowing the sources to be isotopically fingerprinted. δ(15)N-NO3(-) is often effective at tracing NO(-)3 sources from areas with different land use. δ(18)O-NO3(-) is more useful to identify NO3(-) from atmospheric sources. Isotopic data can be combined with statistical mixing models to quantify the relative contributions of NO3(-) from multiple delineated sources. With regard to N transformation processes, N and O isotopes of nitrate can be used to decipher the degree of nitrogen transformation by such processes as nitrification, assimilation, and denitrification. In some cases, however, isotopic fractionation may alter the isotopic fingerprint associated with the delineated NO3(-) source(s). This problem may be addressed by combining the N and O isotopic data with other types of, including the concentration of selected conservative elements, e.g., chloride (Cl(-)), boron isotope (δ(11)B), and sulfur isotope (δ(35)S) data. Future studies should focus on improving stable isotope mixing models and furthering our understanding of isotopic fractionation by conducting laboratory and field experiments in different environments. PMID:26541149

  13. Identifying important nodes in weighted functional brain networks: A comparison of different centrality approaches

    NASA Astrophysics Data System (ADS)

    Kuhnert, Marie-Therese; Geier, Christian; Elger, Christian E.; Lehnertz, Klaus

    2012-06-01

    We compare different centrality metrics which aim at an identification of important nodes in complex networks. We investigate weighted functional brain networks derived from multichannel electroencephalograms recorded from 23 healthy subject under resting-state eyes-open or eyes-closed conditions. Although we observe the metrics strength, closeness, and betweenness centrality to be related to each other, they capture different spatial and temporal aspects of important nodes in these networks associated with behavioral changes. Identifying and characterizing of these nodes thus benefits from the application of several centrality metrics.

  14. FXR antagonism of NSAIDs contributes to drug-induced liver injury identified by systems pharmacology approach

    PubMed Central

    Lu, Weiqiang; Cheng, Feixiong; Jiang, Jing; Zhang, Chen; Deng, Xiaokang; Xu, Zhongyu; Zou, Shien; Shen, Xu; Tang, Yun; Huang, Jin

    2015-01-01

    Non-steroidal anti-inflammatory drugs (NSAIDs) are worldwide used drugs for analgesic, antipyretic, and anti-inflammatory therapeutics. However, NSAIDs often cause several serious liver injuries, such as drug-induced liver injury (DILI), and the molecular mechanisms of DILI have not been clearly elucidated. In this study, we developed a systems pharmacology approach to explore the mechanism-of-action of NSAIDs. We found that the Farnesoid X Receptor (FXR) antagonism of NSAIDs is a potential molecular mechanism of DILI through systematic network analysis and in vitro assays. Specially, the quantitative real-time PCR assay reveals that indomethacin and ibuprofen regulate FXR downstream target gene expression in HepG2 cells. Furthermore, the western blot shows that FXR antagonism by indomethacin induces the phosphorylation of STAT3 (signal transducer and activator of transcription 3), promotes the activation of caspase9, and finally causes DILI. In summary, our systems pharmacology approach provided novel insights into molecular mechanisms of DILI for NSAIDs, which may propel the ways toward the design of novel anti-inflammatory pharmacotherapeutics. PMID:25631039

  15. MartiTracks: a geometrical approach for identifying geographical patterns of distribution.

    PubMed

    Echeverría-Londoño, Susy; Miranda-Esquivel, Daniel Rafael

    2011-01-01

    Panbiogeography represents an evolutionary approach to biogeography, using rational cost-efficient methods to reduce initial complexity to locality data, and depict general distribution patterns. However, few quantitative, and automated panbiogeographic methods exist. In this study, we propose a new algorithm, within a quantitative, geometrical framework, to perform panbiogeographical analyses as an alternative to more traditional methods. The algorithm first calculates a minimum spanning tree, an individual track for each species in a panbiogeographic context. Then the spatial congruence among segments of the minimum spanning trees is calculated using five congruence parameters, producing a general distribution pattern. In addition, the algorithm removes the ambiguity, and subjectivity often present in a manual panbiogeographic analysis. Results from two empirical examples using 61 species of the genus Bomarea (2340 records), and 1031 genera of both plants and animals (100118 records) distributed across the Northern Andes, demonstrated that a geometrical approach to panbiogeography is a feasible quantitative method to determine general distribution patterns for taxa, reducing complexity, and the time needed for managing large data sets. PMID:21533259

  16. A genomics approach to identify susceptibilities of breast cancer cells to “fever-range” hyperthermia

    PubMed Central

    2014-01-01

    Background Preclinical and clinical studies have shown for decades that tumor cells demonstrate significantly enhanced sensitivity to “fever range” hyperthermia (increasing the intratumoral temperature to 42-45°C) than normal cells, although it is unknown why cancer cells exhibit this distinctive susceptibility. Methods To address this issue, mammary epithelial cells and three malignant breast cancer lines were subjected to hyperthermic shock and microarray, bioinformatics, and network analysis of the global transcription changes was subsequently performed. Results Bioinformatics analysis differentiated the gene expression patterns that distinguish the heat shock response of normal cells from malignant breast cancer cells, revealing that the gene expression profiles of mammary epithelial cells are completely distinct from malignant breast cancer lines following this treatment. Using gene network analysis, we identified altered expression of transcripts involved in mitotic regulators, histones, and non-protein coding RNAs as the significant processes that differed between the hyperthermic response of mammary epithelial cells and breast cancer cells. We confirmed our data via qPCR and flow cytometric analysis to demonstrate that hyperthermia specifically disrupts the expression of key mitotic regulators and G2/M phase progression in the breast cancer cells. Conclusion These data have identified molecular mechanisms by which breast cancer lines may exhibit enhanced susceptibility to hyperthermic shock. PMID:24511912

  17. Proteomics Approaches to Identify Mono(ADP-ribosyl)ated and Poly(ADP-ribosyl)ated proteins

    PubMed Central

    Vivelo, Christina A.; Leung, Anthony K. L.

    2015-01-01

    ADP-ribosylation refers to the addition of one or more ADP-ribose units onto protein substrates and this protein modification has been implicated in various cellular processes including DNA damage repair, RNA metabolism, transcription and cell cycle regulation. This review focuses on a compilation of large-scale proteomics studies that identify ADP-ribosylated proteins and their associated proteins by mass spectrometry using a variety of enrichment strategies. Some methods, such as the use of a poly(ADP-ribose)-specific antibody and boronate affinity chromatography and NAD+ analogues, have been employed for decades while others, such as the use of protein microarrays and recombinant proteins that bind ADP-ribose moieties (such as macrodomains), have only recently been developed. The advantages and disadvantages of each method and whether these methods are specific for identifying mono(ADP-ribosyl)ated and poly(ADP-ribosyl)ated proteins will be discussed. Lastly, since poly(ADP-ribose) is heterogeneous in length, it has been difficult to attain a mass signature associated with the modification sites. Several strategies on how to reduce polymer chain length heterogeneity for site identification will be reviewed. PMID:25263235

  18. A novel transcriptomic approach to identify candidate genes for grain quality traits in wheat.

    PubMed

    Wan, Yongfang; Underwood, Claudia; Toole, Geraldine; Skeggs, Peter; Zhu, Tong; Leverington, Michelle; Griffiths, Simon; Wheeler, Tim; Gooding, Mike; Poole, Rebecca; Edwards, Keith J; Gezan, Salvador; Welham, Sue; Snape, John; Mills, E N Clare; Mitchell, Rowan A C; Shewry, Peter R

    2009-06-01

    A novel methodology is described in which transcriptomics is combined with the measurement of bread-making quality and other agronomic traits for wheat genotypes grown in different environments (wet and cool or hot and dry conditions) to identify transcripts associated with these traits. Seven doubled haploid lines from the Spark x Rialto mapping population were selected to be matched for development and known alleles affecting quality. These were grown in polytunnels with different environments applied 14 days post-anthesis, and the whole experiment was repeated over 2 years. Transcriptomics using the wheat Affymetrix chip was carried out on whole caryopsis samples at two stages during grain filling. Transcript abundance was correlated with the traits for approximately 400 transcripts. About 30 of these were selected as being of most interest, and markers were derived from them and mapped using the population. Expression was identified as being under cis control for 11 of these and under trans control for 18. These transcripts are candidates for involvement in the biological processes which underlie genotypic variation in these traits. PMID:19490503

  19. A Proteomic Approach Identifies Candidate Early Biomarkers to Predict Severe Dengue in Children

    PubMed Central

    Nhi, Dang My; Huy, Nguyen Tien; Ohyama, Kaname; Kimura, Daisuke; Lan, Nguyen Thi Phuong; Uchida, Leo; Thuong, Nguyen Van; Nhon, Cao Thi My; Phuc, Le Hong; Mai, Nguyen Thi; Mizukami, Shusaku; Bao, Lam Quoc; Doan, Nguyen Ngoc; Binh, Nguyen Van Thanh; Quang, Luong Chan; Karbwang, Juntra; Yui, Katsuyuki; Morita, Kouichi; Huong, Vu Thi Que; Hirayama, Kenji

    2016-01-01

    Background Severe dengue with severe plasma leakage (SD-SPL) is the most frequent of dengue severe form. Plasma biomarkers for early predictive diagnosis of SD-SPL are required in the primary clinics for the prevention of dengue death. Methodology Among 63 confirmed dengue pediatric patients recruited, hospital based longitudinal study detected six SD-SPL and ten dengue with warning sign (DWS). To identify the specific proteins increased or decreased in the SD-SPL plasma obtained 6–48 hours before the shock compared with the DWS, the isobaric tags for relative and absolute quantification (iTRAQ) technology was performed using four patients each group. Validation was undertaken in 6 SD-SPL and 10 DWS patients. Principal findings Nineteen plasma proteins exhibited significantly different relative concentrations (p<0.05), with five over-expressed and fourteen under-expressed in SD-SPL compared with DWS. The individual protein was classified to either blood coagulation, vascular regulation, cellular transport-related processes or immune response. The immunoblot quantification showed angiotensinogen and antithrombin III significantly increased in SD-SPL whole plasma of early stage compared with DWS subjects. Even using this small number of samples, antithrombin III predicted SD-SPL before shock occurrence with accuracy. Conclusion Proteins identified here may serve as candidate predictive markers to diagnose SD-SPL for timely clinical management. Since the number of subjects are small, so further studies are needed to confirm all these biomarkers. PMID:26895439

  20. A Model-Based Approach for Identifying Signatures of Ancient Balancing Selection in Genetic Data

    PubMed Central

    DeGiorgio, Michael; Lohmueller, Kirk E.; Nielsen, Rasmus

    2014-01-01

    While much effort has focused on detecting positive and negative directional selection in the human genome, relatively little work has been devoted to balancing selection. This lack of attention is likely due to the paucity of sophisticated methods for identifying sites under balancing selection. Here we develop two composite likelihood ratio tests for detecting balancing selection. Using simulations, we show that these methods outperform competing methods under a variety of assumptions and demographic models. We apply the new methods to whole-genome human data, and find a number of previously-identified loci with strong evidence of balancing selection, including several HLA genes. Additionally, we find evidence for many novel candidates, the strongest of which is FANK1, an imprinted gene that suppresses apoptosis, is expressed during meiosis in males, and displays marginal signs of segregation distortion. We hypothesize that balancing selection acts on this locus to stabilize the segregation distortion and negative fitness effects of the distorter allele. Thus, our methods are able to reproduce many previously-hypothesized signals of balancing selection, as well as discover novel interesting candidates. PMID:25144706

  1. Genetic Susceptibility to Vitiligo: GWAS Approaches for Identifying Vitiligo Susceptibility Genes and Loci

    PubMed Central

    Shen, Changbing; Gao, Jing; Sheng, Yujun; Dou, Jinfa; Zhou, Fusheng; Zheng, Xiaodong; Ko, Randy; Tang, Xianfa; Zhu, Caihong; Yin, Xianyong; Sun, Liangdan; Cui, Yong; Zhang, Xuejun

    2016-01-01

    Vitiligo is an autoimmune disease with a strong genetic component, characterized by areas of depigmented skin resulting from loss of epidermal melanocytes. Genetic factors are known to play key roles in vitiligo through discoveries in association studies and family studies. Previously, vitiligo susceptibility genes were mainly revealed through linkage analysis and candidate gene studies. Recently, our understanding of the genetic basis of vitiligo has been rapidly advancing through genome-wide association study (GWAS). More than 40 robust susceptible loci have been identified and confirmed to be associated with vitiligo by using GWAS. Most of these associated genes participate in important pathways involved in the pathogenesis of vitiligo. Many susceptible loci with unknown functions in the pathogenesis of vitiligo have also been identified, indicating that additional molecular mechanisms may contribute to the risk of developing vitiligo. In this review, we summarize the key loci that are of genome-wide significance, which have been shown to influence vitiligo risk. These genetic loci may help build the foundation for genetic diagnosis and personalize treatment for patients with vitiligo in the future. However, substantial additional studies, including gene-targeted and functional studies, are required to confirm the causality of the genetic variants and their biological relevance in the development of vitiligo. PMID:26870082

  2. A molecular approach to identify active microbes in environmental eukaryote clone libraries.

    PubMed

    Stoeck, Thorsten; Zuendorf, Alexandra; Breiner, Hans-Werner; Behnke, Anke

    2007-02-01

    A rapid method for the simultaneous extraction of RNA and DNA from eukaryote plankton samples was developed in order to discriminate between indigenous active cells and signals from inactive or even dead organisms. The method was tested using samples from below the chemocline of an anoxic Danish fjord. The simple protocol yielded RNA and DNA of a purity suitable for amplification by reverse transcription-polymerase chain reaction (RT-PCR) and PCR, respectively. We constructed an rRNA-derived and an rDNA-derived clone library to assess the composition of the microeukaryote assemblage under study and to identify physiologically active constituents of the community. We retrieved nearly 600 protistan target clones, which grouped into 84 different phylotypes (98% sequence similarity). Of these phylotypes, 27% occurred in both libraries, 25% exclusively in the rRNA library, and 48% exclusively in the rDNA library. Both libraries revealed good correspondence of the general community composition in terms of higher taxonomic ranks. They were dominated by anaerobic ciliates and heterotrophic stramenopile flagellates thriving below the fjord's chemocline. The high abundance of these bacterivore organisms points out their role as a major trophic link in anoxic marine systems. A comparison of the two libraries identified phototrophic dinoflagellates, "uncultured marine alveolates group I," and different parasites, which were exclusively detected with the rDNA-derived library, as nonindigenous members of the anoxic microeukaryote community under study. PMID:17264997

  3. Proteomics approaches to identify mono-(ADP-ribosyl)ated and poly(ADP-ribosyl)ated proteins.

    PubMed

    Vivelo, Christina A; Leung, Anthony K L

    2015-01-01

    ADP-ribosylation refers to the addition of one or more ADP-ribose units onto protein substrates and this protein modification has been implicated in various cellular processes including DNA damage repair, RNA metabolism, transcription, and cell cycle regulation. This review focuses on a compilation of large-scale proteomics studies that identify ADP-ribosylated proteins and their associated proteins by MS using a variety of enrichment strategies. Some methods, such as the use of a poly(ADP-ribose)-specific antibody and boronate affinity chromatography and NAD(+) analogues, have been employed for decades while others, such as the use of protein microarrays and recombinant proteins that bind ADP-ribose moieties (such as macrodomains), have only recently been developed. The advantages and disadvantages of each method and whether these methods are specific for identifying mono(ADP-ribosyl)ated and poly(ADP-ribosyl)ated proteins will be discussed. Lastly, since poly(ADP-ribose) is heterogeneous in length, it has been difficult to attain a mass signature associated with the modification sites. Several strategies on how to reduce polymer chain length heterogeneity for site identification will be reviewed. PMID:25263235

  4. Towards identifying Brassica proteins involved in mediating resistance to Leptosphaeria maculans: a proteomics-based approach.

    PubMed

    Sharma, Nidhi; Hotte, Naomi; Rahman, Muhammad H; Mohammadi, Mohsen; Deyholos, Michael K; Kav, Nat N V

    2008-09-01

    To better understand the pathogen-stress response of Brassica species against the ubiquitous hemi-biotroph fungus Leptosphaeria maculans, we conducted a comparative proteomic analysis between blackleg-susceptible Brassica napus and blackleg-resistant Brassica carinata following pathogen inoculation. We examined temporal changes (6, 12, 24, 48 and 72 h) in protein profiles of both species subjected to pathogen-challenge using two-dimensional gel electrophoresis. A total of 64 proteins were found to be significantly affected by the pathogen in the two species, out of which 51 protein spots were identified using tandem mass spectrometry. The proteins identified included antioxidant enzymes, photosynthetic and metabolic enzymes, and those involved in protein processing and signaling. Specifically, we observed that in the tolerant B. carinata, enzymes involved in the detoxification of free radicals increased in response to the pathogen whereas no such increase was observed in the susceptible B. napus. The expression of genes encoding four selected proteins was validated using quantitative real-time PCR and an additional one by Western blotting. Our findings are discussed with respect to tolerance or susceptibility of these species to the pathogen. PMID:18668695

  5. Identifying the greatest team and captain—A complex network approach to cricket matches

    NASA Astrophysics Data System (ADS)

    Mukherjee, Satyam

    2012-12-01

    We consider all Test matches played between 1877 and 2010 and One Day International (ODI) matches played between 1971 and 2010. We form directed and weighted networks of teams and also of their captains. The success of a team (or captain) is determined by the ‘quality’ of the wins, not simply by the number of wins. We apply the diffusion-based PageRank algorithm to the networks to assess the importance of the wins, and rank the respective teams and captains. Our analysis identifies Australia as the best team in both forms of cricket, Test and ODI. Steve Waugh is identified as the best captain in Test cricket and Ricky Ponting is the best captain in the ODI format. We also compare our ranking scheme with an existing ranking scheme, the Reliance ICC ranking. Our method does not depend on ‘external’ criteria in the ranking of teams (captains). The purpose of this paper is to introduce a revised ranking of cricket teams and to quantify the success of the captains.

  6. An approach to developing independent learning and non-technical skills amongst final year mining engineering students

    NASA Astrophysics Data System (ADS)

    Knobbs, C. G.; Grayson, D. J.

    2012-06-01

    There is mounting evidence to show that engineers need more than technical skills to succeed in industry. This paper describes a curriculum innovation in which so-called 'soft' skills, specifically inter-personal and intra-personal skills, were integrated into a final year mining engineering course. The instructional approach was designed to promote independent learning and to develop non-technical skills, essential for students on the threshold of becoming practising engineers. Three psychometric tests were administered at the beginning of the course to make students aware of their own and their classmates' characteristics. Substantial prescribed reading assignments preceded weekly group discussions. Several projects during the course required team work skills and application of content knowledge to real-world contexts. Results obtained from students' reflection papers, assignments related to 'soft' skills and end of course evaluations suggest that students' appreciation of the need for these skills, as well as their own perceived competence, increased during the course. Their ability to function as independent learners also increased.

  7. Ab initio thermodynamic approach to identify mixed solid sorbents for CO2 capture technology

    SciTech Connect

    Duan, Yuhua

    2015-10-15

    Because the current technologies for capturing CO2 are still too energy intensive, new materials must be developed that can capture CO2 reversibly with acceptable energy costs. At a given CO2 pressure, the turnover temperature (Tt) of the reaction of an individual solid that can capture CO2 is fixed. Such Tt may be outside the operating temperature range (ΔTo) for a practical capture technology. To adjust Tt to fit the practical ΔTo, in this study, three scenarios of mixing schemes are explored by combining thermodynamic database mining with first principles density functional theory and phonon lattice dynamics calculations. Our calculated results demonstrate that by mixing different types of solids, it’s possible to shift Tt to the range of practical operating temperature conditions. According to the requirements imposed by the pre- and post- combustion technologies and based on our calculated thermodynamic properties for the CO2 capture reactions by the mixed solids of interest, we were able to identify the mixing ratios of two or more solids to form new sorbent materials for which lower capture energy costs are expected at the desired pressure and temperature conditions.

  8. What's Inside That Seed We Brew? A New Approach To Mining the Coffee Microbiome.

    PubMed

    Vaughan, Michael Joe; Mitchell, Thomas; McSpadden Gardener, Brian B

    2015-10-01

    Coffee is a critically important agricultural commodity for many tropical states and is a beverage enjoyed by millions of people worldwide. Recent concerns over the sustainability of coffee production have prompted investigations of the coffee microbiome as a tool to improve crop health and bean quality. This review synthesizes literature informing our knowledge of the coffee microbiome, with an emphasis on applications of fruit- and seed-associated microbes in coffee production and processing. A comprehensive inventory of microbial species cited in association with coffee fruits and seeds is presented as reference tool for researchers investigating coffee-microbe associations. It concludes with a discussion of the approaches and techniques that provide a path forward to improve our understanding of the coffee microbiome and its utility, as a whole and as individual components, to help ensure the future sustainability of coffee production. PMID:26162877

  9. What's Inside That Seed We Brew? A New Approach To Mining the Coffee Microbiome

    PubMed Central

    Mitchell, Thomas; McSpadden Gardener, Brian B.

    2015-01-01

    Coffee is a critically important agricultural commodity for many tropical states and is a beverage enjoyed by millions of people worldwide. Recent concerns over the sustainability of coffee production have prompted investigations of the coffee microbiome as a tool to improve crop health and bean quality. This review synthesizes literature informing our knowledge of the coffee microbiome, with an emphasis on applications of fruit- and seed-associated microbes in coffee production and processing. A comprehensive inventory of microbial species cited in association with coffee fruits and seeds is presented as reference tool for researchers investigating coffee-microbe associations. It concludes with a discussion of the approaches and techniques that provide a path forward to improve our understanding of the coffee microbiome and its utility, as a whole and as individual components, to help ensure the future sustainability of coffee production. PMID:26162877

  10. Mining the Unknown: A Systems Approach to Metabolite Identification Combining Genetic and Metabolic Information

    PubMed Central

    Krumsiek, Jan; Suhre, Karsten; Evans, Anne M.; Mitchell, Matthew W.; Mohney, Robert P.; Milburn, Michael V.; Wägele, Brigitte; Römisch-Margl, Werner; Illig, Thomas; Adamski, Jerzy; Gieger, Christian; Theis, Fabian J.; Kastenmüller, Gabi

    2012-01-01

    Recent genome-wide association studies (GWAS) with metabolomics data linked genetic variation in the human genome to differences in individual metabolite levels. A strong relevance of this metabolic individuality for biomedical and pharmaceutical research has been reported. However, a considerable amount of the molecules currently quantified by modern metabolomics techniques are chemically unidentified. The identification of these “unknown metabolites” is still a demanding and intricate task, limiting their usability as functional markers of metabolic processes. As a consequence, previous GWAS largely ignored unknown metabolites as metabolic traits for the analysis. Here we present a systems-level approach that combines genome-wide association analysis and Gaussian graphical modeling with metabolomics to predict the identity of the unknown metabolites. We apply our method to original data of 517 metabolic traits, of which 225 are unknowns, and genotyping information on 655,658 genetic variants, measured in 1,768 human blood samples. We report previously undescribed genotype–metabotype associations for six distinct gene loci (SLC22A2, COMT, CYP3A5, CYP2C18, GBA3, UGT3A1) and one locus not related to any known gene (rs12413935). Overlaying the inferred genetic associations, metabolic networks, and knowledge-based pathway information, we derive testable hypotheses on the biochemical identities of 106 unknown metabolites. As a proof of principle, we experimentally confirm nine concrete predictions. We demonstrate the benefit of our method for the functional interpretation of previous metabolomics biomarker studies on liver detoxification, hypertension, and insulin resistance. Our approach is generic in nature and can be directly transferred to metabolomics data from different experimental platforms. PMID:23093944

  11. A Systematic Approach to Identify Candidate Transcription Factors that Control Cell Identity

    PubMed Central

    D’Alessio, Ana C.; Fan, Zi Peng; Wert, Katherine J.; Baranov, Petr; Cohen, Malkiel A.; Saini, Janmeet S.; Cohick, Evan; Charniga, Carol; Dadon, Daniel; Hannett, Nancy M.; Young, Michael J.; Temple, Sally; Jaenisch, Rudolf; Lee, Tong Ihn; Young, Richard A.

    2015-01-01

    Summary Hundreds of transcription factors (TFs) are expressed in each cell type, but cell identity can be induced through the activity of just a small number of core TFs. Systematic identification of these core TFs for a wide variety of cell types is currently lacking and would establish a foundation for understanding the transcriptional control of cell identity in development, disease, and cell-based therapy. Here, we describe a computational approach that generates an atlas of candidate core TFs for a broad spectrum of human cells. The potential impact of the atlas was demonstrated via cellular reprogramming efforts where candidate core TFs proved capable of converting human fibroblasts to retinal pigment epithelial-like cells. These results suggest that candidate core TFs from the atlas will prove a useful starting point for studying transcriptional control of cell identity and reprogramming in many human cell types. PMID:26603904

  12. A Systematic Approach to Identify Candidate Transcription Factors that Control Cell Identity.

    PubMed

    D'Alessio, Ana C; Fan, Zi Peng; Wert, Katherine J; Baranov, Petr; Cohen, Malkiel A; Saini, Janmeet S; Cohick, Evan; Charniga, Carol; Dadon, Daniel; Hannett, Nancy M; Young, Michael J; Temple, Sally; Jaenisch, Rudoff; Lee, Tong Ihn; Young, Richard A

    2015-11-10

    Hundreds of transcription factors (TFs) are expressed in each cell type, but cell identity can be induced through the activity of just a small number of core TFs. Systematic identification of these core TFs for a wide variety of cell types is currently lacking and would establish a foundation for understanding the transcriptional control of cell identity in development, disease, and cell-based therapy. Here, we describe a computational approach that generates an atlas of candidate core TFs for a broad spectrum of human cells. The potential impact of the atlas was demonstrated via cellular reprogramming efforts where candidate core TFs proved capable of converting human fibroblasts to retinal pigment epithelial-like cells. These results suggest that candidate core TFs from the atlas will prove a useful starting point for studying transcriptional control of cell identity and reprogramming in many human cell types. PMID:26603904

  13. Identifying dominant controls on hydrologic parameter transfer from gauged to ungauged catchments - A comparative hydrology approach

    NASA Astrophysics Data System (ADS)

    Singh, R.; Archfield, S. A.; Wagener, T.

    2014-09-01

    Daily streamflow information is critical for solving various hydrologic problems, though observations of continuous streamflow for model calibration are available at only a small fraction of the world's rivers. One approach to estimate daily streamflow at an ungauged location is to transfer rainfall-runoff model parameters calibrated at a gauged (donor) catchment to an ungauged (receiver) catchment of interest. Central to this approach is the selection of a hydrologically similar donor. No single metric or set of metrics of hydrologic similarity have been demonstrated to consistently select a suitable donor catchment. We design an experiment to diagnose the dominant controls on successful hydrologic model parameter transfer. We calibrate a lumped rainfall-runoff model to 83 stream gauges across the United States. All locations are USGS reference gauges with minimal human influence. Parameter sets from the calibrated models are then transferred to each of the other catchments and the performance of the transferred parameters is assessed. This transfer experiment is carried out both at the scale of the entire US and then for six geographic regions. We use classification and regression tree (CART) analysis to determine the relationship between catchment similarity and performance of transferred parameters. Similarity is defined using physical/climatic catchment characteristics, as well as streamflow response characteristics (signatures such as baseflow index and runoff ratio). Across the entire US, successful parameter transfer is governed by similarity in elevation and climate, and high similarity in streamflow signatures. Controls vary for different geographic regions though. Geology followed by drainage, topography and climate constitute the dominant similarity metrics in forested eastern mountains and plateaus, whereas agricultural land use relates most strongly with successful parameter transfer in the humid plains.

  14. Identifying dominant controls on hydrologic parameter transfer from gauged to ungauged catchments: a comparative hydrology approach

    USGS Publications Warehouse

    Singh, R.; Archfield, S.A.; Wagener, T.

    2014-01-01

    Daily streamflow information is critical for solving various hydrologic problems, though observations of continuous streamflow for model calibration are available at only a small fraction of the world’s rivers. One approach to estimate daily streamflow at an ungauged location is to transfer rainfall–runoff model parameters calibrated at a gauged (donor) catchment to an ungauged (receiver) catchment of interest. Central to this approach is the selection of a hydrologically similar donor. No single metric or set of metrics of hydrologic similarity have been demonstrated to consistently select a suitable donor catchment. We design an experiment to diagnose the dominant controls on successful hydrologic model parameter transfer. We calibrate a lumped rainfall–runoff model to 83 stream gauges across the United States. All locations are USGS reference gauges with minimal human influence. Parameter sets from the calibrated models are then transferred to each of the other catchments and the performance of the transferred parameters is assessed. This transfer experiment is carried out both at the scale of the entire US and then for six geographic regions. We use classification and regression tree (CART) analysis to determine the relationship between catchment similarity and performance of transferred parameters. Similarity is defined using physical/climatic catchment characteristics, as well as streamflow response characteristics (signatures such as baseflow index and runoff ratio). Across the entire US, successful parameter transfer is governed by similarity in elevation and climate, and high similarity in streamflow signatures. Controls vary for different geographic regions though. Geology followed by drainage, topography and climate constitute the dominant similarity metrics in forested eastern mountains and plateaus, whereas agricultural land use relates most strongly with successful parameter transfer in the humid plains.

  15. Automatic Entity Recognition and Typing from Massive Text Corpora: A Phrase and Network Mining Approach

    PubMed Central

    Ren, Xiang; El-Kishky, Ahmed; Wang, Chi; Han, Jiawei

    2015-01-01

    In today’s computerized and information-based society, we are soaked with vast amounts of text data, ranging from news articles, scientific publications, product reviews, to a wide range of textual information from social media. To unlock the value of these unstructured text data from various domains, it is of great importance to gain an understanding of entities and their relationships. In this tutorial, we introduce data-driven methods to recognize typed entities of interest in massive, domain-specific text corpora. These methods can automatically identify token spans as entity mentions in documents and label their types (e.g., people, product, food) in a scalable way. We demonstrate on real datasets including news articles and tweets how these typed entities aid in knowledge discovery and management. PMID:26705508

  16. Detecting a Weak Association by Testing its Multiple Perturbations: a Data Mining Approach

    NASA Astrophysics Data System (ADS)

    Lo, Min-Tzu; Lee, Wen-Chung

    2014-05-01

    Many risk factors/interventions in epidemiologic/biomedical studies are of minuscule effects. To detect such weak associations, one needs a study with a very large sample size (the number of subjects, n). The n of a study can be increased but unfortunately only to an extent. Here, we propose a novel method which hinges on increasing sample size in a different direction-the total number of variables (p). We construct a p-based `multiple perturbation test', and conduct power calculations and computer simulations to show that it can achieve a very high power to detect weak associations when p can be made very large. As a demonstration, we apply the method to analyze a genome-wide association study on age-related macular degeneration and identify two novel genetic variants that are significantly associated with the disease. The p-based method may set a stage for a new paradigm of statistical tests.

  17. A literature mining-based approach for identification of cellular pathways associated with chemoresistance in cancer.

    PubMed

    Oh, Jung Hun; Deasy, Joseph O

    2016-05-01

    Chemoresistance is a major obstacle to the successful treatment of many human cancer types. Increasing evidence has revealed that chemoresistance involves many genes and multiple complex biological mechanisms including cancer stem cells, drug efflux mechanism, autophagy and epithelial-mesenchymal transition. Many studies have been conducted to investigate the possible molecular mechanisms of chemoresistance. However, understanding of the biological mechanisms in chemoresistance still remains limited. We surveyed the literature on chemoresistance-related genes and pathways of multiple cancer types. We then used a curated pathway database to investigate significant chemoresistance-related biological pathways. In addition, to investigate the importance of chemoresistance-related markers in protein-protein interaction networks identified using the curated database, we used a gene-ranking algorithm designed based on a graph-based scoring function in our previous study. Our comprehensive survey and analysis provide a systems biology-based overview of the underlying mechanisms of chemoresistance. PMID:26220932

  18. A cationic poly(2-oxazoline) with high in vitro transfection efficiency identified by a library approach.

    PubMed

    Rinkenauer, Alexandra C; Tauhardt, Lutz; Wendler, Felix; Kempe, Kristian; Gottschaldt, Michael; Traeger, Anja; Schubert, Ulrich S

    2015-03-01

    To date, cationic polymers with high transfection efficiencies (TE) often have a high cytotoxicity. By screening an 18-membered library of cationic 2-oxazoline-based polymers, a polymer with similar TE as linear poly(ethylene imine) but no detectable cytotoxicity at the investigated concentrations could be identified. The influence of the polymer side chain hydrophobicity and the type and content of amino groups on the pDNA condensation, the TE, the cytotoxicity, the cellular membrane interaction as well as the size, charge, and stability of the polyplexes was studied. Primary amines and an amine content of at least 40% were required for an efficient TE. While polymers with short side chains were non-toxic up to an amine content of 40%, long hydrophobic side chains induced a high cytotoxicity. PMID:25403084

  19. An improved approach to identify irradiated dog feed by electron paramagnetic resonance study and thermoluminescence measurements

    NASA Astrophysics Data System (ADS)

    Sanyal, Bhaskar; Chawla, S. P.; Sharma, Arun

    2011-05-01

    In the present study, probably for the first time, a detailed analysis of the radiation induced radical species and thermoluminescence measurements of irradiated dog feed are reported. The EPR spectrum of non-irradiated ready-to-eat dog feed was characterized by singlet g=2.0047±0.0003. Irradiated samples exhibited a complex EPR spectrum. During high power (50.0 mW) EPR spectroscopy, a visible change in the shape of the EPR spectrum was observed and characterized by EPR spectrum simulation technique. An axially symmetric anisotropic signal with g║=2.0028 and g┴=1.9976 was identified. However, a negligible change in the matrix of irradiated edible dog chew was observed using EPR spectroscopy. Therefore, thermoluminescence study of the isolated minerals from dog chew was carried out. The composition of the poly-minerals was studied using SEM and EDX analysis and a complete verdict on identification of irradiation is proposed.

  20. An integrated approach towards identifying age-related mechanisms of slip initiated falls

    PubMed Central

    Lockhart, Thurmon E.

    2008-01-01

    The causes of slip and fall accidents, both in terms of extrinsic and intrinsic factors and their associations are not yet fully understood. Successful intervention solutions for reducing slip and fall accidents require a more complete understanding of the mechanisms involved. Before effective fall prevention strategies can be put into practice, it is central to examine the chain of events in an accident, comprising the exposure to hazards, initiation of events and the final outcome leading to injury and disability. These events can be effectively identified and analyzed by applying epidemiological, psychophysical, biomechanical and tribological research principles and methodologies. In this manuscript, various methods available to examine fall accidents and their underlying mechanisms are presented to provide a comprehensive array of information to help pinpoint the needs and requirements of new interventions aimed at reducing the risk of falls among the growing elderly population. PMID:17768070

  1. A synthetic biology approach identifies the mammalian UPR RNA ligase RtcB

    PubMed Central

    Lu, Yanyan; Liang, Feng-Xia; Wang, Xiaozhong

    2014-01-01

    SUMMARY Signaling in the ancestral branch of the unfolded protein response (UPR) is initiated by unconventional splicing of HAC1/XBP1 mRNA during endoplasmic reticulum (ER) stress. In mammals, IRE1α has been known to cleave the XBP1 intron. However, the enzyme responsible for ligation of two XBP1 exons remains unknown. Using an XBP1 splicing-based synthetic circuit, we identify RtcB as the primary UPR RNA ligase. In RtcB knockout cells, XBP1 mRNA splicing is defective during ER stress. Genetic rescue and in vitro splicing show that the RNA ligase activity of RtcB is directly required for the splicing of XBP1 mRNA. Taken together, these data demonstrate that RtcB is the long sought RNA ligase that catalyzes unconventional RNA splicing during the mammalian UPR. PMID:25087875

  2. One Health approach to identify research needs in bovine and human babesioses: workshop report

    PubMed Central

    2010-01-01

    Background Babesia are emerging health threats to humans and animals in the United States. A collaborative effort of multiple disciplines to attain optimal health for people, animals and our environment, otherwise known as the One Health concept, was taken during a research workshop held in April 2009 to identify gaps in scientific knowledge regarding babesioses. The impetus for this analysis was the increased risk for outbreaks of bovine babesiosis, also known as Texas cattle fever, associated with the re-infestation of the U.S. by cattle fever ticks. Results The involvement of wildlife in the ecology of cattle fever ticks jeopardizes the ability of state and federal agencies to keep the national herd free of Texas cattle fever. Similarly, there has been a progressive increase in the number of cases of human babesiosis over the past 25 years due to an increase in the white-tailed deer population. Human babesiosis due to cattle-associated Babesia divergens and Babesia divergens-like organisms have begun to appear in residents of the United States. Research needs for human and bovine babesioses were identified and are presented herein. Conclusions The translation of this research is expected to provide veterinary and public health systems with the tools to mitigate the impact of bovine and human babesioses. However, economic, political, and social commitments are urgently required, including increased national funding for animal and human Babesia research, to prevent the re-establishment of cattle fever ticks and the increasing problem of human babesiosis in the United States. PMID:20377902

  3. Genomic Approach to Identifying the Putative Target of and Mechanisms of Resistance to Mefloquine in Mycobacteria

    PubMed Central

    Danelishvili, Lia; Wu, Martin; Young, Lowell S.; Bermudez, Luiz E.

    2005-01-01

    The emergence of mycobacterial resistance to multiple antimicrobials emphasizes the need for new compounds. The antimycobacterial activity of mefloquine has been recently described. Mycobacterium avium, Mycobacterium smegmatis, and Mycobacterium tuberculosis are susceptible to mefloquine in vitro, and activity was evidenced in vivo against M. avium. Attempts to obtain resistant mutants by both in vitro and in vivo selection have failed. To identify mycobacterial genes regulated in response to mefloquine, we employed DNA microarray and green fluorescent protein (GFP) promoter library techniques. Following mefloquine treatment, RNA was harvested from M. tuberculosis H37Rv, labeled with 32P, and hybridized against a DNA array. Exposure to 4× MIC resulted in a significant stress response, while exposure to a subinhibitory concentration of mefloquine triggered the expression of genes coding for enzymes involved in fatty acid synthesis, the metabolic pathway, and transport across the membrane and other proteins of unknown function. Evaluation of gene expression using an M. avium GFP promoter library exposed to subinhibitory concentrations of mefloquine revealed more than threefold upregulation of 24 genes. To complement the microarray results, we constructed an M. avium genomic library under the control of a strong sigma-70 (G13) promoter in M. smegmatis. Resistant clones were selected in 32 μg/ml of mefloquine (wild-type M. avium, M. tuberculosis, and M. smegmatis are inhibited by 8 μg/ml), and the M. avium genes associated with M. smegmatis resistant to mefloquine were sequenced. Two groups of genes were identified: one affecting membrane transport and one gene that apparently is involved in regulation of cellular replication. PMID:16127044

  4. Renewed mining and reclamation: Imapacts on bats and potential mitigation

    SciTech Connect

    Brown, P.E.; Berry, R.D.

    1997-12-31

    Historic mining created new roosting habitat for many bat species. Now the same industry has the potential to adversely impact bats. Contemporary mining operations usually occur in historic districts; consequently the old workings are destroyed by open pit operations. Occasionally, underground techniques are employed, resulting in the enlargement or destruction of the original workings. Even during exploratory operations, historic mine openings can be covered as drill roads are bulldozed, or drills can penetrate and collapse underground workings. Nearby blasting associated with mine construction and operation can disrupt roosting bats. Bats can also be disturbed by the entry of mine personnel to collect ore samples or by recreational mine explorers, since the creation of roads often results in easier access. In addition to roost disturbance, other aspects of renewed mining can have adverse impacts on bat populations, and affect even those bats that do not live in mines. Open cyanide ponds, or other water in which toxic chemicals accumulate, can poison bats and other wildlife. The creation of the pits, roads and processing areas often destroys critical foraging habitat, or change drainage patterns. Finally, at the completion of mining, any historic mines still open may be sealed as part of closure and reclamation activities. The net result can be a loss of bats and bat habitat. Conversely, in some contemporary underground operations, future roosting habitat for bats can be fabricated. An experimental approach to the creation of new roosting habitat is to bury culverts or old tires beneath waste rock. Mining companies can mitigate for impacts to bats by surveying to identify bat-roosting habitat, removing bats prior to renewed mining or closure, protecting non-impacted roost sites with gates and fences, researching to identify habitat requirements and creating new artificial roosts.

  5. Targeted approach to identify genetic loci associated with evolved dioxin tolerance in Atlantic Killifish (Fundulus heteroclitus)

    PubMed Central

    2014-01-01

    Background The most toxic aromatic hydrocarbon pollutants are categorized as dioxin-like compounds (DLCs) to which extreme tolerance has evolved independently and contemporaneously in (at least) four populations of Atlantic killifish (Fundulus heteroclitus). Surprisingly, the magnitude and phenotype of DLC tolerance is similar among these killifish populations that have adapted to varied, but highly aromatic hydrocarbon-contaminated urban/industrialized estuaries of the US Atlantic coast. Multiple tolerant and neighboring sensitive killifish populations were compared with the expectation that genetic loci associated with DLC tolerance would be revealed. Results Since the aryl hydrocarbon receptor (AHR) pathway partly or fully mediates DLC toxicity in vertebrates, single nucleotide polymorphisms (SNPs) from 42 genes associated with the AHR pathway were identified to serve as targeted markers. Wild fish (N = 36/37) from four highly tolerant killifish populations and four nearby sensitive populations were genotyped using 59 SNP markers. Similar to other killifish population genetic analyses, strong genetic differentiation among populations was detected, consistent with isolation by distance models. When DLC-sensitive populations were pooled and compared to pooled DLC-tolerant populations, multi-locus analyses did not distinguish the two groups. However, pairwise comparisons of nearby tolerant and sensitive populations revealed high differentiation among sensitive and tolerant populations at these specific loci: AHR 1 and 2, cathepsin Z, the cytochrome P450s (CYP1A and 3A30), and the NADH dehydrogenase subunits. In addition, significant shifts in minor allele frequency were observed at AHR2 and CYP1A loci across most sensitive/tolerant pairs, but only AHR2 exhibited shifts in the same direction across all pairs. Conclusions The observed differences in allelic composition at the AHR2 and CYP1A SNP loci were identified as significant among paired sensitive

  6. Profiling Animal Toxicants by Automatically Mining Public Bioassay Data: A Big Data Approach for Computational Toxicology

    PubMed Central

    Zhang, Jun; Hsieh, Jui-Hua; Zhu, Hao

    2014-01-01

    In vitro bioassays have been developed and are currently being evaluated as potential alternatives to traditional animal toxicity models. Already, the progress of high throughput screening techniques has resulted in an enormous amount of publicly available bioassay data having been generated for a large collection of compounds. When a compound is tested using a collection of various bioassays, all the testing results can be considered as providing a unique bio-profile for this compound, which records the responses induced when the compound interacts with different cellular systems or biological targets. Profiling compounds of environmental or pharmaceutical interest using useful toxicity bioassay data is a promising method to study complex animal toxicity. In this study, we developed an automatic virtual profiling tool to evaluate potential animal toxicants. First, we automatically acquired all PubChem bioassay data for a set of 4,841 compounds with publicly available rat acute toxicity results. Next, we developed a scoring system to evaluate the relevance between these extracted bioassays and animal acute toxicity. Finally, the top ranked bioassays were selected to profile the compounds of interest. The resulting response profiles proved to be useful to prioritize untested compounds for their animal toxicity potentials and form a potential in vitro toxicity testing panel. The protocol developed in this study could be combined with structure-activity approaches and used to explore additional publicly available bioassay datasets for modeling a broader range of animal toxicities. PMID:24950175

  7. Multi-Omics Approach Identifies Molecular Mechanisms of Plant-Fungus Mycorrhizal Interaction

    PubMed Central

    Larsen, Peter E.; Sreedasyam, Avinash; Trivedi, Geetika; Desai, Shalaka; Dai, Yang; Cseke, Leland J.; Collart, Frank R.

    2016-01-01

    In mycorrhizal symbiosis, plant roots form close, mutually beneficial interactions with soil fungi. Before this mycorrhizal interaction can be established however, plant roots must be capable of detecting potential beneficial fungal partners and initiating the gene expression patterns necessary to begin symbiosis. To predict a plant root—mycorrhizal fungi sensor systems, we analyzed in vitro experiments of Populus tremuloides (aspen tree) and Laccaria bicolor (mycorrhizal fungi) interaction and leveraged over 200 previously published transcriptomic experimental data sets, 159 experimentally validated plant transcription factor binding motifs, and more than 120-thousand experimentally validated protein-protein interactions to generate models of pre-mycorrhizal sensor systems in aspen root. These sensor mechanisms link extracellular signaling molecules with gene regulation through a network comprised of membrane receptors, signal cascade proteins, transcription factors, and transcription factor biding DNA motifs. Modeling predicted four pre-mycorrhizal sensor complexes in aspen that interact with 15 transcription factors to regulate the expression of 1184 genes in response to extracellular signals synthesized by Laccaria. Predicted extracellular signaling molecules include common signaling molecules such as phenylpropanoids, salicylate, and jasmonic acid. This multi-omic computational modeling approach for predicting the complex sensory networks yielded specific, testable biological hypotheses for mycorrhizal interaction signaling compounds, sensor complexes, and mechanisms of gene regulation. PMID:26834754

  8. Biomechanical approaches to identify and quantify injury mechanisms and risk factors in women's artistic gymnastics.

    PubMed

    Bradshaw, Elizabeth J; Hume, Patria A

    2012-09-01

    Targeted injury prevention strategies, based on biomechanical analyses, have the potential to help reduce the incidence and severity of gymnastics injuries. This review outlines the potential benefits of biomechanics research to contribute to injury prevention strategies for women's artistic gymnastics by identification of mechanisms of injury and quantification of the effects of injury risk factors. One hundred and twenty-three articles were retained for review after searching electronic databases using key words, including 'gymnastic', 'biomech*', and 'inj*', and delimiting by language and relevance to the paper aim. Impact load can be measured biomechanically by the use of instrumented equipment (e.g. beatboard), instrumentation on the gymnast (accelerometers), or by landings on force plates. We need further information on injury mechanisms and risk factors in gymnastics and practical methods of monitoring training loads. We have not yet shown, beyond a theoretical approach, how biomechanical analysis of gymnastics can help reduce injury risk through injury prevention interventions. Given the high magnitude of impact load, both acute and accumulative, coaches should monitor impact loads per training session, taking into consideration training quality and quantity such as the control of rotation and the height from which the landings are executed. PMID:23072044

  9. Multi-omics approach identifies molecular mechanisms of plant-fungus mycorrhizal interaction

    DOE PAGESBeta

    Larsen, Peter E.; Sreedasyam, Avinash; Trivedi, Geetika; Desai, Shalaka D.; Dai, Yang; Cseke, Leland; Collart, Frank R.

    2016-01-19

    In mycorrhizal symbiosis, plant roots form close, mutually beneficial interactions with soil fungi. Before this mycorrhizal interaction can be established however, plant roots must be capable of detecting potential beneficial fungal partners and initiating the gene expression patterns necessary to begin symbiosis. To predict a plant root – mycorrhizal fungi sensor systems, we analyzed in vitro experiments of Populus tremuloides (aspen tree) and Laccaria bicolor (mycorrhizal fungi) interaction and leveraged over 200 previously published transcriptomic experimental data sets, 159 experimentally validated plant transcription factor binding motifs, and more than 120-thousand experimentally validated protein-protein interactions to generate models of pre-mycorrhizal sensormore » systems in aspen root. These sensor mechanisms link extracellular signaling molecules with gene regulation through a network comprised of membrane receptors, signal cascade proteins, transcription factors, and transcription factor biding DNA motifs. Modeling predicted four pre-mycorrhizal sensor complexes in aspen that interact with fifteen transcription factors to regulate the expression of 1184 genes in response to extracellular signals synthesized by Laccaria. Predicted extracellular signaling molecules include common signaling molecules such as phenylpropanoids, salicylate, and, jasmonic acid. Lastly, this multi-omic computational modeling approach for predicting the complex sensory networks yielded specific, testable biological hypotheses for mycorrhizal interaction signaling compounds, sensor complexes, and mechanisms of gene regulation.« less

  10. Identifying student resources in reasoning about entropy and the approach to thermal equilibrium

    NASA Astrophysics Data System (ADS)

    Loverude, Michael

    2015-12-01

    [This paper is part of the Focused Collection on Upper Division Physics Courses.] As part of an ongoing project to examine student learning in upper-division courses in thermal and statistical physics, we have examined student reasoning about entropy and the second law of thermodynamics. We have examined reasoning in terms of heat transfer, entropy maximization, and statistical treatments of multiplicity and probability. In this paper, we describe student responses in interviews focused on the approach of macroscopic systems to thermal equilibrium. Our data suggest that students do not use a single simple model of entropy, but rather use a variety of conceptual resources. Individual students frequently shifted between resources, in some cases leading to contradictory predictions. Among the resources that students employed were some that have been previously described in the literature, including inappropriate use of conservation. However, our results suggest that student use of resources connected to disorder are neither simple nor monolithic. For example, many students used a previously unreported association between the equilibrium state of a system and an increase in order, rather than disorder.

  11. A harmonised approach for identifying core foods for total diet studies.

    PubMed

    Devlin, Niamh F C; McNulty, Breige A; Turrini, Aida; Tlustos, Christina; Hearty, Aine P; Volatier, Jean-Luc; Kelleher, Cecily C; Nugent, Anne P

    2014-01-01

    Total diet studies (TDS) are used to gather information on chemical substances in food, thereby facilitating risk assessments and health monitoring. Candidate foods for inclusion in a TDS should represent a large part of a typical diet to estimate accurately the exposure of a population and/or specific population groups. There are currently no harmonised guidelines for the selection of foods in a TDS, and so the aim of this study was to explore the possibility of generating a harmonised approach to be used across Europe. Summary statistics data from the European Food Safety Authority's (EFSA) Comprehensive Food Consumption Database were used in this research, which provided data from national food consumption surveys in Europe. The chosen methodology for the selection of foods was based on the weight of food consumed and consumer rate. Using the available data, 59 TDS food lists were created, representing over 51 000 people across 17 c