Data mining for health executive decision support: an imperative with a daunting future!
Glover, Saundra; Rivers, Patrick A; Asoh, Derek A; Piper, Crystal N; Murph, Keva
2010-01-01
Summary Data mining is highly profiled. It has the potential to enhance executive information systems. Such enhancement would mean better decision-making by management, which in turn would mean better services for customers. While the future of data mining as technology should be exciting, some are worried about privacy concerns, which make the future of data mining daunting. This paper examines why data mining is highly profiled – the imperative toward data mining, data mining models and processes. Additionally, the paper examines some of the benefits and challenges of using data mining processes within the health-care arena. We cast the future of data mining by highlighting two of the many data mining tools available – one commercial and one freely available. Subsequently, we discuss a number of social and technical factors that may thwart the extensive deployment of data mining, especially when the intent is to know more about the people that organizations have to serve and cast a view of what the future holds for data mining. This component is especially important when attempting to determine the longevity of data mining within health-care organizations. It is hoped that our discussions would be useful to organizations as they engage data mining, strategies for executive information systems and information policy issues. PMID:20150610
Quantification of Operational Risk Using A Data Mining
NASA Technical Reports Server (NTRS)
Perera, J. Sebastian
1999-01-01
What is Data Mining? - Data Mining is the process of finding actionable information hidden in raw data. - Data Mining helps find hidden patterns, trends, and important relationships often buried in a sea of data - Typically, automated software tools based on advanced statistical analysis and data modeling technology can be utilized to automate the data mining process
The Hazards of Data Mining in Healthcare.
Househ, Mowafa; Aldosari, Bakheet
2017-01-01
From the mid-1990s, data mining methods have been used to explore and find patterns and relationships in healthcare data. During the 1990s and early 2000's, data mining was a topic of great interest to healthcare researchers, as data mining showed some promise in the use of its predictive techniques to help model the healthcare system and improve the delivery of healthcare services. However, it was soon discovered that mining healthcare data had many challenges relating to the veracity of healthcare data and limitations around predictive modelling leading to failures of data mining projects. As the Big Data movement has gained momentum over the past few years, there has been a reemergence of interest in the use of data mining techniques and methods to analyze healthcare generated Big Data. Much has been written on the positive impacts of data mining on healthcare practice relating to issues of best practice, fraud detection, chronic disease management, and general healthcare decision making. Little has been written about the limitations and challenges of data mining use in healthcare. In this review paper, we explore some of the limitations and challenges in the use of data mining techniques in healthcare. Our results show that the limitations of data mining in healthcare include reliability of medical data, data sharing between healthcare organizations, inappropriate modelling leading to inaccurate predictions. We conclude that there are many pitfalls in the use of data mining in healthcare and more work is needed to show evidence of its utility in facilitating healthcare decision-making for healthcare providers, managers, and policy makers and more evidence is needed on data mining's overall impact on healthcare services and patient care.
Implications of Emerging Data Mining
NASA Astrophysics Data System (ADS)
Kulathuramaiyer, Narayanan; Maurer, Hermann
Data Mining describes a technology that discovers non-trivial hidden patterns in a large collection of data. Although this technology has a tremendous impact on our lives, the invaluable contributions of this invisible technology often go unnoticed. This paper discusses advances in data mining while focusing on the emerging data mining capability. Such data mining applications perform multidimensional mining on a wide variety of heterogeneous data sources, providing solutions to many unresolved problems. This paper also highlights the advantages and disadvantages arising from the ever-expanding scope of data mining. Data Mining augments human intelligence by equipping us with a wealth of knowledge and by empowering us to perform our daily tasks better. As the mining scope and capacity increases, users and organizations become more willing to compromise privacy. The huge data stores of the ‚master miners` allow them to gain deep insights into individual lifestyles and their social and behavioural patterns. Data integration and analysis capability of combining business and financial trends together with the ability to deterministically track market changes will drastically affect our lives.
Kharat, Amit T; Singh, Amarjit; Kulkarni, Vilas M; Shah, Digish
2014-01-01
Data mining facilitates the study of radiology data in various dimensions. It converts large patient image and text datasets into useful information that helps in improving patient care and provides informative reports. Data mining technology analyzes data within the Radiology Information System and Hospital Information System using specialized software which assesses relationships and agreement in available information. By using similar data analysis tools, radiologists can make informed decisions and predict the future outcome of a particular imaging finding. Data, information and knowledge are the components of data mining. Classes, Clusters, Associations, Sequential patterns, Classification, Prediction and Decision tree are the various types of data mining. Data mining has the potential to make delivery of health care affordable and ensure that the best imaging practices are followed. It is a tool for academic research. Data mining is considered to be ethically neutral, however concerns regarding privacy and legality exists which need to be addressed to ensure success of data mining. PMID:25024513
Big data mining analysis method based on cloud computing
NASA Astrophysics Data System (ADS)
Cai, Qing Qiu; Cui, Hong Gang; Tang, Hao
2017-08-01
Information explosion era, large data super-large, discrete and non-(semi) structured features have gone far beyond the traditional data management can carry the scope of the way. With the arrival of the cloud computing era, cloud computing provides a new technical way to analyze the massive data mining, which can effectively solve the problem that the traditional data mining method cannot adapt to massive data mining. This paper introduces the meaning and characteristics of cloud computing, analyzes the advantages of using cloud computing technology to realize data mining, designs the mining algorithm of association rules based on MapReduce parallel processing architecture, and carries out the experimental verification. The algorithm of parallel association rule mining based on cloud computing platform can greatly improve the execution speed of data mining.
ERIC Educational Resources Information Center
Schoech, Dick; Quinn, Andrew; Rycraft, Joan R.
2000-01-01
Examines the historical and larger context of data mining and describes data mining processes, techniques, and tools. Illustrates these using a child welfare dataset concerning the employee turnover that is mined, using logistic regression and a Bayesian neural network. Discusses the data mining process, the resulting models, their predictive…
Application and Exploration of Big Data Mining in Clinical Medicine.
Zhang, Yue; Guo, Shu-Li; Han, Li-Na; Li, Tie-Ling
2016-03-20
To review theories and technologies of big data mining and their application in clinical medicine. Literatures published in English or Chinese regarding theories and technologies of big data mining and the concrete applications of data mining technology in clinical medicine were obtained from PubMed and Chinese Hospital Knowledge Database from 1975 to 2015. Original articles regarding big data mining theory/technology and big data mining's application in the medical field were selected. This review characterized the basic theories and technologies of big data mining including fuzzy theory, rough set theory, cloud theory, Dempster-Shafer theory, artificial neural network, genetic algorithm, inductive learning theory, Bayesian network, decision tree, pattern recognition, high-performance computing, and statistical analysis. The application of big data mining in clinical medicine was analyzed in the fields of disease risk assessment, clinical decision support, prediction of disease development, guidance of rational use of drugs, medical management, and evidence-based medicine. Big data mining has the potential to play an important role in clinical medicine.
Data-Mining Technologies for Diabetes: A Systematic Review
Marinov, Miroslav; Mosa, Abu Saleh Mohammad; Yoo, Illhoi; Boren, Suzanne Austin
2011-01-01
Background The objective of this study is to conduct a systematic review of applications of data-mining techniques in the field of diabetes research. Method We searched the MEDLINE database through PubMed. We initially identified 31 articles by the search, and selected 17 articles representing various data-mining methods used for diabetes research. Our main interest was to identify research goals, diabetes types, data sets, data-mining methods, data-mining software and technologies, and outcomes. Results The applications of data-mining techniques in the selected articles were useful for extracting valuable knowledge and generating new hypothesis for further scientific research/experimentation and improving health care for diabetes patients. The results could be used for both scientific research and real-life practice to improve the quality of health care diabetes patients. Conclusions Data mining has played an important role in diabetes research. Data mining would be a valuable asset for diabetes researchers because it can unearth hidden knowledge from a huge amount of diabetes-related data. We believe that data mining can significantly help diabetes research and ultimately improve the quality of health care for diabetes patients. PMID:22226277
Data mining in pharma sector: benefits.
Ranjan, Jayanthi
2009-01-01
The amount of data getting generated in any sector at present is enormous. The information flow in the pharma industry is huge. Pharma firms are progressing into increased technology-enabled products and services. Data mining, which is knowledge discovery from large sets of data, helps pharma firms to discover patterns in improving the quality of drug discovery and delivery methods. The paper aims to present how data mining is useful in the pharma industry, how its techniques can yield good results in pharma sector, and to show how data mining can really enhance in making decisions using pharmaceutical data. This conceptual paper is written based on secondary study, research and observations from magazines, reports and notes. The author has listed the types of patterns that can be discovered using data mining in pharma data. The paper shows how data mining is useful in the pharma industry and how its techniques can yield good results in pharma sector. Although much work can be produced for discovering knowledge in pharma data using data mining, the paper is limited to conceptualizing the ideas and view points at this stage; future work may include applying data mining techniques to pharma data based on primary research using the available, famous significant data mining tools. Research papers and conceptual papers related to data mining in Pharma industry are rare; this is the motivation for the paper.
The Lure of Statistics in Data Mining
ERIC Educational Resources Information Center
Grover, Lovleen Kumar; Mehra, Rajni
2008-01-01
The field of Data Mining like Statistics concerns itself with "learning from data" or "turning data into information". For statisticians the term "Data mining" has a pejorative meaning. Instead of finding useful patterns in large volumes of data as in the case of Statistics, data mining has the connotation of searching for data to fit preconceived…
Application and Exploration of Big Data Mining in Clinical Medicine
Zhang, Yue; Guo, Shu-Li; Han, Li-Na; Li, Tie-Ling
2016-01-01
Objective: To review theories and technologies of big data mining and their application in clinical medicine. Data Sources: Literatures published in English or Chinese regarding theories and technologies of big data mining and the concrete applications of data mining technology in clinical medicine were obtained from PubMed and Chinese Hospital Knowledge Database from 1975 to 2015. Study Selection: Original articles regarding big data mining theory/technology and big data mining's application in the medical field were selected. Results: This review characterized the basic theories and technologies of big data mining including fuzzy theory, rough set theory, cloud theory, Dempster–Shafer theory, artificial neural network, genetic algorithm, inductive learning theory, Bayesian network, decision tree, pattern recognition, high-performance computing, and statistical analysis. The application of big data mining in clinical medicine was analyzed in the fields of disease risk assessment, clinical decision support, prediction of disease development, guidance of rational use of drugs, medical management, and evidence-based medicine. Conclusion: Big data mining has the potential to play an important role in clinical medicine. PMID:26960378
A Survey of Educational Data-Mining Research
ERIC Educational Resources Information Center
Huebner, Richard A.
2013-01-01
Educational data mining (EDM) is an emerging discipline that focuses on applying data mining tools and techniques to educationally related data. The discipline focuses on analyzing educational data to develop models for improving learning experiences and improving institutional effectiveness. A literature review on educational data mining topics…
Data mining for multiagent rules, strategies, and fuzzy decision tree structure
NASA Astrophysics Data System (ADS)
Smith, James F., III; Rhyne, Robert D., II; Fisher, Kristin
2002-03-01
A fuzzy logic based resource manager (RM) has been developed that automatically allocates electronic attack resources in real-time over many dissimilar platforms. Two different data mining algorithms have been developed to determine rules, strategies, and fuzzy decision tree structure. The first data mining algorithm uses a genetic algorithm as a data mining function and is called from an electronic game. The game allows a human expert to play against the resource manager in a simulated battlespace with each of the defending platforms being exclusively directed by the fuzzy resource manager and the attacking platforms being controlled by the human expert or operating autonomously under their own logic. This approach automates the data mining problem. The game automatically creates a database reflecting the domain expert's knowledge. It calls a data mining function, a genetic algorithm, for data mining of the database as required and allows easy evaluation of the information mined in the second step. The criterion for re- optimization is discussed as well as experimental results. Then a second data mining algorithm that uses a genetic program as a data mining function is introduced to automatically discover fuzzy decision tree structures. Finally, a fuzzy decision tree generated through this process is discussed.
Introduction to Agent Mining Interaction and Integration
NASA Astrophysics Data System (ADS)
Cao, Longbing
In recent years, more and more researchers have been involved in research on both agent technology and data mining. A clear disciplinary effort has been activated toward removing the boundary between them, that is the interaction and integration between agent technology and data mining. We refer this to agent mining as a new area. The marriage of agents and data mining is driven by challenges faced by both communities, and the need of developing more advanced intelligence, information processing and systems. This chapter presents an overall picture of agent mining from the perspective of positioning it as an emerging area. We summarize the main driving forces, complementary essence, disciplinary framework, applications, case studies, and trends and directions, as well as brief observation on agent-driven data mining, data mining-driven agents, and mutual issues in agent mining. Arguably, we draw the following conclusions: (1) agent mining emerges as a new area in the scientific family, (2) both agent technology and data mining can greatly benefit from agent mining, (3) it is very promising to result in additional advancement in intelligent information processing and systems. However, as a new open area, there are many issues waiting for research and development from theoretical, technological and practical perspectives.
NASA Astrophysics Data System (ADS)
Moyle, Steve
Collaborative Data Mining is a setting where the Data Mining effort is distributed to multiple collaborating agents - human or software. The objective of the collaborative Data Mining effort is to produce solutions to the tackled Data Mining problem which are considered better by some metric, with respect to those solutions that would have been achieved by individual, non-collaborating agents. The solutions require evaluation, comparison, and approaches for combination. Collaboration requires communication, and implies some form of community. The human form of collaboration is a social task. Organizing communities in an effective manner is non-trivial and often requires well defined roles and processes. Data Mining, too, benefits from a standard process. This chapter explores the standard Data Mining process CRISP-DM utilized in a collaborative setting.
Zhao, Yufeng; Xie, Qi; He, Liyun; Liu, Baoyan; Li, Kun; Zhang, Xiang; Bai, Wenjing; Luo, Lin; Jing, Xianghong; Huo, Ruili
2014-10-01
To help researchers selecting appropriate data mining models to provide better evidence for the clinical practice of Traditional Chinese Medicine (TCM) diagnosis and therapy. Clinical issues based on data mining models were comprehensively summarized from four significant elements of the clinical studies: symptoms, symptom patterns, herbs, and efficacy. Existing problems were further generalized to determine the relevant factors of the performance of data mining models, e.g. data type, samples, parameters, variable labels. Combining these relevant factors, the TCM clinical data features were compared with regards to statistical characters and informatics properties. Data models were compared simultaneously from the view of applied conditions and suitable scopes. The main application problems were the inconsistent data type and the small samples for the used data mining models, which caused the inappropriate results, even the mistake results. These features, i.e. advantages, disadvantages, satisfied data types, tasks of data mining, and the TCM issues, were summarized and compared. By aiming at the special features of different data mining models, the clinical doctors could select the suitable data mining models to resolve the TCM problem.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-22
... to make domestic ore resource analyses. Tabulations of volumetric data concerning domestic mining... mined and the resulting marketable product. These data are an indicator of the future mining outlook... the U.S. Geological Survey with domestic production, exploration, and mine development data for...
Collaborative Data Mining Tool for Education
ERIC Educational Resources Information Center
Garcia, Enrique; Romero, Cristobal; Ventura, Sebastian; Gea, Miguel; de Castro, Carlos
2009-01-01
This paper describes a collaborative educational data mining tool based on association rule mining for the continuous improvement of e-learning courses allowing teachers with similar course's profile sharing and scoring the discovered information. This mining tool is oriented to be used by instructors non experts in data mining such that, its…
NASA Astrophysics Data System (ADS)
Nevalainen, Jouni; Kozlovskaya, Elena
2016-04-01
We present results of a seismic travel-time tomography applied to microseismic data from the Pyhäsalmi mine, Finland. The data about microseismic events in the mine is recorded since 2002 when the passive microseismic monitoring network was installed in the mine. Since that over 130000 microseismic events have been observed. The first target of our study was to test can the passive microseismic monitoring data be used with travel-time tomography. In this data set the source-receiver geometry is based on non-even distribution of natural and mine-induced events inside and in the vicinity of the mine and hence, is a non-ideal one for the travel-time tomography. The tomographic inversion procedure was tested with the synthetic data and real source-receiver geometry from Pyhäsalmi mine and with the real travel-time data of the first arrivals of P-waves from the microseismic events. The results showed that seismic tomography is capable to reveal differences in seismic velocities in the mine area corresponding to different rock types. For example, the velocity contrast between the ore body and surrounding rock is detectable. The velocity model recovered agrees well with the known geological structures in the mine area. The second target of the study was to apply the travel-time tomography to microseismic monitoring data recorded during different time periods in order to track temporal changes in seismic velocities within the mining area as the excavation proceeds. The result shows that such a time-lapse travel-time tomography can recover such changes. In order to obtain good ray coverage and good resolution, the time interval for a single tomography round need to be selected taking into account the number of events and their spatial distribution. The third target was to compare and analyze mine-induced event locations, seismic tomography results and mining technological data (for example, mine excavation plans) in order to understand the influence of mining technology to mining-induced seismicity. Acknowledgements: This study has been supported by ERDF SEISLAB project and Pyhäsalmi Mine Ltd.
Privacy Preserving Sequential Pattern Mining in Data Stream
NASA Astrophysics Data System (ADS)
Huang, Qin-Hua
The privacy preserving data mining technique researches have gained much attention in recent years. For data stream systems, wireless networks and mobile devices, the related stream data mining techniques research is still in its' early stage. In this paper, an data mining algorithm dealing with privacy preserving problem in data stream is presented.
76 FR 14637 - State Medicaid Fraud Control Units; Data Mining
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-17
...] State Medicaid Fraud Control Units; Data Mining AGENCY: Office of Inspector General (OIG), HHS. ACTION... and analyzing State Medicaid claims data, known as data mining. To support and modernize MFCU efforts... (FFP) in the costs of defined data mining activities under specified conditions. In addition, we...
Applications of Geomatics in Surface Mining
NASA Astrophysics Data System (ADS)
Blachowski, Jan; Górniak-Zimroz, Justyna; Milczarek, Wojciech; Pactwa, Katarzyna
2017-12-01
In terms of method of extracting mineral from deposit, mining can be classified into: surface, underground, and borehole mining. Surface mining is a form of mining, in which the soil and the rock covering the mineral deposits are removed. Types of surface mining include mainly strip and open-cast methods, as well as quarrying. Tasks associated with surface mining of minerals include: resource estimation and deposit documentation, mine planning and deposit access, mine plant development, extraction of minerals from deposits, mineral and waste processing, reclamation and reclamation of former mining grounds. At each stage of mining, geodata describing changes occurring in space during the entire life cycle of surface mining project should be taken into consideration, i.e. collected, analysed, processed, examined, distributed. These data result from direct (e.g. geodetic) and indirect (i.e. remote or relative) measurements and observations including airborne and satellite methods, geotechnical, geological and hydrogeological data, and data from other types of sensors, e.g. located on mining equipment and infrastructure, mine plans and maps. Management of such vast sources and sets of geodata, as well as information resulting from processing, integrated analysis and examining such data can be facilitated with geomatic solutions. Geomatics is a discipline of gathering, processing, interpreting, storing and delivering spatially referenced information. Thus, geomatics integrates methods and technologies used for collecting, management, processing, visualizing and distributing spatial data. In other words, its meaning covers practically every method and tool from spatial data acquisition to distribution. In this work examples of application of geomatic solutions in surface mining on representative case studies in various stages of mine operation have been presented. These applications include: prospecting and documenting mineral deposits, assessment of land accessibility for a potential large-scale surface mining project, modelling mineral deposit (granite) management, concept of a system for management of conveyor belt network technical condition, project of a geoinformation system of former mining terrains and objects, and monitoring and control of impact of surface mining on mine surroundings with satellite radar interferometry.
Runtime support for parallelizing data mining algorithms
NASA Astrophysics Data System (ADS)
Jin, Ruoming; Agrawal, Gagan
2002-03-01
With recent technological advances, shared memory parallel machines have become more scalable, and offer large main memories and high bus bandwidths. They are emerging as good platforms for data warehousing and data mining. In this paper, we focus on shared memory parallelization of data mining algorithms. We have developed a series of techniques for parallelization of data mining algorithms, including full replication, full locking, fixed locking, optimized full locking, and cache-sensitive locking. Unlike previous work on shared memory parallelization of specific data mining algorithms, all of our techniques apply to a large number of common data mining algorithms. In addition, we propose a reduction-object based interface for specifying a data mining algorithm. We show how our runtime system can apply any of the technique we have developed starting from a common specification of the algorithm.
Improve Data Mining and Knowledge Discovery Through the Use of MatLab
NASA Technical Reports Server (NTRS)
Shaykhian, Gholam Ali; Martin, Dawn (Elliott); Beil, Robert
2011-01-01
Data mining is widely used to mine business, engineering, and scientific data. Data mining uses pattern based queries, searches, or other analyses of one or more electronic databases/datasets in order to discover or locate a predictive pattern or anomaly indicative of system failure, criminal or terrorist activity, etc. There are various algorithms, techniques and methods used to mine data; including neural networks, genetic algorithms, decision trees, nearest neighbor method, rule induction association analysis, slice and dice, segmentation, and clustering. These algorithms, techniques and methods used to detect patterns in a dataset, have been used in the development of numerous open source and commercially available products and technology for data mining. Data mining is best realized when latent information in a large quantity of data stored is discovered. No one technique solves all data mining problems; challenges are to select algorithms or methods appropriate to strengthen data/text mining and trending within given datasets. In recent years, throughout industry, academia and government agencies, thousands of data systems have been designed and tailored to serve specific engineering and business needs. Many of these systems use databases with relational algebra and structured query language to categorize and retrieve data. In these systems, data analyses are limited and require prior explicit knowledge of metadata and database relations; lacking exploratory data mining and discoveries of latent information. This presentation introduces MatLab(R) (MATrix LABoratory), an engineering and scientific data analyses tool to perform data mining. MatLab was originally intended to perform purely numerical calculations (a glorified calculator). Now, in addition to having hundreds of mathematical functions, it is a programming language with hundreds built in standard functions and numerous available toolboxes. MatLab's ease of data processing, visualization and its enormous availability of built in functionalities and toolboxes make it suitable to perform numerical computations and simulations as well as a data mining tool. Engineers and scientists can take advantage of the readily available functions/toolboxes to gain wider insight in their perspective data mining experiments.
Improve Data Mining and Knowledge Discovery through the use of MatLab
NASA Technical Reports Server (NTRS)
Shaykahian, Gholan Ali; Martin, Dawn Elliott; Beil, Robert
2011-01-01
Data mining is widely used to mine business, engineering, and scientific data. Data mining uses pattern based queries, searches, or other analyses of one or more electronic databases/datasets in order to discover or locate a predictive pattern or anomaly indicative of system failure, criminal or terrorist activity, etc. There are various algorithms, techniques and methods used to mine data; including neural networks, genetic algorithms, decision trees, nearest neighbor method, rule induction association analysis, slice and dice, segmentation, and clustering. These algorithms, techniques and methods used to detect patterns in a dataset, have been used in the development of numerous open source and commercially available products and technology for data mining. Data mining is best realized when latent information in a large quantity of data stored is discovered. No one technique solves all data mining problems; challenges are to select algorithms or methods appropriate to strengthen data/text mining and trending within given datasets. In recent years, throughout industry, academia and government agencies, thousands of data systems have been designed and tailored to serve specific engineering and business needs. Many of these systems use databases with relational algebra and structured query language to categorize and retrieve data. In these systems, data analyses are limited and require prior explicit knowledge of metadata and database relations; lacking exploratory data mining and discoveries of latent information. This presentation introduces MatLab(TradeMark)(MATrix LABoratory), an engineering and scientific data analyses tool to perform data mining. MatLab was originally intended to perform purely numerical calculations (a glorified calculator). Now, in addition to having hundreds of mathematical functions, it is a programming language with hundreds built in standard functions and numerous available toolboxes. MatLab's ease of data processing, visualization and its enormous availability of built in functionalities and toolboxes make it suitable to perform numerical computations and simulations as well as a data mining tool. Engineers and scientists can take advantage of the readily available functions/toolboxes to gain wider insight in their perspective data mining experiments.
Data Mining for Financial Applications
NASA Astrophysics Data System (ADS)
Kovalerchuk, Boris; Vityaev, Evgenii
This chapter describes Data Mining in finance by discussing financial tasks, specifics of methodologies and techniques in this Data Mining area. It includes time dependence, data selection, forecast horizon, measures of success, quality of patterns, hypothesis evaluation, problem ID, method profile, attribute-based and relational methodologies. The second part of the chapter discusses Data Mining models and practice in finance. It covers use of neural networks in portfolio management, design of interpretable trading rules and discovering money laundering schemes using decision rules and relational Data Mining methodology.
A review of contrast pattern based data mining
NASA Astrophysics Data System (ADS)
Zhu, Shiwei; Ju, Meilong; Yu, Junfeng; Cai, Binlei; Wang, Aiping
2015-07-01
Contrast pattern based data mining is concerned with the mining of patterns and models that contrast two or more datasets. Contrast patterns can describe similarities or differences between the datasets. They represent strong contrast knowledge and have been shown to be very successful for constructing accurate and robust clusters and classifiers. The increasing use of contrast pattern data mining has initiated a great deal of research and development attempts in the field of data mining. A comprehensive revision on the existing contrast pattern based data mining research is given in this paper. They are generally categorized into background and representation, definitions and mining algorithms, contrast pattern based classification, clustering, and other applications, the research trends in future. The primary of this paper is to server as a glossary for interested researchers to have an overall picture on the current contrast based data mining development and identify their potential research direction to future investigation.
Using Data Mining to Teach Applied Statistics and Correlation
ERIC Educational Resources Information Center
Hartnett, Jessica L.
2016-01-01
This article describes two class activities that introduce the concept of data mining and very basic data mining analyses. Assessment data suggest that students learned some of the conceptual basics of data mining, understood some of the ethical concerns related to the practice, and were able to perform correlations via the Statistical Package for…
NASA Astrophysics Data System (ADS)
Thearling, Kurt
Data Mining technology allows marketing organizations to better understand their customers and respond to their needs. This chapter describes how Data Mining can be combined with customer relationship management to help drive improved interactions with customers. An example showing how to use Data Mining to drive customer acquisition activities is presented.
Use of data mining at the Food and Drug Administration.
Duggirala, Hesha J; Tonning, Joseph M; Smith, Ella; Bright, Roselie A; Baker, John D; Ball, Robert; Bell, Carlos; Bright-Ponte, Susan J; Botsis, Taxiarchis; Bouri, Khaled; Boyer, Marc; Burkhart, Keith; Condrey, G Steven; Chen, James J; Chirtel, Stuart; Filice, Ross W; Francis, Henry; Jiang, Hongying; Levine, Jonathan; Martin, David; Oladipo, Taiye; O'Neill, Rene; Palmer, Lee Anne M; Paredes, Antonio; Rochester, George; Sholtes, Deborah; Szarfman, Ana; Wong, Hui-Lee; Xu, Zhiheng; Kass-Hout, Taha
2016-03-01
This article summarizes past and current data mining activities at the United States Food and Drug Administration (FDA). We address data miners in all sectors, anyone interested in the safety of products regulated by the FDA (predominantly medical products, food, veterinary products and nutrition, and tobacco products), and those interested in FDA activities. Topics include routine and developmental data mining activities, short descriptions of mined FDA data, advantages and challenges of data mining at the FDA, and future directions of data mining at the FDA. Published by Oxford University Press on behalf of the American Medical Informatics Association 2015. This work is written by US Government employees and is in the public domain in the US.
Mining algorithm for association rules in big data based on Hadoop
NASA Astrophysics Data System (ADS)
Fu, Chunhua; Wang, Xiaojing; Zhang, Lijun; Qiao, Liying
2018-04-01
In order to solve the problem that the traditional association rules mining algorithm has been unable to meet the mining needs of large amount of data in the aspect of efficiency and scalability, take FP-Growth as an example, the algorithm is realized in the parallelization based on Hadoop framework and Map Reduce model. On the basis, it is improved using the transaction reduce method for further enhancement of the algorithm's mining efficiency. The experiment, which consists of verification of parallel mining results, comparison on efficiency between serials and parallel, variable relationship between mining time and node number and between mining time and data amount, is carried out in the mining results and efficiency by Hadoop clustering. Experiments show that the paralleled FP-Growth algorithm implemented is able to accurately mine frequent item sets, with a better performance and scalability. It can be better to meet the requirements of big data mining and efficiently mine frequent item sets and association rules from large dataset.
Health Terrain: Visualizing Large Scale Health Data
2015-12-01
Text mining ; Data mining . 16. SECURITY CLASSIFICATION OF: 17... text mining algorithms to construct a concept space. A browser-‐based user interface is developed to...Public health data, Notifiable condition detector, Text mining , Data mining 4 of 29 Disease Patient Location Term
Software tool for data mining and its applications
NASA Astrophysics Data System (ADS)
Yang, Jie; Ye, Chenzhou; Chen, Nianyi
2002-03-01
A software tool for data mining is introduced, which integrates pattern recognition (PCA, Fisher, clustering, hyperenvelop, regression), artificial intelligence (knowledge representation, decision trees), statistical learning (rough set, support vector machine), computational intelligence (neural network, genetic algorithm, fuzzy systems). It consists of nine function models: pattern recognition, decision trees, association rule, fuzzy rule, neural network, genetic algorithm, Hyper Envelop, support vector machine, visualization. The principle and knowledge representation of some function models of data mining are described. The software tool of data mining is realized by Visual C++ under Windows 2000. Nonmonotony in data mining is dealt with by concept hierarchy and layered mining. The software tool of data mining has satisfactorily applied in the prediction of regularities of the formation of ternary intermetallic compounds in alloy systems, and diagnosis of brain glioma.
Data Mining and Knowledge Management in Higher Education -Potential Applications.
ERIC Educational Resources Information Center
Luan, Jing
This paper introduces a new decision support tool, data mining, in the context of knowledge management. The most striking features of data mining techniques are clustering and prediction. The clustering aspect of data mining offers comprehensive characteristics analysis of students, while the predicting function estimates the likelihood for a…
Research on Customer Value Based on Extension Data Mining
NASA Astrophysics Data System (ADS)
Chun-Yan, Yang; Wei-Hua, Li
Extenics is a new discipline for dealing with contradiction problems with formulize model. Extension data mining (EDM) is a product combining Extenics with data mining. It explores to acquire the knowledge based on extension transformations, which is called extension knowledge (EK), taking advantage of extension methods and data mining technology. EK includes extensible classification knowledge, conductive knowledge and so on. Extension data mining technology (EDMT) is a new data mining technology that mining EK in databases or data warehouse. Customer value (CV) can weigh the essentiality of customer relationship for an enterprise according to an enterprise as a subject of tasting value and customers as objects of tasting value at the same time. CV varies continually. Mining the changing knowledge of CV in databases using EDMT, including quantitative change knowledge and qualitative change knowledge, can provide a foundation for that an enterprise decides the strategy of customer relationship management (CRM). It can also provide a new idea for studying CV.
NASA Astrophysics Data System (ADS)
Barbier, Geoffrey; Liu, Huan
The rise of online social media is providing a wealth of social network data. Data mining techniques provide researchers and practitioners the tools needed to analyze large, complex, and frequently changing social media data. This chapter introduces the basics of data mining, reviews social media, discusses how to mine social media data, and highlights some illustrative examples with an emphasis on social networking sites and blogs.
Combined mining: discovering informative knowledge in complex data.
Cao, Longbing; Zhang, Huaifeng; Zhao, Yanchang; Luo, Dan; Zhang, Chengqi
2011-06-01
Enterprise data mining applications often involve complex data such as multiple large heterogeneous data sources, user preferences, and business impact. In such situations, a single method or one-step mining is often limited in discovering informative knowledge. It would also be very time and space consuming, if not impossible, to join relevant large data sources for mining patterns consisting of multiple aspects of information. It is crucial to develop effective approaches for mining patterns combining necessary information from multiple relevant business lines, catering for real business settings and decision-making actions rather than just providing a single line of patterns. The recent years have seen increasing efforts on mining more informative patterns, e.g., integrating frequent pattern mining with classifications to generate frequent pattern-based classifiers. Rather than presenting a specific algorithm, this paper builds on our existing works and proposes combined mining as a general approach to mining for informative patterns combining components from either multiple data sets or multiple features or by multiple methods on demand. We summarize general frameworks, paradigms, and basic processes for multifeature combined mining, multisource combined mining, and multimethod combined mining. Novel types of combined patterns, such as incremental cluster patterns, can result from such frameworks, which cannot be directly produced by the existing methods. A set of real-world case studies has been conducted to test the frameworks, with some of them briefed in this paper. They identify combined patterns for informing government debt prevention and improving government service objectives, which show the flexibility and instantiation capability of combined mining in discovering informative knowledge in complex data.
A Visualization Tool for Integrating Research Results at an Underground Mine
NASA Astrophysics Data System (ADS)
Boltz, S.; Macdonald, B. D.; Orr, T.; Johnson, W.; Benton, D. J.
2016-12-01
Researchers with the National Institute for Occupational Safety and Health are conducting research at a deep, underground metal mine in Idaho to develop improvements in ground control technologies that reduce the effects of dynamic loading on mine workings, thereby decreasing the risk to miners. This research is multifaceted and includes: photogrammetry, microseismic monitoring, geotechnical instrumentation, and numerical modeling. When managing research involving such a wide range of data, understanding how the data relate to each other and to the mining activity quickly becomes a daunting task. In an effort to combine this diverse research data into a single, easy-to-use system, a three-dimensional visualization tool was developed. The tool was created using the Unity3d video gaming engine and includes the mine development entries, production stopes, important geologic structures, and user-input research data. The tool provides the user with a first-person, interactive experience where they are able to walk through the mine as well as navigate the rock mass surrounding the mine to view and interpret the imported data in the context of the mine and as a function of time. The tool was developed using data from a single mine; however, it is intended to be a generic tool that can be easily extended to other mines. For example, a similar visualization tool is being developed for an underground coal mine in Colorado. The ultimate goal is for NIOSH researchers and mine personnel to be able to use the visualization tool to identify trends that may not otherwise be apparent when viewing the data separately. This presentation highlights the features and capabilities of the mine visualization tool and explains how it may be used to more effectively interpret data and reduce the risk of ground fall hazards to underground miners.
Data Mining of Extremely Large Ad Hoc Data Sets to Produce Inverted Indices
2016-06-01
NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA THESIS Approved for public release; distribution is unlimited DATA MINING OF...COVERED Master’s Thesis 4. TITLE AND SUBTITLE DATA MINING OF EXTREMELY LARGE AD HOC DATA SETS TO PRODUCE INVERTED INDICES 5. FUNDING NUMBERS 6...INTENTIONALLY LEFT BLANK iii Approved for public release; distribution is unlimited DATA MINING OF EXTREMELY LARGE AD HOC DATA SETS TO PRODUCE
ERIC Educational Resources Information Center
Winne, Philip H.; Baker, Ryan S. J. D.
2013-01-01
Our article introduces the "Journal of Educational Data Mining's" Special Issue on Educational Data Mining on Motivation, Metacognition, and Self-Regulated Learning. We outline general research challenges for data mining researchers who conduct investigations in these areas, the potential of EDM to advance research in this area, and…
Integration of Text- and Data-Mining Technologies for Use in Banking Applications
NASA Astrophysics Data System (ADS)
Maslankowski, Jacek
Unstructured data, most of it in the form of text files, typically accounts for 85% of an organization's knowledge stores, but it's not always easy to find, access, analyze or use (Robb 2004). That is why it is important to use solutions based on text and data mining. This solution is known as duo mining. This leads to improve management based on knowledge owned in organization. The results are interesting. Data mining provides to lead with structuralized data, usually powered from data warehouses. Text mining, sometimes called web mining, looks for patterns in unstructured data — memos, document and www. Integrating text-based information with structured data enriches predictive modeling capabilities and provides new stores of insightful and valuable information for driving business and research initiatives forward.
ERIC Educational Resources Information Center
Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.
2000-01-01
These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)
NASA Astrophysics Data System (ADS)
Lim, J. H.; Yu, J.; Koh, S. M.; Lee, G.
2017-12-01
Mining is a major industrial business of North Korea accounting for significant portion of an export for North Korean economy. However, due to its veiled political system, details of mining activities of North Korea is rarely known. This study investigated mining activities of Rakyeon Au-Ag mine, North Korea based on remote sensing based multi-temporal observation. To monitor the mining activities, CORONA data acquired in 1960s and 1970s, SPOT and Landsat data acquired in 1980s and 1990s and KOMPSAT-2 data acquired in 2010s are utilized. The results show that mining activities of Rakyeon mine continuously carried out for the observation period expanding tailing areas of the mine. However, its expanding rate varies between the period related to North Korea's economic and political situations.
Data and Statistics on New York's Mining Resources - NYS Dept. of
New York's Mining Resources Skip to main navigation Data and Statistics on New York's Mining Resources and review information about the regulated site. Materials Mined in New York- This site provides information on the various material mined in New York and the locations where they are extracted. Mined Land
77 FR 15026 - Privacy Act of 1974; Farm Records File (Automated) System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2012-03-14
... Mining Project, all program data collected and handled by either RMA or FSA will be treated with the full... data warehouse and data mining operation. RMA will use the information to search or ``mine'' existing... fraud, waste, and abuse. The data mining operation is authorized by the Agricultural Risk Protection Act...
NASA Astrophysics Data System (ADS)
Smith, James F., III; Blank, Joseph A.
2003-03-01
An approach is being explored that involves embedding a fuzzy logic based resource manager in an electronic game environment. Game agents can function under their own autonomous logic or human control. This approach automates the data mining problem. The game automatically creates a cleansed database reflecting the domain expert's knowledge, it calls a data mining function, a genetic algorithm, for data mining of the data base as required and allows easy evaluation of the information extracted. The co-evolutionary fitness functions, chromosomes and stopping criteria for ending the game are discussed. Genetic algorithm and genetic program based data mining procedures are discussed that automatically discover new fuzzy rules and strategies. The strategy tree concept and its relationship to co-evolutionary data mining are examined as well as the associated phase space representation of fuzzy concepts. The overlap of fuzzy concepts in phase space reduces the effective strategies available to adversaries. Co-evolutionary data mining alters the geometric properties of the overlap region known as the admissible region of phase space significantly enhancing the performance of the resource manager. Procedures for validation of the information data mined are discussed and significant experimental results provided.
Proceedings: Fourth Workshop on Mining Scientific Datasets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamath, C
Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratorymore » data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of applications is matched only by the opportunities that await a practitioner.« less
Data Mining Research with the LSST
NASA Astrophysics Data System (ADS)
Borne, Kirk D.; Strauss, M. A.; Tyson, J. A.
2007-12-01
The LSST catalog database will exceed 10 petabytes, comprising several hundred attributes for 5 billion galaxies, 10 billion stars, and over 1 billion variable sources (optical variables, transients, or moving objects), extracted from over 20,000 square degrees of deep imaging in 5 passbands with thorough time domain coverage: 1000 visits over the 10-year LSST survey lifetime. The opportunities are enormous for novel scientific discoveries within this rich time-domain ultra-deep multi-band survey database. Data Mining, Machine Learning, and Knowledge Discovery research opportunities with the LSST are now under study, with a potential for new collaborations to develop to contribute to these investigations. We will describe features of the LSST science database that are amenable to scientific data mining, object classification, outlier identification, anomaly detection, image quality assurance, and survey science validation. We also give some illustrative examples of current scientific data mining research in astronomy, and point out where new research is needed. In particular, the data mining research community will need to address several issues in the coming years as we prepare for the LSST data deluge. The data mining research agenda includes: scalability (at petabytes scales) of existing machine learning and data mining algorithms; development of grid-enabled parallel data mining algorithms; designing a robust system for brokering classifications from the LSST event pipeline (which may produce 10,000 or more event alerts per night); multi-resolution methods for exploration of petascale databases; visual data mining algorithms for visual exploration of the data; indexing of multi-attribute multi-dimensional astronomical databases (beyond RA-Dec spatial indexing) for rapid querying of petabyte databases; and more. Finally, we will identify opportunities for synergistic collaboration between the data mining research group and the LSST Data Management and Science Collaboration teams.
Association Rule Mining from an Intelligent Tutor
ERIC Educational Resources Information Center
Dogan, Buket; Camurcu, A. Yilmaz
2008-01-01
Educational data mining is a very novel research area, offering fertile ground for many interesting data mining applications. Educational data mining can extract useful information from educational activities for better understanding and assessment of the student learning process. In this way, it is possible to explore how students learn topics in…
Factors influencing mine rescue team behaviors.
Jansky, Jacqueline H; Kowalski-Trakofler, K M; Brnich, M J; Vaught, C
2016-01-01
A focus group study of the first moments in an underground mine emergency response was conducted by the National Institute for Occupational Safety and Health (NIOSH), Office for Mine Safety and Health Research. Participants in the study included mine rescue team members, team trainers, mine officials, state mining personnel, and individual mine managers. A subset of the data consists of responses from participants with mine rescue backgrounds. These responses were noticeably different from those given by on-site emergency personnel who were at the mine and involved with decisions made during the first moments of an event. As a result, mine rescue team behavior data were separated in the analysis and are reported in this article. By considering the responses from mine rescue team members and trainers, it was possible to sort the data and identify seven key areas of importance to them. On the basis of the responses from the focus group participants with a mine rescue background, the authors concluded that accurate and complete information and a unity of purpose among all command center personnel are two of the key conditions needed for an effective mine rescue operation.
Underground coal mine instrumentation and test
NASA Technical Reports Server (NTRS)
Burchill, R. F.; Waldron, W. D.
1976-01-01
The need to evaluate mechanical performance of mine tools and to obtain test performance data from candidate systems dictate that an engineering data recording system be built. Because of the wide range of test parameters which would be evaluated, a general purpose data gathering system was designed and assembled to permit maximum versatility. A primary objective of this program was to provide a specific operating evaluation of a longwall mining machine vibration response under normal operating conditions. A number of mines were visited and a candidate for test evaluation was selected, based upon management cooperation, machine suitability, and mine conditions. Actual mine testing took place in a West Virginia mine.
IT Data Mining Tool Uses in Aerospace
NASA Technical Reports Server (NTRS)
Monroe, Gilena A.; Freeman, Kenneth; Jones, Kevin L.
2012-01-01
Data mining has a broad spectrum of uses throughout the realms of aerospace and information technology. Each of these areas has useful methods for processing, distributing, and storing its corresponding data. This paper focuses on ways to leverage the data mining tools and resources used in NASA's information technology area to meet the similar data mining needs of aviation and aerospace domains. This paper details the searching, alerting, reporting, and application functionalities of the Splunk system, used by NASA's Security Operations Center (SOC), and their potential shared solutions to address aircraft and spacecraft flight and ground systems data mining requirements. This paper also touches on capacity and security requirements when addressing sizeable amounts of data across a large data infrastructure.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-05
... safety efforts of MSHA and the mining industry. Accident, injury, and illness data, when correlated with... requested data can be provided in the desired format, reporting burden (time and financial resources) is... provides for uniform information gathering across the mining industry. Section 50.30 requires mine...
A Collaborative Educational Association Rule Mining Tool
ERIC Educational Resources Information Center
Garcia, Enrique; Romero, Cristobal; Ventura, Sebastian; de Castro, Carlos
2011-01-01
This paper describes a collaborative educational data mining tool based on association rule mining for the ongoing improvement of e-learning courses and allowing teachers with similar course profiles to share and score the discovered information. The mining tool is oriented to be used by non-expert instructors in data mining so its internal…
Data Mining: Going beyond Traditional Statistics
ERIC Educational Resources Information Center
Zhao, Chun-Mei; Luan, Jing
2006-01-01
The authors provide an overview of data mining, giving special attention to the relationship between data mining and statistics to unravel some misunderstandings about the two techniques. (Contains 1 figure.)
Large-Scale Constraint-Based Pattern Mining
ERIC Educational Resources Information Center
Zhu, Feida
2009-01-01
We studied the problem of constraint-based pattern mining for three different data formats, item-set, sequence and graph, and focused on mining patterns of large sizes. Colossal patterns in each data formats are studied to discover pruning properties that are useful for direct mining of these patterns. For item-set data, we observed robustness of…
The LSST Data Mining Research Agenda
NASA Astrophysics Data System (ADS)
Borne, K.; Becla, J.; Davidson, I.; Szalay, A.; Tyson, J. A.
2008-12-01
We describe features of the LSST science database that are amenable to scientific data mining, object classification, outlier identification, anomaly detection, image quality assurance, and survey science validation. The data mining research agenda includes: scalability (at petabytes scales) of existing machine learning and data mining algorithms; development of grid-enabled parallel data mining algorithms; designing a robust system for brokering classifications from the LSST event pipeline (which may produce 10,000 or more event alerts per night) multi-resolution methods for exploration of petascale databases; indexing of multi-attribute multi-dimensional astronomical databases (beyond spatial indexing) for rapid querying of petabyte databases; and more.
Enhancements for a Dynamic Data Warehousing and Mining System for Large-Scale HSCB Data
2016-04-21
Intelligent Automation Incorporated Enhancements for a Dynamic Data Warehousing and Mining ...Page | 2 Intelligent Automation Incorporated Progress Report No. 1 Enhancements for a Dynamic Data Warehousing and Mining System Large-Scale
Open-source tools for data mining.
Zupan, Blaz; Demsar, Janez
2008-03-01
With a growing volume of biomedical databases and repositories, the need to develop a set of tools to address their analysis and support knowledge discovery is becoming acute. The data mining community has developed a substantial set of techniques for computational treatment of these data. In this article, we discuss the evolution of open-source toolboxes that data mining researchers and enthusiasts have developed over the span of a few decades and review several currently available open-source data mining suites. The approaches we review are diverse in data mining methods and user interfaces and also demonstrate that the field and its tools are ready to be fully exploited in biomedical research.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 42 Public Health 5 2013-10-01 2013-10-01 false Circumstances in which data mining is permissible... CONTROL UNITS § 1007.20 Circumstances in which data mining is permissible and approval by HHS Office of Inspector General. (a) Notwithstanding § 1007.19(e)(2), a MFCU may engage in data mining as defined in this...
Code of Federal Regulations, 2014 CFR
2014-10-01
... 42 Public Health 5 2014-10-01 2014-10-01 false Circumstances in which data mining is permissible... CONTROL UNITS § 1007.20 Circumstances in which data mining is permissible and approval by HHS Office of Inspector General. (a) Notwithstanding § 1007.19(e)(2), a MFCU may engage in data mining as defined in this...
ERIC Educational Resources Information Center
International Educational Data Mining Society, 2012
2012-01-01
The 5th International Conference on Educational Data Mining (EDM 2012) is held in picturesque Chania on the beautiful Crete island in Greece, under the auspices of the International Educational Data Mining Society (IEDMS). The EDM 2012 conference is a leading international forum for high quality research that mines large data sets of educational…
Data Mining Techniques Applied to Hydrogen Lactose Breath Test.
Rubio-Escudero, Cristina; Valverde-Fernández, Justo; Nepomuceno-Chamorro, Isabel; Pontes-Balanza, Beatriz; Hernández-Mendoza, Yoedusvany; Rodríguez-Herrera, Alfonso
2017-01-01
Analyze a set of data of hydrogen breath tests by use of data mining tools. Identify new patterns of H2 production. Hydrogen breath tests data sets as well as k-means clustering as the data mining technique to a dataset of 2571 patients. Six different patterns have been extracted upon analysis of the hydrogen breath test data. We have also shown the relevance of each of the samples taken throughout the test. Analysis of the hydrogen breath test data sets using data mining techniques has identified new patterns of hydrogen generation upon lactose absorption. We can see the potential of application of data mining techniques to clinical data sets. These results offer promising data for future research on the relations between gut microbiota produced hydrogen and its link to clinical symptoms.
Using data mining to segment healthcare markets from patients' preference perspectives.
Liu, Sandra S; Chen, Jie
2009-01-01
This paper aims to provide an example of how to use data mining techniques to identify patient segments regarding preferences for healthcare attributes and their demographic characteristics. Data were derived from a number of individuals who received in-patient care at a health network in 2006. Data mining and conventional hierarchical clustering with average linkage and Pearson correlation procedures are employed and compared to show how each procedure best determines segmentation variables. Data mining tools identified three differentiable segments by means of cluster analysis. These three clusters have significantly different demographic profiles. The study reveals, when compared with traditional statistical methods, that data mining provides an efficient and effective tool for market segmentation. When there are numerous cluster variables involved, researchers and practitioners need to incorporate factor analysis for reducing variables to clearly and meaningfully understand clusters. Interests and applications in data mining are increasing in many businesses. However, this technology is seldom applied to healthcare customer experience management. The paper shows that efficient and effective application of data mining methods can aid the understanding of patient healthcare preferences.
APPLYING DATA MINING APPROACHES TO FURTHER ...
This dataset will be used to illustrate various data mining techniques to biologically profile the chemical space. This dataset will be used to illustrate various data mining techniques to biologically profile the chemical space.
Exploring the Integration of Data Mining and Data Visualization
ERIC Educational Resources Information Center
Zhang, Yi
2011-01-01
Due to the rapid advances in computing and sensing technologies, enormous amounts of data are being generated everyday in various applications. The integration of data mining and data visualization has been widely used to analyze these massive and complex data sets to discover hidden patterns. For both data mining and visualization to be…
Constructing and Classifying Email Networks from Raw Forensic Images
2016-09-01
data mining for sequence and pattern mining ; in medical imaging for image segmentation; and in computer vision for object recognition” [28]. 2.3.1...machine learning and data mining suite that is written in Python. It provides a platform for experiment selection, recommendation systems, and...predictivemod- eling. The Orange library is a hierarchically-organized toolbox of data mining components. Data filtering and probability assessment are at the
Big data mining: In-database Oracle data mining over hadoop
NASA Astrophysics Data System (ADS)
Kovacheva, Zlatinka; Naydenova, Ina; Kaloyanova, Kalinka; Markov, Krasimir
2017-07-01
Big data challenges different aspects of storing, processing and managing data, as well as analyzing and using data for business purposes. Applying Data Mining methods over Big Data is another challenge because of huge data volumes, variety of information, and the dynamic of the sources. Different applications are made in this area, but their successful usage depends on understanding many specific parameters. In this paper we present several opportunities for using Data Mining techniques provided by the analytical engine of RDBMS Oracle over data stored in Hadoop Distributed File System (HDFS). Some experimental results are given and they are discussed.
NASA Astrophysics Data System (ADS)
Ayuningrum, Theresia Vika; Purnaweni, Hartuti
2018-02-01
Potential Karst area in Nusakambangan has an important role in maintaining the balance of nature. But with the existence of mining activities, will automatically change the environmental conditions there. In order for the utilization of resources to meet the rules of optimization between the interests of mining and sustainability of the environment so in every mining sector activities required a variety of environmental studies. The purpose of this study is to find out how the analysis of environmental management due to limestone mining activities in Nusakambangan so that it can be known the management of mining areas are optimal, wise based on ecological principles, and sustainability. In qualitative research methods, data analysis using description percentage, with the type of data collected in the form of primary data and secondary data.
Data Streams: An Overview and Scientific Applications
NASA Astrophysics Data System (ADS)
Aggarwal, Charu C.
In recent years, advances in hardware technology have facilitated the ability to collect data continuously. Simple transactions of everyday life such as using a credit card, a phone, or browsing the web lead to automated data storage. Similarly, advances in information technology have lead to large flows of data across IP networks. In many cases, these large volumes of data can be mined for interesting and relevant information in a wide variety of applications. When the volume of the underlying data is very large, it leads to a number of computational and mining challenges: With increasing volume of the data, it is no longer possible to process the data efficiently by using multiple passes. Rather, one can process a data item at most once. This leads to constraints on the implementation of the underlying algorithms. Therefore, stream mining algorithms typically need to be designed so that the algorithms work with one pass of the data. In most cases, there is an inherent temporal component to the stream mining process. This is because the data may evolve over time. This behavior of data streams is referred to as temporal locality. Therefore, a straightforward adaptation of one-pass mining algorithms may not be an effective solution to the task. Stream mining algorithms need to be carefully designed with a clear focus on the evolution of the underlying data. Another important characteristic of data streams is that they are often mined in a distributed fashion. Furthermore, the individual processors may have limited processing and memory. Examples of such cases include sensor networks, in which it may be desirable to perform in-network processing of data stream with limited processing and memory [1, 2]. This chapter will provide an overview of the key challenges in stream mining algorithms which arise from the unique setup in which these problems are encountered. This chapter is organized as follows. In the next section, we will discuss the generic challenges that stream mining poses to a variety of data management and data mining problems. The next section also deals with several issues which arise in the context of data stream management. In Sect. 3, we discuss several mining algorithms on the data stream model. Section 4 discusses various scientific applications of data streams. Section 5 discusses the research directions and conclusions.
ERIC Educational Resources Information Center
Benoit, Gerald
2002-01-01
Discusses data mining (DM) and knowledge discovery in databases (KDD), taking the view that KDD is the larger view of the entire process, with DM emphasizing the cleaning, warehousing, mining, and visualization of knowledge discovery in databases. Highlights include algorithms; users; the Internet; text mining; and information extraction.…
Data Mining and Complex Problems: Case Study in Composite Materials
NASA Technical Reports Server (NTRS)
Rabelo, Luis; Marin, Mario
2009-01-01
Data mining is defined as the discovery of useful, possibly unexpected, patterns and relationships in data using statistical and non-statistical techniques in order to develop schemes for decision and policy making. Data mining can be used to discover the sources and causes of problems in complex systems. In addition, data mining can support simulation strategies by finding the different constants and parameters to be used in the development of simulation models. This paper introduces a framework for data mining and its application to complex problems. To further explain some of the concepts outlined in this paper, the potential application to the NASA Shuttle Reinforced Carbon-Carbon structures and genetic programming is used as an illustration.
Privacy Preserving Nearest Neighbor Search
NASA Astrophysics Data System (ADS)
Shaneck, Mark; Kim, Yongdae; Kumar, Vipin
Data mining is frequently obstructed by privacy concerns. In many cases data is distributed, and bringing the data together in one place for analysis is not possible due to privacy laws (e.g. HIPAA) or policies. Privacy preserving data mining techniques have been developed to address this issue by providing mechanisms to mine the data while giving certain privacy guarantees. In this chapter we address the issue of privacy preserving nearest neighbor search, which forms the kernel of many data mining applications. To this end, we present a novel algorithm based on secure multiparty computation primitives to compute the nearest neighbors of records in horizontally distributed data. We show how this algorithm can be used in three important data mining algorithms, namely LOF outlier detection, SNN clustering, and kNN classification. We prove the security of these algorithms under the semi-honest adversarial model, and describe methods that can be used to optimize their performance. Keywords: Privacy Preserving Data Mining, Nearest Neighbor Search, Outlier Detection, Clustering, Classification, Secure Multiparty Computation
42 CFR 1007.17 - Annual report.
Code of Federal Regulations, 2013 CFR
2013-10-01
... those MFCUs approved to conduct data mining under § 1007.20, all costs expended that year by the MFCU attributed to data mining activities; the amount of staff time devoted to data mining activities; the number...
42 CFR 1007.17 - Annual report.
Code of Federal Regulations, 2014 CFR
2014-10-01
... those MFCUs approved to conduct data mining under § 1007.20, all costs expended that year by the MFCU attributed to data mining activities; the amount of staff time devoted to data mining activities; the number...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zvi H. Meiksin
A temporary installation of Transtek's in-mine communications system in the Lake Lynn mine was used in the mine rescue training programs offered by NIOSH in April and May 2002. We developed and implemented a software program that permits point-to-point data transmission through our in-mine system. We also developed a wireless data transceiver for use in a PLC (programmed logic controller) to remotely control long-wall mining equipment.
Randomization Based Privacy Preserving Categorical Data Analysis
ERIC Educational Resources Information Center
Guo, Ling
2010-01-01
The success of data mining relies on the availability of high quality data. To ensure quality data mining, effective information sharing between organizations becomes a vital requirement in today's society. Since data mining often involves sensitive information of individuals, the public has expressed a deep concern about their privacy.…
76 FR 51274 - Supplemental Nutrition Assistance Program: Major System Failures
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-18
... data mining as necessary to determine if losses are occurring in the process of issuing benefits. It is... further by using data mining techniques on States' data or analyzing QC data for error patterns that may... conjunction with an additional sample of cases. Data mining techniques may be employed when QC data cannot...
Data mining applications in the context of casemix.
Koh, H C; Leong, S K
2001-07-01
In October 1999, the Singapore Government introduced casemix-based funding to public hospitals. The casemix approach to health care funding is expected to yield significant benefits, including equity and rationality in financing health care, the use of comparative casemix data for quality improvement activities, and the provision of information that enables hospitals to understand their cost behaviour and reinforces the drive for more cost-efficient services. However, there is some concern about the "quicker and sicker" syndrome (that is, the rapid discharge of patients with little regard for the quality of outcome). As it is likely that consequences of premature discharges will be reflected in the readmission data, an analysis of possible systematic patterns in readmission data can provide useful insight into the "quicker and sicker" syndrome. This paper explores potential data mining applications in the context of casemix by using readmission data as an illustration. In particular, it illustrates how data mining can be used to better understand readmission data and to detect systematic patterns, if any. From a technical perspective, data mining (which is capable of analysing complex non-linear and interaction relationships) supplements and complements traditional statistical methods in data analysis. From an applications perspective, data mining provides the technology and methodology to analyse mass volume of data to detect hidden patterns in data. Using readmission data as an illustrative data mining application, this paper explores potential data mining applications in the general casemix context.
Data Mining: The Art of Automated Knowledge Extraction
NASA Astrophysics Data System (ADS)
Karimabadi, H.; Sipes, T.
2012-12-01
Data mining algorithms are used routinely in a wide variety of fields and they are gaining adoption in sciences. The realities of real world data analysis are that (a) data has flaws, and (b) the models and assumptions that we bring to the data are inevitably flawed, and/or biased and misspecified in some way. Data mining can improve data analysis by detecting anomalies in the data, check for consistency of the user model assumptions, and decipher complex patterns and relationships that would not be possible otherwise. The common form of data collected from in situ spacecraft measurements is multi-variate time series which represents one of the most challenging problems in data mining. We have successfully developed algorithms to deal with such data and have extended the algorithms to handle streaming data. In this talk, we illustrate the utility of our algorithms through several examples including automated detection of reconnection exhausts in the solar wind and flux ropes in the magnetotail. We also show examples from successful applications of our technique to analysis of 3D kinetic simulations. With an eye to the future, we provide an overview of our upcoming plans that include collaborative data mining, expert outsourcing data mining, computer vision for image analysis, among others. Finally, we discuss the integration of data mining algorithms with web-based services such as VxOs and other Heliophysics data centers and the resulting capabilities that it would enable.
Data mining for the e-business: developments and directions
NASA Astrophysics Data System (ADS)
Grasso, Alfred; Sleeper, Harry; Thuraisingham, Bhavani M.; Guo, Yike
2000-04-01
This paper describes data mining and e-business and then shows how data mining may be applied to e-business to gather consumer/supplier intelligence so that targeted marketing and merchandising may be carried out.
Introduction to the mining of clinical data.
Harrison, James H
2008-03-01
The increasing volume of medical data online, including laboratory data, represents a substantial resource that can provide a foundation for improved understanding of disease presentation, response to therapy, and health care delivery processes. Data mining supports these goals by providing a set of techniques designed to discover similarities and relationships between data elements in large data sets. Currently, medical data have several characteristics that increase the difficulty of applying these techniques, although there have been notable medical data mining successes. Future developments in integrated medical data repositories, standardized data representation, and guidelines for the appropriate research use of medical data will decrease the barriers to mining projects.
Mining Land Subsidence Monitoring Using SENTINEL-1 SAR Data
NASA Astrophysics Data System (ADS)
Yuan, W.; Wang, Q.; Fan, J.; Li, H.
2017-09-01
In this paper, DInSAR technique was used to monitor land subsidence in mining area. The study area was selected in the coal mine area located in Yuanbaoshan District, Chifeng City, and Sentinel-1 data were used to carry out DInSAR techniqu. We analyzed the interferometric results by Sentinel-1 data from December 2015 to May 2016. Through the comparison of the results of DInSAR technique and the location of the mine on the optical images, it is shown that DInSAR technique can be used to effectively monitor the land subsidence caused by underground mining, and it is an effective tool for law enforcement of over-mining.
A Data Preparation Methodology in Data Mining Applied to Mortality Population Databases.
Pérez, Joaquín; Iturbide, Emmanuel; Olivares, Víctor; Hidalgo, Miguel; Martínez, Alicia; Almanza, Nelva
2015-11-01
It is known that the data preparation phase is the most time consuming in the data mining process, using up to 50% or up to 70% of the total project time. Currently, data mining methodologies are of general purpose and one of their limitations is that they do not provide a guide about what particular task to develop in a specific domain. This paper shows a new data preparation methodology oriented to the epidemiological domain in which we have identified two sets of tasks: General Data Preparation and Specific Data Preparation. For both sets, the Cross-Industry Standard Process for Data Mining (CRISP-DM) is adopted as a guideline. The main contribution of our methodology is fourteen specialized tasks concerning such domain. To validate the proposed methodology, we developed a data mining system and the entire process was applied to real mortality databases. The results were encouraging because it was observed that the use of the methodology reduced some of the time consuming tasks and the data mining system showed findings of unknown and potentially useful patterns for the public health services in Mexico.
Data Mining at NASA: From Theory to Applications
NASA Technical Reports Server (NTRS)
Srivastava, Ashok N.
2009-01-01
This slide presentation demonstrates the data mining/machine learning capabilities of NASA Ames and Intelligent Data Understanding (IDU) group. This will encompass the work done recently in the group by various group members. The IDU group develops novel algorithms to detect, classify, and predict events in large data streams for scientific and engineering systems. This presentation for Knowledge Discovery and Data Mining 2009 is to demonstrate the data mining/machine learning capabilities of NASA Ames and IDU group. This will encompass the work done re cently in the group by various group members.
Advances in Machine Learning and Data Mining for Astronomy
NASA Astrophysics Data System (ADS)
Way, Michael J.; Scargle, Jeffrey D.; Ali, Kamal M.; Srivastava, Ashok N.
2012-03-01
Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science. The book's introductory part provides context to issues in the astronomical sciences that are also important to health, social, and physical sciences, particularly probabilistic and statistical aspects of classification and cluster analysis. The next part describes a number of astrophysics case studies that leverage a range of machine learning and data mining technologies. In the last part, developers of algorithms and practitioners of machine learning and data mining show how these tools and techniques are used in astronomical applications. With contributions from leading astronomers and computer scientists, this book is a practical guide to many of the most important developments in machine learning, data mining, and statistics. It explores how these advances can solve current and future problems in astronomy and looks at how they could lead to the creation of entirely new algorithms within the data mining community.
Process Mining Online Assessment Data
ERIC Educational Resources Information Center
Pechenizkiy, Mykola; Trcka, Nikola; Vasilyeva, Ekaterina; van der Aalst, Wil; De Bra, Paul
2009-01-01
Traditional data mining techniques have been extensively applied to find interesting patterns, build descriptive and predictive models from large volumes of data accumulated through the use of different information systems. The results of data mining can be used for getting a better understanding of the underlying educational processes, for…
Data Mining and Homeland Security: An Overview
2006-01-27
which government agencies should use and mix commercial data with government data, whether data sources are being used for purposes other than those...example, a hardware store may compare their customers’ tool purchases with home ownership, type of CRS-2 3 John Makulowich, “ Government Data Mining...cleaning, data integration, data selection, data transformation , (data mining), pattern evaluation, and knowledge presentation.4 A number of advances in
A prototype system based on visual interactive SDM called VGC
NASA Astrophysics Data System (ADS)
Jia, Zelu; Liu, Yaolin; Liu, Yanfang
2009-10-01
In many application domains, data is collected and referenced by its geo-spatial location. Spatial data mining, or the discovery of interesting patterns in such databases, is an important capability in the development of database systems. Spatial data mining recently emerges from a number of real applications, such as real-estate marketing, urban planning, weather forecasting, medical image analysis, road traffic accident analysis, etc. It demands for efficient solutions for many new, expensive, and complicated problems. For spatial data mining of large data sets to be effective, it is also important to include humans in the data exploration process and combine their flexibility, creativity, and general knowledge with the enormous storage capacity and computational power of today's computers. Visual spatial data mining applies human visual perception to the exploration of large data sets. Presenting data in an interactive, graphical form often fosters new insights, encouraging the information and validation of new hypotheses to the end of better problem-solving and gaining deeper domain knowledge. In this paper a visual interactive spatial data mining prototype system (visual geo-classify) based on VC++6.0 and MapObject2.0 are designed and developed, the basic algorithms of the spatial data mining is used decision tree and Bayesian networks, and data classify are used training and learning and the integration of the two to realize. The result indicates it's a practical and extensible visual interactive spatial data mining tool.
Analysis of Mining Terrain Deformation Characteristics with Deformation Information System
NASA Astrophysics Data System (ADS)
Blachowski, Jan; Milczarek, Wojciech; Grzempowski, Piotr
2014-05-01
Mapping and prediction of mining related deformations of the earth surface is an important measure for minimising threat to surface infrastructure, human population, the environment and safety of the mining operation itself arising from underground extraction of useful minerals. The number of methods and techniques used for monitoring and analysis of mining terrain deformations is wide and increasing with the development of geographical information technologies. These include for example: terrestrial geodetic measurements, global positioning systems, remote sensing, spatial interpolation, finite element method modelling, GIS based modelling, geological modelling, empirical modelling using the Knothe theory, artificial neural networks, fuzzy logic calculations and other. The aim of this paper is to introduce the concept of an integrated Deformation Information System (DIS) developed in geographic information systems environment for analysis and modelling of various spatial data related to mining activity and demonstrate its applications for mapping and visualising, as well as identifying possible mining terrain deformation areas with various spatial modelling methods. The DIS concept is based on connected modules that include: the spatial database - the core of the system, the spatial data collection module formed by: terrestrial, satellite and remote sensing measurements of the ground changes, the spatial data mining module for data discovery and extraction, the geological modelling module, the spatial data modeling module with data processing algorithms for spatio-temporal analysis and mapping of mining deformations and their characteristics (e.g. deformation parameters: tilt, curvature and horizontal strain), the multivariate spatial data classification module and the visualization module allowing two-dimensional interactive and static mapping and three-dimensional visualizations of mining ground characteristics. The Systems's functionality has been presented on the case study of a coal mining region in SW Poland where it has been applied to study characteristics and map mining induced ground deformations in a city in the last two decades of underground coal extraction and in the first decade after the end of mining. The mining subsidence area and its deformation parameters (tilt and curvature) have been calculated and the latter classified and mapped according to the Polish regulations. In addition possible areas of ground deformation have been indicated based on multivariate spatial data analysis of geological and mining operation characteristics with the geographically weighted regression method.
A systematic review of data mining and machine learning for air pollution epidemiology.
Bellinger, Colin; Mohomed Jabbar, Mohomed Shazan; Zaïane, Osmar; Osornio-Vargas, Alvaro
2017-11-28
Data measuring airborne pollutants, public health and environmental factors are increasingly being stored and merged. These big datasets offer great potential, but also challenge traditional epidemiological methods. This has motivated the exploration of alternative methods to make predictions, find patterns and extract information. To this end, data mining and machine learning algorithms are increasingly being applied to air pollution epidemiology. We conducted a systematic literature review on the application of data mining and machine learning methods in air pollution epidemiology. We carried out our search process in PubMed, the MEDLINE database and Google Scholar. Research articles applying data mining and machine learning methods to air pollution epidemiology were queried and reviewed. Our search queries resulted in 400 research articles. Our fine-grained analysis employed our inclusion/exclusion criteria to reduce the results to 47 articles, which we separate into three primary areas of interest: 1) source apportionment; 2) forecasting/prediction of air pollution/quality or exposure; and 3) generating hypotheses. Early applications had a preference for artificial neural networks. In more recent work, decision trees, support vector machines, k-means clustering and the APRIORI algorithm have been widely applied. Our survey shows that the majority of the research has been conducted in Europe, China and the USA, and that data mining is becoming an increasingly common tool in environmental health. For potential new directions, we have identified that deep learning and geo-spacial pattern mining are two burgeoning areas of data mining that have good potential for future applications in air pollution epidemiology. We carried out a systematic review identifying the current trends, challenges and new directions to explore in the application of data mining methods to air pollution epidemiology. This work shows that data mining is increasingly being applied in air pollution epidemiology. The potential to support air pollution epidemiology continues to grow with advancements in data mining related to temporal and geo-spacial mining, and deep learning. This is further supported by new sensors and storage mediums that enable larger, better quality data. This suggests that many more fruitful applications can be expected in the future.
ThaleMine: A Warehouse for Arabidopsis Data Integration and Discovery.
Krishnakumar, Vivek; Contrino, Sergio; Cheng, Chia-Yi; Belyaeva, Irina; Ferlanti, Erik S; Miller, Jason R; Vaughn, Matthew W; Micklem, Gos; Town, Christopher D; Chan, Agnes P
2017-01-01
ThaleMine (https://apps.araport.org/thalemine/) is a comprehensive data warehouse that integrates a wide array of genomic information of the model plant Arabidopsis thaliana. The data collection currently includes the latest structural and functional annotation from the Araport11 update, the Col-0 genome sequence, RNA-seq and array expression, co-expression, protein interactions, homologs, pathways, publications, alleles, germplasm and phenotypes. The data are collected from a wide variety of public resources. Users can browse gene-specific data through Gene Report pages, identify and create gene lists based on experiments or indexed keywords, and run GO enrichment analysis to investigate the biological significance of selected gene sets. Developed by the Arabidopsis Information Portal project (Araport, https://www.araport.org/), ThaleMine uses the InterMine software framework, which builds well-structured data, and provides powerful data query and analysis functionality. The warehoused data can be accessed by users via graphical interfaces, as well as programmatically via web-services. Here we describe recent developments in ThaleMine including new features and extensions, and discuss future improvements. InterMine has been broadly adopted by the model organism research community including nematode, rat, mouse, zebrafish, budding yeast, the modENCODE project, as well as being used for human data. ThaleMine is the first InterMine developed for a plant model. As additional new plant InterMines are developed by the legume and other plant research communities, the potential of cross-organism integrative data analysis will be further enabled. © The Author 2016. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Data Mining Web Services for Science Data Repositories
NASA Astrophysics Data System (ADS)
Graves, S.; Ramachandran, R.; Keiser, K.; Maskey, M.; Lynnes, C.; Pham, L.
2006-12-01
The maturation of web services standards and technologies sets the stage for a distributed "Service-Oriented Architecture" (SOA) for NASA's next generation science data processing. This architecture will allow members of the scientific community to create and combine persistent distributed data processing services and make them available to other users over the Internet. NASA has initiated a project to create a suite of specialized data mining web services designed specifically for science data. The project leverages the Algorithm Development and Mining (ADaM) toolkit as its basis. The ADaM toolkit is a robust, mature and freely available science data mining toolkit that is being used by several research organizations and educational institutions worldwide. These mining services will give the scientific community a powerful and versatile data mining capability that can be used to create higher order products such as thematic maps from current and future NASA satellite data records with methods that are not currently available. The package of mining and related services are being developed using Web Services standards so that community-based measurement processing systems can access and interoperate with them. These standards-based services allow users different options for utilizing them, from direct remote invocation by a client application to deployment of a Business Process Execution Language (BPEL) solutions package where a complex data mining workflow is exposed to others as a single service. The ability to deploy and operate these services at a data archive allows the data mining algorithms to be run where the data are stored, a more efficient scenario than moving large amounts of data over the network. This will be demonstrated in a scenario in which a user uses a remote Web-Service-enabled clustering algorithm to create cloud masks from satellite imagery at the Goddard Earth Sciences Data and Information Services Center (GES DISC).
A Note on Interfacing Object Warehouses and Mass Storage Systems for Data Mining Applications
NASA Technical Reports Server (NTRS)
Grossman, Robert L.; Northcutt, Dave
1996-01-01
Data mining is the automatic discovery of patterns, associations, and anomalies in data sets. Data mining requires numerically and statistically intensive queries. Our assumption is that data mining requires a specialized data management infrastructure to support the aforementioned intensive queries, but because of the sizes of data involved, this infrastructure is layered over a hierarchical storage system. In this paper, we discuss the architecture of a system which is layered for modularity, but exploits specialized lightweight services to maintain efficiency. Rather than use a full functioned database for example, we use light weight object services specialized for data mining. We propose using information repositories between layers so that components on either side of the layer can access information in the repositories to assist in making decisions about data layout, the caching and migration of data, the scheduling of queries, and related matters.
Mining of high utility-probability sequential patterns from uncertain databases
Zhang, Binbin; Fournier-Viger, Philippe; Li, Ting
2017-01-01
High-utility sequential pattern mining (HUSPM) has become an important issue in the field of data mining. Several HUSPM algorithms have been designed to mine high-utility sequential patterns (HUPSPs). They have been applied in several real-life situations such as for consumer behavior analysis and event detection in sensor networks. Nonetheless, most studies on HUSPM have focused on mining HUPSPs in precise data. But in real-life, uncertainty is an important factor as data is collected using various types of sensors that are more or less accurate. Hence, data collected in a real-life database can be annotated with existing probabilities. This paper presents a novel pattern mining framework called high utility-probability sequential pattern mining (HUPSPM) for mining high utility-probability sequential patterns (HUPSPs) in uncertain sequence databases. A baseline algorithm with three optional pruning strategies is presented to mine HUPSPs. Moroever, to speed up the mining process, a projection mechanism is designed to create a database projection for each processed sequence, which is smaller than the original database. Thus, the number of unpromising candidates can be greatly reduced, as well as the execution time for mining HUPSPs. Substantial experiments both on real-life and synthetic datasets show that the designed algorithm performs well in terms of runtime, number of candidates, memory usage, and scalability for different minimum utility and minimum probability thresholds. PMID:28742847
Kim, Sung-Min
2018-01-01
Cessation of dewatering following underground mine closure typically results in groundwater rebound, because mine voids and surrounding strata undergo flooding up to the levels of the decant points, such as shafts and drifts. SIMPL (Simplified groundwater program In Mine workings using the Pipe equation and Lumped parameter model), a simplified lumped parameter model-based program for predicting groundwater levels in abandoned mines, is presented herein. The program comprises a simulation engine module, 3D visualization module, and graphical user interface, which aids data processing, analysis, and visualization of results. The 3D viewer facilitates effective visualization of the predicted groundwater level rebound phenomenon together with a topographic map, mine drift, goaf, and geological properties from borehole data. SIMPL is applied to data from the Dongwon coal mine and Dalsung copper mine in Korea, with strong similarities in simulated and observed results. By considering mine workings and interpond connections, SIMPL can thus be used to effectively analyze and visualize groundwater rebound. In addition, the predictions by SIMPL can be utilized to prevent the surrounding environment (water and soil) from being polluted by acid mine drainage. PMID:29747480
Driscoll, Heather E; Murray, Janet M; English, Erika L; Hunter, Timothy C; Pivarski, Kara; Dolci, Elizabeth D
2017-08-01
Here we describe microarray expression data (raw and normalized), experimental metadata, and gene-level data with expression statistics from Saccharomyces cerevisiae exposed to simulated asbestos mine drainage from the Vermont Asbestos Group (VAG) Mine on Belvidere Mountain in northern Vermont, USA. For nearly 100 years (between the late 1890s and 1993), chrysotile asbestos fibers were extracted from serpentinized ultramafic rock at the VAG Mine for use in construction and manufacturing industries. Studies have shown that water courses and streambeds nearby have become contaminated with asbestos mine tailings runoff, including elevated levels of magnesium, nickel, chromium, and arsenic, elevated pH, and chrysotile asbestos-laden mine tailings, due to leaching and gradual erosion of massive piles of mine waste covering approximately 9 km 2 . We exposed yeast to simulated VAG Mine tailings leachate to help gain insight on how eukaryotic cells exposed to VAG Mine drainage may respond in the mine environment. Affymetrix GeneChip® Yeast Genome 2.0 Arrays were utilized to assess gene expression after 24-h exposure to simulated VAG Mine tailings runoff. The chemistry of mine-tailings leachate, mine-tailings leachate plus yeast extract peptone dextrose media, and control yeast extract peptone dextrose media is also reported. To our knowledge this is the first dataset to assess global gene expression patterns in a eukaryotic model system simulating asbestos mine tailings runoff exposure. Raw and normalized gene expression data are accessible through the National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO) Database Series GSE89875 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE89875).
NASA Astrophysics Data System (ADS)
Walther, Christian; Frei, Michaela
2017-04-01
Mining of so-called "conflict minerals" is often related with small-scale mining activities. The here discussed activities are located in forested areas in the eastern DRC, which are often remote, difficult to access and insecure for traditional geological field inspection. In order to accelerate their CTC (Certified Trading Chain)-certification process, remote sensing data are used for detection and monitoring of these small-scale mining operations. This requires a high image acquisition frequency due to mining site relocations and for compensation of year-round high cloud coverage, especially for optical data evaluation. Freely available medium resolution optical data of Sentinel-2 and Landsat-8 as well as SAR data of Sentinel-1 are used for detecting small mining targets with a minimum size of approximately 0.5 km2. The developed method enables a robust multi-temporal detection of mining sites, monitoring of mining site spatio-temporal relocations and environmental changes. Since qualitative and quantitative comparable results are generated, the followed change detection approach is objective and transparent and may push the certification process forward.
Wang, Weiqi; Wang, Yanbo Justin; Bañares-Alcántara, René; Coenen, Frans; Cui, Zhanfeng
2009-12-01
In this paper, data mining is used to analyze the data on the differentiation of mammalian Mesenchymal Stem Cells (MSCs), aiming at discovering known and hidden rules governing MSC differentiation, following the establishment of a web-based public database containing experimental data on the MSC proliferation and differentiation. To this effect, a web-based public interactive database comprising the key parameters which influence the fate and destiny of mammalian MSCs has been constructed and analyzed using Classification Association Rule Mining (CARM) as a data-mining technique. The results show that the proposed approach is technically feasible and performs well with respect to the accuracy of (classification) prediction. Key rules mined from the constructed MSC database are consistent with experimental observations, indicating the validity of the method developed and the first step in the application of data mining to the study of MSCs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kniesner, T.J.; Leeth, J.D.
2004-09-15
Using recently assembled data from the Mine Safety and Health Administration (MSHA) we shed new light on the regulatory approach to workplace safety. Because all underground coal mines are inspected quarterly, MSHA regulations will not be ineffective because of infrequent inspections. From over 200 different specifications of dynamic mine safety regressions we select the specification producing the largest MSHA impact. Even using results most favorable to the agency, MSHA is not currently cost effective. Almost 700,000 life years could be gained for typical miners if a quarter of MSHA's enforcement budget were reallocated to other programs (more heart disease screeningmore » or defibrillators at worksites).« less
NASA Astrophysics Data System (ADS)
Wang, Wei; Yang, Jiong
With the rapid growth of computational biology and e-commerce applications, high-dimensional data becomes very common. Thus, mining high-dimensional data is an urgent problem of great practical importance. However, there are some unique challenges for mining data of high dimensions, including (1) the curse of dimensionality and more crucial (2) the meaningfulness of the similarity measure in the high dimension space. In this chapter, we present several state-of-art techniques for analyzing high-dimensional data, e.g., frequent pattern mining, clustering, and classification. We will discuss how these methods deal with the challenges of high dimensionality.
The study on privacy preserving data mining for information security
NASA Astrophysics Data System (ADS)
Li, Xiaohui
2012-04-01
Privacy preserving data mining have a rapid development in a short year. But it still faces many challenges in the future. Firstly, the level of privacy has different definitions in different filed. Therefore, the measure of privacy preserving data mining technology protecting private information is not the same. So, it's an urgent issue to present a unified privacy definition and measure. Secondly, the most of research in privacy preserving data mining is presently confined to the theory study.
Exploring patterns of epigenetic information with data mining techniques.
Aguiar-Pulido, Vanessa; Seoane, José A; Gestal, Marcos; Dorado, Julián
2013-01-01
Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of data for which data mining is essential. Part of these data may contain patterns of epigenetic information which are mitotically and/or meiotically heritable determining gene expression and cellular differentiation, as well as cellular fate. Epigenetic lesions and genetic mutations are acquired by individuals during their life and accumulate with ageing. Both defects, either together or individually, can result in losing control over cell growth and, thus, causing cancer development. Data mining techniques could be then used to extract the previous patterns. This work reviews some of the most important applications of data mining to epigenetics.
A Quantitative Analysis of Organizational Factors That Relate to Data Mining Success
ERIC Educational Resources Information Center
Huebner, Richard A.
2017-01-01
The ubiquity of data in various forms has fueled the need for advanced data-mining techniques within organizations. The advent of data mining methods used to uncover hidden nuggets of information buried within large data sets has also fueled the need for determining how these unique projects can be successful. There are many challenges associated…
78 FR 25266 - An Assessment of Potential Mining Impacts on Salmon Ecosystems of Bristol Bay, Alaska
Federal Register 2010, 2011, 2012, 2013, 2014
2013-04-30
... information presented in the report, the realistic mining scenario used, the data and information used to... additional data or scientific or technical information about Bristol Bay resources or large-scale mining that... Potential Mining Impacts on Salmon Ecosystems of Bristol Bay, Alaska AGENCY: Environmental Protection Agency...
Vlsi implementation of flexible architecture for decision tree classification in data mining
NASA Astrophysics Data System (ADS)
Sharma, K. Venkatesh; Shewandagn, Behailu; Bhukya, Shankar Nayak
2017-07-01
The Data mining algorithms have become vital to researchers in science, engineering, medicine, business, search and security domains. In recent years, there has been a terrific raise in the size of the data being collected and analyzed. Classification is the main difficulty faced in data mining. In a number of the solutions developed for this problem, most accepted one is Decision Tree Classification (DTC) that gives high precision while handling very large amount of data. This paper presents VLSI implementation of flexible architecture for Decision Tree classification in data mining using c4.5 algorithm.
Sams, James I.; Veloski, Garret
2003-01-01
High-resolution airborne thermal infrared (TIR) imagery data were collected over 90.6 km2 (35 mi2) of remote and rugged terrain in the Kettle Creek and Cooks Run Basins, tributaries of the West Branch of the Susquehanna River in north-central Pennsylvania. The purpose of this investigation was to evaluate the effectiveness of TIR for identifying sources of acid mine drainage (AMD) associated with abandoned coal mines. Coal mining from the late 1800s resulted in many AMD sources from abandoned mines in the area. However, very little detailed mine information was available, particularly on the source locations of AMD sites. Potential AMD sources were extracted from airborne TIR data employing custom image processing algorithms and GIS data analysis. Based on field reconnaissance of 103 TIR anomalies, 53 sites (51%) were classified as AMD. The AMD sources had low pH (<4) and elevated concentrations of iron and aluminum. Of the 53 sites, approximately 26 sites could be correlated with sites previously documented as AMD. The other 27 mine discharges identified in the TIR data were previously undocumented. This paper presents a summary of the procedures used to process the TIR data and extract potential mine drainage sites, methods used for field reconnaissance and verification of TIR data, and a brief summary of water-quality data.
Mining large heterogeneous data sets in drug discovery.
Wild, David J
2009-10-01
Increasingly, effective drug discovery involves the searching and data mining of large volumes of information from many sources covering the domains of chemistry, biology and pharmacology amongst others. This has led to a proliferation of databases and data sources relevant to drug discovery. This paper provides a review of the publicly-available large-scale databases relevant to drug discovery, describes the kinds of data mining approaches that can be applied to them and discusses recent work in integrative data mining that looks for associations that pan multiple sources, including the use of Semantic Web techniques. The future of mining large data sets for drug discovery requires intelligent, semantic aggregation of information from all of the data sources described in this review, along with the application of advanced methods such as intelligent agents and inference engines in client applications.
Intelligent Information Retrieval and Web Mining Architecture Using SOA
ERIC Educational Resources Information Center
El-Bathy, Naser Ibrahim
2010-01-01
The study of this dissertation provides a solution to a very specific problem instance in the area of data mining, data warehousing, and service-oriented architecture in publishing and newspaper industries. The research question focuses on the integration of data mining and data warehousing. The research problem focuses on the development of…
Schoech, D; Quinn, A; Rycraft, J R
2000-01-01
Data mining is the sifting through of voluminous data to extract knowledge for decision making. This article illustrates the context, concepts, processes, techniques, and tools of data mining, using statistical and neural network analyses on a dataset concerning employee turnover. The resulting models and their predictive capability, advantages and disadvantages, and implications for decision support are highlighted.
Sams, James I.; Veloski, Garret; Ackman, T.E.
2003-01-01
Nighttime high-resolution airborne thermal infrared imagery (TIR) data were collected in the predawn hours during Feb 5-8 and March 11-12, 1999, from a helicopter platform for 72.4 km of the Youghiogheny River, from Connellsville to McKeesport, in southwestern Pennsylvania. The TIR data were used to identify sources of mine drainage from abandoned mines that discharge directly into the Youghiogheny River. Image-processing and geographic information systems (GIS) techniques were used to identify 70 sites within the study area as possible mine drainage sources. The combination of GIS datasets and the airborne TIR data provided a fast and accurate method to target the possible sources. After field reconnaissance, it was determined that 24 of the 70 sites were mine drainage. This paper summarizes: the procedures used to process the TIR data and extract potential mine-drainage sites; methods used for verification of the TIR data; a discussion of factors affecting the TIR data; and a brief summary of water quality.
A methodological toolkit for field assessments of artisanally mined alluvial diamond deposits
Chirico, Peter G.; Malpeli, Katherine C.
2014-01-01
This toolkit provides a standardized checklist of critical issues relevant to artisanal mining-related field research. An integrated sociophysical geographic approach to collecting data at artisanal mine sites is outlined. The implementation and results of a multistakeholder approach to data collection, carried out in the assessment of Guinea’s artisanally mined diamond deposits, also are summarized. This toolkit, based on recent and successful field campaigns in West Africa, has been developed as a reference document to assist other government agencies or organizations in collecting the data necessary for artisanal diamond mining or similar natural resource assessments.
Mining Claim Activity on Federal Land for the Period 1976 through 2003
Causey, J. Douglas
2005-01-01
Previous reports on mining claim records provided information and statistics (number of claims) using data from the U.S. Bureau of Land Management's (BLM) Mining Claim Recordation System. Since that time, BLM converted their mining claim data to the Legacy Repost 2000 system (LR2000). This report describes a process to extract similar statistical data about mining claims from LR2000 data using different software and procedures than were used in the earlier work. A major difference between this process and the previous work is that every section that has a mining claim record is assigned a value. This is done by proportioning a claim between each section in which it is recorded. Also, the mining claim data in this report includes all BLM records, not just the western states. LR2000 mining claim database tables for the United States were provided by BLM in text format and imported into a Microsoft? Access2000 database in January, 2004. Data from two tables in the BLM LR2000 database were summarized through a series of database queries to determine a number that represents active mining claims in each Public Land Survey (PLS) section for each of the years from 1976 to 2002. For most of the area, spatial databases are also provided. The spatial databases are only configured to work with the statistics provided in the non-spatial data files. They are suitable for geographic information system (GIS)-based regional assessments at a scale of 1:100,000 or smaller (for example, 1:250,000).
Text Mining in Organizational Research
Kobayashi, Vladimer B.; Berkers, Hannah A.; Kismihók, Gábor; Den Hartog, Deanne N.
2017-01-01
Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies. PMID:29881248
Text Mining in Organizational Research.
Kobayashi, Vladimer B; Mol, Stefan T; Berkers, Hannah A; Kismihók, Gábor; Den Hartog, Deanne N
2018-07-01
Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.
NASA Astrophysics Data System (ADS)
Gaber, Mohamed Medhat; Zaslavsky, Arkady; Krishnaswamy, Shonali
Data mining is concerned with the process of computationally extracting hidden knowledge structures represented in models and patterns from large data repositories. It is an interdisciplinary field of study that has its roots in databases, statistics, machine learning, and data visualization. Data mining has emerged as a direct outcome of the data explosion that resulted from the success in database and data warehousing technologies over the past two decades (Fayyad, 1997,Fayyad, 1998,Kantardzic, 2003).
Mars, J.C.; Crowley, J.K.
2003-01-01
Remotely sensed hyperspectral and digital elevation data from southeastern Idaho are combined in a new method to assess mine waste contamination. Waste rock from phosphorite mining in the area contains selenium, cadmium, vanadium, and other metals. Toxic concentrations of selenium have been found in plants and soils near some mine waste dumps. Eighteen mine waste dumps and five vegetation cover types in the southeast Idaho phosphate district were mapped by using Airborne Visible-Infrared Imaging Spectrometer (AVIRIS) imagery and field data. The interaction of surface water runoff with mine waste was assessed by registering the AVIRIS results to digital elevation data, enabling determinations of (1) mine dump morphologies, (2) catchment watershed areas above each mine dump, (3) flow directions from the dumps, (4) stream gradients, and (5) the extent of downstream wetlands available for selenium absorption. Watersheds with the most severe selenium contamination, such as the South Maybe Canyon watershed, are associated with mine dumps that have large catchment watershed areas, high stream gradients, a paucity of downstream wetlands, and dump forms that tend to obstruct stream flow. Watersheds associated with low concentrations of dissolved selenium, such as Angus Creek, have mine dumps with small catchment watershed areas, low stream gradients, abundant wetlands vegetation, and less obstructing dump morphologies. ?? 2002 Elsevier Science Inc. All rights reserved.
Abandoned Uranium Mines (AUM) Site Screening Map Service, 2016, US EPA Region 9
As described in detail in the Five-Year Report, US EPA completed on-the-ground screening of 521 abandoned uranium mine areas. US EPA and the Navajo EPA are using the Comprehensive Database and Atlas to determine which mines should be cleaned up first. US EPA continues to research and identify Potentially Responsible Parties (PRPs) under Superfund to contribute to the costs of cleanup efforts.This US EPA Region 9 web service contains the following map layers:Abandoned Uranium Mines, Priority Mines, Tronox Mines, Navajo Environmental Response Trust Mines, Mines with Enforcement Actions, Superfund AUM Regions, Navajo Nation Administrative Boundaries and Chapter Houses.Mine points have a maximum scale of 1:220,000, while Mine polygons have a minimum scale of 1:220,000. Chapter houses have a minimum scale of 1:200,000. BLM Land Status has a minimum scale of 1:150,000.Full FGDC metadata records for each layer can be found by clicking the layer name at the web service endpoint and viewing the layer description. Data used to create this web service are available for download at https://edg.epa.gov/metadata/catalog/data/data.page.Security Classification: Public. Access Constraints: None. Use Constraints: None. Please check sources, scale, accuracy, currentness and other available information. Please confirm that you are using the most recent copy of both data and metadata. Acknowledgement of the EPA would be appreciated.
Modeling Spatial Dependencies and Semantic Concepts in Data Mining
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vatsavai, Raju
Data mining is the process of discovering new patterns and relationships in large datasets. However, several studies have shown that general data mining techniques often fail to extract meaningful patterns and relationships from the spatial data owing to the violation of fundamental geospatial principles. In this tutorial, we introduce basic principles behind explicit modeling of spatial and semantic concepts in data mining. In particular, we focus on modeling these concepts in the widely used classification, clustering, and prediction algorithms. Classification is the process of learning a structure or model (from user given inputs) and applying the known model to themore » new data. Clustering is the process of discovering groups and structures in the data that are ``similar,'' without applying any known structures in the data. Prediction is the process of finding a function that models (explains) the data with least error. One common assumption among all these methods is that the data is independent and identically distributed. Such assumptions do not hold well in spatial data, where spatial dependency and spatial heterogeneity are a norm. In addition, spatial semantics are often ignored by the data mining algorithms. In this tutorial we cover recent advances in explicitly modeling of spatial dependencies and semantic concepts in data mining.« less
Data Mining and Privacy of Social Network Sites' Users: Implications of the Data Mining Problem.
Al-Saggaf, Yeslam; Islam, Md Zahidul
2015-08-01
This paper explores the potential of data mining as a technique that could be used by malicious data miners to threaten the privacy of social network sites (SNS) users. It applies a data mining algorithm to a real dataset to provide empirically-based evidence of the ease with which characteristics about the SNS users can be discovered and used in a way that could invade their privacy. One major contribution of this article is the use of the decision forest data mining algorithm (SysFor) to the context of SNS, which does not only build a decision tree but rather a forest allowing the exploration of more logic rules from a dataset. One logic rule that SysFor built in this study, for example, revealed that anyone having a profile picture showing just the face or a picture showing a family is less likely to be lonely. Another contribution of this article is the discussion of the implications of the data mining problem for governments, businesses, developers and the SNS users themselves.
Physics Mining of Multi-Source Data Sets
NASA Technical Reports Server (NTRS)
Helly, John; Karimabadi, Homa; Sipes, Tamara
2012-01-01
Powerful new parallel data mining algorithms can produce diagnostic and prognostic numerical models and analyses from observational data. These techniques yield higher-resolution measures than ever before of environmental parameters by fusing synoptic imagery and time-series measurements. These techniques are general and relevant to observational data, including raster, vector, and scalar, and can be applied in all Earth- and environmental science domains. Because they can be highly automated and are parallel, they scale to large spatial domains and are well suited to change and gap detection. This makes it possible to analyze spatial and temporal gaps in information, and facilitates within-mission replanning to optimize the allocation of observational resources. The basis of the innovation is the extension of a recently developed set of algorithms packaged into MineTool to multi-variate time-series data. MineTool is unique in that it automates the various steps of the data mining process, thus making it amenable to autonomous analysis of large data sets. Unlike techniques such as Artificial Neural Nets, which yield a blackbox solution, MineTool's outcome is always an analytical model in parametric form that expresses the output in terms of the input variables. This has the advantage that the derived equation can then be used to gain insight into the physical relevance and relative importance of the parameters and coefficients in the model. This is referred to as physics-mining of data. The capabilities of MineTool are extended to include both supervised and unsupervised algorithms, handle multi-type data sets, and parallelize it.
Learner Typologies Development Using OIndex and Data Mining Based Clustering Techniques
ERIC Educational Resources Information Center
Luan, Jing
2004-01-01
This explorative data mining project used distance based clustering algorithm to study 3 indicators, called OIndex, of student behavioral data and stabilized at a 6-cluster scenario following an exhaustive explorative study of 4, 5, and 6 cluster scenarios produced by K-Means and TwoStep algorithms. Using principles in data mining, the study…
Data Mining in Course Management Systems: Moodle Case Study and Tutorial
ERIC Educational Resources Information Center
Romero, Cristobal; Ventura, Sebastian; Garcia, Enrique
2008-01-01
Educational data mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from the educational context. This work is a survey of the specific application of data mining in learning management systems and a case study tutorial with the Moodle system. Our objective is to introduce it both…
Distributed data mining on grids: services, tools, and applications.
Cannataro, Mario; Congiusta, Antonio; Pugliese, Andrea; Talia, Domenico; Trunfio, Paolo
2004-12-01
Data mining algorithms are widely used today for the analysis of large corporate and scientific datasets stored in databases and data archives. Industry, science, and commerce fields often need to analyze very large datasets maintained over geographically distributed sites by using the computational power of distributed and parallel systems. The grid can play a significant role in providing an effective computational support for distributed knowledge discovery applications. For the development of data mining applications on grids we designed a system called Knowledge Grid. This paper describes the Knowledge Grid framework and presents the toolset provided by the Knowledge Grid for implementing distributed knowledge discovery. The paper discusses how to design and implement data mining applications by using the Knowledge Grid tools starting from searching grid resources, composing software and data components, and executing the resulting data mining process on a grid. Some performance results are also discussed.
Analyzing Student Inquiry Data Using Process Discovery and Sequence Classification
ERIC Educational Resources Information Center
Emond, Bruno; Buffett, Scott
2015-01-01
This paper reports on results of applying process discovery mining and sequence classification mining techniques to a data set of semi-structured learning activities. The main research objective is to advance educational data mining to model and support self-regulated learning in heterogeneous environments of learning content, activities, and…
ERIC Educational Resources Information Center
Flaherty, Bill
2013-01-01
Data-mining systems provide a variety of opportunities for school district personnel to streamline operations and focus on student achievement. This article describes the value of data mining for school personnel, finance departments, teacher evaluations, and in the classroom. It suggests that much could be learned about district practices if one…
Monitoring strip mining and reclamation with LANDSAT data in Belmont County, Ohio
NASA Technical Reports Server (NTRS)
Witt, R. G.; Schaal, G. M.; Bly, B. G.
1983-01-01
The utility of LANDSAT digital data for mapping and monitoring surface mines in Belmont County, Ohio was investigated. Two data sets from 1976 and 1979 were processed to classify level 1 land covers and three strip mine categories in order to examine change over time and assess reclamation efforts. The two classifications were compared with aerial photographs. Results of the accuracy assessment show that both classifications are approximately 86 per cent correct, and that surface mine change detection (date-to-date comparison) is facilitated by the digital format of LANDSAT data.
A Data Mining Approach to Identify Sexuality Patterns in a Brazilian University Population.
Waleska Simões, Priscyla; Cesconetto, Samuel; Toniazzo de Abreu, Larissa Letieli; Côrtes de Mattos Garcia, Merisandra; Cassettari Junior, José Márcio; Comunello, Eros; Bisognin Ceretta, Luciane; Aparecida Manenti, Sandra
2015-01-01
This paper presents the profile and experience of sexuality generated from a data mining classification task. We used a database about sexuality and gender violence performed on a university population in southern Brazil. The data mining task identified two relationships between the variables, which enabled the distinction of subgroups that better detail the profile and experience of sexuality. The identification of the relationships between the variables define behavioral models and factors of risk that will help define the algorithms being implemented in the data mining classification task.
Van Landeghem, Sofie; De Bodt, Stefanie; Drebert, Zuzanna J; Inzé, Dirk; Van de Peer, Yves
2013-03-01
Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies.
Using Data Mining to Detect Health Care Fraud and Abuse: A Review of Literature
Joudaki, Hossein; Rashidian, Arash; Minaei-Bidgoli, Behrouz; Mahmoodi, Mahmood; Geraili, Bijan; Nasiri, Mahdi; Arab, Mohammad
2015-01-01
Inappropriate payments by insurance organizations or third party payers occur because of errors, abuse and fraud. The scale of this problem is large enough to make it a priority issue for health systems. Traditional methods of detecting health care fraud and abuse are time-consuming and inefficient. Combining automated methods and statistical knowledge lead to the emergence of a new interdisciplinary branch of science that is named Knowledge Discovery from Databases (KDD). Data mining is a core of the KDD process. Data mining can help third-party payers such as health insurance organizations to extract useful information from thousands of claims and identify a smaller subset of the claims or claimants for further assessment. We reviewed studies that performed data mining techniques for detecting health care fraud and abuse, using supervised and unsupervised data mining approaches. Most available studies have focused on algorithmic data mining without an emphasis on or application to fraud detection efforts in the context of health service provision or health insurance policy. More studies are needed to connect sound and evidence-based diagnosis and treatment approaches toward fraudulent or abusive behaviors. Ultimately, based on available studies, we recommend seven general steps to data mining of health care claims. PMID:25560347
Fitzpatrick, D.J.; Westerfield, P.W.
1990-01-01
An abandoned barite mine in Hot Spring County, Arkansas, has been selected as the location for a proposed gamma-ray and neutrino detector site. As part of the hydrologic evaluation of the site, the U.S. Geological Survey in cooperation with the Arkansas Geological Commission collected hydrologic data at selected locations in the vicinity of the abandoned barite mine. Data collected as part of the project included water quality, pond-evaluation, and precipitation data within the abandoned barite mine and flow and water quality data at selected sites in the vicinity of the mine. Water quality samples from within the abandoned mine were collected at three locations in the pond at selected depths. These data included field measurements of specific conductance, pH, water temperature, dissolved oxygen, major ions, and trace metals. Major ion and trace-metal samples were collected at six stream sites, one lake site, and two wastewater pond sites. Pond elevation and precipitation data from within the abandoned barite mine were measured during the period between July 1, 1988 and June 30, 1989. Twevle discharge measurements during the period between June 21, 1988, and June 26, 1989, were collected at six sites in the vicinity of the abandoned barite mine. (USGS)
An application of data mining in district heating substations for improving energy performance
NASA Astrophysics Data System (ADS)
Xue, Puning; Zhou, Zhigang; Chen, Xin; Liu, Jing
2017-11-01
Automatic meter reading system is capable of collecting and storing a huge number of district heating (DH) data. However, the data obtained are rarely fully utilized. Data mining is a promising technology to discover potential interesting knowledge from vast data. This paper applies data mining methods to analyse the massive data for improving energy performance of DH substation. The technical approach contains three steps: data selection, cluster analysis and association rule mining (ARM). Two-heating-season data of a substation are used for case study. Cluster analysis identifies six distinct heating patterns based on the primary heat of the substation. ARM reveals that secondary pressure difference and secondary flow rate have a strong correlation. Using the discovered rules, a fault occurring in remote flow meter installed at secondary network is detected accurately. The application demonstrates that data mining techniques can effectively extrapolate potential useful knowledge to better understand substation operation strategies and improve substation energy performance.
NASA Technical Reports Server (NTRS)
Wier, C. E.; Wobber, F. J.; Amato, R. V.; Russell, O. R. (Principal Investigator)
1974-01-01
The author has identified the following significant results. All Skylab 2 imagery received to date has been analyzed manually and data related to fracture analysis and mined land inventories has been summarized on map-overlays. A comparison of the relative utility of the Skylab image products for fracture detection, soil tone/vegetation contrast mapping, and mined land mapping has been completed. Numerous fracture traces were detected on both color and black and white transparencies. Unique fracture trace data which will contribute to the investigator's mining hazards analysis were noted on the EREP imagery; these data could not be detected on ERTS-1 imagery or high altitude aircraft color infrared photography. Stream segments controlled by fractures or joint systems could be identified in more detail than with ERTS-1 imagery of comparable scale. ERTS-1 mine hazards products will be modified to demonstrate the value of this additional data. Skylab images were used successfully to update a mined land map of Indiana made in 1972. Changes in mined area as small as two acres can be identified. As the Energy Crisis increases the demand for coal, such demonstrations of the application of Skylab data to coal resources will take on new importance.
A Review of Financial Accounting Fraud Detection based on Data Mining Techniques
NASA Astrophysics Data System (ADS)
Sharma, Anuj; Kumar Panigrahi, Prabin
2012-02-01
With an upsurge in financial accounting fraud in the current economic scenario experienced, financial accounting fraud detection (FAFD) has become an emerging topic of great importance for academic, research and industries. The failure of internal auditing system of the organization in identifying the accounting frauds has lead to use of specialized procedures to detect financial accounting fraud, collective known as forensic accounting. Data mining techniques are providing great aid in financial accounting fraud detection, since dealing with the large data volumes and complexities of financial data are big challenges for forensic accounting. This paper presents a comprehensive review of the literature on the application of data mining techniques for the detection of financial accounting fraud and proposes a framework for data mining techniques based accounting fraud detection. The systematic and comprehensive literature review of the data mining techniques applicable to financial accounting fraud detection may provide a foundation to future research in this field. The findings of this review show that data mining techniques like logistic models, neural networks, Bayesian belief network, and decision trees have been applied most extensively to provide primary solutions to the problems inherent in the detection and classification of fraudulent data.
Mining injuries in Serbian underground coal mines -- a 10-year study.
Stojadinović, Saša; Svrkota, Igor; Petrović, Dejan; Denić, Miodrag; Pantović, Radoje; Milić, Vitomir
2012-12-01
Mining, especially underground coal mining, has always been a dangerous occupation. Injuries, unfortunately, even those resulting in death, are one of the major occupational risks that all miners live with. Despite the fact that all workers are aware of the risk, efforts must be and are being made to increase the safety of mines. Injury monitoring and data analysis can provide us with valuable data on the causes of accidents and enable us to establish a correlation between the conditions in the work environment and the number of injuries, which can further lead to proper preventive measures. This article presents the data on the injuries in Serbian coal mines during a 10-year period (2000-2009). The presented results are only part of an ongoing study whose aim is to assess the safety conditions in Serbian coal mines and classify them according to that assessment. Copyright © 2011 Elsevier Ltd. All rights reserved.
HC StratoMineR: A Web-Based Tool for the Rapid Analysis of High-Content Datasets.
Omta, Wienand A; van Heesbeen, Roy G; Pagliero, Romina J; van der Velden, Lieke M; Lelieveld, Daphne; Nellen, Mehdi; Kramer, Maik; Yeong, Marley; Saeidi, Amir M; Medema, Rene H; Spruit, Marco; Brinkkemper, Sjaak; Klumperman, Judith; Egan, David A
2016-10-01
High-content screening (HCS) can generate large multidimensional datasets and when aligned with the appropriate data mining tools, it can yield valuable insights into the mechanism of action of bioactive molecules. However, easy-to-use data mining tools are not widely available, with the result that these datasets are frequently underutilized. Here, we present HC StratoMineR, a web-based tool for high-content data analysis. It is a decision-supportive platform that guides even non-expert users through a high-content data analysis workflow. HC StratoMineR is built by using My Structured Query Language for storage and querying, PHP: Hypertext Preprocessor as the main programming language, and jQuery for additional user interface functionality. R is used for statistical calculations, logic and data visualizations. Furthermore, C++ and graphical processor unit power is diffusely embedded in R by using the rcpp and rpud libraries for operations that are computationally highly intensive. We show that we can use HC StratoMineR for the analysis of multivariate data from a high-content siRNA knock-down screen and a small-molecule screen. It can be used to rapidly filter out undesirable data; to select relevant data; and to perform quality control, data reduction, data exploration, morphological hit picking, and data clustering. Our results demonstrate that HC StratoMineR can be used to functionally categorize HCS hits and, thus, provide valuable information for hit prioritization.
Argue, Denise M.; Kiah, Richard G.; Piatak, Nadine M.; Seal, Robert R.; Hammarstrom, Jane M.; Hathaway, Edward; Coles, James F.
2008-01-01
The data contained in this report are a compilation of selected water- and sediment-quality, aquatic biology, and mine-waste data collected at the Ely Copper Mine Superfund site in Vershire, VT, from August 1998 through May 2007. The Ely Copper Mine Superfund site is in eastern, central Vermont (fig. 1) within the Vermont Copper Belt (Hammarstrom and others, 2001). The Ely Copper Mine site was placed on the U.S. Environmental Protection Agency (USEPA) National Priorities List in 2001. Previous investigations conducted at the site documented that the mine is contributing metals and highly acidic waters to local streams (Hammarstrom and others, 2001; Holmes and others, 2002; Piatak and others, 2003, 2004, and 2006). The U.S. Geological Survey (USGS), in cooperation with the USEPA, compiled selected data from previous investigations into uniform datasets that will be used to help characterize the extent of contamination at the mine. The data may be used to determine the magnitude of biological impacts from the contamination and in the development of remediation activities. This report contains analytical data for samples collected from 98 stream locations, 6 pond locations, 21 surface-water seeps, and 29 mine-waste locations. The 98 stream locations are within 3 streams and their tributaries. Ely Brook flows directly through the Ely Copper Mine then into Schoolhouse Brook (fig. 2), which joins the Ompompanoosuc River (fig. 1). The six pond locations are along Ely Brook Tributary 2 (fig. 2). The surface-water seeps and mine-waste locations are near the headwaters of Ely Brook (fig. 2 and fig. 3). The datasets 'Site_Directory' and 'Coordinates' contain specific information about each of the sample locations including stream name, number of meters from the mouth of stream, geographic coordinates, types of samples collected (matrix of sample), and the figure on which the sample location is depicted. Data have been collected at the Ely Copper Mine Superfund site by the USEPA, the Vermont Department of Environmental Conservation (VTDEC), and the USGS. Data also have been collected on behalf of USEPA by the following agencies: Arthur D. Little Incorporated (ADL), U.S. Army Cold Region Research and Engineering Laboratory (CRREL), URS Corporation (URS), USEPA, and USGS. These data provide information about the aquatic communities and their habitats, including chemical analyses of surface water, pore water, sediments, and fish tissue; assessments of macroinvertebrate and fish assemblages; physical characteristics of sediments; and chemical analyses of soil and soil leachate collected in and around the piles of mine waste.
Enhancements for a Dynamic Data Warehousing and Mining System for Large-scale HSCB Data
2016-08-29
Intelligent Automation Incorporated Enhancements for a Dynamic Data Warehousing and Mining ...Page | 2 Intelligent Automation Incorporated Monthly Report No. 5 Enhancements for a Dynamic Data Warehousing and Mining System Large-Scale HSCB...System for Large-scale HSCB Data Monthly Report No. 5 Reporting Period: July 20, 2016 – Aug 19, 2016 Contract No. N00014-16-P-3014
Monitoring genotoxic exposure in uranium mines.
Srám, R J; Dobiás, L; Rössner, P; Veselá, D; Veselý, D; Rakusová, R; Rericha, V
1993-01-01
Recent data from deep uranium mines in Czechoslovakia indicated that mines are exposed to other mutagenic factors in addition to radon daughter products. Mycotoxins were identified as a possible source of mutagens in these mines. Mycotoxins were examined in 38 samples from mines and in throat swabs taken from 116 miners and 78 controls. The following mycotoxins were identified from mines samples: aflatoxins B1 and G1, citrinin, citreoviridin, mycophenolic acid, and sterigmatocystin. Some mold strains isolated from mines and throat swabs were investigated for mutagenic activity by the SOS chromotest and Salmonella assay with strains TA100 and TA98. Mutagenicity was observed, especially with metabolic activation in vitro. These data suggest that mycotoxins produced by molds in uranium mines are a new genotoxic factor for uranium miners. PMID:8143610
Data Mining Citizen Science Results
NASA Astrophysics Data System (ADS)
Borne, K. D.
2012-12-01
Scientific discovery from big data is enabled through multiple channels, including data mining (through the application of machine learning algorithms) and human computation (commonly implemented through citizen science tasks). We will describe the results of new data mining experiments on the results from citizen science activities. Discovering patterns, trends, and anomalies in data are among the powerful contributions of citizen science. Establishing scientific algorithms that can subsequently re-discover the same types of patterns, trends, and anomalies in automatic data processing pipelines will ultimately result from the transformation of those human algorithms into computer algorithms, which can then be applied to much larger data collections. Scientific discovery from big data is thus greatly amplified through the marriage of data mining with citizen science.
Dugas, D.L.; Cravotta, C.A.; Saad, D.A.
1993-01-01
Water-quality and other hydrologic data for two surface coal mines in Clarion County, Pa., were collected during 1983-89 as part of studies conducted by the U.S. Geological Survey in cooperation with the Pennsylvania Department of Environmental Resources. Water samples were collected from streams, seeps, monitor wells, and lysimeters on a monthly basis to evaluate changes in water quality resulting from the addition of alkaline waste or urban sewage sludge to the reclaimed mine-spoil surface. The mines are about 3.5 miles apart and were mined for bituminous coal of the upper and lower Clarion seams of the Allegheny Group of Pennsylvanian age. The coal had high sulfur (greater than 2 weight percent) concentrations. Acidic mine drainage is present at both mines. At one mine, about 8 years after mining was completed, large quantities (greater than 400 tons per acre) of alkaline waste consisting of limestone and lime-kiln flue dust were applied on two 2.5-acre plots within the 65-acre mine area. Water-quality data for the alkaline-addition plots and surrounding area were collected for 1 year before and 3 years after application of the alkaline additives (May 1983-July 1987). Data collected for the alkaline-addition study include ground-water level, surface-water discharge rate, temperature, specific conductance, pH, and concentrations of alkalinity, acidity, sulfate, iron (total and ferrous), manganese, aluminum, calcium, and magnesium. At the other mine, about 3.5 years after mining was completed, urban sewage sludge was applied over 60 acres within the 150-acre mine area. Waterquality data for the sludge-addition study were collected for 3.5 years after the application of the sludge (June 1986-December 1989). Data collected for the sludge-addition study include the above constituents plus dissolved oxygen, redox potential (Eh), and concentrations of dissolved solids, phosphorus, nitrogen species, sulfide, chloride, silica, sodium, potassium, cyanide, arsenic, barium, boron, cadmium, chromium, copper, lead, mercury, molybdenum, nickel, selenium, strontium, and zinc. Climatic data, including monthly average temperature and cumulative precipitation, from a nearby weather station for the period January 1983 through December 1989 also are reported.
toxoMine: an integrated omics data warehouse for Toxoplasma gondii systems biology research
Rhee, David B.; Croken, Matthew McKnight; Shieh, Kevin R.; Sullivan, Julie; Micklem, Gos; Kim, Kami; Golden, Aaron
2015-01-01
Toxoplasma gondii (T. gondii) is an obligate intracellular parasite that must monitor for changes in the host environment and respond accordingly; however, it is still not fully known which genetic or epigenetic factors are involved in regulating virulence traits of T. gondii. There are on-going efforts to elucidate the mechanisms regulating the stage transition process via the application of high-throughput epigenomics, genomics and proteomics techniques. Given the range of experimental conditions and the typical yield from such high-throughput techniques, a new challenge arises: how to effectively collect, organize and disseminate the generated data for subsequent data analysis. Here, we describe toxoMine, which provides a powerful interface to support sophisticated integrative exploration of high-throughput experimental data and metadata, providing researchers with a more tractable means toward understanding how genetic and/or epigenetic factors play a coordinated role in determining pathogenicity of T. gondii. As a data warehouse, toxoMine allows integration of high-throughput data sets with public T. gondii data. toxoMine is also able to execute complex queries involving multiple data sets with straightforward user interaction. Furthermore, toxoMine allows users to define their own parameters during the search process that gives users near-limitless search and query capabilities. The interoperability feature also allows users to query and examine data available in other InterMine systems, which would effectively augment the search scope beyond what is available to toxoMine. toxoMine complements the major community database ToxoDB by providing a data warehouse that enables more extensive integrative studies for T. gondii. Given all these factors, we believe it will become an indispensable resource to the greater infectious disease research community. Database URL: http://toxomine.org PMID:26130662
ERIC Educational Resources Information Center
Luan, Jing; Willett, Terrence
This paper discusses data mining--an end-to-end (ETE) data analysis tool that is used by researchers in higher education. It also relates data mining and other software programs to a brand new concept called "Knowledge Management." The paper culminates in the Tier Knowledge Management Model (TKMM), which seeks to provide a stable…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goodman, P.S.
The report examines coal-miner absenteeism and its relationship to accidents and injuries at underground mines. A total of 19 mines participated in various phases of this 3-year project. Miners at the participating mines ranged in number from 185 to 776. The data consisted of the mines' daily attendance records and detailed interviews with approximately 50 miners from each mine. The interviews contained questions about the miners' satisfaction with various on-the-job and off-the-job factors, their perceptions of the mines' absenteeism policies, the reasons or causes for their own absences, and the miners' demographic characteristics. Accident and injury data from six minesmore » were used in parametric and multiple regression analysis of the absenteeism-accident relationship. The data represented activity during approximately 80,000 miner-days worked. Strategies for reducing absenteeism are discussed.« less
This fact sheet provides guidance on the Chemical Data Reporting (CDR) rule requirements related to the reporting of mined metals, intermediates, and byproducts manufactured during metal mining and related activities.
Publications - GMC 295 | Alaska Division of Geological & Geophysical
DGGS GMC 295 Publication Details Title: Geochemical assay data from U.S. Bureau of Mines hard-rock . Bureau of Mines, 2000, Geochemical assay data from U.S. Bureau of Mines hard-rock mineral cores (holes
Data Mining and Machine Learning in Astronomy
NASA Astrophysics Data System (ADS)
Ball, Nicholas M.; Brunner, Robert J.
We review the current state of data mining and machine learning in astronomy. Data Mining can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those in which data mining techniques directly contributed to improving science, and important current and future directions, including probability density functions, parallel algorithms, Peta-Scale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.
2013-01-01
website). Data mining tools are in-house code developed in Python, C++ and Java . • NGA The National Geospatial-Intelligence Agency (NGA) performs data...as PostgreSQL (with PostGIS), MySQL , Microsoft SQL Server, SQLite, etc. using the appropriate JDBC driver. 14 The documentation and ease to learn are...written in Java that is able to perform various types of regressions, classi- fications, and other data mining tasks. There is also a commercial version
Data Mine and Forget It?: A Cautionary Tale
NASA Technical Reports Server (NTRS)
Tada, Yuri; Kraft, Norbert Otto; Orasanu, Judith M.
2011-01-01
With the development of new technologies, data mining has become increasingly popular. However, caution should be exercised in choosing the variables to include in data mining. A series of regression trees was created to demonstrate the change in the selection by the program of significant predictors based on the nature of variables.
30 CFR 50.30-1 - General instructions for completing MSHA Form 7000-2.
Code of Federal Regulations, 2011 CFR
2011-07-01
... Operations, Preparation Plants, Breakers: Report data on all persons employed at your milling (crushing...) Employment, Employee Hours, and Coal Production—(1) Operation Sub-Unit: (i) Underground Mine: Report data for... underground mine, report data for those persons on the second line; (ii) Surface Mine (Including Shops and...
30 CFR 50.30-1 - General instructions for completing MSHA Form 7000-2.
Code of Federal Regulations, 2010 CFR
2010-07-01
... Operations, Preparation Plants, Breakers: Report data on all persons employed at your milling (crushing...) Employment, Employee Hours, and Coal Production—(1) Operation Sub-Unit: (i) Underground Mine: Report data for... underground mine, report data for those persons on the second line; (ii) Surface Mine (Including Shops and...
Student Privacy and Educational Data Mining: Perspectives from Industry
ERIC Educational Resources Information Center
Sabourin, Jennifer; Kosturko, Lucy; FitzGerald, Clare; McQuiggan, Scott
2015-01-01
While the field of educational data mining (EDM) has generated many innovations for improving educational software and student learning, the mining of student data has recently come under a great deal of scrutiny. Many stakeholder groups, including public officials, media outlets, and parents, have voiced concern over the privacy of student data…
ERIC Educational Resources Information Center
Anaya, Antonio R.; Boticario, Jesus G.
2009-01-01
Data mining methods are successful in educational environments to discover new knowledge or learner skills or features. Unfortunately, they have not been used in depth with collaboration. We have developed a scalable data mining method, whose objective is to infer information on the collaboration during the collaboration process in a…
The use of data mining by private health insurance companies and customers' privacy.
Al-Saggaf, Yeslam
2015-07-01
This article examines privacy threats arising from the use of data mining by private Australian health insurance companies. Qualitative interviews were conducted with key experts, and Australian governmental and nongovernmental websites relevant to private health insurance were searched. Using Rationale, a critical thinking tool, the themes and considerations elicited through this empirical approach were developed into an argument about the use of data mining by private health insurance companies. The argument is followed by an ethical analysis guided by classical philosophical theories-utilitarianism, Mill's harm principle, Kant's deontological theory, and Helen Nissenbaum's contextual integrity framework. Both the argument and the ethical analysis find the use of data mining by private health insurance companies in Australia to be unethical. Although private health insurance companies in Australia cannot use data mining for risk rating to cherry-pick customers and cannot use customers' personal information for unintended purposes, this article nonetheless concludes that the secondary use of customers' personal information and the absence of customers' consent still suggest that the use of data mining by private health insurance companies is wrong.
Automated Analysis of Renewable Energy Datasets ('EE/RE Data Mining')
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bush, Brian; Elmore, Ryan; Getman, Dan
This poster illustrates methods to substantially improve the understanding of renewable energy data sets and the depth and efficiency of their analysis through the application of statistical learning methods ('data mining') in the intelligent processing of these often large and messy information sources. The six examples apply methods for anomaly detection, data cleansing, and pattern mining to time-series data (measurements from metering points in buildings) and spatiotemporal data (renewable energy resource datasets).
Sanmiquel, Lluís; Bascompta, Marc; Rossell, Josep M; Anticoi, Hernán Francisco; Guash, Eduard
2018-03-07
An analysis of occupational accidents in the mining sector was conducted using the data from the Spanish Ministry of Employment and Social Safety between 2005 and 2015, and data-mining techniques were applied. Data was processed with the software Weka. Two scenarios were chosen from the accidents database: surface and underground mining. The most important variables involved in occupational accidents and their association rules were determined. These rules are composed of several predictor variables that cause accidents, defining its characteristics and context. This study exposes the 20 most important association rules in the sector-either surface or underground mining-based on the statistical confidence levels of each rule as obtained by Weka. The outcomes display the most typical immediate causes, along with the percentage of accidents with a basis in each association rule. The most important immediate cause is body movement with physical effort or overexertion, and the type of accident is physical effort or overexertion. On the other hand, the second most important immediate cause and type of accident are different between the two scenarios. Data-mining techniques were chosen as a useful tool to find out the root cause of the accidents.
Mining influence on underground water resources in arid and semiarid regions
NASA Astrophysics Data System (ADS)
Luo, A. K.; Hou, Y.; Hu, X. Y.
2018-02-01
Coordinated mining of coal and water resources in arid and semiarid regions has traditionally become a focus issue. The research takes Energy and Chemical Base in Northern Shaanxi as an example, and conducts statistical analysis on coal yield and drainage volume from several large-scale mines in the mining area. Meanwhile, research determines average water volume per ton coal, and calculates four typical years’ drainage volume in different mining intensity. Then during mining drainage, with the combination of precipitation observation data in recent two decades and water level data from observation well, the calculation of groundwater table, precipitation infiltration recharge, and evaporation capacity are performed. Moreover, the research analyzes the transforming relationship between surface water, mine water, and groundwater. The result shows that the main reason for reduction of water resources quantity and transforming relationship between surface water, groundwater, and mine water is massive mine drainage, which is caused by large-scale coal mining in the research area.
Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M
2013-10-01
The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Application of data mining approaches to drug delivery.
Ekins, Sean; Shimada, Jun; Chang, Cheng
2006-11-30
Computational approaches play a key role in all areas of the pharmaceutical industry from data mining, experimental and clinical data capture to pharmacoeconomics and adverse events monitoring. They will likely continue to be indispensable assets along with a growing library of software applications. This is primarily due to the increasingly massive amount of biology, chemistry and clinical data, which is now entering the public domain mainly as a result of NIH and commercially funded projects. We are therefore in need of new methods for mining this mountain of data in order to enable new hypothesis generation. The computational approaches include, but are not limited to, database compilation, quantitative structure activity relationships (QSAR), pharmacophores, network visualization models, decision trees, machine learning algorithms and multidimensional data visualization software that could be used to improve drug delivery after mining public and/or proprietary data. We will discuss some areas of unmet needs in the area of data mining for drug delivery that can be addressed with new software tools or databases of relevance to future pharmaceutical projects.
Effect of Strip Mining on Water Quality in Small Streams in Eastern Kentucky, 1967-1975
Kenneth L. Dyer; Willie R. Curtis
1977-01-01
Eight years of streamflow data are analyzed to show the effects of strip mining on chemical quality of water in six first-order streams in Breathitt County, Kentucky. All these watersheds were unmined in August, 1967, but five have since been strip mined. The accumulated data from this case history study indicate that strip mining causes large increases in the...
Comparative analysis of data mining techniques for business data
NASA Astrophysics Data System (ADS)
Jamil, Jastini Mohd; Shaharanee, Izwan Nizal Mohd
2014-12-01
Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data contained within a database. Companies are using this tool to further understand their customers, to design targeted sales and marketing campaigns, to predict what product customers will buy and the frequency of purchase, and to spot trends in customer preferences that can lead to new product development. In this paper, we conduct a systematic approach to explore several of data mining techniques in business application. The experimental result reveals that all data mining techniques accomplish their goals perfectly, but each of the technique has its own characteristics and specification that demonstrate their accuracy, proficiency and preference.
A strategy for selecting data mining techniques in metabolomics.
Banimustafa, Ahmed Hmaidan; Hardy, Nigel W
2012-01-01
There is a general agreement that the development of metabolomics depends not only on advances in chemical analysis techniques but also on advances in computing and data analysis methods. Metabolomics data usually requires intensive pre-processing, analysis, and mining procedures. Selecting and applying such procedures requires attention to issues including justification, traceability, and reproducibility. We describe a strategy for selecting data mining techniques which takes into consideration the goals of data mining techniques on the one hand, and the goals of metabolomics investigations and the nature of the data on the other. The strategy aims to ensure the validity and soundness of results and promote the achievement of the investigation goals.
Injury experience in coal mining, 1992
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reich, R.B.; Hugler, E.C.
1994-05-01
This Mine Safety and Health Administration (MSHA) informational report reviews in detail the occupational injury and illness experience of coal mining in the United States for 1992. Data reported by operators of mining establishments concerning work injuries are summarized by work location, accident classification, part of body injured, nature of injury, occupation, and anthracite or bituminous coal. Related information on employment, worktime, and operating activity also is presented. Data reported by independent contractors performing certain work at mining locations are depicted separately in this report. For ease of comparison between coal mining and the metal and nonmetal mineral mining industries,more » summary reference tabulations are included at the end of both the operator and the contractor sections of this report.« less
Injury experience in coal mining, 1990
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
1991-01-01
This Mine Safety and Health Administration (MSHA) informational report reviews in detail the occupational injury and illness experience of coal mining in the United States for 1990. Data reported by operators of mining establishments concerning work injuries are summarized by work location, accident classification, part of body injured, nature of injury, occupation, and anthracite or bituminous coal. Related information on employment, worktime, and operating activity also is presented. Data reported by independent contractors performing certain work at mining locations are depicted separately in this report. For ease of comparison between coal mining and the metal and nonmetal mineral mining industries,more » summary reference tabulations are included at the end of both the operator and the contractor sections of this report.« less
Identifying Catchment-Scale Predictors of Coal Mining Impacts on New Zealand Stream Communities.
Clapcott, Joanne E; Goodwin, Eric O; Harding, Jon S
2016-03-01
Coal mining activities can have severe and long-term impacts on freshwater ecosystems. At the individual stream scale, these impacts have been well studied; however, few attempts have been made to determine the predictors of mine impacts at a regional scale. We investigated whether catchment-scale measures of mining impacts could be used to predict biological responses. We collated data from multiple studies and analyzed algae, benthic invertebrate, and fish community data from 186 stream sites, including un-mined streams, and those associated with 620 mines on the West Coast of the South Island, New Zealand. Algal, invertebrate, and fish richness responded to mine impacts and were significantly higher in un-mined compared to mine-impacted streams. Changes in community composition toward more acid- and metal-tolerant species were evident for algae and invertebrates, whereas changes in fish communities were significant and driven by a loss of nonmigratory native species. Consistent catchment-scale predictors of mining activities affecting biota included the time post mining (years), mining density (the number of mines upstream per catchment area), and mining intensity (tons of coal production per catchment area). Mining was associated with a decline in stream biodiversity irrespective of catchment size, and recovery was not evident until at least 30 years after mining activities have ceased. These catchment-scale predictors can provide managers and regulators with practical metrics to focus on management and remediation decisions.
Identifying Catchment-Scale Predictors of Coal Mining Impacts on New Zealand Stream Communities
NASA Astrophysics Data System (ADS)
Clapcott, Joanne E.; Goodwin, Eric O.; Harding, Jon S.
2016-03-01
Coal mining activities can have severe and long-term impacts on freshwater ecosystems. At the individual stream scale, these impacts have been well studied; however, few attempts have been made to determine the predictors of mine impacts at a regional scale. We investigated whether catchment-scale measures of mining impacts could be used to predict biological responses. We collated data from multiple studies and analyzed algae, benthic invertebrate, and fish community data from 186 stream sites, including un-mined streams, and those associated with 620 mines on the West Coast of the South Island, New Zealand. Algal, invertebrate, and fish richness responded to mine impacts and were significantly higher in un-mined compared to mine-impacted streams. Changes in community composition toward more acid- and metal-tolerant species were evident for algae and invertebrates, whereas changes in fish communities were significant and driven by a loss of nonmigratory native species. Consistent catchment-scale predictors of mining activities affecting biota included the time post mining (years), mining density (the number of mines upstream per catchment area), and mining intensity (tons of coal production per catchment area). Mining was associated with a decline in stream biodiversity irrespective of catchment size, and recovery was not evident until at least 30 years after mining activities have ceased. These catchment-scale predictors can provide managers and regulators with practical metrics to focus on management and remediation decisions.
NASA Technical Reports Server (NTRS)
Ferrari, J. R.; Lookingbill, T. R.; McCormick, B.; Townsend, P. A.; Eshleman, K. N.
2009-01-01
Surface mining of coal and subsequent reclamation represent the dominant land use change in the central Appalachian Plateau (CAP) region of the United States. Hydrologic impacts of surface mining have been studied at the plot scale, but effects at broader scales have not been explored adequately. Broad-scale classification of reclaimed sites is difficult because standing vegetation makes them nearly indistinguishable from alternate land uses. We used a land cover data set that accurately maps surface mines for a 187-km2 watershed within the CAP. These land cover data, as well as plot-level data from within the watershed, are used with HSPF (Hydrologic Simulation Program-Fortran) to estimate changes in flood response as a function of increased mining. Results show that the rate at which flood magnitude increases due to increased mining is linear, with greater rates observed for less frequent return intervals. These findings indicate that mine reclamation leaves the landscape in a condition more similar to urban areas rather than does simple deforestation, and call into question the effectiveness of reclamation in terms of returning mined areas to the hydrological state that existed before mining.
Piatak, Nadine M.; Seal, Robert R.; Hammarstrom, Jane M.; Meier, Allen L.; Briggs, Paul H.
2003-01-01
Waste-rock material produced at historic metal mines contains elevated concentrations of potentially toxic trace elements. Two types of mine waste were examined in this study: sintered waste rock and slag. The samples were collected from the Elizabeth and Ely mines in the Vermont copper belt (Besshi-type massive sulfide deposits), from the Copper Basin mining district near Ducktown, Tennessee (Besshi-type massive sulfide deposits), and from the Clayton silver mine in the Bayhorse mining district, Idaho (polymetallic vein and replacement deposits). The data in this report are presented as a compilation with minimal interpretation or discussion. A detailed discussion and interpretation of the slag data are presented in a companion paper. Data collected from sintered waste rock and slag include: (1) bulk rock chemistry, (2) mineralogy, (3) and the distribution of trace elements among phases for the slag samples. In addition, the reactivity of the waste material under surficial conditions was assessed by examining secondary minerals formed on slag and by laboratory leaching tests using deionized water and a synthetic solution approximating precipitation in the eastern United States.
Web mining in soft computing framework: relevance, state of the art and future directions.
Pal, S K; Talwar, V; Mitra, P
2002-01-01
The paper summarizes the different characteristics of Web data, the basic components of Web mining and its different types, and the current state of the art. The reason for considering Web mining, a separate field from data mining, is explained. The limitations of some of the existing Web mining methods and tools are enunciated, and the significance of soft computing (comprising fuzzy logic (FL), artificial neural networks (ANNs), genetic algorithms (GAs), and rough sets (RSs) are highlighted. A survey of the existing literature on "soft Web mining" is provided along with the commercially available systems. The prospective areas of Web mining where the application of soft computing needs immediate attention are outlined with justification. Scope for future research in developing "soft Web mining" systems is explained. An extensive bibliography is also provided.
30 CFR 785.14 - Mountaintop removal mining.
Code of Federal Regulations, 2011 CFR
2011-07-01
... mountaintop removal mining. (b) Mountaintop removal mining means surface mining activities, where the mining operation removes an entire coal seam or seams running through the upper fraction of a mountain, ridge, or... adjacent land uses; (B) Obtainable according to data regarding expected need and market; (C) Assured of...
Van Landeghem, Sofie; De Bodt, Stefanie; Drebert, Zuzanna J.; Inzé, Dirk; Van de Peer, Yves
2013-01-01
Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein–protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies. PMID:23532071
A Data Miner for the Information Power Grid
NASA Technical Reports Server (NTRS)
Hinke, Thomas H.; Parks, John W. (Technical Monitor)
2002-01-01
Grid Miner (GM) is one of the early data mining applications developed by NASA to help users obtain information from the Information Power Grid (IPG). Topics cover include: benefits of data mining, potential use of grids in data mining activities, an overview of the GM application, and a brief review of GM architecture and implementation issues. The current status of the GM system is also discussed.
Learning in the context of distribution drift
2017-05-09
published in the leading data mining journal, Data Mining and Knowledge Discovery (Webb et. al., 2016)1. We have shown that the previous qualitative...learner Low-bias learner Aggregated classifier Figure 7: Architecture for learning fr m streaming data in th co text of variable or unknown...Learning limited dependence Bayesian classifiers, in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD
Enhancements for a Dynamic Data Warehousing and Mining System for Large-scale HSCB Data
2016-07-20
Intelligent Automation Incorporated Enhancements for a Dynamic Data Warehousing and Mining ...Page | 2 Intelligent Automation Incorporated Monthly Report No. 4 Enhancements for a Dynamic Data Warehousing and Mining System Large-Scale HSCB...including Top Videos, Top Users, Top Words, and Top Languages, and also applied NER to the text associated with YouTube posts. We have also developed UI for
Enhancements for a Dynamic Data Warehousing and Mining System for Large-Scale HSCB Data
2016-07-20
Intelligent Automation Incorporated Enhancements for a Dynamic Data Warehousing and Mining ...Page | 2 Intelligent Automation Incorporated Monthly Report No. 4 Enhancements for a Dynamic Data Warehousing and Mining System Large-Scale HSCB...including Top Videos, Top Users, Top Words, and Top Languages, and also applied NER to the text associated with YouTube posts. We have also developed UI for
Characterization of a mine fire using atmospheric monitoring system sensor data.
Yuan, L; Thomas, R A; Zhou, L
2017-06-01
Atmospheric monitoring systems (AMS) have been widely used in underground coal mines in the United States for the detection of fire in the belt entry and the monitoring of other ventilation-related parameters such as airflow velocity and methane concentration in specific mine locations. In addition to an AMS being able to detect a mine fire, the AMS data have the potential to provide fire characteristic information such as fire growth - in terms of heat release rate - and exact fire location. Such information is critical in making decisions regarding fire-fighting strategies, underground personnel evacuation and optimal escape routes. In this study, a methodology was developed to calculate the fire heat release rate using AMS sensor data for carbon monoxide concentration, carbon dioxide concentration and airflow velocity based on the theory of heat and species transfer in ventilation airflow. Full-scale mine fire experiments were then conducted in the Pittsburgh Mining Research Division's Safety Research Coal Mine using an AMS with different fire sources. Sensor data collected from the experiments were used to calculate the heat release rates of the fires using this methodology. The calculated heat release rate was compared with the value determined from the mass loss rate of the combustible material using a digital load cell. The experimental results show that the heat release rate of a mine fire can be calculated using AMS sensor data with reasonable accuracy.
NASA Astrophysics Data System (ADS)
Tuomela, Anne; Davids, Corine; Knutsson, Sven; Knutsson, Roger; Rauhala, Anssi; Rossi, Pekka M.; Rouyet, Line
2017-04-01
Northern areas of Finland, Sweden and Norway have mineral-rich deposits. There are several active mines in the area but also closed ones and deposits with plans for future mining. With increasing demand for environmental protection in the sensitive Northern conditions, there is a need for more comprehensive monitoring of the mining environment. In our study, we aim to develop new opportunities to use remote sensing data from satellites and unmanned aerial vehicles (UAVs) in improving mining safety and monitoring, for example in the case of mine waste storage facilities. Remote sensing methods have evolved fast, and could in many cases enable precise, reliable, and cost-efficient data collection over large areas. The study has focused on four mining areas in Northern Fennoscandia. Freely available medium-resolution (e.g. Sentinel-1), commercial high-resolution (e.g. TerraSAR-X) and Synthetic Aperture Radar (SAR) data has been collected during 2015-2016 to study how satellite remote sensing could be used e.g. for displacement monitoring using SAR Interferometry (InSAR). Furthermore, UAVs have been utilized in similar data collection in a local scale, and also in collection of thermal infrared data for hydrological monitoring of the areas. The development and efficient use of the methods in mining areas requires experts from several fields. In addition, the Northern conditions with four distinct seasons bring their own challenges for the efficient use of remote sensing, and further complicate their integration as standardised monitoring methods for mine environments. Based on the initial results, remote sensing could especially enhance the monitoring of large-scale structures in mine areas such as tailings impoundments.
NASA Technical Reports Server (NTRS)
Weaver, K. N. (Principal Investigator)
1973-01-01
The author has identified the following significant results. Underflight photography has been used in the Baltimore County mined land inventory to determine areas of disturbed land where surface mining of sand and ground clay, or stone has taken place. Both active and abandoned pits and quarries were located. Aircraft data has been used to update cultural features of Calvert, Caroline, St. Mary's, Somerset, Talbot, and Wicomico Counties. Islands have been located and catalogued for comparison with older film and map data for erosion data. Strip mined areas are being mapped to obtain total area disturbed to aid in future mining and reclamation problems. Coastal estuarine and Atlantic Coast features are being studied to determine nearshore bedforms, sedimentary, and erosional patterns, and manmade influence on natural systems.
Methods for Estimating Water Withdrawals for Mining in the United States, 2005
Lovelace, John K.
2009-01-01
The mining water-use category includes groundwater and surface water that is withdrawn and used for nonfuels and fuels mining. Nonfuels mining includes the extraction of ores, stone, sand, and gravel. Fuels mining includes the extraction of coal, petroleum, and natural gas. Water is used for mineral extraction, quarrying, milling, and other operations directly associated with mining activities. For petroleum and natural gas extraction, water often is injected for secondary oil or gas recovery. Estimates of water withdrawals for mining are needed for water planning and management. This report documents methods used to estimate withdrawals of fresh and saline groundwater and surface water for mining during 2005 for each county and county equivalent in the United States, Puerto Rico, and the U.S. Virgin Islands. Fresh and saline groundwater and surface-water withdrawals during 2005 for nonfuels- and coal-mining operations in each county or county equivalent in the United States, Puerto Rico, and the U.S. Virgin Islands were estimated. Fresh and saline groundwater withdrawals for oil and gas operations in counties of six states also were estimated. Water withdrawals for nonfuels and coal mining were estimated by using mine-production data and water-use coefficients. Production data for nonfuels mining included the mine location and weight (in metric tons) of crude ore, rock, or mineral produced at each mine in the United States, Puerto Rico, and the U.S. Virgin Islands during 2004. Production data for coal mining included the weight, in metric tons, of coal produced in each county or county equivalent during 2004. Water-use coefficients for mined commodities were compiled from various sources including published reports and written communications from U.S. Geological Survey National Water-use Information Program (NWUIP) personnel in several states. Water withdrawals for oil and gas extraction were estimated for six States including California, Colorado, Louisiana, New Mexico, Texas, and Wyoming, by using data from State agencies that regulate oil and gas extraction. Total water withdrawals for mining in a county were estimated by summing estimated water withdrawals for nonfuels mining, coal mining, and oil and gas extraction. The results of this study were distributed to NWUIP personnel in each State during 2007. NWUIP personnel were required to submit estimated withdrawals for numerous categories of use in their States to a national compilation team for inclusion in a national report describing water use in the United States during 2005. NWUIP personnel had the option of submitting the estimates determined by using the methods described in this report, a modified version of these estimates, or their own set of estimates or reported data. Estimated withdrawals resulting from the methods described in this report may not be included in the national report; therefore the estimates are not presented herein in order to avoid potential inconsistencies with the national report. Water-use coefficients for specific minerals also are not presented to avoid potential disclosure of confidential production data provided by mining operations to the U.S. Geological Survey.
Solar Data Mining at Georgia State University
NASA Astrophysics Data System (ADS)
Angryk, R.; Martens, P. C.; Schuh, M.; Aydin, B.; Kempton, D.; Banda, J.; Ma, R.; Naduvil-Vadukootu, S.; Akkineni, V.; Küçük, A.; Filali Boubrahimi, S.; Hamdi, S. M.
2016-12-01
In this talk we give an overview of research projects related to solar data analysis that are conducted at Georgia State University. We will provide update on multiple advances made by our research team on the analysis of image parameters, spatio-temporal patterns mining, temporal data analysis and our experiences with big, heterogeneous solar data visualization, analysis, processing and storage. We will talk about up-to-date data mining methodologies, and their importance for big data-driven solar physics research.
Data mining: sophisticated forms of managed care modeling through artificial intelligence.
Borok, L S
1997-01-01
Data mining is a recent development in computer science that combines artificial intelligence algorithms and relational databases to discover patterns automatically, without the use of traditional statistical methods. Work with data mining tools in health care is in a developmental stage that holds great promise, given the combination of demographic and diagnostic information.
Data Mining: A Hybrid Methodology for Complex and Dynamic Research
ERIC Educational Resources Information Center
Lang, Susan; Baehr, Craig
2012-01-01
This article provides an overview of the ways in which data and text mining have potential as research methodologies in composition studies. It introduces data mining in the context of the field of composition studies and discusses ways in which this methodology can complement and extend our existing research practices by blending the best of what…
Data Mining Research for Information Security
2016-01-29
AFRL-AFOSR-JP-TR-2016-0028 Data Mining Research for Information Security Kevin Barton Texas A&M University-San Antonio Final Report 01/29/2016...Final 3. DATES COVERED (From - To) 20-05-2014 to 19-05-2015 4. TITLE AND SUBTITLE Data Mining Research for Information Security 5a. CONTRACT
Educational Data Mining Applications and Tasks: A Survey of the Last 10 Years
ERIC Educational Resources Information Center
Bakhshinategh, Behdad; Zaiane, Osmar R.; ElAtia, Samira; Ipperciel, Donald
2018-01-01
Educational Data Mining (EDM) is the field of using data mining techniques in educational environments. There exist various methods and applications in EDM which can follow both applied research objectives such as improving and enhancing learning quality, as well as pure research objectives, which tend to improve our understanding of the learning…
Understanding Teacher Users of a Digital Library Service: A Clustering Approach
ERIC Educational Resources Information Center
Xu, Beijie; Recker, Mimi
2011-01-01
This article describes the Knowledge Discovery and Data Mining (KDD) process and its application in the field of educational data mining (EDM) in the context of a digital library service called the Instructional Architect (IA.usu.edu). In particular, the study reported in this article investigated a certain type of data mining problem, clustering,…
2016-09-26
Intelligent Automation Incorporated Enhancements for a Dynamic Data Warehousing and Mining ...Enhancements for a Dynamic Data Warehousing and Mining System for N00014-16-P-3014 Large-Scale Human Social Cultural Behavioral (HSBC) Data 5b. GRANT NUMBER...Representative Media Gallery View. We perform Scraawl’s NER algorithm to the text associated with YouTube post, which classifies the named entities into
15 CFR 970.601 - Logical mining unit.
Code of Federal Regulations, 2014 CFR
2014-01-01
... ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.601 Logical mining unit. (a) In the case of an exploration license, a logical mining unit is an... 15 Commerce and Foreign Trade 3 2014-01-01 2014-01-01 false Logical mining unit. 970.601 Section...
15 CFR 970.601 - Logical mining unit.
Code of Federal Regulations, 2013 CFR
2013-01-01
... ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.601 Logical mining unit. (a) In the case of an exploration license, a logical mining unit is an... 15 Commerce and Foreign Trade 3 2013-01-01 2013-01-01 false Logical mining unit. 970.601 Section...
15 CFR 970.601 - Logical mining unit.
Code of Federal Regulations, 2012 CFR
2012-01-01
... ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.601 Logical mining unit. (a) In the case of an exploration license, a logical mining unit is an... 15 Commerce and Foreign Trade 3 2012-01-01 2012-01-01 false Logical mining unit. 970.601 Section...
15 CFR 970.601 - Logical mining unit.
Code of Federal Regulations, 2010 CFR
2010-01-01
... ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.601 Logical mining unit. (a) In the case of an exploration license, a logical mining unit is an... 15 Commerce and Foreign Trade 3 2010-01-01 2010-01-01 false Logical mining unit. 970.601 Section...
15 CFR 970.601 - Logical mining unit.
Code of Federal Regulations, 2011 CFR
2011-01-01
... ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.601 Logical mining unit. (a) In the case of an exploration license, a logical mining unit is an... 15 Commerce and Foreign Trade 3 2011-01-01 2011-01-01 false Logical mining unit. 970.601 Section...
Dietary patterns analysis using data mining method. An application to data from the CYKIDS study.
Lazarou, Chrystalleni; Karaolis, Minas; Matalas, Antonia-Leda; Panagiotakos, Demosthenes B
2012-11-01
Data mining is a computational method that permits the extraction of patterns from large databases. We applied the data mining approach in data from 1140 children (9-13 years), in order to derive dietary habits related to children's obesity status. Rules emerged via data mining approach revealed the detrimental influence of the increased consumption of soft dinks, delicatessen meat, sweets, fried and junk food. For example, frequent (3-5 times/week) consumption of all these foods increases the risk for being obese by 75%, whereas in children who have a similar dietary pattern, but eat >2 times/week fish and seafood the risk for obesity is reduced by 33%. In conclusion patterns revealed from data mining technique refer to specific groups of children and demonstrate the effect on the risk associated with obesity status when a single dietary habit might be modified. Thus, a more individualized approach when translating public health messages could be achieved. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Su, Chao-Ton; Wang, Pa-Chun; Chen, Yan-Cheng; Chen, Li-Fei
2012-08-01
Pressure ulcer is a serious problem during patient care processes. The high risk factors in the development of pressure ulcer remain unclear during long surgery. Moreover, past preventive policies are hard to implement in a busy operation room. The objective of this study is to use data mining techniques to construct the prediction model for pressure ulcers. Four data mining techniques, namely, Mahalanobis Taguchi System (MTS), Support Vector Machines (SVMs), decision tree (DT), and logistic regression (LR), are used to select the important attributes from the data to predict the incidence of pressure ulcers. Measurements of sensitivity, specificity, F(1), and g-means were used to compare the performance of four classifiers on the pressure ulcer data set. The results show that data mining techniques obtain good results in predicting the incidence of pressure ulcer. We can conclude that data mining techniques can help identify the important factors and provide a feasible model to predict pressure ulcer development.
Earth Science Mining Web Services
NASA Astrophysics Data System (ADS)
Pham, L. B.; Lynnes, C. S.; Hegde, M.; Graves, S.; Ramachandran, R.; Maskey, M.; Keiser, K.
2008-12-01
To allow scientists further capabilities in the area of data mining and web services, the Goddard Earth Sciences Data and Information Services Center (GES DISC) and researchers at the University of Alabama in Huntsville (UAH) have developed a system to mine data at the source without the need of network transfers. The system has been constructed by linking together several pre-existing technologies: the Simple Scalable Script-based Science Processor for Measurements (S4PM), a processing engine at the GES DISC; the Algorithm Development and Mining (ADaM) system, a data mining toolkit from UAH that can be configured in a variety of ways to create customized mining processes; ActiveBPEL, a workflow execution engine based on BPEL (Business Process Execution Language); XBaya, a graphical workflow composer; and the EOS Clearinghouse (ECHO). XBaya is used to construct an analysis workflow at UAH using ADaM components, which are also installed remotely at the GES DISC, wrapped as Web Services. The S4PM processing engine searches ECHO for data using space-time criteria, staging them to cache, allowing the ActiveBPEL engine to remotely orchestrates the processing workflow within S4PM. As mining is completed, the output is placed in an FTP holding area for the end user. The goals are to give users control over the data they want to process, while mining data at the data source using the server's resources rather than transferring the full volume over the internet. These diverse technologies have been infused into a functioning, distributed system with only minor changes to the underlying technologies. The key to this infusion is the loosely coupled, Web- Services based architecture: All of the participating components are accessible (one way or another) through (Simple Object Access Protocol) SOAP-based Web Services.
Earth Science Mining Web Services
NASA Technical Reports Server (NTRS)
Pham, Long; Lynnes, Christopher; Hegde, Mahabaleshwa; Graves, Sara; Ramachandran, Rahul; Maskey, Manil; Keiser, Ken
2008-01-01
To allow scientists further capabilities in the area of data mining and web services, the Goddard Earth Sciences Data and Information Services Center (GES DISC) and researchers at the University of Alabama in Huntsville (UAH) have developed a system to mine data at the source without the need of network transfers. The system has been constructed by linking together several pre-existing technologies: the Simple Scalable Script-based Science Processor for Measurements (S4PM), a processing engine at he GES DISC; the Algorithm Development and Mining (ADaM) system, a data mining toolkit from UAH that can be configured in a variety of ways to create customized mining processes; ActiveBPEL, a workflow execution engine based on BPEL (Business Process Execution Language); XBaya, a graphical workflow composer; and the EOS Clearinghouse (ECHO). XBaya is used to construct an analysis workflow at UAH using ADam components, which are also installed remotely at the GES DISC, wrapped as Web Services. The S4PM processing engine searches ECHO for data using space-time criteria, staging them to cache, allowing the ActiveBPEL engine to remotely orchestras the processing workflow within S4PM. As mining is completed, the output is placed in an FTP holding area for the end user. The goals are to give users control over the data they want to process, while mining data at the data source using the server's resources rather than transferring the full volume over the internet. These diverse technologies have been infused into a functioning, distributed system with only minor changes to the underlying technologies. The key to the infusion is the loosely coupled, Web-Services based architecture: All of the participating components are accessible (one way or another) through (Simple Object Access Protocol) SOAP-based Web Services.
Spectral methods to detect surface mines
NASA Astrophysics Data System (ADS)
Winter, Edwin M.; Schatten Silvious, Miranda
2008-04-01
Over the past five years, advances have been made in the spectral detection of surface mines under minefield detection programs at the U. S. Army RDECOM CERDEC Night Vision and Electronic Sensors Directorate (NVESD). The problem of detecting surface land mines ranges from the relatively simple, the detection of large anti-vehicle mines on bare soil, to the very difficult, the detection of anti-personnel mines in thick vegetation. While spatial and spectral approaches can be applied to the detection of surface mines, spatial-only detection requires many pixels-on-target such that the mine is actually imaged and shape-based features can be exploited. This method is unreliable in vegetated areas because only part of the mine may be exposed, while spectral detection is possible without the mine being resolved. At NVESD, hyperspectral and multi-spectral sensors throughout the reflection and thermal spectral regimes have been applied to the mine detection problem. Data has been collected on mines in forest and desert regions and algorithms have been developed both to detect the mines as anomalies and to detect the mines based on their spectral signature. In addition to the detection of individual mines, algorithms have been developed to exploit the similarities of mines in a minefield to improve their detection probability. In this paper, the types of spectral data collected over the past five years will be summarized along with the advances in algorithm development.
Dhurjad, Pooja Sukhdev; Marothu, Vamsi Krishna; Rathod, Rajeshwari
2017-08-01
Metabolite identification is a crucial part of the drug discovery process. LC-MS/MS-based metabolite identification has gained widespread use, but the data acquired by the LC-MS/MS instrument is complex, and thus the interpretation of data becomes troublesome. Fortunately, advancements in data mining techniques have simplified the process of data interpretation with improved mass accuracy and provide a potentially selective, sensitive, accurate and comprehensive way for metabolite identification. In this review, we have discussed the targeted (extracted ion chromatogram, mass defect filter, product ion filter, neutral loss filter and isotope pattern filter) and untargeted (control sample comparison, background subtraction and metabolomic approaches) post-acquisition data mining techniques, which facilitate the drug metabolite identification. We have also discussed the importance of integrated data mining strategy.
A Predictive Model of Daily Seismic Activity Induced by Mining, Developed with Data Mining Methods
NASA Astrophysics Data System (ADS)
Jakubowski, Jacek
2014-12-01
The article presents the development and evaluation of a predictive classification model of daily seismic energy emissions induced by longwall mining in sector XVI of the Piast coal mine in Poland. The model uses data on tremor energy, basic characteristics of the longwall face and mined output in this sector over the period from July 1987 to March 2011. The predicted binary variable is the occurrence of a daily sum of tremor seismic energies in a longwall that is greater than or equal to the threshold value of 105 J. Three data mining analytical methods were applied: logistic regression,neural networks, and stochastic gradient boosted trees. The boosted trees model was chosen as the best for the purposes of the prediction. The validation sample results showed its good predictive capability, taking the complex nature of the phenomenon into account. This may indicate the applied model's suitability for a sequential, short-term prediction of mining induced seismic activity.
Coyan, Joshua; Zientek, Michael L.; Mihalasky, Mark J.
2017-01-01
Resource managers and agencies involved with planning for future federal land needs are required to complete an assessment of and forecast for future land use every ten years. Predicting mining activities on federal lands is difficult as current regulations do not require disclosure of exploration results. In these cases, historic mining claims may serve as a useful proxy for determining where mining-related activities may occur. We assess the utility of using a space–time cube (STC) and associated analyses to evaluate and characterize mining claim activities around the McDermitt Caldera in northern Nevada and southern Oregon. The most significant advantage of arranging the mining claim data into a STC is the ability to visualize and compare the data, which allows scientists to better understand patterns and results. Additional analyses of the STC (i.e., Trend, Emerging Hot Spot, Hot Spot, and Cluster and Outlier Analyses) provide extra insights into the data and may aid in predicting future mining claim activities.
Wirt, Laurie; Motyka, Jacek; Leach, David; Sass-Gustkiewicz, Maria; Szuwarzynski, Marek; Adamczyk, Zbigniew; Briggs, Paul; Meiers, Al
2003-01-01
The water chemistry of aquifers and streams in the Upper Silesia Ore District, Poland are affected by their proximity to zinc, lead, and silver ores and by ongoing mining activities that date back to the 11th century. This report presents hydrologic and water-quality data collected as part of a collaborative research effort of the U.S. Geological Survey and the University of Mining and Metallurgy in Cracow, Poland to study Mississippi-Valley-Type lead-zinc deposits. MVT deposits in the Upper Silesia Ore District (Fig. 1) were selected for detailed study because the Polish mining industry allowed access to collect samples from underground mines and mine-land property. Water-quality samples were collected from streams, springs, wells, underground mine seeps and drains; and mine-tailings ponds. Data include field measurements of specific conductance, pH, water temperature, and dissolved oxygen and laboratory analyses of major and minor inorganic constituents and selected trace-element constituents.
Short Distance of Nuclei - Mining the Wealth of Existing Jefferson Lab Data - Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weinstein, Lawrence; Kuhn, Sebastian
Over the last fifteen years of operation, the Jefferson Lab CLAS Collaboration has performed many experiments using nuclear targets. Because the CLAS detector has a very large acceptance and because it used a very open (i.e., nonspecific) trigger, there is a vast amount of data on many different reaction channels yet to be analyzed. The goal of the Jefferson Lab Nuclear Data Mining grant was to (1) collect the data from nuclear target experiments using the CLAS detector, (2) collect the associated cuts and corrections used to analyze that data, (3) provide non-expert users with a software environment for easymore » analysis of the data, and (4) to search for interesting reaction signatures in the data. We formed the Jefferson Lab Nuclear Data Mining collaboration under the auspices of this grant. The collaboration successfully carried out all of our goals. Dr. Gavalian, the data mining scientist, created a remarkably user-friendly web-based interface to enable easy analysis of the nuclear-target data by non-experts. Data from many of the CLAS nuclear target experiments has been made available on servers at Old Dominion University. Many of the associated cuts and corrections have been incorporated into the data mining software. The data mining collaboration was extraordinarily successful in finding interesting reaction signatures in the data. Our paper Momentum sharing in imbalanced Fermi systems was published in Science. Several analyses of CLAS data are continuing and will result in papers after the end of the grant period. We have held several analysis workshops and have given many invited talks at international conferences and workshops related to the data mining initiative. Our initiative to maximize the impact of data collected with CLAS in the 6-GeV era was very successful. During the hiatus between the end of 6-GeV experiments and the beginning of 12-GeV experiments, our collaboration and the physics community at large benefited tremendously from the Jefferson Lab Nuclear Data Mining effort.« less
NASA Astrophysics Data System (ADS)
Demigha, Souâd.
2016-03-01
The paper presents a Case-Based Reasoning Tool for Breast Cancer Knowledge Management to improve breast cancer screening. To develop this tool, we combine both concepts and techniques of Case-Based Reasoning (CBR) and Data Mining (DM). Physicians and radiologists ground their diagnosis on their expertise (past experience) based on clinical cases. Case-Based Reasoning is the process of solving new problems based on the solutions of similar past problems and structured as cases. CBR is suitable for medical use. On the other hand, existing traditional hospital information systems (HIS), Radiological Information Systems (RIS) and Picture Archiving Information Systems (PACS) don't allow managing efficiently medical information because of its complexity and heterogeneity. Data Mining is the process of mining information from a data set and transform it into an understandable structure for further use. Combining CBR to Data Mining techniques will facilitate diagnosis and decision-making of medical experts.
CARIBIAM: constrained Association Rules using Interactive Biological IncrementAl Mining.
Rahal, Imad; Rahhal, Riad; Wang, Baoying; Perrizo, William
2008-01-01
This paper analyses annotated genome data by applying a very central data-mining technique known as Association Rule Mining (ARM) with the aim of discovering rules and hypotheses capable of yielding deeper insights into this type of data. In the literature, ARM has been noted for producing an overwhelming number of rules. This work proposes a new technique capable of using domain knowledge in the form of queries in order to efficiently mine only the subset of the associations that are of interest to investigators in an incremental and interactive manner.
NASA Technical Reports Server (NTRS)
Brennan, P. A.; Chapman, P. E.; Chipp, E. R.
1971-01-01
During August of 1970 Mission 140 was flown with the NASA P3A aircraft over the Klondike Mining District, Nevada. High quality metric photography, thermal infrared imagery, multispectral photography and multichannel microwave radiometry were obtained. Geology and ground truth data are presented and relationships of the physical attributes of geologic materials to remotely sensed data is discussed. It is concluded that remote sensing data was valuable in the geologic evaluation of the Klondike Mining District and would be of value in other mining districts.
15 CFR 971.202 - Statement of technological experience and capabilities.
Code of Federal Regulations, 2012 CFR
2012-01-01
... GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL... results to commercial mining. The more test data offered with the application the less analysis will be... step in the mining process, including nodule collection, retrieval, transfer to ship, environmental...
15 CFR 971.202 - Statement of technological experience and capabilities.
Code of Federal Regulations, 2011 CFR
2011-01-01
... GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL... results to commercial mining. The more test data offered with the application the less analysis will be... step in the mining process, including nodule collection, retrieval, transfer to ship, environmental...
15 CFR 971.202 - Statement of technological experience and capabilities.
Code of Federal Regulations, 2010 CFR
2010-01-01
... GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL... results to commercial mining. The more test data offered with the application the less analysis will be... step in the mining process, including nodule collection, retrieval, transfer to ship, environmental...
ERTS-1 data applied to strip mining
NASA Technical Reports Server (NTRS)
Anderson, A. T.; Schubert, J.
1976-01-01
Two coal basins within the western region of the Potomac River Basin contain the largest strip-mining operations in western Maryland and West Virginia. The disturbed strip-mine areas were delineated along with the surrounding geological and vegetation features by using ERTS-1 data in both analog and digital form. The two digital systems employed were (1) the ERTS analysis system, a point-by-point digital analysis of spectral signatures based on known spectral values and (2) the LARS automatic data processing system. These two systems aided in efforts to determine the extent and state of strip mining in this region. Aircraft data, ground-verification information, and geological field studies also aided in the application of ERTS-1 imagery to perform an integrated analysis that assessed the adverse effects of strip mining. The results indicated that ERTS can both monitor and map the extent of strip mining to determine immediately the acreage affected and to indicate where future reclamation and revegetation may be necessary.
Tidal analysis and Arrival Process Mining Using Automatic Identification System (AIS) Data
2017-01-01
files, organized by location. The data were processed using the Python programming language (van Rossum and Drake 2001), the Pandas data analysis...ER D C/ CH L TR -1 7- 2 Coastal Inlets Research Program Tidal Analysis and Arrival Process Mining Using Automatic Identification System...17-2 January 2017 Tidal Analysis and Arrival Process Mining Using Automatic Identification System (AIS) Data Brandan M. Scully Coastal and
The Weather Forecast Using Data Mining Research Based on Cloud Computing.
NASA Astrophysics Data System (ADS)
Wang, ZhanJie; Mazharul Mujib, A. B. M.
2017-10-01
Weather forecasting has been an important application in meteorology and one of the most scientifically and technologically challenging problem around the world. In my study, we have analyzed the use of data mining techniques in forecasting weather. This paper proposes a modern method to develop a service oriented architecture for the weather information systems which forecast weather using these data mining techniques. This can be carried out by using Artificial Neural Network and Decision tree Algorithms and meteorological data collected in Specific time. Algorithm has presented the best results to generate classification rules for the mean weather variables. The results showed that these data mining techniques can be enough for weather forecasting.
Goode, Daniel J.; Cravotta, Charles A.; Hornberger, Roger J.; Hewitt, Michael A.; Hughes, Robert E.; Koury, Daniel J.; Eicholtz, Lee W.
2011-01-01
This report, prepared in cooperation with the Pennsylvania Department of Environmental Protection (PaDEP), the Eastern Pennsylvania Coalition for Abandoned Mine Reclamation, and the Dauphin County Conservation District, provides estimates of water budgets and groundwater volumes stored in abandoned underground mines in the Western Middle Anthracite Coalfield, which encompasses an area of 120 square miles in eastern Pennsylvania. The estimates are based on preliminary simulations using a groundwater-flow model and an associated geographic information system that integrates data on the mining features, hydrogeology, and streamflow in the study area. The Mahanoy and Shamokin Creek Basins were the focus of the study because these basins exhibit extensive hydrologic effects and water-quality degradation from the abandoned mines in their headwaters in the Western Middle Anthracite Coalfield. Proposed groundwater withdrawals from the flooded parts of the mines and stream-channel modifications in selected areas have the potential for altering the distribution of groundwater and the interaction between the groundwater and streams in the area. Preliminary three-dimensional, steady-state simulations of groundwater flow by the use of MODFLOW are presented to summarize information on the exchange of groundwater among adjacent mines and to help guide the management of ongoing data collection, reclamation activities, and water-use planning. The conceptual model includes high-permeability mine voids that are connected vertically and horizontally within multicolliery units (MCUs). MCUs were identified on the basis of mine maps, locations of mine discharges, and groundwater levels in the mines measured by PaDEP. The locations and integrity of mine barriers were determined from mine maps and groundwater levels. The permeability of intact barriers is low, reflecting the hydraulic characteristics of unmined host rock and coal. A steady-state model was calibrated to measured groundwater levels and stream base flow, the latter at many locations composed primarily of discharge from mines. Automatic parameter estimation used MODFLOW-2000 with manual adjustments to constrain parameter values to realistic ranges. The calibrated model supports the conceptual model of high-permeability MCUs separated by low-permeability barriers and streamflow losses and gains associated with mine infiltration and discharge. The simulated groundwater levels illustrate low groundwater gradients within an MCU and abrupt changes in water levels between MCUs. The preliminary model results indicate that the primary result of increased pumping from the mine would be reduced discharge from the mine to streams near the pumping wells. The intact barriers limit the spatial extent of mine dewatering. Considering the simulated groundwater levels, depth of mining, and assumed bulk porosity of 11 or 40 percent for the mined seams, the water volume in storage in the mines of the Western Middle Anthracite Coalfield was estimated to range from 60 to 220 billion gallons, respectively. Details of the groundwater-level distribution and the rates of some mine discharges are not simulated well using the preliminary model. Use of the model results should be limited to evaluation of the conceptual model and its simulation using porous-media flow methods, overall water budgets for the Western Middle Anthracite Coalfield, and approximate storage volumes. Model results should not be considered accurate for detailed simulation of flow within a single MCU or individual flooded mine. Although improvements in the model calibration were possible by introducing spatial variability in permeability parameters and adjusting barrier properties, more detailed parameterizations have increased uncertainty because of the limited data set. The preliminary identification of data needs includes continuous streamflow, mine discharge rate, and groundwater levels in the mines and adjacent areas. Data collected whe
McAdoo, Mitchell A.; Kozar, Mark D.
2017-11-14
This report describes a compilation of existing water-quality data associated with groundwater resources originating from abandoned underground coal mines in West Virginia. Data were compiled from multiple sources for the purpose of understanding the suitability of groundwater from abandoned underground coal mines for public supply, industrial, agricultural, and other uses. This compilation includes data collected for multiple individual studies conducted from July 13, 1973 through September 7, 2016. Analytical methods varied by the time period of data collection and requirements of the independent studies.This project identified 770 water-quality samples from 294 sites that could be attributed to abandoned underground coal mine aquifers originating from multiple coal seams in West Virginia.
Biomedical data mining in clinical routine: expanding the impact of hospital information systems.
Müller, Marcel; Markó, Kornel; Daumke, Philipp; Paetzold, Jan; Roesner, Arnold; Klar, Rüdiger
2007-01-01
In this paper we want to describe how the promising technology of biomedical data mining can improve the use of hospital information systems: a large set of unstructured, narrative clinical data from a dermatological university hospital like discharge letters or other dermatological reports were processed through a morpho-semantic text retrieval engine ("MorphoSaurus") and integrated with other clinical data using a web-based interface and brought into daily clinical routine. The user evaluation showed a very high user acceptance - this system seems to meet the clinicians' requirements for a vertical data mining in the electronic patient records. What emerges is the need for integration of biomedical data mining into hospital information systems for clinical, scientific, educational and economic reasons.
Managing the Big Data Avalanche in Astronomy - Data Mining the Galaxy Zoo Classification Database
NASA Astrophysics Data System (ADS)
Borne, Kirk D.
2014-01-01
We will summarize a variety of data mining experiments that have been applied to the Galaxy Zoo database of galaxy classifications, which were provided by the volunteer citizen scientists. The goal of these exercises is to learn new and improved classification rules for diverse populations of galaxies, which can then be applied to much larger sky surveys of the future, such as the LSST (Large Synoptic Sky Survey), which is proposed to obtain detailed photometric data for approximately 20 billion galaxies. The massive Big Data that astronomy projects will generate in the future demand greater application of data mining and data science algorithms, as well as greater training of astronomy students in the skills of data mining and data science. The project described here has involved several graduate and undergraduate research assistants at George Mason University.
Model architecture of intelligent data mining oriented urban transportation information
NASA Astrophysics Data System (ADS)
Yang, Bogang; Tao, Yingchun; Sui, Jianbo; Zhang, Feizhou
2007-06-01
Aiming at solving practical problems in urban traffic, the paper presents model architecture of intelligent data mining from hierarchical view. With artificial intelligent technologies used in the framework, the intelligent data mining technology improves, which is more suitable for the change of real-time road condition. It also provides efficient technology support for the urban transport information distribution, transmission and display.
ERIC Educational Resources Information Center
D'Mello, S. K., Ed.; Calvo, R. A., Ed.; Olney, A., Ed.
2013-01-01
Since its inception in 2008, the Educational Data Mining (EDM) conference series has featured some of the most innovative and fascinating basic and applied research centered on data mining, education, and learning technologies. This tradition of exemplary interdisciplinary research has been kept alive in 2013 as evident through an imaginative,…
A Tools-Based Approach to Teaching Data Mining Methods
ERIC Educational Resources Information Center
Jafar, Musa J.
2010-01-01
Data mining is an emerging field of study in Information Systems programs. Although the course content has been streamlined, the underlying technology is still in a state of flux. The purpose of this paper is to describe how we utilized Microsoft Excel's data mining add-ins as a front-end to Microsoft's Cloud Computing and SQL Server 2008 Business…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klintenberg, M.; Haraldsen, Jason T.; Balatsky, Alexander V.
In this paper, we report a data-mining investigation for the search of topological insulators by examining individual electronic structures for over 60,000 materials. Using a data-mining algorithm, we survey changes in band inversion with and without spin-orbit coupling by screening the calculated electronic band structure for a small gap and a change concavity at high-symmetry points. Overall, we were able to identify a number of topological candidates with varying structures and composition. Lastly, our overall goal is expand the realm of predictive theory into the determination of new and exotic complex materials through the data mining of electronic structure.
Klintenberg, M.; Haraldsen, Jason T.; Balatsky, Alexander V.
2014-06-19
In this paper, we report a data-mining investigation for the search of topological insulators by examining individual electronic structures for over 60,000 materials. Using a data-mining algorithm, we survey changes in band inversion with and without spin-orbit coupling by screening the calculated electronic band structure for a small gap and a change concavity at high-symmetry points. Overall, we were able to identify a number of topological candidates with varying structures and composition. Lastly, our overall goal is expand the realm of predictive theory into the determination of new and exotic complex materials through the data mining of electronic structure.
Mining Recent Temporal Patterns for Event Detection in Multivariate Time Series Data
Batal, Iyad; Fradkin, Dmitriy; Harrison, James; Moerchen, Fabian; Hauskrecht, Milos
2015-01-01
Improving the performance of classifiers using pattern mining techniques has been an active topic of data mining research. In this work we introduce the recent temporal pattern mining framework for finding predictive patterns for monitoring and event detection problems in complex multivariate time series data. This framework first converts time series into time-interval sequences of temporal abstractions. It then constructs more complex temporal patterns backwards in time using temporal operators. We apply our framework to health care data of 13,558 diabetic patients and show its benefits by efficiently finding useful patterns for detecting and diagnosing adverse medical conditions that are associated with diabetes. PMID:25937993
Text and Structural Data Mining of Influenza Mentions in Web and Social Media
DOE Office of Scientific and Technical Information (OSTI.GOV)
Corley, Courtney D.; Cook, Diane; Mikler, Armin R.
Text and structural data mining of Web and social media (WSM) provides a novel disease surveillance resource and can identify online communities for targeted public health communications (PHC) to assure wide dissemination of pertinent information. WSM that mention influenza are harvested over a 24-week period, 5-October-2008 to 21-March-2009. Link analysis reveals communities for targeted PHC. Text mining is shown to identify trends in flu posts that correlate to real-world influenza-like-illness patient report data. We also bring to bear a graph-based data mining technique to detect anomalies among flu blogs connected by publisher type, links, and user-tags.
Warehousing Structured and Unstructured Data for Data Mining.
ERIC Educational Resources Information Center
Miller, L. L.; Honavar, Vasant; Barta, Tom
1997-01-01
Describes an extensible object-oriented view system that supports the integration of both structured and unstructured data sources in either the multidatabase or data warehouse environment. Discusses related work and data mining issues. (AEF)
78 FR 12756 - Proposed Data Collections Submitted for Public Comment and Recommendations
Federal Register 2010, 2011, 2012, 2013, 2014
2013-02-25
... strategies, including self- report pre-and post-test instruments for assessing trainee reaction and measuring... Knowledge Test. Mine Escape/Continuous Mining Pre/Post- 30 1 6/60 3 participants. Training Knowledge Test. Mine Rescue/Longwall Mining Pre/Post- 30 1 6/60 3 participants. Training Knowledge Test. Mine Rescue...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haley, W.A.; Quenon, H.A.
1954-01-01
The progress of mechanical longwall coal mining in the United States is described in which a German coal planer was employed. Operating data of planer mining in three panels at the mine are summarized.
77 FR 42760 - Proposed Information Collection; Request for Comments
Federal Register 2010, 2011, 2012, 2013, 2014
2012-07-20
.... Data OMB Control Number: 1024-0064. Title: 36 CFR Part 9, Subpart A--Mining and Mining Claims, 36 CFR....gov . Please reference ``1024-0064, 36 CFR Part 9, Subpart A--Mining and Mining Claims, 36 CFR Part 9... regulates mineral development activities inside park boundaries pursuant to rights associated with mining...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-04-08
... Collection; Comment Request; High-Voltage Continuous Mining Machines Standards for Underground Coal Mines... Act of 1995. This program helps to assure that requested data can be provided in the desired format... maintains the safe use of high-voltage continuous mining machines in underground coal mines by requiring...
Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji
2016-01-01
Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.
Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji
2016-01-01
Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org PMID:26989145
NASA Technical Reports Server (NTRS)
1979-01-01
The potential benefits of using LANDSAT remote sensing data by state agencies as an aide in monitoring surface coal mining operations are reviewed. A mountaintop surface mine in eastern Kentucky was surveyed over a 5 year period using satellite multispectral scanner data that were classified by computer analyses. The analyses were guided by aerial photography and by ground surveys of the surface mines procured in 1976. The application of the LANDSAT data indicates that: (1) computer classification of the various landcover categories provides information for monitoring the progress of surface mining and reclamation operations; (2) successive yearly changes in barren and revegetated areas can be qualitatively assessed for surface mines of 100 acres or more of disrupted area; (3) barren areas consisting of limestone and shale mixtures may be recognized, and revegetated areas in various stages of growth may be identified against the hilly forest background.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1979-10-01
The potential benefits of using LANDSAT remote sensing data by state agencies as an aide in monitoring surface coal mining operations are reviewed. A mountaintop surface mine in eastern Kentucky was surveyed over a 5 year period using satellite multispectral scanner data that were classified by computer analyses. The analyses were guided by aerial photography and by ground surveys of the surface mines procured in 1976. The application of the LANDSAT data indicates that: (1) computer classification of the various landcover categories provides information for monitoring the progress of surface mining and reclamation operations, (2) successive yearly changes in barrenmore » and revegetated areas can be qualitatively assessed for surface mines of 100 acres or more of disrupted area, (3) barren areas consisting of limestone and shale mixtures may be recognized, and revegetated areas in various stages of growth may be identified against the hilly forest background.« less
NASA Astrophysics Data System (ADS)
Krawczyk, Artur
2018-01-01
In this article, topics regarding the technical and legal aspects of creating digital underground mining maps are described. Currently used technologies and solutions for creating, storing and making digital maps accessible are described in the context of the Polish mining industry. Also, some problems with the use of these technologies are identified and described. One of the identified problems is the need to expand the range of mining map data provided by survey departments to other mining departments, such as ventilation maintenance or geological maintenance. Three solutions are proposed and analyzed, and one is chosen for further analysis. The analysis concerns data storage and making survey data accessible not only from paper documentation, but also directly from computer systems. Based on enrichment data, new processing procedures are proposed for a new way of presenting information that allows the preparation of new cartographic representations (symbols) of data with regard to users' needs.
Reed, L.A.; Hainly, R.A.
1989-01-01
The U.S. Geological Survey, in cooperation with the Pennsylvania Department of Environmental Resources, has collected hydrologic data from areas in Tioga, Clearfield, and Fayette Counties to determine the effects of surface coal mining on sediment yields. The data were collected from June 1978 through September 1983. Rainfall, streamflow and suspended-sediment data were collected with automatic recording and sampling equipment. Data were collected in Tioga County from an agricultural area that was unaffected by mining and from a forested area prior to surface mining. Data were collected from two areas affected by active surface mining in Tioga County and from an area in Clearfield County being mined by the contour-surface method. Data also were collected from three areas, Tioga, Clearfield, and Fayette Counties, during and after reclamation. The efficiencies of sediment-control pounds in Clearfield and Fayette Counties also were determined. The average annual sediment yield from the agricultural area in Tioga County, which was 35 percent forested, was 0.48 ton per acre per year, and the yield from the forested area prior to mining was 0.0036 ton per acre per year. The average annual sediment yields from the areas affected by active surface mining were 22 tons per acre from the improved haul road and 148 tons per acre from the unimproved haul road. The average annual sediment yield from the site in Clearfield County that had been prepared for mining was 6.3 tons per acre. The average annual sediment yield from the same site while it was being mined by the contour method was 5.5 tons per acre per year. The sediment-control pond reduced the average annual sediment yield to 0.50 ton per acre while the site was prepared for mining and to 0.14 ton per acre while the site was being mined. Because the active surface mining reduced the effective drainage area to the pond, the sediment yield decreased from 0.50 to 0.14 ton per acre. Average annual suspended-sediment yields from the reclaimed site in Tioga County were 1.0 ton per acre during the first year, when vegetation was becoming established, and 0.037 ton per acre during the second year, when vegetation was well established. The average annual sediment yield below a 21.2-acre, reclaimed, surface mine in Clearfield County that had been mined by the contour method was 15 tons per acre during the first year when vegetation was becoming established. However, the average annual sediment yield below a sediment-control pond at this reclaimed site in Clearfield County was 0.30 ton per acre. Data collected from a 4.2-acre reclaimed area that had been surface mined by the block-cut method in Fayette County showed that annual sediment yields from the area were 77 tons per acre in 1981 (no vegetation), 32 tons per acre in 1982 (sparse vegetation), and 1.0 ton per acre in 1983 (well-esatablished vegetation). The average annual yield below a sediment-control pond at the mine site in Fayette County was 0.19 ton per acre during the 27 months of data collection.
Lunar resource evaluation and mine site selection
NASA Technical Reports Server (NTRS)
Bence, A. Edward
1992-01-01
Two scenarios in this evaluation of lunar mineral resources and the selection of possible mining and processing sites are considered. The first scenario assumes that no new surface or near-surface data will be available before site selection (presumably one of the Apollo sites). The second scenario assumes that additional surface geology data will have been obtained by a lunar orbiter mission, an unmanned sample return mission (or missions), and followup manned missions. Regardless of the scenario, once a potentially favorable mine site has been identified, a minimum amount of fundamental data is needed to assess the resources at that site and to evaluate its suitability for mining and downstream processing. Since much of the required data depends on the target mineral(s), information on the resource, its beneficiation, and the refining, smelting, and fabricating processes must be factored into the evaluation. The annual capacity and producing lifetime of the mine and its associated processing plant must be estimated before the resource reserves can be assessed. The available market for the product largely determines the capacity and lifetime of the mine. The Apollo 17 site is described as a possible mining site. The use of new sites is briefly addressed.
NASA Astrophysics Data System (ADS)
Roviana, D.; Tajuddin, A.; Edi, S.
2017-03-01
Mining potential in Indonesian is very abundant, ranging from Sabang to Marauke. Kabupaten Gorontalo is one of many places in Indonesia that have different types of minerals and natural resources that can be found in every district. The abundant of mining potential must be balanced with good management and ease of getting information by investors. The current issue is, (1) ways of presenting data/information about potential mines area is still manually (the maps that already capture from satellite image, then printed and attached to information board in the office) it caused the difficulties of getting information; (2) the high cost of maps printing; (3) the difficulties of regency leader (bupati) to obtain information for strategic decision making about mining potential. The goal of this research is to build a model of Geographical Information System that could provide data management of potential mines, so that the investors could easily get information according to their needs. To achieve that goal Research and Development method is used. The result of this research, is a model of Geographical Information System that implemented in an application to presenting data management of mines.
NASA Astrophysics Data System (ADS)
Huang, Yin; Chen, Jianhua; Xiong, Shaojun
2009-07-01
Mobile-Learning (M-learning) makes many learners get the advantages of both traditional learning and E-learning. Currently, Web-based Mobile-Learning Systems have created many new ways and defined new relationships between educators and learners. Association rule mining is one of the most important fields in data mining and knowledge discovery in databases. Rules explosion is a serious problem which causes great concerns, as conventional mining algorithms often produce too many rules for decision makers to digest. Since Web-based Mobile-Learning System collects vast amounts of student profile data, data mining and knowledge discovery techniques can be applied to find interesting relationships between attributes of learners, assessments, the solution strategies adopted by learners and so on. Therefore ,this paper focus on a new data-mining algorithm, combined with the advantages of genetic algorithm and simulated annealing algorithm , called ARGSA(Association rules based on an improved Genetic Simulated Annealing Algorithm), to mine the association rules. This paper first takes advantage of the Parallel Genetic Algorithm and Simulated Algorithm designed specifically for discovering association rules. Moreover, the analysis and experiment are also made to show the proposed method is superior to the Apriori algorithm in this Mobile-Learning system.
Effect of Temporal Relationships in Associative Rule Mining for Web Log Data
Mohd Khairudin, Nazli; Mustapha, Aida
2014-01-01
The advent of web-based applications and services has created such diverse and voluminous web log data stored in web servers, proxy servers, client machines, or organizational databases. This paper attempts to investigate the effect of temporal attribute in relational rule mining for web log data. We incorporated the characteristics of time in the rule mining process and analysed the effect of various temporal parameters. The rules generated from temporal relational rule mining are then compared against the rules generated from the classical rule mining approach such as the Apriori and FP-Growth algorithms. The results showed that by incorporating the temporal attribute via time, the number of rules generated is subsequently smaller but is comparable in terms of quality. PMID:24587757
Educational Data Mining Application for Estimating Students Performance in Weka Environment
NASA Astrophysics Data System (ADS)
Gowri, G. Shiyamala; Thulasiram, Ramasamy; Amit Baburao, Mahindra
2017-11-01
Educational data mining (EDM) is a multi-disciplinary research area that examines artificial intelligence, statistical modeling and data mining with the data generated from an educational institution. EDM utilizes computational ways to deal with explicate educational information keeping in mind the end goal to examine educational inquiries. To make a country stand unique among the other nations of the world, the education system has to undergo a major transition by redesigning its framework. The concealed patterns and data from various information repositories can be extracted by adopting the techniques of data mining. In order to summarize the performance of students with their credentials, we scrutinize the exploitation of data mining in the field of academics. Apriori algorithmic procedure is extensively applied to the database of students for a wider classification based on various categorizes. K-means procedure is applied to the same set of databases in order to accumulate them into a specific category. Apriori algorithm deals with mining the rules in order to extract patterns that are similar along with their associations in relation to various set of records. The records can be extracted from academic information repositories. The parameters used in this study gives more importance to psychological traits than academic features. The undesirable student conduct can be clearly witnessed if we make use of information mining frameworks. Thus, the algorithms efficiently prove to profile the students in any educational environment. The ultimate objective of the study is to suspect if a student is prone to violence or not.
DOE Office of Scientific and Technical Information (OSTI.GOV)
John McCord
2007-09-01
This report documents transport data and data analyses for Yucca Flat/Climax Mine CAU 97. The purpose of the data compilation and related analyses is to provide the primary reference to support parameterization of the Yucca Flat/Climax Mine CAU transport model. Specific task objectives were as follows: • Identify and compile currently available transport parameter data and supporting information that may be relevant to the Yucca Flat/Climax Mine CAU. • Assess the level of quality of the data and associated documentation. • Analyze the data to derive expected values and estimates of the associated uncertainty and variability. The scope of thismore » document includes the compilation and assessment of data and information relevant to transport parameters for the Yucca Flat/Climax Mine CAU subsurface within the context of unclassified source-term contamination. Data types of interest include mineralogy, aqueous chemistry, matrix and effective porosity, dispersivity, matrix diffusion, matrix and fracture sorption, and colloid-facilitated transport parameters.« less
A comprehensive review on privacy preserving data mining.
Aldeen, Yousra Abdul Alsahib S; Salleh, Mazleena; Razzaque, Mohammad Abdur
2015-01-01
Preservation of privacy in data mining has emerged as an absolute prerequisite for exchanging confidential information in terms of data analysis, validation, and publishing. Ever-escalating internet phishing posed severe threat on widespread propagation of sensitive information over the web. Conversely, the dubious feelings and contentions mediated unwillingness of various information providers towards the reliability protection of data from disclosure often results utter rejection in data sharing or incorrect information sharing. This article provides a panoramic overview on new perspective and systematic interpretation of a list published literatures via their meticulous organization in subcategories. The fundamental notions of the existing privacy preserving data mining methods, their merits, and shortcomings are presented. The current privacy preserving data mining techniques are classified based on distortion, association rule, hide association rule, taxonomy, clustering, associative classification, outsourced data mining, distributed, and k-anonymity, where their notable advantages and disadvantages are emphasized. This careful scrutiny reveals the past development, present research challenges, future trends, the gaps and weaknesses. Further significant enhancements for more robust privacy protection and preservation are affirmed to be mandatory.
[Research of bleeding volume and method in blood-letting acupuncture therapy based on data mining].
Liu, Xin; Jia, Chun-Sheng; Wang, Jian-Ling; Du, Yu-Zhu; Zhang, Xiao-Xu; Shi, Jing; Li, Xiao-Feng; Sun, Yan-Hui; Zhang, Shen; Zhang, Xuan-Ping; Gang, Wei-Juan
2014-03-01
Through computer-based technology and data mining method, with treatment in cases of bloodletting acupuncture therapy in collected literature as sample data, the association rule in data mining was applied. According to self-built database platform, the data was input, arranged and summarized, and eventually required data was acquired to perform the data mining of bleeding volume and method in blood-letting acupuncture therapy, which summarized its application rules and clinical values to provide better guide for clinical practice. There were 9 kinds of blood-letting tools in the literature, in which the frequency of three-edge needle was the highest, accounting for 84.4% (1239/1468). The bleeding volume was classified into six levels, in which less volume (less than 0.1 mL) had the highest frequency (401 times). According to the results of the data mining, blood-letting acupuncture therapy was widely applied in clinical practice of acupuncture, in which use of three-edge needle and less volume (less than 0.1 mL) of blood were the most common, however, there was no central tendency in general.
Characterization of a mine fire using atmospheric monitoring system sensor data
Yuan, L.; Thomas, R.A.; Zhou, L.
2017-01-01
Atmospheric monitoring systems (AMS) have been widely used in underground coal mines in the United States for the detection of fire in the belt entry and the monitoring of other ventilation-related parameters such as airflow velocity and methane concentration in specific mine locations. In addition to an AMS being able to detect a mine fire, the AMS data have the potential to provide fire characteristic information such as fire growth — in terms of heat release rate — and exact fire location. Such information is critical in making decisions regarding fire-fighting strategies, underground personnel evacuation and optimal escape routes. In this study, a methodology was developed to calculate the fire heat release rate using AMS sensor data for carbon monoxide concentration, carbon dioxide concentration and airflow velocity based on the theory of heat and species transfer in ventilation airflow. Full-scale mine fire experiments were then conducted in the Pittsburgh Mining Research Division’s Safety Research Coal Mine using an AMS with different fire sources. Sensor data collected from the experiments were used to calculate the heat release rates of the fires using this methodology. The calculated heat release rate was compared with the value determined from the mass loss rate of the combustible material using a digital load cell. The experimental results show that the heat release rate of a mine fire can be calculated using AMS sensor data with reasonable accuracy. PMID:28845058
Process mining in oncology using the MIMIC-III dataset
NASA Astrophysics Data System (ADS)
Prima Kurniati, Angelina; Hall, Geoff; Hogg, David; Johnson, Owen
2018-03-01
Process mining is a data analytics approach to discover and analyse process models based on the real activities captured in information systems. There is a growing body of literature on process mining in healthcare, including oncology, the study of cancer. In earlier work we found 37 peer-reviewed papers describing process mining research in oncology with a regular complaint being the limited availability and accessibility of datasets with suitable information for process mining. Publicly available datasets are one option and this paper describes the potential to use MIMIC-III, for process mining in oncology. MIMIC-III is a large open access dataset of de-identified patient records. There are 134 publications listed as using the MIMIC dataset, but none of them have used process mining. The MIMIC-III dataset has 16 event tables which are potentially useful for process mining and this paper demonstrates the opportunities to use MIMIC-III for process mining in oncology. Our research applied the L* lifecycle method to provide a worked example showing how process mining can be used to analyse cancer pathways. The results and data quality limitations are discussed along with opportunities for further work and reflection on the value of MIMIC-III for reproducible process mining research.
Methane Content Estimation in DuongHuy Coal Mine
NASA Astrophysics Data System (ADS)
Nguyen, Van Thinh; Mijał, Waldemar; Dang, Vu Chi; Nguyen, Thi Tuyet Mai
2018-03-01
Methane hazard has always been considered for underground coal mining as it can lead to methane explosion. In Quang Ninh province, several coal mines such as Mạo Khe coal mine, Khe Cham coal mine, especially Duong Huy mine that have high methane content. Experimental data to examine contents of methane bearing coal seams at different depths are not similar in Duong coal mine. In order to ensure safety, this report has been undertaken to determine a pattern of changing methane contents of coal seams at different exploitation depths in Duong Huy underground coal mine.
Monitoring of the mercury mining site Almadén implementing remote sensing technologies.
Schmid, Thomas; Rico, Celia; Rodríguez-Rastrero, Manuel; José Sierra, María; Javier Díaz-Puente, Fco; Pelayo, Marta; Millán, Rocio
2013-08-01
The Almadén area in Spain has a long history of mercury mining with prolonged human-induced activities that are related to mineral extraction and metallurgical processes before the closure of the mines and a more recent post period dominated by projects that reclaim the mine dumps and tailings and recuperating the entire mining area. Furthermore, socio-economic alternatives such as crop cultivation, livestock breeding and tourism are increasing in the area. Up till now, only scattered information on these activities is available from specific studies. However, improved acquisition systems using satellite borne data in the last decades opens up new possibilities to periodically study an area of interest. Therefore, comparing the influence of these activities on the environment and monitoring their impact on the ecosystem vastly improves decision making for the public policy makers to implement appropriate land management measures and control environmental degradation. The objective of this work is to monitor environmental changes affected by human-induced activities within the Almadén area occurring before, during and after the mine closure over a period of nearly three decades. To achieve this, data from numerous sources at different spatial scales and time periods are implemented into a methodology based on advanced remote sensing techniques. This includes field spectroradiometry measurements, laboratory analyses and satellite borne data of different surface covers to detect land cover and use changes throughout the mining area. Finally, monitoring results show that the distribution of areas affected by mercury mining is rapidly diminishing since activities ceased and that rehabilitated mining areas form a new landscape. This refers to mine tailings that have been sealed and revegetated as well as an open pit mine that has been converted to an "artificial" lake surface. Implementing a methodology based on remote sensing techniques that integrate data from several sources at different scales greatly improves the regional characterization and monitoring of an area dominated by mercury mining activities. Copyright © 2013 Elsevier Inc. All rights reserved.
The Coal Data Browser gives users easy access to coal information from EIA's electricity and coal surveys as well as data from the Mine Safety and Health Administration and trade information from the U.S. Census Bureau. Users can also see the shipment data from individual mines that deliver coal to the U.S. electric power fleet, have the ability to track supplies delivered to a given power plant, and to see which mines serve each particular plant.
ERIC Educational Resources Information Center
Stamper, John, Ed.; Pardos, Zachary, Ed.; Mavrikis, Manolis, Ed.; McLaren, Bruce M., Ed.
2014-01-01
The 7th International Conference on Education Data Mining held on July 4th-7th, 2014, at the Institute of Education, London, UK is the leading international forum for high-quality research that mines large data sets in order to answer educational research questions that shed light on the learning process. These data sets may come from the traces…
A Contextualized, Differential Sequence Mining Method to Derive Students' Learning Behavior Patterns
ERIC Educational Resources Information Center
Kinnebrew, John S.; Loretz, Kirk M.; Biswas, Gautam
2013-01-01
Computer-based learning environments can produce a wealth of data on student learning interactions. This paper presents an exploratory data mining methodology for assessing and comparing students' learning behaviors from these interaction traces. The core algorithm employs a novel combination of sequence mining techniques to identify deferentially…
Application of data mining in science and technology management information system based on WebGIS
NASA Astrophysics Data System (ADS)
Wu, Xiaofang; Xu, Zhiyong; Bao, Shitai; Chen, Feixiang
2009-10-01
With the rapid development of science and technology and the quick increase of information, a great deal of data is accumulated in the management department of science and technology. Usually, many knowledge and rules are contained and concealed in the data. Therefore, how to excavate and use the knowledge fully is very important in the management of science and technology. It will help to examine and approve the project of science and technology more scientifically and make the achievement transformed as the realistic productive forces easier. Therefore, the data mine technology will be researched and applied to the science and technology management information system to find and excavate the knowledge in the paper. According to analyzing the disadvantages of traditional science and technology management information system, the database technology, data mining and web geographic information systems (WebGIS) technology will be introduced to develop and construct the science and technology management information system based on WebGIS. The key problems are researched in detail such as data mining and statistical analysis. What's more, the prototype system is developed and validated based on the project data of National Natural Science Foundation Committee. The spatial data mining is done from the axis of time, space and other factors. Then the variety of knowledge and rules will be excavated by using data mining technology, which helps to provide an effective support for decisionmaking.
A software tool for determination of breast cancer treatment methods using data mining approach.
Cakır, Abdülkadir; Demirel, Burçin
2011-12-01
In this work, breast cancer treatment methods are determined using data mining. For this purpose, software is developed to help to oncology doctor for the suggestion of application of the treatment methods about breast cancer patients. 462 breast cancer patient data, obtained from Ankara Oncology Hospital, are used to determine treatment methods for new patients. This dataset is processed with Weka data mining tool. Classification algorithms are applied one by one for this dataset and results are compared to find proper treatment method. Developed software program called as "Treatment Assistant" uses different algorithms (IB1, Multilayer Perception and Decision Table) to find out which one is giving better result for each attribute to predict and by using Java Net beans interface. Treatment methods are determined for the post surgical operation of breast cancer patients using this developed software tool. At modeling step of data mining process, different Weka algorithms are used for output attributes. For hormonotherapy output IB1, for tamoxifen and radiotherapy outputs Multilayer Perceptron and for the chemotherapy output decision table algorithm shows best accuracy performance compare to each other. In conclusion, this work shows that data mining approach can be a useful tool for medical applications particularly at the treatment decision step. Data mining helps to the doctor to decide in a short time.
From data mining rules to medical logical modules and medical advices.
Gomoi, Valentin; Vida, Mihaela; Robu, Raul; Stoicu-Tivadar, Vasile; Bernad, Elena; Lupşe, Oana
2013-01-01
Using data mining in collaboration with Clinical Decision Support Systems adds new knowledge as support for medical diagnosis. The current work presents a tool which translates data mining rules supporting generation of medical advices to Arden Syntax formalism. The developed system was tested with data related to 2326 births that took place in 2010 at the Bega Obstetrics - Gynaecology Hospital, Timişoara. Based on processing these data, 14 medical rules regarding the Apgar score were generated and then translated in Arden Syntax language.
Clustering and Dimensionality Reduction to Discover Interesting Patterns in Binary Data
NASA Astrophysics Data System (ADS)
Palumbo, Francesco; D'Enza, Alfonso Iodice
The attention towards binary data coding increased consistently in the last decade due to several reasons. The analysis of binary data characterizes several fields of application, such as market basket analysis, DNA microarray data, image mining, text mining and web-clickstream mining. The paper illustrates two different approaches exploiting a profitable combination of clustering and dimensionality reduction for the identification of non-trivial association structures in binary data. An application in the Association Rules framework supports the theory with the empirical evidence.
Monitoring genotoxic exposure in uranium mines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sram, R.J.; Vesela, D.; Vesely, D.
1993-10-01
Recent data from deep uranium mines in Czechoslovakia indicated that miners are exposed to other mutagenic factors in addition to radon daughter products. Mycotoxins were identified as a possible source of mutagens in these mines. Mycotoxins were examined in 38 samples from mines and in throat swabs taken from 116 miners and 78 controls. The following mycotoxins were identified from mines samples: aflatoxins B{sub 1} and G1, citrinin, citreoviridin, mycophenolic acid, and sterigmatocystin. Some mold strains isolated from mines and throat swabs were investigated for mutagenic activity by the SOS chromotest and Salmonella assay with strains TA100 and TA98. Mutagenicitymore » was observed, especially with metabolic activation in citro. These data suggest that mycotoxins produced by molds in uranium mines are a new genotoxic factor im uranium miners. 17 refs., 4 tabs.« less
Association mining of dependency between time series
NASA Astrophysics Data System (ADS)
Hafez, Alaaeldin
2001-03-01
Time series analysis is considered as a crucial component of strategic control over a broad variety of disciplines in business, science and engineering. Time series data is a sequence of observations collected over intervals of time. Each time series describes a phenomenon as a function of time. Analysis on time series data includes discovering trends (or patterns) in a time series sequence. In the last few years, data mining has emerged and been recognized as a new technology for data analysis. Data Mining is the process of discovering potentially valuable patterns, associations, trends, sequences and dependencies in data. Data mining techniques can discover information that many traditional business analysis and statistical techniques fail to deliver. In this paper, we adapt and innovate data mining techniques to analyze time series data. By using data mining techniques, maximal frequent patterns are discovered and used in predicting future sequences or trends, where trends describe the behavior of a sequence. In order to include different types of time series (e.g. irregular and non- systematic), we consider past frequent patterns of the same time sequences (local patterns) and of other dependent time sequences (global patterns). We use the word 'dependent' instead of the word 'similar' for emphasis on real life time series where two time series sequences could be completely different (in values, shapes, etc.), but they still react to the same conditions in a dependent way. In this paper, we propose the Dependence Mining Technique that could be used in predicting time series sequences. The proposed technique consists of three phases: (a) for all time series sequences, generate their trend sequences, (b) discover maximal frequent trend patterns, generate pattern vectors (to keep information of frequent trend patterns), use trend pattern vectors to predict future time series sequences.
Operating System Support for Shared Hardware Data Structures
2013-01-31
Carbon [73] uses hardware queues to improve fine-grained multitasking for Recognition, Mining , and Synthesis. Compared to software ap- proaches...web transaction processing, data mining , and multimedia. Early work in database processors [114, 96, 79, 111] reduce the costs of relational database...assignment can be solved statically or dynamically. Static assignment deter- mines offline which data structures are assigned to use HWDS resources and at
ERIC Educational Resources Information Center
Santos, Olga Cristina, Ed.; Boticario, Jesus Gonzalez, Ed.; Romero, Cristobal, Ed.; Pechenizkiy, Mykola, Ed.; Merceron, Agathe, Ed.; Mitros, Piotr, Ed.; Luna, Jose Maria, Ed.; Mihaescu, Cristian, Ed.; Moreno, Pablo, Ed.; Hershkovitz, Arnon, Ed.; Ventura, Sebastian, Ed.; Desmarais, Michel, Ed.
2015-01-01
The 8th International Conference on Educational Data Mining (EDM 2015) is held under auspices of the International Educational Data Mining Society at UNED, the National University for Distance Education in Spain. The conference held in Madrid, Spain, July 26-29, 2015, follows the seven previous editions (London 2014, Memphis 2013, Chania 2012,…
NASA Astrophysics Data System (ADS)
Walter, Diana; Wegmuller, Urs; Spreckels, Volker; Busch, Wolfgang
2008-11-01
The main objective of the projects "Determination of ground motions in mining areas by interferometric analyses of ALOS data" (ALOS ADEN 3576, ESA) and "Monitoring of mining induced surface deformation" (ALOS-RA-094, JAXA) is to evaluate PALSAR data for surface deformation monitoring, using interferometric techniques. We present monitoring results of surface movements for an active hard coal colliery of the German hard coal mining company RAG Deutsche Steinkohle (RAG). Underground mining activities lead to ground movements at the surface with maximum subsidence rates of about 10cm per month for the test site. In these projects the L-band sensor clearly demonstrates the good potential for deformation monitoring in active mining areas, especially in rural areas. In comparison to C-band sensors we clearly observe advantages in resolving the high deformation gradients that are present in this area and we achieve a more complete spatial coverage than with C-band. Extensive validation data based on levelling data and GPS measurements are available within RAǴs GIS based database "GeoMon" and thus enable an adequate analysis of the quality of the interferometric results. Previous analyses confirm the good accuracy of PALSAR data for deformation monitoring in mining areas. Furthermore, we present results of special investigations like precision geocoding of PALSAR data and corner reflector analysis. At present only DInSAR results are obtained due to the currently available number of PALSAR scenes. For the future we plan to also apply Persistent Scatterer Interferometry (PSI) using longer series of PALSAR data.
Effects of coal mining on the water resources of the Tradewater River Basin, Kentucky
Grubb, Hayes F.; Ryder, Paul D.
1973-01-01
The effects of coal-mine drainage on the water resources of the Tradewater River basin, in the Western Coal Field region of Kentucky, were evaluated (1) by synthesis and interpretation of 16 years of daily conductance data. 465 chemical analyses covering an 18-year period, 28 years of daily discharge data, and 14 years of daily suspended-sediment data from the Tradewater River at Olney and (2) by collection, synthesis, and interpretation of chemical and physical water-quality data and water-quantity data collected over a 2-year period from mined and nonmined sites in the basin. Maximum observed values of 13 chemical and physical water-quality parameters were three to 300 times greater in the discharge from mined subbasins than in the discharge from nonmined subbasins. Potassium, chloride, and nitrate concentrations were not significantly different between mined and nonmined areas. Mean sulfate loads carried by the Tradewater River at Olney were about 75 percent greater for the period 1955-67 than for the period 1952-54. Suspended-sediment loads at Olney for the November-April storm-runoff periods generally vary in response to strip-mine coal production in the basin above Olney. Streamflow is maintained during extended dry periods in mined subbasins after streams in nonmined subbasins have ceased flowing. Some possible methods of reducing the effects of mine drainage on the streams are considered in view of a geochemical model proposed by Ivan Barnes and F. E. Clarke. Use of low-flow-augmenting reservoirs and crushed limestone in streambeds in nonmined areas seems to be the most promising method for alleviating effects of mine drainage at the present time. Other aspects of the water resources such as variability of water quantity and water quality in the basin are discussed briefly.
ERIC Educational Resources Information Center
Yu, Pulan
2012-01-01
Classification, clustering and association mining are major tasks of data mining and have been widely used for knowledge discovery. Associative classification mining, the combination of both association rule mining and classification, has emerged as an indispensable way to support decision making and scientific research. In particular, it offers a…
Using Open Web APIs in Teaching Web Mining
ERIC Educational Resources Information Center
Chen, Hsinchun; Li, Xin; Chau, M.; Ho, Yi-Jen; Tseng, Chunju
2009-01-01
With the advent of the World Wide Web, many business applications that utilize data mining and text mining techniques to extract useful business information on the Web have evolved from Web searching to Web mining. It is important for students to acquire knowledge and hands-on experience in Web mining during their education in information systems…
NASA Astrophysics Data System (ADS)
Boulicaut, Jean-Francois; Jeudy, Baptiste
Knowledge Discovery in Databases (KDD) is a complex interactive process. The promising theoretical framework of inductive databases considers this is essentially a querying process. It is enabled by a query language which can deal either with raw data or patterns which hold in the data. Mining patterns turns to be the so-called inductive query evaluation process for which constraint-based Data Mining techniques have to be designed. An inductive query specifies declaratively the desired constraints and algorithms are used to compute the patterns satisfying the constraints in the data. We survey important results of this active research domain. This chapter emphasizes a real breakthrough for hard problems concerning local pattern mining under various constraints and it points out the current directions of research as well.
A semantic model for multimodal data mining in healthcare information systems.
Iakovidis, Dimitris; Smailis, Christos
2012-01-01
Electronic health records (EHRs) are representative examples of multimodal/multisource data collections; including measurements, images and free texts. The diversity of such information sources and the increasing amounts of medical data produced by healthcare institutes annually, pose significant challenges in data mining. In this paper we present a novel semantic model that describes knowledge extracted from the lowest-level of a data mining process, where information is represented by multiple features i.e. measurements or numerical descriptors extracted from measurements, images, texts or other medical data, forming multidimensional feature spaces. Knowledge collected by manual annotation or extracted by unsupervised data mining from one or more feature spaces is modeled through generalized qualitative spatial semantics. This model enables a unified representation of knowledge across multimodal data repositories. It contributes to bridging the semantic gap, by enabling direct links between low-level features and higher-level concepts e.g. describing body parts, anatomies and pathological findings. The proposed model has been developed in web ontology language based on description logics (OWL-DL) and can be applied to a variety of data mining tasks in medical informatics. It utility is demonstrated for automatic annotation of medical data.
A study of mining-induced seismicity in Czech mines with longwall coal exploitation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holub, K.
2007-01-15
A review is performed for the data of local and regional seismographical networks installed in mines of the Ostrava-Karvina Coal Basin (Czech Republic), where underground anthracite mining is carried out and dynamic events occur in the form of rockbursts. The seismological and seismoacoustic observations data obtained in panels that are in limiting state are analyzed. This aggregate information is a basic for determining hazardous zones and assigning rockburst prevention measures.
Nahar, Jesmin; Imam, Tasadduq; Tickle, Kevin S; Garcia-Alonso, Debora
2013-01-01
This chapter is a review of data mining techniques used in medical research. It will cover the existing applications of these techniques in the identification of diseases, and also present the authors' research experiences in medical disease diagnosis and analysis. A computational diagnosis approach can have a significant impact on accurate diagnosis and result in time and cost effective solutions. The chapter will begin with an overview of computational intelligence concepts, followed by details on different classification algorithms. Use of association learning, a well recognised data mining procedure, will also be discussed. Many of the datasets considered in existing medical data mining research are imbalanced, and the chapter focuses on this issue as well. Lastly, the chapter outlines the need of data governance in this research domain.
Tahmasebian, Shahram; Ghazisaeedi, Marjan; Langarizadeh, Mostafa; Mokhtaran, Mehrshad; Mahdavi-Mazdeh, Mitra; Javadian, Parisa
2017-01-01
Introduction: Chronic kidney disease (CKD) includes a wide range of pathophysiological processes which will be observed along with abnormal function of kidneys and progressive decrease in glomerular filtration rate (GFR). According to the definition decreasing GFR must have been present for at least three months. CKD will eventually result in end-stage kidney disease. In this process different factors play role and finding the relations between effective parameters in this regard can help to prevent or slow progression of this disease. There are always a lot of data being collected from the patients' medical records. This huge array of data can be considered a valuable source for analyzing, exploring and discovering information. Objectives: Using the data mining techniques, the present study tries to specify the effective parameters and also aims to determine their relations with each other in Iranian patients with CKD. Material and Methods: The study population includes 31996 patients with CKD. First, all of the data is registered in the database. Then data mining tools were used to find the hidden rules and relationships between parameters in collected data. Results: After data cleaning based on CRISP-DM (Cross Industry Standard Process for Data Mining) methodology and running mining algorithms on the data in the database the relationships between the effective parameters was specified. Conclusion: This study was done using the data mining method pertaining to the effective factors on patients with CKD.
NASA Astrophysics Data System (ADS)
Kim, Kwang Hyeon; Lee, Suk; Shim, Jang Bo; Chang, Kyung Hwan; Yang, Dae Sik; Yoon, Won Sup; Park, Young Je; Kim, Chul Yong; Cao, Yuan Jie
2017-08-01
The aim of this study is an integrated research for text-based data mining and toxicity prediction modeling system for clinical decision support system based on big data in radiation oncology as a preliminary research. The structured and unstructured data were prepared by treatment plans and the unstructured data were extracted by dose-volume data image pattern recognition of prostate cancer for research articles crawling through the internet. We modeled an artificial neural network to build a predictor model system for toxicity prediction of organs at risk. We used a text-based data mining approach to build the artificial neural network model for bladder and rectum complication predictions. The pattern recognition method was used to mine the unstructured toxicity data for dose-volume at the detection accuracy of 97.9%. The confusion matrix and training model of the neural network were achieved with 50 modeled plans (n = 50) for validation. The toxicity level was analyzed and the risk factors for 25% bladder, 50% bladder, 20% rectum, and 50% rectum were calculated by the artificial neural network algorithm. As a result, 32 plans could cause complication but 18 plans were designed as non-complication among 50 modeled plans. We integrated data mining and a toxicity modeling method for toxicity prediction using prostate cancer cases. It is shown that a preprocessing analysis using text-based data mining and prediction modeling can be expanded to personalized patient treatment decision support based on big data.
Tahmasebian, Shahram; Ghazisaeedi, Marjan; Langarizadeh, Mostafa; Mokhtaran, Mehrshad; Mahdavi-Mazdeh, Mitra; Javadian, Parisa
2017-01-01
Introduction: Chronic kidney disease (CKD) includes a wide range of pathophysiological processes which will be observed along with abnormal function of kidneys and progressive decrease in glomerular filtration rate (GFR). According to the definition decreasing GFR must have been present for at least three months. CKD will eventually result in end-stage kidney disease. In this process different factors play role and finding the relations between effective parameters in this regard can help to prevent or slow progression of this disease. There are always a lot of data being collected from the patients’ medical records. This huge array of data can be considered a valuable source for analyzing, exploring and discovering information. Objectives: Using the data mining techniques, the present study tries to specify the effective parameters and also aims to determine their relations with each other in Iranian patients with CKD. Material and Methods: The study population includes 31996 patients with CKD. First, all of the data is registered in the database. Then data mining tools were used to find the hidden rules and relationships between parameters in collected data. Results: After data cleaning based on CRISP-DM (Cross Industry Standard Process for Data Mining) methodology and running mining algorithms on the data in the database the relationships between the effective parameters was specified. Conclusion: This study was done using the data mining method pertaining to the effective factors on patients with CKD. PMID:28497080
Surface-water quality of coal-mine lands in Raccoon Creek Basin, Ohio
Wilson, K.S.
1985-01-01
The Ohio Department of Natural Resources, Division of Reclamation, plans to reclaim abandoned surface mines in the Raccoon Creek watershed in southern Ohio. Historic water-quality data collected between 1975 and 1983 were complied and analyzed in terms of eight selected mine-drainage characteristics to develop a data base for individual subbasin reclamation projects. Areas of mine drainage affecting Raccoon Creek basin, the study Sandy Run basin, the Hewett Fork basin, and the Little raccoon Creek basin. Surface-water-quality samples were collected from a 41-site network from November 1 through November 3, 1983, Results of the sampling reaffirmed that the major sources of mine drainage to Raccoon Creek are in the Little Raccoon Creek basin, and the Hewett Fork basin. However, water quality at the mouth of Sandy Run indicated that it is not a source of mine drainage to Raccoon Creek. Buffer Run, Goose Run, an unnamed tributary to Little Raccoon Creek, Mulga Run, and Sugar Run were the main sources of mine drainage sampled in the Little Raccoon Creek basin. All sites sampled in the East Branch Raccoon Creek basin were affected by mine drainage. This information was used to prepare a work plan for additional data collection before, during, and after reclamation. The data will be used to define the effectiveness of reclamation effects in the basin.
Underground coal mining section data
NASA Technical Reports Server (NTRS)
Gabrill, C. P.; Urie, J. T.
1981-01-01
A set of tables which display the allocation of time for ten personnel and eight pieces of underground coal mining equipment to ten function categories is provided. Data from 125 full shift time studies contained in the KETRON database was utilized as the primary source data. The KETRON activity and delay codes were mapped onto JPL equipment, personnel and function categories. Computer processing was then performed to aggregate the shift level data and generate the matrices. Additional, documented time study data were analyzed and used to supplement the KETRON databased. The source data including the number of shifts are described. Specific parameters of the mines from which there data were extracted are presented. The result of the data processing including the required JPL matrices is presented. A brief comparison with a time study analysis of continuous mining systems is presented. The procedures used for processing the source data are described.
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine
Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.
2016-01-01
We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564
Hibernacula selection by Townsend's big-eared bat in Southwestern Colorado
Hayes, Mark A.; Schorr, Robert A.; Navo, Kirk W.
2011-01-01
In western United States, both mine reclamations and renewed mining at previously abandoned mines have increased substantially in the last decade. This increased activity may adversely impact bats that use these mines for roosting. Townsend's big-eared bat (Corynorhinus townsendii) is a species of conservation concern that may be impacted by ongoing mine reclamation and renewed mineral extraction. To help inform wildlife management decisions related to bat use of abandoned mine sites, we used logistic regression, Akaike's information criterion, and multi-model inference to investigate hibernacula use by Townsend's big-eared bats using 9 years of data from surveys inside abandoned mines in southwestern Colorado. Townsend's big-eared bats were found in 38 of 133 mines surveyed (29%), and occupied mines averaged 2.6 individuals per mine. The model explaining the most variability in our data included number of openings and portal temperature at abandoned mines. In southwestern Colorado, we found that abandoned mine sites with more than one opening and portal temperatures near 0°C were more likely to contain hibernating Townsend's big-eared bats. However, mines with only one opening and portal temperatures of ≥10°C were occasionally occupied by Townsend's big-eared bat. Understanding mine use by Townsend's big-eared bat can help guide decisions regarding allocation of resources and placement of bat-compatible closures at mine sites scheduled for reclamation. When feasible we believe that surveys should be conducted inside all abandoned mines in a reclamation project at least once during winter prior to making closure and reclamation recommendations.
Moyle, Phillip R.; Kayser, Helen Z.
2006-01-01
This report describes the spatial database, PHOSMINE01, and the processes used to delineate mining-related features (active and inactive/historical) in the core of the southeastern Idaho phosphate resource area. The spatial data have varying degrees of accuracy and attribution detail. Classification of areas by type of mining-related activity at active mines is generally detailed; however, for many of the closed or inactive mines the spatial coverage does not differentiate mining-related surface disturbance features. Nineteen phosphate mine sites are included in the study, three active phosphate mines - Enoch Valley (nearing closure), Rasmussen Ridge, and Smoky Canyon - and 16 inactive (or historical) phosphate mines - Ballard, Champ, Conda, Diamond Gulch, Dry Valley, Gay, Georgetown Canyon, Henry, Home Canyon, Lanes Creek, Maybe Canyon, Mountain Fuel, Trail Canyon, Rattlesnake, Waterloo, and Wooley Valley. Approximately 6,000 hc (15,000 ac), or 60 km2 (23 mi2) of phosphate mining-related surface disturbance are documented in the spatial coverage. Spatial data for the inactive mines is current because no major changes have occurred; however, the spatial data for active mines were derived from digital maps prepared in early 2001 and therefore recent activity is not included. The inactive Gay Mine has the largest total area of disturbance, 1,900 hc (4,700 ac) or about 19 km2 (7.4 mi2). It encompasses over three times the disturbance area of the next largest mine, the Conda Mine with 610 hc (1,500 ac), and it is nearly four times the area of the Smoky Canyon Mine, the largest of the active mines with about 550 hc (1,400 ac). The wide range of phosphate mining-related surface disturbance features (141) from various industry maps were reduced to 15 types or features based on a generic classification system used for this study: mine pit; backfilled mine pit; waste rock dump; adit and waste rock dump; ore stockpile; topsoil stockpile; tailings or tailings pond; sediment catchment; facilities; road; railroad; water reservoir; disturbed land, undifferentiated; and undisturbed land. In summary, the spatial coverage includes polygons totaling about 1,100 hc (2,800 ac) of mine pits, 440 hc (1100 ac) of backfilled mine pits, 1,600 hc (3,800 ac) of waste rock dumps, 31 hc (75 ac) of ore stockpiles, and 44 hc (110 ac) of tailings or tailings ponds. Areas of undifferentiated phosphate mining-related land disturbances, called 'disturbed land, undifferentiated,' total about 2,200 hc (5,500 ac) or nearly 22 km2 (8.6 mi2). No determination has been made as to status of reclamation on any of the lands. Subsequent site-specific studies to delineate distinct mine features will allow additional revisions to this spatial database.
Multivariate Spatial Condition Mapping Using Subtractive Fuzzy Cluster Means
Sabit, Hakilo; Al-Anbuky, Adnan
2014-01-01
Wireless sensor networks are usually deployed for monitoring given physical phenomena taking place in a specific space and over a specific duration of time. The spatio-temporal distribution of these phenomena often correlates to certain physical events. To appropriately characterise these events-phenomena relationships over a given space for a given time frame, we require continuous monitoring of the conditions. WSNs are perfectly suited for these tasks, due to their inherent robustness. This paper presents a subtractive fuzzy cluster means algorithm and its application in data stream mining for wireless sensor systems over a cloud-computing-like architecture, which we call sensor cloud data stream mining. Benchmarking on standard mining algorithms, the k-means and the FCM algorithms, we have demonstrated that the subtractive fuzzy cluster means model can perform high quality distributed data stream mining tasks comparable to centralised data stream mining. PMID:25313495
Profitability and occupational injuries in U.S. underground coal mines☆
Asfaw, Abay; Mark, Christopher; Pana-Cryan, Regina
2015-01-01
Background Coal plays a crucial role in the U.S. economy yet underground coal mining continues to be one of the most dangerous occupations in the country. In addition, there are large variations in both profitability and the incidence of occupational injuries across mines. Objective The objective of this study was to examine the association between profitability and the incidence rate of occupational injuries in U.S. underground coal mines between 1992 and 2008. Data and method We used mine-specific data on annual hours worked, geographic location, and the number of occupational injuries suffered annually from the employment and accident/injury databases of the Mine Safety and Health Administration, and mine-specific data on annual revenue from coal sales, mine age, workforce union status, and mining method from the U.S. Energy Information Administration. A total of 5669 mine-year observations (number of mines × number of years) were included in our analysis. We used a negative binomial random effects model that was appropriate for analyzing panel (combined time-series and cross-sectional) injury data that were non-negative and discrete. The dependent variable, occupational injury, was measured in three different and non-mutually exclusive ways: all reported fatal and nonfatal injuries, reported nonfatal injuries with lost workdays, and the ‘most serious’ (i.e. sum of fatal and serious nonfatal) injuries reported. The total number of hours worked in each mine and year examined was used as an exposure variable. Profitability, the main explanatory variable, was approximated by revenue per hour worked. Our model included mine age, workforce union status, mining method, and geographic location as additional control variables. Results After controlling for other variables, a 10% increase in real total revenue per hour worked was associated with 0.9%, 1.1%, and 1.6% decrease, respectively, in the incidence rates of all reported injuries, reported injuries with lost workdays, and the most serious injuries reported. Conclusion We found an inverse relationship between profitability and each of the three indicators of occupational injuries we used. These results might be partially due to factors that affect both profitability and safety, such as management or engineering practices, and partially due to lower investments in safety by less profitable mines, which could imply that some financially stressed mines might be so focused on survival that they forgo investing in safety. PMID:22884379
Data Mining Techniques for Customer Relationship Management
NASA Astrophysics Data System (ADS)
Guo, Feng; Qin, Huilin
2017-10-01
Data mining have made customer relationship management (CRM) a new area where firms can gain a competitive advantage, and play a key role in the firms’ management decision. In this paper, we first analyze the value and application fields of data mining techniques for CRM, and further explore how data mining applied to Customer churn analysis. A new business culture is developing today. The conventional production centered and sales purposed market strategy is gradually shifting to customer centered and service purposed. Customers’ value orientation is increasingly affecting the firms’. And customer resource has become one of the most important strategic resources. Therefore, understanding customers’ needs and discriminating the most contributed customers has become the driving force of most modern business.
Data mining for signals in spontaneous reporting databases: proceed with caution.
Stephenson, Wendy P; Hauben, Manfred
2007-04-01
To provide commentary and points of caution to consider before incorporating data mining as a routine component of any Pharmacovigilance program, and to stimulate further research aimed at better defining the predictive value of these new tools as well as their incremental value as an adjunct to traditional methods of post-marketing surveillance. Commentary includes review of current data mining methodologies employed and their limitations, caveats to consider in the use of spontaneous reporting databases and caution against over-confidence in the results of data mining. Future research should focus on more clearly delineating the limitations of the various quantitative approaches as well as the incremental value that they bring to traditional methods of pharmacovigilance.
Buried landmine detection using multivariate normal clustering
NASA Astrophysics Data System (ADS)
Duston, Brian M.
2001-10-01
A Bayesian classification algorithm is presented for discriminating buried land mines from buried and surface clutter in Ground Penetrating Radar (GPR) signals. This algorithm is based on multivariate normal (MVN) clustering, where feature vectors are used to identify populations (clusters) of mines and clutter objects. The features are extracted from two-dimensional images created from ground penetrating radar scans. MVN clustering is used to determine the number of clusters in the data and to create probability density models for target and clutter populations, producing the MVN clustering classifier (MVNCC). The Bayesian Information Criteria (BIC) is used to evaluate each model to determine the number of clusters in the data. An extension of the MVNCC allows the model to adapt to local clutter distributions by treating each of the MVN cluster components as a Poisson process and adaptively estimating the intensity parameters. The algorithm is developed using data collected by the Mine Hunter/Killer Close-In Detector (MH/K CID) at prepared mine lanes. The Mine Hunter/Killer is a prototype mine detecting and neutralizing vehicle developed for the U.S. Army to clear roads of anti-tank mines.
Developing Cyberspace Data Understanding: Using CRISP-DM for Host-based IDS Feature Mining
2010-03-01
Developing Cyberspace Data Understanding: Using CRISP - DM for Host-based IDS Feature Mining THESIS Joseph R. Erskine, Captain, USAF AFIT/GCS/ENG/10-01...Air Force, Department of Defense, or the United States Government. AFIT/GCS/ENG/10-01 Developing Cyberspace Data Understanding: Using CRISP - DM for...Developing Cyberspace Data Understanding: Using CRISP - DM for Host-based IDS Feature Mining Joseph R. Erskine, B.S.C.S. Captain, USAF Approved: /signed/ 12
Multiagent data warehousing and multiagent data mining for cerebrum/cerebellum modeling
NASA Astrophysics Data System (ADS)
Zhang, Wen-Ran
2002-03-01
An algorithm named Neighbor-Miner is outlined for multiagent data warehousing and multiagent data mining. The algorithm is defined in an evolving dynamic environment with autonomous or semiautonomous agents. Instead of mining frequent itemsets from customer transactions, the new algorithm discovers new agents and mining agent associations in first-order logic from agent attributes and actions. While the Apriori algorithm uses frequency as a priory threshold, the new algorithm uses agent similarity as priory knowledge. The concept of agent similarity leads to the notions of agent cuboid, orthogonal multiagent data warehousing (MADWH), and multiagent data mining (MADM). Based on agent similarities and action similarities, Neighbor-Miner is proposed and illustrated in a MADWH/MADM approach to cerebrum/cerebellum modeling. It is shown that (1) semiautonomous neurofuzzy agents can be identified for uniped locomotion and gymnastic training based on attribute relevance analysis; (2) new agents can be discovered and agent cuboids can be dynamically constructed in an orthogonal MADWH, which resembles an evolving cerebrum/cerebellum system; and (3) dynamic motion laws can be discovered as association rules in first order logic. Although examples in legged robot gymnastics are used to illustrate the basic ideas, the new approach is generally suitable for a broad category of data mining tasks where knowledge can be discovered collectively by a set of agents from a geographically or geometrically distributed but relevant environment, especially in scientific and engineering data environments.
NASA Astrophysics Data System (ADS)
Kadampur, Mohammad Ali; D. v. L. N., Somayajulu
Privacy preserving data mining is an art of knowledge discovery without revealing the sensitive data of the data set. In this paper a data transformation technique using wavelets is presented for privacy preserving data mining. Wavelets use well known energy compaction approach during data transformation and only the high energy coefficients are published to the public domain instead of the actual data proper. It is found that the transformed data preserves the Eucleadian distances and the method can be used in privacy preserving clustering. Wavelets offer the inherent improved time complexity.
Sanmiquel, Lluís; Bascompta, Marc; Rossell, Josep M.; Anticoi, Hernán Francisco; Guash, Eduard
2018-01-01
An analysis of occupational accidents in the mining sector was conducted using the data from the Spanish Ministry of Employment and Social Safety between 2005 and 2015, and data-mining techniques were applied. Data was processed with the software Weka. Two scenarios were chosen from the accidents database: surface and underground mining. The most important variables involved in occupational accidents and their association rules were determined. These rules are composed of several predictor variables that cause accidents, defining its characteristics and context. This study exposes the 20 most important association rules in the sector—either surface or underground mining—based on the statistical confidence levels of each rule as obtained by Weka. The outcomes display the most typical immediate causes, along with the percentage of accidents with a basis in each association rule. The most important immediate cause is body movement with physical effort or overexertion, and the type of accident is physical effort or overexertion. On the other hand, the second most important immediate cause and type of accident are different between the two scenarios. Data-mining techniques were chosen as a useful tool to find out the root cause of the accidents. PMID:29518921
Survey of nine surface mines in North America. [Nine different mines in USA and Canada
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hayes, L.G.; Brackett, R.D.; Floyd, F.D.
This report presents the information gathered by three mining engineers in a 1980 survey of nine surface mines in the United States and Canada. The mines visited included seven coal mines, one copper mine, and one tar sands mine selected as representative of present state of the art in open pit, strip, and terrace pit mining. The purpose of the survey was to investigate mining methods, equipment requirements, operating costs, reclamation procedures and costs, and other aspects of current surface mining practices in order to acquire basic data for a study comparing conventional and terrace pit mining methods, particularly inmore » deeper overburdens. The survey was conducted as part of a project under DOE Contract No. DE-AC01-79ET10023 titled The Development of Optimal Terrace Pit Coal Mining Systems.« less
The History of the Coal Mining Industry and Mining Accidents in the World and Turkey
Atalay, Figen
2015-01-01
Three per thousand of the world’s coal reserves and 2% of lignite reserves exist in Turkey. Coal mining is the highest ranking industry for accidents and deaths per capita. For this reason, continuous monitoring and more attention should be gıven to the mining industry. In this review, the basic statistical data related to Turkey’s mining and mining disasters are summarized. PMID:29404107
Introducing Text Analytics as a Graduate Business School Course
ERIC Educational Resources Information Center
Edgington, Theresa M.
2011-01-01
Text analytics refers to the process of analyzing unstructured data from documented sources, including open-ended surveys, blogs, and other types of web dialog. Text analytics has enveloped the concept of text mining, an analysis approach influenced heavily from data mining. While text mining has been covered extensively in various computer…
ERIC Educational Resources Information Center
Kinnebrew, John S.; Biswas, Gautam
2012-01-01
Our learning-by-teaching environment, Betty's Brain, captures a wealth of data on students' learning interactions as they teach a virtual agent. This paper extends an exploratory data mining methodology for assessing and comparing students' learning behaviors from these interaction traces. The core algorithm employs sequence mining techniques to…
Systematic Review of Data Mining Applications in Patient-Centered Mobile-Based Information Systems.
Fallah, Mina; Niakan Kalhori, Sharareh R
2017-10-01
Smartphones represent a promising technology for patient-centered healthcare. It is claimed that data mining techniques have improved mobile apps to address patients' needs at subgroup and individual levels. This study reviewed the current literature regarding data mining applications in patient-centered mobile-based information systems. We systematically searched PubMed, Scopus, and Web of Science for original studies reported from 2014 to 2016. After screening 226 records at the title/abstract level, the full texts of 92 relevant papers were retrieved and checked against inclusion criteria. Finally, 30 papers were included in this study and reviewed. Data mining techniques have been reported in development of mobile health apps for three main purposes: data analysis for follow-up and monitoring, early diagnosis and detection for screening purpose, classification/prediction of outcomes, and risk calculation (n = 27); data collection (n = 3); and provision of recommendations (n = 2). The most accurate and frequently applied data mining method was support vector machine; however, decision tree has shown superior performance to enhance mobile apps applied for patients' self-management. Embedded data-mining-based feature in mobile apps, such as case detection, prediction/classification, risk estimation, or collection of patient data, particularly during self-management, would save, apply, and analyze patient data during and after care. More intelligent methods, such as artificial neural networks, fuzzy logic, and genetic algorithms, and even the hybrid methods may result in more patients-centered recommendations, providing education, guidance, alerts, and awareness of personalized output.
Berendt, Bettina; Preibusch, Sören
2017-06-01
"Big Data" and data-mined inferences are affecting more and more of our lives, and concerns about their possible discriminatory effects are growing. Methods for discrimination-aware data mining and fairness-aware data mining aim at keeping decision processes supported by information technology free from unjust grounds. However, these formal approaches alone are not sufficient to solve the problem. In the present article, we describe reasons why discrimination with data can and typically does arise through the combined effects of human and machine-based reasoning, and argue that this requires a deeper understanding of the human side of decision-making with data mining. We describe results from a large-scale human-subjects experiment that investigated such decision-making, analyzing the reasoning that participants reported during their task to assess whether a loan request should or would be granted. We derive data protection by design strategies for making decision-making discrimination-aware in an accountable way, grounding these requirements in the accountability principle of the European Union General Data Protection Regulation, and outline how their implementations can integrate algorithmic, behavioral, and user interface factors.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grout, J.A.
As part of a study to explore the impacts of acid mine drainage from the Britannia Mine, a beach seine sampling program was initiated on Howe Sound to assess the species composition, abundance, and distribution of the near-shore fish community. Sampling was carried out in April 1997 at 23 sites on the east and west shores of the sound to attempt to differentiate between the fish communities using foreshore areas near Britannia Beach, which may be impacted by acid mine drainage, and fish communities in more distant areas thought to be less affected by mine pollution. Data are presented formore » the 13,882 individuals from 18 families and 39 species of fish, and stomach content data are also presented for a subset of the juvenile salmonids caught at Britannia Beach and Furry Creek. In addition, physical oceanographic data for each site are included.« less
NASA Astrophysics Data System (ADS)
Satyanarayanan, M.; Eswaramoorthi, S.; Subramanian, S.; Periakali, P.
2017-09-01
Geochemical analytical data of 15 representative rock samples, 34 soil samples and 55 groundwater samples collected from Salem magnesite mines and surrounding area in Salem, southern India, were subjected to R-mode factor analysis. A maximum of three factors account for 93.8 % variance in rock data, six factors for 84 % variance in soil data, five factors for 71.2 % in groundwater data during summer and six factors for 73.7 % during winter. Total dissolved solids are predominantly contributed by Mg, Na, Cl and SO4 ions in both seasons and are derived from the country rock and mining waste by dissolution of minerals like magnesite, gypsum, halite. The results also show that groundwater is enriched in considerable amount of minor and trace elements (Fe, Mn, Ni, Cr and Co). Nickel, chromium and cobalt in groundwater and soil are derived from leaching of huge mine dumps deposited by selective magnesite mining activity. The factor analysis on trivalent, hexavalent and total Cr in groundwater indicates that most of the Cr in summer is trivalent and in winter hexavalent. The gradational decrease in topographical elevation from northern mine area to the southern residential area, combined regional hydrogeological factors and distribution of ultramafic rocks in the northern part of the study area indicate that these toxic trace elements in water were derived from mine dumps.
On-Board Mining in the Sensor Web
NASA Astrophysics Data System (ADS)
Tanner, S.; Conover, H.; Graves, S.; Ramachandran, R.; Rushing, J.
2004-12-01
On-board data mining can contribute to many research and engineering applications, including natural hazard detection and prediction, intelligent sensor control, and the generation of customized data products for direct distribution to users. The ability to mine sensor data in real time can also be a critical component of autonomous operations, supporting deep space missions, unmanned aerial and ground-based vehicles (UAVs, UGVs), and a wide range of sensor meshes, webs and grids. On-board processing is expected to play a significant role in the next generation of NASA, Homeland Security, Department of Defense and civilian programs, providing for greater flexibility and versatility in measurements of physical systems. In addition, the use of UAV and UGV systems is increasing in military, emergency response and industrial applications. As research into the autonomy of these vehicles progresses, especially in fleet or web configurations, the applicability of on-board data mining is expected to increase significantly. Data mining in real time on board sensor platforms presents unique challenges. Most notably, the data to be mined is a continuous stream, rather than a fixed store such as a database. This means that the data mining algorithms must be modified to make only a single pass through the data. In addition, the on-board environment requires real time processing with limited computing resources, thus the algorithms must use fixed and relatively small amounts of processing time and memory. The University of Alabama in Huntsville is developing an innovative processing framework for the on-board data and information environment. The Environment for On-Board Processing (EVE) and the Adaptive On-board Data Processing (AODP) projects serve as proofs-of-concept of advanced information systems for remote sensing platforms. The EVE real-time processing infrastructure will upload, schedule and control the execution of processing plans on board remote sensors. These plans provide capabilities for autonomous data mining, classification and feature extraction using both streaming and buffered data sources. A ground-based testbed provides a heterogeneous, embedded hardware and software environment representing both space-based and ground-based sensor platforms, including wireless sensor mesh architectures. The AODP project explores the EVE concepts in the world of sensor-networks, including ad-hoc networks of small sensor platforms.
A Proposed Data Fusion Architecture for Micro-Zone Analysis and Data Mining
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kevin McCarthy; Milos Manic
Data Fusion requires the ability to combine or “fuse” date from multiple data sources. Time Series Analysis is a data mining technique used to predict future values from a data set based upon past values. Unlike other data mining techniques, however, Time Series places special emphasis on periodicity and how seasonal and other time-based factors tend to affect trends over time. One of the difficulties encountered in developing generic time series techniques is the wide variability of the data sets available for analysis. This presents challenges all the way from the data gathering stage to results presentation. This paper presentsmore » an architecture designed and used to facilitate the collection of disparate data sets well suited to Time Series analysis as well as other predictive data mining techniques. Results show this architecture provides a flexible, dynamic framework for the capture and storage of a myriad of dissimilar data sets and can serve as a foundation from which to build a complete data fusion architecture.« less
Data mining in soft computing framework: a survey.
Mitra, S; Pal, S K; Mitra, P
2002-01-01
The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included.
Handling Dynamic Weights in Weighted Frequent Pattern Mining
NASA Astrophysics Data System (ADS)
Ahmed, Chowdhury Farhan; Tanbeer, Syed Khairuzzaman; Jeong, Byeong-Soo; Lee, Young-Koo
Even though weighted frequent pattern (WFP) mining is more effective than traditional frequent pattern mining because it can consider different semantic significances (weights) of items, existing WFP algorithms assume that each item has a fixed weight. But in real world scenarios, the weight (price or significance) of an item can vary with time. Reflecting these changes in item weight is necessary in several mining applications, such as retail market data analysis and web click stream analysis. In this paper, we introduce the concept of a dynamic weight for each item, and propose an algorithm, DWFPM (dynamic weighted frequent pattern mining), that makes use of this concept. Our algorithm can address situations where the weight (price or significance) of an item varies dynamically. It exploits a pattern growth mining technique to avoid the level-wise candidate set generation-and-test methodology. Furthermore, it requires only one database scan, so it is eligible for use in stream data mining. An extensive performance analysis shows that our algorithm is efficient and scalable for WFP mining using dynamic weights.
Monitoring and inversion on land subsidence over mining area with InSAR technique
Wang, Y.; Zhang, Q.; Zhao, C.; Lu, Z.; Ding, X.
2011-01-01
The Wulanmulun town, located in Inner Mongolia, is one of the main mining areas of Shendong Company such as Shangwan coal mine and Bulianta coal mine, which has been suffering serious mine collapse with the underground mine withdrawal. We use ALOS/PALSAR data to extract land deformation under these regions, in which Small Baseline Subsets (SBAS) method was applied. Then we compared InSAR results with the underground mining activities, and found high correlations between them. Lastly we applied Distributed Dislocation (Okada) model to invert the mine collapse mechanism. ?? 2011 Copyright Society of Photo-Optical Instrumentation Engineers (SPIE).
Causey, J. Douglas; Moyle, Phillip R.
2001-01-01
This report provides a description of data and processes used to produce a spatial database that delineates mining-related features in areas of historic and active phosphate mining in the core of the southeastern Idaho phosphate resource area. The data have varying degrees of accuracy and attribution detail. Classification of areas by type of mining-related activity at active mines is generally detailed; however, the spatial coverage does not differentiate mining-related surface disturbance features at many of the closed or inactive mines. Nineteen phosphate mine sites are included in the study. A total of 5,728 hc (14,154 ac), or more than 57 km2 (22 mi2), of phosphate mining-related surface disturbance are documented in the spatial coverage of the core of the southeast Idaho phosphate resource area. The study includes 4 active phosphate mines—Dry Valley, Enoch Valley, Rasmussen Ridge, and Smoky Canyon—and 15 historic phosphate mines—Ballard, Champ, Conda, Diamond Gulch, Gay, Georgetown Canyon, Henry, Home Canyon, Lanes Creek, Maybe Canyon, Mountain Fuel, Trail Canyon, Rattlesnake Canyon, Waterloo, and Wooley Valley. Spatial data on the inactive historic mines is relatively up-to-date; however, spatially described areas for active mines are based on digital maps prepared in early 1999. The inactive Gay mine has the largest total area of disturbance: 1,917 hc (4,736 ac) or about 19 km2 (7.4 mi2). It encompasses over three times the disturbance area of the next largest mine, the Conda mine with 607 hc (1,504 ac), and it is nearly four times the area of the Smoky Canyon mine, the largest of the active mines with 497 hc (1,228 ac). The wide range of phosphate mining-related surface disturbance features (approximately 80) were reduced to 13 types or features used in this study—adit and pit, backfilled mine pit, facilities, mine pit, ore stockpile, railroad, road, sediment catchment, tailings or tailings pond, topsoil stockpile, water reservoir, and disturbed land (undifferentiated). In summary, the spatial coverage includes polygons totaling 1,114 hc (2,753 ac) of mine pits, 272 hc (671 ac) of backfilled mine pits, 1,570 hc (3,880 ac) of waste dumps, 26 hc (64 ac) of ore stockpiles, and 44 hc (110 ac) of tailings or tailings ponds. Areas of undifferentiated phosphate mining-related land disturbances, called “disturbed land,” total 2,176 (5,377 ac) or nearly 21.8 km2 (8.4 mi2). No determination has been made as to status of reclamation on these lands. Subsequent site-specific studies to delineate distinct mine features will allow modification of this preliminary spatial database.
Church, Stan E.; Kirschner, Frederick E.; Choate, LaDonna M.; Lamothe, Paul J.; Budahn, James R.; Brown, Zoe Ann
2008-01-01
Geochemical and radionuclide studies of sediment recovered from eight core sites in the Blue Creek flood plain and Blue Creek delta downstream in Lake Roosevelt provided a stratigraphic geochemical record of the contamination from uranium mining at the Midnite Mine. Sediment recovered from cores in a wetland immediately downstream from the mine site as well as from sediment catchments in Blue Creek and from cores in the delta in Blue Creek cove provided sufficient data to determine the premining geochemical background for the Midnite Mine tributary drainage. These data provide a geochemical background that includes material eroded from the Midnite Mine site prior to mine development. Premining geochemical background for the Blue Creek basin has also been determined using stream-sediment samples from parts of the Blue Creek, Oyachen Creek, and Sand Creek drainage basins not immediately impacted by mining. Sediment geochemistry showed that premining uranium concentrations in the Midnite Mine tributary immediately downstream of the mine site were strongly elevated relative to the crustal abundance of uranium (2.3 ppm). Cesium-137 (137Cs) data and public records of production at the Midnite Mine site provided age control to document timelines in the sediment from the core immediately downstream from the mine site. Mining at the Midnite Mine site on the Spokane Indian Reservation between 1956 and 1981 resulted in production of more than 10 million pounds of U3O8. Contamination of the sediment by uranium during the mining period is documented from the Midnite Mine along a small tributary to the confluence of Blue Creek, in Blue Creek, and into the Blue Creek delta. During the period of active mining (1956?1981), enrichment of base metals in the sediment of Blue Creek delta was elevated by as much as 4 times the concentration of those same metals prior to mining. Cadmium concentrations were elevated by a factor of 10 and uranium by factors of 16 to 55 times premining geochemical background determined upstream of the mine site. Postmining metal concentrations in sediment are lower than during the mining period, but remain elevated relative to premining geochemical background. Furthermore, the sediment composition of surface sediment in the Blue Creek delta is contaminated. Base-metal contamination by arsenic, cadmium, lead, and zinc in sediment in the delta in Blue Creek cove is dominated by suspended sediment from the Coeur d?Alene mining district. Uranium contamination in surface sediment in the delta of Blue Creek cove extends at least 500 meters downstream from the mouth of Blue Creek as defined by the 1,290-ft elevation boundary between lands administered by the National Park Service and the Spokane Indian Tribe. Comparisons of the premining geochemical background to sediment sampled during the period the mine was in operation, and to the sediment data from the postmining period, are used to delineate the extent of contaminated sediment in Blue Creek cove along the thalweg of Blue Creek into Lake Roosevelt. The extent of contamination out into Lake Roosevelt by mining remains open.
Mendez, Monica O.; Maier, Raina M.
2008-01-01
Objective Unreclaimed mine tailings sites are a worldwide problem, with thousands of unvegetated, exposed tailings piles presenting a source of contamination for nearby communities. Tailings disposal sites in arid and semiarid environments are especially subject to eolian dispersion and water erosion. Phytostabilization, the use of plants for in situ stabilization of tailings and metal contaminants, is a feasible alternative to costly remediation practices. In this review we emphasize considerations for phytostabilization of mine tailings in arid and semiarid environments, as well as issues impeding its long-term success. Data sources We reviewed literature addressing mine closures and revegetation of mine tailings, along with publications evaluating plant ecology, microbial ecology, and soil properties of mine tailings. Data extraction Data were extracted from peer-reviewed articles and books identified in Web of Science and Agricola databases, and publications available through the U.S. Department of Agriculture, U.S. Environmental Protection Agency, and the United Nations Environment Programme. Data synthesis Harsh climatic conditions in arid and semiarid environments along with the innate properties of mine tailings require specific considerations. Plants suitable for phytostabilization must be native, be drought-, salt-, and metal-tolerant, and should limit shoot metal accumulation. Factors for evaluating metal accumulation and toxicity issues are presented. Also reviewed are aspects of implementing phytostabilization, including plant growth stage, amendments, irrigation, and evaluation. Conclusions Phytostabilization of mine tailings is a promising remedial technology but requires further research to identify factors affecting its long-term success by expanding knowledge of suitable plant species and mine tailings chemistry in ongoing field trials. PMID:18335091
Applying Data Mining Principles to Library Data Collection.
ERIC Educational Resources Information Center
Guenther, Kim
2000-01-01
Explains how libraries can use data mining techniques for more effective data collection. Highlights include three phases: data selection and acquisition; data preparation and processing, including a discussion of the use of XML (extensible markup language); and data interpretation and integration, including database management systems. (LRW)
Text Mining for Adverse Drug Events: the Promise, Challenges, and State of the Art
Harpaz, Rave; Callahan, Alison; Tamang, Suzanne; Low, Yen; Odgers, David; Finlayson, Sam; Jung, Kenneth; LePendu, Paea; Shah, Nigam H.
2014-01-01
Text mining is the computational process of extracting meaningful information from large amounts of unstructured text. Text mining is emerging as a tool to leverage underutilized data sources that can improve pharmacovigilance, including the objective of adverse drug event detection and assessment. This article provides an overview of recent advances in pharmacovigilance driven by the application of text mining, and discusses several data sources—such as biomedical literature, clinical narratives, product labeling, social media, and Web search logs—that are amenable to text-mining for pharmacovigilance. Given the state of the art, it appears text mining can be applied to extract useful ADE-related information from multiple textual sources. Nonetheless, further research is required to address remaining technical challenges associated with the text mining methodologies, and to conclusively determine the relative contribution of each textual source to improving pharmacovigilance. PMID:25151493
Piatak, N.M.; Seal, R.R.; Sanzolone, R.F.; Lamothe, P.J.; Brown, Z.A.
2006-01-01
We report the preliminary results of sequential partial dissolutions used to characterize the geochemical distribution of selenium in stream sediments, mine wastes, and flotation-mill tailings. In general, extraction schemes are designed to extract metals associated with operationally defined solid phases. Total Se concentrations and the mineralogy of the samples are also presented. Samples were obtained from the Elizabeth, Ely, and Pike Hill mines in Vermont, the Callahan mine in Maine, and the Martha mine in New Zealand. These data are presented here with minimal interpretation or discussion. Further analysis of the data will be presented elsewhere.
Efficient frequent pattern mining algorithm based on node sets in cloud computing environment
NASA Astrophysics Data System (ADS)
Billa, V. N. Vinay Kumar; Lakshmanna, K.; Rajesh, K.; Reddy, M. Praveen Kumar; Nagaraja, G.; Sudheer, K.
2017-11-01
The ultimate goal of Data Mining is to determine the hidden information which is useful in making decisions using the large databases collected by an organization. This Data Mining involves many tasks that are to be performed during the process. Mining frequent itemsets is the one of the most important tasks in case of transactional databases. These transactional databases contain the data in very large scale where the mining of these databases involves the consumption of physical memory and time in proportion to the size of the database. A frequent pattern mining algorithm is said to be efficient only if it consumes less memory and time to mine the frequent itemsets from the given large database. Having these points in mind in this thesis we proposed a system which mines frequent itemsets in an optimized way in terms of memory and time by using cloud computing as an important factor to make the process parallel and the application is provided as a service. A complete framework which uses a proven efficient algorithm called FIN algorithm. FIN algorithm works on Nodesets and POC (pre-order coding) tree. In order to evaluate the performance of the system we conduct the experiments to compare the efficiency of the same algorithm applied in a standalone manner and in cloud computing environment on a real time data set which is traffic accidents data set. The results show that the memory consumption and execution time taken for the process in the proposed system is much lesser than those of standalone system.
Research on parallel algorithm for sequential pattern mining
NASA Astrophysics Data System (ADS)
Zhou, Lijuan; Qin, Bai; Wang, Yu; Hao, Zhongxiao
2008-03-01
Sequential pattern mining is the mining of frequent sequences related to time or other orders from the sequence database. Its initial motivation is to discover the laws of customer purchasing in a time section by finding the frequent sequences. In recent years, sequential pattern mining has become an important direction of data mining, and its application field has not been confined to the business database and has extended to new data sources such as Web and advanced science fields such as DNA analysis. The data of sequential pattern mining has characteristics as follows: mass data amount and distributed storage. Most existing sequential pattern mining algorithms haven't considered the above-mentioned characteristics synthetically. According to the traits mentioned above and combining the parallel theory, this paper puts forward a new distributed parallel algorithm SPP(Sequential Pattern Parallel). The algorithm abides by the principal of pattern reduction and utilizes the divide-and-conquer strategy for parallelization. The first parallel task is to construct frequent item sets applying frequent concept and search space partition theory and the second task is to structure frequent sequences using the depth-first search method at each processor. The algorithm only needs to access the database twice and doesn't generate the candidated sequences, which abates the access time and improves the mining efficiency. Based on the random data generation procedure and different information structure designed, this paper simulated the SPP algorithm in a concrete parallel environment and implemented the AprioriAll algorithm. The experiments demonstrate that compared with AprioriAll, the SPP algorithm had excellent speedup factor and efficiency.
Near-line Archive Data Mining at the Goddard Distributed Active Archive Center
NASA Astrophysics Data System (ADS)
Pham, L.; Mack, R.; Eng, E.; Lynnes, C.
2002-12-01
NASA's Earth Observing System (EOS) is generating immense volumes of data, in some cases too much to provide to users with data-intensive needs. As an alternative to moving the data to the user and his/her research algorithms, we are providing a means to move the algorithms to the data. The Near-line Archive Data Mining (NADM) system is the Goddard Earth Sciences Distributed Active Archive Center's (GES DAAC) web data mining portal to the EOS Data and Information System (EOSDIS) data pool, a 50-TB online disk cache. The NADM web portal enables registered users to submit and execute data mining algorithm codes on the data in the EOSDIS data pool. A web interface allows the user to access the NADM system. The users first develops personalized data mining code on their home platform and then uploads them to the NADM system. The C, FORTRAN and IDL languages are currently supported. The user developed code is automatically audited for any potential security problems before it is installed within the NADM system and made available to the user. Once the code has been installed the user is provided a test environment where he/she can test the execution of the software against data sets of the user's choosing. When the user is satisfied with the results, he/she can promote their code to the "operational" environment. From here the user can interactively run his/her code on the data available in the EOSDIS data pool. The user can also set up a processing subscription. The subscription will automatically process new data as it becomes available in the EOSDIS data pool. The generated mined data products are then made available for FTP pickup. The NADM system uses the GES DAAC-developed Simple Scalable Script-based Science Processor (S4P) to automate tasks and perform the actual data processing. Users will also have the option of selecting a DAAC-provided data mining algorithm and using it to process the data of their choice.
NASA Astrophysics Data System (ADS)
Kenton, Arthur C.; Geci, Duane M.; McDonald, James A.; Ray, Kristofer J.; Thomas, Clayton M.; Holloway, John H., Jr.; Petee, Danny A.; Witherspoon, Ned H.
2003-09-01
The objective of the Office of Naval Research (ONR) Rapid Overt Reconnaissance (ROR) program and the Airborne Littoral Reconnaissance Technologies project's Littoral Assessment of Mine Burial Signatures (LAMBS) contract is to determine if electro-optical spectral discriminants exist that are useful for the detection of land mines located in littoral regions. Statistically significant buried mine overburden and background signature data were collected over a wide spectral range (0.35 to 14 μm) to identify robust spectral features that might serve as discriminants for new airborne sensor concepts. The LAMBS program further expands the hyperspectral database previously collected and analyzed on the U.S. Army's Hyperspectral Mine Detection Phenomenology program [see "Detection of Land Mines with Hyperspectral Data," and "Hyperspectral Mine Detection Phenomenology Program," Proc. SPIE Vol. 3710, pp 917-928 and 819-829, AeroSense April 1999] to littoral areas where tidal, surf, and wind action can additionally modify spectral signatures. This work summarizes the LAMBS buried mine collections conducted at three beach sites - an inland bay beach site (Eglin AFB, FL, Site A-22), an Atlantic beach site (Duck, NC), and a Gulf beach site (Eglin AFB, FL, Site A-15). Characteristics of the spectral signatures of the various dry and damp beach sands are presented. These are then compared to buried land mine signatures observed for the tested background types, burial ages, and environmental conditions experienced.
Wright, Adam; Ricciardi, Thomas N.; Zwick, Martin
2005-01-01
The Medical Quality Improvement Consortium data warehouse contains de-identified data on more than 3.6 million patients including their problem lists, test results, procedures and medication lists. This study uses reconstructability analysis, an information-theoretic data mining technique, on the MQIC data warehouse to empirically identify risk factors for various complications of diabetes including myocardial infarction and microalbuminuria. The risk factors identified match those risk factors identified in the literature, demonstrating the utility of the MQIC data warehouse for outcomes research, and RA as a technique for mining clinical data warehouses. PMID:16779156
Hospitalization patterns associated with Appalachian coal mining.
Hendryx, Michael; Ahern, Melissa M; Nurkiewicz, Timothy R
2007-12-01
The goal of this study was to test whether the volume of coal mining was related to population hospitalization risk for diseases postulated to be sensitive or insensitive to coal mining by-products. The study was a retrospective analysis of 2001 adult hospitalization data (n = 93,952) for West Virginia, Kentucky, and Pennsylvania, merged with county-level coal production figures. Hospitalization data were obtained from the Health Care Utilization Project National Inpatient Sample. Diagnoses postulated to be sensitive to coal mining by-product exposure were contrasted with diagnoses postulated to be insensitive to exposure. Data were analyzed using hierarchical nonlinear models, controlling for patient age, gender, insurance, comorbidities, hospital teaching status, county poverty, and county social capital. Controlling for covariates, the volume of coal mining was significantly related to hospitalization risk for two conditions postulated to be sensitive to exposure: hypertension and chronic obstructive pulmonary disease (COPD). The odds for a COPD hospitalization increased 1% for each 1462 tons of coal, and the odds for a hypertension hospitalization increased 1% for each 1873 tons of coal. Other conditions were not related to mining volume. Exposure to particulates or other pollutants generated by coal mining activities may be linked to increased risk of COPD and hypertension hospitalizations. Limitations in the data likely result in an underestimate of associations.
SparkText: Biomedical Text Mining on Big Data Framework.
Ye, Zhan; Tafti, Ahmad P; He, Karen Y; Wang, Kai; He, Max M
Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.
SparkText: Biomedical Text Mining on Big Data Framework
He, Karen Y.; Wang, Kai
2016-01-01
Background Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. Results In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. Conclusions This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research. PMID:27685652
Determining a pre-mining radiological baseline from historic airborne gamma surveys: a case study.
Bollhöfer, Andreas; Beraldo, Annamarie; Pfitzner, Kirrilly; Esparon, Andrew; Doering, Che
2014-01-15
Knowing the baseline level of radioactivity in areas naturally enriched in radionuclides is important in the uranium mining context to assess radiation doses to humans and the environment both during and after mining. This information is particularly useful in rehabilitation planning and developing closure criteria for uranium mines as only radiation doses additional to the natural background are usually considered 'controllable' for radiation protection purposes. In this case study we have tested whether the method of contemporary groundtruthing of a historic airborne gamma survey could be used to determine the pre-mining radiological conditions at the Ranger mine in northern Australia. The airborne gamma survey was flown in 1976 before mining started and groundtruthed using ground gamma dose rate measurements made between 2007 and 2009 at an undisturbed area naturally enriched in uranium (Anomaly 2) located nearby the Ranger mine. Measurements of (226)Ra soil activity concentration and (222)Rn exhalation flux density at Anomaly 2 were made concurrent with the ground gamma dose rate measurements. Algorithms were developed to upscale the ground gamma data to the same spatial resolution as the historic airborne gamma survey data using a geographic information system, allowing comparison of the datasets. Linear correlation models were developed to estimate the pre-mining gamma dose rates, (226)Ra soil activity concentrations, and (222)Rn exhalation flux densities at selected areas in the greater Ranger region. The modelled levels agreed with measurements made at the Ranger Orebodies 1 and 3 before mining started, and at environmental sites in the region. The conclusion is that our approach can be used to determine baseline radiation levels, and provide a benchmark for rehabilitation of uranium mines or industrial sites where historical airborne gamma survey data are available and an undisturbed radiological analogue exists to groundtruth the data. © 2013.
Application of LANDSAT data to monitor land reclamation progress in Belmont County, Ohio
NASA Technical Reports Server (NTRS)
Bloemer, H. H. L.; Brumfield, J. O.; Campbell, W. J.; Witt, R. G.; Bly, B. G.
1981-01-01
Strip and contour mining techniques are reviewed as well as some studies conducted to determine the applicability of LANDSAT and associated digital image processing techniques to the surficial problems associated with mining operations. A nontraditional unsupervised classification approach to multispectral data is considered which renders increased classification separability in land cover analysis of surface mined areas. The approach also reduces the dimensionality of the data and requires only minimal analytical skills in digital data processing.
NASA Technical Reports Server (NTRS)
Stolzer, Alan J.; Halford, Carl
2007-01-01
In a previous study, multiple regression techniques were applied to Flight Operations Quality Assurance-derived data to develop parsimonious model(s) for fuel consumption on the Boeing 757 airplane. The present study examined several data mining algorithms, including neural networks, on the fuel consumption problem and compared them to the multiple regression results obtained earlier. Using regression methods, parsimonious models were obtained that explained approximately 85% of the variation in fuel flow. In general data mining methods were more effective in predicting fuel consumption. Classification and Regression Tree methods reported correlation coefficients of .91 to .92, and General Linear Models and Multilayer Perceptron neural networks reported correlation coefficients of about .99. These data mining models show great promise for use in further examining large FOQA databases for operational and safety improvements.
Tucker, Conrad; Han, Yixiang; Nembhard, Harriet Black; Lewis, Mechelle; Lee, Wang-Chien; Sterling, Nicholas W; Huang, Xuemei
2017-01-01
Parkinson’s disease (PD) is the second most common neurological disorder after Alzheimer’s disease. Key clinical features of PD are motor-related and are typically assessed by healthcare providers based on qualitative visual inspection of a patient’s movement/gait/posture. More advanced diagnostic techniques such as computed tomography scans that measure brain function, can be cost prohibitive and may expose patients to radiation and other harmful effects. To mitigate these challenges, and open a pathway to remote patient-physician assessment, the authors of this work propose a data mining driven methodology that uses low cost, non-invasive sensors to model and predict the presence (or lack therefore) of PD movement abnormalities and model clinical subtypes. The study presented here evaluates the discriminative ability of non-invasive hardware and data mining algorithms to classify PD cases and controls. A 10-fold cross validation approach is used to compare several data mining algorithms in order to determine that which provides the most consistent results when varying the subject gait data. Next, the predictive accuracy of the data mining model is quantified by testing it against unseen data captured from a test pool of subjects. The proposed methodology demonstrates the feasibility of using non-invasive, low cost, hardware and data mining models to monitor the progression of gait features outside of the traditional healthcare facility, which may ultimately lead to earlier diagnosis of emerging neurological diseases. PMID:29541376
A data mining based approach to predict spatiotemporal changes in satellite images
NASA Astrophysics Data System (ADS)
Boulila, W.; Farah, I. R.; Ettabaa, K. Saheb; Solaiman, B.; Ghézala, H. Ben
2011-06-01
The interpretation of remotely sensed images in a spatiotemporal context is becoming a valuable research topic. However, the constant growth of data volume in remote sensing imaging makes reaching conclusions based on collected data a challenging task. Recently, data mining appears to be a promising research field leading to several interesting discoveries in various areas such as marketing, surveillance, fraud detection and scientific discovery. By integrating data mining and image interpretation techniques, accurate and relevant information (i.e. functional relation between observed parcels and a set of informational contents) can be automatically elicited. This study presents a new approach to predict spatiotemporal changes in satellite image databases. The proposed method exploits fuzzy sets and data mining concepts to build predictions and decisions for several remote sensing fields. It takes into account imperfections related to the spatiotemporal mining process in order to provide more accurate and reliable information about land cover changes in satellite images. The proposed approach is validated using SPOT images representing the Saint-Denis region, capital of Reunion Island. Results show good performances of the proposed framework in predicting change for the urban zone.
Roehl, Edwin A.; Conrads, Paul
2010-01-01
This is the second of two papers that describe how data mining can aid natural-resource managers with the difficult problem of controlling the interactions between hydrologic and man-made systems. Data mining is a new science that assists scientists in converting large databases into knowledge, and is uniquely able to leverage the large amounts of real-time, multivariate data now being collected for hydrologic systems. Part 1 gives a high-level overview of data mining, and describes several applications that have addressed major water resource issues in South Carolina. This Part 2 paper describes how various data mining methods are integrated to produce predictive models for controlling surface- and groundwater hydraulics and quality. The methods include: - signal processing to remove noise and decompose complex signals into simpler components; - time series clustering that optimally groups hundreds of signals into "classes" that behave similarly for data reduction and (or) divide-and-conquer problem solving; - classification which optimally matches new data to behavioral classes; - artificial neural networks which optimally fit multivariate data to create predictive models; - model response surface visualization that greatly aids in understanding data and physical processes; and, - decision support systems that integrate data, models, and graphics into a single package that is easy to use.
A Big Data Platform for Storing, Accessing, Mining and Learning Geospatial Data
NASA Astrophysics Data System (ADS)
Yang, C. P.; Bambacus, M.; Duffy, D.; Little, M. M.
2017-12-01
Big Data is becoming a norm in geoscience domains. A platform that is capable to effiently manage, access, analyze, mine, and learn the big data for new information and knowledge is desired. This paper introduces our latest effort on developing such a platform based on our past years' experiences on cloud and high performance computing, analyzing big data, comparing big data containers, and mining big geospatial data for new information. The platform includes four layers: a) the bottom layer includes a computing infrastructure with proper network, computer, and storage systems; b) the 2nd layer is a cloud computing layer based on virtualization to provide on demand computing services for upper layers; c) the 3rd layer is big data containers that are customized for dealing with different types of data and functionalities; d) the 4th layer is a big data presentation layer that supports the effient management, access, analyses, mining and learning of big geospatial data.
Emoto, Takuo; Yamashita, Tomoya; Kobayashi, Toshio; Sasaki, Naoto; Hirota, Yushi; Hayashi, Tomohiro; So, Anna; Kasahara, Kazuyuki; Yodoi, Keiko; Matsumoto, Takuya; Mizoguchi, Taiji; Ogawa, Wataru; Hirata, Ken-Ichi
2017-01-01
The association between atherosclerosis and gut microbiota has been attracting increased attention. We previously demonstrated a possible link between gut microbiota and coronary artery disease. Our aim of this study was to clarify the gut microbiota profiles in coronary artery disease patients using data mining analysis of terminal restriction fragment length polymorphism (T-RFLP). This study included 39 coronary artery disease (CAD) patients and 30 age- and sex- matched no-CAD controls (Ctrls) with coronary risk factors. Bacterial DNA was extracted from their fecal samples and analyzed by T-RFLP and data mining analysis using the classification and regression algorithm. Five additional CAD patients were newly recruited to confirm the reliability of this analysis. Data mining analysis could divide the composition of gut microbiota into 2 characteristic nodes. The CAD group was classified into 4 CAD pattern nodes (35/39 = 90 %), while the Ctrl group was classified into 3 Ctrl pattern nodes (28/30 = 93 %). Five additional CAD samples were applied to the same dividing model, which could validate the accuracy to predict the risk of CAD by data mining analysis. We could demonstrate that operational taxonomic unit 853 (OTU853), OTU657, and OTU990 were determined important both by the data mining method and by the usual statistical comparison. We classified the gut microbiota profiles in coronary artery disease patients using data mining analysis of T-RFLP data and demonstrated the possibility that gut microbiota is a diagnostic marker of suffering from CAD.
NASA Astrophysics Data System (ADS)
Ng, Alex Hay-Man; Ge, Linlin; Du, Zheyuan; Wang, Shuren; Ma, Chao
2017-09-01
This paper describes the simulation and real data analysis results from the recently launched SAR satellites, ALOS-2, Sentinel-1 and Radarsat-2 for the purpose of monitoring subsidence induced by longwall mining activity using satellite synthetic aperture radar interferometry (InSAR). Because of the enhancement of orbit control (pairs with shorter perpendicular baseline) from the new satellite SAR systems, the mine subsidence detection is now mainly constrained by the phase discontinuities due to large deformation and temporal decorrelation noise. This paper investigates the performance of the three satellite missions with different imaging modes for mapping longwall mine subsidence. The results show that the three satellites perform better than their predecessors. The simulation results show that the Sentinel-1A/B constellation is capable of mapping rapid mine subsidence, especially the Sentinel-1A/B constellation with stripmap (SM) mode. Unfortunately, the Sentinel-1A/B SM data are not available in most cases and hence real data analysis cannot be conducted in this study. Despite the Sentinel-1A/B SM data, the simulation and real data analysis suggest that ALOS-2 is best suited for mapping mine subsidence amongst the three missions. Although not investigated in this study, the X-band satellites TerraSAR-X and COSMO-SkyMed with short temporal baseline and high spatial resolution can be comparable with the performance of the Radarsat-2 and Sentinel-1 C-band data over the dry surface with sparse vegetation. The potential of the recently launched satellites (e.g. ALOS-2 and Sentinel-1A/B) for mapping longwall mine subsidence is expected to be better than the results of this study, if the data acquired from the ideal acquisition modes are available.
Digital mine claim density map for Federal lands in Montana, 1996
Campbell, Harry W.; Hyndman, Paul C.
1998-01-01
This report describes a digital map and data files generated by the U.S. Geological Survey (USGS) to provide digital spatial mining claim information for Federal lands in Montana as of March, 1997. Statewide, 159,704 claims had been recorded with the Bureau of Land Management since 1975. Of those claims, 21,055 (13%) are still actively held while 138,649 (87%) are closed and are no longer held. Montana contains 147,704 sections (usually 1 section equals 1 square mile) in the Public Land Survey System, with 8,569 sections (6%) containing claim data. Of the sections with claim data, 2,192 (26%) contain actively held claims. Only 1.5% of Montana’s sections contains actively held mining claims. The four types of mining claim are lode, placer, mill, and tunnel. A mill claim may be as much as 5 acres or 1/128th (0.78125%) of a square mile. A lode claim, about 20 acres, would cover 1/32nd (3.125%) of a square mile. Mining claim data is earth science information deemed to be relevant to the assessment of historic, current, and future ecological, economic, and social systems. The digital map and data files that are available in this report are suitable for geographic information system (GIS)-based regional assessments at a scale of 1:100,000 or smaller. Campbell (1996) summarized the methodology and GIS techniques that were used to produce the mining claim density map of the Pacific Northwest. Campbell and Hyndman (1997) displayed mining claim information for the Pacific Northwest that used data acquired in 1994. Appendix A of this report lists the attribute data for the digital data files. Appendix B contains the GIS metadata.
Combining complex networks and data mining: Why and how
NASA Astrophysics Data System (ADS)
Zanin, M.; Papo, D.; Sousa, P. A.; Menasalvas, E.; Nicchi, A.; Kubik, E.; Boccaletti, S.
2016-05-01
The increasing power of computer technology does not dispense with the need to extract meaningful information out of data sets of ever growing size, and indeed typically exacerbates the complexity of this task. To tackle this general problem, two methods have emerged, at chronologically different times, that are now commonly used in the scientific community: data mining and complex network theory. Not only do complex network analysis and data mining share the same general goal, that of extracting information from complex systems to ultimately create a new compact quantifiable representation, but they also often address similar problems too. In the face of that, a surprisingly low number of researchers turn out to resort to both methodologies. One may then be tempted to conclude that these two fields are either largely redundant or totally antithetic. The starting point of this review is that this state of affairs should be put down to contingent rather than conceptual differences, and that these two fields can in fact advantageously be used in a synergistic manner. An overview of both fields is first provided, some fundamental concepts of which are illustrated. A variety of contexts in which complex network theory and data mining have been used in a synergistic manner are then presented. Contexts in which the appropriate integration of complex network metrics can lead to improved classification rates with respect to classical data mining algorithms and, conversely, contexts in which data mining can be used to tackle important issues in complex network theory applications are illustrated. Finally, ways to achieve a tighter integration between complex networks and data mining, and open lines of research are discussed.
NASA Technical Reports Server (NTRS)
Anderson, J. E. (Principal Investigator)
1979-01-01
Assistance by NASA to EPA in the establishment and maintenance of a fully operational energy-related monitoring system included: (1) regional analysis applications based on LANDSAT and auxiliary data; (2) development of techniques for using aircraft MSS data to rapidly monitor site specific surface coal mine activities; and (3) registration of aircraft MSS data to a map base. The coal strip mines used in the site specific task were in Campbell County, Wyoming; Big Horn County, Montana; and the Navajo mine in San Juan County, New Mexico. The procedures and software used to accomplish these tasks are described.
Adaptive semantic tag mining from heterogeneous clinical research texts.
Hao, T; Weng, C
2015-01-01
To develop an adaptive approach to mine frequent semantic tags (FSTs) from heterogeneous clinical research texts. We develop a "plug-n-play" framework that integrates replaceable unsupervised kernel algorithms with formatting, functional, and utility wrappers for FST mining. Temporal information identification and semantic equivalence detection were two example functional wrappers. We first compared this approach's recall and efficiency for mining FSTs from ClinicalTrials.gov to that of a recently published tag-mining algorithm. Then we assessed this approach's adaptability to two other types of clinical research texts: clinical data requests and clinical trial protocols, by comparing the prevalence trends of FSTs across three texts. Our approach increased the average recall and speed by 12.8% and 47.02% respectively upon the baseline when mining FSTs from ClinicalTrials.gov, and maintained an overlap in relevant FSTs with the base- line ranging between 76.9% and 100% for varying FST frequency thresholds. The FSTs saturated when the data size reached 200 documents. Consistent trends in the prevalence of FST were observed across the three texts as the data size or frequency threshold changed. This paper contributes an adaptive tag-mining framework that is scalable and adaptable without sacrificing its recall. This component-based architectural design can be potentially generalizable to improve the adaptability of other clinical text mining methods.
Using Text Mining to Uncover Students' Technology-Related Problems in Live Video Streaming
ERIC Educational Resources Information Center
Abdous, M'hammed; He, Wu
2011-01-01
Because of their capacity to sift through large amounts of data, text mining and data mining are enabling higher education institutions to reveal valuable patterns in students' learning behaviours without having to resort to traditional survey methods. In an effort to uncover live video streaming (LVS) students' technology related-problems and to…
Recommending Learning Activities in Social Network Using Data Mining Algorithms
ERIC Educational Resources Information Center
Mahnane, Lamia
2017-01-01
In this paper, we show how data mining algorithms (e.g. Apriori Algorithm (AP) and Collaborative Filtering (CF)) is useful in New Social Network (NSN-AP-CF). "NSN-AP-CF" processes the clusters based on different learning styles. Next, it analyzes the habits and the interests of the users through mining the frequent episodes by the…
2016-05-06
ABSTRACT Awards: Best Paper Honorable Mention Award at the SIAM (Society for Industrial and Applied Mathematics Conference on Data Mining (SDM...magnitude in computation time over the state of the art. 15. SUBJECT TERMS Data Mining 16. SECURITY CLASSIFICATION OF: 17...International Conference on Data Mining and received Best Paper Honorable mention. To ensure broad use and uptake of the outcomes of this research
Numerical linear algebra in data mining
NASA Astrophysics Data System (ADS)
Eldén, Lars
Ideas and algorithms from numerical linear algebra are important in several areas of data mining. We give an overview of linear algebra methods in text mining (information retrieval), pattern recognition (classification of handwritten digits), and PageRank computations for web search engines. The emphasis is on rank reduction as a method of extracting information from a data matrix, low-rank approximation of matrices using the singular value decomposition and clustering, and on eigenvalue methods for network analysis.
15 CFR 971.501 - Resource assessment, recovery plan, and logical mining unit.
Code of Federal Regulations, 2010 CFR
2010-01-01
..., and logical mining unit. 971.501 Section 971.501 Commerce and Foreign Trade Regulations Relating to... COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR... mining unit. (a) The applicant must submit with the application a resource assessment to provide a basis...
MINE WASTE TECHNOLOGY PROGRAM PREVENTION OF ACID MINE DRAINAGE GENERATION FROM OPEN-PIT HIGHWALLS
This document summarizes the results of Mine Waste Technology Program Activity III, Project 26, Prevention of Acid Mine Drainage Generation from Open-Pit Highwalls. The intent of this project was to obtain performance data on the ability of four technologies to prevent the gener...
Data Visualization in Information Retrieval and Data Mining (SIG VIS).
ERIC Educational Resources Information Center
Efthimiadis, Efthimis
2000-01-01
Presents abstracts that discuss using data visualization for information retrieval and data mining, including immersive information space and spatial metaphors; spatial data using multi-dimensional matrices with maps; TREC (Text Retrieval Conference) experiments; users' information needs in cartographic information retrieval; and users' relevance…
Predicting ground-water movement in large mine spoil areas in the Appalachian Plateau
Wunsch, D.R.; Dinger, J.S.; Graham, C.D.R.
1999-01-01
Spoil created by surface mining can accumulate large quantities of ground-water, which can create geotechnical or regulatory problems, as well as flood active mine pits. A current study at a large (4.1 km2), thick, (up to 90 m) spoil body in eastern Kentucky reveals important factors that control the storage and movement of water. Ground-water recharge occurs along the periphery of the spoil body where surface-water drainage is blocked, as well as from infiltration along the spoil-bedrock contact, recharge from adjacent bedrock, and to a minor extent, through macropores at the spoil's surface. Based on an average saturated thickness of 6.4 m for all spoil wells, and assuming an estimated porosity of 20%, approximately 5.2 x 106 m3 of water is stored within the existing 4.1 km2 of reclaimed spoil. A conceptual model of ground-water flow, based on data from monitoring wells, dye-tracing data, discharge from springs and ponds, hydraulic gradients, chemical data, field reconnaissance, and aerial photographs indicate that three distinct but interconnected saturated zones have been established: one in the spoil's interior, and others in the valley fills that surround the main spoil body at lower elevations. Ground-water movement is sluggish in the spoil's interior, but moves quickly through the valley fills. The conceptual model shows that a prediction of ground-water occurrence, movement, and quality can be made for active or abandoned spoil areas if all or some of the following data are available: structural contour of the base of the lowest coal seam being mined, pre-mining topography, documentation of mining methods employed throughout the mine, overburden characteristics, and aerial photographs of mine progression.Spoil created by surface mining can accumulate large quantities of ground-water, which can create geotechnical or regulatory problems, as well as flood active mine pits. A current study at a large (4.1 km2), thick, (up to 90 m) spoil body in eastern Kentucky reveals important factors that control the storage and movement of water. Ground-water recharge occurs along the periphery of the spoil body where surface-water drainage is blocked, as well as from infiltration along the spoil-bedrock contact, recharge from adjacent bedrock, and to a minor extent, through macropores at the spoil's surface. Based on an average saturated thickness of 6.4 m for all spoil wells, and assuming an estimated porosity of 20%, approximately 5.2 ?? 106 m3 of water is stored within the existing 4.1 km2 of reclaimed spoil. A conceptual model of ground-water flow, based on data from monitoring wells, dye-tracing data, discharge from springs and ponds, hydraulic gradients, chemical data, field reconnaissance, and aerial photographs indicate that three distinct but interconnected saturated zones have been established: one in the spoil's interior, and others in the valley fills that surround the main spoil body at lower elevations. Ground-water movement is sluggish in the spoil's interior, but moves quickly through the valley fills. The conceptual model shows that a prediction of ground-water occurrence, movement, and quality can be made for active or abandoned spoil areas if all or some of the following data are available: structural contour of the base of the lowest coal seam being mined, pre-mining topography, documentation of mining methods employed throughout the mine, overburden characteristics, and aerial photographs of mine progression.
The development of health care data warehouses to support data mining.
Lyman, Jason A; Scully, Kenneth; Harrison, James H
2008-03-01
Clinical data warehouses offer tremendous benefits as a foundation for data mining. By serving as a source for comprehensive clinical and demographic information on large patient populations, they streamline knowledge discovery efforts by providing standard and efficient mechanisms to replace time-consuming and expensive original data collection, organization, and processing. Building effective data warehouses requires knowledge of and attention to key issues in database design, data acquisition and processing, and data access and security. In this article, the authors provide an operational and technical definition of data warehouses, present examples of data mining projects enabled by existing data warehouses, and describe key issues and challenges related to warehouse development and implementation.
Assessment of satellite and aircraft multispectral scanner data for strip-mine monitoring
NASA Technical Reports Server (NTRS)
Spisz, E. W.; Dooley, J. T.
1980-01-01
The application of LANDSAT multispectral scanner data to describe the mining and reclamation changes of a hilltop surface coal mine in the rugged, mountainous area of eastern Kentucky is presented. Original single band satellite imagery, computer enhanced single band imagery, and computer classified imagery are presented for four different data sets in order to demonstrate the land cover changes that can be detected. Data obtained with an 11 band multispectral scanner on board a C-47 aircraft at an altitude of 3000 meters are also presented. Comparing the satellite data with color, infrared aerial photography, and ground survey data shows that significant changes in the disrupted area can be detected from LANDSAT band 5 satellite imagery for mines with more than 100 acres of disturbed area. However, band-ratio (bands 5/6) imagery provides greater contrast than single band imagery and can provide a qualitative level 1 classification of the land cover that may be useful for monitoring either the disturbed mining area or the revegetation progress. However, if a quantitative, accurate classification of the barren or revegetated classes is required, it is necessary to perform a detailed, four band computer classification of the data.
Exploration of geo-mineral compounds in granite mining soils using XRD pattern data analysis
NASA Astrophysics Data System (ADS)
Koteswara Reddy, G.; Yarakkula, Kiran
2017-11-01
The purpose of the study was to investigate the major minerals present in granite mining waste and agricultural soils near and away from mining areas. The mineral exploration of representative sub-soil samples are identified by X-Ray Diffractometer (XRD) pattern data analysis. The morphological features and quantitative elementary analysis was performed by Scanning Electron Microscopy-Energy Dispersed Spectroscopy (SEM-EDS).The XRD pattern data revealed that the major minerals are identified as Quartz, Albite, Anorthite, K-Feldspars, Muscovite, Annite, Lepidolite, Illite, Enstatite and Ferrosilite in granite waste. However, in case of agricultural farm soils the major minerals are identified as Gypsum, Calcite, Magnetite, Hematite, Muscovite, K-Feldspars and Quartz. Moreover, the agricultural soils neighbouring mining areas, the minerals are found that, the enriched Mica group minerals (Lepidolite and Illite) the enriched Orthopyroxene group minerals (Ferrosilite and Enstatite). It is observed that the Mica and Orthopyroxene group minerals are present in agricultural farm soils neighbouring mining areas and absent in agricultural farm soils away from mining areas. The study demonstrated that the chemical migration takes place at agricultural farm lands in the vicinity of the granite mining areas.
Adult tooth loss for residents of US coal mining and Appalachian counties.
Hendryx, Michael; Ducatman, Alan M; Zullig, Keith J; Ahern, Melissa M; Crout, Richard
2012-12-01
The authors compared rates of tooth loss between adult residents of Appalachian coal-mining areas and other areas of the nation before and after control for covariate risks. The authors conducted a cross-sectional secondary data analysis that merged 2006 national Behavioral Risk Factor Surveillance System data (BRFSS) (N = 242 184) with county coal-mining data and other county characteristics. The hypothesis tested was that adult tooth loss would be greater in Appalachian mining areas after control for other risks. Primary independent variables included main effects for coal-mining present (yes/no) residence in Appalachia (yes/no), and their interaction. Data were weighted using the BRFSS final weights and analyzed using SUDAAN Proc Multilog to account for the multilevel complex sampling structure. The odds of two measures of tooth loss were examined controlling for age, race\\ethnicity, drinking, smoking, income, education, supply of dentists, receipt of dental care, fluoridation rate, and other variables. After covariate adjustment, the interaction variable for the residents of Appalachian coal-mining counties showed a significantly elevated odds for any tooth loss [odds ratio (OR) = 1.19, 95% CI = 1.02, 1.38], and greater tooth loss measured by a 4-level edentulism scale (OR = 1.20, 95% CI = 1.05, 1.36). The main effect for Appalachia was also significant for both measures, but the main effect for coal mining was not. Greater risk of tooth loss among adult residents of Appalachian coal-mining areas is present and is not explained by differences in reported receipt of dental care, fluoridation rates, supply of dentists or other behavioral or socioeconomic risks. Possible contributing factors include mining-specific disparities related to access, behavior or environmental exposures. © 2012 John Wiley & Sons A/S.
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.
Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E
2016-01-04
We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Stress monitoring versus microseismic ruptures in an active deep mine
NASA Astrophysics Data System (ADS)
Tonnellier, Alice; Bouffier, Christian; Bigarré, Pascal; Nyström, Anders; Österberg, Anders; Fjellström, Peter
2015-04-01
Nowadays, underground mining industry has developed high-technology mass mining methods to optimise the productivity at deep levels. Such massive extraction induces high-level stress redistribution generating seismic events around the mining works, threatening safety and economics. For this reason mining irregular deep ore bodies calls for steadily enhanced scientific practises and technologies to guarantee the mine environment to be safer and stable for the miners and the infrastructures. INERIS, within the framework of the FP7 European project I2Mine and in partnership with the Swedish mining company Boliden, has developed new methodologies in order to monitor both quasi-static stress changes and ruptures in a seismic prone area. To this purpose, a unique local permanent microseismic and stress monitoring network has been installed into the deep-working Garpenberg mine situated to the north of Uppsala (Sweden). In this mine, ore is extracted using sublevel stoping with paste fill production/distribution system and long-hole drilling method. This monitoring network has been deployed between about 1100 and 1250 meter depth. It consists in six 1-component and five 3-component microseismic probes (14-Hz geophones) deployed in the Lappberget area, in addition to three 3D stress monitoring cells that focus on a very local exploited area. Objective is three-fold: to quantify accurately quasi-static stress changes and freshly-induced stress gradients with drift development in the orebody, to study quantitatively those stress changes versus induced detected and located microseismic ruptures, and possibly to identify quasi-static stress transfer from those seismic ruptures. Geophysical and geotechnical data are acquired continuously and automatically transferred to INERIS datacenter through the web. They are made available on a secured web cloud monitoring infrastructure called e.cenaris and completed with mine data. Such interface enables the visualisation of the monitoring data coming from the mine in quasi-real time and facilitates information exchanges and decision making for experts and stakeholders. On the basis of these data acquisition and sharing, preliminary analysis has been started to highlight whether stress variations and seismic sources behaviour might be directly bound with mine working evolution and could improve the knowledge on the equilibrium states inside the mine. Knowing such parameters indeed will be a potential solution to understand better the response of deep mining activities to the exploitation solicitations and to develop, if possible, methods to prevent from major hazards such as rock bursts and other ground failure phenomena.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-27
...\\ level of State data mining participation in data mining reporting only. activities. MFCU Recertification... collection of information, to search data sources, to complete and review the collection of information, and...
The application of satellite data in monitoring strip mines
NASA Technical Reports Server (NTRS)
Sharber, L. A.; Shahrokhi, F.
1977-01-01
Strip mines in the New River Drainage Basin of Tennessee were studied through use of Landsat-1 imagery and aircraft photography. A multilevel analysis, involving conventional photo interpretation techniques, densitometric methods, multispectral analysis and statistical testing was applied to the data. The Landsat imagery proved adequate for monitoring large-scale change resulting from active mining and land-reclamation projects. However, the spatial resolution of the satellite imagery rendered it inadequate for assessment of many smaller strip mines, in the region which may be as small as a few hectares.
Hydrology of area 4, Eastern Coal Province, Pennsylvania, Ohio, and West Virginia
Roth, Donald K.; Engelke, Morris J.; ,
1981-01-01
Area 4 (one of the 24 hydrologic areas defining the Eastern Coal Province) is located at the northern end of the Eastern Coal Province in eastern Ohio, northern West Virginia, and western Pennsylvania. It is part of the upper Ohio River basin, which includes the Beaver, Mahoning, and Shenango Rivers. The area is underlain by rocks of the Pottsville, Allegheny, Conemaugh, Monongahela Groups (or Formations) and Dunkard Group. Area 4 has a temperate climate with an annual average rainfall of 38 to 42 inches, most of its area is covered by forest. The soils have a high erosion potential where the vegetation cover is removed. In response to Public Law 95-87, 132 sites were added to the existing surface-water data-collection network in area 4. At these added sites, collected data includes discharge, water quality, sediment, and biology. The data are available from computer storage through the National Water Data Exchange (NAWDEX) or the published annual Water Resources Data reports for Ohio, Pennsylvania, and West Virginia. Hydrologic problems related to mining are: (1) Erosion and increased sedimentation, and (2) degradation of water quality. Erosion and sedimentation are associated chiefly with surface mining. Sediment yields increase drastically when vegetation is removed from the highly erosive soils. Degradation of water quality can be caused by acid-mine drainage from underground and surface mining. More than half the acid-mine drainage effluent in area 4 comes from underground mines. The rest seeps from abandoned surface mines. Usually in reclaimed surface mines the overburden is replaced in such a short time after the coal is taken out that oxidation of acid-forming minerals, commonly pyrite or marcasite, is not complete or is neutralized by the buffering action of calcareous minerals in the soils. (USGS)
NASA Astrophysics Data System (ADS)
Ochałek, Agnieszka; Lipecki, Tomasz; Jaśkowski, Wojciech; Jabłoński, Mateusz
2018-03-01
The significant part of the hydrography is bathymetry, which is the empirical part of it. Bathymetry is the study of underwater depth of waterways and reservoirs, and graphic presentation of measured data in form of bathymetric maps, cross-sections and three-dimensional bottom models. The bathymetric measurements are based on using Global Positioning System and devices for hydrographic measurements - an echo sounder and a side sonar scanner. In this research authors focused on introducing the case of obtaining and processing the bathymetrical data, building numerical bottom models of two post-mining reclaimed water reservoirs: Dwudniaki Lake in Wierzchosławice and flooded quarry in Zabierzów. The report includes also analysing data from still operating mining water reservoirs located in Poland to depict how bathymetry can be used in mining industry. The significant issue is an integration of bathymetrical data and geodetic data from tachymetry, terrestrial laser scanning measurements.
Stefan, Sarah E; Ehsan, Mohammad; Pearson, Wright L; Aksenov, Alexander; Boginski, Vladimir; Bendiak, Brad; Eyler, John R
2011-11-15
Data mining algorithms have been used to analyze the infrared multiple photon dissociation (IRMPD) patterns of gas-phase lithiated disaccharide isomers irradiated with either a line-tunable CO(2) laser or a free electron laser (FEL). The IR fragmentation patterns over the wavelength range of 9.2-10.6 μm have been shown in earlier work to correlate uniquely with the asymmetry at the anomeric carbon in each disaccharide. Application of data mining approaches for data analysis allowed unambiguous determination of the anomeric carbon configurations for each disaccharide isomer pair using fragmentation data at a single wavelength. In addition, the linkage positions were easily assigned. This combination of wavelength-selective IRMPD and data mining offers a powerful and convenient tool for differentiation of structurally closely related isomers, including those of gas-phase carbohydrate complexes.
A novel water quality data analysis framework based on time-series data mining.
Deng, Weihui; Wang, Guoyin
2017-07-01
The rapid development of time-series data mining provides an emerging method for water resource management research. In this paper, based on the time-series data mining methodology, we propose a novel and general analysis framework for water quality time-series data. It consists of two parts: implementation components and common tasks of time-series data mining in water quality data. In the first part, we propose to granulate the time series into several two-dimensional normal clouds and calculate the similarities in the granulated level. On the basis of the similarity matrix, the similarity search, anomaly detection, and pattern discovery tasks in the water quality time-series instance dataset can be easily implemented in the second part. We present a case study of this analysis framework on weekly Dissolve Oxygen time-series data collected from five monitoring stations on the upper reaches of Yangtze River, China. It discovered the relationship of water quality in the mainstream and tributary as well as the main changing patterns of DO. The experimental results show that the proposed analysis framework is a feasible and efficient method to mine the hidden and valuable knowledge from water quality historical time-series data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Breast Imaging in the Era of Big Data: Structured Reporting and Data Mining.
Margolies, Laurie R; Pandey, Gaurav; Horowitz, Eliot R; Mendelson, David S
2016-02-01
The purpose of this article is to describe structured reporting and the development of large databases for use in data mining in breast imaging. The results of millions of breast imaging examinations are reported with structured tools based on the BI-RADS lexicon. Much of these data are stored in accessible media. Robust computing power creates great opportunity for data scientists and breast imagers to collaborate to improve breast cancer detection and optimize screening algorithms. Data mining can create knowledge, but the questions asked and their complexity require extremely powerful and agile databases. New data technologies can facilitate outcomes research and precision medicine.
Graph Mining Meets the Semantic Web
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Sangkeun; Sukumar, Sreenivas R; Lim, Seung-Hwan
The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today, data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. We address that need through implementation of three popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, and PageRank). We implement these algorithms as SPARQL queries, wrapped within Python scripts. We evaluatemore » the performance of our implementation on 6 real world data sets and show graph mining algorithms (that have a linear-algebra formulation) can indeed be unleashed on data represented as RDF graphs using the SPARQL query interface.« less
Mapping informal small-scale mining features in a data-sparse tropical environment with a small UAS
Chirico, Peter G.; Dewitt, Jessica D.
2017-01-01
This study evaluates the use of a small unmanned aerial system (UAS) to collect imagery over artisanal mining sites in West Africa. The purpose of this study is to consider how very high-resolution imagery and digital surface models (DSMs) derived from structure-from-motion (SfM) photogrammetric techniques from a small UAS can fill the gap in geospatial data collection between satellite imagery and data gathered during field work to map and monitor informal mining sites in tropical environments. The study compares both wide-angle and narrow field of view camera systems in the collection and analysis of high-resolution orthoimages and DSMs of artisanal mining pits. The results of the study indicate that UAS imagery and SfM photogrammetric techniques permit DSMs to be produced with a high degree of precision and relative accuracy, but highlight the challenges of mapping small artisanal mining pits in remote and data sparse terrain.
Data Analysis and Data Mining: Current Issues in Biomedical Informatics
Bellazzi, Riccardo; Diomidous, Marianna; Sarkar, Indra Neil; Takabayashi, Katsuhiko; Ziegler, Andreas; McCray, Alexa T.
2011-01-01
Summary Background Medicine and biomedical sciences have become data-intensive fields, which, at the same time, enable the application of data-driven approaches and require sophisticated data analysis and data mining methods. Biomedical informatics provides a proper interdisciplinary context to integrate data and knowledge when processing available information, with the aim of giving effective decision-making support in clinics and translational research. Objectives To reflect on different perspectives related to the role of data analysis and data mining in biomedical informatics. Methods On the occasion of the 50th year of Methods of Information in Medicine a symposium was organized, that reflected on opportunities, challenges and priorities of organizing, representing and analysing data, information and knowledge in biomedicine and health care. The contributions of experts with a variety of backgrounds in the area of biomedical data analysis have been collected as one outcome of this symposium, in order to provide a broad, though coherent, overview of some of the most interesting aspects of the field. Results The paper presents sections on data accumulation and data-driven approaches in medical informatics, data and knowledge integration, statistical issues for the evaluation of data mining models, translational bioinformatics and bioinformatics aspects of genetic epidemiology. Conclusions Biomedical informatics represents a natural framework to properly and effectively apply data analysis and data mining methods in a decision-making context. In the future, it will be necessary to preserve the inclusive nature of the field and to foster an increasing sharing of data and methods between researchers. PMID:22146916
String Mining in Bioinformatics
NASA Astrophysics Data System (ADS)
Abouelhoda, Mohamed; Ghanem, Moustafa
Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].
Satellite data for surface-mine inventory. [in Maryland
NASA Technical Reports Server (NTRS)
Anderson, A. T.; Schultz, D.; Buchman, N.; Nock, M.
1976-01-01
To determine the feasibility of satellite data for surface-mine inventory, particularly as it applies to coal, a case study was conducted in Maryland. A band-ratio method was developed to measure disturbed surface areas, and it proved to be extendible both temporally and geographically. This method was used to measure area changes in the region over three time periods from September 1972 through July 1974 and to map the entire two-county area for 1973. For mines ranging between 31 and 244 acres (12 to 98 hectares) the measurement accuracy of total affected acreage was determined to be 92%. Mines of 120 acres (50 hectares) and larger were measured with greater accuracy, some within one percent of the actual area. The ability to identify, classify, and measure strip-mine surfaces in a two-county area (1,541 square kilometers - 595 square miles) of western Maryland was demonstrated through the use of computer processing. On the basis of these results the use of LANDSAT satellite data and multilevel sampling of aircraft and field verification inspections, multispectral analysis of digital data is shown to be an effective, rapid, and accurate means of monitoring the surface mining cycle.
NASA Technical Reports Server (NTRS)
Anderson, A. T.; Schubert, J.
1974-01-01
The largest contour strip mining operations in western Maryland and West Virginia are located within the Georges Creek and the Upper Potomac Basins. These two coal basins lie within the Georges Creek (Wellersburg) syncline. The disturbed strip mine areas were delineated with the surrounding geological and vegetation features using ERTS-1 data in both analog (imagery) and digital form. The two digital systems used were: (1) the ERTS-Analysis system, a point-by-point digital analysis of spectral signatures based on known spectral values, and (2) the LARS Automatic Data Processing System. The digital techniques being developed will later be incorporated into a data base for land use planning. These two systems aided in efforts to determine the extent and state of strip mining in this region. Aircraft data, ground verification information, and geological field studies also aided in the application of ERTS-1 imagery to perform an integrated analysis that assessed the adverse effects of strip mining. The results indicated that ERTS can both monitor and map the extent of strip mining to determine immediately the acreage affected and indicate where future reclamation and revegetation may be necessary.
Kenny, J.F.; McCauley, J.R.
1983-01-01
Disturbances resulting from intensive coal mining in the Cherry Creek basin of southeastern Kansas were investigated using color and color-infrared aerial photography in conjunction with water-quality data from simultaneously acquired samples. Imagery was used to identify the type and extent of vegetative cover on strip-mined lands and the extent and success of reclamation practices. Drainage patterns, point sources of acid mine drainage, and recharge areas for underground mines were located for onsite inspection. Comparison of these interpretations with water-quality data illustrated differences between the eastern and western parts of the Cherry Creek basin. Contamination in the eastern part is due largely to circulation of water from unreclaimed strip mines and collapse features through the network of underground mines and subsequent discharge of acidic drainage through seeps. Contamination in the western part is primarily caused by runoff and seepage from strip-mined lands in which surfaces have frequently been graded and limed but are generally devoid of mature stands of soil-anchoring vegetation. The successful use of aerial photography in the study of Cherry Creek basin indicates the potential of using remote-sensing techniques in studies of other coal-mined regions. (USGS)
Auer, Manfred; Peng, Hanchuan; Singh, Ambuj
2007-01-01
The 2006 International Workshop on Multiscale Biological Imaging, Data Mining and Informatics was held at Santa Barbara, on Sept 7–8, 2006. Based on the presentations at the workshop, we selected and compiled this collection of research articles related to novel algorithms and enabling techniques for bio- and biomedical image analysis, mining, visualization, and biology applications. PMID:17634090
Bevans, Hugh E.; Diaz, Arthur M.
1980-01-01
Summaries of descriptive statistics are compiled for 14 data-collection sites located on streams draining areas that have been shaft mined and strip mined for coal in Cherokee and Crawford Counties in southeastern Kansas. These summaries include water-quality data collected from October 1976 through April 1979. Regression equations relating specific conductance and instantaneous streamflow to concentrations of bicarbonate, sulfate, chloride, fluoride, calcium, magnesium, sodium, potassium, silica, and dissolved solids are presented.
Runkel, Robert L.; Verplanck, Philip; Kimball, Briant; Walton-Day, Katie
2018-01-01
Baseline, premining data for streams draining abandoned mine lands is virtually non existent, and indirect methods for estimating premining conditions are needed to establish realistic, cost effective cleanup goals. One such indirect method is the proximal analog approach, in which premining conditions are estimated using data from nearby mineralized areas that are unaffected by mining. In this paper, we combine the proximal analog approach with a quantitative mass balance framework using data from a spatially-detailed synoptic sampling campaign. The combined approach is applied to Cinnamon Gulch, a headwater stream with numerous draining adits. Synoptic sampling results indicate that three of the top five metal sources are affected by mining activities, and stream segments draining these sources account for a large percentage of overall metal loading within the study reach. These initial calculations overestimate the effects of mining, as the affected stream segments were likely acidic and metal rich prior to mining. Premining loads and concentrations were therefore determined through a replacement approach in which the chemistry of each mining-affected stream segment is revised based on proximal analog concentrations. The revised loading profiles indicate that 15–17% of the Al, Cd, Cu, Mn, Ni, and Zn loads are attributable to mining, whereas the mining contribution for Pb is 40%. Premining concentrations of Al, Cd, Cu, Mn, and Zn are estimated to be in excess of aquatic life standards over the length of the study reach.
de la Torre, M L; Grande, J A; Valente, T; Perez-Ostalé, E; Santisteban, M; Aroba, J; Ramos, I
2016-03-01
Poderosa Mine is an abandoned pyrite mine, located in the Iberian Pyrite Belt which pours its acid mine drainage (AMD) waters into the Odiel river (South-West Spain). This work focuses on establishing possible reasons for interdependence between the potential redox and pH, with the load of metals and sulfates, as well as a set of variables that define the physical chemistry of the water-conductivity, temperature, TDS, and dissolved oxygen-transported by a channel from Poderosa mine affected by acid mine drainage, through the use of techniques of artificial intelligence: fuzzy logic and data mining. The sampling campaign was carried out in May of 2012. There were a total of 16 sites, the first inside the tunnel and the last at the mouth of the river Odiel, with a distance of approximately 10 m between each pair of measuring stations. While the tools of classical statistics, which are widely used in this context, prove useful for defining proximity ratios between variables based on Pearson's correlations, in addition to making it easier to handle large volumes of data and producing easier-to-understand graphs, the use of fuzzy logic tools and data mining results in better definition of the variations produced by external stimuli on the set of variables. This tool is adaptable and can be extrapolated to any system polluted by acid mine drainage using simple, intuitive reasoning.
Knowledge Discovery from Massive Healthcare Claims Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chandola, Varun; Sukumar, Sreenivas R; Schryver, Jack C
The role of big data in addressing the needs of the present healthcare system in US and rest of the world has been echoed by government, private, and academic sectors. There has been a growing emphasis to explore the promise of big data analytics in tapping the potential of the massive healthcare data emanating from private and government health insurance providers. While the domain implications of such collaboration are well known, this type of data has been explored to a limited extent in the data mining community. The objective of this paper is two fold: first, we introduce the emergingmore » domain of big"healthcare claims data to the KDD community, and second, we describe the success and challenges that we encountered in analyzing this data using state of art analytics for massive data. Specically, we translate the problem of analyzing healthcare data into some of the most well-known analysis problems in the data mining community, social network analysis, text mining, and temporal analysis and higher order feature construction, and describe how advances within each of these areas can be leveraged to understand the domain of healthcare. Each case study illustrates a unique intersection of data mining and healthcare with a common objective of improving the cost-care ratio by mining for opportunities to improve healthcare operations and reducing hat seems to fall under fraud, waste,and abuse.« less
78 FR 34093 - An Assessment of Potential Mining Impacts on Salmon Ecosystems of Bristol Bay, Alaska
Federal Register 2010, 2011, 2012, 2013, 2014
2013-06-06
... scientific and technical information presented in the report, the realistic mining scenario used, the data... Potential Mining Impacts on Salmon Ecosystems of Bristol Bay, Alaska AGENCY: Environmental Protection Agency... document titled, ``An Assessment of Potential Mining Impacts on Salmon Ecosystems of Bristol Bay, Alaska...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-01-19
... DEPARTMENT OF LABOR Proposed Information Collection Request (ICR) for the Mining Voice in the...)(A)]. This program helps to ensure that required data can be provided in the desired format...' voice in mining workplaces under the jurisdiction of DOL's Mine Safety and Health Administration (MSHA...
pubmed.mineR: an R package with text-mining algorithms to analyse PubMed abstracts.
Rani, Jyoti; Shah, A B Rauf; Ramachandran, Srinivasan
2015-10-01
The PubMed literature database is a valuable source of information for scientific research. It is rich in biomedical literature with more than 24 million citations. Data-mining of voluminous literature is a challenging task. Although several text-mining algorithms have been developed in recent years with focus on data visualization, they have limitations such as speed, are rigid and are not available in the open source. We have developed an R package, pubmed.mineR, wherein we have combined the advantages of existing algorithms, overcome their limitations, and offer user flexibility and link with other packages in Bioconductor and the Comprehensive R Network (CRAN) in order to expand the user capabilities for executing multifaceted approaches. Three case studies are presented, namely, 'Evolving role of diabetes educators', 'Cancer risk assessment' and 'Dynamic concepts on disease and comorbidity' to illustrate the use of pubmed.mineR. The package generally runs fast with small elapsed times in regular workstations even on large corpus sizes and with compute intensive functions. The pubmed.mineR is available at http://cran.rproject. org/web/packages/pubmed.mineR.
Static versus dynamic sampling for data mining
DOE Office of Scientific and Technical Information (OSTI.GOV)
John, G.H.; Langley, P.
1996-12-31
As data warehouses grow to the point where one hundred gigabytes is considered small, the computational efficiency of data-mining algorithms on large databases becomes increasingly important. Using a sample from the database can speed up the datamining process, but this is only acceptable if it does not reduce the quality of the mined knowledge. To this end, we introduce the {open_quotes}Probably Close Enough{close_quotes} criterion to describe the desired properties of a sample. Sampling usually refers to the use of static statistical tests to decide whether a sample is sufficiently similar to the large database, in the absence of any knowledgemore » of the tools the data miner intends to use. We discuss dynamic sampling methods, which take into account the mining tool being used and can thus give better samples. We describe dynamic schemes that observe a mining tool`s performance on training samples of increasing size and use these results to determine when a sample is sufficiently large. We evaluate these sampling methods on data from the UCI repository and conclude that dynamic sampling is preferable.« less
A watershed-scale approach to tracing metal contamination in the environment
Church, Stanley E
1996-01-01
IntroductionPublic policy during the 1800's encouraged mining in the western United States. Mining on Federal lands played an important role in the growing economy creating national wealth from our abundant and diverse mineral resource base. The common industrial practice from the early days of mining through about 1970 in the U.S. was for mine operators to dispose of the mine wastes and mill tailings in the nearest stream reach or lake. As a result of this contamination, many stream reaches below old mines, mills, and mining districts and some major rivers and lakes no longer support aquatic life. Riparian habitats within these affected watersheds have also been impacted. Often, the water from these affected stream reaches is generally not suitable for drinking, creating a public health hazard. The recent Department of Interior Abandoned Mine Lands (AML) Initiative is an effort on the part of the Federal Government to address the adverse environmental impact of these past mining practices on Federal lands. The AML Initiative has adopted a watershed approach to determine those sites that contribute the majority of the contaminants in the watershed. By remediating the largest sources of contamination within the watershed, the impact of metal contamination in the environment within the watershed as a whole is reduced rather than focusing largely on those sites for which principal responsible parties can be found.The scope of the problem of metal contamination in the environment from past mining practices in the coterminous U.S. is addressed in a recent report by Ferderer (1996). Using the USGS1:2,000,000-scale hydrologic drainage basin boundaries and the USGS Minerals Availability System (MAS) data base, he plotted the distribution of 48,000 past-producing metal mines on maps showing the boundaries of lands administered by the various Federal Land Management Agencies (FLMA). Census analysis of these data provided an initial screening tool for prioritization of watersheds in the western U.S. A different approach to the scope of the abandoned mine problem (Church et al., 1996a) is shown by the water quality data collected by the States under the Clean Water Act, section 305(b). These data document the stream reaches affected by metals from naturally occurring sources as well as from mining, or mineral resource extraction. Permitted discharges from active industrial and mine sites are not covered in the 305(b) data base.Local citizens and state and federal agencies are all part of the collaborative decision process used to select the drainage basins chosen for the AML Initiative pilot studies. Data gathered by these three entities were brought to bear on the watershed selection process. The USGS prepared data available from Federal data bases in the form of interpretative GIS products. Maps of the states of Colorado (Plumlee et al., 1995) and a similar study of the state of Montana (USGS, unpublished data) were used to select the Animas watershed in southwestern Colorado and the Boulder watershed southwest of Helena Montana as the pilot study areas for the AML Initiative. Thus, the watersheds selected for study were public decisions made on the basis of available scientific data. The role of the U.S. Geological Survey in the Abandoned Mine Land Initiative is outlined in Buxton et al. (1997).The watershed approach to metals contamination in the environment has been studied in several drainage basins (Church et al., 1993, 1994, 1995, 1996b; Kimball et al., 1995). The underlying principles used to successfully discriminate between sources and to quantify the impact of these sources on the environment are the subject of this report.
Walton-Day, K.; Poeter, E.
2009-01-01
Turquoise Lake is a water-supply reservoir located north of the historic Sugarloaf Mining district near Leadville, Colorado, USA. Elevated water levels in the reservoir may increase flow of low-quality water from abandoned mine tunnels in the Sugarloaf District and degrade water quality downstream. The objective of this study was to understand the sources of water to Dinero mine drainage tunnel and evaluate whether or not there was a direct hydrologic connection between Dinero mine tunnel and Turquoise Lake from late 2002 to early 2008. This study utilized hydrograph data from nearby draining mine tunnels and the lake, and stable isotope (??18O and ??2H) data from the lake, nearby draining mine tunnels, imported water, and springs to characterize water sources in the study area. Hydrograph results indicate that flow from the Dinero mine tunnel decreased 26% (2006) and 10% (2007) when lake elevation (above mean sea level) decreased below approximately 3004 m (approximately 9855 feet). Results of isotope analysis delineated two meteoric water lines in the study area. One line characterizes surface water and water imported to the study area from the western side of the Continental Divide. The other line characterizes groundwater including draining mine tunnels, springs, and seeps. Isotope mixing calculations indicate that water from Turquoise Lake or seasonal groundwater recharge from snowmelt represents approximately 10% or less of the water in Dinero mine tunnel. However, most of the water in Dinero mine tunnel is from deep groundwater having minimal isotopic variation. The asymmetric shape of the Dinero mine tunnel hydrograph may indicate that a limited mine pool exists behind a collapse in the tunnel and attenutates seasonal recharge. Alternatively, a conceptual model is presented (and supported with MODFLOW simulations) that is consistent with current and previous data collected in the study area, and illustrates how fluctuating lake levels change the local water-table elevation which can affect discharge from the Dinero mine tunnel without physical transfer of water between the two locations.
Walton-Day, Katherine; Poeter, Eileen
2009-01-01
Turquoise Lake is a water-supply reservoir located north of the historic Sugarloaf Mining district near Leadville, Colorado, USA. Elevated water levels in the reservoir may increase flow of low-quality water from abandoned mine tunnels in the Sugarloaf District and degrade water quality downstream. The objective of this study was to understand the sources of water to Dinero mine drainage tunnel and evaluate whether or not there was a direct hydrologic connection between Dinero mine tunnel and Turquoise Lake from late 2002 to early 2008. This study utilized hydrograph data from nearby draining mine tunnels and the lake, and stable isotope (δ18O and δ2H) data from the lake, nearby draining mine tunnels, imported water, and springs to characterize water sources in the study area. Hydrograph results indicate that flow from the Dinero mine tunnel decreased 26% (2006) and 10% (2007) when lake elevation (above mean sea level) decreased below approximately 3004 m (approximately 9855 feet). Results of isotope analysis delineated two meteoric water lines in the study area. One line characterizes surface water and water imported to the study area from the western side of the Continental Divide. The other line characterizes groundwater including draining mine tunnels, springs, and seeps. Isotope mixing calculations indicate that water from Turquoise Lake or seasonal groundwater recharge from snowmelt represents approximately 10% or less of the water in Dinero mine tunnel. However, most of the water in Dinero mine tunnel is from deep groundwater having minimal isotopic variation. The asymmetric shape of the Dinero mine tunnel hydrograph may indicate that a limited mine pool exists behind a collapse in the tunnel and attenutates seasonal recharge. Alternatively, a conceptual model is presented (and supported with MODFLOW simulations) that is consistent with current and previous data collected in the study area, and illustrates how fluctuating lake levels change the local water-table elevation which can affect discharge from the Dinero mine tunnel without physical transfer of water between the two locations.
Christenson, Scott C.
1995-01-01
The Roubidoux aquifer in Ottawa County Oklahoma is used extensively as a source of water for public supplies, commerce, industry, and rural water districts. Water in the Roubidoux aquifer in eastern Ottawa County has relatively low dissolved-solids concentrations (less than 200 mg/L) with calcium, magnesium, and bicarbonate as the major ions. The Boone Formation is stratigraphically above the Roubidoux aquifer and is the host rock for zinc and lead sulfide ores, with the richest deposits located in the vicinity of the City of Picher. Mining in what became known as the Picher mining district began in the early 1900's and continued until about 1970. The water in the abandoned zinc and lead mines contains high concentrations of calcium, magnesium, bicarbonate, sulfate, fluoride, cadmium, copper, iron, lead, manganese, nickel, and zinc. Water from the abandoned mines is a potential source of contamination to the Roubidoux aquifer and to wells completed in the Roubidoux aquifer. Water samples were collected from wells completed in the Roubidoux aquifer in the Picher mining district and from wells outside the mining district to determine if 10 public supply wells in the mining district are contaminated. The chemical analyses indicate that at least 7 of the 10 public supply wells in the Picher mining district are contaminated by mine water. Application of the Mann-Whitney test indicated that the concentrations of some chemical constituents that are indicators of mine-water contamination are different in water samples from wells in the mining area as compared to wells outside the mining area. Application of the Wilcoxon signed-rank test showed that the concentrations of some chemical constituents that are indicators of mine-water contamination were higher in current (1992-93) data than in historic (1981-83) data, except for pH, which was lower in current than in historic data. pH and sulfate, alkalinity, bicarbonate, magnesium, iron, and tritium concentrations consistently indicate that the Cardin, Commerce 1, Commerce 3, Picher 2, Picher 3, Picher 4, and Quapaw 2 wells are contaminated.
NASA Astrophysics Data System (ADS)
Mayangsari, S.
2018-01-01
This study investigates the influence of environmental performance on the financial report integrity. The statistics used were primary data from interviews with senior members of the mining sector regarding environmental issues, as well as secondary data using Financial Report 2016. The samples were listed mining companies with semester data. Questionnaires were used to measure their perceptions of the challenges concerning climate change faced by the mining sector. The results of this research show that regulatory interventions will be critical to environmental issues. This study employed KLD as a proxy for environmental performance, correlated with other variables regarding the integrity of disclosure. The outcome indicates that environmental issues will increase the integrity of financial reports.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kargupta, H.; Stafford, B.; Hamzaoglu, I.
This paper describes an experimental parallel/distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based interface for interactive data visualization. It also presents the results of applying PADMA for detecting patterns in unstructured texts of postmortem reports and laboratory test data for Hepatitis C patients.
Data warehousing as a basis for web-based documentation of data mining and analysis.
Karlsson, J; Eklund, P; Hallgren, C G; Sjödin, J G
1999-01-01
In this paper we present a case study for data warehousing intended to support data mining and analysis. We also describe a prototype for data retrieval. Further we discuss some technical issues related to a particular choice of a patient record environment.
ERIC Educational Resources Information Center
Cullen, Kevin
2005-01-01
Corporations employ data mining to analyze operations, find trends in recorded information, and look for new opportunities. Libraries are no different. Librarians manage large stores of data--about collections and usage, for example--and they also want to analyze this data to serve their users better. Analysts use data mining to query a data…
Remote sensing of coal mine pollution in the upper Potomac River basin
NASA Technical Reports Server (NTRS)
1974-01-01
A survey of remote sensing data pertinent to locating and monitoring sources of pollution resulting from surface and shaft mining operations was conducted in order to determine the various methods by which ERTS and aircraft remote sensing data can be used as a replacement for, or a supplement to traditional methods of monitoring coal mine pollution of the upper Potomac Basin. The gathering and analysis of representative samples of the raw and processed data obtained during the survey are described, along with plans to demonstrate and optimize the data collection processes.
Profitability and occupational injuries in U.S. underground coal mines.
Asfaw, Abay; Mark, Christopher; Pana-Cryan, Regina
2013-01-01
Coal plays a crucial role in the U.S. economy yet underground coal mining continues to be one of the most dangerous occupations in the country. In addition, there are large variations in both profitability and the incidence of occupational injuries across mines. The objective of this study was to examine the association between profitability and the incidence rate of occupational injuries in U.S. underground coal mines between 1992 and 2008. We used mine-specific data on annual hours worked, geographic location, and the number of occupational injuries suffered annually from the employment and accident/injury databases of the Mine Safety and Health Administration, and mine-specific data on annual revenue from coal sales, mine age, workforce union status, and mining method from the U.S. Energy Information Administration. A total of 5669 mine-year observations (number of mines×number of years) were included in our analysis. We used a negative binomial random effects model that was appropriate for analyzing panel (combined time-series and cross-sectional) injury data that were non-negative and discrete. The dependent variable, occupational injury, was measured in three different and non-mutually exclusive ways: all reported fatal and nonfatal injuries, reported nonfatal injuries with lost workdays, and the 'most serious' (i.e. sum of fatal and serious nonfatal) injuries reported. The total number of hours worked in each mine and year examined was used as an exposure variable. Profitability, the main explanatory variable, was approximated by revenue per hour worked. Our model included mine age, workforce union status, mining method, and geographic location as additional control variables. After controlling for other variables, a 10% increase in real total revenue per hour worked was associated with 0.9%, 1.1%, and 1.6% decrease, respectively, in the incidence rates of all reported injuries, reported injuries with lost workdays, and the most serious injuries reported. We found an inverse relationship between profitability and each of the three indicators of occupational injuries we used. These results might be partially due to factors that affect both profitability and safety, such as management or engineering practices, and partially due to lower investments in safety by less profitable mines, which could imply that some financially stressed mines might be so focused on survival that they forgo investing in safety. Published by Elsevier Ltd.
Tang, Qi-Yi; Zhang, Chuan-Xi
2013-04-01
A comprehensive but simple-to-use software package called DPS (Data Processing System) has been developed to execute a range of standard numerical analyses and operations used in experimental design, statistics and data mining. This program runs on standard Windows computers. Many of the functions are specific to entomological and other biological research and are not found in standard statistical software. This paper presents applications of DPS to experimental design, statistical analysis and data mining in entomology. © 2012 The Authors Insect Science © 2012 Institute of Zoology, Chinese Academy of Sciences.
Iddamalgoda, Lahiru; Das, Partha S; Aponso, Achala; Sundararajan, Vijayaraghava S; Suravajhala, Prashanth; Valadi, Jayaraman K
2016-01-01
Data mining and pattern recognition methods reveal interesting findings in genetic studies, especially on how the genetic makeup is associated with inherited diseases. Although researchers have proposed various data mining models for biomedical approaches, there remains a challenge in accurately prioritizing the single nucleotide polymorphisms (SNP) associated with the disease. In this commentary, we review the state-of-art data mining and pattern recognition models for identifying inherited diseases and deliberate the need of binary classification- and scoring-based prioritization methods in determining causal variants. While we discuss the pros and cons associated with these methods known, we argue that the gene prioritization methods and the protein interaction (PPI) methods in conjunction with the K nearest neighbors' could be used in accurately categorizing the genetic factors in disease causation.
Study on online community user motif using web usage mining
NASA Astrophysics Data System (ADS)
Alphy, Meera; Sharma, Ajay
2016-04-01
The Web usage mining is the application of data mining, which is used to extract useful information from the online community. The World Wide Web contains at least 4.73 billion pages according to Indexed Web and it contains at least 228.52 million pages according Dutch Indexed web on 6th august 2015, Thursday. It’s difficult to get needed data from these billions of web pages in World Wide Web. Here is the importance of web usage mining. Personalizing the search engine helps the web user to identify the most used data in an easy way. It reduces the time consumption; automatic site search and automatic restore the useful sites. This study represents the old techniques to latest techniques used in pattern discovery and analysis in web usage mining from 1996 to 2015. Analyzing user motif helps in the improvement of business, e-commerce, personalisation and improvement of websites.
Hageman, Philip L.; Briggs, Paul H.; Desborough, George A.; Lamothe, Paul J.; Theodorakos, Peter M.
2000-01-01
This report details chemistry data derived from leaching of mine-waste composite samples using a modification of E.P.A. Method 1312, Synthetic Precipitation Leaching Procedure (SPLP). In 1998, members of the U.S. Geological Survey Mine Waste Characterization Project collected four mine-waste composite samples from mining districts in southwestern New Mexico (CAR and PET) and near Leadville, Colorado (TUC and MII). Resulting leachate pH values for the four composites ranged from 5.45 to 8.84 and ranked in the following order: CAR < TUC < MII < PET. Specific conductivity values ranged from 85 uS/cm to 847 uS/cm in the following order: PET < MII < CAR < TUC. Geochemical data generated from this investigation reveal that leachate from the CAR composite contains the highest concentrations of Pb, Zn, Ni, Mn, Cu, Cd, and Al
NASA Technical Reports Server (NTRS)
Wier, C. E.; Wobber, F. J. (Principal Investigator); Russell, O. R.; Amato, R. V.
1972-01-01
The author has identified the following significant results. Various data compilation and analysis activities in support of ERTS-1 imagery interpretation are in progress or are completed. These include the compilation of mine accident data, areas of mine roof instability and the analysis of high altitude color infrared photography and low altitude color and color infrared photography which was acquired by NASA in support of the project. The photography reveals that many fracture lineaments are detectable through a varied thickness of glacial till. These data will be compiled on a series of 1:250,000 scale base maps and evaluated for a correlation between fracture zones and mine accidents and rooffalls. Due to high occurrence of cloud cover in the project area and to the delay in imagery shipments, little progress has been made in the analysis of ERTS-1 imagery.
Diamond Eye: a distributed architecture for image data mining
NASA Astrophysics Data System (ADS)
Burl, Michael C.; Fowlkes, Charless; Roden, Joe; Stechert, Andre; Mukhtar, Saleem
1999-02-01
Diamond Eye is a distributed software architecture, which enables users (scientists) to analyze large image collections by interacting with one or more custom data mining servers via a Java applet interface. Each server is coupled with an object-oriented database and a computational engine, such as a network of high-performance workstations. The database provides persistent storage and supports querying of the 'mined' information. The computational engine provides parallel execution of expensive image processing, object recognition, and query-by-content operations. Key benefits of the Diamond Eye architecture are: (1) the design promotes trial evaluation of advanced data mining and machine learning techniques by potential new users (all that is required is to point a web browser to the appropriate URL), (2) software infrastructure that is common across a range of science mining applications is factored out and reused, and (3) the system facilitates closer collaborations between algorithm developers and domain experts.
Data Processing and Text Mining Technologies on Electronic Medical Records: A Review
Sun, Wencheng; Li, Yangyang; Liu, Fang; Fang, Shengqun; Wang, Guoyan
2018-01-01
Currently, medical institutes generally use EMR to record patient's condition, including diagnostic information, procedures performed, and treatment results. EMR has been recognized as a valuable resource for large-scale analysis. However, EMR has the characteristics of diversity, incompleteness, redundancy, and privacy, which make it difficult to carry out data mining and analysis directly. Therefore, it is necessary to preprocess the source data in order to improve data quality and improve the data mining results. Different types of data require different processing technologies. Most structured data commonly needs classic preprocessing technologies, including data cleansing, data integration, data transformation, and data reduction. For semistructured or unstructured data, such as medical text, containing more health information, it requires more complex and challenging processing methods. The task of information extraction for medical texts mainly includes NER (named-entity recognition) and RE (relation extraction). This paper focuses on the process of EMR processing and emphatically analyzes the key techniques. In addition, we make an in-depth study on the applications developed based on text mining together with the open challenges and research issues for future work. PMID:29849998
NASA Astrophysics Data System (ADS)
Sokoła-Szewioła, Violetta; Żogała, Monika
2016-12-01
Nowadays the mining companies use the Spatial Information System in order to facilitate data management, gathered during the mining activity. For these purposes various kinds of applications and software information are used. They allow for faster and easier data processing. In the paper there are presented the possibilities of using the ArcGIS system to support the tasks performed in the mining industry in the scope of the analysis of the influence of the mining tremors, induced by the longwall exploitation on the facilities construction sited on the surface area. These possibilities are presented by the example of the database developed for the coal mine KWK "Rydułtowy-Anna." The developed database was created using ArcGIS software for Desktop 10. 1. It contains the values of parameters, specified for its implementation relevant to the analyses of the influence of the mining tremors on the surface structures.
Text mining for traditional Chinese medical knowledge discovery: a survey.
Zhou, Xuezhong; Peng, Yonghong; Liu, Baoyan
2010-08-01
Extracting meaningful information and knowledge from free text is the subject of considerable research interest in the machine learning and data mining fields. Text data mining (or text mining) has become one of the most active research sub-fields in data mining. Significant developments in the area of biomedical text mining during the past years have demonstrated its great promise for supporting scientists in developing novel hypotheses and new knowledge from the biomedical literature. Traditional Chinese medicine (TCM) provides a distinct methodology with which to view human life. It is one of the most complete and distinguished traditional medicines with a history of several thousand years of studying and practicing the diagnosis and treatment of human disease. It has been shown that the TCM knowledge obtained from clinical practice has become a significant complementary source of information for modern biomedical sciences. TCM literature obtained from the historical period and from modern clinical studies has recently been transformed into digital data in the form of relational databases or text documents, which provide an effective platform for information sharing and retrieval. This motivates and facilitates research and development into knowledge discovery approaches and to modernize TCM. In order to contribute to this still growing field, this paper presents (1) a comparative introduction to TCM and modern biomedicine, (2) a survey of the related information sources of TCM, (3) a review and discussion of the state of the art and the development of text mining techniques with applications to TCM, (4) a discussion of the research issues around TCM text mining and its future directions. Copyright 2010 Elsevier Inc. All rights reserved.
Closedure - Mine Closure Technologies Resource
NASA Astrophysics Data System (ADS)
Kauppila, Päivi; Kauppila, Tommi; Pasanen, Antti; Backnäs, Soile; Liisa Räisänen, Marja; Turunen, Kaisa; Karlsson, Teemu; Solismaa, Lauri; Hentinen, Kimmo
2015-04-01
Closure of mining operations is an essential part of the development of eco-efficient mining and the Green Mining concept in Finland to reduce the environmental footprint of mining. Closedure is a 2-year joint research project between Geological Survey of Finland and Technical Research Centre of Finland that aims at developing accessible tools and resources for planning, executing and monitoring mine closure. The main outcome of the Closedure project is an updatable wiki technology-based internet platform (http://mineclosure.gtk.fi) in which comprehensive guidance on the mine closure is provided and main methods and technologies related to mine closure are evaluated. Closedure also provides new data on the key issues of mine closure, such as performance of passive water treatment in Finland, applicability of test methods for evaluating cover structures for mining wastes, prediction of water effluents from mine wastes, and isotopic and geophysical methods to recognize contaminant transport paths in crystalline bedrock.
Santos, R S; Malheiros, S M F; Cavalheiro, S; de Oliveira, J M Parente
2013-03-01
Cancer is the leading cause of death in economically developed countries and the second leading cause of death in developing countries. Malignant brain neoplasms are among the most devastating and incurable forms of cancer, and their treatment may be excessively complex and costly. Public health decision makers require significant amounts of analytical information to manage public treatment programs for these patients. Data mining, a technology that is used to produce analytically useful information, has been employed successfully with medical data. However, the large-scale adoption of this technique has been limited thus far because it is difficult to use, especially for non-expert users. One way to facilitate data mining by non-expert users is to automate the process. Our aim is to present an automated data mining system that allows public health decision makers to access analytical information regarding brain tumors. The emphasis in this study is the use of ontology in an automated data mining process. The non-experts who tried the system obtained useful information about the treatment of brain tumors. These results suggest that future work should be conducted in this area. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Stewart, Anne M.; Thomas, Nicole
2015-01-01
In 2010, in cooperation with the Mining and Minerals Division (MMD) of the State of New Mexico Energy, Minerals and Natural Resources Department, the U.S. Geological Survey (USGS) initiated a 4-year assessment of hydrologic conditions at the San Juan coal mine (SJCM), located about 14 miles west-northwest of the city of Farmington, San Juan County, New Mexico. The mine produces coal for power generation at the adjacent San Juan Generating Station (SJGS) and stores coal-combustion byproducts from the SJGS in mined-out surface-mining pits. The purpose of the hydrologic assessment is to identify groundwater flow paths away from SJCM coal-combustion-byproduct storage sites that might allow metals that may be leached from coal-combustion byproducts to eventually reach wells or streams after regional dewatering ceases and groundwater recovers to predevelopment levels. The hydrologic assessment, undertaken between 2010 and 2013, included compilation of existing data. The purpose of this report is to present data that were acquired and compiled by the USGS for the SJCM hydrologic assessment.
Lamm, Steven H; Li, Ji; Robbins, Shayhan A; Dissen, Elisabeth; Chen, Rusan; Feinleib, Manning
2015-02-01
Pooled 1996 to 2003 birth certificate data for four central states in Appalachia indicated higher rates of infants with birth defects born to residents of counties with mountain-top mining (MTM) than born to residents of non-mining-counties (Ahern 2011). However, those analyses did not consider sources of uncertainty such as unbalanced distributions or quality of data. Quality issues have been a continuing problem with birth certificate analyses. We used 1990 to 2009 live birth certificate data for West Virginia to reassess this hypothesis. Forty-four hospitals contributed 98% of the MTM-county births and 95% of the non-mining-county births, of which six had more than 1000 births from both MTM and nonmining counties. Adjusted and stratified prevalence rate ratios (PRRs) were computed both by using Poisson regression and Mantel-Haenszel analysis. Unbalanced distribution of hospital births was observed by mining groups. The prevalence rate of infants with reported birth defects, higher in MTM-counties (0.021) than in non-mining-counties (0.015), yielded a significant crude PRR (cPRR = 1.43; 95% confidence interval [CI] = 1.36-1.52) but a nonsignificant hospital-adjusted PRR (adjPRR = 1.08; 95% CI = 0.97-1.20; p = 0.16) for the 44 hospitals. So did the six hospital data analysis ([cPRR = 2.39; 95% CI = 2.15-2.65] and [adjPRR = 1.01; 95% CI, 0.89-1.14; p = 0.87]). No increased risk of birth defects was observed for births from MTM-counties after adjustment for, or stratification by, hospital of birth. These results have consistently demonstrated that the reported association between birth defect rates and MTM coal mining was a consequence of data heterogeneity. The data do not demonstrate evidence of a "Mountain-top Mining" effect on the prevalence of infants with reported birth defects in WV. © 2014 Wiley Periodicals, Inc.
Coal and Open-pit surface mining impacts on American Lands (COAL)
NASA Astrophysics Data System (ADS)
Brown, T. A.; McGibbney, L. J.
2017-12-01
Mining is known to cause environmental degradation, but software tools to identify its impacts are lacking. However, remote sensing, spectral reflectance, and geographic data are readily available, and high-performance cloud computing resources exist for scientific research. Coal and Open-pit surface mining impacts on American Lands (COAL) provides a suite of algorithms and documentation to leverage these data and resources to identify evidence of mining and correlate it with environmental impacts over time.COAL was originally developed as a 2016 - 2017 senior capstone collaboration between scientists at the NASA Jet Propulsion Laboratory (JPL) and computer science students at Oregon State University (OSU). The COAL team implemented a free and open-source software library called "pycoal" in the Python programming language which facilitated a case study of the effects of coal mining on water resources. Evidence of acid mine drainage associated with an open-pit coal mine in New Mexico was derived by correlating imaging spectrometer data from the JPL Airborne Visible/InfraRed Imaging Spectrometer - Next Generation (AVIRIS-NG), spectral reflectance data published by the USGS Spectroscopy Laboratory in the USGS Digital Spectral Library 06, and GIS hydrography data published by the USGS National Geospatial Program in The National Map. This case study indicated that the spectral and geospatial algorithms developed by COAL can be used successfully to analyze the environmental impacts of mining activities.Continued development of COAL has been promoted by a Startup allocation award of high-performance computing resources from the Extreme Science and Engineering Discovery Environment (XSEDE). These resources allow the team to undertake further benchmarking, evaluation, and experimentation using multiple XSEDE resources. The opportunity to use computational infrastructure of this caliber will further enable the development of a science gateway to continue foundational COAL research.This work documents the original design and development of COAL and provides insight into continuing research efforts which have potential applications beyond the project to environmental data science and other fields.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roger Mayes; Sera White; Randy Lee
2005-04-01
Selenium is present in waste rock/overburden that is removed during phosphate mining in southeastern Idaho. Waste rock piles or rock used during reclamation can be a source of selenium (and other metals) to streams and vegetation. Some instances (in 1996) of selenium toxicity in grazing sheep and horses caused public health and environmental concerns, leading to Idaho Department of Environmental Quality (DEQ) involvement. The Selenium Information System Project is a collaboration among the DEQ, the United States Forest Service (USFS), the Bureau of Land Management (BLM), the Idaho Mining Association (IMA), Idaho State University (ISU), and the Idaho National Laboratorymore » (INL)2. The Selenium Information System is a centralized data repository for southeastern Idaho selenium data. The data repository combines information that was previously in numerous agency, mining company, and consultants’ databases and web sites. These data include selenium concentrations in soil, water, sediment, vegetation and other environmental media, as well as comprehensive mine information. The Idaho DEQ spearheaded a selenium area-wide investigation through voluntary agreements with the mining companies and interagency participants. The Selenium Information System contains the results of that area-wide investigation, and many other background documents. As studies are conducted and remedial action decisions are made the resulting data and documentation will be stored within the information system. Potential users of the information system are agency officials, students, lawmakers, mining company personnel, teachers, researchers, and the general public. The system, available from a central website, consists of a database that contains the area-wide sampling information and an ESRI ArcIMS map server. The user can easily acquire information pertaining to the area-wide study as well as the final area-wide report. Future work on this project includes creating custom tools to increase the simplicity of the website and increasing the amount of information available from site-specific studies at 15 mines.« less
Knowledge Discovery and Data Mining: An Overview
NASA Technical Reports Server (NTRS)
Fayyad, U.
1995-01-01
The process of knowledge discovery and data mining is the process of information extraction from very large databases. Its importance is described along with several techniques and considerations for selecting the most appropriate technique for extracting information from a particular data set.
Emerging technology becomes an opportunity for EOS
NASA Astrophysics Data System (ADS)
Fargion, Giulietta S.; Harberts, Robert; Masek, Jeffrey G.
1996-11-01
During the last decade, we have seen an explosive growth in our ability to collect and generate data. When implemented, NASA's Earth observing system data information system (EOSDIS) will receive about 50 gigabytes of remotely sensed image data per hour. This will generate an urgent need for new techniques and tools that can automatically and intelligently assist in transforming this abundance of data into useful knowledge. Some emerging technologies that address these challenges include data mining and knowledge discovery in databases (KDD). The most basic data mining application is a content-based search (examples include finding images of particular meteorological phenomena or identifying data that have been previously mined or interpreted). In order that these technologies be effectively exploited for EOSDIS development, a better understanding of data mining and the requirements for using this technology is necessary. The authors are currently undertaking a project exploring the requirements and options of content-based search and data mining for use on EOSDIS. The scope of the project is to develop a prototype with which to investigate user interface concepts, requirements, and designs relevant for EOSDIS core system (ECS) subsystem utilizing these techniques. The goal is to identify a generic handling of these functions. This prototype will help identify opportunities which the earth science community and EOSDIS can use to meet the challenges of collecting, searching, retrieving, and interacting with abundant data resources in highly productive ways.
The Sulphur Bank Mercury Mine (SBMM) in Lake County, California operated from the 1860s through the 1950's. Mining for sulfur started with surface operations and progressed to shaft, then open pit techniques to obtain mercury. Mining has resulted in deposition of approximately ...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zvi H. Meiksin
Two industrial prototype units for through-the-earth wireless communication were constructed and tested. Preparation for a temporary installation in NIOSH's Lake Lynn mine for the through-the-earth and the in-mine system were completed. Progress was made in the programming of the in-mine system to provide data communication. Work has begun to implement a wireless interface between equipment controllers and our in-mine system.
Apriori Versions Based on MapReduce for Mining Frequent Patterns on Big Data.
Luna, Jose Maria; Padillo, Francisco; Pechenizkiy, Mykola; Ventura, Sebastian
2017-09-27
Pattern mining is one of the most important tasks to extract meaningful and useful information from raw data. This task aims to extract item-sets that represent any type of homogeneity and regularity in data. Although many efficient algorithms have been developed in this regard, the growing interest in data has caused the performance of existing pattern mining techniques to be dropped. The goal of this paper is to propose new efficient pattern mining algorithms to work in big data. To this aim, a series of algorithms based on the MapReduce framework and the Hadoop open-source implementation have been proposed. The proposed algorithms can be divided into three main groups. First, two algorithms [Apriori MapReduce (AprioriMR) and iterative AprioriMR] with no pruning strategy are proposed, which extract any existing item-set in data. Second, two algorithms (space pruning AprioriMR and top AprioriMR) that prune the search space by means of the well-known anti-monotone property are proposed. Finally, a last algorithm (maximal AprioriMR) is also proposed for mining condensed representations of frequent patterns. To test the performance of the proposed algorithms, a varied collection of big data datasets have been considered, comprising up to 3 · 10#x00B9;⁸ transactions and more than 5 million of distinct single-items. The experimental stage includes comparisons against highly efficient and well-known pattern mining algorithms. Results reveal the interest of applying MapReduce versions when complex problems are considered, and also the unsuitability of this paradigm when dealing with small data.
Conformational dynamics of ATP/Mg:ATP in motor proteins via data mining and molecular simulation.
Bojovschi, A; Liu, Ming S; Sadus, Richard J
2012-08-21
The conformational diversity of ATP/Mg:ATP in motor proteins was investigated using molecular dynamics and data mining. Adenosine triphosphate (ATP) conformations were found to be constrained mostly by inter cavity motifs in the motor proteins. It is demonstrated that ATP favors extended conformations in the tight pockets of motor proteins such as F(1)-ATPase and actin whereas compact structures are favored in motor proteins such as RNA polymerase and DNA helicase. The incorporation of Mg(2+) leads to increased flexibility of ATP molecules. The differences in the conformational dynamics of ATP/Mg:ATP in various motor proteins was quantified by the radius of gyration. The relationship between the simulation results and those obtained by data mining of motor proteins available in the protein data bank is analyzed. The data mining analysis of motor proteins supports the conformational diversity of the phosphate group of ATP obtained computationally.
Raja, Kalpana; Patrick, Matthew; Gao, Yilin; Madu, Desmond; Yang, Yuyang
2017-01-01
In the past decade, the volume of “omics” data generated by the different high-throughput technologies has expanded exponentially. The managing, storing, and analyzing of this big data have been a great challenge for the researchers, especially when moving towards the goal of generating testable data-driven hypotheses, which has been the promise of the high-throughput experimental techniques. Different bioinformatics approaches have been developed to streamline the downstream analyzes by providing independent information to interpret and provide biological inference. Text mining (also known as literature mining) is one of the commonly used approaches for automated generation of biological knowledge from the huge number of published articles. In this review paper, we discuss the recent advancement in approaches that integrate results from omics data and information generated from text mining approaches to uncover novel biomedical information. PMID:28331849
Text mining for adverse drug events: the promise, challenges, and state of the art.
Harpaz, Rave; Callahan, Alison; Tamang, Suzanne; Low, Yen; Odgers, David; Finlayson, Sam; Jung, Kenneth; LePendu, Paea; Shah, Nigam H
2014-10-01
Text mining is the computational process of extracting meaningful information from large amounts of unstructured text. It is emerging as a tool to leverage underutilized data sources that can improve pharmacovigilance, including the objective of adverse drug event (ADE) detection and assessment. This article provides an overview of recent advances in pharmacovigilance driven by the application of text mining, and discusses several data sources-such as biomedical literature, clinical narratives, product labeling, social media, and Web search logs-that are amenable to text mining for pharmacovigilance. Given the state of the art, it appears text mining can be applied to extract useful ADE-related information from multiple textual sources. Nonetheless, further research is required to address remaining technical challenges associated with the text mining methodologies, and to conclusively determine the relative contribution of each textual source to improving pharmacovigilance.
Bagur, M G; Morales, S; López-Chicano, M
2009-11-15
Unsupervised and supervised pattern recognition techniques such as hierarchical cluster analysis, principal component analysis, factor analysis and linear discriminant analysis have been applied to water samples recollected in Rodalquilar mining district (Southern Spain) in order to identify different sources of environmental pollution caused by the abandoned mining industry. The effect of the mining activity on waters was monitored determining the concentration of eleven elements (Mn, Ba, Co, Cu, Zn, As, Cd, Sb, Hg, Au and Pb) by inductively coupled plasma mass spectrometry (ICP-MS). The Box-Cox transformation has been used to transform the data set in normal form in order to minimize the non-normal distribution of the geochemical data. The environmental impact is affected mainly by the mining activity developed in the zone, the acid drainage and finally by the chemical treatment used for the benefit of gold.
Energy budgets of mining-induced earthquakes and their interactions with nearby stopes
McGarr, A.
2000-01-01
In the early 1960's, N.G.W. Cook, using an underground network of geophones, demonstrated that most Witwatersrand tremors are closely associated with deep level gold mining operations. He also showed that the energy released by the closure of the tabular stopes at depths of the order of 2 km was more than sufficient to account for the mining-induced earthquakes. I report here updated versions of these two results based on more modern underground data acquired in the Witwatersrand gold fields. Firstly, an extensive suite of in situ stress data indicate that the ambient state of crustal stress here is close to the failure state in the absence of mining even though the tectonic setting is thoroughly stable. Mining initially stabilizes the rock mass by reducing the pore fluid pressure from its initial hydrostatic state to nearly zero. The extensive mine excavations, as Cook showed, concentrate the deviatoric stresses, in localized regions of the abutments, back into a failure state resulting in seismicity. Secondly, there appears to be two distinct types of mining-induced earthquakes: the first is strongly coupled to the mining and involves shear failure plus a coseismic volume reduction; the second type is not evidently coupled to any particular mine face, shows purely deviatoric failure, and is presumably caused by more regional changes in the state of stress due to mining. Thirdly, energy budgets for mining induced earthquakes of both types indicate that, of the available released energy, only a few per cent is radiated by the seismic waves with the majority being consumed in overcoming fault friction. Published by Elsevier Science Ltd.In the early 1960's, N.G.W. Cook, using an underground network of geophones, demonstrated that most Witwatersrand tremors are closely associated with deep level gold mining operations. He also showed that the energy released by the closure of the tabular stopes at depths of the order of 2 km was more than sufficient to account for the mining-induced earthquakes. I report here updated versions of these two results based on more modern underground data acquired in the Witwatersrand gold fields. Firstly, an extensive suite of in situ stress data indicate that the ambient state of crustal stress here is close to the failure state in the absence of mining even though the tectonic setting is thoroughly stable. Mining initially stabilizes the rock mass by reducing the pore fluid pressure from its initial hydrostatic state to nearly zero. The extensive mine excavations, as Cook showed, concentrate the deviatoric stresses, in localized regions of the abutments, back into a failure state resulting in seismicity. Secondly, there appears to be two distinct types of mining-induced earthquakes: the first is strongly coupled to the mining and involves shear failure plus a coseismic volume reduction; the second type is not evidently coupled to any particular mine face, shows purely deviatoric failure, and is presumably caused by more regional changes in the state of stress due to mining. Thirdly, energy budgets for mining induced earthquakes of both types indicate that, of the available released energy, only a few per cent is radiated by the seismic waves with the majority being consumed in overcoming fault friction.
A Data Warehouse Architecture for DoD Healthcare Performance Measurements.
1999-09-01
design, develop, implement, and apply statistical analysis and data mining tools to a Data Warehouse of healthcare metrics. With the DoD healthcare...framework, this thesis defines a methodology to design, develop, implement, and apply statistical analysis and data mining tools to a Data Warehouse...21 F. INABILITY TO CONDUCT HELATHCARE ANALYSIS
Multimedia Exploratory Data Analysis for Geospatial Data Mining: The Case for Augmented Seriation.
ERIC Educational Resources Information Center
Gluck, Myke
2001-01-01
Reviews the role of exploratory data analysis (EDA) for spatial data mining and presents a case study addressing environmental risk assessments in New York State to illustrate the feasibility and usability of augmenting seriation for spatial data analysis. Describes augmentation with multimedia tools to understand relationships among spatial,…
Data Mining on Distributed Medical Databases: Recent Trends and Future Directions
NASA Astrophysics Data System (ADS)
Atilgan, Yasemin; Dogan, Firat
As computerization in healthcare services increase, the amount of available digital data is growing at an unprecedented rate and as a result healthcare organizations are much more able to store data than to extract knowledge from it. Today the major challenge is to transform these data into useful information and knowledge. It is important for healthcare organizations to use stored data to improve quality while reducing cost. This paper first investigates the data mining applications on centralized medical databases, and how they are used for diagnostic and population health, then introduces distributed databases. The integration needs and issues of distributed medical databases are described. Finally the paper focuses on data mining studies on distributed medical databases.
NASA Technical Reports Server (NTRS)
Solomon, J. L.; Miller, W. F.; Quattrochi, D. A.
1979-01-01
In a cooperative project with the Geological Survey of Alabama, the Mississippi State Remote Sensing Applications Program has developed a single purpose, decision-tree classifier using band-ratioing techniques to discriminate various stages of surface mining activity. The tree classifier has four levels and employs only two channels in classification at each level. An accurate computation of the amount of disturbed land resulting from the mining activity can be made as a product of the classification output. The utilization of Landsat data provides a cost-efficient, rapid, and accurate means of monitoring surface mining activities.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-11
... could not be done effectively using historical data. The information collected under part 50 is the most comprehensive and reliable occupational data available concerning the mining industry. This submission has been... miners. Accident, injury, and illness data, when correlated with employment and production data, provide...
NASA Astrophysics Data System (ADS)
Ma, Ju; Dineva, Savka; Cesca, Simone; Heimann, Sebastian
2018-06-01
Mining induced seismicity is an undesired consequence of mining operations, which poses significant hazard to miners and infrastructures and requires an accurate analysis of the rupture process. Seismic moment tensors of mining-induced events help to understand the nature of mining-induced seismicity by providing information about the relationship between the mining, stress redistribution and instabilities in the rock mass. In this work, we adapt and test a waveform-based inversion method on high frequency data recorded by a dense underground seismic system in one of the largest underground mines in the world (Kiruna mine, Sweden). A stable algorithm for moment tensor inversion for comparatively small mining induced earthquakes, resolving both the double-couple and full moment tensor with high frequency data, is very challenging. Moreover, the application to underground mining system requires accounting for the 3-D geometry of the monitoring system. We construct a Green's function database using a homogeneous velocity model, but assuming a 3-D distribution of potential sources and receivers. We first perform a set of moment tensor inversions using synthetic data to test the effects of different factors on moment tensor inversion stability and source parameters accuracy, including the network spatial coverage, the number of sensors and the signal-to-noise ratio. The influence of the accuracy of the input source parameters on the inversion results is also tested. Those tests show that an accurate selection of the inversion parameters allows resolving the moment tensor also in the presence of realistic seismic noise conditions. Finally, the moment tensor inversion methodology is applied to eight events chosen from mining block #33/34 at Kiruna mine. Source parameters including scalar moment, magnitude, double-couple, compensated linear vector dipole and isotropic contributions as well as the strike, dip and rake configurations of the double-couple term were obtained. The orientations of the nodal planes of the double-couple component in most cases vary from NNW to NNE with a dip along the ore body or in the opposite direction.
NASA Astrophysics Data System (ADS)
Ma, Ju; Dineva, Savka; Cesca, Simone; Heimann, Sebastian
2018-03-01
Mining induced seismicity is an undesired consequence of mining operations, which poses significant hazard to miners and infrastructures and requires an accurate analysis of the rupture process. Seismic moment tensors of mining-induced events help to understand the nature of mining-induced seismicity by providing information about the relationship between the mining, stress redistribution and instabilities in the rock mass. In this work, we adapt and test a waveform-based inversion method on high frequency data recorded by a dense underground seismic system in one of the largest underground mines in the world (Kiruna mine, Sweden). Stable algorithm for moment tensor inversion for comparatively small mining induced earthquakes, resolving both the double couple and full moment tensor with high frequency data is very challenging. Moreover, the application to underground mining system requires accounting for the 3D geometry of the monitoring system. We construct a Green's function database using a homogeneous velocity model, but assuming a 3D distribution of potential sources and receivers. We first perform a set of moment tensor inversions using synthetic data to test the effects of different factors on moment tensor inversion stability and source parameters accuracy, including the network spatial coverage, the number of sensors and the signal-to-noise ratio. The influence of the accuracy of the input source parameters on the inversion results is also tested. Those tests show that an accurate selection of the inversion parameters allows resolving the moment tensor also in presence of realistic seismic noise conditions. Finally, the moment tensor inversion methodology is applied to 8 events chosen from mining block #33/34 at Kiruna mine. Source parameters including scalar moment, magnitude, double couple, compensated linear vector dipole and isotropic contributions as well as the strike, dip, rake configurations of the double couple term were obtained. The orientations of the nodal planes of the double-couple component in most cases vary from NNW to NNE with a dip along the ore body or in the opposite direction.
NASA Astrophysics Data System (ADS)
Steiakakis, Chrysanthos; Agioutantis, Zacharias; Apostolou, Evangelia; Papavgeri, Georgia; Tripolitsiotis, Achilles
2016-01-01
The geotechnical challenges for safe slope design in large scale surface mining operations are enormous. Sometimes one degree of slope inclination can significantly reduce the overburden to ore ratio and therefore dramatically improve the economics of the operation, while large scale slope failures may have a significant impact on human lives. Furthermore, adverse weather conditions, such as high precipitation rates, may unfavorably affect the already delicate balance between operations and safety. Geotechnical, weather and production parameters should be systematically monitored and evaluated in order to safely operate such pits. Appropriate data management, processing and storage are critical to ensure timely and informed decisions. This paper presents an integrated data management system which was developed over a number of years as well as the advantages through a specific application. The presented case study illustrates how the high production slopes of a mine that exceed depths of 100-120 m were successfully mined with an average displacement rate of 10- 20 mm/day, approaching an almost slow to moderate landslide velocity. Monitoring data of the past four years are included in the database and can be analyzed to produce valuable results. Time-series data correlations of movements, precipitation records, etc. are evaluated and presented in this case study. The results can be used to successfully manage mine operations and ensure the safety of the mine and the workforce.
Mining of Business-Oriented Conversations at a Call Center
NASA Astrophysics Data System (ADS)
Takeuchi, Hironori; Nasukawa, Tetsuya; Watanabe, Hideo
Recently it has become feasible to transcribe textual records from telephone conversations at call centers by using automatic speech recognition. In this research, we extended a text mining system for call summary records and constructed a conversation mining system for the business-oriented conversations at the call center. To acquire useful business insights from the conversational data through the text mining system, it is critical to identify appropriate textual segments and expressions as the viewpoints to focus on. In the analysis of call summary data using a text mining system, some experts defined the viewpoints for the analysis by looking at some sample records and by preparing the dictionaries based on frequent keywords in the sample dataset. However with conversations it is difficult to identify such viewpoints manually and in advance because the target data consists of complete transcripts that are often lengthy and redundant. In this research, we defined a model of the business-oriented conversations and proposed a mining method to identify segments that have impacts on the outcomes of the conversations and can then extract useful expressions in each of these identified segments. In the experiment, we processed the real datasets from a car rental service center and constructed a mining system. With this system, we show the effectiveness of the method based on the defined conversation model.
NASA Technical Reports Server (NTRS)
Estep, Leland
2007-01-01
Presently, the BLM (Bureau of Land Management) has identified a multitude of abandoned mine sites in primarily Western states for cleanup. These sites are prioritized and appropriate cleanup has been called in to reclaim the sites. The task is great in needing considerable amounts of agency resources. For instance, in Colorado alone there exists an estimated 23,000 abandoned mines. The problem is not limited to Colorado or to the United States. Cooperation for reclamation is sought at local, state, and federal agency level to aid in identification, inventory, and cleanup efforts. Dangers posed by abandoned mines are recognized widely and will tend to increase with time because some of these areas are increasingly used for recreation and, in some cases, have been or are in the process of development. In some cases, mines are often vandalized once they are closed. The perpetrators leave them open, so others can then access the mines without realizing the danger posed. Abandoned mine workings often fill with water or oxygen-deficient air and dangerous gases following mining. If the workings are accidentally entered into, water or bad air can prove fatal to those underground. Moreover, mine residue drainage negatively impacts the local watershed ecology. Some of the major hazards that might be monitored by higher-resolution satellites include acid mine drainage, clogged streams, impoundments, slides, piles, embankments, hazardous equipment or facilities, surface burning, smoke from underground fires, and mine openings.
Hydrologic and geochemical data for the Big Brown lignite mine area, Freestone County, Texas
Dorsey, Michael E.
1985-01-01
Lignite mining in east and east-central Texas is increasing in response to increased energy needs throughout the State. Associated with the increase in mining activities is a greater need to know the effects of mining activities on the water quantity and quality of near-surface aquifers. The near-surface lignite beds mined at the Big Brown Lignite Mine are from the Calvert Bluff Formation of the Wilcox Group of Eocene age, which is a minor aquifer generally having water suitable for all uses, in eastern Freestone County, Texas. One of the potential hydro!ogic effects of surface-coal mining is a change in the quality of ground water associated with replacement of aquifer materials by mine spoils. The purpose of this report is to compile and categorize geologic, mineralogic, geochemical, and hydrologic data for the Big Brown Lignite Mine and surrounding area in east-central Texas. Included are results of pasteextract analyses, constituent concentrations in water from batch-mixing experiments, sulfur analyses, and minerals or mineral groups detected by X-ray diffraction in 12 spoil material samples collected from 3 locations at the mine site. Also, common-constituent and trace-constituent concentrations in water from eight selected wells, located updip and downdip from the mine, are presented. Dissolved-solids concentrations in water from batch-mixing experiments vary from 12 to 908 milligrams per liter. Water from selected wells contain dissolved-solids concentrations ranging from 75 to 510 milligrams per liter.
Combined mine tremors source location and error evaluation in the Lubin Copper Mine (Poland)
NASA Astrophysics Data System (ADS)
Leśniak, Andrzej; Pszczoła, Grzegorz
2008-08-01
A modified method of mine tremors location used in Lubin Copper Mine is presented in the paper. In mines where an intensive exploration is carried out a high accuracy source location technique is usually required. The effect of the flatness of the geophones array, complex geological structure of the rock mass and intense exploitation make the location results ambiguous in such mines. In the present paper an effective method of source location and location's error evaluations are presented, combining data from two different arrays of geophones. The first consists of uniaxial geophones spaced in the whole mine area. The second is installed in one of the mining panels and consists of triaxial geophones. The usage of the data obtained from triaxial geophones allows to increase the hypocenter vertical coordinate precision. The presented two-step location procedure combines standard location methods: P-waves directions and P-waves arrival times. Using computer simulations the efficiency of the created algorithm was tested. The designed algorithm is fully non-linear and was tested on the multilayered rock mass model of the Lubin Copper Mine, showing a computational better efficiency than the traditional P-wave arrival times location algorithm. In this paper we present the complete procedure that effectively solves the non-linear location problems, i.e. the mine tremor location and measurement of the error propagation.
Read-across predictions require high quality measured data for source analogues. These data are typically retrieved from structured databases, but biomedical literature data are often untapped because current literature mining approaches are resource intensive. Our high-throughpu...
A Bayesian Scoring Technique for Mining Predictive and Non-Spurious Rules
Batal, Iyad; Cooper, Gregory; Hauskrecht, Milos
2015-01-01
Rule mining is an important class of data mining methods for discovering interesting patterns in data. The success of a rule mining method heavily depends on the evaluation function that is used to assess the quality of the rules. In this work, we propose a new rule evaluation score - the Predictive and Non-Spurious Rules (PNSR) score. This score relies on Bayesian inference to evaluate the quality of the rules and considers the structure of the rules to filter out spurious rules. We present an efficient algorithm for finding rules with high PNSR scores. The experiments demonstrate that our method is able to cover and explain the data with a much smaller rule set than existing methods. PMID:25938136
A Bayesian Scoring Technique for Mining Predictive and Non-Spurious Rules.
Batal, Iyad; Cooper, Gregory; Hauskrecht, Milos
Rule mining is an important class of data mining methods for discovering interesting patterns in data. The success of a rule mining method heavily depends on the evaluation function that is used to assess the quality of the rules. In this work, we propose a new rule evaluation score - the Predictive and Non-Spurious Rules (PNSR) score. This score relies on Bayesian inference to evaluate the quality of the rules and considers the structure of the rules to filter out spurious rules. We present an efficient algorithm for finding rules with high PNSR scores. The experiments demonstrate that our method is able to cover and explain the data with a much smaller rule set than existing methods.
Gorokhovich, Yuri; Reid, Matthew; Mignone, Erica; Voros, Andrew
2003-10-01
Coal mine reclamation projects are very expensive and require coordination of local and federal agencies to identify resources for the most economic way of reclaiming mined land. Location of resources for mine reclamation is a spatial problem. This article presents a methodology that allows the combination of spatial data on resources for the coal mine reclamation and uses GIS analysis to develop a priority list of potential mine reclamation sites within contiguous United States using the method of extrapolation. The extrapolation method in this study was based on the Bark Camp reclamation project. The mine reclamation project at Bark Camp, Pennsylvania, USA, provided an example of the beneficial use of fly ash and dredged material to reclaim 402,600 sq mi of a mine abandoned in the 1980s. Railroads provided transportation of dredged material and fly ash to the site. Therefore, four spatial elements contributed to the reclamation project at Bark Camp: dredged material, abandoned mines, fly ash sources, and railroads. Using spatial distribution of these data in the contiguous United States, it was possible to utilize GIS analysis to prioritize areas where reclamation projects similar to Bark Camp are feasible. GIS analysis identified unique occurrences of all four spatial elements used in the Bark Camp case for each 1 km of the United States territory within 20, 40, 60, 80, and 100 km radii from abandoned mines. The results showed the number of abandoned mines for each state and identified their locations. The federal or state governments can use these results in mine reclamation planning.
Mining Claim Activity on Federal Land in the United States
Causey, J. Douglas
2007-01-01
Several statistical compilations of mining claim activity on Federal land derived from the Bureau of Land Management's LR2000 database have previously been published by the U.S Geological Survey (USGS). The work in the 1990s did not include Arkansas or Florida. None of the previous reports included Alaska because it is stored in a separate database (Alaska Land Information System) and is in a different format. This report includes data for all states for which there are Federal mining claim records, beginning in 1976 and continuing to the present. The intent is to update the spatial and statistical data associated with this report on an annual basis, beginning with 2005 data. The statistics compiled from the databases are counts of the number of active mining claims in a section of land each year from 1976 to the present for all states within the United States. Claim statistics are subset by lode and placer types, as well as a dataset summarizing all claims including mill site and tunnel site claims. One table presents data by case type, case status, and number of claims in a section. This report includes a spatial database for each state in which mining claims were recorded, except North Dakota, which only has had two claims. A field is present that allows the statistical data to be joined to the spatial databases so that spatial displays and analysis can be done by using appropriate geographic information system (GIS) software. The data show how mining claim activity has changed in intensity, space, and time. Variations can be examined on a state, as well as a national level. The data are tied to a section of land, approximately 640 acres, which allows it to be used at regional, as well as local scale. The data only pertain to Federal land and mineral estate that was open to mining claim location at the time the claims were staked.
Towards Cooperative Predictive Data Mining in Competitive Environments
NASA Astrophysics Data System (ADS)
Lisý, Viliam; Jakob, Michal; Benda, Petr; Urban, Štěpán; Pěchouček, Michal
We study the problem of predictive data mining in a competitive multi-agent setting, in which each agent is assumed to have some partial knowledge required for correctly classifying a set of unlabelled examples. The agents are self-interested and therefore need to reason about the trade-offs between increasing their classification accuracy by collaborating with other agents and disclosing their private classification knowledge to other agents through such collaboration. We analyze the problem and propose a set of components which can enable cooperation in this otherwise competitive task. These components include measures for quantifying private knowledge disclosure, data-mining models suitable for multi-agent predictive data mining, and a set of strategies by which agents can improve their classification accuracy through collaboration. The overall framework and its individual components are validated on a synthetic experimental domain.
Alkahest NuclearBLAST : a user-friendly BLAST management and analysis system
Diener, Stephen E; Houfek, Thomas D; Kalat, Sam E; Windham, DE; Burke, Mark; Opperman, Charles; Dean, Ralph A
2005-01-01
Background - Sequencing of EST and BAC end datasets is no longer limited to large research groups. Drops in per-base pricing have made high throughput sequencing accessible to individual investigators. However, there are few options available which provide a free and user-friendly solution to the BLAST result storage and data mining needs of biologists. Results - Here we describe NuclearBLAST, a batch BLAST analysis, storage and management system designed for the biologist. It is a wrapper for NCBI BLAST which provides a user-friendly web interface which includes a request wizard and the ability to view and mine the results. All BLAST results are stored in a MySQL database which allows for more advanced data-mining through supplied command-line utilities or direct database access. NuclearBLAST can be installed on a single machine or clustered amongst a number of machines to improve analysis throughput. NuclearBLAST provides a platform which eases data-mining of multiple BLAST results. With the supplied scripts, the program can export data into a spreadsheet-friendly format, automatically assign Gene Ontology terms to sequences and provide bi-directional best hits between two datasets. Users with SQL experience can use the database to ask even more complex questions and extract any subset of data they require. Conclusion - This tool provides a user-friendly interface for requesting, viewing and mining of BLAST results which makes the management and data-mining of large sets of BLAST analyses tractable to biologists. PMID:15958161
Code of Federal Regulations, 2012 CFR
2012-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Code of Federal Regulations, 2014 CFR
2014-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Code of Federal Regulations, 2013 CFR
2013-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Code of Federal Regulations, 2013 CFR
2013-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Code of Federal Regulations, 2010 CFR
2010-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Code of Federal Regulations, 2012 CFR
2012-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Code of Federal Regulations, 2011 CFR
2011-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Code of Federal Regulations, 2014 CFR
2014-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Code of Federal Regulations, 2010 CFR
2010-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Code of Federal Regulations, 2011 CFR
2011-01-01
... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...
Economic baselines for current underground coal mining technology
NASA Technical Reports Server (NTRS)
Mabe, W. B.
1979-01-01
The cost of mining coal using a room pillar mining method with continuous miner and a longwall mining system was calculated. Costs were calculated for the years 1975 and 2000 time periods and are to be used as economic standards against which advanced mining concepts and systems will be compared. Some assumptions were changed and some internal model stored data was altered from the original calculations procedure chosen, to obtain a result that more closely represented what was considered to be a standard mine. Coal seam thicknesses were varied from one and one-half feet to eight feet to obtain the cost of mining coal over a wide range. Geologic conditions were selected that had a minimum impact on the mining productivity.
Supporting Solar Physics Research via Data Mining
NASA Astrophysics Data System (ADS)
Angryk, Rafal; Banda, J.; Schuh, M.; Ganesan Pillai, K.; Tosun, H.; Martens, P.
2012-05-01
In this talk we will briefly introduce three pillars of data mining (i.e. frequent patterns discovery, classification, and clustering), and discuss some possible applications of known data mining techniques which can directly benefit solar physics research. In particular, we plan to demonstrate applicability of frequent patterns discovery methods for the verification of hypotheses about co-occurrence (in space and time) of filaments and sigmoids. We will also show how classification/machine learning algorithms can be utilized to verify human-created software modules to discover individual types of solar phenomena. Finally, we will discuss applicability of clustering techniques to image data processing.
A planetary nervous system for social mining and collective awareness
NASA Astrophysics Data System (ADS)
Giannotti, F.; Pedreschi, D.; Pentland, A.; Lukowicz, P.; Kossmann, D.; Crowley, J.; Helbing, D.
2012-11-01
We present a research roadmap of a Planetary Nervous System (PNS), capable of sensing and mining the digital breadcrumbs of human activities and unveiling the knowledge hidden in the big data for addressing the big questions about social complexity. We envision the PNS as a globally distributed, self-organizing, techno-social system for answering analytical questions about the status of world-wide society, based on three pillars: social sensing, social mining and the idea of trust networks and privacy-aware social mining. We discuss the ingredients of a science and a technology necessary to build the PNS upon the three mentioned pillars, beyond the limitations of their respective state-of-art. Social sensing is aimed at developing better methods for harvesting the big data from the techno-social ecosystem and make them available for mining, learning and analysis at a properly high abstraction level. Social mining is the problem of discovering patterns and models of human behaviour from the sensed data across the various social dimensions by data mining, machine learning and social network analysis. Trusted networks and privacy-aware social mining is aimed at creating a new deal around the questions of privacy and data ownership empowering individual persons with full awareness and control on own personal data, so that users may allow access and use of their data for their own good and the common good. The PNS will provide a goal-oriented knowledge discovery framework, made of technology and people, able to configure itself to the aim of answering questions about the pulse of global society. Given an analytical request, the PNS activates a process composed by a variety of interconnected tasks exploiting the social sensing and mining methods within the transparent ecosystem provided by the trusted network. The PNS we foresee is the key tool for individual and collective awareness for the knowledge society. We need such a tool for everyone to become fully aware of how powerful is the knowledge of our society we can achieve by leveraging our wisdom as a crowd, and how important is that everybody participates both as a consumer and as a producer of the social knowledge, for it to become a trustable, accessible, safe and useful public good.
NASA Astrophysics Data System (ADS)
Syarif, Andi Erwin; Hatori, Tsuyoshi
2017-06-01
Creating a soft-landing path for mine closure is key to the sustainability of the mining region. In this research, we presents a case of mine closure in Soroako, a small mining town in the north-east of South Sulawesi province, in the center of Sulawesi Island in Indonesia. Especially we investigates corporate social responsibility (CSR) programs of a mining company, PT Vale Indonesia Tbk (PTVI), towards a soft-landing of mine closure in this region. The data of the CSR programs are gathered from in-depth interviews, the annual reports and managerial reports. Furthermore we presents an integrated view of CSR to close mining in a sustainable manner. We then evaluate CSR strategies of the company and its performance from this viewpoint. Based on these steps, the way to improve the CSR mine closure scenario for enhancing the regional sustainability is discussed and recommended.
Sampling and monitoring for closure
McLemore, V.T.; Russell, C.C.; Smith, K.S.
2004-01-01
The Metals Mining Sector of the Acid Drainage Technology Initiative (ADTI-MMS) addresses technical drainage-quality issues related to metal mining and related metallurgical operations, for future and active mines, as well as, for historical mines and mining districts. One of the first projects of ADTI-MMS is to develop a handbook describing the best sampling, monitoring, predicting, mitigating, and modeling of drainage from metal mines, pit lakes and related metallurgical facilities based upon current scientific and engineering practices. One of the important aspects of planning a new mine in today's regulatory environment is the philosophy of designing a new or existing mine or expansion of operations for ultimate closure. The holistic philosophy taken in the ADTI-MMS handbook maintains that sampling and monitoring programs should be designed to take into account all aspects of the mine-life cycle. Data required for the closure of the operation are obtained throughout the mine-life cycle, from exploration through post-closure.
Mineral Mapping with Imaging Spectroscopy: The Ray Mine, AZ
NASA Technical Reports Server (NTRS)
Clark, Roger N.; Vance, J. Sam; Livo, K. Eric; Green, Robert O.
1998-01-01
Mineral maps generated for the Ray Mine, Arizona were analyzed to determine if imaging spectroscopy can provide accurate information for environmental management of active and abandoned mine regions. The Ray Mine, owned by the ASARCO Corporation, covers an area of 5700 acres and is situated in Pinal County, Arizona about 70 miles north of Tucson near Hayden, Arizona. This open-pit mine has been a major source of copper since 1911, producing an estimated 4.5 million tons of copper since its inception. Until 1955 mining was accomplished by underground block caving and shrinkage stope methods. (excavation by working in stepped series usually employed in a vertical or steeply inclined orebody) In 1955, the mine was completely converted to open pit method mining with the bulk of the production from sulfide ore using recovery by concentrating and smelting. Beginning in 1969 a significant production contribution has been from the leaching and solvent extraction-electrowinnowing method of silicate and oxide ores. Published reserves in the deposit as of 1992 are 1.1 billion tons at 0.6 percent copper. The Environmental Protection Agency, in conjunction with ASARCO, and NASA/JPL obtained AVIRIS data over the mine in 1997 as part of the EPA Advanced Measurement Initiative (AMI) (Tom Mace, Principal Investigator). This AVIRIS data set is being used to compare and contrast the accuracy and environmental monitoring capabilities of remote sensing technologies: visible-near-IR imaging spectroscopy, multispectral visible and, near-IR sensors, thermal instruments, and radar platforms. The goal of this effort is to determine if these various technologies provide useful information for envirorunental management of active and abandoned mine sites in the arid western United States. This paper focuses on the analysis of AVIRIS data for assessing the impact of the Ray Mine on Mineral Creek. Mineral Creek flows to the Gila River. This paper discusses our preliminary AVIRIS mineral mapping and environmental findings.
In-situ study of beneficial utilization of coal fly ash in reactive mine tailings.
Lee, Joon Kyu; Shang, Julie Q; Wang, Hongliu; Zhao, Cheng
2014-03-15
Oxidation of reactive mine tailings and subsequent generation of acid mine drainage (AMD) have been long recognized as the largest environmental concern for the mining industry. Laboratory studies on utilization of coal fly ash in management of reactive mine tailings have shown reducing water and oxygen infiltration into tailings matrix, thus preventing oxidation of sulphide minerals and acid generation. However, few data from field studies to evaluate the performance of co-placement of mine tailings and fly ash (CMF hereafter) are reported in the open literature. This paper documents the construction and instrumentation of three CMF systems on the Musselwhite mine located in Ontario, Canada and presents results of 3-year real time monitoring. The field data indicates that the CMFs reduced the ingress of water due to cementation generated by hydration of fly ash. It was also found that the electrical conductivity of leachate from CMFs decreased in the early stage of co-placement, compared to the control. With further study, the principle and approach demonstrated in this paper can be adopted as a sustainable technology in the mine tailings management. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Krawczyk, Artur; Grzybek, Radosław
2018-01-01
The Satellite Radar Interferometry is one of the common methods that allow to measure the land subsidence caused by the underground black coal excavation. The interferometry images processed from the repeat-pass Synthetic Aperture Radar (SAR) systems give the spatial image of the terrain subjected to the surface subsidence over mining areas. Until now, the InSAR methods using data from the SAR Systems like ERS-1/ERS-2 and Envisat-1 were limited to a repeat-pass cycle of 35-day only. Recently, the ESA launched Sentinel-1A and 1B, and together they can provide the InSAR coverage in a 6-day repeat cycle. The studied area was the Upper Silesian Coal Basin in Poland, where the underground coal mining causes continuous subsidence of terrain surface and mining tremors (mine-induced seismicity). The main problem was with overlapping the subsidence caused by the mining exploitation with the epicentre tremors. Based on the Sentinel SAR images, research was done in regard to the correlation between the short term ground subsidence range border and the mine-induced seismicity epicentres localisation.
Medical data mining: knowledge discovery in a clinical data warehouse.
Prather, J. C.; Lobach, D. F.; Goodwin, L. K.; Hales, J. W.; Hage, M. L.; Hammond, W. E.
1997-01-01
Clinical databases have accumulated large quantities of information about patients and their medical conditions. Relationships and patterns within this data could provide new medical knowledge. Unfortunately, few methodologies have been developed and applied to discover this hidden knowledge. In this study, the techniques of data mining (also known as Knowledge Discovery in Databases) were used to search for relationships in a large clinical database. Specifically, data accumulated on 3,902 obstetrical patients were evaluated for factors potentially contributing to preterm birth using exploratory factor analysis. Three factors were identified by the investigators for further exploration. This paper describes the processes involved in mining a clinical database including data warehousing, data query and cleaning, and data analysis. PMID:9357597
Lu, Songjian; Jin, Bo; Cowart, L Ashley; Lu, Xinghua
2013-01-01
Genetic and pharmacological perturbation experiments, such as deleting a gene and monitoring gene expression responses, are powerful tools for studying cellular signal transduction pathways. However, it remains a challenge to automatically derive knowledge of a cellular signaling system at a conceptual level from systematic perturbation-response data. In this study, we explored a framework that unifies knowledge mining and data mining towards the goal. The framework consists of the following automated processes: 1) applying an ontology-driven knowledge mining approach to identify functional modules among the genes responding to a perturbation in order to reveal potential signals affected by the perturbation; 2) applying a graph-based data mining approach to search for perturbations that affect a common signal; and 3) revealing the architecture of a signaling system by organizing signaling units into a hierarchy based on their relationships. Applying this framework to a compendium of yeast perturbation-response data, we have successfully recovered many well-known signal transduction pathways; in addition, our analysis has led to many new hypotheses regarding the yeast signal transduction system; finally, our analysis automatically organized perturbed genes as a graph reflecting the architecture of the yeast signaling system. Importantly, this framework transformed molecular findings from a gene level to a conceptual level, which can be readily translated into computable knowledge in the form of rules regarding the yeast signaling system, such as "if genes involved in the MAPK signaling are perturbed, genes involved in pheromone responses will be differentially expressed."
Cooperative organic mine avoidance path planning
NASA Astrophysics Data System (ADS)
McCubbin, Christopher B.; Piatko, Christine D.; Peterson, Adam V.; Donnald, Creighton R.; Cohen, David
2005-06-01
The JHU/APL Path Planning team has developed path planning techniques to look for paths that balance the utility and risk associated with different routes through a minefield. Extending on previous years' efforts, we investigated real-world Naval mine avoidance requirements and developed a tactical decision aid (TDA) that satisfies those requirements. APL has developed new mine path planning techniques using graph based and genetic algorithms which quickly produce near-minimum risk paths for complicated fitness functions incorporating risk, path length, ship kinematics, and naval doctrine. The TDA user interface, a Java Swing application that obtains data via Corba interfaces to path planning databases, allows the operator to explore a fusion of historic and in situ mine field data, control the path planner, and display the planning results. To provide a context for the minefield data, the user interface also renders data from the Digital Nautical Chart database, a database created by the National Geospatial-Intelligence Agency containing charts of the world's ports and coastal regions. This TDA has been developed in conjunction with the COMID (Cooperative Organic Mine Defense) system. This paper presents a description of the algorithms, architecture, and application produced.
Clinical diabetes research using data mining: a Canadian perspective.
Shah, Baiju R; Lipscombe, Lorraine L
2015-06-01
With the advent of the digitization of large amounts of information and the computer power capable of analyzing this volume of information, data mining is increasingly being applied to medical research. Datasets created for administration of the healthcare system provide a wealth of information from different healthcare sectors, and Canadian provinces' single-payer universal healthcare systems mean that data are more comprehensive and complete in this country than in many other jurisdictions. The increasing ability to also link clinical information, such as electronic medical records, laboratory test results and disease registries, has broadened the types of data available for analysis. Data-mining methods have been used in many different areas of diabetes clinical research, including classic epidemiology, effectiveness research, population health and health services research. Although methodologic challenges and privacy concerns remain important barriers to using these techniques, data mining remains a powerful tool for clinical research. Copyright © 2015 Canadian Diabetes Association. Published by Elsevier Inc. All rights reserved.
Kernel Methods for Mining Instance Data in Ontologies
NASA Astrophysics Data System (ADS)
Bloehdorn, Stephan; Sure, York
The amount of ontologies and meta data available on the Web is constantly growing. The successful application of machine learning techniques for learning of ontologies from textual data, i.e. mining for the Semantic Web, contributes to this trend. However, no principal approaches exist so far for mining from the Semantic Web. We investigate how machine learning algorithms can be made amenable for directly taking advantage of the rich knowledge expressed in ontologies and associated instance data. Kernel methods have been successfully employed in various learning tasks and provide a clean framework for interfacing between non-vectorial data and machine learning algorithms. In this spirit, we express the problem of mining instances in ontologies as the problem of defining valid corresponding kernels. We present a principled framework for designing such kernels by means of decomposing the kernel computation into specialized kernels for selected characteristics of an ontology which can be flexibly assembled and tuned. Initial experiments on real world Semantic Web data enjoy promising results and show the usefulness of our approach.
LLNL electro-optical mine detection program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, C.; Aimonetti, W.; Barth, M.
1994-09-30
Under funding from the Advanced Research Projects Agency (ARPA) and the US Marine Corps (USMC), Lawrence Livermore National Laboratory (LLNL) has directed a program aimed at improving detection capabilities against buried mines and munitions. The program has provided a national test facility for buried mines in arid environments, compiled and distributed an extensive data base of infrared (IR), ground penetrating radar (GPR), and other measurements made at that site, served as a host for other organizations wishing to make measurements, made considerable progress in the use of ground penetrating radar for mine detection, and worked on the difficult problem ofmore » sensor fusion as applied to buried mine detection. While the majority of our effort has been concentrated on the buried mine problem, LLNL has worked with the U.S.M.C. on surface mine problems as well, providing data and analysis to support the COBRA (Coastal Battlefield Reconnaissance and Analysis) program. The original aim of the experimental aspect of the program was the utilization of multiband infrared approaches for the detection of buried mines. Later the work was extended to a multisensor investigation, including sensors other than infrared imagers. After an early series of measurements, it was determined that further progress would require a larger test facility in a natural environment, so the Buried Object Test Facility (BOTF) was constructed at the Nevada Test Site. After extensive testing, with sensors spanning the electromagnetic spectrum from the near ultraviolet to radio frequencies, possible paths for improvement were: improved spatial resolution providing better ground texture discrimination; analysis which involves more complicated spatial queueing and filtering; additional IR bands using imaging spectroscopy; the use of additional sensors other than IR and the use of data fusion techniques with multi-sensor data; and utilizing time dependent observables like temperature.« less
NASA Astrophysics Data System (ADS)
Currell, Matthew J.; Werner, Adrian D.; McGrath, Chris; Webb, John A.; Berkman, Michael
2017-05-01
Understanding and managing impacts from mining on groundwater-dependent ecosystems (GDEs) and other groundwater users requires development of defensible science supported by adequate field data. This usually leads to the creation of predictive models and analysis of the likely impacts of mining and their accompanying uncertainties. The identification, monitoring and management of impacts on GDEs are often a key component of mine approvals, which need to consider and attempt to minimise the risks that negative impacts may arise. Here we examine a case study where approval for a large mining project in Australia (Carmichael Coal Mine) was challenged in court on the basis that it may result in more extensive impacts on a GDE (Doongmabulla Springs) of high ecological and cultural significance than predicted by the proponent. We show that throughout the environmental assessment and approval process, significant data gaps and scientific uncertainties remained unresolved. Evidence shows that the assumed conceptual hydrogeological model for the springs could be incorrect, and that at least one alternative conceptualisation (that the springs are dependent on a deep fault) is consistent with the available field data. Assumptions made about changes to spring flow as a consequence of mine-induced drawdown also appear problematic, with significant implications for the spring-fed wetlands. Despite the large scale of the project, it appears that critical scientific data required to resolve uncertainties and construct robust models of the springs' relationship to the groundwater system were lacking at the time of approval, contributing to uncertainty and conflict. For this reason, we recommend changes to the approval process that would require a higher standard of scientific information to be collected and reviewed, particularly in relation to key environmental assets during the environmental impact assessment process in future projects.
Trippi, Michael H.; Belkin, Harvey E.; Dai, Shifeng; Tewalt, Susan J.; Chou, Chiu-Jung; Trippi, Michael H.; Belkin, Harvey E.; Dai, Shifeng; Tewalt, Susan J.; Chou, Chiu-Jung
2015-01-01
Geographic information system (GIS) information may facilitate energy studies, which in turn provide input for energy policy decisions. The U.S. Geological Survey (USGS) has compiled geographic information system (GIS) data representing the known coal mine locations and coal-mining areas of China as of 2001. These data are now available for download, and may be used in a GIS for a variety of energy resource and environmental studies of China. Province-scale maps were also created to display the point locations of coal mines and the coal-mining areas. In addition, coal-field outlines from a previously published map by Dai and others (2012) were also digitized and are available for download as a separate GIS data file, and shown in a nation-scale map of China. Chemical data for 332 coal samples from a previous USGS study of China and Taiwan (Tewalt and others, 2010) are included in a downloadable GIS point shapefile, and shown on a nation-scale map of China. A brief report summarizes the methodology used for creation of the shapefiles and the chemical analyses run on the samples.
NASA Astrophysics Data System (ADS)
Bhakta, K. D.; Yeboah-Forson, A.
2015-12-01
The Tri-State lead and zinc mining district in SW Missouri, SE Kansas, and NE Oklahoma encompasses nearly 2,500 sq. miles of land and at its peak accounted for half of the US zinc (23,000,000 tons) production that surpassed one billion dollars in economic value. Once these lead and zinc rich ores were extracted, mining and milling sites were abandoned leaving behind a new landscape with numerous environmental challenges. Since 1970, most of the sites have been targeted for remediation and reclamation by federal and state agencies including the EPA. In order to capture the full extent of the impact of lead and zinc mining in the Tri-State area, numerous geoscientific approaches including data from small unmanned aerial vehicle (UAV) were employed to investigate the influence of mining in the study area. The study presented here is focused on observational assessment of the existing landscape using multiple commercial high-definitions data from UAVs to study different sites across areas of concern in the three states. Primary results (images) gathered and analyzed DEM and GIS data from abandoned mines showed the potential to provide a quick snapshot of successful or unsuccessful remediated areas. Although research and remediation of the Tri-State mining district are a continuous process, evidence from this geomorphic study suggest that UAVs can provide a quick overview of the remediated landscape or serve as a primary background tool for a more detail site-specific environmental study.
Mercury methylation at mercury mines in the Humboldt River Basin, Nevada, USA
Gray, J.E.; Crock, J.G.; Lasorsa, B.K.
2002-01-01
Total Hg and methylmercury concentrations were measured in mine-waste calcines (retorted ore), sediment, and water samples collected in and around abandoned mercury mines in western Nevada to evaluate Hg methylation at the mines and in the Humboldt River Basin. Mine-waste calcines contain total Hg concentrations as high as 14 000 ??g g-1. Stream-sediment samples collected within 1 km of the mercury mines contain total Hg concentrations as high as 170 ??g g-1, whereas stream sediments collected at a distance >5 km from the mines, and those collected from the Humboldt River and regional baseline sites, contain total Hg concentrations 8 km from the nearest mercury mines. Our data indicate little transference of Hg and methylmercury from the sediment to the water column due to the lack of mine runoff in this desert climate.
Microcomputer keeps watch at Emerald Mine
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1987-04-01
This paper reviews the computerized mine monitoring system set up at the Emerald Mine, SW Pennsylvania, USA. This coal mine has pioneered the automation of many production and safety features and this article covers their work in fire detection and conveyor belt monitoring. A central computer control room can safely watch over the whole underground mining operation using one 25 inch colour monitor. These new data-acquisition systems will lead the way, in the future, to safer move efficient coal mining. Multi-point monitoring of carbon monoxide, heat anomalies, toxic gases and the procedures in conveyor belt operation from start-up to closedown.
Suspended sediment load below open-cast mines for ungauged river basin
NASA Astrophysics Data System (ADS)
Kuksina, L.
2011-12-01
Placer mines are located in river valleys along river benches or river ancient channels. Frequently the existing mining sites are characterized by low contribution of the environmental technologies. Therefore open-pit mining alters stream hydrology and sediment processes and enhances sediment transport. The most serious environmental consequences of the sediment yield increase occur in the rivers populated by salmon fish community because salmon species prefer clean water with low turbidity. For instance, placer mining located in Kamchatka peninsula (Far East of Russia) which is regarded to be the last global gene pool of wild salmon Oncorhynchus threatens rivers ecosystems significantly. Impact assessment is limited by the hydrological observations scarcity. Gauging network is rare and in many cases whole basins up to 200 km length miss any hydrological data. The main purpose of the work is elaboration of methods for sediment yield estimation in rivers under mining impact and implementation of corresponding calculations. Subjects of the study are rivers of the Vivenka river basin where open-cast platinum mine is situated. It's one of the largest platinum mines in Russian Federation and in the world. This mine is the most well-studied in Kamchatka (research covers a period from 2003 to 2011). Empirical - analytical model of suspended sediment yield estimation was elaborated for rivers draining mine's territories. Sediment delivery at the open-cast mine happens due to the following sediment processes: - erosion in the channel diversions; - soil erosion on the exposed hillsides; - effluent from settling ponds; - mine waste water inflow; - accident mine waste water escape into rivers. Sediment washout caused by erosion was estimated by repeated measurements of the channel profiles in 2003, 2006 and 2008. Estimation of horizontal deformation rates was carried out on the basis of erosion dependence on water discharge rates, slopes and composition of sediments. Soil erosion on the exposed hillsides was estimated taking into account precipitation of various intensity and solid material washout during this period. Effluent from settling ponds was calculated on the basis of minimum anthropogenic turbidity. Its value is difference in background turbidity and minimal turbidity caused by effluent and waste water overflow. Mine waste water inflow was estimated due to actual data on water balance of purification system. Accident mine waste water escape into rivers was estimated by duration and material washout during accidents data measured during observation period. Total suspended sediment yield of rivers draining mine's territory is the sum of its components. Total sediment supply from mining site is 24.7 % from the Vivenka sediment yield. Polluted placer-mined rivers contribute about 35.4 % of the whole sediment yield of the Vivenka river. At the same time the catchment area of these rivers is less than 0.2 % from the whole Vivenka catchment area.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
This volume contains five appendixes: Chattanooga Shale preliminary mining study, soils data, meteorologic data, water resources data, and biological resource data. The area around DeKalb County in Tennessee is the most likely site for commercial development for recovery of uranium. (DLC)
ERIC Educational Resources Information Center
O'Halloran, Kay L.; Tan, Sabine; Pham, Duc-Son; Bateman, John; Vande Moere, Andrew
2018-01-01
This article demonstrates how a digital environment offers new opportunities for transforming qualitative data into quantitative data in order to use data mining and information visualization for mixed methods research. The digital approach to mixed methods research is illustrated by a framework which combines qualitative methods of multimodal…
Open data mining for Taiwan's dengue epidemic.
Wu, ChienHsing; Kao, Shu-Chen; Shih, Chia-Hung; Kan, Meng-Hsuan
2018-07-01
By using a quantitative approach, this study examines the applicability of data mining technique to discover knowledge from open data related to Taiwan's dengue epidemic. We compare results when Google trend data are included or excluded. Data sources are government open data, climate data, and Google trend data. Research findings from analysis of 70,914 cases are obtained. Location and time (month) in open data show the highest classification power followed by climate variables (temperature and humidity), whereas gender and age show the lowest values. Both prediction accuracy and simplicity decrease when Google trends are considered (respectively 0.94 and 0.37, compared to 0.96 and 0.46). The article demonstrates the value of open data mining in the context of public health care. Copyright © 2018 Elsevier B.V. All rights reserved.
Brady, Laura M.; Gray, Floyd; Wissler, Craig A.; Guertin, D. Phillip
2001-01-01
In this study, a geographic information system (GIS) is used to integrate and accurately map field studies, information from remotely sensed data, watershed models, and the dispersion of potentially toxic mine waste and tailings. The purpose of this study is to identify erosion rates and net sediment delivery of soil and mine waste/tailings to the drainage channel within several watershed regions to determine source areas of sediment delivery as a method of quantifying geo-environmental analysis of transport mechanisms in abandoned mine lands in arid climate conditions. Users of this study are the researchers interested in exploration of approaches to depicting historical activity in an area which has no baseline data records for environmental analysis of heavily mined terrain.
NASA Astrophysics Data System (ADS)
Kadioglu, Selma; Kagan Kadioglu, Yusuf
2014-05-01
An anti-tank mine (AT mine) is a type of land mine designed to damage or destroy vehicles including tanks and armored fighting vehicles. Anti-tank mines typically have a much larger explosive charge, and a fuze designed only to be triggered by vehicles or, in some cases, tampering with the mine. There are a lot of AT mine types. In our test study, MK4 and MK5 AT mine types has been used. The Mk 5 was a cylindrical metal cased U.K. anti-tank blast mine that entered service in 1943, during the Second World War. General Specifications of them are 203 mm diameter, 127 mm height, 4.4-5.7 kg weight, 2.05-3.75 kg of TNT explosive content and 350 lbs operating pressure respectively. The aims of the test study were to image anti-tank landmine with GPR method and to analyse the soil characteristics before the mines made explode and after made be exploded and determine changing of the soil characteristics. We realized data measurement on the real 6 unexploded anti-tank landmine buried approximately 15 cm in depth. The mines spaced 3 m were buried in two lines. Space between lines was 1.5 m. We gathered data on the profiles, approximately 7 m, with a Ramac CUII system and 800 MHz shielded antenna. We collected soil samples on the mines, near and around the mines, on the area in village. We collected soil samples before exploding and after exploding mines. We imaged anti-tank landmines on the depth slices of the GPR data and in their interactive transparent 3D subsets successfully. We used polarized microscope and confocal Raman spectroscopy (CRS) to identify soil characteristic before and after exploitation. The results presented that GPR method and its 3D imaging were successful to determine AT mines, and there was no important changing on mineralogical and petrographical characterization of the soil before and after exploding processing. This project has been supported by Ankara University under grant no 11B6055002. The study is a contribution to the EU funded COST action TU1208, "Civil Engineering Applications of Ground penetrating Radar".
ERIC Educational Resources Information Center
Dhar, Vasant
1998-01-01
Shows how counterfactuals and machine learning methods can be used to guide exploration of large databases that addresses some of the fundamental problems that organizations face in learning from data. Discusses data mining, particularly in the financial arena; generating useful knowledge from data; and the evaluation of counterfactuals. (LRW)
2017-06-27
From - To) 05-27-2017 Final 17-03-2017 - 15-03-2018 4. TITLE AND SUBTITLE Sa. CONTRACT NUMBER FA2386-17-1-0102 Advances in Knowledge Discovery and...Springer; Switzerland. 14. ABSTRACT The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference...in the areas of knowledge discovery and data mining (KDD). We had three keynote speeches, delivered by Sang Cha from Seoul National University
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chironis, N.P.
This book contains a wealth of valuable information carefully selected and compiled from recent issues of Coal Age magazine. Much of the source material has been gathered by Coal Age Editors during their visits to coal mines, research establishments, universities and technical symposiums. Equally important are the articles and data contributed by over 50 top experts, many of whom are well known to the mining industry. Specifically, this easy-to-use handbook is divided into eleven key areas of underground mining. Here you will find the latest information on continuous mining techniques, longwall and shortwall methods and equipment, specialized mining and boringmore » systems, continuous haulage techniques, improved roof control and ventilation methods, mine communications and instrumentation, power systems, fire control methods, and new mining regulations. There is also a section on engineering and management considerations, including the modern use of computer terminals, practical techniques for picking leaders and for encouraging more safety consciousness in employees, factors affecting absenteeism, and some highly important financial considerations. All of this valuable information has been thoroughly indexed to provide immediate access to the specific data needed by the reader.« less
A primer to frequent itemset mining for bioinformatics
Naulaerts, Stefan; Meysman, Pieter; Bittremieux, Wout; Vu, Trung Nghia; Vanden Berghe, Wim; Goethals, Bart
2015-01-01
Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences. PMID:24162173
NASA Astrophysics Data System (ADS)
Rochyani, Neny
2017-11-01
Acid mine drainage is a major problem for the mining environment. The main factor that formed acid mine drainage is the volume of rainfall. Therefore, it is important to know clearly the main climate pattern of rainfall and season on the management of acid mine drainage. This study focuses on the effects of rainfall on acid mine water management. Based on daily rainfall data, monthly and seasonal patterns by using Gumbel approach is known the amount of rainfall that occurred in East Pit 3 West Banko area. The data also obtained the highest maximum daily rainfall on 165 mm/day and the lowest at 76.4 mm/day, where it is known that the rainfall conditions during the period 2007 - 2016 is from November to April so the use of lime is also slightly, While the low rainfall is from May to October and the use of lime will be more and more. Based on calculation of lime requirement for each return period, it can be seen the total of lime and financial requirement for treatment of each return period.
Data mining and medical world: breast cancers' diagnosis, treatment, prognosis and challenges.
Oskouei, Rozita Jamili; Kor, Nasroallah Moradi; Maleki, Saeid Abbasi
2017-01-01
The amount of data in electronic and real world is constantly on the rise. Therefore, extracting useful knowledge from the total available data is very important and time consuming task. Data mining has various techniques for extracting valuable information or knowledge from data. These techniques are applicable for all data that are collected inall fields of science. Several research investigations are published about applications of data mining in various fields of sciences such as defense, banking, insurances, education, telecommunications, medicine and etc. This investigation attempts to provide a comprehensive survey about applications of data mining techniques in breast cancer diagnosis, treatment & prognosis till now. Further, the main challenges in these area is presented in this investigation. Since several research studies currently are going on in this issues, therefore, it is necessary to have a complete survey about all researches which are completed up to now, along with the results of those studies and important challenges which are currently exist in this area for helping young researchers and presenting to them the main problems that are still exist in this area.
Data mining and medical world: breast cancers’ diagnosis, treatment, prognosis and challenges
Oskouei, Rozita Jamili; Kor, Nasroallah Moradi; Maleki, Saeid Abbasi
2017-01-01
The amount of data in electronic and real world is constantly on the rise. Therefore, extracting useful knowledge from the total available data is very important and time consuming task. Data mining has various techniques for extracting valuable information or knowledge from data. These techniques are applicable for all data that are collected inall fields of science. Several research investigations are published about applications of data mining in various fields of sciences such as defense, banking, insurances, education, telecommunications, medicine and etc. This investigation attempts to provide a comprehensive survey about applications of data mining techniques in breast cancer diagnosis, treatment & prognosis till now. Further, the main challenges in these area is presented in this investigation. Since several research studies currently are going on in this issues, therefore, it is necessary to have a complete survey about all researches which are completed up to now, along with the results of those studies and important challenges which are currently exist in this area for helping young researchers and presenting to them the main problems that are still exist in this area. PMID:28401016
Mining moving object trajectories in location-based services for spatio-temporal database update
NASA Astrophysics Data System (ADS)
Guo, Danhuai; Cui, Weihong
2008-10-01
Advances in wireless transmission and mobile technology applied to LBS (Location-based Services) flood us with amounts of moving objects data. Vast amounts of gathered data from position sensors of mobile phones, PDAs, or vehicles hide interesting and valuable knowledge and describe the behavior of moving objects. The correlation between temporal moving patterns of moving objects and geo-feature spatio-temporal attribute was ignored, and the value of spatio-temporal trajectory data was not fully exploited too. Urban expanding or frequent town plan change bring about a large amount of outdated or imprecise data in spatial database of LBS, and they cannot be updated timely and efficiently by manual processing. In this paper we introduce a data mining approach to movement pattern extraction of moving objects, build a model to describe the relationship between movement patterns of LBS mobile objects and their environment, and put up with a spatio-temporal database update strategy in LBS database based on trajectories spatiotemporal mining. Experimental evaluation reveals excellent performance of the proposed model and strategy. Our original contribution include formulation of model of interaction between trajectory and its environment, design of spatio-temporal database update strategy based on moving objects data mining, and the experimental application of spatio-temporal database update by mining moving objects trajectories.
Analyzing Teaching Performance of Instructors Using Data Mining Techniques
ERIC Educational Resources Information Center
Mardikyan, Sona; Badur, Bertain
2011-01-01
Student evaluations to measure the teaching effectiveness of instructor's are very frequently applied in higher education for many years. This study investigates the factors associated with the assessment of instructors teaching performance using two different data mining techniques; stepwise regression and decision trees. The data collected…
Teaching the Scientific Method: It's All in the Perspective
ERIC Educational Resources Information Center
Ayers, James M.; Ayers, Kathleen M.
2007-01-01
A three unit module of inquiry, including morphological comparison, cladogram construction, and data mining has been developed to teach students the nature of experimental science. Students generate angiosperm morphological data, form cladistic hypotheses, then mine taxonomic, bioinformatic and historical data from many sources to replicate and…
Comparative performance between compressed and uncompressed airborne imagery
NASA Astrophysics Data System (ADS)
Phan, Chung; Rupp, Ronald; Agarwal, Sanjeev; Trang, Anh; Nair, Sumesh
2008-04-01
The US Army's RDECOM CERDEC Night Vision and Electronic Sensors Directorate (NVESD), Countermine Division is evaluating the compressibility of airborne multi-spectral imagery for mine and minefield detection application. Of particular interest is to assess the highest image data compression rate that can be afforded without the loss of image quality for war fighters in the loop and performance of near real time mine detection algorithm. The JPEG-2000 compression standard is used to perform data compression. Both lossless and lossy compressions are considered. A multi-spectral anomaly detector such as RX (Reed & Xiaoli), which is widely used as a core algorithm baseline in airborne mine and minefield detection on different mine types, minefields, and terrains to identify potential individual targets, is used to compare the mine detection performance. This paper presents the compression scheme and compares detection performance results between compressed and uncompressed imagery for various level of compressions. The compression efficiency is evaluated and its dependence upon different backgrounds and other factors are documented and presented using multi-spectral data.
A Node Linkage Approach for Sequential Pattern Mining
Navarro, Osvaldo; Cumplido, René; Villaseñor-Pineda, Luis; Feregrino-Uribe, Claudia; Carrasco-Ochoa, Jesús Ariel
2014-01-01
Sequential Pattern Mining is a widely addressed problem in data mining, with applications such as analyzing Web usage, examining purchase behavior, and text mining, among others. Nevertheless, with the dramatic increase in data volume, the current approaches prove inefficient when dealing with large input datasets, a large number of different symbols and low minimum supports. In this paper, we propose a new sequential pattern mining algorithm, which follows a pattern-growth scheme to discover sequential patterns. Unlike most pattern growth algorithms, our approach does not build a data structure to represent the input dataset, but instead accesses the required sequences through pseudo-projection databases, achieving better runtime and reducing memory requirements. Our algorithm traverses the search space in a depth-first fashion and only preserves in memory a pattern node linkage and the pseudo-projections required for the branch being explored at the time. Experimental results show that our new approach, the Node Linkage Depth-First Traversal algorithm (NLDFT), has better performance and scalability in comparison with state of the art algorithms. PMID:24933123
Detecting Malicious Tweets in Twitter Using Runtime Monitoring With Hidden Information
2016-06-01
text mining using Twitter streaming API and python [Online]. Available: http://adilmoujahid.com/posts/2014/07/twitter-analytics/ [22] M. Singh, B...sites with 645,750,000 registered users [3] and has open source public tweets for data mining . 2. Malicious Users and Tweets In the modern world...want to data mine in Twitter, and presents the natural language assertions and corresponding rule patterns. It then describes the steps performed using
PERFORMING QUALITY FLOW MEASUREMENTS AT MINE SITES
Accurate flow measurement data is vital to research, monitoring, and remediation efforts at mining sites. This guidebook has been prepared to provide a summary of information relating to the performance of low measurements, and how this information can be applied at mining sites....
Wang, Gang; Zhao, Zhikai; Ning, Yongjie
2018-05-28
As the application of a coal mine Internet of Things (IoT), mobile measurement devices, such as intelligent mine lamps, cause moving measurement data to be increased. How to transmit these large amounts of mobile measurement data effectively has become an urgent problem. This paper presents a compressed sensing algorithm for the large amount of coal mine IoT moving measurement data based on a multi-hop network and total variation. By taking gas data in mobile measurement data as an example, two network models for the transmission of gas data flow, namely single-hop and multi-hop transmission modes, are investigated in depth, and a gas data compressed sensing collection model is built based on a multi-hop network. To utilize the sparse characteristics of gas data, the concept of total variation is introduced and a high-efficiency gas data compression and reconstruction method based on Total Variation Sparsity based on Multi-Hop (TVS-MH) is proposed. According to the simulation results, by using the proposed method, the moving measurement data flow from an underground distributed mobile network can be acquired and transmitted efficiently.
NASA Technical Reports Server (NTRS)
1983-01-01
While planning for the space shuttle, Bendix Corporation with the help of Johnson Space Center expanded the anthropometric data base for aerospace and nonaerospace use in clothing, workplace, etc. The result was the Anthropometric Source Book which was later utilized by the U.S. Bureau of Mines in designing advanced mining systems. The book was particularly valuable in the design of a remote cab used in mining.
ERIC Educational Resources Information Center
Adkins, John; And Others
A project was designed to produce a broad description of current mining training programs and to evaluate their effectiveness with respect to reducing mine injuries. Aggregate training and injury data were used to evaluate the overall training effort at 300 mines as well as specific efforts in 12 categories of training course objectives. From such…
Application of text mining for customer evaluations in commercial banking
NASA Astrophysics Data System (ADS)
Tan, Jing; Du, Xiaojiang; Hao, Pengpeng; Wang, Yanbo J.
2015-07-01
Nowadays customer attrition is increasingly serious in commercial banks. To combat this problem roundly, mining customer evaluation texts is as important as mining customer structured data. In order to extract hidden information from customer evaluations, Textual Feature Selection, Classification and Association Rule Mining are necessary techniques. This paper presents all three techniques by using Chinese Word Segmentation, C5.0 and Apriori, and a set of experiments were run based on a collection of real textual data that includes 823 customer evaluations taken from a Chinese commercial bank. Results, consequent solutions, some advice for the commercial bank are given in this paper.
Deformation Monitoring of Waste-Rock-Backfilled Mining Gob for Ground Control
Zhao, Tongbin; Zhang, Yubao; Zhang, Zhenyu; Li, Zhanhai; Ma, Shuqi
2017-01-01
Backfill mining is an effective option to mitigate ground subsidence, especially for mining under surface infrastructure, such as buildings, dams, rivers and railways. To evaluate its performance, continual long-term field monitoring of the deformation of backfilled gob is important to satisfy strict public scrutiny. Based on industrial Ethernet, a real-time monitoring system was established to monitor the deformation of waste-rock-backfilled gob at −700 m depth in the Tangshan coal mine, Hebei Province, China. The designed deformation sensors, based on a resistance transducer mechanism, were placed vertically between the roof and floor. Stress sensors were installed above square steel plates that were anchored to the floor strata. Meanwhile, data cables were protected by steel tubes in case of damage. The developed system continually harvested field data for three months. The results show that industrial Ethernet technology can be reliably used for long-term data transmission in complicated underground mining conditions. The monitoring reveals that the roof subsidence of the backfilled gob area can be categorized into four phases. The bearing load of the backfill developed gradually and simultaneously with the deformation of the roof strata, and started to be almost invariable when the mining face passed 97 m. PMID:28475168
Deformation Monitoring of Waste-Rock-Backfilled Mining Gob for Ground Control.
Zhao, Tongbin; Zhang, Yubao; Zhang, Zhenyu; Li, Zhanhai; Ma, Shuqi
2017-05-05
Backfill mining is an effective option to mitigate ground subsidence, especially for mining under surface infrastructure, such as buildings, dams, rivers and railways. To evaluate its performance, continual long-term field monitoring of the deformation of backfilled gob is important to satisfy strict public scrutiny. Based on industrial Ethernet, a real-time monitoring system was established to monitor the deformation of waste-rock-backfilled gob at -700 m depth in the Tangshan coal mine, Hebei Province, China. The designed deformation sensors, based on a resistance transducer mechanism, were placed vertically between the roof and floor. Stress sensors were installed above square steel plates that were anchored to the floor strata. Meanwhile, data cables were protected by steel tubes in case of damage. The developed system continually harvested field data for three months. The results show that industrial Ethernet technology can be reliably used for long-term data transmission in complicated underground mining conditions. The monitoring reveals that the roof subsidence of the backfilled gob area can be categorized into four phases. The bearing load of the backfill developed gradually and simultaneously with the deformation of the roof strata, and started to be almost invariable when the mining face passed 97 m.
NASA Astrophysics Data System (ADS)
Vathsala, H.; Koolagudi, Shashidhar G.
2017-01-01
In this paper we discuss a data mining application for predicting peninsular Indian summer monsoon rainfall, and propose an algorithm that combine data mining and statistical techniques. We select likely predictors based on association rules that have the highest confidence levels. We then cluster the selected predictors to reduce their dimensions and use cluster membership values for classification. We derive the predictors from local conditions in southern India, including mean sea level pressure, wind speed, and maximum and minimum temperatures. The global condition variables include southern oscillation and Indian Ocean dipole conditions. The algorithm predicts rainfall in five categories: Flood, Excess, Normal, Deficit and Drought. We use closed itemset mining, cluster membership calculations and a multilayer perceptron function in the algorithm to predict monsoon rainfall in peninsular India. Using Indian Institute of Tropical Meteorology data, we found the prediction accuracy of our proposed approach to be exceptionally good.
Digital Family History Data Mining with Neural Networks: A Pilot Study.
Hoyt, Robert; Linnville, Steven; Thaler, Stephen; Moore, Jeffrey
2016-01-01
Following the passage of the Health Information Technology for Economic and Clinical Health (HITECH) Act of 2009, electronic health records were widely adopted by eligible physicians and hospitals in the United States. Stage 2 meaningful use menu objectives include a digital family history but no stipulation as to how that information should be used. A variety of data mining techniques now exist for these data, which include artificial neural networks (ANNs) for supervised or unsupervised machine learning. In this pilot study, we applied an ANN-based simulation to a previously reported digital family history to mine the database for trends. A graphical user interface was created to display the input of multiple conditions in the parents and output as the likelihood of diabetes, hypertension, and coronary artery disease in male and female offspring. The results of this pilot study show promise in using ANNs to data mine digital family histories for clinical and research purposes.
Luo, Gang
2017-12-01
For user-friendliness, many software systems offer progress indicators for long-duration tasks. A typical progress indicator continuously estimates the remaining task execution time as well as the portion of the task that has been finished. Building a machine learning model often takes a long time, but no existing machine learning software supplies a non-trivial progress indicator. Similarly, running a data mining algorithm often takes a long time, but no existing data mining software provides a nontrivial progress indicator. In this article, we consider the problem of offering progress indicators for machine learning model building and data mining algorithm execution. We discuss the goals and challenges intrinsic to this problem. Then we describe an initial framework for implementing such progress indicators and two advanced, potential uses of them, with the goal of inspiring future research on this topic.
Data collection and simulation of high range resolution laser radar for surface mine detection
NASA Astrophysics Data System (ADS)
Steinvall, Ove; Chevalier, Tomas; Larsson, Håkan
2006-05-01
Rapid and efficient detection of surface mines, IED's (Improvised Explosive Devices) and UXO (Unexploded Ordnance) is of high priority in military conflicts. High range resolution laser radars combined with passive hyper/multispectral sensors offer an interesting concept to help solving this problem. This paper reports on laser radar data collection of various surface mines in different types of terrain. In order to evaluate the capability of 3D imaging for detecting and classifying the objects of interest a scanning laser radar was used to scan mines and surrounding terrain with high angular and range resolution. These data were then fed into a laser radar model capable of generating range waveforms for a variety of system parameters and combinations of different targets and backgrounds. We can thus simulate a potential system by down sampling to relevant pixel sizes and laser/receiver characteristics. Data, simulations and examples will be presented.
A Comparative Study to Predict Student’s Performance Using Educational Data Mining Techniques
NASA Astrophysics Data System (ADS)
Uswatun Khasanah, Annisa; Harwati
2017-06-01
Student’s performance prediction is essential to be conducted for a university to prevent student fail. Number of student drop out is one of parameter that can be used to measure student performance and one important point that must be evaluated in Indonesia university accreditation. Data Mining has been widely used to predict student’s performance, and data mining that applied in this field usually called as Educational Data Mining. This study conducted Feature Selection to select high influence attributes with student performance in Department of Industrial Engineering Universitas Islam Indonesia. Then, two popular classification algorithm, Bayesian Network and Decision Tree, were implemented and compared to know the best prediction result. The outcome showed that student’s attendance and GPA in the first semester were in the top rank from all Feature Selection methods, and Bayesian Network is outperforming Decision Tree since it has higher accuracy rate.
Luo, Gang
2017-01-01
For user-friendliness, many software systems offer progress indicators for long-duration tasks. A typical progress indicator continuously estimates the remaining task execution time as well as the portion of the task that has been finished. Building a machine learning model often takes a long time, but no existing machine learning software supplies a non-trivial progress indicator. Similarly, running a data mining algorithm often takes a long time, but no existing data mining software provides a nontrivial progress indicator. In this article, we consider the problem of offering progress indicators for machine learning model building and data mining algorithm execution. We discuss the goals and challenges intrinsic to this problem. Then we describe an initial framework for implementing such progress indicators and two advanced, potential uses of them, with the goal of inspiring future research on this topic. PMID:29177022
Oxygen transport and pyrite oxidation in unsaturated coal-mine spoil
Guo, Weixing; Cravotta, Charles A.
1996-01-01
An understanding of the mechanisms of oxygen (02) transport in unsaturated mine spoil is necessary to design and implement effective measures to exclude 02 from pyritic materials and to control the formation of acidic mine drainage. Partial pressure of oxygen (Po2) in pore gas, chemistry of pore water, and temperature were measured at different depths in unsaturated spoil at two reclaimed surface coal mines in Pennsylvania. At mine 1, where spoil was loose, blocky sandstone, Po2 changed little with depth, decreasing from 21 volume percent (vol%) at the ground surface to a minimum of about 18 vol% at 10 m depth. At mine 2, where spoil was compacted, friable shale, Po2 decreased to less than 2 vol% at depth of about 10 m. Although pore-water chemistry and temperature data indicate that acid-forming reactions were active at both mines, the pore-gas data indicate that mechanisms for 0 2 transport were different at each mine. A numerical model was developed to simulate 02 transport and pyrite oxidation in unsaturated mine spoil. The results of the numerical simulations indicate that differences in 02 transport at the two mines can be explained by differences in the air permeability of spoil. Po2 changes little with depth if advective transport of 02 dominates as at mine 1, but decreases greatly with depth if diffusive transport of 02 dominates, as in mine 2. Model results also indicate that advective transport becomes significant if the air permeability of spoil is greater than 10-9 m2, which is expected for blocky sandstone spoil. In the advective-dominant system, thermally-induced convective air flow, as a consequence of the exothermic oxidation of pyrite, supplies the 02 to maintain high Po2 within the deep unsaturated zone.
Huber, Douglas W.; Pierce, Brenda S.
2000-01-01
The U. S. Geological Survey (USGS) conducted a coal resource assessment of several areas in Armenia from 1997 to 1999. This report, which presents a prefeasibility study of the economic and mining potential of one coal deposit found and studied by the USGS team, was prepared using all data available at the time of the study and the results of the USGS exploratory work, including core drilling, trenching, coal quality analyses, and other ongoing field work. On the basis of information currently available, it is the authors? opinion that a small surface coal mine having about a 20-year life span could be developed in the Antaramut-Kurtan-Dzoragukh coal field, specifically at the Dzoragukh site. The mining organization selected or created to establish the mine will need to conduct necessary development drilling and other work to establish the final feasibility study for the mine. The company will need to be entrepreneurial, profit oriented, and sensitive to the coal consumer; have an analytical management staff; and focus on employee training, safety, and protection of the environment. It is anticipated that any interested parties will be required to submit detailed mining plans to the appropriate Armenian Government agencies. Further development work will be required to reach a final decision regarding the economic feasibility of the mine. However, available information indicates that a small, economic surface mine can be developed at this locality. The small mine suggested is a typical surface-outcropstripping, contour mining operation. In addition, auger mining is strongly suggested, because the recovery of these low-cost mining reserves will help to ensure that the operation will be a viable, economic enterprise. (Auger mining is a system in which large-diameter boreholes are placed horizontally into the coal seam at the final highwall set as the economic limit for the surface mining operation). A special horizontal boring machine, which can be imported from Russia, is required for auger mining. Although auger-mining coal reserves do exist, the necessary development work will further verify the extent of these reserves and all of the other indicated reserves. The following items are based on the detailed study reported in this publication. Initial investment.?Following an investment of US $85,000 over a 12-month period in mine development drilling and other activities, a decision must be taken regarding further investment in an ongoing mining operation. If the new data support the opening of the surface mine, __________________________ 1Consultant, 6024 Morning Dew Drive, Austin, TX 78749. 2 U.S. Geological Survey, 956 National Center, Reston, VA 20192 1 2 MINABILITY AND ECONOMIC VIABILITY, ANTARAMUT-KURTAN-DZORAGUKH COAL FIELD the $85,000 development cost is amortized over the first 10 years of mine production. If the new data do not support the opening of the mine, the $85,000 is considered a business development expense that may be written off against profits from other operations for income or other tax purposes or simply as a business loss. Total capital required.?The equipment costs will reach a total of $900,500 which will be amortized over a 7-year period to establish estimated coal mining costs. Estimated working capital costs are $300,000, which will be borrowed. Surface mining reserves.?Approximately 840,200 metric tonnes of surface minable coal reserves at 9.3 m3 of overburden per metric tonne of minable coal is indicated. Recovery of the minable coal at 85 percent will yield 714,000 recoverable metric tonnes of marketable as-mined coal. Auger mining reserves.?Auger-mining reserves of 576,000 metric tonnes are indicated. Recoverable auger-mining reserves of 202,000 metric tonnes (at 35-percent recovery) can be expected. Auger-mining production will vary according to the hole size being used, but, in either case, augering is a very profitable addition to the mining oper
Gray, John E.; Hines, Mark E.; Higueras, Pablo L.; Adatto, Isaac; Lasorsa, Brenda K.
2004-01-01
Speciation of Hg and conversion to methyl-Hg were evaluated in mine wastes, sediments, and water collected from the Almade??n District, Spain, the world's largest Hg producing region. Our data for methyl-Hg, a neurotoxin hazardous to humans, are the first reported for sediment and water from the Almade??n area. Concentrations of Hg and methyl-Hg in mine waste, sediment, and water from Almade??n are among the highest found at Hg mines worldwide. Mine wastes from Almade??n contain highly elevated Hg concentrations, ranging from 160 to 34 000 ??g/g, and methyl-Hg varies from <0.20 to 3100 ng/g. Isotopic tracer methods indicate that mine wastes at one site (Almadenejos) exhibit unusually high rates of Hg-methylation, which correspond with mine wastes containing the highest methyl-Hg concentrations. Streamwater collected near the Almade??n mine is also contaminated, containing Hg as high as 13 000 ng/L and methyl-Hg as high as 30 ng/L; corresponding stream sediments contain Hg concentrations as high as 2300 ??g/g and methyl-Hg concentrations as high as 82 ng/g. Several streamwaters contain Hg concentrations in excess of the 1000 ng/L World Health Organization (WHO) drinking water standard. Methyl-Hg formation and degradation was rapid in mines wastes and stream sediments demonstrating the dynamic nature of Hg cycling. These data indicate substantial downstream transport of Hg from the Almade??n mine and significant conversion to methyl-Hg in the surface environment.
Hammarstrom, Jane M.; Johnson, Adam N.; Seal, Robert R.; Meier, Allen L.; Briggs, Paul L.; Piatak, Nadine M.
2006-01-01
The Virginia gold-pyrite belt, part of the central Virginia volcanic-plutonic belt, hosts numerous abandoned metal mines. The belt extends from about 50 km south of Washington, D.C., for approximately 175 km to the southwest into central Virginia. The rocks that comprise the belt include metamorphosed volcanic and clastic (noncarbonate) sedimentary rocks that were originally deposited during the Ordovician). Deposits that were mined can be classified into three broad categories: 1. volcanic-associated massive sulfide deposits, 2. low-sulfide quartz-gold vein deposits, 3. gold placer deposits, which result from weathering of the vein deposits The massive sulfide deposits were historically mined for iron and pyrite (sulfur), zinc, lead, and copper but also yielded byproduct gold and silver. The most intensely mineralized and mined section of the belt is southwest of Fredericksburg, in the Mineral district of Louisa and Spotsylvania counties. The Valzinco Piatak lead-zinc mine and the Mitchell gold prospect are abandoned sites in Spotsylvania County. As a result of environmental impacts associated with historic mining, both sites were prioritized for reclamation under the Virginia Orphaned Land Program administered by the Virginia Department of Mines, Minerals, and Energy (VDMME). This report summarizes geochemical data for all solid sample media, along with mineralogical data, and results of weathering experiments on Valzinco tailings and field experiments on sediment accumulation in Knights Branch. These data provide a framework for evaluating water-rock interactionsand geoenvironmental signatures of long-abandoned mines developed in massive sulfide deposits and low-sulfide gold-quartz vein deposits in the humid temperate ecosystem domain in the eastern United States.
Abar, Orhan; Charnigo, Richard J.; Rayapati, Abner
2017-01-01
Association rule mining has received significant attention from both the data mining and machine learning communities. While data mining researchers focus more on designing efficient algorithms to mine rules from large datasets, the learning community has explored applications of rule mining to classification. A major problem with rule mining algorithms is the explosion of rules even for moderate sized datasets making it very difficult for end users to identify both statistically significant and potentially novel rules that could lead to interesting new insights and hypotheses. Researchers have proposed many domain independent interestingness measures using which, one can rank the rules and potentially glean useful rules from the top ranked ones. However, these measures have not been fully explored for rule mining in clinical datasets owing to the relatively large sizes of the datasets often encountered in healthcare and also due to limited access to domain experts for review/analysis. In this paper, using an electronic medical record (EMR) dataset of diagnoses and medications from over three million patient visits to the University of Kentucky medical center and affiliated clinics, we conduct a thorough evaluation of dozens of interestingness measures proposed in data mining literature, including some new composite measures. Using cumulative relevance metrics from information retrieval, we compare these interestingness measures against human judgments obtained from a practicing psychiatrist for association rules involving the depressive disorders class as the consequent. Our results not only surface new interesting associations for depressive disorders but also indicate classes of interestingness measures that weight rule novelty and statistical strength in contrasting ways, offering new insights for end users in identifying interesting rules. PMID:28736771
Solutions for Mining Distributed Scientific Data
NASA Astrophysics Data System (ADS)
Lynnes, C.; Pham, L.; Graves, S.; Ramachandran, R.; Maskey, M.; Keiser, K.
2007-12-01
Researchers at the University of Alabama in Huntsville (UAH) and the Goddard Earth Sciences Data and Information Services Center (GES DISC) are working on approaches and methodologies facilitating the analysis of large amounts of distributed scientific data. Despite the existence of full-featured analysis tools, such as the Algorithm Development and Mining (ADaM) toolkit from UAH, and data repositories, such as the GES DISC, that provide online access to large amounts of data, there remain obstacles to getting the analysis tools and the data together in a workable environment. Does one bring the data to the tools or deploy the tools close to the data? The large size of many current Earth science datasets incurs significant overhead in network transfer for analysis workflows, even with the advanced networking capabilities that are available between many educational and government facilities. The UAH and GES DISC team are developing a capability to define analysis workflows using distributed services and online data resources. We are developing two solutions for this problem that address different analysis scenarios. The first is a Data Center Deployment of the analysis services for large data selections, orchestrated by a remotely defined analysis workflow. The second is a Data Mining Center approach of providing a cohesive analysis solution for smaller subsets of data. The two approaches can be complementary and thus provide flexibility for researchers to exploit the best solution for their data requirements. The Data Center Deployment of the analysis services has been implemented by deploying ADaM web services at the GES DISC so they can access the data directly, without the need of network transfers. Using the Mining Workflow Composer, a user can define an analysis workflow that is then submitted through a Web Services interface to the GES DISC for execution by a processing engine. The workflow definition is composed, maintained and executed at a distributed location, but most of the actual services comprising the workflow are available local to the GES DISC data repository. Additional refinements will ultimately provide a package that is easily implemented and configured at additional data centers for analysis of additional science data sets. Enhancements to the ADaM toolkit allow the staging of distributed data wherever the services are deployed, to support a Data Mining Center that can provide additional computational resources, large storage of output, easier addition and updates to available services, and access to data from multiple repositories. The Data Mining Center case provides researchers more flexibility to quickly try different workflow configurations and refine the process, using smaller amounts of data that may likely be transferred from distributed online repositories. This environment is sufficient for some analyses, but can also be used as an initial sandbox to test and refine a solution before staging the execution at a Data Center Deployment. Detection of airborne dust both over water and land in MODIS imagery using mining services for both solutions will be presented. The dust detection is just one possible example of the mining and analysis capabilities the proposed mining services solutions will provide to the science community. More information about the available services and the current status of this project is available at http://www.itsc.uah.edu/mws/
Fast Spatio-Temporal Data Mining from Large Geophysical Datasets
NASA Technical Reports Server (NTRS)
Stolorz, P.; Mesrobian, E.; Muntz, R.; Santos, J. R.; Shek, E.; Yi, J.; Mechoso, C.; Farrara, J.
1995-01-01
Use of the UCLA CONQUEST (CONtent-based Querying in Space and Time) is reviewed for performance of automatic cyclone extraction and detection of spatio-temporal blocking conditions on MPP. CONQUEST is a data analysis environment for knowledge and data mining to aid in high-resolution modeling of climate modeling.
Traffic Flow Management: Data Mining Update
NASA Technical Reports Server (NTRS)
Grabbe, Shon R.
2012-01-01
This presentation provides an update on recent data mining efforts that have been designed to (1) identify like/similar days in the national airspace system, (2) cluster/aggregate national-level rerouting data and (3) apply machine learning techniques to predict when Ground Delay Programs are required at a weather-impacted airport
Educational Data Mining Acceptance among Undergraduate Students
ERIC Educational Resources Information Center
Wook, Muslihah; Yusof, Zawiyah M.; Nazri, Mohd Zakree Ahmad
2017-01-01
The acceptance of Educational Data Mining (EDM) technology is on the rise due to, its ability to extract new knowledge from large amounts of students' data. This knowledge is important for educational stakeholders, such as policy makers, educators, and students themselves to enhance efficiency and achievements. However, previous studies on EDM…
Engaging Business Students with Data Mining
ERIC Educational Resources Information Center
Brandon, Dan
2016-01-01
The Economist calls it "a golden vein", and many business experts now say it is the new science of winning. Business and technologists have many names for this new science, "business intelligence" (BI), " data analytics," and "data mining" are among the most common. The job market for people skilled in this…
A Comparative Study of Data Mining Techniques on Football Match Prediction
NASA Astrophysics Data System (ADS)
Rosli, Che Mohamad Firdaus Che Mohd; Zainuri Saringat, Mohd; Razali, Nazim; Mustapha, Aida
2018-05-01
Data prediction have become a trend in today’s business or organization. This paper is set to predict match outcomes for association football from the perspective of football club managers and coaches. This paper explored different data mining techniques used for predicting the match outcomes where the target class is win, draw and lose. The main objective of this research is to find the most accurate data mining technique that fits the nature of football data. The techniques tested are Decision Trees, Neural Networks, Bayesian Network, and k-Nearest Neighbors. The results from the comparative experiments showed that Decision Trees produced the highest average prediction accuracy in the domain of football match prediction by 99.56%.
Automatic mine detection based on multiple features
NASA Astrophysics Data System (ADS)
Yu, Ssu-Hsin; Gandhe, Avinash; Witten, Thomas R.; Mehra, Raman K.
2000-08-01
Recent research sponsored by the Army, Navy and DARPA has significantly advanced the sensor technologies for mine detection. Several innovative sensor systems have been developed and prototypes were built to investigate their performance in practice. Most of the research has been focused on hardware design. However, in order for the systems to be in wide use instead of in limited use by a small group of well-trained experts, an automatic process for mine detection is needed to make the final decision process on mine vs. no mine easier and more straightforward. In this paper, we describe an automatic mine detection process consisting of three stage, (1) signal enhancement, (2) pixel-level mine detection, and (3) object-level mine detection. The final output of the system is a confidence measure that quantifies the presence of a mine. The resulting system was applied to real data collected using radar and acoustic technologies.
ERIC Educational Resources Information Center
Liu, Ran; Koedinger, Kenneth R.
2017-01-01
As the use of educational technology becomes more ubiquitous, an enormous amount of learning process data is being produced. Educational data mining seeks to analyze and model these data, with the ultimate goal of improving learning outcomes. The most firmly grounded and rigorous evaluation of an educational data mining discovery is whether it…
Large Scale Data Mining to Improve Usability of Data: An Intelligent Archive Testbed
NASA Technical Reports Server (NTRS)
Ramapriyan, Hampapuram; Isaac, David; Yang, Wenli; Morse, Steve
2005-01-01
Research in certain scientific disciplines - including Earth science, particle physics, and astrophysics - continually faces the challenge that the volume of data needed to perform valid scientific research can at times overwhelm even a sizable research community. The desire to improve utilization of this data gave rise to the Intelligent Archives project, which seeks to make data archives active participants in a knowledge building system capable of discovering events or patterns that represent new information or knowledge. Data mining can automatically discover patterns and events, but it is generally viewed as unsuited for large-scale use in disciplines like Earth science that routinely involve very high data volumes. Dozens of research projects have shown promising uses of data mining in Earth science, but all of these are based on experiments with data subsets of a few gigabytes or less, rather than the terabytes or petabytes typically encountered in operational systems. To bridge this gap, the Intelligent Archives project is establishing a testbed with the goal of demonstrating the use of data mining techniques in an operationally-relevant environment. This paper discusses the goals of the testbed and the design choices surrounding critical issues that arose during testbed implementation.
Hydrologic data for Leviathan Mine and vicinity, Alpine County, California, 1981-83
Hammermeister, D.P.; Walmsley, S.J.
1985-01-01
The U.S. Geological Survey collected basic hydrologic and water-quality data during 1981-83 to facilitate the geohydrologic evaluation of the Leviathan Mine area and the design of a pollution-abatement project. Surface-water field data included one or more measurements of pH, water temperature, and specific conductance at 45 sites in and adjacent to the mine area. At nine of these sites, daily data on discharge, specific conductance, and water temperature were collected during parts of 1981-82 by using electronic monitor-recorder systems. Ground-water field data included one or more of the water-quality measurements listed above at 71 piezometers in the mine area. Borehole geophysical data included neutron-moisture, neutron-porosity, gamma-gamma density, natural gamma, and temperature logs at three sites. Mineralogic and hydrologic data were obtained for cores taken from nine test holes. One or more surface-water samples from 26 sites were analyzed for major cations, major anions, and a wide range of minor inorganic constituents. Single ground-water samples from 36 piezometers were analyzed for the same array of major and minor constituents. (USGS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hiroshi Saito; Tomihiro Taki
2013-07-01
Ningyo-toge Uranium Mine is subject to the environmental remediation. The main purposes are to take measures to ensure the radiation protection from the exposure pathways to humans in future, and to prevent the occurrence of mining pollution. The Yotsugi Mill Tailings Pond in the Ningyo-toge Uranium Mine has deposited mining waste and impounded water as a buffer reservoir before it is transferred to the Water Treatment Facility. It is located at the upstream of the water-source river and as the impact on its environment in case of earthquake is estimated significant, the highest priority has been put to it amongmore » mine-related facilities in the Mine. So far, basic concept has been examined and a great number of data has been acquired, and using the data, some remediation activities have already done, including capping construction for the upstream part of the Mill Tailings Pond. The capping is to reduce rainwater penetration to lower the burden of water treatment, and to reduce radon exhalation and dose rates. Only natural materials are used to alleviate the future maintenance. Data, including settlement amount and underground temperature is now being acquired and accumulated to verify the effectiveness of the capping, and used for the future remediation of the Downstream with revision of its specifications if necessary. (authors)« less
Tube bundle system: for monitoring of coal mine atmosphere.
Zipf, R Karl; Marchewka, W; Mohamed, K; Addis, J; Karnack, F
2013-05-01
A tube bundle system (TBS) is a mechanical system for continuously drawing gas samples through tubes from multiple monitoring points located in an underground coal mine. The gas samples are drawn via vacuum pump to the surface and are typically analyzed for oxygen, methane, carbon dioxide and carbon monoxide. Results of the gas analyses are displayed and recorded for further analysis. Trends in the composition of the mine atmosphere, such as increasing methane or carbon monoxide concentration, can be detected early, permitting rapid intervention that prevents problems, such as a potentially explosive atmosphere behind seals, fire or spontaneous combustion. TBS is a well-developed technology and has been used in coal mines around the world for more than 50 years. Most longwall coal mines in Australia deploy a TBS, usually with 30 to 40 monitoring points as part of their atmospheric monitoring. The primary uses of a TBS are detecting spontaneous combustion and maintaining sealed areas inert. The TBS might also provide mine atmosphere gas composition data after a catastrophe occurs in an underground mine, if the sampling tubes are not damaged. TBSs are not an alternative to statutory gas and ventilation airflow monitoring by electronic sensors or people; rather, they are an option to consider in an overall mine atmosphere monitoring strategy. This paper describes the hardware, software and operation of a TBS and presents one example of typical data from a longwall coal mine.
NASA Astrophysics Data System (ADS)
Ma, Kevin C.; Forsyth, Sydney; Amezcua, Lilyana; Liu, Brent J.
2017-03-01
We have designed and developed a multiple sclerosis eFolder system for patient data storage, image viewing, and automatic lesion quantification results to allow patient tracking. The web-based system aims to be integrated in DICOM-compliant clinical and research environments to aid clinicians in patient treatments and data analysis. The system quantifies lesion volumes, identify and register lesion locations to track shifts in volume and quantity of lesions in a longitudinal study. We aim to evaluate the two most important features of the system, data mining and longitudinal lesion tracking, to demonstrate the MS eFolder's capability in improving clinical workflow efficiency and outcome analysis for research. In order to evaluate data mining capabilities, we have collected radiological and neurological data from 72 patients, 36 Caucasian and 36 Hispanic matched by gender, disease duration, and age. Data analysis on those patients based on ethnicity is performed, and analysis results are displayed by the system's web-based user interface. The data mining module is able to successfully separate Hispanic and Caucasian patients and compare their disease profiles. For longitudinal lesion tracking, we have collected 4 longitudinal cases and simulated different lesion growths over the next year. As a result, the eFolder is able to detect changes in lesion volume and identifying lesions with the most changes. Data mining and lesion tracking evaluation results show high potential of eFolder's usefulness in patientcare and informatics research for multiple sclerosis.
30 CFR 282.21 - Plans, general.
Code of Federal Regulations, 2011 CFR
2011-07-01
... Resources BUREAU OF OCEAN ENERGY MANAGEMENT, REGULATION, AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR... provide comments on proposed Delineation, Testing, and Mining Plans and any proposal for a significant... Mining Plan if the lessee has sufficient data and information on which to base a Testing or Mining Plan...
15 CFR 970.701 - Significant adverse environmental effects.
Code of Federal Regulations, 2014 CFR
2014-01-01
... REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES... effects of deep seabed mining which cumulatively during commercial recovery have the potential for significant effect. These three effects also occur during mining system tests that may be conducted under a...
15 CFR 970.701 - Significant adverse environmental effects.
Code of Federal Regulations, 2013 CFR
2013-01-01
... REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES... effects of deep seabed mining which cumulatively during commercial recovery have the potential for significant effect. These three effects also occur during mining system tests that may be conducted under a...
15 CFR 970.701 - Significant adverse environmental effects.
Code of Federal Regulations, 2012 CFR
2012-01-01
... REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES... effects of deep seabed mining which cumulatively during commercial recovery have the potential for significant effect. These three effects also occur during mining system tests that may be conducted under a...
Remote sensing for mined area reclamation: Application inventory
NASA Technical Reports Server (NTRS)
1971-01-01
Applications of aerial remote sensing to coal mined area reclamation are documented, and information concerning available data banks for coal producing areas in the east and midwest is given. A summary of mined area information requirements to which remote sensing methods might contribute is included.
15 CFR 970.701 - Significant adverse environmental effects.
Code of Federal Regulations, 2010 CFR
2010-01-01
... REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES... effects of deep seabed mining which cumulatively during commercial recovery have the potential for significant effect. These three effects also occur during mining system tests that may be conducted under a...
15 CFR 970.701 - Significant adverse environmental effects.
Code of Federal Regulations, 2011 CFR
2011-01-01
... REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES... effects of deep seabed mining which cumulatively during commercial recovery have the potential for significant effect. These three effects also occur during mining system tests that may be conducted under a...
Data Mining for Web-Based Support Systems: A Case Study in e-Custom Systems
NASA Astrophysics Data System (ADS)
Razmerita, Liana; Kirchner, Kathrin
This chapter provides an example of a Web-based support system (WSS) used to streamline trade procedures, prevent potential security threats, and reduce tax-related fraud in cross-border trade. The architecture is based on a service-oriented architecture that includes smart seals and Web services. We discuss the implications and suggest further enhancements to demonstrate how such systems can move toward a Web-based decision support system with the support of data mining methods. We provide a concrete example of how data mining can help to analyze the vast amount of data collected while monitoring the container movements along its supply chain.
A Framework for Web Usage Mining in Electronic Government
NASA Astrophysics Data System (ADS)
Zhou, Ping; Le, Zhongjian
Web usage mining has been a major component of management strategy to enhance organizational analysis and decision. The literature on Web usage mining that deals with strategies and technologies for effectively employing Web usage mining is quite vast. In recent years, E-government has received much attention from researchers and practitioners. Huge amounts of user access data are produced in Electronic government Web site everyday. The role of these data in the success of government management cannot be overstated because they affect government analysis, prediction, strategies, tactical, operational planning and control. Web usage miming in E-government has an important role to play in setting government objectives, discovering citizen behavior, and determining future courses of actions. Web usage mining in E-government has not received adequate attention from researchers or practitioners. We developed a framework to promote a better understanding of the importance of Web usage mining in E-government. Using the current literature, we developed the framework presented herein, in hopes that it would stimulate more interest in this important area.