Methods of Sparse Modeling and Dimensionality Reduction to Deal with Big Data
2015-04-01
supervised learning (c). Our framework consists of two separate phases: (a) first find an initial space in an unsupervised manner; then (b) utilize label...model that can learn thousands of topics from a large set of documents and infer the topic mixture of each document, 2) a supervised dimension reduction...model that can learn thousands of topics from a large set of documents and infer the topic mixture of each document, (i) a method of supervised
Raptor: An Enterprise Knowledge Discovery Engine Version 2.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
2011-08-31
The Raptor Version 2.0 computer code uses a set of documents as seed documents to recommend documents of interest from a large, target set of documents. The computer code provides results that show the recommended documents with the highest similarity to the seed documents. Version 2.0 was specifically developed to work with SharePoint 2007 and MS SQL server.
DocCube: Multi-Dimensional Visualization and Exploration of Large Document Sets.
ERIC Educational Resources Information Center
Mothe, Josiane; Chrisment, Claude; Dousset, Bernard; Alaux, Joel
2003-01-01
Describes a user interface that provides global visualizations of large document sets to help users formulate the query that corresponds to their information needs. Highlights include concept hierarchies that users can browse to specify and refine information needs; knowledge discovery in databases and texts; and multidimensional modeling.…
A Metadata Element Set for Project Documentation
NASA Technical Reports Server (NTRS)
Hodge, Gail; Templeton, Clay; Allen, Robert B.
2003-01-01
Abstract NASA Goddard Space Flight Center is a large engineering enterprise with many projects. We describe our efforts to develop standard metadata sets across project documentation which we term the "Goddard Core". We also address broader issues for project management metadata.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choo, Jaegul; Kim, Hannah; Clarkson, Edward
In this paper, we present an interactive visual information retrieval and recommendation system, called VisIRR, for large-scale document discovery. VisIRR effectively combines the paradigms of (1) a passive pull through query processes for retrieval and (2) an active push that recommends items of potential interest to users based on their preferences. Equipped with an efficient dynamic query interface against a large-scale corpus, VisIRR organizes the retrieved documents into high-level topics and visualizes them in a 2D space, representing the relationships among the topics along with their keyword summary. In addition, based on interactive personalized preference feedback with regard to documents,more » VisIRR provides document recommendations from the entire corpus, which are beyond the retrieved sets. Such recommended documents are visualized in the same space as the retrieved documents, so that users can seamlessly analyze both existing and newly recommended ones. This article presents novel computational methods, which make these integrated representations and fast interactions possible for a large-scale document corpus. We illustrate how the system works by providing detailed usage scenarios. Finally, we present preliminary user study results for evaluating the effectiveness of the system.« less
Choo, Jaegul; Kim, Hannah; Clarkson, Edward; ...
2018-01-31
In this paper, we present an interactive visual information retrieval and recommendation system, called VisIRR, for large-scale document discovery. VisIRR effectively combines the paradigms of (1) a passive pull through query processes for retrieval and (2) an active push that recommends items of potential interest to users based on their preferences. Equipped with an efficient dynamic query interface against a large-scale corpus, VisIRR organizes the retrieved documents into high-level topics and visualizes them in a 2D space, representing the relationships among the topics along with their keyword summary. In addition, based on interactive personalized preference feedback with regard to documents,more » VisIRR provides document recommendations from the entire corpus, which are beyond the retrieved sets. Such recommended documents are visualized in the same space as the retrieved documents, so that users can seamlessly analyze both existing and newly recommended ones. This article presents novel computational methods, which make these integrated representations and fast interactions possible for a large-scale document corpus. We illustrate how the system works by providing detailed usage scenarios. Finally, we present preliminary user study results for evaluating the effectiveness of the system.« less
Taming Big Data: An Information Extraction Strategy for Large Clinical Text Corpora.
Gundlapalli, Adi V; Divita, Guy; Carter, Marjorie E; Redd, Andrew; Samore, Matthew H; Gupta, Kalpana; Trautner, Barbara
2015-01-01
Concepts of interest for clinical and research purposes are not uniformly distributed in clinical text available in electronic medical records. The purpose of our study was to identify filtering techniques to select 'high yield' documents for increased efficacy and throughput. Using two large corpora of clinical text, we demonstrate the identification of 'high yield' document sets in two unrelated domains: homelessness and indwelling urinary catheters. For homelessness, the high yield set includes homeless program and social work notes. For urinary catheters, concepts were more prevalent in notes from hospitalized patients; nursing notes accounted for a majority of the high yield set. This filtering will enable customization and refining of information extraction pipelines to facilitate extraction of relevant concepts for clinical decision support and other uses.
Document Set Differentiability Analyzer v. 0.1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osborn, Thor D.
Software is a JMP Scripting Language (JSL) script designed to evaluate the differentiability of a set of documents that exhibit some conceptual commonalities but are expected to describe substantially different – thus differentiable – categories. The script imports the document set, a subset of which may be partitioned into an additions pool. The bulk of the documents form a basis pool. Text analysis is applied to the basis pool to extract a mathematical representation of its conceptual content, referred to as the document concept space. A bootstrapping approach is applied to that mathematical representation in order to generate a representationmore » of a large population of randomly designed documents that could be written within the concept space, notably without actually writing the text of those documents.The Kolmogorov-Smirnov test is applied to determine whether the basis pool document set exhibits superior differentiation relative to the randomly designed virtual documents produced by bootstrapping. If an additions pool exists, the documents are incrementally added to the basis pool, choosing the best differentiated remaining document at each step. In this manner the impact of additional categories to overall document set differentiability may be assessed.The software was developed to assess the differentiability of job description document sets. Differentiability is key to meaningful categorization. Poor job differentiation may have economic, ethical, and/or legal implications for an organization. Job categories are used in the assignment of market-based salaries; consequently, poor differentiation of job duties may set the stage for legal challenges if very similar jobs pay differently depending on title, a circumstance that also invites economic waste.The software can be applied to ensure job description set differentiability, reducing legal, economic, and ethical risks to an organization and its people. The extraction of the conceptual space to a mathematical representation enables identification of exceedingly similar documents. In the event of redundancy, two jobs may be collapsed into one. If in the judgment of the subject matter experts the jobs are truly different, the conceptual similarities are highlighted, inviting inclusion of appropriate descriptive content to explicitly characterize those differences. When additional job categories may be needed as the organization changes, the software enables evaluation of proposed additions to ensure that the resulting document set remains adequately differentiated.« less
Software Manages Documentation in a Large Test Facility
NASA Technical Reports Server (NTRS)
Gurneck, Joseph M.
2001-01-01
The 3MCS computer program assists and instrumentation engineer in performing the 3 essential functions of design, documentation, and configuration management of measurement and control systems in a large test facility. Services provided by 3MCS are acceptance of input from multiple engineers and technicians working at multiple locations;standardization of drawings;automated cross-referencing; identification of errors;listing of components and resources; downloading of test settings; and provision of information to customers.
An interactive environment for the analysis of large Earth observation and model data sets
NASA Technical Reports Server (NTRS)
Bowman, Kenneth P.; Walsh, John E.; Wilhelmson, Robert B.
1993-01-01
We propose to develop an interactive environment for the analysis of large Earth science observation and model data sets. We will use a standard scientific data storage format and a large capacity (greater than 20 GB) optical disk system for data management; develop libraries for coordinate transformation and regridding of data sets; modify the NCSA X Image and X DataSlice software for typical Earth observation data sets by including map transformations and missing data handling; develop analysis tools for common mathematical and statistical operations; integrate the components described above into a system for the analysis and comparison of observations and model results; and distribute software and documentation to the scientific community.
An interactive environment for the analysis of large Earth observation and model data sets
NASA Technical Reports Server (NTRS)
Bowman, Kenneth P.; Walsh, John E.; Wilhelmson, Robert B.
1992-01-01
We propose to develop an interactive environment for the analysis of large Earth science observation and model data sets. We will use a standard scientific data storage format and a large capacity (greater than 20 GB) optical disk system for data management; develop libraries for coordinate transformation and regridding of data sets; modify the NCSA X Image and X Data Slice software for typical Earth observation data sets by including map transformations and missing data handling; develop analysis tools for common mathematical and statistical operations; integrate the components described above into a system for the analysis and comparison of observations and model results; and distribute software and documentation to the scientific community.
Three-Dimensional Dispaly Of Document Set
Lantrip, David B.; Pennock, Kelly A.; Pottier, Marc C.; Schur, Anne; Thomas, James J.; Wise, James A.
2003-06-24
A method for spatializing text content for enhanced visual browsing and analysis. The invention is applied to large text document corpora such as digital libraries, regulations and procedures, archived reports, and the like. The text content from these sources may be transformed to a spatial representation that preserves informational characteristics from the documents. The three-dimensional representation may then be visually browsed and analyzed in ways that avoid language processing and that reduce the analysts' effort.
Three-dimensional display of document set
Lantrip, David B [Oxnard, CA; Pennock, Kelly A [Richland, WA; Pottier, Marc C [Richland, WA; Schur, Anne [Richland, WA; Thomas, James J [Richland, WA; Wise, James A [Richland, WA
2006-09-26
A method for spatializing text content for enhanced visual browsing and analysis. The invention is applied to large text document corpora such as digital libraries, regulations and procedures, archived reports, and the like. The text content from these sources may e transformed to a spatial representation that preserves informational characteristics from the documents. The three-dimensional representation may then be visually browsed and analyzed in ways that avoid language processing and that reduce the analysts' effort.
Three-dimensional display of document set
Lantrip, David B [Oxnard, CA; Pennock, Kelly A [Richland, WA; Pottier, Marc C [Richland, WA; Schur, Anne [Richland, WA; Thomas, James J [Richland, WA; Wise, James A [Richland, WA
2001-10-02
A method for spatializing text content for enhanced visual browsing and analysis. The invention is applied to large text document corpora such as digital libraries, regulations and procedures, archived reports, and the like. The text content from these sources may be transformed to a spatial representation that preserves informational characteristics from the documents. The three-dimensional representation may then be visually browsed and analyzed in ways that avoid language processing and that reduce the analysts' effort.
Three-dimensional display of document set
Lantrip, David B [Oxnard, CA; Pennock, Kelly A [Richland, WA; Pottier, Marc C [Richland, WA; Schur, Anne [Richland, WA; Thomas, James J [Richland, WA; Wise, James A [Richland, WA; York, Jeremy [Bothell, WA
2009-06-30
A method for spatializing text content for enhanced visual browsing and analysis. The invention is applied to large text document corpora such as digital libraries, regulations and procedures, archived reports, and the like. The text content from these sources may be transformed to a spatial representation that preserves informational characteristics from the documents. The three-dimensional representation may then be visually browsed and analyzed in ways that avoid language processing and that reduce the analysts' effort.
SureChEMBL: a large-scale, chemically annotated patent document database.
Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P
2016-01-04
SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
SureChEMBL: a large-scale, chemically annotated patent document database
Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A.; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P.
2016-01-01
SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922
High-Reproducibility and High-Accuracy Method for Automated Topic Classification
NASA Astrophysics Data System (ADS)
Lancichinetti, Andrea; Sirer, M. Irmak; Wang, Jane X.; Acuna, Daniel; Körding, Konrad; Amaral, Luís A. Nunes
2015-01-01
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent searching, statistical characterization, and meaningful classification. Latent Dirichlet allocation (LDA) is the state of the art in topic modeling. Here, we perform a systematic theoretical and numerical analysis that demonstrates that current optimization techniques for LDA often yield results that are not accurate in inferring the most suitable model parameters. Adapting approaches from community detection in networks, we propose a new algorithm that displays high reproducibility and high accuracy and also has high computational efficiency. We apply it to a large set of documents in the English Wikipedia and reveal its hierarchical structure.
Teacher Effects and Teacher-Related Policies
ERIC Educational Resources Information Center
Jackson, C. Kirabo; Rockoff, Jonah E.; Staiger, Douglas O.
2014-01-01
The emergence of large longitudinal data sets linking students to teachers has led to rapid growth in the study of teacher effects on student outcomes by economists over the past decade. One large literature has documented wide variation in teacher effectiveness that is not well explained by observable student or teacher characteristics. A second…
Summers, Alexander; Ruderman, Carly; Leung, Fok-Han; Slater, Morgan
2017-09-22
Studies in the United States have shown that physicians commonly use brand names when documenting medications in an outpatient setting. However, the prevalence of prescribing and documenting brand name medication has not been assessed in a clinical teaching environment. The purpose of this study was to describe the use of generic versus brand names for a select number of pharmaceutical products in clinical documentation in a large, urban academic family practice centre. A retrospective chart review of the electronic medical records of the St. Michael's Hospital Academic Family Health Team (SMHAFHT). Data for twenty commonly prescribed medications were collected from the Cumulative Patient Profile as of August 1, 2014. Each medication name was classified as generic or trade. Associations between documentation patterns and physician characteristics were assessed. Among 9763 patients prescribed any of the twenty medications of interest, 45% of patient charts contained trade nomenclature exclusively. 32% of charts contained only generic nomenclature, and 23% contained a mix of generic and trade nomenclature. There was large variation in use of generic nomenclature amongst physicians, ranging from 19% to 93%. Trade names in clinical documentation, which likely reflect prescribing habits, continue to be used abundantly in the academic setting. This may become part of the informal curriculum, potentially facilitating undue bias in trainees. Further study is needed to determine characteristics which influence use of generic or trade nomenclature and the impact of this trend on trainees' clinical knowledge and decision-making.
Ada Structure Design Language (ASDL)
NASA Technical Reports Server (NTRS)
Chedrawi, Lutfi
1986-01-01
An artist acquires all the necessary tools before painting a scene. In the same analogy, a software engineer needs the necessary tools to provide their design with the proper means for implementation. Ada provide these tools. Yet, as an artist's painting needs a brochure to accompany it for further explanation of the scene, an Ada design also needs a document along with it to show the design in its detailed structure and hierarchical order. Ada could be self-explanatory in small programs not exceeding fifty lines of code in length. But, in a large environment, ranging from thousands of lines and above, Ada programs need to be well documented to be preserved and maintained. The language used to specify an Ada document is called Ada Structure Design Language (ASDL). This language sets some rules to help derive a well formatted Ada detailed design document. The rules are defined to meet the needs of a project manager, a maintenance team, a programmer and a systems designer. The design document templates, the document extractor, and the rules set forth by the ASDL are explained in detail.
[Development of an ophthalmological clinical information system for inpatient eye clinics].
Kortüm, K U; Müller, M; Babenko, A; Kampik, A; Kreutzer, T C
2015-12-01
In times of increased digitalization in healthcare, departments of ophthalmology are faced with the challenge of introducing electronic clinical health records (EHR); however, specialized software for ophthalmology is not available with most major EHR sytems. The aim of this project was to create specific ophthalmological user interfaces for large inpatient eye care providers within a hospitalwide EHR. Additionally the integration of ophthalmic imaging systems, scheduling and surgical documentation should be achieved. The existing EHR i.s.h.med (Siemens, Germany) was modified using advanced business application programming (ABAP) language to create specific ophthalmological user interfaces for reproduction and moreover optimization of the clinical workflow. A user interface for documentation of ambulatory patients with eight tabs was designed. From June 2013 to October 2014 a total of 61,551 patient contact details were documented. For surgical documentation a separate user interface was set up. Digital clinical orders for documentation of registration and scheduling of operations user interfaces were also set up. A direct integration of ophthalmic imaging modalities could be established. An ophthalmologist-orientated EHR for outpatient and surgical documentation for inpatient clinics was created and successfully implemented. By incorporation of imaging procedures the foundation of future smart/big data analyses was created.
Stein, Gary L; Cagle, John G; Christ, Grace H
2017-03-01
Few data are available describing the involvement and activities of social workers in advance care planning (ACP). We sought to provide data about (1) social worker involvement and leadership in ACP conversations with patients and families; and (2) the extent of functions and activities when these discussions occur. We conducted a large web-based survey of social workers employed in hospice, palliative care, and related settings to explore their role, participation, and self-rated competency in facilitating ACP discussions. Respondents were recruited through the Social Work Hospice and Palliative Care Network and the National Hospice and Palliative Care Organization. Descriptive analyses were conducted on the full sample of respondents (N = 641) and a subsample of clinical social workers (N = 456). Responses were analyzed to explore differences in ACP involvement by practice setting. Most clinical social workers (96%) reported that social workers in their department are conducting ACP discussions with patients/families. Majorities also participate in, and lead, ACP discussions (69% and 60%, respectively). Most respondents report that social workers are responsible for educating patients/families about ACP options (80%) and are the team members responsible for documenting ACP (68%). Compared with other settings, oncology and inpatient palliative care social workers were less likely to be responsible for ensuring that patients/families are informed of ACP options and documenting ACP preferences. Social workers are prominently involved in facilitating, leading, and documenting ACP discussions. Policy-makers, administrators, and providers should incorporate the vital contributions of social work professionals in policies and programs supporting ACP.
Document cards: a top trumps visualization for documents.
Strobelt, Hendrik; Oelke, Daniela; Rohrdantz, Christian; Stoffel, Andreas; Keim, Daniel A; Deussen, Oliver
2009-01-01
Finding suitable, less space consuming views for a document's main content is crucial to provide convenient access to large document collections on display devices of different size. We present a novel compact visualization which represents the document's key semantic as a mixture of images and important key terms, similar to cards in a top trumps game. The key terms are extracted using an advanced text mining approach based on a fully automatic document structure extraction. The images and their captions are extracted using a graphical heuristic and the captions are used for a semi-semantic image weighting. Furthermore, we use the image color histogram for classification and show at least one representative from each non-empty image class. The approach is demonstrated for the IEEE InfoVis publications of a complete year. The method can easily be applied to other publication collections and sets of documents which contain images.
Design and development of an ancient Chinese document recognition system
NASA Astrophysics Data System (ADS)
Peng, Liangrui; Xiu, Pingping; Ding, Xiaoqing
2003-12-01
The digitization of ancient Chinese documents presents new challenges to OCR (Optical Character Recognition) research field due to the large character set of ancient Chinese characters, variant font types, and versatile document layout styles, as these documents are historical reflections to the thousands of years of Chinese civilization. After analyzing the general characteristics of ancient Chinese documents, we present a solution for recognition of ancient Chinese documents with regular font-types and layout-styles. Based on the previous work on multilingual OCR in TH-OCR system, we focus on the design and development of two key technologies which include character recognition and page segmentation. Experimental results show that the developed character recognition kernel of 19,635 Chinese characters outperforms our original traditional Chinese recognition kernel; Benchmarked test on printed ancient Chinese books proves that the proposed system is effective for regular ancient Chinese documents.
Radial sets: interactive visual analysis of large overlapping sets.
Alsallakh, Bilal; Aigner, Wolfgang; Miksch, Silvia; Hauser, Helwig
2013-12-01
In many applications, data tables contain multi-valued attributes that often store the memberships of the table entities to multiple sets such as which languages a person masters, which skills an applicant documents, or which features a product comes with. With a growing number of entities, the resulting element-set membership matrix becomes very rich of information about how these sets overlap. Many analysis tasks targeted at set-typed data are concerned with these overlaps as salient features of such data. This paper presents Radial Sets, a novel visual technique to analyze set memberships for a large number of elements. Our technique uses frequency-based representations to enable quickly finding and analyzing different kinds of overlaps between the sets, and relating these overlaps to other attributes of the table entities. Furthermore, it enables various interactions to select elements of interest, find out if they are over-represented in specific sets or overlaps, and if they exhibit a different distribution for a specific attribute compared to the rest of the elements. These interactions allow formulating highly-expressive visual queries on the elements in terms of their set memberships and attribute values. As we demonstrate via two usage scenarios, Radial Sets enable revealing and analyzing a multitude of overlapping patterns between large sets, beyond the limits of state-of-the-art techniques.
Decadal climate prediction in the large ensemble limit
NASA Astrophysics Data System (ADS)
Yeager, S. G.; Rosenbloom, N. A.; Strand, G.; Lindsay, K. T.; Danabasoglu, G.; Karspeck, A. R.; Bates, S. C.; Meehl, G. A.
2017-12-01
In order to quantify the benefits of initialization for climate prediction on decadal timescales, two parallel sets of historical simulations are required: one "initialized" ensemble that incorporates observations of past climate states and one "uninitialized" ensemble whose internal climate variations evolve freely and without synchronicity. In the large ensemble limit, ensemble averaging isolates potentially predictable forced and internal variance components in the "initialized" set, but only the forced variance remains after averaging the "uninitialized" set. The ensemble size needed to achieve this variance decomposition, and to robustly distinguish initialized from uninitialized decadal predictions, remains poorly constrained. We examine a large ensemble (LE) of initialized decadal prediction (DP) experiments carried out using the Community Earth System Model (CESM). This 40-member CESM-DP-LE set of experiments represents the "initialized" complement to the CESM large ensemble of 20th century runs (CESM-LE) documented in Kay et al. (2015). Both simulation sets share the same model configuration, historical radiative forcings, and large ensemble sizes. The twin experiments afford an unprecedented opportunity to explore the sensitivity of DP skill assessment, and in particular the skill enhancement associated with initialization, to ensemble size. This talk will highlight the benefits of a large ensemble size for initialized predictions of seasonal climate over land in the Atlantic sector as well as predictions of shifts in the likelihood of climate extremes that have large societal impact.
Validating a strategy for psychosocial phenotyping using a large corpus of clinical text.
Gundlapalli, Adi V; Redd, Andrew; Carter, Marjorie; Divita, Guy; Shen, Shuying; Palmer, Miland; Samore, Matthew H
2013-12-01
To develop algorithms to improve efficiency of patient phenotyping using natural language processing (NLP) on text data. Of a large number of note titles available in our database, we sought to determine those with highest yield and precision for psychosocial concepts. From a database of over 1 billion documents from US Department of Veterans Affairs medical facilities, a random sample of 1500 documents from each of 218 enterprise note titles were chosen. Psychosocial concepts were extracted using a UIMA-AS-based NLP pipeline (v3NLP), using a lexicon of relevant concepts with negation and template format annotators. Human reviewers evaluated a subset of documents for false positives and sensitivity. High-yield documents were identified by hit rate and precision. Reasons for false positivity were characterized. A total of 58 707 psychosocial concepts were identified from 316 355 documents for an overall hit rate of 0.2 concepts per document (median 0.1, range 1.6-0). Of 6031 concepts reviewed from a high-yield set of note titles, the overall precision for all concept categories was 80%, with variability among note titles and concept categories. Reasons for false positivity included templating, negation, context, and alternate meaning of words. The sensitivity of the NLP system was noted to be 49% (95% CI 43% to 55%). Phenotyping using NLP need not involve the entire document corpus. Our methods offer a generalizable strategy for scaling NLP pipelines to large free text corpora with complex linguistic annotations in attempts to identify patients of a certain phenotype.
Validating a strategy for psychosocial phenotyping using a large corpus of clinical text
Gundlapalli, Adi V; Redd, Andrew; Carter, Marjorie; Divita, Guy; Shen, Shuying; Palmer, Miland; Samore, Matthew H
2013-01-01
Objective To develop algorithms to improve efficiency of patient phenotyping using natural language processing (NLP) on text data. Of a large number of note titles available in our database, we sought to determine those with highest yield and precision for psychosocial concepts. Materials and methods From a database of over 1 billion documents from US Department of Veterans Affairs medical facilities, a random sample of 1500 documents from each of 218 enterprise note titles were chosen. Psychosocial concepts were extracted using a UIMA-AS-based NLP pipeline (v3NLP), using a lexicon of relevant concepts with negation and template format annotators. Human reviewers evaluated a subset of documents for false positives and sensitivity. High-yield documents were identified by hit rate and precision. Reasons for false positivity were characterized. Results A total of 58 707 psychosocial concepts were identified from 316 355 documents for an overall hit rate of 0.2 concepts per document (median 0.1, range 1.6–0). Of 6031 concepts reviewed from a high-yield set of note titles, the overall precision for all concept categories was 80%, with variability among note titles and concept categories. Reasons for false positivity included templating, negation, context, and alternate meaning of words. The sensitivity of the NLP system was noted to be 49% (95% CI 43% to 55%). Conclusions Phenotyping using NLP need not involve the entire document corpus. Our methods offer a generalizable strategy for scaling NLP pipelines to large free text corpora with complex linguistic annotations in attempts to identify patients of a certain phenotype. PMID:24169276
An interactive environment for the analysis of large Earth observation and model data sets
NASA Technical Reports Server (NTRS)
Bowman, Kenneth P.; Walsh, John E.; Wilhelmson, Robert B.
1994-01-01
Envision is an interactive environment that provides researchers in the earth sciences convenient ways to manage, browse, and visualize large observed or model data sets. Its main features are support for the netCDF and HDF file formats, an easy to use X/Motif user interface, a client-server configuration, and portability to many UNIX workstations. The Envision package also provides new ways to view and change metadata in a set of data files. It permits a scientist to conveniently and efficiently manage large data sets consisting of many data files. It also provides links to popular visualization tools so that data can be quickly browsed. Envision is a public domain package, freely available to the scientific community. Envision software (binaries and source code) and documentation can be obtained from either of these servers: ftp://vista.atmos.uiuc.edu/pub/envision/ and ftp://csrp.tamu.edu/pub/envision/. Detailed descriptions of Envision capabilities and operations can be found in the User's Guide and Reference Manuals distributed with Envision software.
Parallel digital forensics infrastructure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liebrock, Lorie M.; Duggan, David Patrick
2009-10-01
This report documents the architecture and implementation of a Parallel Digital Forensics infrastructure. This infrastructure is necessary for supporting the design, implementation, and testing of new classes of parallel digital forensics tools. Digital Forensics has become extremely difficult with data sets of one terabyte and larger. The only way to overcome the processing time of these large sets is to identify and develop new parallel algorithms for performing the analysis. To support algorithm research, a flexible base infrastructure is required. A candidate architecture for this base infrastructure was designed, instantiated, and tested by this project, in collaboration with New Mexicomore » Tech. Previous infrastructures were not designed and built specifically for the development and testing of parallel algorithms. With the size of forensics data sets only expected to increase significantly, this type of infrastructure support is necessary for continued research in parallel digital forensics. This report documents the implementation of the parallel digital forensics (PDF) infrastructure architecture and implementation.« less
FacetGist: Collective Extraction of Document Facets in Large Technical Corpora.
Siddiqui, Tarique; Ren, Xiang; Parameswaran, Aditya; Han, Jiawei
2016-10-01
Given the large volume of technical documents available, it is crucial to automatically organize and categorize these documents to be able to understand and extract value from them. Towards this end, we introduce a new research problem called Facet Extraction. Given a collection of technical documents, the goal of Facet Extraction is to automatically label each document with a set of concepts for the key facets ( e.g. , application, technique, evaluation metrics, and dataset) that people may be interested in. Facet Extraction has numerous applications, including document summarization, literature search, patent search and business intelligence. The major challenge in performing Facet Extraction arises from multiple sources: concept extraction, concept to facet matching, and facet disambiguation. To tackle these challenges, we develop FacetGist, a framework for facet extraction. Facet Extraction involves constructing a graph-based heterogeneous network to capture information available across multiple local sentence-level features, as well as global context features. We then formulate a joint optimization problem, and propose an efficient algorithm for graph-based label propagation to estimate the facet of each concept mention. Experimental results on technical corpora from two domains demonstrate that Facet Extraction can lead to an improvement of over 25% in both precision and recall over competing schemes.
Brehmer, Matthew; Ingram, Stephen; Stray, Jonathan; Munzner, Tamara
2014-12-01
For an investigative journalist, a large collection of documents obtained from a Freedom of Information Act request or a leak is both a blessing and a curse: such material may contain multiple newsworthy stories, but it can be difficult and time consuming to find relevant documents. Standard text search is useful, but even if the search target is known it may not be possible to formulate an effective query. In addition, summarization is an important non-search task. We present Overview, an application for the systematic analysis of large document collections based on document clustering, visualization, and tagging. This work contributes to the small set of design studies which evaluate a visualization system "in the wild", and we report on six case studies where Overview was voluntarily used by self-initiated journalists to produce published stories. We find that the frequently-used language of "exploring" a document collection is both too vague and too narrow to capture how journalists actually used our application. Our iterative process, including multiple rounds of deployment and observations of real world usage, led to a much more specific characterization of tasks. We analyze and justify the visual encoding and interaction techniques used in Overview's design with respect to our final task abstractions, and propose generalizable lessons for visualization design methodology.
FacetGist: Collective Extraction of Document Facets in Large Technical Corpora
Siddiqui, Tarique; Ren, Xiang; Parameswaran, Aditya; Han, Jiawei
2017-01-01
Given the large volume of technical documents available, it is crucial to automatically organize and categorize these documents to be able to understand and extract value from them. Towards this end, we introduce a new research problem called Facet Extraction. Given a collection of technical documents, the goal of Facet Extraction is to automatically label each document with a set of concepts for the key facets (e.g., application, technique, evaluation metrics, and dataset) that people may be interested in. Facet Extraction has numerous applications, including document summarization, literature search, patent search and business intelligence. The major challenge in performing Facet Extraction arises from multiple sources: concept extraction, concept to facet matching, and facet disambiguation. To tackle these challenges, we develop FacetGist, a framework for facet extraction. Facet Extraction involves constructing a graph-based heterogeneous network to capture information available across multiple local sentence-level features, as well as global context features. We then formulate a joint optimization problem, and propose an efficient algorithm for graph-based label propagation to estimate the facet of each concept mention. Experimental results on technical corpora from two domains demonstrate that Facet Extraction can lead to an improvement of over 25% in both precision and recall over competing schemes. PMID:28210517
ERIC Educational Resources Information Center
Hultsman, John T.; Cottrell, Richard L.
This document provides a set of generalized guidelines for the design of units in large family campgrounds. Managers of recreational lands have two responsibilities and goals: to protect the natural resources, and to provide an enjoyable experience for users. With these goals in mind, unique variables to each unit such as shade, site aesthetics,…
USDA-ARS?s Scientific Manuscript database
Soil temperature (Ts) exerts critical controls on hydrologic and biogeochemical processes but magnitude and nature of Ts variability in a landscape setting are rarely documented. Fiber optic distributed temperature sensing systems (FO-DTS) potentially measure Ts at high density over a large extent. ...
ERIC Educational Resources Information Center
Huang, Yifen
2010-01-01
Mixed-initiative clustering is a task where a user and a machine work collaboratively to analyze a large set of documents. We hypothesize that a user and a machine can both learn better clustering models through enriched communication and interactive learning from each other. The first contribution or this thesis is providing a framework of…
Social Networking on the Semantic Web
ERIC Educational Resources Information Center
Finin, Tim; Ding, Li; Zhou, Lina; Joshi, Anupam
2005-01-01
Purpose: Aims to investigate the way that the semantic web is being used to represent and process social network information. Design/methodology/approach: The Swoogle semantic web search engine was used to construct several large data sets of Resource Description Framework (RDF) documents with social network information that were encoded using the…
Visualizing the semantic content of large text databases using text maps
NASA Technical Reports Server (NTRS)
Combs, Nathan
1993-01-01
A methodology for generating text map representations of the semantic content of text databases is presented. Text maps provide a graphical metaphor for conceptualizing and visualizing the contents and data interrelationships of large text databases. Described are a set of experiments conducted against the TIPSTER corpora of Wall Street Journal articles. These experiments provide an introduction to current work in the representation and visualization of documents by way of their semantic content.
CINTEX: International Interoperability Extensions to EOSDIS
NASA Technical Reports Server (NTRS)
Graves, Sara J.
1997-01-01
A large part of the research under this cooperative agreement involved working with representatives of the DLR, NASDA, EDC, and NOAA-SAA data centers to propose a set of enhancements and additions to the EOSDIS Version 0 Information Management System (V0 IMS) Client/Server Message Protocol. Helen Conover of ITSL led this effort to provide for an additional geographic search specification (WRS Path/Row), data set- and data center-specific search criteria, search by granule ID, specification of data granule subsetting requests, data set-based ordering, and the addition of URLs to result messages. The V0 IMS Server Cookbook is an evolving document, providing resources and information to data centers setting up a VO IMS Server. Under this Cooperative Agreement, Helen Conover revised, reorganized, and expanded this document, and converted it to HTML. Ms. Conover has also worked extensively with the IRE RAS data center, CPSSI, in Russia. She served as the primary IMS contact for IRE-CPSSI and as IRE-CPSSI's liaison to other members of IMS and Web Gateway (WG) development teams. Her documentation of IMS problems in the IRE environment (Sun servers and low network bandwidth) led to a general restructuring of the V0 IMS Client message polling system. to the benefit of all IMS participants. In addition to the IMS server software and documentation. which are generally available to CINTEX sites, Ms. Conover also provided database design documentation and consulting, order tracking software, and hands-on testing and debug assistance to IRE. In the final pre-operational phase of IRE-CPSSI development, she also supplied information on configuration management, including ideas and processes in place at the Global Hydrology Resource Center (GHRC), an EOSDIS data center operated by ITSL.
Signature detection and matching for document image retrieval.
Zhu, Guangyu; Zheng, Yefeng; Doermann, David; Jaeger, Stefan
2009-11-01
As one of the most pervasive methods of individual identification and document authentication, signatures present convincing evidence and provide an important form of indexing for effective document image processing and retrieval in a broad range of applications. However, detection and segmentation of free-form objects such as signatures from clustered background is currently an open document analysis problem. In this paper, we focus on two fundamental problems in signature-based document image retrieval. First, we propose a novel multiscale approach to jointly detecting and segmenting signatures from document images. Rather than focusing on local features that typically have large variations, our approach captures the structural saliency using a signature production model and computes the dynamic curvature of 2D contour fragments over multiple scales. This detection framework is general and computationally tractable. Second, we treat the problem of signature retrieval in the unconstrained setting of translation, scale, and rotation invariant nonrigid shape matching. We propose two novel measures of shape dissimilarity based on anisotropic scaling and registration residual error and present a supervised learning framework for combining complementary shape information from different dissimilarity metrics using LDA. We quantitatively study state-of-the-art shape representations, shape matching algorithms, measures of dissimilarity, and the use of multiple instances as query in document image retrieval. We further demonstrate our matching techniques in offline signature verification. Extensive experiments using large real-world collections of English and Arabic machine-printed and handwritten documents demonstrate the excellent performance of our approaches.
Solving Large Problems Quickly: Progress in 2001-2003
NASA Technical Reports Server (NTRS)
Mowry, Todd C.; Colohan, Christopher B.; Brown, Angela Demke; Steffan, J. Gregory; Zhai, Antonia
2004-01-01
This document describes the progress we have made and the lessons we have learned in 2001 through 2003 under the NASA grant entitled "Solving Important Problems Faster". The long-term goal of this research is to accelerate large, irregular scientific applications which have enormous data sets and which are difficult to parallelize. To accomplish this goal, we are exploring two complementary techniques: (i) using compiler-inserted prefetching to automatically hide the I/O latency of accessing these large data sets from disk; and (ii) using thread-level data speculation to enable the optimistic parallelization of applications despite uncertainty as to whether data dependences exist between the resulting threads which would normally make them unsafe to execute in parallel. Overall, we made significant progress in 2001 through 2003, and the project has gone well.
NASA Astrophysics Data System (ADS)
Srinivasa, K. G.; Shree Devi, B. N.
2017-10-01
String searching in documents has become a tedious task with the evolution of Big Data. Generation of large data sets demand for a high performance search algorithm in areas such as text mining, information retrieval and many others. The popularity of GPU's for general purpose computing has been increasing for various applications. Therefore it is of great interest to exploit the thread feature of a GPU to provide a high performance search algorithm. This paper proposes an optimized new approach to N-gram model for string search in a number of lengthy documents and its GPU implementation. The algorithm exploits GPGPUs for searching strings in many documents employing character level N-gram matching with parallel Score Table approach and search using CUDA API. The new approach of Score table used for frequency storage of N-grams in a document, makes the search independent of the document's length and allows faster access to the frequency values, thus decreasing the search complexity. The extensive thread feature in a GPU has been exploited to enable parallel pre-processing of trigrams in a document for Score Table creation and parallel search in huge number of documents, thus speeding up the whole search process even for a large pattern size. Experiments were carried out for many documents of varied length and search strings from the standard Lorem Ipsum text on NVIDIA's GeForce GT 540M GPU with 96 cores. Results prove that the parallel approach for Score Table creation and searching gives a good speed up than the same approach executed serially.
Total Quality Management (TQM) in Higher Education.
ERIC Educational Resources Information Center
Sullivan, Michael F.
This document consists largely of paper versions of the transparencies used by the author to give his conference paper on Total Quality Management (TQM) in the college and university setting. An introduction lists a series of definitional phrases, a list of what TQM is not, and 11 fundamental principles describing what TQM is. The three major…
Metabolomic technologies for improving the quality of food: Practice and promise
USDA-ARS?s Scientific Manuscript database
It is now well documented that the diet has a significant impact on human health and well-being. However, the complete set of small molecule metabolites present in foods that make up the human diet and the role of food production systems in altering this food metabolome are still largely unknown. Me...
Neural networks for data mining electronic text collections
NASA Astrophysics Data System (ADS)
Walker, Nicholas; Truman, Gregory
1997-04-01
The use of neural networks in information retrieval and text analysis has primarily suffered from the issues of adequate document representation, the ability to scale to very large collections, dynamism in the face of new information and the practical difficulties of basing the design on the use of supervised training sets. Perhaps the most important approach to begin solving these problems is the use of `intermediate entities' which reduce the dimensionality of document representations and the size of documents collections to manageable levels coupled with the use of unsupervised neural network paradigms. This paper describes the issues, a fully configured neural network-based text analysis system--dataHARVEST--aimed at data mining text collections which begins this process, along with the remaining difficulties and potential ways forward.
THE MARK I BUSINESS SYSTEM SIMULATION MODEL
of a large-scale business simulation model as a vehicle for doing research in management controls. The major results of the program were the...development of the Mark I business simulation model and the Simulation Package (SIMPAC). SIMPAC is a method and set of programs facilitating the construction...of large simulation models. The object of this document is to describe the Mark I Corporation model, state why parts of the business were modeled as they were, and indicate the research applications of the model. (Author)
SP2Bench: A SPARQL Performance Benchmark
NASA Astrophysics Data System (ADS)
Schmidt, Michael; Hornung, Thomas; Meier, Michael; Pinkel, Christoph; Lausen, Georg
A meaningful analysis and comparison of both existing storage schemes for RDF data and evaluation approaches for SPARQL queries necessitates a comprehensive and universal benchmark platform. We present SP2Bench, a publicly available, language-specific performance benchmark for the SPARQL query language. SP2Bench is settled in the DBLP scenario and comprises a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries. The generated documents mirror vital key characteristics and social-world distributions encountered in the original DBLP data set, while the queries implement meaningful requests on top of this data, covering a variety of SPARQL operator constellations and RDF access patterns. In this chapter, we discuss requirements and desiderata for SPARQL benchmarks and present the SP2Bench framework, including its data generator, benchmark queries and performance metrics.
Boyack, Kevin W.; Newman, David; Duhon, Russell J.; Klavans, Richard; Patek, Michael; Biberstine, Joseph R.; Schijvenaars, Bob; Skupin, André; Ma, Nianli; Börner, Katy
2011-01-01
Background We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis. The few comparisons of clustering results from different similarity approaches have focused on small literature sets and have given conflicting results. Our study was designed to seek a robust answer to the question of which similarity approach would generate the most coherent clusters of a biomedical literature set of over two million documents. Methodology We used a corpus of 2.15 million recent (2004-2008) records from MEDLINE, and generated nine different document-document similarity matrices from information extracted from their bibliographic records, including titles, abstracts and subject headings. The nine approaches were comprised of five different analytical techniques with two data sources. The five analytical techniques are cosine similarity using term frequency-inverse document frequency vectors (tf-idf cosine), latent semantic analysis (LSA), topic modeling, and two Poisson-based language models – BM25 and PMRA (PubMed Related Articles). The two data sources were a) MeSH subject headings, and b) words from titles and abstracts. Each similarity matrix was filtered to keep the top-n highest similarities per document and then clustered using a combination of graph layout and average-link clustering. Cluster results from the nine similarity approaches were compared using (1) within-cluster textual coherence based on the Jensen-Shannon divergence, and (2) two concentration measures based on grant-to-article linkages indexed in MEDLINE. Conclusions PubMed's own related article approach (PMRA) generated the most coherent and most concentrated cluster solution of the nine text-based similarity approaches tested, followed closely by the BM25 approach using titles and abstracts. Approaches using only MeSH subject headings were not competitive with those based on titles and abstracts. PMID:21437291
ERIC Educational Resources Information Center
Herrington, Deborah; Daubenmire, Patrick L.
2016-01-01
Despite decades of research regarding best practices for the teaching and learning of chemistry, as well as two sets of national reform documents for science education, classroom instruction in high school chemistry classrooms remains largely unchanged. One key reason for this continued gap between research and practice is a reliance on…
Analysis Of The IJCNN 2011 UTL Challenge
2012-01-13
large datasets from various application domains: handwriting recognition, image recognition, video processing, text processing, and ecology. The goal...validation and final evaluation sets consist of 4096 examples each. Dataset Domain Features Sparsity Devel. Transf. AVICENNA Handwriting 120 0% 150205...documents [3]. Transfer learning methods could accelerate the application of handwriting recognizers to historical manuscript by reducing the need for
Within-population spatial synchrony in mast seeding of North American oaks.
A.V. Liebhold; M. Sork; O.N. Peltonen; Westfall R. Bjørnstad; J. Elkinton; M. H. J. Knops
2004-01-01
Mast seeding, the synchronous production of large crops of seeds, has been frequently documented in oak species. In this study we used several North American oak data-sets to quantify within-stand (10 km) synchrony in mast dynamics. Results indicated that intraspecific synchrony in seed production always exceeded interspecific synchrony and was essentially constant...
NASA Technical Reports Server (NTRS)
1973-01-01
Techniques are considered which would be used to characterize areospace computers with the space shuttle application as end usage. The system level digital problems which have been encountered and documented are surveyed. From the large cross section of tests, an optimum set is recommended that has a high probability of discovering documented system level digital problems within laboratory environments. Defined is a baseline hardware, software system which is required as a laboratory tool to test aerospace computers. Hardware and software baselines and additions necessary to interface the UTE to aerospace computers for test purposes are outlined.
Flowgen: Flowchart-based documentation for C + + codes
NASA Astrophysics Data System (ADS)
Kosower, David A.; Lopez-Villarejo, J. J.
2015-11-01
We present the Flowgen tool, which generates flowcharts from annotated C + + source code. The tool generates a set of interconnected high-level UML activity diagrams, one for each function or method in the C + + sources. It provides a simple and visual overview of complex implementations of numerical algorithms. Flowgen is complementary to the widely-used Doxygen documentation tool. The ultimate aim is to render complex C + + computer codes accessible, and to enhance collaboration between programmers and algorithm or science specialists. We describe the tool and a proof-of-concept application to the VINCIA plug-in for simulating collisions at CERN's Large Hadron Collider.
Data discretization for novel resource discovery in large medical data sets.
Benoît, G.; Andrews, J. E.
2000-01-01
This paper is motivated by the problems of dealing with large data sets in information retrieval. The authors suggest an information retrieval framework based on mathematical principles to organize and permit end-user manipulation of a retrieval set. By adjusting through the interface the weights and types of relationships between query and set members, it is possible to expose unanticipated, novel relationships between the query/document pair. The retrieval set as a whole is parsed into discrete concept-oriented subsets (based on within-set similarity measures) and displayed on screen as interactive "graphic nodes" in an information space, distributed at first based on the vector model (similarity measure of set to query). The result is a visualized map wherein it is possible to identify main concept regions and multiple sub-regions as dimensions of the same data. Users may examine the membership within sub-regions. Based on this framework, a data visualization user interface was designed to encourage users to work with the data on multiple levels to find novel relationships between the query and retrieval set members. Space constraints prohibit addressing all aspects of this project. PMID:11079845
Schmidt, Rodney A; Simmons, Kim; Grimm, Erin E; Middlebrooks, Michael; Changchien, Rosy
2006-11-01
Electronic document management systems (EDMSs) have the potential to improve the efficiency of anatomic pathology laboratories. We implemented a novel but simple EDMS for scanned documents as part of our laboratory information system (AP-LIS) and collected cost-benefit data with the intention of discerning the value of such a system in general and whether integration with the AP-LIS is advantageous. We found that the direct financial benefits are modest but the indirect and intangible benefits are large. Benefits of time savings and access to data particularly accrued to pathologists and residents (3.8 h/d saved for 26 pathologists and residents). Integrating the scanned document management system (SDMS) into the AP-LIS has major advantages in terms of workflow and overall simplicity. This simple, integrated SDMS is an excellent value in a practice like ours, and many of the benefits likely apply in other practice settings.
ERIC Educational Resources Information Center
Egghe, L.; Michel, C.
2003-01-01
Ordered sets (OS) of documents are encountered more and more in information distribution systems, such as information retrieval systems. Classical similarity measures for ordinary sets of documents need to be extended to these ordered sets. This is done in this article using fuzzy set techniques. The practical usability of the OS-measures is…
Methods and apparatuses for information analysis on shared and distributed computing systems
Bohn, Shawn J [Richland, WA; Krishnan, Manoj Kumar [Richland, WA; Cowley, Wendy E [Richland, WA; Nieplocha, Jarek [Richland, WA
2011-02-22
Apparatuses and computer-implemented methods for analyzing, on shared and distributed computing systems, information comprising one or more documents are disclosed according to some aspects. In one embodiment, information analysis can comprise distributing one or more distinct sets of documents among each of a plurality of processes, wherein each process performs operations on a distinct set of documents substantially in parallel with other processes. Operations by each process can further comprise computing term statistics for terms contained in each distinct set of documents, thereby generating a local set of term statistics for each distinct set of documents. Still further, operations by each process can comprise contributing the local sets of term statistics to a global set of term statistics, and participating in generating a major term set from an assigned portion of a global vocabulary.
HiQuant: Rapid Postquantification Analysis of Large-Scale MS-Generated Proteomics Data.
Bryan, Kenneth; Jarboui, Mohamed-Ali; Raso, Cinzia; Bernal-Llinares, Manuel; McCann, Brendan; Rauch, Jens; Boldt, Karsten; Lynn, David J
2016-06-03
Recent advances in mass-spectrometry-based proteomics are now facilitating ambitious large-scale investigations of the spatial and temporal dynamics of the proteome; however, the increasing size and complexity of these data sets is overwhelming current downstream computational methods, specifically those that support the postquantification analysis pipeline. Here we present HiQuant, a novel application that enables the design and execution of a postquantification workflow, including common data-processing steps, such as assay normalization and grouping, and experimental replicate quality control and statistical analysis. HiQuant also enables the interpretation of results generated from large-scale data sets by supporting interactive heatmap analysis and also the direct export to Cytoscape and Gephi, two leading network analysis platforms. HiQuant may be run via a user-friendly graphical interface and also supports complete one-touch automation via a command-line mode. We evaluate HiQuant's performance by analyzing a large-scale, complex interactome mapping data set and demonstrate a 200-fold improvement in the execution time over current methods. We also demonstrate HiQuant's general utility by analyzing proteome-wide quantification data generated from both a large-scale public tyrosine kinase siRNA knock-down study and an in-house investigation into the temporal dynamics of the KSR1 and KSR2 interactomes. Download HiQuant, sample data sets, and supporting documentation at http://hiquant.primesdb.eu .
Multicompetence in L2 Language Play: A Longitudinal Case Study
ERIC Educational Resources Information Center
Bell, Nancy; Skalicky, Stephen; Salsbury, Tom
2014-01-01
Humor and language play have been recognized as important aspects of second language (L2) development. Qualitative studies that have documented the forms and functions of language play for adult and child L2 users have taken place largely in classroom settings. In order to gain a fuller understanding of such creative manipulations by L2 users, it…
Searching for Sterile Neutrinos with MINOS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Timmons, Ashley
2016-01-01
This document presents the latest results for a 3+1 sterile neutrino search using themore » $$10.56 \\times 10^{20}$$ protons-on-target data set taken from 2005 - 2012. By searching for oscillations driven by a large mass splitting, MINOS is sensitive to the existence of sterile neutrinos through any energy dependent deviations using a charged current sample, as well as looking at any relative deficit between neutral current events between the far and near detectors. This document will discuss the novel analysis that enabled a search for sterile neutrinos setting a limit in the previously unexplored regions in the parameter space $$\\{\\Delta m^{2}_{41}, \\sin^2\\theta_{24}\\}$$. The results presented can be compared to the parameter space suggested by LSND and MiniBooNE and complements other previous experimental searches for sterile neutrinos in the electron neutrino appearance channel.« less
Brand, C; Lam, S K L; Roberts, C; Gorelik, A; Amatya, B; Smallwood, D; Russell, D
2009-06-01
There are delays in implementing evidence about effective therapy into clinical practice. Clinical indicators may support implementation of guideline recommendations. To develop and evaluate the short-term impact of a clinical indicator set for general medicine. A set of clinical process indicators was developed using a structured process. The indicator set was implemented between January 2006 and December 2006, using strategies based on evidence about effectiveness and local contextual factors. Evaluation included a structured survey of general medical staff to assess awareness and attitudes towards the programme and qualitative assessment of barriers to implementation. Impact on documentation of adherence to clinical indicators was assessed by auditing a random sample of medical records before (2003-2005) and after (2006) implementation. Clinical indicators were developed for the following areas: venous thromboembolism, cognition, chronic heart failure, chronic obstructive pulmonary disease, diabetes, low trauma fracture, patient written care plans. The programme was well supported and incurred little burden to staff. Implementation occurred largely as planned; however, documentation of adherence to clinical indicators was variable. There was a generally positive trend over time, but for most indicators this was independent of the implementation process and may have been influenced by other system improvement activities. Failure to demonstrate a significant impact during the pilot phase is likely to have been influenced by administrative factors, especially lack of an integrative data documentation and collection process. Successful implementation in phase two is likely to depend upon an effective data collection system integrated into usual care.
Current Climate Data Set Documentation Standards: Somewhere between Anagrams and Full Disclosure
NASA Astrophysics Data System (ADS)
Fleig, A. J.
2008-12-01
In the 17th century scientists, concerned with establishing primacy for their discoveries while maintaining control of their intellectual property, often published their results as anagrams. Robert Hooke's initial publication in 1676 of his law of elasticity in the form ceiiinossttuv which he revealed two years later as "Ut tension sic vis" or "of the extension, so the force" is one of the better known examples although Galileo, Newton, and many others used the same approach. Fortunately the idea of open publication in scientific journals subject to peer review as a cornerstone of the scientific method gradually became established and is now the norm. Unfortunately though even peer reviewed publication does not necessarily lead to full disclosure. One example of this occurs in the production, review and distribution of large scale data sets of climate variables. Validation papers describe how the data was made in concept but do not provide adequate documentation of the process. Complete provenance of the resulting data sets including description of the exact input files, processing environment, and actual processing code are not required as part of the production and archival effort. A user of the data may be assured by the publication and peer review that the data is considered to be good and usable for scientific investigation but will not know exactly how the data set was made. The problem with this lack of knowledge may be most apparent when considering questions of climate change. Future measurements of the same geophysical parameter will surely be derived from a different observational system than the one used in creating today's data sets. An obvious task in assessing change between the present and the future data set will be to determine how much of the change is because the parameter changed and how much is because the measurement system changed. This will be hard to do without complete knowledge of how the predecessor data set was made. Automated techniques are being developed that will simplify the creation of much of the provenance information but there are both cultural and infrastructure problems that discourage provision of complete documentation. It is time to reconsider what the standards for production and documentation of data sets should be. There is only a short window before the loss of knowledge about current data sets associated with human mortality becomes irreversible. .
Script identification from images using cluster-based templates
Hochberg, J.G.; Kelly, P.M.; Thomas, T.R.
1998-12-01
A computer-implemented method identifies a script used to create a document. A set of training documents for each script to be identified is scanned into the computer to store a series of exemplary images representing each script. Pixels forming the exemplary images are electronically processed to define a set of textual symbols corresponding to the exemplary images. Each textual symbol is assigned to a cluster of textual symbols that most closely represents the textual symbol. The cluster of textual symbols is processed to form a representative electronic template for each cluster. A document having a script to be identified is scanned into the computer to form one or more document images representing the script to be identified. Pixels forming the document images are electronically processed to define a set of document textual symbols corresponding to the document images. The set of document textual symbols is compared to the electronic templates to identify the script. 17 figs.
Script identification from images using cluster-based templates
Hochberg, Judith G.; Kelly, Patrick M.; Thomas, Timothy R.
1998-01-01
A computer-implemented method identifies a script used to create a document. A set of training documents for each script to be identified is scanned into the computer to store a series of exemplary images representing each script. Pixels forming the exemplary images are electronically processed to define a set of textual symbols corresponding to the exemplary images. Each textual symbol is assigned to a cluster of textual symbols that most closely represents the textual symbol. The cluster of textual symbols is processed to form a representative electronic template for each cluster. A document having a script to be identified is scanned into the computer to form one or more document images representing the script to be identified. Pixels forming the document images are electronically processed to define a set of document textual symbols corresponding to the document images. The set of document textual symbols is compared to the electronic templates to identify the script.
Fulton, James L.
1992-01-01
Spatial data analysis has become an integral component in many surface and sub-surface hydrologic investigations within the U.S. Geological Survey (USGS). Currently, one of the largest costs in applying spatial data analysis is the cost of developing the needed spatial data. Therefore, guidelines and standards are required for the development of spatial data in order to allow for data sharing and reuse; this eliminates costly redevelopment. In order to attain this goal, the USGS is expanding efforts to identify guidelines and standards for the development of spatial data for hydrologic analysis. Because of the variety of project and database needs, the USGS has concentrated on developing standards for documenting spatial sets to aid in the assessment of data set quality and compatibility of different data sets. An interim data set documentation standard (1990) has been developed that provides a mechanism for associating a wide variety of information with a data set, including data about source material, data automation and editing procedures used, projection parameters, data statistics, descriptions of features and feature attributes, information on organizational contacts lists of operations performed on the data, and free-form comments and notes about the data, made at various times in the evolution of the data set. The interim data set documentation standard has been automated using a commercial geographic information system (GIS) and data set documentation software developed by the USGS. Where possible, USGS developed software is used to enter data into the data set documentation file automatically. The GIS software closely associates a data set with its data set documentation file; the documentation file is retained with the data set whenever it is modified, copied, or transferred to another computer system. The Water Resources Division of the USGS is continuing to develop spatial data and data processing standards, with emphasis on standards needed to support hydrologic analysis, hydrologic data processing, and publication of hydrologic thermatic maps. There is a need for the GIS vendor community to develop data set documentation tools similar to those developed by the USGS, or to incorporate USGS developed tools in their software.
Use of General-purpose Negation Detection to Augment Concept Indexing of Medical Documents
Mutalik, Pradeep G.; Deshpande, Aniruddha; Nadkarni, Prakash M.
2001-01-01
Objectives: To test the hypothesis that most instances of negated concepts in dictated medical documents can be detected by a strategy that relies on tools developed for the parsing of formal (computer) languages—specifically, a lexical scanner (“lexer”) that uses regular expressions to generate a finite state machine, and a parser that relies on a restricted subset of context-free grammars, known as LALR(1) grammars. Methods: A diverse training set of 40 medical documents from a variety of specialties was manually inspected and used to develop a program (Negfinder) that contained rules to recognize a large set of negated patterns occurring in the text. Negfinder's lexer and parser were developed using tools normally used to generate programming language compilers. The input to Negfinder consisted of medical narrative that was preprocessed to recognize UMLS concepts: the text of a recognized concept had been replaced with a coded representation that included its UMLS concept ID. The program generated an index with one entry per instance of a concept in the document, where the presence or absence of negation of that concept was recorded. This information was used to mark up the text of each document by color-coding it to make it easier to inspect. The parser was then evaluated in two ways: 1) a test set of 60 documents (30 discharge summaries, 30 surgical notes) marked-up by Negfinder was inspected visually to quantify false-positive and false-negative results; and 2) a different test set of 10 documents was independently examined for negatives by a human observer and by Negfinder, and the results were compared. Results: In the first evaluation using marked-up documents, 8,358 instances of UMLS concepts were detected in the 60 documents, of which 544 were negations detected by the program and verified by human observation (true-positive results, or TPs). Thirteen instances were wrongly flagged as negated (false-positive results, or FPs), and the program missed 27 instances of negation (false-negative results, or FNs), yielding a sensitivity of 95.3 percent and a specificity of 97.7 percent. In the second evaluation using independent negation detection, 1,869 concepts were detected in 10 documents, with 135 TPs, 12 FPs, and 6 FNs, yielding a sensitivity of 95.7 percent and a specificity of 91.8 percent. One of the words “no,” “denies/denied,” “not,” or “without” was present in 92.5 percent of all negations. Conclusions: Negation of most concepts in medical narrative can be reliably detected by a simple strategy. The reliability of detection depends on several factors, the most important being the accuracy of concept matching. PMID:11687566
Sanfilippo, Antonio [Richland, WA; Calapristi, Augustin J [West Richland, WA; Crow, Vernon L [Richland, WA; Hetzler, Elizabeth G [Kennewick, WA; Turner, Alan E [Kennewick, WA
2009-12-22
Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture are described. In one aspect, a document clustering method includes providing a document set comprising a plurality of documents, providing a cluster comprising a subset of the documents of the document set, using a plurality of terms of the documents, providing a cluster label indicative of subject matter content of the documents of the cluster, wherein the cluster label comprises a plurality of word senses, and selecting one of the word senses of the cluster label.
The evolving story of information assurance at the DoD.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Campbell, Philip LaRoche
2007-01-01
This document is a review of five documents on information assurance from the Department of Defense (DoD), namely 5200.40, 8510.1-M, 8500.1, 8500.2, and an ''interim'' document on DIACAP [9]. The five documents divide into three sets: (1) 5200.40 & 8510.1-M, (2) 8500.1 & 8500.2, and (3) the interim DIACAP document. The first two sets describe the certification and accreditation process known as ''DITSCAP''; the last two sets describe the certification and accreditation process known as ''DIACAP'' (the second set applies to both processes). Each set of documents describes (1) a process, (2) a systems classification, and (3) a measurement standard.more » Appendices in this report (a) list the Phases, Activities, and Tasks of DITSCAP, (b) note the discrepancies between 5200.40 and 8510.1-M concerning DITSCAP Tasks and the System Security Authorization Agreement (SSAA), (c) analyze the DIACAP constraints on role fusion and on reporting, (d) map terms shared across the documents, and (e) review three additional documents on information assurance, namely DCID 6/3, NIST 800-37, and COBIT{reg_sign}.« less
Search and Graph Database Technologies for Biomedical Semantic Indexing: Experimental Analysis.
Segura Bedmar, Isabel; Martínez, Paloma; Carruana Martín, Adrián
2017-12-01
Biomedical semantic indexing is a very useful support tool for human curators in their efforts for indexing and cataloging the biomedical literature. The aim of this study was to describe a system to automatically assign Medical Subject Headings (MeSH) to biomedical articles from MEDLINE. Our approach relies on the assumption that similar documents should be classified by similar MeSH terms. Although previous work has already exploited the document similarity by using a k-nearest neighbors algorithm, we represent documents as document vectors by search engine indexing and then compute the similarity between documents using cosine similarity. Once the most similar documents for a given input document are retrieved, we rank their MeSH terms to choose the most suitable set for the input document. To do this, we define a scoring function that takes into account the frequency of the term into the set of retrieved documents and the similarity between the input document and each retrieved document. In addition, we implement guidelines proposed by human curators to annotate MEDLINE articles; in particular, the heuristic that says if 3 MeSH terms are proposed to classify an article and they share the same ancestor, they should be replaced by this ancestor. The representation of the MeSH thesaurus as a graph database allows us to employ graph search algorithms to quickly and easily capture hierarchical relationships such as the lowest common ancestor between terms. Our experiments show promising results with an F1 of 69% on the test dataset. To the best of our knowledge, this is the first work that combines search and graph database technologies for the task of biomedical semantic indexing. Due to its horizontal scalability, ElasticSearch becomes a real solution to index large collections of documents (such as the bibliographic database MEDLINE). Moreover, the use of graph search algorithms for accessing MeSH information could provide a support tool for cataloging MEDLINE abstracts in real time. ©Isabel Segura Bedmar, Paloma Martínez, Adrián Carruana Martín. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 01.12.2017.
Auer, Lucas; Mariadassou, Mahendra; O'Donohue, Michael; Klopp, Christophe; Hernandez-Raquet, Guillermina
2017-11-01
Next-generation sequencing technologies give access to large sets of data, which are extremely useful in the study of microbial diversity based on 16S rRNA gene. However, the production of such large data sets is not only marred by technical biases and sequencing noise but also increases computation time and disc space use. To improve the accuracy of OTU predictions and overcome both computations, storage and noise issues, recent studies and tools suggested removing all single reads and low abundant OTUs, considering them as noise. Although the effect of applying an OTU abundance threshold on α- and β-diversity has been well documented, the consequences of removing single reads have been poorly studied. Here, we test the effect of singleton read filtering (SRF) on microbial community composition using in silico simulated data sets as well as sequencing data from synthetic and real communities displaying different levels of diversity and abundance profiles. Scalability to large data sets is also assessed using a complete MiSeq run. We show that SRF drastically reduces the chimera content and computational time, enabling the analysis of a complete MiSeq run in just a few minutes. Moreover, SRF accurately determines the actual community diversity: the differences in α- and β-community diversity obtained with SRF and standard procedures are much smaller than the intrinsic variability of technical and biological replicates. © 2017 John Wiley & Sons Ltd.
The statistical power to detect cross-scale interactions at macroscales
Wagner, Tyler; Fergus, C. Emi; Stow, Craig A.; Cheruvelil, Kendra S.; Soranno, Patricia A.
2016-01-01
Macroscale studies of ecological phenomena are increasingly common because stressors such as climate and land-use change operate at large spatial and temporal scales. Cross-scale interactions (CSIs), where ecological processes operating at one spatial or temporal scale interact with processes operating at another scale, have been documented in a variety of ecosystems and contribute to complex system dynamics. However, studies investigating CSIs are often dependent on compiling multiple data sets from different sources to create multithematic, multiscaled data sets, which results in structurally complex, and sometimes incomplete data sets. The statistical power to detect CSIs needs to be evaluated because of their importance and the challenge of quantifying CSIs using data sets with complex structures and missing observations. We studied this problem using a spatially hierarchical model that measures CSIs between regional agriculture and its effects on the relationship between lake nutrients and lake productivity. We used an existing large multithematic, multiscaled database, LAke multiscaled GeOSpatial, and temporal database (LAGOS), to parameterize the power analysis simulations. We found that the power to detect CSIs was more strongly related to the number of regions in the study rather than the number of lakes nested within each region. CSI power analyses will not only help ecologists design large-scale studies aimed at detecting CSIs, but will also focus attention on CSI effect sizes and the degree to which they are ecologically relevant and detectable with large data sets.
The impact of common APSE interface set specifications on space station information systems
NASA Technical Reports Server (NTRS)
Diaz-Herrera, Jorge L.; Sibley, Edgar H.
1986-01-01
Certain types of software facilities are needed in a Space Station Information Systems Environment; the Common APSE (Ada Program Support Environment) Interface Set (CAIS) was proposed as a means of satisfying them. The reasonableness of this is discussed by examining the current CAIS, considering the changes due to the latest Requirements and Criteria (RAC) document, and postulating the effects on the CAIS 2.0. Finally, a few additional comments are made on the problems inherent in the Ada language itself, especially on its deficiencies when used for implementing large distributed processing and data base applications.
A model for enhancing Internet medical document retrieval with "medical core metadata".
Malet, G; Munoz, F; Appleyard, R; Hersh, W
1999-01-01
Finding documents on the World Wide Web relevant to a specific medical information need can be difficult. The goal of this work is to define a set of document content description tags, or metadata encodings, that can be used to promote disciplined search access to Internet medical documents. The authors based their approach on a proposed metadata standard, the Dublin Core Metadata Element Set, which has recently been submitted to the Internet Engineering Task Force. Their model also incorporates the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary and MEDLINE-type content descriptions. The model defines a medical core metadata set that can be used to describe the metadata for a wide variety of Internet documents. The authors propose that their medical core metadata set be used to assign metadata to medical documents to facilitate document retrieval by Internet search engines.
A Model for Enhancing Internet Medical Document Retrieval with “Medical Core Metadata”
Malet, Gary; Munoz, Felix; Appleyard, Richard; Hersh, William
1999-01-01
Objective: Finding documents on the World Wide Web relevant to a specific medical information need can be difficult. The goal of this work is to define a set of document content description tags, or metadata encodings, that can be used to promote disciplined search access to Internet medical documents. Design: The authors based their approach on a proposed metadata standard, the Dublin Core Metadata Element Set, which has recently been submitted to the Internet Engineering Task Force. Their model also incorporates the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary and Medline-type content descriptions. Results: The model defines a medical core metadata set that can be used to describe the metadata for a wide variety of Internet documents. Conclusions: The authors propose that their medical core metadata set be used to assign metadata to medical documents to facilitate document retrieval by Internet search engines. PMID:10094069
Heimonen, Juho; Danielsson-Ojala, Riitta; Salakoski, Tapio; Lundgrén-Laine, Heljä; Salanterä, Sanna
2018-04-12
Written patient education materials are essential to motivate and help patients to participate in their own care, but the production and management of a large collection of high-quality and easily accessible patient education documents can be challenging. Ontologies can aid in these tasks, but the existing resources are not directly applicable to patient education. An ontology that models patient education documents and their readers was constructed. The Delphi method was used to identify a compact but sufficient set of entities with which the topics of documents may be described. The preferred terms of the entities were also considered to ensure their understandability. In the ontology, readers may be characterized by gender, age group, language, and role (patient or professional), whereas documents may be characterized by audience, topic(s), and content, as well as the time and place of use. The Delphi method yielded 265 unique document topics that are organized into seven hierarchies. Advantages and disadvantages of the ontology design, as well as possibilities for improvements, were identified. The patient education material ontology can enhance many applications, but further development is needed to reach its full potential.
Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network.
Wu, Yonghui; Jiang, Min; Lei, Jianbo; Xu, Hua
2015-01-01
Rapid growth in electronic health records (EHRs) use has led to an unprecedented expansion of available clinical data in electronic formats. However, much of the important healthcare information is locked in the narrative documents. Therefore Natural Language Processing (NLP) technologies, e.g., Named Entity Recognition that identifies boundaries and types of entities, has been extensively studied to unlock important clinical information in free text. In this study, we investigated a novel deep learning method to recognize clinical entities in Chinese clinical documents using the minimal feature engineering approach. We developed a deep neural network (DNN) to generate word embeddings from a large unlabeled corpus through unsupervised learning and another DNN for the NER task. The experiment results showed that the DNN with word embeddings trained from the large unlabeled corpus outperformed the state-of-the-art CRF's model in the minimal feature engineering setting, achieving the highest F1-score of 0.9280. Further analysis showed that word embeddings derived through unsupervised learning from large unlabeled corpus remarkably improved the DNN with randomized embedding, denoting the usefulness of unsupervised feature learning.
NASA Technical Reports Server (NTRS)
Waggoner, J. T.; Phinney, D. E. (Principal Investigator)
1981-01-01
Foreign Commodity Production Forecasting testing activities through June 1981 are documented. A log of test reports is presented. Standard documentation sets are included for each test. The documentation elements presented in each set are summarized.
Island of the Sun: Elite and Non-Elite Observations of the June Solstice
NASA Astrophysics Data System (ADS)
Dearborn, David S. P.; Bauer, Brian S.
In Inca times (AD 1400-1532), two small islands in Lake Titicaca had temples dedicated to the sun and the moon. Colonial documents indicate that the islands were the focus of large-scale pilgrimages. Recent archaeoastronomical work suggests that rituals, attended by both elites and commoners, were held on the Island of the Sun to observe the setting sun on the June solstice.
TopicLens: Efficient Multi-Level Visual Topic Exploration of Large-Scale Document Collections.
Kim, Minjeong; Kang, Kyeongpil; Park, Deokgun; Choo, Jaegul; Elmqvist, Niklas
2017-01-01
Topic modeling, which reveals underlying topics of a document corpus, has been actively adopted in visual analytics for large-scale document collections. However, due to its significant processing time and non-interactive nature, topic modeling has so far not been tightly integrated into a visual analytics workflow. Instead, most such systems are limited to utilizing a fixed, initial set of topics. Motivated by this gap in the literature, we propose a novel interaction technique called TopicLens that allows a user to dynamically explore data through a lens interface where topic modeling and the corresponding 2D embedding are efficiently computed on the fly. To support this interaction in real time while maintaining view consistency, we propose a novel efficient topic modeling method and a semi-supervised 2D embedding algorithm. Our work is based on improving state-of-the-art methods such as nonnegative matrix factorization and t-distributed stochastic neighbor embedding. Furthermore, we have built a web-based visual analytics system integrated with TopicLens. We use this system to measure the performance and the visualization quality of our proposed methods. We provide several scenarios showcasing the capability of TopicLens using real-world datasets.
Clustering of Farsi sub-word images for whole-book recognition
NASA Astrophysics Data System (ADS)
Soheili, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier
2015-01-01
Redundancy of word and sub-word occurrences in large documents can be effectively utilized in an OCR system to improve recognition results. Most OCR systems employ language modeling techniques as a post-processing step; however these techniques do not use important pictorial information that exist in the text image. In case of large-scale recognition of degraded documents, this information is even more valuable. In our previous work, we proposed a subword image clustering method for the applications dealing with large printed documents. In our clustering method, the ideal case is when all equivalent sub-word images lie in one cluster. To overcome the issues of low print quality, the clustering method uses an image matching algorithm for measuring the distance between two sub-word images. The measured distance with a set of simple shape features were used to cluster all sub-word images. In this paper, we analyze the effects of adding more shape features on processing time, purity of clustering, and the final recognition rate. Previously published experiments have shown the efficiency of our method on a book. Here we present extended experimental results and evaluate our method on another book with totally different font face. Also we show that the number of the new created clusters in a page can be used as a criteria for assessing the quality of print and evaluating preprocessing phases.
An automated procedure to identify biomedical articles that contain cancer-associated gene variants.
McDonald, Ryan; Scott Winters, R; Ankuda, Claire K; Murphy, Joan A; Rogers, Amy E; Pereira, Fernando; Greenblatt, Marc S; White, Peter S
2006-09-01
The proliferation of biomedical literature makes it increasingly difficult for researchers to find and manage relevant information. However, identifying research articles containing mutation data, a requisite first step in integrating large and complex mutation data sets, is currently tedious, time-consuming and imprecise. More effective mechanisms for identifying articles containing mutation information would be beneficial both for the curation of mutation databases and for individual researchers. We developed an automated method that uses information extraction, classifier, and relevance ranking techniques to determine the likelihood of MEDLINE abstracts containing information regarding genomic variation data suitable for inclusion in mutation databases. We targeted the CDKN2A (p16) gene and the procedure for document identification currently used by CDKN2A Database curators as a measure of feasibility. A set of abstracts was manually identified from a MEDLINE search as potentially containing specific CDKN2A mutation events. A subset of these abstracts was used as a training set for a maximum entropy classifier to identify text features distinguishing "relevant" from "not relevant" abstracts. Each document was represented as a set of indicative word, word pair, and entity tagger-derived genomic variation features. When applied to a test set of 200 candidate abstracts, the classifier predicted 88 articles as being relevant; of these, 29 of 32 manuscripts in which manual curation found CDKN2A sequence variants were positively predicted. Thus, the set of potentially useful articles that a manual curator would have to review was reduced by 56%, maintaining 91% recall (sensitivity) and more than doubling precision (positive predictive value). Subsequent expansion of the training set to 494 articles yielded similar precision and recall rates, and comparison of the original and expanded trials demonstrated that the average precision improved with the larger data set. Our results show that automated systems can effectively identify article subsets relevant to a given task and may prove to be powerful tools for the broader research community. This procedure can be readily adapted to any or all genes, organisms, or sets of documents. Published 2006 Wiley-Liss, Inc.
DiCanio, Christian; Nam, Hosung; Whalen, Douglas H.; Timothy Bunnell, H.; Amith, Jonathan D.; García, Rey Castillo
2013-01-01
While efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (hmalign and p2fa) was assessed using corpus data from Yoloxóchitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9% within 30 ms for hmalign and 65.7% within 30 ms for p2fa. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in hmalign's phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small corpora more efficient, with more allophonic phone sets providing better agreement than general ones. PMID:23967953
DiCanio, Christian; Nam, Hosung; Whalen, Douglas H; Bunnell, H Timothy; Amith, Jonathan D; García, Rey Castillo
2013-09-01
While efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (hmalign and p2fa) was assessed using corpus data from Yoloxóchitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9% within 30 ms for hmalign and 65.7% within 30 ms for p2fa. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in hmalign's phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small corpora more efficient, with more allophonic phone sets providing better agreement than general ones.
On the Reconstruction of Text Phylogeny Trees: Evaluation and Analysis of Textual Relationships
Marmerola, Guilherme D.; Dias, Zanoni; Goldenstein, Siome; Rocha, Anderson
2016-01-01
Over the history of mankind, textual records change. Sometimes due to mistakes during transcription, sometimes on purpose, as a way to rewrite facts and reinterpret history. There are several classical cases, such as the logarithmic tables, and the transmission of antique and medieval scholarship. Today, text documents are largely edited and redistributed on the Web. Articles on news portals and collaborative platforms (such as Wikipedia), source code, posts on social networks, and even scientific publications or literary works are some examples in which textual content can be subject to changes in an evolutionary process. In this scenario, given a set of near-duplicate documents, it is worthwhile to find which one is the original and the history of changes that created the whole set. Such functionality would have immediate applications on news tracking services, detection of plagiarism, textual criticism, and copyright enforcement, for instance. However, this is not an easy task, as textual features pointing to the documents’ evolutionary direction may not be evident and are often dataset dependent. Moreover, side information, such as time stamps, are neither always available nor reliable. In this paper, we propose a framework for reliably reconstructing text phylogeny trees, and seamlessly exploring new approaches on a wide range of scenarios of text reusage. We employ and evaluate distinct combinations of dissimilarity measures and reconstruction strategies within the proposed framework, and evaluate each approach with extensive experiments, including a set of artificial near-duplicate documents with known phylogeny, and from documents collected from Wikipedia, whose modifications were made by Internet users. We also present results from qualitative experiments in two different applications: text plagiarism and reconstruction of evolutionary trees for manuscripts (stemmatology). PMID:27992446
Tuarob, Suppawong; Tucker, Conrad S; Salathe, Marcel; Ram, Nilam
2014-06-01
The role of social media as a source of timely and massive information has become more apparent since the era of Web 2.0.Multiple studies illustrated the use of information in social media to discover biomedical and health-related knowledge.Most methods proposed in the literature employ traditional document classification techniques that represent a document as a bag of words.These techniques work well when documents are rich in text and conform to standard English; however, they are not optimal for social media data where sparsity and noise are norms.This paper aims to address the limitations posed by the traditional bag-of-word based methods and propose to use heterogeneous features in combination with ensemble machine learning techniques to discover health-related information, which could prove to be useful to multiple biomedical applications, especially those needing to discover health-related knowledge in large scale social media data.Furthermore, the proposed methodology could be generalized to discover different types of information in various kinds of textual data. Social media data is characterized by an abundance of short social-oriented messages that do not conform to standard languages, both grammatically and syntactically.The problem of discovering health-related knowledge in social media data streams is then transformed into a text classification problem, where a text is identified as positive if it is health-related and negative otherwise.We first identify the limitations of the traditional methods which train machines with N-gram word features, then propose to overcome such limitations by utilizing the collaboration of machine learning based classifiers, each of which is trained to learn a semantically different aspect of the data.The parameter analysis for tuning each classifier is also reported. Three data sets are used in this research.The first data set comprises of approximately 5000 hand-labeled tweets, and is used for cross validation of the classification models in the small scale experiment, and for training the classifiers in the real-world large scale experiment.The second data set is a random sample of real-world Twitter data in the US.The third data set is a random sample of real-world Facebook Timeline posts. Two sets of evaluations are conducted to investigate the proposed model's ability to discover health-related information in the social media domain: small scale and large scale evaluations.The small scale evaluation employs 10-fold cross validation on the labeled data, and aims to tune parameters of the proposed models, and to compare with the stage-of-the-art method.The large scale evaluation tests the trained classification models on the native, real-world data sets, and is needed to verify the ability of the proposed model to handle the massive heterogeneity in real-world social media. The small scale experiment reveals that the proposed method is able to mitigate the limitations in the well established techniques existing in the literature, resulting in performance improvement of 18.61% (F-measure).The large scale experiment further reveals that the baseline fails to perform well on larger data with higher degrees of heterogeneity, while the proposed method is able to yield reasonably good performance and outperform the baseline by 46.62% (F-Measure) on average. Copyright © 2014 Elsevier Inc. All rights reserved.
Semiotic indexing of digital resources
Parker, Charles T; Garrity, George M
2014-12-02
A method of classifying a plurality of documents. The method includes steps of providing a first set of classification terms and a second set of classification terms, the second set of classification terms being different from the first set of classification terms; generating a first frequency array of a number of occurrences of each term from the first set of classification terms in each document; generating a second frequency array of a number of occurrences of each term from the second set of classification terms in each document; generating a first similarity matrix from the first frequency array; generating a second similarity matrix from the second frequency array; determining an entrywise combination of the first similarity matrix and the second similarity matrix; and clustering the plurality of documents based on the result of the entrywise combination.
KAT: A Flexible XML-based Knowledge Authoring Environment
Hulse, Nathan C.; Rocha, Roberto A.; Del Fiol, Guilherme; Bradshaw, Richard L.; Hanna, Timothy P.; Roemer, Lorrie K.
2005-01-01
As part of an enterprise effort to develop new clinical information systems at Intermountain Health Care, the authors have built a knowledge authoring tool that facilitates the development and refinement of medical knowledge content. At present, users of the application can compose order sets and an assortment of other structured clinical knowledge documents based on XML schemas. The flexible nature of the application allows the immediate authoring of new types of documents once an appropriate XML schema and accompanying Web form have been developed and stored in a shared repository. The need for a knowledge acquisition tool stems largely from the desire for medical practitioners to be able to write their own content for use within clinical applications. We hypothesize that medical knowledge content for clinical use can be successfully created and maintained through XML-based document frameworks containing structured and coded knowledge. PMID:15802477
Allred, Sharon K; Smith, Kevin F; Flowers, Laura
2004-01-01
With the increased interest in evidence-based medicine, Internet access and the growing emphasis on national standards, there is an increased challenge for teaching institutions and nursing services to teach and implement standards. At the same time, electronic clinical documentation tools have started to become a common format for recording nursing notes. The major aim of this paper is to ascertain and assess the availability of clinical nursing tools based on the NANDA, NOC and NIC standards. Faculty at 20 large nursing schools and directors of nursing at 20 hospitals were interviewed regarding the use of nursing standards in clinical documentation packages, not only for teaching purposes but also for use in hospital-based systems to ensure patient safety. A survey tool was utilized that covered questions regarding what nursing standards are being taught in the nursing schools, what standards are encouraged by the hospitals, and teaching initiatives that include clinical documentation tools. Information was collected on how utilizing these standards in a clinical or hospital setting can improve the overall quality of care. Analysis included univariate and bivariate analysis. The consensus between both groups was that the NANDA, NOC and NIC national standards are the most widely taught and utilized. In addition, a training initiative was identified within a large university where a clinical documentation system based on these standards was developed utilizing handheld devices.
Roberge, Jean-Michel; Lämås, Tomas; Lundmark, Tomas; Ranius, Thomas; Felton, Adam; Nordin, Annika
2015-05-01
Over previous decades new environmental measures have been implemented in forestry. In Fennoscandia, forest management practices were modified to set aside conservation areas and to retain trees at final felling. In this study we simulated the long-term effects of set-aside establishment and tree retention practices on the future availability of large trees and dead wood, two forest structures of documented importance to biodiversity conservation. Using a forest decision support system (Heureka), we projected the amounts of these structures over 200 years in two managed north Swedish landscapes, under management scenarios with and without set-asides and tree retention. In line with common best practice, we simulated set-asides covering 5% of the productive area with priority to older stands, as well as ∼5% green-tree retention (solitary trees and forest patches) including high-stump creation at final felling. We found that only tree retention contributed to substantial increases in the future density of large (DBH ≥35 cm) deciduous trees, while both measures made significant contributions to the availability of large conifers. It took more than half a century to observe stronger increases in the densities of large deciduous trees as an effect of tree retention. The mean landscape-scale volumes of hard dead wood fluctuated widely, but the conservation measures yielded values which were, on average over the entire simulation period, about 2.5 times as high as for scenarios without these measures. While the density of large conifers increased with time in the landscape initially dominated by younger forest, best practice conservation measures did not avert a long-term decrease in large conifer density in the landscape initially comprised of more old forest. Our results highlight the needs to adopt a long temporal perspective and to consider initial landscape conditions when evaluating the large-scale effects of conservation measures on forest biodiversity. Copyright © 2015 Elsevier Ltd. All rights reserved.
SkData: data sets and algorithm evaluation protocols in Python
NASA Astrophysics Data System (ADS)
Bergstra, James; Pinto, Nicolas; Cox, David D.
2015-01-01
Machine learning benchmark data sets come in all shapes and sizes, whereas classification algorithms assume sanitized input, such as (x, y) pairs with vector-valued input x and integer class label y. Researchers and practitioners know all too well how tedious it can be to get from the URL of a new data set to a NumPy ndarray suitable for e.g. pandas or sklearn. The SkData library handles that work for a growing number of benchmark data sets (small and large) so that one-off in-house scripts for downloading and parsing data sets can be replaced with library code that is reliable, community-tested, and documented. The SkData library also introduces an open-ended formalization of training and testing protocols that facilitates direct comparison with published research. This paper describes the usage and architecture of the SkData library.
Handling a Large Collection of PDF Documents
You have several options for making a large collection of PDF documents more accessible to your audience: avoid uploading altogether, use multiple document pages, and use document IDs as anchors for direct links within a document page.
Ontology-Driven Search and Triage: Design of a Web-Based Visual Interface for MEDLINE.
Demelo, Jonathan; Parsons, Paul; Sedig, Kamran
2017-02-02
Diverse users need to search health and medical literature to satisfy open-ended goals such as making evidence-based decisions and updating their knowledge. However, doing so is challenging due to at least two major difficulties: (1) articulating information needs using accurate vocabulary and (2) dealing with large document sets returned from searches. Common search interfaces such as PubMed do not provide adequate support for exploratory search tasks. Our objective was to improve support for exploratory search tasks by combining two strategies in the design of an interactive visual interface by (1) using a formal ontology to help users build domain-specific knowledge and vocabulary and (2) providing multi-stage triaging support to help mitigate the information overload problem. We developed a Web-based tool, Ontology-Driven Visual Search and Triage Interface for MEDLINE (OVERT-MED), to test our design ideas. We implemented a custom searchable index of MEDLINE, which comprises approximately 25 million document citations. We chose a popular biomedical ontology, the Human Phenotype Ontology (HPO), to test our solution to the vocabulary problem. We implemented multistage triaging support in OVERT-MED, with the aid of interactive visualization techniques, to help users deal with large document sets returned from searches. Formative evaluation suggests that the design features in OVERT-MED are helpful in addressing the two major difficulties described above. Using a formal ontology seems to help users articulate their information needs with more accurate vocabulary. In addition, multistage triaging combined with interactive visualizations shows promise in mitigating the information overload problem. Our strategies appear to be valuable in addressing the two major problems in exploratory search. Although we tested OVERT-MED with a particular ontology and document collection, we anticipate that our strategies can be transferred successfully to other contexts. ©Jonathan Demelo, Paul Parsons, Kamran Sedig. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 02.02.2017.
Robust Requirements Tracing via Internet Search Technology: Improving an IV and V Technique. Phase 2
NASA Technical Reports Server (NTRS)
Hayes, Jane; Dekhtyar, Alex
2004-01-01
There are three major objectives to this phase of the work. (1) Improvement of Information Retrieval (IR) methods for Independent Verification and Validation (IV&V) requirements tracing. Information Retrieval methods are typically developed for very large (order of millions - tens of millions and more documents) document collections and therefore, most successfully used methods somewhat sacrifice precision and recall in order to achieve efficiency. At the same time typical IR systems treat all user queries as independent of each other and assume that relevance of documents to queries is subjective for each user. The IV&V requirements tracing problem has a much smaller data set to operate on, even for large software development projects; the set of queries is predetermined by the high-level specification document and individual requirements considered as query input to IR methods are not necessarily independent from each other. Namely, knowledge about the links for one requirement may be helpful in determining the links of another requirement. Finally, while the final decision on the exact form of the traceability matrix still belongs to the IV&V analyst, his/her decisions are much less arbitrary than those of an Internet search engine user. All this suggests that the information available to us in the framework of the IV&V tracing problem can be successfully leveraged to enhance standard IR techniques, which in turn would lead to increased recall and precision. We developed several new methods during Phase II; (2) IV&V requirements tracing IR toolkit. Based on the methods developed in Phase I and their improvements developed in Phase II, we built a toolkit of IR methods for IV&V requirements tracing. The toolkit has been integrated, at the data level, with SAIC's SuperTracePlus (STP) tool; (3) Toolkit testing. We tested the methods included in the IV&V requirements tracing IR toolkit on a number of projects.
Ontology-Driven Search and Triage: Design of a Web-Based Visual Interface for MEDLINE
2017-01-01
Background Diverse users need to search health and medical literature to satisfy open-ended goals such as making evidence-based decisions and updating their knowledge. However, doing so is challenging due to at least two major difficulties: (1) articulating information needs using accurate vocabulary and (2) dealing with large document sets returned from searches. Common search interfaces such as PubMed do not provide adequate support for exploratory search tasks. Objective Our objective was to improve support for exploratory search tasks by combining two strategies in the design of an interactive visual interface by (1) using a formal ontology to help users build domain-specific knowledge and vocabulary and (2) providing multi-stage triaging support to help mitigate the information overload problem. Methods We developed a Web-based tool, Ontology-Driven Visual Search and Triage Interface for MEDLINE (OVERT-MED), to test our design ideas. We implemented a custom searchable index of MEDLINE, which comprises approximately 25 million document citations. We chose a popular biomedical ontology, the Human Phenotype Ontology (HPO), to test our solution to the vocabulary problem. We implemented multistage triaging support in OVERT-MED, with the aid of interactive visualization techniques, to help users deal with large document sets returned from searches. Results Formative evaluation suggests that the design features in OVERT-MED are helpful in addressing the two major difficulties described above. Using a formal ontology seems to help users articulate their information needs with more accurate vocabulary. In addition, multistage triaging combined with interactive visualizations shows promise in mitigating the information overload problem. Conclusions Our strategies appear to be valuable in addressing the two major problems in exploratory search. Although we tested OVERT-MED with a particular ontology and document collection, we anticipate that our strategies can be transferred successfully to other contexts. PMID:28153818
Organ donation in the ICU: A document analysis of institutional policies, protocols, and order sets.
Oczkowski, Simon J W; Centofanti, John E; Durepos, Pamela; Arseneau, Erika; Kelecevic, Julija; Cook, Deborah J; Meade, Maureen O
2018-04-01
To better understand how local policies influence organ donation rates. We conducted a document analysis of our ICU organ donation policies, protocols and order sets. We used a systematic search of our institution's policy library to identify documents related to organ donation. We used Mindnode software to create a publication timeline, basic statistics to describe document characteristics, and qualitative content analysis to extract document themes. Documents were retrieved from Hamilton Health Sciences, an academic hospital system with a high volume of organ donation, from database inception to October 2015. We retrieved 12 active organ donation documents, including six protocols, two policies, two order sets, and two unclassified documents, a majority (75%) after the introduction of donation after circulatory death in 2006. Four major themes emerged: organ donation process, quality of care, patient and family-centred care, and the role of the institution. These themes indicate areas where documented institutional standards may be beneficial. Further research is necessary to determine the relationship of local policies, protocols, and order sets to actual organ donation practices, and to identify barriers and facilitators to improving donation rates. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kolchinsky, A; Lourenço, A; Li, L; Rocha, L M
2013-01-01
Drug-drug interaction (DDI) is a major cause of morbidity and mortality. DDI research includes the study of different aspects of drug interactions, from in vitro pharmacology, which deals with drug interaction mechanisms, to pharmaco-epidemiology, which investigates the effects of DDI on drug efficacy and adverse drug reactions. Biomedical literature mining can aid both kinds of approaches by extracting relevant DDI signals from either the published literature or large clinical databases. However, though drug interaction is an ideal area for translational research, the inclusion of literature mining methodologies in DDI workflows is still very preliminary. One area that can benefit from literature mining is the automatic identification of a large number of potential DDIs, whose pharmacological mechanisms and clinical significance can then be studied via in vitro pharmacology and in populo pharmaco-epidemiology. We implemented a set of classifiers for identifying published articles relevant to experimental pharmacokinetic DDI evidence. These documents are important for identifying causal mechanisms behind putative drug-drug interactions, an important step in the extraction of large numbers of potential DDIs. We evaluate performance of several linear classifiers on PubMed abstracts, under different feature transformation and dimensionality reduction methods. In addition, we investigate the performance benefits of including various publicly-available named entity recognition features, as well as a set of internally-developed pharmacokinetic dictionaries. We found that several classifiers performed well in distinguishing relevant and irrelevant abstracts. We found that the combination of unigram and bigram textual features gave better performance than unigram features alone, and also that normalization transforms that adjusted for feature frequency and document length improved classification. For some classifiers, such as linear discriminant analysis (LDA), proper dimensionality reduction had a large impact on performance. Finally, the inclusion of NER features and dictionaries was found not to help classification.
The BioPrompt-box: an ontology-based clustering tool for searching in biological databases.
Corsi, Claudio; Ferragina, Paolo; Marangoni, Roberto
2007-03-08
High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This scenario generates severe problems not only at indexing time, where suitable algorithmic techniques for data indexing and retrieval are required, but also at query time, since a user query may produce such a large set of results that their browsing and "understanding" becomes humanly impractical. This problem is well known to the Web community, where a new generation of Web search engines is being developed, like Vivisimo. These tools organize on-the-fly the results of a user query in a hierarchy of labeled folders that ease their browsing and knowledge extraction. We investigate this approach on biological data, and propose the so called The BioPrompt-boxsoftware system which deploys ontology-driven clustering strategies for making the searching process of biologists more efficient and effective. The BioPrompt-box (Bpb) defines a document as a biological sequence plus its associated meta-data taken from the underneath databank--like references to ontologies or to external databanks, and plain texts as comments of researchers and (title, abstracts or even body of) papers. Bpboffers several tools to customize the search and the clustering process over its indexed documents. The user can search a set of keywords within a specific field of the document schema, or can execute Blastto find documents relative to homologue sequences. In both cases the search task returns a set of documents (hits) which constitute the answer to the user query. Since the number of hits may be large, Bpbclusters them into groups of homogenous content, organized as a hierarchy of labeled clusters. The user can actually choose among several ontology-based hierarchical clustering strategies, each offering a different "view" of the returned hits. Bpbcomputes these views by exploiting the meta-data present within the retrieved documents such as the references to Gene Ontology, the taxonomy lineage, the organism and the keywords. Of course, the approach is flexible enough to leave room for future additions of other meta-information. The ultimate goal of the clustering process is to provide the user with several different readings of the (maybe numerous) query results and show possible hidden correlations among them, thus improving their browsing and understanding. Bpb is a powerful search engine that makes it very easy to perform complex queries over the indexed databanks (currently only UNIPROT is considered). The ontology-based clustering approach is efficient and effective, and could thus be applied successfully to larger databanks, like GenBank or EMBL.
The BioPrompt-box: an ontology-based clustering tool for searching in biological databases
Corsi, Claudio; Ferragina, Paolo; Marangoni, Roberto
2007-01-01
Background High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This scenario generates severe problems not only at indexing time, where suitable algorithmic techniques for data indexing and retrieval are required, but also at query time, since a user query may produce such a large set of results that their browsing and "understanding" becomes humanly impractical. This problem is well known to the Web community, where a new generation of Web search engines is being developed, like Vivisimo. These tools organize on-the-fly the results of a user query in a hierarchy of labeled folders that ease their browsing and knowledge extraction. We investigate this approach on biological data, and propose the so called The BioPrompt-boxsoftware system which deploys ontology-driven clustering strategies for making the searching process of biologists more efficient and effective. Results The BioPrompt-box (Bpb) defines a document as a biological sequence plus its associated meta-data taken from the underneath databank – like references to ontologies or to external databanks, and plain texts as comments of researchers and (title, abstracts or even body of) papers. Bpboffers several tools to customize the search and the clustering process over its indexed documents. The user can search a set of keywords within a specific field of the document schema, or can execute Blastto find documents relative to homologue sequences. In both cases the search task returns a set of documents (hits) which constitute the answer to the user query. Since the number of hits may be large, Bpbclusters them into groups of homogenous content, organized as a hierarchy of labeled clusters. The user can actually choose among several ontology-based hierarchical clustering strategies, each offering a different "view" of the returned hits. Bpbcomputes these views by exploiting the meta-data present within the retrieved documents such as the references to Gene Ontology, the taxonomy lineage, the organism and the keywords. Of course, the approach is flexible enough to leave room for future additions of other meta-information. The ultimate goal of the clustering process is to provide the user with several different readings of the (maybe numerous) query results and show possible hidden correlations among them, thus improving their browsing and understanding. Conclusion Bpb is a powerful search engine that makes it very easy to perform complex queries over the indexed databanks (currently only UNIPROT is considered). The ontology-based clustering approach is efficient and effective, and could thus be applied successfully to larger databanks, like GenBank or EMBL. PMID:17430575
Tower-Perturbation Measurements in Above-Water Radiometry
NASA Technical Reports Server (NTRS)
Hooker, Stanford B. (Editor); Firestone, Elaine R. (Editor); Zibordi, Giuseppe; Berthon, Jean-Francois; DAlimonte, Davide; vanderLinde, Dirk; Brown, James W.
2003-01-01
This report documents the scientific activities which took place during June 2001 and June 2002 on the Acqua Alta Oceanographic Tower (AAOT) in the northern Adriatic Sea. The primary objective of these field campaigns was to quantify the effect of platform perturbations (principally reflections of sunlight onto the sea surface) on above-water measurements of water-leaving radiances. The deployment goals documented in this report were to: a) collect an extensive and simultaneous set of above- and in-water optical measurements under predominantly clear-sky conditions; b) establish the vertical properties of the water column using a variety of ancillary measurements, many of which were taken coincidently with the optical measurements; and c) determine the bulk properties of the environment using a diversity of atmospheric, biogeochemical, and meteorological techniques. A preliminary assessment of the data collected during the two field campaigns shows the perturbation in above-water radiometry caused by a large offshore structure is very similar to that caused by a large research vessel.
Tower-Perturbation Measurements in Above-Water Radiometry. Volume 23
NASA Technical Reports Server (NTRS)
Hooker, Stanford B. (Editor); Firestone, Elaine R. (Editor); Zibordi, Giuseppe; Berthon, Jean-Francois; D'Alimonte, Davide; vanderLinde, Dirk; Brown, James W.
2003-01-01
This report documents the scientific activities which took place during June 2001 and June 2002 on the Acqua Alta Oceanographic Tower (AAOT) in the northern Adriatic Sea. The primary objective of these field campaigns was to quantify the effect of platform perturbations (principally reflections of sunlight onto the sea surface) on above-water measurements of water-leaving radiances. The deployment goals documented in this report were to: a) collect an extensive and simultaneous set of above- and in-water optical measurements under predominantly clear-sky conditions; b) establish the vertical properties of the water column using a variety of ancillary measurements, many of which were taken coincidently with the optical measurements; and c) determine the bulk properties of the environment using a diversity of atmospheric, biogeochemical, and meteorological techniques. A preliminary assessment of the data collected during the two field campaigns shows the perturbation in above-water radiometry caused by a large offshore structure is very similar to that caused by a large research vessel.
Yu, Zhiguo; Nguyen, Thang; Dhombres, Ferdinand; Johnson, Todd; Bodenreider, Olivier
2018-01-01
Extracting and understanding information, themes and relationships from large collections of documents is an important task for biomedical researchers. Latent Dirichlet Allocation is an unsupervised topic modeling technique using the bag-of-words assumption that has been applied extensively to unveil hidden thematic information within large sets of documents. In this paper, we added MeSH descriptors to the bag-of-words assumption to generate ‘hybrid topics’, which are mixed vectors of words and descriptors. We evaluated this approach on the quality and interpretability of topics in both a general corpus and a specialized corpus. Our results demonstrated that the coherence of ‘hybrid topics’ is higher than that of regular bag-of-words topics in the specialized corpus. We also found that the proportion of topics that are not associated with MeSH descriptors is higher in the specialized corpus than in the general corpus. PMID:29295179
Biomedical information retrieval across languages.
Daumke, Philipp; Markü, Kornél; Poprat, Michael; Schulz, Stefan; Klar, Rüdiger
2007-06-01
This work presents a new dictionary-based approach to biomedical cross-language information retrieval (CLIR) that addresses many of the general and domain-specific challenges in current CLIR research. Our method is based on a multilingual lexicon that was generated partly manually and partly automatically, and currently covers six European languages. It contains morphologically meaningful word fragments, termed subwords. Using subwords instead of entire words significantly reduces the number of lexical entries necessary to sufficiently cover a specific language and domain. Mediation between queries and documents is based on these subwords as well as on lists of word-n-grams that are generated from large monolingual corpora and constitute possible translation units. The translations are then sent to a standard Internet search engine. This process makes our approach an effective tool for searching the biomedical content of the World Wide Web in different languages. We evaluate this approach using the OHSUMED corpus, a large medical document collection, within a cross-language retrieval setting.
NASA Astrophysics Data System (ADS)
Suzuki, Izumi; Mikami, Yoshiki; Ohsato, Ario
A technique that acquires documents in the same category with a given short text is introduced. Regarding the given text as a training document, the system marks up the most similar document, or sufficiently similar documents, from among the document domain (or entire Web). The system then adds the marked documents to the training set to learn the set, and this process is repeated until no more documents are marked. Setting a monotone increasing property to the similarity as it learns enables the system to 1) detect the correct timing so that no more documents remain to be marked and to 2) decide the threshold value that the classifier uses. In addition, under the condition that the normalization process is limited to what term weights are divided by a p-norm of the weights, the linear classifier in which training documents are indexed in a binary manner is the only instance that satisfies the monotone increasing property. The feasibility of the proposed technique was confirmed through an examination of binary similarity and using English and German documents randomly selected from the Web.
Kopanz, Julia; Lichtenegger, Katharina M; Sendlhofer, Gerald; Semlitsch, Barbara; Cuder, Gerald; Pak, Andreas; Pieber, Thomas R; Tax, Christa; Brunner, Gernot; Plank, Johannes
2018-02-09
Insulin charts represent a key component in the inpatient glycemic management process. The aim was to evaluate the quality of structure, documentation, and treatment of diabetic inpatient care to design a new standardized insulin chart for a large university hospital setting. Historically grown blank insulin charts in use at 39 general wards were collected and evaluated for quality structure features. Documentation and treatment quality were evaluated in a consecutive snapshot audit of filled-in charts. The primary end point was the percentage of charts with any medication error. Overall, 20 different blank insulin charts with variable designs and significant structural deficits were identified. A medication error occurred in 55% of the 102 audited filled-in insulin charts, consisting of prescription and management errors in 48% and 16%, respectively. Charts of insulin-treated patients had more medication errors relative to patients treated with oral medication (P < 0.01). Chart design did support neither clinical authorization of individual insulin prescription (10%), nor insulin administration confirmed by nurses' signature (25%), nor treatment of hypoglycemia (0%), which resulted in a reduced documentation and treatment quality in clinical practice 7%, 30%, 25%, respectively. A multitude of charts with variable design characteristics and structural deficits were in use across the inpatient wards. More than half of the inpatients had a chart displaying a medication error. Lack of structure quality features of the charts had an impact on documentation and treatment quality. Based on identified deficits and international standards, a new insulin chart was developed to overcome these quality hurdles.
Safran, C
2014-08-15
To provide an overview of the benefits of clinical data collected as a by-product of the care process, the potential problems with large aggregations of these data, the policy frameworks that have been formulated, and the major challenges in the coming years. This report summarizes some of the major observations from AMIA and IMIA conferences held on this admittedly broad topic from 2006 through 2013. This report also includes many unsupported opinions of the author. The benefits of aggregating larger and larger sets of routinely collected clinical data are well documented and of great societal benefit. These large data sets will probably never answer all possible clinical questions for methodological reasons. Non-traditional sources of health data that are patient-sources will pose new data science challenges. If we ever hope to have tools that can rapidly provide evidence for daily practice of medicine we need a science of health data perhaps modeled after the science of astronomy.
A Study of Large Droplet Ice Accretions in the NASA-Lewis IRT at Near-Freezing Conditions
NASA Technical Reports Server (NTRS)
Miller, Dean R.; Addy, Harold E. , Jr.; Ide, Robert F.
1996-01-01
This report documents the results of an experimental study on large droplet ice accretions which was conducted in the NASA-Lewis Icing Research Tunnel (IRT) with a full-scale 77.25 inch chord Twin-Otter wing section. This study was intended to: (1) document the existing capability of the IRT to produce a large droplet icing cloud, and (2) study the effect of various parameters on large droplet ice accretions. Results are presented from a study of the IRT's capability to produce large droplets with MVD of 99 and 160 microns. The effect of the initial water droplet temperature on the resultant ice accretion was studied for different initial spray bar air and water temperatures. The initial spray bar water temperature was found to have no discernible effect upon the large droplet ice accretions. Also, analytical and experimental results suggest that the water droplet temperature is very nearly the same as the tunnel ambient temperature, thus providing a realistic simulation of the large droplet natural icing condition. The effect of temperature, droplet size, airspeed, angle-of attack, flap setting and de-icer boot cycling time on ice accretion was studied, and will be discussed in this report. It was found that, in almost all of the cases studied, an ice ridge formed immediately aft of the active portion of the de-icer boot. This ridge was irregular in shape, varied in location, and was in some cases discontinuous due to aerodynamic shedding.
Analyzing Document Retrievability in Patent Retrieval Settings
NASA Astrophysics Data System (ADS)
Bashir, Shariq; Rauber, Andreas
Most information retrieval settings, such as web search, are typically precision-oriented, i.e. they focus on retrieving a small number of highly relevant documents. However, in specific domains, such as patent retrieval or law, recall becomes more relevant than precision: in these cases the goal is to find all relevant documents, requiring algorithms to be tuned more towards recall at the cost of precision. This raises important questions with respect to retrievability and search engine bias: depending on how the similarity between a query and documents is measured, certain documents may be more or less retrievable in certain systems, up to some documents not being retrievable at all within common threshold settings. Biases may be oriented towards popularity of documents (increasing weight of references), towards length of documents, favour the use of rare or common words; rely on structural information such as metadata or headings, etc. Existing accessibility measurement techniques are limited as they measure retrievability with respect to all possible queries. In this paper, we improve accessibility measurement by considering sets of relevant and irrelevant queries for each document. This simulates how recall oriented users create their queries when searching for relevant information. We evaluate retrievability scores using a corpus of patents from US Patent and Trademark Office.
Ensemble methods with simple features for document zone classification
NASA Astrophysics Data System (ADS)
Obafemi-Ajayi, Tayo; Agam, Gady; Xie, Bingqing
2012-01-01
Document layout analysis is of fundamental importance for document image understanding and information retrieval. It requires the identification of blocks extracted from a document image via features extraction and block classification. In this paper, we focus on the classification of the extracted blocks into five classes: text (machine printed), handwriting, graphics, images, and noise. We propose a new set of features for efficient classifications of these blocks. We present a comparative evaluation of three ensemble based classification algorithms (boosting, bagging, and combined model trees) in addition to other known learning algorithms. Experimental results are demonstrated for a set of 36503 zones extracted from 416 document images which were randomly selected from the tobacco legacy document collection. The results obtained verify the robustness and effectiveness of the proposed set of features in comparison to the commonly used Ocropus recognition features. When used in conjunction with the Ocropus feature set, we further improve the performance of the block classification system to obtain a classification accuracy of 99.21%.
NASA Technical Reports Server (NTRS)
Staub, B.; Rosenzweig, C.; Rind, D.
1987-01-01
The file structure and coding of four soils data sets derived from the Zobler (1986) world soil file is described. The data were digitized on a one-degree square grid. They are suitable for large-area studies such as climate research with general circulation models, as well as in forestry, agriculture, soils, and hydrology. The first file is a data set of codes for soil unit, land-ice, or water, for all the one-degree square cells on Earth. The second file is a data set of codes for texture, land-ice, or water, for the same soil units. The third file is a data set of codes for slope, land-ice, or water for the same units. The fourth file is the SOILWRLD data set, containing information on soil properties of land cells of both Matthews' and Food and Agriculture Organization (FAO) sources. The fourth file reconciles land-classification differences between the two and has missing data filled in.
Elbogen, Eric B; Tomkins, Alan J; Pothuloori, Antara P; Scalora, Mario J
2003-01-01
Studies have identified risk factors that show a strong association with violent behavior in psychiatric populations. Yet, little research has been conducted on the documentation of violence risk information in actual clinical practice, despite the relevance of such documentation to risk assessment liability and to conducting effective risk management. In this study, the documentation of cues of risk for violence were examined in psychiatric settings. Patient charts (n = 283) in four psychiatric settings were reviewed for documentation of violence risk information summarized in the MacArthur Violence Risk Assessment Study. The results revealed that particular patient and institutional variables influenced documentation practices. The presence of personality disorder, for example, predicted greater documentation of cues of violence risk, regardless of clinical setting. These findings have medicolegal implications for risk assessment liability and clinical implications for optimizing risk management in psychiatric practice.
What Are Red Sprites? An Art and Science Collaboration
NASA Astrophysics Data System (ADS)
McLeish, P.
2013-04-01
Sprites are fleeting luminous shapes that shoot into the upper atmosphere during large thunderstorms as lightning simultaneously reaches down to Earth. For at least a century scientists have attempted to confirm and explain the existence of sprites with visual images and data. Peter McLeish's images, Lightning's Angels, supplement the documentation of sprites by exploring the properties of this natural phenomenon through digitally enhanced oil encaustic paintings set to music in a six-minute film.
Online mass storage system detailed requirements document
NASA Technical Reports Server (NTRS)
1976-01-01
The requirements for an online high density magnetic tape data storage system that can be implemented in a multipurpose, multihost environment is set forth. The objective of the mass storage system is to provide a facility for the compact storage of large quantities of data and to make this data accessible to computer systems with minimum operator handling. The results of a market survey and analysis of candidate vendor who presently market high density tape data storage systems are included.
Baillie, Lesley; Thomas, Nicola
2018-01-01
Person-centred care is internationally recognised as best practice for the care of people with dementia. Personal information documents for people with dementia are proposed as a way to support person-centred care in healthcare settings. However, there is little research about how they are used in practice. The aim of this study was to analyse healthcare staff 's perceptions and experiences of using personal information documents, mainly Alzheimer's Society's 'This is me', for people with dementia in healthcare settings. The method comprised a secondary thematic analysis of data from a qualitative study, of how a dementia awareness initiative affected care for people with dementia in one healthcare organisation. The data were collected through 12 focus groups (n = 58 participants) and 1 individual interview, conducted with a range of healthcare staff, both clinical and non-clinical. There are four themes presented: understanding the rationale for personal information documents; completing personal information documents; location for personal information documents and transfer between settings; impact of personal information documents in practice. The findings illuminated how healthcare staff use personal information documents in practice in ways that support person-centred care. Practical issues about the use of personal information documents were revealed and these may affect the optimal use of the documents in practice. The study indicated the need to complete personal information documents at an early stage following diagnosis of dementia, and the importance of embedding their use across care settings, to support communication and integrated care.
Documenting Collective Development in Online Settings
ERIC Educational Resources Information Center
Dean, Chrystal; Silverman, Jason
2015-01-01
In this paper the authors explored the question of collective understanding in online mathematics education settings and presented a brief overview of traditional methods for documenting norms and collective mathematical practices. A method for documenting collective development was proposed that builds on existing methods and frameworks yet is…
Collected Data of The Boreal Ecosystem and Atmosphere Study (BOREAS)
NASA Technical Reports Server (NTRS)
Newcomer, J. (Editor); Landis, D. (Editor); Conrad, S. (Editor); Curd, S. (Editor); Huemmrich, K. (Editor); Knapp, D. (Editor); Morrell, A. (Editor); Nickerson, J. (Editor); Papagno, A. (Editor); Rinker, D. (Editor)
2000-01-01
The Boreal Ecosystem-Atmosphere Study (BOREAS) was a large-scale international interdisciplinary climate-ecosystem interaction experiment in the northern boreal forests of Canada. Its goal was to improve our understanding of the boreal forests -- how they interact with the atmosphere, how much CO2 they can store, and how climate change will affect them. BOREAS wanted to learn to use satellite data to monitor the forests, and to improve computer simulation and weather models so scientists can anticipate the effects of global change. This BOREAS CD-ROM set is a set of 12 CD-ROMs containing the finalized point data sets and compressed image data from the BOREAS Project. All point data are stored in ASCII text files, and all image and GIS products are stored as binary images, compressed using GZip. Additional descriptions of the various data sets on this CD-ROM are available in other documents in the BOREAS series.
NASA Astrophysics Data System (ADS)
Yeh, Meng-Wan
2007-05-01
The NE-SW trending gneiss domes around Baltimore, Maryland, USA, have been cited as classic examples of mantled gneiss domes formed by diapiric rise of migmatitic gneisses [Eskola, P., 1949. The problem of mantled gneiss domes. Quarterly Journal of Geological Society of London 104/416, 461-476]. However, 3-D analysis of porphyroblast-matrix foliation relations and porphyroblast inclusion trail geometries suggests that they are the result of interference between multiple refolding of an early-formed nappe. A succession of six FIA (Foliation Intersection Axes) sets, based upon relative timing of inclusion texture in garnet and staurolite porphyroblasts, revealed 6 superposed deformation phases. The successions of inclusion trail asymmetries, formed around these FIAs, document the geometry of deformation associated with folding and fabric development during discrete episodes of bulk shortening. Exclusive top to NW shear asymmetries of curvature were recorded by inclusion trails associated with the vertical collapsing event within the oldest FIA set (NE-SW trend). This strongly indicates a large NE-SW-striking, NW-verging nappe had formed early during this deformation sequence. This nappe was later folded into NE-SW-trending up-right folds by coaxial shortening indicated by an almost equal proportion of both inclusion trail asymmetries documented by the second N-S-trending FIA set. These folds were then amplified by later deformation, as the following FIA sets showed an almost equal proportion of both inclusion trail asymmetries.
One-click scanning of large-size documents using mobile phone camera
NASA Astrophysics Data System (ADS)
Liu, Sijiang; Jiang, Bo; Yang, Yuanjie
2016-07-01
Currently mobile apps for document scanning do not provide convenient operations to tackle large-size documents. In this paper, we present a one-click scanning approach for large-size documents using mobile phone camera. After capturing a continuous video of documents, our approach automatically extracts several key frames by optical flow analysis. Then based on key frames, a mobile GPU based image stitching method is adopted to generate a completed document image with high details. There are no extra manual intervention in the process and experimental results show that our app performs well, showing convenience and practicability for daily life.
NASA Astrophysics Data System (ADS)
Zhang, Hui; Wang, Deqing; Wu, Wenjun; Hu, Hongping
2012-11-01
In today's business environment, enterprises are increasingly under pressure to process the vast amount of data produced everyday within enterprises. One method is to focus on the business intelligence (BI) applications and increasing the commercial added-value through such business analytics activities. Term weighting scheme, which has been used to convert the documents as vectors in the term space, is a vital task in enterprise Information Retrieval (IR), text categorisation, text analytics, etc. When determining term weight in a document, the traditional TF-IDF scheme sets weight value for the term considering only its occurrence frequency within the document and in the entire set of documents, which leads to some meaningful terms that cannot get the appropriate weight. In this article, we propose a new term weighting scheme called Term Frequency - Function of Document Frequency (TF-FDF) to address this issue. Instead of using monotonically decreasing function such as Inverse Document Frequency, FDF presents a convex function that dynamically adjusts weights according to the significance of the words in a document set. This function can be manually tuned based on the distribution of the most meaningful words which semantically represent the document set. Our experiments show that the TF-FDF can achieve higher value of Normalised Discounted Cumulative Gain in IR than that of TF-IDF and its variants, and improving the accuracy of relevance ranking of the IR results.
NASA Technical Reports Server (NTRS)
Suarez, Max J. (Editor); Schubert, Siegfried; Rood, Richard; Park, Chung-Kyu; Wu, Chung-Yu; Kondratyeva, Yelena; Molod, Andrea; Takacs, Lawrence; Seablom, Michael; Higgins, Wayne
1995-01-01
The Data Assimilation Office (DAO) at Goddard Space Flight Center has produced a multiyear global assimilated data set with version 1 of the Goddard Earth Observing System Data Assimilation System (GEOS-1 DAS). One of the main goals of this project, in addition to benchmarking the GEOS-1 system, was to produce a research quality data set suitable for the study of short-term climate variability. The output, which is global and gridded, includes all prognostic fields and a large number of diagnostic quantities such as precipitation, latent heating, and surface fluxes. Output is provided four times daily with selected quantities available eight times per day. Information about the observations input to the GEOS-1 DAS is provided in terms of maps of spatial coverage, bar graphs of data counts, and tables of all time periods with significant data gaps. The purpose of this document is to serve as a users' guide to NASA's first multiyear assimilated data set and to provide an early look at the quality of the output. Documentation is provided on all the data archives, including sample read programs and methods of data access. Extensive comparisons are made with the corresponding operational European Center for Medium-Range Weather Forecasts analyses, as well as various in situ and satellite observations. This document is also intended to alert users of the data about potential limitations of assimilated data, in general, and the GEOS-1 data, in particular. Results are presented for the period March 1985-February 1990.
Hickman, Susan E; Nelson, Christine A; Smith-Howell, Esther; Hammes, Bernard J
2014-01-01
The Physician Orders for Life-Sustaining Treatment (POLST) documents patient preferences as medical orders that transfer across settings with patients. The objectives were to pilot test methods and gather preliminary data about POLST including (1) use at time of hospital discharge, (2) transfers across settings, and (3) consistency with prior decisions. Descriptive with chart abstraction and interviews. Participants were hospitalized patients discharged to a nursing facility and/or their surrogates in La Crosse County, Wisconsin. POLST forms were abstracted from hospital records for 151 patients. Hospital and nursing facility chart data were abstracted and interviews were conducted with an additional 39 patients/surrogates. Overall, 176 patients had valid POLST forms at the time of discharge from the hospital, and many (38.6%; 68/176) only documented code status. When the whole POLST was completed, orders were more often marked as based on a discussion with the patient and/or surrogate than when the form was used just for code status (95.1% versus 13.8%, p<.001). In the follow-up and interview sample, a majority (90.6%; 29/32) of POLST forms written in the hospital were unchanged up to three weeks after nursing facility admission. Most (71.9%; 23/32) appeared consistent with patient or surrogate recall of prior treatment decisions. POLST forms generated in the hospital do transfer with patients across settings, but are often used only to document code status. POLST orders appeared largely consistent with prior treatment decisions. Further research is needed to assess the quality of POLST decisions.
Nano Mapper: an Internet knowledge mapping system for nanotechnology development
NASA Astrophysics Data System (ADS)
Li, Xin; Hu, Daning; Dang, Yan; Chen, Hsinchun; Roco, Mihail C.; Larson, Catherine A.; Chan, Joyce
2009-04-01
Nanotechnology research has experienced rapid growth in recent years. Advances in information technology enable efficient investigation of publications, their contents, and relationships for large sets of nanotechnology-related documents in order to assess the status of the field. This paper presents the development of a new knowledge mapping system, called Nano Mapper (http://nanomapper.eller.arizona.edu), which integrates the analysis of nanotechnology patents and research grants into a Web-based platform. The Nano Mapper system currently contains nanotechnology-related patents for 1976-2006 from the United States Patent and Trademark Office (USPTO), European Patent Office (EPO), and Japan Patent Office (JPO), as well as grant documents from the U.S. National Science Foundation (NSF) for the same time period. The system provides complex search functionalities, and makes available a set of analysis and visualization tools (statistics, trend graphs, citation networks, and content maps) that can be applied to different levels of analytical units (countries, institutions, technical fields) and for different time intervals. The paper shows important nanotechnology patenting activities at USPTO for 2005-2006 identified through the Nano Mapper system.
Nano Mapper: an Internet knowledge mapping system for nanotechnology development
Hu, Daning; Dang, Yan; Chen, Hsinchun; Roco, Mihail C.; Larson, Catherine A.; Chan, Joyce
2008-01-01
Nanotechnology research has experienced rapid growth in recent years. Advances in information technology enable efficient investigation of publications, their contents, and relationships for large sets of nanotechnology-related documents in order to assess the status of the field. This paper presents the development of a new knowledge mapping system, called Nano Mapper (http://nanomapper.eller.arizona.edu), which integrates the analysis of nanotechnology patents and research grants into a Web-based platform. The Nano Mapper system currently contains nanotechnology-related patents for 1976–2006 from the United States Patent and Trademark Office (USPTO), European Patent Office (EPO), and Japan Patent Office (JPO), as well as grant documents from the U.S. National Science Foundation (NSF) for the same time period. The system provides complex search functionalities, and makes available a set of analysis and visualization tools (statistics, trend graphs, citation networks, and content maps) that can be applied to different levels of analytical units (countries, institutions, technical fields) and for different time intervals. The paper shows important nanotechnology patenting activities at USPTO for 2005–2006 identified through the Nano Mapper system. PMID:21170121
Enhancing biomedical text summarization using semantic relation extraction.
Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao
2011-01-01
Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.
A Qualitative Analysis Evaluating The Purposes And Practices Of Clinical Documentation
Ho, Y.-X.; Gadd, C. S.; Kohorst, K.L.; Rosenbloom, S.T.
2014-01-01
Summary Objectives An important challenge for biomedical informatics researchers is determining the best approach for healthcare providers to use when generating clinical notes in settings where electronic health record (EHR) systems are used. The goal of this qualitative study was to explore healthcare providers’ and administrators’ perceptions about the purpose of clinical documentation and their own documentation practices. Methods We conducted seven focus groups with a total of 46 subjects composed of healthcare providers and administrators to collect knowledge, perceptions and beliefs about documentation from those who generate and review notes, respectively. Data were analyzed using inductive analysis to probe and classify impressions collected from focus group subjects. Results We observed that both healthcare providers and administrators believe that documentation serves five primary domains: clinical, administrative, legal, research, education. These purposes are tied closely to the nature of the clinical note as a document shared by multiple stakeholders, which can be a source of tension for all parties who must use the note. Most providers reported using a combination of methods to complete their notes in a timely fashion without compromising patient care. While all administrators reported relying on computer-based documentation tools to review notes, they expressed a desire for a more efficient method of extracting relevant data. Conclusions Although clinical documentation has utility, and is valued highly by its users, the development and successful adoption of a clinical documentation tool largely depends on its ability to be smoothly integrated into the provider’s busy workflow, while allowing the provider to generate a note that communicates effectively and efficiently with multiple stakeholders. PMID:24734130
The Precise and Efficient Identification of Medical Order Forms Using Shape Trees
NASA Astrophysics Data System (ADS)
Henker, Uwe; Petersohn, Uwe; Ultsch, Alfred
A powerful and flexible technique to identify, classify and process documents using images from a scanning process is presented. The types of documents can be described to the system as a set of differentiating features in a case base using shape trees. The features are filtered and abstracted from an extremely reduced scanner image of the document. Classification rules are stored with the cases to enable precise recognition and further mark reading and Optical Character Recognition (OCR) process. The method is implemented in a system which actually processes the majority of requests for medical lab procedures in Germany. A large practical experiment with data from practitioners was performed. An average of 97% of the forms were correctly identified; none were identified incorrectly. This meets the quality requirements for most medical applications. The modular description of the recognition process allows for a flexible adaptation of future changes to the form and content of the document’s structures.
27 CFR 19.677 - Large plant applications-organizational documents.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 27 Alcohol, Tobacco Products and Firearms 1 2011-04-01 2011-04-01 false Large plant applications-organizational documents. 19.677 Section 19.677 Alcohol, Tobacco Products and Firearms ALCOHOL AND TOBACCO TAX... Fuel Use Obtaining A Permit § 19.677 Large plant applications—organizational documents. In addition to...
Action of earthworms on flint burial - a return to Darwin's estate
Kevin R. Butt; Mac Callaham; E. Louise Loudermilk; Rowan Blaik
2016-01-01
For thirty years, from the early 1840s, Charles Darwin documented the disappearance of flints in the grounds of Down House in Kent, at a location originally known as the âStony Fieldâ. This site (Great Pucklands Meadow â GPM) was visited in 2007 and an experiment set up in this ungrazed grassland. Locally-sourced flints (either large â 12 cm, or small â 5 cm dia.) were...
The Development of Clinical Document Standards for Semantic Interoperability in China
Yang, Peng; Pan, Feng; Wan, Yi; Tu, Haibo; Tang, Xuejun; Hu, Jianping
2011-01-01
Objectives This study is aimed at developing a set of data groups (DGs) to be employed as reusable building blocks for the construction of the eight most common clinical documents used in China's general hospitals in order to achieve their structural and semantic standardization. Methods The Diagnostics knowledge framework, the related approaches taken from the Health Level Seven (HL7), the Integrating the Healthcare Enterprise (IHE), and the Healthcare Information Technology Standards Panel (HITSP) and 1,487 original clinical records were considered together to form the DG architecture and data sets. The internal structure, content, and semantics of each DG were then defined by mapping each DG data set to a corresponding Clinical Document Architecture data element and matching each DG data set to the metadata in the Chinese National Health Data Dictionary. By using the DGs as reusable building blocks, standardized structures and semantics regarding the clinical documents for semantic interoperability were able to be constructed. Results Altogether, 5 header DGs, 48 section DGs, and 17 entry DGs were developed. Several issues regarding the DGs, including their internal structure, identifiers, data set names, definitions, length and format, data types, and value sets, were further defined. Standardized structures and semantics regarding the eight clinical documents were structured by the DGs. Conclusions This approach of constructing clinical document standards using DGs is a feasible standard-driven solution useful in preparing documents possessing semantic interoperability among the disparate information systems in China. These standards need to be validated and refined through further study. PMID:22259722
Lamas, Daniela; Panariello, Natalie; Henrich, Natalie; Hammes, Bernard; Hanson, Laura C; Meier, Diane E; Guinn, Nancy; Corrigan, Janet; Hubber, Sean; Luetke-Stahlman, Hannah; Block, Susan
2018-04-01
To develop a set of clinically relevant recommendations to improve the state of advance care planning (ACP) documentation in the electronic health record (EHR). Advance care planning (ACP) is a key process that supports goal-concordant care. For preferences to be honored, clinicians must be able to reliably record, find, and use ACP documentation. However, there are no standards to guide ACP documentation in the electronic health record (EHR). We interviewed 21 key informants to understand the strengths and weaknesses of EHR documentation systems for ACP and identify best practices. We analyzed these interviews using a qualitative content analysis approach and subsequently developed a preliminary set of recommendations. These recommendations were vetted and refined in a second round of input from a national panel of content experts. Informants identified six themes regarding current inadequacies in documentation and accessibility of ACP information and opportunities for improvement. We offer a set of concise, clinically relevant recommendations, informed by expert opinion, to improve the state of ACP documentation in the EHR.
NASA Astrophysics Data System (ADS)
Tirupattur, Naveen; Lapish, Christopher C.; Mukhopadhyay, Snehasis
2011-06-01
Text mining, sometimes alternately referred to as text analytics, refers to the process of extracting high-quality knowledge from the analysis of textual data. Text mining has wide variety of applications in areas such as biomedical science, news analysis, and homeland security. In this paper, we describe an approach and some relatively small-scale experiments which apply text mining to neuroscience research literature to find novel associations among a diverse set of entities. Neuroscience is a discipline which encompasses an exceptionally wide range of experimental approaches and rapidly growing interest. This combination results in an overwhelmingly large and often diffuse literature which makes a comprehensive synthesis difficult. Understanding the relations or associations among the entities appearing in the literature not only improves the researchers current understanding of recent advances in their field, but also provides an important computational tool to formulate novel hypotheses and thereby assist in scientific discoveries. We describe a methodology to automatically mine the literature and form novel associations through direct analysis of published texts. The method first retrieves a set of documents from databases such as PubMed using a set of relevant domain terms. In the current study these terms yielded a set of documents ranging from 160,909 to 367,214 documents. Each document is then represented in a numerical vector form from which an Association Graph is computed which represents relationships between all pairs of domain terms, based on co-occurrence. Association graphs can then be subjected to various graph theoretic algorithms such as transitive closure and cycle (circuit) detection to derive additional information, and can also be visually presented to a human researcher for understanding. In this paper, we present three relatively small-scale problem-specific case studies to demonstrate that such an approach is very successful in replicating a neuroscience expert's mental model of object-object associations entirely by means of text mining. These preliminary results provide the confidence that this type of text mining based research approach provides an extremely powerful tool to better understand the literature and drive novel discovery for the neuroscience community.
Bonn, Bernadine A.
2008-01-01
A long-term method detection level (LT-MDL) and laboratory reporting level (LRL) are used by the U.S. Geological Survey?s National Water Quality Laboratory (NWQL) when reporting results from most chemical analyses of water samples. Changing to this method provided data users with additional information about their data and often resulted in more reported values in the low concentration range. Before this method was implemented, many of these values would have been censored. The use of the LT-MDL and LRL presents some challenges for the data user. Interpreting data in the low concentration range increases the need for adequate quality assurance because even small contamination or recovery problems can be relatively large compared to concentrations near the LT-MDL and LRL. In addition, the definition of the LT-MDL, as well as the inclusion of low values, can result in complex data sets with multiple censoring levels and reported values that are less than a censoring level. Improper interpretation or statistical manipulation of low-range results in these data sets can result in bias and incorrect conclusions. This document is designed to help data users use and interpret data reported with the LTMDL/ LRL method. The calculation and application of the LT-MDL and LRL are described. This document shows how to extract statistical information from the LT-MDL and LRL and how to use that information in USGS investigations, such as assessing the quality of field data, interpreting field data, and planning data collection for new projects. A set of 19 detailed examples are included in this document to help data users think about their data and properly interpret lowrange data without introducing bias. Although this document is not meant to be a comprehensive resource of statistical methods, several useful methods of analyzing censored data are demonstrated, including Regression on Order Statistics and Kaplan-Meier Estimation. These two statistical methods handle complex censored data sets without resorting to substitution, thereby avoiding a common source of bias and inaccuracy.
Discovering semantic features in the literature: a foundation for building functional associations
Chagoyen, Monica; Carmona-Saez, Pedro; Shatkay, Hagit; Carazo, Jose M; Pascual-Montano, Alberto
2006-01-01
Background Experimental techniques such as DNA microarray, serial analysis of gene expression (SAGE) and mass spectrometry proteomics, among others, are generating large amounts of data related to genes and proteins at different levels. As in any other experimental approach, it is necessary to analyze these data in the context of previously known information about the biological entities under study. The literature is a particularly valuable source of information for experiment validation and interpretation. Therefore, the development of automated text mining tools to assist in such interpretation is one of the main challenges in current bioinformatics research. Results We present a method to create literature profiles for large sets of genes or proteins based on common semantic features extracted from a corpus of relevant documents. These profiles can be used to establish pair-wise similarities among genes, utilized in gene/protein classification or can be even combined with experimental measurements. Semantic features can be used by researchers to facilitate the understanding of the commonalities indicated by experimental results. Our approach is based on non-negative matrix factorization (NMF), a machine-learning algorithm for data analysis, capable of identifying local patterns that characterize a subset of the data. The literature is thus used to establish putative relationships among subsets of genes or proteins and to provide coherent justification for this clustering into subsets. We demonstrate the utility of the method by applying it to two independent and vastly different sets of genes. Conclusion The presented method can create literature profiles from documents relevant to sets of genes. The representation of genes as additive linear combinations of semantic features allows for the exploration of functional associations as well as for clustering, suggesting a valuable methodology for the validation and interpretation of high-throughput experimental data. PMID:16438716
System for information discovery
Pennock, Kelly A [Richland, WA; Miller, Nancy E [Kennewick, WA
2002-11-19
A sequence of word filters are used to eliminate terms in the database which do not discriminate document content, resulting in a filtered word set and a topic word set whose members are highly predictive of content. These two word sets are then formed into a two dimensional matrix with matrix entries calculated as the conditional probability that a document will contain a word in a row given that it contains the word in a column. The matrix representation allows the resultant vectors to be utilized to interpret document contents.
3 CFR - Federal Employee Pay Schedules and Rates That Are Set by Administrative Discretion
Code of Federal Regulations, 2013 CFR
2013-01-01
... 3 The President 1 2013-01-01 2013-01-01 false Federal Employee Pay Schedules and Rates That Are Set by Administrative Discretion Presidential Documents Other Presidential Documents Memorandum of December 21, 2012 Federal Employee Pay Schedules and Rates That Are Set by Administrative Discretion...
Cohen-Mansfield, Jiska; Libin, Alexander; Lipson, Steven
2003-06-01
Decisions concerning end-of-life care depend on information contained in advance directives that are documented in residents' charts in the nursing home. The availability of that information depends on the quality of the chart and on the location of the information in the chart. No research was found that compared directives by the manner in which they are collected and summarized in the chart. The goal of the proposed study was to clarify how advance directives are summarized in the patient's record and to clarify how physicians perceive the same advance directives and formal orders. The study involved 122 elderly persons who reside in one large (587 beds) nursing home. The authors collected data regarding the advance directives from three sources-Minimum Data Set (MDS), the front cover of the resident's chart, and from inside the chart. The rates of documented advance directives found in this study are higher than those reported in the literature. Agreement rates between sources varied as a function of which sources were compared, as well as on the basis of which directive was examined. More specifically, the authors found higher rates of agreement between the information inside the chart and on the cover of the chart than between the MDS and the other two sources. The reasons for discrepancies may lie in the different functions and procedures pertaining to these source documents.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fishkind, H.H.
1982-04-01
The feasibility of large-scale plantation establishment by various methods was examined, and the following conclusions were reached: seedling plantations are limited in potential yield due to genetic variation among the planting stock and often inadequate supplies of appropriate seed; vegetative propagation by rooted cuttings can provide good genetic uniformity of select hybrid planting stock; however, large-scale production requires establishment and maintenance of extensive cutting orchards. The collection of shoots and preparation of cuttings, although successfully implemented in the Congo and Brazil, would not be economically feasible in Florida for large-scale plantations; tissue culture propagation of select hybrid eucalypts offers themore » only opportunity to produce the very large number of trees required to establish the energy plantation. The cost of tissue culture propagation, although higher than seedling production, is more than off-set by the increased productivity of vegetative plantations established from select hybrid Eucalyptus.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phillips, Laurence R.; Jordan, Danyelle N.; Bauer, Travis L.
The large number of government and industry activities supporting the Unit of Action (UA), with attendant documents, reports and briefings, can overwhelm decision-makers with an overabundance of information that hampers the ability to make quick decisions often resulting in a form of gridlock. In particular, the large and rapidly increasing amounts of data and data formats stored on UA Advanced Collaborative Environment (ACE) servers has led to the realization that it has become impractical and even impossible to perform manual analysis leading to timely decisions. UA Program Management (PM UA) has recognized the need to implement a Decision Support Systemmore » (DSS) on UA ACE. The objective of this document is to research the commercial Knowledge Discovery and Data Mining (KDDM) market and publish the results in a survey. Furthermore, a ranking mechanism based on UA ACE-specific criteria has been developed and applied to a representative set of commercially available KDDM solutions. In addition, an overview of four R&D areas identified as critical to the implementation of DSS on ACE is provided. Finally, a comprehensive database containing detailed information on surveyed KDDM tools has been developed and is available upon customer request.« less
Monitoring and reporting attacks on education in the Democratic Republic of the Congo and Somalia.
Bennouna, Cyril; van Boetzelaer, Elburg; Rojas, Lina; Richard, Kinyera; Karume, Gang; Nshombo, Marius; Roberts, Leslie; Boothby, Neil
2018-04-01
The United Nations' Monitoring and Reporting Mechanism is charged with documenting six grave violations against children in a time of conflict, including attacks on schools. Many of these incidents, however, remain unreported across the globe. This study explores whether or not a local knowledge base of education and child protection actors in North and South Kivu Provinces, Democratic Republic of the Congo, and in Mogadishu, Somalia, could contribute to a more complete record of attacks on education in those areas. Hundreds of semi-structured interviews were conducted with key informants across the three settings, and in total 432 attacks on education were documented. Purposive samples of these reports were verified and a large majority was confirmed. Local non-governmental organisations and education institutions were most knowledgeable about these incidents, but most never reported them to a monitoring authority. The study concludes that attack surveillance and response were largely insufficient, and recommends investing in mechanisms that utilise local knowledge to address these shortcomings. © 2018 The Author(s). Disasters © Overseas Development Institute, 2018.
Helios: Understanding Solar Evolution Through Text Analytics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Randazzese, Lucien
This proof-of-concept project focused on developing, testing, and validating a range of bibliometric, text analytic, and machine-learning based methods to explore the evolution of three photovoltaic (PV) technologies: Cadmium Telluride (CdTe), Dye-Sensitized solar cells (DSSC), and Multi-junction solar cells. The analytical approach to the work was inspired by previous work by the same team to measure and predict the scientific prominence of terms and entities within specific research domains. The goal was to create tools that could assist domain-knowledgeable analysts in investigating the history and path of technological developments in general, with a focus on analyzing step-function changes in performance,more » or “breakthroughs,” in particular. The text-analytics platform developed during this project was dubbed Helios. The project relied on computational methods for analyzing large corpora of technical documents. For this project we ingested technical documents from the following sources into Helios: Thomson Scientific Web of Science (papers), the U.S. Patent & Trademark Office (patents), the U.S. Department of Energy (technical documents), the U.S. National Science Foundation (project funding summaries), and a hand curated set of full-text documents from Thomson Scientific and other sources.« less
3 CFR - Freezing Federal Employee Pay Schedules and Rates That Are Set By Administrative Discretion
Code of Federal Regulations, 2011 CFR
2011-01-01
... 3 The President 1 2011-01-01 2011-01-01 false Freezing Federal Employee Pay Schedules and Rates That Are Set By Administrative Discretion Presidential Documents Other Presidential Documents Memorandum of December 22, 2010 Freezing Federal Employee Pay Schedules and Rates That Are Set By...
Strong Similarity Measures for Ordered Sets of Documents in Information Retrieval.
ERIC Educational Resources Information Center
Egghe, L.; Michel, Christine
2002-01-01
Presents a general method to construct ordered similarity measures in information retrieval based on classical similarity measures for ordinary sets. Describes a test of some of these measures in an information retrieval system that extracted ranked document sets and discuses the practical usability of the ordered similarity measures. (Author/LRW)
Multimodal biometrics for identity documents (MBioID).
Dessimoz, Damien; Richiardi, Jonas; Champod, Christophe; Drygajlo, Andrzej
2007-04-11
The MBioID initiative has been set up to address the following germane question: What and how biometric technologies could be deployed in identity documents in the foreseeable future? This research effort proposes to look at current and future practices and systems of establishing and using biometric identity documents (IDs) and evaluate their effectiveness in large-scale developments. The first objective of the MBioID project is to present a review document establishing the current state-of-the-art related to the use of multimodal biometrics in an IDs application. This research report gives the main definitions, properties and the framework of use related to biometrics, an overview of the main standards developed in the biometric industry and standardisation organisations to ensure interoperability, as well as some of the legal framework and the issues associated to biometrics such as privacy and personal data protection. The state-of-the-art in terms of technological development is also summarised for a range of single biometric modalities (2D and 3D face, fingerprint, iris, on-line signature and speech), chosen according to ICAO recommendations and availabilities, and for various multimodal approaches. This paper gives a summary of the main elements of that report. The second objective of the MBioID project is to propose relevant acquisition and evaluation protocols for a large-scale deployment of biometric IDs. Combined with the protocols, a multimodal database will be acquired in a realistic way, in order to be as close as possible to a real biometric IDs deployment. In this paper, the issues and solutions related to the acquisition setup are briefly presented.
MINC 2.0: A Flexible Format for Multi-Modal Images.
Vincent, Robert D; Neelin, Peter; Khalili-Mahani, Najmeh; Janke, Andrew L; Fonov, Vladimir S; Robbins, Steven M; Baghdadi, Leila; Lerch, Jason; Sled, John G; Adalat, Reza; MacDonald, David; Zijdenbos, Alex P; Collins, D Louis; Evans, Alan C
2016-01-01
It is often useful that an imaging data format can afford rich metadata, be flexible, scale to very large file sizes, support multi-modal data, and have strong inbuilt mechanisms for data provenance. Beginning in 1992, MINC was developed as a system for flexible, self-documenting representation of neuroscientific imaging data with arbitrary orientation and dimensionality. The MINC system incorporates three broad components: a file format specification, a programming library, and a growing set of tools. In the early 2000's the MINC developers created MINC 2.0, which added support for 64-bit file sizes, internal compression, and a number of other modern features. Because of its extensible design, it has been easy to incorporate details of provenance in the header metadata, including an explicit processing history, unique identifiers, and vendor-specific scanner settings. This makes MINC ideal for use in large scale imaging studies and databases. It also makes it easy to adapt to new scanning sequences and modalities.
International Migration and Gender in Latin America: A Comparative Analysis.
Massey, Douglas S; Fischer, Mary J; Capoferro, Chiara
2006-12-01
We review census data to assess the standing of five Latin American nations on a gender continuum ranging from patriarchal to matrifocal. We show that Mexico and Costa Rica lie close to one another with a highly patriarchal system of gender relations whereas Nicaragua and the Dominican Republic are similar in having a matrifocal system. Puerto Rico occupies a middle position, blending characteristics of both systems. These differences yield different patterns of female relative to male migration. Female householders in the two patriarchal settings displayed low rates of out-migration compared with males, whereas in the two matrifocal countries the ratio of female to male migration was much higher, in some case exceeding their male counterparts. Multivariate analyses showed that in patriarchal societies, a formal or informal union with a male dramatically lowers the odds of female out-migration, whereas in matrifocal societies marriage and cohabitation have no real effect. The most important determinants of female migration from patriarchal settings are the migrant status of the husband or partner, having relatives in the United States, and the possession of legal documents. In matrifocal settings, however, female migration is less related to the possession of documents, partner's migrant status, or having relatives in the United States and more strongly related to the woman's own migratory experience. Whereas the process of cumulative causation appears to be driven largely by men in patriarchal societies, it is women who dominate the process in matrifocal settings.
International Migration and Gender in Latin America: A Comparative Analysis
Massey, Douglas S.; Fischer, Mary J.; Capoferro, Chiara
2010-01-01
We review census data to assess the standing of five Latin American nations on a gender continuum ranging from patriarchal to matrifocal. We show that Mexico and Costa Rica lie close to one another with a highly patriarchal system of gender relations whereas Nicaragua and the Dominican Republic are similar in having a matrifocal system. Puerto Rico occupies a middle position, blending characteristics of both systems. These differences yield different patterns of female relative to male migration. Female householders in the two patriarchal settings displayed low rates of out-migration compared with males, whereas in the two matrifocal countries the ratio of female to male migration was much higher, in some case exceeding their male counterparts. Multivariate analyses showed that in patriarchal societies, a formal or informal union with a male dramatically lowers the odds of female out-migration, whereas in matrifocal societies marriage and cohabitation have no real effect. The most important determinants of female migration from patriarchal settings are the migrant status of the husband or partner, having relatives in the United States, and the possession of legal documents. In matrifocal settings, however, female migration is less related to the possession of documents, partner’s migrant status, or having relatives in the United States and more strongly related to the woman’s own migratory experience. Whereas the process of cumulative causation appears to be driven largely by men in patriarchal societies, it is women who dominate the process in matrifocal settings. PMID:21399742
Koenig, Todd A.; Holmes, Robert R.
2013-01-01
The U.S. Geological Survey initiated a substantial effort in the summer of 2011 to measure and document the record-setting floods of the Mississippi and Ohio Rivers, including the reach in and near the New Madrid Floodway. The activation of the floodway, which had not occurred since 1937, provided a rare opportunity to collect a unique dataset describing a flood wave downstream from a levee breach as well as the flow through a large floodway. A total of 42 submersible pressure transducers collected time series of water levels while crews collected hundreds of depth, velocity, and streamflow measurements at selected locations in and near the floodway throughout the period from late April to late June. These data are presented in this chapter.
Intensive agriculture erodes β-diversity at large scales.
Karp, Daniel S; Rominger, Andrew J; Zook, Jim; Ranganathan, Jai; Ehrlich, Paul R; Daily, Gretchen C
2012-09-01
Biodiversity is declining from unprecedented land conversions that replace diverse, low-intensity agriculture with vast expanses under homogeneous, intensive production. Despite documented losses of species richness, consequences for β-diversity, changes in community composition between sites, are largely unknown, especially in the tropics. Using a 10-year data set on Costa Rican birds, we find that low-intensity agriculture sustained β-diversity across large scales on a par with forest. In high-intensity agriculture, low local (α) diversity inflated β-diversity as a statistical artefact. Therefore, at small spatial scales, intensive agriculture appeared to retain β-diversity. Unlike in forest or low-intensity systems, however, high-intensity agriculture also homogenised vegetation structure over large distances, thereby decoupling the fundamental ecological pattern of bird communities changing with geographical distance. This ~40% decline in species turnover indicates a significant decline in β-diversity at large spatial scales. These findings point the way towards multi-functional agricultural systems that maintain agricultural productivity while simultaneously conserving biodiversity. © 2012 Blackwell Publishing Ltd/CNRS.
3 CFR - Federal Employee Pay Schedules and Rates That Are Set by Administrative Discretion
Code of Federal Regulations, 2014 CFR
2014-01-01
... 3 The President 1 2014-01-01 2014-01-01 false Federal Employee Pay Schedules and Rates That Are Set by Administrative Discretion Presidential Documents Other Presidential Documents Memorandum of April 5, 2013 Federal Employee Pay Schedules and Rates That Are Set by Administrative Discretion Memorandum for the Heads of Executive Departments and...
33 CFR 96.350 - Interim Document of Compliance certificate: what is it and when can it be used?
Code of Federal Regulations, 2010 CFR
2010-07-01
... Document of Compliance certificate may be issued to help set up a company's safety management system when— (1) A company is newly set up or in transition from an existing company into a new company; or (2) A new type of vessel is added to an existing safety management system and Document of Compliance...
Kleczka, Bernadette; Musiega, Anita; Rabut, Grace; Wekesa, Phoebe; Mwaniki, Paul; Marx, Michael; Kumar, Pratap
2018-06-01
The United Nations' Sustainable Development Goal #3.8 targets 'access to quality essential healthcare services'. Clinical practice guidelines are an important tool for ensuring quality of clinical care, but many challenges prevent their use in low-resource settings. Monitoring the use of guidelines relies on cumbersome clinical audits of paper records, and electronic systems face financial and other limitations. Here we describe a unique approach to generating digital data from paper using guideline-based templates, rubber stamps and mobile phones. The Guidelines Adherence in Slums Project targeted ten private sector primary healthcare clinics serving informal settlements in Nairobi, Kenya. Each clinic was provided with rubber stamp templates to support documentation and management of commonly encountered outpatient conditions. Participatory design methods were used to customize templates to the workflows and infrastructure of each clinic. Rubber stamps were used to print templates into paper charts, providing clinicians with checklists for use during consultations. Templates used bubble format data entry, which could be digitized from images taken on mobile phones. Besides rubber stamp templates, the intervention included booklets of guideline compilations, one Android phone for digitizing images of templates, and one data feedback/continuing medical education session per clinic each month. In this paper we focus on the effect of the intervention on documentation of three non-communicable diseases in one clinic. Seventy charts of patients enrolled in the chronic disease program (hypertension/diabetes, n=867; chronic respiratory diseases, n=223) at one of the ten intervention clinics were sampled. Documentation of each individual patient encounter in the pre-intervention (January-March 2016) and post-intervention period (May-July) was scored for information in four dimensions - general data, patient assessment, testing, and management. Control criteria included information with no counterparts in templates (e.g. notes on presenting complaints, vital signs). Documentation scores for each patient were compared between both pre- and post-intervention periods and between encounters documented with and without templates (post-intervention only). The total number of patient encounters in the pre-intervention (282) and post-intervention periods (264) did not differ. Mean documentation scores increased significantly in the post-intervention period on average by 21%, 24% and 17% for hypertension, diabetes and chronic respiratory diseases, respectively. Differences were greater (47%, 43% and 27%, respectively) when documentation with and without templates was compared. Changes between pre- vs.post-intervention, and with vs.without template, varied between individual dimensions of documentation. Overall, documentation improved more for general data and patient assessment than in testing or management. The use of templates improves paper-based documentation of patient care, a first step towards improving the quality of care. Rubber stamps provide a simple and low-cost method to print templates on demand. In combination with ubiquitously available mobile phones, information entered on paper can be easily and rapidly digitized. This 'frugal innovation' in m-Health can empower small, private sector facilities, where large numbers of urban patients seek healthcare, to generate digital data on routine outpatient care. These data can form the basis for evidence-based quality improvement efforts at large scale, and help deliver on the SDG promise of quality essential healthcare services for all. Copyright © 2017 Elsevier B.V. All rights reserved.
Security Policy for a Generic Space Exploration Communication Network Architecture
NASA Technical Reports Server (NTRS)
Ivancic, William D.; Sheehe, Charles J.; Vaden, Karl R.
2016-01-01
This document is one of three. It describes various security mechanisms and a security policy profile for a generic space-based communication architecture. Two other documents accompany this document- an Operations Concept (OpsCon) and a communication architecture document. The OpsCon should be read first followed by the security policy profile described by this document and then the architecture document. The overall goal is to design a generic space exploration communication network architecture that is affordable, deployable, maintainable, securable, evolvable, reliable, and adaptable. The architecture should also require limited reconfiguration throughout system development and deployment. System deployment includes subsystem development in a factory setting, system integration in a laboratory setting, launch preparation, launch, and deployment and operation in space.
Probabilistic topic modeling for the analysis and classification of genomic sequences
2015-01-01
Background Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques. Methods The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences. Results and conclusions We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased. PMID:25916734
Inglis, Joshua M; Caughey, Gillian E; Smith, William; Shakib, Sepehr
2017-11-01
The majority of patients with penicillin allergy labels tolerate penicillins. Inappropriate avoidance of penicillin is associated with increased hospitalisation, infections and healthcare costs. To examine the documentation of penicillin adverse drug reactions (ADR) in a large-scale hospital-based electronic health record. Penicillin ADR were extracted from 96 708 patient records in the Enterprise Patient Administration System in South Australia. Expert criteria were used to determine consistency of ADR entry and suitability for further evaluation. Of 43 011 unique ADR reports, there were 5023 ADR to penicillins with most being entered as allergy (n = 4773, 95.0%) rather than intolerance (n = 250, 5.0%). A significant proportion did not include a reaction description (n = 1052, 20.9%). Using pre-set criteria, 10.1% of reports entered as allergy had a reaction description that was consistent with intolerance and 31.0% of the entered intolerances had descriptions consistent with allergy. Virtually all ADR (n = 4979, 99.1%) were appropriate for further evaluation by history taking or immunological testing and half (50.7%, n = 2549) had documented reactions suggesting low-risk of penicillin allergy. The frequency of penicillin allergy label in this data set is consistent with the known overdiagnosis of penicillin allergy in the hospital population. ADR documentation was poor with incomplete entries and inconsistent categorisation. The concepts of allergy and intolerance for ADR classification, whilst mechanistically valid, may not be useful at the point of ADR entry by generalist clinicians. Systematic evaluation of reported ADR is needed to improve the quality of information for future prescribers. © 2017 Royal Australasian College of Physicians.
Selecting a Clinical Intervention Documentation System for an Academic Setting
Andrus, Miranda; Hester, E. Kelly; Byrd, Debbie C.
2011-01-01
Pharmacists' clinical interventions have been the subject of a substantial body of literature that focuses on the process and outcomes of establishing an intervention documentation program within the acute care setting. Few reports describe intervention documentation as a component of doctor of pharmacy (PharmD) programs; none describe the process of selecting an intervention documentation application to support the complete array of pharmacy practice and experiential sites. The process that a school of pharmacy followed to select and implement a school-wide intervention system to document the clinical and financial impact of an experiential program is described. Goals included finding a tool that allowed documentation from all experiential sites and the ability to assign dollar savings (hard and soft) to all documented interventions. The paper provides guidance for other colleges and schools of pharmacy in selecting a clinical intervention documentation system for program-wide use. PMID:21519426
Selecting a clinical intervention documentation system for an academic setting.
Fox, Brent I; Andrus, Miranda; Hester, E Kelly; Byrd, Debbie C
2011-03-10
Pharmacists' clinical interventions have been the subject of a substantial body of literature that focuses on the process and outcomes of establishing an intervention documentation program within the acute care setting. Few reports describe intervention documentation as a component of doctor of pharmacy (PharmD) programs; none describe the process of selecting an intervention documentation application to support the complete array of pharmacy practice and experiential sites. The process that a school of pharmacy followed to select and implement a school-wide intervention system to document the clinical and financial impact of an experiential program is described. Goals included finding a tool that allowed documentation from all experiential sites and the ability to assign dollar savings (hard and soft) to all documented interventions. The paper provides guidance for other colleges and schools of pharmacy in selecting a clinical intervention documentation system for program-wide use.
Experimental Applications of Automatic Test Markup Language (ATML)
NASA Technical Reports Server (NTRS)
Lansdowne, Chatwin A.; McCartney, Patrick; Gorringe, Chris
2012-01-01
The authors describe challenging use-cases for Automatic Test Markup Language (ATML), and evaluate solutions. The first case uses ATML Test Results to deliver active features to support test procedure development and test flow, and bridging mixed software development environments. The second case examines adding attributes to Systems Modelling Language (SysML) to create a linkage for deriving information from a model to fill in an ATML document set. Both cases are outside the original concept of operations for ATML but are typical when integrating large heterogeneous systems with modular contributions from multiple disciplines.
A NASTRAN primer for the analysis of rotating flexible blades
NASA Technical Reports Server (NTRS)
Lawrence, Charles; Aiello, Robert A.; Ernst, Michael A.; Mcgee, Oliver G.
1987-01-01
This primer provides documentation for using MSC NASTRAN in analyzing rotating flexible blades. The analysis of these blades includes geometrically nonlinear (large displacement) analysis under centrifugal loading, and frequency and mode shape (normal modes) determination. The geometrically nonlinear analysis using NASTRAN Solution sequence 64 is discussed along with the determination of frequencies and mode shapes using Solution Sequence 63. A sample problem with the complete NASTRAN input data is included. Items unique to rotating blade analyses, such as setting angle and centrifugal softening effects are emphasized.
Metis: A Pure Metropolis Markov Chain Monte Carlo Bayesian Inference Library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bates, Cameron Russell; Mckigney, Edward Allen
The use of Bayesian inference in data analysis has become the standard for large scienti c experiments [1, 2]. The Monte Carlo Codes Group(XCP-3) at Los Alamos has developed a simple set of algorithms currently implemented in C++ and Python to easily perform at-prior Markov Chain Monte Carlo Bayesian inference with pure Metropolis sampling. These implementations are designed to be user friendly and extensible for customization based on speci c application requirements. This document describes the algorithmic choices made and presents two use cases.
Ontology modularization to improve semantic medical image annotation.
Wennerberg, Pinar; Schulz, Klaus; Buitelaar, Paul
2011-02-01
Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the contents of images and reports using medical ontologies. Medical images and patient reports are then linked to each other through common annotations. Subsequently, search algorithms can more effectively find related sets of documents on the basis of these semantic descriptions. A prerequisite to realizing such a semantic search engine is that the data contained within should have been previously annotated with concepts from medical ontologies. One major challenge in this regard is the size and complexity of medical ontologies as annotation sources. Manual annotation is particularly time consuming labor intensive in a clinical environment. In this article we propose an approach to reducing the size of clinical ontologies for more efficient manual image and text annotation. More precisely, our goal is to identify smaller fragments of a large anatomy ontology that are relevant for annotating medical images from patients suffering from lymphoma. Our work is in the area of ontology modularization, which is a recent and active field of research. We describe our approach, methods and data set in detail and we discuss our results. Copyright © 2010 Elsevier Inc. All rights reserved.
IHE cross-enterprise document sharing for imaging: design challenges
NASA Astrophysics Data System (ADS)
Noumeir, Rita
2006-03-01
Integrating the Healthcare Enterprise (IHE) has recently published a new integration profile for sharing documents between multiple enterprises. The Cross-Enterprise Document Sharing Integration Profile (XDS) lays the basic framework for deploying regional and national Electronic Health Record (EHR). This profile proposes an architecture based on a central Registry that holds metadata information describing published Documents residing in one or multiple Documents Repositories. As medical images constitute important information of the patient health record, it is logical to extend the XDS Integration Profile to include images. However, including images in the EHR presents many challenges. The complete image set is very large; it is useful for radiologists and other specialists such as surgeons and orthopedists. The imaging report, on the other hand, is widely needed and its broad accessibility is vital for achieving optimal patient care. Moreover, a subset of relevant images may also be of wide interest along with the report. Therefore, IHE recently published a new integration profile for sharing images and imaging reports between multiple enterprises. This new profile, the Cross-Enterprise Document Sharing for Imaging (XDS-I), is based on the XDS architecture. The XDS-I integration solution that is published as part of the IHE Technical Framework is the result of an extensive investigation effort of several design solutions. This paper presents and discusses the design challenges and the rationales behind the design decisions of the IHE XDS-I Integration Profile, for a better understanding and appreciation of the final published solution.
Spotting words in handwritten Arabic documents
NASA Astrophysics Data System (ADS)
Srihari, Sargur; Srinivasan, Harish; Babu, Pavithra; Bhole, Chetan
2006-01-01
The design and performance of a system for spotting handwritten Arabic words in scanned document images is presented. Three main components of the system are a word segmenter, a shape based matcher for words and a search interface. The user types in a query in English within a search window, the system finds the equivalent Arabic word, e.g., by dictionary look-up, locates word images in an indexed (segmented) set of documents. A two-step approach is employed in performing the search: (1) prototype selection: the query is used to obtain a set of handwritten samples of that word from a known set of writers (these are the prototypes), and (2) word matching: the prototypes are used to spot each occurrence of those words in the indexed document database. A ranking is performed on the entire set of test word images-- where the ranking criterion is a similarity score between each prototype word and the candidate words based on global word shape features. A database of 20,000 word images contained in 100 scanned handwritten Arabic documents written by 10 different writers was used to study retrieval performance. Using five writers for providing prototypes and the other five for testing, using manually segmented documents, 55% precision is obtained at 50% recall. Performance increases as more writers are used for training.
On the Creation of Hypertext Links in Full-Text Documents: Measurement of Inter-Linker Consistency.
ERIC Educational Resources Information Center
Ellis, David; And Others
1994-01-01
Describes a study in which several different sets of hypertext links are inserted by different people in full-text documents. The degree of similarity between the sets is measured using coefficients and topological indices. As in comparable studies of inter-indexer consistency, the sets of links used by different people showed little similarity.…
Mapping of a standard documentation template to the ICF core sets for arthritis and low back pain.
Escorpizo, Reuben; Davis, Kandace; Stumbo, Teri
2010-12-01
To identify the contents of a documentation template in The Guide to Physical Therapist Practice using the International Classification of Functioning, Disability, and Health (ICF) Core Sets for rheumatoid arthritis, osteoarthritis, and low back pain (LBP) as reference. Concepts were identified from items of an outpatient documentation template and mapped to the ICF using established linking rules. The ICF categories that were linked were compared with existing arthritis and LBP Core Sets. Based on the ICF, the template had the highest number (29%) of linked categories under Activities and participation while Body structures had the least (17%). ICF categories in the arthritis and LBP Core Sets had a 37-55% match with the ICF categories found in the template. We found 164 concepts that were not classified or not defined and 37 as personal factors. The arthritis and LBP Core Sets were reflected in the contents of the template. ICF categories in the Core Sets were reflected in the template (demonstrating up to 55% match). Potential integration of ICF in documentation templates could be explored and examined in the future to enhance clinical encounters and multidisciplinary communication. Copyright © 2010 John Wiley & Sons, Ltd.
Document Ranking Based upon Markov Chains.
ERIC Educational Resources Information Center
Danilowicz, Czeslaw; Balinski, Jaroslaw
2001-01-01
Considers how the order of documents in information retrieval responses are determined and introduces a method that uses a probabilistic model of a document set where documents are regarded as states of a Markov chain and where transition probabilities are directly proportional to similarities between documents. (Author/LRW)
Gandara, Esteban; Ungar, Jonathan; Lee, Jason; Chan-Macrae, Myrna; O'Malley, Terrence; Schnipper, Jeffrey L
2010-06-01
Effective communication among physicians during hospital discharge is critical to patient care. Partners Healthcare (Boston) has been engaged in a multi-year process to measure and improve the quality of documentation of all patients discharged from its five acute care hospitals to subacute facilities. Partners first engaged stakeholders to develop a consensus set of 12 required data elements for all discharges to subacute facilities. A measurement process was established and later refined. Quality improvement interventions were then initiated to address measured deficiencies and included education of physicians and nurses, improvements in information technology, creation of or improvements in discharge documentation templates, training of hospitalists to serve as role models, feedback to physicians and their service chiefs regarding reviewed cases, and case manager review of documentation before discharge. To measure improvement in quality as a result of these efforts, rates of simultaneous inclusion of all 12 applicable data elements ("defect-free rate") were analyzed over time. Some 3,101 discharge documentation packets of patients discharged to subacute facilities from January 1, 2006, through September 2008 were retrospectively studied. During the 11 monitored quarters, the defect-free rate increased from 65% to 96% (p < .001 for trend). The largest improvements were seen in documentation of preadmission medication lists, allergies, follow-up, and warfarin information. Institution of rigorous measurement, feedback, and multidisciplinary, multimodal quality improvement processes improved the inclusion of data elements in discharge documentation required for safe hospital discharge across a large integrated health care system.
Enhancing Biomedical Text Summarization Using Semantic Relation Extraction
Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao
2011-01-01
Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization. PMID:21887336
Using Induction to Refine Information Retrieval Strategies
NASA Technical Reports Server (NTRS)
Baudin, Catherine; Pell, Barney; Kedar, Smadar
1994-01-01
Conceptual information retrieval systems use structured document indices, domain knowledge and a set of heuristic retrieval strategies to match user queries with a set of indices describing the document's content. Such retrieval strategies increase the set of relevant documents retrieved (increase recall), but at the expense of returning additional irrelevant documents (decrease precision). Usually in conceptual information retrieval systems this tradeoff is managed by hand and with difficulty. This paper discusses ways of managing this tradeoff by the application of standard induction algorithms to refine the retrieval strategies in an engineering design domain. We gathered examples of query/retrieval pairs during the system's operation using feedback from a user on the retrieved information. We then fed these examples to the induction algorithm and generated decision trees that refine the existing set of retrieval strategies. We found that (1) induction improved the precision on a set of queries generated by another user, without a significant loss in recall, and (2) in an interactive mode, the decision trees pointed out flaws in the retrieval and indexing knowledge and suggested ways to refine the retrieval strategies.
Limits to reproductive success of Sarracenia purpurea (Sarraceniaceae).
Ne'eman, Gidi; Ne'eman, Rina; Ellison, Aaron M
2006-11-01
Plant biologists have an enduring interest in assessing components of plant fitness and determining limits to seed set. Consequently, the relative contributions of resource and pollinator availability have been documented for a large number of plant species. We experimentally examined the roles of resource and pollen availability on seed set by the northern pitcher plant Sarracenia purpurea. We were able to distinguish the relative contributions of carbon (photosynthate) and mineral nutrients (nitrogen) to reproductive success. We also determined potential pollinators of this species. The bees Bombus affinis and Augochlorella aurata and the fly Fletcherimyia fletcheri were the only floral visitors to S. purpurea that collected pollen. Supplemental pollination increased seed set by <10%, a much lower percentage than would be expected, given data from noncarnivorous, animal-pollinated taxa. Seed set was reduced by 14% in plants that could not capture prey and by another 23% in plants whose pitcher-shaped leaves were also prevented from photosynthesizing. We conclude that resources are more important than pollen availability in determining seed set by this pitcher plant and that reproductive output may be another "cost" of the carnivorous habit.
Late Quaternary Stratigraphic Architecture of the Santee River Delta, South Carolina, U.S.A.
NASA Astrophysics Data System (ADS)
Long, J. H.; Hanebuth, T. J. J.
2017-12-01
The Santee River of South Carolina is the second largest river in terms of drainage area and discharge in the eastern United States and forms the only river-fed delta on the country's Atlantic coast. Significant anthropogenic modifications to this system date back to the early 18th century with the extensive clearing of coastal wetland forest for rice cultivation. In the 1940's the construction of large upstream dams permanently altered the discharge of the Santee River. These modifications are likely documented within the sedimentary record of the Santee Delta as episodes of major environmental changes. The Piedmont-sourced Santee River system incised its valley to an estimated depth of 20 m during lower glacial sea level. Sedimentation during the subsequent Holocene transgression and highstand has filled much of this accommodation. The Santee system remains largely under-investigated with only a handful of studies completed in the 1970's and 1980's based on sediment cores and cuttings. Through the use of high frequency seismic profiles (0.5 - 24 kHz), sediment cores, and other field data, we differentiate depositional units, architectural elements, and bounding surfaces with temporal and spatial distributions reflecting the changing morphodynamics of this complex system at multiple scales. These lithosomes are preserved within both modern inshore and offshore settings and were deposited within a range of paralic environments by processes active on fluvial/estuarine bars, floodplains, marshes, tidal flats, spits, beach ridges, and in backbarrier settings. They are bound by surfaces ranging from diastems to regional, polygenetic, low-angle and channel-form erosional surfaces. Detailed descriptions of cores taken from within the upper 6 m of the modern lower delta plain document heterolithic, mixed-energy, organic-rich, largely aggradational sedimentation dating back to at least 5 ka cal BP. Offshore, stacked, sand-rich, progradational packages sit atop heterolithic paleochannel-fill successions contained within a framework of regionally extensive, erosional bounding surfaces. Ongoing work focuses on continued data acquisition and integration of inshore and offshore data sets into a coherent model for the Holocene evolution of the Santee Delta.
NASA Astrophysics Data System (ADS)
Zhao, Qunhua; Santos, Eugene; Nguyen, Hien; Mohamed, Ahmed
One of the biggest challenges for intelligence analysts who participate in prevention or response to a terrorism act is to quickly find relevant information from massive amounts of data. Along with research on information retrieval and filtering, text summarization is an effective technique to help intelligence analysts shorten their time to find critical information and make timely decisions. Multi-document summarization is particularly useful as it serves to quickly describe a collection of information. The obvious shortcoming lies in what it cannot capture especially in more diverse collections. Thus, the question lies in the adequacy and/or usefulness of such summarizations to the target analyst. In this chapter, we report our experimental study on the sensitivity of users to the quality and content of multi-document summarization. We used the DUC 2002 collection for multi-document summarization as our testbed. Two groups of document sets were considered: (I) the sets consisting of closely correlated documents with highly overlapped content; and (II) the sets consisting of diverse documents covering a wide scope of topics. Intuitively, this suggests that creating a quality summary would be more difficult for the latter case. However, human evaluators were discovered to be fairly insensitive to this difference. This occurred when they were asked to rank the performance of various automated summarizers. In this chapter, we examine and analyze our experiments in order to better understand this phenomenon and how we might address it to improve summarization quality. In particular, we present a new metric based on document graphs that can distinguish between the two types of document sets.
Document similarity measures and document browsing
NASA Astrophysics Data System (ADS)
Ahmadullin, Ildus; Fan, Jian; Damera-Venkata, Niranjan; Lim, Suk Hwan; Lin, Qian; Liu, Jerry; Liu, Sam; O'Brien-Strain, Eamonn; Allebach, Jan
2011-03-01
Managing large document databases is an important task today. Being able to automatically com- pare document layouts and classify and search documents with respect to their visual appearance proves to be desirable in many applications. We measure single page documents' similarity with respect to distance functions between three document components: background, text, and saliency. Each document component is represented as a Gaussian mixture distribution; and distances between dierent documents' components are calculated as probabilistic similarities between corresponding distributions. The similarity measure between documents is represented as a weighted sum of the components' distances. Using this document similarity measure, we propose a browsing mechanism operating on a document dataset. For these purposes, we use a hierarchical browsing environment which we call the document similarity pyramid. It allows the user to browse a large document dataset and to search for documents in the dataset that are similar to the query. The user can browse the dataset on dierent levels of the pyramid, and zoom into the documents that are of interest.
Research Notes - Openness and Evolvability - Documentation Quality Assessment
2016-08-01
UNCLASSIFIED UNCLASSIFIED Notes – Openness and Evolvability – Documentation Quality Assessment Michael Haddy* and Adam Sbrana...Methods and Processes. This set of Research Notes focusses on Documentation Quality Assessment. This work was undertaken from the late 1990s to 2007...1 2. DOCUMENTATION QUALITY ASSESSMENT ......................................................... 1 2.1 Documentation Quality Assessment
Anguzu, Ronald; Akun, Pamela R; Ogwang, Rodney; Shour, Abdul Rahman; Sekibira, Rogers; Ningwa, Albert; Nakamya, Phellister; Abbo, Catherine; Mwaka, Amos D; Opar, Bernard; Idro, Richard
2018-01-01
ABSTRACT A large amount of preparation goes into setting up trials. Different challenges and lessons are experienced. Our trial, testing a treatment for nodding syndrome, an acquired neurological disorder of unknown cause affecting thousands of children in Eastern Africa, provides a unique case study. As part of a study to determine the aetiology, understand pathogenesis and develop specific treatment, we set up a clinical trial in a remote district hospital in Uganda. This paper describes our experiences and documents supportive structures (enablers), challenges faced and lessons learned during set-up of the trial. Protocol development started in September 2015 with phased recruitment of a critical study team. The team spent 12 months preparing trial documents, procurement and training on procedures. Potential recruitment sites were pre-visited, and district and local leaders met as key stakeholders. Key enablers were supportive local leadership and investment by the district and Ministry of Health. The main challenges were community fears about nodding syndrome, adverse experiences of the community during previous research and political involvement. Other challenges included the number and delays in protocol approvals and lengthy procurement processes. This hard-to-reach area has frequent power and Internet fluctuations, which may affect cold chains for study samples, communication and data management. These concerns decreased with a pilot community engagement programme. Experiences and lessons learnt can reduce the duration of processes involved in trial-site set-up. A programme of community engagement and local leader involvement may be key to the success of a trial and in reducing community opposition towards participation in research. PMID:29382251
Anguzu, Ronald; Akun, Pamela R; Ogwang, Rodney; Shour, Abdul Rahman; Sekibira, Rogers; Ningwa, Albert; Nakamya, Phellister; Abbo, Catherine; Mwaka, Amos D; Opar, Bernard; Idro, Richard
2018-01-01
A large amount of preparation goes into setting up trials. Different challenges and lessons are experienced. Our trial, testing a treatment for nodding syndrome, an acquired neurological disorder of unknown cause affecting thousands of children in Eastern Africa, provides a unique case study. As part of a study to determine the aetiology, understand pathogenesis and develop specific treatment, we set up a clinical trial in a remote district hospital in Uganda. This paper describes our experiences and documents supportive structures (enablers), challenges faced and lessons learned during set-up of the trial. Protocol development started in September 2015 with phased recruitment of a critical study team. The team spent 12 months preparing trial documents, procurement and training on procedures. Potential recruitment sites were pre-visited, and district and local leaders met as key stakeholders. Key enablers were supportive local leadership and investment by the district and Ministry of Health. The main challenges were community fears about nodding syndrome, adverse experiences of the community during previous research and political involvement. Other challenges included the number and delays in protocol approvals and lengthy procurement processes. This hard-to-reach area has frequent power and Internet fluctuations, which may affect cold chains for study samples, communication and data management. These concerns decreased with a pilot community engagement programme. Experiences and lessons learnt can reduce the duration of processes involved in trial-site set-up. A programme of community engagement and local leader involvement may be key to the success of a trial and in reducing community opposition towards participation in research.
Dunes on planet Tatooine: Observation of barchan migration at the Star Wars film set in Tunisia
NASA Astrophysics Data System (ADS)
Lorenz, Ralph D.; Gasmi, Nabil; Radebaugh, Jani; Barnes, Jason W.; Ori, Gian G.
2013-11-01
Sand dune migration is documented with a readily-available tool (Google Earth) near Chott El Gharsa, just north-west of Tozeur, Tunisia. As fiducial markers we employ a set of buildings used to portray the fictional city Mos Espa. This set of ~ 20 buildings over roughly a hectare was constructed in 1997 for the movie Star Wars Episode 1 - The Phantom Menace. The site now lies between the arms of a large “pudgy” barchan dune, which has been documented via satellite imaging in 2002, 2004, 2008 and 2009 to have moved from ~ 140 m away to only ~ 10 m away. Visits by the authors to the site in 2009 and 2011 confirm the barchan to be in a threatening position: a smaller set nearby was substantially damaged by being overrun by dunes circa 2004. The migration rate of ~ 15 m/yr decreases over the observation period, possibly as a result of increased local rainfall, and is consistent with barchan migration rates observed at other locations worldwide. The migration rate of this and two other barchans suggests sand transport of ~ 50 m3/m/yr, somewhat higher than would be suggested by traditional wind rose calculations: we explore possible reasons for this discrepancy. Because of the link to popular science fiction, the site may be of pedagogical interest in teaching remote sensing and geomorphic change. We also note that nearby playa surfaces and agricultural areas have a time-variable appearance. The site's popularity as a destination for Star Wars enthusiasts results in many photographs being posted on the internet, providing a rich set of in-situ imagery for continued monitoring in the absence of dedicated field visits.
Tracking Patient Education Documentation across Time and Care Settings
Janousek, Lisa; Heermann, Judith; Eilers, June
2005-01-01
Results of a formative evaluation of a patient education documentation system will be presented. Both quantitative and qualitative approaches to data collection are being used. The goal of integrating patient education documentation into the electronic patient record is to facilitate seamless, multidisciplinary patient/family education across time and settings. The system is being piloted by oncology services at The Nebraska Medical Center. The evaluation addresses the usability and comprehensiveness of the system. PMID:16779280
Operational Concepts for a Generic Space Exploration Communication Network Architecture
NASA Technical Reports Server (NTRS)
Ivancic, William D.; Vaden, Karl R.; Jones, Robert E.; Roberts, Anthony M.
2015-01-01
This document is one of three. It describes the Operational Concept (OpsCon) for a generic space exploration communication architecture. The purpose of this particular document is to identify communication flows and data types. Two other documents accompany this document, a security policy profile and a communication architecture document. The operational concepts should be read first followed by the security policy profile and then the architecture document. The overall goal is to design a generic space exploration communication network architecture that is affordable, deployable, maintainable, securable, evolvable, reliable, and adaptable. The architecture should also require limited reconfiguration throughout system development and deployment. System deployment includes: subsystem development in a factory setting, system integration in a laboratory setting, launch preparation, launch, and deployment and operation in space.
NASA Astrophysics Data System (ADS)
Jones, Chris D.; Arora, Vivek; Friedlingstein, Pierre; Bopp, Laurent; Brovkin, Victor; Dunne, John; Graven, Heather; Hoffman, Forrest; Ilyina, Tatiana; John, Jasmin G.; Jung, Martin; Kawamiya, Michio; Koven, Charlie; Pongratz, Julia; Raddatz, Thomas; Randerson, James T.; Zaehle, Sönke
2016-08-01
Coordinated experimental design and implementation has become a cornerstone of global climate modelling. Model Intercomparison Projects (MIPs) enable systematic and robust analysis of results across many models, by reducing the influence of ad hoc differences in model set-up or experimental boundary conditions. As it enters its 6th phase, the Coupled Model Intercomparison Project (CMIP6) has grown significantly in scope with the design and documentation of individual simulations delegated to individual climate science communities. The Coupled Climate-Carbon Cycle Model Intercomparison Project (C4MIP) takes responsibility for design, documentation, and analysis of carbon cycle feedbacks and interactions in climate simulations. These feedbacks are potentially large and play a leading-order contribution in determining the atmospheric composition in response to human emissions of CO2 and in the setting of emissions targets to stabilize climate or avoid dangerous climate change. For over a decade, C4MIP has coordinated coupled climate-carbon cycle simulations, and in this paper we describe the C4MIP simulations that will be formally part of CMIP6. While the climate-carbon cycle community has created this experimental design, the simulations also fit within the wider CMIP activity, conform to some common standards including documentation and diagnostic requests, and are designed to complement the CMIP core experiments known as the Diagnostic, Evaluation and Characterization of Klima (DECK). C4MIP has three key strands of scientific motivation and the requested simulations are designed to satisfy their needs: (1) pre-industrial and historical simulations (formally part of the common set of CMIP6 experiments) to enable model evaluation, (2) idealized coupled and partially coupled simulations with 1 % per year increases in CO2 to enable diagnosis of feedback strength and its components, (3) future scenario simulations to project how the Earth system will respond to anthropogenic activity over the 21st century and beyond. This paper documents in detail these simulations, explains their rationale and planned analysis, and describes how to set up and run the simulations. Particular attention is paid to boundary conditions, input data, and requested output diagnostics. It is important that modelling groups participating in C4MIP adhere as closely as possible to this experimental design.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jones, Chris D.; Arora, Vivek; Friedlingstein, Pierre
Coordinated experimental design and implementation has become a cornerstone of global climate modelling. Model Intercomparison Projects (MIPs) enable systematic and robust analysis of results across many models, by reducing the influence of ad hoc differences in model set-up or experimental boundary conditions. As it enters its 6th phase, the Coupled Model Intercomparison Project (CMIP6) has grown significantly in scope with the design and documentation of individual simulations delegated to individual climate science communities. The Coupled Climate–Carbon Cycle Model Intercomparison Project (C4MIP) takes responsibility for design, documentation, and analysis of carbon cycle feedbacks and interactions in climate simulations. These feedbacks aremore » potentially large and play a leading-order contribution in determining the atmospheric composition in response to human emissions of CO 2 and in the setting of emissions targets to stabilize climate or avoid dangerous climate change. For over a decade, C4MIP has coordinated coupled climate–carbon cycle simulations, and in this paper we describe the C4MIP simulations that will be formally part of CMIP6. While the climate–carbon cycle community has created this experimental design, the simulations also fit within the wider CMIP activity, conform to some common standards including documentation and diagnostic requests, and are designed to complement the CMIP core experiments known as the Diagnostic, Evaluation and Characterization of Klima (DECK). C4MIP has three key strands of scientific motivation and the requested simulations are designed to satisfy their needs: (1) pre-industrial and historical simulations (formally part of the common set of CMIP6 experiments) to enable model evaluation, (2) idealized coupled and partially coupled simulations with 1 % per year increases in CO 2 to enable diagnosis of feedback strength and its components, (3) future scenario simulations to project how the Earth system will respond to anthropogenic activity over the 21st century and beyond. This study documents in detail these simulations, explains their rationale and planned analysis, and describes how to set up and run the simulations. Particular attention is paid to boundary conditions, input data, and requested output diagnostics. It is important that modelling groups participating in C4MIP adhere as closely as possible to this experimental design.« less
Jones, Chris D.; Arora, Vivek; Friedlingstein, Pierre; ...
2016-08-25
Coordinated experimental design and implementation has become a cornerstone of global climate modelling. Model Intercomparison Projects (MIPs) enable systematic and robust analysis of results across many models, by reducing the influence of ad hoc differences in model set-up or experimental boundary conditions. As it enters its 6th phase, the Coupled Model Intercomparison Project (CMIP6) has grown significantly in scope with the design and documentation of individual simulations delegated to individual climate science communities. The Coupled Climate–Carbon Cycle Model Intercomparison Project (C4MIP) takes responsibility for design, documentation, and analysis of carbon cycle feedbacks and interactions in climate simulations. These feedbacks aremore » potentially large and play a leading-order contribution in determining the atmospheric composition in response to human emissions of CO 2 and in the setting of emissions targets to stabilize climate or avoid dangerous climate change. For over a decade, C4MIP has coordinated coupled climate–carbon cycle simulations, and in this paper we describe the C4MIP simulations that will be formally part of CMIP6. While the climate–carbon cycle community has created this experimental design, the simulations also fit within the wider CMIP activity, conform to some common standards including documentation and diagnostic requests, and are designed to complement the CMIP core experiments known as the Diagnostic, Evaluation and Characterization of Klima (DECK). C4MIP has three key strands of scientific motivation and the requested simulations are designed to satisfy their needs: (1) pre-industrial and historical simulations (formally part of the common set of CMIP6 experiments) to enable model evaluation, (2) idealized coupled and partially coupled simulations with 1 % per year increases in CO 2 to enable diagnosis of feedback strength and its components, (3) future scenario simulations to project how the Earth system will respond to anthropogenic activity over the 21st century and beyond. This study documents in detail these simulations, explains their rationale and planned analysis, and describes how to set up and run the simulations. Particular attention is paid to boundary conditions, input data, and requested output diagnostics. It is important that modelling groups participating in C4MIP adhere as closely as possible to this experimental design.« less
Toolpack mathematical software development environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osterweil, L.
1982-07-21
The purpose of this research project was to produce a well integrated set of tools for the support of numerical computation. The project entailed the specification, design and implementation of both a diversity of tools and an innovative tool integration mechanism. This large configuration of tightly integrated tools comprises an environment for numerical software development, and has been named Toolpack/IST (Integrated System of Tools). Following the creation of this environment in prototype form, the environment software was readied for widespread distribution by transitioning it to a development organization for systematization, documentation and distribution. It is expected that public release ofmore » Toolpack/IST will begin imminently and will provide a basis for evaluation of the innovative software approaches taken as well as a uniform set of development tools for the numerical software community.« less
Time-lapse video sysem used to study nesting gyrfalcons
Booms, Travis; Fuller, Mark R.
2003-01-01
We used solar-powered time-lapse video photography to document nesting Gyrfalcon (Falco rusticolus) food habits in central West Greenland from May to July in 2000 and 2001. We collected 2677.25 h of videotape from three nests, representing 94, 87, and 49% of the nestling period at each nest. The video recorded 921 deliveries of 832 prey items. We placed 95% of the items into prey categories. The image quality was good but did not reveal enough detail to identify most passerines to species. We found no evidence that Gyrfalcons were negatively affected by the video system after the initial camera set-up. The video system experienced some mechanical problems but proved reliable. The system likely can be used to effectively document the food habits and nesting behavior of other birds, especially those delivering large prey to a nest or other frequently used site.
Twentieth century turnover of Mexican endemic avifaunas: Landscape change versus climate drivers.
Peterson, A Townsend; Navarro-Sigüenza, Adolfo G; Martínez-Meyer, Enrique; Cuervo-Robayo, Angela P; Berlanga, Humberto; Soberón, Jorge
2015-05-01
Numerous climate change effects on biodiversity have been anticipated and documented, including extinctions, range shifts, phenological shifts, and breakdown of interactions in ecological communities, yet the relative balance of different climate drivers and their relationships to other agents of global change (for example, land use and land-use change) remains relatively poorly understood. This study integrated historical and current biodiversity data on distributions of 115 Mexican endemic bird species to document areas of concentrated gains and losses of species in local communities, and then related those changes to climate and land-use drivers. Of all drivers examined, at this relatively coarse spatial resolution, only temperature change had significant impacts on avifaunal turnover; neither precipitation change nor human impact on landscapes had detectable effects. This study, conducted across species' geographic distributions, and covering all of Mexico, thanks to two large-scale biodiversity data sets, could discern relative importance of specific climatic drivers of biodiversity change.
Névéol, Aurélie; Zeng, Kelly; Bodenreider, Olivier
2006-01-01
Objective This paper explores alternative approaches for the evaluation of an automatic indexing tool for MEDLINE, complementing the traditional precision and recall method. Materials and methods The performance of MTI, the Medical Text Indexer used at NLM to produce MeSH recommendations for biomedical journal articles is evaluated on a random set of MEDLINE citations. The evaluation examines semantic similarity at the term level (indexing terms). In addition, the documents retrieved by queries resulting from MTI index terms for a given document are compared to the PubMed related citations for this document. Results Semantic similarity scores between sets of index terms are higher than the corresponding Dice similarity scores. Overall, 75% of the original documents and 58% of the top ten related citations are retrieved by queries based on the automatic indexing. Conclusions The alternative measures studied in this paper confirm previous findings and may be used to select particular documents from the test set for a more thorough analysis. PMID:17238409
Neveol, Aurélie; Zeng, Kelly; Bodenreider, Olivier
2006-01-01
This paper explores alternative approaches for the evaluation of an automatic indexing tool for MEDLINE, complementing the traditional precision and recall method. The performance of MTI, the Medical Text Indexer used at NLM to produce MeSH recommendations for biomedical journal articles is evaluated on a random set of MEDLINE citations. The evaluation examines semantic similarity at the term level (indexing terms). In addition, the documents retrieved by queries resulting from MTI index terms for a given document are compared to the PubMed related citations for this document. Semantic similarity scores between sets of index terms are higher than the corresponding Dice similarity scores. Overall, 75% of the original documents and 58% of the top ten related citations are retrieved by queries based on the automatic indexing. The alternative measures studied in this paper confirm previous findings and may be used to select particular documents from the test set for a more thorough analysis.
Piette, John D; Lun, K C; Moura, Lincoln A; Fraser, Hamish S F; Mechael, Patricia N; Powell, John; Khoja, Shariq R
2012-05-01
E-health encompasses a diverse set of informatics tools that have been designed to improve public health and health care. Little information is available on the impacts of e-health programmes, particularly in low- and middle-income countries. We therefore conducted a scoping review of the published and non-published literature to identify data on the effects of e-health on health outcomes and costs. The emphasis was on the identification of unanswered questions for future research, particularly on topics relevant to low- and middle-income countries. Although e-health tools supporting clinical practice have growing penetration globally, there is more evidence of benefits for tools that support clinical decisions and laboratory information systems than for those that support picture archiving and communication systems. Community information systems for disease surveillance have been implemented successfully in several low- and middle-income countries. Although information on outcomes is generally lacking, a large project in Brazil has documented notable impacts on health-system efficiency. Meta-analyses and rigorous trials have documented the benefits of text messaging for improving outcomes such as patients' self-care. Automated telephone monitoring and self-care support calls have been shown to improve some outcomes of chronic disease management, such as glycaemia and blood pressure control, in low- and middle-income countries. Although large programmes for e-health implementation and research are being conducted in many low- and middle-income countries, more information on the impacts of e-health on outcomes and costs in these settings is still needed.
Lun, KC; Moura, Lincoln A; Fraser, Hamish SF; Mechael, Patricia N; Powell, John; Khoja, Shariq R
2012-01-01
Abstract E-health encompasses a diverse set of informatics tools that have been designed to improve public health and health care. Little information is available on the impacts of e-health programmes, particularly in low- and middle-income countries. We therefore conducted a scoping review of the published and non-published literature to identify data on the effects of e-health on health outcomes and costs. The emphasis was on the identification of unanswered questions for future research, particularly on topics relevant to low- and middle-income countries. Although e-health tools supporting clinical practice have growing penetration globally, there is more evidence of benefits for tools that support clinical decisions and laboratory information systems than for those that support picture archiving and communication systems. Community information systems for disease surveillance have been implemented successfully in several low- and middle-income countries. Although information on outcomes is generally lacking, a large project in Brazil has documented notable impacts on health-system efficiency. Meta-analyses and rigorous trials have documented the benefits of text messaging for improving outcomes such as patients’ self-care. Automated telephone monitoring and self-care support calls have been shown to improve some outcomes of chronic disease management, such as glycaemia and blood pressure control, in low- and middle-income countries. Although large programmes for e-health implementation and research are being conducted in many low- and middle-income countries, more information on the impacts of e-health on outcomes and costs in these settings is still needed. PMID:22589570
Capturing exposures: using automated cameras to document environmental determinants of obesity.
Barr, Michelle; Signal, Louise; Jenkin, Gabrielle; Smith, Moira
2015-03-01
Children's exposure to food marketing across multiple everyday settings, a key environmental influence on health, has not yet been objectively documented. Wearable automated cameras (ACs) may have the potential to provide an objective account of this exposure. The purpose of this study is to assess the feasibility of using ACs to document children's exposure to food marketing in multiple settings. A convenience sample of six participants (aged 12) wore a SenseCam device for two full days. Following which, participants attended a focus group to ascertain their experiences of using the device. The collected data were analysed to determine participants' daily and setting specific exposure to 'healthy' and 'unhealthy' food marketing (in minutes). The focus group transcript was analysed using thematic analysis to identify the common themes. Participants collected usable data that could be analysed to determine participant's daily exposure (in minutes) to 'unhealthy' food marketing across a number of everyday settings. Results from the focus group discussion indicated that participants were comfortable wearing the device, after an initial adjustment period. ACs may be an effective tool for documenting children's exposure to food marketing in multiple settings. ACs provide a new method for documenting environmental determinants of obesity and likely other environmental impacts on health. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Extracting Hot spots of Topics from Time Stamped Documents
Chen, Wei; Chundi, Parvathi
2011-01-01
Identifying time periods with a burst of activities related to a topic has been an important problem in analyzing time-stamped documents. In this paper, we propose an approach to extract a hot spot of a given topic in a time-stamped document set. Topics can be basic, containing a simple list of keywords, or complex. Logical relationships such as and, or, and not are used to build complex topics from basic topics. A concept of presence measure of a topic based on fuzzy set theory is introduced to compute the amount of information related to the topic in the document set. Each interval in the time period of the document set is associated with a numeric value which we call the discrepancy score. A high discrepancy score indicates that the documents in the time interval are more focused on the topic than those outside of the time interval. A hot spot of a given topic is defined as a time interval with the highest discrepancy score. We first describe a naive implementation for extracting hot spots. We then construct an algorithm called EHE (Efficient Hot Spot Extraction) using several efficient strategies to improve performance. We also introduce the notion of a topic DAG to facilitate an efficient computation of presence measures of complex topics. The proposed approach is illustrated by several experiments on a subset of the TDT-Pilot Corpus and DBLP conference data set. The experiments show that the proposed EHE algorithm significantly outperforms the naive one, and the extracted hot spots of given topics are meaningful. PMID:21765568
Cornelius, Victoria R; Liu, Kun; Peacock, Janet; Sauzet, Odile
2016-01-01
Objective To compare consistency of adverse drug reaction (ADR) data in publicly available product information documents for brand drugs, between the USA and Europe. To assess the usefulness of information for prescribers and patients. Design A comparison review of product information documents for antidepressants and anticonvulsants concurrently marketed by the same pharmaceutical company in the USA and Europe. Setting For each drug, data were extracted from the US Product Inserts and the European Summary of Product Characteristics documents between 09/2013 and 01/2015. Participants Individuals contributing ADR information to product information documents. Main outcomes measures All ADRs reported in product information sections 5 and 6 (USA), and 4·4 and 4·8 (Europe). Results Twelve brand drugs—24 paired documents—were included. On average, there were 77 more ADRs reported in the USA compared with in the European product information document, with a median number of 201 ADRs (range: 65–425) and 114 (range: 56–265), respectively. More product information documents in the USA reported information on the source of evidence (10 vs 5) and risk (9 vs 5) for greater than 80% of ADRs included in the document. There was negligible information included regarding duration, severity, reversibility or recurrence of ADRs. On average, only 29% of ADR terms were reported in both paired documents. Conclusions Product information documents contained a large number of ADRs, but lacked contextual data and information important to patients and prescribers, such as duration, severity and reversibility. The ADR profile was found to be inconsistently reported between the USA and Europe, for the same drug. Identifying, selecting, summarising and presenting multidimensional harm data should be underpinned by practical evidence-based guidelines. In order for prescribers to provide considered risk-benefit advice across competing drug therapies to patients, they need access to comprehensible and reliable ADR information. PMID:26996819
Transition to Postsecondary: New Documentation Guidance for Access to Accommodations
ERIC Educational Resources Information Center
Klotz, Mary Beth
2012-01-01
The Association on Higher Education and Disability (AHEAD) recently developed a conceptual framework that substantially revises its guidance for disability documentation for accommodations in higher education settings. This new document, "Supporting Accommodation Requests: Guidance on Documentation Practices," was written in response to the…
Handling a Collection of PDF Documents
You have several options for making a large collection of PDF documents more accessible to your audience: avoid uploading altogether, use multiple document pages, and use document IDs as anchors for direct links within a document page.
Collaborative filtering to improve navigation of large radiology knowledge resources.
Kahn, Charles E
2005-06-01
Collaborative filtering is a knowledge-discovery technique that can help guide readers to items of potential interest based on the experience of prior users. This study sought to determine the impact of collaborative filtering on navigation of a large, Web-based radiology knowledge resource. Collaborative filtering was applied to a collection of 1,168 radiology hypertext documents available via the Internet. An item-based collaborative filtering algorithm identified each document's six most closely related documents based on 248,304 page views in an 18-day period. Documents were amended to include links to their related documents, and use was analyzed over the next 5 days. The mean number of documents viewed per visit increased from 1.57 to 1.74 (P < 0.0001). Collaborative filtering can increase a radiology information resource's utilization and can improve its usefulness and ease of navigation. The technique holds promise for improving navigation of large Internet-based radiology knowledge resources.
Informatics in radiology: use of CouchDB for document-based storage of DICOM objects.
Rascovsky, Simón J; Delgado, Jorge A; Sanz, Alexander; Calvo, Víctor D; Castrillón, Gabriel
2012-01-01
Picture archiving and communication systems traditionally have depended on schema-based Structured Query Language (SQL) databases for imaging data management. To optimize database size and performance, many such systems store a reduced set of Digital Imaging and Communications in Medicine (DICOM) metadata, discarding informational content that might be needed in the future. As an alternative to traditional database systems, document-based key-value stores recently have gained popularity. These systems store documents containing key-value pairs that facilitate data searches without predefined schemas. Document-based key-value stores are especially suited to archive DICOM objects because DICOM metadata are highly heterogeneous collections of tag-value pairs conveying specific information about imaging modalities, acquisition protocols, and vendor-supported postprocessing options. The authors used an open-source document-based database management system (Apache CouchDB) to create and test two such databases; CouchDB was selected for its overall ease of use, capability for managing attachments, and reliance on HTTP and Representational State Transfer standards for accessing and retrieving data. A large database was created first in which the DICOM metadata from 5880 anonymized magnetic resonance imaging studies (1,949,753 images) were loaded by using a Ruby script. To provide the usual DICOM query functionality, several predefined "views" (standard queries) were created by using JavaScript. For performance comparison, the same queries were executed in both the CouchDB database and a SQL-based DICOM archive. The capabilities of CouchDB for attachment management and database replication were separately assessed in tests of a similar, smaller database. Results showed that CouchDB allowed efficient storage and interrogation of all DICOM objects; with the use of information retrieval algorithms such as map-reduce, all the DICOM metadata stored in the large database were searchable with only a minimal increase in retrieval time over that with the traditional database management system. Results also indicated possible uses for document-based databases in data mining applications such as dose monitoring, quality assurance, and protocol optimization. RSNA, 2012
A Multiscale Survival Process for Modeling Human Activity Patterns.
Zhang, Tianyang; Cui, Peng; Song, Chaoming; Zhu, Wenwu; Yang, Shiqiang
2016-01-01
Human activity plays a central role in understanding large-scale social dynamics. It is well documented that individual activity pattern follows bursty dynamics characterized by heavy-tailed interevent time distributions. Here we study a large-scale online chatting dataset consisting of 5,549,570 users, finding that individual activity pattern varies with timescales whereas existing models only approximate empirical observations within a limited timescale. We propose a novel approach that models the intensity rate of an individual triggering an activity. We demonstrate that the model precisely captures corresponding human dynamics across multiple timescales over five orders of magnitudes. Our model also allows extracting the population heterogeneity of activity patterns, characterized by a set of individual-specific ingredients. Integrating our approach with social interactions leads to a wide range of implications.
Rapid high-silica magma generation in basalt-dominated rift settings
NASA Astrophysics Data System (ADS)
Berg, Sylvia E.; Troll, Valentin R.; Burchardt, Steffi; Deegan, Frances M.; Riishuus, Morten S.; Whitehouse, Martin J.; Harris, Chris; Freda, Carmela; Ellis, Ben S.; Krumbholz, Michael; Gústafsson, Ludvik E.
2015-04-01
The processes that drive large-scale silicic magmatism in basalt-dominated provinces have been widely debated for decades, with Iceland being at the centre of this discussion [1-5]. Iceland hosts large accumulations of silicic rocks in a largely basaltic oceanic setting that is considered by some workers to resemble the situation documented for the Hadean [6-7]. We have investigated the time scales and processes of silicic volcanism in the largest complete pulse of Neogene rift-related silicic magmatism preserved in Iceland (>450 km3), which is a potential analogue of initial continent nucleation in early Earth. Borgarfjörður Eystri in NE-Iceland hosts silicic rocks in excess of 20 vol.%, which exceeds the ≤12 vol% usual for Iceland [3,8]. New SIMS zircon ages document that the dominantly explosive silicic pulse was generated within a ≤2 Myr window (13.5 ± 0.2 to 12.2 ± 03 Ma), and sub-mantle zircon δ18O values (1.2 to 4.5 ± 0.2‰, n=337) indicate ≤33% assimilation of low-δ18O hydrothermally-altered crust (δ18O=0‰), with intense crustal melting at 12.5 Ma, followed by rapid termination of silicic magma production once crustal fertility declined [9]. This silicic outburst was likely caused by extensive rift flank volcanism due to a rift relocation and a flare of the Iceland plume [4,10] that triggered large-scale crustal melting and generated mixed-origin silicic melts. High-silica melt production from a basaltic parent was replicated in a set of new partial melting experiments of regional hydrated basalts, conducted at 800-900°C and 150 MPa, that produced silicic melt pockets up to 77 wt.% SiO2. Moreover, Ti-in-zircon thermometry from Borgarfjörður Eystri give a zircon crystallisation temperature ~713°C (Ti range from 2.4 to 22.1 ppm, average=7.7 ppm, n=142), which is lower than recorded elsewhere in Iceland [11], but closely overlaps with the zircon crystallisation temperatures documented for Hadean zircon populations [11-13], hinting at crustal recycling as a key process. Our results therefore provide a mechanism and a time-scale for rapid, voluminous silicic magma generation in modern and ancient basalt-dominated rift setting, such as Afar, Taupo, and potentially early Earth. The Neogene plume-related rift flank setting of NE-Iceland may thus constitute a plausible geodynamic and compositional analogue for generating silicic (continental) crust in the subduction-free setting of a young Earth (e.g. ≥3 Ga [14]). [1] Bunsen, R. 1851. Ann. Phys. Chem. 159, 197-272. [2] MacDonald R., et al., 1987. Mineral. Mag. 51, 183-202. [3] Jonasson, K., 2007. J. Geodyn. 43, 101-117. [4] Martin, E., et al., 2011. Earth Planet. Sci. Lett. 311, 28-38. [5] Charreteur, G., et al., 2013.Contrib. Mineral. Petr. 166, 471- 490. [6] Willbold, E., et al., 2009. Earth Planet. Sci. Lett. 279, 44-52. [7] Reimink, J.R., et al., 2014. Nat. Geosci. 7, 529-533. [8] Gústafsson, L.E., et al., 1989. Jökull 39, 75-89. [9] Meade, F.C., et al., 2014. Nat. comm. 5. [10] Óskarsson, B.V., Riishuus, M.S., 2013. J. Volcanol. Geoth. Res. 267, 92-118. [11] Carley, T.L., et al., 2014. Earth Planet. Sci. Lett. 405, 85-97. [12] Trail, D., et al., 2007. Geochem. Geophys. Geosyst.8, Q06014. [13] Harrison, T.M. et al., 2008. Earth Planet. Sci. Lett.268, 476-486. [14] Kamber, B. S., et al., 2005. Earth Planet. Sci. Lett. 240, 276-290.
Postinfusion Phlebitis: Incidence and Risk Factors
Webster, Joan; McGrail, Matthew; Marsh, Nicole; Wallis, Marianne C.; Ray-Barruel, Gillian; Rickard, Claire M.
2015-01-01
Objective. To document the incidence of postinfusion phlebitis and to investigate associated risk factors. Design. Analysis of existing data set from a large randomized controlled trial, the primary purpose of which was to compare routine peripheral intravascular catheter changes with changing catheters only on clinical indication. Participants and Setting. Patients admitted to a large, acute general hospital in Queensland, Australia, and who required a peripheral intravenous catheter. Results. 5,907 PIVCs from 3,283 patients were studied. Postinfusion phlebitis at 48 hours was diagnosed in 59 (1.8%) patients. Fifteen (25.4%) of these patients had phlebitis at removal and also at 48 hours after removal. When data were analyzed per catheter, the rate was lower, 62/5907 (1.1%). The only variable associated with postinfusion phlebitis was placement of the catheter in the emergency room (P = 0.03). Conclusion. Although not a common occurrence, postinfusion phlebitis may be problematic so it is important for health care staff to provide patients with information about what to look for after an intravascular device has been removed. This trial is registered with ACTRN12608000445370. PMID:26075092
Semantic Document Model to Enhance Data and Knowledge Interoperability
NASA Astrophysics Data System (ADS)
Nešić, Saša
To enable document data and knowledge to be efficiently shared and reused across application, enterprise, and community boundaries, desktop documents should be completely open and queryable resources, whose data and knowledge are represented in a form understandable to both humans and machines. At the same time, these are the requirements that desktop documents need to satisfy in order to contribute to the visions of the Semantic Web. With the aim of achieving this goal, we have developed the Semantic Document Model (SDM), which turns desktop documents into Semantic Documents as uniquely identified and semantically annotated composite resources, that can be instantiated into human-readable (HR) and machine-processable (MP) forms. In this paper, we present the SDM along with an RDF and ontology-based solution for the MP document instance. Moreover, on top of the proposed model, we have built the Semantic Document Management System (SDMS), which provides a set of services that exploit the model. As an application example that takes advantage of SDMS services, we have extended MS Office with a set of tools that enables users to transform MS Office documents (e.g., MS Word and MS PowerPoint) into Semantic Documents, and to search local and distant semantic document repositories for document content units (CUs) over Semantic Web protocols.
Machine learning-based coreference resolution of concepts in clinical documents
Ware, Henry; Mullett, Charles J; El-Rawas, Oussama
2012-01-01
Objective Coreference resolution of concepts, although a very active area in the natural language processing community, has not yet been widely applied to clinical documents. Accordingly, the 2011 i2b2 competition focusing on this area is a timely and useful challenge. The objective of this research was to collate coreferent chains of concepts from a corpus of clinical documents. These concepts are in the categories of person, problems, treatments, and tests. Design A machine learning approach based on graphical models was employed to cluster coreferent concepts. Features selected were divided into domain independent and domain specific sets. Training was done with the i2b2 provided training set of 489 documents with 6949 chains. Testing was done on 322 documents. Results The learning engine, using the un-weighted average of three different measurement schemes, resulted in an F measure of 0.8423 where no domain specific features were included and 0.8483 where the feature set included both domain independent and domain specific features. Conclusion Our machine learning approach is a promising solution for recognizing coreferent concepts, which in turn is useful for practical applications such as the assembly of problem and medication lists from clinical documents. PMID:22582205
Serup, J
2001-08-01
Regulations for cosmetic products primarily address safety of the products that may be used by large populations of healthy consumers. Requirements for documentation of efficacy claims are only fragmentary. This synopsis aims to review and conclude a set of standards that may be acceptable to the European Community, and the cosmetic industry, as a legal standard for efficacy documentation in Europe in the future. Ethical, formal, experimental, statistical and other aspects of efficacy testing are described, including validation, quality control and assurance. The importance of user relevant clinical end points, a controlled randomized trial design and evidence-based cosmetic product documentation, validation of methods, statistical power estimation and proper data handling, reporting and archiving is emphasized. The main principles of the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) good clinical practice (GCP) should be followed by the cosmetics industry in a spirit of good documentation standard and scientific soundness, but full GCP is not considered mandatory in the field of cosmetics. Documentation by validated bio-instrumental methods may be acceptable, but efficacy documentation based on information about raw materials, reference to literature and laboratory experiments are only acceptable in exceptional cases. Principles for efficacy substantiation of cosmetic products in Europe, as described in this synopsis, are officially proposed by the Danish Ministry of Environment and Energy to the European Community as a basis for an amendment to the Cosmetics Directive or otherwise implemented as a European Community regulation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cadwell, J.J.; Ruger, C.J.
1995-12-01
This document is one of a three report set, BNL 52201 contains detailed information for use by executives. BNL 52202 is titled, U.S. Statutes of General Interest to Safeguards and Security Officers, and contains less detail than BNL 52201. It is intended for use by officers. BNL 52203 is titled, U.S.Statutes for Enforcement by Security Inspectors, and only contains statutes to be applied by uniformed security inspectors. These are a newly updated version of a set of documents of similar titles published in September 1988, which were an updated version of an original set of documents published in November 1983.
Governance of the International Linear Collider Project
DOE Office of Scientific and Technical Information (OSTI.GOV)
Foster, B.; /Oxford U.; Barish, B.
Governance models for the International Linear Collider Project are examined in the light of experience from similar international projects around the world. Recommendations for one path which could be followed to realize the ILC successfully are outlined. The International Linear Collider (ILC) is a unique endeavour in particle physics; fully international from the outset, it has no 'host laboratory' to provide infrastructure and support. The realization of this project therefore presents unique challenges, in scientific, technical and political arenas. This document outlines the main questions that need to be answered if the ILC is to become a reality. It describesmore » the methodology used to harness the wisdom displayed and lessons learned from current and previous large international projects. From this basis, it suggests both general principles and outlines a specific model to realize the ILC. It recognizes that there is no unique model for such a laboratory and that there are often several solutions to a particular problem. Nevertheless it proposes concrete solutions that the authors believe are currently the best choices in order to stimulate discussion and catalyze proposals as to how to bring the ILC project to fruition. The ILC Laboratory would be set up by international treaty and be governed by a strong Council to whom a Director General and an associated Directorate would report. Council would empower the Director General to give strong management to the project. It would take its decisions in a timely manner, giving appropriate weight to the financial contributions of the member states. The ILC Laboratory would be set up for a fixed term, capable of extension by agreement of all the partners. The construction of the machine would be based on a Work Breakdown Structure and value engineering and would have a common cash fund sufficiently large to allow the management flexibility to optimize the project's construction. Appropriate contingency, clearly apportioned at both a national and global level, is essential if the project is to be realised. Finally, models for running costs and decommissioning at the conclusion of the ILC project are proposed. This document represents an interim report of the bodies and individuals studying these questions inside the structure set up and supervised by the International Committee for Future Accelerators (ICFA). It represents a request for comment to the international community in all relevant disciplines, scientific, technical and most importantly, political. Many areas require further study and some, in particular the site selection process, have not yet progressed sufficiently to be addressed in detail in this document. Discussion raised by this document will be vital in framing the final proposals due to be published in 2012 in the Technical Design Report being prepared by the Global Design Effort of the ILC.« less
IP-Based Video Modem Extender Requirements
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pierson, L G; Boorman, T M; Howe, R E
2003-12-16
Visualization is one of the keys to understanding large complex data sets such as those generated by the large computing resources purchased and developed by the Advanced Simulation and Computing program (aka ASCI). In order to be convenient to researchers, visualization data must be distributed to offices and large complex visualization theaters. Currently, local distribution of the visual data is accomplished by distance limited modems and RGB switches that simply do not scale to hundreds of users across the local, metropolitan, and WAN distances without incurring large costs in fiber plant installation and maintenance. Wide Area application over the DOEmore » Complex is infeasible using these limited distance RGB extenders. On the other hand, Internet Protocols (IP) over Ethernet is a scalable well-proven technology that can distribute large volumes of data over these distances. Visual data has been distributed at lower resolutions over IP in industrial applications. This document describes requirements of the ASCI program in visual signal distribution for the purpose of identifying industrial partners willing to develop products to meet ASCI's needs.« less
14 CFR 302.3 - Filing of documents.
Code of Federal Regulations, 2013 CFR
2013-01-01
... in Washington, DC. Documents may be filed either on paper or by electronic means using the process set at the DOT Dockets Management System (DMS) internet website. (2) Such documents will be deemed to... below the space provided for signature. Electronic filers need only submit one copy of the document...
14 CFR 302.3 - Filing of documents.
Code of Federal Regulations, 2012 CFR
2012-01-01
... in Washington, DC. Documents may be filed either on paper or by electronic means using the process set at the DOT Dockets Management System (DMS) internet website. (2) Such documents will be deemed to... below the space provided for signature. Electronic filers need only submit one copy of the document...
14 CFR 302.3 - Filing of documents.
Code of Federal Regulations, 2014 CFR
2014-01-01
... in Washington, DC. Documents may be filed either on paper or by electronic means using the process set at the DOT Dockets Management System (DMS) internet website. (2) Such documents will be deemed to... below the space provided for signature. Electronic filers need only submit one copy of the document...
14 CFR 302.3 - Filing of documents.
Code of Federal Regulations, 2011 CFR
2011-01-01
... in Washington, DC. Documents may be filed either on paper or by electronic means using the process set at the DOT Dockets Management System (DMS) internet website. (2) Such documents will be deemed to... below the space provided for signature. Electronic filers need only submit one copy of the document...
A laser-based eye-tracking system.
Irie, Kenji; Wilson, Bruce A; Jones, Richard D; Bones, Philip J; Anderson, Tim J
2002-11-01
This paper reports on the development of a new eye-tracking system for noninvasive recording of eye movements. The eye tracker uses a flying-spot laser to selectively image landmarks on the eye and, subsequently, measure horizontal, vertical, and torsional eye movements. Considerable work was required to overcome the adverse effects of specular reflection of the flying-spot from the surface of the eye onto the sensing elements of the eye tracker. These effects have been largely overcome, and the eye-tracker has been used to document eye movement abnormalities, such as abnormal torsional pulsion of saccades, in the clinical setting.
SAMSAN- MODERN NUMERICAL METHODS FOR CLASSICAL SAMPLED SYSTEM ANALYSIS
NASA Technical Reports Server (NTRS)
Frisch, H. P.
1994-01-01
SAMSAN was developed to aid the control system analyst by providing a self consistent set of computer algorithms that support large order control system design and evaluation studies, with an emphasis placed on sampled system analysis. Control system analysts have access to a vast array of published algorithms to solve an equally large spectrum of controls related computational problems. The analyst usually spends considerable time and effort bringing these published algorithms to an integrated operational status and often finds them less general than desired. SAMSAN reduces the burden on the analyst by providing a set of algorithms that have been well tested and documented, and that can be readily integrated for solving control system problems. Algorithm selection for SAMSAN has been biased toward numerical accuracy for large order systems with computational speed and portability being considered important but not paramount. In addition to containing relevant subroutines from EISPAK for eigen-analysis and from LINPAK for the solution of linear systems and related problems, SAMSAN contains the following not so generally available capabilities: 1) Reduction of a real non-symmetric matrix to block diagonal form via a real similarity transformation matrix which is well conditioned with respect to inversion, 2) Solution of the generalized eigenvalue problem with balancing and grading, 3) Computation of all zeros of the determinant of a matrix of polynomials, 4) Matrix exponentiation and the evaluation of integrals involving the matrix exponential, with option to first block diagonalize, 5) Root locus and frequency response for single variable transfer functions in the S, Z, and W domains, 6) Several methods of computing zeros for linear systems, and 7) The ability to generate documentation "on demand". All matrix operations in the SAMSAN algorithms assume non-symmetric matrices with real double precision elements. There is no fixed size limit on any matrix in any SAMSAN algorithm; however, it is generally agreed by experienced users, and in the numerical error analysis literature, that computation with non-symmetric matrices of order greater than about 200 should be avoided or treated with extreme care. SAMSAN attempts to support the needs of application oriented analysis by providing: 1) a methodology with unlimited growth potential, 2) a methodology to insure that associated documentation is current and available "on demand", 3) a foundation of basic computational algorithms that most controls analysis procedures are based upon, 4) a set of check out and evaluation programs which demonstrate usage of the algorithms on a series of problems which are structured to expose the limits of each algorithm's applicability, and 5) capabilities which support both a priori and a posteriori error analysis for the computational algorithms provided. The SAMSAN algorithms are coded in FORTRAN 77 for batch or interactive execution and have been implemented on a DEC VAX computer under VMS 4.7. An effort was made to assure that the FORTRAN source code was portable and thus SAMSAN may be adaptable to other machine environments. The documentation is included on the distribution tape or can be purchased separately at the price below. SAMSAN version 2.0 was developed in 1982 and updated to version 3.0 in 1988.
2008-11-01
T or more words, where T is a threshold that is empirically set to 300 in the experiment. The second rule aims to remove pornographic documents...Some blog documents are embedded with pornographic words to attract search traffic. We identify a list of pornographic words. Given a blog document, all...document, this document is considered pornographic spam, and is discarded. The third rule removes documents written in foreign languages. We count the
Improving the Quality of Nursing Documentation in Home Health Care Setting
ERIC Educational Resources Information Center
Obioma, Chidiadi
2017-01-01
Poor nursing documentation of patient care was identified in daily nurse visit notes in a health care setting. This problem affects effective communication of patient status with other clinicians, thereby jeopardizing clinical decision-making. The purpose of this evidence-based project was to determine the impact of a retraining program on the…
NASA Astrophysics Data System (ADS)
David, Peter; Hansen, Nichole; Nolan, James J.; Alcocer, Pedro
2015-05-01
The growth in text data available online is accompanied by a growth in the diversity of available documents. Corpora with extreme heterogeneity in terms of file formats, document organization, page layout, text style, and content are common. The absence of meaningful metadata describing the structure of online and open-source data leads to text extraction results that contain no information about document structure and are cluttered with page headers and footers, web navigation controls, advertisements, and other items that are typically considered noise. We describe an approach to document structure and metadata recovery that uses visual analysis of documents to infer the communicative intent of the author. Our algorithm identifies the components of documents such as titles, headings, and body content, based on their appearance. Because it operates on an image of a document, our technique can be applied to any type of document, including scanned images. Our approach to document structure recovery considers a finer-grained set of component types than prior approaches. In this initial work, we show that a machine learning approach to document structure recovery using a feature set based on the geometry and appearance of images of documents achieves a 60% greater F1- score than a baseline random classifier.
Luscombe, Ciara; Montgomery, Julia
2016-07-19
Lectures continue to be an efficient and standardised way to deliver information to large groups of students. It has been well documented that students prefer interactive lectures, based on active learning principles, to didactic teaching in the large group setting. Despite this, it is often the case than many students do not engage with active learning tasks and attempts at interaction. By exploring student experiences, expectations and how they use lectures in their learning we will provide recommendations for faculty to support student learning both in the lecture theatre and during personal study time. This research employed a hermeneutic phenomenological approach. Three focus groups, consisting of 19 students in total, were used to explore the experiences of second year medical students in large group teaching sessions. Using generic thematic data analysis, these accounts have been developed into a meaningful account of experience. This study found there to be a well-established learning culture amongst students and with it, expectations as to the format of teaching sessions. Furthermore, there were set perceptions about the student role within the learning environment which had many implications, including the way that innovative teaching methods were received. Student learning was perceived to take place outside the lecture theatre, with a large emphasis placed on creating resources that can be taken away to use in personal study time. Presented here is a constructive review of reasons for student participation, interaction and engagement in large group teaching sessions. Based on this are recommendations constructed with the view to aid educators in engaging students within this setting. Short term, educators can implement strategies that monopolise on the established learning culture of students to encourage engagement with active learning strategies. Long term, it would be beneficial for educators to consider ways to shift the current student learning culture to one that embraces an active learning curriculum.
Sawe, Hendry R; Mfinanga, Juma A; Ringo, Faith H; Mwafongo, Victor; Reynolds, Teri A; Runyon, Michael S
2016-01-01
Objectives To describe the HIV counselling and testing practices for children presenting to an emergency department (ED) in a low-income country. Setting The ED of a large east African national referral hospital. Participants This retrospective review of all paediatric (<18 years old) ED visits in 2012 enrolled patients who had an HIV test ordered and excluded those without testing. Files were available for 5540/5774 (96%) eligible patients and 1632 (30%) were tested for HIV, median age 1.3 years (IQR 9 months to 4 years), 58% <18 months old and 61% male. Primary and secondary outcome measures The primary outcome measure was documentation of pretest and post-test counselling, or deferral of counselling, for children tested for HIV in the ED. Secondary measures included the overall rate of HIV testing, rate of counselling documented in the inpatient record when deferred in the ED, rate of counselling documented when testing was initiated by the inpatient service, rate of counselling documented by test result (positive vs negative) and the rate of referral to follow-up HIV care among patients testing positive. Results Of 418 patients tested in the ED, counselling, or deferral of counselling, was documented for 70 (17%). When deferred to the ward, subsequent counselling was documented for 15/42 (36%). Counselling was documented in 33% of patients testing positive versus 1.1% patients testing negative (OR 43 (95% CI 23 to 83). Of 199 patients who tested positive and survived to hospital discharge, 76 (38%) were referred for follow-up at the HIV clinic on discharge. Conclusions Physicians documented the provision, or deferral, of counselling for <20% of children tested for HIV in the ED. Counselling was much more likely to be documented when the test result was positive. Less than 40% of those testing positive were referred for follow-up care. PMID:26880672
Use of evidence-based assessments for childhood anxiety disorders within a regional medical system.
Sattler, Adam F; Ale, Chelsea M; Nguyen, Kristin; Gregg, Melissa S; Geske, Jennifer R; Whiteside, Stephen P H
2016-11-01
Anxiety disorders represent a common and serious threat to mental health in children and adolescents. To effectively treat anxiety in children, clinicians must conduct accurate assessment of patients' symptoms. However, despite the importance of assessment in the treatment of childhood anxiety disorders, the literature lacks a thorough analysis of the practices used by clinicians' when evaluating such disorders in community settings. Thus, the current study examines the quality of assessment for childhood anxiety disorders in a large regional health system. The results suggest that clinicians often provide non-specific diagnoses, infrequently document symptoms according to diagnostic criteria, and rarely administer rating scales and structured diagnostic interviews. Relatedly, diagnostic agreement across practice settings was low. Finally, the quality of assessment differed according to the setting in which the assessment was conducted and the complexity of the patient's symptomatology. These results highlight the need to develop and disseminate clinically feasible evidence-based assessment practices that can be implemented within resource-constrained service settings. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Lunettes: A Global Inventory of Their Occurrence and Characteristics
NASA Astrophysics Data System (ADS)
Rhodes, D. D.
2012-12-01
Lunettes (including "clay dunes") form along the downwind margins of saline pans. As landforms they were little studied and poorly understood until the mid-20th century when they attracted the attention of Australian geomorphologists, notably James W. Bowler. During the last 40 years, lunettes have been studied extensively as indicators of climate change. Their occurrence has now been documented on every continent except Antarctica. Inspection of more than 100 research sites using Google Earth has led to the recognition that lunettes occur in three definable, though somewhat overlapping, settings. Playas and lunettes are common features of closed basins having sub-humid climates. Large basins with internal drainage due to structural deformation produce the most extensive and complex lake and dune systems (e.g., the Etosha Pan, northern Namibia; Soda Lake, Carrizo Plain, California). In these settings, the large central basin is associated with lunettes that may be more than 10 km long and as much as 50 m high. Tens of lunette ridges may mark former lake levels and channels of the desiccated drainage system. Some basins lack external drainage for hydrological reasons (low precipitation, drainage diversion, etc.). These hydrologically closed basins may also host saline lakes and lunettes (e.g., Lake Malheur, Oregon) though they are generally smaller and less complex. Shallow depressions may occur by the thousands on the surface of arid and semi-arid plains such as Brazil's Pantanal, and the Kalahari and Transvaal of southern Africa, the High Plains of Texas and New Mexico, and several parts of Australia. Although they have not been described in the literature, pans also cover large areas in China, Tibet, and Mongolia. Multiple theories have been advanced to explain the occurrence of plains pans including deflation, piping, subsidence, and animal activity. In the plains setting, pans can occur in large numbers, up to 100 per 100 km2 and may cover 20% or more of the area. Individual pans vary in size by at least three orders of magnitude with maximum areas of a <10 km2. Larger pans may have lunettes, though the strength of the association with pan size varies with climate and substrate. The least commonly documented occurrences of lunettes are in coastal settings, all of which occur at low latitudes (11-28 degree from the equator). The dunes most commonly occur along shallow channels and lagoons that are flooded only during spring tides or in association with storms and strong on-shore winds. Coastal lunettes have been described in deltaic environments (e.g. Senegal River, Senegal; Mitare River, Venezuela) and lagoonal settings (e.g., Laguna Madre and Copano Bay, Texas). Research into the origin and history of lunettes has been dramatically changed by new dating techniques and detailed geochemical studies. What once seemed a simple relationship between climate and landform has been demonstrated to be complex and non-uniform. The variety of settings in which lunettes form should be added to the list of the factors requiring consideration in making paleoenvironmental interpretations based on their occurrence.
García-Remesal, Miguel; Maojo, Victor; Crespo, José
2010-01-01
In this paper we present a knowledge engineering approach to automatically recognize and extract genetic sequences from scientific articles. To carry out this task, we use a preliminary recognizer based on a finite state machine to extract all candidate DNA/RNA sequences. The latter are then fed into a knowledge-based system that automatically discards false positives and refines noisy and incorrectly merged sequences. We created the knowledge base by manually analyzing different manuscripts containing genetic sequences. Our approach was evaluated using a test set of 211 full-text articles in PDF format containing 3134 genetic sequences. For such set, we achieved 87.76% precision and 97.70% recall respectively. This method can facilitate different research tasks. These include text mining, information extraction, and information retrieval research dealing with large collections of documents containing genetic sequences.
B-machine polarimeter: A telescope to measure the polarization of the cosmic microwave background
NASA Astrophysics Data System (ADS)
Williams, Brian Dean
The B-Machine Telescope is the culmination of several years of development, construction, characterization and observation. The telescope is a departure from standard polarization chopping of correlation receivers to a half wave plate technique. Typical polarimeters use a correlation receiver to chop the polarization signal to overcome the 1/f noise inherent in HEMT amplifiers. B-Machine uses a room temperature half wave plate technology to chop between polarization states and measure the polarization signature of the CMB. The telescope has a demodulated 1/f knee of 5 mHz and an average sensitivity of 1.6 mK s . This document examines the construction, characterization, observation of astronomical sources, and data set analysis of B-Machine. Preliminary power spectra and sky maps with large sky coverage for the first year data set are included.
Attractive celebrity and peer images on Instagram: Effect on women's mood and body image.
Brown, Zoe; Tiggemann, Marika
2016-12-01
A large body of research has documented that exposure to images of thin fashion models contributes to women's body dissatisfaction. The present study aimed to experimentally investigate the impact of attractive celebrity and peer images on women's body image. Participants were 138 female undergraduate students who were randomly assigned to view either a set of celebrity images, a set of equally attractive unknown peer images, or a control set of travel images. All images were sourced from public Instagram profiles. Results showed that exposure to celebrity and peer images increased negative mood and body dissatisfaction relative to travel images, with no significant difference between celebrity and peer images. This effect was mediated by state appearance comparison. In addition, celebrity worship moderated an increased effect of celebrity images on body dissatisfaction. It was concluded that exposure to attractive celebrity and peer images can be detrimental to women's body image. Copyright © 2016 Elsevier Ltd. All rights reserved.
Ensemble LUT classification for degraded document enhancement
NASA Astrophysics Data System (ADS)
Obafemi-Ajayi, Tayo; Agam, Gady; Frieder, Ophir
2008-01-01
The fast evolution of scanning and computing technologies have led to the creation of large collections of scanned paper documents. Examples of such collections include historical collections, legal depositories, medical archives, and business archives. Moreover, in many situations such as legal litigation and security investigations scanned collections are being used to facilitate systematic exploration of the data. It is almost always the case that scanned documents suffer from some form of degradation. Large degradations make documents hard to read and substantially deteriorate the performance of automated document processing systems. Enhancement of degraded document images is normally performed assuming global degradation models. When the degradation is large, global degradation models do not perform well. In contrast, we propose to estimate local degradation models and use them in enhancing degraded document images. Using a semi-automated enhancement system we have labeled a subset of the Frieder diaries collection.1 This labeled subset was then used to train an ensemble classifier. The component classifiers are based on lookup tables (LUT) in conjunction with the approximated nearest neighbor algorithm. The resulting algorithm is highly effcient. Experimental evaluation results are provided using the Frieder diaries collection.1
Challenges and Opportunities in Geocuration for Disaster Response
NASA Astrophysics Data System (ADS)
Molthan, A.; Burks, J. E.; McGrath, K.; Ramachandran, R.; Goodman, H. M.
2015-12-01
Following a significant disaster event, a wide range of resources and science teams are leveraged to aid in the response effort. Often, these efforts include the acquisition and use of non-traditional data sets, or the generation of prototyped products using new image analysis techniques. These efforts may also include acquisition and hosting of remote sensing data sets from domestic and international partners - from the public or private sector - which differ from standard remote sensing holdings, or may be accompanied by specific licensing agreements that limit their use and dissemination. In addition, at time periods well beyond the initial disaster event, other science teams may incorporate airborne or field campaign measurements that support the assessment of damage but also acquire information necessary to address key science questions about the specific disaster or a broader category of similar events. The immediate need to gather data and provide information to the response effort can result in large data holdings that require detailed curation to improve the efficiency of response efforts, but also ensure that collected data can be used on a longer time scale to address underlying science questions. Data collected in response to a disaster event may be thought of as a "field campaign" - consisting of traditional data sets managed through physical or virtual holdings, but also a larger number of ad hoc data collections, derived products, and metadata, including the potential for airborne or ground-based data collections. Appropriate metadata and documentation are needed to ensure that derived products have traceability to their source data, along with documentation of algorithm authors, versions, and outcomes so that others can reproduce their results, and to ensure that data sets remain available and well-documented for longer-term analysis that may in turn create new products relevant to understanding a type of disaster, or support future recovery efforts. The authors will review some activities related to recent data acquisitions for severe weather events and the 2015 Nepal Earthquake, discuss challenges and opportunities for geocuration, and suggest means of improving the management of data and scientific collaboration if applied to future events.
Introducing the slime mold graph repository
NASA Astrophysics Data System (ADS)
Dirnberger, M.; Mehlhorn, K.; Mehlhorn, T.
2017-07-01
We introduce the slime mold graph repository or SMGR, a novel data collection promoting the visibility, accessibility and reuse of experimental data revolving around network-forming slime molds. By making data readily available to researchers across multiple disciplines, the SMGR promotes novel research as well as the reproduction of original results. While SMGR data may take various forms, we stress the importance of graph representations of slime mold networks due to their ease of handling and their large potential for reuse. Data added to the SMGR stands to gain impact beyond initial publications or even beyond its domain of origin. We initiate the SMGR with the comprehensive Kist Europe data set focusing on the slime mold Physarum polycephalum, which we obtained in the course of our original research. It contains sequences of images documenting growth and network formation of the organism under constant conditions. Suitable image sequences depicting the typical P. polycephalum network structures are used to compute sequences of graphs faithfully capturing them. Given such sequences, node identities are computed, tracking the development of nodes over time. Based on this information we demonstrate two out of many possible ways to begin exploring the data. The entire data set is well-documented, self-contained and ready for inspection at http://smgr.mpi-inf.mpg.de.
17 CFR 4.31 - Required delivery of Disclosure Document to prospective clients.
Code of Federal Regulations, 2010 CFR
2010-04-01
... Disclosure Document to prospective clients. 4.31 Section 4.31 Commodity and Securities Exchanges COMMODITY... Advisors § 4.31 Required delivery of Disclosure Document to prospective clients. (a) Each commodity trading... prospective client a Disclosure Document containing the information set forth in §§ 4.34 and 4.35 for the...
Review of access, licenses and understandability of open datasets used in hydrology research
NASA Astrophysics Data System (ADS)
Falkenroth, Esa; Arheimer, Berit; Lagerbäck Adolphi, Emma
2015-04-01
The amount of open data available for hydrology research is continually growing. In the EU-funded project SWITCH-ON (Sharing Water-related Information to Tackle Changes in the Hydrosphere - for Operational Needs), we are addressing water concerns by exploring and exploiting the untapped potential of these new open data. This work is enabled by many ongoing efforts to facilitate the use of open data. For instance, a number of portals (such as the GEOSS Portal and the INSPIRE community geoportal) provide the means to search for such open data sets and open spatial data services. However, in general, the systematic use of available open data is still fairly uncommon in hydrology research. Factors that limits (re)usability of a data set include: (1) accessibility, (2) understandability and (3) licences. If you cannot access the data set, you cannot use if for research. If you cannot understand the data set you cannot use it for research. Finally, if you are not permitted to use the data, you cannot use it for research. Early on in the project, we sent out a questionnaire to our research partners (SMHI, Universita di Bologna, University of Bristol, Technische Universiteit Delft and Technische Universitaet Wien) to find out what data sets they were planning to use in their experiments. The result was a comprehensive list of useful open data sets. Later, this list of data sets was extended with additional information on data sets for planned commercial water-information products and services. With the list of 50 common data sets as a starting point, we reviewed issues related to access, understandability and licence conditions. Regarding access to data sets, a majority of data sets were available through direct internet download via some well-known transfer protocol such as ftp or http. However, several data sets were found to be inaccessible due to server downtime, incorrect links or problems with the host database management system. One possible explanation for this could be that many data sets have been assembled by research project that no longer are funded. Hence, their server infrastructure would be less maintained compared to large-scale operational services. Regarding understandability of the data sets, the issues encountered were mainly due to incomplete documentation or metadata and problems with decoding binary formats. Ideally, open data sets should be represented in well-known formats and they should be accompanied with sufficient documentation so the data set can be understood. Furthermore, machine-readable format would be preferrable. Here, the development efforts on Water ML and NETCDF and other standards should improve understandability of data sets over time but in this review, only a few data sets were provided in these wellknown formats. Instead, the majority of datasets were stored in various text-based or binary formats or even document-oriented formats such as PDF. For some binary formats, we could not find information on what software was necessary to decipher the files. Other domains such as meteorology have long-standing traditions of operational data exchange format whereas hydrology research is still quite fragmented and the data exchange is usually done on a case-by-case basis. With the increased sharing of open data there is a good chance the situation will improve for data sets used in hydrology research. Finally, regarding licensce issue, a high number of data sets did not have a clear statement on terms of use and limitation for access. In most cases the provider could be contacted regarding licensing issues.
Multi-font printed Mongolian document recognition system
NASA Astrophysics Data System (ADS)
Peng, Liangrui; Liu, Changsong; Ding, Xiaoqing; Wang, Hua; Jin, Jianming
2009-01-01
Mongolian is one of the major ethnic languages in China. Large amount of Mongolian printed documents need to be digitized in digital library and various applications. Traditional Mongolian script has unique writing style and multi-font-type variations, which bring challenges to Mongolian OCR research. As traditional Mongolian script has some characteristics, for example, one character may be part of another character, we define the character set for recognition according to the segmented components, and the components are combined into characters by rule-based post-processing module. For character recognition, a method based on visual directional feature and multi-level classifiers is presented. For character segmentation, a scheme is used to find the segmentation point by analyzing the properties of projection and connected components. As Mongolian has different font-types which are categorized into two major groups, the parameter of segmentation is adjusted for each group. A font-type classification method for the two font-type group is introduced. For recognition of Mongolian text mixed with Chinese and English, language identification and relevant character recognition kernels are integrated. Experiments show that the presented methods are effective. The text recognition rate is 96.9% on the test samples from practical documents with multi-font-types and mixed scripts.
Colivicchi, Furio; Gulizia, Michele Massimo; Pugliese, Francesco Rocco; Ruggieri, Maria Pia; Musumeci, Giuseppe; Cibinel, Gian Alfonso; Romeo, Francesco
2017-01-01
Abstract Antiplatelet therapy is the cornerstone of the pharmacologic management of patients with acute coronary syndrome (ACS). Over the last years, several studies have evaluated old and new oral or intravenous antiplatelet agents in ACS patients. In particular, research was focused on assessing superiority of two novel platelet ADP P2Y12 receptor antagonists (i.e., prasugrel and ticagrelor) over clopidogrel. Several large randomized controlled trials have been undertaken in this setting and a wide variety of prespecified and post-hoc analyses are available that evaluated the potential benefits of novel antiplatelet therapies in different subsets of patients with ACS. The aim of this document is to review recent data on the use of current antiplatelet agents for in-hospital treatment of ACS patients. In addition, in order to overcome increasing clinical challenges and implement effective therapeutic interventions, this document identifies all potential specific care pathway for ACS patients and accordingly proposes individualized therapeutic options. PMID:28751840
Structured Forms Reference Set of Binary Images (SFRS)
National Institute of Standards and Technology Data Gateway
NIST Structured Forms Reference Set of Binary Images (SFRS) (Web, free access) The NIST Structured Forms Database (Special Database 2) consists of 5,590 pages of binary, black-and-white images of synthesized documents. The documents in this database are 12 different tax forms from the IRS 1040 Package X for the year 1988.
Fast and Scalable Gaussian Process Modeling with Applications to Astronomical Time Series
NASA Astrophysics Data System (ADS)
Foreman-Mackey, Daniel; Agol, Eric; Ambikasaran, Sivaram; Angus, Ruth
2017-12-01
The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large data sets. Gaussian processes (GPs) are a popular class of models used for this purpose, but since the computational cost scales, in general, as the cube of the number of data points, their application has been limited to small data sets. In this paper, we present a novel method for GPs modeling in one dimension where the computational requirements scale linearly with the size of the data set. We demonstrate the method by applying it to simulated and real astronomical time series data sets. These demonstrations are examples of probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra, and transiting planet parameters. The method exploits structure in the problem when the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise. This form of covariance arises naturally when the process is a mixture of stochastically driven damped harmonic oscillators—providing a physical motivation for and interpretation of this choice—but we also demonstrate that it can be a useful effective model in some other cases. We present a mathematical description of the method and compare it to existing scalable GP methods. The method is fast and interpretable, with a range of potential applications within astronomical data analysis and beyond. We provide well-tested and documented open-source implementations of this method in C++, Python, and Julia.
Serebruany, Victor L; Malinin, Alex I; Atar, Dan; Hanley, Dan F
2007-03-01
Numerous reports have dichotomized responses after clopidogrel therapy using varying definitions and platelet tests in patients immediately after acute vascular events; however, no large study has assessed platelet characteristics in outpatients receiving long-term treatment for more than 30 days with the maintenance dose (75 mg/d) of clopidogrel. The aim of this study was to describe the responses of ex vivo measures of platelet aggregation and activation to long-term clopidogrel therapy in a large population of outpatients after coronary stenting or ischemic stroke. We conducted a secondary post hoc analysis of a data set represented by presumably compliant patients after coronary stenting (n = 237) or a documented ischemic stroke (n = 122) treated with clopidogrel-and-aspirin combination antiplatelet therapy. The mean duration of treatment was 5.8 months (range 1-21 months). Every patient exhibited a significant inhibition of adenosine diphosphate-induced platelet aggregation (mean 52.9%, range 36%-70%) as compared with the preclopidogrel measures. Inhibition of aggregation strongly correlated with a diminished expression of PECAM-1 (platelet/endothelial cell adhesion molecule 1, r = 0.75), glycoprotein IIb/IIIa (r = 0.62), and PAR-1 (protease-activated receptor 1, r = 0.71). None of the patients developed hyporesponsiveness (reduction from the baseline <15%) or profound inhibition (residual platelet activity <10%). In contrast to the wide variability of responses that exists in the acute setting, long-term therapy with clopidogrel leads to consistent and much less variable platelet inhibition. Lack of nonresponse and profound inhibition with clopidogrel allow for the maintenance of a delicate balance between proven efficacy and acceptable bleeding risks for long-term secondary prevention in outpatients after acute vascular events.
Johnson, K E; McMorris, B J; Raynor, L A; Monsen, K A
2013-01-01
The Omaha System is a standardized interface terminology that is used extensively by public health nurses in community settings to document interventions and client outcomes. Researchers using Omaha System data to analyze the effectiveness of interventions have typically calculated p-values to determine whether significant client changes occurred between admission and discharge. However, p-values are highly dependent on sample size, making it difficult to distinguish statistically significant changes from clinically meaningful changes. Effect sizes can help identify practical differences but have not yet been applied to Omaha System data. We compared p-values and effect sizes (Cohen's d) for mean differences between admission and discharge for 13 client problems documented in the electronic health records of 1,016 young low-income parents. Client problems were documented anywhere from 6 (Health Care Supervision) to 906 (Caretaking/parenting) times. On a scale from 1 to 5, the mean change needed to yield a large effect size (Cohen's d ≥ 0.80) was approximately 0.60 (range = 0.50 - 1.03) regardless of p-value or sample size (i.e., the number of times a client problem was documented in the electronic health record). Researchers using the Omaha System should report effect sizes to help readers determine which differences are practical and meaningful. Such disclosures will allow for increased recognition of effective interventions.
Kim, Minchul; Franciskovich, Chris M.; Weinberg, Jason E.; Svendsen, Jessica D.; Fehr, Linda S.; Funk, Amy; Sawicki, Robert; Asche, Carl V.
2018-01-01
Abstract Background: Advance care planning (ACP) documents patient wishes and increases awareness of palliative care options. Objective: To study the association of outpatient ACP with advanced directive documentation, utilization, and costs of care. Design: This was a case–control study of cases with ACP who died matched 1:1 with controls. We used 12 months of data pre-ACP/prematch and predeath. We compared rates of documentation with logit model regression and conducted a difference-in-difference analysis using generalized linear models for utilization and costs. Setting/subjects: Medicare beneficiaries attributed to a large rural-suburban-small metro multisite accountable care organization from January 2013 to April 2016, with cross reference to ACP facilitator logs to find cases. Measurements: The presence of advance directive forms was verified by chart review. Cost analysis included all utilization and costs billed to Medicare. Results: We matched 325 cases and 325 controls (51.1% female and 48.9% male, mean age 81). 320/325 (98.5%) ACP versus 243/325 (74.8%) of controls had a Healthcare Power of Attorney (odds ratio [OR] 21.6, 95% CI 8.6–54.1) and 172/325(52.9%) ACP versus 145/325 (44.6%) controls had Practitioner Orders for Life Sustaining Treatment (OR 1.40, 95% CI 1.02–1.90) post-ACP/postmatch. Adjusted results showed ACP cases had fewer inpatient admissions (−0.37 admissions, 95% CI −0.66 to −0.08), and inpatient days (−3.66 days, 95% CI −6.23 to −1.09), with no differences in hospice, hospice days, skilled nursing facility use, home health use, 30-day readmissions, or emergency department visits. Adjusted costs were $9,500 lower in the ACP group (95% CI −$16,207 to −$2,793). Conclusions: ACP increases documentation and was associated with a reduction in overall costs driven primarily by a reduction in inpatient utilization. Our data set was limited by small numbers of minorities and cancer patients. PMID:29206564
Phase II Report: Design Study for Automated Document Location and Control System.
ERIC Educational Resources Information Center
Booz, Allen Applied Research, Inc., Bethesda, MD.
The scope of Phase II is the design of a system for document control within the National Agricultural Library (NAL) that will facilitate the processing of the documents selected, ordered, or received; that will avoid backlogs; and that will provide rapid document location reports. The results are set forth as follows: Chapter I, Introduction,…
Information Retrieval: A Sequential Learning Process.
ERIC Educational Resources Information Center
Bookstein, Abraham
1983-01-01
Presents decision-theoretic models which intrinsically include retrieval of multiple documents whereby system responds to request by presenting documents to patron in sequence, gathering feedback, and using information to modify future retrievals. Document independence model, set retrieval model, sequential retrieval model, learning model,…
Huang, Yang; Lowe, Henry J.; Hersh, William R.
2003-01-01
Objective: Despite the advantages of structured data entry, much of the patient record is still stored as unstructured or semistructured narrative text. The issue of representing clinical document content remains problematic. The authors' prior work using an automated UMLS document indexing system has been encouraging but has been affected by the generally low indexing precision of such systems. In an effort to improve precision, the authors have developed a context-sensitive document indexing model to calculate the optimal subset of UMLS source vocabularies used to index each document section. This pilot study was performed to evaluate the utility of this indexing approach on a set of clinical radiology reports. Design: A set of clinical radiology reports that had been indexed manually using UMLS concept descriptors was indexed automatically by the SAPHIRE indexing engine. Using the data generated by this process the authors developed a system that simulated indexing, at the document section level, of the same document set using many permutations of a subset of the UMLS constituent vocabularies. Measurements: The precision and recall scores generated by simulated indexing for each permutation of two or three UMLS constituent vocabularies were determined. Results: While there was considerable variation in precision and recall values across the different subtypes of radiology reports, the overall effect of this indexing strategy using the best combination of two or three UMLS constituent vocabularies was an improvement in precision without significant impact of recall. Conclusion: In this pilot study a contextual indexing strategy improved overall precision in a set of clinical radiology reports. PMID:12925544
Huang, Yang; Lowe, Henry J; Hersh, William R
2003-01-01
Despite the advantages of structured data entry, much of the patient record is still stored as unstructured or semistructured narrative text. The issue of representing clinical document content remains problematic. The authors' prior work using an automated UMLS document indexing system has been encouraging but has been affected by the generally low indexing precision of such systems. In an effort to improve precision, the authors have developed a context-sensitive document indexing model to calculate the optimal subset of UMLS source vocabularies used to index each document section. This pilot study was performed to evaluate the utility of this indexing approach on a set of clinical radiology reports. A set of clinical radiology reports that had been indexed manually using UMLS concept descriptors was indexed automatically by the SAPHIRE indexing engine. Using the data generated by this process the authors developed a system that simulated indexing, at the document section level, of the same document set using many permutations of a subset of the UMLS constituent vocabularies. The precision and recall scores generated by simulated indexing for each permutation of two or three UMLS constituent vocabularies were determined. While there was considerable variation in precision and recall values across the different subtypes of radiology reports, the overall effect of this indexing strategy using the best combination of two or three UMLS constituent vocabularies was an improvement in precision without significant impact of recall. In this pilot study a contextual indexing strategy improved overall precision in a set of clinical radiology reports.
The World of Barilla Taylor: A Primary Source-Based Kit for Students in Grades 8-12.
ERIC Educational Resources Information Center
Fellner, Kelly; Stearns, Liza
1995-01-01
Examines a primary source-based kit that describes the life of a young woman factory worker in early 19th-century New England. The kit includes five document sets, utilizing maps, newspaper articles, deeds, letters, poems, and other artifacts. The document sets illustrate various topics including mill life and personal life. (MJP)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spittler, T.E.; Sydnor, R.H.; Manson, M.W.
1990-01-01
The Loma Prieta earthquake of October 17, 1989 triggered landslides throughout the Santa Cruz Mountains in central California. The California Department of Conservation, Division of Mines and Geology (DMG) responded to a request for assistance from the County of Santa Cruz, Office of Emergency Services to evaluate the geologic hazard from major reactivated large landslides. DMG prepared a set of geologic maps showing the landslide features that resulted from the October 17 earthquake. The principal purpose of large-scale mapping of these landslides is: (1) to provide county officials with regional landslide information that can be used for timely recovery ofmore » damaged areas; (2) to identify disturbed ground which is potentially vulnerable to landslide movement during winter rains; (3) to provide county planning officials with timely geologic information that will be used for effective land-use decisions; (4) to document regional landslide features that may not otherwise be available for individual site reconstruction permits and for future development.« less
Ahene, Ago; Calonder, Claudio; Davis, Scott; Kowalchick, Joseph; Nakamura, Takahiro; Nouri, Parya; Vostiar, Igor; Wang, Yang; Wang, Jin
2014-01-01
In recent years, the use of automated sample handling instrumentation has come to the forefront of bioanalytical analysis in order to ensure greater assay consistency and throughput. Since robotic systems are becoming part of everyday analytical procedures, the need for consistent guidance across the pharmaceutical industry has become increasingly important. Pre-existing regulations do not go into sufficient detail in regard to how to handle the use of robotic systems for use with analytical methods, especially large molecule bioanalysis. As a result, Global Bioanalytical Consortium (GBC) Group L5 has put forth specific recommendations for the validation, qualification, and use of robotic systems as part of large molecule bioanalytical analyses in the present white paper. The guidelines presented can be followed to ensure that there is a consistent, transparent methodology that will ensure that robotic systems can be effectively used and documented in a regulated bioanalytical laboratory setting. This will allow for consistent use of robotic sample handling instrumentation as part of large molecule bioanalysis across the globe.
Script-independent text line segmentation in freestyle handwritten documents.
Li, Yi; Zheng, Yefeng; Doermann, David; Jaeger, Stefan; Li, Yi
2008-08-01
Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability that the underlying pixel belongs to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike connected component based methods ( [1], [2] for example), the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts, such as Arabic, Chinese, Korean, and Hindi, demonstrate that our algorithm consistently outperforms previous methods [1]-[3]. Further experiments show the proposed algorithm is robust to scale change, rotation, and noise.
The value of trauma registries.
Moore, Lynne; Clark, David E
2008-06-01
Trauma registries are databases that document acute care delivered to patients hospitalised with injuries. They are designed to provide information that can be used to improve the efficiency and quality of trauma care. Indeed, the combination of trauma registry data at regional or national levels can produce very large databases that allow unprecedented opportunities for the evaluation of patient outcomes and inter-hospital comparisons. However, the creation and upkeep of trauma registries requires a substantial investment of money, time and effort, data quality is an important challenge and aggregated trauma data sets rarely represent a population-based sample of trauma. In addition, trauma hospitalisations are already routinely documented in administrative hospital discharge databases. The present review aims to provide evidence that trauma registry data can be used to improve the care dispensed to victims of injury in ways that could not be achieved with information from administrative databases alone. In addition, we will define the structure and purpose of contemporary trauma registries, acknowledge their limitations, and discuss possible ways to make them more useful.
A Nursing Intelligence System to Support Secondary Use of Nursing Routine Data
Rauchegger, F.; Ammenwerth, E.
2015-01-01
Summary Background Nursing care is facing exponential growth of information from nursing documentation. This amount of electronically available data collected routinely opens up new opportunities for secondary use. Objectives To present a case study of a nursing intelligence system for reusing routinely collected nursing documentation data for multiple purposes, including quality management of nursing care. Methods The SPIRIT framework for systematically planning the reuse of clinical routine data was leveraged to design a nursing intelligence system which then was implemented using open source tools in a large university hospital group following the spiral model of software engineering. Results The nursing intelligence system is in routine use now and updated regularly, and includes over 40 million data sets. It allows the outcome and quality analysis of data related to the nursing process. Conclusions Following a systematic approach for planning and designing a solution for reusing routine care data appeared to be successful. The resulting nursing intelligence system is useful in practice now, but remains malleable for future changes. PMID:26171085
Fulford, Janice M.
2003-01-01
A numerical computer model, Transient Inundation Model for Rivers -- 2 Dimensional (TrimR2D), that solves the two-dimensional depth-averaged flow equations is documented and discussed. The model uses a semi-implicit, semi-Lagrangian finite-difference method. It is a variant of the Trim model and has been used successfully in estuarine environments such as San Francisco Bay. The abilities of the model are documented for three scenarios: uniform depth flows, laboratory dam-break flows, and large-scale riverine flows. The model can start computations from a ?dry? bed and converge to accurate solutions. Inflows are expressed as source terms, which limits the use of the model to sufficiently long reaches where the flow reaches equilibrium with the channel. The data sets used by the investigation demonstrate that the model accurately propagates flood waves through long river reaches and simulates dam breaks with abrupt water-surface changes.
Handwritten mathematical symbols dataset.
Chajri, Yassine; Bouikhalene, Belaid
2016-06-01
Due to the technological advances in recent years, paper scientific documents are used less and less. Thus, the trend in the scientific community to use digital documents has increased considerably. Among these documents, there are scientific documents and more specifically mathematics documents. In this context, we present our own dataset of handwritten mathematical symbols composed of 10,379 images. This dataset gathers Arabic characters, Latin characters, Arabic numerals, Latin numerals, arithmetic operators, set-symbols, comparison symbols, delimiters, etc.
7 CFR 1.670 - How must documents be filed and served under §§ 1.670 through 1.673?
Code of Federal Regulations, 2014 CFR
2014-01-01
... complete copy of the document must be served on each license party and FERC, using: (i) One of the methods..., documents must be filed using one of the methods set forth in § 1.612(b). (2) A document is considered filed on the date it is received. However, any document received after 5 p.m. at the place where the filing...
NASA Technical Reports Server (NTRS)
Waggoner, J. T.; Phinney, D. E. (Principal Investigator)
1981-01-01
The crop estimation analysis procedures documentation of the AgRISTARS - Foreign Commodity Production Forecasting Project (FCPF) is presented. Specifically it includes the technical/management documentation of the remote sensing data analysis procedures prepared in accordance with the guidelines provided in the FCPF communication/documentation standards manual. Standard documentation sets are given arranged by procedural type and level then by crop types or other technically differentiating categories.
Compositional Gene Landscapes in Vertebrates
Cruveiller, Stéphane; Jabbari, Kamel; Clay, Oliver; Bernardi, Giorgio
2004-01-01
The existence of a well conserved linear relationship between GC levels of genes' second and third codon positions (GC2, GC3) prompted us to focus on the landscape, or joint distribution, spanned by these two variables. In human, well curated coding sequences now cover at least 15%–30% of the estimated total gene set. Our analysis of the landscape defined by this gene set revealed not only the well documented linear crest, but also the presence of several peaks and valleys along that crest, a property that was also indicated in two other warm-blooded vertebrates represented by large gene databases, that is, mouse and chicken. GC2 is the sum of eight amino acid frequencies, whereas GC3 is linearly related to the GC level of the chromosomal region containing the gene. The landscapes therefore portray relations between proteins and the DNA environments of the genes that encode them. PMID:15123586
Compositional gene landscapes in vertebrates.
Cruveiller, Stéphane; Jabbari, Kamel; Clay, Oliver; Bernardi, Giorgio
2004-05-01
The existence of a well conserved linear relationship between GC levels of genes' second and third codon positions (GC2, GC3) prompted us to focus on the landscape, or joint distribution, spanned by these two variables. In human, well curated coding sequences now cover at least 15%-30% of the estimated total gene set. Our analysis of the landscape defined by this gene set revealed not only the well documented linear crest, but also the presence of several peaks and valleys along that crest, a property that was also indicated in two other warm-blooded vertebrates represented by large gene databases, that is, mouse and chicken. GC2 is the sum of eight amino acid frequencies, whereas GC3 is linearly related to the GC level of the chromosomal region containing the gene. The landscapes therefore portray relations between proteins and the DNA environments of the genes that encode them.
Anthropometric changes and fluid shifts
NASA Technical Reports Server (NTRS)
Thornton, W. E.; Hoffler, G. W.; Rummel, J. A.
1977-01-01
In an effort to obtain the most comprehensive and coherent picture of changes under weightlessness, a set of measurements on Skylab 2 was initiated and at every opportunity, additional studies were added. All pertinent information from ancillary sources were gleaned and collated. On Skylab 2, the initial anthropometric studies were scheduled in conjunction with muscle study. A single set of facial photographs was made in-flight. Additional measurements were made on Skylab 3, with photographs and truncal and limb girth measurements in-flight. Prior to Skylab 4, it was felt there was considerable evidence for large and rapid fluid shifts, so a series of in-flight volume and center of mass measurements and infrared photographs were scheduled to be conducted in the Skylab 4 mission. A number of changes were properly documented for the first time, most important of which were the fluid shifts. The following description of Skylab anthropometrics address work done on Skylab 4 primarily.
Behavioral and computational aspects of language and its acquisition
NASA Astrophysics Data System (ADS)
Edelman, Shimon; Waterfall, Heidi
2007-12-01
One of the greatest challenges facing the cognitive sciences is to explain what it means to know a language, and how the knowledge of language is acquired. The dominant approach to this challenge within linguistics has been to seek an efficient characterization of the wealth of documented structural properties of language in terms of a compact generative grammar-ideally, the minimal necessary set of innate, universal, exception-less, highly abstract rules that jointly generate all and only the observed phenomena and are common to all human languages. We review developmental, behavioral, and computational evidence that seems to favor an alternative view of language, according to which linguistic structures are generated by a large, open set of constructions of varying degrees of abstraction and complexity, which embody both form and meaning and are acquired through socially situated experience in a given language community, by probabilistic learning algorithms that resemble those at work in other cognitive modalities.
Attitudes toward women and tolerance for sexual harassment among reservists.
Vogt, Dawne; Bruce, Tamara A; Street, Amy E; Stafford, Jane
2007-09-01
Women are more likely to experience sexual harassment in some work settings than others; specifically, work settings that have a large proportion of male workers, include a predominance of male supervisors, and represent traditional male occupations may be places in which there is greater tolerance for sexual harassment. The focus of the study was to document attitudes toward women among military personnel, to identify demographic and military characteristics associated with more positive attitudes toward women, and to examine associations between attitudes toward women and tolerance for sexual harassment. The study was based on data from 2,037 male and female former Reservists who reported minimal or no experiences of sexual harassment and no sexual assault in the military. Results suggest that attitudes toward women vary across content domains, are associated with several key demographic and military characteristics, and predict tolerance for sexual harassment. Implications of the findings and future directions are discussed.
Duplicate document detection in DocBrowse
NASA Astrophysics Data System (ADS)
Chalana, Vikram; Bruce, Andrew G.; Nguyen, Thien
1998-04-01
Duplicate documents are frequently found in large databases of digital documents, such as those found in digital libraries or in the government declassification effort. Efficient duplicate document detection is important not only to allow querying for similar documents, but also to filter out redundant information in large document databases. We have designed three different algorithm to identify duplicate documents. The first algorithm is based on features extracted from the textual content of a document, the second algorithm is based on wavelet features extracted from the document image itself, and the third algorithm is a combination of the first two. These algorithms are integrated within the DocBrowse system for information retrieval from document images which is currently under development at MathSoft. DocBrowse supports duplicate document detection by allowing (1) automatic filtering to hide duplicate documents, and (2) ad hoc querying for similar or duplicate documents. We have tested the duplicate document detection algorithms on 171 documents and found that text-based method has an average 11-point precision of 97.7 percent while the image-based method has an average 11- point precision of 98.9 percent. However, in general, the text-based method performs better when the document contains enough high-quality machine printed text while the image- based method performs better when the document contains little or no quality machine readable text.
BOREAS AFM-04 Twin Otter Aircraft Flux Data
NASA Technical Reports Server (NTRS)
MacPherson, J. Ian; Hall, Forrest G. (Editor); Knapp, David E. (Editor); Desjardins, Raymond L.; Smith, David E. (Technical Monitor)
2000-01-01
The BOREAS AFM-5 team collected and processed data from the numerous radiosonde flights during the project. The goals of the AFM-05 team were to provide large-scale definition of the atmosphere by supplementing the existing AES aerological network, both temporally and spatially. This data set includes basic upper-air parameters collected from the network of upper-air stations during the 1993, 1994, and 1996 field campaigns over the entire study region. The data are contained in tabular ASCII files. The data files are available on a CD-ROM (see document number 20010000884) or from the Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC).
Family size, the physical environment, and socioeconomic effects across the stature distribution.
Carson, Scott Alan
2012-04-01
A neglected area in historical stature studies is the relationship between stature and family size. Using robust statistics and a large 19th century data set, this study documents a positive relationship between stature and family size across the stature distribution. The relationship between material inequality and health is the subject of considerable debate, and there was a positive relationship between stature and wealth and an inverse relationship between stature and material inequality. After controlling for family size and wealth variables, the paper reports a positive relationship between the physical environment and stature. Copyright © 2012 Elsevier GmbH. All rights reserved.
Bouhenguel, Jason T; Preiss, David A; Urman, Richard D
2017-12-01
Non-operating room anesthesia (NORA) encounters comprise a significant fraction of contemporary anesthesia practice. With the implemention of an aneshtesia information management system (AIMS), anesthesia practitioners can better streamline preoperative assessment, intraoperative automated documentation, real-time decision support, and remote surveillance. Despite the large personal and financial commitments involved in adoption and implementation of AIMS and other electronic health records in these settings, the benefits to safety, efficacy, and efficiency are far too great to be ignored. Continued future innovation of AIMS technology only promises to further improve on our NORA experience and improve care quality and safety. Copyright © 2017 Elsevier Inc. All rights reserved.
Advancement of Bi-Level Integrated System Synthesis (BLISS)
NASA Technical Reports Server (NTRS)
Sobieszczanski-Sobieski, Jaroslaw; Emiley, Mark S.; Agte, Jeremy S.; Sandusky, Robert R., Jr.
2000-01-01
Bi-Level Integrated System Synthesis (BLISS) is a method for optimization of an engineering system, e.g., an aerospace vehicle. BLISS consists of optimizations at the subsystem (module) and system levels to divide the overall large optimization task into sets of smaller ones that can be executed concurrently. In the initial version of BLISS that was introduced and documented in previous publications, analysis in the modules was kept at the early conceptual design level. This paper reports on the next step in the BLISS development in which the fidelity of the aerodynamic drag and structural stress and displacement analyses were upgraded while the method's satisfactory convergence rate was retained.
NASA Astrophysics Data System (ADS)
Logunova, O. S.; Sibileva, N. S.
2017-12-01
The purpose of the study is to increase the efficiency of the steelmaking process in large capacity arc furnace on the basis of implementation a new decision-making system about the composition of charge materials. The authors proposed an interactive builder for the formation of the optimization problem, taking into account the requirements of the customer, normative documents and stocks of charge materials in the warehouse. To implement the interactive builder, the sets of deterministic and stochastic model components are developed, as well as a list of preferences of criteria and constraints.
Agyeman-Duah, Josephine Nana Afrakoma; Theurer, Antje; Munthali, Charles; Alide, Noor; Neuhann, Florian
2014-01-02
Knowledge regarding the best approaches to improving the quality of healthcare and their implementation is lacking in many resource-limited settings. The Medical Department of Kamuzu Central Hospital in Malawi set out to improve the quality of care provided to its patients and establish itself as a recognized centre in teaching, operations research and supervision of district hospitals. Efforts in the past to achieve these objectives were short-lived, and largely unsuccessful. Against this background, a situational analysis was performed to aid the Medical Department to define and prioritize its quality improvement activities. A mix of quantitative and qualitative methods was applied using checklists for observed practice, review of registers, key informant interviews and structured patient interviews. The mixed methods comprised triangulation by including the perspectives of the clients, healthcare providers from within and outside the department, and the field researcher's perspectives by means of document review and participatory observation. Human resource shortages, staff attitudes and shortage of equipment were identified as major constraints to patient care, and the running of the Medical Department. Processes, including documentation in registers and files and communication within and across cadres of staff were also found to be insufficient and thus undermining the effort of staff and management in establishing a sustained high quality culture. Depending on their past experience and knowledge, the stakeholder interviewees revealed different perspectives and expectations of quality healthcare and the intended quality improvement process. Establishing a quality improvement process in resource-limited settings is an enormous task, considering the host of challenges that these facilities face. The steps towards changing the status quo for improved quality care require critical self-assessment, the willingness to change as well as determined commitment and contributions from clients, staff and management.
Microcomputer based controller for the Langley 0.3-meter Transonic Cryogenic Tunnel
NASA Technical Reports Server (NTRS)
Balakrishna, S.; Kilgore, W. Allen
1989-01-01
Flow control of the Langley 0.3-meter Transonic Cryogenic Tunnel (TCT) is a multivariable nonlinear control problem. Globally stable control laws were generated to hold tunnel conditions in the presence of geometrical disturbances in the test section and precisely control the tunnel states for small and large set point changes. The control laws are mechanized as four inner control loops for tunnel pressure, temperature, fan speed, and liquid nitrogen supply pressure, and two outer loops for Mach number and Reynolds number. These integrated control laws have been mechanized on a 16-bit microcomputer working on DOS. This document details the model of the 0.3-m TCT, control laws, microcomputer realization, and its performance. The tunnel closed loop responses to small and large set point changes were presented. The controller incorporates safe thermal management of the tunnel cooldown based on thermal restrictions. The controller was shown to provide control of temperature to + or - 0.2K, pressure to + or - 0.07 psia, and Mach number to + or - 0.002 of a given set point during aerodynamic data acquisition in the presence of intrusive geometrical changes like flexwall movement, angle-of-attack changes, and drag rake traverse. The controller also provides a new feature of Reynolds number control. The controller provides a safe, reliable, and economical control of the 0.3-m TCT.
Identifying well-formed biomedical phrases in MEDLINE® text.
Kim, Won; Yeganova, Lana; Comeau, Donald C; Wilbur, W John
2012-12-01
In the modern world people frequently interact with retrieval systems to satisfy their information needs. Humanly understandable well-formed phrases represent a crucial interface between humans and the web, and the ability to index and search with such phrases is beneficial for human-web interactions. In this paper we consider the problem of identifying humanly understandable, well formed, and high quality biomedical phrases in MEDLINE documents. The main approaches used previously for detecting such phrases are syntactic, statistical, and a hybrid approach combining these two. In this paper we propose a supervised learning approach for identifying high quality phrases. First we obtain a set of known well-formed useful phrases from an existing source and label these phrases as positive. We then extract from MEDLINE a large set of multiword strings that do not contain stop words or punctuation. We believe this unlabeled set contains many well-formed phrases. Our goal is to identify these additional high quality phrases. We examine various feature combinations and several machine learning strategies designed to solve this problem. A proper choice of machine learning methods and features identifies in the large collection strings that are likely to be high quality phrases. We evaluate our approach by making human judgments on multiword strings extracted from MEDLINE using our methods. We find that over 85% of such extracted phrase candidates are humanly judged to be of high quality. Published by Elsevier Inc.
Document image retrieval through word shape coding.
Lu, Shijian; Li, Linlin; Tan, Chew Lim
2008-11-01
This paper presents a document retrieval technique that is capable of searching document images without OCR (optical character recognition). The proposed technique retrieves document images by a new word shape coding scheme, which captures the document content through annotating each word image by a word shape code. In particular, we annotate word images by using a set of topological shape features including character ascenders/descenders, character holes, and character water reservoirs. With the annotated word shape codes, document images can be retrieved by either query keywords or a query document image. Experimental results show that the proposed document image retrieval technique is fast, efficient, and tolerant to various types of document degradation.
NASA Astrophysics Data System (ADS)
Erickson, T. A.; Granger, B.; Grout, J.; Corlay, S.
2017-12-01
The volume of Earth science data gathered from satellites, aircraft, drones, and field instruments continues to increase. For many scientific questions in the Earth sciences, managing this large volume of data is a barrier to progress, as it is difficult to explore and analyze large volumes of data using the traditional paradigm of downloading datasets to a local computer for analysis. Furthermore, methods for communicating Earth science algorithms that operate on large datasets in an easily understandable and reproducible way are needed. Here we describe a system for developing, interacting, and sharing well-documented Earth Science algorithms that combines existing software components: Jupyter Notebook: An open-source, web-based environment that supports documents that combine code and computational results with text narrative, mathematics, images, and other media. These notebooks provide an environment for interactive exploration of data and development of well documented algorithms. Jupyter Widgets / ipyleaflet: An architecture for creating interactive user interface controls (such as sliders, text boxes, etc.) in Jupyter Notebooks that communicate with Python code. This architecture includes a default set of UI controls (sliders, dropboxes, etc.) as well as APIs for building custom UI controls. The ipyleaflet project is one example that offers a custom interactive map control that allows a user to display and manipulate geographic data within the Jupyter Notebook. Google Earth Engine: A cloud-based geospatial analysis platform that provides access to petabytes of Earth science data via a Python API. The combination of Jupyter Notebooks, Jupyter Widgets, ipyleaflet, and Google Earth Engine makes it possible to explore and analyze massive Earth science datasets via a web browser, in an environment suitable for interactive exploration, teaching, and sharing. Using these environments can make Earth science analyses easier to understand and reproducible, which may increase the rate of scientific discoveries and the transition of discoveries into real-world impacts.
Semi-automated Data Set Submission Work Flow for Archival with the ORNL DAAC
NASA Astrophysics Data System (ADS)
Wright, D.; Beaty, T.; Cook, R. B.; Devarakonda, R.; Eby, P.; Heinz, S. L.; Hook, L. A.; McMurry, B. F.; Shanafield, H. A.; Sill, D.; Santhana Vannan, S.; Wei, Y.
2013-12-01
The ORNL DAAC archives and publishes, free of charge, data and information relevant to biogeochemical, ecological, and environmental processes. The ORNL DAAC primarily archives data produced by NASA's Terrestrial Ecology Program; however, any data that are pertinent to the biogeochemical and ecological community are of interest. The data set submission process to the ORNL DAAC has been recently updated and semi-automated to provide a consistent data provider experience and to create a uniform data product. The data archived at the ORNL DAAC must be well formatted, self-descriptive, and documented, as well as referenced in a peer-reviewed publication. If the ORNL DAAC is the appropriate archive for a data set, the data provider will be sent an email with several URL links to guide them through the submission process. The data provider will be asked to fill out a short online form to help the ORNL DAAC staff better understand the data set. These questions cover information about the data set, a description of the data set, temporal and spatial characteristics of the data set, and how the data were prepared and delivered. The questionnaire is generic and has been designed to gather input on the various diverse data sets the ORNL DAAC archives. A data upload module and metadata editor further guide the data provider through the submission process. For submission purposes, a complete data set includes data files, document(s) describing data, supplemental files, metadata record(s), and an online form. There are five major functions the ORNL DAAC performs during the process of archiving data: 1) Ingestion is the ORNL DAAC side of submission; data are checked, metadata records are compiled, and files are converted to archival formats. 2) Metadata records and data set documentation made searchable and the data set is given a permanent URL. 3) The data set is published, assigned a DOI, and advertised. 4) The data set is provided long-term post-project support. 5) Stewardship of data ensures the data are stored on state of the art computer systems with reliable backups.
Metrics for Electronic-Nursing-Record-Based Narratives: cross-sectional analysis.
Kim, Kidong; Jeong, Suyeon; Lee, Kyogu; Park, Hyeoun-Ae; Min, Yul Ha; Lee, Joo Yun; Kim, Yekyung; Yoo, Sooyoung; Doh, Gippeum; Ahn, Soyeon
2016-11-30
We aimed to determine the characteristics of quantitative metrics for nursing narratives documented in electronic nursing records and their association with hospital admission traits and diagnoses in a large data set not limited to specific patient events or hypotheses. We collected 135,406,873 electronic, structured coded nursing narratives from 231,494 hospital admissions of patients discharged between 2008 and 2012 at a tertiary teaching institution that routinely uses an electronic health records system. The standardized number of nursing narratives (i.e., the total number of nursing narratives divided by the length of the hospital stay) was suggested to integrate the frequency and quantity of nursing documentation. The standardized number of nursing narratives was higher for patients aged ≥ 70 years (median = 30.2 narratives/day, interquartile range [IQR] = 24.0-39.4 narratives/day), long (≥ 8 days) hospital stays (median = 34.6 narratives/day, IQR = 27.2-43.5 narratives/day), and hospital deaths (median = 59.1 narratives/day, IQR = 47.0-74.8 narratives/day). The standardized number of narratives was higher in "pregnancy, childbirth, and puerperium" (median = 46.5, IQR = 39.0-54.7) and "diseases of the circulatory system" admissions (median = 35.7, IQR = 29.0-43.4). Diverse hospital admissions can be consistently described with nursing-document-derived metrics for similar hospital admissions and diagnoses. Some areas of hospital admissions may have consistently increasing volumes of nursing documentation across years. Usability of electronic nursing document metrics for evaluating healthcare requires multiple aspects of hospital admissions to be considered.
Metrics for Electronic-Nursing-Record-Based Narratives: Cross-sectional Analysis
Kim, Kidong; Jeong, Suyeon; Lee, Kyogu; Park, Hyeoun-Ae; Min, Yul Ha; Lee, Joo Yun; Kim, Yekyung; Yoo, Sooyoung; Doh, Gippeum
2016-01-01
Summary Objectives We aimed to determine the characteristics of quantitative metrics for nursing narratives documented in electronic nursing records and their association with hospital admission traits and diagnoses in a large data set not limited to specific patient events or hypotheses. Methods We collected 135,406,873 electronic, structured coded nursing narratives from 231,494 hospital admissions of patients discharged between 2008 and 2012 at a tertiary teaching institution that routinely uses an electronic health records system. The standardized number of nursing narratives (i.e., the total number of nursing narratives divided by the length of the hospital stay) was suggested to integrate the frequency and quantity of nursing documentation. Results The standardized number of nursing narratives was higher for patients aged 70 years (median = 30.2 narratives/day, interquartile range [IQR] = 24.0–39.4 narratives/day), long (8 days) hospital stays (median = 34.6 narratives/day, IQR = 27.2–43.5 narratives/day), and hospital deaths (median = 59.1 narratives/day, IQR = 47.0–74.8 narratives/day). The standardized number of narratives was higher in “pregnancy, childbirth, and puerperium” (median = 46.5, IQR = 39.0–54.7) and “diseases of the circulatory system” admissions (median = 35.7, IQR = 29.0–43.4). Conclusions Diverse hospital admissions can be consistently described with nursing-document-derived metrics for similar hospital admissions and diagnoses. Some areas of hospital admissions may have consistently increasing volumes of nursing documentation across years. Usability of electronic nursing document metrics for evaluating healthcare requires multiple aspects of hospital admissions to be considered. PMID:27901174
42 CFR 415.184 - Psychiatric services.
Code of Federal Regulations, 2010 CFR
2010-10-01
... TEACHING SETTINGS, AND RESIDENTS IN CERTAIN SETTINGS Physician Services in Teaching Settings § 415.184..., including documentation, except that the requirement for the presence of the teaching physician during the...
8 CFR 299.3 - Forms available from Superintendent of Documents.
Code of Federal Regulations, 2010 CFR
2010-01-01
... Documents. 299.3 Section 299.3 Aliens and Nationality DEPARTMENT OF HOMELAND SECURITY IMMIGRATION REGULATIONS IMMIGRATION FORMS § 299.3 Forms available from Superintendent of Documents. The Immigration and... of these forms shall be set aside by immigration officers for free distribution and official use...
NASA Astrophysics Data System (ADS)
Stock, Michala K.; Stull, Kyra E.; Garvin, Heather M.; Klales, Alexandra R.
2016-10-01
Forensic anthropologists are routinely asked to estimate a biological profile (i.e., age, sex, ancestry and stature) from a set of unidentified remains. In contrast to the abundance of collections and techniques associated with adult skeletons, there is a paucity of modern, documented subadult skeletal material, which limits the creation and validation of appropriate forensic standards. Many are forced to use antiquated methods derived from small sample sizes, which given documented secular changes in the growth and development of children, are not appropriate for application in the medico-legal setting. Therefore, the aim of this project is to use multi-slice computed tomography (MSCT) data from a large, diverse sample of modern subadults to develop new methods to estimate subadult age and sex for practical forensic applications. The research sample will consist of over 1,500 full-body MSCT scans of modern subadult individuals (aged birth to 20 years) obtained from two U.S. medical examiner's offices. Statistical analysis of epiphyseal union scores, long bone osteometrics, and os coxae landmark data will be used to develop modern subadult age and sex estimation standards. This project will result in a database of information gathered from the MSCT scans, as well as the creation of modern, statistically rigorous standards for skeletal age and sex estimation in subadults. Furthermore, the research and methods developed in this project will be applicable to dry bone specimens, MSCT scans, and radiographic images, thus providing both tools and continued access to data for forensic practitioners in a variety of settings.
Structured Forms Reference Set of Binary Images II (SFRS2)
National Institute of Standards and Technology Data Gateway
NIST Structured Forms Reference Set of Binary Images II (SFRS2) (Web, free access) The second NIST database of structured forms (Special Database 6) consists of 5,595 pages of binary, black-and-white images of synthesized documents containing hand-print. The documents in this database are 12 different tax forms with the IRS 1040 Package X for the year 1988.
Munkhdalai, Tsendsuren; Li, Meijing; Batsuren, Khuyagbaatar; Park, Hyeon Ah; Choi, Nak Hyeon; Ryu, Keun Ho
2015-01-01
Chemical and biomedical Named Entity Recognition (NER) is an essential prerequisite task before effective text mining can begin for biochemical-text data. Exploiting unlabeled text data to leverage system performance has been an active and challenging research topic in text mining due to the recent growth in the amount of biomedical literature. We present a semi-supervised learning method that efficiently exploits unlabeled data in order to incorporate domain knowledge into a named entity recognition model and to leverage system performance. The proposed method includes Natural Language Processing (NLP) tasks for text preprocessing, learning word representation features from a large amount of text data for feature extraction, and conditional random fields for token classification. Other than the free text in the domain, the proposed method does not rely on any lexicon nor any dictionary in order to keep the system applicable to other NER tasks in bio-text data. We extended BANNER, a biomedical NER system, with the proposed method. This yields an integrated system that can be applied to chemical and drug NER or biomedical NER. We call our branch of the BANNER system BANNER-CHEMDNER, which is scalable over millions of documents, processing about 530 documents per minute, is configurable via XML, and can be plugged into other systems by using the BANNER Unstructured Information Management Architecture (UIMA) interface. BANNER-CHEMDNER achieved an 85.68% and an 86.47% F-measure on the testing sets of CHEMDNER Chemical Entity Mention (CEM) and Chemical Document Indexing (CDI) subtasks, respectively, and achieved an 87.04% F-measure on the official testing set of the BioCreative II gene mention task, showing remarkable performance in both chemical and biomedical NER. BANNER-CHEMDNER system is available at: https://bitbucket.org/tsendeemts/banner-chemdner.
78 FR 15056 - Submission for OMB Review; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-08
... establish, document, and maintain a system of internal risk management controls. The Rule sets forth the..., documenting, and reviewing its internal risk management control system, which are designed to, among other... documenting its risk management control system is 2,000 hours and that, on average, a registered OTC...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-11-21
... updating and revising a set of production, underwriting, asset management, closing, and other documents... clarify the requirements for a management agent and the management agreement. Production--Firm Commitments... closing documents to the Office of Management and Budget (OMB) for review and approval, and assignment of...
Unesco Integrated Documentation Network; Computerized Documentation System (CDS).
ERIC Educational Resources Information Center
United Nations Educational, Scientific, and Cultural Organization, Paris (France). Dept. of Documentation, Libraries, and Archives.
Intended for use by the Computerized Documentation System (CDS), the Unesco version of ISIS (Integrated Set of Information Systems)--originally developed by the International Labour Organization--was developed in 1975 and named CDS/ISIS. This system has a comprehensive collection of programs for input, management, and output, running in batch or…
LExTeS: Link Extraction and Testing Suite
NASA Astrophysics Data System (ADS)
Ryan, P. W.
2017-11-01
LExTeS (Link Extraction and Testing Suite) extracts hyperlinks from PDF documents, tests the extracted links to see which are broken, and tabulates the results. Though written to support a particular set of PDF documents, the dataset and scripts can be edited for use on other documents.
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2010 CFR
2010-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2013 CFR
2013-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2011 CFR
2011-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2012 CFR
2012-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2014 CFR
2014-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
Where do I find documentation/more information concerning a data set?
Atmospheric Science Data Center
2015-11-30
To access documentation, locate and select the link from the Projects Supported page for the project that you would like ... page where you can access it if it is available, note that a missing tab on the product page indicates that there is no documentation ...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-06-08
...: Document--Tools for Implementing Inmate Behavior Management; Setting Measurable Goals AGENCY: National... elements of inmate behavior management (IBM), as defined by NIC. This document will be written in the context of inmate behavior management, which is described under SUPPLEMENTARY INFORMATION below. This...
Handwritten mathematical symbols dataset
Chajri, Yassine; Bouikhalene, Belaid
2016-01-01
Due to the technological advances in recent years, paper scientific documents are used less and less. Thus, the trend in the scientific community to use digital documents has increased considerably. Among these documents, there are scientific documents and more specifically mathematics documents. In this context, we present our own dataset of handwritten mathematical symbols composed of 10,379 images. This dataset gathers Arabic characters, Latin characters, Arabic numerals, Latin numerals, arithmetic operators, set-symbols, comparison symbols, delimiters, etc. PMID:27006975
NASA Technical Reports Server (NTRS)
1994-01-01
This Handbook, effective 13 September 1994, documents the NASA organization, defines terms, and sets forth the policy and requirements for establishing, modifying, and documenting the NASA organizational structure and for assigning organizational responsibilities.
24 CFR 905.510 - Submission requirements.
Code of Federal Regulations, 2012 CFR
2012-04-01
... submitted by this part: Capital fund financing budget, management assessment, fairness opinion, and physical needs assessment. (5) Financing documents. The PHA must submit a complete set of the legal documents...
24 CFR 905.510 - Submission requirements.
Code of Federal Regulations, 2011 CFR
2011-04-01
... submitted by this part: Capital fund financing budget, management assessment, fairness opinion, and physical needs assessment. (5) Financing documents. The PHA must submit a complete set of the legal documents...
24 CFR 905.510 - Submission requirements.
Code of Federal Regulations, 2014 CFR
2014-04-01
... submitted by this part: Capital fund financing budget, management assessment, fairness opinion, and physical needs assessment. (5) Financing documents. The PHA must submit a complete set of the legal documents...
Facilitating access to information in large documents with an intelligent hypertext system
NASA Technical Reports Server (NTRS)
Mathe, Nathalie
1993-01-01
Retrieving specific information from large amounts of documentation is not an easy task. It could be facilitated if information relevant in the current problem solving context could be automatically supplied to the user. As a first step towards this goal, we have developed an intelligent hypertext system called CID (Computer Integrated Documentation) and tested it on the Space Station Freedom requirement documents. The CID system enables integration of various technical documents in a hypertext framework and includes an intelligent context-sensitive indexing and retrieval mechanism. This mechanism utilizes on-line user information requirements and relevance feedback either to reinforce current indexing in case of success or to generate new knowledge in case of failure. This allows the CID system to provide helpful responses, based on previous usage of the documentation, and to improve its performance over time.
Booth, Richard G; Scerbo, Christina Ko; Sinclair, Barbara; Hancock, Michele; Reid, David; Denomy, Eileen
2017-04-01
Little research has been completed exploring knowledge development and transfer from and between simulated and clinical practice settings in nurse education. This study sought to explore the content learned, and the knowledge transferred, in a hybrid mental health clinical course consisting of simulated and clinical setting experiences. A qualitative, interpretive descriptive study design. Clinical practice consisted of six 10-hour shifts in a clinical setting combined with six two-hour simulations. 12 baccalaureate nursing students enrolled in a compressed time frame program at a large, urban, Canadian university participated. Document analysis and a focus group were used to draw thematic representations of content and knowledge transfer between clinical environments (i.e., simulated and clinical settings) using the constant comparative data analysis technique. Four major themes arose: (a) professional nursing behaviors; (b) understanding of the mental health nursing role; (c) confidence gained in interview skills; and, (d) unexpected learning. Nurse educators should further explore the intermingling of simulation and clinical practice in terms of knowledge development and transfer with the goal of preparing students to function within the mental health nursing specialty. Copyright © 2017 Elsevier Ltd. All rights reserved.
Travelling waves and spatial hierarchies in measles epidemics
NASA Astrophysics Data System (ADS)
Grenfell, B. T.; Bjørnstad, O. N.; Kappey, J.
2001-12-01
Spatio-temporal travelling waves are striking manifestations of predator-prey and host-parasite dynamics. However, few systems are well enough documented both to detect repeated waves and to explain their interaction with spatio-temporal variations in population structure and demography. Here, we demonstrate recurrent epidemic travelling waves in an exhaustive spatio-temporal data set for measles in England and Wales. We use wavelet phase analysis, which allows for dynamical non-stationarity-a complication in interpreting spatio-temporal patterns in these and many other ecological time series. In the pre-vaccination era, conspicuous hierarchical waves of infection moved regionally from large cities to small towns; the introduction of measles vaccination restricted but did not eliminate this hierarchical contagion. A mechanistic stochastic model suggests a dynamical explanation for the waves-spread via infective `sparks' from large `core' cities to smaller `satellite' towns. Thus, the spatial hierarchy of host population structure is a prerequisite for these infection waves.
Association of head circumference and shoulder dystocia in macrosomic neonates.
Larson, Austin; Mandelbaum, David E
2013-04-01
To determine whether asymmetric macrosomia (disproportionately large body size in comparison to head circumference) could be demonstrated in a population of infants suffering shoulder dystocia during delivery relative to those that did not suffer from shoulder dystocia. A case-control study was conducted as a retrospective chart review over 3 years at a large maternity hospital in an urban setting. Among infants over 4,000 g, those that suffered from shoulder dystocia during delivery had a smaller mean head circumference than infants of a similar size that did not suffer from shoulder dystocia. A statistically significant difference was also present when cases of documented gestational diabetes were excluded. Asymmetric macrosomia is more likely to be present in a population of infants who suffered shoulder dystocia during delivery. This knowledge could be used in designing tools to predict which pregnancies are at highest risk for shoulder dystocia during delivery.
The Art and Science of Photography in Hand Surgery
Wang, Keming; Kowalski, Evan J.; Chung, Kevin C.
2013-01-01
High-quality medical photography plays an important role in teaching and demonstrating the functional capacity of the hands, as well as in medicolegal documentation. Obtaining standardized, high-quality photographs is now an essential component of many surgery practices. The importance of standardized photography in facial and cosmetic surgery has been well documented in previous studies, but no studies have thoroughly addressed the details of photography for hand surgery. In this paper, we will provide a set of guidelines and basic camera concepts for different scenarios to help hand surgeons obtain appropriate and informative high quality photographs. A camera used for medical photography should come equipped with a large sensor size and an optical zoom lens with a focal length ranging anywhere from 14-75mm. In a clinic or office setting, we recommend six standardized views of the hand and four views for the wrist, and additional views should be taken for tendon ruptures, nerve injuries, or other deformities of the hand. For intra-operative pictures, the camera operator should understand the procedure and pertinent anatomy in order to properly obtain high-quality photographs. When digital radiographs are not available, and radiographic film must be photographed, it is recommended to reduce the exposure and change the color mode to black and white to obtain the best possible pictures. The goal of medical photography is to present the subject in an accurate and precise fashion. PMID:23755927
Parashar, Umesh D; Cortese, Margaret M; Payne, Daniel C; Lopman, Benjamin; Yen, Catherine; Tate, Jacqueline E
2015-11-27
In 1999, the first rhesus-human reassortant rotavirus vaccine licensed in the United States was withdrawn within a year of its introduction after it was linked with intussusception at a rate of ∼1 excess case per 10,000 vaccinated infants. While clinical trials of 60,000-70,000 infants of each of the two current live oral rotavirus vaccines, RotaTeq (RV5) and Rotarix (RV1), did not find an association with intussusception, post-licensure studies have documented a risk in several high and middle income countries, at a rate of ∼1-6 excess cases per 100,000 vaccinated infants. However, considering this low risk against the large health benefits of vaccination that have been observed in many countries, including in countries with a documented vaccine-associated intussusception risk, policy makers and health organizations around the world continue to support the routine use of RV1 and RV5 in national infant immunization programs. Because the risk and benefit data from affluent settings may not be directly applicable to developing countries, further characterization of any associated intussusception risk following rotavirus vaccination as well as the health benefits of vaccination is desirable for low income settings. Copyright © 2015 American Journal of Preventive Medicine. Published by Elsevier Ltd.. All rights reserved.
NASA Astrophysics Data System (ADS)
Bryant, Gerald
2015-04-01
Large-scale soft-sediment deformation features in the Navajo Sandstone have been a topic of interest for nearly 40 years, ever since they were first explored as a criterion for discriminating between marine and continental processes in the depositional environment. For much of this time, evidence for large-scale sediment displacements was commonly attributed to processes of mass wasting. That is, gravity-driven movements of surficial sand. These slope failures were attributed to the inherent susceptibility of dune sand responding to environmental triggers such as earthquakes, floods, impacts, and the differential loading associated with dune topography. During the last decade, a new wave of research is focusing on the event significance of deformation features in more detail, revealing a broad diversity of large-scale deformation morphologies. This research has led to a better appreciation of subsurface dynamics in the early Jurassic deformation events recorded in the Navajo Sandstone, including the important role of intrastratal sediment flow. This report documents two illustrative examples of large-scale sediment displacements represented in extensive outcrops of the Navajo Sandstone along the Utah/Arizona border. Architectural relationships in these outcrops provide definitive constraints that enable the recognition of a large-scale sediment outflow, at one location, and an equally large-scale subsurface flow at the other. At both sites, evidence for associated processes of liquefaction appear at depths of at least 40 m below the original depositional surface, which is nearly an order of magnitude greater than has commonly been reported from modern settings. The surficial, mass flow feature displays attributes that are consistent with much smaller-scale sediment eruptions (sand volcanoes) that are often documented from modern earthquake zones, including the development of hydraulic pressure from localized, subsurface liquefaction and the subsequent escape of fluidized sand toward the unconfined conditions of the surface. The origin of the forces that produced the lateral, subsurface movement of a large body of sand at the other site is not readily apparent. The various constraints on modeling the generation of the lateral force required to produce the observed displacement are considered here, along with photodocumentation of key outcrop relationships.
Multispectral image restoration of historical documents based on LAAMs and mathematical morphology
NASA Astrophysics Data System (ADS)
Lechuga-S., Edwin; Valdiviezo-N., Juan C.; Urcid, Gonzalo
2014-09-01
This research introduces an automatic technique designed for the digital restoration of the damaged parts in historical documents. For this purpose an imaging spectrometer is used to acquire a set of images in the wavelength interval from 400 to 1000 nm. Assuming the presence of linearly mixed spectral pixels registered from the multispectral image, our technique uses two lattice autoassociative memories to extract the set of pure pigments conforming a given document. Through an spectral unmixing analysis, our method produces fractional abundance maps indicating the distributions of each pigment in the scene. These maps are then used to locate cracks and holes in the document under study. The restoration process is performed by the application of a region filling algorithm, based on morphological dilation, followed by a color interpolation to restore the original appearance of the filled areas. This procedure has been successfully applied to the analysis and restoration of three multispectral data sets: two corresponding to artificially superimposed scripts and a real data acquired from a Mexican pre-Hispanic codex, whose restoration results are presented.
Nurses using futuristic technology in today's healthcare setting.
Wolf, Debra M; Kapadia, Amar; Kintzel, Jessie; Anton, Bonnie B
2009-01-01
Human computer interaction (HCI) equates nurses using voice assisted technology within a clinical setting to document patient care real time, retrieve patient information from care plans, and complete routine tasks. This is a reality currently utilized by clinicians today in acute and long term care settings. Voice assisted documentation provides hands & eyes free accurate documentation while enabling effective communication and task management. The speech technology increases the accuracy of documentation, while interfacing directly into the electronic health record (EHR). Using technology consisting of a light weight headset and small fist size wireless computer, verbal responses to easy to follow cues are converted into a database systems allowing staff to obtain individualized care status reports on demand. To further assist staff in their daily process, this innovative technology allows staff to send and receive pages as needed. This paper will discuss how leading edge and award winning technology is being integrated within the United States. Collaborative efforts between clinicians and analyst will be discussed reflecting the interactive design and build functionality. Features such as the system's voice responses and directed cues will be shared and how easily data can be documented, viewed and retrieved. Outcome data will be presented on how the technology impacted organization's quality outcomes, financial reimbursement, and employee's level of satisfaction.
BOREAS RSS-14 Level-2 GOES-7 Shortwave and Longwave Radiation Images
NASA Technical Reports Server (NTRS)
Hall, Forrest G. (Editor); Nickeson, Jaime (Editor); Gu, Jiujing; Smith, Eric A.
2000-01-01
The BOREAS RSS-14 team collected and processed several GOES-7 and GOES-8 image data sets that covered the BOREAS study region. This data set contains images of shortwave and longwave radiation at the surface and top of the atmosphere derived from collected GOES-7 data. The data cover the time period of 05-Feb-1994 to 20-Sep-1994. The images missing from the temporal series were zero-filled to create a consistent sequence of files. The data are stored in binary image format files. Due to the large size of the images, the level-1a GOES-7 data are not contained on the BOREAS CD-ROM set. An inventory listing file is supplied on the CD-ROM to inform users of what data were collected. The level-1a GOES-7 image data are available from the Earth Observing System Data and Information System (EOSDIS) Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC). See sections 15 and 16 for more information. The data files are available on a CD-ROM (see document number 20010000884).
Documentation of a deep percolation model for estimating ground-water recharge
Bauer, H.H.; Vaccaro, J.J.
1987-01-01
A deep percolation model, which operates on a daily basis, was developed to estimate long-term average groundwater recharge from precipitation. It has been designed primarily to simulate recharge in large areas with variable weather, soils, and land uses, but it can also be used at any scale. The physical and mathematical concepts of the deep percolation model, its subroutines and data requirements, and input data sequence and formats are documented. The physical processes simulated are soil moisture accumulation, evaporation from bare soil, plant transpiration, surface water runoff, snow accumulation and melt, and accumulation and evaporation of intercepted precipitation. The minimum data sets for the operation of the model are daily values of precipitation and maximum and minimum air temperature, soil thickness and available water capacity, soil texture, and land use. Long-term average annual precipitation, actual daily stream discharge, monthly estimates of base flow, Soil Conservation Service surface runoff curve numbers, land surface altitude-slope-aspect, and temperature lapse rates are optional. The program is written in the FORTRAN 77 language with no enhancements and should run on most computer systems without modifications. Documentation has been prepared so that program modifications may be made for inclusions of additional physical processes or deletion of ones not considered important. (Author 's abstract)
NASA Astrophysics Data System (ADS)
Cantoro, G.
2017-02-01
Archaeology is by its nature strictly connected with the physical landscape and as such it explores the inter-relations of individuals with places in which they leave and the nature that surrounds them. Since its earliest stages, archaeology demonstrated its permeability to scientific methods and innovative techniques or technologies. Archaeologists were indeed between the first to adopt GIS platforms (since already almost three decades) on large scale and are now between the most demanding customers for emerging technologies such as digital photogrammetry and drone-aided aerial photography. This paper aims at presenting case studies where the "3D approach" can be critically analysed and compared with more traditional means of documentation. Spot-light is directed towards the benefits of a specifically designed platform for user to access the 3D point-clouds and explore their characteristics. Beside simple measuring and editing tools, models are presented in their actual context and location, with historical and archaeological information provided on the side. As final step of a parallel project on geo-referencing and making available a large archive of aerial photographs, 3D models derived from photogrammetric processing of images have been uploaded and linked to photo-footprints polygons. Of great importance in such context is the possibility to interchange the point-cloud colours with satellite imagery from OpenLayers. This approach makes it possible to explore different landscape configurations due to time-changes with simple clicks. In these cases, photogrammetry or 3D laser scanning replaced, sided or integrated legacy documentation, creating at once a new set of information for forthcoming research and ideally new discoveries.
Making automated computer program documentation a feature of total system design
NASA Technical Reports Server (NTRS)
Wolf, A. W.
1970-01-01
It is pointed out that in large-scale computer software systems, program documents are too often fraught with errors, out of date, poorly written, and sometimes nonexistent in whole or in part. The means are described by which many of these typical system documentation problems were overcome in a large and dynamic software project. A systems approach was employed which encompassed such items as: (1) configuration management; (2) standards and conventions; (3) collection of program information into central data banks; (4) interaction among executive, compiler, central data banks, and configuration management; and (5) automatic documentation. A complete description of the overall system is given.
Computer-Assisted Search Of Large Textual Data Bases
NASA Technical Reports Server (NTRS)
Driscoll, James R.
1995-01-01
"QA" denotes high-speed computer system for searching diverse collections of documents including (but not limited to) technical reference manuals, legal documents, medical documents, news releases, and patents. Incorporates previously available and emerging information-retrieval technology to help user intelligently and rapidly locate information found in large textual data bases. Technology includes provision for inquiries in natural language; statistical ranking of retrieved information; artificial-intelligence implementation of semantics, in which "surface level" knowledge found in text used to improve ranking of retrieved information; and relevance feedback, in which user's judgements of relevance of some retrieved documents used automatically to modify search for further information.
Contandriopoulos, Damien; Lemire, Marc; Denis, Jean-Louis; Tremblay, Emile
2010-12-01
This article presents the main results from a large-scale analytical systematic review on knowledge exchange interventions at the organizational and policymaking levels. The review integrated two broad traditions, one roughly focused on the use of social science research results and the other focused on policymaking and lobbying processes. Data collection was done using systematic snowball sampling. First, we used prospective snowballing to identify all documents citing any of a set of thirty-three seminal papers. This process identified 4,102 documents, 102 of which were retained for in-depth analysis. The bibliographies of these 102 documents were merged and used to identify retrospectively all articles cited five times or more and all books cited seven times or more. All together, 205 documents were analyzed. To develop an integrated model, the data were synthesized using an analytical approach. This article developed integrated conceptualizations of the forms of collective knowledge exchange systems, the nature of the knowledge exchanged, and the definition of collective-level use. This literature synthesis is organized around three dimensions of context: level of polarization (politics), cost-sharing equilibrium (economics), and institutionalized structures of communication (social structuring). The model developed here suggests that research is unlikely to provide context-independent evidence for the intrinsic efficacy of knowledge exchange strategies. To design a knowledge exchange intervention to maximize knowledge use, a detailed analysis of the context could use the kind of framework developed here. © 2010 Milbank Memorial Fund. Published by Wiley Periodicals Inc.
Contandriopoulos, Damien; Lemire, Marc; Denis, Jean-Louis; Tremblay, Émile
2010-01-01
Context: This article presents the main results from a large-scale analytical systematic review on knowledge exchange interventions at the organizational and policymaking levels. The review integrated two broad traditions, one roughly focused on the use of social science research results and the other focused on policymaking and lobbying processes. Methods: Data collection was done using systematic snowball sampling. First, we used prospective snowballing to identify all documents citing any of a set of thirty-three seminal papers. This process identified 4,102 documents, 102 of which were retained for in-depth analysis. The bibliographies of these 102 documents were merged and used to identify retrospectively all articles cited five times or more and all books cited seven times or more. All together, 205 documents were analyzed. To develop an integrated model, the data were synthesized using an analytical approach. Findings: This article developed integrated conceptualizations of the forms of collective knowledge exchange systems, the nature of the knowledge exchanged, and the definition of collective-level use. This literature synthesis is organized around three dimensions of context: level of polarization (politics), cost-sharing equilibrium (economics), and institutionalized structures of communication (social structuring). Conclusions: The model developed here suggests that research is unlikely to provide context-independent evidence for the intrinsic efficacy of knowledge exchange strategies. To design a knowledge exchange intervention to maximize knowledge use, a detailed analysis of the context could use the kind of framework developed here. PMID:21166865
Predicate Argument Structure Analysis for Use Case Description Modeling
NASA Astrophysics Data System (ADS)
Takeuchi, Hironori; Nakamura, Taiga; Yamaguchi, Takahira
In a large software system development project, many documents are prepared and updated frequently. In such a situation, support is needed for looking through these documents easily to identify inconsistencies and to maintain traceability. In this research, we focus on the requirements documents such as use cases and consider how to create models from the use case descriptions in unformatted text. In the model construction, we propose a few semantic constraints based on the features of the use cases and use them for a predicate argument structure analysis to assign semantic labels to actors and actions. With this approach, we show that we can assign semantic labels without enhancing any existing general lexical resources such as case frame dictionaries and design a less language-dependent model construction architecture. By using the constructed model, we consider a system for quality analysis of the use cases and automated test case generation to keep the traceability between document sets. We evaluated the reuse of the existing use cases and generated test case steps automatically with the proposed prototype system from real-world use cases in the development of a system using a packaged application. Based on the evaluation, we show how to construct models with high precision from English and Japanese use case data. Also, we could generate good test cases for about 90% of the real use cases through the manual improvement of the descriptions based on the feedback from the quality analysis system.
Güthlin, Corina; Lange, Oliver; Walach, Harald
2004-01-01
Background Despite the increasing demand for acupuncture and homoeopathy in Germany, little is known about the effects of these treatments in routine care. We set up a pragmatic documentation study in general practice funded within the scope of project launched by a German health insurer. Patients were followed-up for up to four years. Methods The aim of the project was to study the effects and benefits of acupuncture and/or homoeopathy, and to assess patient satisfaction within a prospective documentation of over 5000 acupuncture and over 900 homoeopathy patients. As data sources, we used the documentation made available by therapists on every individual visit and a standardised quality-of-life questionnaire (MOS SF-36); these were complemented by questions concerning the patient's medical history and by questions on patient satisfaction. The health insurer provided us with data on work absenteeism. Results Descriptive analyses of the main outcomes showed benefit of treatment with middle to large-sized effects for the quality of life questionnaire SF-36 and about 1 point improvement on a rating scale of effects, given by doctors. Data on the treatment and the patients' and physicians' background suggests chronically ill patients treated by fairly regular schemes. Conclusion Since the results showed evidence of a subjective benefit for patients from acupuncture and homoeopathy, this may account for the increase in demand for these treatments especially when patients are chronically ill and unsatisfied with the conventional treatment given previously. PMID:15113434
Endemism in the moss flora of North America.
Carter, Benjamin E; Shaw, Blanka; Shaw, A Jonathan
2016-04-01
Identifying regions of high endemism is a critical step toward understanding the mechanisms underlying diversification and establishing conservation priorities. Here, we identified regions of high moss endemism across North America. We also identified lineages that contribute disproportionately to endemism and document the progress of efforts to inventory the endemic flora. To understand the documentation of endemic moss diversity in North America, we tabulated species publication dates to document the progress of species discovery across the continent. We analyzed herbarium specimen data and distribution data from the Flora of North America project to delineate major regions of moss endemism. Finally, we surveyed the literature to assess the importance of intercontinental vs. within-continent diversification for generating endemic species. Three primary regions of endemism were identified and two of these were further divided into a total of nine subregions. Overall endemic richness has two peaks, one in northern California and the Pacific Northwest, and the other in the southern Appalachians. Description of new endemic species has risen steeply over the last few decades, especially in western North America. Among the few studies documenting sister species relationships of endemics, recent diversification appears to have played a larger role in western North America, than in the east. Our understanding of bryophyte endemism continues to grow rapidly. Large continent-wide data sets confirm early views on hotspots of endemic bryophyte richness and indicate a high rate of ongoing species discovery in North America. © 2016 Botanical Society of America.
50 CFR 600.315 - National Standard 2-Scientific Information.
Code of Federal Regulations, 2012 CFR
2012-10-01
...) SAFE Report. (1) The SAFE report is a document or set of documents that provides Councils with a... responsibility to assure that a SAFE report or similar document is prepared, reviewed annually, and changed as..., Federal, university, or other sources to acquire and analyze data and produce the SAFE report. (ii) The...
12 CFR 509.24 - Scope of document discovery.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 5 2010-01-01 2010-01-01 false Scope of document discovery. 509.24 Section 509... discovery. (a) Limits on discovery. (1) Subject to the limitations set out in paragraphs (b), (c), and (d) of this section, a party to a proceeding under this subpart may obtain document discovery by serving...
12 CFR 308.24 - Scope of document discovery.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 4 2010-01-01 2010-01-01 false Scope of document discovery. 308.24 Section 308... PRACTICE AND PROCEDURE Uniform Rules of Practice and Procedure § 308.24 Scope of document discovery. (a) Limits on discovery. (1) Subject to the limitations set out in paragraphs (b), (c), and (d) of this...
12 CFR 308.107 - Document discovery.
Code of Federal Regulations, 2014 CFR
2014-01-01
... 12 Banks and Banking 5 2014-01-01 2014-01-01 false Document discovery. 308.107 Section 308.107... PRACTICE AND PROCEDURE General Rules of Procedure § 308.107 Document discovery. (a) Parties to proceedings set forth at § 308.01 of the Uniform Rules and as provided in the Local Rules may obtain discovery...
12 CFR 308.107 - Document discovery.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 12 Banks and Banking 4 2011-01-01 2011-01-01 false Document discovery. 308.107 Section 308.107... PRACTICE AND PROCEDURE General Rules of Procedure § 308.107 Document discovery. (a) Parties to proceedings set forth at § 308.01 of the Uniform Rules and as provided in the Local Rules may obtain discovery...
12 CFR 308.107 - Document discovery.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 12 Banks and Banking 5 2012-01-01 2012-01-01 false Document discovery. 308.107 Section 308.107... PRACTICE AND PROCEDURE General Rules of Procedure § 308.107 Document discovery. (a) Parties to proceedings set forth at § 308.01 of the Uniform Rules and as provided in the Local Rules may obtain discovery...
12 CFR 308.107 - Document discovery.
Code of Federal Regulations, 2013 CFR
2013-01-01
... 12 Banks and Banking 5 2013-01-01 2013-01-01 false Document discovery. 308.107 Section 308.107... PRACTICE AND PROCEDURE General Rules of Procedure § 308.107 Document discovery. (a) Parties to proceedings set forth at § 308.01 of the Uniform Rules and as provided in the Local Rules may obtain discovery...
12 CFR 308.107 - Document discovery.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 4 2010-01-01 2010-01-01 false Document discovery. 308.107 Section 308.107... PRACTICE AND PROCEDURE General Rules of Procedure § 308.107 Document discovery. (a) Parties to proceedings set forth at § 308.01 of the Uniform Rules and as provided in the Local Rules may obtain discovery...
Screening and Evaluation Tool (SET) Users Guide
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pincock, Layne
This document is the users guide to using the Screening and Evaluation Tool (SET). SET is a tool for comparing multiple fuel cycle options against a common set of criteria and metrics. It does this using standard multi-attribute utility decision analysis methods.
Method and system of filtering and recommending documents
Patton, Robert M.; Potok, Thomas E.
2016-02-09
Disclosed is a method and system for discovering documents using a computer and providing a small set of the most relevant documents to the attention of a human observer. Using the method, the computer obtains a seed document from the user and generates a seed document vector using term frequency-inverse corpus frequency weighting. A keyword index for a plurality of source documents can be compared with the weighted terms of the seed document vector. The comparison is then filtered to reduce the number of documents, which define an initial subset of the source documents. Initial subset vectors are generated and compared to the seed document vector to obtain a similarity value for each comparison. Based on the similarity value, the method then recommends one or more of the source documents.
14 CFR 302.3 - Filing of documents.
Code of Federal Regulations, 2010 CFR
2010-01-01
... set at the DOT Dockets Management System (DMS) internet website. (2) Such documents will be deemed to... as to the contents and style of briefs. (2) Papers may be reproduced by any duplicating process...
ENVIRONMENTAL INFORMATION MANAGEMENT SYSTEM (EIMS)
The Environmental Information Management System (EIMS) organizes descriptive information (metadata) for data sets, databases, documents, models, projects, and spatial data. The EIMS design provides a repository for scientific documentation that can be easily accessed with standar...
Westbury, Chris F.; Shaoul, Cyrus; Hollis, Geoff; Smithson, Lisa; Briesemeister, Benny B.; Hofmann, Markus J.; Jacobs, Arthur M.
2013-01-01
Many studies have shown that behavioral measures are affected by manipulating the imageability of words. Though imageability is usually measured by human judgment, little is known about what factors underlie those judgments. We demonstrate that imageability judgments can be largely or entirely accounted for by two computable measures that have previously been associated with imageability, the size and density of a word's context and the emotional associations of the word. We outline an algorithmic method for predicting imageability judgments using co-occurrence distances in a large corpus. Our computed judgments account for 58% of the variance in a set of nearly two thousand imageability judgments, for words that span the entire range of imageability. The two factors account for 43% of the variance in lexical decision reaction times (LDRTs) that is attributable to imageability in a large database of 3697 LDRTs spanning the range of imageability. We document variances in the distribution of our measures across the range of imageability that suggest that they will account for more variance at the extremes, from which most imageability-manipulating stimulus sets are drawn. The two predictors account for 100% of the variance that is attributable to imageability in newly-collected LDRTs using a previously-published stimulus set of 100 items. We argue that our model of imageability is neurobiologically plausible by showing it is consistent with brain imaging data. The evidence we present suggests that behavioral effects in the lexical decision task that are usually attributed to the abstract/concrete distinction between words can be wholly explained by objective characteristics of the word that are not directly related to the semantic distinction. We provide computed imageability estimates for over 29,000 words. PMID:24421777
SARS and hospital priority setting: a qualitative case study and evaluation.
Bell, Jennifer A H; Hyland, Sylvia; DePellegrin, Tania; Upshur, Ross E G; Bernstein, Mark; Martin, Douglas K
2004-12-19
Priority setting is one of the most difficult issues facing hospitals because of funding restrictions and changing patient need. A deadly communicable disease outbreak, such as the Severe Acute Respiratory Syndrome (SARS) in Toronto in 2003, amplifies the difficulties of hospital priority setting. The purpose of this study is to describe and evaluate priority setting in a hospital in response to SARS using the ethical framework 'accountability for reasonableness'. This study was conducted at a large tertiary hospital in Toronto, Canada. There were two data sources: 1) over 200 key documents (e.g. emails, bulletins), and 2) 35 interviews with key informants. Analysis used a modified thematic technique in three phases: open coding, axial coding, and evaluation. Participants described the types of priority setting decisions, the decision making process and the reasoning used. Although the hospital leadership made an effort to meet the conditions of 'accountability for reasonableness', they acknowledged that the decision making was not ideal. We described good practices and opportunities for improvement. 'Accountability for reasonableness' is a framework that can be used to guide fair priority setting in health care organizations, such as hospitals. In the midst of a crisis such as SARS where guidance is incomplete, consequences uncertain, and information constantly changing, where hour-by-hour decisions involve life and death, fairness is more important rather than less.
Alfonso-Goldfarb, Ana Maria; Ferraz, Márcia Helena Mendes; Rattansi, Piyo M.
2015-01-01
In this paper we present three newly rediscovered documents from the Royal Society archives that, together with the four described in our previous publications, constitute a set on a cognate theme. The documents were written by, or addressed to, members of the early Royal Society, and their subject is several magisterial formulae, including J. B. van Helmont's alkahest and Ludus. In addition to the interest in those formulae as medicines for various grave illnesses, our analysis showed that some seventeenth-century scholars sought to explain operations of the animal body by invoking similar but natural substances, while attempting to assimilate the latest anatomical discoveries into a novel framework. The complete set of documents sheds some new light on the interests of seventeenth-century networks of scholars concerned with experimenta. PMID:26665488
Tangible interactive system for document browsing and visualisation of multimedia data
NASA Astrophysics Data System (ADS)
Rytsar, Yuriy; Voloshynovskiy, Sviatoslav; Koval, Oleksiy; Deguillaume, Frederic; Topak, Emre; Startchik, Sergei; Pun, Thierry
2006-01-01
In this paper we introduce and develop a framework for document interactive navigation in multimodal databases. First, we analyze the main open issues of existing multimodal interfaces and then discuss two applications that include interaction with documents in several human environments, i.e., the so-called smart rooms. Second, we propose a system set-up dedicated to the efficient navigation in the printed documents. This set-up is based on the fusion of data from several modalities that include images and text. Both modalities can be used as cover data for hidden indexes using data-hiding technologies as well as source data for robust visual hashing. The particularities of the proposed robust visual hashing are described in the paper. Finally, we address two practical applications of smart rooms for tourism and education and demonstrate the advantages of the proposed solution.
Full value documentation in the Czech Food Composition Database.
Machackova, M; Holasova, M; Maskova, E
2010-11-01
The aim of this project was to launch a new Food Composition Database (FCDB) Programme in the Czech Republic; to implement a methodology for food description and value documentation according to the standards designed by the European Food Information Resource (EuroFIR) Network of Excellence; and to start the compilation of a pilot FCDB. Foods for the initial data set were selected from the list of foods included in the Czech Food Consumption Basket. Selection of 24 priority components was based on the range of components used in former Czech tables. The priority list was extended with components for which original Czech analytical data or calculated data were available. Values that were input into the compiled database were documented according to the EuroFIR standards within the entities FOOD, COMPONENT, VALUE and REFERENCE using Excel sheets. Foods were described using the LanguaL Thesaurus. A template for documentation of data according to the EuroFIR standards was designed. The initial data set comprised documented data for 162 foods. Values were based on original Czech analytical data (available for traditional and fast foods, milk and milk products, wheat flour types), data derived from literature (for example, fruits, vegetables, nuts, legumes, eggs) and calculated data. The Czech FCDB programme has been successfully relaunched. Inclusion of the Czech data set into the EuroFIR eSearch facility confirmed compliance of the database format with the EuroFIR standards. Excel spreadsheets are applicable for full value documentation in the FCDB.
Biblio-MetReS: A bibliometric network reconstruction application and server
2011-01-01
Background Reconstruction of genes and/or protein networks from automated analysis of the literature is one of the current targets of text mining in biomedical research. Some user-friendly tools already perform this analysis on precompiled databases of abstracts of scientific papers. Other tools allow expert users to elaborate and analyze the full content of a corpus of scientific documents. However, to our knowledge, no user friendly tool that simultaneously analyzes the latest set of scientific documents available on line and reconstructs the set of genes referenced in those documents is available. Results This article presents such a tool, Biblio-MetReS, and compares its functioning and results to those of other user-friendly applications (iHOP, STRING) that are widely used. Under similar conditions, Biblio-MetReS creates networks that are comparable to those of other user friendly tools. Furthermore, analysis of full text documents provides more complete reconstructions than those that result from using only the abstract of the document. Conclusions Literature-based automated network reconstruction is still far from providing complete reconstructions of molecular networks. However, its value as an auxiliary tool is high and it will increase as standards for reporting biological entities and relationships become more widely accepted and enforced. Biblio-MetReS is an application that can be downloaded from http://metres.udl.cat/. It provides an easy to use environment for researchers to reconstruct their networks of interest from an always up to date set of scientific documents. PMID:21975133
Indirect effects of domestic and wild herbivores on butterflies in an African savanna
Wilkerson, Marit L; Roche, Leslie M; Young, Truman P
2013-01-01
Indirect interactions driven by livestock and wild herbivores are increasingly recognized as important aspects of community dynamics in savannas and rangelands. Large ungulate herbivores can both directly and indirectly impact the reproductive structures of plants, which in turn can affect the pollinators of those plants. We examined how wild herbivores and cattle each indirectly affect the abundance of a common pollinator butterfly taxon, Colotis spp., at a set of long-term, large herbivore exclosure plots in a semiarid savanna in central Kenya. We also examined effects of herbivore exclusion on the main food plant of Colotis spp., which was also the most common flowering species in our plots: the shrub Cadaba farinosa. The study was conducted in four types of experimental plots: cattle-only, wildlife-only, cattle and wildlife (all large herbivores), and no large herbivores. Across all plots, Colotis spp. abundances were positively correlated with both Cadaba flower numbers (adult food resources) and total Cadaba canopy area (larval food resources). Structural equation modeling (SEM) revealed that floral resources drove the abundance of Colotis butterflies. Excluding browsing wildlife increased the abundances of both Cadaba flowers and Colotis butterflies. However, flower numbers and Colotis spp. abundances were greater in plots with cattle herbivory than in plots that excluded all large herbivores. Our results suggest that wild browsing herbivores can suppress pollinator species whereas well-managed cattle use may benefit important pollinators and the plants that depend on them. This study documents a novel set of ecological interactions that demonstrate how both conservation and livelihood goals can be met in a working landscape with abundant wildlife and livestock. PMID:24198932
ERIC Educational Resources Information Center
Stahl, Steven A.; And Others
To examine the effects of students reading multiple documents on their perceptions of a historical event, in this case the "discovery" of America by Christopher Columbus, 85 high school freshmen read 3 of 4 different texts (or sets of texts) dealing with Columbus. One text was an encyclopedia article, one a set of articles from…
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
This large document provides a catalog of the location of large numbers of reports pertaining to the charge of the Presidential Advisory Committee on Human Radiation Research and is arranged as a series of appendices. Titles of the appendices are Appendix A- Records at the Washington National Records Center Reviewed in Whole or Part by DoD Personnel or Advisory Committee Staff; Appendix B- Brief Descriptions of Records Accessions in the Advisory Committee on Human Radiation Experiments (ACHRE) Research Document Collection; Appendix C- Bibliography of Secondary Sources Used by ACHRE; Appendix D- Brief Descriptions of Human Radiation Experiments Identified by ACHRE,more » and Indexes; Appendix E- Documents Cited in the ACHRE Final Report and other Separately Described Materials from the ACHRE Document Collection; Appendix F- Schedule of Advisory Committee Meetings and Meeting Documentation; and Appendix G- Technology Note.« less
Goenka, Anu; Annamalai, Medeshni; Dhada, Barnesh; Stephen, Cindy R; McKerrow, Neil H; Patrick, Mark E
2014-04-01
We report on the impact of revisions made to an existing pro forma facilitating routine assessment and the management of paediatric HIV and tuberculosis (TB) in KwaZulu-Natal, South Africa. An initial documentation audit in 2010 assessed 25 sets of case notes for the documentation of 16 select indicators based on national HIV and TB guidelines. Using the findings of this initial audit, the existing case note pro forma was revised. The introduction of the revised pro forma was accompanied by training and a similar repeat audit was undertaken in 2012. This demonstrated an overall improvement in documentation. The three indicators that improved most were documentation of maternal HIV status, child's HIV status and child's TB risk assessment (all P < 0.001). This study suggests that tailor-made documentation pro formas may have an important role to play in improving record keeping in low-resource settings.
Automatic script identification from images using cluster-based templates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hochberg, J.; Kerns, L.; Kelly, P.
We have developed a technique for automatically identifying the script used to generate a document that is stored electronically in bit image form. Our approach differs from previous work in that the distinctions among scripts are discovered by an automatic learning procedure, without any handson analysis. We first develop a set of representative symbols (templates) for each script in our database (Cyrillic, Roman, etc.). We do this by identifying all textual symbols in a set of training documents, scaling each symbol to a fixed size, clustering similar symbols, pruning minor clusters, and finding each cluster`s centroid. To identify a newmore » document`s script, we identify and scale a subset of symbols from the document and compare them to the templates for each script. We choose the script whose templates provide the best match. Our current system distinguishes among the Armenian, Burmese, Chinese, Cyrillic, Ethiopic, Greek, Hebrew, Japanese, Korean, Roman, and Thai scripts with over 90% accuracy.« less
ERIC Educational Resources Information Center
Farri, Oladimeji Feyisetan
2012-01-01
Large quantities of redundant clinical data are usually transferred from one clinical document to another, making the review of such documents cognitively burdensome and potentially error-prone. Inadequate designs of electronic health record (EHR) clinical document user interfaces probably contribute to the difficulties clinicians experience while…
Generating Hierarchical Document Indices from Common Denominators in Large Document Collections.
ERIC Educational Resources Information Center
O'Kane, Kevin C.
1996-01-01
Describes an algorithm for computer generation of hierarchical indexes for document collections. The resulting index, when presented with a graphical interface, provides users with a view of a document collection that permits general browsing and informal search activities via an access method that requires no keyboard entry or prior knowledge of…
NASA Technical Reports Server (NTRS)
1989-01-01
An overview of the five volume set of Information System Life-Cycle and Documentation Standards is provided with information on its use. The overview covers description, objectives, key definitions, structure and application of the standards, and document structure decisions. These standards were created to provide consistent NASA-wide structures for coordinating, controlling, and documenting the engineering of an information system (hardware, software, and operational procedures components) phase by phase.
Evidence for late Pliocene deglacial megafloods in the Gulf of Mexico
NASA Astrophysics Data System (ADS)
Wang, Z.; Gani, M. R.
2017-12-01
The paleoclimatic significance of giant sedimentary structures developed under unconfined Froude-supercritical sediment gravity flows in subaqueous settings is considerably under-examined. This research, for the first time, extensively documents >20-km-wide and 200-m-thick Plio-Pleistocene giant sediment waves in the northern Gulf of Mexico continental slope using 3D seismic data, showing waveform morphology in unprecedented detail. Published biostratigraphic data help constraining the geologic age of these deposits. The results of numerical and morphological analyses suggest that such large-scale bedforms were formed under sheet-like unconfined Froude-supercritical turbidity currents as cyclic steps. Paleohydraulic reconstruction (e.g., flow velocity, discharge, and unit flux), in association with other evidence like geologic age, published stable isotope records, and temporal rarity, points out that the responsible Froude-supercritical turbidity currents were most likely triggered by deglacial catastrophic outburst floods during the late Pliocene to early Pleistocene. Laurentide Ice Sheet outburst floods to the Gulf of Mexico have previously been documented based mainly on deep-sea cores during the last several interglacial episodes in the late Pleistocene. Our megaflood events constitute, by far, the oldest record of the glacial outburst floods during the Quaternary Ice Age anywhere in the world. This study suggests that such pervasive occurrence of large-scale sediment waves likely serve as a proxy for extreme events like catastrophic megafloods.
Forward Modeling of Large-scale Structure: An Open-source Approach with Halotools
NASA Astrophysics Data System (ADS)
Hearin, Andrew P.; Campbell, Duncan; Tollerud, Erik; Behroozi, Peter; Diemer, Benedikt; Goldbaum, Nathan J.; Jennings, Elise; Leauthaud, Alexie; Mao, Yao-Yuan; More, Surhud; Parejko, John; Sinha, Manodeep; Sipöcz, Brigitta; Zentner, Andrew
2017-11-01
We present the first stable release of Halotools (v0.2), a community-driven Python package designed to build and test models of the galaxy-halo connection. Halotools provides a modular platform for creating mock universes of galaxies starting from a catalog of dark matter halos obtained from a cosmological simulation. The package supports many of the common forms used to describe galaxy-halo models: the halo occupation distribution, the conditional luminosity function, abundance matching, and alternatives to these models that include effects such as environmental quenching or variable galaxy assembly bias. Satellite galaxies can be modeled to live in subhalos or to follow custom number density profiles within their halos, including spatial and/or velocity bias with respect to the dark matter profile. The package has an optimized toolkit to make mock observations on a synthetic galaxy population—including galaxy clustering, galaxy-galaxy lensing, galaxy group identification, RSD multipoles, void statistics, pairwise velocities and others—allowing direct comparison to observations. Halotools is object-oriented, enabling complex models to be built from a set of simple, interchangeable components, including those of your own creation. Halotools has an automated testing suite and is exhaustively documented on http://halotools.readthedocs.io, which includes quickstart guides, source code notes and a large collection of tutorials. The documentation is effectively an online textbook on how to build and study empirical models of galaxy formation with Python.
Archaeological predictive model set.
DOT National Transportation Integrated Search
2015-03-01
This report is the documentation for Task 7 of the Statewide Archaeological Predictive Model Set. The goal of this project is to : develop a set of statewide predictive models to assist the planning of transportation projects. PennDOT is developing t...
Deep Question Answering for protein annotation
Gobeill, Julien; Gaudinat, Arnaud; Pasche, Emilie; Vishnyakova, Dina; Gaudet, Pascale; Bairoch, Amos; Ruch, Patrick
2015-01-01
Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/ PMID:26384372
Goldacre, Ben; Gray, Jonathan
2016-04-08
OpenTrials is a collaborative and open database for all available structured data and documents on all clinical trials, threaded together by individual trial. With a versatile and expandable data schema, it is initially designed to host and match the following documents and data for each trial: registry entries; links, abstracts, or texts of academic journal papers; portions of regulatory documents describing individual trials; structured data on methods and results extracted by systematic reviewers or other researchers; clinical study reports; and additional documents such as blank consent forms, blank case report forms, and protocols. The intention is to create an open, freely re-usable index of all such information and to increase discoverability, facilitate research, identify inconsistent data, enable audits on the availability and completeness of this information, support advocacy for better data and drive up standards around open data in evidence-based medicine. The project has phase I funding. This will allow us to create a practical data schema and populate the database initially through web-scraping, basic record linkage techniques, crowd-sourced curation around selected drug areas, and import of existing sources of structured and documents. It will also allow us to create user-friendly web interfaces onto the data and conduct user engagement workshops to optimise the database and interface designs. Where other projects have set out to manually and perfectly curate a narrow range of information on a smaller number of trials, we aim to use a broader range of techniques and attempt to match a very large quantity of information on all trials. We are currently seeking feedback and additional sources of structured data.
SIRSALE: integrated video database management tools
NASA Astrophysics Data System (ADS)
Brunie, Lionel; Favory, Loic; Gelas, J. P.; Lefevre, Laurent; Mostefaoui, Ahmed; Nait-Abdesselam, F.
2002-07-01
Video databases became an active field of research during the last decade. The main objective in such systems is to provide users with capabilities to friendly search, access and playback distributed stored video data in the same way as they do for traditional distributed databases. Hence, such systems need to deal with hard issues : (a) video documents generate huge volumes of data and are time sensitive (streams must be delivered at a specific bitrate), (b) contents of video data are very hard to be automatically extracted and need to be humanly annotated. To cope with these issues, many approaches have been proposed in the literature including data models, query languages, video indexing etc. In this paper, we present SIRSALE : a set of video databases management tools that allow users to manipulate video documents and streams stored in large distributed repositories. All the proposed tools are based on generic models that can be customized for specific applications using ad-hoc adaptation modules. More precisely, SIRSALE allows users to : (a) browse video documents by structures (sequences, scenes, shots) and (b) query the video database content by using a graphical tool, adapted to the nature of the target video documents. This paper also presents an annotating interface which allows archivists to describe the content of video documents. All these tools are coupled to a video player integrating remote VCR functionalities and are based on active network technology. So, we present how dedicated active services allow an optimized video transport for video streams (with Tamanoir active nodes). We then describe experiments of using SIRSALE on an archive of news video and soccer matches. The system has been demonstrated to professionals with a positive feedback. Finally, we discuss open issues and present some perspectives.
Deep Question Answering for protein annotation.
Gobeill, Julien; Gaudinat, Arnaud; Pasche, Emilie; Vishnyakova, Dina; Gaudet, Pascale; Bairoch, Amos; Ruch, Patrick
2015-01-01
Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/. © The Author(s) 2015. Published by Oxford University Press.
Reeves, Anthony P; Xie, Yiting; Liu, Shuang
2017-04-01
With the advent of fully automated image analysis and modern machine learning methods, there is a need for very large image datasets having documented segmentations for both computer algorithm training and evaluation. This paper presents a method and implementation for facilitating such datasets that addresses the critical issue of size scaling for algorithm validation and evaluation; current evaluation methods that are usually used in academic studies do not scale to large datasets. This method includes protocols for the documentation of many regions in very large image datasets; the documentation may be incrementally updated by new image data and by improved algorithm outcomes. This method has been used for 5 years in the context of chest health biomarkers from low-dose chest CT images that are now being used with increasing frequency in lung cancer screening practice. The lung scans are segmented into over 100 different anatomical regions, and the method has been applied to a dataset of over 20,000 chest CT images. Using this framework, the computer algorithms have been developed to achieve over 90% acceptable image segmentation on the complete dataset.
12 CFR 19.24 - Scope of document discovery.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 1 2010-01-01 2010-01-01 false Scope of document discovery. 19.24 Section 19... PROCEDURE Uniform Rules of Practice and Procedure § 19.24 Scope of document discovery. (a) Limits on discovery. (1) Subject to the limitations set out in paragraphs (b), (c), and (d) of this section, a party...
The Role of CLEAR Thinking in Learning Science from Multiple-Document Inquiry Tasks
ERIC Educational Resources Information Center
Griffin, Thomas D.; Wiley, Jennifer; Britt, M. Anne; Salas, Carlos R.
2012-01-01
The main goal for the current study was to investigate whether individual differences in domain-general thinking dispositions might affect learning from multiple-document inquiry tasks in science. Middle school students were given a set of documents and were tasked with understanding how and why recent patterns in global temperature might be…
GAISEing into the Common Core of Statistics
ERIC Educational Resources Information Center
Groth, Randall E.; Bargagliotti, Anna E.
2012-01-01
In education, it is common to set aside older curriculum documents when newer ones are released. In fact, some instructional leaders have encouraged the "out with the old, in with the new" process by asking teachers to turn in all copies of the older document. Doing so makes sense when the old curriculum document is incompatible with the new.…
Business Documents Don't Have to Be Boring
ERIC Educational Resources Information Center
Schultz, Benjamin
2006-01-01
With business documents, visuals can serve to enhance the written word in conveying the message. Images can be especially effective when used subtly, on part of the page, on successive pages to provide continuity, or even set as watermarks over the entire page. A main reason given for traditional text-only business documents is that they are…
37 CFR 1.451 - The priority claim and priority document in an international application.
Code of Federal Regulations, 2010 CFR
2010-07-01
... set forth in § 1.19(b)(1). (c) If a certified copy of the priority document is not submitted together... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false The priority claim and priority document in an international application. 1.451 Section 1.451 Patents, Trademarks, and Copyrights...
Sentiment analysis of feature ranking methods for classification accuracy
NASA Astrophysics Data System (ADS)
Joseph, Shashank; Mugauri, Calvin; Sumathy, S.
2017-11-01
Text pre-processing and feature selection are important and critical steps in text mining. Text pre-processing of large volumes of datasets is a difficult task as unstructured raw data is converted into structured format. Traditional methods of processing and weighing took much time and were less accurate. To overcome this challenge, feature ranking techniques have been devised. A feature set from text preprocessing is fed as input for feature selection. Feature selection helps improve text classification accuracy. Of the three feature selection categories available, the filter category will be the focus. Five feature ranking methods namely: document frequency, standard deviation information gain, CHI-SQUARE, and weighted-log likelihood -ratio is analyzed.
NASA Technical Reports Server (NTRS)
Schroeder, Lyle C.; Sweet, Jon L.
1987-01-01
A large data base of Seasat-A Satellite Scatterometer (SASS) measurements merged with high-quality surface-truth wind, wave, and temperature data has been documented. The data base was developed for all times when selected in situ measurement sites were within the SASS footprint. Data were obtained from 42 sites located in the coastal waters of North America, Australia, Western Europe, and Japan and were assembled by correlating the SASS and surface-truth measurements in both time and distance. These data have been archived on a set of nine-track 6250 bpi ASCII coded magnetic tapes, which are available from the National Technical Information Service.
Lithologic discrimination and alteration mapping from AVIRIS Data, Socorro, New Mexico
NASA Technical Reports Server (NTRS)
Beratan, K. K.; Delillo, N.; Jacobson, A.; Blom, R.; Chapin, C. E.
1993-01-01
Geologic maps are, by their very nature, interpretive documents. In contrasts, images prepared from AVIRIS data can be used as uninterpreted, and thus unbiased, geologic maps. We are having significant success applying AVIRIS data in this non-quantitative manner to geologic problems. Much of our success has come from the power of the Linked Windows Interactive Data System. LinkWinds is a visual data analysis and exploration system under development at JPL which is designed to rapidly and interactively investigate large multivariate data sets. In this paper, we present information on the analysis technique, and preliminary results from research on potassium metasomatism, a distinctive and structurally significant type of alteration associated with crustal extension.
Objectives and metrics for wildlife monitoring
Sauer, J.R.; Knutson, M.G.
2008-01-01
Monitoring surveys allow managers to document system status and provide the quantitative basis for management decision-making, and large amounts of effort and funding are devoted to monitoring. Still, monitoring surveys often fall short of providing required information; inadequacies exist in survey designs, analyses procedures, or in the ability to integrate the information into an appropriate evaluation of management actions. We describe current uses of monitoring data, provide our perspective on the value and limitations of current approaches to monitoring, and set the stage for 3 papers that discuss current goals and implementation of monitoring programs. These papers were derived from presentations at a symposium at The Wildlife Society's 13th Annual Conference in Anchorage, Alaska, USA. [2006
Project ACE Activity Sets. Book II: Grades 6 and 7.
ERIC Educational Resources Information Center
Eden City Schools, NC.
The document contains eight activity sets suitable for grades 6 and 7. Topics focus on governmental, social, and educational systems in foreign countries. Each activity set contains background reading materials, resources, concepts, general objectives, and instructional objectives. Grade 6 sets are "Soviet Youth Organizations,""How…
Henderson, Joanna; Milligan, Karen; Niccols, Alison; Thabane, Lehana; Sword, Wendy; Smith, Ainsley; Rosenkranz, Susan
2012-12-07
Implementation of evidence-based practices in real-world settings is a complex process impacted by many factors, including intervention, dissemination, service provider, and organizational characteristics. Efforts to improve knowledge translation have resulted in greater attention to these factors. Researcher attention to the applicability of findings to applied settings also has increased. Much less attention, however, has been paid to intervention feasibility, an issue important to applied settings. In a systematic review of 121 documents regarding integrated treatment programs for women with substance abuse issues and their children, we examined the presence of feasibility-related information. Specifically, we analysed study descriptions for information regarding feasibility factors in six domains (intervention, practitioner, client, service delivery, organizational, and service system). On average, fewer than half of the 25 feasibility details assessed were included in the documents. Most documents included some information describing the participating clients, the services offered as part of the intervention, the location of services, and the expected length of stay or number of sessions. Only approximately half of the documents included specific information about the treatment model. Few documents indicated whether the intervention was manualized or whether the intervention was preceded by a standardized screening or assessment process. Very few provided information about the core intervention features versus the features open to local adaptation, or the staff experience or training required to deliver the intervention. As has been found in reviews of intervention studies in other fields, our findings revealed that most documents provide some client and intervention information, but few documents provided sufficient information to fully evaluate feasibility. We consider possible explanations for the paucity of feasibility information and provide suggestions for better reporting to promote diffusion of evidence-based practices.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tang, Q.; Xie, S.
This report describes the Atmospheric Radiation Measurement (ARM) Best Estimate (ARMBE) station-based surface data (ARMBESTNS) value-added product. It is a twin data product of the ARMBE 2-Dimensional gridded (ARMBE2DGRID) data set. Unlike the ARMBE2DGRID data set, ARMBESTNS data are reported at the original site locations and show the original information (except for the interpolation over time). Therefore, the users have the flexibility to process the data with the approach more suitable for their applications. This document provides information about the input data, quality control (QC) method, and output format of this data set. As much of the information is identicalmore » to that of the ARMBE2DGRID data, this document will emphasize more on the different aspects of these two data sets.« less
Potential for water salvage by removal of non-native woody vegetation from dryland river systems
Doody, T.M.; Nagler, P.L.; Glenn, E.P.; Moore, G.W.; Morino, K.; Hultine, K.R.; Benyon, R.G.
2011-01-01
Globally, expansion of non-native woody vegetation across floodplains has raised concern of increased evapotranspiration (ET) water loss with consequent reduced river flows and groundwater supplies. Water salvage programs, established to meet water supply demands by removing introduced species, show little documented evidence of program effectiveness. We use two case studies in the USA and Australia to illustrate factors that contribute to water salvage feasibility for a given ecological setting. In the USA, saltcedar (Tamarix spp.) has become widespread on western rivers, with water salvage programs attempted over a 50-year period. Some studies document riparian transpiration or ET reduction after saltcedar removal, but detectable increases in river base flow are not conclusively shown. Furthermore, measurements of riparian vegetation ET in natural settings show saltcedar ET overlaps the range measured for native riparian species, thereby constraining the possibility of water salvage by replacing saltcedar with native vegetation. In Australia, introduced willows (Salix spp.) have become widespread in riparian systems in the Murray-Darling Basin. Although large-scale removal projects have been undertaken, no attempts have been made to quantify increases in base flows. Recent studies of ET indicate that willows growing in permanently inundated stream beds have high transpiration rates, indicating water savings could be achieved from removal. In contrast, native Eucalyptus trees and willows growing on stream banks show similar ET rates with no net water salvage from replacing willows with native trees. We conclude that water salvage feasibility is highly dependent on the ecohydrological setting in which the non-native trees occur. We provide an overview of conditions favorable to water salvage. Copyright ?? 2011 John Wiley & Sons, Ltd.
A multi-ontology approach to annotate scientific documents based on a modularization technique.
Gomes, Priscilla Corrêa E Castro; Moura, Ana Maria de Carvalho; Cavalcanti, Maria Cláudia
2015-12-01
Scientific text annotation has become an important task for biomedical scientists. Nowadays, there is an increasing need for the development of intelligent systems to support new scientific findings. Public databases available on the Web provide useful data, but much more useful information is only accessible in scientific texts. Text annotation may help as it relies on the use of ontologies to maintain annotations based on a uniform vocabulary. However, it is difficult to use an ontology, especially those that cover a large domain. In addition, since scientific texts explore multiple domains, which are covered by distinct ontologies, it becomes even more difficult to deal with such task. Moreover, there are dozens of ontologies in the biomedical area, and they are usually big in terms of the number of concepts. It is in this context that ontology modularization can be useful. This work presents an approach to annotate scientific documents using modules of different ontologies, which are built according to a module extraction technique. The main idea is to analyze a set of single-ontology annotations on a text to find out the user interests. Based on these annotations a set of modules are extracted from a set of distinct ontologies, and are made available for the user, for complementary annotation. The reduced size and focus of the extracted modules tend to facilitate the annotation task. An experiment was conducted to evaluate this approach, with the participation of a bioinformatician specialist of the Laboratory of Peptides and Proteins of the IOC/Fiocruz, who was interested in discovering new drug targets aiming at the combat of tropical diseases. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Schnyder, Jara S. D.; Jo, Andrew; Eberli, Gregor P.; Betzler, Christian; Lindhorst, Sebastian; Schiebel, Linda; Hebbeln, Dierk; Wintersteller, Paul; Mulder, Thierry; Principaud, Melanie
2014-05-01
An approximately 5000km2 hydroacoustic and seismic data set provides the high-resolution bathymetry map of along the western slope of Great Bahama Bank, the world's largest isolated carbonate platform. This large data set in combination with core and sediment samples, provides and unprecedented insight into the variability of carbonate slope morphology and the processes affecting the platform margin and the slope. This complete dataset documents how the interplay of platform derived sedimentation, distribution by ocean currents, and local slope and margin failure produce a slope-parallel facies distribution that is not governed by downslope gradients. Platform-derived sediments produce a basinward thinning sediment wedge that is modified by currents that change directions and strength depending on water depth and location. As a result, winnowing and deposition change with water depth and distance from the margin. Morphological features like the plunge pool and migrating antidunes are the result of currents flowing from the banktop, while the ocean currents produce contourites and drifts. These continuous processes are punctuated by submarine slope failures of various sizes. The largest of these slope failures produce several hundred of km2 of mass transport complexes and could generate tsunamis. Closer to the Cuban fold and thrust belt, large margin collapses pose an equal threat for tsunami generation. However, the debris from margin and slope failure is the foundation for a teeming community of cold-water corals.
Fault Tolerance in Critical Information Systems
2001-05-01
that provides an inte- grated editing and analysis environment through the use of the Adobe FrameMaker document processor [1] and the Z/Eves theorem... FrameMaker document processor provid- ing the special character set for Z just as it would any other character set (such as mathe- matical symbols). Zeus...happens to use the LaTeX Z language definition, so Zeus processes the Framemaker spec- ification and outputs the LaTeX translation to Z/Eves for
Shareholder value and the performance of a large nursing home chain.
Kitchener, Martin; O'Meara, Janis; Brody, Ab; Lee, Hyang Yuol; Harrington, Charlene
2008-06-01
To analyze corporate governance arrangements and quality and financial performance outcomes among large multi-facility nursing home corporations (chains) that pursue stakeholder value (profit maximization) strategies. To establish a foundation of knowledge about the focal phenomenon and processes, we conducted an historical (1993-2005) case study of one of the largest chains (Sun Healthcare Inc.) that triangulated qualitative and quantitative data sources. Two main sets of information were compared: (1) corporate sources including Sun's Security Exchange Commission (SEC) Form 10-K annual reports, industry financial reports, and the business press; and (2) external sources including, legal documents, press reports, and publicly available California facility cost reports and quality data. Shareholder value was pursued at Sun through three inter-linked strategies: (1) rapid growth through debt-financed mergers; (2) labor cost constraint through low nurse staffing levels; and (3) a model of corporate governance that views sanctions for fraud and poor quality as a cost of business. Study findings and evidence from other large nursing home chains underscore calls from the Institute of Medicine and other bodies for extended oversight of the corporate governance and performance of large nursing home chains.
Shareholder Value and the Performance of a Large Nursing Home Chain
Kitchener, Martin; O'Meara, Janis; Brody, Ab; Lee, Hyang Yuol; Harrington, Charlene
2008-01-01
Objective To analyze corporate governance arrangements and quality and financial performance outcomes among large multi-facility nursing home corporations (chains) that pursue stakeholder value (profit maximization) strategies. Study Design To establish a foundation of knowledge about the focal phenomenon and processes, we conducted an historical (1993–2005) case study of one of the largest chains (Sun Helathcare Inc.) that triangulated qualitative and quantitative data sources. Data Sources Two main sets of information were compared: (1) corporate sources including Sun's Security Exchange Commission (SEC) Form 10-K annual reports, industry financial reports, and the business press; and (2) external sources including, legal documents, press reports, and publicly available California facility cost reports and quality data. Principal Findings Shareholder value was pursued at Sun through three inter-linked strategies: (1) rapid growth through debt-financed mergers; (2) labor cost constraint through low nurse staffing levels; and (3) a model of corporate governance that views sanctions for fraud and poor quality as a cost of business. Conclusions Study findings and evidence from other large nursing home chains underscore calls from the Institute of Medicine and other bodies for extended oversight of the corporate governance and performance of large nursing home chains. PMID:18454781
Collaborative visual analytics of radio surveys in the Big Data era
NASA Astrophysics Data System (ADS)
Vohl, Dany; Fluke, Christopher J.; Hassan, Amr H.; Barnes, David G.; Kilborn, Virginia A.
2017-06-01
Radio survey datasets comprise an increasing number of individual observations stored as sets of multidimensional data. In large survey projects, astronomers commonly face limitations regarding: 1) interactive visual analytics of sufficiently large subsets of data; 2) synchronous and asynchronous collaboration; and 3) documentation of the discovery workflow. To support collaborative data inquiry, we present encube, a large-scale comparative visual analytics framework. encube can utilise advanced visualization environments such as the CAVE2 (a hybrid 2D and 3D virtual reality environment powered with a 100 Tflop/s GPU-based supercomputer and 84 million pixels) for collaborative analysis of large subsets of data from radio surveys. It can also run on standard desktops, providing a capable visual analytics experience across the display ecology. encube is composed of four primary units enabling compute-intensive processing, advanced visualisation, dynamic interaction, parallel data query, along with data management. Its modularity will make it simple to incorporate astronomical analysis packages and Virtual Observatory capabilities developed within our community. We discuss how encube builds a bridge between high-end display systems (such as CAVE2) and the classical desktop, preserving all traces of the work completed on either platform - allowing the research process to continue wherever you are.
Simpson, James J.; Hufford, Gary L.; Daly, Christopher; Berg, Jared S.; Fleming, Michael D.
2005-01-01
Maps of mean monthly surface temperature and precipitation for Alaska and adjacent areas of Canada, produced by Oregon State University's Spatial Climate Analysis Service (SCAS) and the Alaska Geospatial Data Clearinghouse (AGDC), were analyzed. Because both sets of maps are generally available and in use by the community, there is a need to document differences between the processes and input data sets used by the two groups to produce their respective set of maps and to identify similarities and differences between the two sets of maps and possible reasons for the differences. These differences do not affect the observed large-scale patterns of seasonal and annual variability. Alaska is divided into interior and coastal zones, with consistent but different variability, separated by a transition region. The transition region has high interannual variability but low long-term mean variability. Both data sets support the four major ecosystems and ecosystem transition zone identified in our earlier work. Differences between the two sets of maps do occur, however, on the regional scale; they reflect differences in physiographic domains and in the treatment of these domains by the two groups (AGDC, SCAS). These differences also provide guidance for an improved observational network for Alaska. On the basis of validation with independent in situ data, we conclude that the data set produced by SCAS provides the best spatial coverage of Alaskan long-term mean monthly surface temperature and precipitation currently available. ?? The Arctic Institute of North America.
NASA Astrophysics Data System (ADS)
Kim, Younghoon; Cai, Ling; Usher, Timothy; Jiang, Qing
2009-09-01
This paper documents an experimental and theoretical investigation into characterizing the mechanical configurations and performances of THUNDER actuators, a type of piezoelectric actuator known for their large actuation displacements, through fabrication, measurements and finite element analysis. Five groups of such actuators with different dimensions were fabricated using identical fabrication parameters. The as-fabricated arched configurations, resulting from the thermo-mechanical mismatch among the constituent layers, and their actuation performances were characterized using an experimental set-up based on a laser displacement sensor and through numerical simulations with ANSYS, a widely used commercial software program for finite element analysis. This investigation shows that the presence of large residual stresses within the piezoelectric ceramic layer, built up during the fabrication process, leads to significant nonlinear electromechanical coupling in the actuator response to the driving electric voltage, and it is this nonlinear coupling that is responsible for the large actuation displacements. Furthermore, the severity of the residual stresses, and thus the nonlinearity, increases with increasing substrate/piezoelectric thickness ratio and, to a lesser extent, with decreasing in-plane dimensions of the piezoelectric layer.
Liu, Ying; Lita, Lucian Vlad; Niculescu, Radu Stefan; Mitra, Prasenjit; Giles, C Lee
2008-11-06
Owing to new advances in computer hardware, large text databases have become more prevalent than ever.Automatically mining information from these databases proves to be a challenge due to slow pattern/string matching techniques. In this paper we present a new, fast multi-string pattern matching method based on the well known Aho-Chorasick algorithm. Advantages of our algorithm include:the ability to exploit the natural structure of text, the ability to perform significant character shifting, avoiding backtracking jumps that are not useful, efficiency in terms of matching time and avoiding the typical "sub-string" false positive errors.Our algorithm is applicable to many fields with free text, such as the health care domain and the scientific document field. In this paper, we apply the BSS algorithm to health care data and mine hundreds of thousands of medical concepts from a large Electronic Medical Record (EMR) corpora simultaneously and efficiently. Experimental results show the superiority of our algorithm when compared with the top of the line multi-string matching algorithms.
Standardized development of computer software. Part 2: Standards
NASA Technical Reports Server (NTRS)
Tausworthe, R. C.
1978-01-01
This monograph contains standards for software development and engineering. The book sets forth rules for design, specification, coding, testing, documentation, and quality assurance audits of software; it also contains detailed outlines for the documentation to be produced.
Supporting Air and Space Expeditionary Forces: Analysis of Combat Support Basing Options
2004-01-01
Brooke et al., 2003. 13 For more information on Set Covering models, see Daskin , 1995. Analysis Methodology 43 Transportation Model. A detailed...This PDF document was made available from www.rand.org as a public service of the RAND Corporation. 6Jump down to document Visit RAND at...www.rand.org Explore RAND Project AIR FORCE View document details This document and trademark(s) contained herein are protected by law as indicated in a
Information system life-cycle and documentation standards, volume 1
NASA Technical Reports Server (NTRS)
Callender, E. David; Steinbacher, Jody
1989-01-01
The Software Management and Assurance Program (SMAP) Information System Life-Cycle and Documentation Standards Document describes the Version 4 standard information system life-cycle in terms of processes, products, and reviews. The description of the products includes detailed documentation standards. The standards in this document set can be applied to the life-cycle, i.e., to each phase in the system's development, and to the documentation of all NASA information systems. This provides consistency across the agency as well as visibility into the completeness of the information recorded. An information system is software-intensive, but consists of any combination of software, hardware, and operational procedures required to process, store, or transmit data. This document defines a standard life-cycle model and content for associated documentation.
McCarty, L Kelsey; Saddawi-Konefka, Daniel; Gargan, Lauren M; Driscoll, William D; Walsh, John L; Peterfreund, Robert A
2014-12-01
Process improvement in healthcare delivery settings can be difficult, even when there is consensus among clinicians about a clinical practice or desired outcome. Airway management is a medical intervention fundamental to the delivery of anesthesia care. Like other medical interventions, a detailed description of the management methods should be documented. Despite this expectation, airway documentation is often insufficient. The authors hypothesized that formal adoption of process improvement methods could be used to increase the rate of "complete" airway management documentation. The authors defined a set of criteria as a local practice standard of "complete" airway management documentation. The authors then employed selected process improvement methodologies over 13 months in three iterative and escalating phases to increase the percentage of records with complete documentation. The criteria were applied retrospectively to determine the baseline frequency of complete records, and prospectively to measure the impact of process improvements efforts over the three phases of implementation. Immediately before the initial intervention, a retrospective review of 23,011 general anesthesia cases over 6 months showed that 13.2% of patient records included complete documentation. At the conclusion of the 13-month improvement effort, documentation improved to a completion rate of 91.6% (P<0.0001). During the subsequent 21 months, the completion rate was sustained at an average of 90.7% (SD, 0.9%) across 82,571 general anesthetic records. Systematic application of process improvement methodologies can improve airway documentation and may be similarly effective in improving other areas of anesthesia clinical practice.
NASA Astrophysics Data System (ADS)
Zakaria, Chahnez; Curé, Olivier; Salzano, Gabriella; Smaïli, Kamel
In Computer Supported Cooperative Work (CSCW), it is crucial for project leaders to detect conflicting situations as early as possible. Generally, this task is performed manually by studying a set of documents exchanged between team members. In this paper, we propose a full-fledged automatic solution that identifies documents, subjects and actors involved in relational conflicts. Our approach detects conflicts in emails, probably the most popular type of documents in CSCW, but the methods used can handle other text-based documents. These methods rely on the combination of statistical and ontological operations. The proposed solution is decomposed in several steps: (i) we enrich a simple negative emotion ontology with terms occuring in the corpus of emails, (ii) we categorize each conflicting email according to the concepts of this ontology and (iii) we identify emails, subjects and team members involved in conflicting emails using possibilistic description logic and a set of proposed measures. Each of these steps are evaluated and validated on concrete examples. Moreover, this approach's framework is generic and can be easily adapted to domains other than conflicts, e.g. security issues, and extended with operations making use of our proposed set of measures.
ERIC Educational Resources Information Center
Pankau, Brian L.
2009-01-01
This empirical study evaluates the document category prediction effectiveness of Naive Bayes (NB) and K-Nearest Neighbor (KNN) classifier treatments built from different feature selection and machine learning settings and trained and tested against textual corpora of 2300 Gang-Of-Four (GOF) design pattern documents. Analysis of the experiment's…
Cognitive remediation in large systems of psychiatric care.
Medalia, Alice; Saperstein, Alice M; Erlich, Matthew D; Sederer, Lloyd I
2018-05-02
IntroductionWith the increasing enthusiasm to provide cognitive remediation (CR) as an evidence-based practice, questions arise as to what is involved in implementing CR in a large system of care. This article describes the first statewide implementation of CR in the USA, with the goal of documenting the implementation issues that care providers are likely to face when bringing CR services to their patients. In 2014, the New York State Office of Mental Health set up a Cognitive Health Service that could be implemented throughout the state-operated system of care. This service was intended to broadly address cognitive health, to assure that the cognitive deficits commonly associated with psychiatric illnesses are recognized and addressed, and that cognitive health is embedded in the vocabulary of wellness. It involved creating a mechanism to train staff to recognize how cognitive health could be prioritized in treatment planning as well as implementing CR in state-operated adult outpatient psychiatry clinics. By 2017, CR was available at clinics serving people with serious mental illness in 13 of 16 adult Psychiatric Centers, located in rural and urban settings throughout New York state. The embedded quality assurance program evaluation tools indicated that CR was acceptable, sustainable, and effective. Cognitive remediation can be feasibly implemented in large systems of care that provide a multilevel system of supports, a training program that educates broadly about cognitive health and specifically about the delivery of CR, and embedded, ongoing program evaluation that is linked to staff supervision.
Indexing and retrieving DICOM data in disperse and unstructured archives.
Costa, Carlos; Freitas, Filipe; Pereira, Marco; Silva, Augusto; Oliveira, José L
2009-01-01
This paper proposes an indexing and retrieval solution to gather information from distributed DICOM documents by allowing searches and access to the virtual data repository using a Google-like process. The medical imaging modalities are becoming more powerful and less expensive. The result is the proliferation of equipment acquisition by imaging centers, including the small ones. With this dispersion of data, it is not easy to take advantage of all the information that can be retrieved from these studies. Furthermore, many of these small centers do not have large enough requirements to justify the acquisition of a traditional PACS. A peer-to-peer PACS platform to index and query DICOM files over a set of distributed repositories that are logically viewed as a single federated unit. The solution is based on a public domain document-indexing engine and extends traditional PACS query and retrieval mechanisms. This proposal deals well with complex searching requirements, from a single desktop environment to distributed scenarios. The solution performance and robustness were demonstrated in trials. The characteristics of presented PACS platform make it particularly important for small institutions, including educational and research groups.
Impacts of EHR Certification and Meaningful Use Implementation on an Integrated Delivery Network.
Bowes, Watson A
2014-01-01
Three years ago Intermountain Healthcare made the decision to participate in the Medicare and Medicaid Electronic Heath Record (EHR) Incentive Program which required that hospitals and providers use a certified EHR in a meaningful way. At that time, the barriers to enhance our home grown system, and change clinician workflows were numerous and large. This paper describes the time and effort required to enhance our legacy systems in order to pass certification, including filling 47 gaps in (EHR) functionality. We also describe the processes and resources that resulted in successful changes to many clinical workflows required by clinicians to meet meaningful use requirements. In 2011 we set meaningful use targets of 75% of employed physicians and 75% of our hospitals to meet Stage 1 of meaningful use by 2013. By the end of 2013, 87% of 696 employed eligible professionals and 100% of 22 Intermountain hospitals had successfully attested for Stage 1. This paper describes documented and perceived costs to Intermountain including time, effort, resources, postponement of other projects, as well as documented and perceived benefits of attainment of meaningful use.
Algorithms and programming tools for image processing on the MPP, part 2
NASA Technical Reports Server (NTRS)
Reeves, Anthony P.
1986-01-01
A number of algorithms were developed for image warping and pyramid image filtering. Techniques were investigated for the parallel processing of a large number of independent irregular shaped regions on the MPP. In addition some utilities for dealing with very long vectors and for sorting were developed. Documentation pages for the algorithms which are available for distribution are given. The performance of the MPP for a number of basic data manipulations was determined. From these results it is possible to predict the efficiency of the MPP for a number of algorithms and applications. The Parallel Pascal development system, which is a portable programming environment for the MPP, was improved and better documentation including a tutorial was written. This environment allows programs for the MPP to be developed on any conventional computer system; it consists of a set of system programs and a library of general purpose Parallel Pascal functions. The algorithms were tested on the MPP and a presentation on the development system was made to the MPP users group. The UNIX version of the Parallel Pascal System was distributed to a number of new sites.
CCD imaging technology and the war on crime
NASA Astrophysics Data System (ADS)
McNeill, Glenn E.
1992-08-01
Linear array based CCD technology has been successfully used in the development of an Automatic Currency Reader/Comparator (ACR/C) system. The ACR/C system is designed to provide a method for tracking US currency in the organized crime and drug trafficking environments where large amounts of cash are involved in illegal transactions and money laundering activities. United States currency notes can be uniquely identified by the combination of the denomination serial number and series year. The ACR/C system processes notes at five notes per second using a custom transport a stationary linear array and optical character recognition (OCR) techniques to make such identifications. In this way large sums of money can be " marked" (using the system to read and store their identifiers) and then circulated within various crime networks. The system can later be used to read and compare confiscated notes to the known sets of identifiers from the " marked" set to document a trail of criminal activities. With the ACR/C law enforcement agencies can efficiently identify currency without actually marking it. This provides an undetectable means for making each note individually traceable and facilitates record keeping for providing evidence in a court of law. In addition when multiple systems are used in conjunction with a central data base the system can be used to track currency geographically. 1.
Twin Cities Metro Freight Initiative : Report on Peer Best Practices
DOT National Transportation Integrated Search
2011-06-30
This document reports on key findings compiled from two sets of conversations with eight freight peers at state Departments of Transportation (DOTs) and metropolitan planning organizations (MPOs) around the country. This document responds to stated i...
Restoration of Apollo Data by the Lunar Data Project/PDS Lunar Data Node: An Update
NASA Technical Reports Server (NTRS)
Williams, David R.; Hills, H. Kent; Taylor, Patrick T.; Grayzeck, Edwin J.; Guinness, Edward A.
2016-01-01
The Apollo 11, 12, and 14 through 17 missions orbited and landed on the Moon, carrying scientific instruments that returned data from all phases of the missions, included long-lived Apollo Lunar Surface Experiments Packages (ALSEPs) deployed by the astronauts on the lunar surface. Much of these data were never archived, and some of the archived data were on media and in formats that are outmoded, or were deposited with little or no useful documentation to aid outside users. This is particularly true of the ALSEP data returned autonomously for many years after the Apollo missions ended. The purpose of the Lunar Data Project and the Planetary Data System (PDS) Lunar Data Node is to take data collections already archived at the NASA Space Science Data Coordinated Archive (NSSDCA) and prepare them for archiving through PDS, and to locate lunar data that were never archived, bring them into NSSDCA, and then archive them through PDS. Preparing these data for archiving involves reading the data from the original media, be it magnetic tape, microfilm, microfiche, or hard-copy document, converting the outmoded, often binary, formats when necessary, putting them into a standard digital form accepted by PDS, collecting the necessary ancillary data and documentation (metadata) to ensure that the data are usable and well-described, summarizing the metadata in documentation to be included in the data set, adding other information such as references, mission and instrument descriptions, contact information, and related documentation, and packaging the results in a PDS-compliant data set. The data set is then validated and reviewed by a group of external scientists as part of the PDS final archive process. We present a status report on some of the data sets that we are processing.
ERIC Educational Resources Information Center
Moffat, Alistair; And Others
1994-01-01
Describes an approximate document ranking process that uses a compact array of in-memory, low-precision approximations for document length. Combined with another rule for reducing the memory required by partial similarity accumulators, the approximation heuristic allows the ranking of large document collections using less than one byte of memory…
Unified System Of Data On Materials And Processes
NASA Technical Reports Server (NTRS)
Key, Carlo F.
1989-01-01
Wide-ranging sets of data for aerospace industry described. Document describes Materials and Processes Technical Information System (MAPTIS), computerized set of integrated data bases for use by NASA and aerospace industry. Stores information in standard format for fast retrieval in searches and surveys of data. Helps engineers select materials and verify their properties. Promotes standardized nomenclature as well as standarized tests and presentation of data. Format of document of photographic projection slides used in lectures. Presents examples of reports from various data bases.
A Modular Set of Mixed Reality Simulators for blind and Guided Procedures
2015-08-01
W81XWH-14-1-0113 – Year 1 Report University of Florida Page 1 of 12 AWARD NUMBER: W81XWH-14-1-0113 TITLE: A Modular Set of Mixed Reality...Simulators for “blind” and Guided Procedures PRINCIPAL INVESTIGATOR: Samsun Lampotang CONTRACTING ORGANIZATION: University of Florida Gainesville, FL...designated by other documentation. W81XWH-14-1-0113 – Year 1 Report University of Florida Page 2 of 12 REPORT DOCUMENTATION PAGE Form Approved OMB No
Nosql for Storage and Retrieval of Large LIDAR Data Collections
NASA Astrophysics Data System (ADS)
Boehm, J.; Liu, K.
2015-08-01
Developments in LiDAR technology over the past decades have made LiDAR to become a mature and widely accepted source of geospatial information. This in turn has led to an enormous growth in data volume. The central idea for a file-centric storage of LiDAR point clouds is the observation that large collections of LiDAR data are typically delivered as large collections of files, rather than single files of terabyte size. This split of the dataset, commonly referred to as tiling, was usually done to accommodate a specific processing pipeline. It makes therefore sense to preserve this split. A document oriented NoSQL database can easily emulate this data partitioning, by representing each tile (file) in a separate document. The document stores the metadata of the tile. The actual files are stored in a distributed file system emulated by the NoSQL database. We demonstrate the use of MongoDB a highly scalable document oriented NoSQL database for storing large LiDAR files. MongoDB like any NoSQL database allows for queries on the attributes of the document. As a specialty MongoDB also allows spatial queries. Hence we can perform spatial queries on the bounding boxes of the LiDAR tiles. Inserting and retrieving files on a cloud-based database is compared to native file system and cloud storage transfer speed.
Atlas of natural hazards in the Hawaiian coastal zone
Fletcher, Charles H.; Grossman, Eric E.; Richmond, Bruce M.; Gibbs, Ann E.
2002-01-01
The purpose of this report is to communicate to citizens and regulatory authorities the history and relative intensity of coastal hazards in Hawaii. This information is the key to the wise use and management of coastal resources. The information contained in this document,we hope,will improve the ability of Hawaiian citizens and visitors to safely enjoy the coast and provide a strong data set for planners and managers to guide the future of coastal resources. This work is largely based on previous investigations by scientific and engineering researchers and county, state, and federal offices and agencies. The unique aspect of this report is that, to the extent possible, it assimilates prior efforts in documenting Hawaiian coastal hazards and combines existing knowledge into a single comprehensive coastal hazard data set. This is by no means the final word on coastal hazards in Hawaii. Every hazardous phenomenon described here, and others such as slope failure and rocky shoreline collapse, need to be more carefully quantified, forecast, and mitigated. Our ultimate goal, of course, is to make the Hawaiian coast a safer place by educating the people of the state, and their leaders, about the hazardous nature of the environment. In so doing, we will also be taking steps toward improved preservation of coastal environments, because the best way to avoid coastal hazards is to avoid inappropriate development in the coastal zone. We have chosen maps as the medium for both recording and communicating the hazard history and its intensity along the Hawaiian coast.Two types of maps are used: 1) smallscale maps showing a general history of hazards on each island and summarizing coastal hazards in a readily understandable format for general use, and 2) a large-scale series of technical maps (1:50,000) depicting coastal sections approximately 5 to 7 miles in length with color bands along the coast ranking the relative intensity of each hazard at the adjacent shoreline.
Quality and correlates of medical record documentation in the ambulatory care setting
Soto, Carlos M; Kleinman, Kenneth P; Simon, Steven R
2002-01-01
Background Documentation in the medical record facilitates the diagnosis and treatment of patients. Few studies have assessed the quality of outpatient medical record documentation, and to the authors' knowledge, none has conclusively determined the correlates of chart documentation. We therefore undertook the present study to measure the rates of documentation of quality of care measures in an outpatient primary care practice setting that utilizes an electronic medical record. Methods We reviewed electronic medical records from 834 patients receiving care from 167 physicians (117 internists and 50 pediatricians) at 14 sites of a multi-specialty medical group in Massachusetts. We abstracted information for five measures of medical record documentation quality: smoking history, medications, drug allergies, compliance with screening guidelines, and immunizations. From other sources we determined physicians' specialty, gender, year of medical school graduation, and self-reported time spent teaching and in patient care. Results Among internists, unadjusted rates of documentation were 96.2% for immunizations, 91.6% for medications, 88% for compliance with screening guidelines, 61.6% for drug allergies, 37.8% for smoking history. Among pediatricians, rates were 100% for immunizations, 84.8% for medications, 90.8% for compliance with screening guidelines, 50.4% for drug allergies, and 20.4% for smoking history. While certain physician and patient characteristics correlated with some measures of documentation quality, documentation varied depending on the measure. For example, female internists were more likely than male internists to document smoking history (odds ratio [OR], 1.90; 95% confidence interval [CI], 1.27 – 2.83) but were less likely to document drug allergies (OR, 0.51; 95% CI, 0.35 – 0.75). Conclusions Medical record documentation varied depending on the measure, with room for improvement in most domains. A variety of characteristics correlated with medical record documentation, but no pattern emerged. Further study could lead to targeted interventions to improve documentation. PMID:12473161
NASA Technical Reports Server (NTRS)
Reller, J. O., Jr.
1976-01-01
Data handling, communications, and documentation aspects of the ASSESS mission are described. Most experiments provided their own data handling equipment, although some used the airborne computer for backup, and one experiment required real-time computations. Communications facilities were set up to simulate those to be provided between Spacelab and the ground, including a downlink TV system. Mission documentation was kept to a minimum and proved sufficient. Examples are given of the basic documents of the mission.
Driscoll, Daniel G.; Bunkers, Matthew J.; Carter, Janet M.; Stamm, John F.; Williamson, Joyce E.
2010-01-01
The Black Hills area of western South Dakota has a history of damaging flash floods that have resulted primarily from exceptionally strong rain-producing thunderstorms. The best known example is the catastrophic storm system of June 9-10, 1972, which caused severe flooding in several major drainages near Rapid City and resulted in 238 deaths. More recently, severe thunderstorms caused flash flooding near Piedmont and Hermosa on August 17, 2007. Obtaining a thorough understanding of peak-flow characteristics for low-probability floods will require a comprehensive long-term approach involving (1) documentation of scientific information for extreme events such as these; (2) long-term collection of systematic peak-flow records; and (3) regional assessments of a wide variety of peak-flow information. To that end, the U.S. Geological Survey cooperated with the South Dakota Department of Transportation and National Weather Service to produce this report, which provides documentation regarding the August 17, 2007, storm and associated flooding and provides a context through examination of other large storm and flood events in the Black Hills area. The area affected by the August 17, 2007, storms and associated flooding generally was within the area affected by the larger storm of June 9-10, 1972. The maximum observed 2007 precipitation totals of between 10.00 and 10.50 inches occurred within about 2-3 hours in a small area about 5 miles west of Hermosa. The maximum documented precipitation amount in 1972 was 15.0 inches, and precipitation totals of 10.0 inches or more were documented for 34 locations within an area of about 76 square miles. A peak flow of less than 1 cubic foot per second occurred upstream from the 2007 storm extent for streamflow-gaging station 06404000 (Battle Creek near Keystone); whereas, the 1972 peak flow of 26,200 cubic feet per second was large, relative to the drainage area of only 58.6 square miles. Farther downstream along Battle Creek, a 2007 flow of 26,000 cubic feet per second was generated entirely within an intervening drainage area of only 44.4 square miles. An especially large flow of 44,100 cubic feet per second was documented for this location in 1972. The 2007 peak flow of 18,600 cubic feet per second for Battle Creek at Hermosa (station 06406000) was only slightly smaller than the 1972 peak flow of 21,400 cubic feet per second. Peak-flow values from 2007 for three sites with small drainage areas (less than 1.0 square mile) plot close to a regional envelope curve, indicating exceptionally large flow values, relative to drainage area. Physiographic factors that affect flooding in the area were examined. The limestone headwater hydrogeologic setting (within and near the Limestone Plateau area on the western flank of the Black Hills) has distinctively suppressed peak-flow characteristics for small recurrence intervals. Uncertainty is large, however, regarding characteristics for large recurrence intervals (low-probability floods) because of a dearth of information regarding the potential for generation of exceptionally strong rain-producing thunderstorms. In contrast, the greatest potential for exceptionally damaging floods is around the flanks of the rest of the Black Hills area because of steep topography and limited potential for attenuation of flood peaks in narrow canyons. Climatological factors that affect area flooding also were examined. Area thunderstorms are largely terrain-driven, especially with respect to their requisite upward motion, which can be initiated by orographic lifting effects, thermally enhanced circulations, and obstacle effects. Several other meteorological processes are influential in the development of especially heavy precipitation for the area, including storm cell training, storm anchoring or regeneration, storm mergers, supercell development, and weak upper-level air flow. A composite of storm total precipitation amounts for 13 recent individual storm events indicates
Benchmarking for Bayesian Reinforcement Learning
Ernst, Damien; Couëtoux, Adrien
2016-01-01
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed. PMID:27304891
Relevance of Google-customized search engine vs. CISMeF quality-controlled health gateway.
Gehanno, Jean-François; Kerdelhue, Gaétan; Sakji, Saoussen; Massari, Philippe; Joubert, Michel; Darmoni, Stéfan J
2009-01-01
CISMeF (acronym for Catalog and Index of French Language Health Resources on the Internet) is a quality-controlled health gateway conceived to catalog and index the most important and quality-controlled sources of institutional health information in French. The goal of this study is to compare the relevance of results provided by this gateway from a small set of documents selected and described by human experts to those provided by a search engine from a large set of automatically indexed and ranked resources. The Google-Customized search engine (CSE) was used. The evaluation was made using the 10th first results of 15 queries and two blinded physician evaluators. There was no significant difference between the relevance of information retrieval in CISMeF and Google CSE. In conclusion, automatic indexing does not lead to lower relevance than a manual MeSH indexing and may help to cope with the increasing number of references to be indexed in a controlled health quality gateway.
Benchmarking for Bayesian Reinforcement Learning.
Castronovo, Michael; Ernst, Damien; Couëtoux, Adrien; Fonteneau, Raphael
2016-01-01
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.
Arneson, Justin J; Sackett, Paul R; Beatty, Adam S
2011-10-01
The nature of the relationship between ability and performance is of critical importance for admission decisions in the context of higher education and for personnel selection. Although previous research has supported the more-is-better hypothesis by documenting linearity of ability-performance relationships, such research has not been sensitive enough to detect deviations at the top ends of the score distributions. An alternative position receiving considerable attention is the good-enough hypothesis, which suggests that although higher levels of ability may result in better performance up to a threshold, above this threshold greater ability does not translate to better performance. In this study, the nature of the relationship between cognitive ability and performance was examined throughout the score range in four large-scale data sets. Monotonicity was maintained in all instances. Contrary to the good-enough hypothesis, the ability-performance relationship was commonly stronger at the top end of the score distribution than at the bottom end.
DeepBlue epigenomic data server: programmatic data retrieval and analysis of epigenome region sets
Albrecht, Felipe; List, Markus; Bock, Christoph; Lengauer, Thomas
2016-01-01
Large amounts of epigenomic data are generated under the umbrella of the International Human Epigenome Consortium, which aims to establish 1000 reference epigenomes within the next few years. These data have the potential to unravel the complexity of epigenomic regulation. However, their effective use is hindered by the lack of flexible and easy-to-use methods for data retrieval. Extracting region sets of interest is a cumbersome task that involves several manual steps: identifying the relevant experiments, downloading the corresponding data files and filtering the region sets of interest. Here we present the DeepBlue Epigenomic Data Server, which streamlines epigenomic data analysis as well as software development. DeepBlue provides a comprehensive programmatic interface for finding, selecting, filtering, summarizing and downloading region sets. It contains data from four major epigenome projects, namely ENCODE, ROADMAP, BLUEPRINT and DEEP. DeepBlue comes with a user manual, examples and a well-documented application programming interface (API). The latter is accessed via the XML-RPC protocol supported by many programming languages. To demonstrate usage of the API and to enable convenient data retrieval for non-programmers, we offer an optional web interface. DeepBlue can be openly accessed at http://deepblue.mpi-inf.mpg.de. PMID:27084938
Yucca Mountain licensing support network archive assistant.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dunlavy, Daniel M.; Bauer, Travis L.; Verzi, Stephen J.
2008-03-01
This report describes the Licensing Support Network (LSN) Assistant--a set of tools for categorizing e-mail messages and documents, and investigating and correcting existing archives of categorized e-mail messages and documents. The two main tools in the LSN Assistant are the LSN Archive Assistant (LSNAA) tool for recategorizing manually labeled e-mail messages and documents and the LSN Realtime Assistant (LSNRA) tool for categorizing new e-mail messages and documents. This report focuses on the LSNAA tool. There are two main components of the LSNAA tool. The first is the Sandia Categorization Framework, which is responsible for providing categorizations for documents in anmore » archive and storing them in an appropriate Categorization Database. The second is the actual user interface, which primarily interacts with the Categorization Database, providing a way for finding and correcting categorizations errors in the database. A procedure for applying the LSNAA tool and an example use case of the LSNAA tool applied to a set of e-mail messages are provided. Performance results of the categorization model designed for this example use case are presented.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-03
...Consistent with the Paperwork Reduction Act of 1995 (PRA), HUD is publishing for public comment a comprehensive set of closing and other documents used in connection with transactions involving healthcare facilities (excluding hospitals) that are insured pursuant to section 232 of the National Housing Act (Section 232). In addition to meeting PRA requirements, this notice seeks public comment for the purpose of enlisting input from the lending industry and other interested parties in the development, updating, and adoption of a set of instruments (collectively, healthcare facility documents) that offer the requisite protection to all parties in these FHA-insured mortgage transactions, consistent with modern real estate and mortgage lending laws and practices. The healthcare facility documents, which are the subject of this notice, can be viewed on HUD's Web site: www.hud.gov/232forms. HUD is also publishing today a proposed rule that will submit for public comment certain revisions to FHA's Section 232 regulations for the purpose of ensuring consistency between the program regulations and the revised healthcare facility documents.
NASA Astrophysics Data System (ADS)
Ekberg, Joakim; Timpka, Toomas; Morin, Magnus; Jenvald, Johan; Nyce, James M.; Gursky, Elin A.; Eriksson, Henrik
Computer simulations have emerged as important tools in the preparation for outbreaks of infectious disease. To support the collaborative planning and responding to the outbreaks, reports from simulations need to be transparent (accessible) with regard to the underlying parametric settings. This paper presents a design for generation of simulation reports where the background settings used in the simulation models are automatically visualized. We extended the ontology-management system Protégé to tag different settings into categories, and included these in report generation in parallel to the simulation outcomes. The report generator takes advantage of an XSLT specification and collects the documentation of the particular simulation settings into abridged XMLs including also summarized results. We conclude that even though inclusion of critical background settings in reports may not increase the accuracy of infectious disease simulations, it can prevent misunderstandings and less than optimal public health decisions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stoecker, Nora Kathleen
2014-03-01
A Systems Analysis Group has existed at Sandia National Laboratories since at least the mid-1950s. Much of the groups work output (reports, briefing documents, and other materials) has been retained, along with large numbers of related documents. Over time the collection has grown to hundreds of thousands of unstructured documents in many formats contained in one or more of several different shared drives or SharePoint sites, with perhaps five percent of the collection still existing in print format. This presents a challenge. How can the group effectively find, manage, and build on information contained somewhere within such a large setmore » of unstructured documents? In response, a project was initiated to identify tools that would be able to meet this challenge. This report documents the results found and recommendations made as of August 2013.« less
Typograph: Multiscale Spatial Exploration of Text Documents
DOE Office of Scientific and Technical Information (OSTI.GOV)
Endert, Alexander; Burtner, Edwin R.; Cramer, Nicholas O.
2013-12-01
Visualizing large document collections using a spatial layout of terms can enable quick overviews of information. However, these metaphors (e.g., word clouds, tag clouds, etc.) often lack interactivity to explore the information and the location and rendering of the terms are often not based on mathematical models that maintain relative distances from other information based on similarity metrics. Further, transitioning between levels of detail (i.e., from terms to full documents) can be challanging. In this paper, we present Typograph, a multi-scale spatial exploration visualization for large document collections. Based on the term-based visualization methods, Typograh enables multipel levels of detailmore » (terms, phrases, snippets, and full documents) within the single spatialization. Further, the information is placed based on their relative similarity to other information to create the “near = similar” geography metaphor. This paper discusses the design principles and functionality of Typograph and presents a use case analyzing Wikipedia to demonstrate usage.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2012-03-12
... consisting of twelve consecutive complete month data sets of the documents and related indexing information....\\11\\ The MSRB proposes to charge $10,000 for any twelve consecutive complete month data set for the... data set for the Continuing Disclosure Historical Data Product.\\12\\ In general, no smaller data sets...
Reeves, Anthony P.; Xie, Yiting; Liu, Shuang
2017-01-01
Abstract. With the advent of fully automated image analysis and modern machine learning methods, there is a need for very large image datasets having documented segmentations for both computer algorithm training and evaluation. This paper presents a method and implementation for facilitating such datasets that addresses the critical issue of size scaling for algorithm validation and evaluation; current evaluation methods that are usually used in academic studies do not scale to large datasets. This method includes protocols for the documentation of many regions in very large image datasets; the documentation may be incrementally updated by new image data and by improved algorithm outcomes. This method has been used for 5 years in the context of chest health biomarkers from low-dose chest CT images that are now being used with increasing frequency in lung cancer screening practice. The lung scans are segmented into over 100 different anatomical regions, and the method has been applied to a dataset of over 20,000 chest CT images. Using this framework, the computer algorithms have been developed to achieve over 90% acceptable image segmentation on the complete dataset. PMID:28612037
The art and science of photography in hand surgery.
Wang, Keming; Kowalski, Evan J; Chung, Kevin C
2014-03-01
High-quality medical photography plays an important role in teaching and demonstrating the functional capacity of the hands as well as in medicolegal documentation. Obtaining standardized, high-quality photographs is now an essential component of many surgery practices. The importance of standardized photography in facial and cosmetic surgery has been well documented in previous studies, but no studies have thoroughly addressed the details of photography for hand surgery. In this paper, we provide a set of guidelines and basic camera concepts for different scenarios to help hand surgeons obtain appropriate and informative high-quality photographs. A camera used for medical photography should come equipped with a large sensor size and an optical zoom lens with a focal length ranging anywhere from 14 to 75 mm. In a clinic or office setting, we recommend 6 standardized views of the hand and 4 views for the wrist; additional views should be taken for tendon ruptures, nerve injuries, or other deformities of the hand. For intraoperative pictures, the camera operator should understand the procedure and pertinent anatomy in order to properly obtain high-quality photographs. When digital radiographs are not available and radiographic film must be photographed, it is recommended to reduce the exposure and change the color mode to black and white to obtain the best possible pictures. The goal of medical photography is to present the subject in an accurate and precise fashion. Copyright © 2014 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
22 CFR 181.3 - Determinations.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Determinations. 181.3 Section 181.3 Foreign Relations DEPARTMENT OF STATE INTERNATIONAL AGREEMENTS COORDINATION, REPORTING AND PUBLICATION OF INTERNATIONAL AGREEMENTS § 181.3 Determinations. (a) Whether any undertaking, document, or set of documents...
Developing a system for computing and reporting MAP-21 and other freight performance measures.
DOT National Transportation Integrated Search
2015-07-01
This report documents the use of the National Performance Monitoring Research Data Set : (NPMRDS) for the computation of freight performance measures on Interstate highways in Washington : state. The report documents the data availability and specifi...
Integer Linear Programming for Constrained Multi-Aspect Committee Review Assignment
Karimzadehgan, Maryam; Zhai, ChengXiang
2011-01-01
Automatic review assignment can significantly improve the productivity of many people such as conference organizers, journal editors and grant administrators. A general setup of the review assignment problem involves assigning a set of reviewers on a committee to a set of documents to be reviewed under the constraint of review quota so that the reviewers assigned to a document can collectively cover multiple topic aspects of the document. No previous work has addressed such a setup of committee review assignments while also considering matching multiple aspects of topics and expertise. In this paper, we tackle the problem of committee review assignment with multi-aspect expertise matching by casting it as an integer linear programming problem. The proposed algorithm can naturally accommodate any probabilistic or deterministic method for modeling multiple aspects to automate committee review assignments. Evaluation using a multi-aspect review assignment test set constructed using ACM SIGIR publications shows that the proposed algorithm is effective and efficient for committee review assignments based on multi-aspect expertise matching. PMID:22711970
NASA Technical Reports Server (NTRS)
Switzer, George F.
2008-01-01
This document contains a general description for data sets of a wake vortex system in a turbulent environment. The turbulence and thermal stratification of the environment are representative of the conditions on November 12, 2001 near John F. Kennedy International Airport. The simulation assumes no ambient winds. The full three dimensional simulation of the wake vortex system from a Boeing 747 predicts vortex circulation levels at 80% of their initial value at the time of the proposed vortex encounter. The linked vortex oval orientation showed no twisting, and the oval elevations at the widest point were about 20 meters higher than where the vortex pair joined. Fred Proctor of NASA?s Langley Research Center presented the results from this work at the NTSB public hearing that started 29 October 2002. This document contains a description of each data set including: variables, coordinate system, data format, and sample plots. Also included are instructions on how to read the data.
Narrative assessment: making mathematics learning visible in early childhood settings
NASA Astrophysics Data System (ADS)
Anthony, Glenda; McLachlan, Claire; Lim Fock Poh, Rachel
2015-09-01
Narratives that capture children's learning as they go about their day-to-day activities are promoted as a powerful assessment tool within early childhood settings. However, in the New Zealand context, there is increasing concern that learning stories—the preferred form of narrative assessment—currently downplay domain knowledge. In this paper, we draw on data from 13 teacher interviews and samples of 18 children's learning stories to examine how mathematics is made visible within learning stories. Despite appreciating that mathematics is embedded in a range of everyday activities within the centres, we found that the nature of a particular activity appeared to influence `how' and `what' the teachers chose to document as mathematics learning. Many of the teachers expressed a preference to document and analyse mathematics learning that occurred within explicit mathematics activities rather than within play that involves mathematics. Our concern is that this restricted documentation of mathematical activity could potentially limit opportunities for mathematics learning both in the centre and home settings.
Recent Experiments with INQUERY
1995-11-01
were conducted with version of the INQUERY information retrieval system INQUERY is based on the Bayesian inference network retrieval model It is...corpus based query expansion For TREC a subset of of the adhoc document set was used to build the InFinder database None of the...experiments that showed signi cant improvements in retrieval eectiveness when document rankings based on the entire document text are combined with
Code of Federal Regulations, 2013 CFR
2013-07-01
... period of time it or the Secretary specifies, the documentation set forth in § 668.57 that is requested... purposes of the Federal Pell Grant Program— (1) An applicant may submit a valid SAR to the institution or the institution may receive a valid ISIR after the applicable deadline specified in 34 CFR 690.61 but...
Code of Federal Regulations, 2012 CFR
2012-07-01
... period of time it or the Secretary specifies, the documentation set forth in § 668.57 that is requested... purposes of the Federal Pell Grant Program— (1) An applicant may submit a valid SAR to the institution or the institution may receive a valid ISIR after the applicable deadline specified in 34 CFR 690.61 but...
Younger Children in ECEC: Focus on the National Steering Documents in the Nordic Countries
ERIC Educational Resources Information Center
Hännikäinen, Maritta
2016-01-01
The aim of this study was to review the national steering documents on early childhood education and care (ECEC) in Denmark, Finland, Iceland, Norway and Sweden, with the focus on children up to the age of three, posing the question: What do these documents tell us about ECEC for younger children in the Nordic early childhood settings?…
Test Protocols for Advanced Inverter Interoperability Functions – Main Document
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Jay Dean; Gonzalez, Sigifredo; Ralph, Mark E.
2013-11-01
Distributed energy resources (DER) such as photovoltaic (PV) systems, when deployed in a large scale, are capable of influencing significantly the operation of power systems. Looking to the future, stakeholders are working on standards to make it possible to manage the potentially complex interactions between DER and the power system. In 2009, the Electric Power Research Institute (EPRI), Sandia National Laboratories (SNL) with the U.S. Department of Energy (DOE), and the Solar Electric Power Association (SEPA) initiated a large industry collaborative to identify and standardize definitions for a set of DER grid support functions. While the initial effort concentrated onmore » grid-tied PV inverters and energy storage systems, the concepts have applicability to all DER. A partial product of this on-going effort is a reference definitions document (IEC TR 61850-90-7, Object models for power converters in distributed energy resources (DER) systems) that has become a basis for expansion of related International Electrotechnical Commission (IEC) standards, and is supported by US National Institute of Standards and Technology (NIST) Smart Grid Interoperability Panel (SGIP). Some industry-led organizations advancing communications protocols have also embraced this work. As standards continue to evolve, it is necessary to develop test protocols to independently verify that the inverters are properly executing the advanced functions. Interoperability is assured by establishing common definitions for the functions and a method to test compliance with operational requirements. This document describes test protocols developed by SNL to evaluate the electrical performance and operational capabilities of PV inverters and energy storage, as described in IEC TR 61850-90-7. While many of these functions are not currently required by existing grid codes or may not be widely available commercially, the industry is rapidly moving in that direction. Interoperability issues are already apparent as some of these inverter capabilities are being incorporated in large demonstration and commercial projects. The test protocols are intended to be used to verify acceptable performance of inverters within the standard framework described in IEC TR 61850-90-7. These test protocols, as they are refined and validated over time, can become precursors for future certification test procedures for DER advanced grid support functions.« less
Path Searching Based Crease Detection for Large Scale Scanned Document Images
NASA Astrophysics Data System (ADS)
Zhang, Jifu; Li, Yi; Li, Shutao; Sun, Bin; Sun, Jun
2017-12-01
Since the large size documents are usually folded for preservation, creases will occur in the scanned images. In this paper, a crease detection method is proposed to locate the crease pixels for further processing. According to the imaging process of contactless scanners, the shading on both sides of the crease usually varies a lot. Based on this observation, a convex hull based algorithm is adopted to extract the shading information of the scanned image. Then, the possible crease path can be achieved by applying the vertical filter and morphological operations on the shading image. Finally, the accurate crease is detected via Dijkstra path searching. Experimental results on the dataset of real scanned newspapers demonstrate that the proposed method can obtain accurate locations of the creases in the large size document images.
Reboiro-Jato, Miguel; Arrais, Joel P; Oliveira, José Luis; Fdez-Riverola, Florentino
2014-01-30
The diagnosis and prognosis of several diseases can be shortened through the use of different large-scale genome experiments. In this context, microarrays can generate expression data for a huge set of genes. However, to obtain solid statistical evidence from the resulting data, it is necessary to train and to validate many classification techniques in order to find the best discriminative method. This is a time-consuming process that normally depends on intricate statistical tools. geneCommittee is a web-based interactive tool for routinely evaluating the discriminative classification power of custom hypothesis in the form of biologically relevant gene sets. While the user can work with different gene set collections and several microarray data files to configure specific classification experiments, the tool is able to run several tests in parallel. Provided with a straightforward and intuitive interface, geneCommittee is able to render valuable information for diagnostic analyses and clinical management decisions based on systematically evaluating custom hypothesis over different data sets using complementary classifiers, a key aspect in clinical research. geneCommittee allows the enrichment of microarrays raw data with gene functional annotations, producing integrated datasets that simplify the construction of better discriminative hypothesis, and allows the creation of a set of complementary classifiers. The trained committees can then be used for clinical research and diagnosis. Full documentation including common use cases and guided analysis workflows is freely available at http://sing.ei.uvigo.es/GC/.
Information Model for Reusability in Clinical Trial Documentation
ERIC Educational Resources Information Center
Bahl, Bhanu
2013-01-01
In clinical research, New Drug Application (NDA) to health agencies requires generation of a large number of documents throughout the clinical development life cycle, many of which are also submitted to public databases and external partners. Current processes to assemble the information, author, review and approve the clinical research documents,…
Completeness of breast cancer operative reports in a community care setting.
Eng, Jordan Lang; Baliski, Christopher Ronald; McGahan, Colleen; Cai, Eric
2017-10-01
The narrative operative report represents the traditional means by which breast cancer surgery has been documented. Previous work has established that omissions occur in narrative operative reports produced in an academic setting. The goal of this study was to determine the completeness of breast cancer narrative operative reports produced in a community care setting and to explore the effect of a surgeon's case volume and years in practice on the completeness of these reports. A standardized retrospective review of operative reports produced over a consecutive 2 year period was performed using a set of procedure-specific elements identified through a review of the relevant literature and work done locally. 772 operative reports were reviewed. 45% of all elements were completely documented. A small positive trend was observed between case volume and completeness while a small negative trend was observed between years in practice and completeness. The dictated narrative report inadequately documents breast cancer surgery irrespective of the recording surgeon's volume or experience. An intervention, such as the implementation of synoptic reporting, should be considered in an effort to maximize the utility of the breast cancer operative report. Copyright © 2017. Published by Elsevier Ltd.
Evaluation of Evidence-Based Nursing Pain Management Practice
Song, Wenjia; Eaton, Linda H.; Gordon, Debra B.; Hoyle, Christine; Doorenbos, Ardith Z.
2014-01-01
Background It is important to ensure that cancer pain management is based on the best evidence. Nursing evidence-based pain management can be examined through an evaluation of pain documentation. Aims This study aimed to (a) modify and test an evaluation tool for nursing cancer pain documentation, and (b) describe the frequency and quality of nursing pain documentation in one oncology unit via electronic medical system. Design and Setting A descriptive cross-sectional design was used for this study at an oncology unit of an academic medical center in the Pacific Northwest. Methods Medical records were examined for 37 adults hospitalized during April and May of 2013. Nursing pain documentations (N = 230) were reviewed using an evaluation tool modified from the Cancer Pain Practice Index to consist of 13 evidence-based pain management indicators, including pain assessment, care plan, pharmacologic and nonpharmacologic interventions, monitoring and treatment of analgesic side effects, communication with physicians, and patient education. Individual nursing documentation was assigned a score from 0 (worst possible) to 13 (best possible), to reflect the delivery of evidence-based pain management. Results The participating nurses documented 90% of the recommended evidence-based pain management indicators. Documentation was suboptimal for pain reassessment, pharmacologic interventions, and bowel regimen. Conclusions The study results provide implications for enhancing electronic medical record design and highlight a need for future research to understand the reasons for suboptimal nursing documentation of cancer pain management. For the future use of the data evaluation tool, we recommend additional modifications according to study settings. PMID:26256215
Developing topic-specific search filters for PubMed with click-through data.
Li, J; Lu, Z
2013-01-01
Search filters have been developed and demonstrated for better information access to the immense and ever-growing body of publications in the biomedical domain. However, to date the number of filters remains quite limited because the current filter development methods require significant human efforts in manual document review and filter term selection. In this regard, we aim to investigate automatic methods for generating search filters. We present an automated method to develop topic-specific filters on the basis of users' search logs in PubMed. Specifically, for a given topic, we first detect its relevant user queries and then include their corresponding clicked articles to serve as the topic-relevant document set accordingly. Next, we statistically identify informative terms that best represent the topic-relevant document set using a background set composed of topic irrelevant articles. Lastly, the selected representative terms are combined with Boolean operators and evaluated on benchmark datasets to derive the final filter with the best performance. We applied our method to develop filters for four clinical topics: nephrology, diabetes, pregnancy, and depression. For the nephrology filter, our method obtained performance comparable to the state of the art (sensitivity of 91.3%, specificity of 98.7%, precision of 94.6%, and accuracy of 97.2%). Similarly, high-performing results (over 90% in all measures) were obtained for the other three search filters. Based on PubMed click-through data, we successfully developed a high-performance method for generating topic-specific search filters that is significantly more efficient than existing manual methods. All data sets (topic-relevant and irrelevant document sets) used in this study and a demonstration system are publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/downloads/CQ_filter/
Developing Topic-Specific Search Filters for PubMed with Click-Through Data
Li, Jiao; Lu, Zhiyong
2013-01-01
Summary Objectives Search filters have been developed and demonstrated for better information access to the immense and ever-growing body of publications in the biomedical domain. However, to date the number of filters remains quite limited because the current filter development methods require significant human efforts in manual document review and filter term selection. In this regard, we aim to investigate automatic methods for generating search filters. Methods We present an automated method to develop topic-specific filters on the basis of users’ search logs in PubMed. Specifically, for a given topic, we first detect its relevant user queries and then include their corresponding clicked articles to serve as the topic-relevant document set accordingly. Next, we statistically identify informative terms that best represent the topic-relevant document set using a background set composed of topic irrelevant articles. Lastly, the selected representative terms are combined with Boolean operators and evaluated on benchmark datasets to derive the final filter with the best performance. Results We applied our method to develop filters for four clinical topics: nephrology, diabetes, pregnancy, and depression. For the nephrology filter, our method obtained performance comparable to the state of the art (sensitivity of 91.3%, specificity of 98.7%, precision of 94.6%, and accuracy of 97.2%). Similarly, high-performing results (over 90% in all measures) were obtained for the other three search filters. Conclusion Based on PubMed click-through data, we successfully developed a high-performance method for generating topic-specific search filters that is significantly more efficient than existing manual methods. All data sets (topic-relevant and irrelevant document sets) used in this study and a demonstration system are publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/downloads/CQ_filter/ PMID:23666447
ERIC Educational Resources Information Center
Northwest Regional Educational Lab., Portland, OR.
This document consists of 80 microcomputer software package evaluations prepared by the MicroSIFT (Microcomputer Software and Information for Teachers) Clearinghouse at the Northwest Regional Education Laboratory. Set 15 consists of 27 packages; set 16 consists of 53 packages. Each software review lists producer, time and place of evaluation,…
Knowledge based word-concept model estimation and refinement for biomedical text mining.
Jimeno Yepes, Antonio; Berlanga, Rafael
2015-02-01
Text mining of scientific literature has been essential for setting up large public biomedical databases, which are being widely used by the research community. In the biomedical domain, the existence of a large number of terminological resources and knowledge bases (KB) has enabled a myriad of machine learning methods for different text mining related tasks. Unfortunately, KBs have not been devised for text mining tasks but for human interpretation, thus performance of KB-based methods is usually lower when compared to supervised machine learning methods. The disadvantage of supervised methods though is they require labeled training data and therefore not useful for large scale biomedical text mining systems. KB-based methods do not have this limitation. In this paper, we describe a novel method to generate word-concept probabilities from a KB, which can serve as a basis for several text mining tasks. This method not only takes into account the underlying patterns within the descriptions contained in the KB but also those in texts available from large unlabeled corpora such as MEDLINE. The parameters of the model have been estimated without training data. Patterns from MEDLINE have been built using MetaMap for entity recognition and related using co-occurrences. The word-concept probabilities were evaluated on the task of word sense disambiguation (WSD). The results showed that our method obtained a higher degree of accuracy than other state-of-the-art approaches when evaluated on the MSH WSD data set. We also evaluated our method on the task of document ranking using MEDLINE citations. These results also showed an increase in performance over existing baseline retrieval approaches. Copyright © 2014 Elsevier Inc. All rights reserved.
An automatic indexing method for medical documents.
Wagner, M. M.
1991-01-01
This paper describes MetaIndex, an automatic indexing program that creates symbolic representations of documents for the purpose of document retrieval. MetaIndex uses a simple transition network parser to recognize a language that is derived from the set of main concepts in the Unified Medical Language System Metathesaurus (Meta-1). MetaIndex uses a hierarchy of medical concepts, also derived from Meta-1, to represent the content of documents. The goal of this approach is to improve document retrieval performance by better representation of documents. An evaluation method is described, and the performance of MetaIndex on the task of indexing the Slice of Life medical image collection is reported. PMID:1807564
(Docket A-93-02) Category V-B: Final Support Documents
This Index lists supporting documents related to the decision to certify that the Department of Energy had met the compliance criteria established by EPA in 40 CFR Part 194 and the disposal regulations set by EPA in 40 CFR Part 191.
7 CFR 1.612 - Where and how must documents be filed?
Code of Federal Regulations, 2012 CFR
2012-01-01
... office, as follows: (1) Before NFS refers a case for docketing under § 1.625, any documents must be filed with NFS. NFS's address, telephone number, and facsimile number are set forth in § 1.602. (2) NFS will...
7 CFR 1.612 - Where and how must documents be filed?
Code of Federal Regulations, 2010 CFR
2010-01-01
... office, as follows: (1) Before NFS refers a case for docketing under § 1.625, any documents must be filed with NFS. NFS's address, telephone number, and facsimile number are set forth in § 1.602. (2) NFS will...
7 CFR 1.612 - Where and how must documents be filed?
Code of Federal Regulations, 2014 CFR
2014-01-01
... office, as follows: (1) Before NFS refers a case for docketing under § 1.625, any documents must be filed with NFS. NFS's address, telephone number, and facsimile number are set forth in § 1.602. (2) NFS will...
7 CFR 1.612 - Where and how must documents be filed?
Code of Federal Regulations, 2013 CFR
2013-01-01
... office, as follows: (1) Before NFS refers a case for docketing under § 1.625, any documents must be filed with NFS. NFS's address, telephone number, and facsimile number are set forth in § 1.602. (2) NFS will...
7 CFR 1.612 - Where and how must documents be filed?
Code of Federal Regulations, 2011 CFR
2011-01-01
... office, as follows: (1) Before NFS refers a case for docketing under § 1.625, any documents must be filed with NFS. NFS's address, telephone number, and facsimile number are set forth in § 1.602. (2) NFS will...
21 CFR 880.5440 - Intravascular administration set.
Code of Federal Regulations, 2012 CFR
2012-04-01
...) Classification. Class II (special controls). The special control for pharmacy compounding systems within this classification is the FDA guidance document entitled “Class II Special Controls Guidance Document: Pharmacy Compounding Systems; Final Guidance for Industry and FDA Reviewers.” Pharmacy compounding systems classified...
21 CFR 880.5440 - Intravascular administration set.
Code of Federal Regulations, 2011 CFR
2011-04-01
...) Classification. Class II (special controls). The special control for pharmacy compounding systems within this classification is the FDA guidance document entitled “Class II Special Controls Guidance Document: Pharmacy Compounding Systems; Final Guidance for Industry and FDA Reviewers.” Pharmacy compounding systems classified...
21 CFR 880.5440 - Intravascular administration set.
Code of Federal Regulations, 2013 CFR
2013-04-01
...) Classification. Class II (special controls). The special control for pharmacy compounding systems within this classification is the FDA guidance document entitled “Class II Special Controls Guidance Document: Pharmacy Compounding Systems; Final Guidance for Industry and FDA Reviewers.” Pharmacy compounding systems classified...
21 CFR 880.5440 - Intravascular administration set.
Code of Federal Regulations, 2014 CFR
2014-04-01
...) Classification. Class II (special controls). The special control for pharmacy compounding systems within this classification is the FDA guidance document entitled “Class II Special Controls Guidance Document: Pharmacy Compounding Systems; Final Guidance for Industry and FDA Reviewers.” Pharmacy compounding systems classified...
Prototype Facility Educational Specifications.
ERIC Educational Resources Information Center
Idaho State Div. of Professional-Technical Education, Boise.
This document presents prototypical educational specifications to guide the building and renovation of Idaho vocational schools so they can help communities meet the advanced, professional-technical programs of the future. The specifications start with points to consider when determining school site suitability. The document then sets forth…
COAL UTILITY EVIRONMENTAL COST (CUECOST) WORKBOOK USER'S MANUAL
The document is a user's manual for the Coal Utility Environmental Cost (CUECost) workbook (an interrelated set of spreadsheets) and documents its development and the validity of methods used to estimate installed capital ad annualize costs. The CUECost workbook produces rough-or...
Epidemiology of posttraumatic stress disorder: prevalence, correlates and consequences
Atwoli, Lukoye; Stein, Dan J.; Koenen, Karestan C.; McLaughlin, Katie A.
2015-01-01
Purpose of review This review discusses recent findings from epidemiological surveys of traumatic events and posttraumatic stress disorder (PTSD) globally, including their prevalence, risk factors, and consequences in the community. Recent findings A number of studies on the epidemiology of PTSD have recently been published from diverse countries, with new methodological innovations introduced. Such work has not only documented the prevalence of PTSD in different settings, but has also shed new light on the PTSD conditional risk associated with specific traumatic events, and on the morbidity and comorbidities associated with these events. Summary Recent community studies show that trauma exposure is higher in lower-income countries compared with high-income countries. PTSD prevalence rates are largely similar across countries, however, with the highest rates being found in postconflict settings. Trauma and PTSD-risk factors are distributed differently in lower-income countries compared with high-income countries, with sociodemographic factors contributing more to this risk in high-income than low-income countries. Apart from PTSD, trauma exposure is also associated with several chronic physical conditions. These findings indicate a high burden of trauma exposure in low-income countries and postconflict settings, where access to trained mental health professionals is typically low. PMID:26001922
Vehicle Integrated Prognostic Reasoner (VIPR) Metric Report
NASA Technical Reports Server (NTRS)
Cornhill, Dennis; Bharadwaj, Raj; Mylaraswamy, Dinkar
2013-01-01
This document outlines a set of metrics for evaluating the diagnostic and prognostic schemes developed for the Vehicle Integrated Prognostic Reasoner (VIPR), a system-level reasoner that encompasses the multiple levels of large, complex systems such as those for aircraft and spacecraft. VIPR health managers are organized hierarchically and operate together to derive diagnostic and prognostic inferences from symptoms and conditions reported by a set of diagnostic and prognostic monitors. For layered reasoners such as VIPR, the overall performance cannot be evaluated by metrics solely directed toward timely detection and accuracy of estimation of the faults in individual components. Among other factors, overall vehicle reasoner performance is governed by the effectiveness of the communication schemes between monitors and reasoners in the architecture, and the ability to propagate and fuse relevant information to make accurate, consistent, and timely predictions at different levels of the reasoner hierarchy. We outline an extended set of diagnostic and prognostics metrics that can be broadly categorized as evaluation measures for diagnostic coverage, prognostic coverage, accuracy of inferences, latency in making inferences, computational cost, and sensitivity to different fault and degradation conditions. We report metrics from Monte Carlo experiments using two variations of an aircraft reference model that supported both flat and hierarchical reasoning.
NASIS data base management system - IBM 360/370 OS MVT implementation. 3: Data set specifications
NASA Technical Reports Server (NTRS)
1973-01-01
The data set specifications for the NASA Aerospace Safety Information System (NASIS) are presented. The data set specifications describe the content, format, and medium of communication of every data set required by the system. All relevant information pertinent to a particular set is prepared in a standard form and centralized in a single document. The format for the data set is provided.
NASIS data base management system: IBM 360 TSS implementation. Volume 3: Data set specifications
NASA Technical Reports Server (NTRS)
1973-01-01
The data set specifications for the NASA Aerospace Safety Information System (NASIS) are presented. The data set specifications describe the content, format, and medium of communication of every data set required by the system. All relevant information pertinent to a particular data set is prepared in a standard form and centralized in a single document. The format for the data set is provided.
Data Reorganization for Optimal Time Series Data Access, Analysis, and Visualization
NASA Astrophysics Data System (ADS)
Rui, H.; Teng, W. L.; Strub, R.; Vollmer, B.
2012-12-01
The way data are archived is often not optimal for their access by many user communities (e.g., hydrological), particularly if the data volumes and/or number of data files are large. The number of data records of a non-static data set generally increases with time. Therefore, most data sets are commonly archived by time steps, one step per file, often containing multiple variables. However, many research and application efforts need time series data for a given geographical location or area, i.e., a data organization that is orthogonal to the way the data are archived. The retrieval of a time series of the entire temporal coverage of a data set for a single variable at a single data point, in an optimal way, is an important and longstanding challenge, especially for large science data sets (i.e., with volumes greater than 100 GB). Two examples of such large data sets are the North American Land Data Assimilation System (NLDAS) and Global Land Data Assimilation System (GLDAS), archived at the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC; Hydrology Data Holdings Portal, http://disc.sci.gsfc.nasa.gov/hydrology/data-holdings). To date, the NLDAS data set, hourly 0.125x0.125° from Jan. 1, 1979 to present, has a total volume greater than 3 TB (compressed). The GLDAS data set, 3-hourly and monthly 0.25x0.25° and 1.0x1.0° Jan. 1948 to present, has a total volume greater than 1 TB (compressed). Both data sets are accessible, in the archived time step format, via several convenient methods, including Mirador search and download (http://mirador.gsfc.nasa.gov/), GrADS Data Server (GDS; http://hydro1.sci.gsfc.nasa.gov/dods/), direct FTP (ftp://hydro1.sci.gsfc.nasa.gov/data/s4pa/), and Giovanni Online Visualization and Analysis (http://disc.sci.gsfc.nasa.gov/giovanni). However, users who need long time series currently have no efficient way to retrieve them. Continuing a longstanding tradition of facilitating data access, analysis, and visualization that contribute to knowledge discovery from large science data sets, the GES DISC recently begun a NASA ACCESS-funded project to, in part, optimally reorganize selected large data sets for access and use by the hydrological user community. This presentation discusses the following aspects of the project: (1) explorations of approaches, such as database and file system; (2) findings for each approach, such as limitations and concerns, and pros and cons; (3) implementation of reorganizing data via the file system approach, including data processing (parameter and spatial subsetting), metadata and file structure of reorganized time series data (true "Data Rod," single variable, single grid point, and entire data range per file), and production and quality control. The reorganized time series data will be integrated into several broadly used data tools, such as NASA Giovanni and those provided by CUAHSI HIS (http://his.cuahsi.org/) and EPA BASINS (http://water.epa.gov/scitech/datait/models/basins/), as well as accessible via direct FTP, along with documentation and sample reading software. The data reorganization is initially, as part of the project, applied to selected popular hydrology-related parameters, with other parameters to be added, as resources permit.
Bibliometric analysis of worldwide scientific literature in mobile - health: 2006-2016.
Sweileh, Waleed M; Al-Jabi, Samah W; AbuTaha, Adham S; Zyoud, Sa'ed H; Anayah, Fathi M A; Sawalha, Ansam F
2017-05-30
The advancement of mobile technology had positively influenced healthcare services. An emerging subfield of mobile technology is mobile health (m-Health) in which mobile applications are used for health purposes. The aim of this study was to analyze and assess literature published in the field of m-Health. SciVerse Scopus was used to retrieve literature in m-Health. The study period was set from 2006 to 2016. ArcGIS 10.1 was used to present geographical distribution of publications while VOSviewer was used for data visualization. Growth of publications, citation analysis, and research productivity were presented using standard bibliometric indicators. During the study period, a total of 5465 documents were published, giving an average of 496.8 documents per year. The h-index of retrieved documents was 81. Core keywords used in literature pertaining to m-Health included diabetes mellitus, adherence, and obesity among others. Relative growth rate and doubling time of retrieved literature were stable from 2009 to 2015 indicating exponential growth of literature in this field. A total of 4638 (84.9%) documents were multi-authored with a mean collaboration index of 4.1 authors per article. The United States of America ranked first in productivity with 1926 (35.2%) published documents. India ranked sixth with 183 (3.3%) documents while China ranked seventh with 155(2.8%) documents. VA Medical Center was the most prolific organization/institution while Journal of Medical Internet Research was the preferred journal for publications in the field of m-Health. Top cited articles in the field of m-Health included the use of mobile technology in improving adherence in HIV patients, weight loss, and improving glycemic control in diabetic patients. The size of literature in m-Health showed a noticeable increase in the past decade. Given the large volume of citations received in this field, it is expected that applications of m-Health will be seen into various health aspects and health services. Research in m-Health needs to be encouraged, particularly in the fight against AIDS, poor medication adherence, glycemic control in Africa and other low income world regions where technology can improve health services and decrease disease burden.
BOREAS AFM-5 Level-2 Upper Air Network Standard Pressure Level Data
NASA Technical Reports Server (NTRS)
Barr, Alan; Hrynkiw, Charmaine; Hall, Forrest G. (Editor); Newcomer, Jeffrey A. (Editor); Smith, David E. (Technical Monitor)
2000-01-01
The BOREAS AFM-5 team collected and processed data from the numerous radiosonde flights during the project. The goals of the AFM-05 team were to provide large-scale definition of the atmosphere by supplementing the existing AES aerological network, both temporally and spatially. This data set includes basic upper-air parameters interpolated at 0.5 kiloPascal increments of atmospheric pressure from data collected from the network of upper-air stations during the 1993, 1994, and 1996 field campaigns over the entire study region. The data are contained in tabular ASCII files. The data files are available on a CD-ROM (see document number 20010000884) or from the Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC).
A large-scale cryoelectronic system for biological sample banking
NASA Astrophysics Data System (ADS)
Shirley, Stephen G.; Durst, Christopher H. P.; Fuchs, Christian C.; Zimmermann, Heiko; Ihmig, Frank R.
2009-11-01
We describe a polymorphic electronic infrastructure for managing biological samples stored over liquid nitrogen. As part of this system we have developed new cryocontainers and carrier plates attached to Flash memory chips to have a redundant and portable set of data at each sample. Our experimental investigations show that basic Flash operation and endurance is adequate for the application down to liquid nitrogen temperatures. This identification technology can provide the best sample identification, documentation and tracking that brings added value to each sample. The first application of the system is in a worldwide collaborative research towards the production of an AIDS vaccine. The functionality and versatility of the system can lead to an essential optimization of sample and data exchange for global clinical studies.
Renn, O
1997-01-01
Risk perceptions are only slightly correlated with the expected values of a probability distribution for negative health impacts. Psychometric studies have documented that context variables such as dread or personal control are important predictors for the perceived seriousness of risk. Studies about cultural patterns of risk perceptions emphasize different response sets to risk information, depending on cultural priorities such as social justice versus personal freedom. This chapter reports the major psychological research results pertaining to the factors that govern individual risk perception and discusses the psychometric effects due to people's risk perception and the experience of severe stress. The relative importance of the psychometric context variables, the signals pertaining to each health risks and symbolic beliefs are explained.
Planes, trains, automobiles--and tea sets: extremely intense interests in very young children.
DeLoache, Judy S; Simcock, Gabrielle; Macari, Suzanne
2007-11-01
Some normally developing young children show an intense, passionate interest in a particular category of objects or activities. The present article documents the existence of extremely intense interests that emerge very early in life and establishes some of the basic parameters of the phenomenon. Surveys and interviews with 177 parents revealed that nearly one third of young children have extremely intense interests. The nature of these intense interests is described, with particular focus on their emergence, commonalities in the content of the interests, and the reactions of other people to them. One of the most striking findings is a large gender difference: Extremely intense interests are much more common for young boys than for girls. (c) 2007 APA.
Open data used in water sciences - Review of access, licenses and understandability
NASA Astrophysics Data System (ADS)
Falkenroth, Esa; Lagerbäck Adolphi, Emma; Arheimer, Berit
2016-04-01
The amount of open data available for hydrology research is continually growing. In the EU-funded project SWITCH-ON (Sharing Water-related Information to Tackle Changes in the Hydrosphere - for Operational Needs: www.water-switch-on.eu), we are addressing water concerns by exploring and exploiting the untapped potential of these new open data. This work is enabled by many ongoing efforts to facilitate the use of open data. For instance, a number of portals provide the means to search for open data sets and open spatial data services (such as the GEOSS Portal, INSPIRE community geoportal or various Climate Services and public portals). However, in general, many research groups in water sciences still hesitate in using this open data. We therefore examined some limiting factors. Factors that limit usability of a dataset include: (1) accessibility, (2) understandability and (3) licences. In the SWITCH-ON project we have developed a search tool for finding and accessing data with relevance to water science in Europe, as the existing ones are not addressing data needs in water sciences specifically. The tool is filled with some 9000 sets of metadata and each one is linked to water related key-words. The keywords are based on the ones developed within the CUAHSI community in USA, but extended with non-hydrosphere topics, additional subclasses and only showing keywords actually having data. Access to data sets: 78% of the data is directly accessible, while the rest is either available after registration and request, or through a web client for visualisation but without direct download. However, several data sets were found to be inaccessible due to server downtime, incorrect links or problems with the host database management system. One possible explanation for this could be that many datasets have been assembled by research project that no longer are funded. Hence, their server infrastructure would be less maintained compared to large-scale operational services. Understandability of the data sets: 13 major formats were found, but the major issues encountered were due to incomplete documentation or metadata and problems with decoding binary formats. Ideally, open data sets should be represented in well-known formats and they should be accompanied with sufficient documentation so the data set can be understood. The development efforts on Water ML and NETCDF and other standards could improve understandability of data sets over time but in this review, only a few data sets were provided in these formats. Instead, the majority of datasets were stored in various text-based or binary formats or even document-oriented formats such as PDF. Other disciplines such as meteorology have long-standing traditions of operational data exchange format whereas hydrology research is still quite fragmented and the data exchange is usually done on a case-by-case basis. With the increased sharing of open data there is a good chance the situation will improve for data sets used also in water sciences. License issue: Only 3% of the data is completely free to use, while 57% can be used for non-commercial purposes or research. A high number of datasets did not have a clear statement on terms of use and limitation for access. In most cases the provider could be contacted regarding licensing issues.
Document image binarization using "multi-scale" predefined filters
NASA Astrophysics Data System (ADS)
Saabni, Raid M.
2018-04-01
Reading text or searching for key words within a historical document is a very challenging task. one of the first steps of the complete task is binarization, where we separate foreground such as text, figures and drawings from the background. Successful results of this important step in many cases can determine next steps to success or failure, therefore it is very vital to the success of the complete task of reading and analyzing the content of a document image. Generally, historical documents images are of poor quality due to their storage condition and degradation over time, which mostly cause to varying contrasts, stains, dirt and seeping ink from reverse side. In this paper, we use banks of anisotropic predefined filters in different scales and orientations to develop a binarization method for degraded documents and manuscripts. Using the fact, that handwritten strokes may follow different scales and orientations, we use predefined sets of filter banks having various scales, weights, and orientations to seek a compact set of filters and weights in order to generate diffrent layers of foregrounds and background. Results of convolving these fiters on the gray level image locally, weighted and accumulated to enhance the original image. Based on the different layers, seeds of components in the gray level image and a learning process, we present an improved binarization algorithm to separate the background from layers of foreground. Different layers of foreground which may be caused by seeping ink, degradation or other factors are also separated from the real foreground in a second phase. Promising experimental results were obtained on the DIBCO2011 , DIBCO2013 and H-DIBCO2016 data sets and a collection of images taken from real historical documents.
Automated selected reaction monitoring software for accurate label-free protein quantification.
Teleman, Johan; Karlsson, Christofer; Waldemarson, Sofia; Hansson, Karin; James, Peter; Malmström, Johan; Levander, Fredrik
2012-07-06
Selected reaction monitoring (SRM) is a mass spectrometry method with documented ability to quantify proteins accurately and reproducibly using labeled reference peptides. However, the use of labeled reference peptides becomes impractical if large numbers of peptides are targeted and when high flexibility is desired when selecting peptides. We have developed a label-free quantitative SRM workflow that relies on a new automated algorithm, Anubis, for accurate peak detection. Anubis efficiently removes interfering signals from contaminating peptides to estimate the true signal of the targeted peptides. We evaluated the algorithm on a published multisite data set and achieved results in line with manual data analysis. In complex peptide mixtures from whole proteome digests of Streptococcus pyogenes we achieved a technical variability across the entire proteome abundance range of 6.5-19.2%, which was considerably below the total variation across biological samples. Our results show that the label-free SRM workflow with automated data analysis is feasible for large-scale biological studies, opening up new possibilities for quantitative proteomics and systems biology.
Stanley, W.D.; Blakely, R.J.
1995-01-01
The Geysers-Clear Lake geothermal area encompasses a large dry-steam production area in The Geysers field and a documented high-temperature, high-pressure, water-dominated system in the area largely south of Clear Lake, which has not been developed. An updated view is presented of the geological/geophysical complexities of the crust in this region in order to address key unanswered questions about the heat source and tectonics. Forward modeling, multidimensional inversions, and ideal body analysis of the gravity data, new electromagnetic sounding models, and arguments made from other geophysical data sets suggest that many of the geophysical anomalies have significant contributions from rock property and physical state variations in the upper 7 km and not from "magma' at greater depths. Regional tectonic and magmatic processes are analyzed to develop an updated scenario for pluton emplacement that differs substantially from earlier interpretations. In addition, a rationale is outlined for future exploration for geothermal resources in The Geysers-Clear Lake area. -from Authors
Patel, Chirag J
2017-01-01
Mixtures, or combinations and interactions between multiple environmental exposures, are hypothesized to be causally linked with disease and health-related phenotypes. Established and emerging molecular measurement technologies to assay the exposome , the comprehensive battery of exposures encountered from birth to death, promise a new way of identifying mixtures in disease in the epidemiological setting. In this opinion, we describe the analytic complexity and challenges in identifying mixtures associated with phenotype and disease. Existing and emerging machine-learning methods and data analytic approaches (e.g., "environment-wide association studies" [EWASs]), as well as large cohorts may enhance possibilities to identify mixtures of correlated exposures associated with phenotypes; however, the analytic complexity of identifying mixtures is immense. If the exposome concept is realized, new analytical methods and large sample sizes will be required to ascertain how mixtures are associated with disease. The author recommends documenting prevalent correlated exposures and replicated main effects prior to identifying mixtures.
The High Cost of Failing to Reform Public Education in Texas. School Choice Issues in the State
ERIC Educational Resources Information Center
Gottlob, Brian J.
2008-01-01
Research has documented a crisis in Texas high school graduation rates. Only 67 percent of Texas students graduate from high school, and some large urban districts have graduation rates of 50 percent or lower. This study documents the public costs of high school dropouts in Texas and examines how school choice could provide large public benefits…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-18
... Log Sets b. Vented Hearth Products C. National Energy Savings D. Other Comments 1. Test Procedures 2... address vented gas log sets. DOE clarified its position on vented gas log sets in a document published on... vented gas log sets are included in the definition of ``vented hearth heater''; DOE has reached this...
ERIC Educational Resources Information Center
Saaristo, Vesa; Kulmala, Jenni; Raisamo, Susanna; Rimpelä, Arja; Ståhl, Timo
2014-01-01
Finnish national data sets on schools (N = 496) and pupils (N = 74,143; 14-16 years) were used to study whether a systematic documenting policy for the violations of school smoking bans was associated with pupils' smoking and their perceptions on the enforcement of smoking bans. Attending a school with a systematic documenting policy was…
43 CFR 45.70 - How must documents be filed and served under this subpart?
Code of Federal Regulations, 2013 CFR
2013-10-01
... must be served on each license party and FERC, using: (i) One of the methods of service in § 45.13(c... one of the methods set forth in § 45.12(b). (2) A document is considered filed on the date it is received. However, any document received after 5 p.m. at the place where the filing is due is considered...
50 CFR 221.70 - How must documents be filed and served under this subpart?
Code of Federal Regulations, 2013 CFR
2013-10-01
... party and FERC, using: (i) One of the methods of service in § 221.13(c); or (ii) Regular mail. (2) The... this subpart? (a) Filing. (1) A document under this subpart must be filed using one of the methods set... document received after 5 p.m. at the place where the filing is due is considered filed on the next regular...
43 CFR 45.70 - How must documents be filed and served under this subpart?
Code of Federal Regulations, 2011 CFR
2011-10-01
... must be served on each license party and FERC, using: (i) One of the methods of service in § 45.13(c... one of the methods set forth in § 45.12(b). (2) A document is considered filed on the date it is received. However, any document received after 5 p.m. at the place where the filing is due is considered...
43 CFR 45.70 - How must documents be filed and served under this subpart?
Code of Federal Regulations, 2012 CFR
2012-10-01
... must be served on each license party and FERC, using: (i) One of the methods of service in § 45.13(c... one of the methods set forth in § 45.12(b). (2) A document is considered filed on the date it is received. However, any document received after 5 p.m. at the place where the filing is due is considered...
Seeking Feng Shui in US-China Rhetoric - Words Matter
2017-03-31
2017 DISTRIBUTION A. Approved for public release: distribution unlimited. DISCLAIMER The views expressed in this academic research paper are those...leaders’ rhetoric conflates contingency planning threat analysis as U.S.-China policy and is inconsistent with the threats China poses. Not only is...national strategy documents can be viewed as political documents that may not represent true U.S. intent, both sets of documents still require adherence to
Essential medicines availability is still suboptimal in many countries: a scoping review.
Mahmić-Kaknjo, Mersiha; Jeličić-Kadić, Antonia; Utrobičić, Ana; Chan, Kit; Bero, Lisa; Marušić, Ana
2018-06-01
To identify uses of WHO Model list of essential medicines (EMs) and summarize studies examining EM and national EM lists (NEMLs). In this scoping review, we searched PubMed, Scopus, WHO website and WHO Regional Databases for studies on NEMLs, reimbursement medicines lists, and WHO EML, with no date or language restrictions. Three thousand one hundred forty-four retrieved documents were independently screened by two reviewers; 100 full-text documents were analyzed; 37 contained data suitable for quantitative and qualitative analysis on EMs availability (11 documents), medicines for specific diseases (13 documents), and comparison of WHO EML and NEMLs (13 documents). From the latter, two documents analyzed the relevance of evidence from Cochrane systematic reviews for medicines that were on NEMLs but not on the WHO EML. EMs availability is still suboptimal in low-income countries. Availability of children formulations and EMs for specific diseases such as chronic, cancer, pain, and reproductive health is suboptimal even in middle-income countries. WHO EML can be used as a basic set of medicines for different settings. More evidence is needed into how NEMLs can contribute to better availability of children formulations, pain, and cancer medicines in developing countries. Copyright © 2018 Elsevier Inc. All rights reserved.
Integrated Baseline System (IBS), Version 1. 03. [Chemical Stockpile Emergency Preparedness Program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bailey, B.M.; Burford, M.J.; Downing, T.R.
The Integrated Baseline System (IBS), operated by the Federal Emergency Management Agency (FEMA), is a system of computerized tools for emergency planing and analysis. This document is the user guide for the IBS and explains how to operate the IBS system. The fundamental function of the IBS is to provide tools that civilian emergency management personnel can use in developing emergency plans and in supporting emergency management activities to cope with a chemical-releasing event at a military chemical stockpile. Emergency management planners can evaluate concepts and ideas using the IBS system. The results of that experience can then be factoredmore » into refining requirements and plans. This document provides information for the general system user, and is the primary reference for the system features of the IBS. It is designed for persons who are familiar with general emergency management concepts, operations, and vocabulary. Although the IBS manual set covers basic and advanced operations, it is not a complete reference document set. Emergency situation modeling software in the IBS is supported by additional technical documents. Some of the other LBS software is commercial software for which more complete documentation is available. The IBS manuals reference such documentation where necessary. IBS is a dynamic system. Its capabilities are in a state of continuing expansion and enhancement.« less
ERIC Educational Resources Information Center
Northwest Regional Educational Lab., Portland, OR.
This document consists of 68 microcomputer software package evaluations prepared by MicroSIFT (Microcomputer Software and Information for Teachers) Clearinghouse at the Northwest Regional Education Laboratory. There are 26 packages in set 13 and 42 in set 14. Each software review lists producer, time and place of evaluation, cost, ability level,…
Forward Modeling of Large-scale Structure: An Open-source Approach with Halotools
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hearin, Andrew P.; Campbell, Duncan; Tollerud, Erik
We present the first stable release of Halotools (v0.2), a community-driven Python package designed to build and test models of the galaxy-halo connection. Halotools provides a modular platform for creating mock universes of galaxies starting from a catalog of dark matter halos obtained from a cosmological simulation. The package supports many of the common forms used to describe galaxy-halo models: the halo occupation distribution, the conditional luminosity function, abundance matching, and alternatives to these models that include effects such as environmental quenching or variable galaxy assembly bias. Satellite galaxies can be modeled to live in subhalos or to follow custommore » number density profiles within their halos, including spatial and/or velocity bias with respect to the dark matter profile. The package has an optimized toolkit to make mock observations on a synthetic galaxy population—including galaxy clustering, galaxy–galaxy lensing, galaxy group identification, RSD multipoles, void statistics, pairwise velocities and others—allowing direct comparison to observations. Halotools is object-oriented, enabling complex models to be built from a set of simple, interchangeable components, including those of your own creation. Halotools has an automated testing suite and is exhaustively documented on http://halotools.readthedocs.io, which includes quickstart guides, source code notes and a large collection of tutorials. The documentation is effectively an online textbook on how to build and study empirical models of galaxy formation with Python.« less
Integration of a knowledge-based system and a clinical documentation system via a data dictionary.
Eich, H P; Ohmann, C; Keim, E; Lang, K
1997-01-01
This paper describes the design and realisation of a knowledge-based system and a clinical documentation system linked via a data dictionary. The software was developed as a shell with object oriented methods and C++ for IBM-compatible PC's and WINDOWS 3.1/95. The data dictionary covers terminology and document objects with relations to external classifications. It controls the terminology in the documentation program with form-based entry of clinical documents and in the knowledge-based system with scores and rules. The software was applied to the clinical field of acute abdominal pain by implementing a data dictionary with 580 terminology objects, 501 document objects, and 2136 links; a documentation module with 8 clinical documents and a knowledge-based system with 10 scores and 7 sets of rules.
ETC 408/508: Technical Editing
ERIC Educational Resources Information Center
Charlton, Michael
2013-01-01
The course will focus on the role of the editor in organizational settings, including creating successful writer/editor collaboration. Students will gain practice in editing documents for grammar, syntax, organization, style, emphasis, document design, graphics, and user-centered design. The course will provide an introduction to technology for…
77 FR 76606 - Community Development Financial Institutions Fund
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-28
... form, with pre-set text limits and font size restrictions. Applicants must submit their narrative responses by using the FY 2013 CDFI Program Application narrative template document. This Word document...) A-133 Narrative Report; (iv) Institution Level Report; (v) Transaction Level Report (for Awardees...
Real-effectiveness medicine--pursuing the best effectiveness in the ordinary care of patients.
Malmivaara, Antti
2013-03-01
Clinical know-how and skills as well as up-to-date scientific evidence are cornerstones for providing effective treatment for patients. However, in order to improve the effectiveness of treatment in ordinary practice, also appropriate documentation of care at the health care units and benchmarking based on this documentation are needed. This article presents the new concept of real-effectiveness medicine (REM) which pursues the best effectiveness of patient care in the real-world setting. In order to reach the goal, four layers of information are utilized: 1) good medical know-how and skills combined with the patient view, 2) up-to-date scientific evidence, 3) continuous documentation of performance in ordinary settings, and 4) benchmarking between providers. The new framework is suggested for clinicians, organizations, policy-makers, and researchers.
Leveraging Existing Heritage Documentation for Animations: Senate Virtual Tour
NASA Astrophysics Data System (ADS)
Dhanda, A.; Fai, S.; Graham, K.; Walczak, G.
2017-08-01
The use of digital documentation techniques has led to an increase in opportunities for using documentation data for valorization purposes, in addition to technical purposes. Likewise, building information models (BIMs) made from these data sets hold valuable information that can be as effective for public education as it is for rehabilitation. A BIM can reveal the elements of a building, as well as the different stages of a building over time. Valorizing this information increases the possibility for public engagement and interest in a heritage place. Digital data sets were leveraged by the Carleton Immersive Media Studio (CIMS) for parts of a virtual tour of the Senate of Canada. For the tour, workflows involving four different programs were explored to determine an efficient and effective way to leverage the existing documentation data to create informative and visually enticing animations for public dissemination: Autodesk Revit, Enscape, Autodesk 3ds Max, and Bentley Pointools. The explored workflows involve animations of point clouds, BIMs, and a combination of the two.
Integration of Medical Scribes in the Primary Care Setting: Improving Satisfaction.
Imdieke, Brian H; Martel, Marc L
There are little published data on the use of medical scribes in the primary care setting. We assessed the feasibility of incorporating medical scribes in our ambulatory clinic to support provider documentation in the electronic medical record. In our convenience sampling of patient, provider, and staff perceptions of scribes, we found that patients were comfortable having scribes in the clinic. Overall indicators of patient satisfaction were slightly decreased. Providers found scribe support to be valuable and overall clinician documentation time was reduced by more than 50% using scribes.
2012-01-01
Background Implementation of evidence-based practices in real-world settings is a complex process impacted by many factors, including intervention, dissemination, service provider, and organizational characteristics. Efforts to improve knowledge translation have resulted in greater attention to these factors. Researcher attention to the applicability of findings to applied settings also has increased. Much less attention, however, has been paid to intervention feasibility, an issue important to applied settings. Methods In a systematic review of 121documents regarding integrated treatment programs for women with substance abuse issues and their children, we examined the presence of feasibility-related information. Specifically, we analysed study descriptions for information regarding feasibility factors in six domains (intervention, practitioner, client, service delivery, organizational, and service system). Results On average, fewer than half of the 25 feasibility details assessed were included in the documents. Most documents included some information describing the participating clients, the services offered as part of the intervention, the location of services, and the expected length of stay or number of sessions. Only approximately half of the documents included specific information about the treatment model. Few documents indicated whether the intervention was manualized or whether the intervention was preceded by a standardized screening or assessment process. Very few provided information about the core intervention features versus the features open to local adaptation, or the staff experience or training required to deliver the intervention. Conclusions As has been found in reviews of intervention studies in other fields, our findings revealed that most documents provide some client and intervention information, but few documents provided sufficient information to fully evaluate feasibility. We consider possible explanations for the paucity of feasibility information and provide suggestions for better reporting to promote diffusion of evidence-based practices. PMID:23217025
Dynamic reduction of dimensions of a document vector in a document search and retrieval system
Jiao, Yu; Potok, Thomas E.
2011-05-03
The method and system of the invention involves processing each new document (20) coming into the system into a document vector (16), and creating a document vector with reduced dimensionality (17) for comparison with the data model (15) without recomputing the data model (15). These operations are carried out by a first computer (11) while a second computer (12) updates the data model (18), which can be comprised of an initial large group of documents (19) and is premised on the computing an initial data model (13, 14, 15) to provide a reference point for determining document vectors from documents processed from the data stream (20).
Statistical Techniques for Efficient Indexing and Retrieval of Document Images
ERIC Educational Resources Information Center
Bhardwaj, Anurag
2010-01-01
We have developed statistical techniques to improve the performance of document image search systems where the intermediate step of OCR based transcription is not used. Previous research in this area has largely focused on challenges pertaining to generation of small lexicons for processing handwritten documents and enhancement of poor quality…
Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers
ERIC Educational Resources Information Center
Anaya, Leticia H.
2011-01-01
In the Information Age, a proliferation of unstructured text electronic documents exists. Processing these documents by humans is a daunting task as humans have limited cognitive abilities for processing large volumes of documents that can often be extremely lengthy. To address this problem, text data computer algorithms are being developed.…
A hypertext system that learns from user feedback
NASA Technical Reports Server (NTRS)
Mathe, Nathalie
1994-01-01
Retrieving specific information from large amounts of documentation is not an easy task. It could be facilitated if information relevant in the current problem solving context could be automatically supplied to the user. As a first step towards this goal, we have developed an intelligent hypertext system called CID (Computer Integrated Documentation). Besides providing an hypertext interface for browsing large documents, the CID system automatically acquires and reuses the context in which previous searches were appropriate. This mechanism utilizes on-line user information requirements and relevance feedback either to reinforce current indexing in case of success or to generate new knowledge in case of failure. Thus, the user continually augments and refines the intelligence of the retrieval system. This allows the CID system to provide helpful responses, based on previous usage of the documentation, and to improve its performance over time. We successfully tested the CID system with users of the Space Station Freedom requirements documents. We are currently extending CID to other application domains (Space Shuttle operations documents, airplane maintenance manuals, and on-line training). We are also exploring the potential commercialization of this technique.
Garcia, V.; Conway, C.J.
2009-01-01
Because reliable estimates of nesting success are very important to avian studies, the defnition of a “successful nest” and the use of different analytical methods to estimate success have received much attention. By contrast, variation in the criteria used to determine whether an occupied site that did not produce offspring contained a nesting attempt is a source of bias that has been largely ignored. This problem is especially severe in studies that deal with species whose nest contents are relatively inaccessible because observers cannot determine whether or not an egg was laid for a large proportion of occupied sites. Burrowing Owls (Athene cunicularia) often lay their eggs ≥3 m below ground, so past Burrowing Owl studies have used a variety of criteria to determine whether a nesting attempt was initiated. We searched the literature to document the extent of that variation and examined how that variation influenced estimates of daily nest survival. We found 13 different sets of criteria used by previous authors and applied each criterion to our data set of 1,300 occupied burrows. We found significant variation in estimates of daily nest survival depending on the criteria used. Moreover, differences in daily nest survival among populations were apparent using some sets of criteria but not others. These inconsistencies may lead to incorrect conclusions and invalidate comparisons of the productivity and relative site quality among populations. We encourage future authors working on cavity-, canopy-, or burrow-nesting birds to provide specific details on the criteria they used to identify a nesting attempt.
ERIC Educational Resources Information Center
Northwest Regional Educational Lab., Portland, OR.
This document consists of 170 microcomputer software package evaluations prepared by the MicroSIFT (Microcomputer Software and Information for Teachers) Clearinghouse at the Northwest Regional Education Laboratory. Set 11 consists of 37 packages. Set 12 consists of 34 packages. A special unnumbered set, entitled LIBRA Reviews, treats 99 packages…
NASA Astrophysics Data System (ADS)
Zhang, Jinyu; Steel, Ronald; Ambrose, William
2017-12-01
Shelf margins prograde and aggrade by the incremental addition of deltaic sediments supplied from river channel belts and by stored shoreline sediment. This paper documents the shelf-edge trajectory and coeval channel belts for a segment of Paleocene Lower Wilcox Group in the northern Gulf of Mexico based on 400 wireline logs and 300 m of whole cores. By quantitatively analyzing these data and comparing them with global databases, we demonstrate how varying sediment supply impacted the Wilcox shelf-margin growth and deep-water sediment dispersal under greenhouse eustatic conditions. The coastal plain to marine topset and uppermost continental slope succession of the Lower Wilcox shelf-margin sediment prism is divided into eighteen high-frequency ( 300 ky duration) stratigraphic sequences, and further grouped into 5 sequence sets (labeled as A-E from bottom to top). Sequence Set A is dominantly muddy slope deposits. The shelf edge of Sequence Sets B and C prograded rapidly (> 10 km/Ma) and aggraded modestly (< 80 m/Ma). The coeval channel belts are relatively large (individually averaging 11-13 m thick) and amalgamated. The water discharge of Sequence Sets B and C rivers, estimated by channel-belt thickness, bedform type, and grain size, is 7000-29,000 m3/s, considered as large rivers when compared with modern river databases. In contrast, slow progradation (< 10 km/Ma) and rapid aggradation (> 80 m/Ma) characterizes Sequence Sets D and E, which is associated with smaller (9-10 m thick on average) and isolated channel belts. This stratigraphic trend is likely due to an upward decreasing sediment supply indicated by the shelf-edge progradation rate and channel size, as well as an upward increasing shelf accommodation indicated by the shelf-edge aggradation rate. The rapid shelf-edge progradation and large rivers in Sequence Sets B and C confirm earlier suggestions that it was the early phase of Lower Wilcox dispersal that brought the largest deep-water sediment volumes into the Gulf of Mexico. Key factors in this Lower Wilcox stratigraphic trend are likely to have been a very high initial sediment flux to the Gulf because of the high initial release of sediment from Laramide catchments to the north and northwest, possibly aided by modest eustatic sea-level fall on the Texas shelf, which is suggested by the early, flat shelf-edge trajectory, high amalgamation of channel belts, and the low overall aggradation rate of the Sequence Sets B and C.
Intelligent search and retrieval of a large multimedia knowledgebase for the Hubble Space Telescope
NASA Technical Reports Server (NTRS)
Clapis, Paul J.; Byers, William S.
1990-01-01
A document-retrieval assistant (DRA) in a microcomputer format is described which incorporates hypertext and natural language capabilities. Hypertext is used to introduce an intelligent search capability, and the natural-language interface permits access to specific data without the use of keywords. The DRA can be used to access and 'browse' the large multimedia database that is composed of project documentation from the HST.
44 CFR 10.12 - Pre-implementation actions.
Code of Federal Regulations, 2011 CFR
2011-10-01
... integrated into the decision-making process. Because of the diversity of FEMA, it is not feasible to describe..., for integration of environmental considerations into the decision-making process. The Regional... document for the purpose of justifying the decision. Rather it is a concise document that sets forth the...
44 CFR 10.12 - Pre-implementation actions.
Code of Federal Regulations, 2014 CFR
2014-10-01
... integrated into the decision-making process. Because of the diversity of FEMA, it is not feasible to describe..., for integration of environmental considerations into the decision-making process. The Regional... document for the purpose of justifying the decision. Rather it is a concise document that sets forth the...
44 CFR 10.12 - Pre-implementation actions.
Code of Federal Regulations, 2013 CFR
2013-10-01
... integrated into the decision-making process. Because of the diversity of FEMA, it is not feasible to describe..., for integration of environmental considerations into the decision-making process. The Regional... document for the purpose of justifying the decision. Rather it is a concise document that sets forth the...
15 CFR 904.244 - Production of documents and inspection.
Code of Federal Regulations, 2011 CFR
2011-01-01
... (Continued) NATIONAL OCEANIC AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS CIVIL PROCEDURES Hearing and Appeal Procedures Discovery § 904.244 Production of documents and inspection. (a... the request is served. (b) Procedure. The request must set forth: (1) The items to be produced or...
15 CFR 904.244 - Production of documents and inspection.
Code of Federal Regulations, 2010 CFR
2010-01-01
... (Continued) NATIONAL OCEANIC AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS CIVIL PROCEDURES Hearing and Appeal Procedures Discovery § 904.244 Production of documents and inspection. (a... the request is served. (b) Procedure. The request must set forth: (1) The items to be produced or...
77 FR 72829 - Marine Mammals; File No. 16305
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-06
... Toxicology, Maine Center for Toxicology and Environmental Health, University of Southern Maine, 478 Science... turtle biological samples for scientific research purposes. ADDRESSES: The permit and related documents... consistent with the purposes and policies set forth in section 2 of the ESA. Documents may be reviewed in the...
Visualizing the Topical Structure of the Medical Sciences: A Self-Organizing Map Approach
Skupin, André; Biberstine, Joseph R.; Börner, Katy
2013-01-01
Background We implement a high-resolution visualization of the medical knowledge domain using the self-organizing map (SOM) method, based on a corpus of over two million publications. While self-organizing maps have been used for document visualization for some time, (1) little is known about how to deal with truly large document collections in conjunction with a large number of SOM neurons, (2) post-training geometric and semiotic transformations of the SOM tend to be limited, and (3) no user studies have been conducted with domain experts to validate the utility and readability of the resulting visualizations. Our study makes key contributions to all of these issues. Methodology Documents extracted from Medline and Scopus are analyzed on the basis of indexer-assigned MeSH terms. Initial dimensionality is reduced to include only the top 10% most frequent terms and the resulting document vectors are then used to train a large SOM consisting of over 75,000 neurons. The resulting two-dimensional model of the high-dimensional input space is then transformed into a large-format map by using geographic information system (GIS) techniques and cartographic design principles. This map is then annotated and evaluated by ten experts stemming from the biomedical and other domains. Conclusions Study results demonstrate that it is possible to transform a very large document corpus into a map that is visually engaging and conceptually stimulating to subject experts from both inside and outside of the particular knowledge domain. The challenges of dealing with a truly large corpus come to the fore and require embracing parallelization and use of supercomputing resources to solve otherwise intractable computational tasks. Among the envisaged future efforts are the creation of a highly interactive interface and the elaboration of the notion of this map of medicine acting as a base map, onto which other knowledge artifacts could be overlaid. PMID:23554924
Functions and requirements document for interim store solidified high-level and transuranic waste
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith-Fewell, M.A., Westinghouse Hanford
1996-05-17
The functions, requirements, interfaces, and architectures contained within the Functions and Requirements (F{ampersand}R) Document are based on the information currently contained within the TWRS Functions and Requirements database. The database also documents the set of technically defensible functions and requirements associated with the solidified waste interim storage mission.The F{ampersand}R Document provides a snapshot in time of the technical baseline for the project. The F{ampersand}R document is the product of functional analysis, requirements allocation and architectural structure definition. The technical baseline described in this document is traceable to the TWRS function 4.2.4.1, Interim Store Solidified Waste, and its related requirements, architecture,more » and interfaces.« less
Large Scale eHealth Deployment in Europe: Insights from Concurrent Use of Standards.
Eichelberg, Marco; Chronaki, Catherine
2016-01-01
Large-scale eHealth deployment projects face a major challenge when called to select the right set of standards and tools to achieve sustainable interoperability in an ecosystem including both legacy systems and new systems reflecting technological trends and progress. There is not a single standard that would cover all needs of an eHealth project, and there is a multitude of overlapping and perhaps competing standards that can be employed to define document formats, terminology, communication protocols mirroring alternative technical approaches and schools of thought. eHealth projects need to respond to the important question of how alternative or inconsistently implemented standards and specifications can be used to ensure practical interoperability and long-term sustainability in large scale eHealth deployment. In the eStandards project, 19 European case studies reporting from R&D and large-scale eHealth deployment and policy projects were analyzed. Although this study is not exhaustive, reflecting on the concepts, standards, and tools for concurrent use and the successes, failures, and lessons learned, this paper offers practical insights on how eHealth deployment projects can make the most of the available eHealth standards and tools and how standards and profile developing organizations can serve the users embracing sustainability and technical innovation.
Volume I: fluidized-bed code documentation, for the period February 28, 1983-March 18, 1983
DOE Office of Scientific and Technical Information (OSTI.GOV)
Piperopoulou, H.; Finson, M.; Bloomfield, D.
1983-03-01
This documentation supersedes the previous documentation of the Fluidized-Bed Gasifier code. Volume I documents a simulation program of a Fluidized-Bed Gasifier (FBG), and Volume II documents a systems model of the FBG. The FBG simulation program is an updated version of the PSI/FLUBED code which is capable of modeling slugging beds and variable bed diameter. In its present form the code is set up to model a Westinghouse commercial scale gasifier. The fluidized bed gasifier model combines the classical bubbling bed description for the transport and mixing processes with PSI-generated models for coal chemistry. At the distributor plate, the bubblemore » composition is that of the inlet gas and the initial bubble size is set by the details of the distributor plate. Bubbles grow by coalescence as they rise. The bubble composition and temperature change with height due to transport to and from the cloud as well as homogeneous reactions within the bubble. The cloud composition also varies with height due to cloud/bubble exchange, cloud/emulsion, exchange, and heterogeneous coal char reactions. The emulsion phase is considered to be well mixed.« less
ERIC Educational Resources Information Center
Walsh, Patricia Noonan
2008-01-01
This report gives an account of applying a health survey tool by the "Pomona" Group that earlier documented the process of developing a set of health indicators for people with intellectual disabilities in Europe. The "Pomona" health indicator set mirrors the much larger set of health indicators prepared by the European…
Multiple Representation Document Development
1987-08-06
Interleaf Pub- lishing System [6], or FrameMaker [5], formatting is an integral part of document editing. Here, the document is reformatted as it is edited...consider the task of laying out a page of windowed text. In a layout-driven WYSIWYG system like PageMaker [11], Ready-Set-Go! [12], or FrameMaker [5...in MSIWord, FrameMaker , and Tioga are all handled by a noninteractive off-line program. Direct manipulation, from the processing point of view
Madrigal, Emilio; Prajapati, Shyam; Avadhani, Vaidehi; Annen, Kyle; Friedman, Mark T
2017-02-01
A previous study in our hospitals correlated suboptimal documentation and failure to justify transfusions. In light of implemented blood-conservation strategies, including patient blood management (PBM) and prospective audits (PAs), we performed a follow-up study. We reviewed prospectively audited red blood cell (RBC) transfusions received by adult patients from January to July 2014. Survey forms were used to assess the level of documentation and to classify documentation as adequate, intermediate, or inadequate. Transfusions were deemed justified or not by comparisons with hospital transfusion guidelines. We also analyzed the effect of implemented blood-conservation strategies on our hospital transfusion rates and costs from 2009 to 2015. During the study period, there were 259 prospectively audited transfusion events (TEs) (one or more RBC units transfused to a patient), of which we reviewed 94 TEs (36.3%) in 87 patients. TEs with suboptimal (intermediate and inadequate) documentation accounted for 46.8% of the reviewed TEs, of which 81.8% could not be justified compared with 18.0% of nonjustified, adequately documented TEs. The correlation between suboptimal documentation and failure to justify transfusion was significant (p < 0.001). This correlation remained even in a comparison between the site with a PBM program and the sites without such a program. Overall transfusion rates declined after the introduction of PA, although the decline was only statistically significant at the sites with a PBM program. Suboptimal transfusion documentation remains problematic and is highly correlated with nonjustifiable transfusions. Newly adopted approaches to minimize blood transfusions have not improved transfusion documentation and corresponding out-of-guideline transfusions, although overall transfusions have been reduced by PA, particularly in the setting of a PBM program. © 2016 AABB.
Grabenhenrich, L B; Reich, A; Bellach, J; Trendelenburg, V; Sprikkelman, A B; Roberts, G; Grimshaw, K E C; Sigurdardottir, S; Kowalski, M L; Papadopoulos, N G; Quirce, S; Dubakiene, R; Niggemann, B; Fernández-Rivas, M; Ballmer-Weber, B; van Ree, R; Schnadt, S; Mills, E N C; Keil, T; Beyer, K
2017-03-01
The conduct of oral food challenges as the preferred diagnostic standard for food allergy (FA) was harmonized over the last years. However, documentation and interpretation of challenge results, particularly in research settings, are not sufficiently standardized to allow valid comparisons between studies. Our aim was to develop a diagnostic toolbox to capture and report clinical observations in double-blind placebo-controlled food challenges (DBPCFC). A group of experienced allergists, paediatricians, dieticians, epidemiologists and data managers developed generic case report forms and standard operating procedures for DBPCFCs and piloted them in three clinical centres. The follow-up of the EuroPrevall/iFAAM birth cohort and other iFAAM work packages applied these methods. A set of newly developed questionnaire or interview items capture the history of FA. Together with sensitization status, this forms the basis for the decision to perform a DBPCFC, following a standardized decision algorithm. A generic form including details about severity and timing captures signs and symptoms observed during or after the procedures. In contrast to the commonly used dichotomous outcome FA vs no FA, the allergy status is interpreted in multiple categories to reflect the complexity of clinical decision-making. The proposed toolbox sets a standard for improved documentation and harmonized interpretation of DBPCFCs. By a detailed documentation and common terminology for communicating outcomes, these tools hope to reduce the influence of subjective judgment of supervising physicians. All forms are publicly available for further evolution and free use in clinical and research settings. © 2016 The Authors. Allergy Published by John Wiley & Sons Ltd.
Holden, Chris; Lee, Kelley
2011-05-19
Transnational tobacco companies (TTCs) may respond to processes of regional trade integration both by acting politically to influence policy and by reorganising their own operations. The Central American Common Market (CACM) was reinvigorated in the 1990s, reflecting processes of regional trade liberalisation in Latin America and globally. This study aimed to ascertain how British American Tobacco (BAT), which dominated the markets of the CACM, sought to influence policy towards it by member country governments and how the CACM process impacted upon BAT's operations. The study analysed internal tobacco industry documents released as a result of litigation in the US and available from the online Legacy Tobacco Documents Library at http://legacy.library.ucsf.edu/. Documents were retrieved by searching the BAT collection using key terms in an iterative process. Analysis was based on an interpretive approach involving a process of attempting to understand the meanings of individual documents and relating these to other documents in the set, identifying the central themes of documents and clusters of documents, contextualising the documentary data, and choosing representative material in order to present findings. Utilising its multinational character, BAT was able to act in a coordinated way across the member countries of the CACM to influence tariffs and taxes to its advantage. Documents demonstrate a high degree of access to governments and officials. The company conducted a coordinated, and largely successful, attempt to keep external tariff rates for cigarettes high and to reduce external tariffs for key inputs, whilst also influencing the harmonisation of excise taxes between countries. Protected by these high external tariffs, it reorganised its own operations to take advantage of regional economies of scale. In direct contradiction to arguments presented to CACM governments that affording the tobacco industry protection via high cigarette tariffs would safeguard employment, the company's regional reorganisation involved the loss of hundreds of jobs. Regional integration organisations and their member states should be aware of the capacity of TTCs to act in a coordinated transnational manner to influence policy in their own interests, and coordinate their own public health and tax policies in a similarly effective way.
2011-01-01
Background Transnational tobacco companies (TTCs) may respond to processes of regional trade integration both by acting politically to influence policy and by reorganising their own operations. The Central American Common Market (CACM) was reinvigorated in the 1990s, reflecting processes of regional trade liberalisation in Latin America and globally. This study aimed to ascertain how British American Tobacco (BAT), which dominated the markets of the CACM, sought to influence policy towards it by member country governments and how the CACM process impacted upon BAT's operations. Methods The study analysed internal tobacco industry documents released as a result of litigation in the US and available from the online Legacy Tobacco Documents Library at http://legacy.library.ucsf.edu/. Documents were retrieved by searching the BAT collection using key terms in an iterative process. Analysis was based on an interpretive approach involving a process of attempting to understand the meanings of individual documents and relating these to other documents in the set, identifying the central themes of documents and clusters of documents, contextualising the documentary data, and choosing representative material in order to present findings. Results Utilising its multinational character, BAT was able to act in a coordinated way across the member countries of the CACM to influence tariffs and taxes to its advantage. Documents demonstrate a high degree of access to governments and officials. The company conducted a coordinated, and largely successful, attempt to keep external tariff rates for cigarettes high and to reduce external tariffs for key inputs, whilst also influencing the harmonisation of excise taxes between countries. Protected by these high external tariffs, it reorganised its own operations to take advantage of regional economies of scale. In direct contradiction to arguments presented to CACM governments that affording the tobacco industry protection via high cigarette tariffs would safeguard employment, the company's regional reorganisation involved the loss of hundreds of jobs. Conclusions Regional integration organisations and their member states should be aware of the capacity of TTCs to act in a coordinated transnational manner to influence policy in their own interests, and coordinate their own public health and tax policies in a similarly effective way. PMID:21595921
Single embryo transfer and IVF/ICSI outcome: a balanced appraisal.
Gerris, Jan M R
2005-01-01
This review considers the value of single embryo transfer (SET) to prevent multiple pregnancies (MP) after IVF/ICSI. The incidence of MP (twins and higher order pregnancies) after IVF/ICSI is much higher (approximately 30%) than after natural conception (approximately 1%). Approximately half of all the neonates are multiples. The obstetric, neonatal and long-term consequences for the health of these children are enormous and costs incurred extremely high. Judicious SET is the only method to decrease this epidemic of iatrogenic multiple gestations. Clinical trials have shown that programmes with >50% of SET maintain high overall ongoing pregnancy rates ( approximately 30% per started cycle) while reducing the MP rate to <10%. Experience with SET remains largely European although the need to reduce MP is accepted worldwide. An important issue is how to select patients suitable for SET and embryos with a high putative implantation potential. The typical patient suitable for SET is young (aged <36 years) and in her first or second IVF/ICSI trial. Embryo selection is performed using one or a combination of embryo characteristics. Available evidence suggests that, for the overall population, day 3 and day 5 selection yield similar results but better than zygote selection results. Prospective studies correlating embryo characteristics with documented implantation potential, utilizing databases of individual embryos, are needed. The application of SET should be supported by other measures: reimbursement of IVF/ICSI (earned back by reducing costs), optimized cryopreservation to augment cumulative pregnancy rates per oocyte harvest and a standardized format for reporting results. To make SET the standard of care in the appropriate target group, there is a need for more clinical studies, for intensive counselling of patients, and for an increased sense of responsibility in patients, health care providers and health insurers.
Methods and apparatus of analyzing electrical power grid data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hafen, Ryan P.; Critchlow, Terence J.; Gibson, Tara D.
Apparatus and methods of processing large-scale data regarding an electrical power grid are described. According to one aspect, a method of processing large-scale data regarding an electrical power grid includes accessing a large-scale data set comprising information regarding an electrical power grid; processing data of the large-scale data set to identify a filter which is configured to remove erroneous data from the large-scale data set; using the filter, removing erroneous data from the large-scale data set; and after the removing, processing data of the large-scale data set to identify an event detector which is configured to identify events of interestmore » in the large-scale data set.« less
Campana, Lorenzo; Breitbeck, Robert; Bauer-Kreuz, Regula; Buck, Ursula
2016-05-01
This study evaluated the feasibility of documenting patterned injury using three dimensions and true colour photography without complex 3D surface documentation methods. This method is based on a generated 3D surface model using radiologic slice images (CT) while the colour information is derived from photographs taken with commercially available cameras. The external patterned injuries were documented in 16 cases using digital photography as well as highly precise photogrammetry-supported 3D structured light scanning. The internal findings of these deceased were recorded using CT and MRI. For registration of the internal with the external data, two different types of radiographic markers were used and compared. The 3D surface model generated from CT slice images was linked with the photographs, and thereby digital true-colour 3D models of the patterned injuries could be created (Image projection onto CT/IprojeCT). In addition, these external models were merged with the models of the somatic interior. We demonstrated that 3D documentation and visualization of external injury findings by integration of digital photography in CT/MRI data sets is suitable for the 3D documentation of individual patterned injuries to a body. Nevertheless, this documentation method is not a substitution for photogrammetry and surface scanning, especially when the entire bodily surface is to be recorded in three dimensions including all external findings, and when precise data is required for comparing highly detailed injury features with the injury-inflicting tool.
75 FR 54889 - Development of Set 24 Toxicological Profiles
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-09
... these documents will be available at the ATSDR Web site: http://www.atsdr.cdc.gov/toxpro2.html . Set 24... toxicological profiles for each substance included on the Priority List of Hazardous Substances ( http://www...
Final June Revisions Rule State Budgets and New Unit Set-Asides TSD
This technical support document (TSD) for the final revisions to the Transport Rule shows the underlying data and calculations used to quantify the state budget revisions and new unit set-aside revisions.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-18
... the Regional Entities set priorities of what to audit, and are they doing a good job setting priorities? Do audits focus too much on documentation? Would alternative auditing methods also demonstrate...
Vulnerability Analyst’s Guide to Geometric Target Description
1992-09-01
not constitute indorsement of any commercial product. Form Approved REPORT DOCUMENTATION PAGE OMB No. 0704-O,8 public reporting burden for this...46 5.3 Surrogacy ..............................................46 5.4 Specialized Targets......................................46 5.5... commercially available documents for other large-scale software. The documentation is not a BRL technical report, but can be obtained by contacting
A Study and Model of Machine-Like Indexing Behavior by Human Indexers.
ERIC Educational Resources Information Center
McAllister, Caryl
Although a large part of a document retrieval system's resources are devoted to indexing, the question of how people do subject indexing has been the subject of much conjecture and only a little experimentation. This dissertation examines the relationships between a document being indexed and the index terms assigned to that document in an attempt…
Feng, Rung-Chuang; Tseng, Kuan-Jui; Yan, Hsiu-Fang; Huang, Hsiu-Ya; Chang, Polun
2012-01-01
This study examines the capability of the Clinical Care Classification (CCC) system to represent nursing record data in a medical center in Taiwan. Nursing care records were analyzed using the process of knowledge discovery in data sets. The study data set included all the nursing care plan records from December 1998 to October 2008, totaling 2,060,214 care plan documentation entries. Results show that 75.42% of the documented diagnosis terms could be mapped using the CCC system. A total of 21 established nursing diagnoses were recommended to be added into the CCC system. Results show that one-third of the assessment and care tasks were provided by nursing professionals. This study shows that the CCC system is useful for identifying patterns in nursing practices and can be used to construct a nursing database in the acute setting. PMID:24199066
Natural language processing and the representation of clinical data.
Sager, N; Lyman, M; Bucknall, C; Nhan, N; Tick, L J
1994-01-01
OBJECTIVE: Develop a representation of clinical observations and actions and a method of processing free-text patient documents to facilitate applications such as quality assurance. DESIGN: The Linguistic String Project (LSP) system of New York University utilizes syntactic analysis, augmented by a sublanguage grammar and an information structure that are specific to the clinical narrative, to map free-text documents into a database for querying. MEASUREMENTS: Information precision (I-P) and information recall (I-R) were measured for queries for the presence of 13 asthma-health-care quality assurance criteria in a database generated from 59 discharge letters. RESULTS: I-P, using counts of major errors only, was 95.7% for the 28-letter training set and 98.6% for the 31-letter test set. I-R, using counts of major omissions only, was 93.9% for the training set and 92.5% for the test set. PMID:7719796
CCBD's Position Summary on Physical Restraint & Seclusion Procedures in School Settings
ERIC Educational Resources Information Center
Peterson, Reece; Albrecht, Susan; Johns, Bev
2009-01-01
This document is a summary of policy recommendations from two longer and more detailed documents available from the Council for Children with Behavioral Disorders (CCBD) regarding the use of physical restraint and seclusion procedures in schools. These recommendations include: (1) CCBD believes that physical restraint or seclusion procedures…
Space law information system design, phase 2
NASA Technical Reports Server (NTRS)
Morenoff, J.; Roth, D. L.; Singleton, J. W.
1973-01-01
Design alternatives were defined for the implementation of a Space Law Information System for the Office of the General Counsel, NASA. A thesaurus of space law terms was developed and a selected document sample indexed on the basis of that thesaurus. Abstracts were also prepared for the sample document set.
Readability of Informed Consent Documents at University Counseling Centers
ERIC Educational Resources Information Center
Lustgarten, Samuel D.; Elchert, Daniel M.; Cederberg, Charles; Garrison, Yunkyoung L.; Ho, Y. C. S.
2017-01-01
The extent to which clients understand the nature and anticipated course of therapy is referred to as informed consent. Counseling psychologists often provide informed consent documents to enhance the education of services and for liability purposes. Professionals in numerous health care settings have evaluated the readability of their informed…
Playing the Assessment Game: An English Early Childhood Education Perspective
ERIC Educational Resources Information Center
Basford, Jo; Bath, Caroline
2014-01-01
Assessment and the documentation of learning is an international issue in early childhood education (ECE) and has increasingly become a way for governments to exercise direct control over the practitioners working with young children. This paper details recent statutory guidance about assessment and documentation for English ECE settings and…
ERIC Educational Resources Information Center
Rockley, Ann
1993-01-01
Describes how an analysis of Ontario Hydro's conversion of 20,000 pages of paper manuals to online documentation established the scope of the project, provided a set of design criteria, and recommended the use of Standard Generalized Markup Language to create the new documentation and the purchase of the "Dinatext" program to produce it.…