automatic text summarization: Topics by Science.gov

Sample records for automatic text summarization

Automatic Text Structuring and Summarization.

ERIC Educational Resources Information Center

Salton, Gerard; And Others

1997-01-01

Discussion of the use of information retrieval techniques for automatic generation of semantic hypertext links focuses on automatic text summarization. Topics include World Wide Web links, text segmentation, and evaluation of text summarization by comparing automatically generated abstracts with manually prepared abstracts. (Author/LRW)
Comparison of Document Index Graph Using TextRank and HITS Weighting Method in Automatic Text Summarization

NASA Astrophysics Data System (ADS)

Hadyan, Fadhlil; Shaufiah; Arif Bijaksana, Moch.

2017-01-01

Automatic summarization is a system that can help someone to take the core information of a long text instantly. The system can help by summarizing text automatically. there’s Already many summarization systems that have been developed at this time but there are still many problems in those system. In this final task proposed summarization method using document index graph. This method utilizes the PageRank and HITS formula used to assess the web page, adapted to make an assessment of words in the sentences in a text document. The expected outcome of this final task is a system that can do summarization of a single document, by utilizing document index graph with TextRank and HITS to improve the quality of the summary results automatically.
Automatic Text Summarization for Indonesian Language Using TextTeaser

NASA Astrophysics Data System (ADS)

Gunawan, D.; Pasaribu, A.; Rahmat, R. F.; Budiarto, R.

2017-04-01

Text summarization is one of the solution for information overload. Reducing text without losing the meaning not only can save time to read, but also maintain the reader’s understanding. One of many algorithms to summarize text is TextTeaser. Originally, this algorithm is intended to be used for text in English. However, due to TextTeaser algorithm does not consider the meaning of the text, we implement this algorithm for text in Indonesian language. This algorithm calculates four elements, such as title feature, sentence length, sentence position and keyword frequency. We utilize TextRank, an unsupervised and language independent text summarization algorithm, to evaluate the summarized text yielded by TextTeaser. The result shows that the TextTeaser algorithm needs more improvement to obtain better accuracy.
Generalized minimum dominating set and application in automatic text summarization

NASA Astrophysics Data System (ADS)

Xu, Yi-Zhi; Zhou, Hai-Jun

2016-03-01

For a graph formed by vertices and weighted edges, a generalized minimum dominating set (MDS) is a vertex set of smallest cardinality such that the summed weight of edges from each outside vertex to vertices in this set is equal to or larger than certain threshold value. This generalized MDS problem reduces to the conventional MDS problem in the limiting case of all the edge weights being equal to the threshold value. We treat the generalized MDS problem in the present paper by a replica-symmetric spin glass theory and derive a set of belief-propagation equations. As a practical application we consider the problem of extracting a set of sentences that best summarize a given input text document. We carry out a preliminary test of the statistical physics-inspired method to this automatic text summarization problem.
Keyphrase based Evaluation of Automatic Text Summarization

NASA Astrophysics Data System (ADS)

Elghannam, Fatma; El-Shishtawy, Tarek

2015-05-01

The development of methods to deal with the informative contents of the text units in the matching process is a major challenge in automatic summary evaluation systems that use fixed n-gram matching. The limitation causes inaccurate matching between units in a peer and reference summaries. The present study introduces a new Keyphrase based Summary Evaluator KpEval for evaluating automatic summaries. The KpEval relies on the keyphrases since they convey the most important concepts of a text. In the evaluation process, the keyphrases are used in their lemma form as the matching text unit. The system was applied to evaluate different summaries of Arabic multi-document data set presented at TAC2011. The results showed that the new evaluation technique correlates well with the known evaluation systems: Rouge1, Rouge2, RougeSU4, and AutoSummENG MeMoG. KpEval has the strongest correlation with AutoSummENG MeMoG, Pearson and spearman correlation coefficient measures are 0.8840, 0.9667 respectively.
Evaluation of automatic video summarization systems

NASA Astrophysics Data System (ADS)

Taskiran, Cuneyt M.

2006-01-01

Compact representations of video, or video summaries, data greatly enhances efficient video browsing. However, rigorous evaluation of video summaries generated by automatic summarization systems is a complicated process. In this paper we examine the summary evaluation problem. Text summarization is the oldest and most successful summarization domain. We show some parallels between these to domains and introduce methods and terminology. Finally, we present results for a comprehensive evaluation summary that we have performed.
Summarization as the base for text assessment

NASA Astrophysics Data System (ADS)

Karanikolas, Nikitas N.

2015-02-01

We present a model that apply shallow text summarization as a cheap (in resources needed) process for Automatic (machine based) free text answer Assessment (AA). The evaluation of the proposed method induces the inference that the Conventional Assessment (CA, man made assessment of free text answers) does not have an obvious mechanical replacement. However, this is a research challenge.
An Automatic Multidocument Text Summarization Approach Based on Naïve Bayesian Classifier Using Timestamp Strategy

PubMed Central

Ramanujam, Nedunchelian; Kaliappan, Manivannan

2016-01-01

Nowadays, automatic multidocument text summarization systems can successfully retrieve the summary sentences from the input documents. But, it has many limitations such as inaccurate extraction to essential sentences, low coverage, poor coherence among the sentences, and redundancy. This paper introduces a new concept of timestamp approach with Naïve Bayesian Classification approach for multidocument text summarization. The timestamp provides the summary an ordered look, which achieves the coherent looking summary. It extracts the more relevant information from the multiple documents. Here, scoring strategy is also used to calculate the score for the words to obtain the word frequency. The higher linguistic quality is estimated in terms of readability and comprehensibility. In order to show the efficiency of the proposed method, this paper presents the comparison between the proposed methods with the existing MEAD algorithm. The timestamp procedure is also applied on the MEAD algorithm and the results are examined with the proposed method. The results show that the proposed method results in lesser time than the existing MEAD algorithm to execute the summarization process. Moreover, the proposed method results in better precision, recall, and F-score than the existing clustering with lexical chaining approach. PMID:27034971
Figure-associated text summarization and evaluation.

PubMed

Polepalli Ramesh, Balaji; Sethi, Ricky J; Yu, Hong

2015-01-01

Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903).
Figure-Associated Text Summarization and Evaluation

PubMed Central

Polepalli Ramesh, Balaji; Sethi, Ricky J.; Yu, Hong

2015-01-01

Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903). PMID:25643357
Enhancing biomedical text summarization using semantic relation extraction.

PubMed

Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

2011-01-01

Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.
Enhancing Biomedical Text Summarization Using Semantic Relation Extraction

PubMed Central

Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

2011-01-01

Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization. PMID:21887336
Summarizing Expository Texts

ERIC Educational Resources Information Center

Westby, Carol; Culatta, Barbara; Lawrence, Barbara; Hall-Kenyon, Kendra

2010-01-01

Purpose: This article reviews the literature on students' developing skills in summarizing expository texts and describes strategies for evaluating students' expository summaries. Evaluation outcomes are presented for a professional development project aimed at helping teachers develop new techniques for teaching summarization. Methods: Strategies…
Different approaches for identifying important concepts in probabilistic biomedical text summarization.

PubMed

Moradi, Milad; Ghadiri, Nasser

2018-01-01

Automatic text summarization tools help users in the biomedical domain to acquire their intended information from various textual resources more efficiently. Some of biomedical text summarization systems put the basis of their sentence selection approach on the frequency of concepts extracted from the input text. However, it seems that exploring other measures rather than the raw frequency for identifying valuable contents within an input document, or considering correlations existing between concepts, may be more useful for this type of summarization. In this paper, we describe a Bayesian summarization method for biomedical text documents. The Bayesian summarizer initially maps the input text to the Unified Medical Language System (UMLS) concepts; then it selects the important ones to be used as classification features. We introduce six different feature selection approaches to identify the most important concepts of the text and select the most informative contents according to the distribution of these concepts. We show that with the use of an appropriate feature selection approach, the Bayesian summarizer can improve the performance of biomedical summarization. Using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) toolkit, we perform extensive evaluations on a corpus of scientific papers in the biomedical domain. The results show that when the Bayesian summarizer utilizes the feature selection methods that do not use the raw frequency, it can outperform the biomedical summarizers that rely on the frequency of concepts, domain-independent and baseline methods. Copyright © 2017 Elsevier B.V. All rights reserved.
Graph-based biomedical text summarization: An itemset mining and sentence clustering approach.

PubMed

Nasr Azadani, Mozhgan; Ghadiri, Nasser; Davoodijam, Ensieh

2018-06-12

Automatic text summarization offers an efficient solution to access the ever-growing amounts of both scientific and clinical literature in the biomedical domain by summarizing the source documents while maintaining their most informative contents. In this paper, we propose a novel graph-based summarization method that takes advantage of the domain-specific knowledge and a well-established data mining technique called frequent itemset mining. Our summarizer exploits the Unified Medical Language System (UMLS) to construct a concept-based model of the source document and mapping the document to the concepts. Then, it discovers frequent itemsets to take the correlations among multiple concepts into account. The method uses these correlations to propose a similarity function based on which a represented graph is constructed. The summarizer then employs a minimum spanning tree based clustering algorithm to discover various subthemes of the document. Eventually, it generates the final summary by selecting the most informative and relative sentences from all subthemes within the text. We perform an automatic evaluation over a large number of summaries using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics. The results demonstrate that the proposed summarization system outperforms various baselines and benchmark approaches. The carried out research suggests that the incorporation of domain-specific knowledge and frequent itemset mining equips the summarization system in a better way to address the informativeness measurement of the sentences. Moreover, clustering the graph nodes (sentences) can enable the summarizer to target different main subthemes of a source document efficiently. The evaluation results show that the proposed approach can significantly improve the performance of the summarization systems in the biomedical domain. Copyright © 2018. Published by Elsevier Inc.
Using clustering and a modified classification algorithm for automatic text summarization

NASA Astrophysics Data System (ADS)

Aries, Abdelkrime; Oufaida, Houda; Nouali, Omar

2013-01-01

In this paper we describe a modified classification method destined for extractive summarization purpose. The classification in this method doesn't need a learning corpus; it uses the input text to do that. First, we cluster the document sentences to exploit the diversity of topics, then we use a learning algorithm (here we used Naive Bayes) on each cluster considering it as a class. After obtaining the classification model, we calculate the score of a sentence in each class, using a scoring model derived from classification algorithm. These scores are used, then, to reorder the sentences and extract the first ones as the output summary. We conducted some experiments using a corpus of scientific papers, and we have compared our results to another summarization system called UNIS.1 Also, we experiment the impact of clustering threshold tuning, on the resulted summary, as well as the impact of adding more features to the classifier. We found that this method is interesting, and gives good performance, and the addition of new features (which is simple using this method) can improve summary's accuracy.
Automatic analysis of medical dialogue in the home hemodialysis domain: structure induction and summarization.

PubMed

Lacson, Ronilda C; Barzilay, Regina; Long, William J

2006-10-01

Spoken medical dialogue is a valuable source of information for patients and caregivers. This work presents a first step towards automatic analysis and summarization of spoken medical dialogue. We first abstract a dialogue into a sequence of semantic categories using linguistic and contextual features integrated in a supervised machine-learning framework. Our model has a classification accuracy of 73%, compared to 33% achieved by a majority baseline (p<0.01). We then describe and implement a summarizer that utilizes this automatically induced structure. Our evaluation results indicate that automatically generated summaries exhibit high resemblance to summaries written by humans. In addition, task-based evaluation shows that physicians can reasonably answer questions related to patient care by looking at the automatically generated summaries alone, in contrast to the physicians' performance when they were given summaries from a naïve summarizer (p<0.05). This work demonstrates the feasibility of automatically structuring and summarizing spoken medical dialogue.
Automatic summarization of soccer highlights using audio-visual descriptors.

PubMed

Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc

2015-01-01

Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.
Task-Driven Dynamic Text Summarization

ERIC Educational Resources Information Center

Workman, Terri Elizabeth

2011-01-01

The objective of this work is to examine the efficacy of natural language processing (NLP) in summarizing bibliographic text for multiple purposes. Researchers have noted the accelerating growth of bibliographic databases. Information seekers using traditional information retrieval techniques when searching large bibliographic databases are often…
Using Text Messaging to Summarize Text

ERIC Educational Resources Information Center

Williams, Angela Ruffin

2012-01-01

Summarizing is an academic task that students are expected to have mastered by the time they enter college. However, experience has revealed quite the contrary. Summarization is often difficult to master as well as teach, but instructors in higher education can benefit greatly from the rapid advancement in mobile wireless technology devices, by…

Automatic Summarization of MEDLINE Citations for Evidence–Based Medical Treatment: A Topic-Oriented Evaluation

PubMed Central

Fiszman, Marcelo; Demner-Fushman, Dina; Kilicoglu, Halil; Rindflesch, Thomas C.

2009-01-01

As the number of electronic biomedical textual resources increases, it becomes harder for physicians to find useful answers at the point of care. Information retrieval applications provide access to databases; however, little research has been done on using automatic summarization to help navigate the documents returned by these systems. After presenting a semantic abstraction automatic summarization system for MEDLINE citations, we concentrate on evaluating its ability to identify useful drug interventions for fifty-three diseases. The evaluation methodology uses existing sources of evidence-based medicine as surrogates for a physician-annotated reference standard. Mean average precision (MAP) and a clinical usefulness score developed for this study were computed as performance metrics. The automatic summarization system significantly outperformed the baseline in both metrics. The MAP gain was 0.17 (p < 0.01) and the increase in the overall score of clinical usefulness was 0.39 (p < 0.05). PMID:19022398
Rules for Summarizing Texts: Is Classroom Instruction Being Provided?

ERIC Educational Resources Information Center

Garner, Ruth

1984-01-01

An "ideal lesson" method was used to observe how teachers were providing instruction in text summarization. Teachers prepared a lesson and audiotaped the presentation. Although teachers were prompted to assist students in improving text summaries, only two teachers actually discussed more than one summarization rule. Staff development…
Unsupervised Extraction of Diagnosis Codes from EMRs Using Knowledge-Based and Extractive Text Summarization Techniques

PubMed Central

Kavuluru, Ramakanth; Han, Sifei; Harris, Daniel

2017-01-01

Diagnosis codes are extracted from medical records for billing and reimbursement and for secondary uses such as quality control and cohort identification. In the US, these codes come from the standard terminology ICD-9-CM derived from the international classification of diseases (ICD). ICD-9 codes are generally extracted by trained human coders by reading all artifacts available in a patient’s medical record following specific coding guidelines. To assist coders in this manual process, this paper proposes an unsupervised ensemble approach to automatically extract ICD-9 diagnosis codes from textual narratives included in electronic medical records (EMRs). Earlier attempts on automatic extraction focused on individual documents such as radiology reports and discharge summaries. Here we use a more realistic dataset and extract ICD-9 codes from EMRs of 1000 inpatient visits at the University of Kentucky Medical Center. Using named entity recognition (NER), graph-based concept-mapping of medical concepts, and extractive text summarization techniques, we achieve an example based average recall of 0.42 with average precision 0.47; compared with a baseline of using only NER, we notice a 12% improvement in recall with the graph-based approach and a 7% improvement in precision using the extractive text summarization approach. Although diagnosis codes are complex concepts often expressed in text with significant long range non-local dependencies, our present work shows the potential of unsupervised methods in extracting a portion of codes. As such, our findings are especially relevant for code extraction tasks where obtaining large amounts of training data is difficult. PMID:28748227
Automatic video summarization driven by a spatio-temporal attention model

NASA Astrophysics Data System (ADS)

Barland, R.; Saadane, A.

2008-02-01

According to the literature, automatic video summarization techniques can be classified in two parts, following the output nature: "video skims", which are generated using portions of the original video and "key-frame sets", which correspond to the images, selected from the original video, having a significant semantic content. The difference between these two categories is reduced when we consider automatic procedures. Most of the published approaches are based on the image signal and use either pixel characterization or histogram techniques or image decomposition by blocks. However, few of them integrate properties of the Human Visual System (HVS). In this paper, we propose to extract keyframes for video summarization by studying the variations of salient information between two consecutive frames. For each frame, a saliency map is produced simulating the human visual attention by a bottom-up (signal-dependent) approach. This approach includes three parallel channels for processing three early visual features: intensity, color and temporal contrasts. For each channel, the variations of the salient information between two consecutive frames are computed. These outputs are then combined to produce the global saliency variation which determines the key-frames. Psychophysical experiments have been defined and conducted to analyze the relevance of the proposed key-frame extraction algorithm.
Chinese Text Summarization Algorithm Based on Word2vec

NASA Astrophysics Data System (ADS)

Chengzhang, Xu; Dan, Liu

2018-02-01

In order to extract some sentences that can cover the topic of a Chinese article, a Chinese text summarization algorithm based on Word2vec is used in this paper. Words in an article are represented as vectors trained by Word2vec, the weight of each word, the sentence vector and the weight of each sentence are calculated by combining word-sentence relationship with graph-based ranking model. Finally the summary is generated on the basis of the final sentence vector and the final weight of the sentence. The experimental results on real datasets show that the proposed algorithm has a better summarization quality compared with TF-IDF and TextRank.
Enhancing Summarization Skills Using Twin Texts: Instruction in Narrative and Expository Text Structures

ERIC Educational Resources Information Center

Furtado, Leena; Johnson, Lisa

2010-01-01

This action-research case study endeavors to enhance the summarization skills of first grade students who are reading at or above the third grade level during the first trimester of the academic school year. Students read "twin text" sources, meaning, fiction and nonfiction literary selections focusing on a common theme to help identify…
Text Summarization Model based on Maximum Coverage Problem and its Variant

NASA Astrophysics Data System (ADS)

Takamura, Hiroya; Okumura, Manabu

We discuss text summarization in terms of maximum coverage problem and its variant. To solve the optimization problem, we applied some decoding algorithms including the ones never used in this summarization formulation, such as a greedy algorithm with performance guarantee, a randomized algorithm, and a branch-and-bound method. We conduct comparative experiments. On the basis of the experimental results, we also augment the summarization model so that it takes into account the relevance to the document cluster. Through experiments, we showed that the augmented model is at least comparable to the best-performing method of DUC'04.
Presentation video retrieval using automatically recovered slide and spoken text

NASA Astrophysics Data System (ADS)

Cooper, Matthew

2013-03-01

Video is becoming a prevalent medium for e-learning. Lecture videos contain text information in both the presentation slides and lecturer's speech. This paper examines the relative utility of automatically recovered text from these sources for lecture video retrieval. To extract the visual information, we automatically detect slides within the videos and apply optical character recognition to obtain their text. Automatic speech recognition is used similarly to extract spoken text from the recorded audio. We perform controlled experiments with manually created ground truth for both the slide and spoken text from more than 60 hours of lecture video. We compare the automatically extracted slide and spoken text in terms of accuracy relative to ground truth, overlap with one another, and utility for video retrieval. Results reveal that automatically recovered slide text and spoken text contain different content with varying error profiles. Experiments demonstrate that automatically extracted slide text enables higher precision video retrieval than automatically recovered spoken text.
DiffNet: automatic differential functional summarization of dE-MAP networks.

PubMed

Seah, Boon-Siew; Bhowmick, Sourav S; Dewey, C Forbes

2014-10-01

The study of genetic interaction networks that respond to changing conditions is an emerging research problem. Recently, Bandyopadhyay et al. (2010) proposed a technique to construct a differential network (dE-MAPnetwork) from two static gene interaction networks in order to map the interaction differences between them under environment or condition change (e.g., DNA-damaging agent). This differential network is then manually analyzed to conclude that DNA repair is differentially effected by the condition change. Unfortunately, manual construction of differential functional summary from a dE-MAP network that summarizes all pertinent functional responses is time-consuming, laborious and error-prone, impeding large-scale analysis on it. To this end, we propose DiffNet, a novel data-driven algorithm that leverages Gene Ontology (go) annotations to automatically summarize a dE-MAP network to obtain a high-level map of functional responses due to condition change. We tested DiffNet on the dynamic interaction networks following MMS treatment and demonstrated the superiority of our approach in generating differential functional summaries compared to state-of-the-art graph clustering methods. We studied the effects of parameters in DiffNet in controlling the quality of the summary. We also performed a case study that illustrates its utility. Copyright © 2014 Elsevier Inc. All rights reserved.
Automatic summarization of changes in biological image sequences using algorithmic information theory.

PubMed

Cohen, Andrew R; Bjornsson, Christopher S; Temple, Sally; Banker, Gary; Roysam, Badrinath

2009-08-01

An algorithmic information-theoretic method is presented for object-level summarization of meaningful changes in image sequences. Object extraction and tracking data are represented as an attributed tracking graph (ATG). Time courses of object states are compared using an adaptive information distance measure, aided by a closed-form multidimensional quantization. The notion of meaningful summarization is captured by using the gap statistic to estimate the randomness deficiency from algorithmic statistics. The summary is the clustering result and feature subset that maximize the gap statistic. This approach was validated on four bioimaging applications: 1) It was applied to a synthetic data set containing two populations of cells differing in the rate of growth, for which it correctly identified the two populations and the single feature out of 23 that separated them; 2) it was applied to 59 movies of three types of neuroprosthetic devices being inserted in the brain tissue at three speeds each, for which it correctly identified insertion speed as the primary factor affecting tissue strain; 3) when applied to movies of cultured neural progenitor cells, it correctly distinguished neurons from progenitors without requiring the use of a fixative stain; and 4) when analyzing intracellular molecular transport in cultured neurons undergoing axon specification, it automatically confirmed the role of kinesins in axon specification.
A Comparison of Two Strategies for Teaching Third Graders to Summarize Information Text

ERIC Educational Resources Information Center

Dromsky, Ann Marie

2011-01-01

Summarizing text is one of the most effective comprehension strategies (National Institute of Child Health and Human Development, 2000) and an effective way to learn from information text (Dole, Duffy, Roehler, & Pearson, 1991; Pressley & Woloshyn, 1995). In addition, much research supports the explicit instruction of such strategies as…
Automatic Summarization as a Combinatorial Optimization Problem

NASA Astrophysics Data System (ADS)

Hirao, Tsutomu; Suzuki, Jun; Isozaki, Hideki

We derived the oracle summary with the highest ROUGE score that can be achieved by integrating sentence extraction with sentence compression from the reference abstract. The analysis results of the oracle revealed that summarization systems have to assign an appropriate compression rate for each sentence in the document. In accordance with this observation, this paper proposes a summarization method as a combinatorial optimization: selecting the set of sentences that maximize the sum of the sentence scores from the pool which consists of the sentences with various compression rates, subject to length constrains. The score of the sentence is defined by its compression rate, content words and positional information. The parameters for the compression rates and positional information are optimized by minimizing the loss between score of oracles and that of candidates. The results obtained from TSC-2 corpus showed that our method outperformed the previous systems with statistical significance.
Term-Weighting Approaches in Automatic Text Retrieval.

ERIC Educational Resources Information Center

Salton, Gerard; Buckley, Christopher

1988-01-01

Summarizes the experimental evidence that indicates that text indexing systems based on the assignment of appropriately weighted single terms produce retrieval results superior to those obtained with more elaborate text representations, and provides baseline single term indexing models with which more elaborate content analysis procedures can be…
Text Summarization Model based on Facility Location Problem

NASA Astrophysics Data System (ADS)

Takamura, Hiroya; Okumura, Manabu

e propose a novel multi-document generic summarization model based on the budgeted median problem, which is a facility location problem. The summarization method based on our model is an extractive method, which selects sentences from the given document cluster and generates a summary. Each sentence in the document cluster will be assigned to one of the selected sentences, where the former sentece is supposed to be represented by the latter. Our method selects sentences to generate a summary that yields a good sentence assignment and hence covers the whole content of the document cluster. An advantage of this method is that it can incorporate asymmetric relations between sentences such as textual entailment. Through experiments, we showed that the proposed method yields good summaries on the dataset of DUC'04.
Automatic and user-centric approaches to video summary evaluation

NASA Astrophysics Data System (ADS)

Taskiran, Cuneyt M.; Bentley, Frank

2007-01-01

Automatic video summarization has become an active research topic in content-based video processing. However, not much emphasis has been placed on developing rigorous summary evaluation methods and developing summarization systems based on a clear understanding of user needs, obtained through user centered design. In this paper we address these two topics and propose an automatic video summary evaluation algorithm adapted from teh text summarization domain.
Automatic Coding of Short Text Responses via Clustering in Educational Assessment

ERIC Educational Resources Information Center

Zehner, Fabian; Sälzer, Christine; Goldhammer, Frank

2016-01-01

Automatic coding of short text responses opens new doors in assessment. We implemented and integrated baseline methods of natural language processing and statistical modelling by means of software components that are available under open licenses. The accuracy of automatic text coding is demonstrated by using data collected in the "Programme…
Text Structuration Leading to an Automatic Summary System: RAFI.

ERIC Educational Resources Information Center

Lehman, Abderrafih

1999-01-01

Describes the design and construction of Resume Automatique a Fragments Indicateurs (RAFI), a system of automatic text summary which sums up scientific and technical texts. The RAFI system transforms a long source text into several versions of more condensed texts, using discourse analysis, to make searching easier; it could be adapted to the…
Document Exploration and Automatic Knowledge Extraction for Unstructured Biomedical Text

NASA Astrophysics Data System (ADS)

Chu, S.; Totaro, G.; Doshi, N.; Thapar, S.; Mattmann, C. A.; Ramirez, P.

2015-12-01

We describe our work on building a web-browser based document reader with built-in exploration tool and automatic concept extraction of medical entities for biomedical text. Vast amounts of biomedical information are offered in unstructured text form through scientific publications and R&D reports. Utilizing text mining can help us to mine information and extract relevant knowledge from a plethora of biomedical text. The ability to employ such technologies to aid researchers in coping with information overload is greatly desirable. In recent years, there has been an increased interest in automatic biomedical concept extraction [1, 2] and intelligent PDF reader tools with the ability to search on content and find related articles [3]. Such reader tools are typically desktop applications and are limited to specific platforms. Our goal is to provide researchers with a simple tool to aid them in finding, reading, and exploring documents. Thus, we propose a web-based document explorer, which we called Shangri-Docs, which combines a document reader with automatic concept extraction and highlighting of relevant terms. Shangri-Docsalso provides the ability to evaluate a wide variety of document formats (e.g. PDF, Words, PPT, text, etc.) and to exploit the linked nature of the Web and personal content by performing searches on content from public sites (e.g. Wikipedia, PubMed) and private cataloged databases simultaneously. Shangri-Docsutilizes Apache cTAKES (clinical Text Analysis and Knowledge Extraction System) [4] and Unified Medical Language System (UMLS) to automatically identify and highlight terms and concepts, such as specific symptoms, diseases, drugs, and anatomical sites, mentioned in the text. cTAKES was originally designed specially to extract information from clinical medical records. Our investigation leads us to extend the automatic knowledge extraction process of cTAKES for biomedical research domain by improving the ontology guided information extraction
An Automated Summarization Assessment Algorithm for Identifying Summarizing Strategies

PubMed Central

Abdi, Asad; Idris, Norisma; Alguliyev, Rasim M.; Aliguliyev, Ramiz M.

2016-01-01

Background Summarization is a process to select important information from a source text. Summarizing strategies are the core cognitive processes in summarization activity. Since summarization can be important as a tool to improve comprehension, it has attracted interest of teachers for teaching summary writing through direct instruction. To do this, they need to review and assess the students' summaries and these tasks are very time-consuming. Thus, a computer-assisted assessment can be used to help teachers to conduct this task more effectively. Design/Results This paper aims to propose an algorithm based on the combination of semantic relations between words and their syntactic composition to identify summarizing strategies employed by students in summary writing. An innovative aspect of our algorithm lies in its ability to identify summarizing strategies at the syntactic and semantic levels. The efficiency of the algorithm is measured in terms of Precision, Recall and F-measure. We then implemented the algorithm for the automated summarization assessment system that can be used to identify the summarizing strategies used by students in summary writing. PMID:26735139
Image/text automatic indexing and retrieval system using context vector approach

NASA Astrophysics Data System (ADS)

Qing, Kent P.; Caid, William R.; Ren, Clara Z.; McCabe, Patrick

1995-11-01

Thousands of documents and images are generated daily both on and off line on the information superhighway and other media. Storage technology has improved rapidly to handle these data but indexing this information is becoming very costly. HNC Software Inc. has developed a technology for automatic indexing and retrieval of free text and images. This technique is demonstrated and is based on the concept of `context vectors' which encode a succinct representation of the associated text and features of sub-image. In this paper, we will describe the Automated Librarian System which was designed for free text indexing and the Image Content Addressable Retrieval System (ICARS) which extends the technique from the text domain into the image domain. Both systems have the ability to automatically assign indices for a new document and/or image based on the content similarities in the database. ICARS also has the capability to retrieve images based on similarity of content using index terms, text description, and user-generated images as a query without performing segmentation or object recognition.

Processing Time and Cognitive Effort of Longhand Note Taking When Reading and Summarizing a Structured or Linear Text

ERIC Educational Resources Information Center

Olive, Thierry; Barbier, Marie-Laure

2017-01-01

We examined longhand note taking strategies when reading and summarizing a source text that was formatted with bullets or that was presented in a single paragraph. We analyzed cognitive effort when reading the source text, when jotting notes, when reading the notes, and when composing the summary, as well as time spent in these activities and the…
MeSH indexing based on automatically generated summaries.

PubMed

Jimeno-Yepes, Antonio J; Plaza, Laura; Mork, James G; Aronson, Alan R; Díaz, Alberto

2013-06-26

MEDLINE citations are manually indexed at the U.S. National Library of Medicine (NLM) using as reference the Medical Subject Headings (MeSH) controlled vocabulary. For this task, the human indexers read the full text of the article. Due to the growth of MEDLINE, the NLM Indexing Initiative explores indexing methodologies that can support the task of the indexers. Medical Text Indexer (MTI) is a tool developed by the NLM Indexing Initiative to provide MeSH indexing recommendations to indexers. Currently, the input to MTI is MEDLINE citations, title and abstract only. Previous work has shown that using full text as input to MTI increases recall, but decreases precision sharply. We propose using summaries generated automatically from the full text for the input to MTI to use in the task of suggesting MeSH headings to indexers. Summaries distill the most salient information from the full text, which might increase the coverage of automatic indexing approaches based on MEDLINE. We hypothesize that if the results were good enough, manual indexers could possibly use automatic summaries instead of the full texts, along with the recommendations of MTI, to speed up the process while maintaining high quality of indexing results. We have generated summaries of different lengths using two different summarizers, and evaluated the MTI indexing on the summaries using different algorithms: MTI, individual MTI components, and machine learning. The results are compared to those of full text articles and MEDLINE citations. Our results show that automatically generated summaries achieve similar recall but higher precision compared to full text articles. Compared to MEDLINE citations, summaries achieve higher recall but lower precision. Our results show that automatic summaries produce better indexing than full text articles. Summaries produce similar recall to full text but much better precision, which seems to indicate that automatic summaries can efficiently capture the most
MeSH indexing based on automatically generated summaries

PubMed Central

2013-01-01

Background MEDLINE citations are manually indexed at the U.S. National Library of Medicine (NLM) using as reference the Medical Subject Headings (MeSH) controlled vocabulary. For this task, the human indexers read the full text of the article. Due to the growth of MEDLINE, the NLM Indexing Initiative explores indexing methodologies that can support the task of the indexers. Medical Text Indexer (MTI) is a tool developed by the NLM Indexing Initiative to provide MeSH indexing recommendations to indexers. Currently, the input to MTI is MEDLINE citations, title and abstract only. Previous work has shown that using full text as input to MTI increases recall, but decreases precision sharply. We propose using summaries generated automatically from the full text for the input to MTI to use in the task of suggesting MeSH headings to indexers. Summaries distill the most salient information from the full text, which might increase the coverage of automatic indexing approaches based on MEDLINE. We hypothesize that if the results were good enough, manual indexers could possibly use automatic summaries instead of the full texts, along with the recommendations of MTI, to speed up the process while maintaining high quality of indexing results. Results We have generated summaries of different lengths using two different summarizers, and evaluated the MTI indexing on the summaries using different algorithms: MTI, individual MTI components, and machine learning. The results are compared to those of full text articles and MEDLINE citations. Our results show that automatically generated summaries achieve similar recall but higher precision compared to full text articles. Compared to MEDLINE citations, summaries achieve higher recall but lower precision. Conclusions Our results show that automatic summaries produce better indexing than full text articles. Summaries produce similar recall to full text but much better precision, which seems to indicate that automatic summaries can
Image-based mobile service: automatic text extraction and translation

NASA Astrophysics Data System (ADS)

Berclaz, Jérôme; Bhatti, Nina; Simske, Steven J.; Schettino, John C.

2010-01-01

We present a new mobile service for the translation of text from images taken by consumer-grade cell-phone cameras. Such capability represents a new paradigm for users where a simple image provides the basis for a service. The ubiquity and ease of use of cell-phone cameras enables acquisition and transmission of images anywhere and at any time a user wishes, delivering rapid and accurate translation over the phone's MMS and SMS facilities. Target text is extracted completely automatically, requiring no bounding box delineation or related user intervention. The service uses localization, binarization, text deskewing, and optical character recognition (OCR) in its analysis. Once the text is translated, an SMS message is sent to the user with the result. Further novelties include that no software installation is required on the handset, any service provider or camera phone can be used, and the entire service is implemented on the server side.
Summarizing an Ontology: A "Big Knowledge" Coverage Approach.

PubMed

Zheng, Ling; Perl, Yehoshua; Elhanan, Gai; Ochs, Christopher; Geller, James; Halper, Michael

2017-01-01

Maintenance and use of a large ontology, consisting of thousands of knowledge assertions, are hampered by its scope and complexity. It is important to provide tools for summarization of ontology content in order to facilitate user "big picture" comprehension. We present a parameterized methodology for the semi-automatic summarization of major topics in an ontology, based on a compact summary of the ontology, called an "aggregate partial-area taxonomy", followed by manual enhancement. An experiment is presented to test the effectiveness of such summarization measured by coverage of a given list of major topics of the corresponding application domain. SNOMED CT's Specimen hierarchy is the test-bed. A domain-expert provided a list of topics that serves as a gold standard. The enhanced results show that the aggregate taxonomy covers most of the domain's main topics.
Combining MEDLINE and publisher data to create parallel corpora for the automatic translation of biomedical text

PubMed Central

2013-01-01

Background Most of the institutional and research information in the biomedical domain is available in the form of English text. Even in countries where English is an official language, such as the United States, language can be a barrier for accessing biomedical information for non-native speakers. Recent progress in machine translation suggests that this technique could help make English texts accessible to speakers of other languages. However, the lack of adequate specialized corpora needed to train statistical models currently limits the quality of automatic translations in the biomedical domain. Results We show how a large-sized parallel corpus can automatically be obtained for the biomedical domain, using the MEDLINE database. The corpus generated in this work comprises article titles obtained from MEDLINE and abstract text automatically retrieved from journal websites, which substantially extends the corpora used in previous work. After assessing the quality of the corpus for two language pairs (English/French and English/Spanish) we use the Moses package to train a statistical machine translation model that outperforms previous models for automatic translation of biomedical text. Conclusions We have built translation data sets in the biomedical domain that can easily be extended to other languages available in MEDLINE. These sets can successfully be applied to train statistical machine translation models. While further progress should be made by incorporating out-of-domain corpora and domain-specific lexicons, we believe that this work improves the automatic translation of biomedical texts. PMID:23631733
Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text.

PubMed

Park, Albert; Hartzler, Andrea L; Huh, Jina; McDonald, David W; Pratt, Wanda

2015-08-31

The prevalence and value of patient-generated health text are increasing, but processing such text remains problematic. Although existing biomedical natural language processing (NLP) tools are appealing, most were developed to process clinician- or researcher-generated text, such as clinical notes or journal articles. In addition to being constructed for different types of text, other challenges of using existing NLP include constantly changing technologies, source vocabularies, and characteristics of text. These continuously evolving challenges warrant the need for applying low-cost systematic assessment. However, the primarily accepted evaluation method in NLP, manual annotation, requires tremendous effort and time. The primary objective of this study is to explore an alternative approach-using low-cost, automated methods to detect failures (eg, incorrect boundaries, missed terms, mismapped concepts) when processing patient-generated text with existing biomedical NLP tools. We first characterize common failures that NLP tools can make in processing online community text. We then demonstrate the feasibility of our automated approach in detecting these common failures using one of the most popular biomedical NLP tools, MetaMap. Using 9657 posts from an online cancer community, we explored our automated failure detection approach in two steps: (1) to characterize the failure types, we first manually reviewed MetaMap's commonly occurring failures, grouped the inaccurate mappings into failure types, and then identified causes of the failures through iterative rounds of manual review using open coding, and (2) to automatically detect these failure types, we then explored combinations of existing NLP techniques and dictionary-based matching for each failure cause. Finally, we manually evaluated the automatically detected failures. From our manual review, we characterized three types of failure: (1) boundary failures, (2) missed term failures, and (3) word ambiguity
QCS : a system for querying, clustering, and summarizing documents.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dunlavy, Daniel M.

2006-08-01

Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test setsmore » from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence ''trimming'', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design
QCS: a system for querying, clustering and summarizing documents.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dunlavy, Daniel M.; Schlesinger, Judith D.; O'Leary, Dianne P.

2006-10-01

Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test setsmore » from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence 'trimming', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and
Toward Routine Automatic Pathway Discovery from On-line Scientific Text Abstracts.

PubMed

Ng; Wong

1999-01-01

We are entering a new era of research where the latest scientific discoveries are often first reported online and are readily accessible by scientists worldwide. This rapid electronic dissemination of research breakthroughs has greatly accelerated the current pace in genomics and proteomics research. The race to the discovery of a gene or a drug has now become increasingly dependent on how quickly a scientist can scan through voluminous amount of information available online to construct the relevant picture (such as protein-protein interaction pathways) as it takes shape amongst the rapidly expanding pool of globally accessible biological data (e.g. GENBANK) and scientific literature (e.g. MEDLINE). We describe a prototype system for automatic pathway discovery from on-line text abstracts, combining technologies that (1) retrieve research abstracts from online sources, (2) extract relevant information from the free texts, and (3) present the extracted information graphically and intuitively. Our work demonstrates that this framework allows us to routinely scan online scientific literature for automatic discovery of knowledge, giving modern scientists the necessary competitive edge in managing the information explosion in this electronic age.
Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion.

PubMed

Agarwal, Shashank; Yu, Hong

2009-12-01

Biomedical texts can be typically represented by four rhetorical categories: Introduction, Methods, Results and Discussion (IMRAD). Classifying sentences into these categories can benefit many other text-mining tasks. Although many studies have applied different approaches for automatically classifying sentences in MEDLINE abstracts into the IMRAD categories, few have explored the classification of sentences that appear in full-text biomedical articles. We first evaluated whether sentences in full-text biomedical articles could be reliably annotated into the IMRAD format and then explored different approaches for automatically classifying these sentences into the IMRAD categories. Our results show an overall annotation agreement of 82.14% with a Kappa score of 0.756. The best classification system is a multinomial naïve Bayes classifier trained on manually annotated data that achieved 91.95% accuracy and an average F-score of 91.55%, which is significantly higher than baseline systems. A web version of this system is available online at-http://wood.ims.uwm.edu/full_text_classifier/.
Automatic detection of adverse events to predict drug label changes using text and data mining techniques.

PubMed

Gurulingappa, Harsha; Toldo, Luca; Rajput, Abdul Mateen; Kors, Jan A; Taweel, Adel; Tayrouz, Yorki

2013-11-01

The aim of this study was to assess the impact of automatically detected adverse event signals from text and open-source data on the prediction of drug label changes. Open-source adverse effect data were collected from FAERS, Yellow Cards and SIDER databases. A shallow linguistic relation extraction system (JSRE) was applied for extraction of adverse effects from MEDLINE case reports. Statistical approach was applied on the extracted datasets for signal detection and subsequent prediction of label changes issued for 29 drugs by the UK Regulatory Authority in 2009. 76% of drug label changes were automatically predicted. Out of these, 6% of drug label changes were detected only by text mining. JSRE enabled precise identification of four adverse drug events from MEDLINE that were undetectable otherwise. Changes in drug labels can be predicted automatically using data and text mining techniques. Text mining technology is mature and well-placed to support the pharmacovigilance tasks. Copyright © 2013 John Wiley & Sons, Ltd.
Camera network video summarization

NASA Astrophysics Data System (ADS)

Panda, Rameswar; Roy-Chowdhury, Amit K.

2017-05-01

Networks of vision sensors are deployed in many settings, ranging from security needs to disaster response to environmental monitoring. Many of these setups have hundreds of cameras and tens of thousands of hours of video. The difficulty of analyzing such a massive volume of video data is apparent whenever there is an incident that requires foraging through vast video archives to identify events of interest. As a result, video summarization, that automatically extract a brief yet informative summary of these videos, has attracted intense attention in the recent years. Much progress has been made in developing a variety of ways to summarize a single video in form of a key sequence or video skim. However, generating a summary from a set of videos captured in a multi-camera network still remains as a novel and largely under-addressed problem. In this paper, with the aim of summarizing videos in a camera network, we introduce a novel representative selection approach via joint embedding and capped l21-norm minimization. The objective function is two-fold. The first is to capture the structural relationships of data points in a camera network via an embedding, which helps in characterizing the outliers and also in extracting a diverse set of representatives. The second is to use a capped l21-norm to model the sparsity and to suppress the influence of data outliers in representative selection. We propose to jointly optimize both of the objectives, such that embedding can not only characterize the structure, but also indicate the requirements of sparse representative selection. Extensive experiments on standard multi-camera datasets well demonstrate the efficacy of our method over state-of-the-art methods.
Automatic Cataloguing and Searching for Retrospective Data by Use of OCR Text.

ERIC Educational Resources Information Center

Tseng, Yuen-Hsien

2001-01-01

Describes efforts in supporting information retrieval from OCR (optical character recognition) degraded text. Reports on approaches used in an automatic cataloging and searching contest for books in multiple languages, including a vector space retrieval model, an n-gram indexing method, and a weighting scheme; and discusses problems of Asian…
The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction.

PubMed

Najafi, Elham; Darooneh, Amir H

2015-01-01

A text can be considered as a one dimensional array of words. The locations of each word type in this array form a fractal pattern with certain fractal dimension. We observe that important words responsible for conveying the meaning of a text have dimensions considerably different from one, while the fractal dimensions of unimportant words are close to one. We introduce an index quantifying the importance of the words in a given text using their fractal dimensions and then ranking them according to their importance. This index measures the difference between the fractal pattern of a word in the original text relative to a shuffled version. Because the shuffled text is meaningless (i.e., words have no importance), the difference between the original and shuffled text can be used to ascertain degree of fractality. The degree of fractality may be used for automatic keyword detection. Words with the degree of fractality higher than a threshold value are assumed to be the retrieved keywords of the text. We measure the efficiency of our method for keywords extraction, making a comparison between our proposed method and two other well-known methods of automatic keyword extraction.
The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction

PubMed Central

Najafi, Elham; Darooneh, Amir H.

2015-01-01

A text can be considered as a one dimensional array of words. The locations of each word type in this array form a fractal pattern with certain fractal dimension. We observe that important words responsible for conveying the meaning of a text have dimensions considerably different from one, while the fractal dimensions of unimportant words are close to one. We introduce an index quantifying the importance of the words in a given text using their fractal dimensions and then ranking them according to their importance. This index measures the difference between the fractal pattern of a word in the original text relative to a shuffled version. Because the shuffled text is meaningless (i.e., words have no importance), the difference between the original and shuffled text can be used to ascertain degree of fractality. The degree of fractality may be used for automatic keyword detection. Words with the degree of fractality higher than a threshold value are assumed to be the retrieved keywords of the text. We measure the efficiency of our method for keywords extraction, making a comparison between our proposed method and two other well-known methods of automatic keyword extraction. PMID:26091207
Highlight summarization in golf videos using audio signals

NASA Astrophysics Data System (ADS)

Kim, Hyoung-Gook; Kim, Jin Young

2008-01-01

In this paper, we present an automatic summarization of highlights in golf videos based on audio information alone without video information. The proposed highlight summarization system is carried out based on semantic audio segmentation and detection on action units from audio signals. Studio speech, field speech, music, and applause are segmented by means of sound classification. Swing is detected by the methods of impulse onset detection. Sounds like swing and applause form a complete action unit, while studio speech and music parts are used to anchor the program structure. With the advantage of highly precise detection of applause, highlights are extracted effectively. Our experimental results obtain high classification precision on 18 golf games. It proves that the proposed system is very effective and computationally efficient to apply the technology to embedded consumer electronic devices.
A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method.

PubMed

Yoo, Illhoi; Hu, Xiaohua; Song, Il-Yeol

2007-11-27

A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free text, document clustering and text summarization together are used as a solution for text information overload problem. In this paper, we introduce a coherent graph-based semantic clustering and summarization approach for biomedical literature. Our extensive experimental results show the approach shows 45% cluster quality improvement and 72% clustering reliability improvement, in terms of misclassification index, over Bisecting K-means as a leading document clustering approach. In addition, our approach provides concise but rich text summary in key concepts and sentences. Our coherent biomedical literature clustering and summarization approach that takes advantage of ontology-enriched graphical representations significantly improves the quality of document clusters and understandability of documents through summaries.
A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method

PubMed Central

Yoo, Illhoi; Hu, Xiaohua; Song, Il-Yeol

2007-01-01

Background A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free text, document clustering and text summarization together are used as a solution for text information overload problem. In this paper, we introduce a coherent graph-based semantic clustering and summarization approach for biomedical literature. Results Our extensive experimental results show the approach shows 45% cluster quality improvement and 72% clustering reliability improvement, in terms of misclassification index, over Bisecting K-means as a leading document clustering approach. In addition, our approach provides concise but rich text summary in key concepts and sentences. Conclusion Our coherent biomedical literature clustering and summarization approach that takes advantage of ontology-enriched graphical representations significantly improves the quality of document clusters and understandability of documents through summaries. PMID:18047705
Automatic Extraction of Drug Adverse Effects from Product Characteristics (SPCs): A Text Versus Table Comparison.

PubMed

Lamy, Jean-Baptiste; Ugon, Adrien; Berthelot, Hélène

2016-01-01

Potential adverse effects (AEs) of drugs are described in their summary of product characteristics (SPCs), a textual document. Automatic extraction of AEs from SPCs is useful for detecting AEs and for building drug databases. However, this task is difficult because each AE is associated with a frequency that must be extracted and the presentation of AEs in SPCs is heterogeneous, consisting of plain text and tables in many different formats. We propose a taxonomy for the presentation of AEs in SPCs. We set up natural language processing (NLP) and table parsing methods for extracting AEs from texts and tables of any format, and evaluate them on 10 SPCs. Automatic extraction performed better on tables than on texts. Tables should be recommended for the presentation of the AEs section of the SPCs.

Portable automatic text classification for adverse drug reaction detection via multi-corpus training.

PubMed

Sarker, Abeed; Gonzalez, Graciela

2015-02-01

Automatic detection of adverse drug reaction (ADR) mentions from text has recently received significant interest in pharmacovigilance research. Current research focuses on various sources of text-based information, including social media-where enormous amounts of user posted data is available, which have the potential for use in pharmacovigilance if collected and filtered accurately. The aims of this study are: (i) to explore natural language processing (NLP) approaches for generating useful features from text, and utilizing them in optimized machine learning algorithms for automatic classification of ADR assertive text segments; (ii) to present two data sets that we prepared for the task of ADR detection from user posted internet data; and (iii) to investigate if combining training data from distinct corpora can improve automatic classification accuracies. One of our three data sets contains annotated sentences from clinical reports, and the two other data sets, built in-house, consist of annotated posts from social media. Our text classification approach relies on generating a large set of features, representing semantic properties (e.g., sentiment, polarity, and topic), from short text nuggets. Importantly, using our expanded feature sets, we combine training data from different corpora in attempts to boost classification accuracies. Our feature-rich classification approach performs significantly better than previously published approaches with ADR class F-scores of 0.812 (previously reported best: 0.770), 0.538 and 0.678 for the three data sets. Combining training data from multiple compatible corpora further improves the ADR F-scores for the in-house data sets to 0.597 (improvement of 5.9 units) and 0.704 (improvement of 2.6 units) respectively. Our research results indicate that using advanced NLP techniques for generating information rich features from text can significantly improve classification accuracies over existing benchmarks. Our experiments
Portable Automatic Text Classification for Adverse Drug Reaction Detection via Multi-corpus Training

PubMed Central

Gonzalez, Graciela

2014-01-01

Objective Automatic detection of Adverse Drug Reaction (ADR) mentions from text has recently received significant interest in pharmacovigilance research. Current research focuses on various sources of text-based information, including social media — where enormous amounts of user posted data is available, which have the potential for use in pharmacovigilance if collected and filtered accurately. The aims of this study are: (i) to explore natural language processing approaches for generating useful features from text, and utilizing them in optimized machine learning algorithms for automatic classification of ADR assertive text segments; (ii) to present two data sets that we prepared for the task of ADR detection from user posted internet data; and (iii) to investigate if combining training data from distinct corpora can improve automatic classification accuracies. Methods One of our three data sets contains annotated sentences from clinical reports, and the two other data sets, built in-house, consist of annotated posts from social media. Our text classification approach relies on generating a large set of features, representing semantic properties (e.g., sentiment, polarity, and topic), from short text nuggets. Importantly, using our expanded feature sets, we combine training data from different corpora in attempts to boost classification accuracies. Results Our feature-rich classification approach performs significantly better than previously published approaches with ADR class F-scores of 0.812 (previously reported best: 0.770), 0.538 and 0.678 for the three data sets. Combining training data from multiple compatible corpora further improves the ADR F-scores for the in-house data sets to 0.597 (improvement of 5.9 units) and 0.704 (improvement of 2.6 units) respectively. Conclusions Our research results indicate that using advanced NLP techniques for generating information rich features from text can significantly improve classification accuracies over existing
MPEG content summarization based on compressed domain feature analysis

NASA Astrophysics Data System (ADS)

Sugano, Masaru; Nakajima, Yasuyuki; Yanagihara, Hiromasa

2003-11-01

This paper addresses automatic summarization of MPEG audiovisual content on compressed domain. By analyzing semantically important low-level and mid-level audiovisual features, our method universally summarizes the MPEG-1/-2 contents in the form of digest or highlight. The former is a shortened version of an original, while the latter is an aggregation of important or interesting events. In our proposal, first, the incoming MPEG stream is segmented into shots and the above features are derived from each shot. Then the features are adaptively evaluated in an integrated manner, and finally the qualified shots are aggregated into a summary. Since all the processes are performed completely on compressed domain, summarization is achieved at very low computational cost. The experimental results show that news highlights and sports highlights in TV baseball games can be successfully extracted according to simple shot transition models. As for digest extraction, subjective evaluation proves that meaningful shots are extracted from content without a priori knowledge, even if it contains multiple genres of programs. Our method also has the advantage of generating an MPEG-7 based description such as summary and audiovisual segments in the course of summarization.
Data summarization method for chronic disease tracking.

PubMed

Aleksić, Dejan; Rajković, Petar; Vučković, Dušan; Janković, Dragan; Milenković, Aleksandar

2017-05-01

Bearing in mind the rising prevalence of chronic medical conditions, the chronic disease management is one of the key features required by medical information systems used in primary healthcare. Our research group paid a particular attention to this specific area by offering a set of custom data collection forms and reports in order to improve medical professionals' daily routine. The main idea was to provide an overview of history for chronic diseases, which, as it seems, had not been properly supported in previous administrative workflows. After five years of active use of medical information systems in more than 25 primary healthcare institutions, we were able to identify several scenarios that were often end-user-action dependent and could result in the data related to chronic diagnoses being loosely connected. An additional benefit would be a more effective identification of potentially new patients suffering from chronic diseases. For this particular reason, we introduced an extension of the existing data structures and a summarizing method along with a specific tool that should help in connecting all the data related to a patient and a diagnosis. The summarization method was based on the principle of connecting all of the records pertaining to a specific diagnosis for the selected patient, and it was envisaged to work in both automatic and on-demand mode. The expected results were a more effective identification of new potential patients and a completion of the existing histories of diseases associated with chronic diagnoses. The current system usage analysis shows that a small number of doctors used functionalities specially designed for chronic diseases affecting less than 6% of the total population (around 11,500 out of more than 200,000 patients). In initial tests, the on-demand data summarization mode was applied in general practice and 89 out of 155 users identified more than 3000 new patients with a chronic disease over a three-month test period
An automatic system to detect and extract texts in medical images for de-identification

NASA Astrophysics Data System (ADS)

Zhu, Yingxuan; Singh, P. D.; Siddiqui, Khan; Gillam, Michael

2010-03-01

Recently, there is an increasing need to share medical images for research purpose. In order to respect and preserve patient privacy, most of the medical images are de-identified with protected health information (PHI) before research sharing. Since manual de-identification is time-consuming and tedious, so an automatic de-identification system is necessary and helpful for the doctors to remove text from medical images. A lot of papers have been written about algorithms of text detection and extraction, however, little has been applied to de-identification of medical images. Since the de-identification system is designed for end-users, it should be effective, accurate and fast. This paper proposes an automatic system to detect and extract text from medical images for de-identification purposes, while keeping the anatomic structures intact. First, considering the text have a remarkable contrast with the background, a region variance based algorithm is used to detect the text regions. In post processing, geometric constraints are applied to the detected text regions to eliminate over-segmentation, e.g., lines and anatomic structures. After that, a region based level set method is used to extract text from the detected text regions. A GUI for the prototype application of the text detection and extraction system is implemented, which shows that our method can detect most of the text in the images. Experimental results validate that our method can detect and extract text in medical images with a 99% recall rate. Future research of this system includes algorithm improvement, performance evaluation, and computation optimization.
Assessing the impact of graphical quality on automatic text recognition in digital maps

NASA Astrophysics Data System (ADS)

Chiang, Yao-Yi; Leyk, Stefan; Honarvar Nazari, Narges; Moghaddam, Sima; Tan, Tian Xiang

2016-08-01

Converting geographic features (e.g., place names) in map images into a vector format is the first step for incorporating cartographic information into a geographic information system (GIS). With the advancement in computational power and algorithm design, map processing systems have been considerably improved over the last decade. However, the fundamental map processing techniques such as color image segmentation, (map) layer separation, and object recognition are sensitive to minor variations in graphical properties of the input image (e.g., scanning resolution). As a result, most map processing results would not meet user expectations if the user does not "properly" scan the map of interest, pre-process the map image (e.g., using compression or not), and train the processing system, accordingly. These issues could slow down the further advancement of map processing techniques as such unsuccessful attempts create a discouraged user community, and less sophisticated tools would be perceived as more viable solutions. Thus, it is important to understand what kinds of maps are suitable for automatic map processing and what types of results and process-related errors can be expected. In this paper, we shed light on these questions by using a typical map processing task, text recognition, to discuss a number of map instances that vary in suitability for automatic processing. We also present an extensive experiment on a diverse set of scanned historical maps to provide measures of baseline performance of a standard text recognition tool under varying map conditions (graphical quality) and text representations (that can vary even within the same map sheet). Our experimental results help the user understand what to expect when a fully or semi-automatic map processing system is used to process a scanned map with certain (varying) graphical properties and complexities in map content.
DeTEXT: A Database for Evaluating Text Extraction from Biomedical Literature Figures

PubMed Central

Yin, Xu-Cheng; Yang, Chun; Pei, Wei-Yi; Man, Haixia; Zhang, Jun; Learned-Miller, Erik; Yu, Hong

2015-01-01

Hundreds of millions of figures are available in biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information. A high-quality ground truth standard can greatly facilitate the development of an automated system. This article describes DeTEXT: A database for evaluating text extraction from biomedical literature figures. It is the first publicly available, human-annotated, high quality, and large-scale figure-text dataset with 288 full-text articles, 500 biomedical figures, and 9308 text regions. This article describes how figures were selected from open-access full-text biomedical articles and how annotation guidelines and annotation tools were developed. We also discuss the inter-annotator agreement and the reliability of the annotations. We summarize the statistics of the DeTEXT data and make available evaluation protocols for DeTEXT. Finally we lay out challenges we observed in the automated detection and recognition of figure text and discuss research directions in this area. DeTEXT is publicly available for downloading at http://prir.ustb.edu.cn/DeTEXT/. PMID:25951377
Semi-automatic image personalization tool for variable text insertion and replacement

NASA Astrophysics Data System (ADS)

Ding, Hengzhou; Bala, Raja; Fan, Zhigang; Eschbach, Reiner; Bouman, Charles A.; Allebach, Jan P.

2010-02-01

Image personalization is a widely used technique in personalized marketing,1 in which a vendor attempts to promote new products or retain customers by sending marketing collateral that is tailored to the customers' demographics, needs, and interests. With current solutions of which we are aware such as XMPie,2 DirectSmile,3 and AlphaPicture,4 in order to produce this tailored marketing collateral, image templates need to be created manually by graphic designers, involving complex grid manipulation and detailed geometric adjustments. As a matter of fact, the image template design is highly manual, skill-demanding and costly, and essentially the bottleneck for image personalization. We present a semi-automatic image personalization tool for designing image templates. Two scenarios are considered: text insertion and text replacement, with the text replacement option not offered in current solutions. The graphical user interface (GUI) of the tool is described in detail. Unlike current solutions, the tool renders the text in 3-D, which allows easy adjustment of the text. In particular, the tool has been implemented in Java, which introduces flexible deployment and eliminates the need for any special software or know-how on the part of the end user.
Semi automatic indexing of PostScript files using Medical Text Indexer in medical education.

PubMed

Mollah, Shamim Ara; Cimino, Christopher

2007-10-11

At Albert Einstein College of Medicine a large part of online lecture materials contain PostScript files. As the collection grows it becomes essential to create a digital library to have easy access to relevant sections of the lecture material that is full-text indexed; to create this index it is necessary to extract all the text from the document files that constitute the originals of the lectures. In this study we present a semi automatic indexing method using robust technique for extracting text from PostScript files and National Library of Medicine's Medical Text Indexer (MTI) program for indexing the text. This model can be applied to other medical schools for indexing purposes.
Using a MaxEnt Classifier for the Automatic Content Scoring of Free-Text Responses

NASA Astrophysics Data System (ADS)

Sukkarieh, Jana Z.

2011-03-01

Criticisms against multiple-choice item assessments in the USA have prompted researchers and organizations to move towards constructed-response (free-text) items. Constructed-response (CR) items pose many challenges to the education community—one of which is that they are expensive to score by humans. At the same time, there has been widespread movement towards computer-based assessment and hence, assessment organizations are competing to develop automatic content scoring engines for such items types—which we view as a textual entailment task. This paper describes how MaxEnt Modeling is used to help solve the task. MaxEnt has been used in many natural language tasks but this is the first application of the MaxEnt approach to textual entailment and automatic content scoring.
Usability evaluation of an experimental text summarization system and three search engines: implications for the reengineering of health care interfaces.

PubMed

Kushniruk, Andre W; Kan, Min-Yem; McKeown, Kathleen; Klavans, Judith; Jordan, Desmond; LaFlamme, Mark; Patel, Vimia L

2002-01-01

This paper describes the comparative evaluation of an experimental automated text summarization system, Centrifuser and three conventional search engines - Google, Yahoo and About.com. Centrifuser provides information to patients and families relevant to their questions about specific health conditions. It then produces a multidocument summary of articles retrieved by a standard search engine, tailored to the user's question. Subjects, consisting of friends or family of hospitalized patients, were asked to "think aloud" as they interacted with the four systems. The evaluation involved audio- and video recording of subject interactions with the interfaces in situ at a hospital. Results of the evaluation show that subjects found Centrifuser's summarization capability useful and easy to understand. In comparing Centrifuser to the three search engines, subjects' ratings varied; however, specific interface features were deemed useful across interfaces. We conclude with a discussion of the implications for engineering Web-based retrieval systems.
Challenges for automatically extracting molecular interactions from full-text articles.

PubMed

McIntosh, Tara; Curran, James R

2009-09-24

The increasing availability of full-text biomedical articles will allow more biomedical knowledge to be extracted automatically with greater reliability. However, most Information Retrieval (IR) and Extraction (IE) tools currently process only abstracts. The lack of corpora has limited the development of tools that are capable of exploiting the knowledge in full-text articles. As a result, there has been little investigation into the advantages of full-text document structure, and the challenges developers will face in processing full-text articles. We manually annotated passages from full-text articles that describe interactions summarised in a Molecular Interaction Map (MIM). Our corpus tracks the process of identifying facts to form the MIM summaries and captures any factual dependencies that must be resolved to extract the fact completely. For example, a fact in the results section may require a synonym defined in the introduction. The passages are also annotated with negated and coreference expressions that must be resolved.We describe the guidelines for identifying relevant passages and possible dependencies. The corpus includes 2162 sentences from 78 full-text articles. Our corpus analysis demonstrates the necessity of full-text processing; identifies the article sections where interactions are most commonly stated; and quantifies the proportion of interaction statements requiring coherent dependencies. Further, it allows us to report on the relative importance of identifying synonyms and resolving negated expressions. We also experiment with an oracle sentence retrieval system using the corpus as a gold-standard evaluation set. We introduce the MIM corpus, a unique resource that maps interaction facts in a MIM to annotated passages within full-text articles. It is an invaluable case study providing guidance to developers of biomedical IR and IE systems, and can be used as a gold-standard evaluation set for full-text IR tasks.
An unsupervised method for summarizing egocentric sport videos

NASA Astrophysics Data System (ADS)

Habibi Aghdam, Hamed; Jahani Heravi, Elnaz; Puig, Domenec

2015-12-01

People are getting more interested to record their sport activities using head-worn or hand-held cameras. This type of videos which is called egocentric sport videos has different motion and appearance patterns compared with life-logging videos. While a life-logging video can be defined in terms of well-defined human-object interactions, notwithstanding, it is not trivial to describe egocentric sport videos using well-defined activities. For this reason, summarizing egocentric sport videos based on human-object interaction might fail to produce meaningful results. In this paper, we propose an unsupervised method for summarizing egocentric videos by identifying the key-frames of the video. Our method utilizes both appearance and motion information and it automatically finds the number of the key-frames. Our blind user study on the new dataset collected from YouTube shows that in 93:5% cases, the users choose the proposed method as their first video summary choice. In addition, our method is within the top 2 choices of the users in 99% of studies.
Hierarchical video summarization based on context clustering

NASA Astrophysics Data System (ADS)

Tseng, Belle L.; Smith, John R.

2003-11-01

A personalized video summary is dynamically generated in our video personalization and summarization system based on user preference and usage environment. The three-tier personalization system adopts the server-middleware-client architecture in order to maintain, select, adapt, and deliver rich media content to the user. The server stores the content sources along with their corresponding MPEG-7 metadata descriptions. In this paper, the metadata includes visual semantic annotations and automatic speech transcriptions. Our personalization and summarization engine in the middleware selects the optimal set of desired video segments by matching shot annotations and sentence transcripts with user preferences. Besides finding the desired contents, the objective is to present a coherent summary. There are diverse methods for creating summaries, and we focus on the challenges of generating a hierarchical video summary based on context information. In our summarization algorithm, three inputs are used to generate the hierarchical video summary output. These inputs are (1) MPEG-7 metadata descriptions of the contents in the server, (2) user preference and usage environment declarations from the user client, and (3) context information including MPEG-7 controlled term list and classification scheme. In a video sequence, descriptions and relevance scores are assigned to each shot. Based on these shot descriptions, context clustering is performed to collect consecutively similar shots to correspond to hierarchical scene representations. The context clustering is based on the available context information, and may be derived from domain knowledge or rules engines. Finally, the selection of structured video segments to generate the hierarchical summary efficiently balances between scene representation and shot selection.
Automatic Classification of Medical Text: The Influence of Publication Form1

PubMed Central

Cole, William G.; Michael, Patricia A.; Stewart, James G.; Blois, Marsden S.

1988-01-01

Previous research has shown that within the domain of medical journal abstracts the statistical distribution of words is neither random nor uniform, but is highly characteristic. Many words are used mainly or solely by one medical specialty or when writing about one particular level of description. Due to this regularity of usage, automatic classification within journal abstracts has proved quite successful. The present research asks two further questions. It investigates whether this statistical regularity and automatic classification success can also be achieved in medical textbook chapters. It then goes on to see whether the statistical distribution found in textbooks is sufficiently similar to that found in abstracts to permit accurate classification of abstracts based solely on previous knowledge of textbooks. 14 textbook chapters and 45 MEDLINE abstracts were submitted to an automatic classification program that had been trained only on chapters drawn from a standard textbook series. Statistical analysis of the properties of abstracts vs. chapters revealed important differences in word use. Automatic classification performance was good for chapters, but poor for abstracts.
Automatic vs. manual curation of a multi-source chemical dictionary: the impact on text mining.

PubMed

Hettne, Kristina M; Williams, Antony J; van Mulligen, Erik M; Kleinjans, Jos; Tkachenko, Valery; Kors, Jan A

2010-03-23

Previously, we developed a combined dictionary dubbed Chemlist for the identification of small molecules and drugs in text based on a number of publicly available databases and tested it on an annotated corpus. To achieve an acceptable recall and precision we used a number of automatic and semi-automatic processing steps together with disambiguation rules. However, it remained to be investigated which impact an extensive manual curation of a multi-source chemical dictionary would have on chemical term identification in text. ChemSpider is a chemical database that has undergone extensive manual curation aimed at establishing valid chemical name-to-structure relationships. We acquired the component of ChemSpider containing only manually curated names and synonyms. Rule-based term filtering, semi-automatic manual curation, and disambiguation rules were applied. We tested the dictionary from ChemSpider on an annotated corpus and compared the results with those for the Chemlist dictionary. The ChemSpider dictionary of ca. 80 k names was only a 1/3 to a 1/4 the size of Chemlist at around 300 k. The ChemSpider dictionary had a precision of 0.43 and a recall of 0.19 before the application of filtering and disambiguation and a precision of 0.87 and a recall of 0.19 after filtering and disambiguation. The Chemlist dictionary had a precision of 0.20 and a recall of 0.47 before the application of filtering and disambiguation and a precision of 0.67 and a recall of 0.40 after filtering and disambiguation. We conclude the following: (1) The ChemSpider dictionary achieved the best precision but the Chemlist dictionary had a higher recall and the best F-score; (2) Rule-based filtering and disambiguation is necessary to achieve a high precision for both the automatically generated and the manually curated dictionary. ChemSpider is available as a web service at http://www.chemspider.com/ and the Chemlist dictionary is freely available as an XML file in Simple Knowledge Organization
Automatic extraction of relations between medical concepts in clinical texts

PubMed Central

Harabagiu, Sanda; Roberts, Kirk

2011-01-01

Objective A supervised machine learning approach to discover relations between medical problems, treatments, and tests mentioned in electronic medical records. Materials and methods A single support vector machine classifier was used to identify relations between concepts and to assign their semantic type. Several resources such as Wikipedia, WordNet, General Inquirer, and a relation similarity metric inform the classifier. Results The techniques reported in this paper were evaluated in the 2010 i2b2 Challenge and obtained the highest F1 score for the relation extraction task. When gold standard data for concepts and assertions were available, F1 was 73.7, precision was 72.0, and recall was 75.3. F1 is defined as 2*Precision*Recall/(Precision+Recall). Alternatively, when concepts and assertions were discovered automatically, F1 was 48.4, precision was 57.6, and recall was 41.7. Discussion Although a rich set of features was developed for the classifiers presented in this paper, little knowledge mining was performed from medical ontologies such as those found in UMLS. Future studies should incorporate features extracted from such knowledge sources, which we expect to further improve the results. Moreover, each relation discovery was treated independently. Joint classification of relations may further improve the quality of results. Also, joint learning of the discovery of concepts, assertions, and relations may also improve the results of automatic relation extraction. Conclusion Lexical and contextual features proved to be very important in relation extraction from medical texts. When they are not available to the classifier, the F1 score decreases by 3.7%. In addition, features based on similarity contribute to a decrease of 1.1% when they are not available. PMID:21846787
An automated knowledge-based textual summarization system for longitudinal, multivariate clinical data.

PubMed

Goldstein, Ayelet; Shahar, Yuval

2016-06-01

Design and implement an intelligent free-text summarization system: The system's input includes large numbers of longitudinal, multivariate, numeric and symbolic clinical raw data, collected over varying periods of time, and in different complex contexts, and a suitable medical knowledge base. The system then automatically generates a textual summary of the data. We aim to prove the feasibility of implementing such a system, and to demonstrate its potential benefits for clinicians and for enhancement of quality of care. We have designed a new, domain-independent, knowledge-based system, the CliniText system, for automated summarization in free text of longitudinal medical records of any duration, in any context. The system is composed of six components: (1) A temporal abstraction module generates all possible abstractions from the patient's raw data using a temporal-abstraction knowledge base; (2) The abductive reasoning module infers abstractions or events from the data, which were not explicitly included in the database; (3) The pruning module filters out raw or abstract data based on predefined heuristics; (4) The document structuring module organizes the remaining raw or abstract data, according to the desired format; (5) The microplanning module, groups the raw or abstract data and creates referring expressions; (6) The surface realization module, generates the text, and applies the grammar rules of the chosen language. We have performed an initial technical evaluation of the system in the cardiac intensive-care and diabetes domains. We also summarize the results of a more detailed evaluation study that we have performed in the intensive-care domain that assessed the completeness, correctness, and overall quality of the system's generated text, and its potential benefits to clinical decision making. We assessed these measures for 31 letters originally composed by clinicians, and for the same letters when generated by the CliniText system. We have successfully
FigSum: automatically generating structured text summaries for figures in biomedical literature.

PubMed

Agarwal, Shashank; Yu, Hong

2009-11-14

Figures are frequently used in biomedical articles to support research findings; however, they are often difficult to comprehend based on their legends alone and information from the full-text articles is required to fully understand them. Previously, we found that the information associated with a single figure is distributed throughout the full-text article the figure appears in. Here, we develop and evaluate a figure summarization system - FigSum, which aggregates this scattered information to improve figure comprehension. For each figure in an article, FigSum generates a structured text summary comprising one sentence from each of the four rhetorical categories - Introduction, Methods, Results and Discussion (IMRaD). The IMRaD category of sentences is predicted by an automated machine learning classifier. Our evaluation shows that FigSum captures 53% of the sentences in the gold standard summaries annotated by biomedical scientists and achieves an average ROUGE-1 score of 0.70, which is higher than a baseline system.
FigSum: Automatically Generating Structured Text Summaries for Figures in Biomedical Literature

PubMed Central

Agarwal, Shashank; Yu, Hong

2009-01-01

Figures are frequently used in biomedical articles to support research findings; however, they are often difficult to comprehend based on their legends alone and information from the full-text articles is required to fully understand them. Previously, we found that the information associated with a single figure is distributed throughout the full-text article the figure appears in. Here, we develop and evaluate a figure summarization system – FigSum, which aggregates this scattered information to improve figure comprehension. For each figure in an article, FigSum generates a structured text summary comprising one sentence from each of the four rhetorical categories – Introduction, Methods, Results and Discussion (IMRaD). The IMRaD category of sentences is predicted by an automated machine learning classifier. Our evaluation shows that FigSum captures 53% of the sentences in the gold standard summaries annotated by biomedical scientists and achieves an average ROUGE-1 score of 0.70, which is higher than a baseline system. PMID:20351812

Automatic Text Decomposition and Structuring.

ERIC Educational Resources Information Center

Salton, Gerard; And Others

1996-01-01

Text similarity measurements are used to determine relationships between natural-language texts and text excerpts. The resulting linked hypertext maps can be broken down into text segments and themes used to identify different text types and structures, leading to improved information access and utilization. Examples are provided for text…
Automatic topic identification of health-related messages in online health community using text classification.

PubMed

Lu, Yingjie

2013-01-01

To facilitate patient involvement in online health community and obtain informative support and emotional support they need, a topic identification approach was proposed in this paper for identifying automatically topics of the health-related messages in online health community, thus assisting patients in reaching the most relevant messages for their queries efficiently. Feature-based classification framework was presented for automatic topic identification in our study. We first collected the messages related to some predefined topics in a online health community. Then we combined three different types of features, n-gram-based features, domain-specific features and sentiment features to build four feature sets for health-related text representation. Finally, three different text classification techniques, C4.5, Naïve Bayes and SVM were adopted to evaluate our topic classification model. By comparing different feature sets and different classification techniques, we found that n-gram-based features, domain-specific features and sentiment features were all considered to be effective in distinguishing different types of health-related topics. In addition, feature reduction technique based on information gain was also effective to improve the topic classification performance. In terms of classification techniques, SVM outperformed C4.5 and Naïve Bayes significantly. The experimental results demonstrated that the proposed approach could identify the topics of online health-related messages efficiently.
Automatic indexing of compound words based on mutual information for Korean text retrieval

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pan Koo Kim; Yoo Kun Cho

In this paper, we present an automatic indexing technique for compound words suitable to an aggulutinative language, specifically Korean. Firstly, we present the construction conditions to compose compound words as indexing terms. Also we present the decomposition rules applicable to consecutive nouns to extract all contents of text. Finally we propose a measure to estimate the usefulness of a term, mutual information, to calculate the degree of word association of compound words, based on the information theoretic notion. By applying this method, our system has raised the precision rate of compound words from 72% to 87%.
A Theory of Term Importance in Automatic Text Analysis.

ERIC Educational Resources Information Center

Salton, G.; And Others

Most existing automatic content analysis and indexing techniques are based on work frequency characteristics applied largely in an ad hoc manner. Contradictory requirements arise in this connection, in that terms exhibiting high occurrence frequencies in individual documents are often useful for high recall performance (to retrieve many relevant…
Automatic vs. manual curation of a multi-source chemical dictionary: the impact on text mining

PubMed Central

2010-01-01

Background Previously, we developed a combined dictionary dubbed Chemlist for the identification of small molecules and drugs in text based on a number of publicly available databases and tested it on an annotated corpus. To achieve an acceptable recall and precision we used a number of automatic and semi-automatic processing steps together with disambiguation rules. However, it remained to be investigated which impact an extensive manual curation of a multi-source chemical dictionary would have on chemical term identification in text. ChemSpider is a chemical database that has undergone extensive manual curation aimed at establishing valid chemical name-to-structure relationships. Results We acquired the component of ChemSpider containing only manually curated names and synonyms. Rule-based term filtering, semi-automatic manual curation, and disambiguation rules were applied. We tested the dictionary from ChemSpider on an annotated corpus and compared the results with those for the Chemlist dictionary. The ChemSpider dictionary of ca. 80 k names was only a 1/3 to a 1/4 the size of Chemlist at around 300 k. The ChemSpider dictionary had a precision of 0.43 and a recall of 0.19 before the application of filtering and disambiguation and a precision of 0.87 and a recall of 0.19 after filtering and disambiguation. The Chemlist dictionary had a precision of 0.20 and a recall of 0.47 before the application of filtering and disambiguation and a precision of 0.67 and a recall of 0.40 after filtering and disambiguation. Conclusions We conclude the following: (1) The ChemSpider dictionary achieved the best precision but the Chemlist dictionary had a higher recall and the best F-score; (2) Rule-based filtering and disambiguation is necessary to achieve a high precision for both the automatically generated and the manually curated dictionary. ChemSpider is available as a web service at http://www.chemspider.com/ and the Chemlist dictionary is freely available as an XML file in
A State-Of-The-Art Survey on Automatic Indexing.

ERIC Educational Resources Information Center

Liebesny, Felix

This survey covers the literature relating to automatic indexing techniques, services, and applications published during 1969-1973. Works are summarized and described in the areas of: (1) general papers on automatic indexing; (2) KWIC indexes; (3) KWIC variants listed alphabetically by acronym with descriptions; (4) other KWIC variants arranged by…
Cat swarm optimization based evolutionary framework for multi document summarization

NASA Astrophysics Data System (ADS)

Rautray, Rasmita; Balabantaray, Rakesh Chandra

2017-07-01

Today, World Wide Web has brought us enormous quantity of on-line information. As a result, extracting relevant information from massive data has become a challenging issue. In recent past text summarization is recognized as one of the solution to extract useful information from vast amount documents. Based on number of documents considered for summarization, it is categorized as single document or multi document summarization. Rather than single document, multi document summarization is more challenging for the researchers to find accurate summary from multiple documents. Hence in this study, a novel Cat Swarm Optimization (CSO) based multi document summarizer is proposed to address the problem of multi document summarization. The proposed CSO based model is also compared with two other nature inspired based summarizer such as Harmony Search (HS) based summarizer and Particle Swarm Optimization (PSO) based summarizer. With respect to the benchmark Document Understanding Conference (DUC) datasets, the performance of all algorithms are compared in terms of different evaluation metrics such as ROUGE score, F score, sensitivity, positive predicate value, summary accuracy, inter sentence similarity and readability metric to validate non-redundancy, cohesiveness and readability of the summary respectively. The experimental analysis clearly reveals that the proposed approach outperforms the other summarizers included in the study.
Is automatic speech-to-text transcription ready for use in psychological experiments?

PubMed

Ziman, Kirsten; Heusser, Andrew C; Fitzpatrick, Paxton C; Field, Campbell E; Manning, Jeremy R

2018-04-23

Verbal responses are a convenient and naturalistic way for participants to provide data in psychological experiments (Salzinger, The Journal of General Psychology, 61(1),65-94:1959). However, audio recordings of verbal responses typically require additional processing, such as transcribing the recordings into text, as compared with other behavioral response modalities (e.g., typed responses, button presses, etc.). Further, the transcription process is often tedious and time-intensive, requiring human listeners to manually examine each moment of recorded speech. Here we evaluate the performance of a state-of-the-art speech recognition algorithm (Halpern et al., 2016) in transcribing audio data into text during a list-learning experiment. We compare transcripts made by human annotators to the computer-generated transcripts. Both sets of transcripts matched to a high degree and exhibited similar statistical properties, in terms of the participants' recall performance and recall dynamics that the transcripts captured. This proof-of-concept study suggests that speech-to-text engines could provide a cheap, reliable, and rapid means of automatically transcribing speech data in psychological experiments. Further, our findings open the door for verbal response experiments that scale to thousands of participants (e.g., administered online), as well as a new generation of experiments that decode speech on the fly and adapt experimental parameters based on participants' prior responses.
Automatic extraction of property norm-like data from large text corpora.

PubMed

Kelly, Colin; Devereux, Barry; Korhonen, Anna

2014-01-01

Traditional methods for deriving property-based representations of concepts from text have focused on either extracting only a subset of possible relation types, such as hyponymy/hypernymy (e.g., car is-a vehicle) or meronymy/metonymy (e.g., car has wheels), or unspecified relations (e.g., car--petrol). We propose a system for the challenging task of automatic, large-scale acquisition of unconstrained, human-like property norms from large text corpora, and discuss the theoretical implications of such a system. We employ syntactic, semantic, and encyclopedic information to guide our extraction, yielding concept-relation-feature triples (e.g., car be fast, car require petrol, car cause pollution), which approximate property-based conceptual representations. Our novel method extracts candidate triples from parsed corpora (Wikipedia and the British National Corpus) using syntactically and grammatically motivated rules, then reweights triples with a linear combination of their frequency and four statistical metrics. We assess our system output in three ways: lexical comparison with norms derived from human-generated property norm data, direct evaluation by four human judges, and a semantic distance comparison with both WordNet similarity data and human-judged concept similarity ratings. Our system offers a viable and performant method of plausible triple extraction: Our lexical comparison shows comparable performance to the current state-of-the-art, while subsequent evaluations exhibit the human-like character of our generated properties.
Automatic Text Analysis Based on Transition Phenomena of Word Occurrences

ERIC Educational Resources Information Center

Pao, Miranda Lee

1978-01-01

Describes a method of selecting index terms directly from a word frequency list, an idea originally suggested by Goffman. Results of the analysis of word frequencies of two articles seem to indicate that the automated selection of index terms from a frequency list holds some promise for automatic indexing. (Author/MBR)
Algorithm for Video Summarization of Bronchoscopy Procedures

PubMed Central

2011-01-01

Background The duration of bronchoscopy examinations varies considerably depending on the diagnostic and therapeutic procedures used. It can last more than 20 minutes if a complex diagnostic work-up is included. With wide access to videobronchoscopy, the whole procedure can be recorded as a video sequence. Common practice relies on an active attitude of the bronchoscopist who initiates the recording process and usually chooses to archive only selected views and sequences. However, it may be important to record the full bronchoscopy procedure as documentation when liability issues are at stake. Furthermore, an automatic recording of the whole procedure enables the bronchoscopist to focus solely on the performed procedures. Video recordings registered during bronchoscopies include a considerable number of frames of poor quality due to blurry or unfocused images. It seems that such frames are unavoidable due to the relatively tight endobronchial space, rapid movements of the respiratory tract due to breathing or coughing, and secretions which occur commonly in the bronchi, especially in patients suffering from pulmonary disorders. Methods The use of recorded bronchoscopy video sequences for diagnostic, reference and educational purposes could be considerably extended with efficient, flexible summarization algorithms. Thus, the authors developed a prototype system to create shortcuts (called summaries or abstracts) of bronchoscopy video recordings. Such a system, based on models described in previously published papers, employs image analysis methods to exclude frames or sequences of limited diagnostic or education value. Results The algorithm for the selection or exclusion of specific frames or shots from video sequences recorded during bronchoscopy procedures is based on several criteria, including automatic detection of "non-informative", frames showing the branching of the airways and frames including pathological lesions. Conclusions The paper focuses on the
Echocardiogram video summarization

NASA Astrophysics Data System (ADS)

Ebadollahi, Shahram; Chang, Shih-Fu; Wu, Henry D.; Takoma, Shin

2001-05-01

This work aims at developing innovative algorithms and tools for summarizing echocardiogram videos. Specifically, we summarize the digital echocardiogram videos by temporally segmenting them into the constituent views and representing each view by the most informative frame. For the segmentation we take advantage of the well-defined spatio- temporal structure of the echocardiogram videos. Two different criteria are used: presence/absence of color and the shape of the region of interest (ROI) in each frame of the video. The change in the ROI is due to different modes of echocardiograms present in one study. The representative frame is defined to be the frame corresponding to the end- diastole of the heart cycle. To locate the end-diastole we track the ECG of each frame to find the exact time the time- marker on the ECG crosses the peak of the end-diastole we track the ECG of each frame to find the exact time the time- marker on the ECG crosses the peak of the R-wave. The corresponding frame is chosen to be the key-frame. The entire echocardiogram video can be summarized into either a static summary, which is a storyboard type of summary and a dynamic summary, which is a concatenation of the selected segments of the echocardiogram video. To the best of our knowledge, this if the first automated system for summarizing the echocardiogram videos base don visual content.
Scalable gastroscopic video summarization via similar-inhibition dictionary selection.

PubMed

Wang, Shuai; Cong, Yang; Cao, Jun; Yang, Yunsheng; Tang, Yandong; Zhao, Huaici; Yu, Haibin

2016-01-01

This paper aims at developing an automated gastroscopic video summarization algorithm to assist clinicians to more effectively go through the abnormal contents of the video. To select the most representative frames from the original video sequence, we formulate the problem of gastroscopic video summarization as a dictionary selection issue. Different from the traditional dictionary selection methods, which take into account only the number and reconstruction ability of selected key frames, our model introduces the similar-inhibition constraint to reinforce the diversity of selected key frames. We calculate the attention cost by merging both gaze and content change into a prior cue to help select the frames with more high-level semantic information. Moreover, we adopt an image quality evaluation process to eliminate the interference of the poor quality images and a segmentation process to reduce the computational complexity. For experiments, we build a new gastroscopic video dataset captured from 30 volunteers with more than 400k images and compare our method with the state-of-the-arts using the content consistency, index consistency and content-index consistency with the ground truth. Compared with all competitors, our method obtains the best results in 23 of 30 videos evaluated based on content consistency, 24 of 30 videos evaluated based on index consistency and all videos evaluated based on content-index consistency. For gastroscopic video summarization, we propose an automated annotation method via similar-inhibition dictionary selection. Our model can achieve better performance compared with other state-of-the-art models and supplies more suitable key frames for diagnosis. The developed algorithm can be automatically adapted to various real applications, such as the training of young clinicians, computer-aided diagnosis or medical report generation. Copyright © 2015 Elsevier B.V. All rights reserved.
Application of nonlinear transformations to automatic flight control

NASA Technical Reports Server (NTRS)

Meyer, G.; Su, R.; Hunt, L. R.

1984-01-01

The theory of transformations of nonlinear systems to linear ones is applied to the design of an automatic flight controller for the UH-1H helicopter. The helicopter mathematical model is described and it is shown to satisfy the necessary and sufficient conditions for transformability. The mapping is constructed, taking the nonlinear model to canonical form. The performance of the automatic control system in a detailed simulation on the flight computer is summarized.
Hierarchical video summarization

NASA Astrophysics Data System (ADS)

Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.

1998-12-01

We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.
Effects of Training in Constructing Graphic Organizers on Disabled Readers' Summarization and Recognition of Expository Text Structure.

ERIC Educational Resources Information Center

Weisberg, Renee; Balajthy, Ernest

A study investigated the effects of training in the use of graphic organizers on the summarization strategies of disabled readers. Subjects, 21 disabled readers (with a mean age of 13 years, 7 months) from a reading clinic, received 5 hours of training in the use of graphic organizers to map expository passages. Instruction included training in…
On the Application of Syntactic Methodologies in Automatic Text Analysis.

ERIC Educational Resources Information Center

Salton, Gerard; And Others

1990-01-01

Summarizes various linguistic approaches proposed for document analysis in information retrieval environments. Topics discussed include syntactic analysis; use of machine-readable dictionary information; knowledge base construction; the PLNLP English Grammar (PEG) system; phrase normalization; and statistical and syntactic phrase evaluation used…
Stemming Malay Text and Its Application in Automatic Text Categorization

NASA Astrophysics Data System (ADS)

Yasukawa, Michiko; Lim, Hui Tian; Yokoo, Hidetoshi

In Malay language, there are no conjugations and declensions and affixes have important grammatical functions. In Malay, the same word may function as a noun, an adjective, an adverb, or, a verb, depending on its position in the sentence. Although extensively simple root words are used in informal conversations, it is essential to use the precise words in formal speech or written texts. In Malay, to make sentences clear, derivative words are used. Derivation is achieved mainly by the use of affixes. There are approximately a hundred possible derivative forms of a root word in written language of the educated Malay. Therefore, the composition of Malay words may be complicated. Although there are several types of stemming algorithms available for text processing in English and some other languages, they cannot be used to overcome the difficulties in Malay word stemming. Stemming is the process of reducing various words to their root forms in order to improve the effectiveness of text processing in information systems. It is essential to avoid both over-stemming and under-stemming errors. We have developed a new Malay stemmer (stemming algorithm) for removing inflectional and derivational affixes. Our stemmer uses a set of affix rules and two types of dictionaries: a root-word dictionary and a derivative-word dictionary. The use of set of rules is aimed at reducing the occurrence of under-stemming errors, while that of the dictionaries is believed to reduce the occurrence of over-stemming errors. We performed an experiment to evaluate the application of our stemmer in text mining software. For the experiment, text data used were actual web pages collected from the World Wide Web to demonstrate the effectiveness of our Malay stemming algorithm. The experimental results showed that our stemmer can effectively increase the precision of the extracted Boolean expressions for text categorization.
The Effects of Two Summarization Strategies Using Expository Text on the Reading Comprehension and Summary Writing of Fourth-and Fifth-Grade Students in an Urban, Title 1 School

ERIC Educational Resources Information Center

Braxton, Diane M.

2009-01-01

Using a quasi-experimental pretest/post test design, this study examined the effects of two summarization strategies on the reading comprehension and summary writing of fourth- and fifth- grade students in an urban, Title 1 school. The Strategies, "G"enerating "I"nteractions between "S"chemata and "T"ext (GIST) and Rule-based, were taught using…
Text mining and natural language processing approaches for automatic categorization of lay requests to web-based expert forums.

PubMed

Himmel, Wolfgang; Reincke, Ulrich; Michelmann, Hans Wilhelm

2009-07-22

Both healthy and sick people increasingly use electronic media to obtain medical information and advice. For example, Internet users may send requests to Web-based expert forums, or so-called "ask the doctor" services. To automatically classify lay requests to an Internet medical expert forum using a combination of different text-mining strategies. We first manually classified a sample of 988 requests directed to a involuntary childlessness forum on the German website "Rund ums Baby" ("Everything about Babies") into one or more of 38 categories belonging to two dimensions ("subject matter" and "expectations"). After creating start and synonym lists, we calculated the average Cramer's V statistic for the association of each word with each category. We also used principle component analysis and singular value decomposition as further text-mining strategies. With these measures we trained regression models and determined, on the basis of best regression models, for any request the probability of belonging to each of the 38 different categories, with a cutoff of 50%. Recall and precision of a test sample were calculated as a measure of quality for the automatic classification. According to the manual classification of 988 documents, 102 (10%) documents fell into the category "in vitro fertilization (IVF)," 81 (8%) into the category "ovulation," 79 (8%) into "cycle," and 57 (6%) into "semen analysis." These were the four most frequent categories in the subject matter dimension (consisting of 32 categories). The expectation dimension comprised six categories; we classified 533 documents (54%) as "general information" and 351 (36%) as a wish for "treatment recommendations." The generation of indicator variables based on the chi-square analysis and Cramer's V proved to be the best approach for automatic classification in about half of the categories. In combination with the two other approaches, 100% precision and 100% recall were realized in 18 (47%) out of the 38

Recent progress in automatically extracting information from the pharmacogenomic literature

PubMed Central

Garten, Yael; Coulet, Adrien; Altman, Russ B

2011-01-01

The biomedical literature holds our understanding of pharmacogenomics, but it is dispersed across many journals. In order to integrate our knowledge, connect important facts across publications and generate new hypotheses we must organize and encode the contents of the literature. By creating databases of structured pharmocogenomic knowledge, we can make the value of the literature much greater than the sum of the individual reports. We can, for example, generate candidate gene lists or interpret surprising hits in genome-wide association studies. Text mining automatically adds structure to the unstructured knowledge embedded in millions of publications, and recent years have seen a surge in work on biomedical text mining, some specific to pharmacogenomics literature. These methods enable extraction of specific types of information and can also provide answers to general, systemic queries. In this article, we describe the main tasks of text mining in the context of pharmacogenomics, summarize recent applications and anticipate the next phase of text mining applications. PMID:21047206
A conceptual study of automatic and semi-automatic quality assurance techniques for round image processing

NASA Technical Reports Server (NTRS)

1983-01-01

This report summarizes the results of a study conducted by Engineering and Economics Research (EER), Inc. under NASA Contract Number NAS5-27513. The study involved the development of preliminary concepts for automatic and semiautomatic quality assurance (QA) techniques for ground image processing. A distinction is made between quality assessment and the more comprehensive quality assurance which includes decision making and system feedback control in response to quality assessment.
Contextual Text Mining

ERIC Educational Resources Information Center

Mei, Qiaozhu

2009-01-01

With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…
Effects of Presentation Mode and Computer Familiarity on Summarization of Extended Texts

ERIC Educational Resources Information Center

Yu, Guoxing

2010-01-01

Comparability studies on computer- and paper-based reading tests have focused on short texts and selected-response items via almost exclusively statistical modeling of test performance. The psychological effects of presentation mode and computer familiarity on individual students are under-researched. In this study, 157 students read extended…
Concept Recognition in an Automatic Text-Processing System for the Life Sciences.

ERIC Educational Resources Information Center

Vleduts-Stokolov, Natasha

1987-01-01

Describes a system developed for the automatic recognition of biological concepts in titles of scientific articles; reports results of several pilot experiments which tested the system's performance; analyzes typical ambiguity problems encountered by the system; describes a disambiguation technique that was developed; and discusses future plans…
Automatic classification of diseases from free-text death certificates for real-time surveillance.

PubMed

Koopman, Bevan; Karimi, Sarvnaz; Nguyen, Anthony; McGuire, Rhydwyn; Muscatello, David; Kemp, Madonna; Truran, Donna; Zhang, Ming; Thackway, Sarah

2015-07-15

effective means for automatic and real-time surveillance of diabetes, influenza, pneumonia and HIV deaths. In addition, the methods are generally applicable to other diseases of interest and to other sources of medical free-text besides death certificates.
Text Mining and Natural Language Processing Approaches for Automatic Categorization of Lay Requests to Web-Based Expert Forums

PubMed Central

Reincke, Ulrich; Michelmann, Hans Wilhelm

2009-01-01

Background Both healthy and sick people increasingly use electronic media to obtain medical information and advice. For example, Internet users may send requests to Web-based expert forums, or so-called “ask the doctor” services. Objective To automatically classify lay requests to an Internet medical expert forum using a combination of different text-mining strategies. Methods We first manually classified a sample of 988 requests directed to a involuntary childlessness forum on the German website “Rund ums Baby” (“Everything about Babies”) into one or more of 38 categories belonging to two dimensions (“subject matter” and “expectations”). After creating start and synonym lists, we calculated the average Cramer’s V statistic for the association of each word with each category. We also used principle component analysis and singular value decomposition as further text-mining strategies. With these measures we trained regression models and determined, on the basis of best regression models, for any request the probability of belonging to each of the 38 different categories, with a cutoff of 50%. Recall and precision of a test sample were calculated as a measure of quality for the automatic classification. Results According to the manual classification of 988 documents, 102 (10%) documents fell into the category “in vitro fertilization (IVF),” 81 (8%) into the category “ovulation,” 79 (8%) into “cycle,” and 57 (6%) into “semen analysis.” These were the four most frequent categories in the subject matter dimension (consisting of 32 categories). The expectation dimension comprised six categories; we classified 533 documents (54%) as “general information” and 351 (36%) as a wish for “treatment recommendations.” The generation of indicator variables based on the chi-square analysis and Cramer’s V proved to be the best approach for automatic classification in about half of the categories. In combination with the two other
WOLF; automatic typing program

USGS Publications Warehouse

Evenden, G.I.

1982-01-01

A FORTRAN IV program for the Hewlett-Packard 1000 series computer provides for automatic typing operations and can, when employed with manufacturer's text editor, provide a system to greatly facilitate preparation of reports, letters and other text. The input text and imbedded control data can perform nearly all of the functions of a typist. A few of the features available are centering, titles, footnotes, indentation, page numbering (including Roman numerals), automatic paragraphing, and two forms of tab operations. This documentation contains both user and technical description of the program.
Formative evaluation of a patient-specific clinical knowledge summarization tool

PubMed Central

Del Fiol, Guilherme; Mostafa, Javed; Pu, Dongqiuye; Medlin, Richard; Slager, Stacey; Jonnalagadda, Siddhartha R.; Weir, Charlene R.

2015-01-01

Objective To iteratively design a prototype of a computerized clinical knowledge summarization (CKS) tool aimed at helping clinicians finding answers to their clinical questions; and to conduct a formative assessment of the usability, usefulness, efficiency, and impact of the CKS prototype on physicians’ perceived decision quality compared with standard search of UpToDate and PubMed. Materials and methods Mixed-methods observations of the interactions of 10 physicians with the CKS prototype vs. standard search in an effort to solve clinical problems posed as case vignettes. Results The CKS tool automatically summarizes patient-specific and actionable clinical recommendations from PubMed (high quality randomized controlled trials and systematic reviews) and UpToDate. Two thirds of the study participants completed 15 out of 17 usability tasks. The median time to task completion was less than 10 s for 12 of the 17 tasks. The difference in search time between the CKS and standard search was not significant (median = 4.9 vs. 4.5 min). Physician’s perceived decision quality was significantly higher with the CKS than with manual search (mean = 16.6 vs. 14.4; p = 0.036). Conclusions The CKS prototype was well-accepted by physicians both in terms of usability and usefulness. Physicians perceived better decision quality with the CKS prototype compared to standard search of PubMed and UpToDate within a similar search time. Due to the formative nature of this study and a small sample size, conclusions regarding efficiency and efficacy are exploratory. PMID:26612774
Formative evaluation of a patient-specific clinical knowledge summarization tool.

PubMed

Del Fiol, Guilherme; Mostafa, Javed; Pu, Dongqiuye; Medlin, Richard; Slager, Stacey; Jonnalagadda, Siddhartha R; Weir, Charlene R

2016-02-01

To iteratively design a prototype of a computerized clinical knowledge summarization (CKS) tool aimed at helping clinicians finding answers to their clinical questions; and to conduct a formative assessment of the usability, usefulness, efficiency, and impact of the CKS prototype on physicians' perceived decision quality compared with standard search of UpToDate and PubMed. Mixed-methods observations of the interactions of 10 physicians with the CKS prototype vs. standard search in an effort to solve clinical problems posed as case vignettes. The CKS tool automatically summarizes patient-specific and actionable clinical recommendations from PubMed (high quality randomized controlled trials and systematic reviews) and UpToDate. Two thirds of the study participants completed 15 out of 17 usability tasks. The median time to task completion was less than 10s for 12 of the 17 tasks. The difference in search time between the CKS and standard search was not significant (median=4.9 vs. 4.5m in). Physician's perceived decision quality was significantly higher with the CKS than with manual search (mean=16.6 vs. 14.4; p=0.036). The CKS prototype was well-accepted by physicians both in terms of usability and usefulness. Physicians perceived better decision quality with the CKS prototype compared to standard search of PubMed and UpToDate within a similar search time. Due to the formative nature of this study and a small sample size, conclusions regarding efficiency and efficacy are exploratory. Published by Elsevier Ireland Ltd.
Heterogeneity image patch index and its application to consumer video summarization.

PubMed

Dang, Chinh T; Radha, Hayder

2014-06-01

Automatic video summarization is indispensable for fast browsing and efficient management of large video libraries. In this paper, we introduce an image feature that we refer to as heterogeneity image patch (HIP) index. The proposed HIP index provides a new entropy-based measure of the heterogeneity of patches within any picture. By evaluating this index for every frame in a video sequence, we generate a HIP curve for that sequence. We exploit the HIP curve in solving two categories of video summarization applications: key frame extraction and dynamic video skimming. Under the key frame extraction frame-work, a set of candidate key frames is selected from abundant video frames based on the HIP curve. Then, a proposed patch-based image dissimilarity measure is used to create affinity matrix of these candidates. Finally, a set of key frames is extracted from the affinity matrix using a min–max based algorithm. Under video skimming, we propose a method to measure the distance between a video and its skimmed representation. The video skimming problem is then mapped into an optimization framework and solved by minimizing a HIP-based distance for a set of extracted excerpts. The HIP framework is pixel-based and does not require semantic information or complex camera motion estimation. Our simulation results are based on experiments performed on consumer videos and are compared with state-of-the-art methods. It is shown that the HIP approach outperforms other leading methods, while maintaining low complexity.
Summarization of an online medical encyclopedia.

PubMed

Fiszman, Marcelo; Rindflesch, Thomas C; Kilicoglu, Halil

2004-01-01

We explore a knowledge-rich (abstraction) approach to summarization and apply it to multiple documents from an online medical encyclopedia. A semantic processor functions as the source interpreter and produces a list of predications. A transformation stage then generalizes and condenses this list, ultimately generating a conceptual condensate for a given disorder topic. We provide a preliminary evaluation of the quality of the condensates produced for a sample of four disorders. The overall precision of the disorder conceptual condensates was 87%, and the compression ratio from the base list of predications to the final condensate was 98%. The conceptual condensate could be used as input to a text generator to produce a natural language summary for a given disorder topic.
Testing & Evaluation of Close-Range SAR for Monitoring & Automatically Detecting Pavement Conditions

DOT National Transportation Integrated Search

2012-01-01

This report summarizes activities in support of the DOT contract on Testing & Evaluating Close-Range SAR for Monitoring & Automatically Detecting Pavement Conditions & Improve Visual Inspection Procedures. The work of this project was performed by Dr...
Automatic Keyframe Summarization of User-Generated Video

DTIC Science & Technology

2014-06-01

using the framework presented in this paper. 12 Scenery Technology has been developed that classifies the genre of a video. Here, video genres are...types of videos that shares similarities in content and structure. Many genres of video footage exist. Some examples include news, sports, movies...cartoons, and commercials. Rasheed et al. [42] classify video genres (comedy, action, drama, and horror) with low-level video statistics, such as average
Relatedness-based Multi-Entity Summarization

PubMed Central

Gunaratna, Kalpa; Yazdavar, Amir Hossein; Thirunarayan, Krishnaprasad; Sheth, Amit; Cheng, Gong

2017-01-01

Representing world knowledge in a machine processable format is important as entities and their descriptions have fueled tremendous growth in knowledge-rich information processing platforms, services, and systems. Prominent applications of knowledge graphs include search engines (e.g., Google Search and Microsoft Bing), email clients (e.g., Gmail), and intelligent personal assistants (e.g., Google Now, Amazon Echo, and Apple’s Siri). In this paper, we present an approach that can summarize facts about a collection of entities by analyzing their relatedness in preference to summarizing each entity in isolation. Specifically, we generate informative entity summaries by selecting: (i) inter-entity facts that are similar and (ii) intra-entity facts that are important and diverse. We employ a constrained knapsack problem solving approach to efficiently compute entity summaries. We perform both qualitative and quantitative experiments and demonstrate that our approach yields promising results compared to two other stand-alone state-of-the-art entity summarization approaches. PMID:29051696
Review of automatic detection of pig behaviours by using image analysis

NASA Astrophysics Data System (ADS)

Han, Shuqing; Zhang, Jianhua; Zhu, Mengshuai; Wu, Jianzhai; Kong, Fantao

2017-06-01

Automatic detection of lying, moving, feeding, drinking, and aggressive behaviours of pigs by means of image analysis can save observation input by staff. It would help staff make early detection of diseases or injuries of pigs during breeding and improve management efficiency of swine industry. This study describes the progress of pig behaviour detection based on image analysis and advancement in image segmentation of pig body, segmentation of pig adhesion and extraction of pig behaviour characteristic parameters. Challenges for achieving automatic detection of pig behaviours were summarized.
Automatic extraction of angiogenesis bioprocess from text

PubMed Central

Wang, Xinglong; McKendrick, Iain; Barrett, Ian; Dix, Ian; French, Tim; Tsujii, Jun'ichi; Ananiadou, Sophia

2011-01-01

Motivation: Understanding key biological processes (bioprocesses) and their relationships with constituent biological entities and pharmaceutical agents is crucial for drug design and discovery. One way to harvest such information is searching the literature. However, bioprocesses are difficult to capture because they may occur in text in a variety of textual expressions. Moreover, a bioprocess is often composed of a series of bioevents, where a bioevent denotes changes to one or a group of cells involved in the bioprocess. Such bioevents are often used to refer to bioprocesses in text, which current techniques, relying solely on specialized lexicons, struggle to find. Results: This article presents a range of methods for finding bioprocess terms and events. To facilitate the study, we built a gold standard corpus in which terms and events related to angiogenesis, a key biological process of the growth of new blood vessels, were annotated. Statistics of the annotated corpus revealed that over 36% of the text expressions that referred to angiogenesis appeared as events. The proposed methods respectively employed domain-specific vocabularies, a manually annotated corpus and unstructured domain-specific documents. Evaluation results showed that, while a supervised machine-learning model yielded the best precision, recall and F1 scores, the other methods achieved reasonable performance and less cost to develop. Availability: The angiogenesis vocabularies, gold standard corpus, annotation guidelines and software described in this article are available at http://text0.mib.man.ac.uk/~mbassxw2/angiogenesis/ Contact: xinglong.wang@gmail.com PMID:21821664
FUSE: a profit maximization approach for functional summarization of biological networks.

PubMed

Seah, Boon-Siew; Bhowmick, Sourav S; Dewey, C Forbes; Yu, Hanry

2012-03-21

The availability of large-scale curated protein interaction datasets has given rise to the opportunity to investigate higher level organization and modularity within the protein interaction network (PPI) using graph theoretic analysis. Despite the recent progress, systems level analysis of PPIS remains a daunting task as it is challenging to make sense out of the deluge of high-dimensional interaction data. Specifically, techniques that automatically abstract and summarize PPIS at multiple resolutions to provide high level views of its functional landscape are still lacking. We present a novel data-driven and generic algorithm called FUSE (Functional Summary Generator) that generates functional maps of a PPI at different levels of organization, from broad process-process level interactions to in-depth complex-complex level interactions, through a pro t maximization approach that exploits Minimum Description Length (MDL) principle to maximize information gain of the summary graph while satisfying the level of detail constraint. We evaluate the performance of FUSE on several real-world PPIS. We also compare FUSE to state-of-the-art graph clustering methods with GO term enrichment by constructing the biological process landscape of the PPIS. Using AD network as our case study, we further demonstrate the ability of FUSE to quickly summarize the network and identify many different processes and complexes that regulate it. Finally, we study the higher-order connectivity of the human PPI. By simultaneously evaluating interaction and annotation data, FUSE abstracts higher-order interaction maps by reducing the details of the underlying PPI to form a functional summary graph of interconnected functional clusters. Our results demonstrate its effectiveness and superiority over state-of-the-art graph clustering methods with GO term enrichment.
A hierarchical structure for automatic meshing and adaptive FEM analysis

NASA Technical Reports Server (NTRS)

Kela, Ajay; Saxena, Mukul; Perucchio, Renato

1987-01-01

A new algorithm for generating automatically, from solid models of mechanical parts, finite element meshes that are organized as spatially addressable quaternary trees (for 2-D work) or octal trees (for 3-D work) is discussed. Because such meshes are inherently hierarchical as well as spatially addressable, they permit efficient substructuring techniques to be used for both global analysis and incremental remeshing and reanalysis. The global and incremental techniques are summarized and some results from an experimental closed loop 2-D system in which meshing, analysis, error evaluation, and remeshing and reanalysis are done automatically and adaptively are presented. The implementation of 3-D work is briefly discussed.
AN AUTOMATIC DEVICE FOR READING TYPOGRAPHICAL TEXTS,

DTIC Science & Technology

permissible. The system represents an attempt to apply the methods of machines designed for typescript reading to machines reading printed texts...Some characteristics by which typescript and typographical material differ are presented. The basic aspects of the recognition algorithm are given. A

Automatic Neural Processing of Disorder-Related Stimuli in Social Anxiety Disorder: Faces and More

PubMed Central

Schulz, Claudia; Mothes-Lasch, Martin; Straube, Thomas

2013-01-01

It has been proposed that social anxiety disorder (SAD) is associated with automatic information processing biases resulting in hypersensitivity to signals of social threat such as negative facial expressions. However, the nature and extent of automatic processes in SAD on the behavioral and neural level is not entirely clear yet. The present review summarizes neuroscientific findings on automatic processing of facial threat but also other disorder-related stimuli such as emotional prosody or negative words in SAD. We review initial evidence for automatic activation of the amygdala, insula, and sensory cortices as well as for automatic early electrophysiological components. However, findings vary depending on tasks, stimuli, and neuroscientific methods. Only few studies set out to examine automatic neural processes directly and systematic attempts are as yet lacking. We suggest that future studies should: (1) use different stimulus modalities, (2) examine different emotional expressions, (3) compare findings in SAD with other anxiety disorders, (4) use more sophisticated experimental designs to investigate features of automaticity systematically, and (5) combine different neuroscientific methods (such as functional neuroimaging and electrophysiology). Finally, the understanding of neural automatic processes could also provide hints for therapeutic approaches. PMID:23745116
Text mining patents for biomedical knowledge.

PubMed

Rodriguez-Esteban, Raul; Bundschus, Markus

2016-06-01

Biomedical text mining of scientific knowledge bases, such as Medline, has received much attention in recent years. Given that text mining is able to automatically extract biomedical facts that revolve around entities such as genes, proteins, and drugs, from unstructured text sources, it is seen as a major enabler to foster biomedical research and drug discovery. In contrast to the biomedical literature, research into the mining of biomedical patents has not reached the same level of maturity. Here, we review existing work and highlight the associated technical challenges that emerge from automatically extracting facts from patents. We conclude by outlining potential future directions in this domain that could help drive biomedical research and drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Qualitative Study on the Use of Summarizing Strategies in Elementary Education

ERIC Educational Resources Information Center

Susar Kirmizi, Fatma; Akkaya, Nevin

2011-01-01

The objective of this study is to reveal how well summarizing strategies are used by Grade 4 and Grade 5 students as a reading comprehension strategy. This study was conducted in Buca, Izmir and the document analysis method, a qualitative research strategy, was employed. The study used a text titled "Environmental Pollution" and an…
Evaluation Methods of The Text Entities

ERIC Educational Resources Information Center

Popa, Marius

2006-01-01

The paper highlights some evaluation methods to assess the quality characteristics of the text entities. The main concepts used in building and evaluation processes of the text entities are presented. Also, some aggregated metrics for orthogonality measurements are presented. The evaluation process for automatic evaluation of the text entities is…
More than a "Basic Skill": Breaking down the Complexities of Summarizing for ABE/ESL Learners

ERIC Educational Resources Information Center

Ouellette-Schramm, Jennifer

2015-01-01

This article describes the complex cognitive and linguistic challenges of summarizing expository text at vocabulary, syntactic, and rhetorical levels. It then outlines activities to help ABE/ESL learners develop corresponding skills.
Evaluation of an automated knowledge-based textual summarization system for longitudinal clinical data, in the intensive care domain.

PubMed

Goldstein, Ayelet; Shahar, Yuval; Orenbuch, Efrat; Cohen, Matan J

2017-10-01

To examine the feasibility of the automated creation of meaningful free-text summaries of longitudinal clinical records, using a new general methodology that we had recently developed; and to assess the potential benefits to the clinical decision-making process of using such a method to generate draft letters that can be further manually enhanced by clinicians. We had previously developed a system, CliniText (CTXT), for automated summarization in free text of longitudinal medical records, using a clinical knowledge base. In the current study, we created an Intensive Care Unit (ICU) clinical knowledge base, assisted by two ICU clinical experts in an academic tertiary hospital. The CTXT system generated free-text summary letters from the data of 31 different patients, which were compared to the respective original physician-composed discharge letters. The main evaluation measures were (1) relative completeness, quantifying the data items missed by one of the letters but included by the other, and their importance; (2) quality parameters, such as readability; (3) functional performance, assessed by the time needed, by three clinicians reading each of the summaries, to answer five key questions, based on the discharge letter (e.g., "What are the patient's current respiratory requirements?"), and by the correctness of the clinicians' answers. Completeness: In 13/31 (42%) of the letters the number of important items missed in the CTXT-generated letter was actually less than or equal to the number of important items missed by the MD-composed letter. In each of the MD-composed letters, at least two important items that were mentioned by the CTXT system were missed (a mean of 7.2±5.74). In addition, the standard deviation in the number of missed items in the MD letters (STD=15.4) was much higher than the standard deviation in the CTXT-generated letters (STD=5.3). Quality: The MD-composed letters obtained a significantly better grade in three out of four measured parameters
Machine Translation from Text

NASA Astrophysics Data System (ADS)

Habash, Nizar; Olive, Joseph; Christianson, Caitlin; McCary, John

Machine translation (MT) from text, the topic of this chapter, is perhaps the heart of the GALE project. Beyond being a well defined application that stands on its own, MT from text is the link between the automatic speech recognition component and the distillation component. The focus of MT in GALE is on translating from Arabic or Chinese to English. The three languages represent a wide range of linguistic diversity and make the GALE MT task rather challenging and exciting.
Automatically Detecting Likely Edits in Clinical Notes Created Using Automatic Speech Recognition

PubMed Central

Lybarger, Kevin; Ostendorf, Mari; Yetisgen, Meliha

2017-01-01

The use of automatic speech recognition (ASR) to create clinical notes has the potential to reduce costs associated with note creation for electronic medical records, but at current system accuracy levels, post-editing by practitioners is needed to ensure note quality. Aiming to reduce the time required to edit ASR transcripts, this paper investigates novel methods for automatic detection of edit regions within the transcripts, including both putative ASR errors but also regions that are targets for cleanup or rephrasing. We create detection models using logistic regression and conditional random field models, exploring a variety of text-based features that consider the structure of clinical notes and exploit the medical context. Different medical text resources are used to improve feature extraction. Experimental results on a large corpus of practitioner-edited clinical notes show that 67% of sentence-level edits and 45% of word-level edits can be detected with a false detection rate of 15%. PMID:29854187
Autoclass: An automatic classification system

NASA Technical Reports Server (NTRS)

Stutz, John; Cheeseman, Peter; Hanson, Robin

1991-01-01

The task of inferring a set of classes and class descriptions most likely to explain a given data set can be placed on a firm theoretical foundation using Bayesian statistics. Within this framework, and using various mathematical and algorithmic approximations, the AutoClass System searches for the most probable classifications, automatically choosing the number of classes and complexity of class descriptions. A simpler version of AutoClass has been applied to many large real data sets, has discovered new independently-verified phenomena, and has been released as a robust software package. Recent extensions allow attributes to be selectively correlated within particular classes, and allow classes to inherit, or share, model parameters through a class hierarchy. The mathematical foundations of AutoClass are summarized.
Multi-dimensional classification of biomedical text: Toward automated, practical provision of high-utility text to diverse users

PubMed Central

Shatkay, Hagit; Pan, Fengxia; Rzhetsky, Andrey; Wilbur, W. John

2008-01-01

Motivation: Much current research in biomedical text mining is concerned with serving biologists by extracting certain information from scientific text. We note that there is no ‘average biologist’ client; different users have distinct needs. For instance, as noted in past evaluation efforts (BioCreative, TREC, KDD) database curators are often interested in sentences showing experimental evidence and methods. Conversely, lab scientists searching for known information about a protein may seek facts, typically stated with high confidence. Text-mining systems can target specific end-users and become more effective, if the system can first identify text regions rich in the type of scientific content that is of interest to the user, retrieve documents that have many such regions, and focus on fact extraction from these regions. Here, we study the ability to characterize and classify such text automatically. We have recently introduced a multi-dimensional categorization and annotation scheme, developed to be applicable to a wide variety of biomedical documents and scientific statements, while intended to support specific biomedical retrieval and extraction tasks. Results: The annotation scheme was applied to a large corpus in a controlled effort by eight independent annotators, where three individual annotators independently tagged each sentence. We then trained and tested machine learning classifiers to automatically categorize sentence fragments based on the annotation. We discuss here the issues involved in this task, and present an overview of the results. The latter strongly suggest that automatic annotation along most of the dimensions is highly feasible, and that this new framework for scientific sentence categorization is applicable in practice. Contact: shatkay@cs.queensu.ca PMID:18718948
An Automatic Measure of Cross-Language Text Structures

ERIC Educational Resources Information Center

Kim, Kyung

2018-01-01

In order to further validate and extend the application of "GIKS" (Graphical Interface of Knowledge Structure) beyond English, this investigation applies the "GIKS" to capture, visually represent, and compare text structures inherent in two "contrasting" languages. The English and parallel Korean versions of 50…
Exploring supervised and unsupervised methods to detect topics in biomedical text

PubMed Central

Lee, Minsuk; Wang, Weiqing; Yu, Hong

2006-01-01

Background Topic detection is a task that automatically identifies topics (e.g., "biochemistry" and "protein structure") in scientific articles based on information content. Topic detection will benefit many other natural language processing tasks including information retrieval, text summarization and question answering; and is a necessary step towards the building of an information system that provides an efficient way for biologists to seek information from an ocean of literature. Results We have explored the methods of Topic Spotting, a task of text categorization that applies the supervised machine-learning technique naïve Bayes to assign automatically a document into one or more predefined topics; and Topic Clustering, which apply unsupervised hierarchical clustering algorithms to aggregate documents into clusters such that each cluster represents a topic. We have applied our methods to detect topics of more than fifteen thousand of articles that represent over sixteen thousand entries in the Online Mendelian Inheritance in Man (OMIM) database. We have explored bag of words as the features. Additionally, we have explored semantic features; namely, the Medical Subject Headings (MeSH) that are assigned to the MEDLINE records, and the Unified Medical Language System (UMLS) semantic types that correspond to the MeSH terms, in addition to bag of words, to facilitate the tasks of topic detection. Our results indicate that incorporating the MeSH terms and the UMLS semantic types as additional features enhances the performance of topic detection and the naïve Bayes has the highest accuracy, 66.4%, for predicting the topic of an OMIM article as one of the total twenty-five topics. Conclusion Our results indicate that the supervised topic spotting methods outperformed the unsupervised topic clustering; on the other hand, the unsupervised topic clustering methods have the advantages of being robust and applicable in real world settings. PMID:16539745
Semi-Automatic Indexing of Full Text Biomedical Articles

PubMed Central

Gay, Clifford W.; Kayaalp, Mehmet; Aronson, Alan R.

2005-01-01

The main application of U.S. National Library of Medicine’s Medical Text Indexer (MTI) is to provide indexing recommendations to the Library’s indexing staff. The current input to MTI consists of the titles and abstracts of articles to be indexed. This study reports on an extension of MTI to the full text of articles appearing in online medical journals that are indexed for Medline®. Using a collection of 17 journal issues containing 500 articles, we report on the effectiveness of the contribution of terms by the whole article and also by each section. We obtain the best results using a model consisting of the sections Results, Results and Discussion, and Conclusions together with the article’s title and abstract, the captions of tables and figures, and sections that have no titles. The resulting model provides indexing significantly better (7.4%) than what is currently achieved using only titles and abstracts. PMID:16779044
Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization.

PubMed

Mitra, Adway; Biswas, Soma; Bhattacharyya, Chiranjib

2017-03-01

A video is understood by users in terms of entities present in it. Entity Discovery is the task of building appearance model for each entity (e.g., a person), and finding all its occurrences in the video. We represent a video as a sequence of tracklets, each spanning 10-20 frames, and associated with one entity. We pose Entity Discovery as tracklet clustering, and approach it by leveraging Temporal Coherence (TC): the property that temporally neighboring tracklets are likely to be associated with the same entity. Our major contributions are the first Bayesian nonparametric models for TC at tracklet-level. We extend Chinese Restaurant Process (CRP) to TC-CRP, and further to Temporally Coherent Chinese Restaurant Franchise (TC-CRF) to jointly model entities and temporal segments using mixture components and sparse distributions. For discovering persons in TV serial videos without meta-data like scripts, these methods show considerable improvement over state-of-the-art approaches to tracklet clustering in terms of clustering accuracy, cluster purity and entity coverage. The proposed methods can perform online tracklet clustering on streaming videos unlike existing approaches, and can automatically reject false tracklets. Finally we discuss entity-driven video summarization- where temporal segments of the video are selected based on the discovered entities, to create a semantically meaningful summary.
Impact of translation on named-entity recognition in radiology texts

PubMed Central

Pedro, Vasco

2017-01-01

Abstract Radiology reports describe the results of radiography procedures and have the potential of being a useful source of information which can bring benefits to health care systems around the world. One way to automatically extract information from the reports is by using Text Mining tools. The problem is that these tools are mostly developed for English and reports are usually written in the native language of the radiologist, which is not necessarily English. This creates an obstacle to the sharing of Radiology information between different communities. This work explores the solution of translating the reports to English before applying the Text Mining tools, probing the question of what translation approach should be used. We created MRRAD (Multilingual Radiology Research Articles Dataset), a parallel corpus of Portuguese research articles related to Radiology and a number of alternative translations (human, automatic and semi-automatic) to English. This is a novel corpus which can be used to move forward the research on this topic. Using MRRAD we studied which kind of automatic or semi-automatic translation approach is more effective on the Named-entity recognition task of finding RadLex terms in the English version of the articles. Considering the terms extracted from human translations as our gold standard, we calculated how similar to this standard were the terms extracted using other translations. We found that a completely automatic translation approach using Google leads to F-scores (between 0.861 and 0.868, depending on the extraction approach) similar to the ones obtained through a more expensive semi-automatic translation approach using Unbabel (between 0.862 and 0.870). To better understand the results we also performed a qualitative analysis of the type of errors found in the automatic and semi-automatic translations. Database URL: https://github.com/lasigeBioTM/MRRAD PMID:29220455
29 CFR 779.313 - Requirements summarized.

Code of Federal Regulations, 2010 CFR

2010-07-01

... RETAILERS OF GOODS OR SERVICES Exemptions for Certain Retail or Service Establishments Statutory Meaning of Retail Or Service Establishment § 779.313 Requirements summarized. The statutory definition of the term “retail or service establishment” found in section 13(a)(2), clearly provides that an establishment to be...
Abbreviation definition identification based on automatic precision estimates.

PubMed

Sohn, Sunghwan; Comeau, Donald C; Kim, Won; Wilbur, W John

2008-09-25

The rapid growth of biomedical literature presents challenges for automatic text processing, and one of the challenges is abbreviation identification. The presence of unrecognized abbreviations in text hinders indexing algorithms and adversely affects information retrieval and extraction. Automatic abbreviation definition identification can help resolve these issues. However, abbreviations and their definitions identified by an automatic process are of uncertain validity. Due to the size of databases such as MEDLINE only a small fraction of abbreviation-definition pairs can be examined manually. An automatic way to estimate the accuracy of abbreviation-definition pairs extracted from text is needed. In this paper we propose an abbreviation definition identification algorithm that employs a variety of strategies to identify the most probable abbreviation definition. In addition our algorithm produces an accuracy estimate, pseudo-precision, for each strategy without using a human-judged gold standard. The pseudo-precisions determine the order in which the algorithm applies the strategies in seeking to identify the definition of an abbreviation. On the Medstract corpus our algorithm produced 97% precision and 85% recall which is higher than previously reported results. We also annotated 1250 randomly selected MEDLINE records as a gold standard. On this set we achieved 96.5% precision and 83.2% recall. This compares favourably with the well known Schwartz and Hearst algorithm. We developed an algorithm for abbreviation identification that uses a variety of strategies to identify the most probable definition for an abbreviation and also produces an estimated accuracy of the result. This process is purely automatic.
Automatic Evidence Retrieval for Systematic Reviews

PubMed Central

Choong, Miew Keen; Galgani, Filippo; Dunn, Adam G

2014-01-01

Background Snowballing involves recursively pursuing relevant references cited in the retrieved literature and adding them to the search results. Snowballing is an alternative approach to discover additional evidence that was not retrieved through conventional search. Snowballing’s effectiveness makes it best practice in systematic reviews despite being time-consuming and tedious. Objective Our goal was to evaluate an automatic method for citation snowballing’s capacity to identify and retrieve the full text and/or abstracts of cited articles. Methods Using 20 review articles that contained 949 citations to journal or conference articles, we manually searched Microsoft Academic Search (MAS) and identified 78.0% (740/949) of the cited articles that were present in the database. We compared the performance of the automatic citation snowballing method against the results of this manual search, measuring precision, recall, and F1 score. Results The automatic method was able to correctly identify 633 (as proportion of included citations: recall=66.7%, F1 score=79.3%; as proportion of citations in MAS: recall=85.5%, F1 score=91.2%) of citations with high precision (97.7%), and retrieved the full text or abstract for 490 (recall=82.9%, precision=92.1%, F1 score=87.3%) of the 633 correctly retrieved citations. Conclusions The proposed method for automatic citation snowballing is accurate and is capable of obtaining the full texts or abstracts for a substantial proportion of the scholarly citations in review articles. By automating the process of citation snowballing, it may be possible to reduce the time and effort of common evidence surveillance tasks such as keeping trial registries up to date and conducting systematic reviews. PMID:25274020
Automatic evidence retrieval for systematic reviews.

PubMed

Choong, Miew Keen; Galgani, Filippo; Dunn, Adam G; Tsafnat, Guy

2014-10-01

Snowballing involves recursively pursuing relevant references cited in the retrieved literature and adding them to the search results. Snowballing is an alternative approach to discover additional evidence that was not retrieved through conventional search. Snowballing's effectiveness makes it best practice in systematic reviews despite being time-consuming and tedious. Our goal was to evaluate an automatic method for citation snowballing's capacity to identify and retrieve the full text and/or abstracts of cited articles. Using 20 review articles that contained 949 citations to journal or conference articles, we manually searched Microsoft Academic Search (MAS) and identified 78.0% (740/949) of the cited articles that were present in the database. We compared the performance of the automatic citation snowballing method against the results of this manual search, measuring precision, recall, and F1 score. The automatic method was able to correctly identify 633 (as proportion of included citations: recall=66.7%, F1 score=79.3%; as proportion of citations in MAS: recall=85.5%, F1 score=91.2%) of citations with high precision (97.7%), and retrieved the full text or abstract for 490 (recall=82.9%, precision=92.1%, F1 score=87.3%) of the 633 correctly retrieved citations. The proposed method for automatic citation snowballing is accurate and is capable of obtaining the full texts or abstracts for a substantial proportion of the scholarly citations in review articles. By automating the process of citation snowballing, it may be possible to reduce the time and effort of common evidence surveillance tasks such as keeping trial registries up to date and conducting systematic reviews.
Structure Strategies for Comprehending Expository Text.

ERIC Educational Resources Information Center

Muth, K. Denise

1987-01-01

Examines three strategies designed to help middle school students use text structures to comprehend expository text: (1) hierarchical summaries, (2) conceptual maps, and (3) thematic organizers. Summarizes advantages and disadvantages of each strategy and recommends that teachers consider the outcomes they want and select the most appropriate…

Automatic Evaluations and Exercising: Systematic Review and Implications for Future Research.

PubMed

Schinkoeth, Michaela; Antoniewicz, Franziska

2017-01-01

The general purpose of this systematic review was to summarize, structure and evaluate the findings on automatic evaluations of exercising. Studies were eligible for inclusion if they reported measuring automatic evaluations of exercising with an implicit measure and assessed some kind of exercise variable. Fourteen nonexperimental and six experimental studies (out of a total N = 1,928) were identified and rated by two independent reviewers. The main study characteristics were extracted and the grade of evidence for each study evaluated. First, results revealed a large heterogeneity in the applied measures to assess automatic evaluations of exercising and the exercise variables. Generally, small to large-sized significant relations between automatic evaluations of exercising and exercise variables were identified in the vast majority of studies. The review offers a systematization of the various examined exercise variables and prompts to differentiate more carefully between actually observed exercise behavior (proximal exercise indicator) and associated physiological or psychological variables (distal exercise indicator). Second, a lack of transparent reported reflections on the differing theoretical basis leading to the use of specific implicit measures was observed. Implicit measures should be applied purposefully, taking into consideration the individual advantages or disadvantages of the measures. Third, 12 studies were rated as providing first-grade evidence (lowest grade of evidence), five represent second-grade and three were rated as third-grade evidence. There is a dramatic lack of experimental studies, which are essential for illustrating the cause-effect relation between automatic evaluations of exercising and exercise and investigating under which conditions automatic evaluations of exercising influence behavior. Conclusions about the necessity of exercise interventions targeted at the alteration of automatic evaluations of exercising should therefore
Automatic Evaluations and Exercising: Systematic Review and Implications for Future Research

PubMed Central

Schinkoeth, Michaela; Antoniewicz, Franziska

2017-01-01

The general purpose of this systematic review was to summarize, structure and evaluate the findings on automatic evaluations of exercising. Studies were eligible for inclusion if they reported measuring automatic evaluations of exercising with an implicit measure and assessed some kind of exercise variable. Fourteen nonexperimental and six experimental studies (out of a total N = 1,928) were identified and rated by two independent reviewers. The main study characteristics were extracted and the grade of evidence for each study evaluated. First, results revealed a large heterogeneity in the applied measures to assess automatic evaluations of exercising and the exercise variables. Generally, small to large-sized significant relations between automatic evaluations of exercising and exercise variables were identified in the vast majority of studies. The review offers a systematization of the various examined exercise variables and prompts to differentiate more carefully between actually observed exercise behavior (proximal exercise indicator) and associated physiological or psychological variables (distal exercise indicator). Second, a lack of transparent reported reflections on the differing theoretical basis leading to the use of specific implicit measures was observed. Implicit measures should be applied purposefully, taking into consideration the individual advantages or disadvantages of the measures. Third, 12 studies were rated as providing first-grade evidence (lowest grade of evidence), five represent second-grade and three were rated as third-grade evidence. There is a dramatic lack of experimental studies, which are essential for illustrating the cause-effect relation between automatic evaluations of exercising and exercise and investigating under which conditions automatic evaluations of exercising influence behavior. Conclusions about the necessity of exercise interventions targeted at the alteration of automatic evaluations of exercising should therefore
Summarizing Simulation Results using Causally-relevant States

PubMed Central

Parikh, Nidhi; Marathe, Madhav; Swarup, Samarth

2016-01-01

As increasingly large-scale multiagent simulations are being implemented, new methods are becoming necessary to make sense of the results of these simulations. Even concisely summarizing the results of a given simulation run is a challenge. Here we pose this as the problem of simulation summarization: how to extract the causally-relevant descriptions of the trajectories of the agents in the simulation. We present a simple algorithm to compress agent trajectories through state space by identifying the state transitions which are relevant to determining the distribution of outcomes at the end of the simulation. We present a toy-example to illustrate the working of the algorithm, and then apply it to a complex simulation of a major disaster in an urban area. PMID:28042620
The Interplay between Automatic and Control Processes in Reading.

ERIC Educational Resources Information Center

Walczyk, Jeffrey J.

2000-01-01

Reviews prominent reading theories in light of their accounts of how automatic and control processes combine to produce successful text comprehension, and the trade-offs between the two. Presents the Compensatory-Encoding Model of reading, which explicates how, when, and why automatic and control processes interact. Notes important educational…
The transition to increased automaticity during finger sequence learning in adult males who stutter.

PubMed

Smits-Bandstra, Sarah; De Nil, Luc; Rochon, Elizabeth

2006-01-01

The present study compared the automaticity levels of persons who stutter (PWS) and persons who do not stutter (PNS) on a practiced finger sequencing task under dual task conditions. Automaticity was defined as the amount of attention required for task performance. Twelve PWS and 12 control subjects practiced finger tapping sequences under single and then dual task conditions. Control subjects performed the sequencing task significantly faster and less variably under single versus dual task conditions while PWS' performance was consistently slow and variable (comparable to the dual task performance of control subjects) under both conditions. Control subjects were significantly more accurate on a colour recognition distracter task than PWS under dual task conditions. These results suggested that control subjects transitioned to quick, accurate and increasingly automatic performance on the sequencing task after practice, while PWS did not. Because most stuttering treatment programs for adults include practice and automatization of new motor speech skills, findings of this finger sequencing study and future studies of speech sequence learning may have important implications for how to maximize stuttering treatment effectiveness. As a result of this activity, the participant will be able to: (1) Define automaticity and explain the importance of dual task paradigms to investigate automaticity; (2) Relate the proposed relationship between motor learning and automaticity as stated by the authors; (3) Summarize the reviewed literature concerning the performance of PWS on dual tasks; and (4) Explain why the ability to transition to automaticity during motor learning may have important clinical implications for stuttering treatment effectiveness.
Evaluating a Bilingual Text-Mining System with a Taxonomy of Key Words and Hierarchical Visualization for Understanding Learner-Generated Text

ERIC Educational Resources Information Center

Kong, Siu Cheung; Li, Ping; Song, Yanjie

2018-01-01

This study evaluated a bilingual text-mining system, which incorporated a bilingual taxonomy of key words and provided hierarchical visualization, for understanding learner-generated text in the learning management systems through automatic identification and counting of matching key words. A class of 27 in-service teachers studied a course…
Text-image alignment for historical handwritten documents

NASA Astrophysics Data System (ADS)

Zinger, S.; Nerbonne, J.; Schomaker, L.

2009-01-01

We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text transcriptions. The images of handwritten lines are automatically segmented from the scanned pages of historical documents and then manually transcribed. To train automatic routines to detect words in an image of handwritten text, we need a training set - images of words with their transcriptions. We present our results on aligning words from the images of handwritten lines and their corresponding text transcriptions. Alignment based on the longest spaces between portions of handwriting is a baseline. We then show that relative lengths, i.e. proportions of words in their lines, can be used to improve the alignment results considerably. To take into account the relative word length, we define the expressions for the cost function that has to be minimized for aligning text words with their images. We apply right to left alignment as well as alignment based on exhaustive search. The quality assessment of these alignments shows correct results for 69% of words from 100 lines, or 90% of partially correct and correct alignments combined.
[Advances in automatic detection technology for images of thin blood film of malaria parasite].

PubMed

Juan-Sheng, Zhang; Di-Qiang, Zhang; Wei, Wang; Xiao-Guang, Wei; Zeng-Guo, Wang

2017-05-05

This paper reviews the computer vision and image analysis studies aiming at automated diagnosis or screening of malaria in microscope images of thin blood film smears. On the basis of introducing the background and significance of automatic detection technology, the existing detection technologies are summarized and divided into several steps, including image acquisition, pre-processing, morphological analysis, segmentation, count, and pattern classification components. Then, the principles and implementation methods of each step are given in detail. In addition, the promotion and application in automatic detection technology of thick blood film smears are put forwarded as questions worthy of study, and a perspective of the future work for realization of automated microscopy diagnosis of malaria is provided.
Text Mining to Support Gene Ontology Curation and Vice Versa.

PubMed

Ruch, Patrick

2017-01-01

In this chapter, we explain how text mining can support the curation of molecular biology databases dealing with protein functions. We also show how curated data can play a disruptive role in the developments of text mining methods. We review a decade of efforts to improve the automatic assignment of Gene Ontology (GO) descriptors, the reference ontology for the characterization of genes and gene products. To illustrate the high potential of this approach, we compare the performances of an automatic text categorizer and show a large improvement of +225 % in both precision and recall on benchmarked data. We argue that automatic text categorization functions can ultimately be embedded into a Question-Answering (QA) system to answer questions related to protein functions. Because GO descriptors can be relatively long and specific, traditional QA systems cannot answer such questions. A new type of QA system, so-called Deep QA which uses machine learning methods trained with curated contents, is thus emerging. Finally, future advances of text mining instruments are directly dependent on the availability of high-quality annotated contents at every curation step. Databases workflows must start recording explicitly all the data they curate and ideally also some of the data they do not curate.
Automatic control of a primary electric thrust subsystem

NASA Technical Reports Server (NTRS)

Macie, T. W.; Macmedan, M. L.

1975-01-01

A concept for automatic control of the thrust subsystem has been developed by JPL and participating NASA Centers. This paper reports on progress in implementing the concept at JPL. Control of the Thrust Subsystem (TSS) is performed by the spacecraft computer command subsystem, and telemetry data is extracted by the spacecraft flight data subsystem. The Data and Control Interface Unit, an element of the TSS, provides the interface with the individual elements of the TSS. The control philosophy and implementation guidelines are presented. Control requirements are listed, and the control mechanism, including the serial digital data intercommunication system, is outlined. The paper summarizes progress to Fall 1974.
Text feature extraction based on deep learning: a review.

PubMed

Liang, Hong; Sun, Xiao; Sun, Yunlei; Gao, Yuan

2017-01-01

Selection of text feature item is a basic and important matter for text mining and information retrieval. Traditional methods of feature extraction require handcrafted features. To hand-design, an effective feature is a lengthy process, but aiming at new applications, deep learning enables to acquire new effective feature representation from training data. As a new feature extraction method, deep learning has made achievements in text mining. The major difference between deep learning and conventional methods is that deep learning automatically learns features from big data, instead of adopting handcrafted features, which mainly depends on priori knowledge of designers and is highly impossible to take the advantage of big data. Deep learning can automatically learn feature representation from big data, including millions of parameters. This thesis outlines the common methods used in text feature extraction first, and then expands frequently used deep learning methods in text feature extraction and its applications, and forecasts the application of deep learning in feature extraction.
Terminologies for text-mining; an experiment in the lipoprotein metabolism domain

PubMed Central

Alexopoulou, Dimitra; Wächter, Thomas; Pickersgill, Laura; Eyre, Cecilia; Schroeder, Michael

2008-01-01

Background The engineering of ontologies, especially with a view to a text-mining use, is still a new research field. There does not yet exist a well-defined theory and technology for ontology construction. Many of the ontology design steps remain manual and are based on personal experience and intuition. However, there exist a few efforts on automatic construction of ontologies in the form of extracted lists of terms and relations between them. Results We share experience acquired during the manual development of a lipoprotein metabolism ontology (LMO) to be used for text-mining. We compare the manually created ontology terms with the automatically derived terminology from four different automatic term recognition (ATR) methods. The top 50 predicted terms contain up to 89% relevant terms. For the top 1000 terms the best method still generates 51% relevant terms. In a corpus of 3066 documents 53% of LMO terms are contained and 38% can be generated with one of the methods. Conclusions Given high precision, automatic methods can help decrease development time and provide significant support for the identification of domain-specific vocabulary. The coverage of the domain vocabulary depends strongly on the underlying documents. Ontology development for text mining should be performed in a semi-automatic way; taking ATR results as input and following the guidelines we described. Availability The TFIDF term recognition is available as Web Service, described at PMID:18460175
Spatial Analysis of Handwritten Texts as a Marker of Cognitive Control.

PubMed

Crespo, Y; Soriano, M F; Iglesias-Parro, S; Aznarte, J I; Ibáñez-Molina, A J

2017-12-01

We explore the idea that cognitive demands of the handwriting would influence the degree of automaticity of the handwriting process, which in turn would affect the geometric parameters of texts. We compared the heterogeneity of handwritten texts in tasks with different cognitive demands; the heterogeneity of texts was analyzed with lacunarity, a measure of geometrical invariance. In Experiment 1, we asked participants to perform two tasks that varied in cognitive demands: transcription and exposition about an autobiographical episode. Lacunarity was significantly lower in transcription. In Experiment 2, we compared a veridical and a fictitious version of a personal event. Lacunarity was lower in veridical texts. We contend that differences in lacunarity of handwritten texts reveal the degree of automaticity in handwriting.
Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study.

PubMed

Skeppstedt, Maria; Kvist, Maria; Nilsson, Gunnar H; Dalianis, Hercules

2014-06-01

Automatic recognition of clinical entities in the narrative text of health records is useful for constructing applications for documentation of patient care, as well as for secondary usage in the form of medical knowledge extraction. There are a number of named entity recognition studies on English clinical text, but less work has been carried out on clinical text in other languages. This study was performed on Swedish health records, and focused on four entities that are highly relevant for constructing a patient overview and for medical hypothesis generation, namely the entities: Disorder, Finding, Pharmaceutical Drug and Body Structure. The study had two aims: to explore how well named entity recognition methods previously applied to English clinical text perform on similar texts written in Swedish; and to evaluate whether it is meaningful to divide the more general category Medical Problem, which has been used in a number of previous studies, into the two more granular entities, Disorder and Finding. Clinical notes from a Swedish internal medicine emergency unit were annotated for the four selected entity categories, and the inter-annotator agreement between two pairs of annotators was measured, resulting in an average F-score of 0.79 for Disorder, 0.66 for Finding, 0.90 for Pharmaceutical Drug and 0.80 for Body Structure. A subset of the developed corpus was thereafter used for finding suitable features for training a conditional random fields model. Finally, a new model was trained on this subset, using the best features and settings, and its ability to generalise to held-out data was evaluated. This final model obtained an F-score of 0.81 for Disorder, 0.69 for Finding, 0.88 for Pharmaceutical Drug, 0.85 for Body Structure and 0.78 for the combined category Disorder+Finding. The obtained results, which are in line with or slightly lower than those for similar studies on English clinical text, many of them conducted using a larger training data set, show that
The Use of Summarization Tasks: Some Lexical and Conceptual Analyses

ERIC Educational Resources Information Center

Yu, Guoxing

2013-01-01

This article reports the lexical diversity of summaries written by experts and test takers in an empirical study and then interrogates the (in)congruity between the conceptualisations of "summary" and "summarize" in the literature of educational research and the operationalization of summarization tasks in three international…
Automatic Imitation

ERIC Educational Resources Information Center

Heyes, Cecilia

2011-01-01

"Automatic imitation" is a type of stimulus-response compatibility effect in which the topographical features of task-irrelevant action stimuli facilitate similar, and interfere with dissimilar, responses. This article reviews behavioral, neurophysiological, and neuroimaging research on automatic imitation, asking in what sense it is "automatic"…
Figure summarizer browser extensions for PubMed Central

PubMed Central

Agarwal, Shashank; Yu, Hong

2011-01-01

Summary: Figures in biomedical articles present visual evidence for research facts and help readers understand the article better. However, when figures are taken out of context, it is difficult to understand their content. We developed a summarization algorithm to summarize the content of figures and used it in our figure search engine (http://figuresearch.askhermes.org/). In this article, we report on the development of web browser extensions for Mozilla Firefox, Google Chrome and Apple Safari to display summaries for figures in PubMed Central and NCBI Images. Availability: The extensions can be downloaded from http://figuresearch.askhermes.org/articlesearch/extensions.php. Contact: agarwal@uwm.edu PMID:21493658
Review assessment support in Open Journal System using TextRank

NASA Astrophysics Data System (ADS)

Manalu, S. R.; Willy; Sundjaja, A. M.; Noerlina

2017-01-01

In this paper, a review assessment support in Open Journal System (OJS) using TextRank is proposed. OJS is an open-source journal management platform that provides a streamlined journal publishing workflow. TextRank is an unsupervised, graph-based ranking model commonly used as extractive auto summarization of text documents. This study applies the TextRank algorithm to summarize 50 article reviews from an OJS-based international journal. The resulting summaries are formed using the most representative sentences extracted from the reviews. The summaries are then used to help OJS editors in assessing a review’s quality.
Automatic Fringe Detection for Oil Film Interferometry Measurement of Skin Friction

NASA Technical Reports Server (NTRS)

Naughton, Jonathan W.; Decker, Robert K.; Jafari, Farhad

2001-01-01

This report summarizes two years of work on investigating algorithms for automatically detecting fringe patterns in images acquired using oil-drop interferometry for the determination of skin friction. Several different analysis methods were tested, and a combination of a windowed Fourier transform followed by a correlation was found to be most effective. The implementation of this method is discussed and details of the process are described. The results indicate that this method shows promise for automating the fringe detection process, but further testing is required.
Terminology extraction from medical texts in Polish

PubMed Central

2014-01-01

Background Hospital documents contain free text describing the most important facts relating to patients and their illnesses. These documents are written in specific language containing medical terminology related to hospital treatment. Their automatic processing can help in verifying the consistency of hospital documentation and obtaining statistical data. To perform this task we need information on the phrases we are looking for. At the moment, clinical Polish resources are sparse. The existing terminologies, such as Polish Medical Subject Headings (MeSH), do not provide sufficient coverage for clinical tasks. It would be helpful therefore if it were possible to automatically prepare, on the basis of a data sample, an initial set of terms which, after manual verification, could be used for the purpose of information extraction. Results Using a combination of linguistic and statistical methods for processing over 1200 children hospital discharge records, we obtained a list of single and multiword terms used in hospital discharge documents written in Polish. The phrases are ordered according to their presumed importance in domain texts measured by the frequency of use of a phrase and the variety of its contexts. The evaluation showed that the automatically identified phrases cover about 84% of terms in domain texts. At the top of the ranked list, only 4% out of 400 terms were incorrect while out of the final 200, 20% of expressions were either not domain related or syntactically incorrect. We also observed that 70% of the obtained terms are not included in the Polish MeSH. Conclusions Automatic terminology extraction can give results which are of a quality high enough to be taken as a starting point for building domain related terminological dictionaries or ontologies. This approach can be useful for preparing terminological resources for very specific subdomains for which no relevant terminologies already exist. The evaluation performed showed that none of the

Terminology extraction from medical texts in Polish.

PubMed

Marciniak, Małgorzata; Mykowiecka, Agnieszka

2014-01-01

Hospital documents contain free text describing the most important facts relating to patients and their illnesses. These documents are written in specific language containing medical terminology related to hospital treatment. Their automatic processing can help in verifying the consistency of hospital documentation and obtaining statistical data. To perform this task we need information on the phrases we are looking for. At the moment, clinical Polish resources are sparse. The existing terminologies, such as Polish Medical Subject Headings (MeSH), do not provide sufficient coverage for clinical tasks. It would be helpful therefore if it were possible to automatically prepare, on the basis of a data sample, an initial set of terms which, after manual verification, could be used for the purpose of information extraction. Using a combination of linguistic and statistical methods for processing over 1200 children hospital discharge records, we obtained a list of single and multiword terms used in hospital discharge documents written in Polish. The phrases are ordered according to their presumed importance in domain texts measured by the frequency of use of a phrase and the variety of its contexts. The evaluation showed that the automatically identified phrases cover about 84% of terms in domain texts. At the top of the ranked list, only 4% out of 400 terms were incorrect while out of the final 200, 20% of expressions were either not domain related or syntactically incorrect. We also observed that 70% of the obtained terms are not included in the Polish MeSH. Automatic terminology extraction can give results which are of a quality high enough to be taken as a starting point for building domain related terminological dictionaries or ontologies. This approach can be useful for preparing terminological resources for very specific subdomains for which no relevant terminologies already exist. The evaluation performed showed that none of the tested ranking procedures were
Personalized summarization using user preference for m-learning

NASA Astrophysics Data System (ADS)

Lee, Sihyoung; Yang, Seungji; Ro, Yong Man; Kim, Hyoung Joong

2008-02-01

As the Internet and multimedia technology is becoming advanced, the number of digital multimedia contents is also becoming abundant in learning area. In order to facilitate the access of digital knowledge and to meet the need of a lifelong learning, e-learning could be the helpful alternative way to the conventional learning paradigms. E-learning is known as a unifying term to express online, web-based and technology-delivered learning. Mobile-learning (m-learning) is defined as e-learning through mobile devices using wireless transmission. In a survey, more than half of the people remarked that the re-consumption was one of the convenient features in e-learning. However, it is not easy to find user's preferred segmentation from a full version of lengthy e-learning content. Especially in m-learning, a content-summarization method is strongly required because mobile devices are limited to low processing power and battery capacity. In this paper, we propose a new user preference model for re-consumption to construct personalized summarization for re-consumption. The user preference for re-consumption is modeled based on user actions with statistical model. Based on the user preference model for re-consumption with personalized user actions, our method discriminates preferred parts over the entire content. Experimental results demonstrated successful personalized summarization.
Toward a Model of Text Comprehension and Production.

ERIC Educational Resources Information Center

Kintsch, Walter; Van Dijk, Teun A.

1978-01-01

Described is the system of mental operations occurring in text comprehension and in recall and summarization. A processing model is outlined: 1) the meaning elements of a text become organized into a coherent whole, 2) the full meaning of the text is condensed into its gist, and 3) new texts are generated from the comprehension processes.…
Analysis in Motion Initiative – Summarization Capability

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arendt, Dustin; Pirrung, Meg; Jasper, Rob

2017-06-22

Analysts are tasked with integrating information from multiple data sources for important and timely decision making. What if sense making and overall situation awareness could be improved through visualization techniques? The Analysis in Motion initiative is advancing the ability to summarize and abstract multiple streams and static data sources over time.
Collaborative human-machine analysis to disambiguate entities in unstructured text and structured datasets

NASA Astrophysics Data System (ADS)

Davenport, Jack H.

2016-05-01

Intelligence analysts demand rapid information fusion capabilities to develop and maintain accurate situational awareness and understanding of dynamic enemy threats in asymmetric military operations. The ability to extract relationships between people, groups, and locations from a variety of text datasets is critical to proactive decision making. The derived network of entities must be automatically created and presented to analysts to assist in decision making. DECISIVE ANALYTICS Corporation (DAC) provides capabilities to automatically extract entities, relationships between entities, semantic concepts about entities, and network models of entities from text and multi-source datasets. DAC's Natural Language Processing (NLP) Entity Analytics model entities as complex systems of attributes and interrelationships which are extracted from unstructured text via NLP algorithms. The extracted entities are automatically disambiguated via machine learning algorithms, and resolution recommendations are presented to the analyst for validation; the analyst's expertise is leveraged in this hybrid human/computer collaborative model. Military capability is enhanced by these NLP Entity Analytics because analysts can now create/update an entity profile with intelligence automatically extracted from unstructured text, thereby fusing entity knowledge from structured and unstructured data sources. Operational and sustainment costs are reduced since analysts do not have to manually tag and resolve entities.
iBIOMES Lite: Summarizing Biomolecular Simulation Data in Limited Settings

PubMed Central

2015-01-01

As the amount of data generated by biomolecular simulations dramatically increases, new tools need to be developed to help manage this data at the individual investigator or small research group level. In this paper, we introduce iBIOMES Lite, a lightweight tool for biomolecular simulation data indexing and summarization. The main goal of iBIOMES Lite is to provide a simple interface to summarize computational experiments in a setting where the user might have limited privileges and limited access to IT resources. A command-line interface allows the user to summarize, publish, and search local simulation data sets. Published data sets are accessible via static hypertext markup language (HTML) pages that summarize the simulation protocols and also display data analysis graphically. The publication process is customized via extensible markup language (XML) descriptors while the HTML summary template is customized through extensible stylesheet language (XSL). iBIOMES Lite was tested on different platforms and at several national computing centers using various data sets generated through classical and quantum molecular dynamics, quantum chemistry, and QM/MM. The associated parsers currently support AMBER, GROMACS, Gaussian, and NWChem data set publication. The code is available at https://github.com/jcvthibault/ibiomes. PMID:24830957
Summarizing Audiovisual Contents of a Video Program

NASA Astrophysics Data System (ADS)

Gong, Yihong

2003-12-01

In this paper, we focus on video programs that are intended to disseminate information and knowledge such as news, documentaries, seminars, etc, and present an audiovisual summarization system that summarizes the audio and visual contents of the given video separately, and then integrating the two summaries with a partial alignment. The audio summary is created by selecting spoken sentences that best present the main content of the audio speech while the visual summary is created by eliminating duplicates/redundancies and preserving visually rich contents in the image stream. The alignment operation aims to synchronize each spoken sentence in the audio summary with its corresponding speaker's face and to preserve the rich content in the visual summary. A Bipartite Graph-based audiovisual alignment algorithm is developed to efficiently find the best alignment solution that satisfies these alignment requirements. With the proposed system, we strive to produce a video summary that: (1) provides a natural visual and audio content overview, and (2) maximizes the coverage for both audio and visual contents of the original video without having to sacrifice either of them.
Automatic Processing of Current Affairs Queries

ERIC Educational Resources Information Center

Salton, G.

1973-01-01

The SMART system is used for the analysis, search and retrieval of news stories appearing in Time'' magazine. A comparison is made between the automatic text processing methods incorporated into the SMART system and a manual search using the classified index to Time.'' (14 references) (Author)
Text-mining and information-retrieval services for molecular biology

PubMed Central

Krallinger, Martin; Valencia, Alfonso

2005-01-01

Text-mining in molecular biology - defined as the automatic extraction of information about genes, proteins and their functional relationships from text documents - has emerged as a hybrid discipline on the edges of the fields of information science, bioinformatics and computational linguistics. A range of text-mining applications have been developed recently that will improve access to knowledge for biologists and database annotators. PMID:15998455
TEXT CLASSIFICATION FOR AUTOMATIC DETECTION OF E-CIGARETTE USE AND USE FOR SMOKING CESSATION FROM TWITTER: A FEASIBILITY PILOT.

PubMed

Aphinyanaphongs, Yin; Lulejian, Armine; Brown, Duncan Penfold; Bonneau, Richard; Krebs, Paul

2016-01-01

Rapid increases in e-cigarette use and potential exposure to harmful byproducts have shifted public health focus to e-cigarettes as a possible drug of abuse. Effective surveillance of use and prevalence would allow appropriate regulatory responses. An ideal surveillance system would collect usage data in real time, focus on populations of interest, include populations unable to take the survey, allow a breadth of questions to answer, and enable geo-location analysis. Social media streams may provide this ideal system. To realize this use case, a foundational question is whether we can detect e-cigarette use at all. This work reports two pilot tasks using text classification to identify automatically Tweets that indicate e-cigarette use and/or e-cigarette use for smoking cessation. We build and define both datasets and compare performance of 4 state of the art classifiers and a keyword search for each task. Our results demonstrate excellent classifier performance of up to 0.90 and 0.94 area under the curve in each category. These promising initial results form the foundation for further studies to realize the ideal surveillance solution.
Tashkeela: Novel corpus of Arabic vocalized texts, data for auto-diacritization systems.

PubMed

Zerrouki, Taha; Balla, Amar

2017-04-01

Arabic diacritics are often missed in Arabic scripts. This feature is a handicap for new learner to read َArabic, text to speech conversion systems, reading and semantic analysis of Arabic texts. The automatic diacritization systems are the best solution to handle this issue. But such automation needs resources as diactritized texts to train and evaluate such systems. In this paper, we describe our corpus of Arabic diacritized texts. This corpus is called Tashkeela. It can be used as a linguistic resource tool for natural language processing such as automatic diacritics systems, dis-ambiguity mechanism, features and data extraction. The corpus is freely available, it contains 75 million of fully vocalized words mainly 97 books from classical and modern Arabic language. The corpus is collected from manually vocalized texts using web crawling process.
Unsupervised method for automatic construction of a disease dictionary from a large free text collection.

PubMed

Xu, Rong; Supekar, Kaustubh; Morgan, Alex; Das, Amar; Garber, Alan

2008-11-06

Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting con-textual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35-88%) over available, manually created disease terminologies.
Impact of Machine-Translated Text on Entity and Relationship Extraction

DTIC Science & Technology

2014-12-01

20 1 1. Introduction Using social network analysis tools is an important asset in...semantic modeling software to automatically build detailed network models from unstructured text. Contour imports unstructured text and then maps the text...onto an existing ontology of frames at the sentence level, using FrameNet, a structured language model, and through Semantic Role Labeling ( SRL
Theory and implementation of summarization: Improving sensor interpretation for spacecraft operations

NASA Astrophysics Data System (ADS)

Swartwout, Michael Alden

New paradigms in space missions require radical changes in spacecraft operations. In the past, operations were insulated from competitive pressures of cost, quality and time by system infrastructures, technological limitations and historical precedent. However, modern demands now require that operations meet competitive performance goals. One target for improvement is the telemetry downlink, where significant resources are invested to acquire thousands of measurements for human interpretation. This cost-intensive method is used because conventional operations are not based on formal methodologies but on experiential reasoning and incrementally adapted procedures. Therefore, to improve the telemetry downlink it is first necessary to invent a rational framework for discussing operations. This research explores operations as a feedback control problem, develops the conceptual basis for the use of spacecraft telemetry, and presents a method to improve performance. The method is called summarization, a process to make vehicle data more useful to operators. Summarization enables rational trades for telemetry downlink by defining and quantitatively ranking these elements: all operational decisions, the knowledge needed to inform each decision, and all possible sensor mappings to acquire that knowledge. Summarization methods were implemented for the Sapphire microsatellite; conceptual health management and system models were developed and a degree-of-observability metric was defined. An automated tool was created to generate summarization methods from these models. Methods generated using a Sapphire model were compared against the conventional operations plan. Summarization was shown to identify the key decisions and isolate the most appropriate sensors. Secondly, a form of summarization called beacon monitoring was experimentally verified. Beacon monitoring automates the anomaly detection and notification tasks and migrates these responsibilities to the space segment. A
New challenges for text mining: mapping between text and manually curated pathways

PubMed Central

Oda, Kanae; Kim, Jin-Dong; Ohta, Tomoko; Okanohara, Daisuke; Matsuzaki, Takuya; Tateisi, Yuka; Tsujii, Jun'ichi

2008-01-01

Background Associating literature with pathways poses new challenges to the Text Mining (TM) community. There are three main challenges to this task: (1) the identification of the mapping position of a specific entity or reaction in a given pathway, (2) the recognition of the causal relationships among multiple reactions, and (3) the formulation and implementation of required inferences based on biological domain knowledge. Results To address these challenges, we constructed new resources to link the text with a model pathway; they are: the GENIA pathway corpus with event annotation and NF-kB pathway. Through their detailed analysis, we address the untapped resource, ‘bio-inference,’ as well as the differences between text and pathway representation. Here, we show the precise comparisons of their representations and the nine classes of ‘bio-inference’ schemes observed in the pathway corpus. Conclusions We believe that the creation of such rich resources and their detailed analysis is the significant first step for accelerating the research of the automatic construction of pathway from text. PMID:18426550
Model Considerations for Memory-based Automatic Music Transcription

NASA Astrophysics Data System (ADS)

Albrecht, Štěpán; Šmídl, Václav

2009-12-01

The problem of automatic music description is considered. The recorded music is modeled as a superposition of known sounds from a library weighted by unknown weights. Similar observation models are commonly used in statistics and machine learning. Many methods for estimation of the weights are available. These methods differ in the assumptions imposed on the weights. In Bayesian paradigm, these assumptions are typically expressed in the form of prior probability density function (pdf) on the weights. In this paper, commonly used assumptions about music signal are summarized and complemented by a new assumption. These assumptions are translated into pdfs and combined into a single prior density using combination of pdfs. Validity of the model is tested in simulation using synthetic data.
Text Classification for Organizational Researchers

PubMed Central

Kobayashi, Vladimer B.; Mol, Stefan T.; Berkers, Hannah A.; Kismihók, Gábor; Den Hartog, Deanne N.

2017-01-01

Organizations are increasingly interested in classifying texts or parts thereof into categories, as this enables more effective use of their information. Manual procedures for text classification work well for up to a few hundred documents. However, when the number of documents is larger, manual procedures become laborious, time-consuming, and potentially unreliable. Techniques from text mining facilitate the automatic assignment of text strings to categories, making classification expedient, fast, and reliable, which creates potential for its application in organizational research. The purpose of this article is to familiarize organizational researchers with text mining techniques from machine learning and statistics. We describe the text classification process in several roughly sequential steps, namely training data preparation, preprocessing, transformation, application of classification techniques, and validation, and provide concrete recommendations at each step. To help researchers develop their own text classifiers, the R code associated with each step is presented in a tutorial. The tutorial draws from our own work on job vacancy mining. We end the article by discussing how researchers can validate a text classification model and the associated output. PMID:29881249
Automatic inference of indexing rules for MEDLINE.

PubMed

Névéol, Aurélie; Shooshan, Sonya E; Claveau, Vincent

2008-11-19

Indexing is a crucial step in any information retrieval system. In MEDLINE, a widely used database of the biomedical literature, the indexing process involves the selection of Medical Subject Headings in order to describe the subject matter of articles. The need for automatic tools to assist MEDLINE indexers in this task is growing with the increasing number of publications being added to MEDLINE. In this paper, we describe the use and the customization of Inductive Logic Programming (ILP) to infer indexing rules that may be used to produce automatic indexing recommendations for MEDLINE indexers. Our results show that this original ILP-based approach outperforms manual rules when they exist. In addition, the use of ILP rules also improves the overall performance of the Medical Text Indexer (MTI), a system producing automatic indexing recommendations for MEDLINE. We expect the sets of ILP rules obtained in this experiment to be integrated into MTI.
Unsupervised Method for Automatic Construction of a Disease Dictionary from a Large Free Text Collection

PubMed Central

Xu, Rong; Supekar, Kaustubh; Morgan, Alex; Das, Amar; Garber, Alan

2008-01-01

Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting contextual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35–88%) over available, manually created disease terminologies. PMID:18999169
A Review of Four Text-Formatting Programs.

ERIC Educational Resources Information Center

Press, Larry

1980-01-01

The author compares four formatting programs which run under CP/M: Script-80, Text Processing System (TPS), TEX, and Textwriter III. He summarizes his experience with these programs and his detailed report on 154 program characteristics. (Author/SJL)

Condensed Representation of Sentences in Graphic Displays of Text Structures.

ERIC Educational Resources Information Center

Craven, Timothy C.

1990-01-01

Discusses ways in which sentences may be represented in a condensed form in graphic displays of a sentence dependency structure. A prototype of a text structure management system, TEXNET, is described; a quantitative evaluation of automatic abbreviation schemes is presented; full-text compression is discussed; and additional research is suggested.…
The research of full automatic oil filtering control technology of high voltage insulating oil

NASA Astrophysics Data System (ADS)

Gong, Gangjun; Zhang, Tong; Yan, Guozeng; Zhang, Han; Chen, Zhimin; Su, Chang

2017-09-01

In this paper, the design scheme of automatic oil filter control system for transformer oil in UHV substation is summarized. The scheme specifically includes the typical double tank filter connection control method of the transformer oil of the UHV substation, which distinguishes the single port and the double port connection structure of the oil tank. Finally, the design scheme of the temperature sensor and respirator is given in detail, and the detailed evaluation and application scenarios are given for reference.
The software for automatic creation of the formal grammars used by speech recognition, computer vision, editable text conversion systems, and some new functions

NASA Astrophysics Data System (ADS)

Kardava, Irakli; Tadyszak, Krzysztof; Gulua, Nana; Jurga, Stefan

2017-02-01

For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.
Automatic inference of indexing rules for MEDLINE

PubMed Central

Névéol, Aurélie; Shooshan, Sonya E; Claveau, Vincent

2008-01-01

Background: Indexing is a crucial step in any information retrieval system. In MEDLINE, a widely used database of the biomedical literature, the indexing process involves the selection of Medical Subject Headings in order to describe the subject matter of articles. The need for automatic tools to assist MEDLINE indexers in this task is growing with the increasing number of publications being added to MEDLINE. Methods: In this paper, we describe the use and the customization of Inductive Logic Programming (ILP) to infer indexing rules that may be used to produce automatic indexing recommendations for MEDLINE indexers. Results: Our results show that this original ILP-based approach outperforms manual rules when they exist. In addition, the use of ILP rules also improves the overall performance of the Medical Text Indexer (MTI), a system producing automatic indexing recommendations for MEDLINE. Conclusion: We expect the sets of ILP rules obtained in this experiment to be integrated into MTI. PMID:19025687
Effective Replays and Summarization of Virtual Experiences

PubMed Central

Ponto, Kevin; Kohlmann, Joe; Gleicher, Michael

2012-01-01

Direct replays of the experience of a user in a virtual environment are difficult for others to watch due to unnatural camera motions. We present methods for replaying and summarizing these egocentric experiences that effectively communicate the users observations while reducing unwanted camera movements. Our approach summarizes the viewpoint path as a concise sequence of viewpoints that cover the same parts of the scene. The core of our approach is a novel content dependent metric that can be used to identify similarities between viewpoints. This enables viewpoints to be grouped by similar contextual view information and provides a means to generate novel viewpoints that can encapsulate a series of views. These resulting encapsulated viewpoints are used to synthesize new camera paths that convey the content of the original viewers experience. Projecting the initial movement of the user back on the scene can be used to convey the details of their observations, and the extracted viewpoints can serve as bookmarks for control or analysis. Finally we present performance analysis along with two forms of validation to test whether the extracted viewpoints are representative of the viewers original observations and to test for the overall effectiveness of the presented replay methods. PMID:22402688
Automatic fluid dispenser

NASA Technical Reports Server (NTRS)

Sakellaris, P. C. (Inventor)

1977-01-01

Fluid automatically flows to individual dispensing units at predetermined times from a fluid supply and is available only for a predetermined interval of time after which an automatic control causes the fluid to drain from the individual dispensing units. Fluid deprivation continues until the beginning of a new cycle when the fluid is once again automatically made available at the individual dispensing units.
Semi-Automatic Grading of Students' Answers Written in Free Text

ERIC Educational Resources Information Center

Escudeiro, Nuno; Escudeiro, Paula; Cruz, Augusto

2011-01-01

The correct grading of free text answers to exam questions during an assessment process is time consuming and subject to fluctuations in the application of evaluation criteria, particularly when the number of answers is high (in the hundreds). In consequence of these fluctuations, inherent to human nature, and largely determined by emotional…
Text Mining the History of Medicine.

PubMed

Thompson, Paul; Batista-Navarro, Riza Theresa; Kontonatsios, Georgios; Carter, Jacob; Toon, Elizabeth; McNaught, John; Timmermann, Carsten; Worboys, Michael; Ananiadou, Sophia

2016-01-01

Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while
Text Mining the History of Medicine

PubMed Central

Thompson, Paul; Batista-Navarro, Riza Theresa; Kontonatsios, Georgios; Carter, Jacob; Toon, Elizabeth; McNaught, John; Timmermann, Carsten; Worboys, Michael; Ananiadou, Sophia

2016-01-01

Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while
Automatic categorization of land-water cover types of the Green Swamp, Florida, using Skylab multispectral scanner (S-192) data

NASA Technical Reports Server (NTRS)

Coker, A. E.; Higer, A. L.; Rogers, R. H.; Shah, N. J.; Reed, L. E.; Walker, S.

1975-01-01

The techniques used and the results achieved in the successful application of Skylab Multispectral Scanner (EREP S-192) high-density digital tape data for the automatic categorizing and mapping of land-water cover types in the Green Swamp of Florida were summarized. Data was provided from Skylab pass number 10 on 13 June 1973. Significant results achieved included the automatic mapping of a nine-category and a three-category land-water cover map of the Green Swamp. The land-water cover map was used to make interpretations of a hydrologic condition in the Green Swamp. This type of use marks a significant breakthrough in the processing and utilization of EREP S-192 data.
Automatic generation of pictorial transcripts of video programs

NASA Astrophysics Data System (ADS)

Shahraray, Behzad; Gibbon, David C.

1995-03-01

An automatic authoring system for the generation of pictorial transcripts of video programs which are accompanied by closed caption information is presented. A number of key frames, each of which represents the visual information in a segment of the video (i.e., a scene), are selected automatically by performing a content-based sampling of the video program. The textual information is recovered from the closed caption signal and is initially segmented based on its implied temporal relationship with the video segments. The text segmentation boundaries are then adjusted, based on lexical analysis and/or caption control information, to account for synchronization errors due to possible delays in the detection of scene boundaries or the transmission of the caption information. The closed caption text is further refined through linguistic processing for conversion to lower- case with correct capitalization. The key frames and the related text generate a compact multimedia presentation of the contents of the video program which lends itself to efficient storage and transmission. This compact representation can be viewed on a computer screen, or used to generate the input to a commercial text processing package to generate a printed version of the program.
Web-based UMLS concept retrieval by automatic text scanning: a comparison of two methods.

PubMed

Brandt, C; Nadkarni, P

2001-01-01

The Web is increasingly the medium of choice for multi-user application program delivery. Yet selection of an appropriate programming environment for rapid prototyping, code portability, and maintainability remain issues. We summarize our experience on the conversion of a LISP Web application, Search/SR to a new, functionally identical application, Search/SR-ASP using a relational database and active server pages (ASP) technology. Our results indicate that provision of easy access to database engines and external objects is almost essential for a development environment to be considered viable for rapid and robust application delivery. While LISP itself is a robust language, its use in Web applications may be hard to justify given that current vendor implementations do not provide such functionality. Alternative, currently available scripting environments for Web development appear to have most of LISP's advantages and few of its disadvantages.
Automatic detection and recognition of signs from natural scenes.

PubMed

Chen, Xilin; Yang, Jie; Zhang, Jing; Waibel, Alex

2004-01-01

In this paper, we present an approach to automatic detection and recognition of signs from natural scenes, and its application to a sign translation task. The proposed approach embeds multiresolution and multiscale edge detection, adaptive searching, color analysis, and affine rectification in a hierarchical framework for sign detection, with different emphases at each phase to handle the text in different sizes, orientations, color distributions and backgrounds. We use affine rectification to recover deformation of the text regions caused by an inappropriate camera view angle. The procedure can significantly improve text detection rate and optical character recognition (OCR) accuracy. Instead of using binary information for OCR, we extract features from an intensity image directly. We propose a local intensity normalization method to effectively handle lighting variations, followed by a Gabor transform to obtain local features, and finally a linear discriminant analysis (LDA) method for feature selection. We have applied the approach in developing a Chinese sign translation system, which can automatically detect and recognize Chinese signs as input from a camera, and translate the recognized text into English.
Attention to Automatic Movements in Parkinson's Disease: Modified Automatic Mode in the Striatum

PubMed Central

Wu, Tao; Liu, Jun; Zhang, Hejia; Hallett, Mark; Zheng, Zheng; Chan, Piu

2015-01-01

We investigated neural correlates when attending to a movement that could be made automatically in healthy subjects and Parkinson's disease (PD) patients. Subjects practiced a visuomotor association task until they could perform it automatically, and then directed their attention back to the automated task. Functional MRI was obtained during the early-learning, automatic stage, and when re-attending. In controls, attention to automatic movement induced more activation in the dorsolateral prefrontal cortex (DLPFC), anterior cingulate cortex, and rostral supplementary motor area. The motor cortex received more influence from the cortical motor association regions. In contrast, the pattern of the activity and connectivity of the striatum remained at the level of the automatic stage. In PD patients, attention enhanced activity in the DLPFC, premotor cortex, and cerebellum, but the connectivity from the putamen to the motor cortex decreased. Our findings demonstrate that, in controls, when a movement achieves the automatic stage, attention can influence the attentional networks and cortical motor association areas, but has no apparent effect on the striatum. In PD patients, attention induces a shift from the automatic mode back to the controlled pattern within the striatum. The shifting between controlled and automatic behaviors relies in part on striatal function. PMID:24925772
Semi Automatic Ontology Instantiation in the domain of Risk Management

NASA Astrophysics Data System (ADS)

Makki, Jawad; Alquier, Anne-Marie; Prince, Violaine

One of the challenging tasks in the context of Ontological Engineering is to automatically or semi-automatically support the process of Ontology Learning and Ontology Population from semi-structured documents (texts). In this paper we describe a Semi-Automatic Ontology Instantiation method from natural language text, in the domain of Risk Management. This method is composed from three steps 1 ) Annotation with part-of-speech tags, 2) Semantic Relation Instances Extraction, 3) Ontology instantiation process. It's based on combined NLP techniques using human intervention between steps 2 and 3 for control and validation. Since it heavily relies on linguistic knowledge it is not domain dependent which is a good feature for portability between the different fields of risk management application. The proposed methodology uses the ontology of the PRIMA1 project (supported by the European community) as a Generic Domain Ontology and populates it via an available corpus. A first validation of the approach is done through an experiment with Chemical Fact Sheets from Environmental Protection Agency2.
[Wearable Automatic External Defibrillators].

PubMed

Luo, Huajie; Luo, Zhangyuan; Jin, Xun; Zhang, Leilei; Wang, Changjin; Zhang, Wenzan; Tu, Quan

2015-11-01

Defibrillation is the most effective method of treating ventricular fibrillation(VF), this paper introduces wearable automatic external defibrillators based on embedded system which includes EGG measurements, bioelectrical impedance measurement, discharge defibrillation module, which can automatic identify VF signal, biphasic exponential waveform defibrillation discharge. After verified by animal tests, the device can realize EGG acquisition and automatic identification. After identifying the ventricular fibrillation signal, it can automatic defibrillate to abort ventricular fibrillation and to realize the cardiac electrical cardioversion.
Linguistically informed digital fingerprints for text

NASA Astrophysics Data System (ADS)

Uzuner, Özlem

2006-02-01

Digital fingerprinting, watermarking, and tracking technologies have gained importance in the recent years in response to growing problems such as digital copyright infringement. While fingerprints and watermarks can be generated in many different ways, use of natural language processing for these purposes has so far been limited. Measuring similarity of literary works for automatic copyright infringement detection requires identifying and comparing creative expression of content in documents. In this paper, we present a linguistic approach to automatically fingerprinting novels based on their expression of content. We use natural language processing techniques to generate "expression fingerprints". These fingerprints consist of both syntactic and semantic elements of language, i.e., syntactic and semantic elements of expression. Our experiments indicate that syntactic and semantic elements of expression enable accurate identification of novels and their paraphrases, providing a significant improvement over techniques used in text classification literature for automatic copy recognition. We show that these elements of expression can be used to fingerprint, label, or watermark works; they represent features that are essential to the character of works and that remain fairly consistent in the works even when works are paraphrased. These features can be directly extracted from the contents of the works on demand and can be used to recognize works that would not be correctly identified either in the absence of pre-existing labels or by verbatim-copy detectors.
MedSynDiKATe--design considerations for an ontology-based medical text understanding system.

PubMed Central

Hahn, U.; Romacker, M.; Schulz, S.

2000-01-01

MedSynDiKATe is a natural language processor for automatically acquiring knowledge from medical finding reports. The content of these documents is transferred to formal representation structures which constitute a corresponding text knowledge base. The general system architecture we present integrates requirements from the analysis of single sentences, as well as those of referentially linked sentences forming cohesive texts. The strong demands MedSynDiKATe poses to the availability of expressive knowledge sources are accounted for by two alternative approaches to (semi)automatic ontology engineering. PMID:11079899
POI Summarization by Aesthetics Evaluation From Crowd Source Social Media.

PubMed

Qian, Xueming; Li, Cheng; Lan, Ke; Hou, Xingsong; Li, Zhetao; Han, Junwei

2018-03-01

Place-of-Interest (POI) summarization by aesthetics evaluation can recommend a set of POI images to the user and it is significant in image retrieval. In this paper, we propose a system that summarizes a collection of POI images regarding both aesthetics and diversity of the distribution of cameras. First, we generate visual albums by a coarse-to-fine POI clustering approach and then generate 3D models for each album by the collected images from social media. Second, based on the 3D to 2D projection relationship, we select candidate photos in terms of the proposed crowd source saliency model. Third, in order to improve the performance of aesthetic measurement model, we propose a crowd-sourced saliency detection approach by exploring the distribution of salient regions in the 3D model. Then, we measure the composition aesthetics of each image and we explore crowd source salient feature to yield saliency map, based on which, we propose an adaptive image adoption approach. Finally, we combine the diversity and the aesthetics to recommend aesthetic pictures. Experimental results show that the proposed POI summarization approach can return images with diverse camera distributions and aesthetics.
Seqenv: linking sequences to environments through text mining.

PubMed

Sinclair, Lucas; Ijaz, Umer Z; Jensen, Lars Juhl; Coolen, Marco J L; Gubry-Rangin, Cecile; Chroňáková, Alica; Oulas, Anastasis; Pavloudi, Christina; Schnetzer, Julia; Weimann, Aaron; Ijaz, Ali; Eiler, Alexander; Quince, Christopher; Pafilis, Evangelos

2016-01-01

Understanding the distribution of taxa and associated traits across different environments is one of the central questions in microbial ecology. High-throughput sequencing (HTS) studies are presently generating huge volumes of data to address this biogeographical topic. However, these studies are often focused on specific environment types or processes leading to the production of individual, unconnected datasets. The large amounts of legacy sequence data with associated metadata that exist can be harnessed to better place the genetic information found in these surveys into a wider environmental context. Here we introduce a software program, seqenv, to carry out precisely such a task. It automatically performs similarity searches of short sequences against the "nt" nucleotide database provided by NCBI and, out of every hit, extracts-if it is available-the textual metadata field. After collecting all the isolation sources from all the search results, we run a text mining algorithm to identify and parse words that are associated with the Environmental Ontology (EnvO) controlled vocabulary. This, in turn, enables us to determine both in which environments individual sequences or taxa have previously been observed and, by weighted summation of those results, to summarize complete samples. We present two demonstrative applications of seqenv to a survey of ammonia oxidizing archaea as well as to a plankton paleome dataset from the Black Sea. These demonstrate the ability of the tool to reveal novel patterns in HTS and its utility in the fields of environmental source tracking, paleontology, and studies of microbial biogeography. To install seqenv, go to: https://github.com/xapple/seqenv.

Clinical Summarization Capabilities of Commercially-available and Internally-developed Electronic Health Records

PubMed Central

Laxmisan, A.; McCoy, A.B.; Wright, A.; Sittig, D.F.

2012-01-01

Objective Clinical summarization, the process by which relevant patient information is electronically summarized and presented at the point of care, is of increasing importance given the increasing volume of clinical data in electronic health record systems (EHRs). There is a paucity of research on electronic clinical summarization, including the capabilities of currently available EHR systems. Methods We compared different aspects of general clinical summary screens used in twelve different EHR systems using a previously described conceptual model: AORTIS (Aggregation, Organization, Reduction, Interpretation and Synthesis). Results We found a wide variation in the EHRs’ summarization capabilities: all systems were capable of simple aggregation and organization of limited clinical content, but only one demonstrated an ability to synthesize information from the data. Conclusion Improvement of the clinical summary screen functionality for currently available EHRs is necessary. Further research should identify strategies and methods for creating easy to use, well-designed clinical summary screens that aggregate, organize and reduce all pertinent patient information as well as provide clinical interpretations and synthesis as required. PMID:22468161
RevManHAL: towards automatic text generation in systematic reviews.

PubMed

Torres Torres, Mercedes; Adams, Clive E

2017-02-09

Systematic reviews are a key part of healthcare evaluation. They involve important painstaking but repetitive work. A major producer of systematic reviews, the Cochrane Collaboration, employs Review Manager (RevMan) programme-a software which assists reviewers and produces XML-structured files. This paper describes an add-on programme (RevManHAL) which helps auto-generate the abstract, results and discussion sections of RevMan-generated reviews in multiple languages. The paper also describes future developments for RevManHAL. RevManHAL was created in Java using NetBeans by a programmer working full time for 2 months. The resulting open-source programme uses editable phrase banks to envelop text/numbers from within the prepared RevMan file in formatted readable text of a chosen language. In this way, considerable parts of the review's 'abstract', 'results' and 'discussion' sections are created and a phrase added to 'acknowledgements'. RevManHAL's output needs to be checked by reviewers, but already, from our experience within the Cochrane Schizophrenia Group (200 maintained reviews, 900 reviewers), RevManHAL has saved much time which is better employed thinking about the meaning of the data rather than restating them. Many more functions will become possible as review writing becomes increasingly automated.
Automatic Condensation of Electronic Publications by Sentence Selection.

ERIC Educational Resources Information Center

Brandow, Ronald; And Others

1995-01-01

Describes a system that performs automatic summaries of news from a large commercial news service encompassing 41 different publications. This system was compared to a system that used only the lead sentences of the texts. Lead-based summaries significantly outperformed the sentence-selection summaries. (AEF)
Semi-automatic object geometry estimation for image personalization

NASA Astrophysics Data System (ADS)

Ding, Hengzhou; Bala, Raja; Fan, Zhigang; Eschbach, Reiner; Bouman, Charles A.; Allebach, Jan P.

2010-01-01

Digital printing brings about a host of benefits, one of which is the ability to create short runs of variable, customized content. One form of customization that is receiving much attention lately is in photofinishing applications, whereby personalized calendars, greeting cards, and photo books are created by inserting text strings into images. It is particularly interesting to estimate the underlying geometry of the surface and incorporate the text into the image content in an intelligent and natural way. Current solutions either allow fixed text insertion schemes into preprocessed images, or provide manual text insertion tools that are time consuming and aimed only at the high-end graphic designer. It would thus be desirable to provide some level of automation in the image personalization process. We propose a semi-automatic image personalization workflow which includes two scenarios: text insertion and text replacement. In both scenarios, the underlying surfaces are assumed to be planar. A 3-D pinhole camera model is used for rendering text, whose parameters are estimated by analyzing existing structures in the image. Techniques in image processing and computer vison such as the Hough transform, the bilateral filter, and connected component analysis are combined, along with necessary user inputs. In particular, the semi-automatic workflow is implemented as an image personalization tool, which is presented in our companion paper.1 Experimental results including personalized images for both scenarios are shown, which demonstrate the effectiveness of our algorithms.
Supporting the annotation of chronic obstructive pulmonary disease (COPD) phenotypes with text mining workflows.

PubMed

Fu, Xiao; Batista-Navarro, Riza; Rak, Rafal; Ananiadou, Sophia

2015-01-01

Chronic obstructive pulmonary disease (COPD) is a life-threatening lung disorder whose recent prevalence has led to an increasing burden on public healthcare. Phenotypic information in electronic clinical records is essential in providing suitable personalised treatment to patients with COPD. However, as phenotypes are often "hidden" within free text in clinical records, clinicians could benefit from text mining systems that facilitate their prompt recognition. This paper reports on a semi-automatic methodology for producing a corpus that can ultimately support the development of text mining tools that, in turn, will expedite the process of identifying groups of COPD patients. A corpus of 30 full-text papers was formed based on selection criteria informed by the expertise of COPD specialists. We developed an annotation scheme that is aimed at producing fine-grained, expressive and computable COPD annotations without burdening our curators with a highly complicated task. This was implemented in the Argo platform by means of a semi-automatic annotation workflow that integrates several text mining tools, including a graphical user interface for marking up documents. When evaluated using gold standard (i.e., manually validated) annotations, the semi-automatic workflow was shown to obtain a micro-averaged F-score of 45.70% (with relaxed matching). Utilising the gold standard data to train new concept recognisers, we demonstrated that our corpus, although still a work in progress, can foster the development of significantly better performing COPD phenotype extractors. We describe in this work the means by which we aim to eventually support the process of COPD phenotype curation, i.e., by the application of various text mining tools integrated into an annotation workflow. Although the corpus being described is still under development, our results thus far are encouraging and show great potential in stimulating the development of further automatic COPD phenotype extractors.
Automatic textual annotation of video news based on semantic visual object extraction

NASA Astrophysics Data System (ADS)

Boujemaa, Nozha; Fleuret, Francois; Gouet, Valerie; Sahbi, Hichem

2003-12-01

In this paper, we present our work for automatic generation of textual metadata based on visual content analysis of video news. We present two methods for semantic object detection and recognition from a cross modal image-text thesaurus. These thesaurus represent a supervised association between models and semantic labels. This paper is concerned with two semantic objects: faces and Tv logos. In the first part, we present our work for efficient face detection and recogniton with automatic name generation. This method allows us also to suggest the textual annotation of shots close-up estimation. On the other hand, we were interested to automatically detect and recognize different Tv logos present on incoming different news from different Tv Channels. This work was done jointly with the French Tv Channel TF1 within the "MediaWorks" project that consists on an hybrid text-image indexing and retrieval plateform for video news.
A New Method for Measuring Text Similarity in Learning Management Systems Using WordNet

ERIC Educational Resources Information Center

Alkhatib, Bassel; Alnahhas, Ammar; Albadawi, Firas

2014-01-01

As text sources are getting broader, measuring text similarity is becoming more compelling. Automatic text classification, search engines and auto answering systems are samples of applications that rely on text similarity. Learning management systems (LMS) are becoming more important since electronic media is getting more publicly available. As…
DeepText2GO: Improving large-scale protein function prediction with deep semantic text representation.

PubMed

You, Ronghui; Huang, Xiaodi; Zhu, Shanfeng

2018-06-06

As of April 2018, UniProtKB has collected more than 115 million protein sequences. Less than 0.15% of these proteins, however, have been associated with experimental GO annotations. As such, the use of automatic protein function prediction (AFP) to reduce this huge gap becomes increasingly important. The previous studies conclude that sequence homology based methods are highly effective in AFP. In addition, mining motif, domain, and functional information from protein sequences has been found very helpful for AFP. Other than sequences, alternative information sources such as text, however, may be useful for AFP as well. Instead of using BOW (bag of words) representation in traditional text-based AFP, we propose a new method called DeepText2GO that relies on deep semantic text representation, together with different kinds of available protein information such as sequence homology, families, domains, and motifs, to improve large-scale AFP. Furthermore, DeepText2GO integrates text-based methods with sequence-based ones by means of a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence-based methods, validating its superiority. Copyright © 2018 Elsevier Inc. All rights reserved.
Video Analytics for Indexing, Summarization and Searching of Video Archives

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trease, Harold E.; Trease, Lynn L.

This paper will be submitted to the proceedings The Eleventh IASTED International Conference on. Signal and Image Processing. Given a video or video archive how does one effectively and quickly summarize, classify, and search the information contained within the data? This paper addresses these issues by describing a process for the automated generation of a table-of-contents and keyword, topic-based index tables that can be used to catalogue, summarize, and search large amounts of video data. Having the ability to index and search the information contained within the videos, beyond just metadata tags, provides a mechanism to extract and identify "useful"more » content from image and video data.« less
Text-Based Writing of Low-Skilled Postsecondary Students: Relation to Comprehension, Self-Efficacy and Teacher Judgments

ERIC Educational Resources Information Center

Perin, Dolores; Lauterbach, Mark; Raufman, Julia; Kalamkarian, Hoori Santikian

2017-01-01

Summarization and persuasive writing are important in postsecondary education and often require the use of source text. However, students entering college with low literacy skills often find this type of writing difficult. The present study compared predictors of performance on text-based summarization and persuasive writing in a sample of…
The automatic component of habit in health behavior: habit as cue-contingent automaticity.

PubMed

Orbell, Sheina; Verplanken, Bas

2010-07-01

Habit might be usefully characterized as a form of automaticity that involves the association of a cue and a response. Three studies examined habitual automaticity in regard to different aspects of the cue-response relationship characteristic of unhealthy and healthy habits. In each study, habitual automaticity was assessed by the Self-Report Habit Index (SRHI). In Study 1 SRHI scores correlated with attentional bias to smoking cues in a Stroop task. Study 2 examined the ability of a habit cue to elicit an unwanted habit response. In a prospective field study, habitual automaticity in relation to smoking when drinking alcohol in a licensed public house (pub) predicted the likelihood of cigarette-related action slips 2 months later after smoking in pubs had become illegal. In Study 3 experimental group participants formed an implementation intention to floss in response to a specified situational cue. Habitual automaticity of dental flossing was rapidly enhanced compared to controls. The studies provided three different demonstrations of the importance of cues in the automatic operation of habits. Habitual automaticity assessed by the SRHI captured aspects of a habit that go beyond mere frequency or consistency of the behavior. PsycINFO Database Record (c) 2010 APA, all rights reserved.
Investigation of Learners' Perceptions for Video Summarization and Recommendation

ERIC Educational Resources Information Center

Yang, Jie Chi; Chen, Sherry Y.

2012-01-01

Recently, multimedia-based learning is widespread in educational settings. A number of studies investigate how to develop effective techniques to manage a huge volume of video sources, such as summarization and recommendation. However, few studies examine how these techniques affect learners' perceptions in multimedia learning systems. This…
[Modeling and implementation method for the automatic biochemistry analyzer control system].

PubMed

Wang, Dong; Ge, Wan-cheng; Song, Chun-lin; Wang, Yun-guang

2009-03-01

In this paper the system structure The automatic biochemistry analyzer is a necessary instrument for clinical diagnostics. First of is analyzed. The system problems description and the fundamental principles for dispatch are brought forward. Then this text puts emphasis on the modeling for the automatic biochemistry analyzer control system. The objects model and the communications model are put forward. Finally, the implementation method is designed. It indicates that the system based on the model has good performance.
Attentive Reading With Constrained Summarization Adapted to Address Written Discourse in People With Mild Aphasia.

PubMed

Obermeyer, Jessica A; Edmonds, Lisa A

2018-03-01

The purpose of this study was to examine the preliminary efficacy of Attentive Reading and Constrained Summarization-Written (ARCS-W) in people with mild aphasia. ARCS-W adapts an existing treatment, ARCS (Rogalski & Edmonds, 2008), to address discourse level writing in mild aphasia. ARCS-W focuses on the cognitive and linguistic skills required for discourse production. This study was a within-subject pre-postdesign. Three people with mild aphasia participated. ARCS-W integrates attentive reading or listening with constrained summarization of discourse level material in spoken and written modalities. Outcomes included macro- (main concepts) and microlinguistic (correct information units, complete utterances) discourse measures, confrontation naming, aphasia severity, and functional communication. All 3 participants demonstrated some generalization to untrained spoken and written discourse at the word, sentence, and text levels. Reduced aphasia severity and/or increased functional communication and confrontation naming were also observed in some participants. The findings of this study provide preliminary evidence of the efficacy of ARCS-W to improve spoken and written discourse in mild aphasia. Different generalization patterns suggest different mechanisms of improvement. Further research and replication are required to better understand how ARCS-W can impact discourse abilities.
Use of SI Metric Units Misrepresented in College Physics Texts.

ERIC Educational Resources Information Center

Hooper, William

1980-01-01

Summarizes results of a survey that examined 13 textbooks claiming to use SI units. Tables present data concerning the SI and non-SI units actually used in each text in discussion of fluid pressure and thermal energy, and data concerning which texts do and do not use SI as claimed. (CS)
Adaptive Greedy Dictionary Selection for Web Media Summarization.

PubMed

Cong, Yang; Liu, Ji; Sun, Gan; You, Quanzeng; Li, Yuncheng; Luo, Jiebo

2017-01-01

Initializing an effective dictionary is an indispensable step for sparse representation. In this paper, we focus on the dictionary selection problem with the objective to select a compact subset of basis from original training data instead of learning a new dictionary matrix as dictionary learning models do. We first design a new dictionary selection model via l 2,0 norm. For model optimization, we propose two methods: one is the standard forward-backward greedy algorithm, which is not suitable for large-scale problems; the other is based on the gradient cues at each forward iteration and speeds up the process dramatically. In comparison with the state-of-the-art dictionary selection models, our model is not only more effective and efficient, but also can control the sparsity. To evaluate the performance of our new model, we select two practical web media summarization problems: 1) we build a new data set consisting of around 500 users, 3000 albums, and 1 million images, and achieve effective assisted albuming based on our model and 2) by formulating the video summarization problem as a dictionary selection issue, we employ our model to extract keyframes from a video sequence in a more flexible way. Generally, our model outperforms the state-of-the-art methods in both these two tasks.
Semantic Annotation of Complex Text Structures in Problem Reports

NASA Technical Reports Server (NTRS)

Malin, Jane T.; Throop, David R.; Fleming, Land D.

2011-01-01

Text analysis is important for effective information retrieval from databases where the critical information is embedded in text fields. Aerospace safety depends on effective retrieval of relevant and related problem reports for the purpose of trend analysis. The complex text syntax in problem descriptions has limited statistical text mining of problem reports. The presentation describes an intelligent tagging approach that applies syntactic and then semantic analysis to overcome this problem. The tags identify types of problems and equipment that are embedded in the text descriptions. The power of these tags is illustrated in a faceted searching and browsing interface for problem report trending that combines automatically generated tags with database code fields and temporal information.
Automatic reconstruction of a bacterial regulatory network using Natural Language Processing

PubMed Central

Rodríguez-Penagos, Carlos; Salgado, Heladia; Martínez-Flores, Irma; Collado-Vides, Julio

2007-01-01

Background Manual curation of biological databases, an expensive and labor-intensive process, is essential for high quality integrated data. In this paper we report the implementation of a state-of-the-art Natural Language Processing system that creates computer-readable networks of regulatory interactions directly from different collections of abstracts and full-text papers. Our major aim is to understand how automatic annotation using Text-Mining techniques can complement manual curation of biological databases. We implemented a rule-based system to generate networks from different sets of documents dealing with regulation in Escherichia coli K-12. Results Performance evaluation is based on the most comprehensive transcriptional regulation database for any organism, the manually-curated RegulonDB, 45% of which we were able to recreate automatically. From our automated analysis we were also able to find some new interactions from papers not already curated, or that were missed in the manual filtering and review of the literature. We also put forward a novel Regulatory Interaction Markup Language better suited than SBML for simultaneously representing data of interest for biologists and text miners. Conclusion Manual curation of the output of automatic processing of text is a good way to complement a more detailed review of the literature, either for validating the results of what has been already annotated, or for discovering facts and information that might have been overlooked at the triage or curation stages. PMID:17683642
Motor automaticity in Parkinson’s disease

PubMed Central

Wu, Tao; Hallett, Mark; Chan, Piu

2017-01-01

Bradykinesia is the most important feature contributing to motor difficulties in Parkinson’s disease (PD). However, the pathophysiology underlying bradykinesia is not fully understood. One important aspect is that PD patients have difficulty in performing learned motor skills automatically, but this problem has been generally overlooked. Here we review motor automaticity associated motor deficits in PD, such as reduced arm swing, decreased stride length, freezing of gait, micrographia and reduced facial expression. Recent neuroimaging studies have revealed some neural mechanisms underlying impaired motor automaticity in PD, including less efficient neural coding of movement, failure to shift automated motor skills to the sensorimotor striatum, instability of the automatic mode within the striatum, and use of attentional control and/or compensatory efforts to execute movements usually performed automatically in healthy people. PD patients lose previously acquired automatic skills due to their impaired sensorimotor striatum, and have difficulty in acquiring new automatic skills or restoring lost motor skills. More investigations on the pathophysiology of motor automaticity, the effect of L-dopa or surgical treatments on automaticity, and the potential role of using measures of automaticity in early diagnosis of PD would be valuable. PMID:26102020
Automatic Figure Ranking and User Interfacing for Intelligent Figure Search

PubMed Central

Yu, Hong; Liu, Feifan; Ramesh, Balaji Polepalli

2010-01-01

Background Figures are important experimental results that are typically reported in full-text bioscience articles. Bioscience researchers need to access figures to validate research facts and to formulate or to test novel research hypotheses. On the other hand, the sheer volume of bioscience literature has made it difficult to access figures. Therefore, we are developing an intelligent figure search engine (http://figuresearch.askhermes.org). Existing research in figure search treats each figure equally, but we introduce a novel concept of “figure ranking”: figures appearing in a full-text biomedical article can be ranked by their contribution to the knowledge discovery. Methodology/Findings We empirically validated the hypothesis of figure ranking with over 100 bioscience researchers, and then developed unsupervised natural language processing (NLP) approaches to automatically rank figures. Evaluating on a collection of 202 full-text articles in which authors have ranked the figures based on importance, our best system achieved a weighted error rate of 0.2, which is significantly better than several other baseline systems we explored. We further explored a user interfacing application in which we built novel user interfaces (UIs) incorporating figure ranking, allowing bioscience researchers to efficiently access important figures. Our evaluation results show that 92% of the bioscience researchers prefer as the top two choices the user interfaces in which the most important figures are enlarged. With our automatic figure ranking NLP system, bioscience researchers preferred the UIs in which the most important figures were predicted by our NLP system than the UIs in which the most important figures were randomly assigned. In addition, our results show that there was no statistical difference in bioscience researchers' preference in the UIs generated by automatic figure ranking and UIs by human ranking annotation. Conclusion/Significance The evaluation results

Automatic theory generation from analyst text files using coherence networks

NASA Astrophysics Data System (ADS)

Shaffer, Steven C.

2014-05-01

This paper describes a three-phase process of extracting knowledge from analyst textual reports. Phase 1 involves performing natural language processing on the source text to extract subject-predicate-object triples. In phase 2, these triples are then fed into a coherence network analysis process, using a genetic algorithm optimization. Finally, the highest-value sub networks are processed into a semantic network graph for display. Initial work on a well- known data set (a Wikipedia article on Abraham Lincoln) has shown excellent results without any specific tuning. Next, we ran the process on the SYNthetic Counter-INsurgency (SYNCOIN) data set, developed at Penn State, yielding interesting and potentially useful results.
Adapting a large database of point of care summarized guidelines: a process description.

PubMed

Delvaux, Nicolas; Van de Velde, Stijn; Aertgeerts, Bert; Goossens, Martine; Fauquert, Benjamin; Kunnamo, Ilka; Van Royen, Paul

2017-02-01

Questions posed at the point of care (POC) can be answered using POC summarized guidelines. To implement a national POC information resource, we subscribed to a large database of POC summarized guidelines to complement locally available guidelines. Our challenge was in developing a sustainable strategy for adapting almost 1000 summarized guidelines. The aim of this paper was to describe our process for adapting a database of POC summarized guidelines. An adaptation process based on the ADAPTE framework was tailored to be used by a heterogeneous group of participants. Guidelines were assessed on content and on applicability to the Belgian context. To improve efficiency, we chose to first aim our efforts towards those guidelines most important to primary care doctors. Over a period of 3 years, we screened about 80% of 1000 international summarized guidelines. For those guidelines identified as most important for primary care doctors, we noted that in about half of the cases, remarks were made concerning content. On the other hand, at least two-thirds of all screened guidelines required no changes when evaluating their local usability. Adapting a large body of POC summarized guidelines using a formal adaptation process is possible, even when faced with limited resources. This can be done by creating an efficient and collaborative effort and ensuring user-friendly procedures. Our experiences show that even though in most cases guidelines can be adopted without adaptations, careful review of guidelines developed in a different context remains necessary. Streamlining international efforts in adapting international POC information resources and adopting similar adaptation processes may lessen duplication efforts and prove more cost-effective. © 2015 The Authors. Journal of Evaluation in Clinical Practice published by John Wiley & Sons, Ltd.
Severity Summarization and Just in Time Alert Computation in mHealth Monitoring.

PubMed

Pathinarupothi, Rahul Krishnan; Alangot, Bithin; Rangan, Ekanath

2017-01-01

Mobile health is fast evolving into a practical solution to remotely monitor high-risk patients and deliver timely intervention in case of emergencies. Building upon our previous work on a fast and power efficient summarization framework for remote health monitoring applications, called RASPRO (Rapid Alerts Summarization for Effective Prognosis), we have developed a real-time criticality detection technique, which ensures meeting physician defined interventional time. We also present the results from initial testing of this technique.
Mining free-text medical records for companion animal enteric syndrome surveillance.

PubMed

Anholt, R M; Berezowski, J; Jamal, I; Ribble, C; Stephen, C

2014-03-01

Large amounts of animal health care data are present in veterinary electronic medical records (EMR) and they present an opportunity for companion animal disease surveillance. Veterinary patient records are largely in free-text without clinical coding or fixed vocabulary. Text-mining, a computer and information technology application, is needed to identify cases of interest and to add structure to the otherwise unstructured data. In this study EMR's were extracted from veterinary management programs of 12 participating veterinary practices and stored in a data warehouse. Using commercially available text-mining software (WordStat™), we developed a categorization dictionary that could be used to automatically classify and extract enteric syndrome cases from the warehoused electronic medical records. The diagnostic accuracy of the text-miner for retrieving cases of enteric syndrome was measured against human reviewers who independently categorized a random sample of 2500 cases as enteric syndrome positive or negative. Compared to the reviewers, the text-miner retrieved cases with enteric signs with a sensitivity of 87.6% (95%CI, 80.4-92.9%) and a specificity of 99.3% (95%CI, 98.9-99.6%). Automatic and accurate detection of enteric syndrome cases provides an opportunity for community surveillance of enteric pathogens in companion animals. Copyright © 2014 Elsevier B.V. All rights reserved.
Ontology-based structured cosine similarity in document summarization: with applications to mobile audio-based knowledge management.

PubMed

Yuan, Soe-Tsyr; Sun, Jerry

2005-10-01

Development of algorithms for automated text categorization in massive text document sets is an important research area of data mining and knowledge discovery. Most of the text-clustering methods were grounded in the term-based measurement of distance or similarity, ignoring the structure of the documents. In this paper, we present a novel method named structured cosine similarity (SCS) that furnishes document clustering with a new way of modeling on document summarization, considering the structure of the documents so as to improve the performance of document clustering in terms of quality, stability, and efficiency. This study was motivated by the problem of clustering speech documents (of no rich document features) attained from the wireless experience oral sharing conducted by mobile workforce of enterprises, fulfilling audio-based knowledge management. In other words, this problem aims to facilitate knowledge acquisition and sharing by speech. The evaluations also show fairly promising results on our method of structured cosine similarity.
Method for gathering and summarizing internet information

DOEpatents

Potok, Thomas E.; Elmore, Mark Thomas; Reed, Joel Wesley; Treadwell, Jim N.; Samatova, Nagiza Faridovna

2010-04-06

A computer method of gathering and summarizing large amounts of information comprises collecting information from a plurality of information sources (14, 51) according to respective maps (52) of the information sources (14), converting the collected information from a storage format to XML-language documents (26, 53) and storing the XML-language documents in a storage medium, searching for documents (55) according to a search query (13) having at least one term and identifying the documents (26) found in the search, and displaying the documents as nodes (33) of a tree structure (32) having links (34) and nodes (33) so as to indicate similarity of the documents to each other.
System for gathering and summarizing internet information

DOEpatents

Potok, Thomas E.; Elmore, Mark Thomas; Reed, Joel Wesley; Treadwell, Jim N.; Samatova, Nagiza Faridovna

2006-07-04

A computer method of gathering and summarizing large amounts of information comprises collecting information from a plurality of information sources (14, 51) according to respective maps (52) of the information sources (14), converting the collected information from a storage format to XML-language documents (26, 53) and storing the XML-language documents in a storage medium, searching for documents (55) according to a search query (13) having at least one term and identifying the documents (26) found in the search, and displaying the documents as nodes (33) of a tree structure (32) having links (34) and nodes (33) so as to indicate similarity of the documents to each other.
Method for gathering and summarizing internet information

DOEpatents

Potok, Thomas E [Oak Ridge, TN; Elmore, Mark Thomas [Oak Ridge, TN; Reed, Joel Wesley [Knoxville, TN; Treadwell, Jim N [Louisville, TN; Samatova, Nagiza Faridovna [Oak Ridge, TN

2008-01-01

A computer method of gathering and summarizing large amounts of information comprises collecting information from a plurality of information sources (14, 51) according to respective maps (52) of the information sources (14), converting the collected information from a storage format to XML-language documents (26, 53) and storing the XML-language documents in a storage medium, searching for documents (55) according to a search query (13) having at least one term and identifying the documents (26) found in the search, and displaying the documents as nodes (33) of a tree structure (32) having links (34) and nodes (33) so as to indicate similarity of the documents to each other.
Automatic imitation: A meta-analysis.

PubMed

Cracco, Emiel; Bardi, Lara; Desmet, Charlotte; Genschow, Oliver; Rigoni, Davide; De Coster, Lize; Radkova, Ina; Deschrijver, Eliane; Brass, Marcel

2018-05-01

Automatic imitation is the finding that movement execution is facilitated by compatible and impeded by incompatible observed movements. In the past 15 years, automatic imitation has been studied to understand the relation between perception and action in social interaction. Although research on this topic started in cognitive science, interest quickly spread to related disciplines such as social psychology, clinical psychology, and neuroscience. However, important theoretical questions have remained unanswered. Therefore, in the present meta-analysis, we evaluated seven key questions on automatic imitation. The results, based on 161 studies containing 226 experiments, revealed an overall effect size of g z = 0.95, 95% CI [0.88, 1.02]. Moderator analyses identified automatic imitation as a flexible, largely automatic process that is driven by movement and effector compatibility, but is also influenced by spatial compatibility. Automatic imitation was found to be stronger for forced choice tasks than for simple response tasks, for human agents than for nonhuman agents, and for goalless actions than for goal-directed actions. However, it was not modulated by more subtle factors such as animacy beliefs, motion profiles, or visual perspective. Finally, there was no evidence for a relation between automatic imitation and either empathy or autism. Among other things, these findings point toward actor-imitator similarity as a crucial modulator of automatic imitation and challenge the view that imitative tendencies are an indicator of social functioning. The current meta-analysis has important theoretical implications and sheds light on longstanding controversies in the literature on automatic imitation and related domains. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application

PubMed Central

French, Leon; Liu, Po; Marais, Olivia; Koreman, Tianna; Tseng, Lucia; Lai, Artemis; Pavlidis, Paul

2015-01-01

We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Cross-validation classification within the new corpus replicates results on our original corpus, recalling 67% of connectivity statements at 51% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/. PMID:26052282
UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text.

PubMed

Demner-Fushman, Dina; Mork, James G; Shooshan, Sonya E; Aronson, Alan R

2010-08-01

Identification of medical terms in free text is a first step in such Natural Language Processing (NLP) tasks as automatic indexing of biomedical literature and extraction of patients' problem lists from the text of clinical notes. Many tools developed to perform these tasks use biomedical knowledge encoded in the Unified Medical Language System (UMLS) Metathesaurus. We continue our exploration of automatic approaches to creation of subsets (UMLS content views) which can support NLP processing of either the biomedical literature or clinical text. We found that suppression of highly ambiguous terms in the conservative AutoFilter content view can partially replace manual filtering for literature applications, and suppression of two character mappings in the same content view achieves 89.5% precision at 78.6% recall for clinical applications. Published by Elsevier Inc.
Wilderness Management... A Computerized System for Summarizing Permit Information

Treesearch

Gary H. Elsner

1972-01-01

Permits were first needed for visits to wilderness areas in California during summer 1971. A computerized system for analyzing these permits and summarizing information from them has been developed. It produces four types of summary tables: point-of-origin of visitors; daily variation in total number of persons present; variations in group size; and variations in...
Theorization and an Empirical Investigation of the Component-Based and Developmental Text Writing Fluency Construct

ERIC Educational Resources Information Center

Kim, Young-Suk Grace; Gatlin, Brandy; Al Otaiba, Stephanie; Wanzek, Jeanne

2018-01-01

We discuss a component-based, developmental view of text writing fluency, which we tested using data from children in Grades 2 and 3. "Text writing fluency" was defined as efficiency and automaticity in writing connected texts, which acts as a mediator between text generation (oral language), transcription skills, and writing quality. We…
Supporting the education evidence portal via text mining

PubMed Central

Ananiadou, Sophia; Thompson, Paul; Thomas, James; Mu, Tingting; Oliver, Sandy; Rickinson, Mark; Sasaki, Yutaka; Weissenbacher, Davy; McNaught, John

2010-01-01

The UK Education Evidence Portal (eep) provides a single, searchable, point of access to the contents of the websites of 33 organizations relating to education, with the aim of revolutionizing work practices for the education community. Use of the portal alleviates the need to spend time searching multiple resources to find relevant information. However, the combined content of the websites of interest is still very large (over 500 000 documents and growing). This means that searches using the portal can produce very large numbers of hits. As users often have limited time, they would benefit from enhanced methods of performing searches and viewing results, allowing them to drill down to information of interest more efficiently, without having to sift through potentially long lists of irrelevant documents. The Joint Information Systems Committee (JISC)-funded ASSIST project has produced a prototype web interface to demonstrate the applicability of integrating a number of text-mining tools and methods into the eep, to facilitate an enhanced searching, browsing and document-viewing experience. New features include automatic classification of documents according to a taxonomy, automatic clustering of search results according to similar document content, and automatic identification and highlighting of key terms within documents. PMID:20643679
Nonverbatim Captioning in Dutch Television Programs: A Text Linguistic Approach

ERIC Educational Resources Information Center

Schilperoord, Joost; de Groot, Vanja; van Son, Nic

2005-01-01

In the Netherlands, as in most other European countries, closed captions for the deaf summarize texts rather than render them verbatim. Caption editors argue that in this way television viewers have enough time to both read the text and watch the program. They also claim that the meaning of the original message is properly conveyed. However, many…
Quantitative analysis of the patellofemoral motion pattern using semi-automatic processing of 4D CT data.

PubMed

Forsberg, Daniel; Lindblom, Maria; Quick, Petter; Gauffin, Håkan

2016-09-01

To present a semi-automatic method with minimal user interaction for quantitative analysis of the patellofemoral motion pattern. 4D CT data capturing the patellofemoral motion pattern of a continuous flexion and extension were collected for five patients prone to patellar luxation both pre- and post-surgically. For the proposed method, an observer would place landmarks in a single 3D volume, which then are automatically propagated to the other volumes in a time sequence. From the landmarks in each volume, the measures patellar displacement, patellar tilt and angle between femur and tibia were computed. Evaluation of the observer variability showed the proposed semi-automatic method to be favorable over a fully manual counterpart, with an observer variability of approximately 1.5[Formula: see text] for the angle between femur and tibia, 1.5 mm for the patellar displacement, and 4.0[Formula: see text]-5.0[Formula: see text] for the patellar tilt. The proposed method showed that surgery reduced the patellar displacement and tilt at maximum extension with approximately 10-15 mm and 15[Formula: see text]-20[Formula: see text] for three patients but with less evident differences for two of the patients. A semi-automatic method suitable for quantification of the patellofemoral motion pattern as captured by 4D CT data has been presented. Its observer variability is on par with that of other methods but with the distinct advantage to support continuous motions during the image acquisition.
Comparison of automatic control systems

NASA Technical Reports Server (NTRS)

Oppelt, W

1941-01-01

This report deals with a reciprocal comparison of an automatic pressure control, an automatic rpm control, an automatic temperature control, and an automatic directional control. It shows the difference between the "faultproof" regulator and the actual regulator which is subject to faults, and develops this difference as far as possible in a parallel manner with regard to the control systems under consideration. Such as analysis affords, particularly in its extension to the faults of the actual regulator, a deep insight into the mechanism of the regulator process.
Ultrasound image-based thyroid nodule automatic segmentation using convolutional neural networks.

PubMed

Ma, Jinlian; Wu, Fa; Jiang, Tian'an; Zhao, Qiyu; Kong, Dexing

2017-11-01

Delineation of thyroid nodule boundaries from ultrasound images plays an important role in calculation of clinical indices and diagnosis of thyroid diseases. However, it is challenging for accurate and automatic segmentation of thyroid nodules because of their heterogeneous appearance and components similar to the background. In this study, we employ a deep convolutional neural network (CNN) to automatically segment thyroid nodules from ultrasound images. Our CNN-based method formulates a thyroid nodule segmentation problem as a patch classification task, where the relationship among patches is ignored. Specifically, the CNN used image patches from images of normal thyroids and thyroid nodules as inputs and then generated the segmentation probability maps as outputs. A multi-view strategy is used to improve the performance of the CNN-based model. Additionally, we compared the performance of our approach with that of the commonly used segmentation methods on the same dataset. The experimental results suggest that our proposed method outperforms prior methods on thyroid nodule segmentation. Moreover, the results show that the CNN-based model is able to delineate multiple nodules in thyroid ultrasound images accurately and effectively. In detail, our CNN-based model can achieve an average of the overlap metric, dice ratio, true positive rate, false positive rate, and modified Hausdorff distance as [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text] on overall folds, respectively. Our proposed method is fully automatic without any user interaction. Quantitative results also indicate that our method is so efficient and accurate that it can be good enough to replace the time-consuming and tedious manual segmentation approach, demonstrating the potential clinical applications.
Automatic NEPHIS Coding of Descriptive Titles for Permuted Index Generation.

ERIC Educational Resources Information Center

Craven, Timothy C.

1982-01-01

Describes a system for the automatic coding of most descriptive titles which generates Nested Phrase Indexing System (NEPHIS) input strings of sufficient quality for permuted index production. A series of examples and an 11-item reference list accompany the text. (JL)
Intelligent Text Retrieval and Knowledge Acquisition from Texts for NASA Applications: Preprocessing Issues

NASA Technical Reports Server (NTRS)

2001-01-01

In this contract, which is a component of a larger contract that we plan to submit in the coming months, we plan to study the preprocessing issues which arise in applying natural language processing techniques to NASA-KSC problem reports. The goals of this work will be to deal with the issues of: a) automatically obtaining the problem reports from NASA-KSC data bases, b) the format of these reports and c) the conversion of these reports to a format that will be adequate for our natural language software. At the end of this contract, we expect that these problems will be solved and that we will be ready to apply our natural language software to a text database of over 1000 KSC problem reports.

Reorganized text.

PubMed

2015-05-01

Reorganized Text: In the Original Investigation titled “Patterns of Hospital Utilization for Head and Neck Cancer Care: Changing Demographics” posted online in the January 29, 2015, issue of JAMA Otolaryngology–Head & Neck Surgery (doi:10.1001 /jamaoto.2014.3603), information was copied within sections and text rearranged to accommodate Continuing Medical Education quiz formatting. The information from the topic statements of each paragraph in the Hypothesis Testing subsection of the Methods section was collected in a new first paragraph for that subsection, which reads as follows: “Several hypotheses regarding the causes of regionalization of HNCA care were tested using the NIS data: (1) increasing patient comorbidities over time, causing a shift in care to teaching institutions that would theoretically be better equipped to handle such increased comorbidities; (2) shifting of payer status; (3) increased proportion of prior radiation therapy; and (4) a higher fraction of more complex procedures being referred and performed at teaching institutions.” In addition, the phrase "As summarized in Table3," was added to the beginning of paragraph 6 of the Discussion section, and the call-out to Table 3 in the middle of that paragraph was deleted. Finally, paragraphs 6 and 7 of the Discussion section were combined.
Automatic Analysis of Critical Incident Reports: Requirements and Use Cases.

PubMed

Denecke, Kerstin

2016-01-01

Increasingly, critical incident reports are used as a means to increase patient safety and quality of care. The entire potential of these sources of experiential knowledge remains often unconsidered since retrieval and analysis is difficult and time-consuming, and the reporting systems often do not provide support for these tasks. The objective of this paper is to identify potential use cases for automatic methods that analyse critical incident reports. In more detail, we will describe how faceted search could offer an intuitive retrieval of critical incident reports and how text mining could support in analysing relations among events. To realise an automated analysis, natural language processing needs to be applied. Therefore, we analyse the language of critical incident reports and derive requirements towards automatic processing methods. We learned that there is a huge potential for an automatic analysis of incident reports, but there are still challenges to be solved.
Text de-identification for privacy protection: a study of its impact on clinical text information content.

PubMed

Meystre, Stéphane M; Ferrández, Óscar; Friedlin, F Jeffrey; South, Brett R; Shen, Shuying; Samore, Matthew H

2014-08-01

As more and more electronic clinical information is becoming easier to access for secondary uses such as clinical research, approaches that enable faster and more collaborative research while protecting patient privacy and confidentiality are becoming more important. Clinical text de-identification offers such advantages but is typically a tedious manual process. Automated Natural Language Processing (NLP) methods can alleviate this process, but their impact on subsequent uses of the automatically de-identified clinical narratives has only barely been investigated. In the context of a larger project to develop and investigate automated text de-identification for Veterans Health Administration (VHA) clinical notes, we studied the impact of automated text de-identification on clinical information in a stepwise manner. Our approach started with a high-level assessment of clinical notes informativeness and formatting, and ended with a detailed study of the overlap of select clinical information types and Protected Health Information (PHI). To investigate the informativeness (i.e., document type information, select clinical data types, and interpretation or conclusion) of VHA clinical notes, we used five different existing text de-identification systems. The informativeness was only minimally altered by these systems while formatting was only modified by one system. To examine the impact of de-identification on clinical information extraction, we compared counts of SNOMED-CT concepts found by an open source information extraction application in the original (i.e., not de-identified) version of a corpus of VHA clinical notes, and in the same corpus after de-identification. Only about 1.2-3% less SNOMED-CT concepts were found in de-identified versions of our corpus, and many of these concepts were PHI that was erroneously identified as clinical information. To study this impact in more details and assess how generalizable our findings were, we examined the overlap between
Practical vision based degraded text recognition system

NASA Astrophysics Data System (ADS)

Mohammad, Khader; Agaian, Sos; Saleh, Hani

2011-02-01

Rapid growth and progress in the medical, industrial, security and technology fields means more and more consideration for the use of camera based optical character recognition (OCR) Applying OCR to scanned documents is quite mature, and there are many commercial and research products available on this topic. These products achieve acceptable recognition accuracy and reasonable processing times especially with trained software, and constrained text characteristics. Even though the application space for OCR is huge, it is quite challenging to design a single system that is capable of performing automatic OCR for text embedded in an image irrespective of the application. Challenges for OCR systems include; images are taken under natural real world conditions, Surface curvature, text orientation, font, size, lighting conditions, and noise. These and many other conditions make it extremely difficult to achieve reasonable character recognition. Performance for conventional OCR systems drops dramatically as the degradation level of the text image quality increases. In this paper, a new recognition method is proposed to recognize solid or dotted line degraded characters. The degraded text string is localized and segmented using a new algorithm. The new method was implemented and tested using a development framework system that is capable of performing OCR on camera captured images. The framework allows parameter tuning of the image-processing algorithm based on a training set of camera-captured text images. Novel methods were used for enhancement, text localization and the segmentation algorithm which enables building a custom system that is capable of performing automatic OCR which can be used for different applications. The developed framework system includes: new image enhancement, filtering, and segmentation techniques which enabled higher recognition accuracies, faster processing time, and lower energy consumption, compared with the best state of the art published
Automatic Collision Avoidance Technology (ACAT)

NASA Technical Reports Server (NTRS)

Swihart, Donald E.; Skoog, Mark A.

2007-01-01

This document represents two views of the Automatic Collision Avoidance Technology (ACAT). One viewgraph presentation reviews the development and system design of Automatic Collision Avoidance Technology (ACAT). Two types of ACAT exist: Automatic Ground Collision Avoidance (AGCAS) and Automatic Air Collision Avoidance (AACAS). The AGCAS Uses Digital Terrain Elevation Data (DTED) for mapping functions, and uses Navigation data to place aircraft on map. It then scans DTED in front of and around aircraft and uses future aircraft trajectory (5g) to provide automatic flyup maneuver when required. The AACAS uses data link to determine position and closing rate. It contains several canned maneuvers to avoid collision. Automatic maneuvers can occur at last instant and both aircraft maneuver when using data link. The system can use sensor in place of data link. The second viewgraph presentation reviews the development of a flight test and an evaluation of the test. A review of the operation and comparison of the AGCAS and a pilot's performance are given. The same review is given for the AACAS is given.
Pattern-Based Extraction of Argumentation from the Scientific Literature

ERIC Educational Resources Information Center

White, Elizabeth K.

2010-01-01

As the number of publications in the biomedical field continues its exponential increase, techniques for automatically summarizing information from this body of literature have become more diverse. In addition, the targets of summarization have become more subtle; initial work focused on extracting the factual assertions from full-text papers,…
Information Retrieval and Text Mining Technologies for Chemistry.

PubMed

Krallinger, Martin; Rabal, Obdulia; Lourenço, Anália; Oyarzabal, Julen; Valencia, Alfonso

2017-06-28

Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.
Automatically Assessing Lexical Sophistication: Indices, Tools, Findings, and Application

ERIC Educational Resources Information Center

Kyle, Kristopher; Crossley, Scott A.

2015-01-01

This study explores the construct of lexical sophistication and its applications for measuring second language lexical and speaking proficiency. In doing so, the study introduces the Tool for the Automatic Analysis of LExical Sophistication (TAALES), which calculates text scores for 135 classic and newly developed lexical indices related to word…
Capturing User Reading Behaviors for Personalized Document Summarization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Songhua; Jiang, Hao; Lau, Francis

2011-01-01

We propose a new personalized document summarization method that observes a user's personal reading preferences. These preferences are inferred from the user's reading behaviors, including facial expressions, gaze positions, and reading durations that were captured during the user's past reading activities. We compare the performance of our algorithm with that of a few peer algorithms and software packages. The results of our comparative study show that our algorithm can produce more superior personalized document summaries than all the other methods in that the summaries generated by our algorithm can better satisfy a user's personal preferences.
Automatically identifying health outcome information in MEDLINE records.

PubMed

Demner-Fushman, Dina; Few, Barbara; Hauser, Susan E; Thoma, George

2006-01-01

Understanding the effect of a given intervention on the patient's health outcome is one of the key elements in providing optimal patient care. This study presents a methodology for automatic identification of outcomes-related information in medical text and evaluates its potential in satisfying clinical information needs related to health care outcomes. An annotation scheme based on an evidence-based medicine model for critical appraisal of evidence was developed and used to annotate 633 MEDLINE citations. Textual, structural, and meta-information features essential to outcome identification were learned from the created collection and used to develop an automatic system. Accuracy of automatic outcome identification was assessed in an intrinsic evaluation and in an extrinsic evaluation, in which ranking of MEDLINE search results obtained using PubMed Clinical Queries relied on identified outcome statements. The accuracy and positive predictive value of outcome identification were calculated. Effectiveness of the outcome-based ranking was measured using mean average precision and precision at rank 10. Automatic outcome identification achieved 88% to 93% accuracy. The positive predictive value of individual sentences identified as outcomes ranged from 30% to 37%. Outcome-based ranking improved retrieval accuracy, tripling mean average precision and achieving 389% improvement in precision at rank 10. Preliminary results in outcome-based document ranking show potential validity of the evidence-based medicine-model approach in timely delivery of information critical to clinical decision support at the point of service.
Automatic Coal-Mining System

NASA Technical Reports Server (NTRS)

Collins, E. R., Jr.

1985-01-01

Coal cutting and removal done with minimal hazard to people. Automatic coal mine cutting, transport and roof-support movement all done by automatic machinery. Exposure of people to hazardous conditions reduced to inspection tours, maintenance, repair, and possibly entry mining.
Difficulty identifying feelings and automatic activation in the fusiform gyrus in response to facial emotion.

PubMed

Eichmann, Mischa; Kugel, Harald; Suslow, Thomas

2008-12-01

Difficulties in identifying and differentiating one's emotions are a central characteristic of alexithymia. In the present study, automatic activation of the fusiform gyrus to facial emotion was investigated as a function of alexithymia as assessed by the 20-item Toronto Alexithymia Scale. During 3 Tesla fMRI scanning, pictures of faces bearing sad, happy, and neutral expressions masked by neutral faces were presented to 22 healthy adults who also responded to the Toronto Alexithymia Scale. The fusiform gyrus was selected as the region of interest, and voxel values of this region were extracted, summarized as means, and tested among the different conditions (sad, happy, and neutral faces). Masked sad facial emotions were associated with greater bilateral activation of the fusiform gyrus than masked neutral faces. The subscale, Difficulty Identifying Feelings, was negatively correlated with the neural response of the fusiform gyrus to masked sad faces. The correlation results suggest that automatic hyporesponsiveness of the fusiform gyrus to negative emotion stimuli may reflect problems in recognizing one's emotions in everyday life.
Automatic identification of comparative effectiveness research from Medline citations to support clinicians’ treatment information needs

PubMed Central

Zhang, Mingyuan; Fiol, Guilherme Del; Grout, Randall W.; Jonnalagadda, Siddhartha; Medlin, Richard; Mishra, Rashmi; Weir, Charlene; Liu, Hongfang; Mostafa, Javed; Fiszman, Marcelo

2014-01-01

Online knowledge resources such as Medline can address most clinicians’ patient care information needs. Yet, significant barriers, notably lack of time, limit the use of these sources at the point of care. The most common information needs raised by clinicians are treatment-related. Comparative effectiveness studies allow clinicians to consider multiple treatment alternatives for a particular problem. Still, solutions are needed to enable efficient and effective consumption of comparative effectiveness research at the point of care. Objective Design and assess an algorithm for automatically identifying comparative effectiveness studies and extracting the interventions investigated in these studies. Methods The algorithm combines semantic natural language processing, Medline citation metadata, and machine learning techniques. We assessed the algorithm in a case study of treatment alternatives for depression. Results Both precision and recall for identifying comparative studies was 0.83. A total of 86% of the interventions extracted perfectly or partially matched the gold standard. Conclusion Overall, the algorithm achieved reasonable performance. The method provides building blocks for the automatic summarization of comparative effectiveness research to inform point of care decision-making. PMID:23920677
Automatic switching matrix

DOEpatents

Schlecht, Martin F.; Kassakian, John G.; Caloggero, Anthony J.; Rhodes, Bruce; Otten, David; Rasmussen, Neil

1982-01-01

An automatic switching matrix that includes an apertured matrix board containing a matrix of wires that can be interconnected at each aperture. Each aperture has associated therewith a conductive pin which, when fully inserted into the associated aperture, effects electrical connection between the wires within that particular aperture. Means is provided for automatically inserting the pins in a determined pattern and for removing all the pins to permit other interconnecting patterns.
Linguistic Summarization of Video for Fall Detection Using Voxel Person and Fuzzy Logic

PubMed Central

Anderson, Derek; Luke, Robert H.; Keller, James M.; Skubic, Marjorie; Rantz, Marilyn; Aud, Myra

2009-01-01

In this paper, we present a method for recognizing human activity from linguistic summarizations of temporal fuzzy inference curves representing the states of a three-dimensional object called voxel person. A hierarchy of fuzzy logic is used, where the output from each level is summarized and fed into the next level. We present a two level model for fall detection. The first level infers the states of the person at each image. The second level operates on linguistic summarizations of voxel person’s states and inference regarding activity is performed. The rules used for fall detection were designed under the supervision of nurses to ensure that they reflect the manner in which elders perform these activities. The proposed framework is extremely flexible. Rules can be modified, added, or removed, allowing for per-resident customization based on knowledge about their cognitive and physical ability. PMID:20046216
The NLM Indexing Initiative's Medical Text Indexer.

PubMed

Aronson, Alan R; Mork, James G; Gay, Clifford W; Humphrey, Susanne M; Rogers, Willie J

2004-01-01

The Medical Text Indexer (MTI) is a program for producing MeSH indexing recommendations. It is the major product of NLM's Indexing Initiative and has been used in both semi-automated and fully automated indexing environments at the Library since mid 2002. We report here on an experiment conducted with MEDLINE indexers to evaluate MTI's performance and to generate ideas for its improvement as a tool for user-assisted indexing. We also discuss some filtering techniques developed to improve MTI's accuracy for use primarily in automatically producing the indexing for several abstracts collections.
User Evaluation of a Communication System That Automatically Generates Captions to Improve Telephone Communication

PubMed Central

Zekveld, Adriana A.; Kramer, Sophia E.; Kessens, Judith M.; Vlaming, Marcel S. M. G.; Houtgast, Tammo

2009-01-01

This study examined the subjective benefit obtained from automatically generated captions during telephone-speech comprehension in the presence of babble noise. Short stories were presented by telephone either with or without captions that were generated offline by an automatic speech recognition (ASR) system. To simulate online ASR, the word accuracy (WA) level of the captions was 60% or 70% and the text was presented delayed to the speech. After each test, the hearing impaired participants (n = 20) completed the NASA-Task Load Index and several rating scales evaluating the support from the captions. Participants indicated that using the erroneous text in speech comprehension was difficult and the reported task load did not differ between the audio + text and audio-only conditions. In a follow-up experiment (n = 10), the perceived benefit of presenting captions increased with an increase of WA levels to 80% and 90%, and elimination of the text delay. However, in general, the task load did not decrease when captions were presented. These results suggest that the extra effort required to process the text could have been compensated for by less effort required to comprehend the speech. Future research should aim at reducing the complexity of the task to increase the willingness of hearing impaired persons to use an assistive communication system automatically providing captions. The current results underline the need for obtaining both objective and subjective measures of benefit when evaluating assistive communication systems. PMID:19126551
Automatic assembly of micro-optical components

NASA Astrophysics Data System (ADS)

Gengenbach, Ulrich K.

1996-12-01

Automatic assembly becomes an important issue as hybrid micro systems enter industrial fabrication. Moving from a laboratory scale production with manual assembly and bonding processes to automatic assembly requires a thorough re- evaluation of the design, the characteristics of the individual components and of the processes involved. Parts supply for automatic operation, sensitive and intelligent grippers adapted to size, surface and material properties of the microcomponents gain importance when the superior sensory and handling skills of a human are to be replaced by a machine. This holds in particular for the automatic assembly of micro-optical components. The paper outlines these issues exemplified at the automatic assembly of a micro-optical duplexer consisting of a micro-optical bench fabricated by the LIGA technique, two spherical lenses, a wavelength filter and an optical fiber. Spherical lenses, wavelength filter and optical fiber are supplied by third party vendors, which raises the question of parts supply for automatic assembly. The bonding processes for these components include press fit and adhesive bonding. The prototype assembly system with all relevant components e.g. handling system, parts supply, grippers and control is described. Results of first automatic assembly tests are presented.
30 CFR 27.23 - Automatic warning device.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Automatic warning device. 27.23 Section 27.23... Automatic warning device. (a) An automatic warning device shall be suitably constructed for incorporation in... automatic warning device shall include an alarm signal (audible or colored light), which shall be made to...
30 CFR 27.23 - Automatic warning device.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Automatic warning device. 27.23 Section 27.23... Automatic warning device. (a) An automatic warning device shall be suitably constructed for incorporation in... automatic warning device shall include an alarm signal (audible or colored light), which shall be made to...

30 CFR 27.23 - Automatic warning device.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Automatic warning device. 27.23 Section 27.23... Automatic warning device. (a) An automatic warning device shall be suitably constructed for incorporation in... automatic warning device shall include an alarm signal (audible or colored light), which shall be made to...
30 CFR 27.23 - Automatic warning device.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Automatic warning device. 27.23 Section 27.23... Automatic warning device. (a) An automatic warning device shall be suitably constructed for incorporation in... automatic warning device shall include an alarm signal (audible or colored light), which shall be made to...
Neural Bases of Automaticity

ERIC Educational Resources Information Center

Servant, Mathieu; Cassey, Peter; Woodman, Geoffrey F.; Logan, Gordon D.

2018-01-01

Automaticity allows us to perform tasks in a fast, efficient, and effortless manner after sufficient practice. Theories of automaticity propose that across practice processing transitions from being controlled by working memory to being controlled by long-term memory retrieval. Recent event-related potential (ERP) studies have sought to test this…
Text-to-audiovisual speech synthesizer for children with learning disabilities.

PubMed

Mendi, Engin; Bayrak, Coskun

2013-01-01

Learning disabilities affect the ability of children to learn, despite their having normal intelligence. Assistive tools can highly increase functional capabilities of children with learning disorders such as writing, reading, or listening. In this article, we describe a text-to-audiovisual synthesizer that can serve as an assistive tool for such children. The system automatically converts an input text to audiovisual speech, providing synchronization of the head, eye, and lip movements of the three-dimensional face model with appropriate facial expressions and word flow of the text. The proposed system can enhance speech perception and help children having learning deficits to improve their chances of success.
Text-mining-assisted biocuration workflows in Argo

PubMed Central

Rak, Rafal; Batista-Navarro, Riza Theresa; Rowley, Andrew; Carter, Jacob; Ananiadou, Sophia

2014-01-01

Biocuration activities have been broadly categorized into the selection of relevant documents, the annotation of biological concepts of interest and identification of interactions between the concepts. Text mining has been shown to have a potential to significantly reduce the effort of biocurators in all the three activities, and various semi-automatic methodologies have been integrated into curation pipelines to support them. We investigate the suitability of Argo, a workbench for building text-mining solutions with the use of a rich graphical user interface, for the process of biocuration. Central to Argo are customizable workflows that users compose by arranging available elementary analytics to form task-specific processing units. A built-in manual annotation editor is the single most used biocuration tool of the workbench, as it allows users to create annotations directly in text, as well as modify or delete annotations created by automatic processing components. Apart from syntactic and semantic analytics, the ever-growing library of components includes several data readers and consumers that support well-established as well as emerging data interchange formats such as XMI, RDF and BioC, which facilitate the interoperability of Argo with other platforms or resources. To validate the suitability of Argo for curation activities, we participated in the BioCreative IV challenge whose purpose was to evaluate Web-based systems addressing user-defined biocuration tasks. Argo proved to have the edge over other systems in terms of flexibility of defining biocuration tasks. As expected, the versatility of the workbench inevitably lengthened the time the curators spent on learning the system before taking on the task, which may have affected the usability of Argo. The participation in the challenge gave us an opportunity to gather valuable feedback and identify areas of improvement, some of which have already been introduced. Database URL: http://argo.nactem.ac.uk PMID
46 CFR 52.01-10 - Automatic controls.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 46 Shipping 2 2013-10-01 2013-10-01 false Automatic controls. 52.01-10 Section 52.01-10 Shipping... Requirements § 52.01-10 Automatic controls. (a) Each main boiler must meet the special requirements for automatic safety controls in § 62.35-20(a)(1) of this chapter. (b) Each automatically controlled auxiliary...
46 CFR 52.01-10 - Automatic controls.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 46 Shipping 2 2010-10-01 2010-10-01 false Automatic controls. 52.01-10 Section 52.01-10 Shipping... Requirements § 52.01-10 Automatic controls. (a) Each main boiler must meet the special requirements for automatic safety controls in § 62.35-20(a)(1) of this chapter. (b) Each automatically controlled auxiliary...
46 CFR 52.01-10 - Automatic controls.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 46 Shipping 2 2014-10-01 2014-10-01 false Automatic controls. 52.01-10 Section 52.01-10 Shipping... Requirements § 52.01-10 Automatic controls. (a) Each main boiler must meet the special requirements for automatic safety controls in § 62.35-20(a)(1) of this chapter. (b) Each automatically controlled auxiliary...
46 CFR 52.01-10 - Automatic controls.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 46 Shipping 2 2012-10-01 2012-10-01 false Automatic controls. 52.01-10 Section 52.01-10 Shipping... Requirements § 52.01-10 Automatic controls. (a) Each main boiler must meet the special requirements for automatic safety controls in § 62.35-20(a)(1) of this chapter. (b) Each automatically controlled auxiliary...
46 CFR 52.01-10 - Automatic controls.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 46 Shipping 2 2011-10-01 2011-10-01 false Automatic controls. 52.01-10 Section 52.01-10 Shipping... Requirements § 52.01-10 Automatic controls. (a) Each main boiler must meet the special requirements for automatic safety controls in § 62.35-20(a)(1) of this chapter. (b) Each automatically controlled auxiliary...
Classification of hepatocellular carcinoma stages from free-text clinical and radiology reports

PubMed Central

Yim, Wen-wai; Kwan, Sharon W; Johnson, Guy; Yetisgen, Meliha

2017-01-01

Cancer stage information is important for clinical research. However, they are not always explicitly noted in electronic medical records. In this paper, we present our work on automatic classification of hepatocellular carcinoma (HCC) stages from free-text clinical and radiology notes. To accomplish this, we defined 11 stage parameters used in the three HCC staging systems, American Joint Committee on Cancer (AJCC), Barcelona Clinic Liver Cancer (BCLC), and Cancer of the Liver Italian Program (CLIP). After aggregating stage parameters to the patient-level, the final stage classifications were achieved using an expert-created decision logic. Each stage parameter relevant for staging was extracted using several classification methods, e.g. sentence classification and automatic information structuring, to identify and normalize text as cancer stage parameter values. Stage parameter extraction for the test set performed at 0.81 F1. Cancer stage prediction for AJCC, BCLC, and CLIP stage classifications were 0.55, 0.50, and 0.43 F1.
BROWSER: An Automatic Indexing On-Line Text Retrieval System. Annual Progress Report.

ERIC Educational Resources Information Center

Williams, J. H., Jr.

The development and testing of the Browsing On-line With Selective Retrieval (BROWSER) text retrieval system allowing a natural language query statement and providing on-line browsing capabilities through an IBM 2260 display terminal is described. The prototype system contains data bases of 25,000 German language patent abstracts, 9,000 English…
An NLP Framework for Non-Topical Text Analysis in Urdu--A Resource Poor Language

ERIC Educational Resources Information Center

Mukund, Smruthi

2012-01-01

Language plays a very important role in understanding the culture and mindset of people. Given the abundance of electronic multilingual data, it is interesting to see what insight can be gained by automatic analysis of text. This in turn calls for text analysis which is focused on non-topical information such as emotions being expressed that is in…
Rewriting and Paraphrasing Source Texts in Second Language Writing

ERIC Educational Resources Information Center

Shi, Ling

2012-01-01

The present study is based on interviews with 48 students and 27 instructors in a North American university and explores whether students and professors across faculties share the same views on the use of paraphrased, summarized, and translated texts in four examples of L2 student writing. Participants' comments centered on whether the paraphrases…
The Automatic Assessment of Free Text Answers Using a Modified BLEU Algorithm

ERIC Educational Resources Information Center

Noorbehbahani, F.; Kardan, A. A.

2011-01-01

e-Learning plays an undoubtedly important role in today's education and assessment is one of the most essential parts of any instruction-based learning process. Assessment is a common way to evaluate a student's knowledge regarding the concepts related to learning objectives. In this paper, a new method for assessing the free text answers of…
ADMAP (automatic data manipulation program)

NASA Technical Reports Server (NTRS)

Mann, F. I.

1971-01-01

Instructions are presented on the use of ADMAP, (automatic data manipulation program) an aerospace data manipulation computer program. The program was developed to aid in processing, reducing, plotting, and publishing electric propulsion trajectory data generated by the low thrust optimization program, HILTOP. The program has the option of generating SC4020 electric plots, and therefore requires the SC4020 routines to be available at excution time (even if not used). Several general routines are present, including a cubic spline interpolation routine, electric plotter dash line drawing routine, and single parameter and double parameter sorting routines. Many routines are tailored for the manipulation and plotting of electric propulsion data, including an automatic scale selection routine, an automatic curve labelling routine, and an automatic graph titling routine. Data are accepted from either punched cards or magnetic tape.
Automatic Identification and Organization of Index Terms for Interactive Browsing.

ERIC Educational Resources Information Center

Wacholder, Nina; Evans, David K.; Klavans, Judith L.

The potential of automatically generated indexes for information access has been recognized for several decades, but the quantity of text and the ambiguity of natural language processing have made progress at this task more difficult than was originally foreseen. Recently, a body of work on development of interactive systems to support phrase…
Effects of a Summarizing Strategy on Written Summaries of Children with Emotional and Behavioral Disorders

ERIC Educational Resources Information Center

Saddler, Bruce; Asaro-Saddler, Kristie; Moeyaert, Mariola; Ellis-Robinson, Tammy

2017-01-01

In this single-subject study, we examined the effects of a summarizing strategy on the written summaries of children with emotional and behavioral disorders (EBDs). Six students with EBDs in fifth and sixth grades learned a mnemonic-based strategy for summarizing taught through the self-regulated strategy development (SRSD) approach. Visual…
Calibrating Item Families and Summarizing the Results Using Family Expected Response Functions

ERIC Educational Resources Information Center

Sinharay, Sandip; Johnson, Matthew S.; Williamson, David M.

2003-01-01

Item families, which are groups of related items, are becoming increasingly popular in complex educational assessments. For example, in automatic item generation (AIG) systems, a test may consist of multiple items generated from each of a number of item models. Item calibration or scoring for such an assessment requires fitting models that can…
Analysis of Protein Phosphorylation and Its Functional Impact on Protein-Protein Interactions via Text Mining of the Scientific Literature.

PubMed

Wang, Qinghua; Ross, Karen E; Huang, Hongzhan; Ren, Jia; Li, Gang; Vijay-Shanker, K; Wu, Cathy H; Arighi, Cecilia N

2017-01-01

Post-translational modifications (PTMs) are one of the main contributors to the diversity of proteoforms in the proteomic landscape. In particular, protein phosphorylation represents an essential regulatory mechanism that plays a role in many biological processes. Protein kinases, the enzymes catalyzing this reaction, are key participants in metabolic and signaling pathways. Their activation or inactivation dictate downstream events: what substrates are modified and their subsequent impact (e.g., activation state, localization, protein-protein interactions (PPIs)). The biomedical literature continues to be the main source of evidence for experimental information about protein phosphorylation. Automatic methods to bring together phosphorylation events and phosphorylation-dependent PPIs can help to summarize the current knowledge and to expose hidden connections. In this chapter, we demonstrate two text mining tools, RLIMS-P and eFIP, for the retrieval and extraction of kinase-substrate-site data and phosphorylation-dependent PPIs from the literature. These tools offer several advantages over a literature search in PubMed as their results are specific for phosphorylation. RLIMS-P and eFIP results can be sorted, organized, and viewed in multiple ways to answer relevant biological questions, and the protein mentions are linked to UniProt identifiers.

New directions in biomedical text annotation: definitions, guidelines and corpus construction

PubMed Central

Wilbur, W John; Rzhetsky, Andrey; Shatkay, Hagit

2006-01-01

Background While biomedical text mining is emerging as an important research area, practical results have proven difficult to achieve. We believe that an important first step towards more accurate text-mining lies in the ability to identify and characterize text that satisfies various types of information needs. We report here the results of our inquiry into properties of scientific text that have sufficient generality to transcend the confines of a narrow subject area, while supporting practical mining of text for factual information. Our ultimate goal is to annotate a significant corpus of biomedical text and train machine learning methods to automatically categorize such text along certain dimensions that we have defined. Results We have identified five qualitative dimensions that we believe characterize a broad range of scientific sentences, and are therefore useful for supporting a general approach to text-mining: focus, polarity, certainty, evidence, and directionality. We define these dimensions and describe the guidelines we have developed for annotating text with regard to them. To examine the effectiveness of the guidelines, twelve annotators independently annotated the same set of 101 sentences that were randomly selected from current biomedical periodicals. Analysis of these annotations shows 70–80% inter-annotator agreement, suggesting that our guidelines indeed present a well-defined, executable and reproducible task. Conclusion We present our guidelines defining a text annotation task, along with annotation results from multiple independently produced annotations, demonstrating the feasibility of the task. The annotation of a very large corpus of documents along these guidelines is currently ongoing. These annotations form the basis for the categorization of text along multiple dimensions, to support viable text mining for experimental results, methodology statements, and other forms of information. We are currently developing machine learning
Argo: an integrative, interactive, text mining-based workbench supporting curation

PubMed Central

Rak, Rafal; Rowley, Andrew; Black, William; Ananiadou, Sophia

2012-01-01

Curation of biomedical literature is often supported by the automatic analysis of textual content that generally involves a sequence of individual processing components. Text mining (TM) has been used to enhance the process of manual biocuration, but has been focused on specific databases and tasks rather than an environment integrating TM tools into the curation pipeline, catering for a variety of tasks, types of information and applications. Processing components usually come from different sources and often lack interoperability. The well established Unstructured Information Management Architecture is a framework that addresses interoperability by defining common data structures and interfaces. However, most of the efforts are targeted towards software developers and are not suitable for curators, or are otherwise inconvenient to use on a higher level of abstraction. To overcome these issues we introduce Argo, an interoperable, integrative, interactive and collaborative system for text analysis with a convenient graphic user interface to ease the development of processing workflows and boost productivity in labour-intensive manual curation. Robust, scalable text analytics follow a modular approach, adopting component modules for distinct levels of text analysis. The user interface is available entirely through a web browser that saves the user from going through often complicated and platform-dependent installation procedures. Argo comes with a predefined set of processing components commonly used in text analysis, while giving the users the ability to deposit their own components. The system accommodates various areas and levels of user expertise, from TM and computational linguistics to ontology-based curation. One of the key functionalities of Argo is its ability to seamlessly incorporate user-interactive components, such as manual annotation editors, into otherwise completely automatic pipelines. As a use case, we demonstrate the functionality of an in
Summarizing metocean operating conditions as a climatology of marine hazards

NASA Astrophysics Data System (ADS)

Reid, Heather; Finnis, Joel

2018-03-01

Marine occupations are plagued by some of the highest accident and mortality rates of any occupation, due in part to the variety and severity of environmental hazards presented by the ocean environment. In order to better study and communicate the potential impacts of these hazards on occupational health and safety, a semi-objective, hazard-focused climatology of a particularly dangerous marine environment (Northwestern Atlantic) has been developed. Specifically, climate has been summarized as the frequency with which responsible government agencies are expected to issue relevant warnings or watches, couching results in language relevant to marine stakeholders. Applying cluster analysis to warning/watch frequencies identified seven distinct `hazard climatologies', ranging from near-Arctic conditions to areas dominated by calm seas and warm waters. Spatial and temporal variability in these clusters reflects relevant annual cycles, such as the advance/retreat of sea ice and shifts in the Atlantic storm track; the clusters also highlight regions and seasons with comparable operational risks. Our approach is proposed as an effective means to summarize and communicate marine risk with stakeholders, and a potential framework for describing climate change impacts.
A Recent Advance in the Automatic Indexing of the Biomedical Literature

PubMed Central

Névéol, Aurélie; Shooshan, Sonya E.; Humphrey, Susanne M.; Mork, James G.; Aronson, Alan R.

2009-01-01

The volume of biomedical literature has experienced explosive growth in recent years. This is reflected in the corresponding increase in the size of MEDLINE®, the largest bibliographic database of biomedical citations. Indexers at the U.S. National Library of Medicine (NLM) need efficient tools to help them accommodate the ensuing workload. After reviewing issues in the automatic assignment of Medical Subject Headings (MeSH® terms) to biomedical text, we focus more specifically on the new subheading attachment feature for NLM’s Medical Text Indexer (MTI). Natural Language Processing, statistical, and machine learning methods of producing automatic MeSH main heading/subheading pair recommendations were assessed independently and combined. The best combination achieves 48% precision and 30% recall. After validation by NLM indexers, a suitable combination of the methods presented in this paper was integrated into MTI as a subheading attachment feature producing MeSH indexing recommendations compliant with current state-of-the-art indexing practice. PMID:19166973
A recent advance in the automatic indexing of the biomedical literature.

PubMed

Névéol, Aurélie; Shooshan, Sonya E; Humphrey, Susanne M; Mork, James G; Aronson, Alan R

2009-10-01

The volume of biomedical literature has experienced explosive growth in recent years. This is reflected in the corresponding increase in the size of MEDLINE, the largest bibliographic database of biomedical citations. Indexers at the US National Library of Medicine (NLM) need efficient tools to help them accommodate the ensuing workload. After reviewing issues in the automatic assignment of Medical Subject Headings (MeSH terms) to biomedical text, we focus more specifically on the new subheading attachment feature for NLM's Medical Text Indexer (MTI). Natural Language Processing, statistical, and machine learning methods of producing automatic MeSH main heading/subheading pair recommendations were assessed independently and combined. The best combination achieves 48% precision and 30% recall. After validation by NLM indexers, a suitable combination of the methods presented in this paper was integrated into MTI as a subheading attachment feature producing MeSH indexing recommendations compliant with current state-of-the-art indexing practice.
Multi-documents summarization based on clustering of learning object using hierarchical clustering

NASA Astrophysics Data System (ADS)

Mustamiin, M.; Budi, I.; Santoso, H. B.

2018-03-01

The Open Educational Resources (OER) is a portal of teaching, learning and research resources that is available in public domain and freely accessible. Learning contents or Learning Objects (LO) are granular and can be reused for constructing new learning materials. LO ontology-based searching techniques can be used to search for LO in the Indonesia OER. In this research, LO from search results are used as an ingredient to create new learning materials according to the topic searched by users. Summarizing-based grouping of LO use Hierarchical Agglomerative Clustering (HAC) with the dependency context to the user’s query which has an average value F-Measure of 0.487, while summarizing by K-Means F-Measure only has an average value of 0.336.
Automatic scanning and measuring using POLLY

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fields, T.

1993-07-01

The HPD and PEPR automatic measuring systems, which have been described by B. Powell and I. Pless at this conference, were developed in the 1960`s to be used for what would now be called {open_quotes}batch processing.{close_quotes} That is, an entire reel of bubble chamber film containing interesting events whose tracks had been rough-digitized would be processed in an extended run by a dedicated computer/precision digitizer hardware system, with no human intervention. Then, at a later time, events for which the precision measurement did not appear to be successful would be handled with some type of {open_quotes}fixup{close_quotes} station or process. Bymore » contrast, the POLLY system included from the start, not only a computer and a precision CRT measuring device, but also a human operator who could have convenient two-way interactions with the computer and could also view the picture directly. Inclusion of a human as a key part of the system had some important beneficial effects, as has been described in the original papers. In this note the author summarizes those effects, and also points out connections between the POLLY system philosophy and subsequent developments in both high energy physics data analysis and computing systems.« less
Automatic enforcement and highway safety.

DOT National Transportation Integrated Search

2011-05-01

The objectives of this research are to: 1. Identify aspects of the automatic detection of red light running that the public finds offensive or problematical, and quantify the level of opposition on each aspect. 2. Identify aspects of the automatic de...
Assumption-versus data-based approaches to summarizing species' ranges.

PubMed

Peterson, A Townsend; Navarro-Sigüenza, Adolfo G; Gordillo, Alejandro

2018-06-01

For conservation decision making, species' geographic distributions are mapped using various approaches. Some such efforts have downscaled versions of coarse-resolution extent-of-occurrence maps to fine resolutions for conservation planning. We examined the quality of the extent-of-occurrence maps as range summaries and the utility of refining those maps into fine-resolution distributional hypotheses. Extent-of-occurrence maps tend to be overly simple, omit many known and well-documented populations, and likely frequently include many areas not holding populations. Refinement steps involve typological assumptions about habitat preferences and elevational ranges of species, which can introduce substantial error in estimates of species' true areas of distribution. However, no model-evaluation steps are taken to assess the predictive ability of these models, so model inaccuracies are not noticed. Whereas range summaries derived by these methods may be useful in coarse-grained, global-extent studies, their continued use in on-the-ground conservation applications at fine spatial resolutions is not advisable in light of reliance on assumptions, lack of real spatial resolution, and lack of testing. In contrast, data-driven techniques that integrate primary data on biodiversity occurrence with remotely sensed data that summarize environmental dimensions (i.e., ecological niche modeling or species distribution modeling) offer data-driven solutions based on a minimum of assumptions that can be evaluated and validated quantitatively to offer a well-founded, widely accepted method for summarizing species' distributional patterns for conservation applications. © 2016 Society for Conservation Biology.
Assessing semantic similarity of texts - Methods and algorithms

NASA Astrophysics Data System (ADS)

Rozeva, Anna; Zerkova, Silvia

2017-12-01

Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.
BaffleText: a Human Interactive Proof

NASA Astrophysics Data System (ADS)

Chew, Monica; Baird, Henry S.

2003-01-01

Internet services designed for human use are being abused by programs. We present a defense against such attacks in the form of a CAPTCHA (Completely Automatic Public Turing test to tell Computers and Humans Apart) that exploits the difference in ability between humans and machines in reading images of text. CAPTCHAs are a special case of 'human interactive proofs,' a broad class of security protocols that allow people to identify themselves over networks as members of given groups. We point out vulnerabilities of reading-based CAPTCHAs to dictionary and computer-vision attacks. We also draw on the literature on the psychophysics of human reading, which suggests fresh defenses available to CAPTCHAs. Motivated by these considerations, we propose BaffleText, a CAPTCHA which uses non-English pronounceable words to defend against dictionary attacks, and Gestalt-motivated image-masking degradations to defend against image restoration attacks. Experiments on human subjects confirm the human legibility and user acceptance of BaffleText images. We have found an image-complexity measure that correlates well with user acceptance and assists in engineering the generation of challenges to fit the ability gap. Recent computer-vision attacks, run independently by Mori and Jitendra, suggest that BaffleText is stronger than two existing CAPTCHAs.
30 CFR 77.314 - Automatic temperature control instruments.

Code of Federal Regulations, 2011 CFR

2011-07-01

... UNDERGROUND COAL MINES Thermal Dryers § 77.314 Automatic temperature control instruments. (a) Automatic temperature control instruments for thermal dryer system shall be of the recording type. (b) Automatic... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Automatic temperature control instruments. 77...
30 CFR 77.314 - Automatic temperature control instruments.

Code of Federal Regulations, 2010 CFR

2010-07-01

... UNDERGROUND COAL MINES Thermal Dryers § 77.314 Automatic temperature control instruments. (a) Automatic temperature control instruments for thermal dryer system shall be of the recording type. (b) Automatic... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Automatic temperature control instruments. 77...
30 CFR 77.314 - Automatic temperature control instruments.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Automatic temperature control instruments. 77... UNDERGROUND COAL MINES Thermal Dryers § 77.314 Automatic temperature control instruments. (a) Automatic temperature control instruments for thermal dryer system shall be of the recording type. (b) Automatic...
30 CFR 77.314 - Automatic temperature control instruments.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Automatic temperature control instruments. 77... UNDERGROUND COAL MINES Thermal Dryers § 77.314 Automatic temperature control instruments. (a) Automatic temperature control instruments for thermal dryer system shall be of the recording type. (b) Automatic...
30 CFR 77.314 - Automatic temperature control instruments.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Automatic temperature control instruments. 77... UNDERGROUND COAL MINES Thermal Dryers § 77.314 Automatic temperature control instruments. (a) Automatic temperature control instruments for thermal dryer system shall be of the recording type. (b) Automatic...
Temporal reasoning over clinical text: the state of the art

PubMed Central

Sun, Weiyi; Rumshisky, Anna; Uzuner, Ozlem

2013-01-01

Objectives To provide an overview of the problem of temporal reasoning over clinical text and to summarize the state of the art in clinical natural language processing for this task. Target audience This overview targets medical informatics researchers who are unfamiliar with the problems and applications of temporal reasoning over clinical text. Scope We review the major applications of text-based temporal reasoning, describe the challenges for software systems handling temporal information in clinical text, and give an overview of the state of the art. Finally, we present some perspectives on future research directions that emerged during the recent community-wide challenge on text-based temporal reasoning in the clinical domain. PMID:23676245
Summarizing health inequalities in a Balanced Scorecard. Methodological considerations.

PubMed

Auger, Nathalie; Raynault, Marie-France

2006-01-01

The association between social determinants and health inequalities is well recognized. What are now needed are tools to assist in disseminating such information. This article describes how the Balanced Scorecard may be used for summarizing data on health inequalities. The process begins by selecting appropriate social groups and indicators, and is followed by the measurement of differences across person, place, or time. The next step is to decide whether to focus on absolute versus relative inequality. The last step is to determine the scoring method, including whether to address issues of depth of inequality.
Thai Automatic Speech Recognition

DTIC Science & Technology

2005-01-01

used in an external DARPA evaluation involving medical scenarios between an American Doctor and a naïve monolingual Thai patient. 2. Thai Language... dictionary generation more challenging, and (3) the lack of word segmentation, which calls for automatic segmentation approaches to make n-gram language...requires a dictionary and provides various segmentation algorithms to automatically select suitable segmentations. Here we used a maximal matching
Benchmarking infrastructure for mutation text mining

PubMed Central

2014-01-01

Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600

Benchmarking infrastructure for mutation text mining.

PubMed

Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

2014-02-25

Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.
Automatic programming of simulation models

NASA Technical Reports Server (NTRS)

Schroer, Bernard J.; Tseng, Fan T.; Zhang, Shou X.; Dwan, Wen S.

1988-01-01

The objective of automatic programming is to improve the overall environment for describing the program. This improved environment is realized by a reduction in the amount of detail that the programmer needs to know and is exposed to. Furthermore, this improved environment is achieved by a specification language that is more natural to the user's problem domain and to the user's way of thinking and looking at the problem. The goal of this research is to apply the concepts of automatic programming (AP) to modeling discrete event simulation system. Specific emphasis is on the design and development of simulation tools to assist the modeler define or construct a model of the system and to then automatically write the corresponding simulation code in the target simulation language, GPSS/PC. A related goal is to evaluate the feasibility of various languages for constructing automatic programming simulation tools.
Clothes Dryer Automatic Termination Evaluation

DOE Office of Scientific and Technical Information (OSTI.GOV)

TeGrotenhuis, Ward E.

Volume 2: Improved Sensor and Control Designs Many residential clothes dryers on the market today provide automatic cycles that are intended to stop when the clothes are dry, as determined by the final remaining moisture content (RMC). However, testing of automatic termination cycles has shown that many dryers are susceptible to over-drying of loads, leading to excess energy consumption. In particular, tests performed using the DOE Test Procedure in Appendix D2 of 10 CFR 430 subpart B have shown that as much as 62% of the energy used in a cycle may be from over-drying. Volume 1 of this reportmore » shows an average of 20% excess energy from over-drying when running automatic cycles with various load compositions and dryer settings. Consequently, improving automatic termination sensors and algorithms has the potential for substantial energy savings in the U.S.« less
Sentence Similarity Analysis with Applications in Automatic Short Answer Grading

ERIC Educational Resources Information Center

Mohler, Michael A. G.

2012-01-01

In this dissertation, I explore unsupervised techniques for the task of automatic short answer grading. I compare a number of knowledge-based and corpus-based measures of text similarity, evaluate the effect of domain and size on the corpus-based measures, and also introduce a novel technique to improve the performance of the system by integrating…
Automatic system for computer program documentation

NASA Technical Reports Server (NTRS)

Simmons, D. B.; Elliott, R. W.; Arseven, S.; Colunga, D.

1972-01-01

Work done on a project to design an automatic system for computer program documentation aids was made to determine what existing programs could be used effectively to document computer programs. Results of the study are included in the form of an extensive bibliography and working papers on appropriate operating systems, text editors, program editors, data structures, standards, decision tables, flowchart systems, and proprietary documentation aids. The preliminary design for an automated documentation system is also included. An actual program has been documented in detail to demonstrate the types of output that can be produced by the proposed system.
12 Texts That Facilitate Authentic Reading Strategies for Novice, Experimenting, and Proficient Readers

ERIC Educational Resources Information Center

Hill, K. Dara

2017-01-01

The current climate of reading instruction calls for fluency strategies that stress automaticity, accuracy, and prosody, within the scope of prescribed reading programs that compromise teacher autonomy, with texts that are often irrelevant to the students' experiences. Consequently, accuracy and speed are developed, but deep comprehension is…
Standard Chinese: A Modular Approach. Student Text. Module 3: Money; Module 4: Directions.

ERIC Educational Resources Information Center

Defense Language Inst., Monterey, CA.

Texts in spoken Standard Chinese were developed to improve and update Chinese materials to reflect current usage in Beijing and Taipei. The focus is on communicating in practical situations, and the texts summarize and supplement tapes. The overall course is organized into 10 situational modules, student workbooks, and resource modules. This text…
The Shifting Sands in the Effects of Source Text Summarizability on Summary Writing

ERIC Educational Resources Information Center

Yu, Guoxing

2009-01-01

This paper reports the effects of the properties of source texts on summarization. One hundred and fifty-seven undergraduates were asked to write summaries of one of three extended English texts of similar length and readability, but differing in other discoursal features such as lexical diversity and macro-organization. The effects of…
10 CFR 429.45 - Automatic commercial ice makers.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 10 Energy 3 2012-01-01 2012-01-01 false Automatic commercial ice makers. 429.45 Section 429.45... PRODUCTS AND COMMERCIAL AND INDUSTRIAL EQUIPMENT Certification § 429.45 Automatic commercial ice makers. (a... automatic commercial ice makers; and (2) For each basic model of automatic commercial ice maker selected for...
10 CFR 429.45 - Automatic commercial ice makers.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 10 Energy 3 2014-01-01 2014-01-01 false Automatic commercial ice makers. 429.45 Section 429.45... PRODUCTS AND COMMERCIAL AND INDUSTRIAL EQUIPMENT Certification § 429.45 Automatic commercial ice makers. (a... automatic commercial ice makers; and (2) For each basic model of automatic commercial ice maker selected for...
10 CFR 429.45 - Automatic commercial ice makers.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 10 Energy 3 2013-01-01 2013-01-01 false Automatic commercial ice makers. 429.45 Section 429.45... PRODUCTS AND COMMERCIAL AND INDUSTRIAL EQUIPMENT Certification § 429.45 Automatic commercial ice makers. (a... automatic commercial ice makers; and (2) For each basic model of automatic commercial ice maker selected for...
Computer systems for automatic earthquake detection

USGS Publications Warehouse

Stewart, S.W.

1974-01-01

U.S Geological Survey seismologists in Menlo park, California, are utilizing the speed, reliability, and efficiency of minicomputers to monitor seismograph stations and to automatically detect earthquakes. An earthquake detection computer system, believed to be the only one of its kind in operation, automatically reports about 90 percent of all local earthquakes recorded by a network of over 100 central California seismograph stations. The system also monitors the stations for signs of malfunction or abnormal operation. Before the automatic system was put in operation, all of the earthquakes recorded had to be detected by manually searching the records, a time-consuming process. With the automatic detection system, the stations are efficiently monitored continuously.
Automatic Presentation of Sense-Specific Lexical Information in an Intelligent Learning System

ERIC Educational Resources Information Center

Eom, Soojeong

2012-01-01

Learning vocabulary and understanding texts present difficulty for language learners due to, among other things, the high degree of lexical ambiguity. By developing an intelligent tutoring system, this dissertation examines whether automatically providing enriched sense-specific information is effective for vocabulary learning and reading…
Automatic ultrasound image enhancement for 2D semi-automatic breast-lesion segmentation

NASA Astrophysics Data System (ADS)

Lu, Kongkuo; Hall, Christopher S.

2014-03-01

Breast cancer is the fastest growing cancer, accounting for 29%, of new cases in 2012, and second leading cause of cancer death among women in the United States and worldwide. Ultrasound (US) has been used as an indispensable tool for breast cancer detection/diagnosis and treatment. In computer-aided assistance, lesion segmentation is a preliminary but vital step, but the task is quite challenging in US images, due to imaging artifacts that complicate detection and measurement of the suspect lesions. The lesions usually present with poor boundary features and vary significantly in size, shape, and intensity distribution between cases. Automatic methods are highly application dependent while manual tracing methods are extremely time consuming and have a great deal of intra- and inter- observer variability. Semi-automatic approaches are designed to counterbalance the advantage and drawbacks of the automatic and manual methods. However, considerable user interaction might be necessary to ensure reasonable segmentation for a wide range of lesions. This work proposes an automatic enhancement approach to improve the boundary searching ability of the live wire method to reduce necessary user interaction while keeping the segmentation performance. Based on the results of segmentation of 50 2D breast lesions in US images, less user interaction is required to achieve desired accuracy, i.e. < 80%, when auto-enhancement is applied for live-wire segmentation.
AUTOMATIC COUNTING APPARATUS

DOEpatents

Howell, W.D.

1957-08-20

An apparatus for automatically recording the results of counting operations on trains of electrical pulses is described. The disadvantages of prior devices utilizing the two common methods of obtaining the count rate are overcome by this apparatus; in the case of time controlled operation, the disclosed system automatically records amy information stored by the scaler but not transferred to the printer at the end of the predetermined time controlled operations and, in the case of count controlled operation, provision is made to prevent a weak sample from occupying the apparatus for an excessively long period of time.
Improving text recall with multiple summaries.

PubMed

van der Meij, Hans; van der Meij, Jan

2012-06-01

QuikScan (QS) is an innovative design that aims to improve accessibility, comprehensibility, and subsequent recall of expository text by means of frequent within-document summaries that are formatted as numbered list items. The numbers in the QS summaries correspond to numbers placed in the body of the document where the summarized ideas are discussed in full. To examine the influence of QS summaries on participants' perceptions of text quality (i.e., comprehensibility, structure, and interest) and recall, an experimental - control group design compared the effects of a QS text with a structured abstract (SA) text. Forty psychology students participated voluntarily or received course credits. Students first read a control (SA) or experimental (QS) text on flashbulb memory (FBM). Next, their perceptions of text quality were measured through a questionnaire. Recall was assessed with an open answer test with items for facts, comprehension and higher order information. Perceptions of text quality did not vary across conditions. But QS did lead to significantly and substantially (d= 1.57) higher overall recall scores. Participants with the QS text performed significantly better on all item types than participants with the SA text. Studying a QS text led to a substantial improvement in recall compared to an SA text. Further research is needed to examine how readers study QS texts and whether a text model hypothesis or a repetition effect hypothesis accounts for the effectiveness. The first hypothesis posits that the QS summaries support the reader in constructing a text schema. The second attributes the effects of these summaries to their repetition of text topics. ©2011 The British Psychological Society.
Semi-automatic knee cartilage segmentation

NASA Astrophysics Data System (ADS)

Dam, Erik B.; Folkesson, Jenny; Pettersen, Paola C.; Christiansen, Claus

2006-03-01

Osteo-Arthritis (OA) is a very common age-related cause of pain and reduced range of motion. A central effect of OA is wear-down of the articular cartilage that otherwise ensures smooth joint motion. Quantification of the cartilage breakdown is central in monitoring disease progression and therefore cartilage segmentation is required. Recent advances allow automatic cartilage segmentation with high accuracy in most cases. However, the automatic methods still fail in some problematic cases. For clinical studies, even if a few failing cases will be averaged out in the overall results, this reduces the mean accuracy and precision and thereby necessitates larger/longer studies. Since the severe OA cases are often most problematic for the automatic methods, there is even a risk that the quantification will introduce a bias in the results. Therefore, interactive inspection and correction of these problematic cases is desirable. For diagnosis on individuals, this is even more crucial since the diagnosis will otherwise simply fail. We introduce and evaluate a semi-automatic cartilage segmentation method combining an automatic pre-segmentation with an interactive step that allows inspection and correction. The automatic step consists of voxel classification based on supervised learning. The interactive step combines a watershed transformation of the original scan with the posterior probability map from the classification step at sub-voxel precision. We evaluate the method for the task of segmenting the tibial cartilage sheet from low-field magnetic resonance imaging (MRI) of knees. The evaluation shows that the combined method allows accurate and highly reproducible correction of the segmentation of even the worst cases in approximately ten minutes of interaction.
On the unsupervised analysis of domain-specific Chinese texts

PubMed Central

Deng, Ke; Bol, Peter K.; Li, Kate J.; Liu, Jun S.

2016-01-01

With the growing availability of digitized text data both publicly and privately, there is a great need for effective computational tools to automatically extract information from texts. Because the Chinese language differs most significantly from alphabet-based languages in not specifying word boundaries, most existing Chinese text-mining methods require a prespecified vocabulary and/or a large relevant training corpus, which may not be available in some applications. We introduce an unsupervised method, top-down word discovery and segmentation (TopWORDS), for simultaneously discovering and segmenting words and phrases from large volumes of unstructured Chinese texts, and propose ways to order discovered words and conduct higher-level context analyses. TopWORDS is particularly useful for mining online and domain-specific texts where the underlying vocabulary is unknown or the texts of interest differ significantly from available training corpora. When outputs from TopWORDS are fed into context analysis tools such as topic modeling, word embedding, and association pattern finding, the results are as good as or better than that from using outputs of a supervised segmentation method. PMID:27185919
Automatic Frequency Controller for Power Amplifiers Used in Bio-Implanted Applications: Issues and Challenges

PubMed Central

Hannan, Mahammad A.; Hussein, Hussein A.; Mutashar, Saad; Samad, Salina A.; Hussain, Aini

2014-01-01

With the development of communication technologies, the use of wireless systems in biomedical implanted devices has become very useful. Bio-implantable devices are electronic devices which are used for treatment and monitoring brain implants, pacemakers, cochlear implants, retinal implants and so on. The inductive coupling link is used to transmit power and data between the primary and secondary sides of the biomedical implanted system, in which efficient power amplifier is very much needed to ensure the best data transmission rates and low power losses. However, the efficiency of the implanted devices depends on the circuit design, controller, load variation, changes of radio frequency coil's mutual displacement and coupling coefficients. This paper provides a comprehensive survey on various power amplifier classes and their characteristics, efficiency and controller techniques that have been used in bio-implants. The automatic frequency controller used in biomedical implants such as gate drive switching control, closed loop power control, voltage controlled oscillator, capacitor control and microcontroller frequency control have been explained. Most of these techniques keep the resonance frequency stable in transcutaneous power transfer between the external coil and the coil implanted inside the body. Detailed information including carrier frequency, power efficiency, coils displacement, power consumption, supplied voltage and CMOS chip for the controllers techniques are investigated and summarized in the provided tables. From the rigorous review, it is observed that the existing automatic frequency controller technologies are more or less can capable of performing well in the implant devices; however, the systems are still not up to the mark. Accordingly, current challenges and problems of the typical automatic frequency controller techniques for power amplifiers are illustrated, with a brief suggestions and discussion section concerning the progress of
Text Mining of UU-ITE Implementation in Indonesia

NASA Astrophysics Data System (ADS)

Hakim, Lukmanul; Kusumasari, Tien F.; Lubis, Muharman

2018-04-01

At present, social media and networks act as one of the main platforms for sharing information, idea, thought and opinions. Many people share their knowledge and express their views on the specific topics or current hot issues that interest them. The social media texts have rich information about the complaints, comments, recommendation and suggestion as the automatic reaction or respond to government initiative or policy in order to overcome certain issues.This study examines the sentiment from netizensas part of citizen who has vocal sound about the implementation of UU ITE as the first cyberlaw in Indonesia as a means to identify the current tendency of citizen perception. To perform text mining techniques, this study used Twitter Rest API while R programming was utilized for the purpose of classification analysis based on hierarchical cluster.

The Wolfgang and Amadeus Automatic Photoelectric Telescopes. A ``Kleine-Nacht-Musik'' during the first five years of routine operation

NASA Astrophysics Data System (ADS)

Granzer, T.; Reegen, P.; Strassmeier, K. G.

2001-12-01

We present a summary of five years of continuous operation of the University of Vienna twin Automatic Photoelectric Telescopes (APTs) -- Wolfgang and Amadeus. These two telescopes are part of the Fairborn Observatory facility located in the Sonoran desert close to Washington Camp in southern Arizona. The detection and distinction procedure between weather-induced data-quality loss and systematic data-quality loss turned out to be a crucial task. Therefore, special emphasis is laid on the data quality monitoring tools developed throughout the years. Furthermore, we summarize the scientific highlights from the first five years of operation
Automatic query formulations in information retrieval.

PubMed

Salton, G; Buckley, C; Fox, E A

1983-07-01

Modern information retrieval systems are designed to supply relevant information in response to requests received from the user population. In most retrieval environments the search requests consist of keywords, or index terms, interrelated by appropriate Boolean operators. Since it is difficult for untrained users to generate effective Boolean search requests, trained search intermediaries are normally used to translate original statements of user need into useful Boolean search formulations. Methods are introduced in this study which reduce the role of the search intermediaries by making it possible to generate Boolean search formulations completely automatically from natural language statements provided by the system patrons. Frequency considerations are used automatically to generate appropriate term combinations as well as Boolean connectives relating the terms. Methods are covered to produce automatic query formulations both in a standard Boolean logic system, as well as in an extended Boolean system in which the strict interpretation of the connectives is relaxed. Experimental results are supplied to evaluate the effectiveness of the automatic query formulation process, and methods are described for applying the automatic query formulation process in practice.
Adding Automatic Evaluation to Interactive Virtual Labs

ERIC Educational Resources Information Center

Farias, Gonzalo; Muñoz de la Peña, David; Gómez-Estern, Fabio; De la Torre, Luis; Sánchez, Carlos; Dormido, Sebastián

2016-01-01

Automatic evaluation is a challenging field that has been addressed by the academic community in order to reduce the assessment workload. In this work we present a new element for the authoring tool Easy Java Simulations (EJS). This element, which is named automatic evaluation element (AEE), provides automatic evaluation to virtual and remote…
Prevalence Estimation of Protected Health Information in Swedish Clinical Text.

PubMed

Henriksson, Aron; Kvist, Maria; Dalianis, Hercules

2017-01-01

Obscuring protected health information (PHI) in the clinical text of health records facilitates the secondary use of healthcare data in a privacy-preserving manner. Although automatic de-identification of clinical text using machine learning holds much promise, little is known about the relative prevalence of PHI in different types of clinical text and whether there is a need for domain adaptation when learning predictive models from one particular domain and applying it to another. In this study, we address these questions by training a predictive model and using it to estimate the prevalence of PHI in clinical text written (1) in different clinical specialties, (2) in different types of notes (i.e., under different headings), and (3) by persons in different professional roles. It is demonstrated that the overall PHI density is 1.57%; however, substantial differences exist across domains.
What Automaticity Deficit? Activation of Lexical Information by Readers with Dyslexia in a Rapid Automatized Naming Stroop-Switch Task

ERIC Educational Resources Information Center

Jones, Manon W.; Snowling, Margaret J.; Moll, Kristina

2016-01-01

Reading fluency is often predicted by rapid automatized naming (RAN) speed, which as the name implies, measures the automaticity with which familiar stimuli (e.g., letters) can be retrieved and named. Readers with dyslexia are considered to have less "automatized" access to lexical information, reflected in longer RAN times compared with…
Level statistics of words: Finding keywords in literary texts and symbolic sequences

NASA Astrophysics Data System (ADS)

Carpena, P.; Bernaola-Galván, P.; Hackenberg, M.; Coronado, A. V.; Oliver, J. L.

2009-03-01

Using a generalization of the level statistics analysis of quantum disordered systems, we present an approach able to extract automatically keywords in literary texts. Our approach takes into account not only the frequencies of the words present in the text but also their spatial distribution along the text, and is based on the fact that relevant words are significantly clustered (i.e., they self-attract each other), while irrelevant words are distributed randomly in the text. Since a reference corpus is not needed, our approach is especially suitable for single documents for which no a priori information is available. In addition, we show that our method works also in generic symbolic sequences (continuous texts without spaces), thus suggesting its general applicability.
Automatic Layout Design for Power Module

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ning, Puqi; Wang, Fei; Ngo, Khai

The layout of power modules is one of the most important elements in power module design, especially for high power densities, where couplings are increased. In this paper, an automatic design process using a genetic algorithm is presented. Some practical considerations are introduced in the optimization of the layout design of the module. This paper presents a process for automatic layout design for high power density modules. Detailed GA implementations are introduced both for outer loop and inner loop. As verified by a design example, the results of the automatic design process presented here are better than those from manualmore » design and also better than the results from a popular design software. This automatic design procedure could be a major step toward improving the overall performance of future layout design.« less
Reading Guided by Automated Graphical Representations: How Model-Based Text Visualizations Facilitate Learning in Reading Comprehension Tasks

ERIC Educational Resources Information Center

Pirnay-Dummer, Pablo; Ifenthaler, Dirk

2011-01-01

Our study integrates automated natural language-oriented assessment and analysis methodologies into feasible reading comprehension tasks. With the newly developed T-MITOCAR toolset, prose text can be automatically converted into an association net which has similarities to a concept map. The "text to graph" feature of the software is based on…
NASA automatic subject analysis technique for extracting retrievable multi-terms (NASA TERM) system

NASA Technical Reports Server (NTRS)

Kirschbaum, J.; Williamson, R. E.

1978-01-01

Current methods for information processing and retrieval used at the NASA Scientific and Technical Information Facility are reviewed. A more cost effective computer aided indexing system is proposed which automatically generates print terms (phrases) from the natural text. Satisfactory print terms can be generated in a primarily automatic manner to produce a thesaurus (NASA TERMS) which extends all the mappings presently applied by indexers, specifies the worth of each posting term in the thesaurus, and indicates the areas of use of the thesaurus entry phrase. These print terms enable the computer to determine which of several terms in a hierarchy is desirable and to differentiate ambiguous terms. Steps in the NASA TERMS algorithm are discussed and the processing of surrogate entry phrases is demonstrated using four previously manually indexed STAR abstracts for comparison. The simulation shows phrase isolation, text phrase reduction, NASA terms selection, and RECON display.
Wilderness ecology: a method of sampling and summarizing data for plant community classification.

Treesearch

Lewis F. Ohmann; Robert R. Ream

1971-01-01

Presents a flexible sampling scheme that researchers and land managers may use in surveying and classifying plant communities of forest lands. Includes methods, data sheets, and computer summarization printouts.
Standard Chinese: A Modular Approach. Student Text. Module 1: Orientation; Module 2: Biographic Information.

ERIC Educational Resources Information Center

Defense Language Inst., Monterey, CA.

Texts in spoken Standard Chinese were developed to improve and update Chinese materials to reflect current usage in Beijing and Taipei. The focus is on communicating in Chinese in practical situations, and the texts summarize and supplement tapes. The overall course is organized into 10 situational modules, student workbooks, and resource modules.…
A general graphical user interface for automatic reliability modeling

NASA Technical Reports Server (NTRS)

Liceaga, Carlos A.; Siewiorek, Daniel P.

1991-01-01

Reported here is a general Graphical User Interface (GUI) for automatic reliability modeling of Processor Memory Switch (PMS) structures using a Markov model. This GUI is based on a hierarchy of windows. One window has graphical editing capabilities for specifying the system's communication structure, hierarchy, reconfiguration capabilities, and requirements. Other windows have field texts, popup menus, and buttons for specifying parameters and selecting actions. An example application of the GUI is given.
Text Detection, Tracking and Recognition in Video: A Comprehensive Survey.

PubMed

Yin, Xu-Cheng; Zuo, Ze-Yu; Tian, Shu; Liu, Cheng-Lin

2016-04-14

Intelligent analysis of video data is currently in wide demand because video is a major source of sensory data in our lives. Text is a prominent and direct source of information in video, while recent surveys of text detection and recognition in imagery [1], [2] focus mainly on text extraction from scene images. Here, this paper presents a comprehensive survey of text detection, tracking and recognition in video with three major contributions. First, a generic framework is proposed for video text extraction that uniformly describes detection, tracking, recognition, and their relations and interactions. Second, within this framework, a variety of methods, systems and evaluation protocols of video text extraction are summarized, compared, and analyzed. Existing text tracking techniques, tracking based detection and recognition techniques are specifically highlighted. Third, related applications, prominent challenges, and future directions for video text extraction (especially from scene videos and web videos) are also thoroughly discussed.
Global image analysis to determine suitability for text-based image personalization

NASA Astrophysics Data System (ADS)

Ding, Hengzhou; Bala, Raja; Fan, Zhigang; Bouman, Charles A.; Allebach, Jan P.

2012-03-01

Lately, image personalization is becoming an interesting topic. Images with variable elements such as text usually appear much more appealing to the recipients. In this paper, we describe a method to pre-analyze the image and automatically suggest to the user the most suitable regions within an image for text-based personalization. The method is based on input gathered from experiments conducted with professional designers. It has been observed that regions that are spatially smooth and regions with existing text (e.g. signage, banners, etc.) are the best candidates for personalization. This gives rise to two sets of corresponding algorithms: one for identifying smooth areas, and one for locating text regions. Furthermore, based on the smooth and text regions found in the image, we derive an overall metric to rate the image in terms of its suitability for personalization (SFP).
Text Mining for Neuroscience

NASA Astrophysics Data System (ADS)

Tirupattur, Naveen; Lapish, Christopher C.; Mukhopadhyay, Snehasis

2011-06-01

Text mining, sometimes alternately referred to as text analytics, refers to the process of extracting high-quality knowledge from the analysis of textual data. Text mining has wide variety of applications in areas such as biomedical science, news analysis, and homeland security. In this paper, we describe an approach and some relatively small-scale experiments which apply text mining to neuroscience research literature to find novel associations among a diverse set of entities. Neuroscience is a discipline which encompasses an exceptionally wide range of experimental approaches and rapidly growing interest. This combination results in an overwhelmingly large and often diffuse literature which makes a comprehensive synthesis difficult. Understanding the relations or associations among the entities appearing in the literature not only improves the researchers current understanding of recent advances in their field, but also provides an important computational tool to formulate novel hypotheses and thereby assist in scientific discoveries. We describe a methodology to automatically mine the literature and form novel associations through direct analysis of published texts. The method first retrieves a set of documents from databases such as PubMed using a set of relevant domain terms. In the current study these terms yielded a set of documents ranging from 160,909 to 367,214 documents. Each document is then represented in a numerical vector form from which an Association Graph is computed which represents relationships between all pairs of domain terms, based on co-occurrence. Association graphs can then be subjected to various graph theoretic algorithms such as transitive closure and cycle (circuit) detection to derive additional information, and can also be visually presented to a human researcher for understanding. In this paper, we present three relatively small-scale problem-specific case studies to demonstrate that such an approach is very successful in
Strategic planning of developing automatic optical inspection (AOI) technologies in Taiwan

NASA Astrophysics Data System (ADS)

Fan, K. C.; Hsu, C.

2005-01-01

In most domestic hi-tech industries in Taiwan, the automatic optical inspection (AOI) equipment is mostly imported. In view of the required specifications, AOI consists of the integration of mechanical-electrical-optical-information technologies. In the past two decades, traditional industries have lost their competitiveness due to the low profit rate. It is possible to promote a new AOI industry in Taiwan through the integration of its strong background in mechatronic technology in positioning stages with the optical image processing techniques. The market requirements are huge not only in domestic need but also in global need. This is the main reason to promote the AOI research for the coming years in Taiwan. Focused industrial applications will be in IC, PCB, LCD, communication, and MEMS parts. This paper will analyze the domestic and global AOI equipment market, summarize the necessary fish bone technology diagrams, survey the actual industrial needs, and propose the strategic plan to be promoted in Taiwan.
Automatic programming of simulation models

NASA Technical Reports Server (NTRS)

Schroer, Bernard J.; Tseng, Fan T.; Zhang, Shou X.; Dwan, Wen S.

1990-01-01

The concepts of software engineering were used to improve the simulation modeling environment. Emphasis was placed on the application of an element of rapid prototyping, or automatic programming, to assist the modeler define the problem specification. Then, once the problem specification has been defined, an automatic code generator is used to write the simulation code. The following two domains were selected for evaluating the concepts of software engineering for discrete event simulation: manufacturing domain and a spacecraft countdown network sequence. The specific tasks were to: (1) define the software requirements for a graphical user interface to the Automatic Manufacturing Programming System (AMPS) system; (2) develop a graphical user interface for AMPS; and (3) compare the AMPS graphical interface with the AMPS interactive user interface.
Fast, axis-agnostic, dynamically summarized storage and retrieval for mass spectrometry data.

PubMed

Handy, Kyle; Rosen, Jebediah; Gillan, André; Smith, Rob

2017-01-01

Mass spectrometry, a popular technique for elucidating the molecular contents of experimental samples, creates data sets comprised of millions of three-dimensional (m/z, retention time, intensity) data points that correspond to the types and quantities of analyzed molecules. Open and commercial MS data formats are arranged by retention time, creating latency when accessing data across multiple m/z. Existing MS storage and retrieval methods have been developed to overcome the limitations of retention time-based data formats, but do not provide certain features such as dynamic summarization and storage and retrieval of point meta-data (such as signal cluster membership), precluding efficient viewing applications and certain data-processing approaches. This manuscript describes MzTree, a spatial database designed to provide real-time storage and retrieval of dynamically summarized standard and augmented MS data with fast performance in both m/z and RT directions. Performance is reported on real data with comparisons against related published retrieval systems.
Figure Text Extraction in Biomedical Literature

PubMed Central

Kim, Daehyun; Yu, Hong

2011-01-01

Background Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org) to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures. Methodology We first evaluated an off-the-shelf Optical Character Recognition (OCR) tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT) to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons. Results/Conclusions The evaluation on 382 figures (9,643 figure texts in total) randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19
Figure text extraction in biomedical literature.

PubMed

Kim, Daehyun; Yu, Hong

2011-01-13

Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org) to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures. We first evaluated an off-the-shelf Optical Character Recognition (OCR) tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT) to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons. The evaluation on 382 figures (9,643 figure texts in total) randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for text

Automatic mathematical modeling for space application

NASA Technical Reports Server (NTRS)

Wang, Caroline K.

1987-01-01

A methodology for automatic mathematical modeling is described. The major objective is to create a very friendly environment for engineers to design, maintain and verify their model and also automatically convert the mathematical model into FORTRAN code for conventional computation. A demonstration program was designed for modeling the Space Shuttle Main Engine simulation mathematical model called Propulsion System Automatic Modeling (PSAM). PSAM provides a very friendly and well organized environment for engineers to build a knowledge base for base equations and general information. PSAM contains an initial set of component process elements for the Space Shuttle Main Engine simulation and a questionnaire that allows the engineer to answer a set of questions to specify a particular model. PSAM is then able to automatically generate the model and the FORTRAN code. A future goal is to download the FORTRAN code to the VAX/VMS system for conventional computation.
49 CFR 236.824 - System, automatic block signal.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 49 Transportation 4 2010-10-01 2010-10-01 false System, automatic block signal. 236.824 Section... § 236.824 System, automatic block signal. A block signal system wherein the use of each block is governed by an automatic block signal, cab signal, or both. ...
Real-time automatic registration in optical surgical navigation

NASA Astrophysics Data System (ADS)

Lin, Qinyong; Yang, Rongqian; Cai, Ken; Si, Xuan; Chen, Xiuwen; Wu, Xiaoming

2016-05-01

An image-guided surgical navigation system requires the improvement of the patient-to-image registration time to enhance the convenience of the registration procedure. A critical step in achieving this aim is performing a fully automatic patient-to-image registration. This study reports on a design of custom fiducial markers and the performance of a real-time automatic patient-to-image registration method using these markers on the basis of an optical tracking system for rigid anatomy. The custom fiducial markers are designed to be automatically localized in both patient and image spaces. An automatic localization method is performed by registering a point cloud sampled from the three dimensional (3D) pedestal model surface of a fiducial marker to each pedestal of fiducial markers searched in image space. A head phantom is constructed to estimate the performance of the real-time automatic registration method under four fiducial configurations. The head phantom experimental results demonstrate that the real-time automatic registration method is more convenient, rapid, and accurate than the manual method. The time required for each registration is approximately 0.1 s. The automatic localization method precisely localizes the fiducial markers in image space. The averaged target registration error for the four configurations is approximately 0.7 mm. The automatic registration performance is independent of the positions relative to the tracking system and the movement of the patient during the operation.
46 CFR 63.25-1 - Small automatic auxiliary boilers.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 46 Shipping 2 2010-10-01 2010-10-01 false Small automatic auxiliary boilers. 63.25-1 Section 63.25... AUXILIARY BOILERS Requirements for Specific Types of Automatic Auxiliary Boilers § 63.25-1 Small automatic auxiliary boilers. Small automatic auxiliary boilers defined as having heat-input ratings of 400,000 Btu/hr...
Experimenting with Automatic Text-to-Diagram Conversion: A Novel Teaching Aid for the Blind People

ERIC Educational Resources Information Center

Mukherjee, Anirban; Garain, Utpal; Biswas, Arindam

2014-01-01

Diagram describing texts are integral part of science and engineering subjects including geometry, physics, engineering drawing, etc. In order to understand such text, one, at first, tries to draw or perceive the underlying diagram. For perception of the blind students such diagrams need to be drawn in some non-visual accessible form like tactile…
Automatic Fastening Large Structures: a New Approach

NASA Technical Reports Server (NTRS)

Lumley, D. F.

1985-01-01

The external tank (ET) intertank structure for the space shuttle, a 27.5 ft diameter 22.5 ft long externally stiffened mechanically fastened skin-stringer-frame structure, was a labor intensitive manual structure built on a modified Saturn tooling position. A new approach was developed based on half-section subassemblies. The heart of this manufacturing approach will be 33 ft high vertical automatic riveting system with a 28 ft rotary positioner coming on-line in mid 1985. The Automatic Riveting System incorporates many of the latest automatic riveting technologies. Key features include: vertical columns with two sets of independently operating CNC drill-riveting heads; capability of drill, insert and upset any one piece fastener up to 3/8 inch diameter including slugs without displacing the workpiece offset bucking ram with programmable rotation and deep retraction; vision system for automatic parts program re-synchronization and part edge margin control; and an automatic rivet selection/handling system.
AUTOMATIC COUNTER

DOEpatents

Robinson, H.P.

1960-06-01

An automatic counter of alpha particle tracks recorded by a sensitive emulsion of a photographic plate is described. The counter includes a source of mcdulated dark-field illumination for developing light flashes from the recorded particle tracks as the photographic plate is automatically scanned in narrow strips. Photoelectric means convert the light flashes to proportional current pulses for application to an electronic counting circuit. Photoelectric means are further provided for developing a phase reference signal from the photographic plate in such a manner that signals arising from particle tracks not parallel to the edge of the plate are out of phase with the reference signal. The counting circuit includes provision for rejecting the out-of-phase signals resulting from unoriented tracks as well as signals resulting from spurious marks on the plate such as scratches, dust or grain clumpings, etc. The output of the circuit is hence indicative only of the tracks that would be counted by a human operator.
Use of seatbelts in cars with automatic belts.

PubMed Central

Williams, A F; Wells, J K; Lund, A K; Teed, N J

1992-01-01

Use of seatbelts in late model cars with automatic or manual belt systems was observed in suburban Washington, DC, Chicago, Los Angeles, and Philadelphia. In cars with automatic two-point belt systems, the use of shoulder belts by drivers was substantially higher than in the same model cars with manual three-point belts. This finding was true in varying degrees whatever the type of automatic belt, including cars with detachable nonmotorized belts, cars with detachable motorized belts, and especially cars with nondetachable motorized belts. Most of these automatic shoulder belts systems include manual lap belts. Use of lap belts was lower in cars with automatic two-point belt systems than in the same model cars with manual three-point belts; precisely how much lower could not be reliably estimated in this survey. Use of shoulder and lap belts was slightly higher in General Motors cars with detachable automatic three-point belts compared with the same model cars with manual three-point belts; in Hondas there was no difference in the rates of use of manual three-point belts and the rates of use of automatic three-point belts. PMID:1561301
Automated Classification of Selected Data Elements from Free-text Diagnostic Reports for Clinical Research.

PubMed

Löpprich, Martin; Krauss, Felix; Ganzinger, Matthias; Senghas, Karsten; Riezler, Stefan; Knaup, Petra

2016-08-05

In the Multiple Myeloma clinical registry at Heidelberg University Hospital, most data are extracted from discharge letters. Our aim was to analyze if it is possible to make the manual documentation process more efficient by using methods of natural language processing for multiclass classification of free-text diagnostic reports to automatically document the diagnosis and state of disease of myeloma patients. The first objective was to create a corpus consisting of free-text diagnosis paragraphs of patients with multiple myeloma from German diagnostic reports, and its manual annotation of relevant data elements by documentation specialists. The second objective was to construct and evaluate a framework using different NLP methods to enable automatic multiclass classification of relevant data elements from free-text diagnostic reports. The main diagnoses paragraph was extracted from the clinical report of one third randomly selected patients of the multiple myeloma research database from Heidelberg University Hospital (in total 737 selected patients). An EDC system was setup and two data entry specialists performed independently a manual documentation of at least nine specific data elements for multiple myeloma characterization. Both data entries were compared and assessed by a third specialist and an annotated text corpus was created. A framework was constructed, consisting of a self-developed package to split multiple diagnosis sequences into several subsequences, four different preprocessing steps to normalize the input data and two classifiers: a maximum entropy classifier (MEC) and a support vector machine (SVM). In total 15 different pipelines were examined and assessed by a ten-fold cross-validation, reiterated 100 times. For quality indication the average error rate and the average F1-score were conducted. For significance testing the approximate randomization test was used. The created annotated corpus consists of 737 different diagnoses paragraphs with a
GPM V05 Gridded Text Products

NASA Technical Reports Server (NTRS)

Stocker, Erich Franz; Kelley, Owen

2017-01-01

This presentation will summarize the changes in the products for the GPM V05 reprocessing cycle. It will concentrate on discussing the gridded text product from the core satellite retrievals. However, all aspects of the GPROF GMI changes in this product are equally appropriate to the other two gridded text products. The GPM mission reprocessed its products in May of 2017 as part of a continuing improvement of precipitation retrievals. This lead to important improvement in the retrievals and therefore also necessitated reprocessing the gridded test products. The V05 GPROF changes not only improved the retrievals but substantially alerted the format and this compelled changes to the gridded text products. Especially important in this regard is the GPROF2017 (used in V05) change from reporting the fraction of the total precipitation rate that occurring as convection or in liquid phase. Instead, GPROF2017, and therefore V05 gridded text products, report the rate of convective precipitation in mm/hr. The GPROF2017 algorithm now reports the frozen precipitation rate in mm/hr rather than the fraction of total precipitation that is liquid. Because of the aim of the gridded text product is to remain simple the radar and combined results will also change in V05 to reflect this change in the GMI retrieval. The presentation provides an analysis of these changes as well as presenting a comparison with the swath products from which the hourly text grids were derived.
49 CFR 236.825 - System, automatic train control.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 49 Transportation 4 2010-10-01 2010-10-01 false System, automatic train control. 236.825 Section..., INSPECTION, MAINTENANCE, AND REPAIR OF SIGNAL AND TRAIN CONTROL SYSTEMS, DEVICES, AND APPLIANCES Definitions § 236.825 System, automatic train control. A system so arranged that its operation will automatically...
49 CFR 236.825 - System, automatic train control.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 49 Transportation 4 2011-10-01 2011-10-01 false System, automatic train control. 236.825 Section..., INSPECTION, MAINTENANCE, AND REPAIR OF SIGNAL AND TRAIN CONTROL SYSTEMS, DEVICES, AND APPLIANCES Definitions § 236.825 System, automatic train control. A system so arranged that its operation will automatically...
7 CFR 58.418 - Automatic cheese making equipment.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 7 Agriculture 3 2011-01-01 2011-01-01 false Automatic cheese making equipment. 58.418 Section 58.418 Agriculture Regulations of the Department of Agriculture (Continued) AGRICULTURAL MARKETING... Service 1 Equipment and Utensils § 58.418 Automatic cheese making equipment. (a) Automatic Curd Maker. The...
7 CFR 58.418 - Automatic cheese making equipment.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 7 Agriculture 3 2010-01-01 2010-01-01 false Automatic cheese making equipment. 58.418 Section 58.418 Agriculture Regulations of the Department of Agriculture (Continued) AGRICULTURAL MARKETING... Service 1 Equipment and Utensils § 58.418 Automatic cheese making equipment. (a) Automatic Curd Maker. The...
21 CFR 892.1900 - Automatic radiographic film processor.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
21 CFR 892.1900 - Automatic radiographic film processor.

Code of Federal Regulations, 2013 CFR

2013-04-01

... 21 Food and Drugs 8 2013-04-01 2013-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
21 CFR 892.1900 - Automatic radiographic film processor.

Code of Federal Regulations, 2014 CFR

2014-04-01

... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
21 CFR 892.1900 - Automatic radiographic film processor.

Code of Federal Regulations, 2012 CFR

2012-04-01

... 21 Food and Drugs 8 2012-04-01 2012-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
21 CFR 892.1900 - Automatic radiographic film processor.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Automatic radiographic film processor. 892.1900... (CONTINUED) MEDICAL DEVICES RADIOLOGY DEVICES Diagnostic Devices § 892.1900 Automatic radiographic film processor. (a) Identification. An automatic radiographic film processor is a device intended to be used to...
Automatic Query Formulations in Information Retrieval.

ERIC Educational Resources Information Center

Salton, G.; And Others

1983-01-01

Introduces methods designed to reduce role of search intermediaries by generating Boolean search formulations automatically using term frequency considerations from natural language statements provided by system patrons. Experimental results are supplied and methods are described for applying automatic query formulation process in practice.…

Automatic Dictionary Expansion Using Non-parallel Corpora

NASA Astrophysics Data System (ADS)

Rapp, Reinhard; Zock, Michael

Automatically generating bilingual dictionaries from parallel, manually translated texts is a well established technique that works well in practice. However, parallel texts are a scarce resource. Therefore, it is desirable also to be able to generate dictionaries from pairs of comparable monolingual corpora. For most languages, such corpora are much easier to acquire, and often in considerably larger quantities. In this paper we present the implementation of an algorithm which exploits such corpora with good success. Based on the assumption that the co-occurrence patterns between different languages are related, it expands a small base lexicon. For improved performance, it also realizes a novel interlingua approach. That is, if corpora of more than two languages are available, the translations from one language to another can be determined not only directly, but also indirectly via a pivot language.
Empirical Analysis of Exploiting Review Helpfulness for Extractive Summarization of Online Reviews

ERIC Educational Resources Information Center

Xiong, Wenting; Litman, Diane

2014-01-01

We propose a novel unsupervised extractive approach for summarizing online reviews by exploiting review helpfulness ratings. In addition to using the helpfulness ratings for review-level filtering, we suggest using them as the supervision of a topic model for sentence-level content scoring. The proposed method is metadata-driven, requiring no…
Summarizing Social Disparities in Health

PubMed Central

Asada, Yukiko; Yoshida, Yoko; Whipp, Alyce M

2013-01-01

Context Reporting on health disparities is fundamental for meeting the goal of reducing health disparities. One often overlooked challenge is determining the best way to report those disparities associated with multiple attributes such as income, education, sex, and race/ethnicity. This article proposes an analytical approach to summarizing social disparities in health, and we demonstrate its empirical application by comparing the degrees and patterns of health disparities in all fifty states and the District of Columbia (DC). Methods We used the 2009 American Community Survey, and our measure of health was functional limitation. For each state and DC, we calculated the overall disparity and attribute-specific disparities for income, education, sex, and race/ethnicity in functional limitation. Along with the state rankings of these health disparities, we developed health disparity profiles according to the attribute making the largest contribution to overall disparity in each state. Findings Our results show a general lack of consistency in the rankings of overall and attribute-specific disparities in functional limitation across the states. Wyoming has the smallest overall disparity and West Virginia the largest. In each of the four attribute-specific health disparity rankings, however, most of the best- and worst-performing states in regard to overall health disparity are not consistently good or bad. Our analysis suggests the following three disparity profiles across states: (1) the largest contribution from race/ethnicity (thirty-four states), (2) roughly equal contributions of race/ethnicity and socioeconomic factor(s) (ten states), and (3) the largest contribution from socioeconomic factor(s) (seven states). Conclusions Our proposed approach offers policy-relevant health disparity information in a comparable and interpretable manner, and currently publicly available data support its application. We hope this approach will spark discussion regarding how best
Effects of Transfer to Real-World Subject Area Materials from Training in Graphic Organizers and Summarizing on Developmental College Readers' Comprehension of the Compare/Contrast Text Structure in Science Expository Text.

ERIC Educational Resources Information Center

Balajthy, Ernest; Weisberg, Renee

To determine whether less able readers could use the strategies they had been taught, a study investigated the transfer effects of training in the use of graphic organizers and summary writing on readers' recognition of the compare/contrast text structure. Subjects, 70 freshmen at a western New York state college of liberal arts and sciences in a…
49 CFR 236.826 - System, automatic train stop.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 49 Transportation 4 2011-10-01 2011-10-01 false System, automatic train stop. 236.826 Section 236..., INSPECTION, MAINTENANCE, AND REPAIR OF SIGNAL AND TRAIN CONTROL SYSTEMS, DEVICES, AND APPLIANCES Definitions § 236.826 System, automatic train stop. A system so arranged that its operation will automatically...
The Use of Automatic Indexing for Authority Control.

ERIC Educational Resources Information Center

Dillon, Martin; And Others

1981-01-01

Uses an experimental system for authority control on a collection of bibliographic records to demonstrate the resemblance between thesaurus-based automatic indexing and automatic authority control. Details of the automatic indexing system are given, results discussed, and the benefits of the resemblance examined. Included are a rules appendix and…
49 CFR 236.826 - System, automatic train stop.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 49 Transportation 4 2010-10-01 2010-10-01 false System, automatic train stop. 236.826 Section 236..., INSPECTION, MAINTENANCE, AND REPAIR OF SIGNAL AND TRAIN CONTROL SYSTEMS, DEVICES, AND APPLIANCES Definitions § 236.826 System, automatic train stop. A system so arranged that its operation will automatically...
30 CFR 77.1401 - Automatic controls and brakes.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Automatic controls and brakes. 77.1401 Section... MINES Personnel Hoisting § 77.1401 Automatic controls and brakes. Hoists and elevators shall be equipped with overspeed, overwind, and automatic stop controls and with brakes capable of stopping the elevator...
30 CFR 56.19006 - Automatic hoist braking devices.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Automatic hoist braking devices. 56.19006 Section 56.19006 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND... Hoisting Hoists § 56.19006 Automatic hoist braking devices. Automatic hoists shall be provided with devices...
30 CFR 57.19006 - Automatic hoist braking devices.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Automatic hoist braking devices. 57.19006 Section 57.19006 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND... Hoisting Hoists § 57.19006 Automatic hoist braking devices. Automatic hoists shall be provided with devices...
Improved Automatically Locking/Unlocking Orthotic Knee Joint

NASA Technical Reports Server (NTRS)

Weddendorf, Bruce

1995-01-01

Proposed orthotic knee joint improved version of one described in "Automatically Locking/Unlocking Orthotic Knee Joint" (MFS-28633). Locks automatically upon initial application of radial force (wearer's weight) and unlocks automatically, but only when all loads (radial force and bending) relieved. Joints lock whenever wearer applies weight to knee at any joint angle between full extension and 45 degree bend. Both devices offer increased safety and convenience relative to conventional orthotic knee joints.
Automatic Data Processing, 4-1. Military Curriculum Materials for Vocational and Technical Education.

ERIC Educational Resources Information Center

Army Ordnance Center and School, Aberdeen Proving Ground, MD.

These two texts and student workbook for a secondary/postsecondary-level correspondence course in automatic data processing comprise one of a number of military-developed curriculum packages selected for adaptation to vocational instruction and curriculum development in a civilian setting. The purpose stated for the individualized, self-paced…
Expanding Literacy for Learners with Intellectual Disabilities: The Role of Supported eText

ERIC Educational Resources Information Center

Douglas, Karen H.; Ayres, Kevin M.; Langone, John; Bell, Virginia; Meade, Cara

2009-01-01

A series of single-subject experiments were conducted to evaluate the effects of presentational, translational, illustrative, instructional, and summarizing supports on the reading and listening comprehension of students with moderate intellectual disabilities. The specific eText supports under investigation included digitized voice and…
How automatic is the hand's automatic pilot? Evidence from dual-task studies.

PubMed

McIntosh, Robert D; Mulroue, Amy; Brockmole, James R

2010-10-01

The ability to correct reaching movements for changes in target position has been described as the hand's 'automatic pilot'. These corrections are preconscious and occur by default in double-step reaching tasks, even if the goal is to react to the target jump in some other way, for instance by stopping the movement (STOP instruction). Nonetheless, corrections are strongly modulated by conscious intention: participants make more corrections when asked to follow the target (GO instruction) and can suppress them when explicitly asked not to follow the target (NOGO instruction). We studied the influence of a cognitively demanding (auditory 1-back) task upon correction behaviour under GO, STOP and NOGO instructions. Correction rates under the STOP instruction were unaffected by cognitive load, consistent with the assumption that they reflect the default behaviour of the automatic pilot. Correction rates under the GO instruction were also unaffected, suggesting that minimal cognitive resources are required to enhance online correction. By contrast, cognitive load impeded the ability to suppress online corrections under the NOGO instruction. These data reveal a constitutional bias in the automatic pilot system: intentional suppression of the default correction behaviour is cognitively demanding, but enhancement towards greater responsiveness is seemingly effortless.
MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data.

PubMed

Yavorska, Olena O; Burgess, Stephen

2017-12-01

MendelianRandomization is a software package for the R open-source software environment that performs Mendelian randomization analyses using summarized data. The core functionality is to implement the inverse-variance weighted, MR-Egger and weighted median methods for multiple genetic variants. Several options are available to the user, such as the use of robust regression, fixed- or random-effects models and the penalization of weights for genetic variants with heterogeneous causal estimates. Extensions to these methods, such as allowing for variants to be correlated, can be chosen if appropriate. Graphical commands allow summarized data to be displayed in an interactive graph, or the plotting of causal estimates from multiple methods, for comparison. Although the main method of data entry is directly by the user, there is also an option for allowing summarized data to be incorporated from the PhenoScanner database of genotype-phenotype associations. We hope to develop this feature in future versions of the package. The R software environment is available for download from [https://www.r-project.org/]. The MendelianRandomization package can be downloaded from the Comprehensive R Archive Network (CRAN) within R, or directly from [https://cran.r-project.org/web/packages/MendelianRandomization/]. Both R and the MendelianRandomization package are released under GNU General Public Licenses (GPL-2|GPL-3). © The Author 2017. Published by Oxford University Press on behalf of the International Epidemiological Association.
[Research on automatic external defibrillator based on DSP].

PubMed

Jing, Jun; Ding, Jingyan; Zhang, Wei; Hong, Wenxue

2012-10-01

Electrical defibrillation is the most effective way to treat the ventricular tachycardia (VT) and ventricular fibrillation (VF). An automatic external defibrillator based on DSP is introduced in this paper. The whole design consists of the signal collection module, the microprocessor controlingl module, the display module, the defibrillation module and the automatic recognition algorithm for VF and non VF, etc. This automatic external defibrillator has achieved goals such as ECG signal real-time acquisition, ECG wave synchronous display, data delivering to U disk and automatic defibrillate when shockable rhythm appears, etc.
Precision about the automatic emotional brain.

PubMed

Vuilleumier, Patrik

2015-01-01

The question of automaticity in emotion processing has been debated under different perspectives in recent years. Satisfying answers to this issue will require a better definition of automaticity in terms of relevant behavioral phenomena, ecological conditions of occurrence, and a more precise mechanistic account of the underlying neural circuits.
Automatic segmentation of stereoelectroencephalography (SEEG) electrodes post-implantation considering bending.

PubMed

Granados, Alejandro; Vakharia, Vejay; Rodionov, Roman; Schweiger, Martin; Vos, Sjoerd B; O'Keeffe, Aidan G; Li, Kuo; Wu, Chengyuan; Miserocchi, Anna; McEvoy, Andrew W; Clarkson, Matthew J; Duncan, John S; Sparks, Rachel; Ourselin, Sébastien

2018-06-01

The accurate and automatic localisation of SEEG electrodes is crucial for determining the location of epileptic seizure onset. We propose an algorithm for the automatic segmentation of electrode bolts and contacts that accounts for electrode bending in relation to regional brain anatomy. Co-registered post-implantation CT, pre-implantation MRI, and brain parcellation images are used to create regions of interest to automatically segment bolts and contacts. Contact search strategy is based on the direction of the bolt with distance and angle constraints, in addition to post-processing steps that assign remaining contacts and predict contact position. We measured the accuracy of contact position, bolt angle, and anatomical region at the tip of the electrode in 23 post-SEEG cases comprising two different surgical approaches when placing a guiding stylet close to and far from target point. Local and global bending are computed when modelling electrodes as elastic rods. Our approach executed on average in 36.17 s with a sensitivity of 98.81% and a positive predictive value (PPV) of 95.01%. Compared to manual segmentation, the position of contacts had a mean absolute error of 0.38 mm and the mean bolt angle difference of [Formula: see text] resulted in a mean displacement error of 0.68 mm at the tip of the electrode. Anatomical regions at the tip of the electrode were in strong concordance with those selected manually by neurosurgeons, [Formula: see text], with average distance between regions of 0.82 mm when in disagreement. Our approach performed equally in two surgical approaches regardless of the amount of electrode bending. We present a method robust to electrode bending that can accurately segment contact positions and bolt orientation. The techniques presented in this paper will allow further characterisation of bending within different brain regions.
Automatic comparison of striation marks and automatic classification of shoe prints

NASA Astrophysics Data System (ADS)

Geradts, Zeno J.; Keijzer, Jan; Keereweer, Isaac

1995-09-01

A database for toolmarks (named TRAX) and a database for footwear outsole designs (named REBEZO) have been developed on a PC. The databases are filled with video-images and administrative data about the toolmarks and the footwear designs. An algorithm for the automatic comparison of the digitized striation patterns has been developed for TRAX. The algorithm appears to work well for deep and complete striation marks and will be implemented in TRAX. For REBEZO some efforts have been made to the automatic classification of outsole patterns. The algorithm first segments the shoeprofile. Fourier-features are selected for the separate elements and are classified with a neural network. In future developments information on invariant moments of the shape and rotation angle will be included in the neural network.
Automatic sample Dewar for MX beam-line

DOE Office of Scientific and Technical Information (OSTI.GOV)

Charignon, T.; Tanchon, J.; Trollier, T.

2014-01-29

It is very common for crystals of large biological macromolecules to show considerable variation in quality of their diffraction. In order to increase the number of samples that are tested for diffraction quality before any full data collections at the ESRF*, an automatic sample Dewar has been implemented. Conception and performances of the Dewar are reported in this paper. The automatic sample Dewar has 240 samples capability with automatic loading/unloading ports. The storing Dewar is capable to work with robots and it can be integrated in a full automatic MX** beam-line. The samples are positioned in the front of themore » loading/unloading ports with and automatic rotating plate. A view port has been implemented for data matrix camera reading on each sample loaded in the Dewar. At last, the Dewar is insulated with polyurethane foam that keeps the liquid nitrogen consumption below 1.6 L/h. At last, the static insulation also makes vacuum equipment and maintenance unnecessary. This Dewar will be useful for increasing the number of samples tested in synchrotrons.« less

A Graph Summarization Algorithm Based on RFID Logistics

NASA Astrophysics Data System (ADS)

Sun, Yan; Hu, Kongfa; Lu, Zhipeng; Zhao, Li; Chen, Ling

Radio Frequency Identification (RFID) applications are set to play an essential role in object tracking and supply chain management systems. The volume of data generated by a typical RFID application will be enormous as each item will generate a complete history of all the individual locations that it occupied at every point in time. The movement trails of such RFID data form gigantic commodity flowgraph representing the locations and durations of the path stages traversed by each item. In this paper, we use graph to construct a warehouse of RFID commodity flows, and introduce a database-style operation to summarize graphs, which produces a summary graph by grouping nodes based on user-selected node attributes, further allows users to control the hierarchy of summaries. It can cut down the size of graphs, and provide convenience for users to study just on the shrunk graph which they interested. Through extensive experiments, we demonstrate the effectiveness and efficiency of the proposed method.
ANNUAL REPORT-AUTOMATIC INDEXING AND ABSTRACTING.

ERIC Educational Resources Information Center

Lockheed Missiles and Space Co., Palo Alto, CA. Electronic Sciences Lab.

THE INVESTIGATION IS CONCERNED WITH THE DEVELOPMENT OF AUTOMATIC INDEXING, ABSTRACTING, AND EXTRACTING SYSTEMS. BASIC INVESTIGATIONS IN ENGLISH MORPHOLOGY, PHONETICS, AND SYNTAX ARE PURSUED AS NECESSARY MEANS TO THIS END. IN THE FIRST SECTION THE THEORY AND DESIGN OF THE "SENTENCE DICTIONARY" EXPERIMENT IN AUTOMATIC EXTRACTION IS OUTLINED. SOME OF…
14 CFR 29.1329 - Automatic pilot system.

Code of Federal Regulations, 2011 CFR

2011-01-01

... is automatic synchronization, each system must have a means to readily indicate to the pilot the... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
14 CFR 29.1329 - Automatic pilot system.

Code of Federal Regulations, 2010 CFR

2010-01-01

... is automatic synchronization, each system must have a means to readily indicate to the pilot the... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
14 CFR 29.1329 - Automatic pilot system.

Code of Federal Regulations, 2014 CFR

2014-01-01

... is automatic synchronization, each system must have a means to readily indicate to the pilot the... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
14 CFR 29.1329 - Automatic pilot system.

Code of Federal Regulations, 2013 CFR

2013-01-01

... is automatic synchronization, each system must have a means to readily indicate to the pilot the... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
14 CFR 29.1329 - Automatic pilot system.

Code of Federal Regulations, 2012 CFR

2012-01-01

... is automatic synchronization, each system must have a means to readily indicate to the pilot the... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
Automatic Scaffolding and Measurement of Concept Mapping for EFL Students to Write Summaries

ERIC Educational Resources Information Center

Yang, Yu-Fen

2015-01-01

An incorrect concept map may obstruct a student's comprehension when writing summaries if they are unable to grasp key concepts when reading texts. The purpose of this study was to investigate the effects of automatic scaffolding and measurement of three-layer concept maps on improving university students' writing summaries. The automatic…
14 CFR 27.1329 - Automatic pilot system.

Code of Federal Regulations, 2014 CFR

2014-01-01

... automatic synchronization, each system must have a means to readily indicate to the pilot the alignment of... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
14 CFR 27.1329 - Automatic pilot system.

Code of Federal Regulations, 2010 CFR

2010-01-01

... automatic synchronization, each system must have a means to readily indicate to the pilot the alignment of... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
14 CFR 27.1329 - Automatic pilot system.

Code of Federal Regulations, 2013 CFR

2013-01-01

... automatic synchronization, each system must have a means to readily indicate to the pilot the alignment of... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
14 CFR 27.1329 - Automatic pilot system.

Code of Federal Regulations, 2011 CFR

2011-01-01

... automatic synchronization, each system must have a means to readily indicate to the pilot the alignment of... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
14 CFR 27.1329 - Automatic pilot system.

Code of Federal Regulations, 2012 CFR

2012-01-01

... automatic synchronization, each system must have a means to readily indicate to the pilot the alignment of... that corrective action begins within a reasonable period of time. (e) If the automatic pilot integrates...
Automatic Surveying For Hazard Prevention On Glacier De GiÉtro, Switzerland

NASA Astrophysics Data System (ADS)

Bauder, A.; Funk, M.; Bösch, H.

Breaking off of large ice masses from the steep tongue of Glacier de Giétro may endanger a nearby reservoir. Such a falling ice mass could cause an oversplash over the dam at timeof a nearly filled lake. For this reason the glacier has been monitored intensively since the 1960's. An automatic theodolite was installed three years ago. It allows continuous displacement measurements of several targets on the glacier in order to detect short-term acceleration events. The installation includes a telemetric data transmission, which provides for immediate recognition of hazardous situations and early alarming. The obtained data were analysed in terms of precision and performance of the applied method. A high temporal resolution was gained. The comparison with traditional ob- servations shows clearly the potential of modern instruments to improve monitoring schems. We summarize the main results of this study and discuss the applicability of a modern motorized theodolite with target tracking and recognition ability for moni- toring purposes.
Automatic Speech Recognition from Neural Signals: A Focused Review.

PubMed

Herff, Christian; Schultz, Tanja

2016-01-01

Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e., patients suffering from locked-in syndrome). For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people. This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography). As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the Brain-to-text system.
Channel Measurements for Automatic Vehicle Monitoring Systems

DOT National Transportation Integrated Search

1974-03-01

Co-channel and adjacent channel electromagnetic interference measurements were conducted on the Sierra Research Corp. and the Chicago Transit Authority automatic vehicle monitoring systems. These measurements were made to determine if the automatic v...
10 CFR 431.132 - Definitions concerning automatic commercial ice makers.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 10 Energy 3 2014-01-01 2014-01-01 false Definitions concerning automatic commercial ice makers... CERTAIN COMMERCIAL AND INDUSTRIAL EQUIPMENT Automatic Commercial Ice Makers § 431.132 Definitions concerning automatic commercial ice makers. Automatic commercial ice maker means a factory-made assembly (not...
10 CFR 431.132 - Definitions concerning automatic commercial ice makers.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 10 Energy 3 2012-01-01 2012-01-01 false Definitions concerning automatic commercial ice makers... CERTAIN COMMERCIAL AND INDUSTRIAL EQUIPMENT Automatic Commercial Ice Makers § 431.132 Definitions concerning automatic commercial ice makers. Automatic commercial ice maker means a factory-made assembly (not...
10 CFR 431.132 - Definitions concerning automatic commercial ice makers.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 10 Energy 3 2010-01-01 2010-01-01 false Definitions concerning automatic commercial ice makers... CERTAIN COMMERCIAL AND INDUSTRIAL EQUIPMENT Automatic Commercial Ice Makers § 431.132 Definitions concerning automatic commercial ice makers. Automatic commercial ice maker means a factory-made assembly (not...
10 CFR 431.132 - Definitions concerning automatic commercial ice makers.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 10 Energy 3 2011-01-01 2011-01-01 false Definitions concerning automatic commercial ice makers... CERTAIN COMMERCIAL AND INDUSTRIAL EQUIPMENT Automatic Commercial Ice Makers § 431.132 Definitions concerning automatic commercial ice makers. Automatic commercial ice maker means a factory-made assembly (not...

10 CFR 431.132 - Definitions concerning automatic commercial ice makers.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 10 Energy 3 2013-01-01 2013-01-01 false Definitions concerning automatic commercial ice makers... CERTAIN COMMERCIAL AND INDUSTRIAL EQUIPMENT Automatic Commercial Ice Makers § 431.132 Definitions concerning automatic commercial ice makers. Automatic commercial ice maker means a factory-made assembly (not...
Automatic stereotyping against people with schizophrenia, schizoaffective and affective disorders

PubMed Central

Rüsch, Nicolas; Corrigan, Patrick W.; Todd, Andrew R.; Bodenhausen, Galen V.

2010-01-01

Similar to members of the public, people with mental illness may exhibit general negative automatic prejudice against their own group. However, it is unclear whether more specific negative stereotypes are automatically activated among diagnosed individuals and how such automatic stereotyping may be related to self-reported attitudes and emotional reactions. We therefore studied automatically activated reactions toward mental illness among 85 people with schizophrenia, schizoaffective or affective disorders as well as among 50 members of the general public, using a Lexical Decision Task to measure automatic stereotyping. Deliberately endorsed attitudes and emotional reactions were assessed by self-report. Independent of diagnosis, people with mental illness showed less negative automatic stereotyping than did members of the public. Among members of the public, stronger automatic stereotyping was associated with more self-reported shame about a potential mental illness and more anger toward stigmatized individuals. Reduced automatic stereotyping in the diagnosed group suggests that people with mental illness might not entirely internalize societal stigma. Among members of the public, automatic stereotyping predicted negative emotional reactions to people with mental illness. Initiatives to reduce the impact of public stigma and internalized stigma should take automatic stereotyping and related emotional aspects of stigma into account. PMID:20843560
Improving text comprehension: scaffolding adolescents into strategic reading.

PubMed

Ukrainetz, Teresa A

2015-02-01

Understanding and learning from academic texts involves purposeful, strategic reading. Adolescent readers, particularly poor readers, benefit from explicit instruction in text comprehension strategies, such as text preview, summarization, and comprehension monitoring, as part of a comprehensive reading program. However, strategies are difficult to teach within subject area lessons where content instruction must take primacy. Speech-language pathologists (SLPs) have the expertise and service delivery options to support middle and high school students in learning to use comprehension strategies in their academic reading and learning. This article presents the research evidence on what strategies to teach and how best to teach them, including the use of explicit instruction, spoken interactions around text, cognitive modeling, peer learning, classroom connections, and disciplinary literacy. The article focuses on how to move comprehension strategies from being teaching tools of the SLP to becoming learning tools of the student. SLPs can provide the instruction and support needed for students to learn and apply of this important component of academic reading. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Validation of automatic segmentation of ribs for NTCP modeling.

PubMed

Stam, Barbara; Peulen, Heike; Rossi, Maddalena M G; Belderbos, José S A; Sonke, Jan-Jakob

2016-03-01

Determination of a dose-effect relation for rib fractures in a large patient group has been limited by the time consuming manual delineation of ribs. Automatic segmentation could facilitate such an analysis. We determine the accuracy of automatic rib segmentation in the context of normal tissue complication probability modeling (NTCP). Forty-one patients with stage I/II non-small cell lung cancer treated with SBRT to 54 Gy in 3 fractions were selected. Using the 4DCT derived mid-ventilation planning CT, all ribs were manually contoured and automatically segmented. Accuracy of segmentation was assessed using volumetric, shape and dosimetric measures. Manual and automatic dosimetric parameters Dx and EUD were tested for equivalence using the Two One-Sided T-test (TOST), and assessed for agreement using Bland-Altman analysis. NTCP models based on manual and automatic segmentation were compared. Automatic segmentation was comparable with the manual delineation in radial direction, but larger near the costal cartilage and vertebrae. Manual and automatic Dx and EUD were significantly equivalent. The Bland-Altman analysis showed good agreement. The two NTCP models were very similar. Automatic rib segmentation was significantly equivalent to manual delineation and can be used for NTCP modeling in a large patient group. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Automatic Contour Tracking in Ultrasound Images

ERIC Educational Resources Information Center

Li, Min; Kambhamettu, Chandra; Stone, Maureen

2005-01-01

In this paper, a new automatic contour tracking system, EdgeTrak, for the ultrasound image sequences of human tongue is presented. The images are produced by a head and transducer support system (HATS). The noise and unrelated high-contrast edges in ultrasound images make it very difficult to automatically detect the correct tongue surfaces. In…
An automatic markerless registration method for neurosurgical robotics based on an optical camera.

PubMed

Meng, Fanle; Zhai, Fangwen; Zeng, Bowei; Ding, Hui; Wang, Guangzhi

2018-02-01

Current markerless registration methods for neurosurgical robotics use the facial surface to match the robot space with the image space, and acquisition of the facial surface usually requires manual interaction and constrains the patient to a supine position. To overcome these drawbacks, we propose a registration method that is automatic and does not constrain patient position. An optical camera attached to the robot end effector captures images around the patient's head from multiple views. Then, high coverage of the head surface is reconstructed from the images through multi-view stereo vision. Since the acquired head surface point cloud contains color information, a specific mark that is manually drawn on the patient's head prior to the capture procedure can be extracted to automatically accomplish coarse registration rather than using facial anatomic landmarks. Then, fine registration is achieved by registering the high coverage of the head surface without relying solely on the facial region, thus eliminating patient position constraints. The head surface was acquired by the camera with a good repeatability accuracy. The average target registration error of 8 different patient positions measured with targets inside a head phantom was [Formula: see text], while the mean surface registration error was [Formula: see text]. The method proposed in this paper achieves automatic markerless registration in multiple patient positions and guarantees registration accuracy inside the head. This method provides a new approach for establishing the spatial relationship between the image space and the robot space.
Automatic detection of Parkinson's disease in running speech spoken in three different languages.

PubMed

Orozco-Arroyave, J R; Hönig, F; Arias-Londoño, J D; Vargas-Bonilla, J F; Daqrouq, K; Skodda, S; Rusz, J; Nöth, E

2016-01-01

The aim of this study is the analysis of continuous speech signals of people with Parkinson's disease (PD) considering recordings in different languages (Spanish, German, and Czech). A method for the characterization of the speech signals, based on the automatic segmentation of utterances into voiced and unvoiced frames, is addressed here. The energy content of the unvoiced sounds is modeled using 12 Mel-frequency cepstral coefficients and 25 bands scaled according to the Bark scale. Four speech tasks comprising isolated words, rapid repetition of the syllables /pa/-/ta/-/ka/, sentences, and read texts are evaluated. The method proves to be more accurate than classical approaches in the automatic classification of speech of people with PD and healthy controls. The accuracies range from 85% to 99% depending on the language and the speech task. Cross-language experiments are also performed confirming the robustness and generalization capability of the method, with accuracies ranging from 60% to 99%. This work comprises a step forward for the development of computer aided tools for the automatic assessment of dysarthric speech signals in multiple languages.
Design of Automatic Extraction Algorithm of Knowledge Points for MOOCs

PubMed Central

Chen, Haijian; Han, Dongmei; Zhao, Lina

2015-01-01

In recent years, Massive Open Online Courses (MOOCs) are very popular among college students and have a powerful impact on academic institutions. In the MOOCs environment, knowledge discovery and knowledge sharing are very important, which currently are often achieved by ontology techniques. In building ontology, automatic extraction technology is crucial. Because the general methods of text mining algorithm do not have obvious effect on online course, we designed automatic extracting course knowledge points (AECKP) algorithm for online course. It includes document classification, Chinese word segmentation, and POS tagging for each document. Vector Space Model (VSM) is used to calculate similarity and design the weight to optimize the TF-IDF algorithm output values, and the higher scores will be selected as knowledge points. Course documents of “C programming language” are selected for the experiment in this study. The results show that the proposed approach can achieve satisfactory accuracy rate and recall rate. PMID:26448738
Automatically processed alpha-track radon monitor

DOEpatents

Langner, Jr., G. Harold

1993-01-01

An automatically processed alpha-track radon monitor is provided which includes a housing having an aperture allowing radon entry, and a filter that excludes the entry of radon daughters into the housing. A flexible track registration material is located within the housing that records alpha-particle emissions from the decay of radon and radon daughters inside the housing. The flexible track registration material is capable of being spliced such that the registration material from a plurality of monitors can be spliced into a single strip to facilitate automatic processing of the registration material from the plurality of monitors. A process for the automatic counting of radon registered by a radon monitor is also provided.
Automatically processed alpha-track radon monitor

DOEpatents

Langner, G.H. Jr.

1993-01-12

An automatically processed alpha-track radon monitor is provided which includes a housing having an aperture allowing radon entry, and a filter that excludes the entry of radon daughters into the housing. A flexible track registration material is located within the housing that records alpha-particle emissions from the decay of radon and radon daughters inside the housing. The flexible track registration material is capable of being spliced such that the registration material from a plurality of monitors can be spliced into a single strip to facilitate automatic processing of the registration material from the plurality of monitors. A process for the automatic counting of radon registered by a radon monitor is also provided.
Text mining in livestock animal science: introducing the potential of text mining to animal sciences.

PubMed

Sahadevan, S; Hofmann-Apitius, M; Schellander, K; Tesfaye, D; Fluck, J; Friedrich, C M

2012-10-01

In biological research, establishing the prior art by searching and collecting information already present in the domain has equal importance as the experiments done. To obtain a complete overview about the relevant knowledge, researchers mainly rely on 2 major information sources: i) various biological databases and ii) scientific publications in the field. The major difference between the 2 information sources is that information from databases is available, typically well structured and condensed. The information content in scientific literature is vastly unstructured; that is, dispersed among the many different sections of scientific text. The traditional method of information extraction from scientific literature occurs by generating a list of relevant publications in the field of interest and manually scanning these texts for relevant information, which is very time consuming. It is more than likely that in using this "classical" approach the researcher misses some relevant information mentioned in the literature or has to go through biological databases to extract further information. Text mining and named entity recognition methods have already been used in human genomics and related fields as a solution to this problem. These methods can process and extract information from large volumes of scientific text. Text mining is defined as the automatic extraction of previously unknown and potentially useful information from text. Named entity recognition (NER) is defined as the method of identifying named entities (names of real world objects; for example, gene/protein names, drugs, enzymes) in text. In animal sciences, text mining and related methods have been briefly used in murine genomics and associated fields, leaving behind other fields of animal sciences, such as livestock genomics. The aim of this work was to develop an information retrieval platform in the livestock domain focusing on livestock publications and the recognition of relevant data from
Automatic Lamp and Fan Control Based on Microcontroller

NASA Astrophysics Data System (ADS)

Widyaningrum, V. T.; Pramudita, Y. D.

2018-01-01

In general, automation can be described as a process following pre-determined sequential steps with a little or without any human exertion. Automation is provided with the use of various sensors suitable to observe the production processes, actuators and different techniques and devices. In this research, the automation system developed is an automatic lamp and an automatic fan on the smart home. Both of these systems will be processed using an Arduino Mega 2560 microcontroller. A microcontroller is used to obtain values of physical conditions through sensors connected to it. In the automatic lamp system required sensors to detect the light of the LDR (Light Dependent Resistor) sensor. While the automatic fan system required sensors to detect the temperature of the DHT11 sensor. In tests that have been done lamps and fans can work properly. The lamp can turn on automatically when the light begins to darken, and the lamp can also turn off automatically when the light begins to bright again. In addition, it can concluded also that the readings of LDR sensors are placed outside the room is different from the readings of LDR sensors placed in the room. This is because the light intensity received by the existing LDR sensor in the room is blocked by the wall of the house or by other objects. Then for the fan, it can also turn on automatically when the temperature is greater than 25°C, and the fan speed can also be adjusted. The fan may also turn off automatically when the temperature is less than equal to 25°C.
Practical automatic Arabic license plate recognition system

NASA Astrophysics Data System (ADS)

Mohammad, Khader; Agaian, Sos; Saleh, Hani

2011-02-01

Since 1970's, the need of an automatic license plate recognition system, sometimes referred as Automatic License Plate Recognition system, has been increasing. A license plate recognition system is an automatic system that is able to recognize a license plate number, extracted from image sensors. In specific, Automatic License Plate Recognition systems are being used in conjunction with various transportation systems in application areas such as law enforcement (e.g. speed limit enforcement) and commercial usages such as parking enforcement and automatic toll payment private and public entrances, border control, theft and vandalism control. Vehicle license plate recognition has been intensively studied in many countries. Due to the different types of license plates being used, the requirement of an automatic license plate recognition system is different for each country. [License plate detection using cluster run length smoothing algorithm ].Generally, an automatic license plate localization and recognition system is made up of three modules; license plate localization, character segmentation and optical character recognition modules. This paper presents an Arabic license plate recognition system that is insensitive to character size, font, shape and orientation with extremely high accuracy rate. The proposed system is based on a combination of enhancement, license plate localization, morphological processing, and feature vector extraction using the Haar transform. The performance of the system is fast due to classification of alphabet and numerals based on the license plate organization. Experimental results for license plates of two different Arab countries show an average of 99 % successful license plate localization and recognition in a total of more than 20 different images captured from a complex outdoor environment. The results run times takes less time compared to conventional and many states of art methods.
[Study on Intelligent Automatic Tracking Radiation Protection Curtain].

PubMed

Zhao, Longyang; Han, Jindong; Ou, Minjian; Chen, Jinlong

2015-09-01

In order to overcome the shortcomings of traditional X-ray inspection taking passive protection mode, this paper combines the automatic control technology, puts forward a kind of active protection X-ray equipment. The device of automatic detection of patients receiving X-ray irradiation part, intelligent adjustment in patients and shooting device between automatic tracking radiation protection device height. The device has the advantages of automatic adjustment, anti-radiation device, reduce the height of non-irradiated area X-ray radiation and improve the work efficiency. Testing by the professional organization, the device can decrease more than 90% of X-ray dose for patients with non-irradiated area.
Automatic Clustering Using FSDE-Forced Strategy Differential Evolution

NASA Astrophysics Data System (ADS)

Yasid, A.

2018-01-01

Clustering analysis is important in datamining for unsupervised data, cause no adequate prior knowledge. One of the important tasks is defining the number of clusters without user involvement that is known as automatic clustering. This study intends on acquiring cluster number automatically utilizing forced strategy differential evolution (AC-FSDE). Two mutation parameters, namely: constant parameter and variable parameter are employed to boost differential evolution performance. Four well-known benchmark datasets were used to evaluate the algorithm. Moreover, the result is compared with other state of the art automatic clustering methods. The experiment results evidence that AC-FSDE is better or competitive with other existing automatic clustering algorithm.
Finding Meaning: Sense Inventories for Improved Word Sense Disambiguation

ERIC Educational Resources Information Center

Brown, Susan Windisch

2010-01-01

The deep semantic understanding necessary for complex natural language processing tasks, such as automatic question-answering or text summarization, would benefit from highly accurate word sense disambiguation (WSD). This dissertation investigates what makes an appropriate and effective sense inventory for WSD. Drawing on theories and…
Automatic Brake Slack Adjusters on Transit Buses

DOT National Transportation Integrated Search

1985-04-01

The purpose of this report is to provide transit bus agencies with current information on automatic brake slack adjusters. Rather than a comprehensive statistical survey, this report is intended to provide an overview of the states of automatic slack...
Automatic safety rod for reactors

DOEpatents

Germer, John H.

1988-01-01

An automatic safety rod for a nuclear reactor containing neutron absorbing material and designed to be inserted into a reactor core after a loss-of-core flow. Actuation is based upon either a sudden decrease in core pressure drop or the pressure drop decreases below a predetermined minimum value. The automatic control rod includes a pressure regulating device whereby a controlled decrease in operating pressure due to reduced coolant flow does not cause the rod to drop into the core.
Automatic image enhancement based on multi-scale image decomposition

NASA Astrophysics Data System (ADS)

Feng, Lu; Wu, Zhuangzhi; Pei, Luo; Long, Xiong

2014-01-01

In image processing and computational photography, automatic image enhancement is one of the long-range objectives. Recently the automatic image enhancement methods not only take account of the globe semantics, like correct color hue and brightness imbalances, but also the local content of the image, such as human face and sky of landscape. In this paper we describe a new scheme for automatic image enhancement that considers both global semantics and local content of image. Our automatic image enhancement method employs the multi-scale edge-aware image decomposition approach to detect the underexposure regions and enhance the detail of the salient content. The experiment results demonstrate the effectiveness of our approach compared to existing automatic enhancement methods.
Automatic Processing of Metallurgical Abstracts for the Purpose of Information Retrieval. Final Report.

ERIC Educational Resources Information Center

Melton, Jessica S.

Objectives of this project were to develop and test a method for automatically processing the text of abstracts for a document retrieval system. The test corpus consisted of 768 abstracts from the metallurgical section of Chemical Abstracts (CA). The system, based on a subject indexing rational, had two components: (1) a stored dictionary of words…

Observed use of automatic seat belts in 1987 cars.

PubMed

Williams, A F; Wells, J K; Lund, A K; Teed, N

1989-10-01

Usage of the automatic belt systems supplied by six large-volume automobile manufacturers to meet the federal requirements for automatic restraints were observed in suburban Washington, D.C., Chicago, Los Angeles, and Philadelphia. The different belt systems studied were: Ford and Toyota (motorized, nondetachable automatic shoulder belt), Nissan (motorized, detachable shoulder belt), VW and Chrysler (nonmotorized, detachable shoulder belt), and GM (nonmotorized detachable lap and shoulder belt). Use of automatic belts was significantly greater than manual belt use in otherwise comparable late-model cars for all manufacturers except Chrysler; in Chrysler cars, automatic belt use was significantly lower than manual belt use. The automatic shoulder belts provided by Ford, Nissan, Toyota, and VW increased use rates to about 90%. Because use rates were lower in Ford cars with manual belts, their increase was greater. GM cars had the smallest increase in use rates; however, lap belt use was highest in GM cars. The other manufacturers supply knee bolsters to supplement shoulder belt protection; all--except VW--also provide manual lap belts, which were used by about half of those who used the automatic shoulder belt. The results indicate that some manufacturers have been more successful than others in providing automatic belt systems that result in high use that, in turn, will mean fewer deaths and injuries in those cars.
RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials.

PubMed

Marshall, Iain J; Kuiper, Joël; Wallace, Byron C

2016-01-01

To develop and evaluate RobotReviewer, a machine learning (ML) system that automatically assesses bias in clinical trials. From a (PDF-formatted) trial report, the system should determine risks of bias for the domains defined by the Cochrane Risk of Bias (RoB) tool, and extract supporting text for these judgments. We algorithmically annotated 12,808 trial PDFs using data from the Cochrane Database of Systematic Reviews (CDSR). Trials were labeled as being at low or high/unclear risk of bias for each domain, and sentences were labeled as being informative or not. This dataset was used to train a multi-task ML model. We estimated the accuracy of ML judgments versus humans by comparing trials with two or more independent RoB assessments in the CDSR. Twenty blinded experienced reviewers rated the relevance of supporting text, comparing ML output with equivalent (human-extracted) text from the CDSR. By retrieving the top 3 candidate sentences per document (top3 recall), the best ML text was rated more relevant than text from the CDSR, but not significantly (60.4% ML text rated 'highly relevant' v 56.5% of text from reviews; difference +3.9%, [-3.2% to +10.9%]). Model RoB judgments were less accurate than those from published reviews, though the difference was <10% (overall accuracy 71.0% with ML v 78.3% with CDSR). Risk of bias assessment may be automated with reasonable accuracy. Automatically identified text supporting bias assessment is of equal quality to the manually identified text in the CDSR. This technology could substantially reduce reviewer workload and expedite evidence syntheses. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Automatic rapid attachable warhead section

DOEpatents

Trennel, A.J.

1994-05-10

Disclosed are a method and apparatus for automatically selecting warheads or reentry vehicles from a storage area containing a plurality of types of warheads or reentry vehicles, automatically selecting weapon carriers from a storage area containing at least one type of weapon carrier, manipulating and aligning the selected warheads or reentry vehicles and weapon carriers, and automatically coupling the warheads or reentry vehicles with the weapon carriers such that coupling of improperly selected warheads or reentry vehicles with weapon carriers is inhibited. Such inhibition enhances safety of operations and is achieved by a number of means including computer control of the process of selection and coupling and use of connectorless interfaces capable of assuring that improperly selected items will be rejected or rendered inoperable prior to coupling. Also disclosed are a method and apparatus wherein the stated principles pertaining to selection, coupling and inhibition are extended to apply to any item-to-be-carried and any carrying assembly. 10 figures.
Automatic rapid attachable warhead section

DOEpatents

Trennel, Anthony J.

1994-05-10

Disclosed are a method and apparatus for (1) automatically selecting warheads or reentry vehicles from a storage area containing a plurality of types of warheads or reentry vehicles, (2) automatically selecting weapon carriers from a storage area containing at least one type of weapon carrier, (3) manipulating and aligning the selected warheads or reentry vehicles and weapon carriers, and (4) automatically coupling the warheads or reentry vehicles with the weapon carriers such that coupling of improperly selected warheads or reentry vehicles with weapon carriers is inhibited. Such inhibition enhances safety of operations and is achieved by a number of means including computer control of the process of selection and coupling and use of connectorless interfaces capable of assuring that improperly selected items will be rejected or rendered inoperable prior to coupling. Also disclosed are a method and apparatus wherein the stated principles pertaining to selection, coupling and inhibition are extended to apply to any item-to-be-carried and any carrying assembly.
Automatic, semi-automatic and manual validation of urban drainage data.

PubMed

Branisavljević, N; Prodanović, D; Pavlović, D

2010-01-01

Advances in sensor technology and the possibility of automated long distance data transmission have made continuous measurements the preferable way of monitoring urban drainage processes. Usually, the collected data have to be processed by an expert in order to detect and mark the wrong data, remove them and replace them with interpolated data. In general, the first step in detecting the wrong, anomaly data is called the data quality assessment or data validation. Data validation consists of three parts: data preparation, validation scores generation and scores interpretation. This paper will present the overall framework for the data quality improvement system, suitable for automatic, semi-automatic or manual operation. The first two steps of the validation process are explained in more detail, using several validation methods on the same set of real-case data from the Belgrade sewer system. The final part of the validation process, which is the scores interpretation, needs to be further investigated on the developed system.
AUTOMATIC MASS SPECTROMETER

DOEpatents

Hanson, M.L.; Tabor, C.D. Jr.

1961-12-01

A mass spectrometer for analyzing the components of a gas is designed which is capable of continuous automatic operation such as analysis of samples of process gas from a continuous production system where the gas content may be changing. (AEC)
Automatic Gain Control in Compact Spectrometers.

PubMed

Protopopov, Vladimir

2016-03-01

An image intensifier installed in the optical path of a compact spectrometer may act not only as a fast gating unit, which is widely used for time-resolved measurements, but also as a variable attenuator-amplifier in a continuous wave mode. This opens the possibility of an automatic gain control, a new feature in spectroscopy. With it, the user is relieved from the necessity to manually adjust signal level at a certain value that it is done automatically by means of an electronic feedback loop. It is even more important that automatic gain control is done without changing exposure time, which is an additional benefit in time-resolved experiments. The concept, algorithm, design considerations, and experimental results are presented. © The Author(s) 2016.
30 CFR 75.523-3 - Automatic emergency-parking brakes.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Automatic emergency-parking brakes. 75.523-3...-3 Automatic emergency-parking brakes. (a) Except for personnel carriers, rubber-tired, self... with automatic emergency-parking brakes in accordance with the following schedule. (1) On and after May...
30 CFR 75.523-3 - Automatic emergency-parking brakes.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Automatic emergency-parking brakes. 75.523-3...-3 Automatic emergency-parking brakes. (a) Except for personnel carriers, rubber-tired, self... with automatic emergency-parking brakes in accordance with the following schedule. (1) On and after May...
Utilizing Marzano's Summarizing and Note Taking Strategies on Seventh Grade Students' Mathematics Performance

ERIC Educational Resources Information Center

Jeanmarie-Gardner, Charmaine

2013-01-01

A quasi-experimental research study was conducted that investigated the academic impact of utilizing Marzano's summarizing and note taking strategies on mathematic achievement. A sample of seventh graders from a middle school located on Long Island's North Shore was tested to determine whether significant differences existed in mathematic test…
Grinding Parts For Automatic Welding

NASA Technical Reports Server (NTRS)

Burley, Richard K.; Hoult, William S.

1989-01-01

Rollers guide grinding tool along prospective welding path. Skatelike fixture holds rotary grinder or file for machining large-diameter rings or ring segments in preparation for welding. Operator grasps handles to push rolling fixture along part. Rollers maintain precise dimensional relationship so grinding wheel cuts precise depth. Fixture-mounted grinder machines surface to quality sufficient for automatic welding; manual welding with attendant variations and distortion not necessary. Developed to enable automatic welding of parts, manual welding of which resulted in weld bead permeated with microscopic fissures.
21 CFR 211.68 - Automatic, mechanical, and electronic equipment.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 21 Food and Drugs 4 2011-04-01 2011-04-01 false Automatic, mechanical, and electronic equipment. 211.68 Section 211.68 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN... Equipment § 211.68 Automatic, mechanical, and electronic equipment. (a) Automatic, mechanical, or electronic...
21 CFR 211.68 - Automatic, mechanical, and electronic equipment.

Code of Federal Regulations, 2012 CFR

2012-04-01

... 21 Food and Drugs 4 2012-04-01 2012-04-01 false Automatic, mechanical, and electronic equipment. 211.68 Section 211.68 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN... Equipment § 211.68 Automatic, mechanical, and electronic equipment. (a) Automatic, mechanical, or electronic...
30 CFR 75.1404 - Automatic brakes; speed reduction gear.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Automatic brakes; speed reduction gear. 75.1404... Automatic brakes; speed reduction gear. [Statutory Provisions] Each locomotive and haulage car used in an... permit automatic brakes, locomotives and haulage cars shall be subject to speed reduction gear, or other...
30 CFR 75.1404 - Automatic brakes; speed reduction gear.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Automatic brakes; speed reduction gear. 75.1404... Automatic brakes; speed reduction gear. [Statutory Provisions] Each locomotive and haulage car used in an... permit automatic brakes, locomotives and haulage cars shall be subject to speed reduction gear, or other...
Text as a Supplement to Speech in Young and Older Adults a)

PubMed Central

Krull, Vidya; Humes, Larry E.

2015-01-01

Objective The purpose of this experiment was to quantify the contribution of visual text to auditory speech recognition in background noise. Specifically, we tested the hypothesis that partially accurate visual text from an automatic speech recognizer could be used successfully to supplement speech understanding in difficult listening conditions in older adults, with normal or impaired hearing. Our working hypotheses were based on what is known regarding audiovisual speech perception in the elderly from speechreading literature. We hypothesized that: 1) combining auditory and visual text information will result in improved recognition accuracy compared to auditory or visual text information alone; 2) benefit from supplementing speech with visual text (auditory and visual enhancement) in young adults will be greater than that in older adults; and 3) individual differences in performance on perceptual measures would be associated with cognitive abilities. Design Fifteen young adults with normal hearing, fifteen older adults with normal hearing, and fifteen older adults with hearing loss participated in this study. All participants completed sentence recognition tasks in auditory-only, text-only, and combined auditory-text conditions. The auditory sentence stimuli were spectrally shaped to restore audibility for the older participants with impaired hearing. All participants also completed various cognitive measures, including measures of working memory, processing speed, verbal comprehension, perceptual and cognitive speed, processing efficiency, inhibition, and the ability to form wholes from parts. Group effects were examined for each of the perceptual and cognitive measures. Audiovisual benefit was calculated relative to performance on auditory-only and visual-text only conditions. Finally, the relationship between perceptual measures and other independent measures were examined using principal-component factor analyses, followed by regression analyses. Results
MARZ: Manual and automatic redshifting software

NASA Astrophysics Data System (ADS)

Hinton, S. R.; Davis, Tamara M.; Lidman, C.; Glazebrook, K.; Lewis, G. F.

2016-04-01

The Australian Dark Energy Survey (OzDES) is a 100-night spectroscopic survey underway on the Anglo-Australian Telescope using the fibre-fed 2-degree-field (2dF) spectrograph. We have developed a new redshifting application MARZ with greater usability, flexibility, and the capacity to analyse a wider range of object types than the RUNZ software package previously used for redshifting spectra from 2dF. MARZ is an open-source, client-based, Javascript web-application which provides an intuitive interface and powerful automatic matching capabilities on spectra generated from the AAOmega spectrograph to produce high quality spectroscopic redshift measurements. The software can be run interactively or via the command line, and is easily adaptable to other instruments and pipelines if conforming to the current FITS file standard is not possible. Behind the scenes, a modified version of the AUTOZ cross-correlation algorithm is used to match input spectra against a variety of stellar and galaxy templates, and automatic matching performance for OzDES spectra has increased from 54% (RUNZ) to 91% (MARZ). Spectra not matched correctly by the automatic algorithm can be easily redshifted manually by cycling automatic results, manual template comparison, or marking spectral features.
Automatic retinal interest evaluation system (ARIES).

PubMed

Yin, Fengshou; Wong, Damon Wing Kee; Yow, Ai Ping; Lee, Beng Hai; Quan, Ying; Zhang, Zhuo; Gopalakrishnan, Kavitha; Li, Ruoying; Liu, Jiang

2014-01-01

In recent years, there has been increasing interest in the use of automatic computer-based systems for the detection of eye diseases such as glaucoma, age-related macular degeneration and diabetic retinopathy. However, in practice, retinal image quality is a big concern as automatic systems without consideration of degraded image quality will likely generate unreliable results. In this paper, an automatic retinal image quality assessment system (ARIES) is introduced to assess both image quality of the whole image and focal regions of interest. ARIES achieves 99.54% accuracy in distinguishing fundus images from other types of images through a retinal image identification step in a dataset of 35342 images. The system employs high level image quality measures (HIQM) to perform image quality assessment, and achieves areas under curve (AUCs) of 0.958 and 0.987 for whole image and optic disk region respectively in a testing dataset of 370 images. ARIES acts as a form of automatic quality control which ensures good quality images are used for processing, and can also be used to alert operators of poor quality images at the time of acquisition.
Shaping electromagnetic waves using software-automatically-designed metasurfaces.

PubMed

Zhang, Qian; Wan, Xiang; Liu, Shuo; Yuan Yin, Jia; Zhang, Lei; Jun Cui, Tie

2017-06-15

We present a fully digital procedure of designing reflective coding metasurfaces to shape reflected electromagnetic waves. The design procedure is completely automatic, controlled by a personal computer. In details, the macro coding units of metasurface are automatically divided into several types (e.g. two types for 1-bit coding, four types for 2-bit coding, etc.), and each type of the macro coding units is formed by discretely random arrangement of micro coding units. By combining an optimization algorithm and commercial electromagnetic software, the digital patterns of the macro coding units are optimized to possess constant phase difference for the reflected waves. The apertures of the designed reflective metasurfaces are formed by arranging the macro coding units with certain coding sequence. To experimentally verify the performance, a coding metasurface is fabricated by automatically designing two digital 1-bit unit cells, which are arranged in array to constitute a periodic coding metasurface to generate the required four-beam radiations with specific directions. Two complicated functional metasurfaces with circularly- and elliptically-shaped radiation beams are realized by automatically designing 4-bit macro coding units, showing excellent performance of the automatic designs by software. The proposed method provides a smart tool to realize various functional devices and systems automatically.
Automatic indexing of scanned documents: a layout-based approach

NASA Astrophysics Data System (ADS)

Esser, Daniel; Schuster, Daniel; Muthmann, Klemens; Berger, Michael; Schill, Alexander

2012-01-01

Archiving official written documents such as invoices, reminders and account statements in business and private area gets more and more important. Creating appropriate index entries for document archives like sender's name, creation date or document number is a tedious manual work. We present a novel approach to handle automatic indexing of documents based on generic positional extraction of index terms. For this purpose we apply the knowledge of document templates stored in a common full text search index to find index positions that were successfully extracted in the past.

Automatic soldering machine

NASA Technical Reports Server (NTRS)

Stein, J. A.

1974-01-01

Fully-automatic tube-joint soldering machine can be used to make leakproof joints in aluminum tubes of 3/16 to 2 in. in diameter. Machine consists of temperature-control unit, heater transformer and heater head, vibrator, and associated circuitry controls, and indicators.
Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery

PubMed Central

Gonzalez, Graciela H.; Tahsin, Tasnia; Goodale, Britton C.; Greene, Anna C.

2016-01-01

Precision medicine will revolutionize the way we treat and prevent disease. A major barrier to the implementation of precision medicine that clinicians and translational scientists face is understanding the underlying mechanisms of disease. We are starting to address this challenge through automatic approaches for information extraction, representation and analysis. Recent advances in text and data mining have been applied to a broad spectrum of key biomedical questions in genomics, pharmacogenomics and other fields. We present an overview of the fundamental methods for text and data mining, as well as recent advances and emerging applications toward precision medicine. PMID:26420781
46 CFR 171.118 - Automatic ventilators and side ports.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 46 Shipping 7 2011-10-01 2011-10-01 false Automatic ventilators and side ports. 171.118 Section... SPECIAL RULES PERTAINING TO VESSELS CARRYING PASSENGERS Openings in the Side of a Vessel Below the Bulkhead or Weather Deck § 171.118 Automatic ventilators and side ports. (a) An automatic ventilator must...
46 CFR 171.118 - Automatic ventilators and side ports.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 46 Shipping 7 2010-10-01 2010-10-01 false Automatic ventilators and side ports. 171.118 Section... SPECIAL RULES PERTAINING TO VESSELS CARRYING PASSENGERS Openings in the Side of a Vessel Below the Bulkhead or Weather Deck § 171.118 Automatic ventilators and side ports. (a) An automatic ventilator must...
46 CFR 171.118 - Automatic ventilators and side ports.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 46 Shipping 7 2014-10-01 2014-10-01 false Automatic ventilators and side ports. 171.118 Section... SPECIAL RULES PERTAINING TO VESSELS CARRYING PASSENGERS Openings in the Side of a Vessel Below the Bulkhead or Weather Deck § 171.118 Automatic ventilators and side ports. (a) An automatic ventilator must...
46 CFR 171.118 - Automatic ventilators and side ports.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 46 Shipping 7 2012-10-01 2012-10-01 false Automatic ventilators and side ports. 171.118 Section... SPECIAL RULES PERTAINING TO VESSELS CARRYING PASSENGERS Openings in the Side of a Vessel Below the Bulkhead or Weather Deck § 171.118 Automatic ventilators and side ports. (a) An automatic ventilator must...
46 CFR 171.118 - Automatic ventilators and side ports.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 46 Shipping 7 2013-10-01 2013-10-01 false Automatic ventilators and side ports. 171.118 Section... SPECIAL RULES PERTAINING TO VESSELS CARRYING PASSENGERS Openings in the Side of a Vessel Below the Bulkhead or Weather Deck § 171.118 Automatic ventilators and side ports. (a) An automatic ventilator must...
Ion Channel ElectroPhysiology Ontology (ICEPO) - a case study of text mining assisted ontology development.

PubMed

Elayavilli, Ravikumar Komandur; Liu, Hongfang

2016-01-01

Computational modeling of biological cascades is of great interest to quantitative biologists. Biomedical text has been a rich source for quantitative information. Gathering quantitative parameters and values from biomedical text is one significant challenge in the early steps of computational modeling as it involves huge manual effort. While automatically extracting such quantitative information from bio-medical text may offer some relief, lack of ontological representation for a subdomain serves as impedance in normalizing textual extractions to a standard representation. This may render textual extractions less meaningful to the domain experts. In this work, we propose a rule-based approach to automatically extract relations involving quantitative data from biomedical text describing ion channel electrophysiology. We further translated the quantitative assertions extracted through text mining to a formal representation that may help in constructing ontology for ion channel events using a rule based approach. We have developed Ion Channel ElectroPhysiology Ontology (ICEPO) by integrating the information represented in closely related ontologies such as, Cell Physiology Ontology (CPO), and Cardiac Electro Physiology Ontology (CPEO) and the knowledge provided by domain experts. The rule-based system achieved an overall F-measure of 68.93% in extracting the quantitative data assertions system on an independently annotated blind data set. We further made an initial attempt in formalizing the quantitative data assertions extracted from the biomedical text into a formal representation that offers potential to facilitate the integration of text mining into ontological workflow, a novel aspect of this study. This work is a case study where we created a platform that provides formal interaction between ontology development and text mining. We have achieved partial success in extracting quantitative assertions from the biomedical text and formalizing them in ontological
Assimilating Text-Mining & Bio-Informatics Tools to Analyze Cellulase structures

NASA Astrophysics Data System (ADS)

Satyasree, K. P. N. V., Dr; Lalitha Kumari, B., Dr; Jyotsna Devi, K. S. N. V.; Choudri, S. M. Roy; Pratap Joshi, K.

2017-08-01

Text-mining is one of the best potential way of automatically extracting information from the huge biological literature. To exploit its prospective, the knowledge encrypted in the text should be converted to some semantic representation such as entities and relations, which could be analyzed by machines. But large-scale practical systems for this purpose are rare. But text mining could be helpful for generating or validating predictions. Cellulases have abundant applications in various industries. Cellulose degrading enzymes are cellulases and the same producing bacteria - Bacillus subtilis & fungus Pseudomonas putida were isolated from top soil of Guntur Dt. A.P. India. Absolute cultures were conserved on potato dextrose agar medium for molecular studies. In this paper, we presented how well the text mining concepts can be used to analyze cellulase producing bacteria and fungi, their comparative structures are also studied with the aid of well-establised, high quality standard bioinformatic tools such as Bioedit, Swissport, Protparam, EMBOSSwin with which a complete data on Cellulases like structure, constituents of the enzyme has been obtained.
Mobile text messaging for health: a systematic review of reviews.

PubMed

Hall, Amanda K; Cole-Lewis, Heather; Bernhardt, Jay M

2015-03-18

The aim of this systematic review of reviews is to identify mobile text-messaging interventions designed for health improvement and behavior change and to derive recommendations for practice. We have compiled and reviewed existing systematic research reviews and meta-analyses to organize and summarize the text-messaging intervention evidence base, identify best-practice recommendations based on findings from multiple reviews, and explore implications for future research. Our review found that the majority of published text-messaging interventions were effective when addressing diabetes self-management, weight loss, physical activity, smoking cessation, and medication adherence for antiretroviral therapy. However, we found limited evidence across the population of studies and reviews to inform recommended intervention characteristics. Although strong evidence supports the value of integrating text-messaging interventions into public health practice, additional research is needed to establish longer-term intervention effects, identify recommended intervention characteristics, and explore issues of cost-effectiveness.
Mobile Text Messaging for Health: A Systematic Review of Reviews

PubMed Central

Hall, Amanda K.; Cole-Lewis, Heather; Bernhardt, Jay M.

2015-01-01

The aim of this systematic review of reviews is to identify mobile text-messaging interventions designed for health improvement and behavior change and to derive recommendations for practice. We have compiled and reviewed existing systematic research reviews and meta-analyses to organize and summarize the text-messaging intervention evidence base, identify best-practice recommendations based on findings from multiple reviews, and explore implications for future research. Our review found that the majority of published text-messaging interventions were effective when addressing diabetes self-management, weight loss, physical activity, smoking cessation, and medication adherence for antiretroviral therapy. However, we found limited evidence across the population of studies and reviews to inform recommended intervention characteristics. Although strong evidence supports the value of integrating text-messaging interventions into public health practice, additional research is needed to establish longer-term intervention effects, identify recommended intervention characteristics, and explore issues of cost-effectiveness. PMID:25785892
33 CFR 401.20 - Automatic Identification System.

Code of Federal Regulations, 2010 CFR

2010-07-01

...' maritime Differential Global Positioning System radiobeacon services; or (7) The use of a temporary unit... Identification System. (a) Each of the following vessels must use an Automatic Identification System (AIS... 33 Navigation and Navigable Waters 3 2010-07-01 2010-07-01 false Automatic Identification System...
Populating the Semantic Web by Macro-reading Internet Text

NASA Astrophysics Data System (ADS)

Mitchell, Tom M.; Betteridge, Justin; Carlson, Andrew; Hruschka, Estevam; Wang, Richard

A key question regarding the future of the semantic web is "how will we acquire structured information to populate the semantic web on a vast scale?" One approach is to enter this information manually. A second approach is to take advantage of pre-existing databases, and to develop common ontologies, publishing standards, and reward systems to make this data widely accessible. We consider here a third approach: developing software that automatically extracts structured information from unstructured text present on the web. We also describe preliminary results demonstrating that machine learning algorithms can learn to extract tens of thousands of facts to populate a diverse ontology, with imperfect but reasonably good accuracy.
Inferring Group Processes from Computer-Mediated Affective Text Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schryver, Jack C; Begoli, Edmon; Jose, Ajith

2011-02-01

Political communications in the form of unstructured text convey rich connotative meaning that can reveal underlying group social processes. Previous research has focused on sentiment analysis at the document level, but we extend this analysis to sub-document levels through a detailed analysis of affective relationships between entities extracted from a document. Instead of pure sentiment analysis, which is just positive or negative, we explore nuances of affective meaning in 22 affect categories. Our affect propagation algorithm automatically calculates and displays extracted affective relationships among entities in graphical form in our prototype (TEAMSTER), starting with seed lists of affect terms. Severalmore » useful metrics are defined to infer underlying group processes by aggregating affective relationships discovered in a text. Our approach has been validated with annotated documents from the MPQA corpus, achieving a performance gain of 74% over comparable random guessers.« less
THE APPLICATION OF ENGLISH-WORD MORPHOLOGY TO AUTOMATIC INDEXING AND EXTRACTING. ANNUAL SUMMARY REPORT.

ERIC Educational Resources Information Center

DOLBY, J.L.; AND OTHERS

THE STUDY IS CONCERNED WITH THE LINGUISTIC PROBLEM INVOLVED IN TEXT COMPRESSION--EXTRACTING, INDEXING, AND THE AUTOMATIC CREATION OF SPECIAL-PURPOSE CITATION DICTIONARIES. IN SPITE OF EARLY SUCCESS IN USING LARGE-SCALE COMPUTERS TO AUTOMATE CERTAIN HUMAN TASKS, THESE PROBLEMS REMAIN AMONG THE MOST DIFFICULT TO SOLVE. ESSENTIALLY, THE PROBLEM IS TO…
10 CFR 1045.38 - Automatic declassification prohibition.

Code of Federal Regulations, 2012 CFR

2012-01-01

... Automatic declassification prohibition. (a) Documents containing RD and FRD remain classified until a... Energy Act, no date or event for automatic declassification ever applies to RD and FRD documents, even if such documents also contain NSI. (c) E.O. 12958 acknowledges that RD and FRD are exempt from all...
10 CFR 1045.38 - Automatic declassification prohibition.

Code of Federal Regulations, 2013 CFR

2013-01-01

... Automatic declassification prohibition. (a) Documents containing RD and FRD remain classified until a... Energy Act, no date or event for automatic declassification ever applies to RD and FRD documents, even if such documents also contain NSI. (c) E.O. 12958 acknowledges that RD and FRD are exempt from all...
10 CFR 1045.38 - Automatic declassification prohibition.

Code of Federal Regulations, 2014 CFR

2014-01-01

... Automatic declassification prohibition. (a) Documents containing RD and FRD remain classified until a... Energy Act, no date or event for automatic declassification ever applies to RD and FRD documents, even if such documents also contain NSI. (c) E.O. 12958 acknowledges that RD and FRD are exempt from all...
10 CFR 1045.38 - Automatic declassification prohibition.

Code of Federal Regulations, 2011 CFR

2011-01-01

... Automatic declassification prohibition. (a) Documents containing RD and FRD remain classified until a... Energy Act, no date or event for automatic declassification ever applies to RD and FRD documents, even if such documents also contain NSI. (c) E.O. 12958 acknowledges that RD and FRD are exempt from all...
10 CFR 1045.38 - Automatic declassification prohibition.

Code of Federal Regulations, 2010 CFR

2010-01-01

... Automatic declassification prohibition. (a) Documents containing RD and FRD remain classified until a... Energy Act, no date or event for automatic declassification ever applies to RD and FRD documents, even if such documents also contain NSI. (c) E.O. 12958 acknowledges that RD and FRD are exempt from all...

46 CFR 58.25-30 - Automatic restart.

Code of Federal Regulations, 2014 CFR

2014-10-01

... auxiliary steering gear and each power actuating system must restart automatically when electrical power is... 46 Shipping 2 2014-10-01 2014-10-01 false Automatic restart. 58.25-30 Section 58.25-30 Shipping COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) MARINE ENGINEERING MAIN AND AUXILIARY MACHINERY...
46 CFR 58.25-30 - Automatic restart.

Code of Federal Regulations, 2013 CFR

2013-10-01

... auxiliary steering gear and each power actuating system must restart automatically when electrical power is... 46 Shipping 2 2013-10-01 2013-10-01 false Automatic restart. 58.25-30 Section 58.25-30 Shipping COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) MARINE ENGINEERING MAIN AND AUXILIARY MACHINERY...
46 CFR 58.25-30 - Automatic restart.

Code of Federal Regulations, 2012 CFR

2012-10-01

... auxiliary steering gear and each power actuating system must restart automatically when electrical power is... 46 Shipping 2 2012-10-01 2012-10-01 false Automatic restart. 58.25-30 Section 58.25-30 Shipping COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) MARINE ENGINEERING MAIN AND AUXILIARY MACHINERY...
46 CFR 58.25-30 - Automatic restart.

Code of Federal Regulations, 2011 CFR

2011-10-01

... auxiliary steering gear and each power actuating system must restart automatically when electrical power is... 46 Shipping 2 2011-10-01 2011-10-01 false Automatic restart. 58.25-30 Section 58.25-30 Shipping COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) MARINE ENGINEERING MAIN AND AUXILIARY MACHINERY...
46 CFR 58.25-30 - Automatic restart.

Code of Federal Regulations, 2010 CFR

2010-10-01

... auxiliary steering gear and each power actuating system must restart automatically when electrical power is... 46 Shipping 2 2010-10-01 2010-10-01 false Automatic restart. 58.25-30 Section 58.25-30 Shipping COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) MARINE ENGINEERING MAIN AND AUXILIARY MACHINERY...
30 CFR 27.23 - Automatic warning device.

Code of Federal Regulations, 2010 CFR

2010-07-01

... APPROVAL OF MINING PRODUCTS METHANE-MONITORING SYSTEMS Construction and Design Requirements § 27.23... function automatically at a methane content of the mine atmosphere between 1.0 to 1.5 volume percent and at all higher concentrations of methane. (c) It is recommended that the automatic warning device be...
Automatic Item Generation of Probability Word Problems

ERIC Educational Resources Information Center

Holling, Heinz; Bertling, Jonas P.; Zeuch, Nina

2009-01-01

Mathematical word problems represent a common item format for assessing student competencies. Automatic item generation (AIG) is an effective way of constructing many items with predictable difficulties, based on a set of predefined task parameters. The current study presents a framework for the automatic generation of probability word problems…
Enhancing Automaticity through Task-Based Language Learning

ERIC Educational Resources Information Center

De Ridder, Isabelle; Vangehuchten, Lieve; Gomez, Marta Sesena

2007-01-01

In general terms automaticity could be defined as the subconscious condition wherein "we perform a complex series of tasks very quickly and efficiently, without having to think about the various components and subcomponents of action involved" (DeKeyser 2001: 125). For language learning, Segalowitz (2003) characterised automaticity as a…
49 CFR 236.504 - Operation interconnected with automatic block-signal system.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 49 Transportation 4 2010-10-01 2010-10-01 false Operation interconnected with automatic block... Operation interconnected with automatic block-signal system. (a) A continuous inductive automatic train stop or train control system shall operate in connection with an automatic block signal system and shall...
An automatic brain tumor segmentation tool.

PubMed

Diaz, Idanis; Boulanger, Pierre; Greiner, Russell; Hoehn, Bret; Rowe, Lindsay; Murtha, Albert

2013-01-01

This paper introduces an automatic brain tumor segmentation method (ABTS) for segmenting multiple components of brain tumor using four magnetic resonance image modalities. ABTS's four stages involve automatic histogram multi-thresholding and morphological operations including geodesic dilation. Our empirical results, on 16 real tumors, show that ABTS works very effectively, achieving a Dice accuracy compared to expert segmentation of 81% in segmenting edema and 85% in segmenting gross tumor volume (GTV).
Chemical etching for automatic processing of integrated circuits

NASA Technical Reports Server (NTRS)

Kennedy, B. W.

1981-01-01

Chemical etching for automatic processing of integrated circuits is discussed. The wafer carrier and loading from a receiving air track into automatic furnaces and unloading onto a sending air track are included.
30 CFR 75.1403-4 - Criteria-Automatic elevators.

Code of Federal Regulations, 2010 CFR

2010-07-01

... appropriate on automatic elevators which will automatically shut-off the power and apply the brakes in the... telephone or other effective communication system by which aid or assistance can be obtained promptly. ...
The effect of word prediction settings (frequency of use) on text input speed in persons with cervical spinal cord injury: a prospective study.

PubMed

Pouplin, Samuel; Roche, Nicolas; Antoine, Jean-Yves; Vaugier, Isabelle; Pottier, Sandra; Figere, Marjorie; Bensmail, Djamel

2017-06-01

To determine whether activation of the frequency of use and automatic learning parameters of word prediction software has an impact on text input speed. Forty-five participants with cervical spinal cord injury between C4 and C8 Asia A or B accepted to participate to this study. Participants were separated in two groups: a high lesion group for participants with lesion level is at or above C5 Asia AIS A or B and a low lesion group for participants with lesion is between C6 and C8 Asia AIS A or B. A single evaluation session was carried out for each participant. Text input speed was evaluated during three copying tasks: • without word prediction software (WITHOUT condition) • with automatic learning of words and frequency of use deactivated (NOT_ACTIV condition) • with automatic learning of words and frequency of use activated (ACTIV condition) Results: Text input speed was significantly higher in the WITHOUT than the NOT_ACTIV (p< 0.001) or ACTIV conditions (p = 0.02) for participants with low lesions. Text input speed was significantly higher in the ACTIV than in the NOT_ACTIV (p = 0.002) or WITHOUT (p < 0.001) conditions for participants with high lesions. Use of word prediction software with the activation of frequency of use and automatic learning increased text input speed in participants with high-level tetraplegia. For participants with low-level tetraplegia, the use of word prediction software with frequency of use and automatic learning activated only decreased the number of errors. Implications in rehabilitation Access to technology can be difficult for persons with disabilities such as cervical spinal cord injury (SCI). Several methods have been developed to increase text input speed such as word prediction software.This study show that parameter of word prediction software (frequency of use) affected text input speed in persons with cervical SCI and differed according to the level of the lesion. • For persons with high
Automatic Operation For A Robot Lawn Mower

NASA Astrophysics Data System (ADS)

Huang, Y. Y.; Cao, Z. L.; Oh, S. J.; Kattan, E. U.; Hall, E. L.

1987-02-01

A domestic mobile robot, lawn mower, which performs the automatic operation mode, has been built up in the Center of Robotics Research, University of Cincinnati. The robot lawn mower automatically completes its work with the region filling operation, a new kind of path planning for mobile robots. Some strategies for region filling of path planning have been developed for a partly-known or a unknown environment. Also, an advanced omnidirectional navigation system and a multisensor-based control system are used in the automatic operation. Research on the robot lawn mower, especially on the region filling of path planning, is significant in industrial and agricultural applications.
Exposure to violent video games increases automatic aggressiveness.

PubMed

Uhlmann, Eric; Swanson, Jane

2004-02-01

The effects of exposure to violent video games on automatic associations with the self were investigated in a sample of 121 students. Playing the violent video game Doom led participants to associate themselves with aggressive traits and actions on the Implicit Association Test. In addition, self-reported prior exposure to violent video games predicted automatic aggressive self-concept, above and beyond self-reported aggression. Results suggest that playing violent video games can lead to the automatic learning of aggressive self-views.
What is automatized during perceptual categorization?

PubMed Central

Roeder, Jessica L.; Ashby, F. Gregory

2016-01-01

An experiment is described that tested whether stimulus-response associations or an abstract rule are automatized during extensive practice at perceptual categorization. Twenty-seven participants each completed 12,300 trials of perceptual categorization, either on rule-based (RB) categories that could be learned explicitly or information-integration (II) categories that required procedural learning. Each participant practiced predominantly on a primary category structure, but every third session they switched to a secondary structure that used the same stimuli and responses. Half the stimuli retained their same response on the primary and secondary categories (the congruent stimuli) and half switched responses (the incongruent stimuli). Several results stood out. First, performance on the primary categories met the standard criteria of automaticity by the end of training. Second, for the primary categories in the RB condition, accuracy and response time (RT) were identical on congruent and incongruent stimuli. In contrast, for the primary II categories, accuracy was higher and RT was lower for congruent than for incongruent stimuli. These results are consistent with the hypothesis that rules are automatized in RB tasks, whereas stimulus-response associations are automatized in II tasks. A cognitive neuroscience theory is proposed that accounts for these results. PMID:27232521
Automatic attention to emotional stimuli: neural correlates.

PubMed

Carretié, Luis; Hinojosa, José A; Martín-Loeches, Manuel; Mercado, Francisco; Tapia, Manuel

2004-08-01

We investigated the capability of emotional and nonemotional visual stimulation to capture automatic attention, an aspect of the interaction between cognitive and emotional processes that has received scant attention from researchers. Event-related potentials were recorded from 37 subjects using a 60-electrode array, and were submitted to temporal and spatial principal component analyses to detect and quantify the main components, and to source localization software (LORETA) to determine their spatial origin. Stimuli capturing automatic attention were of three types: emotionally positive, emotionally negative, and nonemotional pictures. Results suggest that initially (P1: 105 msec after stimulus), automatic attention is captured by negative pictures, and not by positive or nonemotional ones. Later (P2: 180 msec), automatic attention remains captured by negative pictures, but also by positive ones. Finally (N2: 240 msec), attention is captured only by positive and nonemotional stimuli. Anatomically, this sequence is characterized by decreasing activation of the visual association cortex (VAC) and by the growing involvement, from dorsal to ventral areas, of the anterior cingulate cortex (ACC). Analyses suggest that the ACC and not the VAC is responsible for experimental effects described above. Intensity, latency, and location of neural activity related to automatic attention thus depend clearly on the stimulus emotional content and on its associated biological importance. Copyright 2004 Wiley-Liss, Inc.
Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry

PubMed Central

AAlAbdulsalam, Abdulrahman K.; Garvin, Jennifer H.; Redd, Andrew; Carter, Marjorie E.; Sweeny, Carol; Meystre, Stephane M.

2018-01-01

Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%–98.4% and classification sensitivity: 83.5%–87%). PMID:29888032
Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry.

PubMed

AAlAbdulsalam, Abdulrahman K; Garvin, Jennifer H; Redd, Andrew; Carter, Marjorie E; Sweeny, Carol; Meystre, Stephane M

2018-01-01

Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%-98.4% and classification sensitivity: 83.5%-87%).
Effectiveness of sequential automatic-manual home respiratory polygraphy scoring.

PubMed

Masa, Juan F; Corral, Jaime; Pereira, Ricardo; Duran-Cantolla, Joaquin; Cabello, Marta; Hernández-Blasco, Luis; Monasterio, Carmen; Alonso-Fernandez, Alberto; Chiner, Eusebi; Vázquez-Polo, Francisco-José; Montserrat, Jose M

2013-04-01

Automatic home respiratory polygraphy (HRP) scoring functions can potentially confirm the diagnosis of sleep apnoea-hypopnoea syndrome (SAHS) (obviating technician scoring) in a substantial number of patients. The result would have important management and cost implications. The aim of this study was to determine the diagnostic cost-effectiveness of a sequential HRP scoring protocol (automatic and then manual for residual cases) compared with manual HRP scoring, and with in-hospital polysomnography. We included suspected SAHS patients in a multicentre study and assigned them to home and hospital protocols at random. We constructed receiver operating characteristic (ROC) curves for manual and automatic scoring. Diagnostic agreement for several cut-off points was explored and costs for two equally effective alternatives were calculated. Of 366 randomised patients, 348 completed the protocol. Manual scoring produced better ROC curves than automatic scoring. There was no sensitive automatic or subsequent manual HRP apnoea-hypopnoea index (AHI) cut-off point. The specific cut-off points for automatic and subsequent manual HRP scorings (AHI >25 and >20, respectively) had a specificity of 93% for automatic and 94% for manual scorings. The costs of manual protocol were 9% higher than sequential HRP protocol; these were 69% and 64%, respectively, of the cost of the polysomnography. A sequential HRP scoring protocol is a cost-effective alternative to polysomnography, although with limited cost savings compared to HRP manual scoring.

Automatic welding of stainless steel tubing

NASA Technical Reports Server (NTRS)

Clautice, W. E.

1978-01-01

The use of automatic welding for making girth welds in stainless steel tubing was investigated as well as the reduction in fabrication costs resulting from the elimination of radiographic inspection. Test methodology, materials, and techniques are discussed, and data sheets for individual tests are included. Process variables studied include welding amperes, revolutions per minute, and shielding gas flow. Strip chart recordings, as a definitive method of insuring weld quality, are studied. Test results, determined by both radiographic and visual inspection, are presented and indicate that once optimum welding procedures for specific sizes of tubing are established, and the welding machine operations are certified, then the automatic tube welding process produces good quality welds repeatedly, with a high degree of reliability. Revised specifications for welding tubing using the automatic process and weld visual inspection requirements at the Kennedy Space Center are enumerated.
Automatic morphological classification of galaxy images

PubMed Central

Shamir, Lior

2009-01-01

We describe an image analysis supervised learning algorithm that can automatically classify galaxy images. The algorithm is first trained using a manually classified images of elliptical, spiral, and edge-on galaxies. A large set of image features is extracted from each image, and the most informative features are selected using Fisher scores. Test images can then be classified using a simple Weighted Nearest Neighbor rule such that the Fisher scores are used as the feature weights. Experimental results show that galaxy images from Galaxy Zoo can be classified automatically to spiral, elliptical and edge-on galaxies with accuracy of ~90% compared to classifications carried out by the author. Full compilable source code of the algorithm is available for free download, and its general-purpose nature makes it suitable for other uses that involve automatic image analysis of celestial objects. PMID:20161594
Automatic Grading of Spreadsheet and Database Skills

ERIC Educational Resources Information Center

Kovacic, Zlatko J.; Green, John Steven

2012-01-01

Growing enrollment in distance education has increased student-to-lecturer ratios and, therefore, increased the workload of the lecturer. This growing enrollment has resulted in mounting efforts to develop automatic grading systems in an effort to reduce this workload. While research in the design and development of automatic grading systems has a…
Automatic Control of Silicon Melt Level

NASA Technical Reports Server (NTRS)

Duncan, C. S.; Stickel, W. B.

1982-01-01

A new circuit, when combined with melt-replenishment system and melt level sensor, offers continuous closed-loop automatic control of melt-level during web growth. Installed on silicon-web furnace, circuit controls melt-level to within 0.1 mm for as long as 8 hours. Circuit affords greater area growth rate and higher web quality, automatic melt-level control also allows semiautomatic growth of web over long periods which can greatly reduce costs.
Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Hood, Robert; Jost, Gabriele; Biegel, Bryan (Technical Monitor)

2001-01-01

This viewgraph presentation provides information on the technical aspects of debugging computer code that has been automatically converted for use in a parallel computing system. Shared memory parallelization and distributed memory parallelization entail separate and distinct challenges for a debugging program. A prototype system has been developed which integrates various tools for the debugging of automatically parallelized programs including the CAPTools Database which provides variable definition information across subroutines as well as array distribution information.
Biomedical text mining and its applications in cancer research.

PubMed

Zhu, Fei; Patumcharoenpol, Preecha; Zhang, Cheng; Yang, Yang; Chan, Jonathan; Meechai, Asawin; Vongsangnak, Wanwipa; Shen, Bairong

2013-04-01

Cancer is a malignant disease that has caused millions of human deaths. Its study has a long history of well over 100years. There have been an enormous number of publications on cancer research. This integrated but unstructured biomedical text is of great value for cancer diagnostics, treatment, and prevention. The immense body and rapid growth of biomedical text on cancer has led to the appearance of a large number of text mining techniques aimed at extracting novel knowledge from scientific text. Biomedical text mining on cancer research is computationally automatic and high-throughput in nature. However, it is error-prone due to the complexity of natural language processing. In this review, we introduce the basic concepts underlying text mining and examine some frequently used algorithms, tools, and data sets, as well as assessing how much these algorithms have been utilized. We then discuss the current state-of-the-art text mining applications in cancer research and we also provide some resources for cancer text mining. With the development of systems biology, researchers tend to understand complex biomedical systems from a systems biology viewpoint. Thus, the full utilization of text mining to facilitate cancer systems biology research is fast becoming a major concern. To address this issue, we describe the general workflow of text mining in cancer systems biology and each phase of the workflow. We hope that this review can (i) provide a useful overview of the current work of this field; (ii) help researchers to choose text mining tools and datasets; and (iii) highlight how to apply text mining to assist cancer systems biology research. Copyright © 2012 Elsevier Inc. All rights reserved.
Automatic Payroll Deposit System.

ERIC Educational Resources Information Center

Davidson, D. B.

1979-01-01

The Automatic Payroll Deposit System in Yakima, Washington's Public School District No. 7, directly transmits each employee's salary amount for each pay period to a bank or other financial institution. (Author/MLF)
FAMA: Fast Automatic MOOG Analysis

NASA Astrophysics Data System (ADS)

Magrini, Laura; Randich, Sofia; Friel, Eileen; Spina, Lorenzo; Jacobson, Heather; Cantat-Gaudin, Tristan; Donati, Paolo; Baglioni, Roberto; Maiorca, Enrico; Bragaglia, Angela; Sordo, Rosanna; Vallenari, Antonella

2014-02-01

FAMA (Fast Automatic MOOG Analysis), written in Perl, computes the atmospheric parameters and abundances of a large number of stars using measurements of equivalent widths (EWs) automatically and independently of any subjective approach. Based on the widely-used MOOG code, it simultaneously searches for three equilibria, excitation equilibrium, ionization balance, and the relationship between logn(FeI) and the reduced EWs. FAMA also evaluates the statistical errors on individual element abundances and errors due to the uncertainties in the stellar parameters. Convergence criteria are not fixed "a priori" but instead are based on the quality of the spectra.
[Study on the automatic parameters identification of water pipe network model].

PubMed

Jia, Hai-Feng; Zhao, Qi-Feng

2010-01-01

Based on the problems analysis on development and application of water pipe network model, the model parameters automatic identification is regarded as a kernel bottleneck of model's application in water supply enterprise. The methodology of water pipe network model parameters automatic identification based on GIS and SCADA database is proposed. Then the kernel algorithm of model parameters automatic identification is studied, RSA (Regionalized Sensitivity Analysis) is used for automatic recognition of sensitive parameters, and MCS (Monte-Carlo Sampling) is used for automatic identification of parameters, the detail technical route based on RSA and MCS is presented. The module of water pipe network model parameters automatic identification is developed. At last, selected a typical water pipe network as a case, the case study on water pipe network model parameters automatic identification is conducted and the satisfied results are achieved.
The algorithm for automatic detection of the calibration object

NASA Astrophysics Data System (ADS)

Artem, Kruglov; Irina, Ugfeld

2017-06-01

The problem of the automatic image calibration is considered in this paper. The most challenging task of the automatic calibration is a proper detection of the calibration object. The solving of this problem required the appliance of the methods and algorithms of the digital image processing, such as morphology, filtering, edge detection, shape approximation. The step-by-step process of the development of the algorithm and its adopting to the specific conditions of the log cuts in the image's background is presented. Testing of the automatic calibration module was carrying out under the conditions of the production process of the logging enterprise. Through the tests the average possibility of the automatic isolating of the calibration object is 86.1% in the absence of the type 1 errors. The algorithm was implemented in the automatic calibration module within the mobile software for the log deck volume measurement.
TEXTINFO: a tool for automatic determination of patient clinical profiles using text analysis.

PubMed Central

Borst, F.; Lyman, M.; Nhàn, N. T.; Tick, L. J.; Sager, N.; Scherrer, J. R.

1991-01-01

The clinical data contained in narrative patient documents is made available via grammatical and semantic processing. Retrievals from the resulting relational database tables are matched against a set of clinical descriptors to obtain clinical profiles of the patients in terms of the descriptors present in the documents. Discharge summaries of 57 Dept. of Digestive Surgery patients were processed in this manner. Factor analysis and discriminant analysis procedures were then applied, showing the profiles to be useful for diagnosis definitions (by establishing relations between diagnoses and clinical findings), for diagnosis assessment (by viewing the match between a definition and observed events recorded in a patient text), and potentially for outcome evaluation based on the classification abilities of clinical signs. PMID:1807679
Automatic spatiotemporal matching of detected pleural thickenings

NASA Astrophysics Data System (ADS)

Chaisaowong, Kraisorn; Keller, Simon Kai; Kraus, Thomas

2014-01-01

Pleural thickenings can be found in asbestos exposed patient's lung. Non-invasive diagnosis including CT imaging can detect aggressive malignant pleural mesothelioma in its early stage. In order to create a quantitative documentation of automatic detected pleural thickenings over time, the differences in volume and thickness of the detected thickenings have to be calculated. Physicians usually estimate the change of each thickening via visual comparison which provides neither quantitative nor qualitative measures. In this work, automatic spatiotemporal matching techniques of the detected pleural thickenings at two points of time based on the semi-automatic registration have been developed, implemented, and tested so that the same thickening can be compared fully automatically. As result, the application of the mapping technique using the principal components analysis turns out to be advantageous than the feature-based mapping using centroid and mean Hounsfield Units of each thickening, since the resulting sensitivity was improved to 98.46% from 42.19%, while the accuracy of feature-based mapping is only slightly higher (84.38% to 76.19%).
Automatic Content Analysis; Part I of Scientific Report No. ISR-18, Information Storage and Retrieval...

ERIC Educational Resources Information Center

Cornell Univ., Ithaca, NY. Dept. of Computer Science.

Four papers are included in Part One of the eighteenth report on Salton's Magical Automatic Retriever of Texts (SMART) project. The first paper: "Content Analysis in Information Retrieval" by S. F. Weiss presents the results of experiments aimed at determining the conditions under which content analysis improves retrieval results as well…
Methods for automatically analyzing humpback song units.

PubMed

Rickwood, Peter; Taylor, Andrew

2008-03-01

This paper presents mathematical techniques for automatically extracting and analyzing bioacoustic signals. Automatic techniques are described for isolation of target signals from background noise, extraction of features from target signals and unsupervised classification (clustering) of the target signals based on these features. The only user-provided inputs, other than raw sound, is an initial set of signal processing and control parameters. Of particular note is that the number of signal categories is determined automatically. The techniques, applied to hydrophone recordings of humpback whales (Megaptera novaeangliae), produce promising initial results, suggesting that they may be of use in automated analysis of not only humpbacks, but possibly also in other bioacoustic settings where automated analysis is desirable.
Adaptive pseudolinear compensators of dynamic characteristics of automatic control systems

NASA Astrophysics Data System (ADS)

Skorospeshkin, M. V.; Sukhodoev, M. S.; Timoshenko, E. A.; Lenskiy, F. V.

2016-04-01

Adaptive pseudolinear gain and phase compensators of dynamic characteristics of automatic control systems are suggested. The automatic control system performance with adaptive compensators has been explored. The efficiency of pseudolinear adaptive compensators in the automatic control systems with time-varying parameters has been demonstrated.
Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery.

PubMed

Gonzalez, Graciela H; Tahsin, Tasnia; Goodale, Britton C; Greene, Anna C; Greene, Casey S

2016-01-01

Precision medicine will revolutionize the way we treat and prevent disease. A major barrier to the implementation of precision medicine that clinicians and translational scientists face is understanding the underlying mechanisms of disease. We are starting to address this challenge through automatic approaches for information extraction, representation and analysis. Recent advances in text and data mining have been applied to a broad spectrum of key biomedical questions in genomics, pharmacogenomics and other fields. We present an overview of the fundamental methods for text and data mining, as well as recent advances and emerging applications toward precision medicine. © The Author 2015. Published by Oxford University Press.
The Summarized Factorial Assessment Outline: A Structural Paradigm for EMR and L-D Assessment.

ERIC Educational Resources Information Center

Gerig, Gaylord E.

Described is the Summarized Factorial Assessment Outline (SFAO), a total assessment procedure for educable mentally retarded and learning disabled students which is centered on the three diagnostic processes of identification, interpretation, and prescription for remediation. Following an introduction covering the problem, purpose, need, and…
Automatization of hardware configuration for plasma diagnostic system

NASA Astrophysics Data System (ADS)

Wojenski, A.; Pozniak, K. T.; Kasprowicz, G.; Kolasinski, P.; Krawczyk, R. D.; Zabolotny, W.; Linczuk, P.; Chernyshova, M.; Czarski, T.; Malinowski, K.

2016-09-01

Soft X-ray plasma measurement systems are mostly multi-channel, high performance systems. In case of the modular construction it is necessary to perform sophisticated system discovery in parallel with automatic system configuration. In the paper the structure of the modular system designed for tokamak plasma soft X-ray measurements is described. The concept of the system discovery and further automatic configuration is also presented. FCS application (FMC/ FPGA Configuration Software) is used for running sophisticated system setup with automatic verification of proper configuration. In order to provide flexibility of further system configurations (e.g. user setup), common communication interface is also described. The approach presented here is related to the automatic system firmware building presented in previous papers. Modular construction and multichannel measurements are key requirement in term of SXR diagnostics with use of GEM detectors.
A Genre-Specific Reading Comprehension Strategy to Enhance Struggling Fifth-Grade Readers' Ability to Critically Analyze Argumentative Text

ERIC Educational Resources Information Center

Haria, Priti D.; Midgette, Ekaterina

2014-01-01

This study examined the effectiveness of instruction in a genre-specific reading comprehension strategy, "critical analysis of argumentative text," which was designed to help students to identify, summarize, and critically analyze parts of an argumentative text. The investigators hypothesized that reading instruction would improve the…
Fever detection from free-text clinical records for biosurveillance.

PubMed

Chapman, Wendy W; Dowling, John N; Wagner, Michael M

2004-04-01

Automatic detection of cases of febrile illness may have potential for early detection of outbreaks of infectious disease either by identification of anomalous numbers of febrile illness or in concert with other information in diagnosing specific syndromes, such as febrile respiratory syndrome. At most institutions, febrile information is contained only in free-text clinical records. We compared the sensitivity and specificity of three fever detection algorithms for detecting fever from free-text. Keyword CC and CoCo classified patients based on triage chief complaints; Keyword HP classified patients based on dictated emergency department reports. Keyword HP was the most sensitive (sensitivity 0.98, specificity 0.89), and Keyword CC was the most specific (sensitivity 0.61, specificity 1.0). Because chief complaints are available sooner than emergency department reports, we suggest a combined application that classifies patients based on their chief complaint followed by classification based on their emergency department report, once the report becomes available.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.