Note: This page contains sample records for the topic automatic text summarization from Science.gov.
While these samples are representative of the content of Science.gov,
they are not comprehensive nor are they the most current set.
We encourage you to perform a real-time search of Science.gov
to obtain the most current and comprehensive results.
Last update: November 12, 2013.
1

Automatic music video summarization based on audio-visual-text analysis and alignment  

Microsoft Academic Search

In this paper, we propose a novel approach for automatic music video summarization based on audio-visual-text analysis and alignment. The music video is separated into the music and video tracks. For the music track, the chorus is detected based on music structure analysis. For the video track, we first segment the shots and classify the shots into close-up face shots

Changsheng Xu; Xi Shao; Namunu Chinthaka Maddage; Mohan S. Kankanhalli

2005-01-01

2

Parsumist: A Persian text summarizer  

Microsoft Academic Search

The rapid growth of online information services causes the problem of information explosion. Automatic text summarization techniques are essential for dealing with this problem. The process of compacting a source document to reduce complexity and length, retaining the most important information is called text summarization. This paper introduces PARSUMIST; a text summarization system for Persian documents. It can generate generic

Mehrnoush SHAMSFARD; Tara AKHAVAN; Mona ERFANI JOURABCHI

2009-01-01

3

Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts  

Microsoft Academic Search

Vast amounts of text material are now available in machine-readable form for automatic processing. Here, approaches are outlined for manipulating and accessing texts in arbitrary subject areas in accordance with user needs. In particular, methods are given for determining text themes, traversing texts selectively, and extracting summary statements that reflect text content.

Gerard Salton; James Allan; Chris Buckley; Amit Singhal

1994-01-01

4

Using clustering and a modified classification algorithm for automatic text summarization  

NASA Astrophysics Data System (ADS)

In this paper we describe a modified classification method destined for extractive summarization purpose. The classification in this method doesn't need a learning corpus; it uses the input text to do that. First, we cluster the document sentences to exploit the diversity of topics, then we use a learning algorithm (here we used Naive Bayes) on each cluster considering it as a class. After obtaining the classification model, we calculate the score of a sentence in each class, using a scoring model derived from classification algorithm. These scores are used, then, to reorder the sentences and extract the first ones as the output summary. We conducted some experiments using a corpus of scientific papers, and we have compared our results to another summarization system called UNIS.1 Also, we experiment the impact of clustering threshold tuning, on the resulted summary, as well as the impact of adding more features to the classifier. We found that this method is interesting, and gives good performance, and the addition of new features (which is simple using this method) can improve summary's accuracy.

Aries, Abdelkrime; Oufaida, Houda; Nouali, Omar

2013-01-01

5

From text to speech summarization  

Microsoft Academic Search

In this paper, we present approaches used in text summarization, showing how they can be adapted for speech summarization and where they fall short. Informal style and apparent lack of structure in speech mean that the typical approaches used for text summarization must be extended for use with speech. We illustrate how features derived from speech can help determine summary

Kathleen McKeown; Julia Hirschberg; Michel Galley; Sameer Maskey

2005-01-01

6

Text Summarization Using Lexical Chains  

Microsoft Academic Search

Text summarization addresses both the problem of selecting the most important portions of text and the problem of generating coherent summaries. We present in this paper the summarizer of the University of Lethbridge at DUC 2001, which is based on an efficient use of lexical chains.

Meru Brunn; Yllias Chali; Christopher J. Pinchak

2001-01-01

7

The Tipster Summac Text Summarization Evaluation  

Microsoft Academic Search

The TIPSTER Text Summarization Evaluation (SUMMAC) has established definitively that automatic text summarization is very effective in relevance assessment tasks. Summaries as short as 17% of full text length sped up decision-making by almost a factor of 2 with no statistically significant degradation in F-score accuracy. SUMMAC has also introduced a new intrinsic method for automated evaluation of informative summaries.

Inderjeet Mani; David House; Gary Klein; Lynette Hirschman; Therese Firmin; Beth Sundheim

1999-01-01

8

Using Text Messaging to Summarize Text  

ERIC Educational Resources Information Center

Summarizing is an academic task that students are expected to have mastered by the time they enter college. However, experience has revealed quite the contrary. Summarization is often difficult to master as well as teach, but instructors in higher education can benefit greatly from the rapid advancement in mobile wireless technology devices, by…

Williams, Angela Ruffin

2012-01-01

9

NewsBytes: Tagalog Text Summarization Using Abstraction  

Microsoft Academic Search

In this paper, we present an automatic Tagalog text summarizer that uses abstraction instead of the traditional extraction method of summarization. It employs Natural Language Processing and Generation to produce the summary. Summarization works by determining the subject of the sentence and then building phrases for that subject. A prototype was tested and evaluated based on the following matrices: sentence

Ervin G. Batang; Regina L. Cruz; Don Erick; J. Bonus; Ria A. Sagum; Mark Angelo; T. Miano; Rubeleen Ann; C. Yu

10

Task-Driven Dynamic Text Summarization  

ERIC Educational Resources Information Center

|The objective of this work is to examine the efficacy of natural language processing (NLP) in summarizing bibliographic text for multiple purposes. Researchers have noted the accelerating growth of bibliographic databases. Information seekers using traditional information retrieval techniques when searching large bibliographic databases are often…

Workman, Terri Elizabeth

2011-01-01

11

Automatic summarization of audio-visual soccer feeds  

Microsoft Academic Search

This paper presents a fully automatic system for soccer game summarization. The system takes audio-visual content as an input, and builds on the integration of two independent but complementary contributions (i) to identify crucial periods of the soccer game in a fully automatic way, and (ii) to summarize the soccer game as a function of individual narrative preferences of the

Fan Chen; Christophe De Vleeschouwer; Helenca Duxans Barrobes; J. Gregorio Escalada; David Conejero

2010-01-01

12

A Logic Framework for Sports Video Summarization Using Text-Based Semantic Annotation  

Microsoft Academic Search

Detection of semantic events in sports videos is an essential step towards video summarization. A large volume of research has been conducted for automatic semantic event detection and summarization of sports videos. In this paper we present a novel sports video summarization framework using a combination of text, video and logic analysis. Parse trees are used to analyze structured and

Mohammed A. Refaey; Wael Abd-Almageed; Larry S. Davis

2008-01-01

13

The Automated Acquisition of Topic Signatures for Text Summarization  

Microsoft Academic Search

In order to produce a good summary, one has to identify the most relevant portions of a given text. We describe in this paper a method for automatically training topic signatures-sets of related words, with associated weights, organized around head topics and illustrate with signatures we created with 6,194 TREC collection texts over 4 selected topics. We describe the possible

Chin-Yew Lin; Eduard H. Hovy

2000-01-01

14

Visual evaluation of text features for document summarization and analysis  

Microsoft Academic Search

Thanks to the web-related and other advanced technologies, tex- tual information is increasingly being stored in digital form and posted online. Automatic methods to analyze such textual infor- mation are becoming inevitable. Many of those methods are based on quantitative text features. Analysts face the challenge to choose the most appropriate features for their tasks. This requires effective approaches for

Daniela Oelke; Peter Bak; Daniel A. Keim; Mark Last; Guy Danon

2008-01-01

15

Automatic Structuring of Written Texts  

Microsoft Academic Search

?? Abstract. This paper deals with automatic structuring and sentence boundary labelling in natural language texts. We describe the imple- mented structure tagging algorithm and heuristic rules that are used for automatic or semiautomatic labelling. Inside the detected sentence the algorithm performs a decomposition to clauses and then marks the parts of text which do not form a sentence, i.e.

Marek Veber; Ales Horák; Rostislav Julinek; Pavel Smrz

1999-01-01

16

Evaluation of Query-Based Arabic Text Summarization System  

Microsoft Academic Search

In this paper, we present and analyze the results of the application of Arabic query-based text summarization system - AQBTSS - in an attempt to produce a query-oriented summary for a single Arabic document. For this task, we adapted the traditional vector space model (VSM) and the cosine similarity measure to find the most relevant passages extracted form Arabic document

Mahmoud O. EL-HAJ; Bassam H. HAMMO

2008-01-01

17

Summarizing text documents: sentence selection and evaluation metrics  

Microsoft Academic Search

Human-quality text summarization systems are difficult to design, and even more difficult to evaluate, in part because documents can differ along several dimensions, such as length, writing style and lexical usage. Nevertheless, certain cues can often help suggest the selection of sentences for inclusion in a summary. This paper presents our analysis of news-article summaries generated by sentence selection. Sentences

Jade Goldstein; Mark Kantrowitz; Vibhu O. Mittal; Jaime G. Carbonell

1999-01-01

18

Generic and Query-Based Text Summarization Using Lexical Cohesion  

Microsoft Academic Search

Text summarization addresses the problem of selecting the most important portions of the text and the problem of producing\\u000a coherent summaries. The goal of this paper is to show how these objectives can be achieved through an efficient use of lexical cohesion. The method addresses both generic and query-based summaries. We present an approach for identifying the most important portions

Yllias Chali

2002-01-01

19

A Multi-Document Multi-Lingual Automatic Summarization System  

Microsoft Academic Search

Abstract. In this paper, a new multi- document multi-lingual text summarization technique, based on singular value decom- position and hierarchical clustering, is pro- posed. The proposed approach relies on only two resources for any language: a word segmentation system and a dictionary of words along with their document fre- quencies. The summarizer initially takes a collection of related documents, and

Mohamad Ali Honarpisheh; Gholamreza Ghassem-Sani; Ghassem Mirroshandel

20

MEAD - a platform for multidocument multilingual text summarization  

Microsoft Academic Search

This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summariza- tion environment that has been thus far downloaded by more than 500 organizations. MEAD has been used in a variety of summarization applications ranging from summarization for mobile devices to Web page summarization within a search engine and to novelty detection.

Dragomir Radev; Timothy Allison; Sasha Blair-Goldensohn; John Blitzer; Arda Celebi; Stanko Dimitrov; Elliott Drabek; Ali Hakim; Wai Lam; Danyu Liu; Jahna Otterbacher; Hong Qi; Horacio Saggion; Simone Teufel; Michael Topper; Adam Winkel; Zhu Zhang

21

The use of unlabeled data to improve supervised learning for text summarization  

Microsoft Academic Search

With the huge amount of information available electronically, there is an increasing demand for automatic text summarization systems. The use of machine learning techniques for this task allows one to adapt summaries to the user needs and to the corpus characteristics. These desirable properties have motivated an increasing amount of work in this field over the last few years. Most

Massih-Reza Amini; Patrick Gallinari

2002-01-01

22

Toward Using Text Summarization for Essay-Based Feedback  

Microsoft Academic Search

We empirically study the impact of using automatically generated summaries in the context of electronic essay rating. Our results indicate that 40% and 60% discourse-based essay summaries improve the performance of the topical analysis module of e-rater. E-rater is a system that electronically scores GMAT essays. We envision using automatically generated essay summaries for instructional feedback, as a supplement to

Jill Burstein; Daniel Marcu

2000-01-01

23

An information arrangement technique for a text classification and summarization based on a summarization frame  

Microsoft Academic Search

In this paper, the purpose is to arrange information to understand at one view. The proposed summarization frame technology is a system to hierarchically arrange and classify information by targeting content and level of importance in sentences. Moreover, the technique in which the Concept Base, the Degree of Association Algorithm, the Time Judgment system and the Place judgment system are

Seiji Tsuchiya; Eriko Yoshimura; Hirokazu Watabe

2009-01-01

24

Text summarization model based on the budgeted median problem  

Microsoft Academic Search

We propose a multi-document generic summarization model based on the budgeted median problem. Our model selects sentences to generate a summary so that every sentence in the document cluster can be assigned to and be represented by a sentence in the summary as much as possible. The advantage of this model is that it covers the entire relevant part of

Hiroya Takamura; Manabu Okumura

2009-01-01

25

Automatic segmentation of clinical texts.  

PubMed

Clinical narratives, such as radiology and pathology reports, are commonly available in electronic form. However, they are also commonly entered and stored as free text. Knowledge of the structure of clinical narratives is necessary for enhancing the productivity of healthcare departments and facilitating research. This study attempts to automatically segment medical reports into semantic sections. Our goal is to develop a robust and scalable medical report segmentation system requiring minimum user input for efficient retrieval and extraction of information from free-text clinical narratives. Hand-crafted rules were used to automatically identify a high-confidence training set. This automatically created training dataset was later used to develop metrics and an algorithm that determines the semantic structure of the medical reports. A word-vector cosine similarity metric combined with several heuristics was used to classify each report sentence into one of several pre-defined semantic sections. This baseline algorithm achieved 79% accuracy. A Support Vector Machine (SVM) classifier trained on additional formatting and contextual features was able to achieve 90% accuracy. Plans for future work include developing a configurable system that could accommodate various medical report formatting and content standards. PMID:19965054

Apostolova, Emilia; Channin, David S; Demner-Fushman, Dina; Furst, Jacob; Lytinen, Steven; Raicu, Daniela

2009-01-01

26

Automatic Text Summarization Using Unsupervised and Semi-supervised Learning  

Microsoft Academic Search

This paper investigates a new approach for unsupervised and semisupervised learning. We show that this method is an instance\\u000a of the Classification EM algorithm in the case of gaussian densities. Its originality is that it relies on a discriminant\\u000a approach whereas classical methods for unsupervised and semi-supervised learning rely on density estimation. This idea is\\u000a used to improve a generic

Massih-reza Amini; Patrick Gallinari

2001-01-01

27

An Automatic Multimedia Content Summarization System for Video Recommendation  

Microsoft Academic Search

In recent years, using video as a learning resource has received a lot of attention and has been successfully applied to many learning activities. In comparison with text-based learning, video learning integrates more multimedia resources, which usually motivate learners more than texts. However, one of the major limitations of video learning is that both instructors and learners must select suitable

Jie-chi Yang; Yi-ting Huang; Chi-cheng Tsai; Ching-i Chung; Yu-chieh Wu

2009-01-01

28

Automatic text decomposition using text segments and text themes  

Microsoft Academic Search

With the widespread use of full-text information retrieval,passage-retrieval techniques are becoming increasinglypopular. Larger texts can then be replacedby important text excerpts, thereby simplifying the retrievaltask and improving retrieval effectiveness. Passagelevelevidence about the use of words in local contextsis also useful for resolving language ambiguities and improvingretrieval output.Two main text decomposition strategies are introducedin this study, including a...

Gerard Salton; Amit Singhal; Chris Buckley; Mandar Mitra

1996-01-01

29

Automatic Detection of Text Genre  

Microsoft Academic Search

As the text databases available to users become larger and more heterogeneous, genre becomes increasingly important for computational linguistics as a complement to topical and structural principles of classification. We propose a theory of genres as bundles of , which correlate with various surface cues, and argue that genre detection based on surface cues is as successful as detection based

Brett Kessler; Geoffrey Nunberg; Hinrich Schuetze

1997-01-01

30

Enhancing E-Business-Intelligence-Service: A Topic-Guided Text Summarization Framework  

Microsoft Academic Search

Text summarization is a very important function in next generation e-business-intelligence-service. While human beings have proven to be extremely capable summarizers, computer based automated abstracting and summarizing has proven to be extremely challenging tasks. The dominant approach to text summarization is selection-based, by which the most content-bearing sentences or passages are identified and selected to compose a summary. However, the

Shuhua Liu

2005-01-01

31

Automatic Summarization of Open-Domain Multiparty Dialogues in Diverse Genres  

Microsoft Academic Search

Automatic summarization of open-domain spoken dialogues is a relatively new research area. This article introduces the task and the challenges involved and motivates and presents an approach for obtaining automatic-extract summaries for human transcripts of multiparty dialogues of four different genres, without any restriction on domain.We address the following issues, which are intrinsic to spoken-dialogue summarization and typically can be

Klaus Zechner

2002-01-01

32

Automatic text detection in complex color image  

Microsoft Academic Search

Detection of text in color images of complex colored background is a very challenging problem. In this paper, an efficient automatic multi-feature fusing text detection method is proposed. First, we generate candidate text regions by merging bounding blocks, which are extracted using the color feature in the spatial color quantized map and the edge feature in the edge map obtained

Jiang Wu; Shao-Lin Qu; Qing Zhuo; Wen-Yuan Wang

2002-01-01

33

Automatic Multilevel Summarizations Generation Based on Basic Semantic Unit for Sports Video  

Microsoft Academic Search

Sports video has been widely studied due to its tremendous commercial potentials. Despite encouraging results from various\\u000a specific sports games, it is almost impossible to extend a summarization system for a new sports game due to the lack of sports\\u000a video modeling. In this paper, we automatically generate multi-level summarizations for sports video based on Basic Semantic\\u000a Unit (BSU), which

Chen Jianyun; Zhao Xinyu; Duan Miyi; Wu Tingting; Lao Songyang

34

Research on Personalized Recommendation System of Scientific and Technological Periodical Based on Automatic Summarization  

Microsoft Academic Search

Utilizing the theoretical methods and technology of automatic summarization system and personalized recommend system, ire study how to access to thesis document index, theme words, summary, readers' evaluations and other important recommended information from the vast amount of scientific and technological periodical documents quickly and effectively. The aim is to improve the scientific workers' research efficiency remarkably. On the basis

Qifeng Yang; Sihang Zhang; Bin Feng

2007-01-01

35

Automatic induction of rules for text simplification  

Microsoft Academic Search

Long and complicated sentences pose various problems to many stateof-the-art natural language technologies. We have been exploring methodsto automatically transform such sentences as to make them simpler. Thesemethods involve the use of a rule-based system, driven by the syntax ofthe text in the domain of interest. Hand-crafting rules for every domainis time-consuming and impractical. This paper describes an algorithm andan

Raman Chandrasekar; Bangalore Srinivas

1997-01-01

36

Science Text Comprehension: Drawing, Main Idea Selection, and Summarizing as Learning Strategies  

ERIC Educational Resources Information Center

|The purpose of two experiments was to contrast instructions to generate drawings with two text-focused strategies--main idea selection (Exp. 1) and summarization (Exp. 2)--and to examine whether these strategies could help students learn from a chemistry science text. Both experiments followed a 2 x 2 design, with drawing strategy instructions…

Leopold, Claudia; Leutner, Detlev

2012-01-01

37

Automatic discourse connective detection in biomedical text  

PubMed Central

Objective Relation extraction in biomedical text mining systems has largely focused on identifying clause-level relations, but increasing sophistication demands the recognition of relations at discourse level. A first step in identifying discourse relations involves the detection of discourse connectives: words or phrases used in text to express discourse relations. In this study supervised machine-learning approaches were developed and evaluated for automatically identifying discourse connectives in biomedical text. Materials and Methods Two supervised machine-learning models (support vector machines and conditional random fields) were explored for identifying discourse connectives in biomedical literature. In-domain supervised machine-learning classifiers were trained on the Biomedical Discourse Relation Bank, an annotated corpus of discourse relations over 24 full-text biomedical articles (?112?000 word tokens), a subset of the GENIA corpus. Novel domain adaptation techniques were also explored to leverage the larger open-domain Penn Discourse Treebank (?1 million word tokens). The models were evaluated using the standard evaluation metrics of precision, recall and F1 scores. Results and Conclusion Supervised machine-learning approaches can automatically identify discourse connectives in biomedical text, and the novel domain adaptation techniques yielded the best performance: 0.761 F1 score. A demonstration version of the fully implemented classifier BioConn is available at: http://bioconn.askhermes.org.

Polepalli Ramesh, Balaji; Prasad, Rashmi; Miller, Tim; Harrington, Brian

2012-01-01

38

Another look at automatic text-retrieval systems  

Microsoft Academic Search

Evidence from available studies comparing manual and automatic text-retrieval systems does not support the conclusion that intellectual content analysis produces better results than comparable automatic systems.

Gerard Salton

1986-01-01

39

Automatic video summarization driven by a spatio-temporal attention model  

NASA Astrophysics Data System (ADS)

According to the literature, automatic video summarization techniques can be classified in two parts, following the output nature: "video skims", which are generated using portions of the original video and "key-frame sets", which correspond to the images, selected from the original video, having a significant semantic content. The difference between these two categories is reduced when we consider automatic procedures. Most of the published approaches are based on the image signal and use either pixel characterization or histogram techniques or image decomposition by blocks. However, few of them integrate properties of the Human Visual System (HVS). In this paper, we propose to extract keyframes for video summarization by studying the variations of salient information between two consecutive frames. For each frame, a saliency map is produced simulating the human visual attention by a bottom-up (signal-dependent) approach. This approach includes three parallel channels for processing three early visual features: intensity, color and temporal contrasts. For each channel, the variations of the salient information between two consecutive frames are computed. These outputs are then combined to produce the global saliency variation which determines the key-frames. Psychophysical experiments have been defined and conducted to analyze the relevance of the proposed key-frame extraction algorithm.

Barland, R.; Saadane, A.

2008-03-01

40

MeSH: a window into full text for document summarization  

PubMed Central

Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. Results: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. Contact: sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu

Bhattacharya, Sanmitra; Ha-Thuc, Viet; Srinivasan, Padmini

2011-01-01

41

Automatic Classification of Verbs in Biomedical Texts  

Microsoft Academic Search

Lexical classes, when tailored to the appli- cation and domain in question, can provide an effective means to deal with a num- ber of natural language processing (NLP) tasks. While manual construction of such classes is difficult, recent research shows that it is possible to automatically induce verb classes from cross-domain corpora with promising accuracy. We report a novel experiment

Anna Korhonen; Yuval Krymolowski; Nigel Collier

2006-01-01

42

Efficient Web Search on Mobile Devices with Multi-Modal Input and Intelligent Text Summarization  

Microsoft Academic Search

Ease of browsing and searching for information on mobile devices has been an area of increasing interest in the World Wide Web research community (1, 2, 3, 6, 7). While some work has been done to enhance the usability of handwriting recognition to input queries through techniques such as automatic word suggestion (2), the use of speech as an input

Eric Chang; Helen Meng; Yuk-chi Li; Tien-ying Fung

2002-01-01

43

Automatic text detection for mobile augmented reality translation  

Microsoft Academic Search

We present a fast automatic text detection algorithm devised for a mobile augmented reality (AR) translation system on a mobile phone. In this application, scene text must be detected, recognized, and translated into a desired language, and then the translation is displayed overlaid properly on the real-world scene. In order to offer a fast automatic text detector, we focused our

Marc Petter; Victor Fragoso; Matthew Turk; Charles Baur

2011-01-01

44

Chinese Text Classification without Automatic Word Segmentation  

Microsoft Academic Search

Due to the lack of word boundaries in Asian systems of writing, machine processing of these languages often involves segmenting text into word units. This paper tests the assumption that this segmentation is a necessary step for authorship attribution and topic classification tasks in Chinese, and demonstrates that it is not. We show extensive results for both tasks, considering both

Wei Liu; Ben Allison; David Guthrie; Louise Guthrie

2007-01-01

45

Automatic construction of biomedical abbreviations dictionary from text  

Microsoft Academic Search

The size and growth rate of biomedical abbreviation are increasing very fast, automatic construction of biomedical abbreviations dictionary from text helps to understand biomedical literature, and to update existing databases, ontologies, and dictionaries. This paper proposes a new method for automatic construction of biomedical abbreviations dictionary from text by combining string matching algorithm and searching algorithm. The string matching algorithm

Changqin Quan; Fuji Ren; Tingting He; Po Hu

2008-01-01

46

Techniques for automatically correcting words in text  

Microsoft Academic Search

Research aimed at correcting words in text has focused on three progressively more difficult problems:(1) nonword error detection; (2) isolated-word error correction; and (3) context-dependent work correction. In response to the first problem, efficient pattern-matching and n-gram analysis techniques have been developed for detecting strings that do not appear in a given word list. In response to the second problem,

Karen Kukich

1992-01-01

47

Automatically classifying case texts and predicting outcomes  

Microsoft Academic Search

Work on a computer program called SMILE + IBP (SMart Index Learner Plus Issue-Based Prediction) bridges case-based reasoning\\u000a and extracting information from texts. The program addresses a technologically challenging task that is also very relevant\\u000a from a legal viewpoint: to extract information from textual descriptions of the facts of decided cases and apply that information\\u000a to predict the outcomes of new cases.

Kevin D. Ashley; Stefanie Brüninghaus

2009-01-01

48

Narrative text classification for automatic key phrase extraction in web document corpora  

Microsoft Academic Search

Automatic key phrase extraction is a useful tool in many text related applications such as clustering and summarization. State-of-the-art methods are aimed towards extracting key phrases from traditional text such as technical papers. Application of these methods on Web documents, which often contain diverse and heterogeneous contents, is of particular interest and challenge in the information age. In this work,

Yongzheng Zhang; Nur Zincir-Heywood; Evangelos E. Milios

2005-01-01

49

Towards automatic recognition of product names: an exploratory study of brand names in economic texts  

Microsoft Academic Search

This paper describes the first stage of research towards automatic recog- nition of brand names (trademarks, product names and service names) in Swedish economic texts. The find- ings of an exploratory study of brand names in economic texts by Malmgren (2004) are summarized, and the work of compiling a corpus annotated with named entities based on these findings is described.

Kristina Nilsson; Aisha Malmgren

50

Structure in Soccer Videos: Detecting and Classifying Highlights for Automatic Summarization  

Microsoft Academic Search

\\u000a We propose an automatic framework to detect and classify highlights directly from soccer videos. Sports videos are amongst\\u000a the most important events for TV transmissions and journalism, however for the purpose of archiving, reuse for sports analysts\\u000a and coaches, and of main interest to the audience, the considered highlights of the match should be annotated and saved separately.\\u000a This procedure

Ederson Sgarbi; Díbio Leandro Borges

2005-01-01

51

Summarization of text-based documents with a determination of latent topical sections and information-rich sentences  

Microsoft Academic Search

A method is proposed for use in summarization of text-based documents. By means of the method it is possible to discover latent topical sections and information-rich sentences. The underlying basis of the method - clustering of sentences - is formulated mathematically in the form of a problem of quadratic-type integer programming. An algorithm that makes it possible to determine with

R. M. Alguliev; R. M. Alyguliev

2007-01-01

52

Automatic Text Summarization Based on Word-Clusters and Ranking Algorithms  

Microsoft Academic Search

This paper investigates a new approach for Single Document Sum- marization based on a Machine Learning ranking algorithm. The use of machine learning techniques for this task allows one to adapt summaries to the user needs and to the corpus characteristics. These desirable properties have motivated an increasing amount of work in this field over the last few years. Most

Massih-reza Amini; Nicolas Usunier; Patrick Gallinari

2005-01-01

53

Text segmentation using gabor filters for automatic document processing  

Microsoft Academic Search

There is a considerable interest in designing automatic systems that will scan a given paper document and store it on electronic\\u000a media for easier storage, manipulation, and access. Most documents contain graphics and images in addition to text. Thus,\\u000a the document image has to be segmented to identify the text regions, so that OCR techniques may be applied only to

Anil K. Jain; Sushil K. Bhattacharjee

1992-01-01

54

Mood avatar: automatic text-driven head motion synthesis  

Microsoft Academic Search

Natural head motion is an indispensable part of realistic facial animation. This paper presents a novel approach to synthesize natural head motion automatically based on grammatical and prosodic features, which are extracted by the text analysis part of a Chinese Text-to-Speech (TTS) system. A two-layer clustering method is proposed to determine elementary head motion patterns from a multimodal database which

Kaihui Mu; Jianhua Tao; Jianfeng Che; Minghao Yang

2010-01-01

55

Automatic inpainting scheme for video text detection and removal.  

PubMed

We present a two stage framework for automatic video text removal to detect and remove embedded video texts and fill-in their remaining regions by appropriate data. In the video text detection stage, text locations in each frame are found via an unsupervised clustering performed on the connected components produced by the stroke width transform (SWT). Since SWT needs an accurate edge map, we develop a novel edge detector which benefits from the geometric features revealed by the bandlet transform. Next, the motion patterns of the text objects of each frame are analyzed to localize video texts. The detected video text regions are removed, then the video is restored by an inpainting scheme. The proposed video inpainting approach applies spatio-temporal geometric flows extracted by bandlets to reconstruct the missing data. A 3D volume regularization algorithm, which takes advantage of bandlet bases in exploiting the anisotropic regularities, is introduced to carry out the inpainting task. The method does not need extra processes to satisfy visual consistency. The experimental results demonstrate the effectiveness of both our proposed video text detection approach and the video completion technique, and consequently the entire automatic video text removal and restoration process. PMID:24057006

Mosleh, Ali; Bouguila, Nizar; Hamza, Abdessamad Ben

2013-11-01

56

Automatic annotation of multilingual text collections with a conceptual thesaurus  

Microsoft Academic Search

Abstract Automatic annotation of documents,with controlled vocabulary terms (descriptors) from a conceptual ,thesaurus is not only useful for document ,indexing ,and ,re- trieval. The mapping ,of texts ,onto the same thesaurus furthermore allows to es- tablish links between ,similar documents. This is also a substantial requirement of the Semantic Web. This paper presents analmost,language-independent system that maps,documents,written in different languages,onto

Bruno Pouliquen; Ralf Steinberger; Camelia Ignat

2006-01-01

57

Toward a multi-sensor neural net approach to automatic text classification  

SciTech Connect

Many automatic text indexing and retrieval methods use a term-document matrix that is automatically derived from the text in question. Latent Semantic Indexing, a recent method for approximating large term-document matrices, appears to be quite useful in the problem of text information retrieval, rather than text classification. Here we outline a method that attempts to combine the strength of the LSI method with that of neural networks, in addressing the problem of text classification. In doing so, we also indicate ways to improve performance by adding additional {open_quotes}logical sensors{close_quotes} to the neural network, something that is hard to do with the LSI method when employed by itself. Preliminary results are summarized, but much work remains to be done.

Dasigi, V. [Sacred Heart Univ., Fairfield, CT (United States). Department of Computer Science and Information Technology; Mann, R. [Oak Ridge National Laboratory, TN (United States)

1996-01-26

58

Image-based mobile service: automatic text extraction and translation  

NASA Astrophysics Data System (ADS)

We present a new mobile service for the translation of text from images taken by consumer-grade cell-phone cameras. Such capability represents a new paradigm for users where a simple image provides the basis for a service. The ubiquity and ease of use of cell-phone cameras enables acquisition and transmission of images anywhere and at any time a user wishes, delivering rapid and accurate translation over the phone's MMS and SMS facilities. Target text is extracted completely automatically, requiring no bounding box delineation or related user intervention. The service uses localization, binarization, text deskewing, and optical character recognition (OCR) in its analysis. Once the text is translated, an SMS message is sent to the user with the result. Further novelties include that no software installation is required on the handset, any service provider or camera phone can be used, and the entire service is implemented on the server side.

Berclaz, Jérôme; Bhatti, Nina; Simske, Steven J.; Schettino, John C.

2010-02-01

59

Automatic extraction of relations between medical concepts in clinical texts  

PubMed Central

Objective A supervised machine learning approach to discover relations between medical problems, treatments, and tests mentioned in electronic medical records. Materials and methods A single support vector machine classifier was used to identify relations between concepts and to assign their semantic type. Several resources such as Wikipedia, WordNet, General Inquirer, and a relation similarity metric inform the classifier. Results The techniques reported in this paper were evaluated in the 2010 i2b2 Challenge and obtained the highest F1 score for the relation extraction task. When gold standard data for concepts and assertions were available, F1 was 73.7, precision was 72.0, and recall was 75.3. F1 is defined as 2*Precision*Recall/(Precision+Recall). Alternatively, when concepts and assertions were discovered automatically, F1 was 48.4, precision was 57.6, and recall was 41.7. Discussion Although a rich set of features was developed for the classifiers presented in this paper, little knowledge mining was performed from medical ontologies such as those found in UMLS. Future studies should incorporate features extracted from such knowledge sources, which we expect to further improve the results. Moreover, each relation discovery was treated independently. Joint classification of relations may further improve the quality of results. Also, joint learning of the discovery of concepts, assertions, and relations may also improve the results of automatic relation extraction. Conclusion Lexical and contextual features proved to be very important in relation extraction from medical texts. When they are not available to the classifier, the F1 score decreases by 3.7%. In addition, features based on similarity contribute to a decrease of 1.1% when they are not available.

Harabagiu, Sanda; Roberts, Kirk

2011-01-01

60

Toward a multi-sensor-based approach to automatic text classification  

SciTech Connect

Many automatic text indexing and retrieval methods use a term-document matrix that is automatically derived from the text in question. Latent Semantic Indexing is a method, recently proposed in the Information Retrieval (IR) literature, for approximating a large and sparse term-document matrix with a relatively small number of factors, and is based on a solid mathematical foundation. LSI appears to be quite useful in the problem of text information retrieval, rather than text classification. In this report, we outline a method that attempts to combine the strength of the LSI method with that of neural networks, in addressing the problem of text classification. In doing so, we also indicate ways to improve performance by adding additional {open_quotes}logical sensors{close_quotes} to the neural network, something that is hard to do with the LSI method when employed by itself. The various programs that can be used in testing the system with TIPSTER data set are described. Preliminary results are summarized, but much work remains to be done.

Dasigi, V.R. [Sacred Heart Univ., Fairfield, CT (United States); Mann, R.C. [Oak Ridge National Lab., TN (United States)

1995-10-01

61

Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text  

Microsoft Academic Search

We describe the design, prototyping and evaluation of ARC, a system for automatically compiling a list of authoritative web resources on any (sufficiently broad) topic. The goal of ARC is to compile resource lists similar to those provided by Yahoo! or Infoseek. The fundamental difference is that these services construct lists either manually or through a combination of human and

Soumen Chakrabarti; Byron Dom; Prabhakar Raghavan; Sridhar Rajagopalan; David Gibson; Jon M. Kleinberg

1998-01-01

62

Automatic resource compilation by analyzing hyperlink structure and associated text  

Microsoft Academic Search

Abstract We describe the design, prototyping and evaluation of ARC, a system for automatically compiling a list of authoritativeWeb resources on any (sufficiently broad) topic. The goal of ARC is to compile resource lists similar to those provided by Yahoo! or Infoseek. The fundamental difference is that these services construct lists either manually or through a combination of human and

Soumen Chakrabarti; Byron Dom; David Gibson; Jon M. Kleinberg; Prabhakar Raghavan; Sridhar Rajagopalan

1997-01-01

63

The Effects of Teaching a Genre-Specific Reading Comprehension Strategy on Struggling Fifth Grade Students' Ability to Summarize and Analyze Argumentative Texts  

ERIC Educational Resources Information Center

This study examined the effectiveness of instruction in a genre-specific reading comprehension strategy, "Critical Analysis of Argumentative Text" (CAAT), which was designed to help students to identify, summarize and critically analyze parts of argumentative text. Based on the premise that reading and writing require similar knowledge of text

Haria, Priti Damji

2010-01-01

64

OCELOT: a system for summarizing Web pages  

Microsoft Academic Search

We introduce OCELOT, a prototype system for automatically generating the “gist” of a web page by summarizing it. Although most text summarization research to date has focused on the task of news articles, web pages are quite different in both structure and content. Instead of coherent text with a well-defined discourse structure, they are more often likely to be a

Adam L. Berger; Vibhu O. Mittal

2000-01-01

65

Automatic Processing of Japanese Text Data Based on the Occurrence Frequency Distributions of Kanji  

NASA Astrophysics Data System (ADS)

The possibility of use of the occurrence frequency distribution of Kanji for automatic indexing and automatic classification of Japanese texts is investigated. This idea is founded on the understanding that it is usually difficult to separate a Japanese text into words automatically and that most keywords in a Japanese text involve Kanji. The concept of 'subject discriminative power' of Kanji is introduced and the Kanji of high discriminative power both in each subfield and as a whole of the electric engineering field are given based on the frequency distributions of Kanji in the JICST File. Experiments of automatic classification and automatic indexing for a sample document collection were carried out by using Kanji of high discriminative power in the documents. It was concluded that the utilization of frequency distribution data of Kanji is significantly effective for automatic processing of Japanese texts but insufficient with Kanji data alone.

Hosono, Kimio; Harada, Takashi; Umeda, Shigeki; Morohashi, Masayuki; Goto, Tomonori; Moriya, Satoru

66

Automatic Extraction of Useful Facet Hierarchies from Text Databases  

Microsoft Academic Search

Databases of text and text-annotated data constitute a significant fraction of the information available in electronic form. Searching and browsing are the typical ways that users locate items of interest in such databases. Faceted interfaces represent a new powerful paradigm that proved to be a successful complement to keyword searching. Thus far, the identification of the facets was either a

Wisam Dakka; Panagiotis G. Ipeirotis

2008-01-01

67

Automatic Acquisition of Subcategorization Frames from Untagged Text  

Microsoft Academic Search

This paper describes an implemented program that takes a raw, untagged text corpus as its only input (no open-class dictionary) and generates a partial list of verbs occurring in the text and the subcategorization frames (SFs) in which they occur. Verbs are detected by a novel technique based on the Case Filter of Rouvret and Vergnaud (1980). The completeness of

Michael R. Brent; Robert C. Berwick

1991-01-01

68

Using a generalized instance set for automatic text categorization  

Microsoft Academic Search

We investigate several recent approaches for text\\u000a categorization under the framework of similarity-based\\u000a learning. They include two families of text categorization\\u000a techniques, namely the k-nearest neighbor (k-NN) algorithm\\u000a and linear classifiers. After identifying the weakness and\\u000a strength of each technique, we propose a new technique\\u000a known as the generalized instance set (GIS) algorithm by\\u000a unifying the strengths of LNN and

Wai Lam; Chao Yang Ho

1998-01-01

69

Automatic text categorization in terms of genre and author  

Microsoft Academic Search

The two main factors that characterize a text are its\\u000a content and its style, and both can be used as a means of\\u000a categorization. In this paper we present an approach to\\u000a text categorization in terms of genre and author for\\u000a Modern Greek. In contrast to previous stylometric\\u000a approaches, we attempt to take full advantage of existing\\u000a natural language processing

Efstathios Stamatatos; Nikos Fakotakis; George K. Kokkinakis

2000-01-01

70

Automatic Model Structuring from Text using BioMedical Ontology  

Microsoft Academic Search

Bayesian Networks and Influence Diagrams are effective methods for structuring clinical problems. Constructing a relevant structure without the numerical probabilities in itself is a challenging task. In addition, due to the rapid rate of innovations and new findings in the biomedical domain, constructing a relevant graphical model becomes even more challenging. Building a model structure from text with minimum intervention

Rohit Joshi; Xiaoli Li; Sreeram Ramachandaran; Tze Yun Leong

71

Summarizing Reflections.  

ERIC Educational Resources Information Center

Summarizes the conference proceedings, asserting that two fundamental questions were addressed: "How can international cohesion be achieved without also reducing diversity?" and "How can collateral damage to local higher education systems be avoided?" (EV)

Edwards, Kenneth

2003-01-01

72

Automatic Generation of Students? Conceptual Models Underpinned by Free-Text Adaptive Computer Assisted Assessment  

Microsoft Academic Search

In this paper, we present an automatic procedure to generate students' knowledge conceptual models from their answers to an automatic free-text scoring system. The conceptual model is defined as a simplified representation of the concepts and relationships among them that each student keeps in his or her mind about an area of knowledge. It is considered that each area of

Diana Pérez-Marín; Enrique Alfonseca; Manuel Freire; Pilar Rodríguez; José María Guirao; Antonio Moreno-Sandoval

2006-01-01

73

Automatic Generation of Students' Conceptual Models from Answers in Plain Text  

Microsoft Academic Search

Recently, we have introduced a new procedure to automatically generate students’ conceptual models to assist teachers in finding\\u000a out their students’ main misconceptions and lack of concepts, from their interaction with an automatic and adaptive free-text\\u000a scoring system. In this paper, we present an improvement of this procedure: the models can be built from the students’ answers\\u000a in plain text

Diana Rosario Pérez Marín; Enrique Alfonseca; Pilar Rodríguez; Ismael Pascual-nieto

2007-01-01

74

Combining MEDLINE and publisher data to create parallel corpora for the automatic translation of biomedical text  

PubMed Central

Background Most of the institutional and research information in the biomedical domain is available in the form of English text. Even in countries where English is an official language, such as the United States, language can be a barrier for accessing biomedical information for non-native speakers. Recent progress in machine translation suggests that this technique could help make English texts accessible to speakers of other languages. However, the lack of adequate specialized corpora needed to train statistical models currently limits the quality of automatic translations in the biomedical domain. Results We show how a large-sized parallel corpus can automatically be obtained for the biomedical domain, using the MEDLINE database. The corpus generated in this work comprises article titles obtained from MEDLINE and abstract text automatically retrieved from journal websites, which substantially extends the corpora used in previous work. After assessing the quality of the corpus for two language pairs (English/French and English/Spanish) we use the Moses package to train a statistical machine translation model that outperforms previous models for automatic translation of biomedical text. Conclusions We have built translation data sets in the biomedical domain that can easily be extended to other languages available in MEDLINE. These sets can successfully be applied to train statistical machine translation models. While further progress should be made by incorporating out-of-domain corpora and domain-specific lexicons, we believe that this work improves the automatic translation of biomedical texts.

2013-01-01

75

Automatic text extraction from video for content-based annotation and retrieval  

Microsoft Academic Search

Efficient content-based retrieval of image and video databases is an important application due to rapid proliferation of digital video data on the Internet and corporate intranets. Text either embedded or superimposed within video frames is very useful for describing the contents of the frames, as it enables both keyword and free-text based search, automatic video logging, and video cataloging. We

Jae-Chang Shim; Chitra Dorai; Ruud Bolle

1998-01-01

76

Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text  

Microsoft Academic Search

It is well known that utterances convey a great deal of information about the speaker in addition to their semantic content. One such type of information consists of cues to the speaker's personality traits, the most fundamental dimension of variation between humans. Recent work explores the automatic detection of other types of pragmatic variation in text and conversation, such as

François Mairesse; Marilyn A. Walker; Matthias R. Mehl; Roger K. Moore

2007-01-01

77

Automatic Cataloguing and Searching for Retrospective Data by Use of OCR Text.  

ERIC Educational Resources Information Center

Describes efforts in supporting information retrieval from OCR (optical character recognition) degraded text. Reports on approaches used in an automatic cataloging and searching contest for books in multiple languages, including a vector space retrieval model, an n-gram indexing method, and a weighting scheme; and discusses problems of Asian…

Tseng, Yuen-Hsien

2001-01-01

78

Evaluation of Extractive Voicemail Summarization  

NSDL National Science Digital Library

This interesting paper outlines a framework for automatic summarization of voicemail messages and delivery as compact text messages. The proposed system, developed at the University of Sheffield, incorporates speech recognition technology and summary word extraction. An overview of the feature selection process is especially interesting, as it briefly describes how pitch, word duration, and pauses in the voicemail message are used to obtain a compressed subset of the most important features. A number of experiments were performed to determine the system's accuracy and usability, and the results are presented in the paper.

Koumpis, Konstantinos; Renals, Steve

79

Validating Automatically Generated Students' Conceptual Models from Free-text Answers at the Level of Concepts  

Microsoft Academic Search

Students' conceptual models can be defined as networks of interconnected concepts, in which a confidence-value (CV) is estimated per each concept. This CV indicates how confident the system is that each student knows the concept according to how the student has used it in the free-text answers provided to an automatic free-text scoring system. In a previous work, a preliminary

Diana Pérez-Marín; Ismael Pascual-Nieto; Pilar Rodríguez; Eloy Anguiano; Enrique Alfonseca

2008-01-01

80

The Effects of Teaching a Text-Structure Based Reading Comprehension Strategy on Struggling Fifth Grade Students' Ability to Summarize and Analyze Written Arguments  

ERIC Educational Resources Information Center

The purpose of this research was to examine the effectiveness of teaching fifth grade students with reading difficulties a genre-specific strategy for summarizing and critically analyzing written arguments. In addition, this research explored whether learning this particular reading strategy informed the students' ability to write effective and…

Haria, Priti; MacArthur, Charles; Santoro, Lana Edwards

2010-01-01

81

Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text  

Microsoft Academic Search

Previous work has shown that Formal Concept Analysis (FCA) can be used to automatically acquire taxonomies from Indo-European text. The taxonomies are built via FCA using syntactic dependencies as attributes such as verb\\/head-object, verb\\/head-subject and verb\\/prepositional phrase-complement. This paper discusses the overall process of learning taxonomy using FCA with the same syntactic dependencies as the English language which is then

Mohd Zakree Ahmad Nazri; Siti Mariyam Shamsudin; Azuraliza Abu Bakar; Tarmizi Abd Ghani

2008-01-01

82

Semi automatic indexing of PostScript files using Medical Text Indexer in medical education.  

PubMed

At Albert Einstein College of Medicine a large part of online lecture materials contain PostScript files. As the collection grows it becomes essential to create a digital library to have easy access to relevant sections of the lecture material that is full-text indexed; to create this index it is necessary to extract all the text from the document files that constitute the originals of the lectures. In this study we present a semi automatic indexing method using robust technique for extracting text from PostScript files and National Library of Medicine's Medical Text Indexer (MTI) program for indexing the text. This model can be applied to other medical schools for indexing purposes. PMID:18694151

Mollah, Shamim Ara; Cimino, Christopher

2007-10-11

83

Automatic Story Segmentation of Closed-Caption Text for Semantic Content Analysis of Broadcasted Sports Video  

Microsoft Academic Search

Sports videos can be characterized as a sequence of recurrent semantic story units. Storing sports videos in this story-unit-based form will lead to develop an intelligent content-based retrieval, browsing, and summarization system. The storage requires segmentation of videos and semantic understanding of each segment. Since transcribed broadcasted video speech, the closed-caption text, can be the useful information source for semantic

Naoko Nitta

2002-01-01

84

Text localization and character segmentation algorithms for automatic recognition of slab identification numbers  

NASA Astrophysics Data System (ADS)

This paper describes application-oriented text localization and character segmentation algorithms in images. The target text in our application includes many unclear characters due to poor environment as well as the fact that their positions are variable in the images. Consequently, it is difficult to expect a high success rate when using existing text localization algorithms that have been developed for generic texts. Therefore, it is necessary to develop a new text localization algorithm. We propose (1) a coarse algorithm for detecting top and bottom boundaries, (2) a fitness function that is used to decide the true text among the text candidates, (3) two kinds of presegmentation algorithms for calculating the fitness function, and (4) a blank-detecting algorithm that determines whether the text is upside down or not. By the proposed algorithms, input upside-down text is rotated automatically without using any supervised or unsupervised learning methods; further, character segmentation can be done in the process of selecting the true text. To evaluate the algorithms, image data captured by the installed recognition system at Pohang Steel Company (POSCO) are used, and experimental results show that the proposed algorithms are fast and reliable.

Choi, Sunghoo; Yun, Jong Pil; Kim, Sang Woo

2009-03-01

85

Automatic topic identification of health-related messages in online health community using text classification.  

PubMed

To facilitate patient involvement in online health community and obtain informative support and emotional support they need, a topic identification approach was proposed in this paper for identifying automatically topics of the health-related messages in online health community, thus assisting patients in reaching the most relevant messages for their queries efficiently. Feature-based classification framework was presented for automatic topic identification in our study. We first collected the messages related to some predefined topics in a online health community. Then we combined three different types of features, n-gram-based features, domain-specific features and sentiment features to build four feature sets for health-related text representation. Finally, three different text classification techniques, C4.5, Naïve Bayes and SVM were adopted to evaluate our topic classification model. By comparing different feature sets and different classification techniques, we found that n-gram-based features, domain-specific features and sentiment features were all considered to be effective in distinguishing different types of health-related topics. In addition, feature reduction technique based on information gain was also effective to improve the topic classification performance. In terms of classification techniques, SVM outperformed C4.5 and Naïve Bayes significantly. The experimental results demonstrated that the proposed approach could identify the topics of online health-related messages efficiently. PMID:23961389

Lu, Yingjie

2013-07-10

86

Automatic identification of ROI in figure images toward improving hybrid (text and image) biomedical document retrieval  

NASA Astrophysics Data System (ADS)

Biomedical images are often referenced for clinical decision support (CDS), educational purposes, and research. They appear in specialized databases or in biomedical publications and are not meaningfully retrievable using primarily textbased retrieval systems. The task of automatically finding the images in an article that are most useful for the purpose of determining relevance to a clinical situation is quite challenging. An approach is to automatically annotate images extracted from scientific publications with respect to their usefulness for CDS. As an important step toward achieving the goal, we proposed figure image analysis for localizing pointers (arrows, symbols) to extract regions of interest (ROI) that can then be used to obtain meaningful local image content. Content-based image retrieval (CBIR) techniques can then associate local image ROIs with identified biomedical concepts in figure captions for improved hybrid (text and image) retrieval of biomedical articles. In this work we present methods that make robust our previous Markov random field (MRF)-based approach for pointer recognition and ROI extraction. These include use of Active Shape Models (ASM) to overcome problems in recognizing distorted pointer shapes and a region segmentation method for ROI extraction. We measure the performance of our methods on two criteria: (i) effectiveness in recognizing pointers in images, and (ii) improved document retrieval through use of extracted ROIs. Evaluation on three test sets shows 87% accuracy in the first criterion. Further, the quality of document retrieval using local visual features and text is shown to be better than using visual features alone.

You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Rahman, Md Mahmudur; Govindaraju, Venu; Thoma, George R.

2011-01-01

87

Automatic vs. manual curation of a multi-source chemical dictionary: the impact on text mining  

PubMed Central

Background Previously, we developed a combined dictionary dubbed Chemlist for the identification of small molecules and drugs in text based on a number of publicly available databases and tested it on an annotated corpus. To achieve an acceptable recall and precision we used a number of automatic and semi-automatic processing steps together with disambiguation rules. However, it remained to be investigated which impact an extensive manual curation of a multi-source chemical dictionary would have on chemical term identification in text. ChemSpider is a chemical database that has undergone extensive manual curation aimed at establishing valid chemical name-to-structure relationships. Results We acquired the component of ChemSpider containing only manually curated names and synonyms. Rule-based term filtering, semi-automatic manual curation, and disambiguation rules were applied. We tested the dictionary from ChemSpider on an annotated corpus and compared the results with those for the Chemlist dictionary. The ChemSpider dictionary of ca. 80 k names was only a 1/3 to a 1/4 the size of Chemlist at around 300 k. The ChemSpider dictionary had a precision of 0.43 and a recall of 0.19 before the application of filtering and disambiguation and a precision of 0.87 and a recall of 0.19 after filtering and disambiguation. The Chemlist dictionary had a precision of 0.20 and a recall of 0.47 before the application of filtering and disambiguation and a precision of 0.67 and a recall of 0.40 after filtering and disambiguation. Conclusions We conclude the following: (1) The ChemSpider dictionary achieved the best precision but the Chemlist dictionary had a higher recall and the best F-score; (2) Rule-based filtering and disambiguation is necessary to achieve a high precision for both the automatically generated and the manually curated dictionary. ChemSpider is available as a web service at http://www.chemspider.com/ and the Chemlist dictionary is freely available as an XML file in Simple Knowledge Organization System format on the web at http://www.biosemantics.org/chemlist.

2010-01-01

88

From episodes of care to diagnosis codes: automatic text categorization for medico-economic encoding.  

PubMed

We report on the design and evaluation of an original system to help assignment ICD (International Classification of Disease) codes to clinical narratives. The task is defined as a multi-class multi-document classification task. We combine a set of machine learning and data-poor methods to generate a single automatic text categorizer, which returns a ranked list of ICD codes. The combined ranking system currently obtains a precision of 75% at high ranks and a recall of about 63% for the top twenty returned codes for a theoretical upper bound of about 79% (inter-coder agreement). The performance of the data-poor classifier is weak, whereas the use of tempo-rally-typed contents such as anamnesis and prescription free text sections results in a statistically significant improvement. PMID:18999206

Ruch, Patrick; Gobeilla, Julien; Tbahritia, Imad; Geissbühlera, Antoine

2008-11-06

89

Challenges for automatically extracting molecular interactions from full-text articles  

PubMed Central

Background The increasing availability of full-text biomedical articles will allow more biomedical knowledge to be extracted automatically with greater reliability. However, most Information Retrieval (IR) and Extraction (IE) tools currently process only abstracts. The lack of corpora has limited the development of tools that are capable of exploiting the knowledge in full-text articles. As a result, there has been little investigation into the advantages of full-text document structure, and the challenges developers will face in processing full-text articles. Results We manually annotated passages from full-text articles that describe interactions summarised in a Molecular Interaction Map (MIM). Our corpus tracks the process of identifying facts to form the MIM summaries and captures any factual dependencies that must be resolved to extract the fact completely. For example, a fact in the results section may require a synonym defined in the introduction. The passages are also annotated with negated and coreference expressions that must be resolved. We describe the guidelines for identifying relevant passages and possible dependencies. The corpus includes 2162 sentences from 78 full-text articles. Our corpus analysis demonstrates the necessity of full-text processing; identifies the article sections where interactions are most commonly stated; and quantifies the proportion of interaction statements requiring coherent dependencies. Further, it allows us to report on the relative importance of identifying synonyms and resolving negated expressions. We also experiment with an oracle sentence retrieval system using the corpus as a gold-standard evaluation set. Conclusion We introduce the MIM corpus, a unique resource that maps interaction facts in a MIM to annotated passages within full-text articles. It is an invaluable case study providing guidance to developers of biomedical IR and IE systems, and can be used as a gold-standard evaluation set for full-text IR tasks.

McIntosh, Tara; Curran, James R

2009-01-01

90

Automatic coding of reasons for hospital referral from general medicine free-text reports.  

PubMed Central

Although the coding of medical data is expected to benefit both patients and the health care system, its implementation as a manual process often represents a poorly attractive workload for the physician. For epidemiological purpose, we developed a simple automatic coding system based on string matching, which was designed to process free-text sentences stating reasons for hospital referral, as collected from general practitioners (GPs). This system relied on a look-up table, built up from 2590 reports giving a single reason for referral, which were coded manually according to the International Classification of Primary Care (ICPC). We tested the system by entering 797 new reasons for referral. The match rate was estimated at 77%, and the accuracy rate, at 80% at code level and 92% at chapter level. This simple system is now routinely used by a national epidemiological network of sentinel physicians.

Letrilliart, L.; Viboud, C.; Boelle, P. Y.; Flahault, A.

2000-01-01

91

Texting  

ERIC Educational Resources Information Center

|With the increasing ranks of cell phone ownership is an increase in text messaging, or texting. During 2008, more than 2.5 trillion text messages were sent worldwide--that's an average of more than 400 messages for every person on the planet. Although many of the messages teenagers text each day are perhaps nothing more than "how r u?" or "c u…

Tilley, Carol L.

2009-01-01

92

Character-based movie summarization  

Microsoft Academic Search

A decent movie summary is helpful for movie producer to promote the movie as well as audience to capture the theme of the movie before watching the whole movie. Most exiting automatic movie summarization approaches heavily rely on video content only, which may not deliver ideal result due to the semantic gap between computer calculated low-level features and human used

Jitao Sang; Changsheng Xu

2010-01-01

93

The Effects of Two Summarization Strategies Using Expository Text on the Reading Comprehension and Summary Writing of Fourth-and Fifth-Grade Students in an Urban, Title 1 School  

ERIC Educational Resources Information Center

|Using a quasi-experimental pretest/post test design, this study examined the effects of two summarization strategies on the reading comprehension and summary writing of fourth- and fifth- grade students in an urban, Title 1 school. The Strategies, "G"enerating "I"nteractions between "S"chemata and "T"ext (GIST) and Rule-based, were taught using…

Braxton, Diane M.

2009-01-01

94

Summarizing can improve metacomprehension accuracy  

Microsoft Academic Search

In two experiments, it was examined whether the accuracy of comprehension monitoring (metacomprehension accuracy) was improved by summarizing texts. College students read texts and then some wrote a summary of each text (either immediately after reading or after a delay—the delay between reading and summarizing was filled by the reading of the remaining texts), whereas others did not (the control

Keith W Thiede; Mary C. M Anderson

2003-01-01

95

Automatic Seed Word Selection for Unsupervised Sentiment Classification of Chinese Text  

Microsoft Academic Search

We describe and evaluate a new method of automatic seed word selection for un- supervised sentiment classification of product reviews in Chinese. The whole method is unsupervised and does not re- quire any annotated training data; it only requires information about commonly oc- curring negations and adverbials. Unsu- pervised techniques are promising for this task since they avoid problems of

Taras Zagibalov; John Carroll

2008-01-01

96

Parse: A System for Automatic Syntactic Analysis of English Text, Part II.  

National Technical Information Service (NTIS)

A list is presented of the three major components of PARSE, a system for the automatic syntactic analysis of English sentences. It contains: (1) a glossary of the words used, grouped alphabetically by families; (2) a presentation of the words organized by...

J. Robinson S. Marks

1965-01-01

97

An Automatic Classification of Book Texts to User-Defined Tags  

Microsoft Academic Search

We describe work on automatically assigning labels to books using user-defined tags as the label set. Using supervised learning and exploring both binary and mul- ticlass classification, we train and test classifiers on sev- eral sets of features, focusing on the size of the sets, part-of-speech classes and named entities. Results indi- cate that a binary classifier, trained and tested

Sharon Givon; Theresa Wilson

2008-01-01

98

Automatic Summarization of Personal Photo Collections  

Microsoft Academic Search

Photo taking and sharing devices (e.g., smart phones, digital cameras, etc) have become extremely popular in recent times. Photo enthusiasts today capture moments of their personal lives using these devices. This has resulted in huge collections of photos stored in various personal archives. The exponential growth of online social networks and web based photo sharing platforms have added fuel to

Pinaki Sinha

2011-01-01

99

Is automatic classification a reasonable application of statistical analysis of text?  

Microsoft Academic Search

The statistical approach to the analysis of document\\u000a collections and retrieval therefrom has proceeded along\\u000a two main lines, associative machine searching and\\u000a automatic classification. The former approach has been\\u000a favored because of the tendency of people in the computer\\u000a field to strive for new methods of dealing with the\\u000a literature -- methods which do not resemble those of\\u000a traditional libraries.

Lauren B. Doyle

1965-01-01

100

Unsupervised method for automatic construction of a disease dictionary from a large free text collection.  

PubMed

Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting con-textual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35-88%) over available, manually created disease terminologies. PMID:18999169

Xu, Rong; Supekar, Kaustubh; Morgan, Alex; Das, Amar; Garber, Alan

2008-11-06

101

Extraction-Based Text Categorization: Generating Domain-Specific Role Relationships Automatically  

Microsoft Academic Search

In previous work, we developed several algorithms that use information extraction techniques to achieve high-precision text categorization. The relevancy signatures algorithm classifies texts using extraction patterns, and the augmented relevancy signatures algorithm classifies texts using extraction patterns and semantic features associated with role fillers (Riloff and Lehnert, 1994). These algorithms relied on hand-coded training data, including annotated texts and a

Ellen Riloff; Jeffrey Lorenzen

1998-01-01

102

Web-based UMLS concept retrieval by automatic text scanning: a comparison of two methods.  

PubMed

The Web is increasingly the medium of choice for multi-user application program delivery. Yet selection of an appropriate programming environment for rapid prototyping, code portability, and maintainability remain issues. We summarize our experience on the conversion of a LISP Web application, Search/SR to a new, functionally identical application, Search/SR-ASP using a relational database and active server pages (ASP) technology. Our results indicate that provision of easy access to database engines and external objects is almost essential for a development environment to be considered viable for rapid and robust application delivery. While LISP itself is a robust language, its use in Web applications may be hard to justify given that current vendor implementations do not provide such functionality. Alternative, currently available scripting environments for Web development appear to have most of LISP's advantages and few of its disadvantages. PMID:11084231

Brandt, C; Nadkarni, P

2001-01-01

103

Summarizing Email Threads  

Microsoft Academic Search

Summarizing threads of email is different from summarizing other types of written communi- cation as it has an inherent dialog structure. We present initial research which shows that sen- tence extraction techniques can work for email threads as well, but profit from email-specific features. In addition, the presentation of the summary should take into account the dialogic structure of email

Owen Rambow; Lokesh Shrestha; John Chen; Chirsty Lauridsen

104

Deep versus broad methods for automatic extraction of intelligence information from text  

Microsoft Academic Search

Extraction of intelligence from text data is increasingly becoming automated as software and network technology increases in speed and scope. However, enormous amounts of text data are often available and one must carefully design a data mining strategy to obtain the relevant nuggets of gold from the mountains of useless dross. Two strategies can be tried. A \\

Neil C. Rowe; Jonathan Wintrode; Jason Sparks; Jonathan Vorrath; Matthew Lear

105

Automatic Text Formatting for Social Media Based on Linefeed and Comma Insertion  

Microsoft Academic Search

\\u000a By appearance of social media, people are coming to be able to transmit information easily on a personal level. However, because\\u000a users of social media generally spend little time on describing information, low-quality texts are transmitted and it blocks\\u000a the spread of information. On transmitted texts in social media, commas and linefeeds are inserted incorrectly, and it becomes\\u000a a factor

Masaki Murata; Tomohiro Ohno; Shigeki Matsubara

106

AUTOMATIC IDENTIFICATION OF CAUSAL RELATIONS IN TEXT AND THEIR USE FOR IMPROVING PRECISION IN INFORMATION RETRIEVAL  

Microsoft Academic Search

This study represents one attempt to make use of relations expressed in text to improve information retrieval effectiveness. In particular, the study investigated whether the information obtained by matching causal relations expressed in documents with the causal relations expressed in users' queries could be used to improve document retrieval results in comparison to using just term matching without considering relations.

Christopher Soo-Guan Khoo

107

Test-Driving TANKA: Evaluating a Semi-automatic System of Text Analysis for Knowledge Acquisition  

Microsoft Academic Search

The evaluation of a large implemented natural language processing system involves more than its application to a common performance task. Such tasks have been used in the message understanding conferences (MUCs), text retrieval conferences (TRECs) as well as in speech technology and machine translation workshops. It is useful to compare the performance of different systems in a predefined application, but

Ken Barker; Sylvain Delisle; Stan Szpakowicz

1998-01-01

108

Webpage summarization using clickthrough data  

Microsoft Academic Search

Most previous Web-page summarization methods treat a Web page as plain text. However, such methods fail to uncover the full knowledge associated with a Web page needed in building a high-quality summary, because many of these methods do not consider the hidden relationships in the Web. Uncovering the hidden knowledge is important in building good Web-page summarizers. In this paper,

Jian-Tao Sun; Dou Shen; Hua-Jun Zeng; Qiang Yang; Yuchang Lu; Zheng Chen

2005-01-01

109

Summarizing Short Stories  

Microsoft Academic Search

We present an approach to the automatic creation of extractive summaries of literary short stories. The summaries are produced with a specific objective in mind: to help a reader decide whether she would be interested in reading the complete story. To this end, the summaries give the user relevant information about the setting of the story without revealing its plot.

Anna Kazantseva; Stan Szpakowicz

2010-01-01

110

Generalizability and Comparison of Automatic Clinical Text De-Identification Methods and Resources  

PubMed Central

In this paper, we present an evaluation of the hybrid best-of-breed automated VHA (Veteran’s Health Administration) clinical text de-identification system, nicknamed BoB, developed within the VHA Consortium for Healthcare Informatics Research. We also evaluate two available machine learning-based text de-identifications systems: MIST and HIDE. Two different clinical corpora were used for this evaluation: a manually annotated VHA corpus, and the 2006 i2b2 de-identification challenge corpus. These experiments focus on the generalizability and portability of the classification models across different document sources. BoB demonstrated good recall (92.6%), satisfactorily prioritizing patient privacy, and also achieved competitive precision (83.6%) for preserving subsequent document interpretability. MIST and HIDE reached very competitive results, in most cases with high precision (92.6% and 93.6%), although recall was sometimes lower than desired for the most sensitive PHI categories.

Ferrandez, Oscar; South, Brett R.; Shen, Shuying; Friedlin, F. Jeff; Samore, Matthew H.; Meystre, Stephane M.

2012-01-01

111

Generalizability and comparison of automatic clinical text de-identification methods and resources.  

PubMed

In this paper, we present an evaluation of the hybrid best-of-breed automated VHA (Veteran's Health Administration) clinical text de-identification system, nicknamed BoB, developed within the VHA Consortium for Healthcare Informatics Research. We also evaluate two available machine learning-based text de-identifications systems: MIST and HIDE. Two different clinical corpora were used for this evaluation: a manually annotated VHA corpus, and the 2006 i2b2 de-identification challenge corpus. These experiments focus on the generalizability and portability of the classification models across different document sources. BoB demonstrated good recall (92.6%), satisfactorily prioritizing patient privacy, and also achieved competitive precision (83.6%) for preserving subsequent document interpretability. MIST and HIDE reached very competitive results, in most cases with high precision (92.6% and 93.6%), although recall was sometimes lower than desired for the most sensitive PHI categories. PMID:23304289

Ferrández, Óscar; South, Brett R; Shen, Shuying; Friedlin, F Jeff; Samore, Matthew H; Meystre, Stéphane M

2012-11-03

112

Text Mining and Natural Language Processing Approaches for Automatic Categorization of Lay Requests to Web-Based Expert Forums  

PubMed Central

Background Both healthy and sick people increasingly use electronic media to obtain medical information and advice. For example, Internet users may send requests to Web-based expert forums, or so-called “ask the doctor” services. Objective To automatically classify lay requests to an Internet medical expert forum using a combination of different text-mining strategies. Methods We first manually classified a sample of 988 requests directed to a involuntary childlessness forum on the German website “Rund ums Baby” (“Everything about Babies”) into one or more of 38 categories belonging to two dimensions (“subject matter” and “expectations”). After creating start and synonym lists, we calculated the average Cramer’s V statistic for the association of each word with each category. We also used principle component analysis and singular value decomposition as further text-mining strategies. With these measures we trained regression models and determined, on the basis of best regression models, for any request the probability of belonging to each of the 38 different categories, with a cutoff of 50%. Recall and precision of a test sample were calculated as a measure of quality for the automatic classification. Results According to the manual classification of 988 documents, 102 (10%) documents fell into the category “in vitro fertilization (IVF),” 81 (8%) into the category “ovulation,” 79 (8%) into “cycle,” and 57 (6%) into “semen analysis.” These were the four most frequent categories in the subject matter dimension (consisting of 32 categories). The expectation dimension comprised six categories; we classified 533 documents (54%) as “general information” and 351 (36%) as a wish for “treatment recommendations.” The generation of indicator variables based on the chi-square analysis and Cramer’s V proved to be the best approach for automatic classification in about half of the categories. In combination with the two other approaches, 100% precision and 100% recall were realized in 18 (47%) out of the 38 categories in the test sample. For 35 (92%) categories, precision and recall were better than 80%. For some categories, the input variables (ie, “words”) also included variables from other categories, most often with a negative sign. For example, absence of words predictive for “menstruation” was a strong indicator for the category “pregnancy test.” Conclusions Our approach suggests a way of automatically classifying and analyzing unstructured information in Internet expert forums. The technique can perform a preliminary categorization of new requests and help Internet medical experts to better handle the mass of information and to give professional feedback.

Reincke, Ulrich; Michelmann, Hans Wilhelm

2009-01-01

113

Degree centrality for semantic abstraction summarization of therapeutic studies  

PubMed Central

Automatic summarization has been proposed to help manage the results of biomedical information retrieval systems. Semantic MEDLINE, for example, summarizes semantic predications representing assertions in MEDLINE citations. Results are presented as a graph which maintains links to the original citations. Graphs summarizing more than 500 citations are hard to read and navigate, however. We exploit graph theory for focusing these large graphs. The method is based on degree centrality, which measures connectedness in a graph. Four categories of clinical concepts related to treatment of disease were identified and presented as a summary of input text. A baseline was created using term frequency of occurrence. The system was evaluated on summaries for treatment of five diseases compared to a reference standard produced manually by two physicians. The results showed that recall for system results was 72%, precision was 73%, and F-score was 0.72. The system F-score was considerably higher than that for the baseline (0.47).

Zhang, Han; Fiszman, Marcelo; Shin, Dongwook; Miller, Christopher M.; Rosemblat, Graciela; Rindflesch, Thomas C.

2011-01-01

114

Dynamic summarization of bibliographic-based data  

PubMed Central

Background Traditional information retrieval techniques typically return excessive output when directed at large bibliographic databases. Natural Language Processing applications strive to extract salient content from the excessive data. Semantic MEDLINE, a National Library of Medicine (NLM) natural language processing application, highlights relevant information in PubMed data. However, Semantic MEDLINE implements manually coded schemas, accommodating few information needs. Currently, there are only five such schemas, while many more would be needed to realistically accommodate all potential users. The aim of this project was to develop and evaluate a statistical algorithm that automatically identifies relevant bibliographic data; the new algorithm could be incorporated into a dynamic schema to accommodate various information needs in Semantic MEDLINE, and eliminate the need for multiple schemas. Methods We developed a flexible algorithm named Combo that combines three statistical metrics, the Kullback-Leibler Divergence (KLD), Riloff's RlogF metric (RlogF), and a new metric called PredScal, to automatically identify salient data in bibliographic text. We downloaded citations from a PubMed search query addressing the genetic etiology of bladder cancer. The citations were processed with SemRep, an NLM rule-based application that produces semantic predications. SemRep output was processed by Combo, in addition to the standard Semantic MEDLINE genetics schema and independently by the two individual KLD and RlogF metrics. We evaluated each summarization method using an existing reference standard within the task-based context of genetic database curation. Results Combo asserted 74 genetic entities implicated in bladder cancer development, whereas the traditional schema asserted 10 genetic entities; the KLD and RlogF metrics individually asserted 77 and 69 genetic entities, respectively. Combo achieved 61% recall and 81% precision, with an F-score of 0.69. The traditional schema achieved 23% recall and 100% precision, with an F-score of 0.37. The KLD metric achieved 61% recall, 70% precision, with an F-score of 0.65. The RlogF metric achieved 61% recall, 72% precision, with an F-score of 0.66. Conclusions Semantic MEDLINE summarization using the new Combo algorithm outperformed a conventional summarization schema in a genetic database curation task. It potentially could streamline information acquisition for other needs without having to hand-build multiple saliency schemas.

2011-01-01

115

Evidence of a Highly Specific Relationship between Rapid Automatic Naming of Digits and Text-Reading Speed  

ERIC Educational Resources Information Center

|This paper explores the specificity of the relationship between rapid automatic naming and reading fluency. Reading accuracy, rate, and fluency was measured among a sample of 67 children, the majority of whom were very poor readers. Regression analyses revealed that phonological processing tasks predicted reading accuracy and comprehension…

Savage, R.; Frederickson, N.

2005-01-01

116

Evidence of a Highly Specific Relationship between Rapid Automatic Naming of Digits and Text-Reading Speed  

ERIC Educational Resources Information Center

This paper explores the specificity of the relationship between rapid automatic naming and reading fluency. Reading accuracy, rate, and fluency was measured among a sample of 67 children, the majority of whom were very poor readers. Regression analyses revealed that phonological processing tasks predicted reading accuracy and comprehension whereas…

Savage, R.; Frederickson, N.

2005-01-01

117

Taming the Tiger Topic: An XCES Compliant Corpus Portal to Generate Subcorpora Based on Automatic Text-Topic Identification  

Microsoft Academic Search

Large-corpus projects generally use a rich header to describe their texts allowing several types of text searching to create study subcorpora. They normally use TEI (Text Encoding Initiative) or XCES (Corpus Encoding Standard for XML) as encoding standards. TEI was a very early initiative on standardizing text encoding. XCES is currently being largely used in corpus-based work in natural language

Marcelo Muniz; Fernando V. Paulovich; Rosane Minghim; Kleber Infante; Fernando Muniz; Renata Vieira; Sandra Aluísio

118

User and Device Adaptation in Summarizing Sports Videos  

Microsoft Academic Search

Video summarization is defined as creating a video summary which includes only important scenes in the original video streams. In order to realize automatic video summarization, the significance of each scene needs to be determined. When targeted especially on broadcast sports videos, a play scene, which corresponds to a play, can be considered as a scene unit. The significance of

Naoko Nitta; Noboru Babaguchi

2009-01-01

119

Summarization Techniques at DUC 2004  

Microsoft Academic Search

This paper presents the summarization techniques implemented by the University of Leth- bridge summarizer in order to generate very short summary ( 75 bytes) and short summary ( 665 bytes) from single and multiple documents. We present these techniques in the context of DUC 2004.

Yllias Chali; Maheedhar Kolla

2004-01-01

120

Video summarization: methods and landscape  

NASA Astrophysics Data System (ADS)

The ability to summarize and abstract information will be an essential part of intelligent behavior in consumer devices. Various summarization methods have been the topic of intensive research in the content-based video analysis community. Summarization in traditional information retrieval is a well understood problem. While there has been a lot of research in the multimedia community there is no agreed upon terminology and classification of the problems in this domain. Although the problem has been researched from different aspects there is usually no distinction between the various dimensions of summarization. The goal of the paper is to provide the basic definitions of widely used terms such as skimming, summarization, and highlighting. The different levels of summarization: local, global, and meta-level are made explicit. We distinguish among the dimensions of task, content, and method and provide an extensive classification model for the same. We map the existing summary extraction approaches in the literature into this model and we classify the aspects of proposed systems in the literature. In addition, we outline the evaluation methods and provide a brief survey. Finally we propose future research directions based on the white spots that we identified by analysis of existing systems in the literature.

Barbieri, Mauro; Agnihotri, Lalitha; Dimitrova, Nevenka

2003-11-01

121

QCS: a system for querying, clustering and summarizing documents.  

SciTech Connect

Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence 'trimming', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.

Dunlavy, Daniel M.; Schlesinger, Judith D. (Center for Computing Sciences, Bowie, MD); O'Leary, Dianne P. (University of Maryland, College Park, MD); Conroy, John M. (Center for Computing Sciences, Bowie, MD)

2006-10-01

122

QCS : a system for querying, clustering, and summarizing documents.  

SciTech Connect

Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence ''trimming'', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.

Dunlavy, Daniel M.

2006-08-01

123

Towards High-Quality Next-Generation Text-to-Speech Synthesis: A Multidomain Approach by Automatic Domain Classification  

Microsoft Academic Search

This paper is a contribution to the recent advancements in the development of high-quality next generation text-to-speech (TTS) synthesis systems. Two of the hottest research topics in this area are oriented towards the improvement of speech expressiveness and flexibility of synthesis. In this context, this paper presents a new TTS strategy called multidomain TTS (MD-TTS) for synthesizing among different domains.

Francesc Alías; Xavier Sevillano; Joan Claudi Socoró; Xavier Gonzalvo

2008-01-01

124

Adaptive Maximum Marginal Relevance Based Multi-email Summarization  

NASA Astrophysics Data System (ADS)

By analyzing the inherent relationship between the maximum marginal relevance (MMR) model and the content cohesion of emails with the same subject, this paper presents an adaptive maximum marginal relevance based multi-email summarization method. Due to the adoption of approximate computing of email content cohesion, the adaptive MMR is able to automatically adjust the parameters according to the changing of the email sets. The experimental results have shown that the email summarizing system based on this technique can increase the precision while reducing the redundancy of the automatic summary results, consequently improve the average quality of email summaries.

Wang, Baoxun; Liu, Bingquan; Sun, Chengjie; Wang, Xiaolong; Li, Bo

125

MPEG content summarization based on compressed domain feature analysis  

Microsoft Academic Search

This paper addresses automatic summarization of MPEG audiovisual content on compressed domain. By analyzing semantically important low-level and mid-level audiovisual features, our method universally summarizes the MPEG-1\\/-2 contents in the form of digest or highlight. The former is a shortened version of an original, while the latter is an aggregation of important or interesting events. In our proposal, first, the

Masaru Sugano; Yasuyuki Nakajima; Hiromasa Yanagihara

2003-01-01

126

Summarizing and presenting numerical data.  

PubMed

Scientific hypothesis and type of the study define variables that have to be measured. Measurements are determined by four distinct scales of measurement; nominal, ordinal, interval and ratio, producing two distinct types of data: categorical and numerical. Numerical data are usually summarized and presented by distribution, measures of central tendency and dispersion. For normally distributed data, arithmetic mean and standard deviation are used. For data not normally distributed, median with data range (minimum to maximum, interquartile range or percentile range) and mode are used. Commonly used graph types in descriptive statistics for numerical data presentation are error bar and box-and-whisker plots. Outliers are values that are numerically distant from the rest of the data and must be recognized. PMID:22135849

Pupovac, Vanja; Petrovecki, Mladen

2011-01-01

127

Content-based summarization for personal image library  

Microsoft Academic Search

With the accumulation of consumer's personal image library, the problem of managing, browsing, querying and presenting photos effectively and efficiently would become critical. We propose a framework for automatic organization of personal image libraries based on analysis of image creation time stamps and image contents to facilitate browsing and summarization of images.

Joo-Hwee Lim; Jun Li; Philippe Mulhem; Qi Tian

2003-01-01

128

Automatic Condensation of Electronic Publications by Sentence Selection  

Microsoft Academic Search

As electronic information access becomes the norm, and the variety of retrievable material increases, automatic methods of summarizing or condensing text will become critical. This paper describes a system that performs domain-independent automatic condensation of news from a large commercial news service encompassing 41 different publications. This system was evaluated against a system that condensed the same articles using only

Ronald Brandow; Karl Mitze; Lisa F. Rau

1995-01-01

129

Automatic Evaluation of Information Ordering: Kendall's Tau  

Microsoft Academic Search

This article considers the automatic evaluation of information ordering, a task underlying many text-based applications such as concept-to-text generation and multidocument summarization. We propose an evaluation method based on Kendall's ?, a metric of rank correlation. The method is inexpensive, robust, and representation independent. We show that Kendall's ? correlates reliably with human ratings and reading times.

Mirella Lapata

2006-01-01

130

Affect Units and Narrative Summarization.  

National Technical Information Service (NTIS)

The analysis of narrative text involves various levels of description. On the lowest level are word meanings and syntactic structure within single sentences. On a higher level there are problems of generating inferences and integrating information into me...

W. G. Lehnert

1980-01-01

131

Summarize to Get the Gist  

ERIC Educational Resources Information Center

As schools prepare for the common core state standards in literacy, they'll be confronted with two challenges: first, helping students comprehend complex texts, and, second, training students to write arguments supported by factual evidence. A teacher's response to these challenges might be to lead class discussions about complex reading or assign…

Collins, John

2012-01-01

132

Summarizing qualitative behavior from measurements of nonlinear circuits, revision  

NASA Astrophysics Data System (ADS)

The process of exploring the behavior of nonlinear, dynamical systems can be a time-consuming and tedious process. A program was written which automates much of the work of an experimental dynamicist. In particular, the program automatically characterizes the behavior of any driven, nonlinear electrical circuit exhibiting interesting behavior below the 10 MHz range. In order to accomplish this task, the program can autonomously select interesting input parameters, drive the circuit, measure its response, perform a set of numeric computations on the measured data, interpret the results and decompose the circuit's parameter space into regions of qualitatively distinct behavior. The output is a two-dimensional portrait summarizing the high-level, qualitative behavior of the nonlinear circuit for every point in the graph as well as an accompanying textual explanation describing any interesting patterns observed in the diagram. In addition to the graph and the text, the program generates a symbolic description of the circuit's behavior. This intermediate data structure can then be passed onto other programs for further analysis.

Lee, Michelle K.

1989-05-01

133

Summarization from medical documents: a survey  

Microsoft Academic Search

Objective: The aim of this paper is to survey the recent work in medical documents summarization. Background: During the last decade, documents summarization got increasing attention by the AI re- search community. More recently it also attracted the interest of the medical research community as well, due to the enormous growth of information that is available to the physicians and

Stergos D. Afantenos; Vangelis Karkaletsis; Panagiotis Stamatopoulos

2005-01-01

134

Advances in Video Summarization and Skimming  

Microsoft Academic Search

This chapter summarizes recent advances in video abstraction for fast content browsing, skimming, transmission, and retrieval\\u000a of massive video database which are demanded in many system applications, such as web multimedia, mobile multimedia, interactive\\u000a TV, and emerging 3D TV. Video summarization and skimming aims to provide an abstract of a long video for shortening the navigation\\u000a and browsing the original

Richard M. Jiang; Abdul H. Sadka; Danny Crookes

135

Task-focused Summarization of Email  

Microsoft Academic Search

We describe SmartMail, a prototype system for automatically identifying action items (tasks) in email messages. SmartMail presents the user with a task-focused summary of a message. The summary consists of a list of action items extracted from the message. The user can add these action items to their \\

Simon Corston-Oliver; Eric Ringger; Michael Gamon; Richard Campbell

2004-01-01

136

A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method  

PubMed Central

Background A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free text, document clustering and text summarization together are used as a solution for text information overload problem. In this paper, we introduce a coherent graph-based semantic clustering and summarization approach for biomedical literature. Results Our extensive experimental results show the approach shows 45% cluster quality improvement and 72% clustering reliability improvement, in terms of misclassification index, over Bisecting K-means as a leading document clustering approach. In addition, our approach provides concise but rich text summary in key concepts and sentences. Conclusion Our coherent biomedical literature clustering and summarization approach that takes advantage of ontology-enriched graphical representations significantly improves the quality of document clusters and understandability of documents through summaries.

Yoo, Illhoi; Hu, Xiaohua; Song, Il-Yeol

2007-01-01

137

Summarizing ontology-based schemas in PDMS  

Microsoft Academic Search

Quickly understanding the content of a data source is very useful in several contexts. In a Peer Data Management System (PDMS), peers can be semantically clustered, each cluster being represented by a schema obtained by merging the local schemas of the peers in this cluster. In this paper, we present a process for summarizing schemas of peers participating in a

Carlos Eduardo S. Pires; Paulo Sousa; Zoubida Kedad; Ana Carolina Salgado

2010-01-01

138

Summarizing popular music via structural similarity analysis  

Microsoft Academic Search

We present a framework for summarizing digital media based on structural analysis. Though these methods are applicable to general media, we concentrate here on characterizing the repetitive structure in popular music. In the first step, a similarity matrix is calculated from interframe spectral similarity. Segment boundaries, such as verse-chorus transitions, are found by correlating a kernel along the diagonal of

Matthew Cooper; Jonathan Foote

2003-01-01

139

Regression-Based Summarization of Email Conversations  

Microsoft Academic Search

In this paper we present a regression-based machine learning approach to email thread summarization. The regression model is able to take advantage of multi- ple gold-standard annotations for training purposes, in contrast to most work with binary classiers. We also investigate the usefulness of novel features such as speech acts. This paper also introduces a newly created and publicly available

Jan Ulrich; Giuseppe Carenini; Gabriel Murray; Raymond T. Ng

2009-01-01

140

REGIONAL AIR POLLUTION STUDY, EMISSION INVENTORY SUMMARIZATION  

EPA Science Inventory

As part of the Regional Air Pollution Study (RAPS), data for an air pollution emission inventory are summarized for point and area sources in the St. Louis Air Quality Control Region. Data for point sources were collected for criteria and noncriteria pollutants, hydrocarbons, sul...

141

Adaptive detection of missed text areas in OCR outputs: application to the automatic assessment of OCR quality in mass digitization projects  

NASA Astrophysics Data System (ADS)

The French National Library (BnF*) has launched many mass digitization projects in order to give access to its collection. The indexation of digital documents on Gallica (digital library of the BnF) is done through their textual content obtained thanks to service providers that use Optical Character Recognition softwares (OCR). OCR softwares have become increasingly complex systems composed of several subsystems dedicated to the analysis and the recognition of the elements in a page. However, the reliability of these systems is always an issue at stake. Indeed, in some cases, we can find errors in OCR outputs that occur because of an accumulation of several errors at different levels in the OCR process. One of the frequent errors in OCR outputs is the missed text components. The presence of such errors may lead to severe defects in digital libraries. In this paper, we investigate the detection of missed text components to control the OCR results from the collections of the French National Library. Our verification approach uses local information inside the pages based on Radon transform descriptors and Local Binary Patterns descriptors (LBP) coupled with OCR results to control their consistency. The experimental results show that our method detects 84.15% of the missed textual components, by comparing the OCR ALTO files outputs (produced by the service providers) to the images of the document.

Ben Salah, Ahmed; Ragot, Nicolas; Paquet, Thierry

2013-01-01

142

Topic-based web site summarization  

Microsoft Academic Search

Purpose – Summarization of an entire web site with diverse content may lead to a summary heavily biased towards the site's dominant topics. The purpose of this paper is to present a novel topic-based framework to address this problem. Design\\/methodology\\/approach – A two-stage framework is proposed. The first stage identifies the main topics covered in a web site via clustering

Yongzheng Zhang; Evangelos E. Milios; A. Nur Zincir-Heywood

2010-01-01

143

Exploiting duality in summarization with deterministic guarantees  

Microsoft Academic Search

Summarization is an important task in data mining. A major chal- lenge over the past years has been the efficient construction of fixed-space synopses that provide a deterministic quality guaran- tee, often expressed in terms of a maximum-error metric. His- tograms and several hierarchical techniques have been proposed for this problem. However, their time and\\/or space complexities remain impractically high

Panagiotis Karras; Dimitris Sacharidis; Nikos Mamoulis

2007-01-01

144

Efficient summarization of stereoscopic video sequences  

Microsoft Academic Search

An efficient technique for summarization of stereoscopic video sequences is presented, which extracts a small but meaningful set of video frames using a content-based sampling algorithm. The proposed video-content representation provides the capability of browsing digital stereoscopic video sequences and performing more efficient content-based queries and indexing. Each stereoscopic video sequence is first partitioned into shots by applying a shot-cut

Nikolaos D. Doulamis; Anastasios D. Doulamis; Yannis S. Avrithis; Klimis S. Ntalianis; Stefanos D. Kollias

2000-01-01

145

Summarization and learning-based approaches to information distillation  

Microsoft Academic Search

Information distillation is the task that aims to extract relevant passages of text from massive volumes of textual and audio sources, given a query. In this paper, we investigate two perspectives that use shallow language processing for answering open-ended distillation queries, such as “List me facts about [event]”. The first approach is a summarization-based approach that uses the unsupervised maximum

Boriska Toth; Dilek Hakkani-Tür; Sibel Yaman

2010-01-01

146

Comparing Twitter Summarization Algorithms for Multiple Post Summaries  

Microsoft Academic Search

Due to the sheer volume of text generated by a micro log site like Twitter, it is often difficult to fully understand what is being said about various topics. In an attempt to understand micro logs better, this paper compares algorithms for extractive summarization of micro log posts. We present two algorithms that produce summaries by selecting several posts from

David Inouye; Jugal K. Kalita

2011-01-01

147

Experiments in Single and Multi-Document Summarization Using MEAD  

Microsoft Academic Search

In this paper, we describe four experiments in text summariza- tion. The first experiment involves the automatic creation of 120 multi-document summaries and 308 single-document summaries from a set of 30 clusters of related documents. We present offi- cial results from a multi-site manual evaluation of the quality of the summaries. The second experiment is about the identification by human

Dragomir R. Radev; Sasha Blair-Goldensohn; Zhu Zhang

2001-01-01

148

Finding text in images  

Microsoft Academic Search

There are many applications in which the automatic detection and recognition of text embedded in images isuseful. These applications include multimedia systems, digital libraries, and Geographical Information Systems.When machine generated text is printed against clean backgrounds, it can be converted to a computer readbleform (ASCII) using current Optical Character Recognition (OCR) technology. However, text is often printedagainst shaded or textured

Victor Wu; R. Manmatha; Edward M. Riseman

1997-01-01

149

Video summarization for energy efficient wireless streaming  

NASA Astrophysics Data System (ADS)

With the proliferation of camera equipped cell phones and the deployment of the higher data rate 2.5G and 3G infra structure systems, providing consumers with video-equipped cellular communication infrastructure is highly desirable, and can drive the development of a large number of valuable applications. However, for an uplink wireless channel, both the bandwidth and battery energy in a mobile phone are limited for video communications. In this paper, we pursue an energy efficient video communication solution through joint video summarization and transmission adaptation over a slow fading wireless channel. Coding and modulation schemes and packet transmission strategy are optimized and adapted to the unique packet arrival and delay characteristics of the video summaries. In additional to the optimal solution, we also propose a heuristic solution that is greedy but has close to optimal performance. Operational energy efficiency-summary distortion performance is characterized under an optimal summarization setting. Simulation results show the advantage of the proposed scheme with respect to energy efficiency and video transmission quality.

Li, Zhu; Zhai, Fan; Katsaggelos, Aggelos K.

2005-07-01

150

Automatic Text Searching For Personal Photos  

Microsoft Academic Search

This demonstration presents the MediAssist proto- type system for organisation of personal digital photo collections based on contextual information, such as time and location of image capture, and content-based analysis, such as face detection and recognition. This metadata is used directly for identification of photos which match specified attributes, and also to create tex t surrogates for photos, allowing for

Neil O'hare; Hyowon Lee; Saman Cooray; Cathal Gurrin; Gareth J. F. Jones; Jovanka Malobabic; Noel E. O'connor; Alan F. Smeaton; Bartlomiej Uscilowski

2006-01-01

151

Automatic Ontology Extraction from Unstructured Texts  

Microsoft Academic Search

\\u000a Construction of the ontology of a specific domain currently relies on the intuition of a knowledge engineer, and the typical\\u000a output is a thesaurus of terms, each of which is expected to denote a concept. Ontological ‘engineers’ tend to hand-craft\\u000a these thesauri on an ad-hoc basis and on a relatively smallscale. Workers in the specific domain create their own special

Khurshid Ahmad; Lee Gillam

2005-01-01

152

Video summarization and semantics editing tools  

NASA Astrophysics Data System (ADS)

This paper describes a video summarization and semantics editing tool that is suited for content-based video indexing and retrieval with appropriate human operator assistance. The whole system has been designed with a clear focus on the extraction and exploitation of motion information inherent in the dynamic video scene. The dominant motion information has ben used explicitly for shot boundary detection, camera motion characterization, visual content variations description, and for key frame extraction. Various contributions have been made to ensure that the system works robustly with complex scenes and across different media types. A window-based graphical user interface has been designed to make the task very easy for interactive analysis and editing of semantic events and episode where appropriate.

Xu, Li-Qun; Zhu, Jian; Stentiford, Fred

2000-12-01

153

Summarizing cellular responses as biological process networks  

PubMed Central

Background Microarray experiments can simultaneously identify thousands of genes that show significant perturbation in expression between two experimental conditions. Response networks, computed through the integration of gene interaction networks with expression perturbation data, may themselves contain tens of thousands of interactions. Gene set enrichment has become standard for summarizing the results of these analyses in terms functionally coherent collections of genes such as biological processes. However, even these methods can yield hundreds of enriched functions that may overlap considerably. Results We describe a new technique called Markov chain Monte Carlo Biological Process Networks (MCMC-BPN) capable of reporting a highly non-redundant set of links between processes that describe the molecular interactions that are perturbed under a specific biological context. Each link in the BPN represents the perturbed interactions that serve as the interfaces between the two processes connected by the link. We apply MCMC-BPN to publicly available liver-related datasets to demonstrate that the networks formed by the most probable inter-process links reported by MCMC-BPN show high relevance to each biological condition. We show that MCMC-BPN’s ability to discern the few key links from in a very large solution space by comparing results from two other methods for detecting inter-process links. Conclusions MCMC-BPN is successful in using few inter-process links to explain as many of the perturbed gene-gene interactions as possible. Thereby, BPNs summarize the important biological trends within a response network by reporting a digestible number of inter-process links that can be explored in greater detail.

2013-01-01

154

Using Synchronic and Diachronic Relations for Summarizing Multiple Documents Describing Evolving Events  

Microsoft Academic Search

In this paper we present a fresh look at the problem of summarizing evolving events from multiple sources. After a discussion concerning the nature of evolving events we introduce a distinction between linearly and non-linearly evolving events. We present then a general methodology for the automatic creation of summaries from evolving events. At its heart lie the notions of Synchronic

Stergos D. Afantenos; Vangelis Karkaletsis; Panagiotis Stamatopoulos; Constantin Halatsis

2007-01-01

155

Blind Summarization: Content-Adaptive Video Summarization using Time-Series Analysis  

Microsoft Academic Search

Severe complexity constraints on consumer electronic devices motivate us to investigate general-purpose video summarization techniques that are able to apply a common hardware setup to multiple content genres. On the other hand, we know that high quality summaries can only be produced with domain-speciflc processing. In this paper, we present a time-series analysis based video summarization technique that provides a

Ajay Divakaran; Regunathan Radhakrishnan; A. Peker

156

Text Categorisation of Racist Texts Using a Support Vector Machine  

Microsoft Academic Search

The automatic processing of text is a major challenge because of the increasing availability of textual informa- tion and the need to organise and manage such information effectively and efficiently. Automatic Text Categori- sation is one of a number of functions we would like to have available to us and involves the assignment of one or more predefined categories to

Edel P. Greevy; Alan F. Smeaton

2004-01-01

157

The Relations among Summarizing Instruction, Support for Student Choice, Reading Engagement and Expository Text Comprehension  

ERIC Educational Resources Information Center

Research on early adolescence reveals significant declines in intrinsic motivation for reading and points out the need for metacognitive strategy use among middle school students. Research indicates that explicit instruction involving motivation and metacognitive support for reading strategy use in the context of a discipline is an efficient and…

Littlefield, Amy Root

2011-01-01

158

Acquiring Disambiguation Rules from Text  

Microsoft Academic Search

An effective procedure for automatically acquiring a new set of disambiguation rules for an existing deterministic parser on the basis of tagged text is presented. Performance of the automatically acquired rules is much better than the existing hand-written disambiguation rules. The success of the acquired rules depends on using the linguistic information encoded in the parser; enhancements to various components

Donald Hindle

1989-01-01

159

SimSum: An Empirically Founded Simulation of Summarizing  

Microsoft Academic Search

SimSum (Simulation of Summarizing) simulates 20 real-world working steps of expert summarizers. It presents an empirically founded cognitive model of summarizing and demonstrates that human summarization strategies can be simulated. The cognitive model operationalizes the discourse processing model developed by Kintsch and van Dijk (1983). Knowledge engineering followed the KADS approach, empirical modeling used methods of grounded theory development. The

Brigitte Endres

160

Automatic Imitation  

ERIC Educational Resources Information Center

"Automatic imitation" is a type of stimulus-response compatibility effect in which the topographical features of task-irrelevant action stimuli facilitate similar, and interfere with dissimilar, responses. This article reviews behavioral, neurophysiological, and neuroimaging research on automatic imitation, asking in what sense it is "automatic"…

Heyes, Cecilia

2011-01-01

161

Using synchronic and diachronic relations for summarizing multiple documents describing evolving events  

Microsoft Academic Search

In this paper we present a fresh look at the problem of summarizing evolving\\u000aevents from multiple sources. After a discussion concerning the nature of\\u000aevolving events we introduce a distinction between linearly and non-linearly\\u000aevolving events. We present then a general methodology for the automatic\\u000acreation of summaries from evolving events. At its heart lie the notions of\\u000aSynchronic

Stergos D. Afantenos; Vangelis Karkaletsis; Panagiotis Stamatopoulos; Constantin Halatsis

2008-01-01

162

Text Structure  

NSDL National Science Digital Library

This web page defines and describes text structure, or how the information within a written text is organized. It explains the benefits of teaching students to identify and analyze text structures within text and describes an instructional sequence in which students read examples of different text structures and then write paragraphs that follow a specific text structure. The site includes definitions and examples of five common text structures, and graphic organizers that can be used with each type of text. Links to additional resources and research citations are included.

2012-01-01

163

Customization in a unified framework for summarizing medical literature  

Microsoft Academic Search

Objectives: We present the summarization system in the PERSIVAL medical digital library. Although we discuss the context of our summarization research within the PERSIVAL platform, the primary focus of this article is on strategies to dene and generate customized summaries. Methods and Material: Our summarizer employs a unied user model to create a tailored summary of relevant documents for either

Noemie Elhadad; Min-yen Kan; Judith L. Klavans; Kathleen Mckeown

2005-01-01

164

Text Mining  

Microsoft Academic Search

Zusammenfassung  Im Blickpunkt dieses Artikels stehen die Funktionsweise und die Einsatzpotenziale des Text Mining. Text Mining läuft in einem mehrstufigen Prozess ab, dessen einzelne Schritte knapp vorgestellt werden. Der Fokus liegt hierbei auf der Datenaufbereitung, bei der mittels Techniken des Natural Language Processing Terme aus den zugrunde liegenden Texten extrahiert werden.

Hajo Hippner; René Rentzmann

2006-01-01

165

Text Sets.  

ERIC Educational Resources Information Center

|Presents annotations of approximately 30 titles grouped in text sets. Defines a text set as five to ten books on a particular topic or theme. Discusses books on the following topics: living creatures; pirates; physical appearance; natural disasters; and the Irish potato famine. (SG)|

Giorgis, Cyndi; Johnson, Nancy J.

2002-01-01

166

Automatic Multimodal Cognitive Load Measurement (AMCLM).  

National Technical Information Service (NTIS)

This report summarizes the research activities, results of the user studies, and research accomplishments out of the AMCLM project in the past year. We investigated the validity of using speech formants and their fusion to measure cognitive load automatic...

F. Chen

2011-01-01

167

MiTAP, Text and Audio Processing for BioSecurity: A Case Study  

Microsoft Academic Search

MiTAP (MITRE Text and Audio Processing) is a prototype system available for monitoring infectious disease outbreaks and other global events. MiTAP focuses on providing timely, multi-lingual, global information access to medical experts and individuals involved in humanitarian assistance and relief work. Multiple information sources in multiple languages are automatically captured, filtered, translated, summarized, and categorized by disease, region, information source,

Laurie E. Damianos; Jay M. Ponte; Steve Wohlever; Florence Reeder; David Day; D. George Wilson; Lynette Hirschman

2002-01-01

168

Automatic Estimation Techniques are Useful?  

Microsoft Academic Search

Best practices for software effort estimation can include the use of automatic techniques to summarize past data. There exists a large and growing number of techniques. Which are useful? In this study, 158 techniques were applied to some COCOMO data. 154 158 = 97% of the variants explored below add little or nothing to a standard linear model (with simple

Tim Menzies; Omid Jalali; Jairus Hihn; Dan Baker; Karen Lum

169

A bottom-up approach to sentence ordering for multi-document summarization  

Microsoft Academic Search

Ordering information is a difficult but important task for applications generating natural language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is selected from a set of source documents. However, improper ordering of information in a summary can confuse the reader and deteriorate the readability of the summary. Therefore, it is vital to properly

Danushka Bollegala; Naoaki Okazaki; Mitsuru Ishizuka

2010-01-01

170

A Risk-Aware Modeling Framework for Speech Summarization  

Microsoft Academic Search

Extractive speech summarization attempts to select a representative set of sentences from a spoken document so as to succinctly describe the main theme of the original document. In this paper, we adapt the notion of risk minimization for extrac- tive speech summarization by formulating the selection of sum- mary sentences as a decision-making problem. To this end, we de- velop

Berlin Chen; Shih-Hsiang Lin

2012-01-01

171

Multiscale Histograms: Summarizing Topological Relations in Large Spatial Datasets  

Microsoft Academic Search

Summarizing topological relations is fundamen- tal to many spatial applications including spatial query optimization. In this paper, we present sev- eral novel techniques to eectiv ely construct cell density based spatial histograms for range (win- dow) summarizations restricted to the four most important topological relations: contains, con- tained, overlap, and disjoint. We rst present a novel framework to construct a

Xuemin Lin; Qing Liu; Yidong Yuan; Xiaofang Zhou

2003-01-01

172

Text Segmentation Using Exponential Models  

Microsoft Academic Search

This paper introduces a new statistical ap- proach to partitioning text automatically into coherent segments. Our approach en- lists both short-range and long-range lan- guage models to help it sniff out likely sites of topic changes in text. To aid its search, the system consults a set of simple lexical hints it has learned to associate with the presence of

Doug Beeferman; Adam L. Berger; John D. Lafferty

1997-01-01

173

Automatic imitation.  

PubMed

"Automatic imitation" is a type of stimulus-response compatibility effect in which the topographical features of task-irrelevant action stimuli facilitate similar, and interfere with dissimilar, responses. This article reviews behavioral, neurophysiological, and neuroimaging research on automatic imitation, asking in what sense it is "automatic" and whether it is "imitation." This body of research reveals that automatic imitation is a covert form of imitation, distinct from spatial compatibility. It also indicates that, although automatic imitation is subject to input modulation by attentional processes, and output modulation by inhibitory processes, it is mediated by learned, long-term sensorimotor associations that cannot be altered directly by intentional processes. Automatic imitation provides an important tool for the investigation of the mirror neuron system, motor mimicry, and complex forms of imitation. It is a new behavioral phenomenon, comparable with the Stroop and Simon effects, providing strong evidence that even healthy adult humans are prone, in an unwilled and unreasoned way, to copy the actions of others. PMID:21280938

Heyes, Cecilia

2011-05-01

174

A Stretch Computer Program to Summarize Hero Test Data.  

National Technical Information Service (NTIS)

A detailed description is given for an IBM 7030 computer program designed to summarize HERO test results. A flow chart of the program, FORTRAN IV listing and samples of input and output configurations are given. (Author)

R. W. Royal

1967-01-01

175

Movie Rating and Review Summarization in Mobile Environment  

Microsoft Academic Search

In this paper, we design and develop a movie-rating and review-summarization system in a mobile environment. The movie-rating information is based on the sentiment-classification result. The condensed descriptions of movie reviews are generated from the feature-based summarization. We propose a novel approach based on latent semantic analysis (LSA) to identify product features. Furthermore, we find a way to reduce the

Chien-Liang Liu; Wen-Hoar Hsaio; Chia-Hoang Lee; Gen-Chi Lu; Emery Jou

2012-01-01

176

Summarizing local context to personalize global web search  

Microsoft Academic Search

The PC Desktop is a very rich repository of personal information, efficiently capturing user's interests. In this paper we propose a new approach towards an automatic personalization of web search in which the user specific information is extracted from such local desktops, thus allowing for an increased quality of user profiling, while sharing less private information with the search engine.

Paul-alexandru Chirita; Claudiu S. Firan; Wolfgang Nejdl

2006-01-01

177

Automatic Dilatometer.  

National Technical Information Service (NTIS)

A new and improved form of dilatometer for automatically measuring and recording small changes in dimensions over a wide temperature range is described. The normal working range is 77-1400 degrees K, and special measures are incorporated to prevent damage...

L. N. Larikov M. E. Gurevich

1971-01-01

178

Automatic transmission  

Microsoft Academic Search

1. An automatic transmission with four forward speeds and one reverse position, is described which consists of: an input shaft; an output member; first and second planetary gear sets each having a sun gear, a ring gear and a carrier supporting a pinion in mesh with the sun gear and ring gear; the carrier of the first gear set, the

M. Miura; T. Inuzuka

1986-01-01

179

Multilingual Summarization in Practice: The Case of Patent Claims  

Microsoft Academic Search

Hardly any other type of textual material is as difficult to read and comprehend as patents. Especially the claims in a patent reveal very complex syntactic constructions which are difficult to process even for native speakers, let alone for foreigners who do not master well the language in which the patent is written. Therefore, multilingual summarization is very attractive to

Simon Mille; Leo Wanner

2008-01-01

180

A fuzzy ontology and its application to news summarization.  

PubMed

In this paper, a fuzzy ontology and its application to news summarization are presented. The fuzzy ontology with fuzzy concepts is an extension of the domain ontology with crisp concepts. It is more suitable to describe the domain knowledge than domain ontology for solving the uncertainty reasoning problems. First, the domain ontology with various events of news is predefined by domain experts. The document preprocessing mechanism will generate the meaningful terms based on the news corpus and the Chinese news dictionary defined by the domain expert. Then, the meaningful terms will be classified according to the events of the news by the term classifier. The fuzzy inference mechanism will generate the membership degrees for each fuzzy concept of the fuzzy ontology. Every fuzzy concept has a set of membership degrees associated with various events of the domain ontology. In addition, a news agent based on the fuzzy ontology is also developed for news summarization. The news agent contains five modules, including a retrieval agent, a document preprocessing mechanism, a sentence path extractor, a sentence generator, and a sentence filter to perform news summarization. Furthermore, we construct an experimental website to test the proposed approach. The experimental results show that the news agent based on the fuzzy ontology can effectively operate for news summarization. PMID:16240764

Lee, Chang-Shing; Jian, Zhi-Wei; Huang, Lin-Kai

2005-10-01

181

Human subject-based video browsing and summarization  

Microsoft Academic Search

To acquire digital videos is much easier than before, since we can get videos captured from DV camcorder. More video archives make searching the targeted content more difficult. In the past decade, efficient video indexing, browsing and summarization techniques thus have become an important research issue in the field of content-based video retrieval. In this work, a novel mechanism of

Duan-Yu Chen; Kuei-Cheng Chu; Yu-Chien Liu; Yung-Sheng Chen

2010-01-01

182

A framework for summarizing and analyzing twitter feeds  

Microsoft Academic Search

The firehose of data generated by users on social networking and microblogging sites such as Facebook and Twitter is enormous. Real-time analytics on such data is challenging with most current efforts largely focusing on the efficient querying and retrieval of data produced recently. In this paper, we present a dynamic pattern driven approach to summarize data produced by Twitter feeds.

Xintian Yang; Amol Ghoting; Yiye Ruan; Srinivasan Parthasarathy

2012-01-01

183

Upper-Intermediate-Level ESL Students' Summarizing in English  

ERIC Educational Resources Information Center

|This qualitative instrumental case study explores various factors that might influence upper-intermediate-level English as a second language (ESL) students' summarizing from a sociocultural perspective. The study was conducted in a formal classroom setting, during a reading and writing class in the English Language Institute at a university in…

Vorobel, Oksana; Kim, Deoksoon

2011-01-01

184

Investigation of Learners' Perceptions for Video Summarization and Recommendation  

ERIC Educational Resources Information Center

|Recently, multimedia-based learning is widespread in educational settings. A number of studies investigate how to develop effective techniques to manage a huge volume of video sources, such as summarization and recommendation. However, few studies examine how these techniques affect learners' perceptions in multimedia learning systems. This…

Yang, Jie Chi; Chen, Sherry Y.

2012-01-01

185

Mendelian randomization analysis with multiple genetic variants using summarized data.  

PubMed

Genome-wide association studies, which typically report regression coefficients summarizing the associations of many genetic variants with various traits, are potentially a powerful source of data for Mendelian randomization investigations. We demonstrate how such coefficients from multiple variants can be combined in a Mendelian randomization analysis to estimate the causal effect of a risk factor on an outcome. The bias and efficiency of estimates based on summarized data are compared to those based on individual-level data in simulation studies. We investigate the impact of gene-gene interactions, linkage disequilibrium, and 'weak instruments' on these estimates. Both an inverse-variance weighted average of variant-specific associations and a likelihood-based approach for summarized data give similar estimates and precision to the two-stage least squares method for individual-level data, even when there are gene-gene interactions. However, these summarized data methods overstate precision when variants are in linkage disequilibrium. If the P-value in a linear regression of the risk factor for each variant is less than 1×10-5, then weak instrument bias will be small. We use these methods to estimate the causal association of low-density lipoprotein cholesterol (LDL-C) on coronary artery disease using published data on five genetic variants. A 30% reduction in LDL-C is estimated to reduce coronary artery disease risk by 67% (95% CI: 54% to 76%). We conclude that Mendelian randomization investigations using summarized data from uncorrelated variants are similarly efficient to those using individual-level data, although the necessary assumptions cannot be so fully assessed. PMID:24114802

Burgess, Stephen; Butterworth, Adam; Thompson, Simon G

2013-09-20

186

BuildingaTree-BankofModern Hebrew Text  

Microsoft Academic Search

This paper describes the process of building the first tree-bank for Modern Hebrew texts. A major concern in this process is the need for reducing the cost of manual annotation by the use of automatic means. To this end, the joint utility of an automatic morphological ana- lyzer, a probabilistic parser and a small manually annotated tree-bank was explored. An

Khalil Sima; Alon Itai; Yoad Winter

187

Automatic Punctuation Generation For Speech  

Microsoft Academic Search

Automatic generation of punctuation is an essential feature for many speech-to-text transcription tasks. This paper describes a maximum a-posteriori (MAP) approach for inserting punctuation marks into raw word sequences obtained from automatic speech recognition (ASR). The system consists of an ¿acoustic model¿ (AM) for prosodic features (actually pause duration) and a ¿language model¿ (LM) for text-only features. The LM combines

Wenzhu Shen; Peng Yu; Frank Seide

2009-01-01

188

Multi-video summarization based on AV-MMR  

Microsoft Academic Search

This paper presents an algorithm for video summarization, Audio Video Maximal Marginal Relevance (AV-MMR), exploiting both audio and video information. It is an extension of the Video Maximal Marginal Relevance (Video-MMR) algorithm which was only based on visual information. AV-MMR iteratively selects segments which best represent unselected information and are non redundant with previously selected information. As for Video-MMR, AV-MMR

Yingbo Li; Bernard Merialdo

2010-01-01

189

Summarizing itemset patterns: a profile-based approach  

Microsoft Academic Search

Frequent-pattern mining has been studied extensively on scalable methods for mining various kinds of patterns including itemsets, sequences, and graphs. However, the bottleneck of frequent-pattern mining is not at the efficiency but at the interpretability, due to the huge number of patterns generated by the mining process.In this paper, we examine how to summarize a collection of itemset patterns using

Xifeng Yan; Hong Cheng; Jiawei Han; Dong Xin

2005-01-01

190

Improving Web Search and Navigation Using Summarization Process  

NASA Astrophysics Data System (ADS)

The paper presents a summarization process for enabling personalized searching framework facilitating the user access and navigation through desired contents. The system will express key concepts and relationships describing resources in a formal machine-processable representation. A WordNet-based knowledge representation could be used for content analysis and concept recognition, for reasoning processes and for enabling user-friendly and intelligent content exploration.

Carbonaro, Antonella

191

Video summarization and personalization for pervasive mobile devices  

NASA Astrophysics Data System (ADS)

We have designed and implemented a video semantic summarization system, which includes an MPEG-7 compliant annotation interface, a semantic summarization middleware, a real-time MPEG-1/2 video transcoder on PCs, and an application interface on color/black-and-white Palm-OS PDAs. We designed a video annotation tool, VideoAnn, to annotate semantic labels associated with video shots. Videos are first segmentated into shots based on their visual-audio characteristics. They are played back using an interactive interface, which facilitate and fasten the annotation process. Users can annotate the video content with the units of temporal shots or spatial regions. The annotated results are stored in the MPEG-7 XML format. We also designed and implemented a video transmission system, Universal Tuner, for wireless video streaming. This system transcodes MPEG-1/2 videos or live TV broadcasting videos to the BW or indexed color Palm OS devices. In our system, the complexity of multimedia compression and decompression algorithms is adaptively partitioned between the encoder and decoder. In the client end, users can access the summarized video based on their preferences, time, keywords, as well as the transmission bandwidth and the remaining battery power on the pervasive devices.

Tseng, Belle L.; Lin, Ching-Yung; Smith, John R.

2001-12-01

192

A Qualitative Study on the Use of Summarizing Strategies in Elementary Education  

ERIC Educational Resources Information Center

|The objective of this study is to reveal how well summarizing strategies are used by Grade 4 and Grade 5 students as a reading comprehension strategy. This study was conducted in Buca, Izmir and the document analysis method, a qualitative research strategy, was employed. The study used a text titled "Environmental Pollution" and an "Evaluation…

Susar Kirmizi, Fatma; Akkaya, Nevin

2011-01-01

193

Maximizing text-mining performance  

Microsoft Academic Search

With the advent of centralized data warehouses, where\\u000a data might be stored as electronic documents or as text\\u000a fields in databases, text mining has increased in\\u000a importance and economic value. One important goal in text\\u000a mining is automatic classification of electronic\\u000a documents. Computer programs scan text in a document and\\u000a apply a model that assigns the document to one or

Sholom M. Weiss; Chidanand Apte; Fred J. Damerau; David E. Johnson; Frank J. Oles; Thilo Goetz; Thomas Hampp

1999-01-01

194

Ergito: Virtual Text  

NSDL National Science Digital Library

Ergito's Virtual Text, started in 2000, was created to provide a more timely and interactive alternative to printed scientific textbooks at the undergraduate and graduate level. This still-developing Web site covers life science writ large, including molecular biology, cell biology, genetics, biochemistry, immunology, and so on. However, only a small number of features are available free of charge. The first chapter of the molecular biology module -- Genes are DNA -- is available for free, as is Great Experiments, a collection of essays written by authors who conducted original research that has contributed greatly to our understanding of molecular and cellular biology. Great Experiments has a recently added essay by 2001 Nobel Prize winner Paul Nurse, titled "The Discovery of cdc2 as the Key Regulator of the Cell Cycle." These essays are formatted just as the Virtual Text pages are, with downloadable figures, a glossary, an online note-taking feature (notes are automatically compiled with a summary of the essay), glossary, and more. Ergito will soon make available Techniques, another free feature offering descriptions of widely used experimental protocols. Even without free access to the larger body of material in this Web site, Ergito is a fantastic resource for learning about molecular and cellular biology. Users must complete a free registration process to access this Web site.

2000-01-01

195

Automatic Exposure Control in Multidetector-Row Computed Tomography  

Microsoft Academic Search

William Shakespeare might have accidentally explained the premise for development of automatic exposure control (AEC) techniques,\\u000a although Henry Miller might have summarized the issues related to the heterogeneous nomenclature of these techniques!

Mannudeep K. Kalra

196

Research and exploration of text mining technology  

Microsoft Academic Search

This article introduced the text excavation's research condition, has analyzed the text excavation basic concept and the technology, summarized the text excavation process, the commonly used algorithm, the text classification, the text cluster, the connection analysis, the tendency forecast and so on, pointed out that the algorithm the insufficiency, has forecast the text excavation futurology question and the direction.

Cao Lijun; Yu Hongkui; Li Yuxiang; Liu Xiyin

2010-01-01

197

Discovering evolutionary theme patterns from text: an exploration of temporal text mining  

Microsoft Academic Search

Temporal Text Mining (TTM) is concerned with discovering temporal patterns in text information collected over time. Since most text information bears some time stamps, TTM has many applications in multiple domains, such as summarizing events in news articles and revealing research trends in scientific literature. In this paper, we study a particular TTM task -- discovering and summarizing the evolutionary

Qiaozhu Mei; ChengXiang Zhai

2005-01-01

198

REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms  

PubMed Central

Outcomes of high-throughput biological experiments are typically interpreted by statistical testing for enriched gene functional categories defined by the Gene Ontology (GO). The resulting lists of GO terms may be large and highly redundant, and thus difficult to interpret. REVIGO is a Web server that summarizes long, unintelligible lists of GO terms by finding a representative subset of the terms using a simple clustering algorithm that relies on semantic similarity measures. Furthermore, REVIGO visualizes this non-redundant GO term set in multiple ways to assist in interpretation: multidimensional scaling and graph-based visualizations accurately render the subdivisions and the semantic relationships in the data, while treemaps and tag clouds are also offered as alternative views. REVIGO is freely available at http://revigo.irb.hr/.

Supek, Fran; Bosnjak, Matko; Skunca, Nives; Smuc, Tomislav

2011-01-01

199

Summarizing multiple aspects of model performance in a single diagram  

NASA Astrophysics Data System (ADS)

A diagram has been devised that can provide a concise statistical summary of how well patterns match each other in terms of their correlation, their root-mean-square difference, and the ratio of their variances. Although the form of this diagram is general, it is especially useful in evaluating complex models, such as those used to study geophysical phenomena. Examples are given showing that the diagram can be used to summarize the relative merits of a collection of different models or to track changes in performance of a model as it is modified. Methods are suggested for indicating on these diagrams the statistical significance of apparent differences and the degree to which observational uncertainty and unforced internal variability limit the expected agreement between model-simulated and observed behaviors. The geometric relationship between the statistics plotted on the diagram also provides some guidance for devising skill scores that appropriately weight among the various measures of pattern correspondence.

Taylor, Karl E.

2001-04-01

200

A State-Of-The-Art Survey on Automatic Indexing.  

ERIC Educational Resources Information Center

This survey covers the literature relating to automatic indexing techniques, services, and applications published during 1969-1973. Works are summarized and described in the areas of: (1) general papers on automatic indexing; (2) KWIC indexes; (3) KWIC variants listed alphabetically by acronym with descriptions; (4) other KWIC variants arranged by…

Liebesny, Felix

201

GeneLibrarian: an effective gene-information summarization and visualization system  

PubMed Central

Background Abundant information about gene products is stored in online searchable databases such as annotation or literature. To efficiently obtain and digest such information, there is a pressing need for automated information-summarization and functional-similarity clustering of genes. Results We have developed a novel method for semantic measurement of annotation and integrated it with a biomedical literature summarization system to establish a platform, GeneLibrarian, to provide users well-organized information about any specific group of genes (e.g. one cluster of genes from a microarray chip) they might be interested in. The GeneLibrarian generates a summarized viewgraph of candidate genes for a user based on his/her preference and delivers the desired background information effectively to the user. The summarization technique involves optimizing the text mining algorithm and Gene Ontology-based clustering method to enable the discovery of gene relations. Conclusion GeneLibrarian is a Java-based web application that automates the process of retrieving critical information from the literature and expanding the number of potential genes for further analysis. This study concentrates on providing well organized information to users and we believe that will be useful in their researches. GeneLibrarian is available on

Chiang, Jung-Hsien; Shin, Jyh-Wei; Liu, Heng-Hui; Chin, Chong-Liang

2006-01-01

202

Investigating and Annotating the Role of Citation in Biomedical Full-Text Articles  

PubMed Central

Citations are ubiquitous in scientific articles and play important roles for representing the semantic content of a full-text biomedical article. In this work, we manually examined full-text biomedical articles to analyze the semantic content of citations in full-text biomedical articles. After developing a citation relation schema and annotation guideline, our pilot annotation results show an overall agreement of 0.71, and here we report on the research challenges and the lessons we've learned while trying to overcome them. Our work is a first step toward automatic citation classification in full-text biomedical articles, which may contribute to many text mining tasks, including information retrieval, extraction, summarization, and question answering.

Yu, Hong; Agarwal, Shashank; Frid, Nadya

2010-01-01

203

BioSumm: A novel summarizer oriented to biological information  

Microsoft Academic Search

The availability of increasingly wider repositories of biomedical and biological texts requires effective tech niques to manage the huge mass of unstructured information there contained. The availability of ad-hoc document summaries, targeted to specific topics, may assist researchers in infer ring previously undisclosed knowledge and in performing the bio- logical validation of the results of data mining analysis. This paper

Elena Baralis; Alessandro Fiori; Lorenzo Montrucchio

2008-01-01

204

The following tables and graphs summarize the findings from ...  

Center for Food Safety and Applied Nutrition (CFSAN)

Text Version... PAH h samples t ace amoun FLA PYR T ) Trace Amount H Test R tested for nt detected BaA CH Type of PAH t Detected Qua Results f each PAH, ... More results from www.fda.gov/downloads/food/foodsafety

205

Effects of Teacher-Directed and Student-Interactive Summarization Instruction on Reading Comprehension and Written Summarization of Korean Fourth Graders  

ERIC Educational Resources Information Center

The purpose of this study was to investigate how Korean fourth graders' performance on reading comprehension and written summarization changes as a function of instruction in summarization across test times. Seventy five Korean fourth graders from three classes were randomly assigned to the collaborative summarization, direct instruction, and…

Jeong, Jongseong

2009-01-01

206

Clustering cliques for graph-based summarization of the biomedical research literature  

PubMed Central

Background Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts). Results SemRep is used to extract semantic predications from the citations returned by a PubMed search. Cliques were identified from frequently occurring predications with highly connected arguments filtered by degree centrality. Themes contained in the summary were identified with a hierarchical clustering algorithm based on common arguments shared among cliques. The validity of the clusters in the summaries produced was compared to the Silhouette-generated baseline for cohesion, separation and overall validity. The theme labels were also compared to a reference standard produced with major MeSH headings. Conclusions For 11 topics in the testing data set, the overall validity of clusters from the system summary was 10% better than the baseline (43% versus 33%). While compared to the reference standard from MeSH headings, the results for recall, precision and F-score were 0.64, 0.65, and 0.65 respectively.

2013-01-01

207

Music Genres Classification using Text Categorization Method  

Microsoft Academic Search

Automatic music genre classification is one of the most challenging problems in music information retrieval and management of digital music database. In this paper, we propose a new framework using text category methods to classify music genres. This framework is different from current methods for music genre classification. In our framework, we consider music as text-like semantic music document, which

Kai Chen; Sheng Gao; Yongwei Zhu; Qibin Sun

2006-01-01

208

A parallel learning algorithm for text classification  

Microsoft Academic Search

Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify text need sufficient labeled documents to learn accurately. Applying the Expectation-Maximization (EM) algorithm to this problem is an alternative approach that utilizes a large pool of unlabeled documents to augment the available labeled documents. Unfortunately, the time needed

Canasai Kruengkrai; Chuleerat Jaruskulchai

2002-01-01

209

Content-based video retrieval and summarization using MPEG-7  

NASA Astrophysics Data System (ADS)

Retrieval in current multimedia databases is usually limited to browsing and searching based on low-level visual features and explicit textual descriptors. Semantic aspects of visual information are mainly described in full text attributes or mapped onto specialized, application specific description schemes. Result lists of queries are commonly represented by textual descriptions and single key frames. This approach is valid for text documents and images, but is often insufficient to represent video content in a meaningful way. In this paper we present a multimedia retrieval framework focusing on video objects, which fully relies on the MPEG-7 standard as information base. It provides a content-based retrieval interface which uses hierarchical content-based video summaries to allow for quick viewing and browsing through search results even on bandwidth limited Web applications. Additionally semantic meaning about video content can be annotated based on domain specific ontologies, enabling a more targeted search for content. Our experiences and results with these techniques will be discussed in this paper.

Bailer, Werner; Mayer, Harald; Neuschmied, Helmut; Haas, Werner; Lux, Mathias; Klieber, Werner

2003-12-01

210

Automatic generation of personalized music sports video  

Microsoft Academic Search

In this paper, we propose a novel automatic approach for personalized music sports video generation. Two research challenges, semantic sports video content selection and automatic video composition, are addressed. For the first challenge, we propose to use multi-modal (audio, video and text) feature analysis and alignment to detect the semantic of events in sports video. For the second challenge, we

Jinjun Wang; Changsheng Xu; Engsiong Chng; Ling-Yu Duan; Kongwah Wan; Qi Tian

2005-01-01

211

Automatic translation method  

US Patent & Trademark Office Database

The present invention relates to an automatic translation method. When a sentence in a source language is translated into a sentence in a target language, the method comprises: a step (1) of extracting the set of sentence portions of the target language from a textual database that correspond to a total or partial translation of the source sentence to be translated; a step (2) of determining all the assemblies of these target sentence portions that overlap the source sentence; a step (3) of choosing the best assemblies according to a criterion of maximum overlap between the target sentence portions assembled in the preceding step and according to a criterion of minimizing the number of assembled elements; a step (4) of determining the target sentence by choosing the best assembly according to coherence criteria. The invention is notably applicable to the translation of texts in a rare language. More generally, it applies to translation with no previously established bilingual texts.

2013-07-16

212

ICTCAS's ICTGrasper at TAC 2008: Summarizing Dynamic Information with Signature Terms Based Content Filtering  

Microsoft Academic Search

This paper presents our new, topic-oriented multi-document summarization system used in TAC 2008. To deal with the problem of summarizing changes of the dynamic in- formation with time going, we propose a novel summarization method with signature terms based content filtering. We first present the definition of dynamic summarization ac- cording to temporal analysis and then pro- pose the fundamental

Jin Zhang; Xueqi Cheng; Hongbo Xu; Xiaolei Wang; Yiling Zeng

2008-01-01

213

Automatic Feature Extraction System.  

National Technical Information Service (NTIS)

The AFES (Automatic Feature Extraction System) is designed to be a testbed for evaluation of semi-automatic and computer-assisted techniques for automated production flow processes. Its intended input sources included National Sensors and LANDSAT imagery,...

J. L. Cambier

1982-01-01

214

Automatic sources of aggression  

Microsoft Academic Search

In this paper, we review research on automaticity with particular relevance to aggression. Once triggered by environmental features, preconscious automatic processes run to completion without any conscious monitoring. The basic experimental technique for studying automatic processes is priming. We review studies showing that priming, including subliminal priming, of mental constructs related to aggression leads to reliable effects on perceptions, judgments,

Alexander Todorov; John A. Bargh

2002-01-01

215

Text documents as social networks  

NASA Astrophysics Data System (ADS)

The extraction of keywords and features is a fundamental problem in text data mining. Document processing applications directly depend on the quality and speed of the identification of salient terms and phrases. Applications as disparate as automatic document classification, information visualization, filtering and security policy enforcement all rely on the quality of automatically extracted keywords. Recently, a novel approach to rapid change detection in data streams and documents has been developed. It is based on ideas from image processing and in particular on the Helmholtz Principle from the Gestalt Theory of human perception. By modeling a document as a one-parameter family of graphs with its sentences or paragraphs defining the vertex set and with edges defined by Helmholtz's principle, we demonstrated that for some range of the parameters, the resulting graph becomes a small-world network. In this article we investigate the natural orientation of edges in such small world networks. For two connected sentences, we can say which one is the first and which one is the second, according to their position in a document. This will make such a graph look like a small WWW-type network and PageRank type algorithms will produce interesting ranking of nodes in such a document.

Balinsky, Helen; Balinsky, Alexander; Simske, Steven J.

2012-02-01

216

TEXT MINING FOR PATENT MAP ANALYSIS  

Microsoft Academic Search

Patent documents contain important research results. However, they are lengthy and rich in technical and legal terminology such that it takes a lot of human efforts to analyze them. Automatic tools for assisting patent analysis are in great demand. This paper describes some methods for patent map analysis based on text mining techniques. We experiments on a real- world patent

Yuen-Hsien Tseng; Yeong-Ming Wang; Dai-Wei Juang

2005-01-01

217

Text mining techniques for patent analysis  

Microsoft Academic Search

Patent documents contain important research results. However, they are lengthy and rich in technical terminology such that it takes a lot of human efforts for analyses. Automatic tools for assisting patent engineers or decision makers in patent analysis are in great demand. This paper describes a series of text mining techniques that conforms to the analytical process used by patent

Yuen-hsien Tseng; Chi-jen Lin; Yu-i Lin

2007-01-01

218

Probabilistic Text Categorization using Sparse Topical Encoding  

Microsoft Academic Search

In this paper, we propose a topic-based probabilistic text categorization model, which can be decomposed into two steps. Firstly, we present sparse non-negative matrix factorization (SNMF) algo- rithm, which can extract the sparse topical encoding for documents automatically and has an intuitive interpretation. Secondly, we calculate the similarity between documents by integrating the probabilistic similarities of their topics, we can

Xipeng Qiuy; Xuanjing Huang; Lide Wu

219

Conceptual graph formalism for financial text representation  

Microsoft Academic Search

We present an approach to automatically transform a financial text into conceptual graph formalism. The approach exploits the constituent structure of sentences and general English grammar rules to perform the transformation. We suggest face validation and traces as the evaluation method to be performed on the resulting formalism to validate its accuracy. We also discuss the potential manipulation and application

Siti Sakira Kamaruddin; Azuraliza Abu Bakar; Abdul Razak Hamdan; Fauzias Mat Nor

2008-01-01

220

Text Categorization Approach For Chat Room Monitoring  

Microsoft Academic Search

The Internet has been utilized in several real life aspects such as online searching, and chatting. On the other hand, the Internet has been misused in com- munication of crime related matters. Monitoring of such communication would aid in crime detection or even crime prevention. This paper presents a text categoriza- tion approach for automatic monitoring of chat conversa- tions

EIMAN ELNAHRAWY

221

An Intelligent Information System for Organizing Online Text Documents  

Microsoft Academic Search

This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such as Web documents) in a hierarchical manner. The orga- nizational capabilities of this system are able to evolve semi-automatically with minimal human input. The system starts with an initial taxonomy in which documents are automatically catego- rized, and then evolves so as to

Han-joon Kim; Sang-goo Lee

2004-01-01

222

Classification of summarized videos using hidden markov models on compressed chromaticity signatures  

Microsoft Academic Search

Tools for efficiently summarizing and classifying video sequences are indispensable to assist in the synthesis and analysis of digital video. In this paper, we present a method for effective classification of different types of videos that uses the output of a concise video summarization technique that forms a list of keyframes. The summarization is produced by a method recently presented,

Cheng Lu; Mark S. Drew; James Au

2001-01-01

223

The Text Encoding Initiative  

Microsoft Academic Search

Introduction: why bother with electronic texts?. Document Analysis: defining the essentials of your texts. Markup and the Basics of SGML. Introduction to the TEI. TEI Lite. Tagging a Text2Introduction. Why do electronic texts at all?. Why bother with markup?. The steps of a project:--- clarify goals--- select sample and analyse documents--- specify markup policy--- encode the texts--- enrich, reuse, retarget,

C. M. Sperberg-mcqueen

1995-01-01

224

Automatic differentiation bibliography  

SciTech Connect

This is a bibliography of work related to automatic differentiation. Automatic differentiation is a technique for the fast, accurate propagation of derivative values using the chain rule. It is neither symbolic nor numeric. Automatic differentiation is a fundamental tool for scientific computation, with applications in optimization, nonlinear equations, nonlinear least squares approximation, stiff ordinary differential equation, partial differential equations, continuation methods, and sensitivity analysis. This report is an updated version of the bibliography which originally appeared in Automatic Differentiation of Algorithms: Theory, Implementation, and Application.

Corliss, G.F. [comp.

1992-07-01

225

Automatic battery charger  

SciTech Connect

An automatic battery charging circuit for use with battery powered vehicles such as golf carts includes an automatically timed charging switch which is connected in parallel with the conventional manually timed charging switch of the battery charger. The automatically timed charging switch includes an electrical clock connected across the power line of the charger. When the charger is plugged into the power line, the clock closes the terminals of the automatically timed charging switch for a brief period of time on a periodic basis. This prevents the batteries of the vehicle from becoming substantially discharged during extended periods of non-use, thereby increasing the life of the batteries.

Schub, L.

1984-06-26

226

Mechatronics hands-on training through the development of an Internet-based automatic control laboratory  

Microsoft Academic Search

Summarizes the authors experience from mechatronics and automatic control teaching at the undergraduate level. A hands-on remote Internet-based control laboratory project is discussed as way for training

J. C. Martinez-Garcia; R. Garrido

2001-01-01

227

Automatic Computer Program Generation for Automatic Testing Systems (ATS).  

National Technical Information Service (NTIS)

This is a report of an investigation of automatic computer program generation for Automatic Test Equipment. The feasibility of replacing human programmers by an automatic process is demonstrated through description and design of a language and a processor...

N. S. Prywes

1975-01-01

228

Multi-Lingual Text Generation and the Meaning-Text Theory  

Microsoft Academic Search

We describe multi-lingual text generation as an alternative to automatic translation in specified technical sublanguages, illustrating the notion with the implemented RAREAS-2 system for synthesizing marine weather forecasts in English and French. We then review the Meaning-Text Theory (MTT) of Mel'cuk et al. as we have applied it to text generation in the GOSSIP system for producing English reports about

Richard Kittredge; Lidija Iordanskaja; Alain Polguère

1988-01-01

229

SENT: semantic features in text  

PubMed Central

We present SENT (semantic features in text), a functional interpretation tool based on literature analysis. SENT uses Non-negative Matrix Factorization to identify topics in the scientific articles related to a collection of genes or their products, and use them to group and summarize these genes. In addition, the application allows users to rank and explore the articles that best relate to the topics found, helping put the analysis results into context. This approach is useful as an exploratory step in the workflow of interpreting and understanding experimental data, shedding some light into the complex underlying biological mechanisms. This tool provides a user-friendly interface via a web site, and a programmatic access via a SOAP web server. SENT is freely accessible at http://sent.dacya.ucm.es.

Vazquez, Miguel; Carmona-Saez, Pedro; Nogales-Cadenas, Ruben; Chagoyen, Monica; Tirado, Francisco; Carazo, Jose Maria; Pascual-Montano, Alberto

2009-01-01

230

The Second Text Retrieval Conference (TREC-2) [and] Overview of the Second Text Retrieval Conference (TREC-2) [and] Reflections on TREC [and] Automatic Routing and Retrieval Using Smart: TREC-2 [and] TREC and TIPSTER Experiments with INQUIRY [and] Large Test Collection Experiments on an Operational Interactive System: Okapi at TREC [and] Efficient Retrieval of Partial Documents [and] TREC Routing Experiments with the TRW/Paracel Fast Data Finder [and] CLARIT-TREC Experiments.  

ERIC Educational Resources Information Center

Presents an overview of the second Text Retrieval Conference (TREC-2), an opinion paper about the program, and nine papers by participants that show a range of techniques used in TREC. Topics include traditional text retrieval and information technology, efficiency, the use of language processing techniques, unusual approaches to text retrieval,…

Harman, Donna; And Others

1995-01-01

231

METER: MEasuring TExt Reuse  

Microsoft Academic Search

In this paper we present results from the METER (MEasuring TExt Reuse) project whose aim is to explore issues pertaining to text reuse and derivation, especially in the context of newspapers using newswire sources. Although the reuse of text by journalists has been studied in linguistics, we are not aware of any investigation using existing computational methods for this particular

Paul Clough; Robert Gaizauskas; Scott S. L. Piao; Yorick Wilks

2001-01-01

232

Text-Translation Alignment  

Microsoft Academic Search

We present an algorithm for aligning texts with their translations that is based only on internal evidence. The relaxation process rests on a notion of which word in one text corresponds to which word in the other text that is essentially based on the similarity of their distributions. It exploits a partial alignment of the word level to induce a

Martin Kay; Martin Röscheisen

1993-01-01

233

Texting on the Move  

MedlinePLUS

KidsHealth > Teens > Staying Safe > Driving > Texting on the Move Print A A A Text Size What's in this article? (click to view) What's the Big Deal? ... on something other than the road. In fact, driving while texting (DWT) can be more dangerous than driving under ...

234

Automatic storytelling in comics: a case study on World of Warcraft  

Microsoft Academic Search

This paper presents a development of our comic generation system that automatically summarizes players' actions and interactions in the virtual world. The feature of the system is that it analyzes the log and screenshots of a game, decides which events are important and memorable, and then generates comics in a fully automatic manner. Also, the interface of our system allows

Chia-jung Chan; Ruck Thawonmas; Kuan-ta Chen

2009-01-01

235

Automatic detection of replay segments in broadcast sports programs by detection of logos in scene transitions  

Microsoft Academic Search

In broadcast sports, replays provide viewers another look at interesting events. We propose an automatic algorithm for replay segment detection by detecting frames containing logos in the special scene transitions that sandwich replays. Detected replays are utilized in efficient navigation, indexing, and summarization of sports programs. The proposed algorithm first automatically determines the logo template from frames surrounding slow motion

Hao Pan; Baoxin Li; M. Ibrahim Sezan

2002-01-01

236

Extracting Events and Temporal Expressions from Text  

Microsoft Academic Search

Extracting temporal information from raw text is fundamental for deep language understanding, and key to many applications like question answering, information extraction, and document summarization. Our long-term goal is to build complete temporal structure of documents and apply the temporal structure in other applications like textual entailment, question answering, dialog systems or others. In this paper, we present a first

Naushad UzZaman; James F. Allen

2010-01-01

237

Learning Knowledge Bases for Information Extraction from Multiple Text Based Web Sites  

Microsoft Academic Search

This paper describes a learning\\/adaptive approach to automatically building a knowledge base for information extraction from text based web pages. A frame based representation is introduced to represent domain knowledge as knowledge unit frames. A frame learning algorithm is developed to automatically learn knowledge unit frames from training examples. Some training examples can be obtained by automatically parsing a number

Xiaoying Gao; Mengjie Zhang

2003-01-01

238

Automatic segmentation of moving objects for video object plane generation  

Microsoft Academic Search

The new video coding standard MPEG-4 is enabling content-based functionalities. It takes advantage of a prior decomposition of sequences into video object planes (VOPs) so that each VOP represents one moving object. A comprehensive review summarizes some of the most important motion segmentation and VOP generation techniques that have been proposed. Then, a new automatic video sequence segmentation algorithm that

Thomas Meier; King N. Ngan

1998-01-01

239

Lightweight Structured Text Processing  

Microsoft Academic Search

Text is a popular storage and distribution format for information, partly due to generic text-processing tools like Unix grep and sort. Unfortunately, ex- isting generic tools make assumptions about text format (e.g., each line is a record) that limit their applicability. Custom-built tools are one alterna- tive, but they require substantial time investment and programming expertise. We describe a new

Robert C. Miller; Brad A. Myers

1999-01-01

240

TextImages  

NSDL National Science Digital Library

Those persons who do their own website design will find TextImages most useful. Developed by Stefan Trost, this helpful tool allows users to integrate text written on images into their websites. Visitors can create single text images with this application, along with a wide range of pictures. Visitors also have the ability to precisely adjust the writing, design, format, style, colors, fonts, margins, and spacing as they see fit. The tool is particularly useful for those who want headings or other recurring text to look the same regardless of browser or available fonts. This version is compatible with Windows 7, XP, and Vista.

Trost, Stefan

2012-03-30

241

Processing DNA molecules as text  

PubMed Central

Polymerase Chain Reaction (PCR) is the DNA-equivalent of Gutenberg’s movable type printing, both allowing large-scale replication of a piece of text. De novo DNA synthesis is the DNA-equivalent of mechanical typesetting, both ease the setting of text for replication. What is the DNA-equivalent of the word processor? Biology labs engage daily in DNA processing—the creation of variations and combinations of existing DNA—using a plethora of manual labor-intensive methods such as site-directed mutagenesis, error-prone PCR, assembly PCR, overlap extension PCR, cleavage and ligation, homologous recombination, and others. So far no universal method for DNA processing has been proposed and, consequently, no engineering discipline that could eliminate this manual labor has emerged. Here we present a novel operation on DNA molecules, called Y, which joins two DNA fragments into one, and show that it provides a foundation for DNA processing as it can implement all basic text processing operations on DNA molecules including insert, delete, replace, cut and paste and copy and paste. In addition, complicated DNA processing tasks such as the creation of libraries of DNA variants, chimeras and extensions can be accomplished with DNA processing plans consisting of multiple Y operations, which can be executed automatically under computer control. The resulting DNA processing system, which incorporates our earlier work on recursive DNA composition and error correction, is the first demonstration of a unified approach to DNA synthesis, editing, and library construction. Electronic supplementary material The online version of this article (doi:10.1007/s11693-010-9059-y) contains supplementary material, which is available to authorized users.

Shabi, Uri; Kaplan, Shai; Linshiz, Gregory; BenYehezkel, Tuval; Buaron, Hen; Mazor, Yair

2010-01-01

242

Interacting with computers by voice: automatic speech recognition and synthesis  

Microsoft Academic Search

This paper examines how people communicate with computers using speech. Automatic speech recognition (ASR) transforms speech into text, while automatic speech synthesis [or text-to-speech (TTS)] performs the reverse task. ASR has been largely developed based on speech coding theory, while simulating certain spectral analyses performed by the ear. Typically, a Fourier transform is employed, but following the auditory Bark scale

D. O'Shaughnessy

2003-01-01

243

Composing Texts, Composing Lives.  

ERIC Educational Resources Information Center

|Using composition, reader response, critical, and feminist theories, a teacher demonstrates how adult students respond critically to literary texts and how teachers must critically analyze the texts of their teaching practice. Both students and teachers can use writing to bring their experiences to interpretation. (SK)|

Perl, Sondra

1994-01-01

244

The Perfect Text.  

ERIC Educational Resources Information Center

|A chemistry teacher describes the elements of the ideal chemistry textbook. The perfect text is focused and helps students draw a coherent whole out of the myriad fragments of information and interpretation. The text would show chemistry as the central science necessary for understanding other sciences and would also root chemistry firmly in the…

Russo, Ruth

1998-01-01

245

Solar Energy Project: Text.  

ERIC Educational Resources Information Center

The text is a compilation of background information which should be useful to teachers wishing to obtain some technical information on solar technology. Twenty sections are included which deal with topics ranging from discussion of the sun's composition to the legal implications of using solar energy. The text is intended to provide useful…

Tullock, Bruce, Ed.; And Others

246

Solar Energy Project: Text.  

ERIC Educational Resources Information Center

|The text is a compilation of background information which should be useful to teachers wishing to obtain some technical information on solar technology. Twenty sections are included which deal with topics ranging from discussion of the sun's composition to the legal implications of using solar energy. The text is intended to provide useful…

Tullock, Bruce, Ed.; And Others

247

Symbol ranking text compressors  

Microsoft Academic Search

Summary form only given. In 1951 Shannon estimated the entropy of English text by giving human subjects a sample of text and asking them to guess the next letters. He found, in one example, that 79% of the attempts were correct at the first try, 8% needed two attempts and 3% needed 3 attempts. By regarding the number of attempts

P. Fenwick

1997-01-01

248

Symbol Ranking Text Compression  

Microsoft Academic Search

In his work on the information content of English text in 1951, Shannon described a method of recoding the input text, a technique which has apparently lain dormant for the ensuing 45 years. Whereas traditional compressors exploit symbol frequencies and symbol contexts, Shannon's method adds the concept of \\

Peter Fenwick

1996-01-01

249

Comparison of Self-Questioning, Summarizing, and Notetaking-Review as Strategies for Learning From Lectures  

Microsoft Academic Search

Underprepared college students in three conditions viewed a lecture, took notes, and then engaged in their respective study strategies. Those trained in questioning generated (and answered) their own questions based on the lecture, those trained in summarizing wrote original summaries of the lecture, and those in an untrained control group simply reviewed their lecture notes. At immediate testing, summarizers recalled

Alison King

1992-01-01

250

Automatic Microbial Transfer Device.  

National Technical Information Service (NTIS)

An apparatus is disclosed for automatically transferring a predetermined amount of inoculated culture from a first container into a second container which has a sterile culture. The containers rest on the top of a pivoted support surface, where a horizont...

J. R. Wilkins S. M. Mills

1973-01-01

251

Automatic Payroll Deposit System.  

ERIC Educational Resources Information Center

The Automatic Payroll Deposit System in Yakima, Washington's Public School District No. 7, directly transmits each employee's salary amount for each pay period to a bank or other financial institution. (Author/MLF)

Davidson, D. B.

1979-01-01

252

Automatic Pedestrian Counter.  

National Technical Information Service (NTIS)

Emerging sensor technologies accelerated the shift toward automatic pedestrian counting methods to acquire reliable long-term data for transportation design, planning, and safety studies. Although a number of commercial pedestrian sensors are available, t...

B. Bartin H. Yang K. Ozbay R. Walla R. Williams

2010-01-01

253

Draft Label Text - Provenge  

Center for Biologics Evaluation and Research (CBER)

Text Version... nausea, and feeling cold. Your doctor may recommend or administer calcium to prevent or treat these side effects. If you have ... More results from www.fda.gov/downloads/biologicsbloodvaccines/cellulargenetherapyproducts

254

Planning Coherent Multisentential Text,  

National Technical Information Service (NTIS)

Though most text generators are capable of simply stringing together more than one sentence, they cannot determine which order will ensure a coherent paragraph. A paragraph is coherent when the information in successive sentences follows some pattern of i...

E. H. Hovy

1988-01-01

255

Text Segmentation by Topic  

Microsoft Academic Search

. We investigate the problem of text segmentation by topic.Applications for this task include topic tracking of broadcast speech dataand topic identification in full-text databases. Researchers have tackledsimilar problems before but with different goals. This study focuses ondata with relatively small segment sizes and for which within-segmentsentences have relatively few words in common making the problem challenging.We present a method

Jay M. Ponte; W. Bruce Croft

1997-01-01

256

Automatic behaviour: Efficient not mindless  

Microsoft Academic Search

Automaticity is a core construct underpinning theoretical accounts of human performance and cognition. In spite of this, its current conceptualisation is plagued by circularity – automaticity is typically defined in terms of the very behaviour it seeks to explain – and a lack of internal consistency—defining features of automaticity do not reliably co-occur. Furthermore, invoking automaticity tends to be post

L. L. Saling; J. G. Phillips

2007-01-01

257

Text information extraction in images and video: a survey  

Microsoft Academic Search

Text data present in images and video contain useful information for automatic annotation, indexing, and structuring of images. Extraction of this information involves detection, localization, tracking, extraction, enhancement, and recognition of the text from a given image. However, variations of text due to differences in size, style, orientation, and alignment, as well as low image contrast and complex background make

Keechul Jung; Kwang In Kim; Anil K. Jain

2004-01-01

258

Inductive Learning Algorithms and Representations for Text Categorization  

Microsoft Academic Search

Text categorization - the assignment of natural language texts to one or more predefined categories based on their content - is an important component in many information organization and management tasks. We compare the effectiveness of five different automatic learning algorithms for text categorization in terms of learning speed, real-time classification speed, and classification accuracy. We also examine training set

Susan Dumais; John Platt; David Heckerman; Mehran Sahami

1998-01-01

259

Identifying annotations for adventure game generation from fiction text  

Microsoft Academic Search

Recent advancements in Text-to-Scene research have lead to the development of systems which automatically extract key concepts from the text of a fiction book and generate computer animated movies depicting the story. Extracting such annotations from raw fiction text is a laborious process and so in this work we evaluate appropriate candidates to serve as the basis for the required

Ross Berkland; Shaun Bangay

2010-01-01

260

Text Retrieval Conference (TREC)  

NSDL National Science Digital Library

The Text REtrieval Conference (TREC) is an annual event that supports "research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies." Proceedings of the conference covering all twelve years of its history are available on the TREC homepage. As TREC has evolved, it has added several focus areas that span new and different topics in information retrieval. These tracks mainly examine methods of searching and filtering different types of data, including genomic records, digital video, and data that is given in multiple languages.

2001-01-01

261

Text Encoding Initiative  

NSDL National Science Digital Library

The Text Encoding Initiative (TEI) is an "international and interdisciplinary standard that helps libraries, museums, publishers, and individual scholars represent all kinds of literary and linguistic texts for online research and teaching." The site offers information about the TEI consortium; recommendations for the encoding of textual material in various languages; TEI Tutorials that provide introductory and advanced teaching materials, presentations, and user case studies; a history archive of TEI publications and working papers; and much more. This site is primarily for individuals who already possess some knowledge of material encoding.

2001-01-01

262

The Effect of Instructional Explanations on Learning from Scientific Texts.  

ERIC Educational Resources Information Center

|Explores the influence of offering different instructions to undergraduate students prior to their learning an expository text on evolutionary biology. Participants were asked to either explain, summarize, or listen to another's explanation. Overall, explainers outperformed summarizers. Moreover, the teach-through-explanation condition had the…

Coleman, Elaine B.; Rivkin, Inna D.; Brown, Ann L.

1997-01-01

263

An intelligent information system for organizing online text documents  

Microsoft Academic Search

This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such\\u000a as Web documents) in a hierarchical manner. The organizational capabilities of this system are able to evolve semi-automatically\\u000a with minimal human input. The system starts with an initial taxonomy in which documents are automatically categorized, and\\u000a then evolves so as to provide a

Han-joon Kim; Sang-goo Lee

2004-01-01

264

Senses and Texts  

Microsoft Academic Search

This paper addresses the question of whether it is possible tosense-tag systematically, and on a large scale, and how we shouldassess progress so far. That is to say, how to attach each occurrenceof a word in a text to one and only one sense in a dictionary – aparticular dictionary of course, and that is part of the problem. Thepaper

Yorick Wilks

1997-01-01

265

Representation for Narrative Text.  

National Technical Information Service (NTIS)

This report lists and describes the new linguistic information which has been implemented in the TERSE(Text Reduction System) system. The software for this deliverable has been delivered and is present on the Symbolics workstation at NRL. The TERSE system...

1988-01-01

266

Taming the Wild Text  

ERIC Educational Resources Information Center

|As a well-known advocate for promoting wider reading and reading engagement among all children--and founder of a reading program for foster children--Pam Allyn knows that struggling readers often face any printed text with fear and confusion, like Max in the book Where the Wild Things Are. She argues that teachers need to actively create a…

Allyn, Pam

2012-01-01

267

Reflections of Older Texts.  

ERIC Educational Resources Information Center

An overseas teaching assignment in 1961 led one educator to visit St. Patrick's Cathedral in Dublin where he came upon an effigy of Richard Whately and realized that Whately had written a text used in many American universities. The educator especially recalled that Whately had said "Encourage your students." He also wrote that the audience…

Reid, Loren

268

Lyell's Geological Texts  

Microsoft Academic Search

RECENTLY, while referring to Charles Lyell' ``Elements of Geology'', it was found that the Yale Library copy, of date September 12, 1839, had been sent by the publishers to Benjamin Silliman. This was the first American edition from the first London edition as published by Kay Bros., Philadelphia, with 316 pages and 295 figures in the text. After one hundred

G. R. Wieland

1940-01-01

269

STEM Careers Cursive Text  

NSDL National Science Digital Library

This brief video from WPSU compares technologies from yesterday with today. Yesterday a middle school girl writes in cursive while today a girl the same age texts on her cell. The video suggests that science will bring us technologies of tomorrow.

Wpsu

2009-11-10

270

Automatic pesticide application in greenhouses  

Microsoft Academic Search

Three automatic pesticide application systems are presented: an automatic thermal vaporimeter; a cold fogger (low volume mist\\u000a applicator or mechanical aerosol generator); and an automatic air-assisted sprayer for controlled droplet application (CDA).\\u000a The automatic thermal vaporimeter is thermally regulated to prevent spontaneous ignition of the evaporated pesticide, and\\u000a is equipped with an automatic quantity-control system. One vaporimeter is capable of

Miriam Austerweil; A. Grinstein

1997-01-01

271

Short text language detection using geographic information  

US Patent & Trademark Office Database

A content-providing entity receives a relatively short text from a user and attempts to determine, automatically, based on that short text (and on other available clues), a language that the user can read and understand. The content-providing entity may then provide, to the user, documents that are written in the determined language. The content-providing entity may determine a language of the input text based on several factors in combination: (a) the service provider's "market," which is determined based on at least a portion of the URL of the Internet site to which the user directed his browser; (b) the user's "region," which is determined based on the source Internet Protocol (IP) address of the IP packets that the user sends to the Internet site; (c) the "script" in which the short user-entered text is written; and (d) a statistical analysis of the frequency of the characters present in the short user-entered text.

Kim; Yookyung (Los Altos, CA); Guo; Shuang (San Jose, CA); Hu; Xian Xiang (San Jose, CA); Li; Xin (Sunnyvale, CA)

2013-10-01

272

Clandestine E-Texts  

NSDL National Science Digital Library

Edited and maintained by Gianluca Mori of the University of Turin-Vercelli, this site currently hosts the full texts (in French) of seventeen French clandestine manuscripts from the early enlightenment. As Mori notes, the treatises share an anti-Christian attitude, but beyond that their philosophical inspiration varies, "leading sometimes either to a deist (Examen de la religion) or to an atheist position (Meslier's Memoire, Freret's Lettre de Thrasybule a Leucippe)." The treatises are offered in HTML format, some with related links. Links are also provided to several texts on other servers and to related resources. Users may register for email notification of updates to the site, which is also available in French and Italian.

273

Metamemory for narrative text  

Microsoft Academic Search

In this experiment, we investigated metamemory for narrative text passages. Subjects read two stories and made memory predictions\\u000a for the idea units in one and rated the importance of ideas in the other. Half of the subjects were asked to recall the story\\u000a immediately after reading the passages and half were asked to recall 1 week later; half received passages

Ruth H. Maki; Sharon Swett

1987-01-01

274

Technology of Text Mining  

Microsoft Academic Search

A large amount of information is stored in databases, in intranets or in Internet. This information is organised in documents\\u000a or in text documents. The difference depends on the fact if pictures, tables, figures, and formulas are included or not. The\\u000a common problem is to find the desired piece of information, a trend, or an undiscovered pattern from these sources.

Ari Visa

2001-01-01

275

A distribution free summarization method for Affymetrix GeneChip® arrays  

Microsoft Academic Search

Motivation: Affymetrix GeneChip arrays require summarization in order to combine the probe-level intensities into one value represent- ing the expression level of a gene. However, probe intensity measure- mentsareexpectedtobeaffectedbydifferentlevelsofnon-specific-and cross-hybridization to non-specific transcripts. Here, we present a new summarization technique, the Distribution Free Weighted method (DFW), which uses information about the variability in probe behavior to estimate the extent of

Zhongxue Chen; Monnie Mcgee; Qingzhong Liu; Richard H. Scheuermann

2007-01-01

276

Internet Sacred Text Archive  

NSDL National Science Digital Library

The world's philosophical and religious traditions have found a fine home at the Internet Sacred Text Archive, which, as the homepage notes, is "a quiet place in cyberspace devoted to religious tolerance and scholarship." Working together with a number of colleagues and volunteers, JB Hare has compiled this vast archive of sacred and philosophical texts from a number of public-domain sources and placed them on the site. What makes the site so intriguing is that Hare has placed detailed information about the sources and standards that have been deployed for each separate project, which will be of great interest to scholars. While the entire site can be searched, there is much to be learned by looking through the topics listed on the main page, which range from Atlantis to Zoroastrianism. Each separate topic contains a number of accurately transcribed (and some times, translated) primary and secondary documents, such as first-hand collections of oral traditions. For persons looking for their own copy of the material contained on the site, a CD-ROM is available for purchase as well.

1997-01-01

277

Calibrating Item Families and Summarizing the Results Using Family Expected Response Functions  

ERIC Educational Resources Information Center

|Item families, which are groups of related items, are becoming increasingly popular in complex educational assessments. For example, in automatic item generation (AIG) systems, a test may consist of multiple items generated from each of a number of item models. Item calibration or scoring for such an assessment requires fitting models that can…

Sinharay, Sandip; Johnson, Matthew S.; Williamson, David M.

2003-01-01

278

Automatic English Sentence Analysis.  

National Technical Information Service (NTIS)

The report concerns research in natural-language analysis based on correlational grammar. It consists of five sections. Section I summarizes work in several subareas of the project and deals specifically with the system's vocabulary, a CORPUS of English t...

E. von Glasersfeld P. P. Pisani B. Notarmarco B. Dutton

1969-01-01

279

AUTOMATIC COUNTING APPARATUS  

DOEpatents

An apparatus for automatically recording the results of counting operations on trains of electrical pulses is described. The disadvantages of prior devices utilizing the two common methods of obtaining the count rate are overcome by this apparatus; in the case of time controlled operation, the disclosed system automatically records amy information stored by the scaler but not transferred to the printer at the end of the predetermined time controlled operations and, in the case of count controlled operation, provision is made to prevent a weak sample from occupying the apparatus for an excessively long period of time.

Howell, W.D.

1957-08-20

280

Physician. A metapaedogogical text.  

PubMed

It has generally been thought that the short treatise Physician was written for the beginning medical student and as such it has been criticized for being so superficial as to be worthless for producing anything but an empty charade of a physician. There are also numerous cruces in the text on which scholars have failed to come to any consensus. This paper argues that by taking the audience of the treatise to be the beginning instructor rather than the beginning student the tone of and information included in the treatise can be seen to be appropriate and the textual cruces can all be explained with little or no amendment by the same hypothesis. PMID:21560569

Dean-Jones, Lesley

2010-01-01

281

Theory and implementation of summarization: Improving sensor interpretation for spacecraft operations  

NASA Astrophysics Data System (ADS)

New paradigms in space missions require radical changes in spacecraft operations. In the past, operations were insulated from competitive pressures of cost, quality and time by system infrastructures, technological limitations and historical precedent. However, modern demands now require that operations meet competitive performance goals. One target for improvement is the telemetry downlink, where significant resources are invested to acquire thousands of measurements for human interpretation. This cost-intensive method is used because conventional operations are not based on formal methodologies but on experiential reasoning and incrementally adapted procedures. Therefore, to improve the telemetry downlink it is first necessary to invent a rational framework for discussing operations. This research explores operations as a feedback control problem, develops the conceptual basis for the use of spacecraft telemetry, and presents a method to improve performance. The method is called summarization, a process to make vehicle data more useful to operators. Summarization enables rational trades for telemetry downlink by defining and quantitatively ranking these elements: all operational decisions, the knowledge needed to inform each decision, and all possible sensor mappings to acquire that knowledge. Summarization methods were implemented for the Sapphire microsatellite; conceptual health management and system models were developed and a degree-of-observability metric was defined. An automated tool was created to generate summarization methods from these models. Methods generated using a Sapphire model were compared against the conventional operations plan. Summarization was shown to identify the key decisions and isolate the most appropriate sensors. Secondly, a form of summarization called beacon monitoring was experimentally verified. Beacon monitoring automates the anomaly detection and notification tasks and migrates these responsibilities to the space segment. A set of experiments using Sapphire demonstrated significant cost and time savings compared to conventional operations. Summarization is based on rational concepts for defining and understanding operations. Therefore, it enables additional trade studies that were formerly not possible and also can form the basis for future detailed research into spacecraft operations.

Swartwout, Michael Alden

282

Automatic Construction of User Interfaces from Constraint Multiset Grammars  

Microsoft Academic Search

Describes tools which automatically generate a sophisticated user interface from a constraint multiset grammar specification of a visual language. The user interface allows the user to construct diagrams in the visual language from primitive tokens such as text, lines, rectangles or circles. These tokens are incrementally parsed into sub-diagrams. During parsing, automatic error correction removes geometric errors, providing feedback about

Sitt Sen Chok; Kim Marriott

1995-01-01

283

Teaching Text Comprehension Strategies to Adult Poor Readers.  

ERIC Educational Resources Information Center

|Examines the effectiveness of self-questioning and summarization instruction on adult poor readers enrolled in adult education programs. Demonstrates the benefit of teaching text comprehension strategies to adults who are poor readers. (RS)|

Rich, Rebecca; Shepherd, Margaret Jo

1993-01-01

284

A Review of Four Text-Formatting Programs.  

ERIC Educational Resources Information Center

The author compares four formatting programs which run under CP/M: Script-80, Text Processing System (TPS), TEX, and Textwriter III. He summarizes his experience with these programs and his detailed report on 154 program characteristics. (Author/SJL)

Press, Larry

1980-01-01

285

A novel feature selection algorithm for text categorization  

Microsoft Academic Search

With the development of the web, large numbers of documents are available on the Internet. Digital libraries, news sources and inner data of companies surge more and more. Automatic text categorization becomes more and more important for dealing with massive data. However the major problem of text categorization is the high dimensionality of the feature space. At present there are

Wenqian Shang; Houkuan Huang; Haibin Zhu; Yongmin Lin; Youli Qu; Zhihai Wang

2007-01-01

286

Word Sense Disambiguation with Specification Marks in Unrestricted Texts  

Microsoft Academic Search

The authors present a method for the automatic disambiguating of nouns in English texts, using the notion of specification marks and employing the noun taxonomy of the WordNet lexical knowledge base (G.A. Miller et al., 1990). The method resolves the lexical ambiguity of nouns in any sort of text, and although it relies on the semantic relations (Hypernymy and Hyponymy)

Andrés Montoyo; Manuel Palomar

2000-01-01

287

Text-mining approaches in molecular biology and biomedicine  

Microsoft Academic Search

Biomedical articles provide functional descriptions of bioentities such as chemical compounds and proteins. To extract relevant information using automatic techniques, text-mining and information-extraction approaches have been developed. These technologies have a key role in integrating biomedical information through analysis of scientific literature. In this article, important applications such as the identification of biologically relevant entities in free text and the

Martin Krallinger; Ramon Alonso-Allende Erhardt; Alfonso Valencia

2005-01-01

288

Automatic multimedia cross-modal correlation discovery  

Microsoft Academic Search

Given an image (or video clip, or audio song), how do we automatically assign keywords to it? The general problem is to find correlations across the media in a collection of multimedia objects like video clips, with colors, and\\/or motion, and\\/or audio, and\\/or text scripts. We propose a novel, graph-based approach, \\

Jia-Yu Pan; Hyung-Jeong Yang; Christos Faloutsos; Pinar Duygulu

2004-01-01

289

The Automatic Creation of Literature Abstracts  

Microsoft Academic Search

Excerpts of technical papers and magazine articles that serve the purposes of conventional abstracts have been created entirely by automatic means. In the exploratory research described, the complete text of an article in machine-readable form is scanned by an IBM 704 data-processing machine and analyzed in accordance with a standard program. Statistical information derived from word frequency and distribution is

H. P. Luhn

1958-01-01

290

Automatic Headline Generation for Newspaper Stories  

Microsoft Academic Search

In this paper we propose a novel application of Hidden Markov Models to automatic generation of informative headlines for English texts. We propose four decoding parameters to make the headlines appear more like Headlinese, the language of informative newspaper headlines. We also allow for morphological variation in words between headline and story English. Informal and formal evaluations indicate that our

D. Zajic R. Schwartz; B. Door; Richard Schwartz

2002-01-01

291

Automatic Discrimination of Emotion from Spoken Finnish  

ERIC Educational Resources Information Center

In this paper, experiments on the automatic discrimination of basic emotions from spoken Finnish are described. For the purpose of the study, a large emotional speech corpus of Finnish was collected; 14 professional actors acted as speakers, and simulated four primary emotions when reading out a semantically neutral text. More than 40 prosodic…

Toivanen, Juhani; Vayrynen, Eero; Seppanen, Tapio

2004-01-01

292

Evaluation of Automatic Generation of Basic Stories  

Microsoft Academic Search

This paper presents an application that automatically gen- erates basic stories: short texts that only narrate the main events of the plot. The system operates with a representation in Description Logics, combining stored fabulas with the narrative knowledge implemented in a domain-speciflc ontology. The domain of application is the traditional folk tale, using the well-known morphology of Vladimir Propp as

Federico Peinado; Pablo Gervás

2006-01-01

293

Automatic construction of FCMs  

Microsoft Academic Search

In this paper we describe a method for automatically constructing fuzzy cognitive maps based on the user provided data. This method consists of finding the degree of similarity between any two variables (represented by numerical vectors), finding whether the relation between variables is direct or inverse, and with the use of the fuzzy expert system tool (FEST) it determines the

M Schneider; E Shnaider; A Kandel; G Chew

1998-01-01

294

Automatic Dance Lesson Generation  

ERIC Educational Resources Information Center

|In this paper, an automatic lesson generation system is presented which is suitable in a learning-by-mimicking scenario where the learning objects can be represented as multiattribute time series data. The dance is used as an example in this paper to illustrate the idea. Given a dance motion sequence as the input, the proposed lesson generation…

Yang, Yang; Leung, H.; Yue, Lihua; Deng, LiQun

2012-01-01

295

Automatic Threshold Circuit.  

National Technical Information Service (NTIS)

An automatic threshold circuit to establish a threshold that is a specified number of db above the input's rms frequency weighted noise value is described. The input is compared with the feedback threshold value, the result of which is coupled to a limite...

J. H. Bumgardner

1976-01-01

296

Automatic sweep circuit  

DOEpatents

An automatically sweeping circuit for searching for an evoked response in an output signal in time with respect to a trigger input. Digital counters are used to activate a detector at precise intervals, and monitoring is repeated for statistical accuracy. If the response is not found then a different time window is examined until the signal is found.

Keefe, Donald J. (Lemont, IL)

1980-01-01

297

An Automatic Overlay Generator  

Microsoft Academic Search

We present an algorithm for automatically generating an overlay structure for a program, with the goal of reducing the primary storage requirements of that program. Subject to the constraints of intermodule dependences, the algorithm can either find a maximal overlay structure or find an overlay structure that, where possible, restricts the program to a specified amount of primary storage. Results

Ron Cytron; Paul G. Loewner

1986-01-01

298

Automatic domotic device interoperation  

Microsoft Academic Search

Current domotic systems manufacturers develop their systems nearly in isolation, responding to different marketing policies and to different technological choices. While there are many available approaches to enable interoperation with domotic systems as a whole, few solutions tackle interoperation between single domotic devices belonging to different technology networks. This paper introduces an automatic device-to-device interoperation solution exploiting ontology- based semantic

Dario Bonino; Emiliano Castellina; Fulvio Corno

2009-01-01

299

Reactor component automatic grapple  

DOEpatents

A grapple for handling nuclear reactor components in a medium such as liquid sodium which, upon proper seating and alignment of the grapple with the component as sensed by a mechanical logic integral to the grapple, automatically seizes the component. The mechanical logic system also precludes seizure in the absence of proper seating and alignment.

Greenaway, Paul R. (Bethel Park, PA)

1982-01-01

300

Automatic Smear Counter.  

National Technical Information Service (NTIS)

An automatic system to detect alpha and beta radiation emitted from either ''smeared'' IBM cards or special IBM cards with a filter paper window, as used in air sampling systems, has been designed and fabricated. A modified card reader is used to input da...

E. R. Rogers L. E. White

1986-01-01

301

Is Semantic Priming Automatic.  

National Technical Information Service (NTIS)

The time to decide that a letter string (e.g., 'doctor') is a word is reduced when it is preceeded by a related word ('nurse'). At least some component of this semantic priming effect is thought to be automatic and therefore free of attentional limitation...

J. E. Hoffman F. W. MacMillan

1984-01-01

302

Automatic TCP Buffer Tuning  

Microsoft Academic Search

With the growth of high performance networking, a single host may have simultaneous connections that vary in bandwidth by as many as six orders of magnitude. We identify requirements for an automatically-tuning TCP to achieve maximum throughput across all connections simultaneously within the resource limits of the sender. Our auto-tuning TCP implementation makes use of several existing technologies and adds

Jeffrey Semke; Jamshid Mahdavi; Matthew Mathis

1998-01-01

303

Automatic Electronic Oxygen Supply  

PubMed Central

An automatic electronic oxygen system has been devised to supply an intensive care unit with a “fail-safe” supply of continuous oxygen. All parts of the system are fitted with alarms, as the oxygen powers gas-driven ventilators. Since the system is cheap it can be installed in hospitals where finance is limited.

Ford, Patricia; Hoodless, D. J.

1971-01-01

304

Automatic Thickener Control.  

National Technical Information Service (NTIS)

Direct automatic control of underflow density is found in practice to lead to a higher and more consistent degree of dewatering of a milled gold ore pulp in continuous thickening than achieved by the present method of manual operation. An ultrasonic senso...

K. J. Scott

1972-01-01

305

Automatic Whistler Detector and Analyzer system: Automatic Whistler Detector  

Microsoft Academic Search

A new, unique system has been developed for the automatic detection and analysis of whistlers. The Automatic Whistler Detector and Analyzer (AWDA) system has two purposes: (1) to automatically provide plasmaspheric electron densities extracted from whistlers and (2) to collect statistical data for the investigation of whistler generation and propagation. This paper presents the details of and the first results

J. Lichtenberger; C. Ferencz; L. Bodnár; D. Hamar; P. Steinbach

2008-01-01

306

What's yours and what's mine: Determining Intellectual Attribution in Scientific Text  

Microsoft Academic Search

We believe that identifying the structure of scien- tific argumentation in articles can help in tasks such as automatic summarization or the auto- mated construction of citation indexes. One par- ticularly important aspect of this structure is the question of who a given scientific statement is at- tributed to: other researchers, the field in general, or the authors themselves. We

Simone Teufel; Marc Moens

2000-01-01

307

Co-clustering Sentences and Terms for Multi-document Summarization  

Microsoft Academic Search

\\u000a Two issues are crucial to multi-document summarization: diversity and redundancy. Content within some topically-related articles\\u000a are usually redundant while the topic is delivered from diverse perspectives. This paper presents a co-clustering based multi-document\\u000a summarization method that makes full use of the diverse and redundant content. A multi-document summary is generated in three\\u000a steps. First, the sentence-term co-occurrence matrix is designed

Yunqing Xia; Yonggang Zhang; Jianmin Yao

2011-01-01

308

Relevant Information Extraction Driven with Rhetorical Schemas to Summarize Scientific Papers  

Microsoft Academic Search

Automatic summaries are often subject to several criticisms (e.g., lack of cohesion and coherence). In this paper, we propose\\u000a an approach that uses coherent Summary-Schemas (templates) conceived from the rhetorical structure of scientific papers including their abstracts. The Summary-Schemas embed\\u000a rhetorical roles specified by signatures (sets of positional, structural, linguistic and thematic features) that guide the\\u000a search for appropriate sentences

Mariem Ellouze; Abdelmajid Ben Hamadou

2002-01-01

309

Text Genre Classification with Genre-Revealing and Subject-Revealing Features  

Microsoft Academic Search

Subject or prepositional content has been the focus of most classification research. Genre or style, on the other hand, is a different and important property of text, and automatic text genre classification is becoming important for classification and retrieval purposes as well as for some natural language processing research. In this paper, we present a method for automatic genre classification

Yong-Bae Lee; Sung-Hyon Myaeng

2002-01-01

310

Accordion summarization for end-game browsing on PDAs and cellular phones  

Microsoft Academic Search

We demonstrate a new browsing technique for devices with small displays such as PDAs or cellular phones. We concentrate on end-game browsing, where the user is close to or on the target page. We make browsing more efficient and easier by Accordion Summarization. In this technique the Web page is first represented as a short summary. The user can then

Orkut Buyukkokten; Hector Garcia-Molina; Andreas Paepcke

2001-01-01

311

Summarized proceedings of a conference on stress analysis - University College of North Staffordshire, April 1960  

Microsoft Academic Search

The Annual Conference of the Stress Analysis Group of The Institute of Physics was held at the University College of North Staffordshire, Keele, Staffordshire, from 11th to 13th April 1960. The papers, which were concerned primarily with polymers and fibres, are summarized in this article.

C D Pomeroy

1961-01-01

312

Summarized proceedings of a conference on solid state physics - Melbourne, August 1959  

Microsoft Academic Search

The Australian Branch of The Institute of Physics held a conference on solid state physics in Melbourne from 17-21 August, 1959. This conference was the first of its kind to be held in Australia and attracted an attendance of about one hundred and thirty. In all, 46 papers, ranging over a wide field, were presented and these are summarized; they

J F Nicholas

1960-01-01

313

Long story short - Global unsupervised models for keyphrase based meeting summarization  

Microsoft Academic Search

We analyze and compare two different methods for unsupervised extractive spontaneous speech summarization in the meeting domain. Based on utterance comparison, we introduce an optimal formulation for the widely used greedy maximum marginal relevance (MMR) algorithm. Following the idea that information is spread over the utterances in form of concepts, we describe a system which finds an optimal selection of

Korbinian Riedhammer; Benoît Favre; Dilek Hakkani-Tür

2010-01-01

314

An Extensive Empirical Study of Automated Evaluation of Multi-Document Summarization  

Microsoft Academic Search

This paper discusses an approach to automated evaluation of multi-document summarization by computing the similarities of automated summaries and human summaries and scoring the automated summaries by their similarities to the human ones. Several schemes are used in our experiment, as well as the effects of stop words and stemming. Our method experimental result is compared to Rouge which is

Ying-Qiang Wu; Gang Zhou; Li-Qing Qiu

2008-01-01

315

Motion-Based Selection of Relevant Video Segments for Video Summarization  

Microsoft Academic Search

We present a method for motion-based video segmentation and segment classification as a step towards video summarization. The sequential segmentation of the video is performed by detecting changes in the dominant image motion, assumed to be related to camera motion and represented by a 2D affine model. The detection is achieved by analysing the temporal variations of some coefficients of

Nathalie Peyrard; Patrick Bouthemy

2005-01-01

316

Automatism and driving offences.  

PubMed

Automatism is a rarely used defence, but it is particularly used for driving offences because many are strict liability offences. Medical evidence is almost always crucial to argue the defence, and it is important to understand the bars that limit the use of automatism so that the important medical issues can be identified. The issue of prior fault is an important public safeguard to ensure that reasonable precautions are taken to prevent accidents. The total loss of control definition is more problematic, especially with disorders of more gradual onset like hypoglycaemic episodes. In these cases the alternative of 'effective loss of control' would be fairer. This article explores several cases, how the criteria were applied to each, and the types of medical assessment required. PMID:24112330

Rumbold, John

2013-08-07

317

Implementing automatic differentiation efficiently  

SciTech Connect

The automatic differentiation of computer arithmetic has been investigated since before 1960. Most of this effort has been centered on the forward mode of derivative evaluation. Speelpenning, Iri and Kubota, and Horwedel et al. have all implemented the more efficient reverse mode of evaluating derivatives in their respective Fortran precompilers. The reverse mode requires information about a computation to be stored in order that derivatives may be calculated after the function evaluation has been completed. This additional storage cost may be prohibitive. The goals of our research have been to implement an automatic differentiation package that gracefully handles the storage issue and to generate a parallel implementation of derivative evaluation in the reverse mode. This paper discusses results of both research efforts, specifically those involving the ADOL-C package. 10 refs., 4 figs.

Juedes, D.; Griewank, A.

1990-10-01

318

Criterra automatic location planning  

Microsoft Academic Search

Criterra is a software suite that automatically determines optimum locations and heights in seconds\\/minutes for security system sensors, and locations for infrastructure and response forces based on dominant mosaic, line-of-sight, time-and-space, Doppler, propagation and other algorithms, executed on a terabyte size 3D geospatial and object database. Inputs include specifications of sensor systems, barriers, and response forces. Criterra is based on

Lawrence Cassenti; P. E. Peter Leed

2011-01-01

319

Automatic Skin Color Beautification  

NASA Astrophysics Data System (ADS)

In this paper, we propose an automatic skin beautification framework based on color-temperature-insensitive skin-color detection. To polish selected skin region, we apply bilateral filter to smooth the facial flaw. Last, we use Poisson image cloning to integrate the beautified parts into the original input. Experimental results show that the proposed method can be applied in varied light source environment. In addition, this method can naturally beautify the portrait skin.

Chen, Chih-Wei; Huang, Da-Yuan; Fuh, Chiou-Shann

320

Tele-Graffiti: A Camera-Projector Based Remote Sketching System with Hand-Based User Interface and Automatic Session Summarization  

Microsoft Academic Search

One way to build a remote sketching system is to use a video camera to image what each user draws at their site, transmit the video to the other sites, and display it there using an LCD projector. Such camera-projector based remote sketching systems date back to Paul Wellner's (largely unimplemented) Xerox Double DigitalDesk. To make such a system usable,

Naoya Takao; Jianbo Shi; Simon Baker

2003-01-01

321

Practical vision based degraded text recognition system  

NASA Astrophysics Data System (ADS)

Rapid growth and progress in the medical, industrial, security and technology fields means more and more consideration for the use of camera based optical character recognition (OCR) Applying OCR to scanned documents is quite mature, and there are many commercial and research products available on this topic. These products achieve acceptable recognition accuracy and reasonable processing times especially with trained software, and constrained text characteristics. Even though the application space for OCR is huge, it is quite challenging to design a single system that is capable of performing automatic OCR for text embedded in an image irrespective of the application. Challenges for OCR systems include; images are taken under natural real world conditions, Surface curvature, text orientation, font, size, lighting conditions, and noise. These and many other conditions make it extremely difficult to achieve reasonable character recognition. Performance for conventional OCR systems drops dramatically as the degradation level of the text image quality increases. In this paper, a new recognition method is proposed to recognize solid or dotted line degraded characters. The degraded text string is localized and segmented using a new algorithm. The new method was implemented and tested using a development framework system that is capable of performing OCR on camera captured images. The framework allows parameter tuning of the image-processing algorithm based on a training set of camera-captured text images. Novel methods were used for enhancement, text localization and the segmentation algorithm which enables building a custom system that is capable of performing automatic OCR which can be used for different applications. The developed framework system includes: new image enhancement, filtering, and segmentation techniques which enabled higher recognition accuracies, faster processing time, and lower energy consumption, compared with the best state of the art published techniques. The system successfully produced impressive OCR accuracies (90% -to- 93%) using customized systems generated by our development framework in two industrial OCR applications: water bottle label text recognition and concrete slab plate text recognition. The system was also trained for the Arabic language alphabet, and demonstrated extremely high recognition accuracy (99%) for Arabic license name plate text recognition with processing times of 10 seconds. The accuracy and run times of the system were compared to conventional and many states of art methods, the proposed system shows excellent results.

Mohammad, Khader; Agaian, Sos; Saleh, Hani

2011-02-01

322

Automatic categorization design for broadcast news  

NASA Astrophysics Data System (ADS)

This paper discusses our work on automatic categorization of broadcast news based on close caption texts. The multimedia news data under study are first segmented into story units based on video and audio signals with our previous developed algorithms. Based on the time stamp information, close caption texts are segmented into text units corresponding to each story unit. A Bayes network is then trained to automatically classify the story units into fourteen categories. The major contribution of this paper is the idea of category, which represents a higher level of semantic generalization as compared with traditional topics. We discusses in detail the administrated bottom-up clustering algorithm to generate semantically meaningful category framework as well as the training procedures to build the brief network that covers the large broadcast news data set. Using LDC (Linguistic Data Consortium)'s CSR LM 1996 data set, we designed a number of experiments to discuss the relationship between categorization design and the classification performance.

Luo, Huitao; Huang, Qian

2001-12-01

323

Identifying Word Translation in Non_Parallel Texts  

Microsoft Academic Search

Common algorithms for sentence and word-alignment allow the automatic identification of word translations from parallel texts. This study suggests that the identification of word translations should also be possible with non-parallel and even unrelated texts. The method proposed is based on the assumption that there is a correlation between the patterns of word co-occurrences in texts of different languages.

Reinhard Rapp

1995-01-01

324

Integrating Rhetorical-Semantic Relation Models for Query-Focused Summarization  

Microsoft Academic Search

We present our recent work on query-focused summarization, focusing on our efforts in building and applying models of rhetorical-semantic relations (RSRs) such as contrast and causal- ity. We overview ongoing work in extracting and evaluating RSR models. We describe our system for query-focused summariza- tion, focusing on an enhanced, feature-based framework. We present results of experiments to measure the impact

Sasha Blair-Goldensohn; Kathleen McKeown

325

Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis  

Microsoft Academic Search

We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from a text corpus. The approach is based on Formal Concept Analysis (FCA), a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. We follow Harris' distributional hypothesis and model the context of a certain term as a vector

Philipp Cimiano; Andreas Hotho; Steffen Staab

2005-01-01

326

Extraction of Text Objects in Image and Video Documents  

Microsoft Academic Search

The popularity of digital image and video is increasing rapidly. To help users navigate libraries of image and video, Content Based Information Retrieval (CBIR) system that can automatically index image and video documents are needed. However, due to the semantic gap between low-level machine descriptors and high-level semantic descriptors, the existing CBIR systems are still far from perfect. Text embedded

Jing Zhang

2012-01-01

327

USING WORDNET TO COMPLEMENT TRAINING INFORMATION IN TEXT CATEGORIZATION  

Microsoft Academic Search

Automatic Text Categorization (TC) is a complex and useful task for many natural language applications, and is usually performed through the use of a set of manually classified documents, a training collec- tion. We suggest the utilization of additional resources like lexical databases to increase the amount of information that TC systems make use of, and thus, to improve their

Manuel de Buenaga; Rodr ´ õguez; Jose Mar ´ õa Gomez-Hidalgo

328

Enhanced Arabic Information Retrieval System based on Arabic Text Classification  

Microsoft Academic Search

The paper presents enhanced, effective and simple approach to text classification. The approach uses an algorithm to automatically classifying documents. The main idea of the algorithm is to select feature words from each document; those words cover all the ideas in the document. The results of this algorithm are list of the main subjects founded in the document. Also, in

S. Ghwanmeh; G. Kanaan; R. Al-Shalabi; A. Ababneh

2007-01-01

329

Anaphor Resolution in Unrestricted Texts with Partial Parsing  

Microsoft Academic Search

In this paper we deal with several kinds of anaphora in unrestricted texts. These kinds of anaphora are pronominal references, surfacecount anaphora and one-anaphora. In order to solve these anaphors we work on the output of a part-of-speech tagger, on which we automatically apply a partial parsing from the formalism: , which has been implemented in Prolog. We only use

Antonio Ferrández; Manuel Palomar; Lidia Moreno

1998-01-01

330

Error Diagnosis in the FreeText Project  

Microsoft Academic Search

This paper presents an overview of the research conducted within the FreeText project to build an automatic error diagnosis system for learners of French as a foreign language. After a brief review of the main features of the project and of the learner corpus collected and used within the project, the paper focuses on the error diagnosis system itself and,

SÉBASTIEN L' HAIRE; ANNE VANDEVENTER FALTIN

331

Emotional Reading of Medical Texts Using Conversational Agents (Short Paper)  

Microsoft Academic Search

In this paper, we present a prototype that helps visualizing the relative importance of sentences extracted from medical texts using Embodied Conversational Agents (ECA). We propose to map rhetorical structures automatically recognized in the documents onto a set of communicative acts controlling the expression of an ECA. As a consequence, the ECA will dramatize a sentence to reflect its perceived

Gersende Georg; Catherine Pelachaud; Marc Cavazza

2008-01-01

332

Mobile phone text messaging in the management of diabetes  

Microsoft Academic Search

We conducted a trial of mobile phone text messaging (short message service; SMS) for diabetes management. In an eight-month period, 23 diabetic patients used the service. Patients used SMS to transmit data such as blood glucose levels and body weight to a server. The server automatically answered with an SMS acknowledgement message. A monthly calculated glycosylated haemoglobin result was also

O. Ferrer-Roca; A. Cardenas; A. Diaz-Cardama; P. Pulido

2004-01-01

333

Component Skills of Text Comprehension in Less Competent Chinese Comprehenders  

ERIC Educational Resources Information Center

|The present study examined the role of verbal working memory (memory span and tongue-twister), two-character Chinese pseudoword reading (two tasks), rapid automatized naming (RAN) (letters and numbers), and phonological segmentation (deletion of rimes and onsets) in inferential text comprehension in Chinese in 31 less competent comprehenders…

Leong, Che Kan; Hau, Kit Tai; Tse, Shek Kam; Loh, Ka Yee

2007-01-01

334

Topic Detection Of Unrestricted Texts: Approaches And Evaluations  

Microsoft Academic Search

Topic detection and tracking refers to automatic techniques for locating topically related cohesive paragraphs in a stream of text. Most documents are about more than one subject, but many Natural Language Processing (NLP) and Information Retrieval (IR) techniques implicitly assume documents have just one topic. Even in the presence of a single topic within a document, the document may address

Yllias Chali

2005-01-01

335

Overlay Text Retrieval From Video Scene  

NASA Astrophysics Data System (ADS)

The rapid growth of video data leads to an urgent demand for efficient and true contentbased browsing and retrieving systems. In response to such needs, various video content analysis schemes using one or a combination of image, audio, and text information in videos have been proposed to parse, index, or abstract massive amount of data text in video is a very compact and accurate clue for video indexing and summarization. Most video text detection and extraction methods hold assumptions on text color, background contrast, and font style. Moreover, few methods can handle multilingual text well since different languages may have quite different appearances. In this paper, an efficient overlay text detection and extraction method is implemented which deals with complex backgrounds. Based on our observation that there exist transient colors between inserted text and its adjacent background. It is robust with respect to font size, style text, color, orientation and noise and can be used in a large variety of application fields such as mobile robot navigation vehicle license detection and recognition, object identification , document retrieving, etc.

Manohar, K.; Irfan, S.; Sravani, K.

2013-03-01

336

Use of SI Metric Units Misrepresented in College Physics Texts.  

ERIC Educational Resources Information Center

|Summarizes results of a survey that examined 13 textbooks claiming to use SI units. Tables present data concerning the SI and non-SI units actually used in each text in discussion of fluid pressure and thermal energy, and data concerning which texts do and do not use SI as claimed. (CS)|

Hooper, William

1980-01-01

337

Automatic synthesis from ordinary english test  

Microsoft Academic Search

We summarize work between 1969 and 1972 in a continuing project With two objectives: to produce acceptable synthetic speech directly from English text; and to demonstrate with speech synthesis a detailed model of human articulatory movements. Work in the four-year period has yielded moderately accurate rules for predicting the occurrence of pauses and lesser breaks in the sentence; rules for

C. Coker; N. Umeda; C. Browman

1973-01-01

338

Summarizing ChaMPlane: Global Distributions And Nature Of Faint Chandra XRBs in the Galaxy  

NASA Astrophysics Data System (ADS)

We summarize the global properties of sources detected in our nearly 10y ChaMPlane survey of faint (<0.01 cts/s) Chandra sources in the Galactic plane. In this presentation, we focus on sources detected above 2 keV which are dominated by CVs and XRBs. Using the GCR hard sources with optical counterparts (Zhao et al), we derive constraints on luminosity and spatial distributions for sources out to 3 kpc (for NH < 3 E21) for comparison with all ChaMPlane fields with similar limiting NH to constrain the global distributions of CVs and qLMXBs in the Galaxy.

Grindlay, Jonathan E.; Hong, J.; van den Berg, M.; Servillat, M.; Zhao, P.; Allen, B.

2011-09-01

339

Automatic readout micrometer  

SciTech Connect

A measuring system is disclosed for surveying and very accurately positioning objects with respect to a reference line. A principal use of this surveying system is for accurately aligning the electromagnets which direct a particle beam emitted from a particle accelerator. Prior art surveying systems require highly skilled surveyors. Prior art systems include, for example, optical surveying systems which are susceptible to operator reading errors, and celestial navigation-type surveying systems, with their inherent complexities. The present invention provides an automatic readout micrometer which can very accurately measure distances. The invention has a simplicity of operation which practically eliminates the possibilities of operator optical reading error, owning to the elimination of traditional optical alignments for making measurements. The invention has an extendable arm which carries a laser surveying target. The extendable arm can be continuously positioned over its entire length of travel by either a coarse or fine adjustment without having the fine adjustment outrun the coarse adjustment until a reference laser beam is centered on the target as indicated by a digital readout. The length of the micrometer can then be accurately and automatically read by a computer and compared with a standardized set of alignment measurements. Due to its construction, the micrometer eliminates any errors due to temperature changes when the system is operated within a standard operating temperature range.

Lauritzen, T.

1982-03-23

340

Automatic readout micrometer  

DOEpatents

A measuring system is disclosed for surveying and very accurately positioning objects with respect to a reference line. A principal use of this surveying system is for accurately aligning the electromagnets which direct a particle beam emitted from a particle accelerator. Prior art surveying systems require highly skilled surveyors. Prior art systems include, for example, optical surveying systems which are susceptible to operator reading errors, and celestial navigation-type surveying systems, with their inherent complexities. The present invention provides an automatic readout micrometer which can very accurately measure distances. The invention has a simplicity of operation which practically eliminates the possibilities of operator optical reading error, owning to the elimination of traditional optical alignments for making measurements. The invention has an extendable arm which carries a laser surveying target. The extendable arm can be continuously positioned over its entire length of travel by either a coarse or fine adjustment without having the fine adjustment outrun the coarse adjustment until a reference laser beam is centered on the target as indicated by a digital readout. The length of the micrometer can then be accurately and automatically read by a computer and compared with a standardized set of alignment measurements. Due to its construction, the micrometer eliminates any errors due to temperature changes when the system is operated within a standard operating temperature range.

Lauritzen, Ted (Lafayette, CA)

1982-01-01

341

Comparing Conceptual, Divise and Agglomerative Clustering for Learning Taxonomies from Text  

Microsoft Academic Search

The application of clustering methods for automatic tax- onomy construction from text requires knowledge about the trade- off between, (i), their effectiveness (quality of result), (ii), efficiency (run-time behaviour), and, (iii), traceability of the taxonomy con- struction by the ontology engineer. In this line, we present an original conceptual clustering method based on Formal Concept Analysis for automatic taxonomy construction

Philipp Cimiano; Andreas Hotho; Steffen Staab

2004-01-01

342

Automated categorisation of clinical incident reports using statistical text classification  

Microsoft Academic Search

ObjectivesTo explore the feasibility of using statistical text classification techniques to automatically categorise clinical incident reports.MethodsStatistical text classifiers based on Naïve Bayes and Support Vector Machine algorithms were trained and tested on incident reports submitted by public hospitals to identify two classes of clinical incidents: inadequate clinical handover and incorrect patient identification. Each classifier was trained on 600 reports (300

Mei-Sing Ong; Farah Magrabi; Enrico Coiera

2010-01-01

343

Automatic clutch control system  

SciTech Connect

An automatic clutch control system is described for a vehicle of the type having an engine for driving a drive shaft. The engine has a throttle, and the vehicle further has a clutch for coupling the drive shaft to a driven shaft. The system comprises: first speed detecting means for detecting a rotational speed of the drive shaft; second speed detecting means for detecting the rotational speed of the driven shaft; throttle detecting means for detecting the throttle opening of the engine; clutch engagement control means for controlling the degree of engagement of the clutch in accordance with a clutch engagement control signal; and electronic control means coupled to the first and second speed detecting means. The throttle detecting means and clutch control means are for specifying, with a predetermined time period, a particular group of controlling signal data, which is assigned to a time segment corresponding to the period for clutch engagement control.

Mitsui, T.; Kobayashi, H.; Hirosawa, K.

1986-11-04

344

Automatic clutch control system  

SciTech Connect

This patent describes an automatic clutch control system for a vehicle of the type having an engine for driving a drive shaft. The engine has a throttle, and the vehicle further comprises a clutch for coupling the drive shaft to a driven shaft. The system comprising: first speed detecting means for detecting a rotational speed of the drive shaft; second speed detecting means for detecting the rotational speed of the driven shaft; throttle detecting means for detecting the throttle opening of the engine; clutch engagement control means for controlliing the degree of engagement of the clutch in accordance with a clutch engagement control signal; and electronic control means coupled to the first and second speed detecting means, the throttle detecting means and the clutch control means, and including rate change detection means for determining the rate of change of the rotational speed of the drive shaft.

Kobayashi, H.; Mitsui, T.

1986-10-07

345

Automatic beamline calibration procedures  

SciTech Connect

Recent experience with the SLC and SPEAR accelerators have led to a well-defined set of procedures for calibration of the beamline model using the orbit fitting program, RESOLVE. Difference orbit analysis is used to calibrate quadrupole strengths, BPM sensitivities, corrector strengths, focusing effects from insertion devices, and to determine the source of dispersion and coupling errors. Absolute orbit analysis is used to locate quadrupole misalignments, BPM offsets, or beam loss. For light source applications, the photon beam source coordinates can be found. The result is an accurate model of the accelerator which can be used for machine control. In this paper, automatable beamline calibration procedures are outlined and illustrated with recent examples. 5 refs.

Corbett, W.J. (Stanford Univ., CA (United States). Stanford Synchrotron Radiation Lab.); Lee, M.J. (Stanford Linear Accelerator Center, Menlo Park, CA (United States)); Zambre, Y. (SRI International, Menlo Park, CA (United States))

1992-03-01

346

Automatic thesaurus extraction for Icelandic  

Microsoft Academic Search

Thesauri are becoming a common resource used in various Natural Language Processing and Information Retrieval related tasks. Methods for automatic extraction of thesauri have just recently begun performing well enough for practical use. A method to automatically extract a thesaurus for Icelandic from a tagged and parsed corpus was implemented and evaluated. The method is based on extracting relational trigrams

Frank Arthur; Blöndahl Cassata

347

Automatic image stitching using SIFT  

Microsoft Academic Search

This paper concerns the problem of automatic image stitching which mainly applies to the image sequence even those including noise images. And it uses a method based on invariant features to realize fully automatic image stitching, in which it includes two main parts: image matching and image blending. As the noises images have large differences between the other images, when

Yanfang Li; Yaming Wang; Wenqing Huang; Zuoli Zhang

2008-01-01

348

Texts in Homes and Communities.  

ERIC Educational Resources Information Center

This paper considers how children's text making is shaped by the environment in which the texts are made. By considering texts made in classrooms and texts made in homes, the paper explores how classrooms and homes interact with children's (6-7 year old boys) reflective processes as they create artifacts--drawings, models, and writings. The paper…

Pahl, Kate

349

Litterature: Retour au texte (Literature: Return to the Text).  

ERIC Educational Resources Information Center

|Choice of texts for use in French language instruction is discussed. It is argued that the text's format (e.g., advertising, figurative poetry, journal article, play, prose, etc.) is instrumental in bringing attention to the language in it, and this has implications for the best uses of different text types. (MSE)|

Noe, Alfred

1993-01-01

350

Further experience with controller-based automatic motion synthesis for articulated figures  

Microsoft Academic Search

We extend an earlier automatic motion-synthesis algorithm for physically realistic articulated figures in several ways. First, we summarize several incremental improvements to the original algorithm that improve its efficiency significantly and provide the user with some ability to influence what motions are generated. These techniques can be used by an animator to achieve a desired movement style, or they can

Joel Auslander; Alex S. Fukunaga; Hadi Partovi; Jon Christensen; Lloyd Hsu; Peter Reiss; Andrew Shuman; Joe Marks; J. Thomas Ngo

1995-01-01

351

An automatic leading indicator of economic activity: forecasting GDP growth for European countries  

Microsoft Academic Search

In the construction of a leading indicator model of economic activity, economists must select among a pool of variables which lead output growth. Usually the pool of variables is large and a selection of a subset must be carried out. This paper proposes an automatic leading indicator model which, rather than preselection, uses a dynamic factor model to summarize the

GONZALO CAMBA-MENDEZ; GEORGE KAPETANIOS; RICHARD J. SMITH; MARTIN R. WEALE

2001-01-01

352

Automatic neural processing of disorder-related stimuli in social anxiety disorder: faces and more.  

PubMed

It has been proposed that social anxiety disorder (SAD) is associated with automatic information processing biases resulting in hypersensitivity to signals of social threat such as negative facial expressions. However, the nature and extent of automatic processes in SAD on the behavioral and neural level is not entirely clear yet. The present review summarizes neuroscientific findings on automatic processing of facial threat but also other disorder-related stimuli such as emotional prosody or negative words in SAD. We review initial evidence for automatic activation of the amygdala, insula, and sensory cortices as well as for automatic early electrophysiological components. However, findings vary depending on tasks, stimuli, and neuroscientific methods. Only few studies set out to examine automatic neural processes directly and systematic attempts are as yet lacking. We suggest that future studies should: (1) use different stimulus modalities, (2) examine different emotional expressions, (3) compare findings in SAD with other anxiety disorders, (4) use more sophisticated experimental designs to investigate features of automaticity systematically, and (5) combine different neuroscientific methods (such as functional neuroimaging and electrophysiology). Finally, the understanding of neural automatic processes could also provide hints for therapeutic approaches. PMID:23745116

Schulz, Claudia; Mothes-Lasch, Martin; Straube, Thomas

2013-05-24

353

Automatic Neural Processing of Disorder-Related Stimuli in Social Anxiety Disorder: Faces and More  

PubMed Central

It has been proposed that social anxiety disorder (SAD) is associated with automatic information processing biases resulting in hypersensitivity to signals of social threat such as negative facial expressions. However, the nature and extent of automatic processes in SAD on the behavioral and neural level is not entirely clear yet. The present review summarizes neuroscientific findings on automatic processing of facial threat but also other disorder-related stimuli such as emotional prosody or negative words in SAD. We review initial evidence for automatic activation of the amygdala, insula, and sensory cortices as well as for automatic early electrophysiological components. However, findings vary depending on tasks, stimuli, and neuroscientific methods. Only few studies set out to examine automatic neural processes directly and systematic attempts are as yet lacking. We suggest that future studies should: (1) use different stimulus modalities, (2) examine different emotional expressions, (3) compare findings in SAD with other anxiety disorders, (4) use more sophisticated experimental designs to investigate features of automaticity systematically, and (5) combine different neuroscientific methods (such as functional neuroimaging and electrophysiology). Finally, the understanding of neural automatic processes could also provide hints for therapeutic approaches.

Schulz, Claudia; Mothes-Lasch, Martin; Straube, Thomas

2013-01-01

354

Automatically resetting safety brake  

US Patent & Trademark Office Database

Antifrictional screws are used to permit the resetting of safety brakes of the type used in hoists. As in prior brakes, centrifugally operated dogs connect braking members, but in the present brakes the dogs also engage actuating members of the antifrictional screws, the screws being rotatable independently of their respective braking members. The dogs can be disengaged from the actuating members by merely operating the usual controls to raise cages of the hoists. The antifrictional screws in one embodiment are then free to be turned automatically independent of the rate of rotation of a braking member by force of usual spring washers disposed between the screws and respective braking members until the brakes are normally released. In modified types, dogs or pawls for controlling the amount of braking continue to engage respective actuating members for controlling rate of rotation of the screws until the cages have been raised sufficiently to release the brakes. For testing purposes, the dogs have easily operated rods associated with them for pushing outwardly the dogs at speeds below the usual speeds required for operation. In heavy-duty hoists, second dogs positioned to clear the actuating members are included to provide a moderate amount of braking in an ascending direction.

1981-03-31

355

Automatic drilling control system  

SciTech Connect

An automatic drilling control system is described for a drilling apparatus having a rig with a crown block and a traveling block. A draw works include an engine, a drum powered by the engine, clutches, and controls, a drilling line wound on the drum and rolled up or fed out during drilling by the engine. The drilling line extends through the crown block and the traveling block and connects to a fixed point. The line portion from the crown block to the fixed point is the dead line. The crown block and traveling block form a pulley system for supporting a drill pipe to raise or lower the same during drilling. A hydraulic pressure sensor connects to the dead line to measure the tension. A weight indicator gauge adjacent to the controls connects to the pressure sensor by a hydraulic line. A brake, having a brake handle, controls the rate of feed out of the drilling line to determine the tension on the dead line.

Ball, J.W.

1987-05-05

356

Automatic brain tumor segmentation  

NASA Astrophysics Data System (ADS)

A system that automatically segments and labels complete glioblastoma-multiform tumor volumes in magnetic resonance images of the human brain is presented. The magnetic resonance images consist of three feature images (T1- weighted, proton density, T2-weighted) and are processed by a system which integrates knowledge-based techniques with multispectral analysis and is independent of a particular magnetic resonance scanning protocol. Initial segmentation is performed by an unsupervised clustering algorithm. The segmented image, along with cluster centers for each class are provided to a rule-based expert system which extracts the intra-cranial region. Multispectral histogram analysis separates suspected tumor from the rest of the intra-cranial region, with region analysis used in performing the final tumor labeling. This system has been trained on eleven volume data sets and tested on twenty-two unseen volume data sets acquired from a single magnetic resonance imaging system. The knowledge-based tumor segmentation was compared with radiologist-verified `ground truth' tumor volumes and results generated by a supervised fuzzy clustering algorithm. The results of this system generally correspond well to ground truth, both on a per slice basis and more importantly in tracking total tumor volume during treatment over time.

Clark, Matthew C.; Hall, Lawrence O.; Goldgof, Dmitry B.; Velthuizen, Robert P.; Murtaugh, F. R.; Silbiger, Martin L.

1998-06-01

357

Automatic imitation in dogs  

PubMed Central

After preliminary training to open a sliding door using their head and their paw, dogs were given a discrimination task in which they were rewarded with food for opening the door using the same method (head or paw) as demonstrated by their owner (compatible group), or for opening the door using the alternative method (incompatible group). The incompatible group, which had to counterimitate to receive food reward, required more trials to reach a fixed criterion of discrimination performance (85% correct) than the compatible group. This suggests that, like humans, dogs are subject to ‘automatic imitation’; they cannot inhibit online the tendency to imitate head use and/or paw use. In a subsequent transfer test, where all dogs were required to imitate their owners' head and paw use for food reward, the incompatible group made a greater proportion of incorrect, counterimitative responses than the compatible group. These results are consistent with the associative sequence learning model, which suggests that the development of imitation depends on sensorimotor experience and phylogenetically general mechanisms of associative learning. More specifically, they suggest that the imitative behaviour of dogs is shaped more by their developmental interactions with humans than by their evolutionary history of domestication.

Range, Friederike; Huber, Ludwig; Heyes, Cecilia

2011-01-01

358

Automatic imitation in dogs.  

PubMed

After preliminary training to open a sliding door using their head and their paw, dogs were given a discrimination task in which they were rewarded with food for opening the door using the same method (head or paw) as demonstrated by their owner (compatible group), or for opening the door using the alternative method (incompatible group). The incompatible group, which had to counterimitate to receive food reward, required more trials to reach a fixed criterion of discrimination performance (85% correct) than the compatible group. This suggests that, like humans, dogs are subject to 'automatic imitation'; they cannot inhibit online the tendency to imitate head use and/or paw use. In a subsequent transfer test, where all dogs were required to imitate their owners' head and paw use for food reward, the incompatible group made a greater proportion of incorrect, counterimitative responses than the compatible group. These results are consistent with the associative sequence learning model, which suggests that the development of imitation depends on sensorimotor experience and phylogenetically general mechanisms of associative learning. More specifically, they suggest that the imitative behaviour of dogs is shaped more by their developmental interactions with humans than by their evolutionary history of domestication. PMID:20667875

Range, Friederike; Huber, Ludwig; Heyes, Cecilia

2010-07-28

359

Text comprehension, memory, and learning.  

PubMed

People are often able to reproduce a text quite well but are unable to use the information in the text for other purposes. Factors that help people to reproduce a text have been studied for some time. This article explores ways that enable people to learn from texts. Content overlap between a text and the reader's prior knowledge is identified as one factor, and methods are proposed to identify whether a text is suitable for readers with given background knowledge. For readers with low background knowledge, a text should be as coherent and explicit as possible to facilitate learning. However, data are presented to show that for readers with adequate background knowledge, texts with coherence gaps that stimulate constructive activities are in fact better for learning. PMID:8203801

Kintsch, W

1994-04-01

360

Dangers of Texting While Driving  

MedlinePLUS

... While Driving Guide Print Email The Dangers of Texting While Driving "Putting the brakes on the distracted driving epidemic ... 12th Street, SW Washington, DC 20554 Print Out Texting While Driving Guide (pdf) Related Information Filter NTT DOCOMO USA ...

361

ProTEXT Print Job  

Center for Biologics Evaluation and Research (CBER)

Text Version... For motavizumab, MedImmune has included specific text in the package insert which describes recommendations for safety monitoring and product ... More results from www.fda.gov/downloads/advisorycommittees/committeesmeetingmaterials

362

Meaning Representation and Text Planning  

Microsoft Academic Search

starts with a'world'state, represented by structures of an application program (e. g., an expert system) that has text generation needs and an impetus to produce a natu-ral language text. The output of generation is a natural language text. The generation process involves the tasks of a) delimiting the content of the eventual text, b) plano ning its structure, c) selecting

Christine Defrise; Sergei Nirenburg

1990-01-01

363

ParaText : scalable text analysis and visualization.  

SciTech Connect

Automated analysis of unstructured text documents (e.g., web pages, newswire articles, research publications, business reports) is a key capability for solving important problems in areas including decision making, risk assessment, social network analysis, intelligence analysis, scholarly research and others. However, as data sizes continue to grow in these areas, scalable processing, modeling, and semantic analysis of text collections becomes essential. In this paper, we present the ParaText text analysis engine, a distributed memory software framework for processing, modeling, and analyzing collections of unstructured text documents. Results on several document collections using hundreds of processors are presented to illustrate the exibility, extensibility, and scalability of the the entire process of text modeling from raw data ingestion to application analysis.

Dunlavy, Daniel M.; Stanton, Eric T.; Shead, Timothy M.

2010-07-01

364

Automatic imitation is automatic, but less so for narcissists.  

PubMed

Imitation is a fundamentally important human capability and has been the topic of considerable research in the behavioural sciences. One paradigm for investigating the basic nature of imitation is the "automatic imitation" paradigm. In this paradigm, participants are symbolically cued to make a particular response, whilst being incidentally exposed to a congruent or incongruent motor action performed by another person. The robust finding is that when the incidental action is incongruent with the cued action, participants are slower to respond than when it is congruent. Despite the name given to this paradigm, the extent to which the imitative tendency involved is actually automatic remains unclear. Here, we manipulated the probability of congruent and incongruent trials within blocks to assess the effects of expectation on the imitative process. In addition, we determined whether an individual difference variable related to how people process others' behaviour-narcissism-affected the automaticity of imitation. Our results confirm that imitation as observed in this paradigm is robust in the face of expectation. However, the degree to which expectation modulates automatic imitation was enhanced for individuals who scored higher on a narcissism inventory. Together, these results suggest that imitation in the automatic imitation paradigm is indeed largely automatic, but that individual differences in narcissism can change the extent to which imitative behaviour manifests. PMID:23187883

Hogeveen, Jeremy; Obhi, Sukhvinder S

2012-11-28

365

What is this text about?  

Microsoft Academic Search

Most work in text retrieval aims at presenting the information held by several texts in order to give entry clues towards these texts and to allow a navigation between them. Besides, a lesser interest is dedicated to the definition of principles for accessing content of single documents. As most information retrieval systems return documents from an initial request made of

Nicolas Hernandez; Brigitte Grau

2003-01-01

366

Slippery Texts and Evolving Literacies  

ERIC Educational Resources Information Center

|The idea of "slippery texts" provides a useful descriptor for materials that mutate and evolve across different media. Eight adult gamers, encountering the slippery text "American McGee's Alice," demonstrate a variety of ways in which players attempt to manage their attention as they encounter a new text with many resonances. The range of their…

Mackey, Margaret

2007-01-01

367

Text detection for video analysis  

Microsoft Academic Search

Textual information brings important semantic clues in video content analysis. We describe a method for detection and representation of text in video segments. The method consists of seven steps: channel separation, image enhancement, edge detection, edge filtering, character detection, text box detection, and text line detection. Our results show that this method can be applied to English as well as

Lalitha Agnihotri; Nevenka Dimitrova

1999-01-01

368

Choosing Software for Text Processing.  

ERIC Educational Resources Information Center

|Review of text processing software for microcomputers covers data entry, text editing, document formatting, and spelling and proofreading programs including "Wordstar,""PeachText,""PerfectWriter,""Select," and "The Word Plus.""The Whole Earth Software Catalog" and a new terminal to be manufactured for OCLC by IBM are mentioned. (EJS)|

Mason, Robert M.

1983-01-01

369

Text prediction systems: a survey  

Microsoft Academic Search

Text prediction is one of the most widely used techniques to enhance the communication rate in augmentative and alternative communication. Prediction systems are traditionally used by people with disabilities (e.g. people with motor and speech impairments). However, new applications, such as writing short text messages via mobile phones, have recently appeared. A vast amount of heterogeneous text prediction methods and

Nestor Garay-vitoria; Julio Abascal

2006-01-01

370

Text Editing in Chemistry Instruction.  

ERIC Educational Resources Information Center

|Describes experiments with Australian high school students that investigated differences in performance on chemistry word problems between two learning strategies: text editing, and conventional problem solving. Concluded that text editing had no advantage over problem solving in stoichiometry problems, and that the suitability of a text editing…

Ngu, Bing Hiong; Low, Renae; Sweller, John

2002-01-01

371

Multilingual Text Analysis for Text-to-Speech Synthesis  

Microsoft Academic Search

We present a model of text analysis for text-to-speech (TTS) syn- thesis based on weighted finite-state transducers, which serves as the text-analysis module of the multilingual Bell Labs TTS system. The transducers are constructed using a lexical toolkit that allows declarative descriptions of lexicons, morphological rules, numeral- expansion rules, and phonological rules, inter alia. To date, the model has been

Richard Sproat

1996-01-01

372

Rewriting and Paraphrasing Source Texts in Second Language Writing  

ERIC Educational Resources Information Center

|The present study is based on interviews with 48 students and 27 instructors in a North American university and explores whether students and professors across faculties share the same views on the use of paraphrased, summarized, and translated texts in four examples of L2 student writing. Participants' comments centered on whether the…

Shi, Ling

2012-01-01

373

Rewriting and Paraphrasing Source Texts in Second Language Writing  

ERIC Educational Resources Information Center

The present study is based on interviews with 48 students and 27 instructors in a North American university and explores whether students and professors across faculties share the same views on the use of paraphrased, summarized, and translated texts in four examples of L2 student writing. Participants' comments centered on whether the paraphrases…

Shi, Ling

2012-01-01

374

Succinct Text Indexing with Wildcards  

NASA Astrophysics Data System (ADS)

A succinct text index uses space proportional to the text itself, say, two times n log? for a text of n characters over an alphabet of size ?. In the past few years, there were several exciting results leading to succinct indexes that support efficient pattern matching. In this paper we present the first succinct index for a text that contains wildcards. The space complexity of our index is (3 + o(1))n log? + O(?logn) bits, where ? is the number of wildcard groups in the text. Such an index finds applications in indexing genomic sequences that contain single-nucleotide polymorphisms (SNP), which could be modeled as wildcards.

Tam, Alan; Wu, Edward; Lam, Tak-Wah; Yiu, Siu-Ming

375

An improved kNN learning based korean text classifier with heuristic information  

Microsoft Academic Search

Automatic text categorization is a problem of assigning predefined categories to free text documents based on the likelihood suggested by a training set of labelled texts. kNN learning based text classifier is a well known statistical approach and its algorithm is quite simple. While the method has been applied to many systems and shown relatively good performance, a through evaluation

Heui-Seok Lim

2002-01-01

376

Research on Automatic Program Generation.  

National Technical Information Service (NTIS)

Automatic Program Generation Research has been conducted under the contract. The objective of the research has been to provide software generation directly from user specifications. This technical report contains a collection of three papers intended to s...

J. A. Ramirez N. A. Rin M. Brown N. S. Prywes

1974-01-01

377

An Automatic Vehicle Classification System.  

National Technical Information Service (NTIS)

This manuscript documents the development of an automatic surface transportation identification system to break down vehicle noise sources, by class, into categories of truck, bus, car, and motorcycle. Such a classification would enable the Army, U.S. Env...

V. I. Pawlowska

1981-01-01

378

Progress of DORIS Automatic Scaling.  

National Technical Information Service (NTIS)

A major component of the Digital Oblique Remote Ionospheric Sensing program (DORIS) is the development of an automatic oblique ionogram scaling algorithm. The nature of the variations that have been observed in oblique ionograms collected to data has requ...

B. W. Reinisch K. Chandra W. S. Kuklinski

1989-01-01

379

Automatic Recognition of Cooperative Speakers.  

National Technical Information Service (NTIS)

Various statistical features were extracted in real time from speech samples from speakers and then processed for automatic identification and verification. On the basis of the most reliable feature, the long term averaged spectrum, two prototypes were se...

R. Frehse R. Geppert R. Gierloff M. H. Kuhn H. Ney

1980-01-01

380

Automatic Recognition of Solar Features.  

National Technical Information Service (NTIS)

Initial algorithms were developed to recognize two very different types of features in solar images: H alpha filaments and large scale magnetic field patterns. These algorithms provide an effective way to evaluate possibilities for future routine, automat...

P. L. Bornmann

1992-01-01

381

Instrumentation for automatic coke cutting  

SciTech Connect

This paper describes a system developed by Conoco and currently in use at a refinery for automatically decoking delayed coker drums. The paper describes experiences with the system and discusses future plans for automated hydraulic decoking systems.

Alworth, C.W.

1985-01-01

382

Automatic Light Gas Gun Development.  

National Technical Information Service (NTIS)

Work was continued toward the development of an automatic hypervelocity weapon based on the principles and techniques of the laboratory light gas gun. With the basic feasibility of the firing scheme established by the preceding work phase, effort was aime...

1964-01-01

383

Prospective Automatic Flight Control Systems.  

National Technical Information Service (NTIS)

The report contains a describion of nonlinear self adjusting and variable structure automatic control systems for piloted and pilotless flight vehicles. Control problems considered include load stabilization, limitation of critical regimes, and control of...

A. D. Aleksandrov

1972-01-01

384

European Standards for Automatic Sprinklers.  

National Technical Information Service (NTIS)

European Automatic Sprinkler Standards were developed by the Comite Europeen des Assurances, and international association of insurance organizations of Western Europe, based on a detailed analysis of reliable fire statistics on sprinkler-protected proper...

P. Kirchhoff

1975-01-01

385

Encoding standards for large text resources: The Text Encoding Initiative  

Microsoft Academic Search

The Text Encoding Initiative (TEI) is an international project established in 1988 to develop guidelines for the preparation and interchange of electronic texts for research, and to satisfy a broad range of uses by the language industries more generally. The need for standardized encoding practices has become inxreasingly critical as the need to use and, most importantly, reuse vast amounts

Nancy Ide

1994-01-01

386

Automatic safety rod for reactors  

DOEpatents

An automatic safety rod for a nuclear reactor containing neutron absorbing material and designed to be inserted into a reactor core after a loss-of-core flow. Actuation is based upon either a sudden decrease in core pressure drop or the pressure drop decreases below a predetermined minimum value. The automatic control rod includes a pressure regulating device whereby a controlled decrease in operating pressure due to reduced coolant flow does not cause the rod to drop into the core.

Germer, John H. (San Jose, CA)

1988-01-01

387

Automaticity and the anxiety disorders  

Microsoft Academic Search

Experimental psychopathologists have increasingly relied upon the concepts and methods of cognitive psychology in their attempts to elucidate information-processing biases associated with anxiety disorders. Many of these biases presumably constitute instances of automatic, not strategic, processing. But research has shown that attributes of automaticity (i.e. capacity-free, unconsious, involuntary) do not all apply to selective processing of threat associated with anxiety.

Richard J. McNally

1995-01-01

388

Operating safety of automatic objects  

NASA Astrophysics Data System (ADS)

Operating-safety assurance for automatic objects (aircraft, spacecraft, and underwater vehicles) is considered in the framework of safety-automata theory and automatic-control considerations. The interaction between the operator and the safety-assurance facilities is considered. Methodological recommendations are presented on the specification of reliability requirements for the vehicles considered, as well as on automata synthesis and analysis considerations, test planning, and the analysis of test results.

Maiorov, Anatolii Vladimirovich; Moskatov, Genrikh Karlovich; Shibanov, Georgii Petrovich

389

Networking automatic test equipment environments  

Microsoft Academic Search

Automatic test equipment (ATE) is a term that, in its broadest meaning, indicates a generic system capable of performing measurements in an automatic or semiautomated (human-assisted) way. Years ago, this term was used specifically to refer to an automated measurement system employed to test the functionality of some electronic device-under-test (DUT). Typical applications were in the manufacturing area, where ATE

L. Benetazzo; M. Bertocco; C. Narduzzi

2005-01-01

390

Inductive learning algorithms and representations for text categorization  

Microsoft Academic Search

Text categorization - the assignment of natural language\\u000a texts to one or more predefined categories based on their\\u000a content - is an important component in many information\\u000a organization and management tasks. We compare the\\u000a effectiveness of five different automatic learning\\u000a algorithms for text categorization in terms of learning\\u000a speed, real-time classification speed, and classification\\u000a accuracy. We also examine training set

Susan T. Dumais; John C. Platt; David Heckerman; Mehran Sahami

1998-01-01

391

Clinical text classification under the Open and Closed Topic Assumptions.  

PubMed

This paper investigates multi-topic aspects in automatic classification of clinical free text in comparison with general text. In this paper, we facilitate two different views on multi-topics: the Closed Topic Assumption (CTA) and the Open Topic Assumption (OTA). Experimental results show that the characteristics of multi-topic assignments in the Computational Medicine Centre (CMC) Medical NLP Challenge Data is strongly OTA-oriented but general text Reuters-21578 is characterised in the middle of the OTA and CTA spectrum. PMID:19623772

Sasaki, Yutaka; Rea, Brian; Ananiadou, Sophia

2009-01-01

392

Browsing Semi-structured Web Texts Using Formal Concept Analysis  

Microsoft Academic Search

Query-directed browsing of unstructured Web-texts using Formal Concept Analysis (FCA) confronts two problems. Firstly on-line Web-data is sometimes unstructured and any FCA-system must include additional mechanisms to structure input sources. Secondly many on- line collections are large and dynamic so a Web-robot must be used to automatically extract data. These issues are addressed in this paper. We report on the

Richard J. Cole II; Peter W. Eklund

2001-01-01

393

A systematic review of named entity recognition in biomedical texts  

Microsoft Academic Search

Biomedical Named Entities (NEs) are phrases or combinations of phrases that denote specific objects or groups of objects in\\u000a the biomedical literature. Research on Named Entity Recognition (NER) is one of the most disseminated activities in the automatic\\u000a processing of biomedical scientific articles. We analyzed articles relevant to NER in biomedical texts, in the period from\\u000a 2007 to 2009, through

Rodrigo Rafael Villarreal Goulart; Clarissa Castellã Xavier

2011-01-01

394

MindNet: Acquiring and Structuring Semantic Information from Text  

Microsoft Academic Search

As a lexical knowledge base constructed automatically from the definitions and example sentences in two machine-readable dictionaries (MRDs), MindNet embodies several features that distinguish it from prior work with MRDs. It is, however, more than this static resource alone. MindNet represents a general methodology for acquiring, structuring, accessing, and exploiting semantic information from natural language text. This paper provides an

Stephen D. Richardson; William B. Dolan; Lucy Vanderwende

1998-01-01

395

Computer Assisted Transcription of Text Images and Multimodal Interaction  

Microsoft Academic Search

Current automatic handwriting text image recognition systems are far from being perfect and, in general, human intervention\\u000a is required to check and correct the results of such systems. This is both inefficient and uncomfortable to the user. As an\\u000a alternative to this post-editing process, a multimodal interactive approach is proposed, where user feedback is provided by\\u000a means of touch-screen pen

Alejandro Hector Toselli; Verónica Romero; Enrique Vidal

2008-01-01

396

Hierarchical Approach to Emotion Recognition and Classification in Texts  

Microsoft Academic Search

\\u000a We explore the task of automatic classification of texts by the emotions expressed. We consider how the presence of neutral\\u000a instances affects the performance of distinguishing between emotions. Another facet of the evaluation concerns the relation\\u000a between polarity and emotions. We apply a novel approach which arranges neutrality, polarity and emotions hierarchically.\\u000a This method significantly outperforms the corresponding “flat” approach

Diman Ghazi; Diana Inkpen; Stan Szpakowicz

2010-01-01

397

Automatic Thesaurus Generation from Raw Text using Knowledge-Poor Techniques  

Microsoft Academic Search

In addition to showing how lexicalunits are related within a field,domain-specific thesauri give an ideaof what subjects are important to thatfield and are thus useful at many pointsin an information system. The majorimpediment to creation of thesaurihas been the cost of their manual creation.We present here a number ofautomatic techniques that jointly producea first draft of a thesaurus fromany domain-defining

Gregory Grefenstette

1993-01-01

398

A Method for Semi-automatic Creation of Ontologies Based on Texts  

Microsoft Academic Search

The recent developments related to knowledge management, the semantic web and the exchange of electronic information through\\u000a the use of agents have increased the need for ontologies to describe in a formal way shared understanding of a given domain.\\u000a For computers and people to work in cooperation it is necessary that information have well defined and shared definitions.\\u000a Ontologies are

Luiz C. C. Carvalheira; Edson Satoshi Gomi

2007-01-01

399

Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions  

Microsoft Academic Search

We describe the basic design of a system for au- tomatic detection of protein-protein interactions extracted from scientific abstracts. By restrict- ing the problem domain and imposing a number of strong assumptions which include pre-speeified protein names and a limited set of verbs that rep- resent actions, we show- that it is possible to per- form accurate information extraction. The

Christian Blaschke; Miguel A. Andrade; Christos A. Ouzounis; Alfonso Valencia

1999-01-01

400

TextArc: An Alternate Way to View a Text  

NSDL National Science Digital Library

Textarc is an unconventional tool that gives readers the opportunity to discover patterns and concepts in texts. Still in a developmental stage, the site offers readers the opportunity to utilize human visual processing by allowing intuition to help extract meaning from a text. By exposing every word at once, the eye is able to make connections and decipher meaning otherwise overlooked by normal reading, thereby exposing the essence of a text. The site currently has Hamlet available as a full textarc text, and is in the process of exposing more literary works. Not only of value for avid readers and literary critics, the site offers librarians and archivists new approaches to cataloguing. On the whole, this new and innovative creation is at a minimum intriguing, and the site is definitely worth a visit.

2002-01-01

401

[Progress in automatic reconstruction and analysis tools of genome-scale metabolic network].  

PubMed

High-throughput data supply a basis for the reconstruction of genome-scale metabolic networks, and meanwhile bring challenges to the reconstruction and analysis methods. With the increasing of data quantity, the time-consuming manual reconstruction and analysis are far behind the improvement of models. Therefore, various automatic methods emerge. The automatic reconstruction and analysis have irreplaceable effect in the standardization and programming of reconstruction and analysis methods, as well as largely improving the speed of reconstruction and understanding of the metabolic network. In this review, we introduced the progress of automatic reconstruction and the main analysis tools of genome-scale metabolic network. We further summarized the workflow of automatic reconstruction. The difficulties and perspectives on this research field are also discussed. PMID:23016303

Hao, Tong; Ma, Hongwu; Zhao, Xueming

2012-06-01

402

An Automatic Tremor Activity Monitoring System (TAMS)  

NASA Astrophysics Data System (ADS)

We have developed an algorithm that quantitatively characterizes the level of seismic tremors from recorded seismic waveforms. For each hour of waveform at a given station, the process begins with the calculation of scintillation index and moving average with various time lengths. The scintillation index (essentially the `normalized variance of intensity of the signal') is adapted from the studies of pulses in radio waves and is an efficient tool to identify the energy bursts of tremor signals. Both scintillation index and moving average values are fed into a series of logic gates to determine if tremor activity exists. This algorithm is implemented in the Tremor Activity Monitoring System (TAMS) to provide automatic early alerts for episodic tremor and slip (ETS) events in the northern Cascadia margin. Currently, TAMS retrieves the digital waveforms recorded during the previous day from the Canadian National Seismographic Network (CNSN) archive server at 1 AM every morning. The detecting process is repeated for all stations and hours to determine the level of tremor activity of the previous day. If a sufficient number of stations within a radius of 100 km are determined to have tremor patterns and coherent tremor arrivals can be found at more than 3 stations, TAMS automatically sends out alert emails to a list of subscribers with a figure summarizing the hours and locations of coherent tremors. TAMS outputs are very consistent with the work done by visual inspection, especially for major ETS events. It is straightforward to configure TAMS into a near-real-time system that can send out hourly (or shorter) reports if necessary.

Kao, H.; Thompson, P. J.; Rogers, G.; Dragert, H.; Spence, G.

2006-12-01

403

Text mining for technology monitoring  

Microsoft Academic Search

A considerable part of scientific and technological knowledge is coded in writing. In this context, automated text categorization can be regarded as a promising tool particularly for patent data analysis. In a real-life example, we show that automated text categorization can closely resemble the time-consuming categorisation job of an expert. By comparing different algorithms we reveal systematic differences in their

Thorsten Teichert; Marc-Andre Mittermayer

2002-01-01

404

Algorithm for Training Text Classifiers  

Microsoft Academic Search

The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was developed and tested on a newswire text categorization task. This method, which we call uncertainty sampling, reduced

David D. Lewis; William A. Gale

405

DISFLUENCIES IN COMPREHENDING ARGUMENTATIVE TEXTS  

Microsoft Academic Search

In two experiments, we examine university students' ability to comprehend authentic argumentative texts and factors that influence their application of this skill. Participants read several relatively lengthy arguments and identified the main claim and reasons. Experiment 1 shows that participants are not skilled at identifying key elements from an argumentative text (only 30% accuracy). The performance of participants of all

MEREDITH LARSON; M. ANNE BRITT; AARON A. LARSON

2004-01-01

406

The encoding of spoken texts  

Microsoft Academic Search

There is a great deal of variation in the encoding of spoken texts in electronic form, both with respect to the types of features represented and the way particular features are rendered. This paper surveys problems in the electronic representation of speech and presents the solutions proposed by the Text Encoding Initiative. The special tags needed for the encoding of

Stig Johansson

1995-01-01

407

Text Messaging During Simulated Driving  

Microsoft Academic Search

Objective: This research aims to identify the impact of text messaging on simulated driving performance. Background: In the past decade, a number of on-road, epidemiological, and simulator-based studies reported the negative impact of talking on a cell phone on driving behavior. However, the impact of text messaging on simulated driving performance is still not fully understood. Method: Forty participants engaged

Frank A. Drews; Hina Yazdani; Celeste N. Godfrey; Joel M. Cooper; David L. Strayer

2009-01-01

408

Symbolic representation of text documents  

Microsoft Academic Search

This paper presents a novel method of representing a text document by the use of interval valued symbolic features. A method of classification of text documents based on the proposed representation is also presented. The newly proposed model significantly reduces the dimension of feature vectors and also the time taken to classify a given document. Further, extensive experimentations are conducted

D. S. Guru; B. S. Harish; S. Manjunath

2010-01-01

409

Text mining with conceptual graphs  

Microsoft Academic Search

A method for conceptual clustering of a collection of texts represented with conceptual graphs is presented. It uses an incremental strategy to construct the cluster hierarchy and incorporates some characteristics attractive for text mining purposes. For instance, it considers the structural information of the graphs, uses domain knowledge to detect the clusters with generalized descriptions, and uses a user-defined similarity

M. Montes-Y-Gomez; A. Gelbukh; A. Lopez-Lopez; R. Baeza-Yates

2001-01-01

410

TEXT MINING WITH CONCEPTUAL GRAPHS  

Microsoft Academic Search

A method for conceptual clustering of a collection of texts represented with conceptual graphs is presented. It uses the incremental strategy to construct the clus- ter hierarchy and incorporates some characteristics attractive for text mining proposes. For instance, it considers the structural information of the graphs, uses domain knowledge to detect the clusters with generalized descriptions, and uses a user-defined

M. MONTES-Y-GÓMEZ; A. GELBUKH; A. LÓPEZ-LÓPEZ; R. BAEZA-YATES

411

MULTICENTER AUTOMATIC DEFIBRILLATOR ...  

Center for Biologics Evaluation and Research (CBER)

Text Version... David Oakes, PhD (Chair and Biostatistician) Professor Department of Biostatistics and Computational Biology University of Rochester Medical ... More results from www.fda.gov/downloads/advisorycommittees/committeesmeetingmaterials

412

MULTICENTER AUTOMATIC DEFIBRILLATOR ...  

Center for Biologics Evaluation and Research (CBER)

Text Version... Gill analysis that accounts for the changing risk from first to ... The changes from baseline to 12 months in ... by linear regression models of change in the ... More results from www.fda.gov/downloads/advisorycommittees/committeesmeetingmaterials

413

Text Mining in Social Networks  

NASA Astrophysics Data System (ADS)

Social networks are rich in various kinds of contents such as text and multimedia. The ability to apply text mining algorithms effectively in the context of text data is critical for a wide variety of applications. Social networks require text mining algorithms for a wide variety of applications such as keyword search, classification, and clustering. While search and classification are well known applications for a wide variety of scenarios, social networks have a much richer structure both in terms of text and links. Much of the work in the area uses either purely the text content or purely the linkage structure. However, many recent algorithms use a combination of linkage and content information for mining purposes. In many cases, it turns out that the use of a combination of linkage and content information provides much more effective results than a system which is based purely on either of the two. This paper provides a survey of such algorithms, and the advantages observed by using such algorithms in different scenarios. We also present avenues for future research in this area.

Aggarwal, Charu C.; Wang, Haixun

414

A Bottom-Up Approach to Sentence Ordering for Multi-Document Summarization  

Microsoft Academic Search

Ordering information is a difficult but important task for applications generat- ing natural-language text. We present a bottom-up approach to arranging sen- tences extracted for multi-document sum- marization. To capture the association and order of two textual segments (eg, sen- tences), we define four criteria, chronol- ogy, topical-closeness, precedence, and succession. These criteria are integrated into a criterion by a

Danushka Bollegala; Naoaki Okazaki; Mitsuru Ishizuka

2006-01-01

415

Counting OCR errors in typeset text  

NASA Astrophysics Data System (ADS)

Frequently object recognition accuracy is a key component in the performance analysis of pattern matching systems. In the past three years, the results of numerous excellent and rigorous studies of OCR system typeset-character accuracy (henceforth OCR accuracy) have been published, encouraging performance comparisons between a variety of OCR products and technologies. These published figures are important; OCR vendor advertisements in the popular trade magazines lead readers to believe that published OCR accuracy figures effect market share in the lucrative OCR market. Curiously, a detailed review of many of these OCR error occurrence counting results reveals that they are not reproducible as published and they are not strictly comparable due to larger variances in the counts than would be expected by the sampling variance. Naturally, since OCR accuracy is based on a ratio of the number of OCR errors over the size of the text searched for errors, imprecise OCR error accounting leads to similar imprecision in OCR accuracy. Some published papers use informal, non-automatic, or intuitively correct OCR error accounting. Still other published results present OCR error accounting methods based on string matching algorithms such as dynamic programming using Levenshtein (edit) distance but omit critical implementation details (such as the existence of suspect markers in the OCR generated output or the weights used in the dynamic programming minimization procedure). The problem with not specifically revealing the accounting method is that the number of errors found by different methods are significantly different. This paper identifies the basic accounting methods used to measure OCR errors in typeset text and offers an evaluation and comparison of the various accounting methods.

Sandberg, Jonathan S.

1995-03-01

416

Video summarization using descriptors of motion activity: a motion activity based approach to key-frame extraction from video shots  

NASA Astrophysics Data System (ADS)

We describe a video summarization technique that uses motion descriptors computed in the compressed domain. It can either speed up conventional color-based video summarization techniques, or rapidly generate a key-frame based summary by itself. The basic hypothesis of the work is that the intensity of motion activity of a video segment is a direct indication of its `summarizability,' which we experimentally verify using the MPEG-7 motion activity descriptor and the fidelity measure proposed in H. S. Chang, S. Sull, and S. U. Lee, `Efficient video indexing scheme for content-based retrieval,' IEEE Trans. Circuits Syst. Video Technol. 9(8), (1999). Note that the compressed domain extraction of motion activity intensity is much simpler than the color-based calculations. We are thus able to quickly identify easy to summarize segments of a video sequence since they have a low intensity of motion activity. We are able to easily summarize these segments by simply choosing their first frames. We can then apply conventional color-based summarization techniques to the remaining segments. We thus speed up color-based summarization by reducing the number of segments processed. Our results also motivate a simple and novel key-frame extraction technique that relies on a motion activity based nonuniform sampling of the frames. Our results indicate that it can either be used by itself or to speed up color-based techniques as explained earlier.

Divakaran, Ajay; Radhakrishnan, Regunathan; Peker, Kadir A.

2001-10-01

417

A comprehensive method for multilingual video text detection, localization, and extraction  

Microsoft Academic Search

Text in video is a very compact and accurate clue for video indexing and summarization. Most video text detection and extraction methods hold assumptions on text color, background contrast, and font style. Moreover, few methods can handle multilingual text well since different languages may have quite different appearances. This paper performs a detailed analysis of multilingual text characteristics, including English

Michael R. Lyu; Jiqiang Song; Min Cai

2005-01-01

418

Determining an author's native language by mining a text for errors  

Microsoft Academic Search

In this paper, we show that stylistic text features can be exploited to determine an anonymous author's native language with high accuracy. Specifically, we first use automatic tools to ascertain frequencies of various stylistic idiosyncrasies in a text. These frequencies then serve as features for support vector machines that learn to classify texts according to author native language.

Moshe Koppel; Jonathan Schler; Kfir Zigdon

2005-01-01

419

A feature selection based on deviation from feature centroid for text categorization  

Microsoft Academic Search

Text categorization is very vital in assisting people to process automatically the information which increases exponentially. But the high dimensionality of the vector space is a big hurdle in applying many sophisticated learning algorithms in text categorization. So feature selection has become a research focus in text categorization. In this paper, we proposed a new feature selection, named FCFS, which

Jieming Yang; Zhiying Liu

2011-01-01

420

Text Detection in Color Scene Images based on Unsupervised Clustering of Multichannel Wavelet Features  

Microsoft Academic Search

Texts in natural scenes provide us with much useful information. In order to use such information automatically, it is necessary to make computers detect text regions in the images. Gllavata et. al. proposed a method based on unsupervised classification of high frequency wavelet coefficients for text detection in video frames [Gllavata et. al. (2004)]. Although the method is very accurate,

Tomoyuki Saoi; Hideaki Goto; Hiroaki Kobayashi

2005-01-01

421

Text-mining and information-retrieval services for molecular biology  

PubMed Central

Text-mining in molecular biology - defined as the automatic extraction of information about genes, proteins and their functional relationships from text documents - has emerged as a hybrid discipline on the edges of the fields of information science, bioinformatics and computational linguistics. A range of text-mining applications have been developed recently that will improve access to knowledge for biologists and database annotators.

Krallinger, Martin; Valencia, Alfonso

2005-01-01

422

Level statistics of words: finding keywords in literary texts and symbolic sequences  

Microsoft Academic Search

Using a generalization of the level statistics analysis of quantum disordered systems, we present an approach able to extract automatically keywords in literary texts. Our approach takes into account not only the frequencies of the words present in the text but also their spatial distribution along the text, and is based on the fact that relevant words are significantly clustered

P. Carpena; P. Bernaola-Galvan; M. Hackenberg; A. V. Coronado; J. L Oliver

2009-01-01

423

Alex Catalog of Electronic Texts  

NSDL National Science Digital Library

This catalog, maintained by Eric Lease Morgan, a systems librarian at North Carolina State University, specializes in American literature, English literature, and philosophy. Alex is particularly helpful because the search interface allows researchers to both look for documents and search the content of those documents. Users first search standard fields such as author, title, or publication date; then they can search the content of documents they select from their returns list. Though returns in content searches would be more convenient were they hyperlinked to the complete record for the text, such a search nonetheless has obvious utility for someone writing on, for example, flower imagery in Shakespearian sonnets or Emerson's vision of democracy. Another nice feature of the catalog is the ability to convert documents to .pdf files on-the-fly (with the font and spacing customizable). Alternately, users can download the whole collection of American or English literature or philosophy texts and the tools to search the texts.

424

Automatic kinetic typography composer  

Microsoft Academic Search

Animated text, commonly called kinetic typography, is any attractive visual expression used in films, TV programs, video games, etc. Previous studies have developed tools that support the authoring and rendering of kinetic typography. However, authoring kinetic typography is not easy because its methodology is still at an early stage. Hence, we systematize expression elements in kinetic typography and propose an

Mitsuru Minakuchi; Katsumi Tanaka

2005-01-01

425

Automatic rapid attachable warhead section  

DOEpatents

Disclosed are a method and apparatus for (1) automatically selecting warheads or reentry vehicles from a storage area containing a plurality of types of warheads or reentry vehicles, (2) automatically selecting weapon carriers from a storage area containing at least one type of weapon carrier, (3) manipulating and aligning the selected warheads or reentry vehicles and weapon carriers, and (4) automatically coupling the warheads or reentry vehicles with the weapon carriers such that coupling of improperly selected warheads or reentry vehicles with weapon carriers is inhibited. Such inhibition enhances safety of operations and is achieved by a number of means including computer control of the process of selection and coupling and use of connectorless interfaces capable of assuring that improperly selected items will be rejected or rendered inoperable prior to coupling. Also disclosed are a method and apparatus wherein the stated principles pertaining to selection, coupling and inhibition are extended to apply to any item-to-be-carried and any carrying assembly.

Trennel, Anthony J. (Albuquerque, NM)

1994-05-10

426

Finding text in color images  

NASA Astrophysics Data System (ADS)

In this paper, we consider the problem of locating and extracting text from WWW images. A previous algorithm based on color clustering and connected components analysis works well as long as the color of each character is relatively uniform and the typography is fairly simple. It breaks down quickly, however, when these assumptions are violated. In this paper, we describe more robust techniques for dealing with this challenging problem. We present an improved color clustering algorithm that measures similarity based on both RGB and spatial proximity. Layout analysis is also incorporated to handle more complex typography. THese changes significantly enhance the performance of our text detection procedure.

Zhou, Jiangying; Lopresti, Daniel P.; Tasdizen, Tolga

1998-04-01

427

Text Classification using String Kernels  

Microsoft Academic Search

We propose a novel approach for categorizing text\\u000a documents based on the use of a special kernel. The kernel\\u000a is an inner product in the feature space generated by all\\u000a subsequences of length k. A subsequence is any ordered\\u000a sequence of k characters occurring in the text though not\\u000a necessarily contiguously. The subsequences are weighted by\\u000a an exponentially decaying factor

Huma Lodhi; Craig Saunders; John Shawe-Taylor; Nello Cristianini; Christopher J. C. H. Watkins

2002-01-01

428

Biomarker Identification Using Text Mining  

PubMed Central

Identifying molecular biomarkers has become one of the important tasks for scientists to assess the different phenotypic states of cells or organisms correlated to the genotypes of diseases from large-scale biological data. In this paper, we proposed a text-mining-based method to discover biomarkers from PubMed. First, we construct a database based on a dictionary, and then we used a finite state machine to identify the biomarkers. Our method of text mining provides a highly reliable approach to discover the biomarkers in the PubMed database.

Li, Hui; Liu, Chunmei

2012-01-01

429

Biomarker identification using text mining.  

PubMed

Identifying molecular biomarkers has become one of the important tasks for scientists to assess the different phenotypic states of cells or organisms correlated to the genotypes of diseases from large-scale biological data. In this paper, we proposed a text-mining-based method to discover biomarkers from PubMed. First, we construct a database based on a dictionary, and then we used a finite state machine to identify the biomarkers. Our method of text mining provides a highly reliable approach to discover the biomarkers in the PubMed database. PMID:23197989

Li, Hui; Liu, Chunmei

2012-11-11

430

Interpreting Texts in Classroom Contexts.  

ERIC Educational Resources Information Center

|Describes a series of instructional episodes in an 11th-grade classroom discussing J.D. Salinger's short story "The Laughing Man." Presents and discusses the "Text and Context" model for the negotiation of interpretations in classroom contexts. Offers suggestions for developing interpretive classroom communities. (SR)|

Unrau, Norman J.; Ruddell, Robert B.

1995-01-01

431

Postmodern Texts and Emotional Audiences  

Microsoft Academic Search

Chabot Davis analyzes contemporary texts that bond together two seemingly antithetical sensibilities: the sentimental and the postmodern. Ranging across multiple media and offering a methodological union of textual analysis and reception study, Chabot Davis presents case studies of audience responses. Chabot Davis argues that sentimental postmodernism deepened leftist political engagement by moving audiences to identify emotionally with people across the

Kimberly Chabot Davis

2007-01-01

432

Emotions in nondirected text learning  

Microsoft Academic Search

Two studies examined the influence of emotions on nondirected learning. Nondirected learning is conceptualized as learning which occurs in the absence of external prompts, reinforcements, or specific instruction. In Study 1, one of two expository texts was given to ninety-two undergraduate subjects for the ostensible purpose of obtaining attitudinal and emotional ratings. Two separate measures of motivational and emotional factors

RICHARD M. RYAN; JAMES P. CONNELL; ROBERT W. PLANT

1990-01-01

433

Reviving "Walden": Mining the Text.  

ERIC Educational Resources Information Center

|Describes how the author and her high school English students begin their study of Thoreau's "Walden" by mining the text for quotations to inspire their own writing and discussion on the topic, "How does Thoreau speak to you or how could he speak to someone you know?" (SR)|

Hewitt Julia

2000-01-01

434

Solar Concepts: A Background Text.  

ERIC Educational Resources Information Center

This text is designed to provide teachers, students, and the general public with an overview of key solar energy concepts. Various energy terms are defined and explained. Basic thermodynamic laws are discussed. Alternative energy production is described in the context of the present energy situation. Described are the principal contemporary solar…

Gorham, Jonathan W.

435

Transformation and Text: Journal Pedagogy.  

ERIC Educational Resources Information Center

|One intention that an instructor had for her new course called "Writing and Healing: Women's Journal Writing" was to make apparent the power of self-written text to transform the writer. She asked her students--women studying women writing their lives and women writing their own lives--to write three pages a day and to focus on change. The…

Ellis, Carol

436

Clustering Concept Hierarchies from Text  

Microsoft Academic Search

We present a novel approach to learning taxonomies or concept hierarchies from text. The approach is based on Formal Concept Analysis, a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. Our approach is based on the distributional hypothesis, i.e. that nouns or terms are similar to the extent to which they share

Philipp Cimiano; Andreas Hotho

437

Reading Instruction and Text Difficulty  

ERIC Educational Resources Information Center

|An observational study investigated the influence of text difficulty (independent, instructional, or frustration level) on the reading experiences of students in grades 1-3 in two schools for the deaf. Participants included 12 students who are deaf or hard of hearing and 5 educators. The most significant findings were twofold. First, students…

Donne, Vicki

2011-01-01

438

Critical Edition of Sanskrit Texts  

Microsoft Academic Search

A critical edition takes into account all the different known versions of the same text in order to show the differ- ences between any two distinct versions. The construction of a critical edition is a long and, sometimes, tedious work. Some software that help the philologist in such a task have been available for a long time for the European

Marc Csernel; François Patte

2008-01-01

439

Text-independent speaker identification  

Microsoft Academic Search

We describe current approaches to text-independent speaker identification based on probabilistic modeling techniques. The probabilistic approaches have largely supplanted methods based on comparisons of long-term feature averages. The probabilistic approaches have an important and basic dichotomy into nonparametric and parametric probability models. Nonparametric models have the advantage of being potentially more accurate models (though possibly more fragile) while parametric models

H. Gish; M. Schmidt

1994-01-01

440

Predictive Encoding in Text Compression.  

ERIC Educational Resources Information Center

Presents three text compression methods of increasing power and evaluates each based on the trade-off between compression gain and processing time. The advantages of using hash coding for speed and optimal arithmetic coding to successor information for compression gain are discussed. (26 references) (Author/CLB)

Raita, Timo; Teuhola, Jukka

1989-01-01

441

Automatic mass spectrometer inlet system  

SciTech Connect

There is provided a mass spectrometer having a gas inlet system for introducing a sample into the ion source of the spectrometer which inlet system includes a cold trap for condensing a sample. The inlet system is provided with means for detecting the pressure therein and means for automatically controlling the operation of the cold trap in dependence on the detected pressure whereby the sample is automatically condensed in the cold trap when it is present in a small quantity. Around the cold trap is conveniently a coolant passage through which coolant from a coolant reservoir is drawn.

Barrie, A.; Freedman, P.A.

1985-01-22

442

A media agent for automatically building a personalized semantic index of Web media objects  

Microsoft Academic Search

A novel idea of media agent is briefly presented, which can automatically build a personalized semantic index of Web media objects for each particular user. Because the Web is a rich source of multimedia data and the text content on the Web pages is usually semantically related to those media objects on the same pages, the media agent can automatically

Liu Wenyin; Zheng Chen; Mingjing Li; Hongjiang Zhang

2001-01-01

443

Boosting based text and non-text region classification  

NASA Astrophysics Data System (ADS)

Layout analysis is a crucial process for document image understanding and information retrieval. Document layout analysis depends on page segmentation and block classification. This paper describes an algorithm for extracting blocks from document images and a boosting based method to classify those blocks as machine printed text or not. The feature vector which is fed into the boosting classifier consists of a four direction run-length histogram, and connected components features in both background and foreground. Using a combination of features through a boosting classifier, we obtain an accuracy of 99.5% on our test collection.

Xie, Binqing; Agam, Gady

2011-01-01

444

Syntactic tools for text watermarking  

NASA Astrophysics Data System (ADS)

This paper explores the morphosyntactic tools for text watermarking and develops a syntax-based natural language watermarking scheme. Turkish, an agglutinative language, provides a good ground for the syntax-based natural language watermarking with its relatively free word order possibilities and rich repertoire of morphosyntactic structures. The unmarked text is first transformed into a syntactic tree diagram in which the syntactic hierarchies and the functional dependencies are coded. The watermarking software then operates on the sentences in syntax tree format and executes binary changes under control of Wordnet to avoid semantic drops. The key-controlled randomization of morphosyntactic tool order and the insertion of void watermark provide a certain level of security. The embedding capacity is calculated statistically, and the imperceptibility is measured using edit hit counts.

Meral, Hasan M.; Sevinç, Emre; Ünkar, Ersin; Sankur, Bülent; Özsoy, A. S.; Güngör, Tunga

2007-02-01

445

Price Theory: An Intermediate Text  

NSDL National Science Digital Library

David D. Friedman, Professor of Law at Santa Clara University, has made his textbook, "Price Theory: An Intermediate Text" available on the web. The book focuses on trying to teach students the "economic way of thinking" and the "analytical core of economics -- price theory." Topics covered include consumer choice, market structure and economic efficiency. Price Theory also contains chapters on less conventional topics such as the political marketplace, the economics of law and law breaking; and the economics of love and marriage.

Friedman, David D.

1990-01-01

446

Multilingual Authoring Using Feedback Texts  

Microsoft Academic Search

There are obvious reasons for trying to automate the production of multilingual documentation, especially for routine subject-matter in restricted domains (e.g. technical instructions). Two approaches have been adopted: Machine Translation (MT) of a source text, and Multilingual Natural Language Generation (M-NLG) from a knowledge base. For MT, information extraction is a major difficulty, since the meaning must be derived by

Richard Power; Donia Scott

1998-01-01

447

Decision Support via Text Mining  

Microsoft Academic Search

The growing volume of textual data presents genuine, modern day challenges that traditional decision support systems, focused\\u000a on quantitative data processing, are unable to address. The costs of competitive intelligence, customer experience metrics,\\u000a and manufacturing controls are escalating as organizations are buried in piles of open-ended responses, news articles and\\u000a documents. The emerging field of text mining is capable of

Josh Froelich; Sergei Ananyan

448

[On two antique medical texts].  

PubMed

The two texts presented here--Regimento proueytoso contra ha pestenença [literally, "useful regime against pestilence"] and Modus curandi cum balsamo ["curing method using balm"]--represent the extent of Portugal's known medical library until circa 1530, produced in gothic letters by foreign printers: Germany's Valentim Fernandes, perhaps the era's most important printer, who worked in Lisbon between 1495 and 1518, and Germdo Galharde, a Frenchman who practiced his trade in Lisbon and Coimbra between 1519 and 1560. Modus curandi, which came to light in 1974 thanks to bibliophile José de Pina Martins, is anonymous. Johannes Jacobi is believed to be the author of Regimento proueytoso, which was translated into Latin (Regimen contra pestilentiam), French, and English. Both texts are presented here in facsimile and in modern Portuguese, while the first has also been reproduced in archaic Portuguese using modern typographical characters. This philological venture into sixteenth-century medicine is supplemented by a scholarly glossary which serves as a valuable tool in interpreting not only Regimento proueytoso but also other texts from the era. Two articles place these documents in historical perspective. PMID:17500134

Rosa, Maria Carlota

449

Probabilistic Approaches for Modeling Text Structure and Their Application to Text-to-Text Generation  

Microsoft Academic Search

\\u000a Since the early days of generation research, it has been acknowledged that modeling the global structure of a document is\\u000a crucial for producing coherent, readable output. However, traditional knowledge-intensive approaches have been of limited\\u000a utility in addressing this problem since they cannot be effectively scaled to operate in domain-independent, large-scale applications.\\u000a Due to this difficulty, existing text-to-text generation systems rarely

Regina Barzilay

2010-01-01

450

Exploring the use of image text for biomedical literature retrieval.  

PubMed

In biomedical publications, figures and images concisely summarize a paper's experimental findings and results. Recent studies have therefore explored the use of images to assist in information retrieval (IR) in biomedicine, mostly based on mining the image caption content. We extend this approach by mining the image text, which refers to the text inside biomedical figures and images. In this work, we discuss the distinct advantages of using image text for biomedical IR and present a prototype search engine implementing the idea. PMID:18999241

Xu, Songhua; McCusker, James; Krauthammer, Michael

2008-11-06

451

Automatic Report Generation from Ontologies: The MIAKT Approach  

Microsoft Academic Search

This paper presented an approach for automatic generation of reports from domain ontologies encoded in Semantic Web standards like OWL. The paper identifles the challenges that need to be addressed when generating text from RDF and OWL and demonstrates how the ontology is used during the difierent stages of the generation process. The main contribution is in showing how NLG

Kalina Bontcheva; Yorick Wilks

2004-01-01

452

Automatic Identification and Organization of Index Terms for Interactive Browsing.  

ERIC Educational Resources Information Center

The potential of automatically generated indexes for information access has been recognized for several decades, but the quantity of text and the ambiguity of natural language processing have made progress at this task more difficult than was originally foreseen. Recently, a body of work on development of interactive systems to support phrase…

Wacholder, Nina; Evans, David K.; Klavans, Judith L.

453

Automatic Indexing and Content-Based Retrieval of Captioned Images  

Microsoft Academic Search

The interaction of textual and photographic information in an integrated text\\/image database environment is being explored. Specifically, our research group has developed an automatic indexing system for captioned pictures of people; the indexing information and other textual information is subsequently used in a content-based image retrieval system. Our approach presents an alternative to traditional face identification systems; it goes beyond

Rohini K. Srihari

1995-01-01

454

Automatic Knowledge Acquire System Oriented to Web Pages  

Microsoft Academic Search

The disordered way of the Web information organization has seriously hindered the knowledge sharing and interoperability, this paper presents a knowledge-oriented Web page automatic acquisition system (AKAS2WP). This system includes four core modules, and they are accessing of web pages, text extraction, the management and organizations of the concept and the attribute extraction of the concept. Accessing of Internet web

Zhu Junwu; Jiang Yi; Xu Yingying

2009-01-01

455

Enriching text with images and colored light  

NASA Astrophysics Data System (ADS)

We present an unsupervised method to enrich textual applications with relevant images and colors. The images are collected by querying large image repositories and subsequently the colors are computed using image processing. A prototype system based on this method is presented where the method is applied to song lyrics. In combination with a lyrics synchronization algorithm the system produces a rich multimedia experience. In order to identify terms within the text that may be associated with images and colors, we select noun phrases using a part of speech tagger. Large image repositories are queried with these terms. Per term representative colors are extracted using the collected images. Hereto, we either use a histogram-based or a mean shift-based algorithm. The representative color extraction uses the non-uniform distribution of the colors found in the large repositories. The images that are ranked best by the search engine are displayed on a screen, while the extracted representative colors are rendered on controllable lighting devices in the living room. We evaluate our method by comparing the computed colors to standard color representations of a set of English color terms. A second evaluation focuses on the distance in color between a queried term in English and its translation in a foreign language. Based on results from three sets of terms, a measure of suitability of a term for color extraction based on KL Divergence is proposed. Finally, we compare the performance of the algorithm using either the automatically indexed repository of Google Images and the manually annotated Flickr.com. Based on the results of these experiments, we conclude that using the presented method we can compute the relevant color for a term using a large image repository and image processing.

Sekulovski, Dragan; Geleijnse, Gijs; Kater, Bram; Korst, Jan; Pauws, Steffen; Clout, Ramon

2008-01-01

456

Automatically classifying emails into activities  

Microsoft Academic Search

Email-based activity management systems promise to give users better tools for managing increasing volumes of email, by organizing email according to a user's activities. Current activity management systems do not automatically classify incoming messages by the activity to which they belong, in- stead relying on simple heuristics (such as message threads), or asking the user to manually classify incoming messages

Mark Dredze; Tessa A. Lau; Nicholas Kushmerick

2006-01-01

457

Advanced automatic train protection system  

Microsoft Academic Search

Presents a new automatic train protection system. Such messages as the distance to the preceding train or the speed restriction on switches are transmitted to each train instead of the limit speed set for the train. This is done through the rails using digital codes. An on-board processor generates a braking pattern using these messages and the braking performance. And

I. Watanabe; T. Takashige

1994-01-01

458

Thesaurus based automatic keyphrase indexing  

Microsoft Academic Search

We propose a new method that enhances automatic keyphrase extraction by using semantic information on terms and phrases gleaned from a domain-specific thesaurus. We evaluate the results against keyphrase sets assigned by a state-of-the-art keyphrase extraction system and those assigned by six professional indexers.

Olena Medelyan; Ian H. Witten

2006-01-01

459

Automatic feature selection in neuroevolution  

Microsoft Academic Search

Feature selection is the process of finding the set of inputs to a machine learning algorithm that will yield the best performance. Developing a way to solve this problem automatically would make current machine learning methods much more useful. Previous efforts to automate feature selection rely on expensive meta-learning or are applicable only when labeled training data is available. This

Shimon Whiteson; Peter Stone; Kenneth O. Stanley; Risto Miikkulainen; Nate Kohl

2005-01-01

460

The Automaticity of Social Life  

Microsoft Academic Search

Much of social life is experienced through mental processes that are not intended and about which one is fairly oblivious. These processes are automatically triggered by features of the immediate social environment, such as the group memberships of other people, the qualities of their behavior, and features of social situations (e.g., norms, one's relative power). Recent research has shown these

John A. Bargh; Erin L. Williams

2006-01-01

461

Automatic Error Analysis Using Intervals  

ERIC Educational Resources Information Center

|A technique for automatic error analysis using interval mathematics is introduced. A comparison to standard error propagation methods shows that in cases involving complicated formulas, the interval approach gives comparable error estimates with much less effort. Several examples are considered, and numerical errors are computed using the INTLAB…

Rothwell, E. J.; Cloud, M. J.

2012-01-01

462

Automatic recognition of film genres  

Microsoft Academic Search

Film genres in digital video can be detected automatically. In a three-step approach we analyze first the syntactic properties of digital films: color statistics, cut detection, camera motion, object motion and audio. In a second step we use these statistics to derive at a more abstract level film style attributes such as camera panning and zooming, speech and music. These

Stephan Fischer; Rainer Lienhart; Wolfgang Effelsberg

1995-01-01

463

Automatic validation of pipeline specifications  

Microsoft Academic Search

Recent approaches on language-driven Design Space Exploration (DSE) use Architectural Description Languages (ADL) to capture the processor architecture, generate automatically a software toolkit (including compiler, simulator, and assembler) for that processor, and provide feedback to the designer on the quality of the architecture. It is important to verify the ADL description of the processor to ensure the correctness of the

Prabhat Mishra; Nikil Dutt; Alex Nicolau

2001-01-01

464

Isolating Intrusions by Automatic Experiments  

Microsoft Academic Search

When dealing with malware infections, one of the first tasksis to findtheprocesses thatwere involvedintheattack. We introduce Malfor, a system that isolates those processes automatically. In contrast to other methods that help ana- lyze attacks, Malfor works by experiments: first, we record the interaction of the system under attack; after the intru- sion has been detected, we replay the recorded events

Stephan Neuhaus; Andreas Zeller

2006-01-01

465

An automatic adiabatic bomb calorimeter  

Microsoft Academic Search

The paper details the conversion of an existing isothermal bomb calorimeter to an adiabatic calorimeter with automatic control. Thermistors in the inner and outer vessels are included in two arms of an a.c. Wheatstone bridge: any rise in temperature of the inner vessel above that of the outer vessel results in phase reversal of the output voltage from the bridge,

W F Raymond; R J Canaway; C E Harris

1957-01-01

466

Automatic 'Descente Infinie' Induction Reasoning  

Microsoft Academic Search

We present a framework and a methodology to build and analyse automatic provers using the 'Descente Infinie' induction princi- ple. A stronger connection between different proof techniques like those based on implicit induction and saturation is established by uniformly and explicitly representing them as applications of this principle. The framework offers a clear separation between logic and computation, by the

Sorin Stratulat

2005-01-01

467

Eating as an Automatic Behavior  

Microsoft Academic Search

The continued growth of the obesity epidemic at a time when obesity is highly stigmatizing should make us ques- tion the assumption that, given the right information and motivation, people can successfully reduce their food intake over the long term. An alternative view is that eat- ing is an automatic behavior over which the environment has more control than do

Deborah A. Cohen; Thomas A. Farley

468

Automatic assembly planning with fasteners  

Microsoft Academic Search

The automatic assembly planning with fasteners (AAPF) prototype system is described. Given a model of a product to be manufactured, the AAPF system produces a high-level assembly sequence for producing that product. The model is defined in terms of constructive solid-geometry primitives and nut, bolt, and screw fastener primitives. The system works backwards by disassembling the finished product model. Under

Joseph M. Miller; Richard L. Hoffman

1989-01-01

469

Supporting the education evidence portal via text mining  

PubMed Central

The UK Education Evidence Portal (eep) provides a single, searchable, point of access to the contents of the websites of 33 organizations relating to education, with the aim of revolutionizing work practices for the education community. Use of the portal alleviates the need to spend time searching multiple resources to find relevant information. However, the combined content of the websites of interest is still very large (over 500?000 documents and growing). This means that searches using the portal can produce very large numbers of hits. As users often have limited time, they would benefit from enhanced methods of performing searches and viewing results, allowing them to drill down to information of interest more efficiently, without having to sift through potentially long lists of irrelevant documents. The Joint Information Systems Committee (JISC)-funded ASSIST project has produced a prototype web interface to demonstrate the applicability of integrating a number of text-mining tools and methods into the eep, to facilitate an enhanced searching, browsing and document-viewing experience. New features include automatic classification of documents according to a taxonomy, automatic clustering of search results according to similar document content, and automatic identification and highlighting of key terms within documents.

Ananiadou, Sophia; Thompson, Paul; Thomas, James; Mu, Tingting; Oliver, Sandy; Rickinson, Mark; Sasaki, Yutaka; Weissenbacher, Davy; McNaught, John

2010-01-01

470

Combining text clustering and retrieval for corpus adaptation  

NASA Astrophysics Data System (ADS)

The application-relevant text data are very useful in various natural language applications. Using them can achieve significantly better performance for vocabulary selection, language modeling, which are widely employed in automatic speech recognition, intelligent input method etc. In some situations, however, the relevant data is hard to collect. Thus, the scarcity of application-relevant training text brings difficulty upon these natural language processing. In this paper, only using a small set of application specific text, by combining unsupervised text clustering and text retrieval techniques, the proposed approach can find the relevant text from unorganized large scale corpus, thereby, adapt training corpus towards the application area of interest. We use the performance of n-gram statistical language model, which is trained from the text retrieved and test on the application-specific text, to evaluate the relevance of the text acquired, accordingly, to validate the effectiveness of our corpus adaptation approach. The language models trained from the ranked text bundles present well discriminated perplexities on the application-specific text. The preliminary experiments on short message text and unorganized large corpus demonstrate the performance of the proposed methods.

He, Feng; Ding, Xiaoqing

2007-01-01

471

Identifying issue frames in text.  

PubMed

Framing, the effect of context on cognitive processes, is a prominent topic of research in psychology and public opinion research. Research on framing has traditionally relied on controlled experiments and manually annotated document collections. In this paper we present a method that allows for quantifying the relative strengths of competing linguistic frames based on corpus analysis. This method requires little human intervention and can therefore be efficiently applied to large bodies of text. We demonstrate its effectiveness by tracking changes in the framing of terror over time and comparing the framing of abortion by Democrats and Republicans in the U.S. PMID:23874909

Sagi, Eyal; Diermeier, Daniel; Kaufmann, Stefan

2013-07-16

472

Primary Students and Informational Texts  

NSDL National Science Digital Library

Anyone who has spent time looking into science books with young children has no doubt experienced the endless questions that the information and visuals in the books can stimulate. Can snakes climb trees? How do frogs hide from predators? Why do volcanoes erupt? Books prompt questions, which can lead to further reading about and investigations of science topics. Whether from a textbook or a nonfiction trade book, informational text can be the fuel that sparks curiosity about and interest in science, thus contributing to the development of science attitudes.

Yopp, Hallie K.; Yopp, Ruth H.

2006-11-01

473

CELT: Corpus of Electronic Texts  

NSDL National Science Digital Library

CELT, an "online database of contemporary and historical topics from many areas, including literature and the other arts," is aimed at the greatest possible range of readers, from academic scholars to the general public. Texts at the site can be searched, read on-screen, or downloaded. Other works available at CELT include essays by Michael Collins, the Dail debates on the 1921 Anglo-Irish Treaty, works by James Connolly and Padraic Pearse, and almost the whole corpus of Hiberno-Norman French poetry.

1997-01-01

474

Multi Sensor Information Integration and Automatic Understanding.  

National Technical Information Service (NTIS)

This program addresses Automatic Image Understanding and Automatic Integration of Disparate Sources of Information. The techniques are particularly focused on asymmetric warfare, urban warfare, guerrilla warfare, and port/base security, for which automati...

2006-01-01

475

Automatisms: bridging clinical neurology with criminal law.  

PubMed

The law, like neurology, grapples with the relationship between disease states and behavior. Sometimes, the two disciplines share the same terminology, such as automatism. In law, the "automatism defense" is a claim that action was involuntary or performed while unconscious. Someone charged with a serious crime can acknowledge committing the act and yet may go free if, relying on the expert testimony of clinicians, the court determines that the act of crime was committed in a state of automatism. In this review, we explore the relationship between the use of automatism in the legal and clinical literature. We close by addressing several issues raised by the automatism defense: semantic ambiguity surrounding the term automatism, the presence or absence of consciousness during automatisms, and the methodological obstacles that have hindered the study of cognition during automatisms. PMID:21145287

Rolnick, Joshua; Parvizi, Josef

2010-12-08

476

Manual and Automatic Lineament Mapping: Comparing Results  

NASA Astrophysics Data System (ADS)

A method for automatic lineament extraction using topographic data is applied on the Thaumasia plateau. A comparison is made between the results that are obtained from the automatic mapping approach and from a traditional tectonic lineament mapping.

Vaz, D. A.; di Achille, G.; Barata, M. T.; Alves, E. I.

2008-03-01

477

How automatic are crossmodal correspondences?  

PubMed

The last couple of years have seen a rapid growth of interest (especially amongst cognitive psychologists, cognitive neuroscientists, and developmental researchers) in the study of crossmodal correspondences - the tendency for our brains (not to mention the brains of other species) to preferentially associate certain features or dimensions of stimuli across the senses. By now, robust empirical evidence supports the existence of numerous crossmodal correspondences, affecting people's performance across a wide range of psychological tasks - in everything from the redundant target effect paradigm through to studies of the Implicit Association Test, and from speeded discrimination/classification tasks through to unspeeded spatial localisation and temporal order judgment tasks. However, one question that has yet to receive a satisfactory answer is whether crossmodal correspondences automatically affect people's performance (in all, or at least in a subset of tasks), as opposed to reflecting more of a strategic, or top-down, phenomenon. Here, we review the latest research on the topic of crossmodal correspondences to have addressed this issue. We argue that answering the question will require researchers to be more precise in terms of defining what exactly automaticity entails. Furthermore, one's answer to the automaticity question may also hinge on the answer to a second question: Namely, whether crossmodal correspondences are all 'of a kind', or whether instead there may be several different kinds of crossmodal mapping (e.g., statistical, structural, and semantic). Different answers to the automaticity question may then be revealed depending on the type of correspondence under consideration. We make a number of suggestions for future research that might help to determine just how automatic crossmodal correspondences really are. PMID:23370382

Spence, Charles; Deroy, Ophelia

2013-01-29

478

Text Mining for Discovery of Host–Pathogen Interactions  

Microsoft Academic Search

\\u000a Text processing systems now supplement the information needs of professionals across a variety of industries. Applications\\u000a such as relationship extraction, information retrieval, document summarization, question answering, and multilingual machine\\u000a translation demonstrate practical utility in terms of accuracy and speed. Significant drivers behind these advances stem from\\u000a performance improvements in underlying technologies such as syntactic parsing, named entity recognition, and semantic

Stephen Anthony; Vitali Sintchenko; Enrico Coiera

479

Efficient Index for Handwritten Text  

NASA Astrophysics Data System (ADS)

This paper deals with one of the new emerging multimedia data types, namely, handwritten cursive text. The paper presents two indexing methods for searching a collection of cursive handwriting. The first index, word-level index, treats word as pictogram and uses global features for retrieval. The word-level index is suitable for large collection of cursive text. While the second one, called stroke-level index, treats the word as a set of strokes. The stroke-level index is more accurate, but more costly than the word level index. Each word (or stroke) can be described with a set of features and, thus, can be stored as points in the feature space. The Karhunen-Loeve transform is then used to minimize the number of features used (data dimensionality) and thus the index size. Feature vectors are stored in an R-tree. We implemented both indexes and carried many simulation experiments to measure the effectiveness and the cost of the search algorithm. The proposed indexes achieve substantial saving in the search time over the sequential search. Moreover, the proposed indexes improve the matching rate up to 46% over the sequential search.

Kamel, Ibrahim

480

Text Text Revolution: A Game That Improves Text Entry on Mobile Touchscreen Keyboards  

Microsoft Academic Search

\\u000a Mobile devices often utilize touchscreen keyboards for text input. However, due to the lack of tactile feedback and generally\\u000a small key sizes, users often produce typing errors. Key-target resizing, which dynamically adjusts the underlying target areas\\u000a of the keys based on their probabilities, can significantly reduce errors, but requires training data in the form of touch\\u000a points for intended keys.

Dmitry Rudchenko; Tim Paek; Eric Badger

481

Constructing biological knowledge bases by extracting information from text sources.  

PubMed

Recently, there has been much effort in making databases for molecular biology more accessible and interoperable. However, information in text form, such as MEDLINE records, remains a greatly underutilized source of biological information. We have begun a research effort aimed at automatically mapping information from text sources into structured representations, such as knowledge bases. Our approach to this task is to use machine-learning methods to induce routines for extracting facts from text. We describe two learning methods that we have applied to this task--a statistical text classification method, and a relational learning method--and our initial experiments in learning such information-extraction routines. We also present an approach to decreasing the cost of learning information-extraction routines by learning from "weakly" labeled training data. PMID:10786289

Craven, M; Kumlien, J

1999-01-01

482

Self-Compassion and Automatic Thoughts  

ERIC Educational Resources Information Center

The aim of this research is to examine the relationships between self-compassion and automatic thoughts. Participants were 299 university students. In this study, the Self-compassion Scale and the Automatic Thoughts Questionnaire were used. The relationships between self-compassion and automatic thoughts were examined using correlation analysis…

Akin, Ahmet

2012-01-01

483

Self-Compassion and Automatic Thoughts  

ERIC Educational Resources Information Center

|The aim of this research is to examine the relationships between self-compassion and automatic thoughts. Participants were 299 university students. In this study, the Self-compassion Scale and the Automatic Thoughts Questionnaire were used. The relationships between self-compassion and automatic thoughts were examined using correlation analysis…

Akin, Ahmet

2012-01-01

484

From Automatic Structures to Borel Structures  

Microsoft Academic Search

We study the classes of B¨ uchi and Rabin automatic struc- tures. For B¨ uchi (Rabin) automatic structures their domains consist of infinite strings (trees), and the basic relations, in- cluding the equality relation, and graphs of operations are recognized by Buchi (Rabin) automata. A B¨ uchi (Rabin) automatic structure is injective if different infinite strings (trees) represent different elements

Greg Hjorth; Bakhadyr Khoussainov; Antonio Montalbán; André Nies

2008-01-01

485

Text mining in livestock animal science: introducing the potential of text mining to animal sciences.  

PubMed

In biological research, establishing the prior art by searching and collecting information already present in the domain has equal importance as the experiments done. To obtain a complete overview about the relevant knowledge, researchers mainly rely on 2 major information sources: i) various biological databases and ii) scientific publications in the field. The major difference between the 2 information sources is that information from databases is available, typically well structured and condensed. The information content in scientific literature is vastly unstructured; that is, dispersed among the many different sections of scientific text. The traditional method of information extraction from scientific literature occurs by generating a list of relevant publications in the field of interest and manually scanning these texts for relevant information, which is very time consuming. It is more than likely that in using this "classical" approach the researcher misses some relevant information mentioned in the literature or has to go through biological databases to extract further information. Text mining and named entity recognition methods have already been used in human genomics and related fields as a solution to this problem. These methods can process and extract information from large volumes of scientific text. Text mining is defined as the automatic extraction of previously unknown and potentially useful information from text. Named entity recognition (NER) is defined as the method of identifying named entities (names of real world objects; for example, gene/protein names, drugs, enzymes) in text. In animal sciences, text mining and related methods have been briefly used in murine genomics and associated fields, leaving behind other fields of animal sciences, such as livestock genomics. The aim of this work was to develop an information retrieval platform in the livestock domain focusing on livestock publications and the recognition of relevant data from cattle and pigs. For this purpose, the rather noncomprehensive resources of pig and cattle gene and protein terminologies were enriched with orthologue synonyms, integrated in the NER platform, ProMiner, which is successfully used in human genomics domain. Based on the performance tests done, the present system achieved a fair performance with precision 0.64, recall 0.74, and F(1) measure of 0.69 in a test scenario based on cattle literature. PMID:22665627

Sahadevan, S; Hofmann-Apitius, M; Schellander, K; Tesfaye, D; Fluck, J; Friedrich, C M

2012-06-04

486

Automatic processing, analysis, and recognition of images  

NASA Astrophysics Data System (ADS)

New approaches and computer codes (A&CC) for automatic processing, analysis and recognition of images are offered. The A&CC are based on presentation of object image as a collection of pixels of various colours and consecutive automatic painting of distinguished itself parts of the image. The A&CC have technical objectives centred on such direction as: 1) image processing, 2) image feature extraction, 3) image analysis and some others in any consistency and combination. The A&CC allows to obtain various geometrical and statistical parameters of object image and its parts. Additional possibilities of the A&CC usage deal with a usage of artificial neural networks technologies. We believe that A&CC can be used at creation of the systems of testing and control in a various field of industry and military applications (airborne imaging systems, tracking of moving objects), in medical diagnostics, at creation of new software for CCD, at industrial vision and creation of decision-making system, etc. The opportunities of the A&CC are tested at image analysis of model fires and plumes of the sprayed fluid, ensembles of particles, at a decoding of interferometric images, for digitization of paper diagrams of electrical signals, for recognition of the text, for elimination of a noise of the images, for filtration of the image, for analysis of the astronomical images and air photography, at detection of objects.

Abrukov, Victor S.; Smirnov, Evgeniy V.; Ivanov, Dmitriy G.

2004-11-01

487

Behavioral Summarized Evaluation: An Assessment Tool to Enhance Multidisciplinary and Parent-Professional Collaborations in Assessing Symptoms of Autism  

Microsoft Academic Search

The Behavioral Summarized Evaluation (BSE) was developed by Barthelemy and colleagues (Barthelemy et al., 1990; Barthelemy et al., 1997) for professionals and paraprofessionals to assess autism symptom severity over the course of treatment. The purpose of this article is to review empirical literature examining the utility of this psychometric instrument in research and clinical practice, with an emphasis on the

Roger N. Reeb; Susan F. Folger; Brent J. Oneal

2009-01-01

488

The Effect of Summarization on Intermediate EFL Learners' Reading Comprehension and Their Performance on Display, Referential and Inferential Questions  

ERIC Educational Resources Information Center

|This study examined the effect of summarization as a generative learning strategy of the readers' performance on reading comprehension, in general, and reading comprehension display, referential and inferential questions in particular. The subjects in this study were 61 high school students. They were assigned to two groups--control and…

Ghabanchi, Zargham; Mirza, Fateme Haji

2010-01-01

489

A Novel Video Summarization Based on Mining the Story-Structure and Semantic Relations Among Concept Entities  

Microsoft Academic Search

Video summarization techniques have been proposed for years to offer people comprehensive understanding of the whole story in the video. Roughly speaking, existing approaches can be classified into the two types: one is static storyboard, and the other is dynamic skimming. However, despite that these traditional methods give brief summaries for users, they still do not provide with a concept-organized

Bo-Wei Chen; Jia-Ching Wang; Jhing-Fa Wang

2009-01-01

490

Relationship between summarizing chemical parameters like AOX, TOC, TN b , and toxicity tests for effluents from the chemical production  

Microsoft Academic Search

Therefore this study was undertaken to investigate, whether there are correlations between summarizing parameters, which potentially indicate hazardous water components, like AOX (adsorbable organic halogen), TOC (total organic carbon) and TN b (total bound nitrogen) and biological effects, observed from different bioassays conducted in the laboratory. The toxic effects were investigated on luminescent bacteria, microcrustaceans and algae in accordance with

G. Gellert

2000-01-01

491

Automated learning of decision rules for text categorization  

Microsoft Academic Search

We describe the results of extensive experiments using\\u000a optimized rule-based induction methods on large document\\u000a collections. The goal of these methods is to discover\\u000a automatically classification patterns that can be used for\\u000a general document categorization or personalized filtering\\u000a of free text. Previous reports indicate that\\u000a human-engineered rule-based systems, requiring many\\u000a man-years of developmental efforts, have been successfully\\u000a built to ``read''

Chidanand Apte; Fred J. Damerau; Sholom M. Weiss

1994-01-01

492

Comparing Natural Language Processing Tools to Extract Medical Problems from Narrative Text  

Microsoft Academic Search

To help maintain a complete, accurate and timely Problem List, we are developing a system to automatically retrieve medical problems from free-text documents. This system uses Natural Language Processing to analyze all electronic narrative text documents in a patient's record. Here we evaluate and compare 3 different applications of NLP technology in our system: the first using MMTx (MetaMap Transfer)

Stéphane M. Meystre; Peter J. Haug

2005-01-01

493

Differences in Text Structure and Its Implications for Assessment of Struggling Readers  

ERIC Educational Resources Information Center

|One source of potential difficulty for struggling readers is the variability of texts across grade levels. This article explores the use of automatic natural language processing techniques to identify dimensions of variation within a corpus of school-appropriate texts. Specifically, we asked: Are there identifiable dimensions of lexical and…

Deane, Paul; Sheehan, Kathleen M.; Sabatini, John; Futagi, Yoko; Kostin, Irene

2006-01-01

494

Information extraction for Chinese free text based on pattern match combine with heuristic information  

Microsoft Academic Search

Describes an approach to information extraction for Chinese free text. An automatic learning algorithm of pattern rules for Chinese free text and employment of heuristic information are described. The method that combines pattern rule matching with heuristic information is utilized to perform the information extraction task. Experimental results have proved this method to be effective in improving the extraction result

Ying Yu; Xiao-Long Wang; Yi Guan

2002-01-01

495

Text Detection from Natural Scene Images: Towards a System for Visually Impaired Persons  

Microsoft Academic Search

We propose a system that reads the text encountered in natural scenes with the aim to provide assistance to the visually impaired persons. This paper describes the sys- tem design and evaluates several character extraction meth- ods. Automatic text recognition from natural images re- ceives a growing attention because of potential applications in image retrieval, robotics and intelligent transport system.

Nobuo Ezaki; Marius Bulacu; Lambert Schomaker

2004-01-01

496

A text detection, localization and segmentation system for OCR in images  

Microsoft Academic Search

One way to include semantic knowledge into the process of indexing databases of digital images is to use caption text, since it provides important information about the image content and is a very good entity for queries based on keywords. In this paper, we propose an approach to automatically localize, segment and binarize text appearing in complex images. First, an

Julinda Gllavata; Ralph Ewerth; Bernd Freisleben

2004-01-01

497

A decision-tree-based symbolic rule induction system for text categorization  

Microsoft Academic Search

We present a decision-tree-based symbolic rule induction system whosepurpose is to categorize text documents automatically. Our method forrule induction involves the novel combination of (1) a fast decision treeinduction algorithm especially suited to text data and (2) a new methodfor converting a decision tree to a rule set that is simplied, but stilllogically equivalent to, the original tree. We report

David E. Johnson; Frank J. Oles; Tong Zhang; Thilo Götz

2002-01-01

498

AVATAR: Using text analytics to bridge the structured-unstructured divide  

Microsoft Academic Search

There is a growing need in enterprise applications to query and analyze seamlessly across structured and unstructured data. We propose an informa- tion system in which text analytics bridges the structured-unstructured divide. Annotations ex- tracted by text analytic engines, with associated uncertainty, is automatically ingested into a struc- tured data store. We propose an interface that is capable of supporting

Huaiyu Zhu; Sriram Raghavan; Shivakumar Vaithyanathan; Jayram S. Thathachar; Rajasekar Krishnamurthy; Rahul Gupta; Krishna P. Chitrapura

499

Automatic design of magazine covers  

NASA Astrophysics Data System (ADS)

In this paper, we propose a system for automatic design of magazine covers that quantifies a number of concepts from art and aesthetics. Our solution to automatic design of this type of media has been shaped by input from professional designers, magazine art directors and editorial boards, and journalists. Consequently, a number of principles in design and rules in designing magazine covers are delineated. Several techniques are derived and employed in order to quantify and implement these principles and rules in the format of a software framework. At this stage, our framework divides the task of design into three main modules: layout of magazine cover elements, choice of color for masthead and cover lines, and typography of cover lines. Feedback from professional designers on our designs suggests that our results are congruent with their intuition.

Jahanian, Ali; Liu, Jerry; Tretter, Daniel R.; Lin, Qian; Damera-Venkata, Niranjan; O'Brien-Strain, Eamonn; Lee, Seungyon; Fan, Jian; Allebach, Jan P.

2012-02-01

500

AUTO: Automatic script generation system  

NASA Astrophysics Data System (ADS)

This technical manual describes an automatic script generation system (Auto) for guiding the physical design of a printed circuit board. Auto accepts a printed circuit board design as specified in a netlist and partslist and returns a script to automatically provide all the necessary commands and file specifications required by Harris EDA's Finesse CAD system for placing and routing the printed circuit board. Auto insulates the designer from learning the details of commercial CAD systems, allows designers to modify the script for customized design entry, and performs format and completeness checking of the design files. This technical manual contains a complete tutorial/design example describing how to use the Auto system and also contains appendices describing the format of files required by the Finesse CAD system.

Granacki, John; Hom, Ivan; Kazi, Tauseef

1993-11-01