Font adaptive word indexing of modern printed documents.
Marinai, Simone; Marino, Emanuele; Soda, Giovanni
2006-08-01
We propose an approach for the word-level indexing of modern printed documents which are difficult to recognize using current OCR engines. By means of word-level indexing, it is possible to retrieve the position of words in a document, enabling queries involving proximity of terms. Web search engines implement this kind of indexing, allowing users to retrieve Web pages on the basis of their textual content. Nowadays, digital libraries hold collections of digitized documents that can be retrieved either by browsing the document images or relying on appropriate metadata assembled by domain experts. Word indexing tools would therefore increase the access to these collections. The proposed system is designed to index homogeneous document collections by automatically adapting to different languages and font styles without relying on OCR engines for character recognition. The approach is based on three main ideas: the use of Self Organizing Maps (SOM) to perform unsupervised character clustering, the definition of one suitable vector-based word representation whose size depends on the word aspect-ratio, and the run-time alignment of the query word with indexed words to deal with broken and touching characters. The most appropriate applications are for processing modern printed documents (17th to 19th centuries) where current OCR engines are less accurate. Our experimental analysis addresses six data sets containing documents ranging from books of the 17th century to contemporary journals.
ERIC Educational Resources Information Center
Morocco, Catherine Cobb; And Others
The 2-year study investigated the use of word processing technology with 36 learning disabled (LD) intermediate grade children and 9 remedial teachers in five Massachusetts school districts. During the first year study staff documented how word processing was being used. In the second year, word processing activities hypothesized to be the most…
Zagoris, Konstantinos; Pratikakis, Ioannis; Gatos, Basilis
2017-05-03
Word spotting strategies employed in historical handwritten documents face many challenges due to variation in the writing style and intense degradation. In this paper, a new method that permits effective word spotting in handwritten documents is presented that it relies upon document-oriented local features which take into account information around representative keypoints as well a matching process that incorporates spatial context in a local proximity search without using any training data. Experimental results on four historical handwritten datasets for two different scenarios (segmentation-based and segmentation-free) using standard evaluation measures show the improved performance achieved by the proposed methodology.
Automated software system for checking the structure and format of ACM SIG documents
NASA Astrophysics Data System (ADS)
Mirza, Arsalan Rahman; Sah, Melike
2017-04-01
Microsoft (MS) Office Word is one of the most commonly used software tools for creating documents. MS Word 2007 and above uses XML to represent the structure of MS Word documents. Metadata about the documents are automatically created using Office Open XML (OOXML) syntax. We develop a new framework, which is called ADFCS (Automated Document Format Checking System) that takes the advantage of the OOXML metadata, in order to extract semantic information from MS Office Word documents. In particular, we develop a new ontology for Association for Computing Machinery (ACM) Special Interested Group (SIG) documents for representing the structure and format of these documents by using OWL (Web Ontology Language). Then, the metadata is extracted automatically in RDF (Resource Description Framework) according to this ontology using the developed software. Finally, we generate extensive rules in order to infer whether the documents are formatted according to ACM SIG standards. This paper, introduces ACM SIG ontology, metadata extraction process, inference engine, ADFCS online user interface, system evaluation and user study evaluations.
ERIC Educational Resources Information Center
Herrera-Viedma, Enrique; Peis, Eduardo
2003-01-01
Presents a fuzzy evaluation method of SGML documents based on computing with words. Topics include filtering the amount of information available on the Web to assist users in their search processes; document type definitions; linguistic modeling; user-system interaction; and use with XML and other markup languages. (Author/LRW)
"What is relevant in a text document?": An interpretable machine learning approach
Arras, Leila; Horn, Franziska; Montavon, Grégoire; Müller, Klaus-Robert
2017-01-01
Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML) models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text’s category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP), a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN) and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications. PMID:28800619
Aye, Aye, Aye, Aye: Orthography Enhances Rapid Word Reading in an Exploratory Study.
ERIC Educational Resources Information Center
Neuhaus, Graham F.; Post, Yolanda
2003-01-01
Uses a novel word-reading efficiency measure to determine if articulations or processing times associated with reading the word "aye" were enhanced through the phonological or orthographic qualities contained in the preceding word. Documents the importance of separating phonological and orthographic information in English homophones. (SG)
Fast words boundaries localization in text fields for low quality document images
NASA Astrophysics Data System (ADS)
Ilin, Dmitry; Novikov, Dmitriy; Polevoy, Dmitry; Nikolaev, Dmitry
2018-04-01
The paper examines the problem of word boundaries precise localization in document text zones. Document processing on a mobile device consists of document localization, perspective correction, localization of individual fields, finding words in separate zones, segmentation and recognition. While capturing an image with a mobile digital camera under uncontrolled capturing conditions, digital noise, perspective distortions or glares may occur. Further document processing gets complicated because of its specifics: layout elements, complex background, static text, document security elements, variety of text fonts. However, the problem of word boundaries localization has to be solved at runtime on mobile CPU with limited computing capabilities under specified restrictions. At the moment, there are several groups of methods optimized for different conditions. Methods for the scanned printed text are quick but limited only for images of high quality. Methods for text in the wild have an excessively high computational complexity, thus, are hardly suitable for running on mobile devices as part of the mobile document recognition system. The method presented in this paper solves a more specialized problem than the task of finding text on natural images. It uses local features, a sliding window and a lightweight neural network in order to achieve an optimal algorithm speed-precision ratio. The duration of the algorithm is 12 ms per field running on an ARM processor of a mobile device. The error rate for boundaries localization on a test sample of 8000 fields is 0.3
Wang OIS glossary package for reformatting documents telecommunicated to the OIS system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Markow, S.R.
1983-12-09
Documents that are composed on a computer and then transmitted by telecommunications into a Wang Office Information System (OIS) word processing system need to be reformatted and cleaned up before they can be used properly as word processing documents suitable for further revisions or additions. This report describes a group of glossary entries created for the Wang OIS which simplifies the job of cleaning up telecommunicated documents. This glossary is a semi-automated process designed to eliminate most of the tedious work needed to be performed in removing extra spaces and returns, adjusting formats, moving material, repagination, using tabs or indents,more » and similar problems. The report briefly discusses the problems, describes the glossary approach to solving them, and gives instructions for actually using the glossary entries.« less
10 CFR 2.1011 - Management of electronic information.
Code of Federal Regulations, 2013 CFR
2013-01-01
... participants shall make textual (or, where non-text, image) versions of their documents available on a web... of the following acceptable formats: ASCII, native word processing (Word, WordPerfect), PDF Normal, or HTML. (iv) Image files must be formatted as TIFF CCITT G4 for bi-tonal images or PNG (Portable...
10 CFR 2.1011 - Management of electronic information.
Code of Federal Regulations, 2014 CFR
2014-01-01
... participants shall make textual (or, where non-text, image) versions of their documents available on a web... of the following acceptable formats: ASCII, native word processing (Word, WordPerfect), PDF Normal, or HTML. (iv) Image files must be formatted as TIFF CCITT G4 for bi-tonal images or PNG (Portable...
An Evaluation of the UMLS in Representing Corpus Derived Clinical Concepts
Friedlin, Jeff; Overhage, Marc
2011-01-01
We performed an evaluation of the Unified Medical Language System (UMLS) in representing concepts derived from medical narrative documents from three domains: chest x-ray reports, discharge summaries and admission notes. We detected concepts in these documents by identifying noun phrases (NPs) and N-grams, including unigrams (single words), bigrams (word pairs) and trigrams (word triples). After removing NPs and N-grams that did not represent discrete clinical concepts, we processed the remaining with the UMLS MetaMap program. We manually reviewed the results of MetaMap processing to determine whether MetaMap found full, partial or no representation of the concept. For full representations, we determined whether post-coordination was required. Our results showed that a large portion of concepts found in clinical narrative documents are either unrepresented or poorly represented in the current version of the UMLS Metathesaurus and that post-coordination was often required in order to fully represent a concept. PMID:22195097
GeoSegmenter: A statistically learned Chinese word segmenter for the geoscience domain
NASA Astrophysics Data System (ADS)
Huang, Lan; Du, Youfu; Chen, Gongyang
2015-03-01
Unlike English, the Chinese language has no space between words. Segmenting texts into words, known as the Chinese word segmentation (CWS) problem, thus becomes a fundamental issue for processing Chinese documents and the first step in many text mining applications, including information retrieval, machine translation and knowledge acquisition. However, for the geoscience subject domain, the CWS problem remains unsolved. Although a generic segmenter can be applied to process geoscience documents, they lack the domain specific knowledge and consequently their segmentation accuracy drops dramatically. This motivated us to develop a segmenter specifically for the geoscience subject domain: the GeoSegmenter. We first proposed a generic two-step framework for domain specific CWS. Following this framework, we built GeoSegmenter using conditional random fields, a principled statistical framework for sequence learning. Specifically, GeoSegmenter first identifies general terms by using a generic baseline segmenter. Then it recognises geoscience terms by learning and applying a model that can transform the initial segmentation into the goal segmentation. Empirical experimental results on geoscience documents and benchmark datasets showed that GeoSegmenter could effectively recognise both geoscience terms and general terms.
When "Veps" Cry: Two-Year-Olds Efficiently Learn Novel Words from Linguistic Contexts Alone
ERIC Educational Resources Information Center
Ferguson, Brock; Graf, Eileen; Waxman, Sandra R.
2018-01-01
We assessed 24-month-old infants' lexical processing efficiency for both novel and familiar words. Prior work documented that 19-month-olds successfully identify referents of familiar words (e.g., The dog is so little) as well as novel words whose meanings were informed only by the surrounding sentence (e.g., The vep is crying), but that the speed…
Clustering of Farsi sub-word images for whole-book recognition
NASA Astrophysics Data System (ADS)
Soheili, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier
2015-01-01
Redundancy of word and sub-word occurrences in large documents can be effectively utilized in an OCR system to improve recognition results. Most OCR systems employ language modeling techniques as a post-processing step; however these techniques do not use important pictorial information that exist in the text image. In case of large-scale recognition of degraded documents, this information is even more valuable. In our previous work, we proposed a subword image clustering method for the applications dealing with large printed documents. In our clustering method, the ideal case is when all equivalent sub-word images lie in one cluster. To overcome the issues of low print quality, the clustering method uses an image matching algorithm for measuring the distance between two sub-word images. The measured distance with a set of simple shape features were used to cluster all sub-word images. In this paper, we analyze the effects of adding more shape features on processing time, purity of clustering, and the final recognition rate. Previously published experiments have shown the efficiency of our method on a book. Here we present extended experimental results and evaluate our method on another book with totally different font face. Also we show that the number of the new created clusters in a page can be used as a criteria for assessing the quality of print and evaluating preprocessing phases.
Information extraction and knowledge graph construction from geoscience literature
NASA Astrophysics Data System (ADS)
Wang, Chengbin; Ma, Xiaogang; Chen, Jianguo; Chen, Jingwen
2018-03-01
Geoscience literature published online is an important part of open data, and brings both challenges and opportunities for data analysis. Compared with studies of numerical geoscience data, there are limited works on information extraction and knowledge discovery from textual geoscience data. This paper presents a workflow and a few empirical case studies for that topic, with a focus on documents written in Chinese. First, we set up a hybrid corpus combining the generic and geology terms from geology dictionaries to train Chinese word segmentation rules of the Conditional Random Fields model. Second, we used the word segmentation rules to parse documents into individual words, and removed the stop-words from the segmentation results to get a corpus constituted of content-words. Third, we used a statistical method to analyze the semantic links between content-words, and we selected the chord and bigram graphs to visualize the content-words and their links as nodes and edges in a knowledge graph, respectively. The resulting graph presents a clear overview of key information in an unstructured document. This study proves the usefulness of the designed workflow, and shows the potential of leveraging natural language processing and knowledge graph technologies for geoscience.
Document image retrieval through word shape coding.
Lu, Shijian; Li, Linlin; Tan, Chew Lim
2008-11-01
This paper presents a document retrieval technique that is capable of searching document images without OCR (optical character recognition). The proposed technique retrieves document images by a new word shape coding scheme, which captures the document content through annotating each word image by a word shape code. In particular, we annotate word images by using a set of topological shape features including character ascenders/descenders, character holes, and character water reservoirs. With the annotated word shape codes, document images can be retrieved by either query keywords or a query document image. Experimental results show that the proposed document image retrieval technique is fast, efficient, and tolerant to various types of document degradation.
The nature of compounds: a psychocentric perspective.
Libben, Gary
2014-01-01
Although compound words often seem to be words that themselves contain words, this paper argues that this is not the case for the vast majority of lexicalized compounds. Rather, it is claimed that as a result of acts of lexical processing, the constituents of compound words develop into new lexical representations. These representations are bound to specific morphological roles and positions (e.g., head, modifier) within a compound word. The development of these positionally bound compound constituents creates a rich network of lexical knowledge that facilitates compound processing and also creates some of the well-documented patterns in the psycholinguistic and neurolinguistic study of compounding.
An overview of selected information storage and retrieval issues in computerized document processing
NASA Technical Reports Server (NTRS)
Dominick, Wayne D. (Editor); Ihebuzor, Valentine U.
1984-01-01
The rapid development of computerized information storage and retrieval techniques has introduced the possibility of extending the word processing concept to document processing. A major advantage of computerized document processing is the relief of the tedious task of manual editing and composition usually encountered by traditional publishers through the immense speed and storage capacity of computers. Furthermore, computerized document processing provides an author with centralized control, the lack of which is a handicap of the traditional publishing operation. A survey of some computerized document processing techniques is presented with emphasis on related information storage and retrieval issues. String matching algorithms are considered central to document information storage and retrieval and are also discussed.
Software Process Automation: Experiences from the Trenches.
1996-07-01
Integration of problem database Weaver tions) J Process WordPerfect, All-in-One, Oracle, CM Integration of tools Weaver System K Process Framemaker , CM...handle change requests and problem reports. * Autoplan, a project management tool * Framemaker , a document processing system * Worldview, a document...Cadre, Team Work, FrameMaker , some- thing for requirements traceability, their own homegrown scheduling tool, and their own homegrown tool integrator
Enhancement of Text Representations Using Related Document Titles.
ERIC Educational Resources Information Center
Salton, G.; Zhang, Y.
1986-01-01
Briefly reviews various methodologies for constructing enhanced document representations, discusses their general lack of usefulness, and describes a method of document indexing which uses title words taken from bibliographically related items. Evaluation of this process indicates that it is not sufficiently reliable to warrant incorporation into…
ERIC Educational Resources Information Center
Lee, Jesse
2013-01-01
The goal of this study was to find and trace word order patterns in Possessive Noun Phrases ("PNP's") in formulaic language within notarial documents dating from the tenth through the thirteenth centuries, originating from the Monastery of Sahagun, Leon, Spain. The overall results show clear trends, which reveal a diachronic process that…
Fracture Testing of Large-Scale Thin-Sheet Aluminum Alloy (MS Word file)
DOT National Transportation Integrated Search
1996-02-01
Word Document; A series of fracture tests on large-scale, precracked, aluminum alloy panels were carried out to examine and characterize the process by which cracks propagate and link up in this material. Extended grips and test fixtures were special...
Determining Fuzzy Membership for Sentiment Classification: A Three-Layer Sentiment Propagation Model
Zhao, Chuanjun; Wang, Suge; Li, Deyu
2016-01-01
Enormous quantities of review documents exist in forums, blogs, twitter accounts, and shopping web sites. Analysis of the sentiment information hidden in these review documents is very useful for consumers and manufacturers. The sentiment orientation and sentiment intensity of a review can be described in more detail by using a sentiment score than by using bipolar sentiment polarity. Existing methods for calculating review sentiment scores frequently use a sentiment lexicon or the locations of features in a sentence, a paragraph, and a document. In order to achieve more accurate sentiment scores of review documents, a three-layer sentiment propagation model (TLSPM) is proposed that uses three kinds of interrelations, those among documents, topics, and words. First, we use nine relationship pairwise matrices between documents, topics, and words. In TLSPM, we suppose that sentiment neighbors tend to have the same sentiment polarity and similar sentiment intensity in the sentiment propagation network. Then, we implement the sentiment propagation processes among the documents, topics, and words in turn. Finally, we can obtain the steady sentiment scores of documents by a continuous iteration process. Intuition might suggest that documents with strong sentiment intensity make larger contributions to classification than those with weak sentiment intensity. Therefore, we use the fuzzy membership of documents obtained by TLSPM as the weight of the text to train a fuzzy support vector machine model (FSVM). As compared with a support vector machine (SVM) and four other fuzzy membership determination methods, the results show that FSVM trained with TLSPM can enhance the effectiveness of sentiment classification. In addition, FSVM trained with TLSPM can reduce the mean square error (MSE) on seven sentiment rating prediction data sets. PMID:27846225
Van Wicklin, Sharon A
2016-05-01
Variations in documenting surgical wound classification Key words: surgical wound classification, clean, clean-contaminated, contaminated, dirty. Wearing long-sleeved jackets while preparing and packaging items for sterilization Key words: long-sleeved jackets, organic material, sterile processing. Endoscopic transmission of prions Key words: prions, high-risk tissue, low-risk tissue, Creutzfeldt-Jakob disease (CJD), variant Creutzfeldt-Jakob disease (vCJD). Wearing gloves when handling flexible endoscopes Key words: gloves, low-protein, powder-free, natural rubber latex gloves, latex-free gloves. Copyright © 2016 AORN, Inc. Published by Elsevier Inc. All rights reserved.
Orena, E F; Caldiroli, D; Acerbi, F; Barazzetta, I; Papagno, C
2018-06-05
Neuropsychological, neuroimaging and electrophysiological studies demonstrate that abstract and concrete word processing relies not only on the activity of a common bilateral network but also on dedicated networks. The neuropsychological literature has shown that a selective sparing of abstract relative to concrete words can be documented in lesions of the left anterior temporal regions. We investigated concrete and abstract word processing in 10 patients undergoing direct electrical stimulation (DES) for brain mapping during awake surgery in the left hemisphere. A lexical decision and a concreteness judgment task were added to the neuropsychological assessment during intra-operative monitoring. On the concreteness judgment, DES delivered over the inferior frontal gyrus significantly decreased abstract word accuracy while accuracy for concrete words decreased when the anterior temporal cortex was stimulated. These results are consistent with a lexical-semantic model that distinguishes between concrete and abstract words related to different neural substrates in the left hemisphere.
Spotting words in handwritten Arabic documents
NASA Astrophysics Data System (ADS)
Srihari, Sargur; Srinivasan, Harish; Babu, Pavithra; Bhole, Chetan
2006-01-01
The design and performance of a system for spotting handwritten Arabic words in scanned document images is presented. Three main components of the system are a word segmenter, a shape based matcher for words and a search interface. The user types in a query in English within a search window, the system finds the equivalent Arabic word, e.g., by dictionary look-up, locates word images in an indexed (segmented) set of documents. A two-step approach is employed in performing the search: (1) prototype selection: the query is used to obtain a set of handwritten samples of that word from a known set of writers (these are the prototypes), and (2) word matching: the prototypes are used to spot each occurrence of those words in the indexed document database. A ranking is performed on the entire set of test word images-- where the ranking criterion is a similarity score between each prototype word and the candidate words based on global word shape features. A database of 20,000 word images contained in 100 scanned handwritten Arabic documents written by 10 different writers was used to study retrieval performance. Using five writers for providing prototypes and the other five for testing, using manually segmented documents, 55% precision is obtained at 50% recall. Performance increases as more writers are used for training.
A Multiple-Representation Paradigm for Document Development
1988-07-05
Write [10], MicroSoft ·word [99], PageMaker [4], Vent ura Pub- lisher [135], Interleaf Publishing System [78], FrameMaker [52] and more have alre ady...processing in FrameMaker , MicroSoft Word, and Ventura Publisher are all handled by a noninteractive off-line program. Direct manipulation, from the
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-01
... must be submitted electronically in machine-readable format. PDF images created by scanning a paper document may not be submitted, except in cases in which a word- processing version of the document is not...
Transcript mapping for handwritten English documents
NASA Astrophysics Data System (ADS)
Jose, Damien; Bharadwaj, Anurag; Govindaraju, Venu
2008-01-01
Transcript mapping or text alignment with handwritten documents is the automatic alignment of words in a text file with word images in a handwritten document. Such a mapping has several applications in fields ranging from machine learning where large quantities of truth data are required for evaluating handwriting recognition algorithms, to data mining where word image indexes are used in ranked retrieval of scanned documents in a digital library. The alignment also aids "writer identity" verification algorithms. Interfaces which display scanned handwritten documents may use this alignment to highlight manuscript tokens when a person examines the corresponding transcript word. We propose an adaptation of the True DTW dynamic programming algorithm for English handwritten documents. The integration of the dissimilarity scores from a word-model word recognizer and Levenshtein distance between the recognized word and lexicon word, as a cost metric in the DTW algorithm leading to a fast and accurate alignment, is our primary contribution. Results provided, confirm the effectiveness of our approach.
ERIC Educational Resources Information Center
White, Charles E., Jr.
The purpose of this study was to develop and implement a hypertext documentation system in an industrial laboratory and to evaluate its usefulness by participative observation and a questionnaire. Existing word-processing test method documentation was converted directly into a hypertext format or "hyperdocument." The hyperdocument was designed and…
Responding to Nonwords in the Lexical Decision Task: Insights from the English Lexicon Project
ERIC Educational Resources Information Center
Yap, Melvin J.; Sibley, Daragh E.; Balota, David A.; Ratcliff, Roger; Rueckl, Jay
2015-01-01
Researchers have extensively documented how various statistical properties of words (e.g., word frequency) influence lexical processing. However, the impact of lexical variables on nonword decision-making performance is less clear. This gap is surprising, because a better specification of the mechanisms driving nonword responses may provide…
Morphological Effects in Auditory Word Recognition: Evidence from Danish
ERIC Educational Resources Information Center
Balling, Laura Winther; Baayen, R. Harald
2008-01-01
In this study, we investigate the processing of morphologically complex words in Danish using auditory lexical decision. We document a second critical point in auditory comprehension in addition to the Uniqueness Point (UP), namely the point at which competing morphological continuation forms of the base cease to be compatible with the input,…
A Validation of Parafoveal Semantic Information Extraction in Reading Chinese
ERIC Educational Resources Information Center
Zhou, Wei; Kliegl, Reinhold; Yan, Ming
2013-01-01
Parafoveal semantic processing has recently been well documented in reading Chinese sentences, presumably because of language-specific features. However, because of a large variation of fixation landing positions on pretarget words, some preview words actually were located in foveal vision when readers' eyes landed close to the end of the…
Reaction Time Variability Associated with Reading Skills in Poor Readers with ADHD
Tamm, Leanne; Epstein, Jeffery N.; Denton, Carolyn A.; Vaughn, Aaron J.; Peugh, James; Willcutt, Erik G.
2014-01-01
Objective Linkages between neuropsychological functioning (i.e., response inhibition, processing speed, reaction time variability) and word reading have been documented among children with Attention-Deficit/Hyperactivity Disorder (ADHD) and children with Reading Disorders. However, associations between neuropsychological functioning and other aspects of reading (i.e., fluency, comprehension) have not been well-documented among children with comorbid ADHD and Reading Disorder. Method Children with ADHD and poor word reading (i.e., ≤25th percentile) completed a stop signal task (SST) and tests of word reading, reading fluency, and reading comprehension. Multivariate multiple regression was conducted predicting the reading skills from SST variables [i.e., mean reaction time (MRT), reaction time standard deviation (SDRT), and stop signal reaction time (SSRT)]. Results SDRT predicted word reading, reading fluency, and reading comprehension. MRT and SSRT were not associated with any reading skill. After including word reading in models predicting reading fluency and reading comprehension, the effects of SDRT were minimized. Discussion Reaction time variability (i.e., SDRT) reflects impairments in information processing and failure to maintain executive control. The pattern of results from this study suggest SDRT exerts its effects on reading fluency and reading comprehension through its effect on word reading (i.e., decoding) and that this relation may be related to observed deficits in higher-level elements of reading. PMID:24528537
A Comparison of Product Realization Frameworks
1993-10-01
software (integrated FrameMaker ). Also included are BOLD for on-line documentation delivery, printer/plotter support, and 18 network licensing support. AMPLE...are built with DSS. Documentation tools include an on-line information system (BOLD), text editing (Notepad), word processing (integrated FrameMaker ...within an application. FrameMaker is fully integrated with the Falcon Framework to provide consistent documentation capabilities within engineering
System for information discovery
Pennock, Kelly A [Richland, WA; Miller, Nancy E [Kennewick, WA
2002-11-19
A sequence of word filters are used to eliminate terms in the database which do not discriminate document content, resulting in a filtered word set and a topic word set whose members are highly predictive of content. These two word sets are then formed into a two dimensional matrix with matrix entries calculated as the conditional probability that a document will contain a word in a row given that it contains the word in a column. The matrix representation allows the resultant vectors to be utilized to interpret document contents.
Finding Relevant Data in a Sea of Languages
2016-04-26
full machine-translated text , unbiased word clouds , query-biased word clouds , and query-biased sentence...and information retrieval to automate language processing tasks so that the limited number of linguists available for analyzing text and spoken...the crime (stock market). The Cross-LAnguage Search Engine (CLASE) has already preprocessed the documents, extracting text to identify the language
2008-11-01
T or more words, where T is a threshold that is empirically set to 300 in the experiment. The second rule aims to remove pornographic documents...Some blog documents are embedded with pornographic words to attract search traffic. We identify a list of pornographic words. Given a blog document, all...document, this document is considered pornographic spam, and is discarded. The third rule removes documents written in foreign languages. We count the
Proceedings-1979 third annual practical conference on communication
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1981-04-01
Topics covered at the meeting include: nonacademic writing, writer and editor training in technical publications, readability of technical documents, guide for beginning technical editors, a visual aids data base, newsletter publishing, style guide for a project management organization, word processing, computer graphics, text management for technical documentation, and typographical terminology.
Arabic handwritten: pre-processing and segmentation
NASA Astrophysics Data System (ADS)
Maliki, Makki; Jassim, Sabah; Al-Jawad, Naseer; Sellahewa, Harin
2012-06-01
This paper is concerned with pre-processing and segmentation tasks that influence the performance of Optical Character Recognition (OCR) systems and handwritten/printed text recognition. In Arabic, these tasks are adversely effected by the fact that many words are made up of sub-words, with many sub-words there associated one or more diacritics that are not connected to the sub-word's body; there could be multiple instances of sub-words overlap. To overcome these problems we investigate and develop segmentation techniques that first segment a document into sub-words, link the diacritics with their sub-words, and removes possible overlapping between words and sub-words. We shall also investigate two approaches for pre-processing tasks to estimate sub-words baseline, and to determine parameters that yield appropriate slope correction, slant removal. We shall investigate the use of linear regression on sub-words pixels to determine their central x and y coordinates, as well as their high density part. We also develop a new incremental rotation procedure to be performed on sub-words that determines the best rotation angle needed to realign baselines. We shall demonstrate the benefits of these proposals by conducting extensive experiments on publicly available databases and in-house created databases. These algorithms help improve character segmentation accuracy by transforming handwritten Arabic text into a form that could benefit from analysis of printed text.
Fuzzy Document Clustering Approach using WordNet Lexical Categories
NASA Astrophysics Data System (ADS)
Gharib, Tarek F.; Fouad, Mohammed M.; Aref, Mostafa M.
Text mining refers generally to the process of extracting interesting information and knowledge from unstructured text. This area is growing rapidly mainly because of the strong need for analysing the huge and large amount of textual data that reside on internal file systems and the Web. Text document clustering provides an effective navigation mechanism to organize this large amount of data by grouping their documents into a small number of meaningful classes. In this paper we proposed a fuzzy text document clustering approach using WordNet lexical categories and Fuzzy c-Means algorithm. Some experiments are performed to compare efficiency of the proposed approach with the recently reported approaches. Experimental results show that Fuzzy clustering leads to great performance results. Fuzzy c-means algorithm overcomes other classical clustering algorithms like k-means and bisecting k-means in both clustering quality and running time efficiency.
Kuperman, Victor; Drieghe, Denis; Keuleers, Emmanuel; Brysbaert, Marc
2013-01-01
We assess the amount of shared variance between three measures of visual word recognition latencies: eye movement latencies, lexical decision times, and naming times. After partialling out the effects of word frequency and word length, two well-documented predictors of word recognition latencies, we see that 7-44% of the variance is uniquely shared between lexical decision times and naming times, depending on the frequency range of the words used. A similar analysis of eye movement latencies shows that the percentage of variance they uniquely share either with lexical decision times or with naming times is much lower. It is 5-17% for gaze durations and lexical decision times in studies with target words presented in neutral sentences, but drops to 0.2% for corpus studies in which eye movements to all words are analysed. Correlations between gaze durations and naming latencies are lower still. These findings suggest that processing times in isolated word processing and continuous text reading are affected by specific task demands and presentation format, and that lexical decision times and naming times are not very informative in predicting eye movement latencies in text reading once the effect of word frequency and word length are taken into account. The difference between controlled experiments and natural reading suggests that reading strategies and stimulus materials may determine the degree to which the immediacy-of-processing assumption and the eye-mind assumption apply. Fixation times are more likely to exclusively reflect the lexical processing of the currently fixated word in controlled studies with unpredictable target words rather than in natural reading of sentences or texts.
ERIC Educational Resources Information Center
Chenail, Ronald J.
2012-01-01
In the first of a series of "how-to" essays on conducting qualitative data analysis, Ron Chenail points out the challenges of determining units to analyze qualitatively when dealing with text. He acknowledges that although we may read a document word-by-word or line-by-line, we need to adjust our focus when processing the text for purposes of…
Word spotting for handwritten documents using Chamfer Distance and Dynamic Time Warping
NASA Astrophysics Data System (ADS)
Saabni, Raid M.; El-Sana, Jihad A.
2011-01-01
A large amount of handwritten historical documents are located in libraries around the world. The desire to access, search, and explore these documents paves the way for a new age of knowledge sharing and promotes collaboration and understanding between human societies. Currently, the indexes for these documents are generated manually, which is very tedious and time consuming. Results produced by state of the art techniques, for converting complete images of handwritten documents into textual representations, are not yet sufficient. Therefore, word-spotting methods have been developed to archive and index images of handwritten documents in order to enable efficient searching within documents. In this paper, we present a new matching algorithm to be used in word-spotting tasks for historical Arabic documents. We present a novel algorithm based on the Chamfer Distance to compute the similarity between shapes of word-parts. Matching results are used to cluster images of Arabic word-parts into different classes using the Nearest Neighbor rule. To compute the distance between two word-part images, the algorithm subdivides each image into equal-sized slices (windows). A modified version of the Chamfer Distance, incorporating geometric gradient features and distance transform data, is used as a similarity distance between the different slices. Finally, the Dynamic Time Warping (DTW) algorithm is used to measure the distance between two images of word-parts. By using the DTW we enabled our system to cluster similar word-parts, even though they are transformed non-linearly due to the nature of handwriting. We tested our implementation of the presented methods using various documents in different writing styles, taken from Juma'a Al Majid Center - Dubai, and obtained encouraging results.
ERIC Educational Resources Information Center
Smith, Irene; Yoder, Sharon
1996-01-01
Discusses word processing and desktop publishing and offers suggestions for creating documents that look more professional, including proportional type size, spacing, the use of punctuation marks, italics, tabs and margins, and paragraph styles. (LRW)
ERIC Educational Resources Information Center
Haapaniemi, Peter
1990-01-01
Describes imaging technology, which allows huge numbers of words and illustrations to be reduced to tiny fraction of space required by originals and discusses current applications. Highlights include image processing system at National Archives; use by banks for high-speed check processing; engineering document management systems (EDMS); folder…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Okman, Oya; Baginska, Marta; Jones, Elizabeth MC
Representing the Center for Electrical Energy Storage (CEES), this document is one of the entries in the Ten Hundred and One Word Challenge and was awarded "Best Science Lesson." As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CEES is to acquire a fundamentalmore » understanding of interfacial phenomena controlling electrochemical processes that will enable dramatic improvements in the properties and performance of energy storage devices, notably Li ion batteries.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bryant, Steven L; Camacho-Lopez, Tara R; Tenney, Craig M
Representing the Center for Frontiers of Subsurface Energy Security (CFSES), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CFSES is to pursue the scientific understanding of multiscale, multiphysicsmore » processes and to ensure safe and economically feasible storage of carbon dioxide and other byproducts of energy production without harming the environment.« less
Herbert, Cornelia; Kissler, Johanna
2010-05-01
Valence-driven modulation of the startle reflex, that is larger eyeblinks during viewing of unpleasant pictures and inhibited blinks while viewing pleasant pictures, is well documented. The current study investigated, whether this motivational priming pattern also occurs during processing of unpleasant and pleasant words, and to what extent it is influenced by shallow vs. deep encoding of verbal stimuli. Emotional and neutral adjectives were presented for 5s, and the acoustically elicited startle eyeblink response was measured while subjects memorized the words by means of shallow or deep processing strategies. Results showed blink potentiation to unpleasant and blink inhibition to pleasant adjectives in subjects using shallow encoding strategies. In subjects using deep-encoding strategies, blinks were larger for pleasant than unpleasant or neutral adjectives. In line with this, free recall of pleasant words was also better in subjects who engaged in deep processing. The results suggest that motivational priming holds as long as processing is perceptual. However, during deep processing the startle reflex appears to represent a measure of "processing interrupt", facilitating blinks to those stimuli that are more deeply encoded. Copyright 2010 Elsevier B.V. All rights reserved.
Desktop publishing: a useful tool for scientists.
Lindroth, J R; Cooper, G; Kent, R L
1994-01-01
Desktop publishing offers features that are not available in word processing programs. The process yields an impressive and professional-looking document that is legible and attractive. It is a simple but effective tool to enhance the quality and appearance of your work and perhaps also increase your productivity.
Intelligent Document Gateway: A Service System Case Study and Analysis
NASA Astrophysics Data System (ADS)
Krishna, Vikas; Lelescu, Ana
In today's fast paced world, it is necessary to process business documents expediently, accurately, and diligently. In other words, processing has to be fast, errors must be prevented (or caught and corrected quickly), and documents cannot be lost or misplaced. The failure to meet these criteria, depending on the type and purpose of the documents, can have serious business, legal, or safety consequences. In this paper, we evaluated a B2B order placement service system that allows clients to place orders for products and services over a network. We describe the order placement service before and after deploying the Intelligent Document Gateway (IDG), a document-centric business process automation technology from IBM Research. Using service science perspective and service systems frameworks, we provide an analysis of how IDG improved the value proposition for both the service providers and service clients.
Hauk, Olaf; Davis, Matthew H; Pulvermüller, Friedemann
2008-09-01
Psycholinguistic research has documented a range of variables that influence visual word recognition performance. Many of these variables are highly intercorrelated. Most previous studies have used factorial designs, which do not exploit the full range of values available for continuous variables, and are prone to skewed stimulus selection as well as to effects of the baseline (e.g. when contrasting words with pseudowords). In our study, we used a parametric approach to study the effects of several psycholinguistic variables on brain activation. We focussed on the variable word frequency, which has been used in numerous previous behavioural, electrophysiological and neuroimaging studies, in order to investigate the neuronal network underlying visual word processing. Furthermore, we investigated the variable orthographic typicality as well as a combined variable for word length and orthographic neighbourhood size (N), for which neuroimaging results are still either scarce or inconsistent. Data were analysed using multiple linear regression analysis of event-related fMRI data acquired from 21 subjects in a silent reading paradigm. The frequency variable correlated negatively with activation in left fusiform gyrus, bilateral inferior frontal gyri and bilateral insulae, indicating that word frequency can affect multiple aspects of word processing. N correlated positively with brain activity in left and right middle temporal gyri as well as right inferior frontal gyrus. Thus, our analysis revealed multiple distinct brain areas involved in visual word processing within one data set.
Use of Co-occurrences for Temporal Expressions Annotation
NASA Astrophysics Data System (ADS)
Craveiro, Olga; Macedo, Joaquim; Madeira, Henrique
The annotation or extraction of temporal information from text documents is becoming increasingly important in many natural language processing applications such as text summarization, information retrieval, question answering, etc.. This paper presents an original method for easy recognition of temporal expressions in text documents. The method creates semantically classified temporal patterns, using word co-occurrences obtained from training corpora and a pre-defined seed keywords set, derived from the used language temporal references. A participation on a Portuguese named entity evaluation contest showed promising effectiveness and efficiency results. This approach can be adapted to recognize other type of expressions or languages, within other contexts, by defining the suitable word sets and training corpora.
Semantic Similarity between Web Documents Using Ontology
NASA Astrophysics Data System (ADS)
Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh
2018-06-01
The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.
Semantic Similarity between Web Documents Using Ontology
NASA Astrophysics Data System (ADS)
Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh
2018-03-01
The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.
Gerfo, Emanuele Lo; Oliveri, Massimiliano; Torriero, Sara; Salerno, Silvia; Koch, Giacomo; Caltagirone, Carlo
2008-01-31
We investigated the differential role of two frontal regions in the processing of grammatical and semantic knowledge. Given the documented specificity of the prefrontal cortex for the grammatical class of verbs, and of the primary motor cortex for the semantic class of action words, we sought to investigate whether the prefrontal cortex is also sensitive to semantic effects, and whether the motor cortex is also sensitive to grammatical class effects. We used repetitive transcranial magnetic stimulation (rTMS) to suppress the excitability of a portion of left prefontal cortex (first experiment) and of the motor area (second experiment). In the first experiment we found that rTMS applied to the left prefrontal cortex delays the processing of action verbs' retrieval, but is not critical for retrieval of state verbs and state nouns. In the second experiment we found that rTMS applied to the left motor cortex delays the processing of action words, both name and verbs, while it is not critical for the processing of state words. These results support the notion that left prefrontal and motor cortex are involved in the process of action word retrieval. Left prefrontal cortex subserves processing of both grammatical and semantic information, whereas motor cortex contributes to the processing of semantic representation of action words without any involvement in the representation of grammatical categories.
76 FR 27048 - Information Collection Being Reviewed by the Federal Communications Commission
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-10
... Commission; (8) Ex parte notices must be submitted electronically in machine-readable format. PDF images created by scanning a paper document may not be submitted, except in cases in which a word-processing...
Tracing And Control Of Engineering Requirements
NASA Technical Reports Server (NTRS)
Turner, Philip R.; Stoller, Richard L.; Neville, Ted; Boyle, Karen A.
1991-01-01
TRACER (Tracing and Control of Engineering Requirements) is data-base/word-processing software system created to document and maintain order of both requirements and descriptions associated with engineering project. Implemented on IBM PC under PC-DOS. Written with CLIPPER.
Exploiting salient semantic analysis for information retrieval
NASA Astrophysics Data System (ADS)
Luo, Jing; Meng, Bo; Quan, Changqin; Tu, Xinhui
2016-11-01
Recently, many Wikipedia-based methods have been proposed to improve the performance of different natural language processing (NLP) tasks, such as semantic relatedness computation, text classification and information retrieval. Among these methods, salient semantic analysis (SSA) has been proven to be an effective way to generate conceptual representation for words or documents. However, its feasibility and effectiveness in information retrieval is mostly unknown. In this paper, we study how to efficiently use SSA to improve the information retrieval performance, and propose a SSA-based retrieval method under the language model framework. First, SSA model is adopted to build conceptual representations for documents and queries. Then, these conceptual representations and the bag-of-words (BOW) representations can be used in combination to estimate the language models of queries and documents. The proposed method is evaluated on several standard text retrieval conference (TREC) collections. Experiment results on standard TREC collections show the proposed models consistently outperform the existing Wikipedia-based retrieval methods.
Schuster, Sarah; Hawelka, Stefan; Hutzler, Florian; Kronbichler, Martin; Richlan, Fabio
2016-01-01
Word length, frequency, and predictability count among the most influential variables during reading. Their effects are well-documented in eye movement studies, but pertinent evidence from neuroimaging primarily stem from single-word presentations. We investigated the effects of these variables during reading of whole sentences with simultaneous eye-tracking and functional magnetic resonance imaging (fixation-related fMRI). Increasing word length was associated with increasing activation in occipital areas linked to visual analysis. Additionally, length elicited a U-shaped modulation (i.e., least activation for medium-length words) within a brain stem region presumably linked to eye movement control. These effects, however, were diminished when accounting for multiple fixation cases. Increasing frequency was associated with decreasing activation within left inferior frontal, superior parietal, and occipito-temporal regions. The function of the latter region—hosting the putative visual word form area—was originally considered as limited to sublexical processing. An exploratory analysis revealed that increasing predictability was associated with decreasing activation within middle temporal and inferior frontal regions previously implicated in memory access and unification. The findings are discussed with regard to their correspondence with findings from single-word presentations and with regard to neurocognitive models of visual word recognition, semantic processing, and eye movement control during reading. PMID:27365297
Federal Register 2010, 2011, 2012, 2013, 2014
2013-02-11
... massive emails, word processing documents, PDF files, spreadsheets, presentations, database entries, and....pdf . PURPOSES: OGC-EDMS provides OGC with a method to initiate, track, and manage the collection...
The Microcomputer and School Transportation.
ERIC Educational Resources Information Center
Dembowski, Frederick L.
1984-01-01
Microcomputers have many cost- and time-saving uses in school transportation management. Applications include routing and scheduling, demographic analysis, fleet maintenance, and personnel and contract management. Word processing is especially promising for storing and updating documents like specifications. Enrollment forecasting and inventory…
Diverging receptive and expressive word processing mechanisms in a deep dyslexic reader.
Ablinger, Irene; Radach, Ralph
2016-01-29
We report on KJ, a patient with acquired dyslexia due to cerebral artery infarction. He represents an unusually clear case of an "output" deep dyslexic reader, with a distinct pattern of pure semantic reading. According to current neuropsychological models of reading, the severity of this condition is directly related to the degree of impairment in semantic and phonological representations and the resulting imbalance in the interaction between the two word processing pathways. The present work sought to examine whether an innovative eye movement supported intervention combining lexical and segmental therapy would strengthen phonological processing and lead to an attenuation of the extreme semantic over-involvement in KJ's word identification process. Reading performance was assessed before (T1) between (T2) and after (T3) therapy using both analyses of linguistic errors and word viewing patterns. Therapy resulted in improved reading aloud accuracy along with a change in error distribution that suggested a return to more sequential reading. Interestingly, this was in contrast to the dynamics of moment-to-moment word processing, as eye movement analyses still suggested a predominantly holistic strategy, even at T3. So, in addition to documenting the success of the therapeutic intervention, our results call for a theoretically important conclusion: Real-time letter and word recognition routines should be considered separately from properties of the verbal output. Combining both perspectives may provide a promising strategy for future assessment and therapy evaluation. Copyright © 2015. Published by Elsevier Ltd.
Teaching Basic Reading Skills in Secondary Schools.
ERIC Educational Resources Information Center
Carnine, Linda
1980-01-01
This document presents diagnostic and prescriptive techniques that will enable teachers to enhance secondary school students' learning through reading in content areas. Three terms used in the document are defined in Section I: "vocabulary skills" include word attack skills, sight word skills, and word meanings; "comprehension skills" are literal,…
Sanfilippo, Antonio [Richland, WA; Calapristi, Augustin J [West Richland, WA; Crow, Vernon L [Richland, WA; Hetzler, Elizabeth G [Kennewick, WA; Turner, Alan E [Kennewick, WA
2009-12-22
Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture are described. In one aspect, a document clustering method includes providing a document set comprising a plurality of documents, providing a cluster comprising a subset of the documents of the document set, using a plurality of terms of the documents, providing a cluster label indicative of subject matter content of the documents of the cluster, wherein the cluster label comprises a plurality of word senses, and selecting one of the word senses of the cluster label.
Building a Road from Light to Energy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Anton; Bilby, David; Barito, Adam
Representing the Center for Solar and Thermal Energy Conversion (CSTEC), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of the Center for Solar and Thermal Energy Conversion (CSTEC) is tomore » design and to synthesize new materials for high efficiency photovoltaic (PV) and thermoelectric (TE) devices, predicated on new fundamental insights into equilibrium and non-equilibrium processes, including quantum phenomena, that occur in materials over various spatial and temporal scales.« less
Putting more power in your pocket
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chapman, Karena
Representing the Northeastern Center for Chemical Energy Storage (NECCES), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of NECCEC is to identify the key atomic-scale processes which govern electrode functionmore » in rechargeable batteries, over a wide range of time and length scales, via the development and use of novel characterization and theoretical tools, and to use this information to identify and design new battery systems.« less
Car manufacturers and global road safety: a word frequency analysis of road safety documents.
Roberts, I; Wentz, R; Edwards, P
2006-10-01
The World Bank believes that the car manufacturers can make a valuable contribution to road safety in poor countries and has established the Global Road Safety Partnership (GRSP) for this purpose. However, some commentators are sceptical. The authors examined road safety policy documents to assess the extent of any bias. Word frequency analyses of road safety policy documents from the World Health Organization (WHO) and the GRSP. The relative occurrence of key road safety terms was quantified by calculating a word prevalence ratio with 95% confidence intervals. Terms for which there was a fourfold difference in prevalence between the documents were tabulated. Compared to WHO's World report on road traffic injury prevention, the GRSP road safety documents were substantially less likely to use the words speed, speed limits, child restraint, pedestrian, public transport, walking, and cycling, but substantially more likely to use the words school, campaign, driver training, and billboard. There are important differences in emphasis in road safety policy documents prepared by WHO and the GRSP. Vigilance is needed to ensure that the road safety interventions that the car industry supports are based on sound evidence of effectiveness.
Genes2WordCloud: a quick way to identify biological themes from gene lists and free text.
Baroukh, Caroline; Jenkins, Sherry L; Dannenfelser, Ruth; Ma'ayan, Avi
2011-10-13
Word-clouds recently emerged on the web as a solution for quickly summarizing text by maximizing the display of most relevant terms about a specific topic in the minimum amount of space. As biologists are faced with the daunting amount of new research data commonly presented in textual formats, word-clouds can be used to summarize and represent biological and/or biomedical content for various applications. Genes2WordCloud is a web application that enables users to quickly identify biological themes from gene lists and research relevant text by constructing and displaying word-clouds. It provides users with several different options and ideas for the sources that can be used to generate a word-cloud. Different options for rendering and coloring the word-clouds give users the flexibility to quickly generate customized word-clouds of their choice. Genes2WordCloud is a word-cloud generator and a word-cloud viewer that is based on WordCram implemented using Java, Processing, AJAX, mySQL, and PHP. Text is fetched from several sources and then processed to extract the most relevant terms with their computed weights based on word frequencies. Genes2WordCloud is freely available for use online; it is open source software and is available for installation on any web-site along with supporting documentation at http://www.maayanlab.net/G2W. Genes2WordCloud provides a useful way to summarize and visualize large amounts of textual biological data or to find biological themes from several different sources. The open source availability of the software enables users to implement customized word-clouds on their own web-sites and desktop applications.
Genes2WordCloud: a quick way to identify biological themes from gene lists and free text
2011-01-01
Background Word-clouds recently emerged on the web as a solution for quickly summarizing text by maximizing the display of most relevant terms about a specific topic in the minimum amount of space. As biologists are faced with the daunting amount of new research data commonly presented in textual formats, word-clouds can be used to summarize and represent biological and/or biomedical content for various applications. Results Genes2WordCloud is a web application that enables users to quickly identify biological themes from gene lists and research relevant text by constructing and displaying word-clouds. It provides users with several different options and ideas for the sources that can be used to generate a word-cloud. Different options for rendering and coloring the word-clouds give users the flexibility to quickly generate customized word-clouds of their choice. Methods Genes2WordCloud is a word-cloud generator and a word-cloud viewer that is based on WordCram implemented using Java, Processing, AJAX, mySQL, and PHP. Text is fetched from several sources and then processed to extract the most relevant terms with their computed weights based on word frequencies. Genes2WordCloud is freely available for use online; it is open source software and is available for installation on any web-site along with supporting documentation at http://www.maayanlab.net/G2W. Conclusions Genes2WordCloud provides a useful way to summarize and visualize large amounts of textual biological data or to find biological themes from several different sources. The open source availability of the software enables users to implement customized word-clouds on their own web-sites and desktop applications. PMID:21995939
Interface for the documentation and compilation of a library of computer models in physiology.
Summers, R. L.; Montani, J. P.
1994-01-01
A software interface for the documentation and compilation of a library of computer models in physiology was developed. The interface is an interactive program built within a word processing template in order to provide ease and flexibility of documentation. A model editor within the interface directs the model builder as to standardized requirements for incorporating models into the library and provides the user with an index to the levels of documentation. The interface and accompanying library are intended to facilitate model development, preservation and distribution and will be available for public use. PMID:7950046
Link-topic model for biomedical abbreviation disambiguation.
Kim, Seonho; Yoon, Juntae
2015-02-01
The ambiguity of biomedical abbreviations is one of the challenges in biomedical text mining systems. In particular, the handling of term variants and abbreviations without nearby definitions is a critical issue. In this study, we adopt the concepts of topic of document and word link to disambiguate biomedical abbreviations. We newly suggest the link topic model inspired by the latent Dirichlet allocation model, in which each document is perceived as a random mixture of topics, where each topic is characterized by a distribution over words. Thus, the most probable expansions with respect to abbreviations of a given abstract are determined by word-topic, document-topic, and word-link distributions estimated from a document collection through the link topic model. The model allows two distinct modes of word generation to incorporate semantic dependencies among words, particularly long form words of abbreviations and their sentential co-occurring words; a word can be generated either dependently on the long form of the abbreviation or independently. The semantic dependency between two words is defined as a link and a new random parameter for the link is assigned to each word as well as a topic parameter. Because the link status indicates whether the word constitutes a link with a given specific long form, it has the effect of determining whether a word forms a unigram or a skipping/consecutive bigram with respect to the long form. Furthermore, we place a constraint on the model so that a word has the same topic as a specific long form if it is generated in reference to the long form. Consequently, documents are generated from the two hidden parameters, i.e. topic and link, and the most probable expansion of a specific abbreviation is estimated from the parameters. Our model relaxes the bag-of-words assumption of the standard topic model in which the word order is neglected, and it captures a richer structure of text than does the standard topic model by considering unigrams and semantically associated bigrams simultaneously. The addition of semantic links improves the disambiguation accuracy without removing irrelevant contextual words and reduces the parameter space of massive skipping or consecutive bigrams. The link topic model achieves 98.42% disambiguation accuracy on 73,505 MEDLINE abstracts with respect to 21 three letter abbreviations and their 139 distinct long forms. Copyright © 2014 Elsevier Inc. All rights reserved.
Biomedical information retrieval across languages.
Daumke, Philipp; Markü, Kornél; Poprat, Michael; Schulz, Stefan; Klar, Rüdiger
2007-06-01
This work presents a new dictionary-based approach to biomedical cross-language information retrieval (CLIR) that addresses many of the general and domain-specific challenges in current CLIR research. Our method is based on a multilingual lexicon that was generated partly manually and partly automatically, and currently covers six European languages. It contains morphologically meaningful word fragments, termed subwords. Using subwords instead of entire words significantly reduces the number of lexical entries necessary to sufficiently cover a specific language and domain. Mediation between queries and documents is based on these subwords as well as on lists of word-n-grams that are generated from large monolingual corpora and constitute possible translation units. The translations are then sent to a standard Internet search engine. This process makes our approach an effective tool for searching the biomedical content of the World Wide Web in different languages. We evaluate this approach using the OHSUMED corpus, a large medical document collection, within a cross-language retrieval setting.
NASA Astrophysics Data System (ADS)
Zhang, Hui; Wang, Deqing; Wu, Wenjun; Hu, Hongping
2012-11-01
In today's business environment, enterprises are increasingly under pressure to process the vast amount of data produced everyday within enterprises. One method is to focus on the business intelligence (BI) applications and increasing the commercial added-value through such business analytics activities. Term weighting scheme, which has been used to convert the documents as vectors in the term space, is a vital task in enterprise Information Retrieval (IR), text categorisation, text analytics, etc. When determining term weight in a document, the traditional TF-IDF scheme sets weight value for the term considering only its occurrence frequency within the document and in the entire set of documents, which leads to some meaningful terms that cannot get the appropriate weight. In this article, we propose a new term weighting scheme called Term Frequency - Function of Document Frequency (TF-FDF) to address this issue. Instead of using monotonically decreasing function such as Inverse Document Frequency, FDF presents a convex function that dynamically adjusts weights according to the significance of the words in a document set. This function can be manually tuned based on the distribution of the most meaningful words which semantically represent the document set. Our experiments show that the TF-FDF can achieve higher value of Normalised Discounted Cumulative Gain in IR than that of TF-IDF and its variants, and improving the accuracy of relevance ranking of the IR results.
BioWord: A sequence manipulation suite for Microsoft Word
2012-01-01
Background The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. Results BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA) as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. Conclusions BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms. PMID:22676326
BioWord: a sequence manipulation suite for Microsoft Word.
Anzaldi, Laura J; Muñoz-Fernández, Daniel; Erill, Ivan
2012-06-07
The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA) as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms.
ERIC Educational Resources Information Center
Gazan, Rich
2000-01-01
Surveys the current state of Extensible Markup Language (XML), a metalanguage for creating structured documents that describe their own content, and its implications for information professionals. Predicts that XML will become the common language underlying Web, word processing, and database formats. Also discusses Extensible Stylesheet Language…
Content Abstract Classification Using Naive Bayes
NASA Astrophysics Data System (ADS)
Latif, Syukriyanto; Suwardoyo, Untung; Aldrin Wihelmus Sanadi, Edwin
2018-03-01
This study aims to classify abstract content based on the use of the highest number of words in an abstract content of the English language journals. This research uses a system of text mining technology that extracts text data to search information from a set of documents. Abstract content of 120 data downloaded at www.computer.org. Data grouping consists of three categories: DM (Data Mining), ITS (Intelligent Transport System) and MM (Multimedia). Systems built using naive bayes algorithms to classify abstract journals and feature selection processes using term weighting to give weight to each word. Dimensional reduction techniques to reduce the dimensions of word counts rarely appear in each document based on dimensional reduction test parameters of 10% -90% of 5.344 words. The performance of the classification system is tested by using the Confusion Matrix based on comparative test data and test data. The results showed that the best classification results were obtained during the 75% training data test and 25% test data from the total data. Accuracy rates for categories of DM, ITS and MM were 100%, 100%, 86%. respectively with dimension reduction parameters of 30% and the value of learning rate between 0.1-0.5.
Lapinskaya, Natalia; Uzomah, Uchechukwu; Bedny, Marina; Lau, Ellen
2016-12-01
Numerous theories have been proposed regarding the brain's organization and retrieval of lexical information. Neurophysiological dissociations in processing different word classes, particularly nouns and verbs, have been extensively documented, supporting the contribution of grammatical class to lexical organization. However, the contribution of semantic properties to these processing differences is still unresolved. We aim to isolate this contribution by comparing ERPs to verbs (e.g. wade), object nouns (e.g. cookie), and event nouns (e.g. concert) in a paired similarity judgment task, as event nouns share grammatical category with object nouns but some semantic properties with verbs. We find that event nouns pattern with verbs in eliciting a more positive response than object nouns across left anterior electrodes 300-500ms after word presentation. This time-window has been strongly linked to lexical-semantic access by prior electrophysiological work. Thus, the similarity of the response to words referring to concepts with more complex participant structure and temporal continuity extends across grammatical class (event nouns and verbs), and contrasts with the words that refer to objects (object nouns). This contrast supports a semantic, as well as syntactic, contribution to the differential neural organization and processing of lexical items. We also observed a late (500-800ms post-stimulus) posterior positivity for object nouns relative to event nouns and verbs at the second word of each pair, which may reflect the impact of semantic properties on the similarity judgment task. Copyright © 2016 Elsevier Ltd. All rights reserved.
The distinct emotional flavor of Gnostic writings from the early Christian era.
Whissell, Cynthia
2008-02-01
More than 500,000 scored words in 83 documents were used to conclude that it is possible to identify the source of documents (proto-orthodox Christian versus early Gnostic) on the basis of the emotions underlying the words. Twenty-seven New Testament works and seven Gnostic documents (including the gospels of Thomas, Judas, and Mary [Magdalene]) were scored with the Dictionary of Affect in Language. Patterns of emotional word use focusing on eight types of extreme emotional words were employed in a discriminant function analysis to predict source. Prediction was highly successful (canonical r = .81, 97% correct identification of source). When the discriminant function was tested with more than 30 additional Gnostic and Christian works including a variety of translations and some wisdom books, it correctly classified all of them. The majority of the predictive power of the function (97% of all correct categorizations, 70% of the canonical r2) was associated with the preferential presence of passive and passive/pleasant words in Gnostic documents.
Command Center Library Model Document. Comprehensive Approach to Reusable Defense Software (CARDS)
1992-05-31
system, and functionality for specifying the layout of the document. 3.7.16.1 FrameMaker FrameMaker is a Commercial Off The Shelf (COTS) component...facilitating WYSIWYG creation of formatted reports with embedded graphics. FrameMaker is an advanced publishing tool that integrates word processing...available for the component FrameMaker : * Product evaluation reports in ASCII and postscript formats • Product assessment on line in model 0 Product
A Language-Independent Approach to Automatic Text Difficulty Assessment for Second-Language Learners
2013-08-01
best-suited for regression. Our baseline uses z-normalized shallow length features and TF -LOG weighted vectors on bag-of-words for Arabic, Dari...length features and TF -LOG weighted vectors on bag-of-words for Arabic, Dari, English and Pashto. We compare Support Vector Machines and the Margin...football, whereas they are much less common in documents about opera). We used TF -LOG weighted word frequencies on bag-of-words for each document
Landi, Nicole; Avery, Trey; Crowley, Michael J; Wu, Jia; Mayes, Linda
2017-01-01
Extant research documents impaired language among children with prenatal cocaine exposure (PCE) relative to nondrug exposed (NDE) children, suggesting that cocaine alters development of neurobiological systems that support language. The current study examines behavioral and neural (electrophysiological) indices of language function in older adolescents. Specifically, we compare performance of PCE (N = 59) and NDE (N = 51) adolescents on a battery of cognitive and linguistic assessments that tap word reading, reading comprehension, semantic and grammatical processing, and IQ. In addition, we examine event related potential (ERP) responses in in a subset of these children across three experimental tasks that examine word level phonological processing (rhyme priming), word level semantic processing (semantic priming), and sentence level semantic processing (semantic anomaly). Findings reveal deficits across a number of reading and language assessments, after controlling for socioeconomic status and exposure to other substances. Additionally, ERP data reveal atypical orthography to phonology mapping (reduced N1/P2 response) and atypical rhyme and semantic processing (N400 response). These findings suggest that PCE continues to impact language and reading skills into the late teenage years.
Desktop Publishing in Education.
ERIC Educational Resources Information Center
Hall, Wendy; Layman, J.
1989-01-01
Discusses the state of desktop publishing (DTP) in education today and describes the weaknesses of the systems available for use in the classroom. Highlights include document design and layout; text composition; graphics; word processing capabilities; a comparison of commercial and educational DTP packages; and skills required for DTP. (four…
Generative Processes: Thick Drawing
ERIC Educational Resources Information Center
Wallick, Karl
2012-01-01
This article presents techniques and theories of generative drawing as a means for developing complex content in architecture design studios. Appending the word "generative" to drawing adds specificity to the most common representation tool and clarifies that such drawings are not singularly about communication or documentation but are…
Tsuji, Shintarou; Nishimoto, Naoki; Ogasawara, Katsuhiko
2008-07-20
Although large medical texts are stored in electronic format, they are seldom reused because of the difficulty of processing narrative texts by computer. Morphological analysis is a key technology for extracting medical terms correctly and automatically. This process parses a sentence into its smallest unit, the morpheme. Phrases consisting of two or more technical terms, however, cause morphological analysis software to fail in parsing the sentence and output unprocessed terms as "unknown words." The purpose of this study was to reduce the number of unknown words in medical narrative text processing. The results of parsing the text with additional dictionaries were compared with the analysis of the number of unknown words in the national examination for radiologists. The ratio of unknown words was reduced 1.0% to 0.36% by adding terminologies of radiological technology, MeSH, and ICD-10 labels. The terminology of radiological technology was the most effective resource, being reduced by 0.62%. This result clearly showed the necessity of additional dictionary selection and trends in unknown words. The potential for this investigation is to make available a large body of clinical information that would otherwise be inaccessible for applications other than manual health care review by personnel.
Technological Developments in Journalism: The Impact of the Computer in the Newsroom.
ERIC Educational Resources Information Center
Garrison, Bruce
A review of the literature for the past 7 years reveals that the computer serves several key functions in the newsroom. Its more dominant role is in word processing, or internal copy processing regardless of the source of the copy. Computers are also useful in reviewing documents for content analysis, for survey research in public opinion polls…
Video to Text (V2T) in Wide Area Motion Imagery
2015-09-01
microtext) or a document (e.g., using Sphinx or Apache NLP ) as an automated approach [102]. Previous work in natural language full-text searching...language processing ( NLP ) based module. The heart of the structured text processing module includes the following seven key word banks...Features Tracker MHT Multiple Hypothesis Tracking MIL Multiple Instance Learning NLP Natural Language Processing OAB Online AdaBoost OF Optic Flow
NASA Astrophysics Data System (ADS)
Beretta, Giordano
2007-01-01
The words in a document are often supported, illustrated, and enriched by visuals. When color is used, some of it is used to define the document's identity and is therefore strictly controlled in the design process. The result of this design process is a "color specification sheet," which must be created for every background color. While in traditional publishing there are only a few backgrounds, in variable data publishing a larger number of backgrounds can be used. We present an algorithm that nudges the colors in a visual to be distinct from a background while preserving the visual's general color character.
Different Neural Correlates of Emotion-Label Words and Emotion-Laden Words: An ERP Study.
Zhang, Juan; Wu, Chenggang; Meng, Yaxuan; Yuan, Zhen
2017-01-01
It is well-documented that both emotion-label words (e.g., sadness, happiness) and emotion-laden words (e.g., death, wedding) can induce emotion activation. However, the neural correlates of emotion-label words and emotion-laden words recognition have not been examined. The present study aimed to compare the underlying neural responses when processing the two kinds of words by employing event-related potential (ERP) measurements. Fifteen Chinese native speakers were asked to perform a lexical decision task in which they should judge whether a two-character compound stimulus was a real word or not. Results showed that (1) emotion-label words and emotion-laden words elicited similar P100 at the posteriors sites, (2) larger N170 was found for emotion-label words than for emotion-laden words at the occipital sites on the right hemisphere, and (3) negative emotion-label words elicited larger Late Positivity Complex (LPC) on the right hemisphere than on the left hemisphere while such effect was not found for emotion-laden words and positive emotion-label words. The results indicate that emotion-label words and emotion-laden words elicit different cortical responses at both early (N170) and late (LPC) stages. In addition, right hemisphere advantage for emotion-label words over emotion-laden words can be observed in certain time windows (i.e., N170 and LPC) while fails to be detected in some other time window (i.e., P100). The implications of the current findings for future emotion research were discussed.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-09-24
... Microsoft Excel. By Hard Copy: U.S. mail or hand-delivery: Public Comments Processing, Attn: FWS-HQ-ES-2013... procedures. If you attach your comments as a separate document, our preferred file format is Microsoft Word...
Topics in Semantic Representation
ERIC Educational Resources Information Center
Griffiths, Thomas L.; Steyvers, Mark; Tenenbaum, Joshua B.
2007-01-01
Processing language requires the retrieval of concepts from memory in response to an ongoing stream of information. This retrieval is facilitated if one can infer the gist of a sentence, conversation, or document and use that gist to predict related concepts and disambiguate words. This article analyzes the abstract computational problem…
Narrative medicine and death in the ICU: word clouds as a visual legacy.
Vanstone, Meredith; Toledo, Feli; Clarke, France; Boyle, Anne; Giacomini, Mita; Swinton, Marilyn; Saunders, Lois; Shears, Melissa; Zytaruk, Nicole; Woods, Anne; Rose, Trudy; Hand-Breckenridge, Tracey; Heels-Ansdell, Diane; Anderson-White, Shelley; Sheppard, Robert; Cook, Deborah
2016-11-24
The Word Cloud is a frequent wish in the 3 Wishes Project developed to nurture peace and ease the grieving process for dying critically ill patients. The objective was to examine whether Word Clouds can act as a heuristic approach to encourage a narrative orientation to medicine. Narrative medicine is an approach which can strengthen relationships, compassion and resilience. Word Clouds were created for 42 dying patients, and we interviewed 37 family members and 73 clinicians about their impact. We conducted a directed qualitative content analysis, using the 3 stages of narrative medicine (attention, representation, affiliation) to examine the narrative medicine potential of Word Clouds. The elicitation of stories for the Word Cloud promotes narrative attention to the patient as a whole person. The distillation of these stories into a list of words and the prioritisation of those words for arrangement in the collage encourages a representation that did not enforce a beginning, middle or end to the story of the patient's life. Strong affiliative connections were achieved through the honouring of patients, caring for families and sharing of memories encouraged through the creation, sharing and discussion of Word Clouds. In the 3 Wishes Project, Word Clouds are 1 way that families and clinicians honour a dying patient. Engaging in the process of making a Word Cloud can promote a narrative orientation to medicine, forging connections, making meaning through reminiscence and leaving a legacy of a loved one. Documenting and displaying words to remember someone in death reaffirms their life. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Text-image alignment for historical handwritten documents
NASA Astrophysics Data System (ADS)
Zinger, S.; Nerbonne, J.; Schomaker, L.
2009-01-01
We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text transcriptions. The images of handwritten lines are automatically segmented from the scanned pages of historical documents and then manually transcribed. To train automatic routines to detect words in an image of handwritten text, we need a training set - images of words with their transcriptions. We present our results on aligning words from the images of handwritten lines and their corresponding text transcriptions. Alignment based on the longest spaces between portions of handwriting is a baseline. We then show that relative lengths, i.e. proportions of words in their lines, can be used to improve the alignment results considerably. To take into account the relative word length, we define the expressions for the cost function that has to be minimized for aligning text words with their images. We apply right to left alignment as well as alignment based on exhaustive search. The quality assessment of these alignments shows correct results for 69% of words from 100 lines, or 90% of partially correct and correct alignments combined.
Centroid-Based Document Classification Algorithms: Analysis & Experimental Results
2000-03-06
stories such as baseball, football , basketball, and Olympics. In the first category, most of the documents contain words Clinton and Lewinsky and hence...document. On the other hand, any of sports related words like baseball, football , and basketball appearing in a document will put the document in the...0.15 diseas 0.14 women 0.13 heart 0.12 drug 4 0.41 newspap 0.22 editor 0.19 advertis 0.14 media 0.13 peruvian 0.13 coverag 0.12 percent 0.12 journalist
Attention and eye-movement control in reading: The selective reading paradigm.
Reingold, Eyal M; Sheridan, Heather; Meadmore, Katie L; Drieghe, Denis; Liversedge, Simon P
2016-12-01
We introduced a novel paradigm for investigating covert attention and eye-movement control in reading. In 2 experiments, participants read sentence words (shown in blue color) while ignoring interleaved distractor strings (shown in orange color). Each single-line text display contained a target word and a critical distractor. Critical distractors were located just prior to the target in the text and were either words or symbol strings (e.g., @#%&). Target word availability for parafoveal processing (i.e., preview validity) was also manipulated. The results indicated much shallower processing of distractors than targets, and this pattern was more pronounced for symbol than word distractors. The influences of word frequency and fixation location on first-pass fixation durations on distractors were dramatically different than the well-documented pattern obtained in normal reading. Robust preview benefits were demonstrated both when the critical distractors were fixated and when the critical distractors were skipped. Finally, with the exception of larger preview benefits that were obtained in the condition in which the target and critical distractor were identical, the magnitude of the preview effect was largely unaffected by the nature of the critical distractor. Implications of the present paradigm and findings to the study of eye-movement control in reading are discussed. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Efficient automatic OCR word validation using word partial format derivation and language model
NASA Astrophysics Data System (ADS)
Chen, Siyuan; Misra, Dharitri; Thoma, George R.
2010-01-01
In this paper we present an OCR validation module, implemented for the System for Preservation of Electronic Resources (SPER) developed at the U.S. National Library of Medicine.1 The module detects and corrects suspicious words in the OCR output of scanned textual documents through a procedure of deriving partial formats for each suspicious word, retrieving candidate words by partial-match search from lexicons, and comparing the joint probabilities of N-gram and OCR edit transformation corresponding to the candidates. The partial format derivation, based on OCR error analysis, efficiently and accurately generates candidate words from lexicons represented by ternary search trees. In our test case comprising a historic medico-legal document collection, this OCR validation module yielded the correct words with 87% accuracy and reduced the overall OCR word errors by around 60%.
The Role of Grammatical Class on Word Recognition
ERIC Educational Resources Information Center
Vigliocco, Gabriella; Vinson, David P.; Arciuli, Joanne; Barber, Horacio
2008-01-01
The double dissociation between noun and verb processing, well documented in the neuropsychological literature, has not been supported in imaging studies. Recent imaging studies, in fact, suggest that once confounding with semantics is eliminated, grammatical class effects only emerge as a consequence of building frames. Here we assess this…
ERIC Educational Resources Information Center
Ellis, Barbara G.; Dick, Steven J.
1996-01-01
Employs the statistics-documentation portion of a word-processing program's grammar-check feature together with qualitative analyses to determine that Henry Watterson, long-time editor of the "Louisville Courier-Journal," was probably the South's famed Civil War correspondent "Shadow." (TB)
Radical Thoughts on Simplifying Square Roots
ERIC Educational Resources Information Center
Schultz, Kyle T.; Bismarck, Stephen F.
2013-01-01
A picture is worth a thousand words. This statement is especially true in mathematics teaching and learning. Visual representations such as pictures, diagrams, charts, and tables can illuminate ideas that can be elusive when displayed in symbolic form only. The prevalence of representation as a mathematical process in such documents as…
75 FR 4375 - Transmission Loading Relief Reliability Standard and Curtailment Priorities
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-27
... Site: http://www.ferc.gov . Documents created electronically using word processing software should be... ensure operation within acceptable reliability criteria. NERC Glossary of Terms Used in Reliability Standards at 19, available at http://www.nerc.com/files/Glossary_12Feb08.pdf (NERC Glossary). An...
Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network.
Wu, Yonghui; Jiang, Min; Lei, Jianbo; Xu, Hua
2015-01-01
Rapid growth in electronic health records (EHRs) use has led to an unprecedented expansion of available clinical data in electronic formats. However, much of the important healthcare information is locked in the narrative documents. Therefore Natural Language Processing (NLP) technologies, e.g., Named Entity Recognition that identifies boundaries and types of entities, has been extensively studied to unlock important clinical information in free text. In this study, we investigated a novel deep learning method to recognize clinical entities in Chinese clinical documents using the minimal feature engineering approach. We developed a deep neural network (DNN) to generate word embeddings from a large unlabeled corpus through unsupervised learning and another DNN for the NER task. The experiment results showed that the DNN with word embeddings trained from the large unlabeled corpus outperformed the state-of-the-art CRF's model in the minimal feature engineering setting, achieving the highest F1-score of 0.9280. Further analysis showed that word embeddings derived through unsupervised learning from large unlabeled corpus remarkably improved the DNN with randomized embedding, denoting the usefulness of unsupervised feature learning.
39 CFR 3001.10 - Form and number of copies of documents.
Code of Federal Regulations, 2010 CFR
2010-07-01
... service must be printed from a text-based pdf version of the document, where possible. Otherwise, they may... generated in either Acrobat (pdf), Word, or WordPerfect, or Rich Text Format (rtf). [67 FR 67559, Nov. 6...
Different Neural Correlates of Emotion-Label Words and Emotion-Laden Words: An ERP Study
Zhang, Juan; Wu, Chenggang; Meng, Yaxuan; Yuan, Zhen
2017-01-01
It is well-documented that both emotion-label words (e.g., sadness, happiness) and emotion-laden words (e.g., death, wedding) can induce emotion activation. However, the neural correlates of emotion-label words and emotion-laden words recognition have not been examined. The present study aimed to compare the underlying neural responses when processing the two kinds of words by employing event-related potential (ERP) measurements. Fifteen Chinese native speakers were asked to perform a lexical decision task in which they should judge whether a two-character compound stimulus was a real word or not. Results showed that (1) emotion-label words and emotion-laden words elicited similar P100 at the posteriors sites, (2) larger N170 was found for emotion-label words than for emotion-laden words at the occipital sites on the right hemisphere, and (3) negative emotion-label words elicited larger Late Positivity Complex (LPC) on the right hemisphere than on the left hemisphere while such effect was not found for emotion-laden words and positive emotion-label words. The results indicate that emotion-label words and emotion-laden words elicit different cortical responses at both early (N170) and late (LPC) stages. In addition, right hemisphere advantage for emotion-label words over emotion-laden words can be observed in certain time windows (i.e., N170 and LPC) while fails to be detected in some other time window (i.e., P100). The implications of the current findings for future emotion research were discussed. PMID:28983242
Sub-word image clustering in Farsi printed books
NASA Astrophysics Data System (ADS)
Soheili, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier
2015-02-01
Most OCR systems are designed for the recognition of a single page. In case of unfamiliar font faces, low quality papers and degraded prints, the performance of these products drops sharply. However, an OCR system can use redundancy of word occurrences in large documents to improve recognition results. In this paper, we propose a sub-word image clustering method for the applications dealing with large printed documents. We assume that the whole document is printed by a unique unknown font with low quality print. Our proposed method finds clusters of equivalent sub-word images with an incremental algorithm. Due to the low print quality, we propose an image matching algorithm for measuring the distance between two sub-word images, based on Hamming distance and the ratio of the area to the perimeter of the connected components. We built a ground-truth dataset of more than 111000 sub-word images to evaluate our method. All of these images were extracted from an old Farsi book. We cluster all of these sub-words, including isolated letters and even punctuation marks. Then all centers of created clusters are labeled manually. We show that all sub-words of the book can be recognized with more than 99.7% accuracy by assigning the label of each cluster center to all of its members.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stack, Andrew
Representing the Nanoscale Control of Geologic CO2 (NCGC), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of NCGC is to build a fundamental understanding of molecular-to-pore-scale processes in fluid-rock systems,more » and to demonstrate the ability to control critical aspects of flow, transport, and mineralization in porous rock media as applied to the injection and storage of carbon dioxide (CO2) in subsurface reservoirs.« less
Using ontology network structure in text mining.
Berndt, Donald J; McCart, James A; Luther, Stephen L
2010-11-13
Statistical text mining treats documents as bags of words, with a focus on term frequencies within documents and across document collections. Unlike natural language processing (NLP) techniques that rely on an engineered vocabulary or a full-featured ontology, statistical approaches do not make use of domain-specific knowledge. The freedom from biases can be an advantage, but at the cost of ignoring potentially valuable knowledge. The approach proposed here investigates a hybrid strategy based on computing graph measures of term importance over an entire ontology and injecting the measures into the statistical text mining process. As a starting point, we adapt existing search engine algorithms such as PageRank and HITS to determine term importance within an ontology graph. The graph-theoretic approach is evaluated using a smoking data set from the i2b2 National Center for Biomedical Computing, cast as a simple binary classification task for categorizing smoking-related documents, demonstrating consistent improvements in accuracy.
Artificial neural networks for document analysis and recognition.
Marinai, Simone; Gori, Marco; Soda, Giovanni; Society, Computer
2005-01-01
Artificial neural networks have been extensively applied to document analysis and recognition. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document processing tasks, like preprocessing, layout analysis, character segmentation, word recognition, and signature verification, have been effectively faced with very promising results. This paper surveys the most significant problems in the area of offline document image processing, where connectionist-based approaches have been applied. Similarities and differences between approaches belonging to different categories are discussed. A particular emphasis is given on the crucial role of prior knowledge for the conception of both appropriate architectures and learning algorithms. Finally, the paper provides a critical analysis on the reviewed approaches and depicts the most promising research guidelines in the field. In particular, a second generation of connectionist-based models are foreseen which are based on appropriate graphical representations of the learning environment.
Grundmeier, Robert W; Masino, Aaron J; Casper, T Charles; Dean, Jonathan M; Bell, Jamie; Enriquez, Rene; Deakyne, Sara; Chamberlain, James M; Alpern, Elizabeth R
2016-11-09
Important information to support healthcare quality improvement is often recorded in free text documents such as radiology reports. Natural language processing (NLP) methods may help extract this information, but these methods have rarely been applied outside the research laboratories where they were developed. To implement and validate NLP tools to identify long bone fractures for pediatric emergency medicine quality improvement. Using freely available statistical software packages, we implemented NLP methods to identify long bone fractures from radiology reports. A sample of 1,000 radiology reports was used to construct three candidate classification models. A test set of 500 reports was used to validate the model performance. Blinded manual review of radiology reports by two independent physicians provided the reference standard. Each radiology report was segmented and word stem and bigram features were constructed. Common English "stop words" and rare features were excluded. We used 10-fold cross-validation to select optimal configuration parameters for each model. Accuracy, recall, precision and the F1 score were calculated. The final model was compared to the use of diagnosis codes for the identification of patients with long bone fractures. There were 329 unique word stems and 344 bigrams in the training documents. A support vector machine classifier with Gaussian kernel performed best on the test set with accuracy=0.958, recall=0.969, precision=0.940, and F1 score=0.954. Optimal parameters for this model were cost=4 and gamma=0.005. The three classification models that we tested all performed better than diagnosis codes in terms of accuracy, precision, and F1 score (diagnosis code accuracy=0.932, recall=0.960, precision=0.896, and F1 score=0.927). NLP methods using a corpus of 1,000 training documents accurately identified acute long bone fractures from radiology reports. Strategic use of straightforward NLP methods, implemented with freely available software, offers quality improvement teams new opportunities to extract information from narrative documents.
Words, concepts, or both: optimal indexing units for automated information retrieval.
Hersh, W. R.; Hickam, D. H.; Leone, T. J.
1992-01-01
What is the best way to represent the content of documents in an information retrieval system? This study compares the retrieval effectiveness of five different methods for automated (machine-assigned) indexing using three test collections. The consistently best methods are those that use indexing based on the words that occur in the available text of each document. Methods used to map text into concepts from a controlled vocabulary showed no advantage over the word-based methods. This study also looked at an approach to relevance feedback which showed benefit for both word-based and concept-based methods. PMID:1482951
Design and realization of the compound text-based test questions library management system
NASA Astrophysics Data System (ADS)
Shi, Lei; Feng, Lin; Zhao, Xin
2011-12-01
The test questions library management system is the essential part of the on-line examination system. The basic demand for which is to deal with compound text including information like images, formulae and create the corresponding Word documents. Having compared with the two current solutions of creating documents, this paper presents a design proposal of Word Automation mechanism based on OLE/COM technology, and discusses the way of Word Automation application in detail and at last provides the operating results of the system which have high reference value in improving the generated efficiency of project documents and report forms.
Using color management in color document processing
NASA Astrophysics Data System (ADS)
Nehab, Smadar
1995-04-01
Color Management Systems have been used for several years in Desktop Publishing (DTP) environments. While this development hasn't matured yet, we are already experiencing the next generation of the color imaging revolution-Device Independent Color for the small office/home office (SOHO) environment. Though there are still open technical issues with device independent color matching, they are not the focal point of this paper. This paper discusses two new and crucial aspects in using color management in color document processing: the management of color objects and their associated color rendering methods; a proposal for a precedence order and handshaking protocol among the various software components involved in color document processing. As color peripherals become affordable to the SOHO market, color management also becomes a prerequisite for common document authoring applications such as word processors. The first color management solutions were oriented towards DTP environments whose requirements were largely different. For example, DTP documents are image-centric, as opposed to SOHO documents that are text and charts centric. To achieve optimal reproduction on low-cost SOHO peripherals, it is critical that different color rendering methods are used for the different document object types. The first challenge in using color management of color document processing is the association of rendering methods with object types. As a result of an evolutionary process, color matching solutions are now available as application software, as driver embedded software and as operating system extensions. Consequently, document processing faces a new challenge, the correct selection of the color matching solution while avoiding duplicate color corrections.
29 CFR 516.1 - Form of records; scope of regulations.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 29 Labor 3 2010-07-01 2010-07-01 false Form of records; scope of regulations. 516.1 Section 516.1 Labor Regulations Relating to Labor (Continued) WAGE AND HOUR DIVISION, DEPARTMENT OF LABOR REGULATIONS... other basic source document of an automatic word or data processing memory provided that adequate...
Different Words for the Same Concept: Learning Collaboratively from Multiple Documents
ERIC Educational Resources Information Center
Jucks, Regina; Paus, Elisabeth
2013-01-01
This study investigated how varying the lexical encodings of technical terms in multiple texts influences learners' dyadic processing of scientific-related information. Fifty-seven pairs of college students read journalistic texts on depression. Each partner in a dyad received one text; for half of the dyads the partner's text contained different…
To Teach or Not to Teach: The Ethics of Metadata
ERIC Educational Resources Information Center
Barnes, Cynthia; Cavaliere, Frank
2009-01-01
Metadata is information about computer-generated documents that is often inadvertently transmitted to others. The problems associated with metadata have become more acute over time as word processing and other popular programs have become more receptive to the concept of collaboration. As more people become involved in the preparation of…
18 CFR 385.2003 - Specifications (Rule 2003).
Code of Federal Regulations, 2011 CFR
2011-04-01
... governing timeliness, a document filed via the Internet will be deemed to have been received by the... other word or data processing equipment; (2) Have double-spaced lines with left margins not less than 11/2 inch wide, except that any tariff or rate filing may be single-spaced; (3) Have indented and...
Establishing the Content Validity of a Basic Computer Literacy Course.
ERIC Educational Resources Information Center
Clements, James; Carifio, James
1995-01-01
Content analysis of 13 textbooks and 2 Department of Education documents was conducted to ascertain common word processing, database, and spreadsheet software skills in order to determine which specific skills should be taught in a high school computer literacy course. Aspects of a basic computer course, created from this analysis, are described.…
ERIC Educational Resources Information Center
Ng, Kwong Bor; Rieh, Soo Young; Kantor, Paul
2000-01-01
Discussion of natural language processing focuses on experiments using linear discriminant analysis to distinguish "Wall Street Journal" texts from "Federal Register" tests using information about the frequency of occurrence of word boundaries, sentence boundaries, and punctuation marks. Displays and interprets results in terms…
Hemispheric differences in the recruitment of semantic processing mechanisms
Kandhadai, Padmapriya; Federmeier, Kara D.
2010-01-01
This study examined how the two cerebral hemispheres recruit semantic processing mechanisms by combining event-related potential measures and visual half-field methods in a word priming paradigm in which semantic strength and predictability were manipulated using lexically associated word pairs. Activation patterns on the Late Positive Complex (LPC), linked to controlled aspects of processing, showed that previously documented left hemisphere (LH) processing benefits for word pairs with a weak forward but strong backward association stem from the ability to appreciate meaning relations in an order-independent fashion and/or strategically reorder them. Whereas there is a LH benefit for such strategic processing during comprehension in passive tasks, the present study further showed that the RH is also able to make use of these mechanisms when explicit semantic judgments are required. In both hemispheres, N400 responses, linked to initial semantic activation, were largely graded by association strength, with more amplitude reduction for forward associates and strong, symmetrically associated pairs compared to backward associates and matched weak, symmetrically associated pairs. However, responses to moderately associated pairs were more facilitated after initial presentation to the LH than to the RH. This pattern converges with sentence processing findings that point to LH advantages for using context information to predict features of likely upcoming words. Together, the results suggest that an important basis for hemispheric asymmetries in language comprehension arises from when and how each uses top-down semantic mechanisms to shape initial semantic activation over time. PMID:20638397
Responding to nonwords in the lexical decision task: Insights from the English Lexicon Project.
Yap, Melvin J; Sibley, Daragh E; Balota, David A; Ratcliff, Roger; Rueckl, Jay
2015-05-01
Researchers have extensively documented how various statistical properties of words (e.g., word frequency) influence lexical processing. However, the impact of lexical variables on nonword decision-making performance is less clear. This gap is surprising, because a better specification of the mechanisms driving nonword responses may provide valuable insights into early lexical processes. In the present study, item-level and participant-level analyses were conducted on the trial-level lexical decision data for almost 37,000 nonwords in the English Lexicon Project in order to identify the influence of different psycholinguistic variables on nonword lexical decision performance and to explore individual differences in how participants respond to nonwords. Item-level regression analyses reveal that nonword response time was positively correlated with number of letters, number of orthographic neighbors, number of affixes, and base-word number of syllables, and negatively correlated with Levenshtein orthographic distance and base-word frequency. Participant-level analyses also point to within- and between-session stability in nonword responses across distinct sets of items, and intriguingly reveal that higher vocabulary knowledge is associated with less sensitivity to some dimensions (e.g., number of letters) but more sensitivity to others (e.g., base-word frequency). The present findings provide well-specified and interesting new constraints for informing models of word recognition and lexical decision. (c) 2015 APA, all rights reserved).
Replacement Attack: A New Zero Text Watermarking Attack
NASA Astrophysics Data System (ADS)
Bashardoost, Morteza; Mohd Rahim, Mohd Shafry; Saba, Tanzila; Rehman, Amjad
2017-03-01
The main objective of zero watermarking methods that are suggested for the authentication of textual properties is to increase the fragility of produced watermarks against tampering attacks. On the other hand, zero watermarking attacks intend to alter the contents of document without changing the watermark. In this paper, the Replacement attack is proposed, which focuses on maintaining the location of the words in the document. The proposed text watermarking attack is specifically effective on watermarking approaches that exploit words' transition in the document. The evaluation outcomes prove that tested word-based method are unable to detect the existence of replacement attack in the document. Moreover, the comparison results show that the size of Replacement attack is estimated less accurate than other common types of zero text watermarking attacks.
Discrete Emotion Effects on Lexical Decision Response Times
Briesemeister, Benny B.; Kuchinke, Lars; Jacobs, Arthur M.
2011-01-01
Our knowledge about affective processes, especially concerning effects on cognitive demands like word processing, is increasing steadily. Several studies consistently document valence and arousal effects, and although there is some debate on possible interactions and different notions of valence, broad agreement on a two dimensional model of affective space has been achieved. Alternative models like the discrete emotion theory have received little interest in word recognition research so far. Using backward elimination and multiple regression analyses, we show that five discrete emotions (i.e., happiness, disgust, fear, anger and sadness) explain as much variance as two published dimensional models assuming continuous or categorical valence, with the variables happiness, disgust and fear significantly contributing to this account. Moreover, these effects even persist in an experiment with discrete emotion conditions when the stimuli are controlled for emotional valence and arousal levels. We interpret this result as evidence for discrete emotion effects in visual word recognition that cannot be explained by the two dimensional affective space account. PMID:21887307
Discrete emotion effects on lexical decision response times.
Briesemeister, Benny B; Kuchinke, Lars; Jacobs, Arthur M
2011-01-01
Our knowledge about affective processes, especially concerning effects on cognitive demands like word processing, is increasing steadily. Several studies consistently document valence and arousal effects, and although there is some debate on possible interactions and different notions of valence, broad agreement on a two dimensional model of affective space has been achieved. Alternative models like the discrete emotion theory have received little interest in word recognition research so far. Using backward elimination and multiple regression analyses, we show that five discrete emotions (i.e., happiness, disgust, fear, anger and sadness) explain as much variance as two published dimensional models assuming continuous or categorical valence, with the variables happiness, disgust and fear significantly contributing to this account. Moreover, these effects even persist in an experiment with discrete emotion conditions when the stimuli are controlled for emotional valence and arousal levels. We interpret this result as evidence for discrete emotion effects in visual word recognition that cannot be explained by the two dimensional affective space account.
Contribution to terminology internationalization by word alignment in parallel corpora.
Deléger, Louise; Merkel, Magnus; Zweigenbaum, Pierre
2006-01-01
Creating a complete translation of a large vocabulary is a time-consuming task, which requires skilled and knowledgeable medical translators. Our goal is to examine to which extent such a task can be alleviated by a specific natural language processing technique, word alignment in parallel corpora. We experiment with translation from English to French. Build a large corpus of parallel, English-French documents, and automatically align it at the document, sentence and word levels using state-of-the-art alignment methods and tools. Then project English terms from existing controlled vocabularies to the aligned word pairs, and examine the number and quality of the putative French translations obtained thereby. We considered three American vocabularies present in the UMLS with three different translation statuses: the MeSH, SNOMED CT, and the MedlinePlus Health Topics. We obtained several thousand new translations of our input terms, this number being closely linked to the number of terms in the input vocabularies. Our study shows that alignment methods can extract a number of new term translations from large bodies of text with a moderate human reviewing effort, and thus contribute to help a human translator obtain better translation coverage of an input vocabulary. Short-term perspectives include their application to a corpus 20 times larger than that used here, together with more focused methods for term extraction.
Contribution to Terminology Internationalization by Word Alignment in Parallel Corpora
Deléger, Louise; Merkel, Magnus; Zweigenbaum, Pierre
2006-01-01
Background and objectives Creating a complete translation of a large vocabulary is a time-consuming task, which requires skilled and knowledgeable medical translators. Our goal is to examine to which extent such a task can be alleviated by a specific natural language processing technique, word alignment in parallel corpora. We experiment with translation from English to French. Methods Build a large corpus of parallel, English-French documents, and automatically align it at the document, sentence and word levels using state-of-the-art alignment methods and tools. Then project English terms from existing controlled vocabularies to the aligned word pairs, and examine the number and quality of the putative French translations obtained thereby. We considered three American vocabularies present in the UMLS with three different translation statuses: the MeSH, SNOMED CT, and the MedlinePlus Health Topics. Results We obtained several thousand new translations of our input terms, this number being closely linked to the number of terms in the input vocabularies. Conclusion Our study shows that alignment methods can extract a number of new term translations from large bodies of text with a moderate human reviewing effort, and thus contribute to help a human translator obtain better translation coverage of an input vocabulary. Short-term perspectives include their application to a corpus 20 times larger than that used here, together with more focused methods for term extraction. PMID:17238328
Automatic generation of stop word lists for information retrieval and analysis
Rose, Stuart J
2013-01-08
Methods and systems for automatically generating lists of stop words for information retrieval and analysis. Generation of the stop words can include providing a corpus of documents and a plurality of keywords. From the corpus of documents, a term list of all terms is constructed and both a keyword adjacency frequency and a keyword frequency are determined. If a ratio of the keyword adjacency frequency to the keyword frequency for a particular term on the term list is less than a predetermined value, then that term is excluded from the term list. The resulting term list is truncated based on predetermined criteria to form a stop word list.
75 FR 12001 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-12
... Source Categories. In routine Use 27, we inadvertently omitted the words ``in writing''. This document..., paragraph 27, in the third line after the words ``verbally or'', add the words ``in writing''. Approved...
The Fluid Reading Primer: Animated Decoding Support for Emergent Readers.
ERIC Educational Resources Information Center
Zellweger, Polle T.; Mackinlay, Jock D.
A prototype application called the Fluid Reading Primer was developed to help emergent readers with the process of decoding written words into their spoken forms. The Fluid Reading Primer is part of a larger research project called Fluid Documents, which is exploring the use of interactive animation of typography to show additional information in…
Apple IIe Computers and Appleworks Training Mini Course Materials.
ERIC Educational Resources Information Center
Schlenker, Richard M.
The instructional materials included in this document are designed to introduce students to the Apple IIe computer and to the word processing and database portions of the AppleWorks program. The materials are intended for small groups of students, each of whom has use of a computer during class and for short periods between classes. The course…
Federal Register 2010, 2011, 2012, 2013, 2014
2012-07-05
...'' as used in the NERC Glossary. \\25\\ Id. at 15. \\26\\ Id. at 16. 16. NERC also explains that, while the...: Through http://www.ferc.gov . Documents created electronically using word processing software should be...'s Glossary of Terms Used in Reliability Standards (NERC Glossary) developed by the North American...
Maguire, Mandy J; Schneider, Julie M; Middleton, Anna E; Ralph, Yvonne; Lopez, Michael; Ackerman, Robert A; Abel, Alyson D
2018-02-01
The relationship between children's slow vocabulary growth and the family's low socioeconomic status (SES) has been well documented. However, previous studies have often focused on infants or preschoolers and primarily used static measures of vocabulary at multiple time points. To date, there is no research investigating whether SES predicts a child's word learning abilities in grade school and, if so, what mediates this relationship. In this study, 68 children aged 8-15 years performed a written word learning from context task that required using the surrounding text to identify the meaning of an unknown word. Results revealed that vocabulary knowledge significantly mediated the relationship between SES (as measured by maternal education) and word learning. This was true despite the fact that the words in the linguistic context surrounding the target word are typically acquired well before 8 years of age. When controlling for vocabulary, word learning from written context was not predicted by differences in reading comprehension, decoding, or working memory. These findings reveal that differences in vocabulary growth between grade school children from low and higher SES homes are likely related to differences in the process of word learning more than knowledge of surrounding words or reading skills. Specifically, children from lower SES homes are not as effective at using known vocabulary to build a robust semantic representation of incoming text to identify the meaning of an unknown word. Copyright © 2017 Elsevier Inc. All rights reserved.
Identification of misspelled words without a comprehensive dictionary using prevalence analysis.
Turchin, Alexander; Chu, Julia T; Shubina, Maria; Einbinder, Jonathan S
2007-10-11
Misspellings are common in medical documents and can be an obstacle to information retrieval. We evaluated an algorithm to identify misspelled words through analysis of their prevalence in a representative body of text. We evaluated the algorithm's accuracy of identifying misspellings of 200 anti-hypertensive medication names on 2,000 potentially misspelled words randomly selected from narrative medical documents. Prevalence ratios (the frequency of the potentially misspelled word divided by the frequency of the non-misspelled word) in physician notes were computed by the software for each of the words. The software results were compared to the manual assessment by an independent reviewer. Area under the ROC curve for identification of misspelled words was 0.96. Sensitivity, specificity, and positive predictive value were 99.25%, 89.72% and 82.9% for the prevalence ratio threshold (0.32768) with the highest F-measure (0.903). Prevalence analysis can be used to identify and correct misspellings with high accuracy.
Computer program documentation: Raw-to-processed SINDA program (RTOPHS) user's guide
NASA Technical Reports Server (NTRS)
Damico, S. J.
1980-01-01
Use of the Raw to Processed SINDA(System Improved Numerical Differencing Analyzer) Program, RTOPHS, which provides a means of making the temperature prediction data on binary HSTFLO and HISTRY units generated by SINDA available to engineers in an easy to use format, is discussed. The program accomplishes this by reading the HISTRY unit and according to user input instructions, the desired times and temperature prediction data are extracted and written to a word addressable drum file.
Small Business Innovations (Automated Information)
NASA Technical Reports Server (NTRS)
1992-01-01
Bruce G. Jackson & Associates Document Director is an automated tool that combines word processing and database management technologies to offer the flexibility and convenience of text processing with the linking capability of database management. Originally developed for NASA, it provides a means to collect and manage information associated with requirements development. The software system was used by NASA in the design of the Assured Crew Return Vehicle, as well as by other government and commercial organizations including the Southwest Research Institute.
Relational Learning via Collective Matrix Factorization
2008-06-01
well-known example of such a schema is pLSI- pHITS [13], which models document-word counts and document-document citations: E1 = words and E2 = E3...relational co- clustering include pLSI, pLSI- pHITS , the symmetric block models of Long et. al. [23, 24, 25], and Bregman tensor clustering [5] (which can...to pLSI- pHITS In this section we provide an example where the additional flexibility of collective matrix factorization leads to better results; and
The Implementation of Cosine Similarity to Calculate Text Relevance between Two Documents
NASA Astrophysics Data System (ADS)
Gunawan, D.; Sembiring, C. A.; Budiman, M. A.
2018-03-01
Rapidly increasing number of web pages or documents leads to topic specific filtering in order to find web pages or documents efficiently. This is a preliminary research that uses cosine similarity to implement text relevance in order to find topic specific document. This research is divided into three parts. The first part is text-preprocessing. In this part, the punctuation in a document will be removed, then convert the document to lower case, implement stop word removal and then extracting the root word by using Porter Stemming algorithm. The second part is keywords weighting. Keyword weighting will be used by the next part, the text relevance calculation. Text relevance calculation will result the value between 0 and 1. The closer value to 1, then both documents are more related, vice versa.
NASA Technical Reports Server (NTRS)
1994-01-01
This booklet provides a partial list of acronyms, abbreviations, and other short word forms, including their definitions, used in documents at the Goddard Space Flight Center (GSFC). This list does not preclude the use of other short forms of less general usage, as long as these short forms are identified the first time they appear in a document and are defined in a glossary in the document in which they are used. This document supplements information in the GSFC Scientific and Technical Information Handbook (GHB 2200.2/April 1989). It is not intended to contain all short word forms used in GSFC documents; however, it was compiled of actual short forms used in recent GSFC documents. The entries are listed first, alphabetically by the short form, and then again alphabetically by definition.
Dispaldro, Marco; Deevy, Patricia; Altoe, Gianmarco; Benelli, Beatrice; Leonard Purdue, Laurence B.
2013-01-01
Background Although relationships among non-word repetition, real-word repetition and grammatical ability have been documented, it is important to study whether the specific nature of these relationships is tied to the characteristics of a given language. Aims The aim of this study is to explore the potential cross-linguistic differences (Italian and English) in the relationship among non-word repetition, real-word repetition, and grammatical ability in three- and four-year-old children with typical language development. Methods & Procedures To reach this goal, two repetition tasks (one real-word list and one non-word list for each language) were used. In Italian the grammatical categories were the third person plural inflection and the direct-object clitic pronouns, while in English they were the third person singular present tense inflection and the past tense in regular and irregular forms. Outcomes & Results A cross-linguistic comparison showed that in both Italian and English, non-word repetition was a significant predictor of grammatical ability. However, performance on real-word repetition explained children’s grammatical ability in Italian but not in English. Conclusions & Implications Abilities underlying non-word repetition performance (e.g., the processing and/or storage of phonological material) play an important role in the development of children’s grammatical abilities in both languages. Lexical ability (indexed by real-word repetition) showed a close relationship to grammatical ability in Italian but not in English. Implications of the findings are discussed in terms of cross-linguistic differences, genetic research, clinical intervention and methodological issues. PMID:21899673
Dispaldro, Marco; Deevy, Patricia; Altoé, Gianmarco; Benelli, Beatrice; Leonard, Laurence B
2011-01-01
Although relationships among non-word repetition, real-word repetition and grammatical ability have been documented, it is important to study whether the specific nature of these relationships is tied to the characteristics of a given language. The aim of this study is to explore the potential cross-linguistic differences (Italian and English) in the relationship among non-word repetition, real-word repetition, and grammatical ability in three-and four-year-old children with typical language development. To reach this goal, two repetition tasks (one real-word list and one non-word list for each language) were used. In Italian the grammatical categories were the third person plural inflection and the direct-object clitic pronouns, while in English they were the third person singular present tense inflection and the past tense in regular and irregular forms. A cross-linguistic comparison showed that in both Italian and English, non-word repetition was a significant predictor of grammatical ability. However, performance on real-word repetition explained children's grammatical ability in Italian but not in English. Abilities underlying non-word repetition performance (e.g., the processing and/or storage of phonological material) play an important role in the development of children's grammatical abilities in both languages. Lexical ability (indexed by real-word repetition) showed a close relationship to grammatical ability in Italian but not in English. Implications of the findings are discussed in terms of cross-linguistic differences, genetic research, clinical intervention and methodological issues. © 2011 Royal College of Speech & Language Therapists.
Visualization and Analysis of Geology Word Vectors for Efficient Information Extraction
NASA Astrophysics Data System (ADS)
Floyd, J. S.
2016-12-01
When a scientist begins studying a new geographic region of the Earth, they frequently begin by gathering relevant scientific literature in order to understand what is known, for example, about the region's geologic setting, structure, stratigraphy, and tectonic and environmental history. Experienced scientists typically know what keywords to seek and understand that if a document contains one important keyword, then other words in the document may be important as well. Word relationships in a document give rise to what is known in linguistics as the context-dependent nature of meaning. For example, the meaning of the word `strike' in geology, as in the strike of a fault, is quite different from its popular meaning in baseball. In addition, word order, such as in the phrase `Cretaceous-Tertiary boundary,' often corresponds to the order of sequences in time or space. The context of words and the relevance of words to each other can be derived quantitatively by machine learning vector representations of words. Here we show the results of training a neural network to create word vectors from scientific research papers from selected rift basins and mid-ocean ridges: the Woodlark Basin of Papua New Guinea, the Hess Deep rift, and the Gulf of Mexico basin. The word vectors are statistically defined by surrounding words within a given window, limited by the length of each sentence. The word vectors are analyzed by their cosine distance to related words (e.g., `axial' and `magma'), classified by high dimensional clustering, and visualized by reducing the vector dimensions and plotting the vectors on a two- or three-dimensional graph. Similarity analysis of `Triassic' and `Cretaceous' returns `Jurassic' as the nearest word vector, suggesting that the model is capable of learning the geologic time scale. Similarity analysis of `basalt' and `minerals' automatically returns mineral names such as `chlorite', `plagioclase,' and `olivine.' Word vector analysis and visualization allow one to extract information from hundreds of papers or more and find relationships in less time than it would take to read all of the papers. As machine learning tools become more commonly available, more and more scientists will be able to use and refine these tools for their individual needs.
Information Security Program Regulation
1986-06-01
above. When an alarmed area is used for the storage of Top Secret material, the physical barrier must be adequate to prevent (a) surreptitious removal ...IV-9 4-304 Removable ADP and Word Processing Storage Media ---------- IV-10 4-305 Documents Produced by ADP Equipment...with a removal or cancellation of the classification designation. 1-315 Declassification Event An event that eliminates the need for continued
"Say Just One Word at First": The Emergence of Reliable Speech in a Student Labeled with Autism.
ERIC Educational Resources Information Center
Broderick, Alicia A.; Kasa-Hendrickson, Christi
2001-01-01
A qualitative study documents the emergence, in the context of typed expression, of increasingly useful and reliable speech for an adolescent labeled with autism. The process of speech development is described and the three categories of supports that he and his family experienced that supported his emergent speech are discussed. (Contains…
ERIC Educational Resources Information Center
Devauchelle, Anne-Dominique; Oppenheim, Catherine; Rizzi, Luigi; Dehaene, Stanislas; Pallier, Christophe
2009-01-01
Priming effects have been well documented in behavioral psycholinguistics experiments: The processing of a word or a sentence is typically facilitated when it shares lexico-semantic or syntactic features with a previously encountered stimulus. Here, we used fMRI priming to investigate which brain areas show adaptation to the repetition of a…
The New York City Subways: The First Ten Years. A Library Research Exercise Using a Computer.
ERIC Educational Resources Information Center
Machalow, Robert
This document presents a library research exercise developed at York College which uses the Apple IIe microcomputer and word processing software--the Applewriter--to teach library research skills. Unlike some other library research exercises on disk, this program allows the student to decide on alternative approaches to solving the given problem:…
Sohn, Sunghwan; Wang, Yanshan; Wi, Chung-Il; Krusemark, Elizabeth A; Ryu, Euijung; Ali, Mir H; Juhn, Young J; Liu, Hongfang
2017-11-30
To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability. Birth cohorts from Mayo Clinic and Sanford Children's Hospital (SCH) were used in this study (n = 298 for each). Documentation variations regarding asthma between the 2 cohorts were examined in various aspects: (1) overall corpus at the word level (ie, lexical variation), (2) topics and asthma-related concepts (ie, semantic variation), and (3) clinical note types (ie, process variation). We compared those statistics and explored NLP system portability for asthma ascertainment in 2 stages: prototype and refinement. There exist notable lexical variations (word-level similarity = 0.669) and process variations (differences in major note types containing asthma-related concepts). However, semantic-level corpora were relatively homogeneous (topic similarity = 0.944, asthma-related concept similarity = 0.971). The NLP system for asthma ascertainment had an F-score of 0.937 at Mayo, and produced 0.813 (prototype) and 0.908 (refinement) when applied at SCH. The criteria for asthma ascertainment are largely dependent on asthma-related concepts. Therefore, we believe that semantic similarity is important to estimate NLP system portability. As the Mayo Clinic and SCH corpora were relatively homogeneous at a semantic level, the NLP system, developed at Mayo Clinic, was imported to SCH successfully with proper adjustments to deal with the intrinsic corpus heterogeneity. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Whittier Tunnel, Transportation & Public Facilities, State of Alaska
ONLINE (or choose to download in Adobe PDF or Excel format) Summer May 1 - Sept 30 PDF document | Excel document Winter Oct 1 - Apr 30 PDF document | Excel document Current Regulations: PDF document | Word
ERIC Educational Resources Information Center
Pavelko, Stacey L.; Owens, Robert E., Jr.
2017-01-01
Purpose: The purpose of this study was to document whether mean length of utterance (MLU[subscript S]), total number of words (TNW), clauses per sentence (CPS), and/or words per sentence (WPS) demonstrated age-related changes in children with typical language and to document the average time to collect, transcribe, and analyze conversational…
ERIC Educational Resources Information Center
Hendrix, Peter; Bolger, Patrick; Baayen, Harald
2017-01-01
Recent studies have documented frequency effects for word n-grams, independently of word unigram frequency. Further studies have revealed constructional prototype effects, both at the word level as well as for phrases. The present speech production study investigates the time course of these effects for the production of prepositional phrases in…
Development of First-Graders' Word Reading Skills: For Whom Can Dynamic Assessment Tell Us More?
ERIC Educational Resources Information Center
Cho, Eunsoo; Compton, Donald L.; Gilbert, Jennifer K.; Steacy, Laura M.; Collins, Alyson A.; Lindström, Esther R.
2017-01-01
Dynamic assessment (DA) of word reading measures learning potential for early reading development by documenting the amount of assistance needed to learn how to read words with unfamiliar orthography. We examined the additive value of DA for predicting first-grade decoding and word recognition development while controlling for autoregressive…
SciReader enables reading of medical content with instantaneous definitions.
Gradie, Patrick R; Litster, Megan; Thomas, Rinu; Vyas, Jay; Schiller, Martin R
2011-01-25
A major problem patients encounter when reading about health related issues is document interpretation, which limits reading comprehension and therefore negatively impacts health care. Currently, searching for medical definitions from an external source is time consuming, distracting, and negatively impacts reading comprehension and memory of the material. SciReader was built as a Java application with a Flex-based front-end client. The dictionary used by SciReader was built by consolidating data from several sources and generating new definitions with a standardized syntax. The application was evaluated by measuring the percentage of words defined in different documents. A survey was used to test the perceived effect of SciReader on reading time and comprehension. We present SciReader, a web-application that simplifies document interpretation by allowing users to instantaneously view medical, English, and scientific definitions as they read any document. This tool reveals the definitions of any selected word in a small frame at the top of the application. SciReader relies on a dictionary of ~750,000 unique Biomedical and English word definitions. Evaluation of the application shows that it maps ~98% of words in several different types of documents and that most users tested in a survey indicate that the application decreases reading time and increases comprehension. SciReader is a web application useful for reading medical and scientific documents. The program makes jargon-laden content more accessible to patients, educators, health care professionals, and the general public.
About the unidirectionality of interference: insight from the musical Stroop effect.
Grégoire, Laurent; Perruchet, Pierre; Poulin-Charronnat, Bénédicte
2014-01-01
The asymmetry of interference in a Stroop task usually refers to the well-documented result that incongruent colour words slow colour naming (Stroop effect) but incongruent colours do not slow colour word reading (no reverse Stroop effect). A few other studies have suggested that, more generally, a reverse Stroop effect can be occasionally observed but at the expense of the Stroop effect itself, as if interference was inherently unidirectional, from the stronger to the weaker of the two competing processes. We describe here a situation conducive to a pervasive mutual interference effect. Musicians were exposed to congruent and incongruent note name/note position patterns, and they were asked either to read the word while ignoring the location of the note within the staff, or to name the note while ignoring the note name written inside the note picture. Most of the participants exhibited interference in the two tasks. Overall, this result pattern runs against the still prevalent model of the Stroop phenomenon [Cohen, J. D., Dunbar, K., & McClelland, J. L. (1990). On the control of automatic processes: A parallel distributed processing account of the Stroop effect. Psychological Review, 97(3), 332-361]. However, further analyses lend support to one of the key tenets of the model, namely that the pattern of interference depends on the relative strength of the two competing pathways. The reasons for the impressive differences between the results collected in the present study and in the standard colour-word (or picture-word) paradigms are also examined. We suggest that these differences reveal the importance of stimulus-response contingency in the formation of automatisms.
Assessing semantic similarity of texts - Methods and algorithms
NASA Astrophysics Data System (ADS)
Rozeva, Anna; Zerkova, Silvia
2017-12-01
Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.
Numerical Algorithms for the Analysis of Expert Opinions Elicited in Text Format
2013-04-01
generative process just described actually means. No “real” intelligible document will ever be gen- erated from such a process, rather, a bag of words...Linguistics, 1992, pp. 977–981. Bra89. N. Bratchell, Cluster analysis, Chememeotrics and Intelligent Laboratory Sys- tems 6 (1989), 105–125. Bre99. P...with graphical models, Journal of Arti- ficial Intelligence Research 2 96 (1994), 159–225. UNCLASSIFIED 39 DSTO–TR–2797 UNCLASSIFIED Bun02. Wray L
Exploiting domain information for Word Sense Disambiguation of medical documents.
Stevenson, Mark; Agirre, Eneko; Soroa, Aitor
2012-01-01
Current techniques for knowledge-based Word Sense Disambiguation (WSD) of ambiguous biomedical terms rely on relations in the Unified Medical Language System Metathesaurus but do not take into account the domain of the target documents. The authors' goal is to improve these methods by using information about the topic of the document in which the ambiguous term appears. The authors proposed and implemented several methods to extract lists of key terms associated with Medical Subject Heading terms. These key terms are used to represent the document topic in a knowledge-based WSD system. They are applied both alone and in combination with local context. A standard measure of accuracy was calculated over the set of target words in the widely used National Library of Medicine WSD dataset. The authors report a significant improvement when combining those key terms with local context, showing that domain information improves the results of a WSD system based on the Unified Medical Language System Metathesaurus alone. The best results were obtained using key terms obtained by relevance feedback and weighted by inverse document frequency.
Exploiting domain information for Word Sense Disambiguation of medical documents
Agirre, Eneko; Soroa, Aitor
2011-01-01
Objective Current techniques for knowledge-based Word Sense Disambiguation (WSD) of ambiguous biomedical terms rely on relations in the Unified Medical Language System Metathesaurus but do not take into account the domain of the target documents. The authors' goal is to improve these methods by using information about the topic of the document in which the ambiguous term appears. Design The authors proposed and implemented several methods to extract lists of key terms associated with Medical Subject Heading terms. These key terms are used to represent the document topic in a knowledge-based WSD system. They are applied both alone and in combination with local context. Measurements A standard measure of accuracy was calculated over the set of target words in the widely used National Library of Medicine WSD dataset. Results and discussion The authors report a significant improvement when combining those key terms with local context, showing that domain information improves the results of a WSD system based on the Unified Medical Language System Metathesaurus alone. The best results were obtained using key terms obtained by relevance feedback and weighted by inverse document frequency. PMID:21900701
ERIC Educational Resources Information Center
Velasco, Kelly; Zizak, Amanda
This report describes a program for improving word analysis skills in order to increase sight reading, reading accuracy, and fluency. The targeted population consisted of second and third graders in a suburban area close to a large metropolitan city in a Midwestern state. The problems of low word analysis skills were documented through Qualitative…
RDBMS Based Lexical Resource for Indian Heritage: The Case of Mahābhārata
NASA Astrophysics Data System (ADS)
Mani, Diwakar
The paper describes a lexical resource in the form of a relational database based indexing system for Sanskrit documents - Mahābhārata (MBh) as an example. The system is available online on http://sanskrit.jnu.ac.in/mb with input and output in Devanāgarī Unicode, using technologies such as RDBMS and Java Servlet. The system works as an interactive and multi-dimensional indexing system with search facility for MBh and has potentials for use as a generic system for all Sanskrit texts of similar structure. Currently, the system allows three types of searching facilities- 'Direct Search', 'Alphabetical Search' and 'Search by Classes'. The input triggers an indexing process by which a temporary index is created for the search string, and then clicking on any indexed word displays the details for that word and also a facility to search that word in some other online lexical resources.
ERIC Educational Resources Information Center
LoGerfo, Emanuele; Oliveri, Massimiliano; Torriero, Sara; Salerno, Silvia; Koch, Giacomo; Caltagirone, Carlo
2008-01-01
We investigated the differential role of two frontal regions in the processing of grammatical and semantic knowledge. Given the documented specificity of the prefrontal cortex for the grammatical class of verbs, and of the primary motor cortex for the semantic class of action words, we sought to investigate whether the prefrontal cortex is also…
Medical document anonymization with a semantic lexicon.
Ruch, P.; Baud, R. H.; Rassinoux, A. M.; Bouillon, P.; Robert, G.
2000-01-01
We present an original system for locating and removing personally-identifying information in patient records. In this experiment, anonymization is seen as a particular case of knowledge extraction. We use natural language processing tools provided by the MEDTAG framework: a semantic lexicon specialized in medicine, and a toolkit for word-sense and morpho-syntactic tagging. The system finds 98-99% of all personally-identifying information. PMID:11079980
NASA Astrophysics Data System (ADS)
Rahman, Fuad; Tarnikova, Yuliya; Hartono, Rachmat; Alam, Hassan
2006-01-01
This paper presents a novel automatic web publishing solution, Pageview (R). PageView (R) is a complete working solution for document processing and management. The principal aim of this tool is to allow workgroups to share, access and publish documents on-line on a regular basis. For example, assuming that a person is working on some documents. The user will, in some fashion, organize his work either in his own local directory or in a shared network drive. Now extend that concept to a workgroup. Within a workgroup, some users are working together on some documents, and they are saving them in a directory structure somewhere on a document repository. The next stage of this reasoning is that a workgroup is working on some documents, and they want to publish them routinely on-line. Now it may happen that they are using different editing tools, different software, and different graphics tools. The resultant documents may be in PDF, Microsoft Office (R), HTML, or Word Perfect format, just to name a few. In general, this process needs the documents to be processed in a fashion so that they are in the HTML format, and then a web designer needs to work on that collection to make them available on-line. PageView (R) takes care of this whole process automatically, making the document workflow clean and easy to follow. PageView (R) Server publishes documents, complete with the directory structure, for online use. The documents are automatically converted to HTML and PDF so that users can view the content without downloading the original files, or having to download browser plug-ins. Once published, other users can access the documents as if they are accessing them from their local folders. The paper will describe the complete working system and will discuss possible applications within the document management research.
ERIC Educational Resources Information Center
Kercood, Suneeta; Zentall, Sydney S.; Vinh, Megan; Tom-Wright, Kinsey
2012-01-01
The purpose of this theoretically-based study was to examine the effects of yellow-highlighting "relevant" words and units within math word problems. Initial differences were documented between 10 girls at-risk for ADHD and 10 comparisons on the performance of group and individual assessments of math computations and word problems, as had…
Borkowski, A; Lee, D H; Sydnor, D L; Johnson, R J; Rabinovitch, A; Moore, G W
2001-01-01
The Pathology and Laboratory Medicine Service of the Veterans Affairs Maryland Health Care System is inspected biannually by the College of American Pathologists (CAP). As of the year 2000, all documentation in the Anatomic Pathology Section is available to all staff through the VA Intranet. Signed, supporting paper documents are on file in the office of the department chair. For the year 2000 CAP inspection, inspectors conducted their document review by use of these Web-based documents, in which each CAP question had a hyperlink to the corresponding section of the procedure manual. Thus inspectors were able to locate the documents relevant to each question quickly and efficiently. The procedure manuals consist of 87 procedures for surgical pathology, 52 procedures for cytopathology, and 25 procedures for autopsy pathology. Each CAP question requiring documentation had from one to three hyperlinks to the corresponding section of the procedure manual. Intranet documentation allows for easier sharing among decentralized institutions and for centralized updates of the laboratory documentation. These documents can be upgraded to allow for multimedia presentations, including text search for key words, hyperlinks to other documents, and images, audio, and video. Use of Web-based documents can improve the efficiency of the inspection process.
TurboTech Technical Evaluation Automated System
NASA Technical Reports Server (NTRS)
Tiffany, Dorothy J.
2009-01-01
TurboTech software is a Web-based process that simplifies and semiautomates technical evaluation of NASA proposals for Contracting Officer's Technical Representatives (COTRs). At the time of this reporting, there have been no set standards or systems for training new COTRs in technical evaluations. This new process provides boilerplate text in response to interview style questions. This text is collected into a Microsoft Word document that can then be further edited to conform to specific cases. By providing technical language and a structured format, TurboTech allows the COTRs to concentrate more on the actual evaluation, and less on deciding what language would be most appropriate. Since the actual word choice is one of the more time-consuming parts of a COTRs job, this process should allow for an increase in quantity of proposals evaluated. TurboTech is applicable to composing technical evaluations of contractor proposals, task and delivery orders, change order modifications, requests for proposals, new work modifications, task assignments, as well as any changes to existing contracts.
Image Annotation and Topic Extraction Using Super-Word Latent Dirichlet Allocation
2013-09-01
an image can be used to improve automated image annotation performance over existing generalized annotators. Second, image anno - 3 tations can be used...the other variables. The first ratio in the sampling Equation 2.18 uses word frequency by total words, φ̂ (w) j . The second ratio divides word...topics by total words in that document θ̂ (d) j . Both leave out the current assignment of zi and the results are used to randomly choose a new topic
California State Spelling Championship Word Lists [and Spelling Bee Planning Information].
ERIC Educational Resources Information Center
Sonoma County Superintendent of Schools, Santa Rosa, CA.
This two-part document contains a spelling word list compiled by the Sonoma County Superintendent of Schools (California) for use in the California State Elementary Spelling Championship competition, along with information for planning and conducting spelling bees. The spelling word list (also intended for use in the regional competitions) is a…
Evidence on Tips for Supporting Reading Skills at Home
ERIC Educational Resources Information Center
What Works Clearinghouse, 2018
2018-01-01
This document begins by providing four tips parents and care takers can use to supporting childrens' reading skills at home: (1) Have conversations before, during, and after reading together; (2) Help children learn how to break sentences into words and words into syllables; (3) Help children sound out words smoothly; and (4) Model reading…
77 FR 15053 - Manual for Courts-Martial; Proposed Amendments
Federal Register 2010, 2011, 2012, 2013, 2014
2012-03-14
... the M.R.E. and a Word document using color-coded text and comments to explain amendments. Updated... evidence. f. Commenter recommended using the words ``pursuant to statutory authority'' in M.R.E. 807. JSC... the rule to findings. i. Commenter recommended removing the word ``allegedly'' from proposed M.R.E...
ERIC Educational Resources Information Center
Ambrose, Regina Maria; Palpanathan, Shanthini
2017-01-01
Computer-assisted language learning (CALL) has evolved through various stages in both technology as well as the pedagogical use of technology (Warschauer & Healey, 1998). Studies show that the CALL trend has facilitated students in their English language writing with useful tools such as computer based activities and word processing. Students…
Computer Center Reference Manual. Volume 1
1990-09-30
Unlimited o- 0 0 91o1 UNCLASSI FI ED SECURITY CLASSIFICATION OF THIS PAGE REPORT DOCUMENTATION PAGE la . REPORT SECURITY CLASSIFICATION lb. RESTRICTIVE...with connection to INTERNET ) (host tables allow transfer to some other networks) OASYS - the DTRC Office Automation System The following can be reached...and buffers, two windows, and some word processing commands. Advanced editing commands are entered through the use of a command line. EVE las its own
Keywords image retrieval in historical handwritten Arabic documents
NASA Astrophysics Data System (ADS)
Saabni, Raid; El-Sana, Jihad
2013-01-01
A system is presented for spotting and searching keywords in handwritten Arabic documents. A slightly modified dynamic time warping algorithm is used to measure similarities between words. Two sets of features are generated from the outer contour of the words/word-parts. The first set is based on the angles between nodes on the contour and the second set is based on the shape context features taken from the outer contour. To recognize a given word, the segmentation-free approach is partially adopted, i.e., continuous word parts are used as the basic alphabet, instead of individual characters or complete words. Additional strokes, such as dots and detached short segments, are classified and used in a postprocessing step to determine the final comparison decision. The search for a keyword is performed by the search for its word parts given in the correct order. The performance of the presented system was very encouraging in terms of efficiency and match rates. To evaluate the presented system its performance is compared to three different systems. Unfortunately, there are no publicly available standard datasets with ground truth for testing Arabic key word searching systems. Therefore, a private set of images partially taken from Juma'a Al-Majid Center in Dubai for evaluation is used, while using a slightly modified version of the IFN/ENIT database for training.
Putting Home Data Management into Perspective
2009-12-01
approaches. However, users of home and personal storage live it. Popular interfaces (e.g., iTunes , iPhoto, and even drop-down lists of recently...users of home and personal storage live it. Popular interfaces (e.g., iTunes , iPhoto, and even drop-down lists of recently-opened Word documents...live it. Popular interfaces (e.g., iTunes , iPhoto, and even drop- down lists of recently-opened Word documents) allow users to navigate file
75 FR 27199 - Promoting Diversification of Ownership in the Broadcasting Services
Federal Register 2010, 2011, 2012, 2013, 2014
2010-05-14
... business. This document corrects the Report and Order by substituting the word ``ethnicity'' for ``gender... the first column, paragraph 11, the Commission inadvertently used the word ``gender'' instead of...
Research on aviation unsafe incidents classification with improved TF-IDF algorithm
NASA Astrophysics Data System (ADS)
Wang, Yanhua; Zhang, Zhiyuan; Huo, Weigang
2016-05-01
The text content of Aviation Safety Confidential Reports contains a large number of valuable information. Term frequency-inverse document frequency algorithm is commonly used in text analysis, but it does not take into account the sequential relationship of the words in the text and its role in semantic expression. According to the seven category labels of civil aviation unsafe incidents, aiming at solving the problems of TF-IDF algorithm, this paper improved TF-IDF algorithm based on co-occurrence network; established feature words extraction and words sequential relations for classified incidents. Aviation domain lexicon was used to improve the accuracy rate of classification. Feature words network model was designed for multi-documents unsafe incidents classification, and it was used in the experiment. Finally, the classification accuracy of improved algorithm was verified by the experiments.
Pedagogical principles underpinning undergraduate Nurse Education in the UK: A review.
Mackintosh-Franklin, Carolyn
2016-05-01
This review provides a contextual report of the current use of pedagogy in undergraduate nursing programmes run by Higher Education Institutes (HEIs) in the United Kingdom (UK). Pedagogy provides the framework for educators to add shape and structure to the educational process, and to support student learning and programme development. Traditionally nurse education has used a behaviourist approach focusing on learning outcomes and competency based education, although there is also increasing support for the cognitive/student learning focused pedagogic approach. The keywords andragogy, pedagogy and student centred learning were used in a systematic stepwise descriptive content analysis of the programme specifications and programme handbooks of 40 current undergraduate programme documents, leading to an undergraduate award and professional registration as a nurse. 42% (17) of documents contained reference to the words, pedagogy and student centred learning, whilst no documents used the word andragogy. Where identified, pedagogy was used in a superficial manner, with only three documents identifying a specific pedagogical philosophy: one HEI citing a value based curriculum and two HEIs referencing social constructionism. Nine HEIs made reference to student centred learning but with no additional pedagogic information. A review of teaching, learning and assessment strategies indicated no difference between the documented strategies used by HEIs when comparing those with an espoused pedagogy and those without. Although educational literature supports the use of pedagogic principles in curriculum design, this is not explicit in undergraduate nursing programme documentation, and suggests that nurse educators do not view pedagogy as important to their programmes. Instead programmes appear to be developed based on operational and functional requirements with a focus on acquisition of knowledge and skills, and the fitness to practice of graduates entering the nursing workforce. Copyright © 2016 Elsevier Ltd. All rights reserved.
Recurrent-neural-network-based Boolean factor analysis and its application to word clustering.
Frolov, Alexander A; Husek, Dusan; Polyakov, Pavel Yu
2009-07-01
The objective of this paper is to introduce a neural-network-based algorithm for word clustering as an extension of the neural-network-based Boolean factor analysis algorithm (Frolov , 2007). It is shown that this extended algorithm supports even the more complex model of signals that are supposed to be related to textual documents. It is hypothesized that every topic in textual data is characterized by a set of words which coherently appear in documents dedicated to a given topic. The appearance of each word in a document is coded by the activity of a particular neuron. In accordance with the Hebbian learning rule implemented in the network, sets of coherently appearing words (treated as factors) create tightly connected groups of neurons, hence, revealing them as attractors of the network dynamics. The found factors are eliminated from the network memory by the Hebbian unlearning rule facilitating the search of other factors. Topics related to the found sets of words can be identified based on the words' semantics. To make the method complete, a special technique based on a Bayesian procedure has been developed for the following purposes: first, to provide a complete description of factors in terms of component probability, and second, to enhance the accuracy of classification of signals to determine whether it contains the factor. Since it is assumed that every word may possibly contribute to several topics, the proposed method might be related to the method of fuzzy clustering. In this paper, we show that the results of Boolean factor analysis and fuzzy clustering are not contradictory, but complementary. To demonstrate the capabilities of this attempt, the method is applied to two types of textual data on neural networks in two different languages. The obtained topics and corresponding words are at a good level of agreement despite the fact that identical topics in Russian and English conferences contain different sets of keywords.
Yu, Zhiguo; Nguyen, Thang; Dhombres, Ferdinand; Johnson, Todd; Bodenreider, Olivier
2018-01-01
Extracting and understanding information, themes and relationships from large collections of documents is an important task for biomedical researchers. Latent Dirichlet Allocation is an unsupervised topic modeling technique using the bag-of-words assumption that has been applied extensively to unveil hidden thematic information within large sets of documents. In this paper, we added MeSH descriptors to the bag-of-words assumption to generate ‘hybrid topics’, which are mixed vectors of words and descriptors. We evaluated this approach on the quality and interpretability of topics in both a general corpus and a specialized corpus. Our results demonstrated that the coherence of ‘hybrid topics’ is higher than that of regular bag-of-words topics in the specialized corpus. We also found that the proportion of topics that are not associated with MeSH descriptors is higher in the specialized corpus than in the general corpus. PMID:29295179
Walker Ranch 3D seismic images
Robert J. Mellors
2016-03-01
Amplitude images (both vertical and depth slices) extracted from 3D seismic reflection survey over area of Walker Ranch area (adjacent to Raft River). Crossline spacing of 660 feet and inline of 165 feet using a Vibroseis source. Processing included depth migration. Micro-earthquake hypocenters on images. Stratigraphic information and nearby well tracks added to images. Images are embedded in a Microsoft Word document with additional information. Exact location and depth restricted for proprietary reasons. Data collection and processing funded by Agua Caliente. Original data remains property of Agua Caliente.
Novel grid-based optical Braille conversion: from scanning to wording
NASA Astrophysics Data System (ADS)
Yoosefi Babadi, Majid; Jafari, Shahram
2011-12-01
Grid-based optical Braille conversion (GOBCO) is explained in this article. The grid-fitting technique involves processing scanned images taken from old hard-copy Braille manuscripts, recognising and converting them into English ASCII text documents inside a computer. The resulted words are verified using the relevant dictionary to provide the final output. The algorithms employed in this article can be easily modified to be implemented on other visual pattern recognition systems and text extraction applications. This technique has several advantages including: simplicity of the algorithm, high speed of execution, ability to help visually impaired persons and blind people to work with fax machines and the like, and the ability to help sighted people with no prior knowledge of Braille to understand hard-copy Braille manuscripts.
LCS Content Document Application
NASA Technical Reports Server (NTRS)
Hochstadt, Jake
2011-01-01
My project at KSC during my spring 2011 internship was to develop a Ruby on Rails application to manage Content Documents..A Content Document is a collection of documents and information that describes what software is installed on a Launch Control System Computer. It's important for us to make sure the tools we use everyday are secure, up-to-date, and properly licensed. Previously, keeping track of the information was done by Excel and Word files between different personnel. The goal of the new application is to be able to manage and access the Content Documents through a single database backed web application. Our LCS team will benefit greatly with this app. Admin's will be able to login securely to keep track and update the software installed on each computer in a timely manner. We also included exportability such as attaching additional documents that can be downloaded from the web application. The finished application will ease the process of managing Content Documents while streamlining the procedure. Ruby on Rails is a very powerful programming language and I am grateful to have the opportunity to build this application.
An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development
Knauff, Markus; Nejasmic, Jelica
2014-01-01
The choice of an efficient document preparation system is an important decision for any academic researcher. To assist the research community, we report a software usability study in which 40 researchers across different disciplines prepared scholarly texts with either Microsoft Word or LaTeX. The probe texts included simple continuous text, text with tables and subheadings, and complex text with several mathematical equations. We show that LaTeX users were slower than Word users, wrote less text in the same amount of time, and produced more typesetting, orthographical, grammatical, and formatting errors. On most measures, expert LaTeX users performed even worse than novice Word users. LaTeX users, however, more often report enjoying using their respective software. We conclude that even experienced LaTeX users may suffer a loss in productivity when LaTeX is used, relative to other document preparation systems. Individuals, institutions, and journals should carefully consider the ramifications of this finding when choosing document preparation strategies, or requiring them of authors. PMID:25526083
A metric to search for relevant words
NASA Astrophysics Data System (ADS)
Zhou, Hongding; Slater, Gary W.
2003-11-01
We propose a new metric to evaluate and rank the relevance of words in a text. The method uses the density fluctuations of a word to compute an index that measures its degree of clustering. Highly significant words tend to form clusters, while common words are essentially uniformly spread in a text. If a word is not rare, the metric is stable when we move any individual occurrence of this word in the text. Furthermore, we prove that the metric always increases when words are moved to form larger clusters, or when several independent documents are merged. Using the Holy Bible as an example, we show that our approach reduces the significance of common words when compared to a recently proposed statistical metric.
ERIC Educational Resources Information Center
Hamada, Megumi; Koda, Keiko
2011-01-01
Although the role of the phonological loop in word-retention is well documented, research in Chinese character retention suggests the involvement of non-phonological encoding. This study investigated whether the extent to which the phonological loop contributes to learning and remembering visually introduced words varies between college-level…
The Effect of Sonority on Word Segmentation: Evidence for the Use of a Phonological Universal
ERIC Educational Resources Information Center
Ettlinger, Marc; Finn, Amy S.; Hudson Kam, Carla L.
2012-01-01
It has been well documented how language-specific cues may be used for word segmentation. Here, we investigate what role a language-independent phonological universal, the sonority sequencing principle (SSP), may also play. Participants were presented with an unsegmented speech stream with non-English word onsets that juxtaposed adherence to the…
Can multilinguality improve Biomedical Word Sense Disambiguation?
Duque, Andres; Martinez-Romo, Juan; Araujo, Lourdes
2016-12-01
Ambiguity in the biomedical domain represents a major issue when performing Natural Language Processing tasks over the huge amount of available information in the field. For this reason, Word Sense Disambiguation is critical for achieving accurate systems able to tackle complex tasks such as information extraction, summarization or document classification. In this work we explore whether multilinguality can help to solve the problem of ambiguity, and the conditions required for a system to improve the results obtained by monolingual approaches. Also, we analyze the best ways to generate those useful multilingual resources, and study different languages and sources of knowledge. The proposed system, based on co-occurrence graphs containing biomedical concepts and textual information, is evaluated on a test dataset frequently used in biomedicine. We can conclude that multilingual resources are able to provide a clear improvement of more than 7% compared to monolingual approaches, for graphs built from a small number of documents. Also, empirical results show that automatically translated resources are a useful source of information for this particular task. Copyright © 2016 Elsevier Inc. All rights reserved.
Semantic Document Model to Enhance Data and Knowledge Interoperability
NASA Astrophysics Data System (ADS)
Nešić, Saša
To enable document data and knowledge to be efficiently shared and reused across application, enterprise, and community boundaries, desktop documents should be completely open and queryable resources, whose data and knowledge are represented in a form understandable to both humans and machines. At the same time, these are the requirements that desktop documents need to satisfy in order to contribute to the visions of the Semantic Web. With the aim of achieving this goal, we have developed the Semantic Document Model (SDM), which turns desktop documents into Semantic Documents as uniquely identified and semantically annotated composite resources, that can be instantiated into human-readable (HR) and machine-processable (MP) forms. In this paper, we present the SDM along with an RDF and ontology-based solution for the MP document instance. Moreover, on top of the proposed model, we have built the Semantic Document Management System (SDMS), which provides a set of services that exploit the model. As an application example that takes advantage of SDMS services, we have extended MS Office with a set of tools that enables users to transform MS Office documents (e.g., MS Word and MS PowerPoint) into Semantic Documents, and to search local and distant semantic document repositories for document content units (CUs) over Semantic Web protocols.
The use of the picture–word interference paradigm to examine naming abilities in aphasic individuals
Hashimoto, Naomi; Thompson, Cynthia K.
2015-01-01
Background Although naming deficits are well documented in aphasia, on-line measures of naming processes have been little investigated. The use of on-line measures may offer further insight into the nature of aphasic naming deficits that would otherwise be difficult to interpret when using off-line measures. Aims The temporal activation of semantic and phonological processes was tracked in older normal control and aphasic individuals using a picture–word interference paradigm. The purpose of the study was to examine how word interference results can augment and/or corroborate standard language testing in the aphasic group, as well as to examine temporal patterns of activation in the aphasic group when compared to a normal control group. Methods & Procedures A total of 20 older normal individuals and 11 aphasic individuals participated. Detailed measures of each aphasic individual's language and naming skills were obtained. A visual picture–word interference paradigm was used in which the words bore either a semantic, phonological, or no relationship to 25 pictures. These competitor words were presented at stimulus onset asynchronies of −300 ms, +300 ms, and 0 ms. Outcomes & Results Analyses of naming RTs in both groups revealed significant early semantic interference effects, mid-semantic interference effects, and mid-phonological facilitation effects. A matched control-aphasic group comparison revealed no differences in the temporal activation of effects during the course of naming. Partial support for this RT pattern was found in the aphasic naming error pattern. The aphasic group also demonstrated greater SIEs and PFEs compared to the matched control group, which indicated disruptions of the phonological processing stage. Analyses of behavioural performances of the aphasic group corroborated this finding. Conclusions The aphasic naming RTs results were unexpected given the results from the priming literature, which has supported the idea of slowed or reduced patterns of activation in aphasic individuals. However, analyses of naming RTs also confirmed the behavioural finding of a disruption surrounding phonological processes; thus, the analyses of naming latencies offers another potential means of pinpointing breakdowns of lexical access in individuals with aphasia. PMID:26166927
Investigative change detection: identifying new topics using lexicon-based search
NASA Astrophysics Data System (ADS)
Hintz, Kenneth J.
2002-08-01
In law enforcement there is much textual data which needs to be searched in order to detect new threats. A new methodology which can be applied to this need is the automatic searching of the contents of documents from known sources to construct a lexicon of words used by that source. When analyzing future documents, the occurrence of words which have not been lexiconized are indicative of the introduction of a new topic into the source's lexicon which should be examined in its context by an analyst. A system analogous to this has been built and used to detect Fads and Categories on web sites. Fad refers to the first appearance of a word not in the lexicon; Category refers to the repeated appearance of a Fad word and the exceeding of some frequency or spatial occurrence metric indicating a permanence to the Category.
Towards Automatic Classification of Wikipedia Content
NASA Astrophysics Data System (ADS)
Szymański, Julian
Wikipedia - the Free Encyclopedia encounters the problem of proper classification of new articles everyday. The process of assignment of articles to categories is performed manually and it is a time consuming task. It requires knowledge about Wikipedia structure, which is beyond typical editor competence, which leads to human-caused mistakes - omitting or wrong assignments of articles to categories. The article presents application of SVM classifier for automatic classification of documents from The Free Encyclopedia. The classifier application has been tested while using two text representations: inter-documents connections (hyperlinks) and word content. The results of the performed experiments evaluated on hand crafted data show that the Wikipedia classification process can be partially automated. The proposed approach can be used for building a decision support system which suggests editors the best categories that fit new content entered to Wikipedia.
Risse, M; Weiler, G
2001-01-01
Johann Joachim Winckelmann, German historian of ancient art and archaeologist, was born on 9 December 1717 in Stendal, a town in Saxony-Anhalt. At the age of 50 he was murdered on 8 June 1768 in a Trieste hotel. The voluminous original record of the criminal proceedings against his murderer, Francesco Arcangeli, was presumed lost for about 150 years. A new edition in the wording of the original text appeared in 1964. This long sought historical document gives cause for forensic-historical reflections under consideration of the autopsy protocol about Winckelmann, which is likewise a historical document. A considerable change of paradigm in comparison to current autopsy protocols is observed with regard to the evaluation of injuries and the circumstances of death.
Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Xu, Richard Yi Da; Luo, Xiangfeng
2018-05-01
Sparse nonnegative matrix factorization (SNMF) aims to factorize a data matrix into two optimized nonnegative sparse factor matrices, which could benefit many tasks, such as document-word co-clustering. However, the traditional SNMF typically assumes the number of latent factors (i.e., dimensionality of the factor matrices) to be fixed. This assumption makes it inflexible in practice. In this paper, we propose a doubly sparse nonparametric NMF framework to mitigate this issue by using dependent Indian buffet processes (dIBP). We apply a correlation function for the generation of two stick weights associated with each column pair of factor matrices while still maintaining their respective marginal distribution specified by IBP. As a consequence, the generation of two factor matrices will be columnwise correlated. Under this framework, two classes of correlation function are proposed: 1) using bivariate Beta distribution and 2) using Copula function. Compared with the single IBP-based NMF, this paper jointly makes two factor matrices nonparametric and sparse, which could be applied to broader scenarios, such as co-clustering. This paper is seen to be much more flexible than Gaussian process-based and hierarchial Beta process-based dIBPs in terms of allowing the two corresponding binary matrix columns to have greater variations in their nonzero entries. Our experiments on synthetic data show the merits of this paper compared with the state-of-the-art models in respect of factorization efficiency, sparsity, and flexibility. Experiments on real-world data sets demonstrate the efficiency of this paper in document-word co-clustering tasks.
Ackermann; Mathiak
1999-11-01
Pure word deafness (auditory verbal agnosia) is characterized by an impairment of auditory comprehension, repetition of verbal material and writing to dictation whereas spontaneous speech production and reading largely remain unaffected. Sometimes, this syndrome is preceded by complete deafness (cortical deafness) of varying duration. Perception of vowels and suprasegmental features of verbal utterances (e.g., intonation contours) seems to be less disrupted than the processing of consonants and, therefore, might mediate residual auditory functions. Often, lip reading and/or slowing of speaking rate allow within some limits to compensate for speech comprehension deficits. Apart from a few exceptions, the available reports of pure word deafness documented a bilateral temporal lesion. In these instances, as a rule, identification of nonverbal (environmental) sounds, perception of music, temporal resolution of sequential auditory cues and/or spatial localization of acoustic events were compromised as well. The observed variable constellation of auditory signs and symptoms in central hearing disorders following bilateral temporal disorders, most probably, reflects the multitude of functional maps at the level of the auditory cortices subserving, as documented in a variety of non-human species, the encoding of specific stimulus parameters each. Thus, verbal/nonverbal auditory agnosia may be considered a paradigm of distorted "auditory scene analysis" (Bregman 1990) affecting both primitive and schema-based perceptual processes. It cannot be excluded, however, that disconnection of the Wernicke-area from auditory input (Geschwind 1965) and/or an impairment of suggested "phonetic module" (Liberman 1996) contribute to the observed deficits as well. Conceivably, these latter mechanisms underly the rare cases of pure word deafness following a lesion restricted to the dominant hemisphere. Only few instances of a rather isolated disruption of the discrimination/identification of nonverbal sound sources, in the presence of uncompromised speech comprehension, have been reported so far (nonverbal auditory agnosia). As a rule, unilateral right-sided damage has been found to be the relevant lesion.
ERIC Educational Resources Information Center
Long, Sandra; And Others
Part of a curriculum series for academically gifted elementary students in the area of reading, the document presents objectives and activities for language arts instruction. There are three major objectives: (1) recognizing persuasive use of words, vague and imprecise words, multiple meanings conveyed by a single word, and propaganda techniques;…
ERIC Educational Resources Information Center
Ivy, Sarah E.; Guerra, Jennifer A.; Hatton, Deborah D.
2017-01-01
Introduction: Constant time delay is an evidence-based practice to teach sight word recognition to students with a variety of disabilities. To date, two studies have documented its effectiveness for teaching braille. Methods: Using a multiple-baseline design, we evaluated the effectiveness of constant time delay to teach highly motivating words to…
Min-cut segmentation of cursive handwriting in tabular documents
NASA Astrophysics Data System (ADS)
Davis, Brian L.; Barrett, William A.; Swingle, Scott D.
2015-01-01
Handwritten tabular documents, such as census, birth, death and marriage records, contain a wealth of information vital to genealogical and related research. Much work has been done in segmenting freeform handwriting, however, segmentation of cursive handwriting in tabular documents is still an unsolved problem. Tabular documents present unique segmentation challenges caused by handwriting overlapping cell-boundaries and other words, both horizontally and vertically, as "ascenders" and "descenders" overlap into adjacent cells. This paper presents a method for segmenting handwriting in tabular documents using a min-cut/max-flow algorithm on a graph formed from a distance map and connected components of handwriting. Specifically, we focus on line, word and first letter segmentation. Additionally, we include the angles of strokes of the handwriting as a third dimension to our graph to enable the resulting segments to share pixels of overlapping letters. Word segmentation accuracy is 89.5% evaluating lines of the data set used in the ICDAR2013 Handwriting Segmentation Contest. Accuracy is 92.6% for a specific application of segmenting first and last names from noisy census records. Accuracy for segmenting lines of names from noisy census records is 80.7%. The 3D graph cutting shows promise in segmenting overlapping letters, although highly convoluted or overlapping handwriting remains an ongoing challenge.
Natural language information retrieval in digital libraries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strzalkowski, T.; Perez-Carballo, J.; Marinescu, M.
In this paper we report on some recent developments in joint NYU and GE natural language information retrieval system. The main characteristic of this system is the use of advanced natural language processing to enhance the effectiveness of term-based document retrieval. The system is designed around a traditional statistical backbone consisting of the indexer module, which builds inverted index files from pre-processed documents, and a retrieval engine which searches and ranks the documents in response to user queries. Natural language processing is used to (1) preprocess the documents in order to extract content-carrying terms, (2) discover inter-term dependencies and buildmore » a conceptual hierarchy specific to the database domain, and (3) process user`s natural language requests into effective search queries. This system has been used in NIST-sponsored Text Retrieval Conferences (TREC), where we worked with approximately 3.3 GBytes of text articles including material from the Wall Street Journal, the Associated Press newswire, the Federal Register, Ziff Communications`s Computer Library, Department of Energy abstracts, U.S. Patents and the San Jose Mercury News, totaling more than 500 million words of English. The system have been designed to facilitate its scalability to deal with ever increasing amounts of data. In particular, a randomized index-splitting mechanism has been installed which allows the system to create a number of smaller indexes that can be independently and efficiently searched.« less
Rule-based Approach on Extraction of Malay Compound Nouns in Standard Malay Document
NASA Astrophysics Data System (ADS)
Abu Bakar, Zamri; Kamal Ismail, Normaly; Rawi, Mohd Izani Mohamed
2017-08-01
Malay compound noun is defined as a form of words that exists when two or more words are combined into a single syntax and it gives a specific meaning. Compound noun acts as one unit and it is spelled separately unless an established compound noun is written closely from two words. The basic characteristics of compound noun can be seen in the Malay sentences which are the frequency of that word in the text itself. Thus, this extraction of compound nouns is significant for the following research which is text summarization, grammar checker, sentiments analysis, machine translation and word categorization. There are many research efforts that have been proposed in extracting Malay compound noun using linguistic approaches. Most of the existing methods were done on the extraction of bi-gram noun+noun compound. However, the result still produces some problems as to give a better result. This paper explores a linguistic method for extracting compound Noun from stand Malay corpus. A standard dataset are used to provide a common platform for evaluating research on the recognition of compound Nouns in Malay sentences. Therefore, an improvement for the effectiveness of the compound noun extraction is needed because the result can be compromised. Thus, this study proposed a modification of linguistic approach in order to enhance the extraction of compound nouns processing. Several pre-processing steps are involved including normalization, tokenization and tagging. The first step that uses the linguistic approach in this study is Part-of-Speech (POS) tagging. Finally, we describe several rules-based and modify the rules to get the most relevant relation between the first word and the second word in order to assist us in solving of the problems. The effectiveness of the relations used in our study can be measured using recall, precision and F1-score techniques. The comparison of the baseline values is very essential because it can provide whether there has been an improvement in the result.
Mexican American family processes: nurturing, support, and socialization.
Niska, K J
1999-04-01
The purpose of this ethnographic study with Mexican American families was to document characteristics of Mexican American family processes of nurturing, support, and socialization. Audiotaped conversations with participants were transcribed verbatim in Spanish or English. Content analysis was used to derive characteristics of family processes. Family nurturing was characterized by being kin-based and intimate in nature. Family support was kin-based, with material support oriented toward household needs; with emotional support grounded in shared stories, problem solving, and prayer; and with informational support offered in consejos (wisdom sayings and words of advice), stories, and guidance. Family socialization was kin-based, hierarchical, and ritualistic.
Rapid automatic keyword extraction for information retrieval and analysis
Rose, Stuart J [Richland, WA; Cowley,; E, Wendy [Richland, WA; Crow, Vernon L [Richland, WA; Cramer, Nicholas O [Richland, WA
2012-03-06
Methods and systems for rapid automatic keyword extraction for information retrieval and analysis. Embodiments can include parsing words in an individual document by delimiters, stop words, or both in order to identify candidate keywords. Word scores for each word within the candidate keywords are then calculated based on a function of co-occurrence degree, co-occurrence frequency, or both. Based on a function of the word scores for words within the candidate keyword, a keyword score is calculated for each of the candidate keywords. A portion of the candidate keywords are then extracted as keywords based, at least in part, on the candidate keywords having the highest keyword scores.
PC-based web authoring: How to learn as little unix as possible while getting on the Web
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gennari, L.T.; Breaux, M.; Minton, S.
1996-09-01
This document is a general guide for creating Web pages, using commonly available word processing and file transfer applications. It is not a full guide to HTML, nor does it provide an introduction to the many WYSIWYG HTML editors available. The viability of the authoring method it describes will not be affected by changes in the HTML specification or the rapid release-and-obsolescence cycles of commercial WYSIWYG HTML editors. This document provides a gentle introduction to HTML for the beginner, and as the user gains confidence and experience, encourages greater familiarity with HTML through continued exposure to and hands-on usage ofmore » HTML code.« less
Mac-based Web authoring: How to learn as little Unix as possible while getting on the Web.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gennari, L.T.
1996-06-01
This document is a general guide for creating Web pages, using commonly available word processing and file transfer applications. It is not a full guide to HTML, nor does it provide an introduction to the many WYSIWYG HTML editors available. The viability of the authoring method it describes will not be affected by changes in the HTML specification or the rapid release-and-obsolescence cycles of commercial WYSIWYG HTML editors. This document provides a gentle introduction to HTML for the beginner and as the user gains confidence and experience, encourages greater familiarity with HTML through continued exposure to and hands-on usage ofmore » HTML code.« less
Apple Macintosh programs for nucleic and protein sequence analyses.
Bellon, B
1988-01-01
This paper describes a package of programs for handling and analyzing nucleic acid and protein sequences using the Apple Macintosh microcomputer. There are three important features of these programs: first, because of the now classical Macintosh interface the programs can be easily used by persons with little or no computer experience. Second, it is possible to save all the data, written in an editable scrolling text window or drawn in a graphic window, as files that can be directly used either as word processing documents or as picture documents. Third, sequences can be easily exchanged with any other computer. The package is composed of thirteen programs, written in Pascal programming language. PMID:2832832
Design of a modular digital computer system, CDRL no. D001, final design plan
NASA Technical Reports Server (NTRS)
Easton, R. A.
1975-01-01
The engineering breadboard implementation for the CDRL no. D001 modular digital computer system developed during design of the logic system was documented. This effort followed the architecture study completed and documented previously, and was intended to verify the concepts of a fault tolerant, automatically reconfigurable, modular version of the computer system conceived during the architecture study. The system has a microprogrammed 32 bit word length, general register architecture and an instruction set consisting of a subset of the IBM System 360 instruction set plus additional fault tolerance firmware. The following areas were covered: breadboard packaging, central control element, central processing element, memory, input/output processor, and maintenance/status panel and electronics.
Word add-in for ontology recognition: semantic enrichment of scientific literature.
Fink, J Lynn; Fernicola, Pablo; Chandran, Rahul; Parastatidis, Savas; Wade, Alex; Naim, Oscar; Quinn, Gregory B; Bourne, Philip E
2010-02-24
In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles. The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at http://www.codeplex.com/UCSDBioLit. The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata.
Implementation of the common phrase index method on the phrase query for information retrieval
NASA Astrophysics Data System (ADS)
Fatmawati, Triyah; Zaman, Badrus; Werdiningsih, Indah
2017-08-01
As the development of technology, the process of finding information on the news text is easy, because the text of the news is not only distributed in print media, such as newspapers, but also in electronic media that can be accessed using the search engine. In the process of finding relevant documents on the search engine, a phrase often used as a query. The number of words that make up the phrase query and their position obviously affect the relevance of the document produced. As a result, the accuracy of the information obtained will be affected. Based on the outlined problem, the purpose of this research was to analyze the implementation of the common phrase index method on information retrieval. This research will be conducted in English news text and implemented on a prototype to determine the relevance level of the documents produced. The system is built with the stages of pre-processing, indexing, term weighting calculation, and cosine similarity calculation. Then the system will display the document search results in a sequence, based on the cosine similarity. Furthermore, system testing will be conducted using 100 documents and 20 queries. That result is then used for the evaluation stage. First, determine the relevant documents using kappa statistic calculation. Second, determine the system success rate using precision, recall, and F-measure calculation. In this research, the result of kappa statistic calculation was 0.71, so that the relevant documents are eligible for the system evaluation. Then the calculation of precision, recall, and F-measure produces precision of 0.37, recall of 0.50, and F-measure of 0.43. From this result can be said that the success rate of the system to produce relevant documents is low.
Analyzing the history of Cognition using Topic Models.
Cohen Priva, Uriel; Austerweil, Joseph L
2015-02-01
Very few articles have analyzed how cognitive science as a field has changed over the last six decades. We explore how Cognition changed over the last four decades using Topic Models. Topic Models assume that every word in every document is generated by one of a limited number of topics. Words that are likely to co-occur are likely to be generated by a single topic. We find a number of significant historical trends: the rise of moral cognition, eyetracking methods, and action, the fall of sentence processing, and the stability of development. We introduce the notion of framing topics, which frame content, rather than present the content itself. These framing topics suggest that over time Cognition turned from abstract theorizing to more experimental approaches. Copyright © 2014 Elsevier B.V. All rights reserved.
Onboard shuttle on-line software requirements system: Prototype
NASA Technical Reports Server (NTRS)
Kolkhorst, Barbara; Ogletree, Barry
1989-01-01
The prototype discussed here was developed as proof of a concept for a system which could support high volumes of requirements documents with integrated text and graphics; the solution proposed here could be extended to other projects whose goal is to place paper documents in an electronic system for viewing and printing purposes. The technical problems (such as conversion of documentation between word processors, management of a variety of graphics file formats, and difficulties involved in scanning integrated text and graphics) would be very similar for other systems of this type. Indeed, technological advances in areas such as scanning hardware and software and display terminals insure that some of the problems encountered here will be solved in the near-term (less than five years). Examples of these solvable problems include automated input of integrated text and graphics, errors in the recognition process, and the loss of image information which results from the digitization process. The solution developed for the Online Software Requirements System is modular and allows hardware and software components to be upgraded or replaced as industry solutions mature. The extensive commercial software content allows the NASA customer to apply resources to solving the problem and maintaining documents.
A Better Way to Store Energy for Less Cost
DOE Office of Scientific and Technical Information (OSTI.GOV)
Darmon, Jonathan M.; Weiss, Charles J.; Hulley, Elliott B.
Representing the Center for Molecular Electrocatalysis (CME), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of CME to understand, design and develop molecular electrocatalysts for solar fuel production and use.
Rapid Prototyping of Application Specific Signal Processors (RASSP)
1993-12-23
Compilers 2-9 - Cadre Teamwork 2-13 - CodeCenter (Centerline) 2-15 - dbx/dbxtool (UNIXm) 2-17 - Falcon (Mentor) 2-19 - FrameMaker (Frame Tech) 2-21 - gprof...UNIXm C debuggers Falcon Mentor ECAD Framework FrameMaker Frame Tech Word Processing gcc GNU CIC++ compiler gprof GNU Software profiling tool...organization can put their own documentation on-line using the BOLD Com- poser for Framemaker . " The AMPLE programming language is a C like language used for
Galdino, Greg M; Gotway, Michael
2005-02-01
The curriculum vitae (CV) has been the traditional method for radiologists to illustrate their accomplishments in the field of medicine. Despite its presence in medicine as a standard, widely accepted means to describe one's professional career and its use for decades as an accomplice to most applications and interviews, there is relatively little written in the medical literature regarding the CV. Misrepresentation on medical students', residents', and fellows' applications has been reported. Using digital technology, CVs have the potential to be much more than printed words on paper and offers a solution to misrepresentation. Digital CVs may incorporate full-length articles, graphics, presentations, clinical images, and video. Common formats for digital CVs include CD-ROMs or DVD-ROMs containing articles (in Adobe Portable Document Format) and presentations (in Microsoft PowerPoint format) accompanying printed CVs, word processing documents with hyperlinks to articles and presentations either locally (on CD-ROMs or DVD-ROMs) or remotely (via the Internet), or hypertext markup language documents. Digital CVs afford the ability to provide more information that is readily accessible to those receiving and reviewing them. Articles, presentations, videos, images, and Internet links can be illustrated using standard file formats commonly available to all radiologists. They can be easily updated and distributed on an inexpensive media, such as a CD-ROM or DVD-ROM. With the availability of electronic articles, presentations, and information via the Internet, traditional paper CVs may soon be superseded by their electronic successors.
The processing of blend words in naming and sentence reading.
Johnson, Rebecca L; Slate, Sarah Rose; Teevan, Allison R; Juhasz, Barbara J
2018-04-01
Research exploring the processing of morphologically complex words, such as compound words, has found that they are decomposed into their constituent parts during processing. Although much is known about the processing of compound words, very little is known about the processing of lexicalised blend words, which are created from parts of two words, often with phoneme overlap (e.g., brunch). In the current study, blends were matched with non-blend words on a variety of lexical characteristics, and blend processing was examined using two tasks: a naming task and an eye-tracking task that recorded eye movements during reading. Results showed that blend words were processed more slowly than non-blend control words in both tasks. Blend words led to longer reaction times in naming and longer processing times on several eye movement measures compared to non-blend words. This was especially true for blends that were long, rated low in word familiarity, but were easily recognisable as blends.
Relationships between music training, speech processing, and word learning: a network perspective.
Elmer, Stefan; Jäncke, Lutz
2018-03-15
Numerous studies have documented the behavioral advantages conferred on professional musicians and children undergoing music training in processing speech sounds varying in the spectral and temporal dimensions. These beneficial effects have previously often been associated with local functional and structural changes in the auditory cortex (AC). However, this perspective is oversimplified, in that it does not take into account the intrinsic organization of the human brain, namely, neural networks and oscillatory dynamics. Therefore, we propose a new framework for extending these previous findings to a network perspective by integrating multimodal imaging, electrophysiology, and neural oscillations. In particular, we provide concrete examples of how functional and structural connectivity can be used to model simple neural circuits exerting a modulatory influence on AC activity. In addition, we describe how such a network approach can be used for better comprehending the beneficial effects of music training on more complex speech functions, such as word learning. © 2018 New York Academy of Sciences.
ERIC Educational Resources Information Center
Hicks, Emily D.
2004-01-01
The cultural activities, including the performance of music and spoken word are documented. The cultural activities in the San Diego-Tijuana region that is described is emerged from rhizomatic, transnational points of contact.
Development and validation of a brief, descriptive Danish pain questionnaire (BDDPQ).
Perkins, F M; Werner, M U; Persson, F; Holte, K; Jensen, T S; Kehlet, H
2004-04-01
A new pain questionnaire should be simple, be documented to have discriminative function, and be related to previously used questionnaires. Word meaning was validated by using bilingual Danish medical students and asking them to translate words taken from the Danish version of the McGill pain questionnaire into English. Evaluative word value was estimated using a visual analog scale (VAS). Discriminative function was assessed by having patients with one of six painful conditions (postherpetic neuralgia, phantom limb pain, rheumatoid arthritis, ankle fracture, appendicitis, or labor pain) complete the questionnaire. We were not able to find Danish words that were reliably back-translated to the English words 'splitting' or 'gnawing'. A simple three-word set of evaluative terms had good separation when rated on a VAS scale ('let' 17.5+/-6.5 mm; 'moderat' 42.7+/-8.6 mm; and 'staerk' 74.9+/-9.7 mm). The questionnaire was able to discriminate among the six painful conditions with 77% accuracy by just using the descriptive words. The accuracy of the questionnaire increased to 96% with the addition of evaluative terms (for pain at rest and with activity), chronicity (acute vs. chronic), and location of the pain. A Danish pain questionnaire that subjects and patients can self-administer has been developed and validated relative to the words used in the English McGill Pain questionnaire. The discriminative ability of the questionnaire among some common painful conditions has been tested and documented. The questionnaire may be of use in patient care and research.
2015-01-01
Linguistic and cultural differences can impede comprehension among potential research participants during the informed consent process, but how researchers and IRBs respond to these challenges in practice is unclear. We conducted in-depth interviews with 15 researchers, research ethics committee (REC) chairs and members from 8 different countries with emerging economies, involved in HIV-related research sponsored by HIV Prevention Trials Network (HPTN), regarding the ethical and regulatory challenges they face in this regard. In the interviews, problems with translating study materials often arose as major concerns. Four sets of challenges were identified concerning linguistic and cultural translations of informed consent documents and other study materials, related to the: (1) context, (2) process, (3) content and (4) translation of these documents. Host country contextual issues included low literacy rates, education (e.g., documents may need to be written below 5th grade reading level), and experiences with research, and different views of written documentation. Certain terms and concepts may not exist in other languages, or have additional connotations that back translations do not always reveal. Challenges arise because of not only the content of word-for-word, literal translation, but the linguistic form of the language, such as tone (e.g., appropriate forms of politeness vs. legalese, seen as harsh), syntax, manner of questions posed, and the concept of the consent); and the contexts of use affect meaning. Problems also emerged in bilateral communications – US IRBs may misunderstand local practices, or communicate insufficiently the reasons for their decisions to foreign RECs. In sum, these data highlight several challenges that have received little, if any, attention in past literature on translation of informed consent and study materials, and have crucial implications for improving practice, education, research and policy, suggesting several strategies, including needs for broader open-source multilingual lexicons, and more awareness of the complexities involved. PMID:26225759
Hanrahan, Donna; Sexton, Patrina; Hui, Katrina; Teitcher, Jennifer; Sugarman, Jeremy; London, Alex John; Barnes, Mark; Purpura, James; Klitzman, Robert
2015-01-01
Linguistic and cultural differences can impede comprehension among potential research participants during the informed consent process, but how researchers and IRBs respond to these challenges in practice is unclear. We conducted in-depth interviews with 15 researchers, research ethics committee (REC) chairs and members from 8 different countries with emerging economies, involved in HIV-related research sponsored by HIV Prevention Trials Network (HPTN), regarding the ethical and regulatory challenges they face in this regard. In the interviews, problems with translating study materials often arose as major concerns. Four sets of challenges were identified concerning linguistic and cultural translations of informed consent documents and other study materials, related to the: (1) context, (2) process, (3) content and (4) translation of these documents. Host country contextual issues included low literacy rates, education (e.g., documents may need to be written below 5th grade reading level), and experiences with research, and different views of written documentation. Certain terms and concepts may not exist in other languages, or have additional connotations that back translations do not always reveal. Challenges arise because of not only the content of word-for-word, literal translation, but the linguistic form of the language, such as tone (e.g., appropriate forms of politeness vs. legalese, seen as harsh), syntax, manner of questions posed, and the concept of the consent); and the contexts of use affect meaning. Problems also emerged in bilateral communications--US IRBs may misunderstand local practices, or communicate insufficiently the reasons for their decisions to foreign RECs. In sum, these data highlight several challenges that have received little, if any, attention in past literature on translation of informed consent and study materials, and have crucial implications for improving practice, education, research and policy, suggesting several strategies, including needs for broader open-source multilingual lexicons, and more awareness of the complexities involved.
On application of image analysis and natural language processing for music search
NASA Astrophysics Data System (ADS)
Gwardys, Grzegorz
2013-10-01
In this paper, I investigate a problem of finding most similar music tracks using, popular in Natural Language Processing, techniques like: TF-IDF and LDA. I de ned document as music track. Each music track is transformed to spectrogram, thanks that, I can use well known techniques to get words from images. I used SURF operation to detect characteristic points and novel approach for their description. The standard kmeans was used for clusterization. Clusterization is here identical with dictionary making, so after that I can transform spectrograms to text documents and perform TF-IDF and LDA. At the final, I can make a query in an obtained vector space. The research was done on 16 music tracks for training and 336 for testing, that are splitted in four categories: Hiphop, Jazz, Metal and Pop. Although used technique is completely unsupervised, results are satisfactory and encouraging to further research.
Munkhdalai, Tsendsuren; Li, Meijing; Batsuren, Khuyagbaatar; Park, Hyeon Ah; Choi, Nak Hyeon; Ryu, Keun Ho
2015-01-01
Chemical and biomedical Named Entity Recognition (NER) is an essential prerequisite task before effective text mining can begin for biochemical-text data. Exploiting unlabeled text data to leverage system performance has been an active and challenging research topic in text mining due to the recent growth in the amount of biomedical literature. We present a semi-supervised learning method that efficiently exploits unlabeled data in order to incorporate domain knowledge into a named entity recognition model and to leverage system performance. The proposed method includes Natural Language Processing (NLP) tasks for text preprocessing, learning word representation features from a large amount of text data for feature extraction, and conditional random fields for token classification. Other than the free text in the domain, the proposed method does not rely on any lexicon nor any dictionary in order to keep the system applicable to other NER tasks in bio-text data. We extended BANNER, a biomedical NER system, with the proposed method. This yields an integrated system that can be applied to chemical and drug NER or biomedical NER. We call our branch of the BANNER system BANNER-CHEMDNER, which is scalable over millions of documents, processing about 530 documents per minute, is configurable via XML, and can be plugged into other systems by using the BANNER Unstructured Information Management Architecture (UIMA) interface. BANNER-CHEMDNER achieved an 85.68% and an 86.47% F-measure on the testing sets of CHEMDNER Chemical Entity Mention (CEM) and Chemical Document Indexing (CDI) subtasks, respectively, and achieved an 87.04% F-measure on the official testing set of the BioCreative II gene mention task, showing remarkable performance in both chemical and biomedical NER. BANNER-CHEMDNER system is available at: https://bitbucket.org/tsendeemts/banner-chemdner.
Recalling taboo and nontaboo words.
Jay, Timothy; Caldwell-Harris, Catherine; King, Krista
2008-01-01
People remember emotional and taboo words better than neutral words. It is well known that words that are processed at a deep (i.e., semantic) level are recalled better than words processed at a shallow (i.e., purely visual) level. To determine how depth of processing influences recall of emotional and taboo words, a levels of processing paradigm was used. Whether this effect holds for emotional and taboo words has not been previously investigated. Two experiments demonstrated that taboo and emotional words benefit less from deep processing than do neutral words. This is consistent with the proposal that memories for taboo and emotional words are a function of the arousal level they evoke, even under shallow encoding conditions. Recall was higher for taboo words, even when taboo words were cued to be recalled after neutral and emotional words. The superiority of taboo word recall is consistent with cognitive neuroscience and brain imaging research.
The impact of developmental dyslexia and dysgraphia on movement production during word writing.
Kandel, Sonia; Lassus-Sangosse, Delphine; Grosjacques, Géraldine; Perret, Cyril
This study investigated how deficits in orthographic processing affect movement production during word writing. Children with dyslexia and dysgraphia wrote words and pseudo-words on a digitizer. The words were orthographically regular and irregular of varying frequency. The group analysis revealed that writing irregular words and pseudo-words increased movement duration and dysfluency. This indicates that the spelling processes were active while the children were writing the words. The impact of these spelling processes was stronger for the children with dyslexia and dysgraphia. The analysis of individual performance revealed that most dyslexic/dysgraphic children presented similar writing patterns. However, selective lexical processing deficits affected irregular word writing but not pseudo-word writing. Selective poor sublexical processing affected pseudo-word writing more than irregular word writing. This study suggests that the interaction between orthographic and motor processing constitutes an important cognitive load that may disrupt the graphic outcome of the children with dyslexia/dysgraphia.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fink, J.K.
1972-07-01
The HELP documents provide SPEAKEASY users with concise definitions of most of the words available in the current processors. In this report, the documents are given in a variety of formats to enable one to find specific information quickly. The bulk of this report consists of computer read-out of the HELP library via SPEAKEASY.
Analysis of Informed Consent Document Utilization in a Minimal-Risk Genetic Study
Desch, Karl; Li, Jun; Kim, Scott; Laventhal, Naomi; Metzger, Kristen; Siemieniak, David; Ginsburg, David
2012-01-01
Background The signed informed consent document certifies that the process of informed consent has taken place and provides research participants with comprehensive information about their role in the study. Despite efforts to optimize the informed consent document, only limited data are available about the actual use of consent documents by participants in biomedical research. Objective To examine the use of online consent documents in a minimal-risk genetic study. Design Prospective sibling cohort enrolled as part of a genetic study of hematologic and common human traits. Setting University of Michigan Campus, Ann Arbor, Michigan. Participants Volunteer sample of healthy persons with 1 or more eligible siblings aged 14 to 35 years. Enrollment was through targeted e-mail to student lists. A total of 1209 persons completed the study. Measurements Time taken by participants to review a 2833-word online consent document before indicating consent and identification of a masked hyperlink near the end of the document. Results The minimum predicted reading time was 566 seconds. The median time to consent was 53 seconds. A total of 23% of participants consented within 10 seconds, and 93% of participants consented in less than the minimum predicted reading time. A total of 2.5% of participants identified the masked hyperlink. Limitation The online consent process was not observed directly by study investigators, and some participants may have viewed the consent document more than once. Conclusion Few research participants thoroughly read the consent document before agreeing to participate in this genetic study. These data suggest that current informed consent documents, particularly for low-risk studies, may no longer serve the intended purpose of protecting human participants, and the role of these documents should be reassessed. Primary Funding Source National Institutes of Health. PMID:21893624
Scribe: A Document Specification Language and Its Compiler
1980-10-01
34" prints today’s date as "Samedi, le 13 Decembre, 1980". The template "el 8 de Marzo de 1952" prints today’s date as "el 13 de Diciembre de 1980". The...Letter spacing and kerning 20 3.12 Ligatures 24 3.1.3 Diacritical Marks 24 3.2 Lineation and Word Placement 27 3.2.1 Word Spacing and Justification 27...letterhead. 67 Figure 24 : Document format definition for CMU thesis. 68 Figure 25: Twenty basic rules for indexers, from Collison [11]. 74 Figure 26
2003-04-01
8 Deconstructing the model’s output................................................................................ 9 Implications of the ideas...identified characters of a word are used as a probe to retrieve a word’s identity (its spelling and phonology ) from memory. In addition to the...document matrix has been reduced by the SVD. Deconstructing the model’s output Why do semantic relationships between words emerge from the model? Is the
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-02
... number of the interim rule published on July 13, 2012, in the words of issuance. This document corrects... July 13, 2012, as 41427 instead of 41247 in the words of issuance. The page number is correctly listed...
Responding to Nonwords in the Lexical Decision Task: Insights from the English Lexicon Project
Yap, Melvin J.; Sibley, Daragh E.; Balota, David A.; Ratcliff, Roger; Rueckl, Jay
2014-01-01
Researchers have extensively documented how various statistical properties of words (e.g., word-frequency) influence lexical processing. However, the impact of lexical variables on nonword decision-making performance is less clear. This gap is surprising, since a better specification of the mechanisms driving nonword responses may provide valuable insights into early lexical processes. In the present study, item-level and participant-level analyses were conducted on the trial-level lexical decision data for almost 37,000 nonwords in the English Lexicon Project in order to identify the influence of different psycholinguistic variables on nonword lexical decision performance, and to explore individual differences in how participants respond to nonwords. Item-level regression analyses reveal that nonword response time was positively correlated with number of letters, number of orthographic neighbors, number of affixes, and baseword number of syllables, and negatively correlated with Levenshtein orthographic distance and baseword frequency. Participant-level analyses also point to within- and between-session stability in nonword responses across distinct sets of items, and intriguingly reveal that higher vocabulary knowledge is associated with less sensitivity to some dimensions (e.g., number of letters) but more sensitivity to others (e.g., baseword frequency). The present findings provide well-specified and interesting new constraints for informing models of word recognition and lexical decision. PMID:25329078
The Processing of Novel and Lexicalised Prefixed Words in Reading
ERIC Educational Resources Information Center
Pollatsek, Alexander; Slattery, Timothy J.; Juhasz, Barbara
2008-01-01
Two experiments compared how relatively long novel prefixed words (e.g., "overfarm") and existing prefixed words were processed in reading. The use of novel prefixed words allows one to examine the roles of whole-word access and decompositional processing in the processing of non-novel prefixed words. The two experiments found that,…
Ramanujam, Nedunchelian; Kaliappan, Manivannan
2016-01-01
Nowadays, automatic multidocument text summarization systems can successfully retrieve the summary sentences from the input documents. But, it has many limitations such as inaccurate extraction to essential sentences, low coverage, poor coherence among the sentences, and redundancy. This paper introduces a new concept of timestamp approach with Naïve Bayesian Classification approach for multidocument text summarization. The timestamp provides the summary an ordered look, which achieves the coherent looking summary. It extracts the more relevant information from the multiple documents. Here, scoring strategy is also used to calculate the score for the words to obtain the word frequency. The higher linguistic quality is estimated in terms of readability and comprehensibility. In order to show the efficiency of the proposed method, this paper presents the comparison between the proposed methods with the existing MEAD algorithm. The timestamp procedure is also applied on the MEAD algorithm and the results are examined with the proposed method. The results show that the proposed method results in lesser time than the existing MEAD algorithm to execute the summarization process. Moreover, the proposed method results in better precision, recall, and F-score than the existing clustering with lexical chaining approach. PMID:27034971
Using Serial and Discrete Digit Naming to Unravel Word Reading Processes
Altani, Angeliki; Protopapas, Athanassios; Georgiou, George K.
2018-01-01
During reading acquisition, word recognition is assumed to undergo a developmental shift from slow serial/sublexical processing of letter strings to fast parallel processing of whole word forms. This shift has been proposed to be detected by examining the size of the relationship between serial- and discrete-trial versions of word reading and rapid naming tasks. Specifically, a strong association between serial naming of symbols and single word reading suggests that words are processed serially, whereas a strong association between discrete naming of symbols and single word reading suggests that words are processed in parallel as wholes. In this study, 429 Grade 1, 3, and 5 English-speaking Canadian children were tested on serial and discrete digit naming and word reading. Across grades, single word reading was more strongly associated with discrete naming than with serial naming of digits, indicating that short high-frequency words are processed as whole units early in the development of reading ability in English. In contrast, serial naming was not a unique predictor of single word reading across grades, suggesting that within-word sequential processing was not required for the successful recognition for this set of words. Factor mixture analysis revealed that our participants could be clustered into two classes, namely beginning and more advanced readers. Serial naming uniquely predicted single word reading only among the first class of readers, indicating that novice readers rely on a serial strategy to decode words. Yet, a considerable proportion of Grade 1 students were assigned to the second class, evidently being able to process short high-frequency words as unitized symbols. We consider these findings together with those from previous studies to challenge the hypothesis of a binary distinction between serial/sublexical and parallel/lexical processing in word reading. We argue instead that sequential processing in word reading operates on a continuum, depending on the level of reading proficiency, the degree of orthographic transparency, and word-specific characteristics. PMID:29706918
Using Serial and Discrete Digit Naming to Unravel Word Reading Processes.
Altani, Angeliki; Protopapas, Athanassios; Georgiou, George K
2018-01-01
During reading acquisition, word recognition is assumed to undergo a developmental shift from slow serial/sublexical processing of letter strings to fast parallel processing of whole word forms. This shift has been proposed to be detected by examining the size of the relationship between serial- and discrete-trial versions of word reading and rapid naming tasks. Specifically, a strong association between serial naming of symbols and single word reading suggests that words are processed serially, whereas a strong association between discrete naming of symbols and single word reading suggests that words are processed in parallel as wholes. In this study, 429 Grade 1, 3, and 5 English-speaking Canadian children were tested on serial and discrete digit naming and word reading. Across grades, single word reading was more strongly associated with discrete naming than with serial naming of digits, indicating that short high-frequency words are processed as whole units early in the development of reading ability in English. In contrast, serial naming was not a unique predictor of single word reading across grades, suggesting that within-word sequential processing was not required for the successful recognition for this set of words. Factor mixture analysis revealed that our participants could be clustered into two classes, namely beginning and more advanced readers. Serial naming uniquely predicted single word reading only among the first class of readers, indicating that novice readers rely on a serial strategy to decode words. Yet, a considerable proportion of Grade 1 students were assigned to the second class, evidently being able to process short high-frequency words as unitized symbols. We consider these findings together with those from previous studies to challenge the hypothesis of a binary distinction between serial/sublexical and parallel/lexical processing in word reading. We argue instead that sequential processing in word reading operates on a continuum, depending on the level of reading proficiency, the degree of orthographic transparency, and word-specific characteristics.
Computerized training management system
Rice, H.B.; McNair, R.C.; White, K.; Maugeri, T.
1998-08-04
A Computerized Training Management System (CTMS) is disclosed for providing a procedurally defined process that is employed to develop accreditable performance based training programs for job classifications that are sensitive to documented regulations and technical information. CTMS is a database that links information needed to maintain a five-phase approach to training-analysis, design, development, implementation, and evaluation independent of training program design. CTMS is designed using R-Base{trademark}, an-SQL compliant software platform. Information is logically entered and linked in CTMS. Each task is linked directly to a performance objective, which, in turn, is linked directly to a learning objective; then, each enabling objective is linked to its respective test items. In addition, tasks, performance objectives, enabling objectives, and test items are linked to their associated reference documents. CTMS keeps all information up to date since it automatically sorts, files and links all data; CTMS includes key word and reference document searches. 18 figs.
Computerized training management system
Rice, Harold B.; McNair, Robert C.; White, Kenneth; Maugeri, Terry
1998-08-04
A Computerized Training Management System (CTMS) for providing a procedurally defined process that is employed to develop accreditable performance based training programs for job classifications that are sensitive to documented regulations and technical information. CTMS is a database that links information needed to maintain a five-phase approach to training-analysis, design, development, implementation, and evaluation independent of training program design. CTMS is designed using R-Base.RTM., an-SQL compliant software platform. Information is logically entered and linked in CTMS. Each task is linked directly to a performance objective, which, in turn, is linked directly to a learning objective; then, each enabling objective is linked to its respective test items. In addition, tasks, performance objectives, enabling objectives, and test items are linked to their associated reference documents. CTMS keeps all information up to date since it automatically sorts, files and links all data; CTMS includes key word and reference document searches.
A boy asked his Mom about energy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mutolo, Paul F.; Muller, David; O'Dea, James
Representing the Energy Materials Center (EMC), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of EMC is advancing the science of energy conversion and storage by understanding and exploiting fundamentalmore » properties of active materials and their interfaces.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rameau, Jon; Crabtree, George; Greene, Laura
Representing the Center for Emergent Superconductivity (CES), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CES is to discover new high-temperature superconductors and improve the performance of knownmore » superconductors by understanding the fundamental physics of superconductivity.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, Chuanqi; Liang, Yan; Sahl, Lars
Representing the Center for Solar Fuels (CSF), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of the CSF is to provide the basic research to enable a revolution in themore » collection and conversion of sunlight into storable solar fuels.« less
Some observations on the interdigitation of advances in medical science and mathematics.
Glamore, Michael James; West, James L; O'leary, James Patrick
2013-12-01
The immense advancement of our understanding of disease processes has not been a uniform progression related to the passage of time. Advances have been made in "lurches" and "catches" since the advent of the written word. There has been a remarkable interdependency between such advances in medicine and advances in mathematics that has proved beneficial to both. This work explores some of these critical relationships and documents how the individuals involved contributed to advances in each.
CLOCS (Computer with Low Context-Switching Time) Architecture Reference Documents
1988-05-06
Peculiarities The only state inside the central processing unit(CPU) is a program status word. All data operations are memory to memory. One result of this... to the challenge "if I whore to design RISC, this is how I would do it." The architecture was designed by Mark Davis and Bill Gallmeister. 1.2...are memory to memory. Any special devices added should be memory mapped. The program counter is even memory mapped. 1.3.1 Working storage There is no
Collaborative writing: Tools and tips.
Eapen, Bell Raj
2007-01-01
Majority of technical writing is done by groups of experts and various web based applications have made this collaboration easy. Email exchange of word processor documents with tracked changes used to be the standard technique for collaborative writing. However web based tools like Google docs and Spreadsheets have made the process fast and efficient. Various versioning tools and synchronous editors are available for those who need additional functionality. Having a group leader who decides the scheduling, communication and conflict resolving protocols is important for successful collaboration.
Adamson, Lauren B.; Bakeman, Roger; Brandon, Benjamin
2015-01-01
This study documents how parents weave new words into on-going interactions with children who are just beginning to speak. Dyads with typically developing toddlers and with young children with autism spectrum disorder and Down syndrome (n = 56, 23, and 29) were observed using a Communication Play Protocol during which parents could use novel words to refer to novel objects. Parents readily introduced both labels and sound words even when their child did not respond expressively or produce the words. Results highlight both how parents act in ways that may facilitate their child's appreciation of the relation between a new word and its referent and how they subtly adjust their actions to suit their child's level of word learning and specific learning challenges. PMID:25863927
New Tools to Convert PDF Math Contents into Accessible e-Books Efficiently.
Suzuki, Masakazu; Terada, Yugo; Kanahori, Toshihiro; Yamaguchi, Katsuhito
2015-01-01
New features in our math-OCR software to convert PDF math contents into accessible e-books are shown. A method for recognizing PDF is thoroughly improved. In addition, contents in any selected area including math formulas in a PDF file can be cut and pasted into a document in various accessible formats, which is automatically recognized and converted into texts and accessible math formulas through this process. Combining it with our authoring tool for a technical document, one can easily produce accessible e-books in various formats such as DAISY, accessible EPUB3, DAISY-like HTML5, Microsoft Word with math objects and so on. Those contents are useful for various print-disabled students ranging from the blind to the dyslexic.
An introduction to information retrieval: applications in genomics
Nadkarni, P M
2011-01-01
Information retrieval (IR) is the field of computer science that deals with the processing of documents containing free text, so that they can be rapidly retrieved based on keywords specified in a user’s query. IR technology is the basis of Web-based search engines, and plays a vital role in biomedical research, because it is the foundation of software that supports literature search. Documents can be indexed by both the words they contain, as well as the concepts that can be matched to domain-specific thesauri; concept matching, however, poses several practical difficulties that make it unsuitable for use by itself. This article provides an introduction to IR and summarizes various applications of IR and related technologies to genomics. PMID:12049181
;meta http-equiv=Content-Type content="text/html; charset=iso-8859-1"> <meta name=ProgId content=Word.Document> <meta name=Generator content="Microsoft Word 11"> <meta name /dublin_core"> <meta name=dc.title content="Alaska Solar Resource: Flat Plate Collector, Facing
Critical Linguistics: A Starting Point for Oppositional Reading.
ERIC Educational Resources Information Center
Janks, Hilary
This document focuses on specific linguistic features that serve ideological functions in texts written in South Africa from 1985 to 1988. The features examined include: naming; metaphors; old words with new meanings; words becoming tainted; renaming or lexicalization; overlexicalization; strategies for resisting classification; tense and aspect;…
1988-04-01
e.g., definitions, references, pictures) on the selected item in a separate window. For example, in a hyper- text document on astronomy , the reader...might arrive at the highlighted word " Copernicus ", select the word with the keyboard or mouse, and then be offered a number of related topics from
ERIC Educational Resources Information Center
Bolger, Charlene
A compilation of over 50 elementary school activities focuses on developing students' familiarity with the 50 states. Exercises such as word searches, scrambled word puzzles, shape puzzles, spelling bees, match games, and atlas games introduce students to the capitals, major cities, main characteristics, and location of each state. The document is…
Word Processing. A Handbook for Business Teachers.
ERIC Educational Resources Information Center
Stewart, Jeffrey R., Jr., Ed.
This handbook is designed to provide information to help teachers keep abreast of changes in word processing and to develop necessary teaching skills. The handbook is divided into two main parts: understanding word processing and teaching word processing skills. In the introduction the part word processing plays in the business scheme of a company…
Rashotte, Judy; Coburn, Geraldine; Harrison, Denise; Stevens, Bonnie J; Yamada, Janet; Abbott, Laura K
2013-01-01
Although documentation of children's pain by health care professionals is frequently undertaken, few studies have explored the nature of the language used to describe pain in the medical records of hospitalized children. To describe health care professionals' use of written language related to the quality and quantity of pain experienced by hospitalized children. Free-text pain narratives documented during a 24 h period were collected from the medical records of 3822 children (0 to 18 years of age) hospitalized on 32 inpatient units in eight Canadian pediatric hospitals. A qualitative descriptive exploration using a content analysis approach was used. Pain narratives were documented a total of 5390 times in 1518 of the 3822 children's medical records (40%). Overall, word choices represented objective and subjective descriptors. Two major categories were identified, with their respective subcategories of word indicators and associated cues: indicators of pain, including behavioural (e.g., vocal, motor, facial and activities cues), affective and physiological cues, and children's descriptors; and word qualifiers, including intensity, comparator and temporal qualifiers. The richness and complexity of vocabulary used by clinicians to document children's pain lend support to the concept that the word 'pain' is a label that represents a myriad of different experiences. There is potential to refine pediatric pain assessment measures to be inclusive of other cues used to identify children's pain. The results enhance the discussion concerning the development of standardized nomenclature. Further research is warranted to determine whether there is congruence in interpretation across time, place and individuals.
Individual differences in emotion word processing: A diffusion model analysis.
Mueller, Christina J; Kuchinke, Lars
2016-06-01
The exploratory study investigated individual differences in implicit processing of emotional words in a lexical decision task. A processing advantage for positive words was observed, and differences between happy and fear-related words in response times were predicted by individual differences in specific variables of emotion processing: Whereas more pronounced goal-directed behavior was related to a specific slowdown in processing of fear-related words, the rate of spontaneous eye blinks (indexing brain dopamine levels) was associated with a processing advantage of happy words. Estimating diffusion model parameters revealed that the drift rate (rate of information accumulation) captures unique variance of processing differences between happy and fear-related words, with highest drift rates observed for happy words. Overall emotion recognition ability predicted individual differences in drift rates between happy and fear-related words. The findings emphasize that a significant amount of variance in emotion processing is explained by individual differences in behavioral data.
Lin, Ching-Heng; Wu, Nai-Yuan; Lai, Wei-Shao; Liou, Der-Ming
2015-01-01
Electronic medical records with encoded entries should enhance the semantic interoperability of document exchange. However, it remains a challenge to encode the narrative concept and to transform the coded concepts into a standard entry-level document. This study aimed to use a novel approach for the generation of entry-level interoperable clinical documents. Using HL7 clinical document architecture (CDA) as the example, we developed three pipelines to generate entry-level CDA documents. The first approach was a semi-automatic annotation pipeline (SAAP), the second was a natural language processing (NLP) pipeline, and the third merged the above two pipelines. We randomly selected 50 test documents from the i2b2 corpora to evaluate the performance of the three pipelines. The 50 randomly selected test documents contained 9365 words, including 588 Observation terms and 123 Procedure terms. For the Observation terms, the merged pipeline had a significantly higher F-measure than the NLP pipeline (0.89 vs 0.80, p<0.0001), but a similar F-measure to that of the SAAP (0.89 vs 0.87). For the Procedure terms, the F-measure was not significantly different among the three pipelines. The combination of a semi-automatic annotation approach and the NLP application seems to be a solution for generating entry-level interoperable clinical documents. © The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.comFor numbered affiliation see end of article.
Word add-in for ontology recognition: semantic enrichment of scientific literature
2010-01-01
Background In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles. Results The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at http://www.codeplex.com/UCSDBioLit. Conclusions The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata. PMID:20181245
Authorship Discovery in Blogs Using Bayesian Classification with Corrective Scaling
2008-06-01
4 2.3 W. Fucks ’ Diagram of n-Syllable Word Frequencies . . . . . . . . . . . . . . 5 3.1 Confusion Matrix for All Test Documents of 500...of the books which scholars believed he had. • Wilhelm Fucks discriminated between authors using the average number of syllables per word and average...distance between equal-syllabled words [8]. Fucks , too, concluded that a study such as his reveals a “possibility of a quantitative classification
Comparing Medline citations using modified N-grams
Nawab, Rao Muhammad Adeel; Stevenson, Mark; Clough, Paul
2014-01-01
Objective We aim to identify duplicate pairs of Medline citations, particularly when the documents are not identical but contain similar information. Materials and methods Duplicate pairs of citations are identified by comparing word n-grams in pairs of documents. N-grams are modified using two approaches which take account of the fact that the document may have been altered. These are: (1) deletion, an item in the n-gram is removed; and (2) substitution, an item in the n-gram is substituted with a similar term obtained from the Unified Medical Language System Metathesaurus. N-grams are also weighted using a score derived from a language model. Evaluation is carried out using a set of 520 Medline citation pairs, including a set of 260 manually verified duplicate pairs obtained from the Deja Vu database. Results The approach accurately detects duplicate Medline document pairs with an F1 measure score of 0.99. Allowing for word deletions and substitution improves performance. The best results are obtained by combining scores for n-grams of length 1–5 words. Discussion Results show that the detection of duplicate Medline citations can be improved by modifying n-grams and that high performance can also be obtained using only unigrams (F1=0.959), particularly when allowing for substitutions of alternative phrases. PMID:23715801
Comparing Medline citations using modified N-grams.
Nawab, Rao Muhammad Adeel; Stevenson, Mark; Clough, Paul
2014-01-01
We aim to identify duplicate pairs of Medline citations, particularly when the documents are not identical but contain similar information. Duplicate pairs of citations are identified by comparing word n-grams in pairs of documents. N-grams are modified using two approaches which take account of the fact that the document may have been altered. These are: (1) deletion, an item in the n-gram is removed; and (2) substitution, an item in the n-gram is substituted with a similar term obtained from the Unified Medical Language System Metathesaurus. N-grams are also weighted using a score derived from a language model. Evaluation is carried out using a set of 520 Medline citation pairs, including a set of 260 manually verified duplicate pairs obtained from the Deja Vu database. The approach accurately detects duplicate Medline document pairs with an F1 measure score of 0.99. Allowing for word deletions and substitution improves performance. The best results are obtained by combining scores for n-grams of length 1-5 words. Results show that the detection of duplicate Medline citations can be improved by modifying n-grams and that high performance can also be obtained using only unigrams (F1=0.959), particularly when allowing for substitutions of alternative phrases.
Noel, Jean-Paul; Blanke, Olaf; Serino, Andrea; Salomon, Roy
2017-01-01
The construct of the "self" is conceived as being fundamental in promoting survival. As such, extensive studies have documented preferential processing of self-relevant stimuli. For example, attributes that relate to the self are better encoded and retrieved, and are more readily consciously perceived. The preferential processing of self-relevant information, however, appears to be especially true for physical (e.g., faces), as opposed to psychological (e.g., traits), conceptions of the self. Here, we test whether semantic attributes that participants judge as self-relevant are further processed unconsciously than attributes that were not judged as self-relevant. In Experiment 1, a continuous flash suppression paradigm was employed with "self" and "non-self" attribute words being presented subliminally, and we asked participants to categorize unseen words as either self-related or not. In a second experiment, we attempted to boost putative preferential self-processing by relation to its physical conception, that is, one's own body. To this aim, we repeated Experiment 1 while administrating acoustic stimuli either close or far from the body, i.e., within or outside peripersonal space. Results of both Experiment 1 and 2 demonstrate no difference in breaking suppression for self and non-self words. Additionally, we found that while participants were able to process the physical location of the unseen words (above or below fixation) they were not able to categorize these as self-relevant or not. Finally, results showed that sounds presented in the extra-personal space elicited a more stringent response criterion for "self" in the process of categorizing unseen visual stimuli. This shift in criterion as a consequence of sound location was restricted to the self, as no such effect was observed in the categorization of attributes occurring above or below fixation. Overall, our findings seem to indicate that subliminally presented stimuli are not semantically processed, at least inasmuch as to be categorized as self-relevant or not. However, we do demonstrate that the distance at which acoustic stimuli are presented may alter the balance between self- and non-self biases.
Noel, Jean-Paul; Blanke, Olaf; Serino, Andrea; Salomon, Roy
2017-01-01
The construct of the “self” is conceived as being fundamental in promoting survival. As such, extensive studies have documented preferential processing of self-relevant stimuli. For example, attributes that relate to the self are better encoded and retrieved, and are more readily consciously perceived. The preferential processing of self-relevant information, however, appears to be especially true for physical (e.g., faces), as opposed to psychological (e.g., traits), conceptions of the self. Here, we test whether semantic attributes that participants judge as self-relevant are further processed unconsciously than attributes that were not judged as self-relevant. In Experiment 1, a continuous flash suppression paradigm was employed with “self” and “non-self” attribute words being presented subliminally, and we asked participants to categorize unseen words as either self-related or not. In a second experiment, we attempted to boost putative preferential self-processing by relation to its physical conception, that is, one’s own body. To this aim, we repeated Experiment 1 while administrating acoustic stimuli either close or far from the body, i.e., within or outside peripersonal space. Results of both Experiment 1 and 2 demonstrate no difference in breaking suppression for self and non-self words. Additionally, we found that while participants were able to process the physical location of the unseen words (above or below fixation) they were not able to categorize these as self-relevant or not. Finally, results showed that sounds presented in the extra-personal space elicited a more stringent response criterion for “self” in the process of categorizing unseen visual stimuli. This shift in criterion as a consequence of sound location was restricted to the self, as no such effect was observed in the categorization of attributes occurring above or below fixation. Overall, our findings seem to indicate that subliminally presented stimuli are not semantically processed, at least inasmuch as to be categorized as self-relevant or not. However, we do demonstrate that the distance at which acoustic stimuli are presented may alter the balance between self- and non-self biases. PMID:28197110
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-31
... of multiple mandatory documents including: (1) a PDF fillable Applicant intake form; (2) a Microsoft Excel Workbook; (3) a Microsoft Word Narrative template; and (4) other mandatory attachments. (Applicants must use the Microsoft Word Narrative template the CDFI Fund provides; alternative templates...
Graph-based word sense disambiguation of biomedical documents.
Agirre, Eneko; Soroa, Aitor; Stevenson, Mark
2010-11-15
Word Sense Disambiguation (WSD), automatically identifying the meaning of ambiguous words in context, is an important stage of text processing. This article presents a graph-based approach to WSD in the biomedical domain. The method is unsupervised and does not require any labeled training data. It makes use of knowledge from the Unified Medical Language System (UMLS) Metathesaurus which is represented as a graph. A state-of-the-art algorithm, Personalized PageRank, is used to perform WSD. When evaluated on the NLM-WSD dataset, the algorithm outperforms other methods that rely on the UMLS Metathesaurus alone. The WSD system is open source licensed and available from http://ixa2.si.ehu.es/ukb/. The UMLS, MetaMap program and NLM-WSD corpus are available from the National Library of Medicine https://www.nlm.nih.gov/research/umls/, http://mmtx.nlm.nih.gov and http://wsd.nlm.nih.gov. Software to convert the NLM-WSD corpus into a format that can be used by our WSD system is available from http://www.dcs.shef.ac.uk/∼marks/biomedical_wsd under open source license.
Working memory and flexibility in awareness and attention.
Bunting, Michael F; Cowan, Nelson
2005-06-01
We argue that attention and awareness form the basis of one type of working-memory storage. In contrast to models of working memory in which storage and retrieval occur effortlessly, we document that an attention-demanding goal conflict within a retrieval cue impairs recall from working memory. In a conceptual span task, semantic and color-name cues prompted recall of four consecutive words from a twelve-word list. The first-four, middle-four, and final-four words belonged to different semantic categories (e.g., body parts, animals, and tools) and were shown in different colors (e.g., red, blue, and green). In Experiment 1, the color of the cue matched that of cued items 75% of the time, and the rare mismatch impaired recall. In Experiment 2, though, the color of the cue matched that of the cued items only 25% of the time, and the now-more-frequent mismatches no longer mattered. These results are difficult to explain with passive storage alone and indicate that a processing difficulty impedes recall from working memory, presumably by distracting attention away from its storage function.
Abbassi, Ensie; Blanchette, Isabelle; Ansaldo, Ana I; Ghassemzadeh, Habib; Joanette, Yves
2015-01-01
Emotional words are processed rapidly and automatically in the left hemisphere (LH) and slowly, with the involvement of attention, in the right hemisphere (RH). This review aims to find the reason for this difference and suggests that emotional words can be processed superficially or deeply due to the involvement of the linguistic and imagery systems, respectively. During superficial processing, emotional words likely make connections only with semantically associated words in the LH. This part of the process is automatic and may be sufficient for the purpose of language processing. Deep processing, in contrast, seems to involve conceptual information and imagery of a word's perceptual and emotional properties using autobiographical memory contents. Imagery and the involvement of autobiographical memory likely differentiate between emotional and neutral word processing and explain the salient role of the RH in emotional word processing. It is concluded that the level of emotional word processing in the RH should be deeper than in the LH and, thus, it is conceivable that the slow mode of processing adds certain qualities to the output.
Tardif, Twila; Fletcher, Paul; Liang, Weilan; Zhang, Zhixiang; Kaciroti, Niko; Marchman, Virginia A
2008-07-01
Although there has been much debate over the content of children's first words, few large sample studies address this question for children at the very earliest stages of word learning. The authors report data from comparable samples of 265 English-, 336 Putonghua- (Mandarin), and 369 Cantonese-speaking 8- to 16-month-old infants whose caregivers completed MacArthur-Bates Communicative Development Inventories and reported them to produce between 1 and 10 words. Analyses of individual words indicated striking commonalities in the first words that children learn. However, substantive cross-linguistic differences appeared in the relative prevalence of common nouns, people terms, and verbs as well as in the probability that children produced even one of these word types when they had a total of 1-3, 4-6, or 7-10 words in their vocabularies. These data document cross-linguistic differences in the types of words produced even at the earliest stages of vocabulary learning and underscore the importance of parental input and cross-linguistic/cross-cultural variations in children's early word-learning.
Information extraction for enhanced access to disease outbreak reports.
Grishman, Ralph; Huttunen, Silja; Yangarber, Roman
2002-08-01
Document search is generally based on individual terms in the document. However, for collections within limited domains it is possible to provide more powerful access tools. This paper describes a system designed for collections of reports of infectious disease outbreaks. The system, Proteus-BIO, automatically creates a table of outbreaks, with each table entry linked to the document describing that outbreak; this makes it possible to use database operations such as selection and sorting to find relevant documents. Proteus-BIO consists of a Web crawler which gathers relevant documents; an information extraction engine which converts the individual outbreak events to a tabular database; and a database browser which provides access to the events and, through them, to the documents. The information extraction engine uses sets of patterns and word classes to extract the information about each event. Preparing these patterns and word classes has been a time-consuming manual operation in the past, but automated discovery tools now make this task significantly easier. A small study comparing the effectiveness of the tabular index with conventional Web search tools demonstrated that users can find substantially more documents in a given time period with Proteus-BIO.
Emotional words facilitate lexical but not early visual processing.
Trauer, Sophie M; Kotz, Sonja A; Müller, Matthias M
2015-12-12
Emotional scenes and faces have shown to capture and bind visual resources at early sensory processing stages, i.e. in early visual cortex. However, emotional words have led to mixed results. In the current study ERPs were assessed simultaneously with steady-state visual evoked potentials (SSVEPs) to measure attention effects on early visual activity in emotional word processing. Neutral and negative words were flickered at 12.14 Hz whilst participants performed a Lexical Decision Task. Emotional word content did not modulate the 12.14 Hz SSVEP amplitude, neither did word lexicality. However, emotional words affected the ERP. Negative compared to neutral words as well as words compared to pseudowords lead to enhanced deflections in the P2 time range indicative of lexico-semantic access. The N400 was reduced for negative compared to neutral words and enhanced for pseudowords compared to words indicating facilitated semantic processing of emotional words. LPC amplitudes reflected word lexicality and thus the task-relevant response. In line with previous ERP and imaging evidence, the present results indicate that written emotional words are facilitated in processing only subsequent to visual analysis.
Suicidal traits in Marilyn Monroe's Fragments: an LIWC analysis.
Fernández-Cabana, M; García-Caballero, A; Alves-Pérez, M T; García-García, M J; Mateos, R
2013-01-01
Linguistic inquiry and word count (LIWC), a computerized method for text analysis, is often used to examine suicide writings in order to characterize the quantitative linguistic features of suicidal texts. To analyze texts compiled in Marilyn Monroe's Fragments using LIWC, in order to explore the use of different linguistic categories in her narrative over the years. Selected texts were grouped into four periods of similar word count and processed with LIWC. Spearman's rank correlation was used to assess changes in language use across the documents over time. The Kruskal-Wallis test was applied to compare means between periods and for each of the 80 LIWC output scores. Significant differences (p < .05) were found in 11 categories, the most relevant being a progressive decrease in the use of negative emotion words, a reduction in the use of long words in the third period, and an increase in the proportion of personal pronouns used as Monroe approached the time of her death. The consistently elevated usage of first-person personal singular pronouns and the consistently diminished usage of first-person personal plural pronouns are in line with previous studies linking this pattern with a low level of social integration, which has been related to suicide according to different theories.
Automatic Keyword Extraction from Individual Documents
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rose, Stuart J.; Engel, David W.; Cramer, Nicholas O.
2010-05-03
This paper introduces a novel and domain-independent method for automatically extracting keywords, as sequences of one or more words, from individual documents. We describe the method’s configuration parameters and algorithm, and present an evaluation on a benchmark corpus of technical abstracts. We also present a method for generating lists of stop words for specific corpora and domains, and evaluate its ability to improve keyword extraction on the benchmark corpus. Finally, we apply our method of automatic keyword extraction to a corpus of news articles and define metrics for characterizing the exclusivity, essentiality, and generality of extracted keywords within a corpus.
SDDL- SOFTWARE DESIGN AND DOCUMENTATION LANGUAGE
NASA Technical Reports Server (NTRS)
Kleine, H.
1994-01-01
Effective, efficient communication is an essential element of the software development process. The Software Design and Documentation Language (SDDL) provides an effective communication medium to support the design and documentation of complex software applications. SDDL supports communication between all the members of a software design team and provides for the production of informative documentation on the design effort. Even when an entire development task is performed by a single individual, it is important to explicitly express and document communication between the various aspects of the design effort including concept development, program specification, program development, and program maintenance. SDDL ensures that accurate documentation will be available throughout the entire software life cycle. SDDL offers an extremely valuable capability for the design and documentation of complex programming efforts ranging from scientific and engineering applications to data management and business sytems. Throughout the development of a software design, the SDDL generated Software Design Document always represents the definitive word on the current status of the ongoing, dynamic design development process. The document is easily updated and readily accessible in a familiar, informative form to all members of the development team. This makes the Software Design Document an effective instrument for reconciling misunderstandings and disagreements in the development of design specifications, engineering support concepts, and the software design itself. Using the SDDL generated document to analyze the design makes it possible to eliminate many errors that might not be detected until coding and testing is attempted. As a project management aid, the Software Design Document is useful for monitoring progress and for recording task responsibilities. SDDL is a combination of language, processor, and methodology. The SDDL syntax consists of keywords to invoke design structures and a collection of directives which control processor actions. The designer has complete control over the choice of keywords, commanding the capabilities of the processor in a way which is best suited to communicating the intent of the design. The SDDL processor translates the designer's creative thinking into an effective document for communication. The processor performs as many automatic functions as possible, thereby freeing the designer's energy for the creative effort. Document formatting includes graphical highlighting of structure logic, accentuation of structure escapes and module invocations, logic error detection, and special handling of title pages and text segments. The SDDL generated document contains software design summary information including module invocation hierarchy, module cross reference, and cross reference tables of user selected words or phrases appearing in the document. The basic forms of the methodology are module and block structures and the module invocation statement. A design is stated in terms of modules that represent problem abstractions which are complete and independent enough to be treated as separate problem entities. Blocks are lower-level structures used to build the modules. Both kinds of structures may have an initiator part, a terminator part, an escape segment, or a substructure. The SDDL processor is written in PASCAL for batch execution on a DEC VAX series computer under VMS. SDDL was developed in 1981 and last updated in 1984.
Automatic Processing of Emotional Words in the Absence of Awareness: The Critical Role of P2
Lei, Yi; Dou, Haoran; Liu, Qingming; Zhang, Wenhai; Zhang, Zhonglu; Li, Hong
2017-01-01
It has been long debated to what extent emotional words can be processed in the absence of awareness. Behavioral studies have shown that the meaning of emotional words can be accessed even without any awareness. However, functional magnetic resonance imaging studies have revealed that emotional words that are unconsciously presented do not activate the brain regions involved in semantic or emotional processing. To clarify this point, we used continuous flash suppression (CFS) and event-related potential (ERP) techniques to distinguish between semantic and emotional processing. In CFS, we successively flashed some Mondrian-style images into one participant's eye steadily, which suppressed the images projected to the other eye. Negative, neutral, and scrambled words were presented to 16 healthy participants for 500 ms. Whenever the participants saw the stimuli—in both visible and invisible conditions—they pressed specific keyboard buttons. Behavioral data revealed that there was no difference in reaction time to negative words and to neutral words in the invisible condition, although negative words were processed faster than neutral words in the visible condition. The ERP results showed that negative words elicited a larger P2 amplitude in the invisible condition than in the visible condition. The P2 component was enhanced for the neutral words compared with the scrambled words in the visible condition; however, the scrambled words elicited larger P2 amplitudes than the neutral words in the invisible condition. These results suggest that the emotional processing of words is more sensitive than semantic processing in the conscious condition. Semantic processing was found to be attenuated in the absence of awareness. Our findings indicate that P2 plays an important role in the unconscious processing of emotional words, which highlights the fact that emotional processing may be automatic and prioritized compared with semantic processing in the absence of awareness. PMID:28473785
Automatic Processing of Emotional Words in the Absence of Awareness: The Critical Role of P2.
Lei, Yi; Dou, Haoran; Liu, Qingming; Zhang, Wenhai; Zhang, Zhonglu; Li, Hong
2017-01-01
It has been long debated to what extent emotional words can be processed in the absence of awareness. Behavioral studies have shown that the meaning of emotional words can be accessed even without any awareness. However, functional magnetic resonance imaging studies have revealed that emotional words that are unconsciously presented do not activate the brain regions involved in semantic or emotional processing. To clarify this point, we used continuous flash suppression (CFS) and event-related potential (ERP) techniques to distinguish between semantic and emotional processing. In CFS, we successively flashed some Mondrian-style images into one participant's eye steadily, which suppressed the images projected to the other eye. Negative, neutral, and scrambled words were presented to 16 healthy participants for 500 ms. Whenever the participants saw the stimuli-in both visible and invisible conditions-they pressed specific keyboard buttons. Behavioral data revealed that there was no difference in reaction time to negative words and to neutral words in the invisible condition, although negative words were processed faster than neutral words in the visible condition. The ERP results showed that negative words elicited a larger P2 amplitude in the invisible condition than in the visible condition. The P2 component was enhanced for the neutral words compared with the scrambled words in the visible condition; however, the scrambled words elicited larger P2 amplitudes than the neutral words in the invisible condition. These results suggest that the emotional processing of words is more sensitive than semantic processing in the conscious condition. Semantic processing was found to be attenuated in the absence of awareness. Our findings indicate that P2 plays an important role in the unconscious processing of emotional words, which highlights the fact that emotional processing may be automatic and prioritized compared with semantic processing in the absence of awareness.
Adamson, Lauren B; Bakeman, Roger; Brandon, Benjamin
2015-05-01
This study documents how parents weave new words into on-going interactions with children who are just beginning to speak. Dyads with typically developing toddlers and with young children with autism spectrum disorder and Down syndrome (n=56, 23, and 29) were observed using a Communication Play Protocol during which parents could use novel words to refer to novel objects. Parents readily introduced both labels and sound words even when their child did not respond expressively or produce the words. Results highlight both how parents act in ways that may facilitate their child's appreciation of the relation between a new word and its referent and how they subtly adjust their actions to suit their child's level of word learning and specific learning challenges. Copyright © 2015 Elsevier Inc. All rights reserved.
PROCESS DOCUMENTATION: A MODEL FOR KNOWLEDGE MANAGEMENT IN ORGANIZATIONS.
Haddadpoor, Asefeh; Taheri, Behjat; Nasri, Mehran; Heydari, Kamal; Bahrami, Gholamreza
2015-10-01
Continuous and interconnected processes are a chain of activities that turn the inputs of an organization to its outputs and help achieve partial and overall goals of the organization. These activates are carried out by two types of knowledge in the organization called explicit and implicit knowledge. Among these, implicit knowledge is the knowledge that controls a major part of the activities of an organization, controls these activities internally and will not be transferred to the process owners unless they are present during the organization's work. Therefore the goal of this study is identification of implicit knowledge and its integration with explicit knowledge in order to improve human resources management, physical resource management, information resource management, training of new employees and other activities of Isfahan University of Medical Science. The project for documentation of activities in department of health of Isfahan University of Medical Science was carried out in several stages. First the main processes and related sub processes were identified and categorized with the help of planning expert. The categorization was carried out from smaller processes to larger ones. In this stage the experts of each process wrote down all their daily activities and organized them into general categories based on logical and physical relations between different activities. Then each activity was assigned a specific code. The computer software was designed after understanding the different parts of the processes, including main and sup processes, and categorization, which will be explained in the following sections. The findings of this study showed that documentation of activities can help expose implicit knowledge because all of inputs and outputs of a process along with the length, location, tools and different stages of the process, exchanged information, storage location of the information and information flow can be identified using proper documentation. A documentation program can create a complete identifier for every process of an organization and also acts as the main tool for establishment of information technology as the basis of the organization and helps achieve the goal of having electronic and information technology based organizations. In other words documentation is the starting step in creating an organizational architecture. Afterwards, in order to reach the desired goal of documentation, computer software containing all tools, methods, instructions and guidelines and implicit knowledge of the organization was designed. This software links all relevant knowledge to the main text of the documentation and identification of a process and provides the users with electronic versions of all documentations and helps use the explicit and implicit knowledge of the organization to facilitate the reengineering of the processes in the organization.
Skipped words and fixated words are processed differently during reading.
Eskenazi, Michael A; Folk, Jocelyn R
2015-04-01
The purpose of this study was to investigate whether words are processed differently when they are fixated during silent reading than when they are skipped. According to a serial processing model of eye movement control (e.g., EZ Reader) skipped words are fully processed (Reichle, Rayner, Pollatsek, Behavioral and Brain Sciences, 26(04):445-476, 2003), whereas in a parallel processing model (e.g., SWIFT) skipped words do not need to be fully processed (Engbert, Nuthmann, Richter, Kliegl, Psychological Review, 112(4):777-813, 2005). Participants read 34 sentences with target words embedded in them while their eye movements were recorded. All target words were three-letter, low-frequency, and unpredictable nouns. After the reading session, participants completed a repetition priming lexical decision task with the target words from the reading session included as the repetition prime targets, with presentation of those same words during the reading task acting as the prime. When participants skipped a word during the reading session, their reaction times on the lexical decision task were significantly longer (M = 656.42 ms) than when they fixated the word (M = 614.43 ms). This result provides evidence that skipped words are sometimes not processed to the same degree as fixated words during reading.
Do Chinese Readers Follow the National Standard Rules for Word Segmentation during Reading?
Liu, Ping-Ping; Li, Wei-Jun; Lin, Nan; Li, Xing-Shan
2013-01-01
We conducted a preliminary study to examine whether Chinese readers’ spontaneous word segmentation processing is consistent with the national standard rules of word segmentation based on the Contemporary Chinese language word segmentation specification for information processing (CCLWSSIP). Participants were asked to segment Chinese sentences into individual words according to their prior knowledge of words. The results showed that Chinese readers did not follow the segmentation rules of the CCLWSSIP, and their word segmentation processing was influenced by the syntactic categories of consecutive words. In many cases, the participants did not consider the auxiliary words, adverbs, adjectives, nouns, verbs, numerals and quantifiers as single word units. Generally, Chinese readers tended to combine function words with content words to form single word units, indicating they were inclined to chunk single words into large information units during word segmentation. Additionally, the “overextension of monosyllable words” hypothesis was tested and it might need to be corrected to some degree, implying that word length have an implicit influence on Chinese readers’ segmentation processing. Implications of these results for models of word recognition and eye movement control are discussed. PMID:23408981
DOE Office of Scientific and Technical Information (OSTI.GOV)
Augustyn, Veronica; Ko, Jesse; Rauda, Iris
Representing the Molecularly Engineered Energy Materials (MEEM), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of MEEM, using inexpensive custom-designed molecular building blocks, aims to create revolutionary new materials withmore » self-assembled multi-scale architectures that will enable high performing energy generation and storage applications.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stocks, G. Malcolm; Morris, James; Sproles, Andrew
Representing the Center for Defect Physics (CDP), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CDP is to enhance our fundamental understanding of defects, defect interactions, and defectmore » dynamics that determine the performance of structural materials in extreme environments.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shastry, Tejas
Representing the Argonne-Northwestern Solar Energy Research (ANSER) Center, this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of ANSER is to revolutionize our understanding of molecules, materials and methods necessary tomore » create dramatically more efficient technologies for solar fuels and electricity production.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Montoya, Joseph
Representing the Center on Nanostructuring for Efficient Energy Conversion (CNEEC), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of CNEEC is to understand how nanostructuring can enhance efficiency for energymore » conversion and solve fundamental cross-cutting problems in advanced energy conversion and storage systems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crain, Steven P.; Yang, Shuang-Hong; Zha, Hongyuan
Access to health information by consumers is ham- pered by a fundamental language gap. Current attempts to close the gap leverage consumer oriented health information, which does not, however, have good coverage of slang medical terminology. In this paper, we present a Bayesian model to automatically align documents with different dialects (slang, com- mon and technical) while extracting their semantic topics. The proposed diaTM model enables effective information retrieval, even when the query contains slang words, by explicitly modeling the mixtures of dialects in documents and the joint influence of dialects and topics on word selection. Simulations us- ing consumermore » questions to retrieve medical information from a corpus of medical documents show that diaTM achieves a 25% improvement in information retrieval relevance by nDCG@5 over an LDA baseline.« less
Business Documents Don't Have to Be Boring
ERIC Educational Resources Information Center
Schultz, Benjamin
2006-01-01
With business documents, visuals can serve to enhance the written word in conveying the message. Images can be especially effective when used subtly, on part of the page, on successive pages to provide continuity, or even set as watermarks over the entire page. A main reason given for traditional text-only business documents is that they are…
47 CFR 0.409 - Commission policy on private printing of FCC forms.
Code of Federal Regulations, 2014 CFR
2014-10-01
... in quality to the original document, without change to the page size, image size, configuration of... document.” (4) Do not add to the form any other symbol, word or phrase that might be construed as...
47 CFR 0.409 - Commission policy on private printing of FCC forms.
Code of Federal Regulations, 2012 CFR
2012-10-01
... in quality to the original document, without change to the page size, image size, configuration of... document.” (4) Do not add to the form any other symbol, word or phrase that might be construed as...
47 CFR 0.409 - Commission policy on private printing of FCC forms.
Code of Federal Regulations, 2013 CFR
2013-10-01
... in quality to the original document, without change to the page size, image size, configuration of... document.” (4) Do not add to the form any other symbol, word or phrase that might be construed as...
47 CFR 0.409 - Commission policy on private printing of FCC forms.
Code of Federal Regulations, 2011 CFR
2011-10-01
... in quality to the original document, without change to the page size, image size, configuration of... document.” (4) Do not add to the form any other symbol, word or phrase that might be construed as...
Assessing the Readability of Medical Documents: A Ranking Approach.
Zheng, Jiaping; Yu, Hong
2018-03-23
The use of electronic health record (EHR) systems with patient engagement capabilities, including viewing, downloading, and transmitting health information, has recently grown tremendously. However, using these resources to engage patients in managing their own health remains challenging due to the complex and technical nature of the EHR narratives. Our objective was to develop a machine learning-based system to assess readability levels of complex documents such as EHR notes. We collected difficulty ratings of EHR notes and Wikipedia articles using crowdsourcing from 90 readers. We built a supervised model to assess readability based on relative orders of text difficulty using both surface text features and word embeddings. We evaluated system performance using the Kendall coefficient of concordance against human ratings. Our system achieved significantly higher concordance (.734) with human annotators than did a baseline using the Flesch-Kincaid Grade Level, a widely adopted readability formula (.531). The improvement was also consistent across different disease topics. This method's concordance with an individual human user's ratings was also higher than the concordance between different human annotators (.658). We explored methods to automatically assess the readability levels of clinical narratives. Our ranking-based system using simple textual features and easy-to-learn word embeddings outperformed a widely used readability formula. Our ranking-based method can predict relative difficulties of medical documents. It is not constrained to a predefined set of readability levels, a common design in many machine learning-based systems. Furthermore, the feature set does not rely on complex processing of the documents. One potential application of our readability ranking is personalization, allowing patients to better accommodate their own background knowledge. ©Jiaping Zheng, Hong Yu. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 23.03.2018.
Chen, Peiyao; Lin, Jie; Chen, Bingle; Lu, Chunming; Guo, Taomei
2015-10-01
Emotional words in a bilingual's second language (L2) seem to have less emotional impact compared to emotional words in the first language (L1). The present study examined the neural mechanisms of emotional word processing in Chinese-English bilinguals' two languages by using both event-related potentials (ERPs) and functional magnetic resonance imaging (fMRI). Behavioral results show a robust positive word processing advantage in L1 such that responses to positive words were faster and more accurate compared to responses to neutral words and negative words. In L2, emotional words only received higher accuracies than neutral words. In ERPs, positive words elicited a larger early posterior negativity and a smaller late positive component than neutral words in L1, while a trend of reduced N400 component was found for positive words compared to neutral words in L2. In fMRI, reduced activation was found for L1 emotional words in both the left middle occipital gyrus and the left cerebellum whereas increased activation in the left cerebellum was found for L2 emotional words. Altogether, these results suggest that emotional word processing advantage in L1 relies on rapid and automatic attention capture while facilitated semantic retrieval might help processing emotional words in L2. Copyright © 2015 Elsevier Ltd. All rights reserved.
The effect of sign language structure on complex word reading in Chinese deaf adolescents.
Lu, Aitao; Yu, Yanping; Niu, Jiaxin; Zhang, John X
2015-01-01
The present study was carried out to investigate whether sign language structure plays a role in the processing of complex words (i.e., derivational and compound words), in particular, the delay of complex word reading in deaf adolescents. Chinese deaf adolescents were found to respond faster to derivational words than to compound words for one-sign-structure words, but showed comparable performance for two-sign-structure words. For both derivational and compound words, response latencies to one-sign-structure words were shorter than to two-sign-structure words. These results provide strong evidence that the structure of sign language affects written word processing in Chinese. Additionally, differences between derivational and compound words in the one-sign-structure condition indicate that Chinese deaf adolescents acquire print morphological awareness. The results also showed that delayed word reading was found in derivational words with two signs (DW-2), compound words with one sign (CW-1), and compound words with two signs (CW-2), but not in derivational words with one sign (DW-1), with the delay being maximum in DW-2, medium in CW-2, and minimum in CW-1, suggesting that the structure of sign language has an impact on the delayed processing of Chinese written words in deaf adolescents. These results provide insight into the mechanisms about how sign language structure affects written word processing and its delayed processing relative to their hearing peers of the same age.
Words for Work Evaluation Report 2011
ERIC Educational Resources Information Center
National Literacy Trust, 2011
2011-01-01
This document analyses and evaluates the findings of the second pilot year of the National Literacy Trust's speaking and listening project, Words for Work. This year's project worked with 219 year 9 pupils across England, and engaged 91 volunteers from the business community to facilitate group work that encouraged pupils to investigate their own…
ERIC Educational Resources Information Center
Baayen, R. Harald; Hendrix, Peter; Ramscar, Michael
2013-01-01
Arnon and Snider ((2010). More than words: Frequency effects for multi-word phrases. "Journal of Memory and Language," 62, 67-82) documented frequency effects for compositional four-grams independently of the frequencies of lower-order "n"-grams. They argue that comprehenders apparently store frequency information about…
Machine-Aided Indexing of Technical Literature
ERIC Educational Resources Information Center
Klingbiel, Paul H.
1973-01-01
To index at the Defense Documentation Center (DDC), an automated system must choose single words or phrases rapidly and economically. Automation of DDC's indexing has been machine-aided from its inception. A machine-aided indexing system is described that indexes one million words of text per hour of CPU time. (22 references) (Author/SJ)
A Basic Vocabulary of Federal Social Program Applications and Forms.
ERIC Educational Resources Information Center
Afflerbach, Peter P.; And Others
A study of the application forms for Social Security, Supplemental Security Income, public assistance, food stamps, Medicaid, and Medicare was conducted to examine the frequently occurring unfamiliar, specialized vocabulary words. It was found that 76 such words occurred at least ten times in the documents studied. A large number of other…
Improving Elementary Students' Spelling Achievement Using High-Frequency Words.
ERIC Educational Resources Information Center
Durnil, Christina; And Others
An action research study detailed a program for improving spelling achievement across the curriculum. The targeted population is composed of second and third grade students from a growing, middle class community located in a suburb of Chicago, Illinois. The problem of misspelled words in the students' writing was documented through students'…
Linguistic, Cognitive, and Social Constraints on Lexical Entrenchment
ERIC Educational Resources Information Center
Chesley, Paula
2011-01-01
How do new words become established in a speech community? This dissertation documents linguistic, cognitive, and social factors that are hypothesized to affect "lexical entrenchment," the extent to which a new word becomes part of the lexicon of a speech community. First, in a longitudinal corpus study, I find that linguistic properties such as…
Creating Printed Materials for Mathematics with a Macintosh Computer.
ERIC Educational Resources Information Center
Mahler, Philip
This document gives instructions on how to use a Macintosh computer to create printed materials for mathematics. A Macintosh computer, Microsoft Word, and objected-oriented (Draw-type) art program, and a function-graphing program are capable of producing high quality printed instructional materials for mathematics. Word 5.1 has an equation editor…
The Microgravity Research Experiments (MICREX) Data Base
NASA Technical Reports Server (NTRS)
Winter, C. A.; Jones, J. C.
1996-01-01
An electronic data base identifying over 800 fluids and materials processing experiments performed in a low-gravity environment has been created at NASA Marshall Space Flight Center. The compilation, called MICREX (MICrogravity Research Experiments) was designed to document all such experimental efforts performed (1) on U.S. manned space vehicles, (2) on payloads deployed from U.S. manned space vehicles, and (3) on all domestic and international sounding rockets (excluding those of China and the former U.S.S.R.). Data available on most experiments include (1) principal and co-investigator (2) low-gravity mission, (3) processing facility, (4) experimental objectives and results, (5) identifying key words, (6) sample materials, (7) applications of the processed materials/research area, (8) experiment descriptive publications, and (9) contacts for more information concerning the experiment. This technical memorandum (1) summarizes the historical interest in reduced-gravity fluid dynamics, (2) describes the importance of a low-gravity fluids and materials processing data base, (4) describes thE MICREX data base format and computational World Wide Web access procedures, and (5) documents (in hard-copy form) the descriptions of the first 600 fluids and materials processing experiments entered into MICREX.
The Microgravity Research Experiments (MICREX) Data Base. Volume 2
NASA Technical Reports Server (NTRS)
Winter, C. A.; Jones, J. C.
1996-01-01
An electronic data base identifying over 800 fluids and materials processing experiments performed in a low-gravity environment has been created at NASA Marshall Space Flight Center. The compilation, called MICREX (MICrogravity Research Experiments), was designed to document all such experimental efforts performed (1) on U.S. manned space vehicles, (2) on payloads deployed from U.S. manned space vehicles, and (3) on all domestic and international sounding rockets (excluding those of China and the former U.S.S.R.). Data available on most experiments include (1) principal and co-investigators (2) low-gravity mission, (3) processing facility, (4) experimental objectives and results, (5) identifying key words, (6) sample materials, (7) applications of the processed materials/research area, (8) experiment descriptive publications, and (9) contacts for more information concerning the experiment. This technical memorandum (1) summarizes the historical interest in reduced-gravity fluid dynamics, (2) describes the experimental facilities employed to examine reduced gravity fluid flow, (3) discusses the importance of a low-gravity fluids and materials processing data base, (4) describes the MICREX data base format and computational World Wide Web access procedures, and (5) documents (in hard-copy form) the descriptions of the first 600 fluids and materials processing experiments entered into MICREX.
The Microgravity Research Experiments (MICREX) Data Base. Volume 1
NASA Technical Reports Server (NTRS)
Winter, C. A.; Jones, J.C.
1996-01-01
An electronic data base identifying over 800 fluids and materials processing experiments performed in a low-gravity environment has been created at NASA Marshall Space Flight Center. The compilation, called MICREX (MICrogravity Research Experiments), was designed to document all such experimental efforts performed (1) on U.S. manned space vehicles, (2) on payloads deployed from U.S. manned space vehicles, and (3) on all domestic and international sounding rockets (excluding those of China and the former U.S.S.R.). Data available on most experiments include (1) principal and co-investigators, (2) low-gravity mission, (3) processing facility, (4) experimental objectives and results, (5) identifying key words, (6) sample materials, (7) applications of the processed materials/research area, (8) experiment descriptive publications, and (9) contacts for more information concerning the experiment. This technical memorandum (1) summarizes the historical interest in reduced-gravity fluid dynamics, (2) describes the experimental facilities employed to examine reduced gravity fluid flow, (3) discusses the importance of a low-gravity fluids and materials processing data base, (4) describes the MICREX data base format and computational World Wide Web access procedures, and (5) documents (in hard-copy form) the descriptions of the first 600 fluids and materials processing experiments entered into MICREX.
The Microgravity Research Experiments (MICREX) Data Base, Volume 4
NASA Technical Reports Server (NTRS)
Winter, C. A.; Jones, J. C.
1996-01-01
An electronic data base identifying over 800 fluids and materials processing experiments performed in a low-gravity environment has been created at NASA Marshall Space Flight Center. The compilation, called MICREX (MICrogravity Research Experiments), was designed to document all such experimental efforts performed (1) on U.S. manned space vehicles, (2) on payloads deployed from U.S. manned space vehicles, and (3) on all domestic and international sounding rockets (excluding those of China and the former U.S.S.R.). Data available on most experiments include (1) principal and co-investigators (2) low-gravity mission, (3) processing facility, (4) experimental objectives and results, (5) identifying key words, (6) sample materials, (7) applications of the processed materials/research area, (8) experiment descriptive publications, and (9) contacts for more information concerning the experiment. This technical Memorandum (1) summarizes the historical interest in reduced-gravity fluid dynamics, (2) describes the importance of a low-gravity fluids and materials processing data base, (4) describes the MICREX data base format and computational World Wide Web access procedures, and (5) documents (in hard-copy form) the descriptions of the first 600 fluids and materials processing experiments entered into MICREX.
Word Reading Aloud Skills: Their Positive Redefinition through Ageing
ERIC Educational Resources Information Center
Chapleau, Marianne; Wilson, Maximiliano A.; Potvin, Karel; Harvey-Langton, Alexandra; Montembeault, Maxime; Brambati, Simona M.
2017-01-01
Background: Successful reading can be achieved by means of two different procedures: sub-word processes for the pronunciation of words without semantics or pseudowords (PW) and whole-word processes that recruit word-specific information regarding the pronunciation of words with atypical orthography-to-phonology mappings (exception words, EW).…
The time course of morphological processing during spoken word recognition in Chinese.
Shen, Wei; Qu, Qingqing; Ni, Aiping; Zhou, Junyi; Li, Xingshan
2017-12-01
We investigated the time course of morphological processing during spoken word recognition using the printed-word paradigm. Chinese participants were asked to listen to a spoken disyllabic compound word while simultaneously viewing a printed-word display. Each visual display consisted of three printed words: a semantic associate of the first constituent of the compound word (morphemic competitor), a semantic associate of the whole compound word (whole-word competitor), and an unrelated word (distractor). Participants were directed to detect whether the spoken target word was on the visual display. Results indicated that both the morphemic and whole-word competitors attracted more fixations than the distractor. More importantly, the morphemic competitor began to diverge from the distractor immediately at the acoustic offset of the first constituent, which was earlier than the whole-word competitor. These results suggest that lexical access to the auditory word is incremental and morphological processing (i.e., semantic access to the first constituent) that occurs at an early processing stage before access to the representation of the whole word in Chinese.
Emotional words can be embodied or disembodied: the role of superficial vs. deep types of processing
Abbassi, Ensie; Blanchette, Isabelle; Ansaldo, Ana I.; Ghassemzadeh, Habib; Joanette, Yves
2015-01-01
Emotional words are processed rapidly and automatically in the left hemisphere (LH) and slowly, with the involvement of attention, in the right hemisphere (RH). This review aims to find the reason for this difference and suggests that emotional words can be processed superficially or deeply due to the involvement of the linguistic and imagery systems, respectively. During superficial processing, emotional words likely make connections only with semantically associated words in the LH. This part of the process is automatic and may be sufficient for the purpose of language processing. Deep processing, in contrast, seems to involve conceptual information and imagery of a word’s perceptual and emotional properties using autobiographical memory contents. Imagery and the involvement of autobiographical memory likely differentiate between emotional and neutral word processing and explain the salient role of the RH in emotional word processing. It is concluded that the level of emotional word processing in the RH should be deeper than in the LH and, thus, it is conceivable that the slow mode of processing adds certain qualities to the output. PMID:26217288
Hierarchic Agglomerative Clustering Methods for Automatic Document Classification.
ERIC Educational Resources Information Center
Griffiths, Alan; And Others
1984-01-01
Considers classifications produced by application of single linkage, complete linkage, group average, and word clustering methods to Keen and Cranfield document test collections, and studies structure of hierarchies produced, extent to which methods distort input similarity matrices during classification generation, and retrieval effectiveness…
Cao, Hong-Wen; Yang, Ke-Yu; Yan, Hong-Mei
2017-01-01
Character order information is encoded at the initial stage of Chinese word processing, however, its time course remains underspecified. In this study, we assess the exact time course of the character decomposition and transposition processes of two-character Chinese compound words (canonical, transposed, or reversible words) compared with pseudowords using dual-target rapid serial visual presentation (RSVP) of stimuli appearing at 30 ms per character with no inter-stimulus interval. The results indicate that Chinese readers can identify words with character transpositions in rapid succession; however, a transposition cost is involved in identifying transposed words compared to canonical words. In RSVP reading, character order of words is more likely to be reversed during the period from 30 to 180 ms for canonical and reversible words, but the period from 30 to 240 ms for transposed words. Taken together, the findings demonstrate that the holistic representation of the base word is activated, however, the order of the two constituent characters is not strictly processed during the very early stage of visual word processing.
Word Criticality Analysis MOS: 17B. Skill Levels 1 & 2.
1981-09-01
DPFO Curl -___ .... F ...... COPIES ATOP .o’,. 109.1 ,.,,,o .4i,1,.~ .. d.,--. - , selll efle *,5,ed. !* DISCLAIMER NOTICE THIS DOCUMENT IS BEST QUALITY...Manual (SM). These critical words were selected by subject matter/job experts knowledgeable in their MOS. The vocabulary set used as the basis for critical...following 5 point rating scale was used by a team of up to 3 subject matter experts fzum Army MOS proponent schools to rate each word selected as having
Reading handprinted addresses on IRS tax forms
NASA Astrophysics Data System (ADS)
Ramanaprasad, Vemulapati; Shin, Yong-Chul; Srihari, Sargur N.
1996-03-01
The hand-printed address recognition system described in this paper is a part of the Name and Address Block Reader (NABR) system developed by the Center of Excellence for Document Analysis and Recognition (CEDAR). NABR is currently being used by the IRS to read address blocks (hand-print as well as machine-print) on fifteen different tax forms. Although machine- print address reading was relatively straightforward, hand-print address recognition has posed some special challenges due to demands on processing speed (with an expected throughput of 8450 forms/hour) and recognition accuracy. We discuss various subsystems involved in hand- printed address recognition, including word segmentation, word recognition, digit segmentation, and digit recognition. We also describe control strategies used to make effective use of these subsystems to maximize recognition accuracy. We present system performance on 931 address blocks in recognizing various fields, such as city, state, ZIP Code, street number and name, and personal names.
Grammatical distinctions in the left frontal cortex.
Shapiro, K A; Pascual-Leone, A; Mottaghy, F M; Gangitano, M; Caramazza, A
2001-08-15
Selective deficits in producing verbs relative to nouns in speech are well documented in neuropsychology and have been associated with left hemisphere frontal cortical lesions resulting from stroke and other neurological disorders. The basis for these impairments is unresolved: Do they arise because of differences in the way grammatical categories of words are organized in the brain, or because of differences in the neural representation of actions and objects? We used repetitive transcranial magnetic stimulation (rTMS) to suppress the excitability of a portion of left prefrontal cortex and to assess its role in producing nouns and verbs. In one experiment subjects generated real words; in a second, they produced pseudowords as nouns or verbs. In both experiments, response latencies increased for verbs but were unaffected for nouns following rTMS. These results demonstrate that grammatical categories have a neuroanatomical basis and that the left prefrontal cortex is selectively engaged in processing verbs as grammatical objects.
Eye Movements Reveal Fast, Voice-Specific Priming
Papesh, Megan H.; Goldinger, Stephen D.; Hout, Michael C.
2015-01-01
In spoken word perception, voice specificity effects are well-documented: When people hear repeated words in some task, performance is generally better when repeated items are presented in their originally heard voices, relative to changed voices. A key theoretical question about voice specificity effects concerns their time-course: Some studies suggest that episodic traces exert their influence late in lexical processing (the time-course hypothesis; McLennan & Luce, 2005), whereas others suggest that episodic traces influence immediate, online processing. We report two eye-tracking studies investigating the time-course of voice-specific priming within and across cognitive tasks. In Experiment 1, participants performed modified lexical decision or semantic classification to words spoken by four speakers. The tasks required participants to click a red “×” or a blue “+” located randomly within separate visual half-fields, necessitating trial-by-trial visual search with consistent half-field response mapping. After a break, participants completed a second block with new and repeated items, half spoken in changed voices. Voice effects were robust very early, appearing in saccade initiation times. Experiment 2 replicated this pattern while changing tasks across blocks, ruling out a response priming account. In the General Discussion, we address the time-course hypothesis, focusing on the challenge it presents for empirical disconfirmation, and highlighting the broad importance of indexical effects, beyond studies of priming. PMID:26726911
A Neurophysiological Study of Semantic Processing in Parkinson's Disease.
Angwin, Anthony J; Dissanayaka, Nadeeka N W; Moorcroft, Alison; McMahon, Katie L; Silburn, Peter A; Copland, David A
2017-01-01
Cognitive-linguistic impairments in Parkinson's disease (PD) have been well documented; however, few studies have explored the neurophysiological underpinnings of semantic deficits in PD. This study investigated semantic function in PD using event-related potentials. Eighteen people with PD and 18 healthy controls performed a semantic judgement task on written word pairs that were either congruent or incongruent. The mean amplitude of the N400 for new incongruent word pairs was similar for both groups, however the onset latency was delayed in the PD group. Further analysis of the data revealed that both groups demonstrated attenuation of the N400 for repeated incongruent trials, as well as attenuation of the P600 component for repeated congruent trials. The presence of N400 congruity and N400 repetition effects in the PD group suggests that semantic processing is generally intact, but with a slower time course as evidenced by the delayed N400. Additional research will be required to determine whether N400 and P600 repetition effects are sensitive to further cognitive decline in PD. (JINS, 2017, 23, 78-89).
Spoken word recognition by Latino children learning Spanish as their first language*
HURTADO, NEREYDA; MARCHMAN, VIRGINIA A.; FERNALD, ANNE
2010-01-01
Research on the development of efficiency in spoken language understanding has focused largely on middle-class children learning English. Here we extend this research to Spanish-learning children (n=49; M=2;0; range=1;3–3;1) living in the USA in Latino families from primarily low socioeconomic backgrounds. Children looked at pictures of familiar objects while listening to speech naming one of the objects. Analyses of eye movements revealed developmental increases in the efficiency of speech processing. Older children and children with larger vocabularies were more efficient at processing spoken language as it unfolds in real time, as previously documented with English learners. Children whose mothers had less education tended to be slower and less accurate than children of comparable age and vocabulary size whose mothers had more schooling, consistent with previous findings of slower rates of language learning in children from disadvantaged backgrounds. These results add to the cross-linguistic literature on the development of spoken word recognition and to the study of the impact of socioeconomic status (SES) factors on early language development. PMID:17542157
Mouriño García, Marcos Antonio; Pérez Rodríguez, Roberto; Anido Rifón, Luis E
2015-01-01
Automatic classification of text documents into a set of categories has a lot of applications. Among those applications, the automatic classification of biomedical literature stands out as an important application for automatic document classification strategies. Biomedical staff and researchers have to deal with a lot of literature in their daily activities, so it would be useful a system that allows for accessing to documents of interest in a simple and effective way; thus, it is necessary that these documents are sorted based on some criteria-that is to say, they have to be classified. Documents to classify are usually represented following the bag-of-words (BoW) paradigm. Features are words in the text-thus suffering from synonymy and polysemy-and their weights are just based on their frequency of occurrence. This paper presents an empirical study of the efficiency of a classifier that leverages encyclopedic background knowledge-concretely Wikipedia-in order to create bag-of-concepts (BoC) representations of documents, understanding concept as "unit of meaning", and thus tackling synonymy and polysemy. Besides, the weighting of concepts is based on their semantic relevance in the text. For the evaluation of the proposal, empirical experiments have been conducted with one of the commonly used corpora for evaluating classification and retrieval of biomedical information, OHSUMED, and also with a purpose-built corpus of MEDLINE biomedical abstracts, UVigoMED. Results obtained show that the Wikipedia-based bag-of-concepts representation outperforms the classical bag-of-words representation up to 157% in the single-label classification problem and up to 100% in the multi-label problem for OHSUMED corpus, and up to 122% in the single-label classification problem and up to 155% in the multi-label problem for UVigoMED corpus.
Pérez Rodríguez, Roberto; Anido Rifón, Luis E.
2015-01-01
Automatic classification of text documents into a set of categories has a lot of applications. Among those applications, the automatic classification of biomedical literature stands out as an important application for automatic document classification strategies. Biomedical staff and researchers have to deal with a lot of literature in their daily activities, so it would be useful a system that allows for accessing to documents of interest in a simple and effective way; thus, it is necessary that these documents are sorted based on some criteria—that is to say, they have to be classified. Documents to classify are usually represented following the bag-of-words (BoW) paradigm. Features are words in the text—thus suffering from synonymy and polysemy—and their weights are just based on their frequency of occurrence. This paper presents an empirical study of the efficiency of a classifier that leverages encyclopedic background knowledge—concretely Wikipedia—in order to create bag-of-concepts (BoC) representations of documents, understanding concept as “unit of meaning”, and thus tackling synonymy and polysemy. Besides, the weighting of concepts is based on their semantic relevance in the text. For the evaluation of the proposal, empirical experiments have been conducted with one of the commonly used corpora for evaluating classification and retrieval of biomedical information, OHSUMED, and also with a purpose-built corpus of MEDLINE biomedical abstracts, UVigoMED. Results obtained show that the Wikipedia-based bag-of-concepts representation outperforms the classical bag-of-words representation up to 157% in the single-label classification problem and up to 100% in the multi-label problem for OHSUMED corpus, and up to 122% in the single-label classification problem and up to 155% in the multi-label problem for UVigoMED corpus. PMID:26468436
Involuntary awareness and implicit priming: role of retrieval context.
Zhou, Renlai; Hu, Senqi; Sun, Xuefei; Huang, Junhong
2006-10-01
This study examined the role of retrieval context in implicit priming by manipulating percentage of word-stem index as shallow and deep processing while performing a word-stem completion task. 80 subjects were randomly divided into four groups each of 20 subjects: shallow processing or deep processing with few retrieval indices, and shallow processing or deep processing with many retrieval indices. Analysis indicated that proportion of word-stem completion was significantly higher for studied words than for nonstudied words in all four groups and that the subjects in the groups with many retrieval indices had a significantly increased proportion of word-stem completion between studied and nonstudied words than those in the groups with few retrieval indices. Postquestionnaire analysis indicated that more previously studied items were retrieved if many studied items were available during implicit word-stem completion and that only a small proportion of word-stem completion was finished with studied words by the subjects who were aware of the prior studied and test word relations in all four groups. It was concluded that having more studied words retrievable contributed to more being retrieved and that involuntary awareness had very limited influence on the priming in the implicit word-stem completion.
Robinson, Amanda K; Plaut, David C; Behrmann, Marlene
2017-07-01
Words and faces have vastly different visual properties, but increasing evidence suggests that word and face processing engage overlapping distributed networks. For instance, fMRI studies have shown overlapping activity for face and word processing in the fusiform gyrus despite well-characterized lateralization of these objects to the left and right hemispheres, respectively. To investigate whether face and word perception influences perception of the other stimulus class and elucidate the mechanisms underlying such interactions, we presented images using rapid serial visual presentations. Across 3 experiments, participants discriminated 2 face, word, and glasses targets (T1 and T2) embedded in a stream of images. As expected, T2 discrimination was impaired when it followed T1 by 200 to 300 ms relative to longer intertarget lags, the so-called attentional blink. Interestingly, T2 discrimination accuracy was significantly reduced at short intertarget lags when a face was followed by a word (face-word) compared with glasses-word and word-word combinations, indicating that face processing interfered with word perception. The reverse effect was not observed; that is, word-face performance was no different than the other object combinations. EEG results indicated the left N170 to T1 was correlated with the word decrement for face-word trials, but not for other object combinations. Taken together, the results suggest face processing interferes with word processing, providing evidence for overlapping neural mechanisms of these 2 object types. Furthermore, asymmetrical face-word interference points to greater overlap of face and word representations in the left than the right hemisphere. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Li, Sara Tze Kwan; Hsiao, Janet Hui-Wen
2018-07-01
Music notation and English word reading both involve mapping horizontally arranged visual components to components in sound, in contrast to reading in logographic languages such as Chinese. Accordingly, music-reading expertise may influence English word processing more than Chinese character processing. Here we showed that musicians named English words significantly faster than non-musicians when words were presented in the left visual field/right hemisphere (RH) or the center position, suggesting an advantage of RH processing due to music reading experience. This effect was not observed in Chinese character naming. A follow-up ERP study showed that in a sequential matching task, musicians had reduced RH N170 responses to English non-words under the processing of musical segments as compared with non-musicians, suggesting a shared visual processing mechanism in the RH between music notation and English non-word reading. This shared mechanism may be related to the letter-by-letter, serial visual processing that characterizes RH English word recognition (e.g., Lavidor & Ellis, 2001), which may consequently facilitate English word processing in the RH in musicians. Thus, music reading experience may have differential influences on the processing of different languages, depending on their similarities in the cognitive processes involved. Copyright © 2018 Elsevier B.V. All rights reserved.
The effects of gender and self-insight on early semantic processing.
Xu, Xu; Kang, Chunyan; Guo, Taomei
2014-01-01
This event-related potential (ERP) study explored individual differences associated with gender and level of self-insight in early semantic processing. Forty-eight Chinese native speakers completed a semantic judgment task with three different categories of words: abstract neutral words (e.g., logic, effect), concrete neutral words (e.g., teapot, table), and emotion words (e.g., despair, guilt). They then assessed their levels of self-insight. Results showed that women engaged in greater processing than did men. Gender differences also manifested in the relationship between level of self-insight and word processing. For women, level of self-insight was associated with level of semantic activation for emotion words and abstract neutral words, but not for concrete neutral words. For men, level of self-insight was related to processing speed, particularly in response to abstract and concrete neutral words. These findings provide electrophysiological evidence for the effects of gender and self-insight on semantic processing and highlight the need to take into consideration subject variables in related research.
Han, Xu; Kim, Jung-jae; Kwoh, Chee Keong
2016-01-01
Biomedical text mining may target various kinds of valuable information embedded in the literature, but a critical obstacle to the extension of the mining targets is the cost of manual construction of labeled data, which are required for state-of-the-art supervised learning systems. Active learning is to choose the most informative documents for the supervised learning in order to reduce the amount of required manual annotations. Previous works of active learning, however, focused on the tasks of entity recognition and protein-protein interactions, but not on event extraction tasks for multiple event types. They also did not consider the evidence of event participants, which might be a clue for the presence of events in unlabeled documents. Moreover, the confidence scores of events produced by event extraction systems are not reliable for ranking documents in terms of informativity for supervised learning. We here propose a novel committee-based active learning method that supports multi-event extraction tasks and employs a new statistical method for informativity estimation instead of using the confidence scores from event extraction systems. Our method is based on a committee of two systems as follows: We first employ an event extraction system to filter potential false negatives among unlabeled documents, from which the system does not extract any event. We then develop a statistical method to rank the potential false negatives of unlabeled documents 1) by using a language model that measures the probabilities of the expression of multiple events in documents and 2) by using a named entity recognition system that locates the named entities that can be event arguments (e.g. proteins). The proposed method further deals with unknown words in test data by using word similarity measures. We also apply our active learning method for the task of named entity recognition. We evaluate the proposed method against the BioNLP Shared Tasks datasets, and show that our method can achieve better performance than such previous methods as entropy and Gibbs error based methods and a conventional committee-based method. We also show that the incorporation of named entity recognition into the active learning for event extraction and the unknown word handling further improve the active learning method. In addition, the adaptation of the active learning method into named entity recognition tasks also improves the document selection for manual annotation of named entities.
Juhasz, Barbara J; Johnson, Rebecca L; Brewer, Jennifer
2017-04-01
New words enter the language through several word formation processes [see Simonini (Engl J 55:752-757, 1966)]. One such process, blending, occurs when two source words are combined to represent a new concept (e.g., SMOG, BRUNCH, BLOG, and INFOMERCIAL). While there have been examinations of the structure of blends [see Gries (Linguistics 42:639-667, 2004) and Lehrer (Am Speech 73:3-28, 1998)], relatively little attention has been given to how lexicalized blends are recognized and if this process differs from other types of words. In the present study, blend words were matched to non-blend control words on length, familiarity, and frequency. Two tasks were used to examine blend processing: lexical decision and sentence reading. The results demonstrated that blend words were processed differently than non-blend control words. However, the nature of the effect varied as a function of task demands. Blends were recognized slower than control words in the lexical decision task but received shorter fixation durations when embedded in sentences.
Making More Light with Less Energy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuritzky, Leah; Jewell, Jason
Representing the Center for Energy Efficient Materials (CEEM), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CEEM is to discover and develop materials that control the interactions amongmore » light, electricity, and heat at the nanoscale for improved solar energy conversion, solid-state lighting, and conversion of heat into electricity.« less
Rocks Filled with Tiny Spaces Can Turn Green Growing Things Into Stuff We Use Every Day
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nikbin, Nima; Josephson, Tyler; Courtney, Timothy
Representing the Catalysis Center for Energy Innovation (CCEI), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of CCEI is to design and characterize novel catalysts for the efficient conversion ofmore » the complex molecules comprising biomass into chemicals and fuels.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Epstein, Marianne; Luckyanova, Maria; Manke, Kara
Representing the Solid-State Solar-Thermal Energy Conversion Center (S3TEC), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of S3TEC is advancing fundamental science and developing materials to harness heat from themore » sun and convert this heat into electricity via solid-state thermoelectric and thermophotovoltaic technologies.« less
Lighting the World in a Different Way
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilber, Nicole; Houmpheng, Krista; Coltrin, Mike
Representing the Solid State Lighting Science (SSLS), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the SSLS is to help build the scientific foundation that enables solid-state lighting tomore » produce the most light for the least energy, both in the U.S. and, as a side-benefit, throughout the world.« less
Power to the People...Energy for Now and Later
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sung, Chih-Jen; Law, Chung K; Brady, Kyle
Representing the Combustion Energy Frontier Research Center (CEFRC), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of CEFRC is to develop a validated, predictive, multi-scale combusion modeling capacity which canmore » be used to optimize the design and operation of evolving fuels in advanced engines for transportation applications.« less
Using Left Overs to Make Energy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steuterman, Sally; Czarnecki, Alicia; Hurley, Paul
Representing the Material Science Antinides (MSA), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of MSA is to conduct transformative research in the actinide sciences with full integration of experimentalmore » and computational approaches, and an emphasis on research questions that are important to the energy future of the nation.« less
Using all of the Energy from the Sun to Make Power
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dapkus, P. Daniel; Povinelli, Michelle
Representing the Center for Energy Nanoscience (CEN), this document is one of the entries in the Ten Hundred and One Word Challenge and was awarded "Overall Winner Runner-up." As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CEN is to explore the light absorptionmore » and emission in organic and nanostructure materials and their hybrids for solar energy conversion and solid state lighting.« less
Sunlight + Water = Tomorrow's Energy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jones, Anne Katherine
Representing the Center for Bio-Inspired Solar Fuel Production (BISfuel), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of BISfuel is to construct a complete system for solar-powered production of hydrogenmore » fuel via water splitting; design principles are drawn from the fundamental concepts that underlie photosynthetic energy conversion.« less
Exploring Contextual Models in Chemical Patent Search
NASA Astrophysics Data System (ADS)
Urbain, Jay; Frieder, Ophir
We explore the development of probabilistic retrieval models for integrating term statistics with entity search using multiple levels of document context to improve the performance of chemical patent search. A distributed indexing model was developed to enable efficient named entity search and aggregation of term statistics at multiple levels of patent structure including individual words, sentences, claims, descriptions, abstracts, and titles. The system can be scaled to an arbitrary number of compute instances in a cloud computing environment to support concurrent indexing and query processing operations on large patent collections.
Computerized Project Management: How to Use a Macintosh to Improve Manager Productivity.
1986-03-01
similiar in concept to the "Flat Rate Manual" automobile shops use when repairing a car. That book lists all parts and the standard time necessary to do a...word processing document. Such a system would be similar to the ’flat rate manual’ used in automobile repair shops that list resource and time...Monterey, CA 93943 6. Commandant (G-TPP/ HRM )1 U.S. Coast Guard 2 100 Second St SW Washington, DC 20953 7. Cdr John Stumpf f 1 CCGD Eleven (dpi) 400
Text mining by Tsallis entropy
NASA Astrophysics Data System (ADS)
Jamaati, Maryam; Mehri, Ali
2018-01-01
Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.
Amatchmethod Based on Latent Semantic Analysis for Earthquakehazard Emergency Plan
NASA Astrophysics Data System (ADS)
Sun, D.; Zhao, S.; Zhang, Z.; Shi, X.
2017-09-01
The structure of the emergency plan on earthquake is complex, and it's difficult for decision maker to make a decision in a short time. To solve the problem, this paper presents a match method based on Latent Semantic Analysis (LSA). After the word segmentation preprocessing of emergency plan, we carry out keywords extraction according to the part-of-speech and the frequency of words. Then through LSA, we map the documents and query information to the semantic space, and calculate the correlation of documents and queries by the relation between vectors. The experiments results indicate that the LSA can improve the accuracy of emergency plan retrieval efficiently.
ASM Based Synthesis of Handwritten Arabic Text Pages
Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available. PMID:26295059
ASM Based Synthesis of Handwritten Arabic Text Pages.
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
77 FR 76606 - Community Development Financial Institutions Fund
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-28
... form, with pre-set text limits and font size restrictions. Applicants must submit their narrative responses by using the FY 2013 CDFI Program Application narrative template document. This Word document...) A-133 Narrative Report; (iv) Institution Level Report; (v) Transaction Level Report (for Awardees...
Lexical frequency and voice assimilation in complex words in Dutch
NASA Astrophysics Data System (ADS)
Ernestus, Mirjam; Lahey, Mybeth; Verhees, Femke; Baayen, Harald
2004-05-01
Words with higher token frequencies tend to have more reduced acoustic realizations than lower frequency words (e.g., Hay, 2000; Bybee, 2001; Jurafsky et al., 2001). This study documents frequency effects for regressive voice assimilation (obstruents are voiced before voiced plosives) in Dutch morphologically complex words in the subcorpus of read-aloud novels in the corpus of spoken Dutch (Oostdijk et al., 2002). As expected, the initial obstruent of the cluster tends to be absent more often as lexical frequency increases. More importantly, as frequency increases, the duration of vocal-fold vibration in the cluster decreases, and the duration of the bursts in the cluster increases, after partialing out cluster duration. This suggests that there is less voicing for higher-frequency words. In fact, phonetic transcriptions show regressive voice assimilation for only half of the words and progressive voice assimilation for one third. Interestingly, the progressive voice assimilation observed for higher-frequency complex words renders these complex words more similar to monomorphemic words: Dutch monomorphemic words typically contain voiceless obstruent clusters (Zonneveld, 1983). Such high-frequency complex words may therefore be less easily parsed into their constituent morphemes (cf. Hay, 2000), favoring whole word lexical access (Bertram et al., 2000).
VisualUrText: A Text Analytics Tool for Unstructured Textual Data
NASA Astrophysics Data System (ADS)
Zainol, Zuraini; Jaymes, Mohd T. H.; Nohuddin, Puteri N. E.
2018-05-01
The growing amount of unstructured text over Internet is tremendous. Text repositories come from Web 2.0, business intelligence and social networking applications. It is also believed that 80-90% of future growth data is available in the form of unstructured text databases that may potentially contain interesting patterns and trends. Text Mining is well known technique for discovering interesting patterns and trends which are non-trivial knowledge from massive unstructured text data. Text Mining covers multidisciplinary fields involving information retrieval (IR), text analysis, natural language processing (NLP), data mining, machine learning statistics and computational linguistics. This paper discusses the development of text analytics tool that is proficient in extracting, processing, analyzing the unstructured text data and visualizing cleaned text data into multiple forms such as Document Term Matrix (DTM), Frequency Graph, Network Analysis Graph, Word Cloud and Dendogram. This tool, VisualUrText, is developed to assist students and researchers for extracting interesting patterns and trends in document analyses.
Kim, Kyungmo; Choi, Jinwook
2017-01-01
Laboratory test names are used as basic information to diagnose diseases. However, this kind of medical information is usually written in a natural language. To find this information, lexicon based methods have been good solutions but they cannot find terms that do not have abbreviated expressions, such as "neuts" that means "neutrophils". To address this issue, similar word matching can be used; however, it can be disadvantageous because of significant false positives. Moreover, processing time is longer as the size of terms is bigger. Therefore, we suggest a novel q-gram based algorithm, named modified triangular area filtering, to find abbreviated laboratory test terms in clinical documents, minimizing the possibility to impair the lexicons' precision. In addition, we found the terms using the methodology with reasonable processing time. The results show that this method can achieve 92.54 precision, 87.72 recall, 90.06 f1-score in test sets when edit distance threshold(τ) = 3.
Schultheiss, Oliver C.
2013-01-01
Traditionally, implicit motives (i.e., non-conscious preferences for specific classes of incentives) are assessed through semantic coding of imaginative stories. The present research tested the marker-word hypothesis, which states that implicit motives are reflected in the frequencies of specific words. Using Linguistic Inquiry and Word Count (LIWC; Pennebaker et al., 2001), Study 1 identified word categories that converged with a content-coding measure of the implicit motives for power, achievement, and affiliation in picture stories collected in German and US student samples, showed discriminant validity with self-reported motives, and predicted well-validated criteria of implicit motives (gender difference for the affiliation motive; in interaction with personal-goal progress: emotional well-being). Study 2 demonstrated LIWC-based motive scores' causal validity by documenting their sensitivity to motive arousal. PMID:24137149
Enzymatic Decontamination of Environmental Organophosphorus Compounds
2006-12-04
ABSTRACT (Maximum 200 words) The abstract is below since many authors do not follow the 200 word limit 14. SUBJECT TERMS organophosphorus compounds ...5404 Enzymatic decontamination of environmental organophosphorus compounds REPORT DOCUMENTATION PAGE 18. SECURITY CLASSIFICATION ON THIS PAGE...239-18 298-102 15. NUMBER OF PAGES 20. LIMITATION OF ABSTRACT UL - 4-Dec-2006 Enzymatic decontamination of environmental organophosphorus compounds
76 FR 10405 - Federal Copyright Protection of Sound Recordings Fixed Before February 15, 1972
Federal Register 2010, 2011, 2012, 2013, 2014
2011-02-24
... file in either the Adobe Portable Document File (PDF) format that contains searchable, accessible text (not an image); Microsoft Word; WordPerfect; Rich Text Format (RTF); or ASCII text file format (not a..., comments may be delivered in hard copy. If hand delivered by a private party, an original [[Page 10406...
Photograph + Printed Word: A New Language for the Student Journalist.
ERIC Educational Resources Information Center
Magmer, James
This document examines the use of photography and the printed word to make visual statements in student publications. It is written for journalists who are writers and editors as well as for photojournalists and for student journalists interested in increasing the quality of the school newspaper, magazine, or yearbook. The role of the photographer…
JPKWIC - General key word in context and subject index report generator
NASA Technical Reports Server (NTRS)
Jirka, R.; Kabashima, N.; Kelly, D.; Plesset, M.
1968-01-01
JPKWIC computer program is a general key word in context and subject index report generator specifically developed to help nonprogrammers and nontechnical personnel to use the computer to access files, libraries and mass documentation. This program is designed to produce a KWIC index, a subject index, an edit report, a summary report, and an exclusion list.
According to Davis: Connecting Principles and Practices
ERIC Educational Resources Information Center
Schulman, Steven M.
2013-01-01
In this article, the author allows Robert B. Davis to state for himself his own Principles concerning how children learn, and how teachers can best teach them. These principles are put forward in Davis' own words along with detailed documentation. The author goes on compare Davis' words with his practices. A single Davis video (Towers of Hanoi) is…
A Comparison of Key Concepts in Data Analytics and Data Science
ERIC Educational Resources Information Center
McMaster, Kirby; Rague, Brian; Wolthuis, Stuart L.; Sambasivam, Samuel
2018-01-01
This research study provides an examination of the relatively new fields of Data Analytics and Data Science. We compare word rates in Data Analytics and Data Science documents to determine which concepts are mentioned most often. The most frequent concept in both fields is "data." The word rate for "data" is more than twice the…
ERIC Educational Resources Information Center
Cornell Univ., Ithaca, NY. Dept. of Computer Science.
Part Two of the eighteenth report on Salton's Magical Automatic Retriever of Texts (SMART) project is composed of three papers: The first: "The Effect of Common Words and Synonyms on Retrieval Performance" by D. Bergmark discloses that removal of common words from the query and document vectors significantly increases precision and that…
ERIC Educational Resources Information Center
Hopp, Holger
2005-01-01
This study documents knowledge of UG-mediated aspects of optionality in word order in the second language (L2) German of advanced English and Japanese speakers (n = 39). A bimodal grammaticality judgement task, which controlled for context and intonation, was administered to probe judgements on a set of scrambling, topicalization and remnant…
Word Spotting and Recognition with Embedded Attributes.
Almazán, Jon; Gordo, Albert; Fornés, Alicia; Valveny, Ernest
2014-12-01
This paper addresses the problems of word spotting and word recognition on images. In word spotting, the goal is to find all instances of a query word in a dataset of images. In recognition, the goal is to recognize the content of the word image, usually aided by a dictionary or lexicon. We describe an approach in which both word images and text strings are embedded in a common vectorial subspace. This is achieved by a combination of label embedding and attributes learning, and a common subspace regression. In this subspace, images and strings that represent the same word are close together, allowing one to cast recognition and retrieval tasks as a nearest neighbor problem. Contrary to most other existing methods, our representation has a fixed length, is low dimensional, and is very fast to compute and, especially, to compare. We test our approach on four public datasets of both handwritten documents and natural images showing results comparable or better than the state-of-the-art on spotting and recognition tasks.
Document image cleanup and binarization
NASA Astrophysics Data System (ADS)
Wu, Victor; Manmatha, Raghaven
1998-04-01
Image binarization is a difficult task for documents with text over textured or shaded backgrounds, poor contrast, and/or considerable noise. Current optical character recognition (OCR) and document analysis technology do not handle such documents well. We have developed a simple yet effective algorithm for document image clean-up and binarization. The algorithm consists of two basic steps. In the first step, the input image is smoothed using a low-pass filter. The smoothing operation enhances the text relative to any background texture. This is because background texture normally has higher frequency than text does. The smoothing operation also removes speckle noise. In the second step, the intensity histogram of the smoothed image is computed and a threshold automatically selected as follows. For black text, the first peak of the histogram corresponds to text. Thresholding the image at the value of the valley between the first and second peaks of the histogram binarizes the image well. In order to reliably identify the valley, the histogram is smoothed by a low-pass filter before the threshold is computed. The algorithm has been applied to some 50 images from a wide variety of source: digitized video frames, photos, newspapers, advertisements in magazines or sales flyers, personal checks, etc. There are 21820 characters and 4406 words in these images. 91 percent of the characters and 86 percent of the words are successfully cleaned up and binarized. A commercial OCR was applied to the binarized text when it consisted of fonts which were OCR recognizable. The recognition rate was 84 percent for the characters and 77 percent for the words.
[The effect of taboo word on language processing].
Huszár, Tamás; Makra, Emese; Hallgató, Emese; Janacsek, Karolina; Németh, Dezsö
2010-01-01
Knowledge about how we process taboo words brings us closer to the and emotional processes, and broadens the interpretative framework in psychiatry and psychotherapy. In this study the lexical decision paradigm was used. Subjects were presented neutral words, taboo words and pseudowords in a random order, and they had to indicate whether the presented word was meaningful (neutral and taboo words) or meaningless (pseudowords). Each target word was preceded by a prime word (either taboo or neutral). SOA differed in the two experimental conditions (it was 250 msec in the experimental group, and 500 msec in the control group). In the experimental group, response latencies increased for target words that were preceded by taboo prime words, as compared to those that were preceded by neutral prime words. In the control group prime had no such differential effects on response latencies. Results indicate that emotional processing of taboo words occur very early and the negative effect of taboo words on the following lexical decision fades away in 500 msec. Our experiment and other empirical data are presented in this paper.
Masino, Aaron J.; Casper, T. Charles; Dean, Jonathan M.; Bell, Jamie; Enriquez, Rene; Deakyne, Sara; Chamberlain, James M.; Alpern, Elizabeth R.
2016-01-01
Summary Background Important information to support healthcare quality improvement is often recorded in free text documents such as radiology reports. Natural language processing (NLP) methods may help extract this information, but these methods have rarely been applied outside the research laboratories where they were developed. Objective To implement and validate NLP tools to identify long bone fractures for pediatric emergency medicine quality improvement. Methods Using freely available statistical software packages, we implemented NLP methods to identify long bone fractures from radiology reports. A sample of 1,000 radiology reports was used to construct three candidate classification models. A test set of 500 reports was used to validate the model performance. Blinded manual review of radiology reports by two independent physicians provided the reference standard. Each radiology report was segmented and word stem and bigram features were constructed. Common English “stop words” and rare features were excluded. We used 10-fold cross-validation to select optimal configuration parameters for each model. Accuracy, recall, precision and the F1 score were calculated. The final model was compared to the use of diagnosis codes for the identification of patients with long bone fractures. Results There were 329 unique word stems and 344 bigrams in the training documents. A support vector machine classifier with Gaussian kernel performed best on the test set with accuracy=0.958, recall=0.969, precision=0.940, and F1 score=0.954. Optimal parameters for this model were cost=4 and gamma=0.005. The three classification models that we tested all performed better than diagnosis codes in terms of accuracy, precision, and F1 score (diagnosis code accuracy=0.932, recall=0.960, precision=0.896, and F1 score=0.927). Conclusions NLP methods using a corpus of 1,000 training documents accurately identified acute long bone fractures from radiology reports. Strategic use of straightforward NLP methods, implemented with freely available software, offers quality improvement teams new opportunities to extract information from narrative documents. PMID:27826610
Parafoveal load of word N+1 modulates preprocessing effectiveness of word N+2 in Chinese reading.
Yan, Ming; Kliegl, Reinhold; Shu, Hua; Pan, Jinger; Zhou, Xiaolin
2010-12-01
Preview benefits (PBs) from two words to the right of the fixated one (i.e., word N + 2) and associated parafoveal-on-foveal effects are critical for proposals of distributed lexical processing during reading. This experiment examined parafoveal processing during reading of Chinese sentences, using a boundary manipulation of N + 2-word preview with low- and high-frequency words N + 1. The main findings were (a) an identity PB for word N + 2 that was (b) primarily observed when word N + 1 was of high frequency (i.e., an interaction between frequency of word N + 1 and PB for word N + 2), and (c) a parafoveal-on-foveal frequency effect of word N + 1 for fixation durations on word N. We discuss implications for theories of serial attention shifts and parallel distributed processing of words during reading.
ERIC Educational Resources Information Center
García-Orza, Javier; Comesaña, Montserrat; Piñeiro, Ana; Soares, Ana Paula; Perea, Manuel
2016-01-01
Recent research has shown that leet words (i.e., words in which some of the letters are replaced by visually similar digits; e.g., VIRTU4L) can be processed as their base words without much cost. However, it remains unclear whether the digits inserted in leet words are simply processed as letters or whether they are simultaneously processed as…
Vinciarelli, Alessandro
2005-12-01
This work presents categorization experiments performed over noisy texts. By noisy, we mean any text obtained through an extraction process (affected by errors) from media other than digital texts (e.g., transcriptions of speech recordings extracted with a recognition system). The performance of a categorization system over the clean and noisy (Word Error Rate between approximately 10 and approximately 50 percent) versions of the same documents is compared. The noisy texts are obtained through handwriting recognition and simulation of optical character recognition. The results show that the performance loss is acceptable for Recall values up to 60-70 percent depending on the noise sources. New measures of the extraction process performance, allowing a better explanation of the categorization results, are proposed.
Nonconscious semantic processing of emotional words modulates conscious access
Gaillard, Raphaël; Del Cul, Antoine; Naccache, Lionel; Vinckier, Fabien; Cohen, Laurent; Dehaene, Stanislas
2006-01-01
Whether masked words can be processed at a semantic level remains a controversial issue in cognitive psychology. Although recent behavioral studies have demonstrated masked semantic priming for number words, attempts to generalize this finding to other categories of words have failed. Here, as an alternative to subliminal priming, we introduce a sensitive behavioral method to detect nonconscious semantic processing of words. The logic of this method consists of presenting words close to the threshold for conscious perception and examining whether their semantic content modulates performance in objective and subjective tasks. Our results disclose two independent sources of modulation of the threshold for access to consciousness. First, prior conscious perception of words increases the detection rate of the same words when they are subsequently presented with stronger masking. Second, the threshold for conscious access is lower for emotional words than for neutral ones, even for words that have not been previously consciously perceived, thus implying that written words can receive nonconscious semantic processing. PMID:16648261
Intrusive effects of implicitly processed information on explicit memory.
Sentz, Dustin F; Kirkhart, Matthew W; LoPresto, Charles; Sobelman, Steven
2002-02-01
This study described the interference of implicitly processed information on the memory for explicitly processed information. Participants studied a list of words either auditorily or visually under instructions to remember the words (explicit study). They were then visually presented another word list under instructions which facilitate implicit but not explicit processing. Following a distractor task, memory for the explicit study list was tested with either a visual or auditory recognition task that included new words, words from the explicit study list, and words implicitly processed. Analysis indicated participants both failed to recognize words from the explicit study list and falsely recognized words that were implicitly processed as originating from the explicit study list. However, this effect only occurred when the testing modality was visual, thereby matching the modality for the implicitly processed information, regardless of the modality of the explicit study list. This "modality effect" for explicit memory was interpreted as poor source memory for implicitly processed information and in light of the procedures used. as well as illustrating an example of "remembering causing forgetting."
The relation between resource limitations and optional conceptual processing by children and adults.
Ackerman, B P; Spiker, K; Bailey, K
1989-10-01
In some situations children fail to perform optional conceptual processing that they are able to perform. The purpose of the 4 experiments was to determine if the difficulty of word identification affects optional conceptual processing by second/third graders, fifth graders, and college students in a cued recall task. Conceptual processing was manipulated by presenting Hard (e.g., hawk eagle canary) or Easy (river lake canary) word triplets that varied in the contrastive processing necessary to identify the "odd" target word (canary). The orienting activity also varied: for the Oddity Choice activity, contrastive processing was obligatory because the subject had to identify the target; for the Read activity, contrastive processing was optional because the experimenter identified the target. A recall advantage for the Hard over the Easy triplets was the measure of contrastive processing. Finally, the difficulty of word identification varied in that the subjects read the stimuli or the experimenter read the stimuli, and all the words were degraded, only the nontarget words were degraded, or all the words were intact. The results established that contrastive processing facilitates recall, and that word identification difficulty may limit the extent of optional contrastive processing.
Kuehl, Linn K; Wolf, Oliver T; Driessen, Martin; Schlosser, Nicole; Fernando, Silvia Carvalho; Wingenfeld, Katja
2017-09-01
Mood congruent alterations in information processing such as an impaired memory bias for emotional information and impaired inhibitory functions are prominent features of a major depressive disorder (MDD). Furthermore, in MDD patients hypothalamic-pituitary-adrenal axis dysfunctions are frequently found. Impairing effects of stress or cortisol administration on memory retrieval as well as impairing stress effects on cognitive inhibition are well documented in healthy participants. In MDD patients, no effect of acute cortisol administration on memory retrieval was found. The current study investigated the effect of acute cortisol administration on memory bias in MDD patients (N = 55) and healthy controls (N = 63) using the Directed Forgetting (DF) task with positive, negative and neutral words in a placebo controlled, double blind design. After oral administration of 10 mg hydrocortisone/placebo, the item method of the DF task was conducted. Memory performance was tested with a free recall test. Cortisol was not found to have an effect on the results of the DF task. Interestingly, there was significant impact of valence: both groups showed the highest DF score for positive words and remembered significantly more positive words that were supposed to be remembered and significantly more negative words that were supposed to be forgotten. In general, healthy participants remembered more words than the depressed patients. Still, the depressed patients were able to inhibit intentionally irrelevant information at a comparable level as the healthy controls. These results demonstrate the importance to distinguish in experimental designs between different cognitive domains such as inhibition and memory in our study. Copyright © 2017 Elsevier Ltd. All rights reserved.
Effects of Speaking Rate on Word Recognition in Parkinson’s Disease and Normal Aging
Forrest, Karen; Nygaard, Lynne; Pisoni, David B.; Siemers, Eric
2011-01-01
Current theories of basal ganglia function emphasize their role in the integration of sensory information into motor activities, particularly in the control of movement timing. People with basal ganglia disorders such as Parkinson’s disease exhibit poor temporal control of movements, in general and articulation in particular, as demonstrated by irregular speaking rate, reduced stress contrasts, and reduced movement durations and velocities. Previous research has implicated sensory deficits as contributory factors in limb movement control in patients with Parkinson’s disease; however, the relation between sensory deficits and speech-movement abnormalities has not been documented. In the present study, the existence of perceptual processing difficulties of speaking rate was investigated in subjects with Parkinsonian dysarthria (PD). Comparisons in perception were made between subjects with PD, neurologically normal geriatrics (GN) and neurologically normal young adults (YN) for accuracy in identification of words presented at different speaking rates. We hypothesized that word-identification scores would be lower for PD and GN subjects compared to the YN subjects, an effect that was supported by the data. We also expected that there would be differences between the GN and PD subjects in their accuracy of word identification at a faster speaking rate, an hypothesis that was not supported by the data. Rather, GN and PD subjects differed in identification scores for words spoken at a slow rate. PD subjects who had faster habitual speaking rates (HSR) had significantly lower word-identification scores in the slow compared to conversational rate conditions, a relation that was significant r = +0.64). These data suggest the need to consider perceptual deficits as an additional factor that contributes to rate variations in PD speech. PMID:21637728
Alcoholism and dampened temporal limbic activation to emotional faces.
Marinkovic, Ksenija; Oscar-Berman, Marlene; Urban, Trinity; O'Reilly, Cara E; Howard, Julie A; Sawyer, Kayle; Harris, Gordon J
2009-11-01
Excessive chronic drinking is accompanied by a broad spectrum of emotional changes ranging from apathy and emotional flatness to deficits in comprehending emotional information, but their neural bases are poorly understood. Emotional abnormalities associated with alcoholism were examined with functional magnetic resonance imaging in abstinent long-term alcoholic men in comparison to healthy demographically matched controls. Participants were presented with emotionally valenced words and photographs of faces during deep (semantic) and shallow (perceptual) encoding tasks followed by recognition. Overall, faces evoked stronger activation than words, with the expected material-specific laterality (left hemisphere for words, and right for faces) and depth of processing effects. However, whereas control participants showed stronger activation in the amygdala and hippocampus when viewing faces with emotional (relative to neutral) expressions, the alcoholics responded in an undifferentiated manner to all facial expressions. In the alcoholic participants, amygdala activity was inversely correlated with an increase in lateral prefrontal activity as a function of their behavioral deficits. Prefrontal modulation of emotional function as a compensation for the blunted amygdala activity during a socially relevant face appraisal task is in agreement with a distributed network engagement during emotional face processing. Deficient activation of amygdala and hippocampus may underlie impaired processing of emotional faces associated with long-term alcoholism and may be a part of the wide array of behavioral problems including disinhibition, concurring with previously documented interpersonal difficulties in this population. Furthermore, the results suggest that alcoholics may rely on prefrontal rather than temporal limbic areas in order to compensate for reduced limbic responsivity and to maintain behavioral adequacy when faced with emotionally or socially challenging situations.
Yao, Zhao; Yu, Deshui; Wang, Lili; Zhu, Xiangru; Guo, Jingjing; Wang, Zhenhong
2016-12-01
We investigated whether the effects of valence and arousal on emotional word processing are modulated by concreteness using event-related potentials (ERPs). The stimuli included concrete words (Experiment 1) and abstract words (Experiment 2) that were organized in an orthogonal design, with valence (positive and negative) and arousal (low and high) as factors in a lexical decision task. In Experiment 1, the impact of emotion on the effects of concrete words mainly resulted from the contribution of valence. Positive concrete words were processed more quickly than negative words and elicited a reduction of N400 (300-410ms) and enhancement of late positive complex (LPC; 450-750ms), whereas no differences in response times or ERPs were found between high and low levels of arousal. In Experiment 2, the interaction between valence and arousal influenced the impact of emotion on the effects of abstract words. Low-arousal positive words were associated with shorter response times and a reduction of LPC amplitudes compared with high-arousal positive words. Low-arousal negative words were processed more slowly and elicited a reduction of N170 (140-200ms) compared with high-arousal negative words. The present study indicates that word concreteness modulates the contributions of valence and arousal to the effects of emotion, and this modulation occurs during the early perceptual processing stage (N170) and late elaborate processing stage (LPC) for emotional words and at the end of all cognitive processes (i.e., reflected by response times). These findings support an embodied theory of semantic representation and help clarify prior inconsistent findings regarding the ways in which valance and arousal influence different stages of word processing, at least in a lexical decision task. Copyright © 2016 Elsevier B.V. All rights reserved.
Structural and functional correlates for language efficiency in auditory word processing.
Jung, JeYoung; Kim, Sunmi; Cho, Hyesuk; Nam, Kichun
2017-01-01
This study aims to provide convergent understanding of the neural basis of auditory word processing efficiency using a multimodal imaging. We investigated the structural and functional correlates of word processing efficiency in healthy individuals. We acquired two structural imaging (T1-weighted imaging and diffusion tensor imaging) and functional magnetic resonance imaging (fMRI) during auditory word processing (phonological and semantic tasks). Our results showed that better phonological performance was predicted by the greater thalamus activity. In contrary, better semantic performance was associated with the less activation in the left posterior middle temporal gyrus (pMTG), supporting the neural efficiency hypothesis that better task performance requires less brain activation. Furthermore, our network analysis revealed the semantic network including the left anterior temporal lobe (ATL), dorsolateral prefrontal cortex (DLPFC) and pMTG was correlated with the semantic efficiency. Especially, this network acted as a neural efficient manner during auditory word processing. Structurally, DLPFC and cingulum contributed to the word processing efficiency. Also, the parietal cortex showed a significate association with the word processing efficiency. Our results demonstrated that two features of word processing efficiency, phonology and semantics, can be supported in different brain regions and, importantly, the way serving it in each region was different according to the feature of word processing. Our findings suggest that word processing efficiency can be achieved by in collaboration of multiple brain regions involved in language and general cognitive function structurally and functionally.
Structural and functional correlates for language efficiency in auditory word processing
Kim, Sunmi; Cho, Hyesuk; Nam, Kichun
2017-01-01
This study aims to provide convergent understanding of the neural basis of auditory word processing efficiency using a multimodal imaging. We investigated the structural and functional correlates of word processing efficiency in healthy individuals. We acquired two structural imaging (T1-weighted imaging and diffusion tensor imaging) and functional magnetic resonance imaging (fMRI) during auditory word processing (phonological and semantic tasks). Our results showed that better phonological performance was predicted by the greater thalamus activity. In contrary, better semantic performance was associated with the less activation in the left posterior middle temporal gyrus (pMTG), supporting the neural efficiency hypothesis that better task performance requires less brain activation. Furthermore, our network analysis revealed the semantic network including the left anterior temporal lobe (ATL), dorsolateral prefrontal cortex (DLPFC) and pMTG was correlated with the semantic efficiency. Especially, this network acted as a neural efficient manner during auditory word processing. Structurally, DLPFC and cingulum contributed to the word processing efficiency. Also, the parietal cortex showed a significate association with the word processing efficiency. Our results demonstrated that two features of word processing efficiency, phonology and semantics, can be supported in different brain regions and, importantly, the way serving it in each region was different according to the feature of word processing. Our findings suggest that word processing efficiency can be achieved by in collaboration of multiple brain regions involved in language and general cognitive function structurally and functionally. PMID:28892503
Cost Estimating Cases: Educational Tools for Cost Analysts
1993-09-01
only appropriate documentation should be provided. In other words, students should not submit all of the documentation possible using ACEIT , only that...case was their lack of understanding of the ACEIT software used to conduct the estimate. Specifically, many students misinterpreted the cost...estimating relationships (CERs) embedded in the 49 software. Additionally, few of the students were able to properly organize the ACEIT documentation output
Word Processing Curriculum: Attitudes/Skills Business Educators Should Update.
ERIC Educational Resources Information Center
Robertson, Jane R.; West, Judy F.
1984-01-01
Discusses a study to gain data enabling curricula planners and business educators to plan an effective word processing curriculum, to determine basic skills and attitudes needed by word processing operators, and to make recommendations to help word processor operators increase productivity. (JOW)
Wabnitz, Pascal; Martens, Ulla; Neuner, Frank
2016-01-01
Social anxiety disorder (SAD) is associated with heightened sensitivity to threat cues, typically represented by emotional facial expressions. To examine if this bias can be transferred to a general hypersensitivity or whether it is specific to disorder relevant cues, we investigated electrophysiological correlates of emotional word processing (alpha activity and event-related potentials) in 20 healthy participants and 20 participants with SAD. The experimental task was a silent reading of neutral, positive, physically threatening and socially threatening words (the latter were abusive swear words) while responding to a randomly presented dot. Subsequently, all participants were asked to recall as many words as possible during an unexpected recall test. Participants with SAD showed blunted sensory processing followed by a rapid processing of emotional words during early stages (early posterior negativity - EPN). At later stages, all participants showed enhanced processing of negative (physically and socially threatening) compared to neutral and positive words (N400). Moreover, at later processing stages alpha activity was increased specifically for negative words in participants with SAD but not in healthy controls. Recall of emotional words for all subjects was best for socially threatening words, followed by negative and positive words irrespective of social anxiety. The present findings indicate that SAD is associated with abnormalities in emotional word processing characterised by early hypervigilance to emotional cues followed by cognitive avoidance at later processing stages. Most importantly, the specificity of these attentional biases seems to change as a function of time with a general emotional bias at early and a more specific bias at later processing stages.
Semantic word category processing in semantic dementia and posterior cortical atrophy.
Shebani, Zubaida; Patterson, Karalyn; Nestor, Peter J; Diaz-de-Grenu, Lara Z; Dawson, Kate; Pulvermüller, Friedemann
2017-08-01
There is general agreement that perisylvian language cortex plays a major role in lexical and semantic processing; but the contribution of additional, more widespread, brain areas in the processing of different semantic word categories remains controversial. We investigated word processing in two groups of patients whose neurodegenerative diseases preferentially affect specific parts of the brain, to determine whether their performance would vary as a function of semantic categories proposed to recruit those brain regions. Cohorts with (i) Semantic Dementia (SD), who have anterior temporal-lobe atrophy, and (ii) Posterior Cortical Atrophy (PCA), who have predominantly parieto-occipital atrophy, performed a lexical decision test on words from five different lexico-semantic categories: colour (e.g., yellow), form (oval), number (seven), spatial prepositions (under) and function words (also). Sets of pseudo-word foils matched the target words in length and bi-/tri-gram frequency. Word-frequency was matched between the two visual word categories (colour and form) and across the three other categories (number, prepositions, and function words). Age-matched healthy individuals served as controls. Although broad word processing deficits were apparent in both patient groups, the deficit was strongest for colour words in SD and for spatial prepositions in PCA. The patterns of performance on the lexical decision task demonstrate (a) general lexicosemantic processing deficits in both groups, though more prominent in SD than in PCA, and (b) differential involvement of anterior-temporal and posterior-parietal cortex in the processing of specific semantic categories of words. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Effects of context and individual differences on the processing of taboo words.
Christianson, Kiel; Zhou, Peiyun; Palmer, Cassie; Raizen, Adina
2017-07-01
Previous studies suggest that taboo words are special in regards to language processing. Findings from the studies have led to the formation of two theories, global resource theory and binding theory, of taboo word processing. The current study investigates how readers process taboo words embedded in sentences during silent reading. In two experiments, measures collected include eye movement data, accuracy and reaction time measures for recalling probe words within the sentences, and individual differences in likelihood of being offended by taboo words. Although certain aspects of the results support both theories, as the likelihood of a person being offended by a taboo word influenced some measures, neither theory sufficiently predicts or describes the effects observed. The results are interpreted as evidence that processing effects ascribed to taboo words are largely, but not completely, attributable to the context in which they are used and the individual attitudes of the people who hear/read them. The results also demonstrate the importance of investigating taboo words in naturalistic language processing paradigms. A revised theory of taboo word processing is proposed that incorporates both global resource theory and binding theory along with the sociolinguistic factors and individual differences that largely drive the effects observed here. Copyright © 2017 Elsevier B.V. All rights reserved.
Neural dichotomy of word concreteness: a view from functional neuroimaging.
Kumar, Uttam
2016-02-01
Our perception about the representation and processing of concrete and abstract concepts is based on the fact that concrete words are highly imagined and remembered faster than abstract words. In order to explain the processing differences between abstract and concrete concepts, various theories have been proposed, yet there is no unanimous consensus about its neural implication. The present study investigated the processing of concrete and abstract words during an orthography judgment task (implicit semantic processing) using functional magnetic resonance imaging to validate the involvement of the neural regions. Relative to non-words, both abstract and concrete words show activation in the regions of bilateral hemisphere previously associated with semantic processing. The common areas (conjunction analyses) observed for abstract and concrete words are bilateral inferior frontal gyrus (BA 44/45), left superior parietal (BA 7), left fusiform gyrus and bilateral middle occipital. The additional areas for abstract words were noticed in bilateral superior temporal and bilateral middle temporal region, whereas no distinct region was noticed for concrete words. This suggests that words with abstract concepts recruit additional language regions in the brain.
Words and Melody Are Intertwined in Perception of Sung Words: EEG and Behavioral Evidence
Gordon, Reyna L.; Schön, Daniele; Magne, Cyrille; Astésano, Corine; Besson, Mireille
2010-01-01
Language and music, two of the most unique human cognitive abilities, are combined in song, rendering it an ecological model for comparing speech and music cognition. The present study was designed to determine whether words and melodies in song are processed interactively or independently, and to examine the influence of attention on the processing of words and melodies in song. Event-Related brain Potentials (ERPs) and behavioral data were recorded while non-musicians listened to pairs of sung words (prime and target) presented in four experimental conditions: same word, same melody; same word, different melody; different word, same melody; different word, different melody. Participants were asked to attend to either the words or the melody, and to perform a same/different task. In both attentional tasks, different word targets elicited an N400 component, as predicted based on previous results. Most interestingly, different melodies (sung with the same word) elicited an N400 component followed by a late positive component. Finally, ERP and behavioral data converged in showing interactions between the linguistic and melodic dimensions of sung words. The finding that the N400 effect, a well-established marker of semantic processing, was modulated by musical melody in song suggests that variations in musical features affect word processing in sung language. Implications of the interactions between words and melody are discussed in light of evidence for shared neural processing resources between the phonological/semantic aspects of language and the melodic/harmonic aspects of music. PMID:20360991
The Impact of Word Processing on Office Administration in the Medical and Allied Health Professions.
ERIC Educational Resources Information Center
Platt, Naomi Dornfeld
The effect of word processing equipment on the future medical secretarial science curriculum was studied. A literature search focused on word processing and the medical and allied health professions, word processing and business education, and futuring of and changes in the secretarial science curriculum. Questionnaires to identify various aspects…
Brysbaert, Marc; Keuleers, Emmanuel; New, Boris
2011-01-01
In this Perspective Article we assess the usefulness of Google's new word frequencies for word recognition research (lexical decision and word naming). We find that, despite the massive corpus on which the Google estimates are based (131 billion words from books published in the United States alone), the Google American English frequencies explain 11% less of the variance in the lexical decision times from the English Lexicon Project (Balota et al., 2007) than the SUBTLEX-US word frequencies, based on a corpus of 51 million words from film and television subtitles. Further analyses indicate that word frequencies derived from recent books (published after 2000) are better predictors of word processing times than frequencies based on the full corpus, and that word frequencies based on fiction books predict word processing times better than word frequencies based on the full corpus. The most predictive word frequencies from Google still do not explain more of the variance in word recognition times of undergraduate students and old adults than the subtitle-based word frequencies. PMID:21713191
Semi-automated ontology generation and evolution
NASA Astrophysics Data System (ADS)
Stirtzinger, Anthony P.; Anken, Craig S.
2009-05-01
Extending the notion of data models or object models, ontology can provide rich semantic definition not only to the meta-data but also to the instance data of domain knowledge, making these semantic definitions available in machine readable form. However, the generation of an effective ontology is a difficult task involving considerable labor and skill. This paper discusses an Ontology Generation and Evolution Processor (OGEP) aimed at automating this process, only requesting user input when un-resolvable ambiguous situations occur. OGEP directly attacks the main barrier which prevents automated (or self learning) ontology generation: the ability to understand the meaning of artifacts and the relationships the artifacts have to the domain space. OGEP leverages existing lexical to ontological mappings in the form of WordNet, and Suggested Upper Merged Ontology (SUMO) integrated with a semantic pattern-based structure referred to as the Semantic Grounding Mechanism (SGM) and implemented as a Corpus Reasoner. The OGEP processing is initiated by a Corpus Parser performing a lexical analysis of the corpus, reading in a document (or corpus) and preparing it for processing by annotating words and phrases. After the Corpus Parser is done, the Corpus Reasoner uses the parts of speech output to determine the semantic meaning of a word or phrase. The Corpus Reasoner is the crux of the OGEP system, analyzing, extrapolating, and evolving data from free text into cohesive semantic relationships. The Semantic Grounding Mechanism provides a basis for identifying and mapping semantic relationships. By blending together the WordNet lexicon and SUMO ontological layout, the SGM is given breadth and depth in its ability to extrapolate semantic relationships between domain entities. The combination of all these components results in an innovative approach to user assisted semantic-based ontology generation. This paper will describe the OGEP technology in the context of the architectural components referenced above and identify a potential technology transition path to Scott AFB's Tanker Airlift Control Center (TACC) which serves as the Air Operations Center (AOC) for the Air Mobility Command (AMC).
ERIC Educational Resources Information Center
Australian Dept. of Labour and National Service, Melbourne. Women's Bureau.
This document is an English-language abstract (approximately 1,500 words) in which Australian child care facilities are surveyed to include those providing full-day care and therefore excludes kindergartens, play centers, nursery schools, and child minding centers that provide care for only part of the day. The document presents a breakdown of…
Sun-to-power cells layer by layer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moseke, Dawn; Richards, Robin; Moseke, Daniel
Representing the Center for Interface Science: Solar Electric Materials (CISSEM), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CISSEM is to advance the understanding of interface science underlyingmore » solar energy conversion technologies based on organic and organic-inorganic hybrid materials; and to inspire, recruit and train future scientists and leaders in basic science of solar electric conversion.« less
Powering your car with sun light
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cosgrove, Daniel; Brown, Nicole; Kiemle, Sarah
Representing the Center for Lignocellulose Structure and Formation (CLSF), this document is one of the entries in the Ten Hundred and One Word Challenge and was awarded "Overall Winner." As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CLSF is to dramatically increase ourmore » fundamental knowledge of the formation and physical interactions of bio-polymer networks in plant cell walls to provide a basis for improved methods for converting biomass into fuels.« less
Our On-Its-Head-and-In-Your-Dreams Approach Leads to Clean Energy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kazmerski, Lawrence; Gwinner, Don; Hicks, Al
Representing the Center for Inverse Design (CID), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CID is to revolutionize the discovery of new materials by design with tailoredmore » properties through the development and application of a novel inverse design approach powered by theory guiding experiment with an initial focus on solar energy conversion.« less
Controlling Light to Make the Most Energy From the Sun
DOE Office of Scientific and Technical Information (OSTI.GOV)
Callahan, Dennis; Corcoran, Chris; Eisler, Carissa
Representing the Light-Material Interactions in Energy Conversion (LMI), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of LMI to tailor the morphology, complex dielectric structure, and electronic properties of mattermore » so as to sculpt the flow of sunlight and heat, enabling light conversion to electrical and chemical energy with unprecedented efficiency.« less
Stuff Moving Through Other Stuff - For Energy
DOE Office of Scientific and Technical Information (OSTI.GOV)
All EFRC effort,
Representing the Understanding Charge Separation and Transfer at Interfaces in Energy Materials (EFRC:CST), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. Understanding Charge Separation and Transfer at Interfaces in Energy Materials (EFRC:CST),more » is focused on advancing the understanding and design of nanostructured molecular materials for organic photovoltaic (OPV) and electrical energy storage (EES) applications.« less
The Walk Forward of Sun-Grown Green-Thing Energy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huetteman, Carl; Burroff-Murr, Pam; Anderson, Sarah
Representing the Center for Direct Catalytic Conversion of Biomass to Biofuels (C3Bio), this document is one of the entries in the Ten Hundred and One Word Challenge and was awarded "Best Tagline." As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of C3Bio at Purdue Universitymore » is to integrate fundamental knowledge and enable technologies for catalytic conversion of engineered biomass to advanced biofuels and value-added products.« less
Is The Same bit of Light Exciting Two (or more) Parts of a Thing at the Same Time?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goodknight, Joey; Aspuru-Guzik, Alan
Representing the Center for Excitonics (CE), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of the CE is to understand the transport of charge carriers in synthetic disordered systems, whichmore » hold promise as new materials for conversion of solar energy to electricity and electrical energy storage.« less
Dreyer, Felix R.; Frey, Dietmar; Arana, Sophie; von Saldern, Sarah; Picht, Thomas; Vajkoczy, Peter; Pulvermüller, Friedemann
2015-01-01
Neuroimaging and neuropsychological experiments suggest that modality-preferential cortices, including motor- and somatosensory areas, contribute to the semantic processing of action related concrete words. Still, a possible role of sensorimotor areas in processing abstract meaning remains under debate. Recent fMRI studies indicate an involvement of the left sensorimotor cortex in the processing of abstract-emotional words (e.g., “love”) which resembles activation patterns seen for action words. But are the activated areas indeed necessary for processing action-related and abstract words? The current study now investigates word processing in two patients suffering from focal brain lesion in the left frontocentral motor system. A speeded Lexical Decision Task on meticulously matched word groups showed that the recognition of nouns from different semantic categories – related to food, animals, tools, and abstract-emotional concepts – was differentially affected. Whereas patient HS with a lesion in dorsolateral central sensorimotor systems next to the hand area showed a category-specific deficit in recognizing tool words, patient CA suffering from lesion centered in the left supplementary motor area was primarily impaired in abstract-emotional word processing. These results point to a causal role of the motor cortex in the semantic processing of both action-related object concepts and abstract-emotional concepts and therefore suggest that the motor areas previously found active in action-related and abstract word processing can serve a meaning-specific necessary role in word recognition. The category-specific nature of the observed dissociations is difficult to reconcile with the idea that sensorimotor systems are somehow peripheral or ‘epiphenomenal’ to meaning and concept processing. Rather, our results are consistent with the claim that cognition is grounded in action and perception and based on distributed action perception circuits reaching into modality-preferential cortex. PMID:26617535
Phonological Priming with Nonwords in Children with and without Specific Language Impairment
ERIC Educational Resources Information Center
Brooks, Patricia J.; Seiger-Gardner, Liat; Obeid, Rita; MacWhinney, Brian
2015-01-01
Purpose: The cross-modal picture-word interference task is used to examine contextual effects on spoken-word production. Previous work has documented lexical-phonological interference in children with specific language impairment (SLI) when a related distractor (e.g., bell) occurs prior to a picture to be named (e.g., a bed). In the current study,…
The Magic of Words: Teaching Vocabulary in the Early Childhood Classroom
ERIC Educational Resources Information Center
Neuman, Susan B.; Wright, Tanya S.
2014-01-01
Developing a large and rich vocabulary is central to learning to read. Children must know the words that make up written texts in order to understand them, especially as the vocabulary demands of content-related materials increase in the upper grades. Studies have documented that the size of a person's vocabulary is strongly related to how…
New Words Digest, Fall 1989-Summer 1990.
ERIC Educational Resources Information Center
New Words Digest, 1990
1990-01-01
This document consists of the four issues of the first annual volume of a quarterly magazine for new adult readers. It is aimed at adults reading at the fourth- to eighth-grade level. The magazine is designed to be self-motivating to the new reader or the learning disabled. Phonetic helps are provided for those words that do not conform to typical…
McBride, Dawn M; Anne Dosher, Barbara
2002-09-01
Four experiments were conducted to evaluate explanations of picture superiority effects previously found for several tasks. In a process dissociation procedure (Jacoby, 1991) with word stem completion, picture fragment completion, and category production tasks, conscious and automatic memory processes were compared for studied pictures and words with an independent retrieval model and a generate-source model. The predictions of a transfer appropriate processing account of picture superiority were tested and validated in "process pure" latent measures of conscious and unconscious, or automatic and source, memory processes. Results from both model fits verified that pictures had a conceptual (conscious/source) processing advantage over words for all tasks. The effects of perceptual (automatic/word generation) compatibility depended on task type, with pictorial tasks favoring pictures and linguistic tasks favoring words. Results show support for an explanation of the picture superiority effect that involves an interaction of encoding and retrieval processes.
Holistic processing of words modulated by reading experience.
Wong, Alan C-N; Bukach, Cindy M; Yuen, Crystal; Yang, Lizhuang; Leung, Shirley; Greenspon, Emma
2011-01-01
Perceptual expertise has been studied intensively with faces and object categories involving detailed individuation. A common finding is that experience in fulfilling the task demand of fine, subordinate-level discrimination between highly similar instances is associated with the development of holistic processing. This study examines whether holistic processing is also engaged by expert word recognition, which is thought to involve coarser, basic-level processing that is more part-based. We adopted a paradigm widely used for faces--the composite task, and found clear evidence of holistic processing for English words. A second experiment further showed that holistic processing for words was sensitive to the amount of experience with the language concerned (native vs. second-language readers) and with the specific stimuli (words vs. pseudowords). The adoption of a paradigm from the face perception literature to the study of expert word perception is important for further comparison between perceptual expertise with words and face-like expertise.
An IR-Based Approach Utilizing Query Expansion for Plagiarism Detection in MEDLINE.
Nawab, Rao Muhammad Adeel; Stevenson, Mark; Clough, Paul
2017-01-01
The identification of duplicated and plagiarized passages of text has become an increasingly active area of research. In this paper, we investigate methods for plagiarism detection that aim to identify potential sources of plagiarism from MEDLINE, particularly when the original text has been modified through the replacement of words or phrases. A scalable approach based on Information Retrieval is used to perform candidate document selection-the identification of a subset of potential source documents given a suspicious text-from MEDLINE. Query expansion is performed using the ULMS Metathesaurus to deal with situations in which original documents are obfuscated. Various approaches to Word Sense Disambiguation are investigated to deal with cases where there are multiple Concept Unique Identifiers (CUIs) for a given term. Results using the proposed IR-based approach outperform a state-of-the-art baseline based on Kullback-Leibler Distance.
What do foreign neighbors say about the mental lexicon?*
VITEVITCH, MICHAEL S.
2012-01-01
A corpus analysis of phonological word-forms shows that English words have few phonological neighbors that are Spanish words. Concomitantly, Spanish words have few phonological neighbors that are English words. These observations appear to undermine certain accounts of bilingual language processing, and have significant implications for the processing and representation of word-forms in bilinguals. PMID:23930081
The low-frequency encoding disadvantage: Word frequency affects processing demands.
Diana, Rachel A; Reder, Lynne M
2006-07-01
Low-frequency words produce more hits and fewer false alarms than high-frequency words in a recognition task. The low-frequency hit rate advantage has sometimes been attributed to processes that operate during the recognition test (e.g., L. M. Reder et al., 2000). When tasks other than recognition, such as recall, cued recall, or associative recognition, are used, the effects seem to contradict a low-frequency advantage in memory. Four experiments are presented to support the claim that in addition to the advantage of low-frequency words at retrieval, there is a low-frequency disadvantage during encoding. That is, low-frequency words require more processing resources to be encoded episodically than high-frequency words. Under encoding conditions in which processing resources are limited, low-frequency words show a larger decrement in recognition than high-frequency words. Also, studying items (pictures and words of varying frequencies) along with low-frequency words reduces performance for those stimuli. Copyright 2006 APA, all rights reserved.
Limitations of the dual-process-theory regarding the writing of words and non-words to dictation.
Tucha, Oliver; Trumpp, Christian; Lange, Klaus W
2004-12-01
It is generally assumed that the lexical and phonological systems are involved in writing to dictation. In an experiment concerned with the writing of words and non-words to dictation, the handwriting of female students was registered using a digitising tablet. The data contradict the assumption that the phonological system represents an alexical process. Both words and non-words which were acoustically presented to the subjects were lexically parsed. The analysis of kinematic data revealed significant differences between the subjects' writing of words and non-words. The findings reveal gross disturbances of handwriting fluency during the writing of non-words. The findings of the experiment cannot be explained by the dual-process-theory.
How does the interaction between spelling and motor processes build up during writing acquisition?
Kandel, Sonia; Perret, Cyril
2015-03-01
How do we recall a word's spelling? How do we produce the movements to form the letters of a word? Writing involves several processing levels. Surprisingly, researchers have focused either on spelling or motor production. However, these processes interact and cannot be studied separately. Spelling processes cascade into movement production. For example, in French, producing letters PAR in the orthographically irregular word PARFUM (perfume) delays motor production with respect to the same letters in the regular word PARDON (pardon). Orthographic regularity refers to the possibility of spelling a word correctly by applying the most frequent sound-letter conversion rules. The present study examined how the interaction between spelling and motor processing builds up during writing acquisition. French 8-10 year old children participated in the experiment. This is the age handwriting skills start to become automatic. The children wrote regular and irregular words that could be frequent or infrequent. They wrote on a digitizer so we could collect data on latency, movement duration and fluency. The results revealed that the interaction between spelling and motor processing was present already at age 8. It became more adult-like at ages 9 and 10. Before starting to write, processing irregular words took longer than regular words. This processing load spread into movement production. It increased writing duration and rendered the movements more dysfluent. Word frequency affected latencies and cascaded into production. It modulated writing duration but not movement fluency. Writing infrequent words took longer than frequent words. The data suggests that orthographic regularity has a stronger impact on writing than word frequency. They do not cascade in the same extent. Copyright © 2014 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Scriven, Jolene D.; And Others
A study was conducted (1) to determine current practices in word processing installations in selected organizations throughout the United States, and (2) to ascertain anticipated future developments in word processing as well as to provide recommendations for educational institutions that prepare workers for business offices. Seven interview…
Hamada, Megumi; Koda, Keiko
2011-04-01
Although the role of the phonological loop in word-retention is well documented, research in Chinese character retention suggests the involvement of non-phonological encoding. This study investigated whether the extent to which the phonological loop contributes to learning and remembering visually introduced words varies between college-level Chinese ESL learners (N = 20) and native speakers of English (N = 20). The groups performed a paired associative learning task under two conditions (control versus articulatory suppression) with two word types (regularly spelled versus irregularly spelled words) differing in degree of phonological accessibility. The results demonstrated that both groups' recall declined when the phonological loop was made less available (with irregularly spelled words and in the articulatory suppression condition), but the decline was greater for the native group. These results suggest that word learning entails phonological encoding uniformly across learners, but the contribution of phonology varies among learners with diverse linguistic backgrounds.
[Representation of letter position in visual word recognition process].
Makioka, S
1994-08-01
Two experiments investigated the representation of letter position in visual word recognition process. In Experiment 1, subjects (12 undergraduates and graduates) were asked to detect a target word in a briefly-presented probe. Probes consisted of two kanji words. The latters which formed targets (critical letters) were always contained in probes. (e.g. target: [symbol: see text] probe: [symbol: see text]) High false alarm rate was observed when critical letters occupied the same within-word relative position (left or right within the word) in the probe words as in the target word. In Experiment 2 (subject were ten undergraduates and graduates), spaces adjacent to probe words were replaced by randomly chosen hiragana letters (e.g. [symbol: see text]), because spaces are not used to separate words in regular Japanese sentences. In addition to the effect of within-word relative position as in Experiment 1, the effect of between-word relative position (left or right across the probe words) was observed. These results suggest that information about within-word relative position of a letter is used in word recognition process. The effect of within-word relative position was explained by a connectionist model of word recognition.
Seeking Feng Shui in US-China Rhetoric - Words Matter
2017-03-31
2017 DISTRIBUTION A. Approved for public release: distribution unlimited. DISCLAIMER The views expressed in this academic research paper are those...leaders’ rhetoric conflates contingency planning threat analysis as U.S.-China policy and is inconsistent with the threats China poses. Not only is...national strategy documents can be viewed as political documents that may not represent true U.S. intent, both sets of documents still require adherence to
Methods and means used in programming intelligent searches of technical documents
NASA Technical Reports Server (NTRS)
Gross, David L.
1993-01-01
In order to meet the data research requirements of the Safety, Reliability & Quality Assurance activities at Kennedy Space Center (KSC), a new computer search method for technical data documents was developed. By their very nature, technical documents are partially encrypted because of the author's use of acronyms, abbreviations, and shortcut notations. This problem of computerized searching is compounded at KSC by the volume of documentation that is produced during normal Space Shuttle operations. The Centralized Document Database (CDD) is designed to solve this problem. It provides a common interface to an unlimited number of files of various sizes, with the capability to perform any diversified types and levels of data searches. The heart of the CDD is the nature and capability of its search algorithms. The most complex form of search that the program uses is with the use of a domain-specific database of acronyms, abbreviations, synonyms, and word frequency tables. This database, along with basic sentence parsing, is used to convert a request for information into a relational network. This network is used as a filter on the original document file to determine the most likely locations for the data requested. This type of search will locate information that traditional techniques, (i.e., Boolean structured key-word searching), would not find.
Is the masked priming same-different task a pure measure of prelexical processing?
Kelly, Andrew N; van Heuven, Walter J B; Pitchford, Nicola J; Ledgeway, Timothy
2013-01-01
To study prelexical processes involved in visual word recognition a task is needed that only operates at the level of abstract letter identities. The masked priming same-different task has been purported to do this, as the same pattern of priming is shown for words and nonwords. However, studies using this task have consistently found a processing advantage for words over nonwords, indicating a lexicality effect. We investigated the locus of this word advantage. Experiment 1 used conventional visually-presented reference stimuli to test previous accounts of the lexicality effect. Results rule out the use of different strategies, or strength of representations, for words and nonwords. No interaction was shown between prime type and word type, but a consistent word advantage was found. Experiment 2 used novel auditorally-presented reference stimuli to restrict nonword matching to the sublexical level. This abolished scrambled priming for nonwords, but not words. Overall this suggests the processing advantage for words over nonwords results from activation of whole-word, lexical representations. Furthermore, the number of shared open-bigrams between primes and targets could account for scrambled priming effects. These results have important implications for models of orthographic processing and studies that have used this task to investigate prelexical processes.
Contemporary issues in HIM. The application layer--III.
Wear, L L; Pinkert, J R
1993-07-01
We have seen document preparation systems evolve from basic line editors through powerful, sophisticated desktop publishing programs. This component of the application layer is probably one of the most used, and most readily identifiable. Ask grade school children nowadays, and many will tell you that they have written a paper on a computer. Next month will be a "fun" tour through a number of other application programs we find useful. They will range from a simple notebook reminder to a sophisticated photograph processor. Application layer: Software targeted for the end user, focusing on a specific application area, and typically residing in the computer system as distinct components on top of the OS. Desktop publishing: A document preparation program that begins with the text features of a word processor, then adds the ability for a user to incorporate outputs from a variety of graphic programs, spreadsheets, and other applications. Line editor: A document preparation program that manipulates text in a file on the basis of numbered lines. Word processor: A document preparation program that can, among other things, reformat sections of documents, move and replace blocks of text, use multiple character fonts, automatically create a table of contents and index, create complex tables, and combine text and graphics.
ERP Indicators of L2 Proficiency in Word-to-text Integration Processes.
Yang, Chin Lung; Perfetti, Charles A; Tan, Li-Hai; Jiang, Ying
2018-06-04
Studies of bilingual proficiency have largely focused on word and sentence processing, whereas the text level has received relatively little attention. We examined on-line second language (L2) text comprehension in relation to L2 proficiency with ERPs recorded on critical words separated across a sentence boundary from their co-referential antecedents. The integration processes on the critical words were designed to reflect different levels of text representation: word-form, word-meaning, and situational levels (Kintsch, 1998). Across proficiency level, bilinguals showed biphasic N400/late positive component (LPC) effects related to word meaning integration (N400) and mental model updating (LPC) processes. More proficient bilinguals, compared with less proficient bilinguals, showed reduced amplitudes in both N400 and LPC when the integration depended on semantic and conceptual meanings. When the integration was based on word repetitions and inferences, both groups showed reduced N400 negativity while elevated LPC positivity. These effects reflect how memory mechanisms (processes and resources) support the tight coupling among word meaning, readers' memory of the text meaning and the referentially-specified meaning of the text. They further demonstrate the importance of L2 semantic and conceptual processing in modulating the L2 proficiency effect on L2 text integration processes. These results align with the assumption that word meaning processes are causal components in variations of comprehension ability for both monolinguals and bilinguals. Copyright © 2018. Published by Elsevier Ltd.
The influence of autonomic arousal and semantic relatedness on memory for emotional words.
Buchanan, Tony W; Etzel, Joset A; Adolphs, Ralph; Tranel, Daniel
2006-07-01
Increased memory for emotional stimuli is a well-documented phenomenon. Emotional arousal during the encoding of a stimulus is one mediator of this memory enhancement. Other variables such as semantic relatedness also play a role in the enhanced memory for emotional stimuli, especially for verbal stimuli. Research has not addressed the contributions of emotional arousal, indexed by self-report and autonomic measures, and semantic relatedness on memory performance. Twenty young adults (10 women) were presented neutral-unrelated words, school-related words, moderately arousing emotional words, and highly arousing taboo words while heart rate and skin conductance were measured. Memory was tested with free recall and recognition tests. Results showed that taboo words, which were both semantically related and high arousal were remembered best. School-related words, which were high on semantic relatedness but low on arousal, were remembered better than the moderately arousing emotional words and semantically unrelated neutral words. Psychophysiological responses showed that within the moderately arousing emotional and neutral word groups, those words eliciting greater autonomic activity were better remembered than words that did not elicit such activity. These results demonstrate additive effects of semantic relatedness and emotional arousal on memory. Relatedness confers an advantage to memory (as in the school-words), but the combination of relatedness and arousal (as in the taboo words) results in the best memory performance.
The Time Course of Incremental Word Processing during Chinese Reading
ERIC Educational Resources Information Center
Zhou, Junyi; Ma, Guojie; Li, Xingshan; Taft, Marcus
2018-01-01
In the current study, we report two eye movement experiments investigating how Chinese readers process incremental words during reading. These are words where some of the component characters constitute another word (an embedded word). In two experiments, eye movements were monitored while the participants read sentences with incremental words…
Processing of Color Words Activates Color Representations
ERIC Educational Resources Information Center
Richter, Tobias; Zwaan, Rolf A.
2009-01-01
Two experiments were conducted to investigate whether color representations are routinely activated when color words are processed. Congruency effects of colors and color words were observed in both directions. Lexical decisions on color words were faster when preceding colors matched the color named by the word. Color-discrimination responses…
Sambai, Ami; Coltheart, Max; Uno, Akira
2018-04-01
In English, the size of the regularity effect on word reading-aloud latency decreases across position of irregularity. This has been explained by a sublexical serially operating reading mechanism. It is unclear whether sublexical serial processing occurs in reading two-character kanji words aloud. To investigate this issue, we studied how the position of atypical character-to-sound correspondences influenced reading performance. When participants read inconsistent-atypical words aloud mixed randomly with nonwords, reading latencies of words with an inconsistent-atypical correspondence in the initial position were significantly longer than words with an inconsistent-atypical correspondence in the second position. The significant difference of reading latencies for inconsistent-atypical words disappeared when inconsistent-atypical words were presented without nonwords. Moreover, reading latencies for words with an inconsistent-atypical correspondence in the first position were shorter than for words with a typical correspondence in the first position. This typicality effect was absent when the atypicality was in the second position. These position-of-atypicality effects suggest that sublexical processing of kanji occurs serially and that the phonology of two-character kanji words is generated from both a lexical parallel process and a sublexical serial process.
Word Processors and Invention in Technical Writing.
ERIC Educational Resources Information Center
Barker, Thomas T.
1989-01-01
Explores how word processing affects thinking and writing. Examines two myths surrounding word processors and invention in technical writing. Describes how word processing can enhance invention through collaborative writing, templates, and on-screen outlining. (MM)
When and how do GPs record vital signs in children with acute infections? A cross-sectional study
Blacklock, Claire; Haj-Hassan, Tanya Ali; Thompson, Matthew J
2012-01-01
Background NICE recommendations and evidence from ambulatory settings promotes the use of vital signs in identifying serious infections in children. This appears to differ from usual clinical practice where GPs report measuring vital signs infrequently. Aim To identify frequency of vital sign documentation by GPs, in the assessment of children with acute infections in primary care. Design and setting Observational study in 15 general practice surgeries in Oxfordshire and Somerset, UK. Method A standardised proforma was used to extract consultation details including documentation of numerical vital signs, and words or phrases used by the GP in assessing vital signs, for 850 children aged 1 month to 16 years presenting with acute infection. Results Of the children presenting with acute infections 31.6% had one or more numerical vital signs recorded (269, 31.6%), however GP recording rate improved if free text proxies were also considered: at least one vital sign was then recorded in over half (54.1%) of children. In those with recorded numerical values for vital signs, the most frequent was temperature (210, 24.7%), followed by heart rate (62, 7.3%), respiratory rate (58, 6.8%), and capillary refill time (36, 4.2%). Words or phrases for vital signs were documented infrequently (temperature 17.6%, respiratory rate 14.6%, capillary refill time 12.5%, and heart rate 0.5%), Text relating to global assessment was documented in 313/850 (36.8%) of consultations. Conclusion GPs record vital signs using words and phrases as well as numerical methods, although overall documentation of vital signs is infrequent in children presenting with acute infections. PMID:23265227
Neuromagnetic correlates of audiovisual word processing in the developing brain.
Dinga, Samantha; Wu, Di; Huang, Shuyang; Wu, Caiyun; Wang, Xiaoshan; Shi, Jingping; Hu, Yue; Liang, Chun; Zhang, Fawen; Lu, Meng; Leiken, Kimberly; Xiang, Jing
2018-06-01
The brain undergoes enormous changes during childhood. Little is known about how the brain develops to serve word processing. The objective of the present study was to investigate the maturational changes of word processing in children and adolescents using magnetoencephalography (MEG). Responses to a word processing task were investigated in sixty healthy participants. Each participant was presented with simultaneous visual and auditory word pairs in "match" and "mismatch" conditions. The patterns of neuromagnetic activation from MEG recordings were analyzed at both sensor and source levels. Topography and source imaging revealed that word processing transitioned from bilateral connections to unilateral connections as age increased from 6 to 17 years old. Correlation analyses of language networks revealed that the path length of word processing networks negatively correlated with age (r = -0.833, p < 0.0001), while the connection strength (r = 0.541, p < 0.01) and the clustering coefficient (r = 0.705, p < 0.001) of word processing networks were positively correlated with age. In addition, males had more visual connections, whereas females had more auditory connections. The correlations between gender and path length, gender and connection strength, and gender and clustering coefficient demonstrated a developmental trend without reaching statistical significance. The results indicate that the developmental trajectory of word processing is gender specific. Since the neuromagnetic signatures of these gender-specific paths to adult word processing were determined using non-invasive, objective, and quantitative methods, the results may play a key role in understanding language impairments in pediatric patients in the future. Copyright © 2018 Elsevier B.V. All rights reserved.
The present status and problems in document retrieval system : document input type retrieval system
NASA Astrophysics Data System (ADS)
Inagaki, Hirohito
The office-automation (OA) made many changes. Many documents were begun to maintained in an electronic filing system. Therefore, it is needed to establish efficient document retrieval system to extract useful information. Current document retrieval systems are using simple word-matching, syntactic-matching, semantic-matching to obtain high retrieval efficiency. On the other hand, the document retrieval systems using special hardware devices, such as ISSP, were developed for aiming high speed retrieval. Since these systems can accept a single sentence or keywords as input, it is difficult to explain searcher's request. We demonstrated document input type retrieval system, which can directly accept document as an input, and can search similar documents from document data-base.
Cognate and Word Class Ambiguity Effects in Noun and Verb Processing
ERIC Educational Resources Information Center
Bultena, Sybrine; Dijkstra, Ton; van Hell, Janet G.
2013-01-01
This study examined how noun and verb processing in bilingual visual word recognition are affected by within and between-language overlap. We investigated how word class ambiguous noun and verb cognates are processed by bilinguals, to see if co-activation of overlapping word forms between languages benefits from additional overlap within a…
Midbrain-Driven Emotion and Reward Processing in Alcoholism
Müller-Oehring, E M; Jung, Y-C; Sullivan, E V; Hawkes, W C; Pfefferbaum, A; Schulte, T
2013-01-01
Alcohol dependence is associated with impaired control over emotionally motivated actions, possibly associated with abnormalities in the frontoparietal executive control network and midbrain nodes of the reward network associated with automatic attention. To identify differences in the neural response to alcohol-related word stimuli, 26 chronic alcoholics (ALC) and 26 healthy controls (CTL) performed an alcohol-emotion Stroop Match-to-Sample task during functional MR imaging. Stroop contrasts were modeled for color-word incongruency (eg, word RED printed in green) and for alcohol (eg, BEER), positive (eg, HAPPY) and negative (eg, MAD) emotional word content relative to congruent word conditions (eg, word RED printed in red). During color-Stroop processing, ALC and CTL showed similar left dorsolateral prefrontal activation, and CTL, but not ALC, deactivated posterior cingulate cortex/cuneus. An interaction revealed a dissociation between alcohol-word and color-word Stroop processing: ALC activated midbrain and parahippocampal regions more than CTL when processing alcohol-word relative to color-word conditions. In ALC, the midbrain region was also invoked by negative emotional Stroop words thereby showing significant overlap of this midbrain activation for alcohol-related and negative emotional processing. Enhanced midbrain activation to alcohol-related words suggests neuroadaptation of dopaminergic midbrain systems. We speculate that such tuning is normally associated with behavioral conditioning to optimize responses but here contributed to automatic bias to alcohol-related stimuli. PMID:23615665
Midbrain-driven emotion and reward processing in alcoholism.
Müller-Oehring, E M; Jung, Y-C; Sullivan, E V; Hawkes, W C; Pfefferbaum, A; Schulte, T
2013-09-01
Alcohol dependence is associated with impaired control over emotionally motivated actions, possibly associated with abnormalities in the frontoparietal executive control network and midbrain nodes of the reward network associated with automatic attention. To identify differences in the neural response to alcohol-related word stimuli, 26 chronic alcoholics (ALC) and 26 healthy controls (CTL) performed an alcohol-emotion Stroop Match-to-Sample task during functional MR imaging. Stroop contrasts were modeled for color-word incongruency (eg, word RED printed in green) and for alcohol (eg, BEER), positive (eg, HAPPY) and negative (eg, MAD) emotional word content relative to congruent word conditions (eg, word RED printed in red). During color-Stroop processing, ALC and CTL showed similar left dorsolateral prefrontal activation, and CTL, but not ALC, deactivated posterior cingulate cortex/cuneus. An interaction revealed a dissociation between alcohol-word and color-word Stroop processing: ALC activated midbrain and parahippocampal regions more than CTL when processing alcohol-word relative to color-word conditions. In ALC, the midbrain region was also invoked by negative emotional Stroop words thereby showing significant overlap of this midbrain activation for alcohol-related and negative emotional processing. Enhanced midbrain activation to alcohol-related words suggests neuroadaptation of dopaminergic midbrain systems. We speculate that such tuning is normally associated with behavioral conditioning to optimize responses but here contributed to automatic bias to alcohol-related stimuli.
ERIC Educational Resources Information Center
Mogey, Nora; Hartley, James
2013-01-01
There is much debate about whether or not these days students should be able to word-process essay-type examinations as opposed to handwriting them, particularly when they are asked to word-process everything else. This study used word-processing software to examine the stylistic features of 13 examination essays written by hand and 24 by…
ERIC Educational Resources Information Center
Scriven, Jolene D.; And Others
A study sought to determine current practices in word processing installations located in selected organizations throughout the United States. A related problem was to ascertain anticipated future developments in word processing to provide information for educational institutions preparing workers for the business office. Six interview instruments…
Nakagawa, A; Sukigara, M
2000-09-01
The purpose of this study was to examine the relationship between familiarity and laterality in reading Japanese Kana words. In two divided-visual-field experiments, three- or four-character Hiragana or Katakana words were presented in both familiar and unfamiliar scripts, to which subjects performed lexical decisions. Experiment 1, using three stimulus durations (40, 100, 160 ms), suggested that only in the unfamiliar script condition was increased stimulus presentation time differently affected in each visual field. To examine this lateral difference during the processing of unfamiliar scripts as related to attentional laterality, a concurrent auditory shadowing task was added in Experiment 2. The results suggested that processing words in an unfamiliar script requires attention, which could be left-hemisphere lateralized, while orthographically familiar kana words can be processed automatically on the basis of their word-level orthographic representations or visual word form. Copyright 2000 Academic Press.
Wegrzyn, Martin; Herbert, Cornelia; Ethofer, Thomas; Flaisch, Tobias; Kissler, Johanna
2017-11-01
Visually presented emotional words are processed preferentially and effects of emotional content are similar to those of explicit attention deployment in that both amplify visual processing. However, auditory processing of emotional words is less well characterized and interactions between emotional content and task-induced attention have not been fully understood. Here, we investigate auditory processing of emotional words, focussing on how auditory attention to positive and negative words impacts their cerebral processing. A Functional magnetic resonance imaging (fMRI) study manipulating word valence and attention allocation was performed. Participants heard negative, positive and neutral words to which they either listened passively or attended by counting negative or positive words, respectively. Regardless of valence, active processing compared to passive listening increased activity in primary auditory cortex, left intraparietal sulcus, and right superior frontal gyrus (SFG). The attended valence elicited stronger activity in left inferior frontal gyrus (IFG) and left SFG, in line with these regions' role in semantic retrieval and evaluative processing. No evidence for valence-specific attentional modulation in auditory regions or distinct valence-specific regional activations (i.e., negative > positive or positive > negative) was obtained. Thus, allocation of auditory attention to positive and negative words can substantially increase their processing in higher-order language and evaluative brain areas without modulating early stages of auditory processing. Inferior and superior frontal brain structures mediate interactions between emotional content, attention, and working memory when prosodically neutral speech is processed. Copyright © 2017 Elsevier Ltd. All rights reserved.
Cooperative Educational Abstracting Service (CEAS). (Abstract Series No. 103-122, March 1972).
ERIC Educational Resources Information Center
International Bureau of Education, Geneva (Switzerland).
This document is a compilation of 20 English-language abstracts concerning various aspects of education in Switzerland, New Zealand, Chile, Poland, Argentina, Pakistan, Malaysia, Thailand, and France. The abstracts are informative in nature, each being approximately 1,500 words in length. They are based on documents submitted by each of the…
ERIC Educational Resources Information Center
Consejo Nacional Tecnico de la Educacion (Mexico).
This document is an English-language abstract (approximately 1,500 words) of two booklets on Mexican educational reform. The first booklet cites the parts of the Mexican Constitution dealing with education, the legal foundation of Mexican education, stipulating that it shall be universal, democratic, national, compulsory, free and immune from…
Action Learning. Symposium 21. [Concurrent Symposium Session at AHRD Annual Conference, 2000.
ERIC Educational Resources Information Center
2000
This document contains three papers from a symposium on action learning that was conducted as part of a conference on human resource development (HRD). "Searching for Meaning in Complex Action Learning Data: What Environments, Acts, and Words Reveal" (Verna J. Willis) analyzes complex action learning documents produced as course…
Processing concrete words: fMRI evidence against a specific right-hemisphere involvement.
Fiebach, Christian J; Friederici, Angela D
2004-01-01
Behavioral, patient, and electrophysiological studies have been taken as support for the assumption that processing of abstract words is confined to the left hemisphere, whereas concrete words are processed also by right-hemispheric brain areas. These are thought to provide additional information from an imaginal representational system, as postulated in the dual-coding theory of memory and cognition. Here we report new event-related fMRI data on the processing of concrete and abstract words in a lexical decision task. While abstract words activated a subregion of the left inferior frontal gyrus (BA 45) more strongly than concrete words, specific activity for concrete words was observed in the left basal temporal cortex. These data as well as data from other neuroimaging studies reviewed here are not compatible with the assumption of a specific right-hemispheric involvement for concrete words. The combined findings rather suggest a revised view of the neuroanatomical bases of the imaginal representational system assumed in the dual-coding theory, at least with respect to word recognition.
Oscillatory brain dynamics associated with the automatic processing of emotion in words.
Wang, Lin; Bastiaansen, Marcel
2014-10-01
This study examines the automaticity of processing the emotional aspects of words, and characterizes the oscillatory brain dynamics that accompany this automatic processing. Participants read emotionally negative, neutral and positive nouns while performing a color detection task in which only perceptual-level analysis was required. Event-related potentials and time frequency representations were computed from the concurrently measured EEG. Negative words elicited a larger P2 and a larger late positivity than positive and neutral words, indicating deeper semantic/evaluative processing of negative words. In addition, sustained alpha power suppressions were found for the emotional compared to neutral words, in the time range from 500 to 1000ms post-stimulus. These results suggest that sustained attention was allocated to the emotional words, whereas the attention allocated to the neutral words was released after an initial analysis. This seems to hold even when the emotional content of the words is task-irrelevant. Copyright © 2014 Elsevier Inc. All rights reserved.
A Method for Search Engine Selection using Thesaurus for Selective Meta-Search Engine
NASA Astrophysics Data System (ADS)
Goto, Shoji; Ozono, Tadachika; Shintani, Toramatsu
In this paper, we propose a new method for selecting search engines on WWW for selective meta-search engine. In selective meta-search engine, a method is needed that would enable selecting appropriate search engines for users' queries. Most existing methods use statistical data such as document frequency. These methods may select inappropriate search engines if a query contains polysemous words. In this paper, we describe an search engine selection method based on thesaurus. In our method, a thesaurus is constructed from documents in a search engine and is used as a source description of the search engine. The form of a particular thesaurus depends on the documents used for its construction. Our method enables search engine selection by considering relationship between terms and overcomes the problems caused by polysemous words. Further, our method does not have a centralized broker maintaining data, such as document frequency for all search engines. As a result, it is easy to add a new search engine, and meta-search engines become more scalable with our method compared to other existing methods.
Planning and production of grammatical and lexical verbs in multi-word messages.
Michel Lange, Violaine; Messerschmidt, Maria; Harder, Peter; Siebner, Hartwig Roman; Boye, Kasper
2017-01-01
Grammatical words represent the part of grammar that can be most directly contrasted with the lexicon. Aphasiological studies, linguistic theories and psycholinguistic studies suggest that their processing is operated at different stages in speech production. Models of sentence production propose that at the formulation stage, lexical words are processed at the functional level while grammatical words are processed at a later positional level. In this study we consider proposals made by linguistic theories and psycholinguistic models to derive two predictions for the processing of grammatical words compared to lexical words. First, based on the assumption that grammatical words are less crucial for communication and therefore paid less attention to, it is predicted that they show shorter articulation times and/or higher error rates than lexical words. Second, based on the assumption that grammatical words differ from lexical words in being dependent on a lexical host, it is hypothesized that the retrieval of a grammatical word has to be put on hold until its lexical host is available, and it is predicted that this is reflected in longer reaction times (RTs) for grammatical compared to lexical words. We investigated these predictions by comparing fully homonymous sentences with only a difference in verb status (grammatical vs. lexical) elicited by a specific context. We measured RTs, duration and accuracy rate. No difference in duration was observed. Longer RTs and a lower accuracy rate for grammatical words were reported, successfully reflecting grammatical word properties as defined by linguistic theories and psycholinguistic models. Importantly, this study provides insight into the span of encoding and grammatical encoding processes in speech production.
Planning and production of grammatical and lexical verbs in multi-word messages
Messerschmidt, Maria; Harder, Peter; Siebner, Hartwig Roman; Boye, Kasper
2017-01-01
Grammatical words represent the part of grammar that can be most directly contrasted with the lexicon. Aphasiological studies, linguistic theories and psycholinguistic studies suggest that their processing is operated at different stages in speech production. Models of sentence production propose that at the formulation stage, lexical words are processed at the functional level while grammatical words are processed at a later positional level. In this study we consider proposals made by linguistic theories and psycholinguistic models to derive two predictions for the processing of grammatical words compared to lexical words. First, based on the assumption that grammatical words are less crucial for communication and therefore paid less attention to, it is predicted that they show shorter articulation times and/or higher error rates than lexical words. Second, based on the assumption that grammatical words differ from lexical words in being dependent on a lexical host, it is hypothesized that the retrieval of a grammatical word has to be put on hold until its lexical host is available, and it is predicted that this is reflected in longer reaction times (RTs) for grammatical compared to lexical words. We investigated these predictions by comparing fully homonymous sentences with only a difference in verb status (grammatical vs. lexical) elicited by a specific context. We measured RTs, duration and accuracy rate. No difference in duration was observed. Longer RTs and a lower accuracy rate for grammatical words were reported, successfully reflecting grammatical word properties as defined by linguistic theories and psycholinguistic models. Importantly, this study provides insight into the span of encoding and grammatical encoding processes in speech production. PMID:29091940
Li, Su; Lee, Kang; Zhao, Jing; Yang, Zhi; He, Sheng; Weng, Xuchu
2013-01-01
Little is known about the impact of learning to read on early neural development for word processing and its collateral effects on neural development in non-word domains. Here, we examined the effect of early exposure to reading on neural responses to both word and face processing in preschool children with the use of the Event Related Potential (ERP) methodology. We specifically linked children’s reading experience (indexed by their sight vocabulary) to two major neural markers: the amplitude differences between the left and right N170 on the bilateral posterior scalp sites and the hemispheric spectrum power differences in the γ band on the same scalp sites. The results showed that the left-lateralization of both the word N170 and the spectrum power in the γ band were significantly positively related to vocabulary. In contrast, vocabulary and the word left-lateralization both had a strong negative direct effect on the face right-lateralization. Also, vocabulary negatively correlated with the right-lateralized face spectrum power in the γ band even after the effects of age and the word spectrum power were partialled out. The present study provides direct evidence regarding the role of reading experience in the neural specialization of word and face processing above and beyond the effect of maturation. The present findings taken together suggest that the neural development of visual word processing competes with that of face processing before the process of neural specialization has been consolidated. PMID:23462239
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDaniel, Hunter; Beard, Matthew C; Wheeler, Lance M
Representing the Center for Advanced Solar Photophysics (CASP), this document is one of the entries in the Ten Hundred and One Word Challenge and was awarded “Overall Winner Runner-up and People’s Choice Winner.” As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of CASP is tomore » explore and exploit the unique physics of nanostructured materials to boost the efficiency of solar energy conversion through novel light-matter interactions, controlled excited-state dynamics, and engineered carrier-carrier coupling.« less
How are the energy waves blocked on the way from hot to cold?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bai, Xianming; He, Lingfeng; Khafizov, Marat
Representing the Center for Materials Science of Nuclear Fuel (CMSNF), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE energy. The mission of CMSNF to develop an experimentally validated multi-scale computational capability for themore » predictive understanding of the impact of microstructure on thermal transport in nuclear fuel under irradiation, with ultimate application to UO2 as a model system« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cropley, Cecelia
Representing the Center for Catalytic Hydrocarbon Functionalization (CCHF), this document is one of the entries in the Ten Hundred and One Word Challenge. As part of the challenge, the 46 Energy Frontier Research Centers were invited to represent their science in images, cartoons, photos, words and original paintings, but any descriptions or words could only use the 1000 most commonly used words in the English language, with the addition of one word important to each of the EFRCs and the mission of DOE: energy. The mission of CCHF is to develop, validate, and optimize new methods to rearrange the bondsmore » of hydrocarbons, implement enzymatic strategies into synthetic systems, and design optimal environments for catalysts that can be used to reversibly functionalize hydrocarbons, especially for more efficient use of natural gas including low temperature conversion to liquid fuels.« less
Taken out of Context: Differential Processing in Contextual and Isolated Word Reading
ERIC Educational Resources Information Center
Martin-Chang, Sandra; Levesque, Kyle
2013-01-01
Three experiments are reported that investigate the cognitive processes underlying contextual and isolated word reading. In Phase 1, undergraduate participants were exposed to 75 target words under three conditions. The participants generated 25 words from definitions, read 25 words in context and read 25 in isolation. In Phase 2, volunteers…
SPAR data set contents. [finite element structural analysis system
NASA Technical Reports Server (NTRS)
Cunningham, S. W.
1981-01-01
The contents of the stored data sets of the SPAR (space processing applications rocket) finite element structural analysis system are documented. The data generated by each of the system's processors are stored in a data file organized as a library. Each data set, containing a two-dimensional table or matrix, is identified by a four-word name listed in a table of contents. The creating SPAR processor, number of rows and columns, and definitions of each of the data items are listed for each data set. An example SPAR problem using these data sets is also presented.
HIPAA's effects on US healthcare.
Kumar, Sameer; Henseler, Anne; Haukaas, David
2009-01-01
Health Insurance Portability and Accountability Act implementation in the USA caused waves in the medical world about documentation storage, flow and access. Protecting patients from information falling into the wrong hands is admirable, but the Act has influenced more than just documentation; it has slowed the research process and complicated basic US medical care. This article aims to discuss Health Insurance Portability and Accountability Act's effects on documentation and patient care and future US healthcare options. A chronological approach is used to lay out the Act's effects. Using process flow maps, the pre- and post-Act environment is analyzed to discover differences in the two processes. Then a critique of the new environment leads to future movement recommendations by the US government and the healthcare industry. True to the US government's track record, by the time the Act was passed, it was already outdated in terms of IT management capabilities. In addition to trying to comply with these outdated practices, the Act's wording is so vague that hospital staff are not sure with what they are even complying. The Act could be improved with some simple changes to wording and updating. This article attempts to take a massive problem with far reaching implications, drill down to the key issues and make managerial recommendations based on findings. This provides a more detailed problem view that can only be understood at a high level owing to its complexity. Importantly, the key issues developed in the article support US government reform for legislation, which is not an easy task. There were studies available on the Act's cost to patients, hospitals, clinics and general costs in the USA. However, all the research was site specific and easily contradicted by other sources. Additionally, source reliability was questionable at best, as publications came from specific hospitals and clinics. Throughout the study two themes were clear--the Act's outdated nature and vague wording. The more research that was done, the more confusing the information began to get, it seems even experts have a hard time understating and complying with the Act. One thing is clear. The Act is confusing and outdated. Because the problem is so large and fragmented, people are not sure where to start fixing the predicament. Arming US hospitals, clinics and doctors with basic knowledge can give them a common springboard to start changing the current environment. It is clear that the problem is large and confusing. Consolidating research results seems a valuable tool to help understand what is wrong with US healthcare. This article makes a case that updating and improving the directive's ambiguous nature helps create a less frustrating US healthcare system.
The influence of contextual diversity on eye movements in reading.
Plummer, Patrick; Perea, Manuel; Rayner, Keith
2014-01-01
Recent research has shown contextual diversity (i.e., the number of passages in which a given word appears) to be a reliable predictor of word processing difficulty. It has also been demonstrated that word-frequency has little or no effect on word recognition speed when accounting for contextual diversity in isolated word processing tasks. An eye-movement experiment was conducted wherein the effects of word-frequency and contextual diversity were directly contrasted in a normal sentence reading scenario. Subjects read sentences with embedded target words that varied in word-frequency and contextual diversity. All 1st-pass and later reading times were significantly longer for words with lower contextual diversity compared to words with higher contextual diversity when controlling for word-frequency and other important lexical properties. Furthermore, there was no difference in reading times for higher frequency and lower frequency words when controlling for contextual diversity. The results confirm prior findings regarding contextual diversity and word-frequency effects and demonstrate that contextual diversity is a more accurate predictor of word processing speed than word-frequency within a normal reading task. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Moffat, Michael; Siakaluk, Paul D; Sidhu, David M; Pexman, Penny M
2015-04-01
It has been proposed that much of conceptual knowledge is acquired through situated conceptualization, such that both external (e.g., agents, objects, events) and internal (e.g., emotions, introspections) environments are considered important (Barsalou, 2003). To evaluate this proposal, we characterized two dimensions by which situated conceptualization may be measured and which should have different relevance for abstract and concrete concepts; namely, emotional experience (i.e., the ease with which words evoke emotional experience; Newcombe, Campbell, Siakaluk, & Pexman, 2012) and context availability (i.e., the ease with which words evoke contexts in which their referents may appear; Schwanenflugel & Shoben, 1983). We examined the effects of these two dimensions on abstract and concrete word processing in verbal semantic categorization (VSCT) and naming tasks. In the VSCT, emotional experience facilitated processing of abstract words but inhibited processing of concrete words, whereas context availability facilitated processing of both types of words. In the naming task in which abstract words and concrete words were not blocked by emotional experience, context availability facilitated responding to only the abstract words. In the naming task in which abstract words and concrete words were blocked by emotional experience, emotional experience facilitated responding to only the abstract words, whereas context availability facilitated responding to only the concrete words. These results were observed even with several lexical (e.g., frequency, age of acquisition) and semantic (e.g., concreteness, arousal, valence) variables included in the analyses. As such, the present research suggests that emotional experience and context availability tap into different aspects of situated conceptualization and make unique contributions to the representation and processing of abstract and concrete concepts.
Flaisch, Tobias; Imhof, Martin; Schmälzle, Ralf; Wentz, Klaus-Ulrich; Ibach, Bernd; Schupp, Harald T
2015-01-01
The present study utilized functional magnetic resonance imaging (fMRI) to examine the neural processing of concurrently presented emotional stimuli under varying explicit and implicit attention demands. Specifically, in separate trials, participants indicated the category of either pictures or words. The words were placed over the center of the pictures and the picture-word compound-stimuli were presented for 1500 ms in a rapid event-related design. The results reveal pronounced main effects of task and emotion: the picture categorization task prompted strong activations in visual, parietal, temporal, frontal, and subcortical regions; the word categorization task evoked increased activation only in left extrastriate cortex. Furthermore, beyond replicating key findings regarding emotional picture and word processing, the results point to a dissociation of semantic-affective and sensory-perceptual processes for words: while emotional words engaged semantic-affective networks of the left hemisphere regardless of task, the increased activity in left extrastriate cortex associated with explicitly attending to words was diminished when the word was overlaid over an erotic image. Finally, we observed a significant interaction between Picture Category and Task within dorsal visual-associative regions, inferior parietal, and dorsolateral, and medial prefrontal cortices: during the word categorization task, activation was increased in these regions when the words were overlaid over erotic as compared to romantic pictures. During the picture categorization task, activity in these areas was relatively decreased when categorizing erotic as compared to romantic pictures. Thus, the emotional intensity of the pictures strongly affected brain regions devoted to the control of task-related word or picture processing. These findings are discussed with respect to the interplay of obligatory stimulus processing with task-related attentional control mechanisms.
Flaisch, Tobias; Imhof, Martin; Schmälzle, Ralf; Wentz, Klaus-Ulrich; Ibach, Bernd; Schupp, Harald T.
2015-01-01
The present study utilized functional magnetic resonance imaging (fMRI) to examine the neural processing of concurrently presented emotional stimuli under varying explicit and implicit attention demands. Specifically, in separate trials, participants indicated the category of either pictures or words. The words were placed over the center of the pictures and the picture-word compound-stimuli were presented for 1500 ms in a rapid event-related design. The results reveal pronounced main effects of task and emotion: the picture categorization task prompted strong activations in visual, parietal, temporal, frontal, and subcortical regions; the word categorization task evoked increased activation only in left extrastriate cortex. Furthermore, beyond replicating key findings regarding emotional picture and word processing, the results point to a dissociation of semantic-affective and sensory-perceptual processes for words: while emotional words engaged semantic-affective networks of the left hemisphere regardless of task, the increased activity in left extrastriate cortex associated with explicitly attending to words was diminished when the word was overlaid over an erotic image. Finally, we observed a significant interaction between Picture Category and Task within dorsal visual-associative regions, inferior parietal, and dorsolateral, and medial prefrontal cortices: during the word categorization task, activation was increased in these regions when the words were overlaid over erotic as compared to romantic pictures. During the picture categorization task, activity in these areas was relatively decreased when categorizing erotic as compared to romantic pictures. Thus, the emotional intensity of the pictures strongly affected brain regions devoted to the control of task-related word or picture processing. These findings are discussed with respect to the interplay of obligatory stimulus processing with task-related attentional control mechanisms. PMID:26733895
Cross-language parafoveal semantic processing: Evidence from Korean-Chinese bilinguals.
Wang, Aiping; Yeon, Junmo; Zhou, Wei; Shu, Hua; Yan, Ming
2016-02-01
In the present study, we aimed at testing cross-language cognate and semantic preview effects. We tested how native Korean readers who learned Chinese as a second language make use of the parafoveal information during the reading of Chinese sentences. There were 3 types of Korean preview words: cognate translations of the Chinese target words, semantically related noncognate words, and unrelated words. Together with a highly significant cognate preview effect, more critically, we also observed reliable facilitation in processing of the target word from the semantically related previews in all fixation measures. Results from the present study provide first evidence for semantic processing from parafoveally presented Korean words and for cross-language parafoveal semantic processing.
ERIC Educational Resources Information Center
Aboud, Katherine S.; Bailey, Stephen K.; Petrill, Stephen A.; Cutting, Laurie E.
2016-01-01
Skilled reading depends on recognizing words efficiently in isolation ("word-level processing"; "WL") and extracting meaning from text ("discourse-level processing"; "DL"); deficiencies in either result in poor reading. FMRI has revealed consistent overlapping networks in word and passage reading, as well as…
Morphological Processing during Visual Word Recognition in Hebrew as a First and a Second Language
ERIC Educational Resources Information Center
Norman, Tal; Degani, Tamar; Peleg, Orna
2017-01-01
The present study examined whether sublexical morphological processing takes place during visual word-recognition in Hebrew, and whether morphological decomposition of written words depends on lexical activation of the complete word. Furthermore, it examined whether morphological processing is similar when reading Hebrew as a first language (L1)…
Information content versus word length in random typing
NASA Astrophysics Data System (ADS)
Ferrer-i-Cancho, Ramon; Moscoso del Prado Martín, Fermín
2011-12-01
Recently, it has been claimed that a linear relationship between a measure of information content and word length is expected from word length optimization and it has been shown that this linearity is supported by a strong correlation between information content and word length in many languages (Piantadosi et al 2011 Proc. Nat. Acad. Sci. 108 3825). Here, we study in detail some connections between this measure and standard information theory. The relationship between the measure and word length is studied for the popular random typing process where a text is constructed by pressing keys at random from a keyboard containing letters and a space behaving as a word delimiter. Although this random process does not optimize word lengths according to information content, it exhibits a linear relationship between information content and word length. The exact slope and intercept are presented for three major variants of the random typing process. A strong correlation between information content and word length can simply arise from the units making a word (e.g., letters) and not necessarily from the interplay between a word and its context as proposed by Piantadosi and co-workers. In itself, the linear relation does not entail the results of any optimization process.
Velan, Hadas; Frost, Ram
2010-01-01
Recent studies suggest that basic effects which are markers of visual word recognition in Indo-European languages cannot be obtained in Hebrew or in Arabic. Although Hebrew has an alphabetic writing system, just like English, French, or Spanish, a series of studies consistently suggested that simple form-orthographic priming, or letter-transposition priming are not found in Hebrew. In four experiments, we tested the hypothesis that this is due to the fact that Semitic words have an underlying structure that constrains the possible alignment of phonemes and their respective letters. The experiments contrasted typical Semitic words which are root-derived, with Hebrew words of non-Semitic origin, which are morphologically simple and resemble base words in European languages. Using RSVP, TL priming, and form-priming manipulations, we show that Hebrew readers process Hebrew words which are morphologically simple similar to the way they process English words. These words indeed reveal the typical form-priming and TL priming effects reported in European languages. In contrast, words with internal structure are processed differently, and require a different code for lexical access. We discuss the implications of these findings for current models of visual word recognition. PMID:21163472
Intrinsically organized network for word processing during the resting state.
Zhao, Jizheng; Liu, Jiangang; Li, Jun; Liang, Jimin; Feng, Lu; Ai, Lin; Lee, Kang; Tian, Jie
2011-01-03
Neural mechanisms underlying word processing have been extensively studied. It has been revealed that when individuals are engaged in active word processing, a complex network of cortical regions is activated. However, it is entirely unknown whether the word-processing regions are intrinsically organized without any explicit processing tasks during the resting state. The present study investigated the intrinsic functional connectivity between word-processing regions during the resting state with the use of fMRI methodology. The low-frequency fluctuations were observed between the left middle fusiform gyrus and a number of cortical regions. They included the left angular gyrus, left supramarginal gyrus, bilateral pars opercularis, and left pars triangularis of the inferior frontal gyrus, which have been implicated in phonological and semantic processing. Additionally, the activations were also observed in the bilateral superior parietal lobule and dorsal lateral prefrontal cortex, which have been suggested to provide top-down monitoring on the visual-spatial processing of words. The findings of our study indicate an intrinsically organized network during the resting state that likely prepares the visual system to anticipate the highly probable word input for ready and effective processing. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
"Language Is the Skin of My Thought": Integrating Wikipedia and AI to Support a Guillotine Player
NASA Astrophysics Data System (ADS)
Lops, Pasquale; Basile, Pierpaolo; de Gemmis, Marco; Semeraro, Giovanni
This paper describes OTTHO (On the Tip of my THOught), a system designed for solving a language game, called Guillotine, which demands knowledge covering a broad range of topics, such as movies, politics, literature, history, proverbs, and popular culture. The rule of the game is simple: the player observes five words, generally unrelated to each other, and in one minute she has to provide a sixth word, semantically connected to the others. The system exploits several knowledge sources, such as a dictionary, a set of proverbs, and Wikipedia to realize a knowledge infusion process. The paper describes the process of modeling these sources and the reasoning mechanism to find the solution of the game. The main motivation for designing an artificial player for Guillotine is the challenge of providing the machine with the cultural and linguistic background knowledge which makes it similar to a human being, with the ability of interpreting natural language documents and reasoning on their content. Experiments carried out showed promising results. Our feeling is that the presented approach has a great potential for other more practical applications besides solving a language game.
Co-occurrence graphs for word sense disambiguation in the biomedical domain.
Duque, Andres; Stevenson, Mark; Martinez-Romo, Juan; Araujo, Lourdes
2018-05-01
Word sense disambiguation is a key step for many natural language processing tasks (e.g. summarization, text classification, relation extraction) and presents a challenge to any system that aims to process documents from the biomedical domain. In this paper, we present a new graph-based unsupervised technique to address this problem. The knowledge base used in this work is a graph built with co-occurrence information from medical concepts found in scientific abstracts, and hence adapted to the specific domain. Unlike other unsupervised approaches based on static graphs such as UMLS, in this work the knowledge base takes the context of the ambiguous terms into account. Abstracts downloaded from PubMed are used for building the graph and disambiguation is performed using the personalized PageRank algorithm. Evaluation is carried out over two test datasets widely explored in the literature. Different parameters of the system are also evaluated to test robustness and scalability. Results show that the system is able to outperform state-of-the-art knowledge-based systems, obtaining more than 10% of accuracy improvement in some cases, while only requiring minimal external resources. Copyright © 2018 Elsevier B.V. All rights reserved.
Engineering Documentation and Data Control
NASA Technical Reports Server (NTRS)
Matteson, Michael J.; Bramley, Craig; Ciaruffoli, Veronica
2001-01-01
Mississippi Space Services (MSS) the facility services contractor for NASA's John C. Stennis Space Center (SSC), is utilizing technology to improve engineering documentation and data control. Two identified improvement areas, labor intensive documentation research and outdated drafting standards, were targeted as top priority. MSS selected AutoManager(R) WorkFlow from Cyco software to manage engineering documentation. The software is currently installed on over 150 desctops. The outdated SSC drafting standard was written for pre-CADD drafting methods, in other words, board drafting. Implementation of COTS software solutions to manage engineering documentation and update the drafting standard resulted in significant increases in productivity by reducing the time spent searching for documents.
ERIC Educational Resources Information Center
Reinke, Karen; Fernandes, Myra; Schwindt, Graeme; O'Craven, Kathleen; Grady, Cheryl L.
2008-01-01
The functional specificity of the brain region known as the Visual Word Form Area (VWFA) was examined using fMRI. We explored whether this area serves a general role in processing symbolic stimuli, rather than being selective for the processing of words. Brain activity was measured during a visual 1-back task to English words, meaningful symbols…
Resting state neural networks for visual Chinese word processing in Chinese adults and children.
Li, Ling; Liu, Jiangang; Chen, Feiyan; Feng, Lu; Li, Hong; Tian, Jie; Lee, Kang
2013-07-01
This study examined the resting state neural networks for visual Chinese word processing in Chinese children and adults. Both the functional connectivity (FC) and amplitude of low frequency fluctuation (ALFF) approaches were used to analyze the fMRI data collected when Chinese participants were not engaged in any specific explicit tasks. We correlated time series extracted from the visual word form area (VWFA) with those in other regions in the brain. We also performed ALFF analysis in the resting state FC networks. The FC results revealed that, regarding the functionally connected brain regions, there exist similar intrinsically organized resting state networks for visual Chinese word processing in adults and children, suggesting that such networks may already be functional after 3-4 years of informal exposure to reading plus 3-4 years formal schooling. The ALFF results revealed that children appear to recruit more neural resources than adults in generally reading-irrelevant brain regions. Differences between child and adult ALFF results suggest that children's intrinsic word processing network during the resting state, though similar in functional connectivity, is still undergoing development. Further exposure to visual words and experience with reading are needed for children to develop a mature intrinsic network for word processing. The developmental course of the intrinsically organized word processing network may parallel that of the explicit word processing network. Copyright © 2013 Elsevier Ltd. All rights reserved.
Level statistics of words: Finding keywords in literary texts and symbolic sequences
NASA Astrophysics Data System (ADS)
Carpena, P.; Bernaola-Galván, P.; Hackenberg, M.; Coronado, A. V.; Oliver, J. L.
2009-03-01
Using a generalization of the level statistics analysis of quantum disordered systems, we present an approach able to extract automatically keywords in literary texts. Our approach takes into account not only the frequencies of the words present in the text but also their spatial distribution along the text, and is based on the fact that relevant words are significantly clustered (i.e., they self-attract each other), while irrelevant words are distributed randomly in the text. Since a reference corpus is not needed, our approach is especially suitable for single documents for which no a priori information is available. In addition, we show that our method works also in generic symbolic sequences (continuous texts without spaces), thus suggesting its general applicability.
Chen, Changyou; Buntine, Wray; Ding, Nan; Xie, Lexing; Du, Lan
2015-02-01
In applications we may want to compare different document collections: they could have shared content but also different and unique aspects in particular collections. This task has been called comparative text mining or cross-collection modeling. We present a differential topic model for this application that models both topic differences and similarities. For this we use hierarchical Bayesian nonparametric models. Moreover, we found it was important to properly model power-law phenomena in topic-word distributions and thus we used the full Pitman-Yor process rather than just a Dirichlet process. Furthermore, we propose the transformed Pitman-Yor process (TPYP) to incorporate prior knowledge such as vocabulary variations in different collections into the model. To deal with the non-conjugate issue between model prior and likelihood in the TPYP, we thus propose an efficient sampling algorithm using a data augmentation technique based on the multinomial theorem. Experimental results show the model discovers interesting aspects of different collections. We also show the proposed MCMC based algorithm achieves a dramatically reduced test perplexity compared to some existing topic models. Finally, we show our model outperforms the state-of-the-art for document classification/ideology prediction on a number of text collections.
Processing Academic Language through Four Corners Vocabulary Chart Applications
ERIC Educational Resources Information Center
Smith, Sarah; Sanchez, Claudia; Betty, Sharon; Davis, Shiloh
2016-01-01
4 Corners Vocabulary Charts (FCVCs) are explored as a multipurpose vehicle for processing academic language in a 5th-grade classroom. FCVCs typically display a vocabulary word, an illustration of the word, synonyms associated with the word, a sentence using a given vocabulary word, and a definition of the term in students' words. The use of…
Using the Word Processor in Writing Groups.
ERIC Educational Resources Information Center
Melia, Josie
Writing groups can use word processors or microcomputers in many different types of writing activities. Four hour-long sessions at a word processor with the help of a skilled word processing tutor have been found to be sufficient to provide a working knowledge of word processing. When two or three students enrolled in a writing class are assigned…
The Influence of Contextual Diversity on Eye Movements in Reading
ERIC Educational Resources Information Center
Plummer, Patrick; Perea, Manuel; Rayner, Keith
2014-01-01
Recent research has shown contextual diversity (i.e., the number of passages in which a given word appears) to be a reliable predictor of word processing difficulty. It has also been demonstrated that word-frequency has little or no effect on word recognition speed when accounting for contextual diversity in isolated word processing tasks. An…
Rotation Reveals the Importance of Configural Cues in Handwritten Word Perception
Barnhart, Anthony S.; Goldinger, Stephen D.
2013-01-01
A dramatic perceptual asymmetry occurs when handwritten words are rotated 90° in either direction. Those rotated in a direction consistent with their natural tilt (typically clockwise) become much more difficult to recognize, relative to those rotated in the opposite direction. In Experiment 1, we compared computer-printed and handwritten words, all equated for degrees of leftward and rightward tilt, and verified the phenomenon: The effect of rotation was far larger for cursive words, especially when rotated in a tilt-consistent direction. In Experiment 2, we replicated this pattern with all items presented in visual noise. In both experiments, word frequency effects were larger for computer-printed words and did not interact with rotation. The results suggest that handwritten word perception requires greater configural processing, relative to computer print, because handwritten letters are variable and ambiguous. When words are rotated, configural processing suffers, particularly when rotation exaggerates natural tilt. Our account is similar to theories of the “Thatcher Illusion,” wherein face inversion disrupts holistic processing. Together, the findings suggest that configural, word-level processing automatically increases when people read handwriting, as letter-level processing becomes less reliable. PMID:23589201
Interpreting Chicken-Scratch: Lexical Access for Handwritten Words
Barnhart, Anthony S.; Goldinger, Stephen D.
2014-01-01
Handwritten word recognition is a field of study that has largely been neglected in the psychological literature, despite its prevalence in society. Whereas studies of spoken word recognition almost exclusively employ natural, human voices as stimuli, studies of visual word recognition use synthetic typefaces, thus simplifying the process of word recognition. The current study examined the effects of handwriting on a series of lexical variables thought to influence bottom-up and top-down processing, including word frequency, regularity, bidirectional consistency, and imageability. The results suggest that the natural physical ambiguity of handwritten stimuli forces a greater reliance on top-down processes, because almost all effects were magnified, relative to conditions with computer print. These findings suggest that processes of word perception naturally adapt to handwriting, compensating for physical ambiguity by increasing top-down feedback. PMID:20695708
Alcoholism and Dampened Temporal Limbic Activation to Emotional Faces
Marinkovic, Ksenija; Oscar-Berman, Marlene; Urban, Trinity; O’Reilly, Cara E.; Howard, Julie A.; Sawyer, Kayle; Harris, Gordon J.
2013-01-01
Background Excessive chronic drinking is accompanied by a broad spectrum of emotional changes ranging from apathy and emotional flatness to deficits in comprehending emotional information, but their neural bases are poorly understood. Methods Emotional abnormalities associated with alcoholism were examined with functional magnetic resonance imaging in abstinent long-term alcoholic men in comparison to healthy demographically matched controls. Participants were presented with emotionally valenced words and photographs of faces during deep (semantic) and shallow (perceptual) encoding tasks followed by recognition. Results Overall, faces evoked stronger activation than words, with the expected material-specific laterality (left hemisphere for words, and right for faces) and depth of processing effects. However, whereas control participants showed stronger activation in the amygdala and hippocampus when viewing faces with emotional (relative to neutral) expressions, the alcoholics responded in an undifferentiated manner to all facial expressions. In the alcoholic participants, amygdala activity was inversely correlated with an increase in lateral prefrontal activity as a function of their behavioral deficits. Prefrontal modulation of emotional function as a compensation for the blunted amygdala activity during a socially relevant face appraisal task is in agreement with a distributed network engagement during emotional face processing. Conclusions Deficient activation of amygdala and hippocampus may underlie impaired processing of emotional faces associated with long-term alcoholism and may be a part of the wide array of behavioral problems including disinhibition, concurring with previously documented interpersonal difficulties in this population. Furthermore, the results suggest that alcoholics may rely on prefrontal rather than temporal limbic areas in order to compensate for reduced limbic responsivity and to maintain behavioral adequacy when faced with emotionally or socially challenging situations. PMID:19673745
A New Perspective on Visual Word Processing Efficiency
Houpt, Joseph W.; Townsend, James T.; Donkin, Christopher
2013-01-01
As a fundamental part of our daily lives, visual word processing has received much attention in the psychological literature. Despite the well established advantage of perceiving letters in a word or in a pseudoword over letters alone or in random sequences using accuracy, a comparable effect using response times has been elusive. Some researchers continue to question whether the advantage due to word context is perceptual. We use the capacity coefficient, a well established, response time based measure of efficiency to provide evidence of word processing as a particularly efficient perceptual process to complement those results from the accuracy domain. PMID:24334151
Combining approaches to on-line handwriting information retrieval
NASA Astrophysics Data System (ADS)
Peña Saldarriaga, Sebastián; Viard-Gaudin, Christian; Morin, Emmanuel
2010-01-01
In this work, we propose to combine two quite different approaches for retrieving handwritten documents. Our hypothesis is that different retrieval algorithms should retrieve different sets of documents for the same query. Therefore, significant improvements in retrieval performances can be expected. The first approach is based on information retrieval techniques carried out on the noisy texts obtained through handwriting recognition, while the second approach is recognition-free using a word spotting algorithm. Results shows that for texts having a word error rate (WER) lower than 23%, the performances obtained with the combined system are close to the performances obtained on clean digital texts. In addition, for poorly recognized texts (WER > 52%), an improvement of nearly 17% can be observed with respect to the best available baseline method.
Is Word Shape Still in Poor Shape for the Race to the Lexicon?
ERIC Educational Resources Information Center
Hill, Jessica C.
2010-01-01
Current models of normal reading behavior emphasize not only the recognition and processing of the word being fixated (n) but also processing of the upcoming parafoveal word (n + 1). Gaze contingent displays employing the boundary paradigm often mask words in order to understand how much and what type of processing is completed on the parafoveal…
Automated Non-Alphanumeric Symbol Resolution in Clinical Texts
Moon, SungRim; Pakhomov, Serguei; Ryan, James; Melton, Genevieve B.
2011-01-01
Although clinical texts contain many symbols, relatively little attention has been given to symbol resolution by medical natural language processing (NLP) researchers. Interpreting the meaning of symbols may be viewed as a special case of Word Sense Disambiguation (WSD). One thousand instances of four common non-alphanumeric symbols (‘+’, ‘–’, ‘/’, and ‘#’) were randomly extracted from a clinical document repository and annotated by experts. The symbols and their surrounding context, in addition to bag-of-Words (BoW), and heuristic rules were evaluated as features for the following classifiers: Naïve Bayes, Support Vector Machine, and Decision Tree, using 10-fold cross-validation. Accuracies for ‘+’, ‘–’, ‘/’, and ‘#’ were 80.11%, 80.22%, 90.44%, and 95.00% respectively, with Naïve Bayes. While symbol context contributed the most, BoW was also helpful for disambiguation of some symbols. Symbol disambiguation with supervised techniques can be implemented with reasonable accuracy as a module for medical NLP systems. PMID:22195157
Gwilliams, L; Marantz, A
2015-08-01
Although the significance of morphological structure is established in visual word processing, its role in auditory processing remains unclear. Using magnetoencephalography we probe the significance of the root morpheme for spoken Arabic words with two experimental manipulations. First we compare a model of auditory processing that calculates probable lexical outcomes based on whole-word competitors, versus a model that only considers the root as relevant to lexical identification. Second, we assess violations to the root-specific Obligatory Contour Principle (OCP), which disallows root-initial consonant gemination. Our results show root prediction to significantly correlate with neural activity in superior temporal regions, independent of predictions based on whole-word competitors. Furthermore, words that violated the OCP constraint were significantly easier to dismiss as valid words than probability-matched counterparts. The findings suggest that lexical auditory processing is dependent upon morphological structure, and that the root forms a principal unit through which spoken words are recognised. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Bridgers, Franca Ferrari; Kacinik, Natalie
2017-02-01
The majority of words in most languages consist of derived poly-morphemic words but a cross-linguistic review of the literature (Amenta and Crepaldi in Front Psychol 3:232-243, 2012) shows a contradictory picture with respect to how such words are represented and processed. The current study examined the effects of linearity and structural complexity on the processing of Italian derived words. Participants performed a lexical decision task on three types of prefixed and suffixed words and nonwords differing in the complexity of their internal structure. The processing of these words was indeed found to vary according to the nature of the affixes, the order in which they appear, and the type of information the affix encodes. The results thus indicate that derived words are not a uniform class and the best account of these findings appears to be a constraint-based or probabilistic multi-route processing model (e.g., Kuperman et al. in Lang Cogn Process 23:1089-1132, 2008; J Exp Psychol Hum Percept Perform 35:876-895, 2009; J Mem Lang 62:83-97, 2010).
How Sound Symbolism Is Processed in the Brain: A Study on Japanese Mimetic Words
Okuda, Jiro; Okada, Hiroyuki; Matsuda, Tetsuya
2014-01-01
Sound symbolism is the systematic and non-arbitrary link between word and meaning. Although a number of behavioral studies demonstrate that both children and adults are universally sensitive to sound symbolism in mimetic words, the neural mechanisms underlying this phenomenon have not yet been extensively investigated. The present study used functional magnetic resonance imaging to investigate how Japanese mimetic words are processed in the brain. In Experiment 1, we compared processing for motion mimetic words with that for non-sound symbolic motion verbs and adverbs. Mimetic words uniquely activated the right posterior superior temporal sulcus (STS). In Experiment 2, we further examined the generalizability of the findings from Experiment 1 by testing another domain: shape mimetics. Our results show that the right posterior STS was active when subjects processed both motion and shape mimetic words, thus suggesting that this area may be the primary structure for processing sound symbolism. Increased activity in the right posterior STS may also reflect how sound symbolic words function as both linguistic and non-linguistic iconic symbols. PMID:24840874
Fussell, Nicola J; Rowe, Angela C; Mohr, Christine
2012-01-01
The reliance in experimental psychology on testing undergraduate populations with relatively little life experience, and/or ambiguously valenced stimuli with varying degrees of self-relevance, may have contributed to inconsistent findings in the literature on the valence hypothesis. To control for these potential limitations, the current study assessed lateralised lexical decisions for positive and negative attachment words in 40 middle-aged male and female participants. Self-relevance was manipulated in two ways: by testing currently married compared with previously married individuals and by assessing self-relevance ratings individually for each word. Results replicated a left hemisphere advantage for lexical decisions and a processing advantage of emotional over neutral words but did not support the valence hypothesis. Positive attachment words yielded a processing advantage over neutral words in the right hemisphere, while emotional words (irrespective of valence) yielded a processing advantage over neutral words in the left hemisphere. Both self-relevance manipulations were unrelated to lateralised performance. The role of participant sex and age in emotion processing are discussed as potential modulators of the present findings.
The effects of sad prosody on hemispheric specialization for words processing.
Leshem, Rotem; Arzouan, Yossi; Armony-Sivan, Rinat
2015-06-01
This study examined the effect of sad prosody on hemispheric specialization for word processing using behavioral and electrophysiological measures. A dichotic listening task combining focused attention and signal-detection methods was conducted to evaluate the detection of a word spoken in neutral or sad prosody. An overall right ear advantage together with leftward lateralization in early (150-170 ms) and late (240-260 ms) processing stages was found for word detection, regardless of prosody. Furthermore, the early stage was most pronounced for words spoken in neutral prosody, showing greater negative activation over the left than the right hemisphere. In contrast, the later stage was most pronounced for words spoken with sad prosody, showing greater positive activation over the left than the right hemisphere. The findings suggest that sad prosody alone was not sufficient to modulate hemispheric asymmetry in word-level processing. We posit that lateralized effects of sad prosody on word processing are largely dependent on the psychoacoustic features of the stimuli as well as on task demands. Copyright © 2015 Elsevier Inc. All rights reserved.
Morphable Word Clouds for Time-Varying Text Data Visualization.
Chi, Ming-Te; Lin, Shih-Syun; Chen, Shiang-Yi; Lin, Chao-Hung; Lee, Tong-Yee
2015-12-01
A word cloud is a visual representation of a collection of text documents that uses various font sizes, colors, and spaces to arrange and depict significant words. The majority of previous studies on time-varying word clouds focuses on layout optimization and temporal trend visualization. However, they do not fully consider the spatial shapes and temporal motions of word clouds, which are important factors for attracting people's attention and are also important cues for human visual systems in capturing information from time-varying text data. This paper presents a novel method that uses rigid body dynamics to arrange multi-temporal word-tags in a specific shape sequence under various constraints. Each word-tag is regarded as a rigid body in dynamics. With the aid of geometric, aesthetic, and temporal coherence constraints, the proposed method can generate a temporally morphable word cloud that not only arranges word-tags in their corresponding shapes but also smoothly transforms the shapes of word clouds over time, thus yielding a pleasing time-varying visualization. Using the proposed frame-by-frame and morphable word clouds, people can observe the overall story of a time-varying text data from the shape transition, and people can also observe the details from the word clouds in frames. Experimental results on various data demonstrate the feasibility and flexibility of the proposed method in morphable word cloud generation. In addition, an application that uses the proposed word clouds in a simulated exhibition demonstrates the usefulness of the proposed method.
Li, Su; Lee, Kang; Zhao, Jing; Yang, Zhi; He, Sheng; Weng, Xuchu
2013-04-01
Little is known about the impact of learning to read on early neural development for word processing and its collateral effects on neural development in non-word domains. Here, we examined the effect of early exposure to reading on neural responses to both word and face processing in preschool children with the use of the Event Related Potential (ERP) methodology. We specifically linked children's reading experience (indexed by their sight vocabulary) to two major neural markers: the amplitude differences between the left and right N170 on the bilateral posterior scalp sites and the hemispheric spectrum power differences in the γ band on the same scalp sites. The results showed that the left-lateralization of both the word N170 and the spectrum power in the γ band were significantly positively related to vocabulary. In contrast, vocabulary and the word left-lateralization both had a strong negative direct effect on the face right-lateralization. Also, vocabulary negatively correlated with the right-lateralized face spectrum power in the γ band even after the effects of age and the word spectrum power were partialled out. The present study provides direct evidence regarding the role of reading experience in the neural specialization of word and face processing above and beyond the effect of maturation. The present findings taken together suggest that the neural development of visual word processing competes with that of face processing before the process of neural specialization has been consolidated. Copyright © 2013 Elsevier Ltd. All rights reserved.
Hemispheric asymmetry in holistic processing of words.
Ventura, Paulo; Delgado, João; Ferreira, Miguel; Farinha-Fernandes, António; Guerreiro, José C; Faustino, Bruno; Leite, Isabel; Wong, Alan C-N
2018-05-13
Holistic processing has been regarded as a hallmark of face perception, indicating the automatic and obligatory tendency of the visual system to process all face parts as a perceptual unit rather than in isolation. Studies involving lateralized stimulus presentation suggest that the right hemisphere dominates holistic face processing. Holistic processing can also be shown with other categories such as words and thus it is not specific to faces or face-like expertize. Here, we used divided visual field presentation to investigate the possibly different contributions of the two hemispheres for holistic word processing. Observers performed same/different judgment on the cued parts of two sequentially presented words in the complete composite paradigm. Our data indicate a right hemisphere specialization for holistic word processing. Thus, these markers of expert object recognition are domain general.
Spotting handwritten words and REGEX using a two stage BLSTM-HMM architecture
NASA Astrophysics Data System (ADS)
Bideault, Gautier; Mioulet, Luc; Chatelain, Clément; Paquet, Thierry
2015-01-01
In this article, we propose a hybrid model for spotting words and regular expressions (REGEX) in handwritten documents. The model is made of the state-of-the-art BLSTM (Bidirectional Long Short Time Memory) neural network for recognizing and segmenting characters, coupled with a HMM to build line models able to spot the desired sequences. Experiments on the Rimes database show very promising results.
Climbing the Tower of Babel: Perfecting Machine Translation
2011-02-16
Center) used MT tools to translate extraordinary numbers of Russian technical documents. 10 For the Air Force, the manpower and time savings were...recognition.htm. Granted, this number is tempered by the rules of a specific language that would disallow specific word orderings, or mandate particular word...sequences, (e.g., in English, prepositions can only be followed by articles, etc) but the overall numbers convey the complexity of the machine
ERIC Educational Resources Information Center
Abu Nasr, Julinda; And Others
This document is divided into two parts: (1) "A Study of Sex Role Stereotype in Arabic Readers" and (2) "A Guide for the Identification and Elimination of Sexism in Arabic Textbooks." In part 1, a sample of 79 Arabic readers were read word for word and the images pertaining to females were recorded. The results of the survey…
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali
2018-06-01
Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation, a clinical report is transformed into a format that is suitable for classification. The traditional document representation technique for text categorization is the bag-of-words (BoW) technique. In this study, the traditional BoW technique is ineffective in classifying forensic autopsy reports because it merely extracts frequent but discriminative features from clinical reports. Moreover, this technique fails to capture word inversion, as well as word-level synonymy and polysemy, when classifying autopsy reports. Hence, the BoW technique suffers from low accuracy and low robustness unless it is improved with contextual and application-specific information. To overcome the aforementioned limitations of the BoW technique, this research aims to develop an effective conceptual graph-based document representation (CGDR) technique to classify 1500 forensic autopsy reports from four (4) manners of death (MoD) and sixteen (16) causes of death (CoD). Term-based and Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) based conceptual features were extracted and represented through graphs. These features were then used to train a two-level text classifier. The first level classifier was responsible for predicting MoD. In addition, the second level classifier was responsible for predicting CoD using the proposed conceptual graph-based document representation technique. To demonstrate the significance of the proposed technique, its results were compared with those of six (6) state-of-the-art document representation techniques. Lastly, this study compared the effects of one-level classification and two-level classification on the experimental results. The experimental results indicated that the CGDR technique achieved 12% to 15% improvement in accuracy compared with fully automated document representation baseline techniques. Moreover, two-level classification obtained better results compared with one-level classification. The promising results of the proposed conceptual graph-based document representation technique suggest that pathologists can adopt the proposed system as their basis for second opinion, thereby supporting them in effectively determining CoD. Copyright © 2018 Elsevier Inc. All rights reserved.
The Effects of Test Trial and Processing Level on Immediate and Delayed Retention.
Chang, Sau Hou
2017-03-01
The purpose of the present study was to investigate the effects of test trial and processing level on immediate and delayed retention. A 2 × 2 × 2 mixed ANOVAs was used with two between-subject factors of test trial (single test, repeated test) and processing level (shallow, deep), and one within-subject factor of final recall (immediate, delayed). Seventy-six college students were randomly assigned first to the single test (studied the stimulus words three times and took one free-recall test) and the repeated test trials (studied the stimulus words once and took three consecutive free-recall tests), and then to the shallow processing level (asked whether each stimulus word was presented in capital letter or in small letter) and the deep processing level (whether each stimulus word belonged to a particular category) to study forty stimulus words. The immediate test was administered five minutes after the trials, whereas the delayed test was administered one week later. Results showed that single test trial recalled more words than repeated test trial in immediate final free-recall test, participants in deep processing performed better than those in shallow processing in both immediate and delayed retention. However, the dominance of single test trial and deep processing did not happen in delayed retention. Additional study trials did not further enhance the delayed retention of words encoded in deep processing, but did enhance the delayed retention of words encoded in shallow processing.
The Effects of Test Trial and Processing Level on Immediate and Delayed Retention
Chang, Sau Hou
2017-01-01
The purpose of the present study was to investigate the effects of test trial and processing level on immediate and delayed retention. A 2 × 2 × 2 mixed ANOVAs was used with two between-subject factors of test trial (single test, repeated test) and processing level (shallow, deep), and one within-subject factor of final recall (immediate, delayed). Seventy-six college students were randomly assigned first to the single test (studied the stimulus words three times and took one free-recall test) and the repeated test trials (studied the stimulus words once and took three consecutive free-recall tests), and then to the shallow processing level (asked whether each stimulus word was presented in capital letter or in small letter) and the deep processing level (whether each stimulus word belonged to a particular category) to study forty stimulus words. The immediate test was administered five minutes after the trials, whereas the delayed test was administered one week later. Results showed that single test trial recalled more words than repeated test trial in immediate final free-recall test, participants in deep processing performed better than those in shallow processing in both immediate and delayed retention. However, the dominance of single test trial and deep processing did not happen in delayed retention. Additional study trials did not further enhance the delayed retention of words encoded in deep processing, but did enhance the delayed retention of words encoded in shallow processing. PMID:28344679
Constellation Stretch Goals: Review of Industry Inputs
NASA Technical Reports Server (NTRS)
Lang, John
2006-01-01
Many good ideas received based on industry experience: a) Shuttle operations; b) Commercial aircraft production; c) NASA's historical way of doing business; d) Military and commercial programs. Aerospace performed preliminary analysis: a) Potential savings; b) Cost of implementation; c) Performance or other impact/penalties; d) Roadblocks; e) Unintended consequences; f) Bottom line. Significant work ahead for a "Stretch Goal"to become a good, documented requirement: 1) As a group, the relative "value" of goals are uneven; 2) Focused analysis on each goal is required: a) Need to ensure that a new requirement produces the desired consequence; b) It is not certain that some goals will not create problems elsewhere. 3) Individual implementation path needs to be studied: a) Best place to insert requirement (what level, which document); b) Appropriate wording for the requirement. Many goals reflect "best practices" based on lessons learned and may have value beyond near-term CxP requirements process.
A novel word spotting method based on recurrent neural networks.
Frinken, Volkmar; Fischer, Andreas; Manmatha, R; Bunke, Horst
2012-02-01
Keyword spotting refers to the process of retrieving all instances of a given keyword from a document. In the present paper, a novel keyword spotting method for handwritten documents is described. It is derived from a neural network-based system for unconstrained handwriting recognition. As such it performs template-free spotting, i.e., it is not necessary for a keyword to appear in the training set. The keyword spotting is done using a modification of the CTC Token Passing algorithm in conjunction with a recurrent neural network. We demonstrate that the proposed systems outperform not only a classical dynamic time warping-based approach but also a modern keyword spotting system, based on hidden Markov models. Furthermore, we analyze the performance of the underlying neural networks when using them in a recognition task followed by keyword spotting on the produced transcription. We point out the advantages of keyword spotting when compared to classic text line recognition.
On the interaction of deaffrication and consonant harmony*
Dinnsen, Daniel A.; Gierut, Judith A.; Morrisette, Michele L.; Green, Christopher R.; Farris-Trimble, Ashley W.
2010-01-01
Error patterns in children’s phonological development are often described as simplifying processes that can interact with one another with different consequences. Some interactions limit the applicability of an error pattern, and others extend it to more words. Theories predict that error patterns interact to their full potential. While specific interactions have been documented for certain pairs of processes, no developmental study has shown that the range of typologically predicted interactions occurs for those processes. To determine whether this anomaly is an accidental gap or a systematic peculiarity of particular error patterns, two commonly occurring processes were considered, namely Deaffrication and Consonant Harmony. Results are reported from a cross-sectional and longitudinal study of 12 children (age 3;0 – 5;0) with functional phonological delays. Three interaction types were attested to varying degrees. The longitudinal results further instantiated the typology and revealed a characteristic trajectory of change. Implications of these findings are explored. PMID:20513256
A dual-task investigation of automaticity in visual word processing
NASA Technical Reports Server (NTRS)
McCann, R. S.; Remington, R. W.; Van Selst, M.
2000-01-01
An analysis of activation models of visual word processing suggests that frequency-sensitive forms of lexical processing should proceed normally while unattended. This hypothesis was tested by having participants perform a speeded pitch discrimination task followed by lexical decisions or word naming. As the stimulus onset asynchrony between the tasks was reduced, lexical-decision and naming latencies increased dramatically. Word-frequency effects were additive with the increase, indicating that frequency-sensitive processing was subject to postponement while attention was devoted to the other task. Either (a) the same neural hardware shares responsibility for lexical processing and central stages of choice reaction time task processing and cannot perform both computations simultaneously, or (b) lexical processing is blocked in order to optimize performance on the pitch discrimination task. Either way, word processing is not as automatic as activation models suggest.
Encoding the world around us: motor-related processing influences verbal memory.
Madan, Christopher R; Singhal, Anthony
2012-09-01
It is known that properties of words such as their imageability can influence our ability to remember those words. However, it is not known if other object-related properties can also influence our memory. In this study we asked whether a word representing a concrete object that can be functionally interacted with (i.e., high-manipulability word) would enhance the memory representations for that item compared to a word representing a less manipulable object (i.e., low-manipulability word). Here participants incidentally encoded high-manipulability (e.g., CAMERA) and low-manipulability words (e.g., TABLE) while making word judgments. Using a between-subjects design, we varied the depth-of-processing involved in the word judgment task: participants judged the words based on personal experience (deep/elaborative processing), word length (shallow), or functionality (intermediate). Participants were able to remember high-manipulability words better than low-manipulability words in both the personal experience and word length groups; thus presenting the first evidence that manipulability can influence memory. However, we observed better memory for low- than high-manipulability words in the functionality group. We explain this surprising interaction between manipulability and memory as being mediated by automatic vs. controlled motor-related cognition. Copyright © 2012 Elsevier Inc. All rights reserved.
Memory for pictures and words as a function of level of processing: Depth or dual coding?
D'Agostino, P R; O'Neill, B J; Paivio, A
1977-03-01
The experiment was designed to test differential predictions derived from dual-coding and depth-of-processing hypotheses. Subjects under incidental memory instructions free recalled a list of 36 test events, each presented twice. Within the list, an equal number of events were assigned to structural, phonemic, and semantic processing conditions. Separate groups of subjects were tested with a list of pictures, concrete words, or abstract words. Results indicated that retention of concrete words increased as a direct function of the processing-task variable (structural < phonemic
Eye-fixation behavior, lexical storage, and visual word recognition in a split processing model.
Shillcock, R; Ellison, T M; Monaghan, P
2000-10-01
Some of the implications of a model of visual word recognition in which processing is conditioned by the anatomical splitting of the visual field between the two hemispheres of the brain are explored. The authors investigate the optimal processing of visually presented words within such an architecture, and, for a realistically sized lexicon of English, characterize a computationally optimal fixation point in reading. They demonstrate that this approach motivates a range of behavior observed in reading isolated words and text, including the optimal viewing position and its relationship with the preferred viewing location, the failure to fixate smaller words, asymmetries in hemisphere-specific processing, and the priority given to the exterior letters of words. The authors also show that split architectures facilitate the uptake of all the letter-position information necessary for efficient word recognition and that this information may be less specific than is normally assumed. A split model of word recognition captures a range of behavior in reading that is greater than that covered by existing models of visual word recognition.
Dudschig, Carolin; de la Vega, Irmgard; Kaup, Barbara
2014-05-01
Converging evidence suggests that understanding our first-language (L1) results in reactivation of experiential sensorimotor traces in the brain. Surprisingly, little is known regarding the involvement of these processes during second-language (L2) processing. Participants saw L1 or L2 words referring to entities with a typical location (e.g., star, mole) (Experiment 1 & 2) or to an emotion (e.g., happy, sad) (Experiment 3). Participants responded to the words' ink color with an upward or downward arm movement. Despite word meaning being fully task-irrelevant, L2 automatically activated motor responses similar to L1 even when L2 was acquired rather late in life (age >11). Specifically, words such as star facilitated upward, and words such as root facilitated downward responses. Additionally, words referring to positive emotions facilitated upward, and words referring to negative emotions facilitated downward responses. In summary our study suggests that reactivation of experiential traces is not limited to L1 processing. Copyright © 2014 Elsevier Inc. All rights reserved.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-09-13
... minutes, automatically generate the SPL document (a few formatting edits may have to be made). Based on... render it as intended in SPL. The comment said that most users need to apply applicable formatting to..., including MS Word (both editable and hard- formatted), faxes, texts, in emails, or other scanned documents...
ERIC Educational Resources Information Center
Branzburg, Jeffrey
2008-01-01
There are many ways to begin a PDF document using Adobe Acrobat. The easiest and most popular way is to create the document in another application (such as Microsoft Word) and then use the Adobe Acrobat software to convert it to a PDF. In this article, the author describes how he used Acrobat's many tools in his project--an interactive…
Working Words: A User's Guide to Written Communication at Work.
ERIC Educational Resources Information Center
Hagston, Jan
Writing a document that is clear and easy to understand is difficult. This resource book is a guide to making written material easier to read, understand, and use. The guide is targeted at those who write work-place documents--industry or TAFE (Technical and Further Education) trainers, managers, supervisors, union representatives or writers of…
ERIC Educational Resources Information Center
Newlin, George
Charles Dickens' novel, "A Tale of Two Cities," does not waste a word in telling a touching, suspenseful tale set against the background of one of the bloodiest events in history, the French Revolution. This casebook's collection of historical documents, collateral readings, and commentary will promote interdisciplinary study of the…
Putting Practice into Words: The State of Data and Methods Transparency in Grammatical Descriptions
ERIC Educational Resources Information Center
Gawne, Lauren; Kelly, Barbara F.; Berez-Kroeker, Andrea L.; Heston, Tyler
2017-01-01
Language documentation and description are closely related practices, often performed as part of the same fieldwork project on an un(der)-studied language. Research trends in recent decades have seen a great volume of publishing in regards to the methods of language documentation, however, it is not clear that linguists' awareness of the…
ERIC Educational Resources Information Center
Uzunboylu, Huseyin; Genc, Zeynep
2017-01-01
The purpose of this study is to determine the recent trends in foreign language learning through mobile learning. The study was conducted employing document analysis and related content analysis among the qualitative research methodology. Through the search conducted on Scopus database with the key words "mobile learning and foreign language…
2011-02-17
document objects, on one or more electronic document pages. These commands have their roots in typography , so, to understand the PDF Language, one...must have at least a rudimentary understanding of typography . Only a few of the typographic commands, called text showing operators, can hold strings
NASA Technical Reports Server (NTRS)
1989-01-01
This document establishes electrical, electronic, and electromechanical (EEE) parts management and control requirements for contractors providing and maintaining space flight and mission-essential or critical ground support equipment for NASA space flight programs. Although the text is worded 'the contractor shall,' the requirements are also to be used by NASA Headquarters and field installations for developing program/project parts management and control requirements for in-house and contracted efforts. This document places increased emphasis on parts programs to ensure that reliability and quality are considered through adequate consideration of the selection, control, and application of parts. It is the intent of this document to identify disciplines that can be implemented to obtain reliable parts which meet mission needs. The parts management and control requirements described in this document are to be selectively applied, based on equipment class and mission needs. Individual equipment needs should be evaluated to determine the extent to which each requirement should be implemented on a procurement. Utilization of this document does not preclude the usage of other documents. The entire process of developing and implementing requirements is referred to as 'tailoring' the program for a specific project. Some factors that should be considered in this tailoring process include program phase, equipment category and criticality, equipment complexity, and mission requirements. Parts management and control requirements advocated by this document directly support the concept of 'reliability by design' and are an integral part of system reliability and maintainability. Achieving the required availability and mission success objectives during operation depends on the attention given reliability and maintainability in the design phase. Consequently, it is intended that the requirements described in this document are consistent with those of NASA publications, 'Reliability Program Requirements for Aeronautical and Space System Contractors,' NHB 5300.4(1A-l); 'Maintainability Program Requirements for Space Systems,' NHB 5300.4(1E); and 'Quality Program Provisions for Aeronautical and Space System Contractors,' NHB 5300.4(1B).
Parafoveal Load of Word N+1 Modulates Preprocessing Effectiveness of Word N+2 in Chinese Reading
ERIC Educational Resources Information Center
Yan, Ming; Kliegl, Reinhold; Shu, Hua; Pan, Jinger; Zhou, Xiaolin
2010-01-01
Preview benefits (PBs) from two words to the right of the fixated one (i.e., word N + 2) and associated parafoveal-on-foveal effects are critical for proposals of distributed lexical processing during reading. This experiment examined parafoveal processing during reading of Chinese sentences, using a boundary manipulation of N + 2-word preview…
ERIC Educational Resources Information Center
Eckerth, Johannes; Tavakoli, Parveneh
2012-01-01
Research on incidental second language (L2) vocabulary acquisition through reading has claimed that repeated encounters with unfamiliar words and the relative elaboration of processing these words facilitate word learning. However, so far both variables have been investigated in isolation. To help close this research gap, the current study…
Rau, Anne K; Moll, Kristina; Snowling, Margaret J; Landerl, Karin
2015-02-01
The current study investigated the time course of cross-linguistic differences in word recognition. We recorded eye movements of German and English children and adults while reading closely matched sentences, each including a target word manipulated for length and frequency. Results showed differential word recognition processes for both developing and skilled readers. Children of the two orthographies did not differ in terms of total word processing time, but this equal outcome was achieved quite differently. Whereas German children relied on small-unit processing early in word recognition, English children applied small-unit decoding only upon rereading-possibly when experiencing difficulties in integrating an unfamiliar word into the sentence context. Rather unexpectedly, cross-linguistic differences were also found in adults in that English adults showed longer processing times than German adults for nonwords. Thus, although orthographic consistency does play a major role in reading development, cross-linguistic differences are detectable even in skilled adult readers. Copyright © 2014 Elsevier Inc. All rights reserved.
Extracting Related Words from Anchor Text Clusters by Focusing on the Page Designer's Intention
NASA Astrophysics Data System (ADS)
Liu, Jianquan; Chen, Hanxiong; Furuse, Kazutaka; Ohbo, Nobuo
Approaches for extracting related words (terms) by co-occurrence work poorly sometimes. Two words frequently co-occurring in the same documents are considered related. However, they may not relate at all because they would have no common meanings nor similar semantics. We address this problem by considering the page designer’s intention and propose a new model to extract related words. Our approach is based on the idea that the web page designers usually make the correlative hyperlinks appear in close zone on the browser. We developed a browser-based crawler to collect “geographically” near hyperlinks, then by clustering these hyperlinks based on their pixel coordinates, we extract related words which can well reflect the designer’s intention. Experimental results show that our method can represent the intention of the web page designer in extremely high precision. Moreover, the experiments indicate that our extracting method can obtain related words in a high average precision.
Word Processing: The Air Force Administrators’ Handbook
1979-05-01
finest magazine on the market , Word Processing World. If you can’t get the bucks for the Report, order Word Process- ing World by itself for $14/year...following publication. "The Seybold Report on Word Processing" is published monthly by Seybold Publications, Inc., Box 644, Media , Pennsylvania 19063...Avenue, New York, NY 10022. It’s a lot like the Cecil book--aimed at the community college and vocational-technical school market . Well, that wraps up
When does word frequency influence written production?
Baus, Cristina; Strijkers, Kristof; Costa, Albert
2013-01-01
The aim of the present study was to explore the central (e.g., lexical processing) and peripheral processes (motor preparation and execution) underlying word production during typewriting. To do so, we tested non-professional typers in a picture typing task while continuously recording EEG. Participants were instructed to write (by means of a standard keyboard) the corresponding name for a given picture. The lexical frequency of the words was manipulated: half of the picture names were of high-frequency while the remaining were of low-frequency. Different measures were obtained: (1) first keystroke latency and (2) keystroke latency of the subsequent letters and duration of the word. Moreover, ERPs locked to the onset of the picture presentation were analyzed to explore the temporal course of word frequency in typewriting. The results showed an effect of word frequency for the first keystroke latency but not for the duration of the word or the speed to which letter were typed (interstroke intervals). The electrophysiological results showed the expected ERP frequency effect at posterior sites: amplitudes for low-frequency words were more positive than those for high-frequency words. However, relative to previous evidence in the spoken modality, the frequency effect appeared in a later time-window. These results demonstrate two marked differences in the processing dynamics underpinning typing compared to speaking: First, central processing dynamics between speaking and typing differ already in the manner that words are accessed; second, central processing differences in typing, unlike speaking, do not cascade to peripheral processes involved in response execution.
When does word frequency influence written production?
Baus, Cristina; Strijkers, Kristof; Costa, Albert
2013-01-01
The aim of the present study was to explore the central (e.g., lexical processing) and peripheral processes (motor preparation and execution) underlying word production during typewriting. To do so, we tested non-professional typers in a picture typing task while continuously recording EEG. Participants were instructed to write (by means of a standard keyboard) the corresponding name for a given picture. The lexical frequency of the words was manipulated: half of the picture names were of high-frequency while the remaining were of low-frequency. Different measures were obtained: (1) first keystroke latency and (2) keystroke latency of the subsequent letters and duration of the word. Moreover, ERPs locked to the onset of the picture presentation were analyzed to explore the temporal course of word frequency in typewriting. The results showed an effect of word frequency for the first keystroke latency but not for the duration of the word or the speed to which letter were typed (interstroke intervals). The electrophysiological results showed the expected ERP frequency effect at posterior sites: amplitudes for low-frequency words were more positive than those for high-frequency words. However, relative to previous evidence in the spoken modality, the frequency effect appeared in a later time-window. These results demonstrate two marked differences in the processing dynamics underpinning typing compared to speaking: First, central processing dynamics between speaking and typing differ already in the manner that words are accessed; second, central processing differences in typing, unlike speaking, do not cascade to peripheral processes involved in response execution. PMID:24399980
Don't words come easy? A psychophysical exploration of word superiority
Starrfelt, Randi; Petersen, Anders; Vangkilde, Signe
2013-01-01
Words are made of letters, and yet sometimes it is easier to identify a word than a single letter. This word superiority effect (WSE) has been observed when written stimuli are presented very briefly or degraded by visual noise. We compare performance with letters and words in three experiments, to explore the extents and limits of the WSE. Using a carefully controlled list of three letter words, we show that a WSE can be revealed in vocal reaction times even to undegraded stimuli. With a novel combination of psychophysics and mathematical modeling, we further show that the typical WSE is specifically reflected in perceptual processing speed: single words are simply processed faster than single letters. Intriguingly, when multiple stimuli are presented simultaneously, letters are perceived more easily than words, and this is reflected both in perceptual processing speed and visual short term memory (VSTM) capacity. So, even if single words come easy, there is a limit to the WSE. PMID:24027510
Wang, Jie; Wong, Andus Wing-Kuen; Chen, Hsuan-Chih
2017-06-05
The time course of phonological encoding in Mandarin monosyllabic word production was investigated by using the picture-word interference paradigm. Participants were asked to name pictures in Mandarin while visual distractor words were presented before, at, or after picture onset (i.e., stimulus-onset asynchrony/SOA = -100, 0, or +100 ms, respectively). Compared with the unrelated control, the distractors sharing atonal syllables with the picture names significantly facilitated the naming responses at -100- and 0-ms SOAs. In addition, the facilitation effect of sharing word-initial segments only appeared at 0-ms SOA, and null effects were found for sharing word-final segments. These results indicate that both syllables and subsyllabic units play important roles in Mandarin spoken word production and more critically that syllabic processing precedes subsyllabic processing. The current results lend strong support to the proximate units principle (O'Seaghdha, Chen, & Chen, 2010), which holds that the phonological structure of spoken word production is language-specific and that atonal syllables are the proximate phonological units in Mandarin Chinese. On the other hand, the significance of word-initial segments over word-final segments suggests that serial processing of segmental information seems to be universal across Germanic languages and Chinese, which remains to be verified in future studies.
Juhasz, Barbara J
2016-11-14
Recording eye movements provides information on the time-course of word recognition during reading. Juhasz and Rayner [Juhasz, B. J., & Rayner, K. (2003). Investigating the effects of a set of intercorrelated variables on eye fixation durations in reading. Journal of Experimental Psychology: Learning, Memory and Cognition, 29, 1312-1318] examined the impact of five word recognition variables, including familiarity and age-of-acquisition (AoA), on fixation durations. All variables impacted fixation durations, but the time-course differed. However, the study focused on relatively short, morphologically simple words. Eye movements are also informative for examining the processing of morphologically complex words such as compound words. The present study further examined the time-course of lexical and semantic variables during morphological processing. A total of 120 English compound words that varied in familiarity, AoA, semantic transparency, lexeme meaning dominance, sensory experience rating (SER), and imageability were selected. The impact of these variables on fixation durations was examined when length, word frequency, and lexeme frequencies were controlled in a regression model. The most robust effects were found for familiarity and AoA, indicating that a reader's experience with compound words significantly impacts compound recognition. These results provide insight into semantic processing of morphologically complex words during reading.
Computational Models of the Representation of Bangla Compound Words in the Mental Lexicon.
Dasgupta, Tirthankar; Sinha, Manjira; Basu, Anupam
2016-08-01
In this paper we aim to model the organization and processing of Bangla compound words in the mental lexicon. Our objective is to determine whether the mental lexicon access a Bangla compound word as a whole or decomposes the whole word into its constituent morphemes and then recognize them accordingly. To address this issue, we adopted two different strategies. First, we conduct a cross-modal priming experiment over a number of native speakers. Analysis of reaction time (RT) and error rates indicates that in general, Bangla compound words are accessed via partial decomposition process. That is some word follows full-listing mode of representation and some words follow the decomposition route of representation. Next, based on the collected RT data we have developed a computational model that can explain the processing phenomena of the access and representation of Bangla compound words. In order to achieve this, we first explored the individual roles of head word position, morphological complexity, orthographic transparency and semantic compositionality between the constituents and the whole compound word. Accordingly, we have developed a complexity based model by combining these features together. To a large extent we have successfully explained the possible processing phenomena of most of the Bangla compound words. Our proposed model shows an accuracy of around 83 %.
ERIC Educational Resources Information Center
Juhasz, Barbara J.; Johnson, Rebecca L.; Brewer, Jennifer
2017-01-01
New words enter the language through several word formation processes [see Simonini ("Engl J" 55:752-757, 1966)]. One such process, blending, occurs when two source words are combined to represent a new concept (e.g., SMOG, BRUNCH, BLOG, and INFOMERCIAL). While there have been examinations of the structure of blends [see Gries…
DTD Creation for the Software Technology for Adaptable, Reliable Systems (STARS) Program
1990-06-23
developed to store documents in a format peculiar to the program’s design . Editing the document became easy since word processors adjust all spacing and...descriptive markup may be output to a 3 CDRL 1810 January 26, 1990 variety of devices ranging from high quality typography printers through laser printers...provision for non-SGML material, such as graphics , to be inserted in a document. For these reasons the Computer-Aided Acquisition and Logistics Support
[Electrophysiological bases of semantic processing of objects].
Kahlaoui, Karima; Baccino, Thierry; Joanette, Yves; Magnié, Marie-Noële
2007-02-01
How pictures and words are stored and processed in the human brain constitute a long-standing question in cognitive psychology. Behavioral studies have yielded a large amount of data addressing this issue. Generally speaking, these data show that there are some interactions between the semantic processing of pictures and words. However, behavioral methods can provide only limited insight into certain findings. Fortunately, Event-Related Potential (ERP) provides on-line cues about the temporal nature of cognitive processes and contributes to the exploration of their neural substrates. ERPs have been used in order to better understand semantic processing of words and pictures. The main objective of this article is to offer an overview of the electrophysiologic bases of semantic processing of words and pictures. Studies presented in this article showed that the processing of words is associated with an N 400 component, whereas pictures elicited both N 300 and N 400 components. Topographical analysis of the N 400 distribution over the scalp is compatible with the idea that both image-mediated concrete words and pictures access an amodal semantic system. However, given the distinctive N 300 patterns, observed only during picture processing, it appears that picture and word processing rely upon distinct neuronal networks, even if they end up activating more or less similar semantic representations.
Distance-dependent processing of pictures and words.
Amit, Elinor; Algom, Daniel; Trope, Yaacov
2009-08-01
A series of 8 experiments investigated the association between pictorial and verbal representations and the psychological distance of the referent objects from the observer. The results showed that people better process pictures that represent proximal objects and words that represent distal objects than pictures that represent distal objects and words that represent proximal objects. These results were obtained with various psychological distance dimensions (spatial, temporal, and social), different tasks (classification and categorization), and different measures (speed of processing and selective attention). The authors argue that differences in the processing of pictures and words emanate from the physical similarity of pictures, but not words, to the referents. Consequently, perceptual analysis is commonly applied to pictures but not to words. Pictures thus impart a sense of closeness to the referent objects and are preferably used to represent such objects, whereas words do not convey proximity and are preferably used to represent distal objects in space, time, and social perspective.
The Integration of Word Processing with Data Processing in an Educational Environment. Final Report.
ERIC Educational Resources Information Center
Patterson, Lorna; Schlender, Jim
A project examined the Office of the Future and determined trends regarding an integration of word processing and data processing. It then sought to translate those trends into an educational package to develop the potential information specialist. A survey instrument completed by 33 office managers and word processing and data processing…
Word Processors: A Look at Four Popular Programs.
ERIC Educational Resources Information Center
Press, Larry
1980-01-01
Described are types of programs used for processing text (editors, print formatters, and word processors), followed by the comparison of four word-processing packages: Auto Scribe, Electric Pencil, Magic Want and Word Star. With the exception of Auto Scribe, all programs reviewed are CP/M versions. (KC)
Bakos, Sarolta; Landerl, Karin; Bartling, Jürgen; Schulte-Körne, Gerd; Moll, Kristina
2018-03-01
In consistent orthographies, isolated reading disorders (iRD) and isolated spelling disorders (iSD) are nearly as common as combined reading-spelling disorders (cRSD). However, the exact nature of the underlying word processing deficits in isolated versus combined literacy deficits are not well understood yet. We applied a phonological lexical decision task (including words, pseudohomophones, legal and illegal pseudowords) during ERP recording to investigate the neurophysiological correlates of lexical and sublexical word-processing in children with iRD, iSD and cRSD compared to typically developing (TD) 9-year-olds. TD children showed enhanced early sensitivity (N170) for word material and for the violation of orthographic rules compared to the other groups. Lexical orthographic effects (higher LPC amplitude for words than for pseudohomophones) were the same in the TD and iRD groups, although processing took longer in children with iRD. In the iSD and cRSD groups, lexical orthographic effects were evident and stable over time only for correctly spelled words. Orthographic representations were intact in iRD children, but word processing took longer compared to TD. Children with spelling disorders had partly missing orthographic representations. Our study is the first to specify the underlying neurophysiology of word processing deficits associated with isolated literacy deficits. Copyright © 2017 International Federation of Clinical Neurophysiology. Published by Elsevier B.V. All rights reserved.
Quam, Carolyn; Creel, Sarah C
2017-01-01
Previous research has mainly considered the impact of tone-language experience on ability to discriminate linguistic pitch, but proficient bilingual listening requires differential processing of sound variation in each language context. Here, we ask whether Mandarin-English bilinguals, for whom pitch indicates word distinctions in one language but not the other, can process pitch differently in a Mandarin context vs. an English context. Across three eye-tracked word-learning experiments, results indicated that tone-intonation bilinguals process tone in accordance with the language context. In Experiment 1, 51 Mandarin-English bilinguals and 26 English speakers without tone experience were taught Mandarin-compatible novel words with tones. Mandarin-English bilinguals out-performed English speakers, and, for bilinguals, overall accuracy was correlated with Mandarin dominance. Experiment 2 taught 24 Mandarin-English bilinguals and 25 English speakers novel words with Mandarin-like tones, but English-like phonemes and phonotactics. The Mandarin-dominance advantages observed in Experiment 1 disappeared when words were English-like. Experiment 3 contrasted Mandarin-like vs. English-like words in a within-subjects design, providing even stronger evidence that bilinguals can process tone language-specifically. Bilinguals (N = 58), regardless of language dominance, attended more to tone than English speakers without Mandarin experience (N = 28), but only when words were Mandarin-like-not when they were English-like. Mandarin-English bilinguals thus tailor tone processing to the within-word language context.