Sample records for handwritten text recognition

  1. Handwritten recognition of Tamil vowels using deep learning

    NASA Astrophysics Data System (ADS)

    Ram Prashanth, N.; Siddarth, B.; Ganesh, Anirudh; Naveen Kumar, Vaegae

    2017-11-01

    We come across a large volume of handwritten texts in our daily lives and handwritten character recognition has long been an important area of research in pattern recognition. The complexity of the task varies among different languages and it so happens largely due to the similarity between characters, distinct shapes and number of characters which are all language-specific properties. There have been numerous works on character recognition of English alphabets and with laudable success, but regional languages have not been dealt with very frequently and with similar accuracies. In this paper, we explored the performance of Deep Belief Networks in the classification of Handwritten Tamil vowels, and conclusively compared the results obtained. The proposed method has shown satisfactory recognition accuracy in light of difficulties faced with regional languages such as similarity between characters and minute nuances that differentiate them. We can further extend this to all the Tamil characters.

  2. Kannada character recognition system using neural network

    NASA Astrophysics Data System (ADS)

    Kumar, Suresh D. S.; Kamalapuram, Srinivasa K.; Kumar, Ajay B. R.

    2013-03-01

    Handwriting recognition has been one of the active and challenging research areas in the field of pattern recognition. It has numerous applications which include, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. As there is no sufficient number of works on Indian language character recognition especially Kannada script among 15 major scripts in India. In this paper an attempt is made to recognize handwritten Kannada characters using Feed Forward neural networks. A handwritten Kannada character is resized into 20x30 Pixel. The resized character is used for training the neural network. Once the training process is completed the same character is given as input to the neural network with different set of neurons in hidden layer and their recognition accuracy rate for different Kannada characters has been calculated and compared. The results show that the proposed system yields good recognition accuracy rates comparable to that of other handwritten character recognition systems.

  3. Handwritten digits recognition based on immune network

    NASA Astrophysics Data System (ADS)

    Li, Yangyang; Wu, Yunhui; Jiao, Lc; Wu, Jianshe

    2011-11-01

    With the development of society, handwritten digits recognition technique has been widely applied to production and daily life. It is a very difficult task to solve these problems in the field of pattern recognition. In this paper, a new method is presented for handwritten digit recognition. The digit samples firstly are processed and features extraction. Based on these features, a novel immune network classification algorithm is designed and implemented to the handwritten digits recognition. The proposed algorithm is developed by Jerne's immune network model for feature selection and KNN method for classification. Its characteristic is the novel network with parallel commutating and learning. The performance of the proposed method is experimented to the handwritten number datasets MNIST and compared with some other recognition algorithms-KNN, ANN and SVM algorithm. The result shows that the novel classification algorithm based on immune network gives promising performance and stable behavior for handwritten digits recognition.

  4. Comparative implementation of Handwritten and Machine written Gurmukhi text utilizing appropriate parameters

    NASA Astrophysics Data System (ADS)

    Kaur, Jaswinder; Jagdev, Gagandeep, Dr.

    2018-01-01

    Optical character recognition is concerned with the recognition of optically processed characters. The recognition is done offline after the writing or printing has been completed, unlike online recognition where the computer has to recognize the characters instantly as they are drawn. The performance of character recognition depends upon the quality of scanned documents. The preprocessing steps are used for removing low-frequency background noise and normalizing the intensity of individual scanned documents. Several filters are used for reducing certain image details and enabling an easier or faster evaluation. The primary aim of the research work is to recognize handwritten and machine written characters and differentiate them. The language opted for the research work is Punjabi Gurmukhi and tool utilized is Matlab.

  5. A Parallel Neuromorphic Text Recognition System and Its Implementation on a Heterogeneous High-Performance Computing Cluster

    DTIC Science & Technology

    2013-01-01

    M. Ahmadi, and M. Shridhar, “ Handwritten Numeral Recognition with Multiple Features and Multistage Classifiers,” Proc. IEEE Int’l Symp. Circuits...ARTICLE (Post Print) 3. DATES COVERED (From - To) SEP 2011 – SEP 2013 4. TITLE AND SUBTITLE A PARALLEL NEUROMORPHIC TEXT RECOGNITION SYSTEM AND ITS...research in computational intelligence has entered a new era. In this paper, we present an HPC-based context-aware intelligent text recognition

  6. Transcript mapping for handwritten English documents

    NASA Astrophysics Data System (ADS)

    Jose, Damien; Bharadwaj, Anurag; Govindaraju, Venu

    2008-01-01

    Transcript mapping or text alignment with handwritten documents is the automatic alignment of words in a text file with word images in a handwritten document. Such a mapping has several applications in fields ranging from machine learning where large quantities of truth data are required for evaluating handwriting recognition algorithms, to data mining where word image indexes are used in ranked retrieval of scanned documents in a digital library. The alignment also aids "writer identity" verification algorithms. Interfaces which display scanned handwritten documents may use this alignment to highlight manuscript tokens when a person examines the corresponding transcript word. We propose an adaptation of the True DTW dynamic programming algorithm for English handwritten documents. The integration of the dissimilarity scores from a word-model word recognizer and Levenshtein distance between the recognized word and lexicon word, as a cost metric in the DTW algorithm leading to a fast and accurate alignment, is our primary contribution. Results provided, confirm the effectiveness of our approach.

  7. Do handwritten words magnify lexical effects in visual word recognition?

    PubMed

    Perea, Manuel; Gil-López, Cristina; Beléndez, Victoria; Carreiras, Manuel

    2016-01-01

    An examination of how the word recognition system is able to process handwritten words is fundamental to formulate a comprehensive model of visual word recognition. Previous research has revealed that the magnitude of lexical effects (e.g., the word-frequency effect) is greater with handwritten words than with printed words. In the present lexical decision experiments, we examined whether the quality of handwritten words moderates the recruitment of top-down feedback, as reflected in word-frequency effects. Results showed a reading cost for difficult-to-read and easy-to-read handwritten words relative to printed words. But the critical finding was that difficult-to-read handwritten words, but not easy-to-read handwritten words, showed a greater word-frequency effect than printed words. Therefore, the inherent physical variability of handwritten words does not necessarily boost the magnitude of lexical effects.

  8. Performance evaluation of MLP and RBF feed forward neural network for the recognition of off-line handwritten characters

    NASA Astrophysics Data System (ADS)

    Rishi, Rahul; Choudhary, Amit; Singh, Ravinder; Dhaka, Vijaypal Singh; Ahlawat, Savita; Rao, Mukta

    2010-02-01

    In this paper we propose a system for classification problem of handwritten text. The system is composed of preprocessing module, supervised learning module and recognition module on a very broad level. The preprocessing module digitizes the documents and extracts features (tangent values) for each character. The radial basis function network is used in the learning and recognition modules. The objective is to analyze and improve the performance of Multi Layer Perceptron (MLP) using RBF transfer functions over Logarithmic Sigmoid Function. The results of 35 experiments indicate that the Feed Forward MLP performs accurately and exhaustively with RBF. With the change in weight update mechanism and feature-drawn preprocessing module, the proposed system is competent with good recognition show.

  9. Online handwritten mathematical expression recognition

    NASA Astrophysics Data System (ADS)

    Büyükbayrak, Hakan; Yanikoglu, Berrin; Erçil, Aytül

    2007-01-01

    We describe a system for recognizing online, handwritten mathematical expressions. The system is designed with a user-interface for writing scientific articles, supporting the recognition of basic mathematical expressions as well as integrals, summations, matrices etc. A feed-forward neural network recognizes symbols which are assumed to be single-stroke and a recursive algorithm parses the expression by combining neural network output and the structure of the expression. Preliminary results show that writer-dependent recognition rates are very high (99.8%) while writer-independent symbol recognition rates are lower (75%). The interface associated with the proposed system integrates the built-in recognition capabilities of the Microsoft's Tablet PC API for recognizing textual input and supports conversion of hand-drawn figures into PNG format. This enables the user to enter text, mathematics and draw figures in a single interface. After recognition, all output is combined into one LATEX code and compiled into a PDF file.

  10. An adaptive deep Q-learning strategy for handwritten digit recognition.

    PubMed

    Qiao, Junfei; Wang, Gongming; Li, Wenjing; Chen, Min

    2018-02-22

    Handwritten digits recognition is a challenging problem in recent years. Although many deep learning-based classification algorithms are studied for handwritten digits recognition, the recognition accuracy and running time still need to be further improved. In this paper, an adaptive deep Q-learning strategy is proposed to improve accuracy and shorten running time for handwritten digit recognition. The adaptive deep Q-learning strategy combines the feature-extracting capability of deep learning and the decision-making of reinforcement learning to form an adaptive Q-learning deep belief network (Q-ADBN). First, Q-ADBN extracts the features of original images using an adaptive deep auto-encoder (ADAE), and the extracted features are considered as the current states of Q-learning algorithm. Second, Q-ADBN receives Q-function (reward signal) during recognition of the current states, and the final handwritten digits recognition is implemented by maximizing the Q-function using Q-learning algorithm. Finally, experimental results from the well-known MNIST dataset show that the proposed Q-ADBN has a superiority to other similar methods in terms of accuracy and running time. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. Sunspot drawings handwritten character recognition method based on deep learning

    NASA Astrophysics Data System (ADS)

    Zheng, Sheng; Zeng, Xiangyun; Lin, Ganghua; Zhao, Cui; Feng, Yongli; Tao, Jinping; Zhu, Daoyuan; Xiong, Li

    2016-05-01

    High accuracy scanned sunspot drawings handwritten characters recognition is an issue of critical importance to analyze sunspots movement and store them in the database. This paper presents a robust deep learning method for scanned sunspot drawings handwritten characters recognition. The convolution neural network (CNN) is one algorithm of deep learning which is truly successful in training of multi-layer network structure. CNN is used to train recognition model of handwritten character images which are extracted from the original sunspot drawings. We demonstrate the advantages of the proposed method on sunspot drawings provided by Chinese Academy Yunnan Observatory and obtain the daily full-disc sunspot numbers and sunspot areas from the sunspot drawings. The experimental results show that the proposed method achieves a high recognition accurate rate.

  12. U.S. Army Research Laboratory (ARL) Corporate Dari Document Transcription and Translation Guidelines

    DTIC Science & Technology

    2012-10-01

    text file format. 15. SUBJECT TERMS Transcription, Translation, guidelines, ground truth, Optical character recognition , OCR, Machine Translation, MT...foreign language into a target language in order to train, test, and evaluate optical character recognition (OCR) and machine translation (MT) embedded...graphic element and should not be transcribed. Elements that are not part of the primary text such as handwritten annotations or stamps should not be

  13. ASM Based Synthesis of Handwritten Arabic Text Pages

    PubMed Central

    Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif; Ghoneim, Ahmed

    2015-01-01

    Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available. PMID:26295059

  14. ASM Based Synthesis of Handwritten Arabic Text Pages.

    PubMed

    Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed

    2015-01-01

    Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.

  15. Maximum mutual information estimation of a simplified hidden MRF for offline handwritten Chinese character recognition

    NASA Astrophysics Data System (ADS)

    Xiong, Yan; Reichenbach, Stephen E.

    1999-01-01

    Understanding of hand-written Chinese characters is at such a primitive stage that models include some assumptions about hand-written Chinese characters that are simply false. So Maximum Likelihood Estimation (MLE) may not be an optimal method for hand-written Chinese characters recognition. This concern motivates the research effort to consider alternative criteria. Maximum Mutual Information Estimation (MMIE) is an alternative method for parameter estimation that does not derive its rationale from presumed model correctness, but instead examines the pattern-modeling problem in automatic recognition system from an information- theoretic point of view. The objective of MMIE is to find a set of parameters in such that the resultant model allows the system to derive from the observed data as much information as possible about the class. We consider MMIE for recognition of hand-written Chinese characters using on a simplified hidden Markov Random Field. MMIE provides improved performance improvement over MLE in this application.

  16. Recognition of Similar Shaped Handwritten Marathi Characters Using Artificial Neural Network

    NASA Astrophysics Data System (ADS)

    Jane, Archana P.; Pund, Mukesh A.

    2012-03-01

    The growing need have handwritten Marathi character recognition in Indian offices such as passport, railways etc has made it vital area of a research. Similar shape characters are more prone to misclassification. In this paper a novel method is provided to recognize handwritten Marathi characters based on their features extraction and adaptive smoothing technique. Feature selections methods avoid unnecessary patterns in an image whereas adaptive smoothing technique form smooth shape of charecters.Combination of both these approaches leads to the better results. Previous study shows that, no one technique achieves 100% accuracy in handwritten character recognition area. This approach of combining both adaptive smoothing & feature extraction gives better results (approximately 75-100) and expected outcomes.

  17. What differs in visual recognition of handwritten vs. printed letters? An fMRI study.

    PubMed

    Longcamp, Marieke; Hlushchuk, Yevhen; Hari, Riitta

    2011-08-01

    In models of letter recognition, handwritten letters are considered as a particular font exemplar, not qualitatively different in their processing from printed letters. Yet, some data suggest that recognizing handwritten letters might rely on distinct processes, possibly related to motor knowledge. We applied functional magnetic resonance imaging to compare the neural correlates of perceiving handwritten letters vs. standard printed letters. Statistical analysis circumscribed to frontal brain regions involved in hand-movement triggering and execution showed that processing of handwritten letters is supported by a stronger activation of the left primary motor cortex and the supplementary motor area. At the whole-brain level, additional differences between handwritten and printed letters were observed in the right superior frontal, middle occipital, and parahippocampal gyri, and in the left inferior precentral and the fusiform gyri. The results are suggested to indicate embodiment of the visual perception of handwritten letters. Copyright © 2010 Wiley-Liss, Inc.

  18. Interpreting Chicken-Scratch: Lexical Access for Handwritten Words

    ERIC Educational Resources Information Center

    Barnhart, Anthony S.; Goldinger, Stephen D.

    2010-01-01

    Handwritten word recognition is a field of study that has largely been neglected in the psychological literature, despite its prevalence in society. Whereas studies of spoken word recognition almost exclusively employ natural, human voices as stimuli, studies of visual word recognition use synthetic typefaces, thus simplifying the process of word…

  19. Comparison of crisp and fuzzy character networks in handwritten word recognition

    NASA Technical Reports Server (NTRS)

    Gader, Paul; Mohamed, Magdi; Chiang, Jung-Hsien

    1992-01-01

    Experiments involving handwritten word recognition on words taken from images of handwritten address blocks from the United States Postal Service mailstream are described. The word recognition algorithm relies on the use of neural networks at the character level. The neural networks are trained using crisp and fuzzy desired outputs. The fuzzy outputs were defined using a fuzzy k-nearest neighbor algorithm. The crisp networks slightly outperformed the fuzzy networks at the character level but the fuzzy networks outperformed the crisp networks at the word level.

  20. Arabic handwritten: pre-processing and segmentation

    NASA Astrophysics Data System (ADS)

    Maliki, Makki; Jassim, Sabah; Al-Jawad, Naseer; Sellahewa, Harin

    2012-06-01

    This paper is concerned with pre-processing and segmentation tasks that influence the performance of Optical Character Recognition (OCR) systems and handwritten/printed text recognition. In Arabic, these tasks are adversely effected by the fact that many words are made up of sub-words, with many sub-words there associated one or more diacritics that are not connected to the sub-word's body; there could be multiple instances of sub-words overlap. To overcome these problems we investigate and develop segmentation techniques that first segment a document into sub-words, link the diacritics with their sub-words, and removes possible overlapping between words and sub-words. We shall also investigate two approaches for pre-processing tasks to estimate sub-words baseline, and to determine parameters that yield appropriate slope correction, slant removal. We shall investigate the use of linear regression on sub-words pixels to determine their central x and y coordinates, as well as their high density part. We also develop a new incremental rotation procedure to be performed on sub-words that determines the best rotation angle needed to realign baselines. We shall demonstrate the benefits of these proposals by conducting extensive experiments on publicly available databases and in-house created databases. These algorithms help improve character segmentation accuracy by transforming handwritten Arabic text into a form that could benefit from analysis of printed text.

  1. Handwritten-word spotting using biologically inspired features.

    PubMed

    van der Zant, Tijn; Schomaker, Lambert; Haak, Koen

    2008-11-01

    For quick access to new handwritten collections, current handwriting recognition methods are too cumbersome. They cannot deal with the lack of labeled data and would require extensive laboratory training for each individual script, style, language and collection. We propose a biologically inspired whole-word recognition method which is used to incrementally elicit word labels in a live, web-based annotation system, named Monk. Since human labor should be minimized given the massive amount of image data, it becomes important to rely on robust perceptual mechanisms in the machine. Recent computational models of the neuro-physiology of vision are applied to isolated word classification. A primate cortex-like mechanism allows to classify text-images that have a low frequency of occurrence. Typically these images are the most difficult to retrieve and often contain named entities and are regarded as the most important to people. Usually standard pattern-recognition technology cannot deal with these text-images if there are not enough labeled instances. The results of this retrieval system are compared to normalized word-image matching and appear to be very promising.

  2. Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research.

    PubMed

    Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif

    2016-03-11

    Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers-that we proposed earlier-improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction.

  3. Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research

    PubMed Central

    Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif

    2016-01-01

    Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers—that we proposed earlier—improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction. PMID:26978368

  4. Handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals.

    PubMed

    Bhattacharya, Ujjwal; Chaudhuri, B B

    2009-03-01

    This article primarily concerns the problem of isolated handwritten numeral recognition of major Indian scripts. The principal contributions presented here are (a) pioneering development of two databases for handwritten numerals of two most popular Indian scripts, (b) a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and (c) application of (b) for the recognition of mixed handwritten numerals of three Indian scripts Devanagari, Bangla and English. The present databases include respectively 22,556 and 23,392 handwritten isolated numeral samples of Devanagari and Bangla collected from real-life situations and these can be made available free of cost to researchers of other academic Institutions. In the proposed scheme, a numeral is subjected to three multilayer perceptron classifiers corresponding to three coarse-to-fine resolution levels in a cascaded manner. If rejection occurred even at the highest resolution, another multilayer perceptron is used as the final attempt to recognize the input numeral by combining the outputs of three classifiers of the previous stages. This scheme has been extended to the situation when the script of a document is not known a priori or the numerals written on a document belong to different scripts. Handwritten numerals in mixed scripts are frequently found in Indian postal mails and table-form documents.

  5. Boosting bonsai trees for handwritten/printed text discrimination

    NASA Astrophysics Data System (ADS)

    Ricquebourg, Yann; Raymond, Christian; Poirriez, Baptiste; Lemaitre, Aurélie; Coüasnon, Bertrand

    2013-12-01

    Boosting over decision-stumps proved its efficiency in Natural Language Processing essentially with symbolic features, and its good properties (fast, few and not critical parameters, not sensitive to over-fitting) could be of great interest in the numeric world of pixel images. In this article we investigated the use of boosting over small decision trees, in image classification processing, for the discrimination of handwritten/printed text. Then, we conducted experiments to compare it to usual SVM-based classification revealing convincing results with very close performance, but with faster predictions and behaving far less as a black-box. Those promising results tend to make use of this classifier in more complex recognition tasks like multiclass problems.

  6. Handwritten digits recognition using HMM and PSO based on storks

    NASA Astrophysics Data System (ADS)

    Yan, Liao; Jia, Zhenhong; Yang, Jie; Pang, Shaoning

    2010-07-01

    A new method for handwritten digits recognition based on hidden markov model (HMM) and particle swarm optimization (PSO) is proposed. This method defined 24 strokes with the sense of directional, to make up for the shortage that is sensitive in choice of stating point in traditional methods, but also reduce the ambiguity caused by shakes. Make use of excellent global convergence of PSO; improving the probability of finding the optimum and avoiding local infinitesimal obviously. Experimental results demonstrate that compared with the traditional methods, the proposed method can make most of the recognition rate of handwritten digits improved.

  7. Handwritten Word Recognition Using Multi-view Analysis

    NASA Astrophysics Data System (ADS)

    de Oliveira, J. J.; de A. Freitas, C. O.; de Carvalho, J. M.; Sabourin, R.

    This paper brings a contribution to the problem of efficiently recognizing handwritten words from a limited size lexicon. For that, a multiple classifier system has been developed that analyzes the words from three different approximation levels, in order to get a computational approach inspired on the human reading process. For each approximation level a three-module architecture composed of a zoning mechanism (pseudo-segmenter), a feature extractor and a classifier is defined. The proposed application is the recognition of the Portuguese handwritten names of the months, for which a best recognition rate of 97.7% was obtained, using classifier combination.

  8. Construction of language models for an handwritten mail reading system

    NASA Astrophysics Data System (ADS)

    Morillot, Olivier; Likforman-Sulem, Laurence; Grosicki, Emmanuèle

    2012-01-01

    This paper presents a system for the recognition of unconstrained handwritten mails. The main part of this system is an HMM recognizer which uses trigraphs to model contextual information. This recognition system does not require any segmentation into words or characters and directly works at line level. To take into account linguistic information and enhance performance, a language model is introduced. This language model is based on bigrams and built from training document transcriptions only. Different experiments with various vocabulary sizes and language models have been conducted. Word Error Rate and Perplexity values are compared to show the interest of specific language models, fit to handwritten mail recognition task.

  9. Robust recognition of handwritten numerals based on dual cooperative network

    NASA Technical Reports Server (NTRS)

    Lee, Sukhan; Choi, Yeongwoo

    1992-01-01

    An approach to robust recognition of handwritten numerals using two operating parallel networks is presented. The first network uses inputs in Cartesian coordinates, and the second network uses the same inputs transformed into polar coordinates. How the proposed approach realizes the robustness to local and global variations of input numerals by handling inputs both in Cartesian coordinates and in its transformed Polar coordinates is described. The required network structures and its learning scheme are discussed. Experimental results show that by tracking only a small number of distinctive features for each teaching numeral in each coordinate, the proposed system can provide robust recognition of handwritten numerals.

  10. Post processing for offline Chinese handwritten character string recognition

    NASA Astrophysics Data System (ADS)

    Wang, YanWei; Ding, XiaoQing; Liu, ChangSong

    2012-01-01

    Offline Chinese handwritten character string recognition is one of the most important research fields in pattern recognition. Due to the free writing style, large variability in character shapes and different geometric characteristics, Chinese handwritten character string recognition is a challenging problem to deal with. However, among the current methods over-segmentation and merging method which integrates geometric information, character recognition information and contextual information, shows a promising result. It is found experimentally that a large part of errors are segmentation error and mainly occur around non-Chinese characters. In a Chinese character string, there are not only wide characters namely Chinese characters, but also narrow characters like digits and letters of the alphabet. The segmentation error is mainly caused by uniform geometric model imposed on all segmented candidate characters. To solve this problem, post processing is employed to improve recognition accuracy of narrow characters. On one hand, multi-geometric models are established for wide characters and narrow characters respectively. Under multi-geometric models narrow characters are not prone to be merged. On the other hand, top rank recognition results of candidate paths are integrated to boost final recognition of narrow characters. The post processing method is investigated on two datasets, in total 1405 handwritten address strings. The wide character recognition accuracy has been improved lightly and narrow character recognition accuracy has been increased up by 10.41% and 10.03% respectively. It indicates that the post processing method is effective to improve recognition accuracy of narrow characters.

  11. An online handwriting recognition system for Turkish

    NASA Astrophysics Data System (ADS)

    Vural, Esra; Erdogan, Hakan; Oflazer, Kemal; Yanikoglu, Berrin A.

    2004-12-01

    Despite recent developments in Tablet PC technology, there has not been any applications for recognizing handwritings in Turkish. In this paper, we present an online handwritten text recognition system for Turkish, developed using the Tablet PC interface. However, even though the system is developed for Turkish, the addressed issues are common to online handwriting recognition systems in general. Several dynamic features are extracted from the handwriting data for each recorded point and Hidden Markov Models (HMM) are used to train letter and word models. We experimented with using various features and HMM model topologies, and report on the effects of these experiments. We started with first and second derivatives of the x and y coordinates and relative change in the pen pressure as initial features. We found that using two more additional features, that is, number of neighboring points and relative heights of each point with respect to the base-line improve the recognition rate. In addition, extracting features within strokes and using a skipping state topology improve the system performance as well. The improved system performance is 94% in recognizing handwritten words from a 1000-word lexicon.

  12. An online handwriting recognition system for Turkish

    NASA Astrophysics Data System (ADS)

    Vural, Esra; Erdogan, Hakan; Oflazer, Kemal; Yanikoglu, Berrin A.

    2005-01-01

    Despite recent developments in Tablet PC technology, there has not been any applications for recognizing handwritings in Turkish. In this paper, we present an online handwritten text recognition system for Turkish, developed using the Tablet PC interface. However, even though the system is developed for Turkish, the addressed issues are common to online handwriting recognition systems in general. Several dynamic features are extracted from the handwriting data for each recorded point and Hidden Markov Models (HMM) are used to train letter and word models. We experimented with using various features and HMM model topologies, and report on the effects of these experiments. We started with first and second derivatives of the x and y coordinates and relative change in the pen pressure as initial features. We found that using two more additional features, that is, number of neighboring points and relative heights of each point with respect to the base-line improve the recognition rate. In addition, extracting features within strokes and using a skipping state topology improve the system performance as well. The improved system performance is 94% in recognizing handwritten words from a 1000-word lexicon.

  13. Development of an optical character recognition pipeline for handwritten form fields from an electronic health record.

    PubMed

    Rasmussen, Luke V; Peissig, Peggy L; McCarty, Catherine A; Starren, Justin

    2012-06-01

    Although the penetration of electronic health records is increasing rapidly, much of the historical medical record is only available in handwritten notes and forms, which require labor-intensive, human chart abstraction for some clinical research. The few previous studies on automated extraction of data from these handwritten notes have focused on monolithic, custom-developed recognition systems or third-party systems that require proprietary forms. We present an optical character recognition processing pipeline, which leverages the capabilities of existing third-party optical character recognition engines, and provides the flexibility offered by a modular custom-developed system. The system was configured and run on a selected set of form fields extracted from a corpus of handwritten ophthalmology forms. The processing pipeline allowed multiple configurations to be run, with the optimal configuration consisting of the Nuance and LEADTOOLS engines running in parallel with a positive predictive value of 94.6% and a sensitivity of 13.5%. While limitations exist, preliminary experience from this project yielded insights on the generalizability and applicability of integrating multiple, inexpensive general-purpose third-party optical character recognition engines in a modular pipeline.

  14. Development of an optical character recognition pipeline for handwritten form fields from an electronic health record

    PubMed Central

    Peissig, Peggy L; McCarty, Catherine A; Starren, Justin

    2011-01-01

    Background Although the penetration of electronic health records is increasing rapidly, much of the historical medical record is only available in handwritten notes and forms, which require labor-intensive, human chart abstraction for some clinical research. The few previous studies on automated extraction of data from these handwritten notes have focused on monolithic, custom-developed recognition systems or third-party systems that require proprietary forms. Methods We present an optical character recognition processing pipeline, which leverages the capabilities of existing third-party optical character recognition engines, and provides the flexibility offered by a modular custom-developed system. The system was configured and run on a selected set of form fields extracted from a corpus of handwritten ophthalmology forms. Observations The processing pipeline allowed multiple configurations to be run, with the optimal configuration consisting of the Nuance and LEADTOOLS engines running in parallel with a positive predictive value of 94.6% and a sensitivity of 13.5%. Discussion While limitations exist, preliminary experience from this project yielded insights on the generalizability and applicability of integrating multiple, inexpensive general-purpose third-party optical character recognition engines in a modular pipeline. PMID:21890871

  15. Fuzzy Logic Module of Convolutional Neural Network for Handwritten Digits Recognition

    NASA Astrophysics Data System (ADS)

    Popko, E. A.; Weinstein, I. A.

    2016-08-01

    Optical character recognition is one of the important issues in the field of pattern recognition. This paper presents a method for recognizing handwritten digits based on the modeling of convolutional neural network. The integrated fuzzy logic module based on a structural approach was developed. Used system architecture adjusted the output of the neural network to improve quality of symbol identification. It was shown that proposed algorithm was flexible and high recognition rate of 99.23% was achieved.

  16. Unconstrained handwritten numeral recognition based on radial basis competitive and cooperative networks with spatio-temporal feature representation.

    PubMed

    Lee, S; Pan, J J

    1996-01-01

    This paper presents a new approach to representation and recognition of handwritten numerals. The approach first transforms a two-dimensional (2-D) spatial representation of a numeral into a three-dimensional (3-D) spatio-temporal representation by identifying the tracing sequence based on a set of heuristic rules acting as transformation operators. A multiresolution critical-point segmentation method is then proposed to extract local feature points, at varying degrees of scale and coarseness. A new neural network architecture, referred to as radial-basis competitive and cooperative network (RCCN), is presented especially for handwritten numeral recognition. RCCN is a globally competitive and locally cooperative network with the capability of self-organizing hidden units to progressively achieve desired network performance, and functions as a universal approximator of arbitrary input-output mappings. Three types of RCCNs are explored: input-space RCCN (IRCCN), output-space RCCN (ORCCN), and bidirectional RCCN (BRCCN). Experiments against handwritten zip code numerals acquired by the U.S. Postal Service indicated that the proposed method is robust in terms of variations, deformations, transformations, and corruption, achieving about 97% recognition rate.

  17. Interpreting Chicken-Scratch: Lexical Access for Handwritten Words

    PubMed Central

    Barnhart, Anthony S.; Goldinger, Stephen D.

    2014-01-01

    Handwritten word recognition is a field of study that has largely been neglected in the psychological literature, despite its prevalence in society. Whereas studies of spoken word recognition almost exclusively employ natural, human voices as stimuli, studies of visual word recognition use synthetic typefaces, thus simplifying the process of word recognition. The current study examined the effects of handwriting on a series of lexical variables thought to influence bottom-up and top-down processing, including word frequency, regularity, bidirectional consistency, and imageability. The results suggest that the natural physical ambiguity of handwritten stimuli forces a greater reliance on top-down processes, because almost all effects were magnified, relative to conditions with computer print. These findings suggest that processes of word perception naturally adapt to handwriting, compensating for physical ambiguity by increasing top-down feedback. PMID:20695708

  18. Basic test framework for the evaluation of text line segmentation and text parameter extraction.

    PubMed

    Brodić, Darko; Milivojević, Dragan R; Milivojević, Zoran

    2010-01-01

    Text line segmentation is an essential stage in off-line optical character recognition (OCR) systems. It is a key because inaccurately segmented text lines will lead to OCR failure. Text line segmentation of handwritten documents is a complex and diverse problem, complicated by the nature of handwriting. Hence, text line segmentation is a leading challenge in handwritten document image processing. Due to inconsistencies in measurement and evaluation of text segmentation algorithm quality, some basic set of measurement methods is required. Currently, there is no commonly accepted one and all algorithm evaluation is custom oriented. In this paper, a basic test framework for the evaluation of text feature extraction algorithms is proposed. This test framework consists of a few experiments primarily linked to text line segmentation, skew rate and reference text line evaluation. Although they are mutually independent, the results obtained are strongly cross linked. In the end, its suitability for different types of letters and languages as well as its adaptability are its main advantages. Thus, the paper presents an efficient evaluation method for text analysis algorithms.

  19. Basic Test Framework for the Evaluation of Text Line Segmentation and Text Parameter Extraction

    PubMed Central

    Brodić, Darko; Milivojević, Dragan R.; Milivojević, Zoran

    2010-01-01

    Text line segmentation is an essential stage in off-line optical character recognition (OCR) systems. It is a key because inaccurately segmented text lines will lead to OCR failure. Text line segmentation of handwritten documents is a complex and diverse problem, complicated by the nature of handwriting. Hence, text line segmentation is a leading challenge in handwritten document image processing. Due to inconsistencies in measurement and evaluation of text segmentation algorithm quality, some basic set of measurement methods is required. Currently, there is no commonly accepted one and all algorithm evaluation is custom oriented. In this paper, a basic test framework for the evaluation of text feature extraction algorithms is proposed. This test framework consists of a few experiments primarily linked to text line segmentation, skew rate and reference text line evaluation. Although they are mutually independent, the results obtained are strongly cross linked. In the end, its suitability for different types of letters and languages as well as its adaptability are its main advantages. Thus, the paper presents an efficient evaluation method for text analysis algorithms. PMID:22399932

  20. Experiments on Urdu Text Recognition

    NASA Astrophysics Data System (ADS)

    Mukhtar, Omar; Setlur, Srirangaraj; Govindaraju, Venu

    Urdu is a language spoken in the Indian subcontinent by an estimated 130-270 million speakers. At the spoken level, Urdu and Hindi are considered dialects of a single language because of shared vocabulary and the similarity in grammar. At the written level, however, Urdu is much closer to Arabic because it is written in Nastaliq, the calligraphic style of the Persian-Arabic script. Therefore, a speaker of Hindi can understand spoken Urdu but may not be able to read written Urdu because Hindi is written in Devanagari script, whereas an Arabic writer can read the written words but may not understand the spoken Urdu. In this chapter we present an overview of written Urdu. Prior research in handwritten Urdu OCR is very limited. We present (perhaps) the first system for recognizing handwritten Urdu words. On a data set of about 1300 handwritten words, we achieved an accuracy of 70% for the top choice, and 82% for the top three choices.

  1. Optical character recognition of handwritten Arabic using hidden Markov models

    NASA Astrophysics Data System (ADS)

    Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.; Olama, Mohammed M.

    2011-04-01

    The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.

  2. Optical character recognition of handwritten Arabic using hidden Markov models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.

    2011-01-01

    The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language ismore » initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.« less

  3. Recognition of degraded handwritten digits using dynamic Bayesian networks

    NASA Astrophysics Data System (ADS)

    Likforman-Sulem, Laurence; Sigelle, Marc

    2007-01-01

    We investigate in this paper the application of dynamic Bayesian networks (DBNs) to the recognition of handwritten digits. The main idea is to couple two separate HMMs into various architectures. First, a vertical HMM and a horizontal HMM are built observing the evolving streams of image columns and image rows respectively. Then, two coupled architectures are proposed to model interactions between these two streams and to capture the 2D nature of character images. Experiments performed on the MNIST handwritten digit database show that coupled architectures yield better recognition performances than non-coupled ones. Additional experiments conducted on artificially degraded (broken) characters demonstrate that coupled architectures better cope with such degradation than non coupled ones and than discriminative methods such as SVMs.

  4. Signature Verification Based on Handwritten Text Recognition

    NASA Astrophysics Data System (ADS)

    Viriri, Serestina; Tapamo, Jules-R.

    Signatures continue to be an important biometric trait because it remains widely used primarily for authenticating the identity of human beings. This paper presents an efficient text-based directional signature recognition algorithm which verifies signatures, even when they are composed of special unconstrained cursive characters which are superimposed and embellished. This algorithm extends the character-based signature verification technique. The experiments carried out on the GPDS signature database and an additional database created from signatures captured using the ePadInk tablet, show that the approach is effective and efficient, with a positive verification rate of 94.95%.

  5. A Dynamic Bayesian Network Based Structural Learning towards Automated Handwritten Digit Recognition

    NASA Astrophysics Data System (ADS)

    Pauplin, Olivier; Jiang, Jianmin

    Pattern recognition using Dynamic Bayesian Networks (DBNs) is currently a growing area of study. In this paper, we present DBN models trained for classification of handwritten digit characters. The structure of these models is partly inferred from the training data of each class of digit before performing parameter learning. Classification results are presented for the four described models.

  6. Handwritten text line segmentation by spectral clustering

    NASA Astrophysics Data System (ADS)

    Han, Xuecheng; Yao, Hui; Zhong, Guoqiang

    2017-02-01

    Since handwritten text lines are generally skewed and not obviously separated, text line segmentation of handwritten document images is still a challenging problem. In this paper, we propose a novel text line segmentation algorithm based on the spectral clustering. Given a handwritten document image, we convert it to a binary image first, and then compute the adjacent matrix of the pixel points. We apply spectral clustering on this similarity metric and use the orthogonal kmeans clustering algorithm to group the text lines. Experiments on Chinese handwritten documents database (HIT-MW) demonstrate the effectiveness of the proposed method.

  7. Combining approaches to on-line handwriting information retrieval

    NASA Astrophysics Data System (ADS)

    Peña Saldarriaga, Sebastián; Viard-Gaudin, Christian; Morin, Emmanuel

    2010-01-01

    In this work, we propose to combine two quite different approaches for retrieving handwritten documents. Our hypothesis is that different retrieval algorithms should retrieve different sets of documents for the same query. Therefore, significant improvements in retrieval performances can be expected. The first approach is based on information retrieval techniques carried out on the noisy texts obtained through handwriting recognition, while the second approach is recognition-free using a word spotting algorithm. Results shows that for texts having a word error rate (WER) lower than 23%, the performances obtained with the combined system are close to the performances obtained on clean digital texts. In addition, for poorly recognized texts (WER > 52%), an improvement of nearly 17% can be observed with respect to the best available baseline method.

  8. Automatic extraction of numeric strings in unconstrained handwritten document images

    NASA Astrophysics Data System (ADS)

    Haji, M. Mehdi; Bui, Tien D.; Suen, Ching Y.

    2011-01-01

    Numeric strings such as identification numbers carry vital pieces of information in documents. In this paper, we present a novel algorithm for automatic extraction of numeric strings in unconstrained handwritten document images. The algorithm has two main phases: pruning and verification. In the pruning phase, the algorithm first performs a new segment-merge procedure on each text line, and then using a new regularity measure, it prunes all sequences of characters that are unlikely to be numeric strings. The segment-merge procedure is composed of two modules: a new explicit character segmentation algorithm which is based on analysis of skeletal graphs and a merging algorithm which is based on graph partitioning. All the candidate sequences that pass the pruning phase are sent to a recognition-based verification phase for the final decision. The recognition is based on a coarse-to-fine approach using probabilistic RBF networks. We developed our algorithm for the processing of real-world documents where letters and digits may be connected or broken in a document. The effectiveness of the proposed approach is shown by extensive experiments done on a real-world database of 607 documents which contains handwritten, machine-printed and mixed documents with different types of layouts and levels of noise.

  9. Evaluating structural pattern recognition for handwritten math via primitive label graphs

    NASA Astrophysics Data System (ADS)

    Zanibbi, Richard; Mouchère, Harold; Viard-Gaudin, Christian

    2013-01-01

    Currently, structural pattern recognizer evaluations compare graphs of detected structure to target structures (i.e. ground truth) using recognition rates, recall and precision for object segmentation, classification and relationships. In document recognition, these target objects (e.g. symbols) are frequently comprised of multiple primitives (e.g. connected components, or strokes for online handwritten data), but current metrics do not characterize errors at the primitive level, from which object-level structure is obtained. Primitive label graphs are directed graphs defined over primitives and primitive pairs. We define new metrics obtained by Hamming distances over label graphs, which allow classification, segmentation and parsing errors to be characterized separately, or using a single measure. Recall and precision for detected objects may also be computed directly from label graphs. We illustrate the new metrics by comparing a new primitive-level evaluation to the symbol-level evaluation performed for the CROHME 2012 handwritten math recognition competition. A Python-based set of utilities for evaluating, visualizing and translating label graphs is publicly available.

  10. Optical character recognition with feature extraction and associative memory matrix

    NASA Astrophysics Data System (ADS)

    Sasaki, Osami; Shibahara, Akihito; Suzuki, Takamasa

    1998-06-01

    A method is proposed in which handwritten characters are recognized using feature extraction and an associative memory matrix. In feature extraction, simple processes such as shifting and superimposing patterns are executed. A memory matrix is generated with singular value decomposition and by modifying small singular values. The method is optically implemented with two liquid crystal displays. Experimental results for the recognition of 25 handwritten alphabet characters clearly shows the effectiveness of the method.

  11. Background feature descriptor for offline handwritten numeral recognition

    NASA Astrophysics Data System (ADS)

    Ming, Delie; Wang, Hao; Tian, Tian; Jie, Feiran; Lei, Bo

    2011-11-01

    This paper puts forward an offline handwritten numeral recognition method based on background structural descriptor (sixteen-value numerical background expression). Through encoding the background pixels in the image according to a certain rule, 16 different eigenvalues were generated, which reflected the background condition of every digit, then reflected the structural features of the digits. Through pattern language description of images by these features, automatic segmentation of overlapping digits and numeral recognition can be realized. This method is characterized by great deformation resistant ability, high recognition speed and easy realization. Finally, the experimental results and conclusions are presented. The experimental results of recognizing datasets from various practical application fields reflect that with this method, a good recognition effect can be achieved.

  12. Eye movements when reading sentences with handwritten words.

    PubMed

    Perea, Manuel; Marcet, Ana; Uixera, Beatriz; Vergara-Martínez, Marta

    2016-10-17

    The examination of how we read handwritten words (i.e., the original form of writing) has typically been disregarded in the literature on reading. Previous research using word recognition tasks has shown that lexical effects (e.g., the word-frequency effect) are magnified when reading difficult handwritten words. To examine this issue in a more ecological scenario, we registered the participants' eye movements when reading handwritten sentences that varied in the degree of legibility (i.e., sentences composed of words in easy vs. difficult handwritten style). For comparison purposes, we included a condition with printed sentences. Results showed a larger reading cost for sentences with difficult handwritten words than for sentences with easy handwritten words, which in turn showed a reading cost relative to the sentences with printed words. Critically, the effect of word frequency was greater for difficult handwritten words than for easy handwritten words or printed words in the total times on a target word, but not on first-fixation durations or gaze durations. We examine the implications of these findings for models of eye movement control in reading.

  13. A distinguishing method of printed and handwritten legal amount on Chinese bank check

    NASA Astrophysics Data System (ADS)

    Zhu, Ningbo; Lou, Zhen; Yang, Jingyu

    2003-09-01

    While carrying out Optical Chinese Character Recognition, distinguishing the font between printed and handwritten characters at the early phase is necessary, because there is so much difference between the methods on recognizing these two types of characters. In this paper, we proposed a good method on how to banish seals and its relative standards that can judge whether they should be banished. Meanwhile, an approach on clearing up scattered noise shivers after image segmentation is presented. Four sets of classifying features that show discrimination between printed and handwritten characters are well adopted. The proposed approach was applied to an automatic check processing system and tested on about 9031 checks. The recognition rate is more than 99.5%.

  14. HMM-based lexicon-driven and lexicon-free word recognition for online handwritten Indic scripts.

    PubMed

    Bharath, A; Madhvanath, Sriganesh

    2012-04-01

    Research for recognizing online handwritten words in Indic scripts is at its early stages when compared to Latin and Oriental scripts. In this paper, we address this problem specifically for two major Indic scripts--Devanagari and Tamil. In contrast to previous approaches, the techniques we propose are largely data driven and script independent. We propose two different techniques for word recognition based on Hidden Markov Models (HMM): lexicon driven and lexicon free. The lexicon-driven technique models each word in the lexicon as a sequence of symbol HMMs according to a standard symbol writing order derived from the phonetic representation. The lexicon-free technique uses a novel Bag-of-Symbols representation of the handwritten word that is independent of symbol order and allows rapid pruning of the lexicon. On handwritten Devanagari word samples featuring both standard and nonstandard symbol writing orders, a combination of lexicon-driven and lexicon-free recognizers significantly outperforms either of them used in isolation. In contrast, most Tamil word samples feature the standard symbol order, and the lexicon-driven recognizer outperforms the lexicon free one as well as their combination. The best recognition accuracies obtained for 20,000 word lexicons are 87.13 percent for Devanagari when the two recognizers are combined, and 91.8 percent for Tamil using the lexicon-driven technique.

  15. Neural Networks for Handwritten English Alphabet Recognition

    NASA Astrophysics Data System (ADS)

    Perwej, Yusuf; Chaturvedi, Ashish

    2011-04-01

    This paper demonstrates the use of neural networks for developing a system that can recognize hand-written English alphabets. In this system, each English alphabet is represented by binary values that are used as input to a simple feature extraction system, whose output is fed to our neural network system.

  16. Character context: a shape descriptor for Arabic handwriting recognition

    NASA Astrophysics Data System (ADS)

    Mudhsh, Mohammed; Almodfer, Rolla; Duan, Pengfei; Xiong, Shengwu

    2017-11-01

    In the handwriting recognition field, designing good descriptors are substantial to obtain rich information of the data. However, the handwriting recognition research of a good descriptor is still an open issue due to unlimited variation in human handwriting. We introduce a "character context descriptor" that efficiently dealt with the structural characteristics of Arabic handwritten characters. First, the character image is smoothed and normalized, then the character context descriptor of 32 feature bins is built based on the proposed "distance function." Finally, a multilayer perceptron with regularization is used as a classifier. On experimentation with a handwritten Arabic characters database, the proposed method achieved a state-of-the-art performance with recognition rate equal to 98.93% and 99.06% for the 66 and 24 classes, respectively.

  17. Numerical linear algebra in data mining

    NASA Astrophysics Data System (ADS)

    Eldén, Lars

    Ideas and algorithms from numerical linear algebra are important in several areas of data mining. We give an overview of linear algebra methods in text mining (information retrieval), pattern recognition (classification of handwritten digits), and PageRank computations for web search engines. The emphasis is on rank reduction as a method of extracting information from a data matrix, low-rank approximation of matrices using the singular value decomposition and clustering, and on eigenvalue methods for network analysis.

  18. A perceptive method for handwritten text segmentation

    NASA Astrophysics Data System (ADS)

    Lemaitre, Aurélie; Camillerapp, Jean; Coüasnon, Bertrand

    2011-01-01

    This paper presents a new method to address the problem of handwritten text segmentation into text lines and words. Thus, we propose a method based on the cooperation among points of view that enables the localization of the text lines in a low resolution image, and then to associate the pixels at a higher level of resolution. Thanks to the combination of levels of vision, we can detect overlapping characters and re-segment the connected components during the analysis. Then, we propose a segmentation of lines into words based on the cooperation among digital data and symbolic knowledge. The digital data are obtained from distances inside a Delaunay graph, which gives a precise distance between connected components, at the pixel level. We introduce structural rules in order to take into account some generic knowledge about the organization of a text page. This cooperation among information gives a bigger power of expression and ensures the global coherence of the recognition. We validate this work using the metrics and the database proposed for the segmentation contest of ICDAR 2009. Thus, we show that our method obtains very interesting results, compared to the other methods of the literature. More precisely, we are able to deal with slope and curvature, overlapping text lines and varied kinds of writings, which are the main difficulties met by the other methods.

  19. New efficient algorithm for recognizing handwritten Hindi digits

    NASA Astrophysics Data System (ADS)

    El-Sonbaty, Yasser; Ismail, Mohammed A.; Karoui, Kamal

    2001-12-01

    In this paper a new algorithm for recognizing handwritten Hindi digits is proposed. The proposed algorithm is based on using the topological characteristics combined with statistical properties of the given digits in order to extract a set of features that can be used in the process of digit classification. 10,000 handwritten digits are used in the experimental results. 1100 digits are used for training and another 5500 unseen digits are used for testing. The recognition rate has reached 97.56%, a substitution rate of 1.822%, and a rejection rate of 0.618%.

  20. Recognition of Handwritten Arabic words using a neuro-fuzzy network

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boukharouba, Abdelhak; Bennia, Abdelhak

    We present a new method for the recognition of handwritten Arabic words based on neuro-fuzzy hybrid network. As a first step, connected components (CCs) of black pixels are detected. Then the system determines which CCs are sub-words and which are stress marks. The stress marks are then isolated and identified separately and the sub-words are segmented into graphemes. Each grapheme is described by topological and statistical features. Fuzzy rules are extracted from training examples by a hybrid learning scheme comprised of two phases: rule generation phase from data using a fuzzy c-means, and rule parameter tuning phase using gradient descentmore » learning. After learning, the network encodes in its topology the essential design parameters of a fuzzy inference system.The contribution of this technique is shown through the significant tests performed on a handwritten Arabic words database.« less

  1. Iterative cross section sequence graph for handwritten character segmentation.

    PubMed

    Dawoud, Amer

    2007-08-01

    The iterative cross section sequence graph (ICSSG) is an algorithm for handwritten character segmentation. It expands the cross section sequence graph concept by applying it iteratively at equally spaced thresholds. The iterative thresholding reduces the effect of information loss associated with image binarization. ICSSG preserves the characters' skeletal structure by preventing the interference of pixels that causes flooding of adjacent characters' segments. Improving the structural quality of the characters' skeleton facilitates better feature extraction and classification, which improves the overall performance of optical character recognition (OCR). Experimental results showed significant improvements in OCR recognition rates compared to other well-established segmentation algorithms.

  2. Quantify spatial relations to discover handwritten graphical symbols

    NASA Astrophysics Data System (ADS)

    Li, Jinpeng; Mouchère, Harold; Viard-Gaudin, Christian

    2012-01-01

    To model a handwritten graphical language, spatial relations describe how the strokes are positioned in the 2-dimensional space. Most of existing handwriting recognition systems make use of some predefined spatial relations. However, considering a complex graphical language, it is hard to express manually all the spatial relations. Another possibility would be to use a clustering technique to discover the spatial relations. In this paper, we discuss how to create a relational graph between strokes (nodes) labeled with graphemes in a graphical language. Then we vectorize spatial relations (edges) for clustering and quantization. As the targeted application, we extract the repetitive sub-graphs (graphical symbols) composed of graphemes and learned spatial relations. On two handwriting databases, a simple mathematical expression database and a complex flowchart database, the unsupervised spatial relations outperform the predefined spatial relations. In addition, we visualize the frequent patterns on two text-lines containing Chinese characters.

  3. Interactive-predictive detection of handwritten text blocks

    NASA Astrophysics Data System (ADS)

    Ramos Terrades, O.; Serrano, N.; Gordó, A.; Valveny, E.; Juan, A.

    2010-01-01

    A method for text block detection is introduced for old handwritten documents. The proposed method takes advantage of sequential book structure, taking into account layout information from pages previously transcribed. This glance at the past is used to predict the position of text blocks in the current page with the help of conventional layout analysis methods. The method is integrated into the GIDOC prototype: a first attempt to provide integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. Results are given in a transcription task on a 764-page Spanish manuscript from 1891.

  4. Text-image alignment for historical handwritten documents

    NASA Astrophysics Data System (ADS)

    Zinger, S.; Nerbonne, J.; Schomaker, L.

    2009-01-01

    We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text transcriptions. The images of handwritten lines are automatically segmented from the scanned pages of historical documents and then manually transcribed. To train automatic routines to detect words in an image of handwritten text, we need a training set - images of words with their transcriptions. We present our results on aligning words from the images of handwritten lines and their corresponding text transcriptions. Alignment based on the longest spaces between portions of handwriting is a baseline. We then show that relative lengths, i.e. proportions of words in their lines, can be used to improve the alignment results considerably. To take into account the relative word length, we define the expressions for the cost function that has to be minimized for aligning text words with their images. We apply right to left alignment as well as alignment based on exhaustive search. The quality assessment of these alignments shows correct results for 69% of words from 100 lines, or 90% of partially correct and correct alignments combined.

  5. Dynamic and Contextual Information in HMM Modeling for Handwritten Word Recognition.

    PubMed

    Bianne-Bernard, Anne-Laure; Menasri, Farès; Al-Hajj Mohamad, Rami; Mokbel, Chafic; Kermorvant, Christopher; Likforman-Sulem, Laurence

    2011-10-01

    This study aims at building an efficient word recognition system resulting from the combination of three handwriting recognizers. The main component of this combined system is an HMM-based recognizer which considers dynamic and contextual information for a better modeling of writing units. For modeling the contextual units, a state-tying process based on decision tree clustering is introduced. Decision trees are built according to a set of expert-based questions on how characters are written. Questions are divided into global questions, yielding larger clusters, and precise questions, yielding smaller ones. Such clustering enables us to reduce the total number of models and Gaussians densities by 10. We then apply this modeling to the recognition of handwritten words. Experiments are conducted on three publicly available databases based on Latin or Arabic languages: Rimes, IAM, and OpenHart. The results obtained show that contextual information embedded with dynamic modeling significantly improves recognition.

  6. Line Segmentation in Handwritten Assamese and Meetei Mayek Script Using Seam Carving Based Algorithm

    NASA Astrophysics Data System (ADS)

    Kumar, Chandan Jyoti; Kalita, Sanjib Kr.

    Line segmentation is a key stage in an Optical Character Recognition system. This paper primarily concerns the problem of text line extraction on color and grayscale manuscript pages of two major North-east Indian regional Scripts, Assamese and Meetei Mayek. Line segmentation of handwritten text in Assamese and Meetei Mayek scripts is an uphill task primarily because of the structural features of both the scripts and varied writing styles. Line segmentation of a document image is been achieved by using the Seam carving technique, in this paper. Researchers from various regions used this approach for content aware resizing of an image. However currently many researchers are implementing Seam Carving for line segmentation phase of OCR. Although it is a language independent technique, mostly experiments are done over Arabic, Greek, German and Chinese scripts. Two types of seams are generated, medial seams approximate the orientation of each text line, and separating seams separated one line of text from another. Experiments are performed extensively over various types of documents and detailed analysis of the evaluations reflects that the algorithm performs well for even documents with multiple scripts. In this paper, we present a comparative study of accuracy of this method over different types of data.

  7. Adaptive Learning and Pruning Using Periodic Packet for Fast Invariance Extraction and Recognition

    NASA Astrophysics Data System (ADS)

    Chang, Sheng-Jiang; Zhang, Bian-Li; Lin, Lie; Xiong, Tao; Shen, Jin-Yuan

    2005-02-01

    A new learning scheme using a periodic packet as the neuronal activation function is proposed for invariance extraction and recognition of handwritten digits. Simulation results show that the proposed network can extract the invariant feature effectively and improve both the convergence and the recognition rate.

  8. Recognition of handprinted characters for automated cartography A progress report

    NASA Technical Reports Server (NTRS)

    Lybanon, M.; Brown, R. M.; Gronmeyer, L. K.

    1980-01-01

    A research program for developing handwritten character recognition techniques is reported. The generation of cartographic/hydrographic manuscripts is overviewed. The performance of hardware/software systems is discussed, along with future research problem areas and planned approaches.

  9. Word Spotting and Recognition with Embedded Attributes.

    PubMed

    Almazán, Jon; Gordo, Albert; Fornés, Alicia; Valveny, Ernest

    2014-12-01

    This paper addresses the problems of word spotting and word recognition on images. In word spotting, the goal is to find all instances of a query word in a dataset of images. In recognition, the goal is to recognize the content of the word image, usually aided by a dictionary or lexicon. We describe an approach in which both word images and text strings are embedded in a common vectorial subspace. This is achieved by a combination of label embedding and attributes learning, and a common subspace regression. In this subspace, images and strings that represent the same word are close together, allowing one to cast recognition and retrieval tasks as a nearest neighbor problem. Contrary to most other existing methods, our representation has a fixed length, is low dimensional, and is very fast to compute and, especially, to compare. We test our approach on four public datasets of both handwritten documents and natural images showing results comparable or better than the state-of-the-art on spotting and recognition tasks.

  10. Spatial Analysis of Handwritten Texts as a Marker of Cognitive Control.

    PubMed

    Crespo, Y; Soriano, M F; Iglesias-Parro, S; Aznarte, J I; Ibáñez-Molina, A J

    2017-12-01

    We explore the idea that cognitive demands of the handwriting would influence the degree of automaticity of the handwriting process, which in turn would affect the geometric parameters of texts. We compared the heterogeneity of handwritten texts in tasks with different cognitive demands; the heterogeneity of texts was analyzed with lacunarity, a measure of geometrical invariance. In Experiment 1, we asked participants to perform two tasks that varied in cognitive demands: transcription and exposition about an autobiographical episode. Lacunarity was significantly lower in transcription. In Experiment 2, we compared a veridical and a fictitious version of a personal event. Lacunarity was lower in veridical texts. We contend that differences in lacunarity of handwritten texts reveal the degree of automaticity in handwriting.

  11. Text-line extraction in handwritten Chinese documents based on an energy minimization framework.

    PubMed

    Koo, Hyung Il; Cho, Nam Ik

    2012-03-01

    Text-line extraction in unconstrained handwritten documents remains a challenging problem due to nonuniform character scale, spatially varying text orientation, and the interference between text lines. In order to address these problems, we propose a new cost function that considers the interactions between text lines and the curvilinearity of each text line. Precisely, we achieve this goal by introducing normalized measures for them, which are based on an estimated line spacing. We also present an optimization method that exploits the properties of our cost function. Experimental results on a database consisting of 853 handwritten Chinese document images have shown that our method achieves a detection rate of 99.52% and an error rate of 0.32%, which outperforms conventional methods.

  12. Application of the ANNA neural network chip to high-speed character recognition.

    PubMed

    Sackinger, E; Boser, B E; Bromley, J; Lecun, Y; Jackel, L D

    1992-01-01

    A neural network with 136000 connections for recognition of handwritten digits has been implemented using a mixed analog/digital neural network chip. The neural network chip is capable of processing 1000 characters/s. The recognition system has essentially the same rate (5%) as a simulation of the network with 32-b floating-point precision.

  13. Parameter calibration for synthesizing realistic-looking variability in offline handwriting

    NASA Astrophysics Data System (ADS)

    Cheng, Wen; Lopresti, Dan

    2011-01-01

    Motivated by the widely accepted principle that the more training data, the better a recognition system performs, we conducted experiments asking human subjects to do evaluate a mixture of real English handwritten text lines and text lines altered from existing handwriting with various distortion degrees. The idea of generating synthetic handwriting is based on a perturbation method by T. Varga and H. Bunke that distorts an entire text line. There are two purposes of our experiments. First, we want to calibrate distortion parameter settings for Varga and Bunke's perturbation model. Second, we intend to compare the effects of parameter settings on different writing styles: block, cursive and mixed. From the preliminary experimental results, we determined appropriate ranges for parameter amplitude, and found that parameter settings should be altered for different handwriting styles. With the proper parameter settings, it should be possible to generate large amount of training and testing data for building better off-line handwriting recognition systems.

  14. A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic

    NASA Astrophysics Data System (ADS)

    Yousefi, Mohammad Reza; Soheili, Mohammad Reza; Breuel, Thomas M.; Stricker, Didier

    2015-01-01

    In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.

  15. Enhancement Of Reading Accuracy By Multiple Data Integration

    NASA Astrophysics Data System (ADS)

    Lee, Kangsuk

    1989-07-01

    In this paper, a multiple sensor integration technique with neural network learning algorithms is presented which can enhance the reading accuracy of the hand-written numerals. Many document reading applications involve hand-written numerals in a predetermined location on a form, and in many cases, critical data is redundantly described. The amount of a personal check is one such case which is written redundantly in numerals and in alphabetical form. Information from two optical character recognition modules, one specialized for digits and one for words, is combined to yield an enhanced recognition of the amount. The combination can be accomplished by a decision tree with "if-then" rules, but by simply fusing two or more sets of sensor data in a single expanded neural net, the same functionality can be expected with a much reduced system cost. Experimental results of fusing two neural nets to enhance overall recognition performance using a controlled data set are presented.

  16. Recognition of Telugu characters using neural networks.

    PubMed

    Sukhaswami, M B; Seetharamulu, P; Pujari, A K

    1995-09-01

    The aim of the present work is to recognize printed and handwritten Telugu characters using artificial neural networks (ANNs). Earlier work on recognition of Telugu characters has been done using conventional pattern recognition techniques. We make an initial attempt here of using neural networks for recognition with the aim of improving upon earlier methods which do not perform effectively in the presence of noise and distortion in the characters. The Hopfield model of neural network working as an associative memory is chosen for recognition purposes initially. Due to limitation in the capacity of the Hopfield neural network, we propose a new scheme named here as the Multiple Neural Network Associative Memory (MNNAM). The limitation in storage capacity has been overcome by combining multiple neural networks which work in parallel. It is also demonstrated that the Hopfield network is suitable for recognizing noisy printed characters as well as handwritten characters written by different "hands" in a variety of styles. Detailed experiments have been carried out using several learning strategies and results are reported. It is shown here that satisfactory recognition is possible using the proposed strategy. A detailed preprocessing scheme of the Telugu characters from digitized documents is also described.

  17. Automated recognition and extraction of tabular fields for the indexing of census records

    NASA Astrophysics Data System (ADS)

    Clawson, Robert; Bauer, Kevin; Chidester, Glen; Pohontsch, Milan; Kennard, Douglas; Ryu, Jongha; Barrett, William

    2013-01-01

    We describe a system for indexing of census records in tabular documents with the goal of recognizing the content of each cell, including both headers and handwritten entries. Each document is automatically rectified, registered and scaled to a known template following which lines and fields are detected and delimited as cells in a tabular form. Whole-word or whole-phrase recognition of noisy machine-printed text is performed using a glyph library, providing greatly increased efficiency and accuracy (approaching 100%), while avoiding the problems inherent with traditional OCR approaches. Constrained handwriting recognition results for a single author reach as high as 98% and 94.5% for the Gender field and Birthplace respectively. Multi-author accuracy (currently 82%) can be improved through an increased training set. Active integration of user feedback in the system will accelerate the indexing of records while providing a tightly coupled learning mechanism for system improvement.

  18. Shape analysis modeling for character recognition

    NASA Astrophysics Data System (ADS)

    Khan, Nadeem A. M.; Hegt, Hans A.

    1998-10-01

    Optimal shape modeling of character-classes is crucial for achieving high performance on recognition of mixed-font, hand-written or (and) poor quality text. A novel scheme is presented in this regard focusing on constructing such structural models that can be hierarchically examined. These models utilize a certain `well-thought' set of shape primitives. They are simplified enough to ignore the inter- class variations in font-type or writing style yet retaining enough details for discrimination between the samples of the similar classes. Thus the number of models per class required can be kept minimal without sacrificing the recognition accuracy. In this connection a flexible multi- stage matching scheme exploiting the proposed modeling is also described. This leads to a system which is robust against various distortions and degradation including those related to cases of touching and broken characters. Finally, we present some examples and test results as a proof-of- concept demonstrating the validity and the robustness of the approach.

  19. A novel word spotting method based on recurrent neural networks.

    PubMed

    Frinken, Volkmar; Fischer, Andreas; Manmatha, R; Bunke, Horst

    2012-02-01

    Keyword spotting refers to the process of retrieving all instances of a given keyword from a document. In the present paper, a novel keyword spotting method for handwritten documents is described. It is derived from a neural network-based system for unconstrained handwriting recognition. As such it performs template-free spotting, i.e., it is not necessary for a keyword to appear in the training set. The keyword spotting is done using a modification of the CTC Token Passing algorithm in conjunction with a recurrent neural network. We demonstrate that the proposed systems outperform not only a classical dynamic time warping-based approach but also a modern keyword spotting system, based on hidden Markov models. Furthermore, we analyze the performance of the underlying neural networks when using them in a recognition task followed by keyword spotting on the produced transcription. We point out the advantages of keyword spotting when compared to classic text line recognition.

  20. [About da tai - abortion in old Chinese folk medicine handwritten manuscripts].

    PubMed

    Zheng, Jinsheng

    2013-01-01

    Of 881 Chinese handwritten volumes with medical texts of the 17th through mid-20th century held by Staatsbibliothek zu Berlin and Ethnologisches Museum Berlin-Dahlem, 48 volumes include prescriptions for induced abortion. A comparison shows that these records are significantly different from references to abortion in Chinese printed medical texts of pre-modern times. For example, the percentage of recipes recommended for artificial abortions in handwritten texts is significantly higher than those in printed medical books. Authors of handwritten texts used 25 terms to designate artificial abortion, with the term da tai [see text], lit.: "to strike the fetus", occurring most frequently. Its meaning is well defined, in contrast to other terms used, such as duo tai [see text], lit: "to make a fetus fall", xia tai [see text], lit. "to bring a fetus down", und duan chan [see text], lit., to interrupt birthing", which is mostly used to indicate a temporary or permanent sterilization. Pre-modern Chinese medicine has not generally abstained from inducing abortions; physicians showed a differentiating attitude. While abortions were descibed as "things a [physician with an attitude of] humaneness will not do", in case a pregnancy was seen as too risky for a woman she was offered medication to terminate this pregnancy. The commercial application of abortifacients has been recorded in China since ancient times. A request for such services has continued over time for various reasons, including so-called illegitimate pregnancies, and those by nuns, widows and prostitutes. In general, recipes to induce abortions documented in printed medical literature have mild effects and are to be ingested orally. In comparison, those recommended in handwritten texts are rather toxic. Possibly to minimize the negative side-effects of such medication, practitioners of folk medicine developed mechanical devices to perform "external", i.e., vaginal approaches.

  1. A guide for digitising manuscript climate data

    NASA Astrophysics Data System (ADS)

    Brönnimann, S.; Annis, J.; Dann, W.; Ewen, T.; Grant, A. N.; Griesser, T.; Krähenmann, S.; Mohr, C.; Scherer, M.; Vogler, C.

    2006-10-01

    Hand-written or printed manuscript data are an important source for paleo-climatological studies, but bringing them into a suitable format can be a time consuming adventure with uncertain success. Before digitising such data (e.g., in the context a specific research project), it is worthwhile spending a few thoughts on the characteristics of the data, the scientific requirements with respect to quality and coverage, the metadata, and technical aspects such as reproduction techniques, digitising techniques, and quality control strategies. Here we briefly discuss the most important considerations according to our own experience and describe different methods for digitising numeric or text data (optical character recognition, speech recognition, and key entry). We present a tentative guide that is intended to help others compiling the necessary information and making the right decisions.

  2. A guide for digitising manuscript climate data

    NASA Astrophysics Data System (ADS)

    Brönnimann, S.; Annis, J.; Dann, W.; Ewen, T.; Grant, A. N.; Griesser, T.; Krähenmann, S.; Mohr, C.; Scherer, M.; Vogler, C.

    2006-05-01

    Hand-written or printed manuscript data are an important source for paleo-climatological studies, but bringing them into a suitable format can be a time consuming adventure with uncertain success. Before starting the digitising work, it is worthwhile spending a few thoughts on the characteristics of the data, the scientific requirements with respect to quality and coverage, and on the different digitising techniques. Here we briefly discuss the most important considerations and report our own experience. We describe different methods for digitising numeric or text data, i.e., optical character recognition (OCR), speech recognition, and key entry. Each technique has its advantages and disadvantages that may become important for certain applications. It is therefore crucial to thoroughly investigate beforehand the characteristics of the manuscript data, define the quality targets and develop validation strategies.

  3. Machine printed text and handwriting identification in noisy document images.

    PubMed

    Zheng, Yefeng; Li, Huiping; Doermann, David

    2004-03-01

    In this paper, we address the problem of the identification of text in noisy document images. We are especially focused on segmenting and identifying between handwriting and machine printed text because: 1) Handwriting in a document often indicates corrections, additions, or other supplemental information that should be treated differently from the main content and 2) the segmentation and recognition techniques requested for machine printed and handwritten text are significantly different. A novel aspect of our approach is that we treat noise as a separate class and model noise based on selected features. Trained Fisher classifiers are used to identify machine printed text and handwriting from noise and we further exploit context to refine the classification. A Markov Random Field-based (MRF) approach is used to model the geometrical structure of the printed text, handwriting, and noise to rectify misclassifications. Experimental results show that our approach is robust and can significantly improve page segmentation in noisy document collections.

  4. Structural model constructing for optical handwritten character recognition

    NASA Astrophysics Data System (ADS)

    Khaustov, P. A.; Spitsyn, V. G.; Maksimova, E. I.

    2017-02-01

    The article is devoted to the development of the algorithms for optical handwritten character recognition based on the structural models constructing. The main advantage of these algorithms is the low requirement regarding the number of reference images. The one-pass approach to a thinning of the binary character representation has been proposed. This approach is based on the joint use of Zhang-Suen and Wu-Tsai algorithms. The effectiveness of the proposed approach is confirmed by the results of the experiments. The article includes the detailed description of the structural model constructing algorithm’s steps. The proposed algorithm has been implemented in character processing application and has been approved on MNIST handwriting characters database. Algorithms that could be used in case of limited reference images number were used for the comparison.

  5. New approach for segmentation and recognition of handwritten numeral strings

    NASA Astrophysics Data System (ADS)

    Sadri, Javad; Suen, Ching Y.; Bui, Tien D.

    2004-12-01

    In this paper, we propose a new system for segmentation and recognition of unconstrained handwritten numeral strings. The system uses a combination of foreground and background features for segmentation of touching digits. The method introduces new algorithms for traversing the top/bottom-foreground-skeletons of the touched digits, and for finding feature points on these skeletons, and matching them to build all the segmentation paths. For the first time a genetic representation is used to show all the segmentation hypotheses. Our genetic algorithm tries to search and evolve the population of candidate segmentations and finds the one with the highest confidence for its segmentation and recognition. We have also used a new method for feature extraction which lowers the variations in the shapes of the digits, and then a MLP neural network is utilized to produce the labels and confidence values for those digits. The NIST SD19 and CENPARMI databases are used for evaluating the system. Our system can get a correct segmentation-recognition rate of 96.07% with rejection rate of 2.61% which compares favorably with those that exist in the literature.

  6. New approach for segmentation and recognition of handwritten numeral strings

    NASA Astrophysics Data System (ADS)

    Sadri, Javad; Suen, Ching Y.; Bui, Tien D.

    2005-01-01

    In this paper, we propose a new system for segmentation and recognition of unconstrained handwritten numeral strings. The system uses a combination of foreground and background features for segmentation of touching digits. The method introduces new algorithms for traversing the top/bottom-foreground-skeletons of the touched digits, and for finding feature points on these skeletons, and matching them to build all the segmentation paths. For the first time a genetic representation is used to show all the segmentation hypotheses. Our genetic algorithm tries to search and evolve the population of candidate segmentations and finds the one with the highest confidence for its segmentation and recognition. We have also used a new method for feature extraction which lowers the variations in the shapes of the digits, and then a MLP neural network is utilized to produce the labels and confidence values for those digits. The NIST SD19 and CENPARMI databases are used for evaluating the system. Our system can get a correct segmentation-recognition rate of 96.07% with rejection rate of 2.61% which compares favorably with those that exist in the literature.

  7. Offline handwritten word recognition using MQDF-HMMs

    NASA Astrophysics Data System (ADS)

    Ramachandrula, Sitaram; Hambarde, Mangesh; Patial, Ajay; Sahoo, Dushyant; Kochar, Shaivi

    2015-01-01

    We propose an improved HMM formulation for offline handwriting recognition (HWR). The main contribution of this work is using modified quadratic discriminant function (MQDF) [1] within HMM framework. In an MQDF-HMM the state observation likelihood is calculated by a weighted combination of MQDF likelihoods of individual Gaussians of GMM (Gaussian Mixture Model). The quadratic discriminant function (QDF) of a multivariate Gaussian can be rewritten by avoiding the inverse of covariance matrix by using the Eigen values and Eigen vectors of it. The MQDF is derived from QDF by substituting few of badly estimated lower-most Eigen values by an appropriate constant. The estimation errors of non-dominant Eigen vectors and Eigen values of covariance matrix for which the training data is insufficient can be controlled by this approach. MQDF has been successfully shown to improve the character recognition performance [1]. The usage of MQDF in HMM improves the computation, storage and modeling power of HMM when there is limited training data. We have got encouraging results on offline handwritten character (NIST database) and word recognition in English using MQDF HMMs.

  8. BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters.

    PubMed

    Biswas, Mithun; Islam, Rafiqul; Shom, Gautam Kumar; Shopon, Md; Mohammed, Nabeel; Momen, Sifat; Abedin, Anowarul

    2017-06-01

    BanglaLekha-Isolated, a Bangla handwritten isolated character dataset is presented in this article. This dataset contains 84 different characters comprising of 50 Bangla basic characters, 10 Bangla numerals and 24 selected compound characters. 2000 handwriting samples for each of the 84 characters were collected, digitized and pre-processed. After discarding mistakes and scribbles, 1,66,105 handwritten character images were included in the final dataset. The dataset also includes labels indicating the age and the gender of the subjects from whom the samples were collected. This dataset could be used not only for optical handwriting recognition research but also to explore the influence of gender and age on handwriting. The dataset is publicly available at https://data.mendeley.com/datasets/hf6sf8zrkc/2.

  9. Structural analysis of online handwritten mathematical symbols based on support vector machines

    NASA Astrophysics Data System (ADS)

    Simistira, Foteini; Papavassiliou, Vassilis; Katsouros, Vassilis; Carayannis, George

    2013-01-01

    Mathematical expression recognition is still a very challenging task for the research community mainly because of the two-dimensional (2d) structure of mathematical expressions (MEs). In this paper, we present a novel approach for the structural analysis between two on-line handwritten mathematical symbols of a ME, based on spatial features of the symbols. We introduce six features to represent the spatial affinity of the symbols and compare two multi-class classification methods that employ support vector machines (SVMs): one based on the "one-against-one" technique and one based on the "one-against-all", in identifying the relation between a pair of symbols (i.e. subscript, numerator, etc). A dataset containing 1906 spatial relations derived from the Competition on Recognition of Online Handwritten Mathematical Expressions (CROHME) 2012 training dataset is constructed to evaluate the classifiers and compare them with the rule-based classifier of the ILSP-1 system participated in the contest. The experimental results give an overall mean error rate of 2.61% for the "one-against-one" SVM approach, 6.57% for the "one-against-all" SVM technique and 12.31% error rate for the ILSP-1 classifier.

  10. Slant correction for handwritten English documents

    NASA Astrophysics Data System (ADS)

    Shridhar, Malayappan; Kimura, Fumitaka; Ding, Yimei; Miller, John W. V.

    2004-12-01

    Optical character recognition of machine-printed documents is an effective means for extracting textural material. While the level of effectiveness for handwritten documents is much poorer, progress is being made in more constrained applications such as personal checks and postal addresses. In these applications a series of steps is performed for recognition beginning with removal of skew and slant. Slant is a characteristic unique to the writer and varies from writer to writer in which characters are tilted some amount from vertical. The second attribute is the skew that arises from the inability of the writer to write on a horizontal line. Several methods have been proposed and discussed for average slant estimation and correction in the earlier papers. However, analysis of many handwritten documents reveals that slant is a local property and slant varies even within a word. The use of an average slant for the entire word often results in overestimation or underestimation of the local slant. This paper describes three methods for local slant estimation, namely the simple iterative method, high-speed iterative method, and the 8-directional chain code method. The experimental results show that the proposed methods can estimate and correct local slant more effectively than the average slant correction.

  11. Analog design of a new neural network for optical character recognition.

    PubMed

    Morns, I P; Dlay, S S

    1999-01-01

    An electronic circuit is presented for a new type of neural network, which gives a recognition rate of over 100 kHz. The network is used to classify handwritten numerals, presented as Fourier and wavelet descriptors, and has been shown to train far quicker than the popular backpropagation network while maintaining classification accuracy.

  12. Performance evaluation methodology for historical document image binarization.

    PubMed

    Ntirogiannis, Konstantinos; Gatos, Basilis; Pratikakis, Ioannis

    2013-02-01

    Document image binarization is of great importance in the document image analysis and recognition pipeline since it affects further stages of the recognition process. The evaluation of a binarization method aids in studying its algorithmic behavior, as well as verifying its effectiveness, by providing qualitative and quantitative indication of its performance. This paper addresses a pixel-based binarization evaluation methodology for historical handwritten/machine-printed document images. In the proposed evaluation scheme, the recall and precision evaluation measures are properly modified using a weighting scheme that diminishes any potential evaluation bias. Additional performance metrics of the proposed evaluation scheme consist of the percentage rates of broken and missed text, false alarms, background noise, character enlargement, and merging. Several experiments conducted in comparison with other pixel-based evaluation measures demonstrate the validity of the proposed evaluation scheme.

  13. Local Subspace Classifier with Transform-Invariance for Image Classification

    NASA Astrophysics Data System (ADS)

    Hotta, Seiji

    A family of linear subspace classifiers called local subspace classifier (LSC) outperforms the k-nearest neighbor rule (kNN) and conventional subspace classifiers in handwritten digit classification. However, LSC suffers very high sensitivity to image transformations because it uses projection and the Euclidean distances for classification. In this paper, I present a combination of a local subspace classifier (LSC) and a tangent distance (TD) for improving accuracy of handwritten digit recognition. In this classification rule, we can deal with transform-invariance easily because we are able to use tangent vectors for approximation of transformations. However, we cannot use tangent vectors in other type of images such as color images. Hence, kernel LSC (KLSC) is proposed for incorporating transform-invariance into LSC via kernel mapping. The performance of the proposed methods is verified with the experiments on handwritten digit and color image classification.

  14. Script-independent text line segmentation in freestyle handwritten documents.

    PubMed

    Li, Yi; Zheng, Yefeng; Doermann, David; Jaeger, Stefan; Li, Yi

    2008-08-01

    Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability that the underlying pixel belongs to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike connected component based methods ( [1], [2] for example), the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts, such as Arabic, Chinese, Korean, and Hindi, demonstrate that our algorithm consistently outperforms previous methods [1]-[3]. Further experiments show the proposed algorithm is robust to scale change, rotation, and noise.

  15. Ultrafast learning in a hard-limited neural network pattern recognizer

    NASA Astrophysics Data System (ADS)

    Hu, Chia-Lun J.

    1996-03-01

    As we published in the last five years, the supervised learning in a hard-limited perceptron system can be accomplished in a noniterative manner if the input-output mapping to be learned satisfies a certain positive-linear-independency (or PLI) condition. When this condition is satisfied (for most practical pattern recognition applications, this condition should be satisfied,) the connection matrix required to meet this mapping can be obtained noniteratively in one step. Generally, there exist infinitively many solutions for the connection matrix when the PLI condition is satisfied. We can then select an optimum solution such that the recognition of any untrained patterns will become optimally robust in the recognition mode. The learning speed is very fast and close to real-time because the learning process is noniterative and one-step. This paper reports the theoretical analysis and the design of a practical charter recognition system for recognizing hand-written alphabets. The experimental result is recorded in real-time on an unedited video tape for demonstration purposes. It is seen from this real-time movie that the recognition of the untrained hand-written alphabets is invariant to size, location, orientation, and writing sequence, even the training is done with standard size, standard orientation, central location and standard writing sequence.

  16. Reduction of the dimension of neural network models in problems of pattern recognition and forecasting

    NASA Astrophysics Data System (ADS)

    Nasertdinova, A. D.; Bochkarev, V. V.

    2017-11-01

    Deep neural networks with a large number of parameters are a powerful tool for solving problems of pattern recognition, prediction and classification. Nevertheless, overfitting remains a serious problem in the use of such networks. A method of solving the problem of overfitting is proposed in this article. This method is based on reducing the number of independent parameters of a neural network model using the principal component analysis, and can be implemented using existing libraries of neural computing. The algorithm was tested on the problem of recognition of handwritten symbols from the MNIST database, as well as on the task of predicting time series (rows of the average monthly number of sunspots and series of the Lorentz system were used). It is shown that the application of the principal component analysis enables reducing the number of parameters of the neural network model when the results are good. The average error rate for the recognition of handwritten figures from the MNIST database was 1.12% (which is comparable to the results obtained using the "Deep training" methods), while the number of parameters of the neural network can be reduced to 130 times.

  17. Marker Registration Technique for Handwritten Text Marker in Augmented Reality Applications

    NASA Astrophysics Data System (ADS)

    Thanaborvornwiwat, N.; Patanukhom, K.

    2018-04-01

    Marker registration is a fundamental process to estimate camera poses in marker-based Augmented Reality (AR) systems. We developed AR system that creates correspondence virtual objects on handwritten text markers. This paper presents a new method for registration that is robust for low-content text markers, variation of camera poses, and variation of handwritten styles. The proposed method uses Maximally Stable Extremal Regions (MSER) and polygon simplification for a feature point extraction. The experiment shows that we need to extract only five feature points per image which can provide the best registration results. An exhaustive search is used to find the best matching pattern of the feature points in two images. We also compared performance of the proposed method to some existing registration methods and found that the proposed method can provide better accuracy and time efficiency.

  18. Limited receptive area neural classifier for recognition of swallowing sounds using continuous wavelet transform.

    PubMed

    Makeyev, Oleksandr; Sazonov, Edward; Schuckers, Stephanie; Lopez-Meyer, Paulo; Melanson, Ed; Neuman, Michael

    2007-01-01

    In this paper we propose a sound recognition technique based on the limited receptive area (LIRA) neural classifier and continuous wavelet transform (CWT). LIRA neural classifier was developed as a multipurpose image recognition system. Previous tests of LIRA demonstrated good results in different image recognition tasks including: handwritten digit recognition, face recognition, metal surface texture recognition, and micro work piece shape recognition. We propose a sound recognition technique where scalograms of sound instances serve as inputs of the LIRA neural classifier. The methodology was tested in recognition of swallowing sounds. Swallowing sound recognition may be employed in systems for automated swallowing assessment and diagnosis of swallowing disorders. The experimental results suggest high efficiency and reliability of the proposed approach.

  19. A comparison study between MLP and convolutional neural network models for character recognition

    NASA Astrophysics Data System (ADS)

    Ben Driss, S.; Soua, M.; Kachouri, R.; Akil, M.

    2017-05-01

    Optical Character Recognition (OCR) systems have been designed to operate on text contained in scanned documents and images. They include text detection and character recognition in which characters are described then classified. In the classification step, characters are identified according to their features or template descriptions. Then, a given classifier is employed to identify characters. In this context, we have proposed the unified character descriptor (UCD) to represent characters based on their features. Then, matching was employed to ensure the classification. This recognition scheme performs a good OCR Accuracy on homogeneous scanned documents, however it cannot discriminate characters with high font variation and distortion.3 To improve recognition, classifiers based on neural networks can be used. The multilayer perceptron (MLP) ensures high recognition accuracy when performing a robust training. Moreover, the convolutional neural network (CNN), is gaining nowadays a lot of popularity for its high performance. Furthermore, both CNN and MLP may suffer from the large amount of computation in the training phase. In this paper, we establish a comparison between MLP and CNN. We provide MLP with the UCD descriptor and the appropriate network configuration. For CNN, we employ the convolutional network designed for handwritten and machine-printed character recognition (Lenet-5) and we adapt it to support 62 classes, including both digits and characters. In addition, GPU parallelization is studied to speed up both of MLP and CNN classifiers. Based on our experimentations, we demonstrate that the used real-time CNN is 2x more relevant than MLP when classifying characters.

  20. Recognition of handwritten similar Chinese characters by self-growing probabilistic decision-based neural network.

    PubMed

    Fu, H C; Xu, Y Y; Chang, H Y

    1999-12-01

    Recognition of similar (confusion) characters is a difficult problem in optical character recognition (OCR). In this paper, we introduce a neural network solution that is capable of modeling minor differences among similar characters, and is robust to various personal handwriting styles. The Self-growing Probabilistic Decision-based Neural Network (SPDNN) is a probabilistic type neural network, which adopts a hierarchical network structure with nonlinear basis functions and a competitive credit-assignment scheme. Based on the SPDNN model, we have constructed a three-stage recognition system. First, a coarse classifier determines a character to be input to one of the pre-defined subclasses partitioned from a large character set, such as Chinese mixed with alphanumerics. Then a character recognizer determines the input image which best matches the reference character in the subclass. Lastly, the third module is a similar character recognizer, which can further enhance the recognition accuracy among similar or confusing characters. The prototype system has demonstrated a successful application of SPDNN to similar handwritten Chinese recognition for the public database CCL/HCCR1 (5401 characters x200 samples). Regarding performance, experiments on the CCL/HCCR1 database produced 90.12% recognition accuracy with no rejection, and 94.11% accuracy with 6.7% rejection, respectively. This recognition accuracy represents about 4% improvement on the previously announced performance. As to processing speed, processing before recognition (including image preprocessing, segmentation, and feature extraction) requires about one second for an A4 size character image, and recognition consumes approximately 0.27 second per character on a Pentium-100 based personal computer, without use of any hardware accelerator or co-processor.

  1. Pen-chant: Acoustic emissions of handwriting and drawing

    NASA Astrophysics Data System (ADS)

    Seniuk, Andrew G.

    The sounds generated by a writing instrument ('pen-chant') provide a rich and underutilized source of information for pattern recognition. We examine the feasibility of recognition of handwritten cursive text, exclusively through an analysis of acoustic emissions. We design and implement a family of recognizers using a template matching approach, with templates and similarity measures derived variously from: smoothed amplitude signal with fixed resolution, discrete sequence of magnitudes obtained from peaks in the smoothed amplitude signal, and ordered tree obtained from a scale space signal representation. Test results are presented for recognition of isolated lowercase cursive characters and for whole words. We also present qualitative results for recognizing gestures such as circling, scratch-out, check-marks, and hatching. Our first set of results, using samples provided by the author, yield recognition rates of over 70% (alphabet) and 90% (26 words), with a confidence of +/-8%, based solely on acoustic emissions. Our second set of results uses data gathered from nine writers. These results demonstrate that acoustic emissions are a rich source of information, usable---on their own or in conjunction with image-based features---to solve pattern recognition problems. In future work, this approach can be applied to writer identification, handwriting and gesture-based computer input technology, emotion recognition, and temporal analysis of sketches.

  2. Learning and Inductive Inference

    DTIC Science & Technology

    1982-07-01

    a set of graph grammars to describe visual scenes . Other researchers have applied graph grammars to the pattern recognition of handwritten characters...345 1. Issues / 345 2. Mostows’ operationalizer / 350 0. Learning from ezamples / 360 1. Issues / 3t60 2. Learning in control and pattern recognition ...art.icleis on rote learntinig and ailvice- tAik g. K(ennieth Clarkson contributed Ltte article on grmvit atical inference, anid Geoff’ lroiney wrote

  3. Online Farsi digit recognition using their upper half structure

    NASA Astrophysics Data System (ADS)

    Ghods, Vahid; Sohrabi, Mohammad Karim

    2015-03-01

    In this paper, we investigated the efficiency of upper half Farsi numerical digit structure. In other words, half of data (upper half of the digit shapes) was exploited for the recognition of Farsi numerical digits. This method can be used for both offline and online recognition. Half of data is more effective in speed process, data transfer and in this application accuracy. Hidden Markov model (HMM) was used to classify online Farsi digits. Evaluation was performed by TMU dataset. This dataset contains more than 1200 samples of online handwritten Farsi digits. The proposed method yielded more accuracy in recognition rate.

  4. The Characteristics of Binary Spike-Time-Dependent Plasticity in HfO2-Based RRAM and Applications for Pattern Recognition

    NASA Astrophysics Data System (ADS)

    Zhou, Zheng; Liu, Chen; Shen, Wensheng; Dong, Zhen; Chen, Zhe; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng

    2017-04-01

    A binary spike-time-dependent plasticity (STDP) protocol based on one resistive-switching random access memory (RRAM) device was proposed and experimentally demonstrated in the fabricated RRAM array. Based on the STDP protocol, a novel unsupervised online pattern recognition system including RRAM synapses and CMOS neurons is developed. Our simulations show that the system can efficiently compete the handwritten digits recognition task, which indicates the feasibility of using the RRAM-based binary STDP protocol in neuromorphic computing systems to obtain good performance.

  5. Textual blocks rectification method based on fast Hough transform analysis in identity documents recognition

    NASA Astrophysics Data System (ADS)

    Bezmaternykh, P. V.; Nikolaev, D. P.; Arlazarov, V. L.

    2018-04-01

    Textual blocks rectification or slant correction is an important stage of document image processing in OCR systems. This paper considers existing methods and introduces an approach for the construction of such algorithms based on Fast Hough Transform analysis. A quality measurement technique is proposed and obtained results are shown for both printed and handwritten textual blocks processing as a part of an industrial system of identity documents recognition on mobile devices.

  6. Image Segmentation of Historical Handwriting from Palm Leaf Manuscripts

    NASA Astrophysics Data System (ADS)

    Surinta, Olarik; Chamchong, Rapeeporn

    Palm leaf manuscripts were one of the earliest forms of written media and were used in Southeast Asia to store early written knowledge about subjects such as medicine, Buddhist doctrine and astrology. Therefore, historical handwritten palm leaf manuscripts are important for people who like to learn about historical documents, because we can learn more experience from them. This paper presents an image segmentation of historical handwriting from palm leaf manuscripts. The process is composed of three steps: 1) background elimination to separate text and background by Otsu's algorithm 2) line segmentation and 3) character segmentation by histogram of image. The end result is the character's image. The results from this research may be applied to optical character recognition (OCR) in the future.

  7. Diffuse Interface Methods for Multiclass Segmentation of High-Dimensional Data

    DTIC Science & Technology

    2014-03-04

    handwritten digits , 1998. http://yann.lecun.com/exdb/mnist/. [19] S. Nene, S. Nayar, H. Murase, Columbia Object Image Library (COIL-100), Technical Report... recognition on smartphones using a multiclass hardware-friendly support vector machine, in: Ambient Assisted Living and Home Care, Springer, 2012, pp. 216–223.

  8. Recognizing characters of ancient manuscripts

    NASA Astrophysics Data System (ADS)

    Diem, Markus; Sablatnig, Robert

    2010-02-01

    Considering printed Latin text, the main issues of Optical Character Recognition (OCR) systems are solved. However, for degraded handwritten document images, basic preprocessing steps such as binarization, gain poor results with state-of-the-art methods. In this paper ancient Slavonic manuscripts from the 11th century are investigated. In order to minimize the consequences of false character segmentation, a binarization-free approach based on local descriptors is proposed. Additionally local information allows the recognition of partially visible or washed out characters. The proposed algorithm consists of two steps: character classification and character localization. Initially Scale Invariant Feature Transform (SIFT) features are extracted which are subsequently classified using Support Vector Machines (SVM). Afterwards, the interest points are clustered according to their spatial information. Thereby, characters are localized and finally recognized based on a weighted voting scheme of pre-classified local descriptors. Preliminary results show that the proposed system can handle highly degraded manuscript images with background clutter (e.g. stains, tears) and faded out characters.

  9. Analysis of line structure in handwritten documents using the Hough transform

    NASA Astrophysics Data System (ADS)

    Ball, Gregory R.; Kasiviswanathan, Harish; Srihari, Sargur N.; Narayanan, Aswin

    2010-01-01

    In the analysis of handwriting in documents a central task is that of determining line structure of the text, e.g., number of text lines, location of their starting and end-points, line-width, etc. While simple methods can handle ideal images, real world documents have complexities such as overlapping line structure, variable line spacing, line skew, document skew, noisy or degraded images etc. This paper explores the application of the Hough transform method to handwritten documents with the goal of automatically determining global document line structure in a top-down manner which can then be used in conjunction with a bottom-up method such as connected component analysis. The performance is significantly better than other top-down methods, such as the projection profile method. In addition, we evaluate the performance of skew analysis by the Hough transform on handwritten documents.

  10. Giro form reading machine

    NASA Astrophysics Data System (ADS)

    Minh Ha, Thien; Niggeler, Dieter; Bunke, Horst; Clarinval, Jose

    1995-08-01

    Although giro forms are used by many people in daily life for money remittance in Switzerland, the processing of these forms at banks and post offices is only partly automated. We describe an ongoing project for building an automatic system that is able to recognize various items printed or written on a giro form. The system comprises three main components, namely, an automatic form feeder, a camera system, and a computer. These components are connected in such a way that the system is able to process a bunch of forms without any human interactions. We present two real applications of our system in the field of payment services, which require the reading of both machine printed and handwritten information that may appear on a giro form. One particular feature of giro forms is their flexible layout, i.e., information items are located differently from one form to another, thus requiring an additional analysis step to localize them before recognition. A commercial optical character recognition software package is used for recognition of machine-printed information, whereas handwritten information is read by our own algorithms, the details of which are presented. The system is implemented by using a client/server architecture providing a high degree of flexibility to change. Preliminary results are reported supporting our claim that the system is usable in practice.

  11. A Novel Handwritten Letter Recognizer Using Enhanced Evolutionary Neural Network

    NASA Astrophysics Data System (ADS)

    Mahmoudi, Fariborz; Mirzashaeri, Mohsen; Shahamatnia, Ehsan; Faridnia, Saed

    This paper introduces a novel design for handwritten letter recognition by employing a hybrid back-propagation neural network with an enhanced evolutionary algorithm. Feeding the neural network consists of a new approach which is invariant to translation, rotation, and scaling of input letters. Evolutionary algorithm is used for the global search of the search space and the back-propagation algorithm is used for the local search. The results have been computed by implementing this approach for recognizing 26 English capital letters in the handwritings of different people. The computational results show that the neural network reaches very satisfying results with relatively scarce input data and a promising performance improvement in convergence of the hybrid evolutionary back-propagation algorithms is exhibited.

  12. An Approach to a Comprehensive Test Framework for Analysis and Evaluation of Text Line Segmentation Algorithms

    PubMed Central

    Brodic, Darko; Milivojevic, Dragan R.; Milivojevic, Zoran N.

    2011-01-01

    The paper introduces a testing framework for the evaluation and validation of text line segmentation algorithms. Text line segmentation represents the key action for correct optical character recognition. Many of the tests for the evaluation of text line segmentation algorithms deal with text databases as reference templates. Because of the mismatch, the reliable testing framework is required. Hence, a new approach to a comprehensive experimental framework for the evaluation of text line segmentation algorithms is proposed. It consists of synthetic multi-like text samples and real handwritten text as well. Although the tests are mutually independent, the results are cross-linked. The proposed method can be used for different types of scripts and languages. Furthermore, two different procedures for the evaluation of algorithm efficiency based on the obtained error type classification are proposed. The first is based on the segmentation line error description, while the second one incorporates well-known signal detection theory. Each of them has different capabilities and convenience, but they can be used as supplements to make the evaluation process efficient. Overall the proposed procedure based on the segmentation line error description has some advantages, characterized by five measures that describe measurement procedures. PMID:22164106

  13. An approach to a comprehensive test framework for analysis and evaluation of text line segmentation algorithms.

    PubMed

    Brodic, Darko; Milivojevic, Dragan R; Milivojevic, Zoran N

    2011-01-01

    The paper introduces a testing framework for the evaluation and validation of text line segmentation algorithms. Text line segmentation represents the key action for correct optical character recognition. Many of the tests for the evaluation of text line segmentation algorithms deal with text databases as reference templates. Because of the mismatch, the reliable testing framework is required. Hence, a new approach to a comprehensive experimental framework for the evaluation of text line segmentation algorithms is proposed. It consists of synthetic multi-like text samples and real handwritten text as well. Although the tests are mutually independent, the results are cross-linked. The proposed method can be used for different types of scripts and languages. Furthermore, two different procedures for the evaluation of algorithm efficiency based on the obtained error type classification are proposed. The first is based on the segmentation line error description, while the second one incorporates well-known signal detection theory. Each of them has different capabilities and convenience, but they can be used as supplements to make the evaluation process efficient. Overall the proposed procedure based on the segmentation line error description has some advantages, characterized by five measures that describe measurement procedures.

  14. Eye movement analysis for activity recognition using electrooculography.

    PubMed

    Bulling, Andreas; Ward, Jamie A; Gellersen, Hans; Tröster, Gerhard

    2011-04-01

    In this work, we investigate eye movement analysis as a new sensing modality for activity recognition. Eye movement data were recorded using an electrooculography (EOG) system. We first describe and evaluate algorithms for detecting three eye movement characteristics from EOG signals-saccades, fixations, and blinks-and propose a method for assessing repetitive patterns of eye movements. We then devise 90 different features based on these characteristics and select a subset of them using minimum redundancy maximum relevance (mRMR) feature selection. We validate the method using an eight participant study in an office environment using an example set of five activity classes: copying a text, reading a printed paper, taking handwritten notes, watching a video, and browsing the Web. We also include periods with no specific activity (the NULL class). Using a support vector machine (SVM) classifier and person-independent (leave-one-person-out) training, we obtain an average precision of 76.1 percent and recall of 70.5 percent over all classes and participants. The work demonstrates the promise of eye-based activity recognition (EAR) and opens up discussion on the wider applicability of EAR to other activities that are difficult, or even impossible, to detect using common sensing modalities.

  15. Target and (Astro-)WISE technologies Data federations and its applications

    NASA Astrophysics Data System (ADS)

    Valentijn, E. A.; Begeman, K.; Belikov, A.; Boxhoorn, D. R.; Brinchmann, J.; McFarland, J.; Holties, H.; Kuijken, K. H.; Verdoes Kleijn, G.; Vriend, W.-J.; Williams, O. R.; Roerdink, J. B. T. M.; Schomaker, L. R. B.; Swertz, M. A.; Tsyganov, A.; van Dijk, G. J. W.

    2017-06-01

    After its first implementation in 2003 the Astro-WISE technology has been rolled out in several European countries and is used for the production of the KiDS survey data. In the multi-disciplinary Target initiative this technology, nicknamed WISE technology, has been further applied to a large number of projects. Here, we highlight the data handling of other astronomical applications, such as VLT-MUSE and LOFAR, together with some non-astronomical applications such as the medical projects Lifelines and GLIMPS; the MONK handwritten text recognition system; and business applications, by amongst others, the Target Holding. We describe some of the most important lessons learned and describe the application of the data-centric WISE type of approach to the Science Ground Segment of the Euclid satellite.

  16. [New discovery of the handwritten draft of Eucharius Rösslin's midwifery textbook Pregnant Women and Midwives Rosengarten and Ps.-Ortlof's Small Book for Women].

    PubMed

    Kruse, B J

    1994-01-01

    The author of the famous midwifery text book Der schwangeren Frauen und Hebammen Rosengarten has until now thought to have been Eucharius Rösslin the Elder, in whose name the first printed edition of the work appeared in 1513. According to him, he compiled the text from various sources in the years 1508-1512 at the suggestion of the Duchess Catherine of Brunswick-Luneburg. In the SB und UB Hamburg there is a handwritten preliminary draft of Rosengarten (Cod. med. 801, p. 9-130), dated by the scribe in the year 1494 (this is borne out by watermark analysis). It reproduces the text of Rosengarten without the privilegium, the dedication and the rhyming 'admonition' of the pregnant women and the midwives, as well as the glossary and the illustrative woodcuts almost identically. The printed version of Rosengarten was also expanded by Eucharius Rösslin the Elder with passages among others from Ps.-Ortolfs Frauenbüchlein. The author of this paper was also able to trace a handwritten preliminary draft of Frauenbüchlein, until now unknown, in manuscript 2967 of the Austrian National Library in Vienna. The remark Hic liber pertinet ad Constantinum Roeslin written in the manuscript by a previous owner, and a treatise on syphilis in the hand Eucharius Rösslin the Younger, would indicate that Cod. med. 801 was once in the possession of the Rösslin family. Since Eucharius Rösslin the Elder was born around 1470, and since errors and omissions in Cod. med. 801 indicate that it is a copy of an older text, we are confronted with the question of whether or not the handwritten edition of Rosengarten originates from him or from some other author.

  17. Signature Verification Using N-tuple Learning Machine.

    PubMed

    Maneechot, Thanin; Kitjaidure, Yuttana

    2005-01-01

    This research presents new algorithm for signature verification using N-tuple learning machine. The features are taken from handwritten signature on Digital Tablet (On-line). This research develops recognition algorithm using four features extraction, namely horizontal and vertical pen tip position(x-y position), pen tip pressure, and pen altitude angles. Verification uses N-tuple technique with Gaussian thresholding.

  18. Arabic Optical Character Recognition (OCR) Evaluation in Order to Develop a Post-OCR Module

    DTIC Science & Technology

    2011-09-01

    handwritten, and many more have some handwriting in the margins. Some images are blurred or faded to the point of illegibility. Others are mostly or...it is to English, because Arabic has more features such as agreement. We say that Arabic is more “morphologically rich” than English. We intend to

  19. Preliminary study towards the development of copying skill assessment on dyslexic children in Jawi handwriting

    NASA Astrophysics Data System (ADS)

    Rahim, Kartini Abdul; Kahar, Rosmila Abdul; Khalid, Halimi Mohd.; Salleh, Rohayu Mohd; Hashim, Rathiah

    2015-05-01

    Recognition of Arabic handwritten and its variants such as Farsi (Persian) and Urdu had been receiving considerable attention in recent years. Being contrast to Arabic handwritten, Jawi, as a second method of Malay handwritten, has not been studied yet, but if any, there were a few references on it. The recent transformation in Malaysian education, the Special Education is one of the priorities in the Malaysia Blueprint. One of the special needs quoted in Malaysia education is dyslexia. A dyslexic student is considered as student with learning disability. Concluding a student is truly dyslexia might be incorrect for they were only assessed through Roman alphabet, without considering assessment via Jawi handwriting. A study was conducted on dyslexic students attending a special class for dyslexia in Malay Language to determine whether they are also dyslexia in Jawi handwriting. The focus of the study is to test the copying skills in relation to word reading and writing in Malay Language with and without dyslexia through both characters. A total of 10 dyslexic children and 10 normal children were recruited. In conclusion for future study, dyslexic students have less difficulty in performing Jawi handwriting in Malay Language through statistical analysis.

  20. Arabic writer identification based on diacritic's features

    NASA Astrophysics Data System (ADS)

    Maliki, Makki; Al-Jawad, Naseer; Jassim, Sabah A.

    2012-06-01

    Natural languages like Arabic, Kurdish, Farsi (Persian), Urdu, and any other similar languages have many features, which make them different from other languages like Latin's script. One of these important features is diacritics. These diacritics are classified as: compulsory like dots which are used to identify/differentiate letters, and optional like short vowels which are used to emphasis consonants. Most indigenous and well trained writers often do not use all or some of these second class of diacritics, and expert readers can infer their presence within the context of the writer text. In this paper, we investigate the use of diacritics shapes and other characteristic as parameters of feature vectors for Arabic writer identification/verification. Segmentation techniques are used to extract the diacritics-based feature vectors from examples of Arabic handwritten text. The results of evaluation test will be presented, which has been carried out on an in-house database of 50 writers. Also the viability of using diacritics for writer recognition will be demonstrated.

  1. Font generation of personal handwritten Chinese characters

    NASA Astrophysics Data System (ADS)

    Lin, Jeng-Wei; Wang, Chih-Yin; Ting, Chao-Lung; Chang, Ray-I.

    2014-01-01

    Today, digital multimedia messages have drawn more and more attention due to the great achievement of computer and network techniques. Nevertheless, text is still the most popular media for people to communicate with others. Many fonts have been developed so that product designers can choose unique fonts to demonstrate their idea gracefully. It is commonly believed that handwritings can reflect one's personality, emotion, feeling, education level, and so on. This is especially true in Chinese calligraphy. However, it is not easy for ordinary users to customize a font of their personal handwritings. In this study, we performed a process reengineering in font generation. We present a new method to create font in a batch mode. Rather than to create glyphs of characters one by one according to their codepoints, people create glyphs incrementally in an on-demand manner. A Java Implementation is developed to read a document image of user handwritten Chinese characters, and make a vector font of these handwritten Chinese characters. Preliminary experiment result shows that the proposed method can help ordinary users create their personal handwritten fonts easily and quickly.

  2. Classification of remotely sensed data using OCR-inspired neural network techniques. [Optical Character Recognition

    NASA Technical Reports Server (NTRS)

    Kiang, Richard K.

    1992-01-01

    Neural networks have been applied to classifications of remotely sensed data with some success. To improve the performance of this approach, an examination was made of how neural networks are applied to the optical character recognition (OCR) of handwritten digits and letters. A three-layer, feedforward network, along with techniques adopted from OCR, was used to classify Landsat-4 Thematic Mapper data. Good results were obtained. To overcome the difficulties that are characteristic of remote sensing applications and to attain significant improvements in classification accuracy, a special network architecture may be required.

  3. Permutation coding technique for image recognition systems.

    PubMed

    Kussul, Ernst M; Baidyk, Tatiana N; Wunsch, Donald C; Makeyev, Oleksandr; Martín, Anabel

    2006-11-01

    A feature extractor and neural classifier for image recognition systems are proposed. The proposed feature extractor is based on the concept of random local descriptors (RLDs). It is followed by the encoder that is based on the permutation coding technique that allows to take into account not only detected features but also the position of each feature on the image and to make the recognition process invariant to small displacements. The combination of RLDs and permutation coding permits us to obtain a sufficiently general description of the image to be recognized. The code generated by the encoder is used as an input data for the neural classifier. Different types of images were used to test the proposed image recognition system. It was tested in the handwritten digit recognition problem, the face recognition problem, and the microobject shape recognition problem. The results of testing are very promising. The error rate for the Modified National Institute of Standards and Technology (MNIST) database is 0.44% and for the Olivetti Research Laboratory (ORL) database it is 0.1%.

  4. Fusion of Dependent and Independent Biometric Information Sources

    DTIC Science & Technology

    2005-03-01

    palmprint , DNA, ECG, signature, etc. The comparison of various biometric techniques is given in [13] and is presented in Table 1. Since, each...theory. Experimental studies on the M2VTS database [32] showed that a reduction in error rates is up to about 40%. Four combination strategies are...taken from the CEDAR benchmark database . The word recognition results were the highest (91%) among published results for handwritten words (before 2001

  5. Dealing with contaminated datasets: An approach to classifier training

    NASA Astrophysics Data System (ADS)

    Homenda, Wladyslaw; Jastrzebska, Agnieszka; Rybnik, Mariusz

    2016-06-01

    The paper presents a novel approach to classification reinforced with rejection mechanism. The method is based on a two-tier set of classifiers. First layer classifies elements, second layer separates native elements from foreign ones in each distinguished class. The key novelty presented here is rejection mechanism training scheme according to the philosophy "one-against-all-other-classes". Proposed method was tested in an empirical study of handwritten digits recognition.

  6. Spiking neural networks for handwritten digit recognition-Supervised learning and network optimization.

    PubMed

    Kulkarni, Shruti R; Rajendran, Bipin

    2018-07-01

    We demonstrate supervised learning in Spiking Neural Networks (SNNs) for the problem of handwritten digit recognition using the spike triggered Normalized Approximate Descent (NormAD) algorithm. Our network that employs neurons operating at sparse biological spike rates below 300Hz achieves a classification accuracy of 98.17% on the MNIST test database with four times fewer parameters compared to the state-of-the-art. We present several insights from extensive numerical experiments regarding optimization of learning parameters and network configuration to improve its accuracy. We also describe a number of strategies to optimize the SNN for implementation in memory and energy constrained hardware, including approximations in computing the neuronal dynamics and reduced precision in storing the synaptic weights. Experiments reveal that even with 3-bit synaptic weights, the classification accuracy of the designed SNN does not degrade beyond 1% as compared to the floating-point baseline. Further, the proposed SNN, which is trained based on the precise spike timing information outperforms an equivalent non-spiking artificial neural network (ANN) trained using back propagation, especially at low bit precision. Thus, our study shows the potential for realizing efficient neuromorphic systems that use spike based information encoding and learning for real-world applications. Copyright © 2018 Elsevier Ltd. All rights reserved.

  7. Concurrent evolution of feature extractors and modular artificial neural networks

    NASA Astrophysics Data System (ADS)

    Hannak, Victor; Savakis, Andreas; Yang, Shanchieh Jay; Anderson, Peter

    2009-05-01

    This paper presents a new approach for the design of feature-extracting recognition networks that do not require expert knowledge in the application domain. Feature-Extracting Recognition Networks (FERNs) are composed of interconnected functional nodes (feurons), which serve as feature extractors, and are followed by a subnetwork of traditional neural nodes (neurons) that act as classifiers. A concurrent evolutionary process (CEP) is used to search the space of feature extractors and neural networks in order to obtain an optimal recognition network that simultaneously performs feature extraction and recognition. By constraining the hill-climbing search functionality of the CEP on specific parts of the solution space, i.e., individually limiting the evolution of feature extractors and neural networks, it was demonstrated that concurrent evolution is a necessary component of the system. Application of this approach to a handwritten digit recognition task illustrates that the proposed methodology is capable of producing recognition networks that perform in-line with other methods without the need for expert knowledge in image processing.

  8. Combination of dynamic Bayesian network classifiers for the recognition of degraded characters

    NASA Astrophysics Data System (ADS)

    Likforman-Sulem, Laurence; Sigelle, Marc

    2009-01-01

    We investigate in this paper the combination of DBN (Dynamic Bayesian Network) classifiers, either independent or coupled, for the recognition of degraded characters. The independent classifiers are a vertical HMM and a horizontal HMM whose observable outputs are the image columns and the image rows respectively. The coupled classifiers, presented in a previous study, associate the vertical and horizontal observation streams into single DBNs. The scores of the independent and coupled classifiers are then combined linearly at the decision level. We compare the different classifiers -independent, coupled or linearly combined- on two tasks: the recognition of artificially degraded handwritten digits and the recognition of real degraded old printed characters. Our results show that coupled DBNs perform better on degraded characters than the linear combination of independent HMM scores. Our results also show that the best classifier is obtained by linearly combining the scores of the best coupled DBN and the best independent HMM.

  9. Artificial neural networks for document analysis and recognition.

    PubMed

    Marinai, Simone; Gori, Marco; Soda, Giovanni; Society, Computer

    2005-01-01

    Artificial neural networks have been extensively applied to document analysis and recognition. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document processing tasks, like preprocessing, layout analysis, character segmentation, word recognition, and signature verification, have been effectively faced with very promising results. This paper surveys the most significant problems in the area of offline document image processing, where connectionist-based approaches have been applied. Similarities and differences between approaches belonging to different categories are discussed. A particular emphasis is given on the crucial role of prior knowledge for the conception of both appropriate architectures and learning algorithms. Finally, the paper provides a critical analysis on the reviewed approaches and depicts the most promising research guidelines in the field. In particular, a second generation of connectionist-based models are foreseen which are based on appropriate graphical representations of the learning environment.

  10. Orthographic and phonological neighborhood effects in handwritten word perception

    PubMed Central

    Goldinger, Stephen D.

    2017-01-01

    In printed-word perception, the orthographic neighborhood effect (i.e., faster recognition of words with more neighbors) has considerable theoretical importance, because it implicates great interactivity in lexical access. Mulatti, Reynolds, and Besner Journal of Experimental Psychology: Human Perception and Performance, 32, 799–810 (2006) questioned the validity of orthographic neighborhood effects, suggesting that they reflect a confound with phonological neighborhood density. They reported that, when phonological density is controlled, orthographic neighborhood effects vanish. Conversely, phonological neighborhood effects were still evident even when controlling for orthographic neighborhood density. The present study was a replication and extension of Mulatti et al. (2006), with words presented in four different formats (computer-generated print and cursive, and handwritten print and cursive). The results from Mulatti et al. (2006) were replicated with computer-generated stimuli, but were reversed with natural stimuli. These results suggest that, when ambiguity is introduced at the level of individual letters, top-down influences from lexical neighbors are increased. PMID:26306881

  11. Fuzzy Clustering of Multiple Instance Data

    DTIC Science & Technology

    2015-11-30

    depth is not. To illustrate this data, in figure 1 we display the GPR signatures of the same mine buried at 3 in deep in two geographically different...target signature depends on the soil properties of the site. The same mine type is buried at 3in deep in both sites. Since its formal introduction...drug design [15], and the problem of handwritten digit recognition [16]. To the best of our knowledge, Diet - terich, et. al [1] were the first to

  12. A paper form processing system with an error correcting function for reading handwritten Kanji strings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Katsumi Marukawa; Kazuki Nakashima; Masashi Koga

    1994-12-31

    This paper presents a paper form processing system with an error correcting function for reading handwritten kanji strings. In the paper form processing system, names and addresses are important key data, and especially this paper takes up an error correcting method for name and address recognition. The method automatically corrects errors of the kanji OCR (Optical Character Reader) with the help of word dictionaries and other knowledge. Moreover, it allows names and addresses to be written in any style. The method consists of word matching {open_quotes}furigana{close_quotes} verification for name strings, and address approval for address strings. For word matching, kanjimore » name candidates are extracted by automaton-type word matching. In {open_quotes}furigana{close_quotes} verification, kana candidate characters recognized by the kana OCR are compared with kana`s searched from the name dictionary based on kanji name candidates, given by the word matching. The correct name is selected from the results of word matching and furigana verification. Also, the address approval efficiently searches for the right address based on a bottom-up procedure which follows hierarchical relations from a lower placename to a upper one by using the positional condition among the placenames. We ascertained that the error correcting method substantially improves the recognition rate and processing speed in experiments on 5,032 forms.« less

  13. Handwritten character recognition using background analysis

    NASA Astrophysics Data System (ADS)

    Tascini, Guido; Puliti, Paolo; Zingaretti, Primo

    1993-04-01

    The paper describes a low-cost handwritten character recognizer. It is constituted by three modules: the `acquisition' module, the `binarization' module, and the `core' module. The core module can be logically partitioned into six steps: character dilation, character circumscription, region and `profile' analysis, `cut' analysis, decision tree descent, and result validation. Firstly, it reduces the resolution of the binarized regions and detects the minimum rectangle (MR) which encloses the character; the MR partitions the background into regions that surround the character or are enclosed by it, and allows it to define features as `profiles' and `cuts;' a `profile' is the set of vertical or horizontal minimum distances between a side of the MR and the character itself; a `cut' is a vertical or horizontal image segment delimited by the MR. Then, the core module classifies the character by descending along the decision tree on the basis of the analysis of regions around the character, in particular of the `profiles' and `cuts,' and without using context information. Finally, it recognizes the character or reactivates the core module by analyzing validation test results. The recognizer is largely insensible to character discontinuity and is able to detect Arabic numerals and English alphabet capital letters. The recognition rate of a 32 X 32 pixel character is of about 97% after the first iteration, and of over 98% after the second iteration.

  14. The recognition of graphical patterns invariant to geometrical transformation of the models

    NASA Astrophysics Data System (ADS)

    Ileană, Ioan; Rotar, Corina; Muntean, Maria; Ceuca, Emilian

    2010-11-01

    In case that a pattern recognition system is used for images recognition (in robot vision, handwritten recognition etc.), the system must have the capacity to identify an object indifferently of its size or position in the image. The problem of the invariance of recognition can be approached in some fundamental modes. One may apply the similarity criterion used in associative recall. The original pattern is replaced by a mathematical transform that assures some invariance (e.g. the value of two-dimensional Fourier transformation is translation invariant, the value of Mellin transformation is scale invariant). In a different approach the original pattern is represented through a set of features, each of them being coded indifferently of the position, orientation or position of the pattern. Generally speaking, it is easy to obtain invariance in relation with one transformation group, but is difficult to obtain simultaneous invariance at rotation, translation and scale. In this paper we analyze some methods to achieve invariant recognition of images, particularly for digit images. A great number of experiments are due and the conclusions are underplayed in the paper.

  15. Neuromorphic Hardware Architecture Using the Neural Engineering Framework for Pattern Recognition.

    PubMed

    Wang, Runchun; Thakur, Chetan Singh; Cohen, Gregory; Hamilton, Tara Julia; Tapson, Jonathan; van Schaik, Andre

    2017-06-01

    We present a hardware architecture that uses the neural engineering framework (NEF) to implement large-scale neural networks on field programmable gate arrays (FPGAs) for performing massively parallel real-time pattern recognition. NEF is a framework that is capable of synthesising large-scale cognitive systems from subnetworks and we have previously presented an FPGA implementation of the NEF that successfully performs nonlinear mathematical computations. That work was developed based on a compact digital neural core, which consists of 64 neurons that are instantiated by a single physical neuron using a time-multiplexing approach. We have now scaled this approach up to build a pattern recognition system by combining identical neural cores together. As a proof of concept, we have developed a handwritten digit recognition system using the MNIST database and achieved a recognition rate of 96.55%. The system is implemented on a state-of-the-art FPGA and can process 5.12 million digits per second. The architecture and hardware optimisations presented offer high-speed and resource-efficient means for performing high-speed, neuromorphic, and massively parallel pattern recognition and classification tasks.

  16. Technology and the Oops! Effect: Finding a Bias against Word Processing.

    ERIC Educational Resources Information Center

    Roblyer, M. D.

    1997-01-01

    Introduced to aid writing, word processing can cause unexpected problems for those who use it. Describes four studies in which raters gave word-processed essays consistently lower scores than handwritten essays. Reasons for the discrepancies were higher expectations for typed essays, ease of spotting text errors in typed text, and more difficulty…

  17. Assessment of legibility and completeness of handwritten and electronic prescriptions.

    PubMed

    Albarrak, Ahmed I; Al Rashidi, Eman Abdulrahman; Fatani, Rwaa Kamil; Al Ageel, Shoog Ibrahim; Mohammed, Rafiuddin

    2014-12-01

    To assess the legibility and completeness of handwritten prescriptions and compare with electronic prescription system for medication errors. Prospective study. King Khalid University Hospital (KKUH), Riyadh, Saudi Arabia. Handwritten prescriptions were received from clinical units of Medicine Outpatient Department (MOPD), Primary Care Clinic (PCC) and Surgery Outpatient Department (SOPD) whereas electronic prescriptions were collected from the pediatric ward. The handwritten prescription was assessed for completeness by the checklist designed according to the hospital prescription and evaluated for legibility by two pharmacists. The comparison between handwritten and electronic prescription errors was evaluated based on the validated checklist adopted from previous studies. Legibility and completeness of prescriptions. 398 prescriptions (199 handwritten and 199 e-prescriptions) were assessed. About 71 (35.7%) of handwritten and 5 (2.5%) of electronic prescription errors were identified. A significant statistical difference (P < 0.001) was observed between handwritten and e-prescriptions in omitted dose and omitted route of administration category of error distribution. The rate of completeness in patient identification in handwritten prescriptions was 80.97% in MOPD, 76.36% in PCC and 85.93% in SOPD clinic units. Assessment of medication prescription completeness was 91.48% in MOPD, 88.48% in PCC, and 89.28% in SOPD. This study revealed a high incidence of prescribing errors in handwritten prescriptions. The use of e-prescription system showed a significant decline in the incidence of errors. The legibility of handwritten prescriptions was relatively good whereas the level of completeness was very low.

  18. A randomized comparison between records made with an anesthesia information management system and by hand, and evaluation of the Hawthorne effect.

    PubMed

    Edwards, Kylie-Ellen; Hagen, Sander M; Hannam, Jacqueline; Kruger, Cornelis; Yu, Richard; Merry, Alan F

    2013-10-01

    Anesthesia information management system (AIMS) technology is designed to facilitate high-quality anesthetic recordkeeping. We examined the hypothesis that no difference exists between AIMS and handwritten anesthetic records in regard to the completeness of important information contained as text data. We also investigated the effect of observational research on the completeness of anesthesiologists' recordkeeping. As part of a larger randomized controlled trial, participants were randomized to produce 400 anesthetic records, either handwritten (n = 200) or using an AIMS (n = 200). Records were assessed against a 32-item checklist modified from a clinical guideline. Intravenous agent and bolus recordings were quantified, and data were compared between handwritten and AIMS records. Records produced with intensive research observation during the initial phase of the study (n = 200) were compared with records produced with reduced intensity observation during the final phase of the study (n = 200). The AIMS records were more complete than the handwritten records (mean difference 7.1%; 95% confidence interval [CI] 5.6 to 8.6%; P < 0.0001), with higher completion rates for six individual items on the checklist (P < 0.0001). Drug annotation data were equal between arms. The records completed early in the study, during a period of more intense observation, were more thorough than subsequent records (87.3% vs 81.6%, respectively; mean difference 5.7%; 95% CI 4.2 to 7.3%; P < 0.0001). The AIMS records were more complete than the handwritten records for 32 predefined items. The potential of observational research to influence professional behaviour in an anesthetic context was confirmed. This trial was registered at the Australian New Zealand Clinical Trials Registry No 12608000068369.

  19. Sea Level Data Archaeology for the Global Sea Level Observing System (GLOSS)

    NASA Astrophysics Data System (ADS)

    Bradshaw, Elizabeth; Matthews, Andy; Rickards, Lesley; Jevrejeva, Svetlana

    2015-04-01

    The Global Sea Level Observing System (GLOSS) was set up in 1985 to collect long term tide gauge observations and has carried out a number of data archaeology activities over the past decade, including sending member organisations questionnaires to report on their repositories. The GLOSS Group of Experts (GLOSS GE) is looking to future developments in sea level data archaeology and will provide its user community with guidance on finding, digitising, quality controlling and distributing historic records. Many records may not be held in organisational archives and may instead by in national libraries, archives and other collections. GLOSS will promote a Citizen Science approach to discovering long term records by providing tools for volunteers to report data. Tide gauge data come in two different formats, charts and hand-written ledgers. Charts are paper analogue records generated by the mechanical instrument driving a pen trace. Several GLOSS members have developed software to automatically digitise these charts and the various methods were reported in a paper on automated techniques for the digitization of archived mareograms, delivered to the GLOSS GE 13th meeting. GLOSS is creating a repository of software for scanning analogue charts. NUNIEAU is the only publically available software for digitising tide gauge charts but other organisations have developed their own tide gauge digitising software that is available internally. There are several other freely available software packages that convert image data to numerical values. GLOSS could coordinate a comparison study of the various different digitising software programs by: Sending the same charts to each organisation and asking everyone to digitise them using their own procedures Comparing the digitised data Providing recommendations to the GLOSS community The other major form of analogue sea level data is handwritten ledgers, which are usually observations of high and low waters, but sometimes contain higher frequency data. The standard current method for digitising these data is to enter the values manually, which has been performed by GLOSS countries, including France and Spain. The GLOSS GE is exploring other methods for use in the future as this process is time consuming. Current projects to improve Handwritten Text Recognition (HTR) tend to be working with the written word and so require knowledge of sentence structures and word occurrence probabilities to reconstruct sentences e.g. tranScriptorium (European Union's Seventh Framework Programme funded project). This approach would not be applicable to sea level data, however tidal data by its very nature contains periodicity and predictability. HTR technology could be adapted to take this into account and improve the automatic digitisation of handwritten tide gauge ledgers. There are many challenges facing the sea level data archaeology community, but it is hoped that improvements in technology can overcome some of the obstacles: Faster automated digitisation of tide gauge charts Minimal user input Automatic transcribing of handwritten ledgers The GLOSS GE will provide a central location to share software, guidelines for quality controlling data and the GLOSS data archive centres will be the repository of the newly created datasets.

  20. Enhancement and character recognition of the erased colophon of a 15th-century Hebrew prayer book

    NASA Astrophysics Data System (ADS)

    Walvoord, Derek J.; Easton, Roger L., Jr.; Knox, Keith T.; Heimbueger, Matthew

    2005-01-01

    A handwritten codex often included an inscription that listed facts about its publication, such as the names of the scribe and patron, date of publication, the city where the book was copied, etc. These facts obviously provide essential information to a historian studying the provenance of the codex. Unfortunately, this page was sometimes erased after the sale of the book to a new owner, often by scraping off the original ink. The importance of recovering this information would be difficult to overstate. This paper reports on the methods of imaging, image enhancement, and character recognition that were applied to this page in a Hebrew prayer book copied in Florence in the 15th century.

  1. Enhancement and character recognition of the erased colophon of a 15th-century Hebrew prayer book

    NASA Astrophysics Data System (ADS)

    Walvoord, Derek J.; Easton, Roger L., Jr.; Knox, Keith T.; Heimbueger, Matthew

    2004-12-01

    A handwritten codex often included an inscription that listed facts about its publication, such as the names of the scribe and patron, date of publication, the city where the book was copied, etc. These facts obviously provide essential information to a historian studying the provenance of the codex. Unfortunately, this page was sometimes erased after the sale of the book to a new owner, often by scraping off the original ink. The importance of recovering this information would be difficult to overstate. This paper reports on the methods of imaging, image enhancement, and character recognition that were applied to this page in a Hebrew prayer book copied in Florence in the 15th century.

  2. Diverse spike-timing-dependent plasticity based on multilevel HfO x memristor for neuromorphic computing

    NASA Astrophysics Data System (ADS)

    Lu, Ke; Li, Yi; He, Wei-Fan; Chen, Jia; Zhou, Ya-Xiong; Duan, Nian; Jin, Miao-Miao; Gu, Wei; Xue, Kan-Hao; Sun, Hua-Jun; Miao, Xiang-Shui

    2018-06-01

    Memristors have emerged as promising candidates for artificial synaptic devices, serving as the building block of brain-inspired neuromorphic computing. In this letter, we developed a Pt/HfO x /Ti memristor with nonvolatile multilevel resistive switching behaviors due to the evolution of the conductive filaments and the variation in the Schottky barrier. Diverse state-dependent spike-timing-dependent-plasticity (STDP) functions were implemented with different initial resistance states. The measured STDP forms were adopted as the learning rule for a three-layer spiking neural network which achieves a 75.74% recognition accuracy for MNIST handwritten digit dataset. This work has shown the capability of memristive synapse in spiking neural networks for pattern recognition application.

  3. Introduction of statistical information in a syntactic analyzer for document image recognition

    NASA Astrophysics Data System (ADS)

    Maroneze, André O.; Coüasnon, Bertrand; Lemaitre, Aurélie

    2011-01-01

    This paper presents an improvement to document layout analysis systems, offering a possible solution to Sayre's paradox (which states that an element "must be recognized before it can be segmented; and it must be segmented before it can be recognized"). This improvement, based on stochastic parsing, allows integration of statistical information, obtained from recognizers, during syntactic layout analysis. We present how this fusion of numeric and symbolic information in a feedback loop can be applied to syntactic methods to improve document description expressiveness. To limit combinatorial explosion during exploration of solutions, we devised an operator that allows optional activation of the stochastic parsing mechanism. Our evaluation on 1250 handwritten business letters shows this method allows the improvement of global recognition scores.

  4. Assessment of legibility and completeness of handwritten and electronic prescriptions

    PubMed Central

    Albarrak, Ahmed I; Al Rashidi, Eman Abdulrahman; Fatani, Rwaa Kamil; Al Ageel, Shoog Ibrahim; Mohammed, Rafiuddin

    2014-01-01

    Objectives To assess the legibility and completeness of handwritten prescriptions and compare with electronic prescription system for medication errors. Design Prospective study. Setting King Khalid University Hospital (KKUH), Riyadh, Saudi Arabia. Subjects and methods Handwritten prescriptions were received from clinical units of Medicine Outpatient Department (MOPD), Primary Care Clinic (PCC) and Surgery Outpatient Department (SOPD) whereas electronic prescriptions were collected from the pediatric ward. The handwritten prescription was assessed for completeness by the checklist designed according to the hospital prescription and evaluated for legibility by two pharmacists. The comparison between handwritten and electronic prescription errors was evaluated based on the validated checklist adopted from previous studies. Main outcome measures Legibility and completeness of prescriptions. Results 398 prescriptions (199 handwritten and 199 e-prescriptions) were assessed. About 71 (35.7%) of handwritten and 5 (2.5%) of electronic prescription errors were identified. A significant statistical difference (P < 0.001) was observed between handwritten and e-prescriptions in omitted dose and omitted route of administration category of error distribution. The rate of completeness in patient identification in handwritten prescriptions was 80.97% in MOPD, 76.36% in PCC and 85.93% in SOPD clinic units. Assessment of medication prescription completeness was 91.48% in MOPD, 88.48% in PCC, and 89.28% in SOPD. Conclusions This study revealed a high incidence of prescribing errors in handwritten prescriptions. The use of e-prescription system showed a significant decline in the incidence of errors. The legibility of handwritten prescriptions was relatively good whereas the level of completeness was very low. PMID:25561864

  5. Quicksilver IV: The Real Operation Fortitude

    DTIC Science & Technology

    2010-06-01

    Fortitude, they have also focused on the personalities that made those operations so fascinating; they have devoted entire books to Juan Garcia...was unclear, I have included explanatory notes, based on my own insights, in an effort to provide clarity. The original text is in normal font . Text...that was handwritten in is in italics. Text that was manually crossed out is in a strikethrough font . Notes on Coordinates and Conversion The

  6. Identifying images of handwritten digits using deep learning in H2O

    NASA Astrophysics Data System (ADS)

    Sadhasivam, Jayakumar; Charanya, R.; Kumar, S. Harish; Srinivasan, A.

    2017-11-01

    Automatic digit recognition is of popular interest today. Deep learning techniques make it possible for object recognition in image data. Perceiving the digit has turned into a fundamental part as far as certifiable applications. Since, digits are composed in various styles in this way to distinguish the digit it is important to perceive and arrange it with the assistance of machine learning methods. This exploration depends on supervised learning vector quantization neural system arranged under counterfeit artificial neural network. The pictures of digits are perceived, prepared and tried. After the system is made digits are prepared utilizing preparing dataset vectors and testing is connected to the pictures of digits which are separated to each other by fragmenting the picture and resizing the digit picture as needs be for better precision.

  7. Korean letter handwritten recognition using deep convolutional neural network on android platform

    NASA Astrophysics Data System (ADS)

    Purnamawati, S.; Rachmawati, D.; Lumanauw, G.; Rahmat, R. F.; Taqyuddin, R.

    2018-03-01

    Currently, popularity of Korean culture attracts many people to learn everything about Korea, particularly its language. To acquire Korean Language, every single learner needs to be able to understand Korean non-Latin character. A digital approach needs to be carried out in order to make Korean learning process easier. This study is done by using Deep Convolutional Neural Network (DCNN). DCNN performs the recognition process on the image based on the model that has been trained such as Inception-v3 Model. Subsequently, re-training process using transfer learning technique with the trained and re-trained value of model is carried though in order to develop a new model with a better performance without any specific systemic errors. The testing accuracy of this research results in 86,9%.

  8. Postprocessing for character recognition using pattern features and linguistic information

    NASA Astrophysics Data System (ADS)

    Yoshikawa, Takatoshi; Okamoto, Masayosi; Horii, Hiroshi

    1993-04-01

    We propose a new method of post-processing for character recognition using pattern features and linguistic information. This method corrects errors in the recognition of handwritten Japanese sentences containing Kanji characters. This post-process method is characterized by having two types of character recognition. Improving the accuracy of the character recognition rate of Japanese characters is made difficult by the large number of characters, and the existence of characters with similar patterns. Therefore, it is not practical for a character recognition system to recognize all characters in detail. First, this post-processing method generates a candidate character table by recognizing the simplest features of characters. Then, it selects words corresponding to the character from the candidate character table by referring to a word and grammar dictionary before selecting suitable words. If the correct character is included in the candidate character table, this process can correct an error, however, if the character is not included, it cannot correct an error. Therefore, if this method can presume a character does not exist in a candidate character table by using linguistic information (word and grammar dictionary). It then can verify a presumed character by character recognition using complex features. When this method is applied to an online character recognition system, the accuracy of character recognition improves 93.5% to 94.7%. This proved to be the case when it was used for the editorials of a Japanese newspaper (Asahi Shinbun).

  9. Computerized Orders with Standardized Concentrations Decrease Dispensing Errors of Continuous Infusion Medications for Pediatrics

    PubMed Central

    Sowan, Azizeh K.; Vaidya, Vinay U.; Soeken, Karen L.; Hilmas, Elora

    2010-01-01

    OBJECTIVES The use of continuous infusion medications with individualized concentrations may increase the risk for errors in pediatric patients. The objective of this study was to evaluate the effect of computerized prescriber order entry (CPOE) for continuous infusions with standardized concentrations on frequency of pharmacy processing errors. In addition, time to process handwritten versus computerized infusion orders was evaluated and user satisfaction with CPOE as compared to handwritten orders was measured. METHODS Using a crossover design, 10 pharmacists in the pediatric satellite within a university teaching hospital were given test scenarios of handwritten and CPOE order sheets and asked to process infusion orders using the pharmacy system in order to generate infusion labels. Participants were given three groups of orders: five correct handwritten orders, four handwritten orders written with deliberate errors, and five correct CPOE orders. Label errors were analyzed and time to complete the task was recorded. RESULTS Using CPOE orders, participants required less processing time per infusion order (2 min, 5 sec ± 58 sec) compared with time per infusion order in the first handwritten order sheet group (3 min, 7 sec ± 1 min, 20 sec) and the second handwritten order sheet group (3 min, 26 sec ± 1 min, 8 sec), (p<0.01). CPOE eliminated all error types except wrong concentration. With CPOE, 4% of infusions processed contained errors, compared with 26% of the first group of handwritten orders and 45% of the second group of handwritten orders (p<0.03). Pharmacists were more satisfied with CPOE orders when compared with the handwritten method (p=0.0001). CONCLUSIONS CPOE orders saved pharmacists' time and greatly improved the safety of processing continuous infusions, although not all errors were eliminated. pharmacists were overwhelmingly satisfied with the CPOE orders PMID:22477811

  10. Learning representation hierarchies by sharing visual features: a computational investigation of Persian character recognition with unsupervised deep learning.

    PubMed

    Sadeghi, Zahra; Testolin, Alberto

    2017-08-01

    In humans, efficient recognition of written symbols is thought to rely on a hierarchical processing system, where simple features are progressively combined into more abstract, high-level representations. Here, we present a computational model of Persian character recognition based on deep belief networks, where increasingly more complex visual features emerge in a completely unsupervised manner by fitting a hierarchical generative model to the sensory data. Crucially, high-level internal representations emerging from unsupervised deep learning can be easily read out by a linear classifier, achieving state-of-the-art recognition accuracy. Furthermore, we tested the hypothesis that handwritten digits and letters share many common visual features: A generative model that captures the statistical structure of the letters distribution should therefore also support the recognition of written digits. To this aim, deep networks trained on Persian letters were used to build high-level representations of Persian digits, which were indeed read out with high accuracy. Our simulations show that complex visual features, such as those mediating the identification of Persian symbols, can emerge from unsupervised learning in multilayered neural networks and can support knowledge transfer across related domains.

  11. A Set of Handwriting Features for Use in Automated Writer Identification.

    PubMed

    Miller, John J; Patterson, Robert Bradley; Gantz, Donald T; Saunders, Christopher P; Walch, Mark A; Buscaglia, JoAnn

    2017-05-01

    A writer's biometric identity can be characterized through the distribution of physical feature measurements ("writer's profile"); a graph-based system that facilitates the quantification of these features is described. To accomplish this quantification, handwriting is segmented into basic graphical forms ("graphemes"), which are "skeletonized" to yield the graphical topology of the handwritten segment. The graph-based matching algorithm compares the graphemes first by their graphical topology and then by their geometric features. Graphs derived from known writers can be compared against graphs extracted from unknown writings. The process is computationally intensive and relies heavily upon statistical pattern recognition algorithms. This article focuses on the quantification of these physical features and the construction of the associated pattern recognition methods for using the features to discriminate among writers. The graph-based system described in this article has been implemented in a highly accurate and approximately language-independent biometric recognition system of writers of cursive documents. © 2017 American Academy of Forensic Sciences.

  12. Recognition of Handwriting from Electromyography

    PubMed Central

    Linderman, Michael; Lebedev, Mikhail A.; Erlichman, Joseph S.

    2009-01-01

    Handwriting – one of the most important developments in human culture – is also a methodological tool in several scientific disciplines, most importantly handwriting recognition methods, graphology and medical diagnostics. Previous studies have relied largely on the analyses of handwritten traces or kinematic analysis of handwriting; whereas electromyographic (EMG) signals associated with handwriting have received little attention. Here we show for the first time, a method in which EMG signals generated by hand and forearm muscles during handwriting activity are reliably translated into both algorithm-generated handwriting traces and font characters using decoding algorithms. Our results demonstrate the feasibility of recreating handwriting solely from EMG signals – the finding that can be utilized in computer peripherals and myoelectric prosthetic devices. Moreover, this approach may provide a rapid and sensitive method for diagnosing a variety of neurogenerative diseases before other symptoms become clear. PMID:19707562

  13. Maximum entropy PDF projection: A review

    NASA Astrophysics Data System (ADS)

    Baggenstoss, Paul M.

    2017-06-01

    We review maximum entropy (MaxEnt) PDF projection, a method with wide potential applications in statistical inference. The method constructs a sampling distribution for a high-dimensional vector x based on knowing the sampling distribution p(z) of a lower-dimensional feature z = T (x). Under mild conditions, the distribution p(x) having highest possible entropy among all distributions consistent with p(z) may be readily found. Furthermore, the MaxEnt p(x) may be sampled, making the approach useful in Monte Carlo methods. We review the theorem and present a case study in model order selection and classification for handwritten character recognition.

  14. Two-stage approach to keyword spotting in handwritten documents

    NASA Astrophysics Data System (ADS)

    Haji, Mehdi; Ameri, Mohammad R.; Bui, Tien D.; Suen, Ching Y.; Ponson, Dominique

    2013-12-01

    Separation of keywords from non-keywords is the main problem in keyword spotting systems which has traditionally been approached by simplistic methods, such as thresholding of recognition scores. In this paper, we analyze this problem from a machine learning perspective, and we study several standard machine learning algorithms specifically in the context of non-keyword rejection. We propose a two-stage approach to keyword spotting and provide a theoretical analysis of the performance of the system which gives insights on how to design the classifier in order to maximize the overall performance in terms of F-measure.

  15. Invariant approach to the character classification

    NASA Astrophysics Data System (ADS)

    Šariri, Kristina; Demoli, Nazif

    2008-04-01

    Image moments analysis is a very useful tool which allows image description invariant to translation and rotation, scale change and some types of image distortions. The aim of this work was development of simple method for fast and reliable classification of characters by using Hu's and affine moment invariants. Measure of Eucleidean distance was used as a discrimination feature with statistical parameters estimated. The method was tested in classification of Times New Roman font letters as well as sets of the handwritten characters. It is shown that using all Hu's and three affine invariants as discrimination set improves recognition rate by 30%.

  16. Hemispheric Differences in Processing Handwritten Cursive

    ERIC Educational Resources Information Center

    Hellige, Joseph B.; Adamson, Maheen M.

    2007-01-01

    Hemispheric asymmetry was examined for native English speakers identifying consonant-vowel-consonant (CVC) non-words presented in standard printed form, in standard handwritten cursive form or in handwritten cursive with the letters separated by small gaps. For all three conditions, fewer errors occurred when stimuli were presented to the right…

  17. Kernel-aligned multi-view canonical correlation analysis for image recognition

    NASA Astrophysics Data System (ADS)

    Su, Shuzhi; Ge, Hongwei; Yuan, Yun-Hao

    2016-09-01

    Existing kernel-based correlation analysis methods mainly adopt a single kernel in each view. However, only a single kernel is usually insufficient to characterize nonlinear distribution information of a view. To solve the problem, we transform each original feature vector into a 2-dimensional feature matrix by means of kernel alignment, and then propose a novel kernel-aligned multi-view canonical correlation analysis (KAMCCA) method on the basis of the feature matrices. Our proposed method can simultaneously employ multiple kernels to better capture the nonlinear distribution information of each view, so that correlation features learned by KAMCCA can have well discriminating power in real-world image recognition. Extensive experiments are designed on five real-world image datasets, including NIR face images, thermal face images, visible face images, handwritten digit images, and object images. Promising experimental results on the datasets have manifested the effectiveness of our proposed method.

  18. Unsupervised categorization method of graphemes on handwritten manuscripts: application to style recognition

    NASA Astrophysics Data System (ADS)

    Daher, H.; Gaceb, D.; Eglin, V.; Bres, S.; Vincent, N.

    2012-01-01

    We present in this paper a feature selection and weighting method for medieval handwriting images that relies on codebooks of shapes of small strokes of characters (graphemes that are issued from the decomposition of manuscripts). These codebooks are important to simplify the automation of the analysis, the manuscripts transcription and the recognition of styles or writers. Our approach provides a precise features weighting by genetic algorithms and a highperformance methodology for the categorization of the shapes of graphemes by using graph coloring into codebooks which are applied in turn on CBIR (Content Based Image Retrieval) in a mixed handwriting database containing different pages from different writers, periods of the history and quality. We show how the coupling of these two mechanisms 'features weighting - graphemes classification' can offer a better separation of the forms to be categorized by exploiting their grapho-morphological, their density and their significant orientations particularities.

  19. Usage of the back-propagation method for alphabet recognition

    NASA Astrophysics Data System (ADS)

    Shaila Sree, R. N.; Eswaran, Kumar; Sundararajan, N.

    1999-03-01

    Artificial Neural Networks play a pivotal role in the branch of Artificial Intelligence. They can be trained efficiently for a variety of tasks using different methods, of which the Back Propagation method is one among them. The paper studies the choosing of various design parameters of a neural network for the Back Propagation method. The study shows that when these parameters are properly assigned, the training task of the net is greatly simplified. The character recognition problem has been chosen as a test case for this study. A sample space of different handwritten characters of the English alphabet was gathered. A Neural net is finally designed taking many the design aspects into consideration and trained for different styles of writing. Experimental results are reported and discussed. It has been found that an appropriate choice of the design parameters of the neural net for the Back Propagation method reduces the training time and improves the performance of the net.

  20. Two medieval doctors: Gilbertus Anglicus (c1180-c1250) and John of Gaddesden (1280-1361).

    PubMed

    Pearn, John

    2013-02-01

    Biographies of medieval English doctors are uncommon and fragmentary. The two best-known English medieval physicians were Gilbertus Anglicus and John of Gaddesden. This paper brings together the known details of their lives, compiled from extant biographies and from internal references in their texts. The primary records of their writings exist in handwritten texts and thereafter in incunabula from the time of the invention of printing in 1476. The record of the lives of these two medieval physicians can be expanded, as here, by the general perspective of the life and times in which they lived. Gilbertus Anglicus, an often-quoted physician-teacher at Montpellier, wrote a seven-folio Compendium medicinae in 1271. He described pioneering procedures used later in the emergent disciplines of anaesthetics, cosmetic medicine and travel medicine. Gilbertus' texts, used extensively in European medical schools, passed in handwritten copies from student to student and eventually were printed in 1510. John of Gaddesden, an Oxford graduate in Arts, Medicine and Theology, wrote Rosa Anglica, published circa 1314. Its detailed text is an exemplar of the mixture of received Hippocratic and Galenic lore compounded by medieval astronomy and religious injunction, which mixture was the essence of medieval medicine. The writings of both these medieval English physicians formed part of the core curriculum that underpinned the practice of medicine for the next 400 years.

  1. Handwritten word preprocessing for database adaptation

    NASA Astrophysics Data System (ADS)

    Oprean, Cristina; Likforman-Sulem, Laurence; Mokbel, Chafic

    2013-01-01

    Handwriting recognition systems are typically trained using publicly available databases, where data have been collected in controlled conditions (image resolution, paper background, noise level,...). Since this is not often the case in real-world scenarios, classification performance can be affected when novel data is presented to the word recognition system. To overcome this problem, we present in this paper a new approach called database adaptation. It consists of processing one set (training or test) in order to adapt it to the other set (test or training, respectively). Specifically, two kinds of preprocessing, namely stroke thickness normalization and pixel intensity normalization are considered. The advantage of such approach is that we can re-use the existing recognition system trained on controlled data. We conduct several experiments with the Rimes 2011 word database and with a real-world database. We adapt either the test set or the training set. Results show that training set adaptation achieves better results than test set adaptation, at the cost of a second training stage on the adapted data. Accuracy of data set adaptation is increased by 2% to 3% in absolute value over no adaptation.

  2. Historical Analyses of Disordered Handwriting: Perspectives on Early 20th-Century Material From a German Psychiatric Hospital

    ERIC Educational Resources Information Center

    Schiegg, Markus; Thorpe, Deborah

    2017-01-01

    Handwritten texts carry significant information, extending beyond the meaning of their words. Modern neurology, for example, benefits from the interpretation of the graphic features of writing and drawing for the diagnosis and monitoring of diseases and disorders. This article examines how handwriting analysis can be used, and has been used…

  3. Unsupervised Word Spotting in Historical Handwritten Document Images using Document-oriented Local Features.

    PubMed

    Zagoris, Konstantinos; Pratikakis, Ioannis; Gatos, Basilis

    2017-05-03

    Word spotting strategies employed in historical handwritten documents face many challenges due to variation in the writing style and intense degradation. In this paper, a new method that permits effective word spotting in handwritten documents is presented that it relies upon document-oriented local features which take into account information around representative keypoints as well a matching process that incorporates spatial context in a local proximity search without using any training data. Experimental results on four historical handwritten datasets for two different scenarios (segmentation-based and segmentation-free) using standard evaluation measures show the improved performance achieved by the proposed methodology.

  4. Deformation-Aware Log-Linear Models

    NASA Astrophysics Data System (ADS)

    Gass, Tobias; Deselaers, Thomas; Ney, Hermann

    In this paper, we present a novel deformation-aware discriminative model for handwritten digit recognition. Unlike previous approaches our model directly considers image deformations and allows discriminative training of all parameters, including those accounting for non-linear transformations of the image. This is achieved by extending a log-linear framework to incorporate a latent deformation variable. The resulting model has an order of magnitude less parameters than competing approaches to handling image deformations. We tune and evaluate our approach on the USPS task and show its generalization capabilities by applying the tuned model to the MNIST task. We gain interesting insights and achieve highly competitive results on both tasks.

  5. Learning in Stochastic Bit Stream Neural Networks.

    PubMed

    van Daalen, Max; Shawe-Taylor, John; Zhao, Jieyu

    1996-08-01

    This paper presents learning techniques for a novel feedforward stochastic neural network. The model uses stochastic weights and the "bit stream" data representation. It has a clean analysable functionality and is very attractive with its great potential to be implemented in hardware using standard digital VLSI technology. The design allows simulation at three different levels and learning techniques are described for each level. The lowest level corresponds to on-chip learning. Simulation results on three benchmark MONK's problems and handwritten digit recognition with a clean set of 500 16 x 16 pixel digits demonstrate that the new model is powerful enough for the real world applications. Copyright 1996 Elsevier Science Ltd

  6. Research on Signature Verification Method Based on Discrete Fréchet Distance

    NASA Astrophysics Data System (ADS)

    Fang, J. L.; Wu, W.

    2018-05-01

    This paper proposes a multi-feature signature template based on discrete Fréchet distance, which breaks through the limitation of traditional signature authentication using a single signature feature. It solves the online handwritten signature authentication signature global feature template extraction calculation workload, signature feature selection unreasonable problem. In this experiment, the false recognition rate (FAR) and false rejection rate (FRR) of the statistical signature are calculated and the average equal error rate (AEER) is calculated. The feasibility of the combined template scheme is verified by comparing the average equal error rate of the combination template and the original template.

  7. Identification of handwriting by using the genetic algorithm (GA) and support vector machine (SVM)

    NASA Astrophysics Data System (ADS)

    Zhang, Qigui; Deng, Kai

    2016-12-01

    As portable digital camera and a camera phone comes more and more popular, and equally pressing is meeting the requirements of people to shoot at any time, to identify and storage handwritten character. In this paper, genetic algorithm(GA) and support vector machine(SVM)are used for identification of handwriting. Compare with parameters-optimized method, this technique overcomes two defects: first, it's easy to trap in the local optimum; second, finding the best parameters in the larger range will affects the efficiency of classification and prediction. As the experimental results suggest, GA-SVM has a higher recognition rate.

  8. Dense modifiable interconnections utilizing photorefractive volume holograms

    NASA Astrophysics Data System (ADS)

    Psaltis, Demetri; Qiao, Yong

    1990-11-01

    This report describes an experimental two-layer optical neural network built at Caltech. The system uses photorefractive volume holograms to implement dense, modifiable synaptic interconnections and liquid crystal light valves (LCVS) to perform nonlinear thresholding operations. Kanerva's Sparse, Distributed Memory was implemented using this network and its ability to recognize handwritten character-alphabet (A-Z) has been demonstrated experimentally. According to Kanerva's model, the first layer has fixed, random weights of interconnections and the second layer is trained by sum-of-outer-products rule. After training, the recognition rates of the network on the training set (104 patterns) and test set (520 patterns) are 100 and 50 percent, respectively.

  9. Historical Analyses of Disordered Handwriting

    PubMed Central

    Schiegg, Markus; Thorpe, Deborah

    2016-01-01

    Handwritten texts carry significant information, extending beyond the meaning of their words. Modern neurology, for example, benefits from the interpretation of the graphic features of writing and drawing for the diagnosis and monitoring of diseases and disorders. This article examines how handwriting analysis can be used, and has been used historically, as a methodological tool for the assessment of medical conditions and how this enhances our understanding of historical contexts of writing. We analyze handwritten material, writing tests and letters, from patients in an early 20th-century psychiatric hospital in southern Germany (Irsee/Kaufbeuren). In this institution, early psychiatrists assessed handwriting features, providing us novel insights into the earliest practices of psychiatric handwriting analysis, which can be connected to Berkenkotter’s research on medical admission records. We finally consider the degree to which historical handwriting bears semiotic potential to explain the psychological state and personality of a writer, and how future research in written communication should approach these sources. PMID:28408774

  10. Semi-automatic ground truth generation using unsupervised clustering and limited manual labeling: Application to handwritten character recognition

    PubMed Central

    Vajda, Szilárd; Rangoni, Yves; Cecotti, Hubert

    2015-01-01

    For training supervised classifiers to recognize different patterns, large data collections with accurate labels are necessary. In this paper, we propose a generic, semi-automatic labeling technique for large handwritten character collections. In order to speed up the creation of a large scale ground truth, the method combines unsupervised clustering and minimal expert knowledge. To exploit the potential discriminant complementarities across features, each character is projected into five different feature spaces. After clustering the images in each feature space, the human expert labels the cluster centers. Each data point inherits the label of its cluster’s center. A majority (or unanimity) vote decides the label of each character image. The amount of human involvement (labeling) is strictly controlled by the number of clusters – produced by the chosen clustering approach. To test the efficiency of the proposed approach, we have compared, and evaluated three state-of-the art clustering methods (k-means, self-organizing maps, and growing neural gas) on the MNIST digit data set, and a Lampung Indonesian character data set, respectively. Considering a k-nn classifier, we show that labeling manually only 1.3% (MNIST), and 3.2% (Lampung) of the training data, provides the same range of performance than a completely labeled data set would. PMID:25870463

  11. Correlation of patient entry rates and physician documentation errors in dictated and handwritten emergency treatment records.

    PubMed

    Dawdy, M R; Munter, D W; Gilmore, R A

    1997-03-01

    This study was designed to examine the relationship between patient entry rates (a measure of physician work load) and documentation errors/omissions in both handwritten and dictated emergency treatment records. The study was carried out in two phases. Phase I examined handwritten records and Phase II examined dictated and transcribed records. A total of 838 charts for three common chief complaints (chest pain, abdominal pain, asthma/chronic obstructive pulmonary disease) were retrospectively reviewed and scored for the presence or absence of 11 predetermined criteria. Patient entry rates were determined by reviewing the emergency department patient registration logs. The data were analyzed using simple correlation and linear regression analysis. A positive correlation was found between patient entry rates and documentation errors in handwritten charts. No such correlation was found in the dictated charts. We conclude that work load may negatively affect documentation accuracy when charts are handwritten. However, the use of dictation services may minimize or eliminate this effect.

  12. Completion of hand-written surgical consent forms is frequently suboptimal and could be improved by using electronically generated, procedure-specific forms.

    PubMed

    St John, E R; Scott, A J; Irvine, T E; Pakzad, F; Leff, D R; Layer, G T

    2017-08-01

    Completion of hand-written consent forms for surgical procedures may suffer from missing or inaccurate information, poor legibility and high variability. We audited the completion of hand-written consent forms and trialled a web-based application to generate modifiable, procedure-specific consent forms. The investigation comprised two phases at separate UK hospitals. In phase one, the completion of individual responses in hand-written consent forms for a variety of procedures were prospectively audited. Responses were categorised into three domains (patient details, procedure details and patient sign-off) that were considered "failed" if a contained element was not correct and legible. Phase two was confined to a breast surgical unit where hand-written consent forms were assessed as for phase one and interrogated for missing complications by two independent experts. An electronic consent platform was introduced and electronically-produced consent forms assessed. In phase one, 99 hand-written consent forms were assessed and the domain failure rates were: patient details 10%; procedure details 30%; and patient sign-off 27%. Laparoscopic cholecystectomy was the most common procedure (7/99) but there was significant variability in the documentation of complications: 12 in total, a median of 6 and a range of 2-9. In phase two, 44% (27/61) of hand-written forms were missing essential complications. There were no domain failures amongst 29 electronically-produced consent forms and no variability in the documentation of potential complications. Completion of hand-written consent forms suffers from wide variation and is frequently suboptimal. Electronically-produced, procedure-specific consent forms can improve the quality and consistency of consent documentation. Copyright © 2015 Royal College of Surgeons of Edinburgh (Scottish charity number SC005317) and Royal College of Surgeons in Ireland. Published by Elsevier Ltd. All rights reserved.

  13. Are written and spoken recall of text equivalent?

    PubMed

    Kellogg, Ronald T

    2007-01-01

    Writing is less practiced than speaking, graphemic codes are activated only in writing, and the retrieved representations of the text must be maintained in working memory longer because handwritten output is slower than speech. These extra demands on working memory could result in less effort being given to retrieval during written compared with spoken text recall. To test this hypothesis, college students read or heard Bartlett's "War of the Ghosts" and then recalled the text in writing or speech. Spoken recall produced more accurately recalled propositions and more major distortions (e.g., inferences) than written recall. The results suggest that writing reduces the retrieval effort given to reconstructing the propositions of a text.

  14. Reduction in chemotherapy order errors with computerized physician order entry.

    PubMed

    Meisenberg, Barry R; Wright, Robert R; Brady-Copertino, Catherine J

    2014-01-01

    To measure the number and type of errors associated with chemotherapy order composition associated with three sequential methods of ordering: handwritten orders, preprinted orders, and computerized physician order entry (CPOE) embedded in the electronic health record. From 2008 to 2012, a sample of completed chemotherapy orders were reviewed by a pharmacist for the number and type of errors as part of routine performance improvement monitoring. Error frequencies for each of the three distinct methods of composing chemotherapy orders were compared using statistical methods. The rate of problematic order sets-those requiring significant rework for clarification-was reduced from 30.6% with handwritten orders to 12.6% with preprinted orders (preprinted v handwritten, P < .001) to 2.2% with CPOE (preprinted v CPOE, P < .001). The incidence of errors capable of causing harm was reduced from 4.2% with handwritten orders to 1.5% with preprinted orders (preprinted v handwritten, P < .001) to 0.1% with CPOE (CPOE v preprinted, P < .001). The number of problem- and error-containing chemotherapy orders was reduced sequentially by preprinted order sets and then by CPOE. CPOE is associated with low error rates, but it did not eliminate all errors, and the technology can introduce novel types of errors not seen with traditional handwritten or preprinted orders. Vigilance even with CPOE is still required to avoid patient harm.

  15. Image distortion analysis using polynomial series expansion.

    PubMed

    Baggenstoss, Paul M

    2004-11-01

    In this paper, we derive a technique for analysis of local distortions which affect data in real-world applications. In the paper, we focus on image data, specifically handwritten characters. Given a reference image and a distorted copy of it, the method is able to efficiently determine the rotations, translations, scaling, and any other distortions that have been applied. Because the method is robust, it is also able to estimate distortions for two unrelated images, thus determining the distortions that would be required to cause the two images to resemble each other. The approach is based on a polynomial series expansion using matrix powers of linear transformation matrices. The technique has applications in pattern recognition in the presence of distortions.

  16. Historical Analyses of Disordered Handwriting: Perspectives on Early 20th-Century Material From a German Psychiatric Hospital.

    PubMed

    Schiegg, Markus; Thorpe, Deborah

    2017-01-01

    Handwritten texts carry significant information, extending beyond the meaning of their words. Modern neurology, for example, benefits from the interpretation of the graphic features of writing and drawing for the diagnosis and monitoring of diseases and disorders. This article examines how handwriting analysis can be used, and has been used historically, as a methodological tool for the assessment of medical conditions and how this enhances our understanding of historical contexts of writing. We analyze handwritten material, writing tests and letters, from patients in an early 20th-century psychiatric hospital in southern Germany (Irsee/Kaufbeuren). In this institution, early psychiatrists assessed handwriting features, providing us novel insights into the earliest practices of psychiatric handwriting analysis, which can be connected to Berkenkotter's research on medical admission records. We finally consider the degree to which historical handwriting bears semiotic potential to explain the psychological state and personality of a writer, and how future research in written communication should approach these sources.

  17. A comparison of the surface contaminants of handwritten recycled and printed electronic parenteral nutrition prescriptions and their transfer to bag surfaces during delivery to hospital wards.

    PubMed

    Austin, Peter David; Hand, Kieran Sean; Elia, Marinos

    2014-02-01

    Handwritten recycled paper prescription for parenteral nutrition (PN) may become a concentrated source of viable contaminants, including pathogens. This study examined the effect of using fresh printouts of electronic prescriptions on these contaminants. Cellulose sponge stick swabs with neutralizing buffer were used to sample the surfaces of PN prescriptions (n = 32 handwritten recycled; n = 32 printed electronic) on arrival to the pharmacy or following printing and PN prescriptions and bags packaged together during delivery (n = 38 handwritten recycled; n = 34 printed electronic) on arrival to hospital wards. Different media plates and standard microbiological procedures identified the type and number of contaminants. Staphylococcus aureus, fungi, and mold were infrequent contaminants. nonspecific aerobes more frequently contaminated handwritten recycled than printed electronic prescriptions (into pharmacy, 94% vs 44%, fisher exact test P .001; onto wards, 76% vs 50%, p = .028), with greater numbers of colony-forming units (CFU) (into pharmacy, median 130 [interquartile range (IQR), 65260] VS 0 [075], Mann-Whitney U test, P .001; onto wards, median 120 [15320] vs 10 [040], P = .001). packaging with handwritten recycled prescriptions led to more frequent nonspecific aerobic bag surface contamination (63% vs 41%, fisher exact test P = .097), with greater numbers of CFU (median 40 [IQR, 080] VS 0 [040], Mann-Whitney U test, P = .036). The use of printed electronic PN prescriptions can reduce microbial loads for contamination of surfaces that compromises aseptic techniques.

  18. Rotation Reveals the Importance of Configural Cues in Handwritten Word Perception

    PubMed Central

    Barnhart, Anthony S.; Goldinger, Stephen D.

    2013-01-01

    A dramatic perceptual asymmetry occurs when handwritten words are rotated 90° in either direction. Those rotated in a direction consistent with their natural tilt (typically clockwise) become much more difficult to recognize, relative to those rotated in the opposite direction. In Experiment 1, we compared computer-printed and handwritten words, all equated for degrees of leftward and rightward tilt, and verified the phenomenon: The effect of rotation was far larger for cursive words, especially when rotated in a tilt-consistent direction. In Experiment 2, we replicated this pattern with all items presented in visual noise. In both experiments, word frequency effects were larger for computer-printed words and did not interact with rotation. The results suggest that handwritten word perception requires greater configural processing, relative to computer print, because handwritten letters are variable and ambiguous. When words are rotated, configural processing suffers, particularly when rotation exaggerates natural tilt. Our account is similar to theories of the “Thatcher Illusion,” wherein face inversion disrupts holistic processing. Together, the findings suggest that configural, word-level processing automatically increases when people read handwriting, as letter-level processing becomes less reliable. PMID:23589201

  19. Understanding the Use of Graphic Novels to Support the Writing Skills of a Struggling Writer

    ERIC Educational Resources Information Center

    Voss, Christina L.

    2013-01-01

    This mixed methods study combining a single-subject experimental design with an embedded case study focuses on the impact of a visual treatment on the handwritten and typed output of a struggling male writer during his 5th through 7th grades who has undergone a longitudinal remedial phase of two and a half years creating text-only material as well…

  20. Discriminative Features Mining for Offline Handwritten Signature Verification

    NASA Astrophysics Data System (ADS)

    Neamah, Karrar; Mohamad, Dzulkifli; Saba, Tanzila; Rehman, Amjad

    2014-03-01

    Signature verification is an active research area in the field of pattern recognition. It is employed to identify the particular person with the help of his/her signature's characteristics such as pen pressure, loops shape, speed of writing and up down motion of pen, writing speed, pen pressure, shape of loops, etc. in order to identify that person. However, in the entire process, features extraction and selection stage is of prime importance. Since several signatures have similar strokes, characteristics and sizes. Accordingly, this paper presents combination of orientation of the skeleton and gravity centre point to extract accurate pattern features of signature data in offline signature verification system. Promising results have proved the success of the integration of the two methods.

  1. Experimental Realization of a Quantum Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Li, Zhaokai; Liu, Xiaomei; Xu, Nanyang; Du, Jiangfeng

    2015-04-01

    The fundamental principle of artificial intelligence is the ability of machines to learn from previous experience and do future work accordingly. In the age of big data, classical learning machines often require huge computational resources in many practical cases. Quantum machine learning algorithms, on the other hand, could be exponentially faster than their classical counterparts by utilizing quantum parallelism. Here, we demonstrate a quantum machine learning algorithm to implement handwriting recognition on a four-qubit NMR test bench. The quantum machine learns standard character fonts and then recognizes handwritten characters from a set with two candidates. Because of the wide spread importance of artificial intelligence and its tremendous consumption of computational resources, quantum speedup would be extremely attractive against the challenges of big data.

  2. Silicon synaptic transistor for hardware-based spiking neural network and neuromorphic system

    NASA Astrophysics Data System (ADS)

    Kim, Hyungjin; Hwang, Sungmin; Park, Jungjin; Park, Byung-Gook

    2017-10-01

    Brain-inspired neuromorphic systems have attracted much attention as new computing paradigms for power-efficient computation. Here, we report a silicon synaptic transistor with two electrically independent gates to realize a hardware-based neural network system without any switching components. The spike-timing dependent plasticity characteristics of the synaptic devices are measured and analyzed. With the help of the device model based on the measured data, the pattern recognition capability of the hardware-based spiking neural network systems is demonstrated using the modified national institute of standards and technology handwritten dataset. By comparing systems with and without inhibitory synapse part, it is confirmed that the inhibitory synapse part is an essential element in obtaining effective and high pattern classification capability.

  3. Silicon synaptic transistor for hardware-based spiking neural network and neuromorphic system.

    PubMed

    Kim, Hyungjin; Hwang, Sungmin; Park, Jungjin; Park, Byung-Gook

    2017-10-06

    Brain-inspired neuromorphic systems have attracted much attention as new computing paradigms for power-efficient computation. Here, we report a silicon synaptic transistor with two electrically independent gates to realize a hardware-based neural network system without any switching components. The spike-timing dependent plasticity characteristics of the synaptic devices are measured and analyzed. With the help of the device model based on the measured data, the pattern recognition capability of the hardware-based spiking neural network systems is demonstrated using the modified national institute of standards and technology handwritten dataset. By comparing systems with and without inhibitory synapse part, it is confirmed that the inhibitory synapse part is an essential element in obtaining effective and high pattern classification capability.

  4. A bimodal biometric identification system

    NASA Astrophysics Data System (ADS)

    Laghari, Mohammad S.; Khuwaja, Gulzar A.

    2013-03-01

    Biometrics consists of methods for uniquely recognizing humans based upon one or more intrinsic physical or behavioral traits. Physicals are related to the shape of the body. Behavioral are related to the behavior of a person. However, biometric authentication systems suffer from imprecision and difficulty in person recognition due to a number of reasons and no single biometrics is expected to effectively satisfy the requirements of all verification and/or identification applications. Bimodal biometric systems are expected to be more reliable due to the presence of two pieces of evidence and also be able to meet the severe performance requirements imposed by various applications. This paper presents a neural network based bimodal biometric identification system by using human face and handwritten signature features.

  5. Handwritten document age classification based on handwriting styles

    NASA Astrophysics Data System (ADS)

    Ramaiah, Chetan; Kumar, Gaurav; Govindaraju, Venu

    2012-01-01

    Handwriting styles are constantly changing over time. We approach the novel problem of estimating the approximate age of Historical Handwritten Documents using Handwriting styles. This system will have many applications in handwritten document processing engines where specialized processing techniques can be applied based on the estimated age of the document. We propose to learn a distribution over styles across centuries using Topic Models and to apply a classifier over weights learned in order to estimate the approximate age of the documents. We present a comparison of different distance metrics such as Euclidean Distance and Hellinger Distance within this application.

  6. House officer procedure documentation using a personal digital assistant: a longitudinal study

    PubMed Central

    Bird, Steven B; Lane, David R

    2006-01-01

    Background Personal Digital Assistants (PDAs) have been integrated into daily practice for many emergency physicians and house officers. Few objective data exist that quantify the effect of PDAs on documentation. The objective of this study was to determine whether use of a PDA would improve emergency medicine house officer documentation of procedures and patient resuscitations. Methods Twelve first-year Emergency Medicine (EM) residents were provided a Palm V (Palm, Inc., Santa Clara, California, USA) PDA. A customizable patient procedure and encounter program was constructed and loaded into each PDA. Residents were instructed to enter information on patients who had any of 20 procedures performed, were deemed clinically unstable, or on whom follow-up was obtained. These data were downloaded to the residency coordinator's desktop computer on a weekly basis for 36 months. The mean number of procedures and encounters performed per resident over a three year period were then compared with those of 12 historical controls from a previous residency class that had recorded the same information using a handwritten card system for 36 months. Means of both groups were compared a two-tailed Student's t test with a Bonferroni correction for multiple comparisons. One hundred randomly selected entries from both the PDA and handwritten groups were reviewed for completeness. Another group of 11 residents who had used both handwritten and PDA procedure logs for one year each were asked to complete a questionnaire regarding their satisfaction with the PDA system. Results Mean documentation of three procedures significantly increased in the PDA vs handwritten groups: conscious sedation 24.0 vs 0.03 (p = 0.001); thoracentesis 3.0 vs 0.0 (p = 0.001); and ED ultrasound 24.5 vs. 0.0 (p = 0.001). In the handwritten cohort, only the number of cardioversions/defibrillations (26.5 vs 11.5) was statistically increased (p = 0.001). Of the PDA entries, 100% were entered completely, compared to only 91% of the handwritten group, including 4% that were illegible. 10 of 11 questioned residents preferred the PDA procedure log to a handwritten log (mean ± SD Likert-scale score of 1.6 ± 0.9). Conclusion Overall use of a PDA did not significantly change EM resident procedure or patient resuscitation documentation when used over a three-year period. Statistically significant differences between the handwritten and PDA groups likely represent alterations in the standard of ED care over time. Residents overwhelmingly preferred the PDA procedure log to a handwritten log and more entries are complete using the PDA. These favorable comparisons and the numerous other uses of PDAs may make them an attractive alternative for resident documentation. PMID:16438709

  7. 21 CFR 11.1 - Scope.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... SIGNATURES General Provisions § 11.1 Scope. (a) The regulations in this part set forth the criteria under which the agency considers electronic records, electronic signatures, and handwritten signatures... handwritten signatures executed on paper. (b) This part applies to records in electronic form that are created...

  8. Training the max-margin sequence model with the relaxed slack variables.

    PubMed

    Niu, Lingfeng; Wu, Jianmin; Shi, Yong

    2012-09-01

    Sequence models are widely used in many applications such as natural language processing, information extraction and optical character recognition, etc. We propose a new approach to train the max-margin based sequence model by relaxing the slack variables in this paper. With the canonical feature mapping definition, the relaxed problem is solved by training a multiclass Support Vector Machine (SVM). Compared with the state-of-the-art solutions for the sequence learning, the new method has the following advantages: firstly, the sequence training problem is transformed into a multiclassification problem, which is more widely studied and already has quite a few off-the-shelf training packages; secondly, this new approach reduces the complexity of training significantly and achieves comparable prediction performance compared with the existing sequence models; thirdly, when the size of training data is limited, by assigning different slack variables to different microlabel pairs, the new method can use the discriminative information more frugally and produces more reliable model; last but not least, by employing kernels in the intermediate multiclass SVM, nonlinear feature space can be easily explored. Experimental results on the task of named entity recognition, information extraction and handwritten letter recognition with the public datasets illustrate the efficiency and effectiveness of our method. Copyright © 2012 Elsevier Ltd. All rights reserved.

  9. Universal brain systems for recognizing word shapes and handwriting gestures during reading

    PubMed Central

    Nakamura, Kimihiro; Kuo, Wen-Jui; Pegado, Felipe; Cohen, Laurent; Tzeng, Ovid J. L.; Dehaene, Stanislas

    2012-01-01

    Do the neural circuits for reading vary across culture? Reading of visually complex writing systems such as Chinese has been proposed to rely on areas outside the classical left-hemisphere network for alphabetic reading. Here, however, we show that, once potential confounds in cross-cultural comparisons are controlled for by presenting handwritten stimuli to both Chinese and French readers, the underlying network for visual word recognition may be more universal than previously suspected. Using functional magnetic resonance imaging in a semantic task with words written in cursive font, we demonstrate that two universal circuits, a shape recognition system (reading by eye) and a gesture recognition system (reading by hand), are similarly activated and show identical patterns of activation and repetition priming in the two language groups. These activations cover most of the brain regions previously associated with culture-specific tuning. Our results point to an extended reading network that invariably comprises the occipitotemporal visual word-form system, which is sensitive to well-formed static letter strings, and a distinct left premotor region, Exner’s area, which is sensitive to the forward or backward direction with which cursive letters are dynamically presented. These findings suggest that cultural effects in reading merely modulate a fixed set of invariant macroscopic brain circuits, depending on surface features of orthographies. PMID:23184998

  10. 38 CFR 1.559 - Appeals.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... contain an image of the requester's handwritten signature, such as an attachment that shows the requester... confidentiality statute, the email transmission must contain an image of the requester's handwritten signature... processing, e-mail FOIA appeals must be sent to official VA FOIA mailboxes established for the purpose of...

  11. 38 CFR 1.559 - Appeals.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... contain an image of the requester's handwritten signature, such as an attachment that shows the requester... confidentiality statute, the email transmission must contain an image of the requester's handwritten signature... processing, e-mail FOIA appeals must be sent to official VA FOIA mailboxes established for the purpose of...

  12. 38 CFR 1.559 - Appeals.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... contain an image of the requester's handwritten signature, such as an attachment that shows the requester... confidentiality statute, the email transmission must contain an image of the requester's handwritten signature... processing, e-mail FOIA appeals must be sent to official VA FOIA mailboxes established for the purpose of...

  13. Simulation Detection in Handwritten Documents by Forensic Document Examiners.

    PubMed

    Kam, Moshe; Abichandani, Pramod; Hewett, Tom

    2015-07-01

    This study documents the results of a controlled experiment designed to quantify the abilities of forensic document examiners (FDEs) and laypersons to detect simulations in handwritten documents. Nineteen professional FDEs and 26 laypersons (typical of a jury pool) were asked to inspect test packages that contained six (6) known handwritten documents written by the same person and two (2) questioned handwritten documents. Each questioned document was either written by the person who wrote the known documents, or written by a different person who tried to simulate the writing of the person who wrote the known document. The error rates of the FDEs were smaller than those of the laypersons when detecting simulations in the questioned documents. Among other findings, the FDEs never labeled a questioned document that was written by the same person who wrote the known documents as "simulation." There was a significant statistical difference between the responses of the FDEs and layperson for documents without simulations. © 2015 American Academy of Forensic Sciences.

  14. Feature extraction with deep neural networks by a generalized discriminant analysis.

    PubMed

    Stuhlsatz, André; Lippel, Jens; Zielke, Thomas

    2012-04-01

    We present an approach to feature extraction that is a generalization of the classical linear discriminant analysis (LDA) on the basis of deep neural networks (DNNs). As for LDA, discriminative features generated from independent Gaussian class conditionals are assumed. This modeling has the advantages that the intrinsic dimensionality of the feature space is bounded by the number of classes and that the optimal discriminant function is linear. Unfortunately, linear transformations are insufficient to extract optimal discriminative features from arbitrarily distributed raw measurements. The generalized discriminant analysis (GerDA) proposed in this paper uses nonlinear transformations that are learnt by DNNs in a semisupervised fashion. We show that the feature extraction based on our approach displays excellent performance on real-world recognition and detection tasks, such as handwritten digit recognition and face detection. In a series of experiments, we evaluate GerDA features with respect to dimensionality reduction, visualization, classification, and detection. Moreover, we show that GerDA DNNs can preprocess truly high-dimensional input data to low-dimensional representations that facilitate accurate predictions even if simple linear predictors or measures of similarity are used.

  15. Grading Multiple Choice Exams with Low-Cost and Portable Computer-Vision Techniques

    NASA Astrophysics Data System (ADS)

    Fisteus, Jesus Arias; Pardo, Abelardo; García, Norberto Fernández

    2013-08-01

    Although technology for automatic grading of multiple choice exams has existed for several decades, it is not yet as widely available or affordable as it should be. The main reasons preventing this adoption are the cost and the complexity of the setup procedures. In this paper, Eyegrade, a system for automatic grading of multiple choice exams is presented. While most current solutions are based on expensive scanners, Eyegrade offers a truly low-cost solution requiring only a regular off-the-shelf webcam. Additionally, Eyegrade performs both mark recognition as well as optical character recognition of handwritten student identification numbers, which avoids the use of bubbles in the answer sheet. When compared with similar webcam-based systems, the user interface in Eyegrade has been designed to provide a more efficient and error-free data collection procedure. The tool has been validated with a set of experiments that show the ease of use (both setup and operation), the reduction in grading time, and an increase in the reliability of the results when compared with conventional, more expensive systems.

  16. 37 CFR 1.4 - Nature of correspondence and signature requirements.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ..., that is, have an original handwritten signature personally signed, in permanent dark ink or its... EFS-Web customization. (e) The following correspondence must be submitted with an original handwritten signature personally signed in permanent dark ink or its equivalent: (1) Correspondence requiring a person's...

  17. A unified approach for development of Urdu Corpus for OCR and demographic purpose

    NASA Astrophysics Data System (ADS)

    Choudhary, Prakash; Nain, Neeta; Ahmed, Mushtaq

    2015-02-01

    This paper presents a methodology for the development of an Urdu handwritten text image Corpus and application of Corpus linguistics in the field of OCR and information retrieval from handwritten document. Compared to other language scripts, Urdu script is little bit complicated for data entry. To enter a single character it requires a combination of multiple keys entry. Here, a mixed approach is proposed and demonstrated for building Urdu Corpus for OCR and Demographic data collection. Demographic part of database could be used to train a system to fetch the data automatically, which will be helpful to simplify existing manual data-processing task involved in the field of data collection such as input forms like Passport, Ration Card, Voting Card, AADHAR, Driving licence, Indian Railway Reservation, Census data etc. This would increase the participation of Urdu language community in understanding and taking benefit of the Government schemes. To make availability and applicability of database in a vast area of corpus linguistics, we propose a methodology for data collection, mark-up, digital transcription, and XML metadata information for benchmarking.

  18. Determining the Value of Handwritten Comments within Work Orders

    ERIC Educational Resources Information Center

    Thombs, Daniel

    2010-01-01

    In the workplace many work orders are handwritten on paper rather than recorded in a digital format. Despite being archived, these documents are neither referenced nor analyzed after their creation. Tacit knowledge gathered though employee documentation is generally considered beneficial, but only if it can be easily gathered and processed. …

  19. Comparing Postsecondary Marketing Student Performance on Computer-Based and Handwritten Essay Tests

    ERIC Educational Resources Information Center

    Truell, Allen D.; Alexander, Melody W.; Davis, Rodney E.

    2004-01-01

    The purpose of this study was to determine if there were differences in postsecondary marketing student performance on essay tests based on test format (i.e., computer-based or handwritten). Specifically, the variables of performance, test completion time, and gender were explored for differences based on essay test format. Results of the study…

  20. Handwritten Newspapers on the Iowa Frontier, 1844-54.

    ERIC Educational Resources Information Center

    Atwood, Roy Alden

    Journalism on the agricultural frontier of the Old Northwest territory of the United States was shaped by a variety of cultural forces and environmental factors and took on diverse forms. Bridging the gap between the two cultural forms of written correspondence and printed news was a third form: the handwritten newspaper. Between 1844 and 1854…

  1. Judging the Emergent Reading Abilities of Kindergarten Children.

    ERIC Educational Resources Information Center

    Otto, Beverly; Sulzby, Elizabeth

    In 1981, a scale, the Emergent Reading Ability Judgments for Dictated and Handwritten Stories, was developed for use in assessing how close a child was to reading independently based upon the nature of the child's attempts to read from dictated and handwritten stories. A study was conducted to apply the scale to stories from a new sample of…

  2. Dimension Reduction With Extreme Learning Machine.

    PubMed

    Kasun, Liyanaarachchi Lekamalage Chamara; Yang, Yan; Huang, Guang-Bin; Zhang, Zhengyou

    2016-08-01

    Data may often contain noise or irrelevant information, which negatively affect the generalization capability of machine learning algorithms. The objective of dimension reduction algorithms, such as principal component analysis (PCA), non-negative matrix factorization (NMF), random projection (RP), and auto-encoder (AE), is to reduce the noise or irrelevant information of the data. The features of PCA (eigenvectors) and linear AE are not able to represent data as parts (e.g. nose in a face image). On the other hand, NMF and non-linear AE are maimed by slow learning speed and RP only represents a subspace of original data. This paper introduces a dimension reduction framework which to some extend represents data as parts, has fast learning speed, and learns the between-class scatter subspace. To this end, this paper investigates a linear and non-linear dimension reduction framework referred to as extreme learning machine AE (ELM-AE) and sparse ELM-AE (SELM-AE). In contrast to tied weight AE, the hidden neurons in ELM-AE and SELM-AE need not be tuned, and their parameters (e.g, input weights in additive neurons) are initialized using orthogonal and sparse random weights, respectively. Experimental results on USPS handwritten digit recognition data set, CIFAR-10 object recognition, and NORB object recognition data set show the efficacy of linear and non-linear ELM-AE and SELM-AE in terms of discriminative capability, sparsity, training time, and normalized mean square error.

  3. The electronic, 'paperless' medical office; has it arrived?

    PubMed

    Gates, P; Urquhart, J

    2007-02-01

    Modern information technology offers efficiencies in medical practice, with a reduction in secretarial time in maintaining, filing and retrieving the paper medical record. Electronic requesting of investigations allows tracking of outstanding results. Less storage space is required and telephone calls from pharmacies, pathology and medical imaging service providers to clarify the hand-written request are abolished. Voice recognition software reduces secretarial typing time per letter. These combined benefits can lead to significantly reduced costs and improved patient care. The paperless office is possible, but requires commitment and training of all staff; it is preferable but not absolutely essential that at least one member of the practice has an interest and some expertise in computers. More importantly, back-up from information technology providers and back-up of the electronic data are absolutely crucial and a paperless environment should not be considered without them.

  4. A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification

    PubMed Central

    Wen, Cuihong; Zhang, Jing; Rebelo, Ana; Cheng, Fanyong

    2016-01-01

    Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs). PMID:26985826

  5. A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification.

    PubMed

    Wen, Cuihong; Zhang, Jing; Rebelo, Ana; Cheng, Fanyong

    2016-01-01

    Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs).

  6. Supporting Learning with Weblogs in Science Education: A Comparison of Blogging and Hand-Written Reflective Writing with and without Prompts

    ERIC Educational Resources Information Center

    Petko, Dominik; Egger, Nives; Graber, Marc

    2014-01-01

    The goal of this study was to compare how weblogs and traditional handwritten reflective learning protocols compare regarding the use of cognitive and metacognitive strategies for knowledge acquisition as well as learning gains in secondary school students. The study used a quasi-experimental control group design with repeated measurements…

  7. A survey of user acceptance of electronic patient anesthesia records

    PubMed Central

    Jin, Hyun Seung; Lee, Suk Young; Jeong, Hui Yeon; Choi, Soo Joo; Lee, Hye Won

    2012-01-01

    Background An anesthesia information management system (AIMS), although not widely used in Korea, will eventually replace handwritten records. This hospital began using AIMS in April 2010. The purpose of this study was to evaluate users' attitudes concerning AIMS and to compare them with manual documentation in the operating room (OR). Methods A structured questionnaire focused on satisfaction with electronic anesthetic records and comparison with handwritten anesthesia records was administered to anesthesiologists, trainees, and nurses during February 2011 and the responses were collected anonymously during March 2011. Results A total of 28 anesthesiologists, 27 trainees, and 47 nurses responded to this survey. Most participants involved in this survey were satisfied with AIMS (96.3%, 82.2%, and 89.3% of trainees, anesthesiologists, and nurses, respectively) and preferred AIMS over handwritten anesthesia records in 96.3%, 71.4%, and 97.9% of trainees, anesthesiologists, and nurses, respectively. However, there were also criticisms of AIMS related to user-discomfort during short, simple or emergency surgeries, doubtful legal status, and inconvenient placement of the system. Conclusions Overall, most of the anesthetic practitioners in this hospital quickly accepted and prefer AIMS over the handwritten anesthetic records in the OR. PMID:22558502

  8. PCANet: A Simple Deep Learning Baseline for Image Classification?

    PubMed

    Chan, Tsung-Han; Jia, Kui; Gao, Shenghua; Lu, Jiwen; Zeng, Zinan; Ma, Yi

    2015-12-01

    In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal component analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed to learn multistage filter banks. This is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus called the PCA network (PCANet) and can be extremely easily and efficiently designed and learned. For comparison and to provide a better understanding, we also introduce and study two simple variations of PCANet: 1) RandNet and 2) LDANet. They share the same topology as PCANet, but their cascaded filters are either randomly selected or learned from linear discriminant analysis. We have extensively tested these basic networks on many benchmark visual data sets for different tasks, including Labeled Faces in the Wild (LFW) for face verification; the MultiPIE, Extended Yale B, AR, Facial Recognition Technology (FERET) data sets for face recognition; and MNIST for hand-written digit recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)]. Even more surprisingly, the model sets new records for many classification tasks on the Extended Yale B, AR, and FERET data sets and on MNIST variations. Additional experiments on other public data sets also demonstrate the potential of PCANet to serve as a simple but highly competitive baseline for texture classification and object recognition.

  9. Dilated contour extraction and component labeling algorithm for object vector representation

    NASA Astrophysics Data System (ADS)

    Skourikhine, Alexei N.

    2005-08-01

    Object boundary extraction from binary images is important for many applications, e.g., image vectorization, automatic interpretation of images containing segmentation results, printed and handwritten documents and drawings, maps, and AutoCAD drawings. Efficient and reliable contour extraction is also important for pattern recognition due to its impact on shape-based object characterization and recognition. The presented contour tracing and component labeling algorithm produces dilated (sub-pixel) contours associated with corresponding regions. The algorithm has the following features: (1) it always produces non-intersecting, non-degenerate contours, including the case of one-pixel wide objects; (2) it associates the outer and inner (i.e., around hole) contours with the corresponding regions during the process of contour tracing in a single pass over the image; (3) it maintains desired connectivity of object regions as specified by 8-neighbor or 4-neighbor connectivity of adjacent pixels; (4) it avoids degenerate regions in both background and foreground; (5) it allows an easy augmentation that will provide information about the containment relations among regions; (6) it has a time complexity that is dominantly linear in the number of contour points. This early component labeling (contour-region association) enables subsequent efficient object-based processing of the image information.

  10. Evaluation of Hand Written and Computerized Out-Patient Prescriptions in Urban Part of Central Gujarat.

    PubMed

    Joshi, Anuradha; Buch, Jatin; Kothari, Nitin; Shah, Nishal

    2016-06-01

    Prescription order is an important therapeutic transaction between physician and patient. A good quality prescription is an extremely important factor for minimizing errors in dispensing medication and it should be adherent to guidelines for prescription writing for benefit of the patient. To evaluate frequency and type of prescription errors in outpatient prescriptions and find whether prescription writing abides with WHO standards of prescription writing. A cross-sectional observational study was conducted at Anand city. Allopathic private practitioners practising at Anand city of different specialities were included in study. Collection of prescriptions was started a month after the consent to minimize bias in prescription writing. The prescriptions were collected from local pharmacy stores of Anand city over a period of six months. Prescriptions were analysed for errors in standard information, according to WHO guide to good prescribing. Descriptive analysis was performed to estimate frequency of errors, data were expressed as numbers and percentage. Total 749 (549 handwritten and 200 computerised) prescriptions were collected. Abundant omission errors were identified in handwritten prescriptions e.g., OPD number was mentioned in 6.19%, patient's age was mentioned in 25.50%, gender in 17.30%, address in 9.29% and weight of patient mentioned in 11.29%, while in drug items only 2.97% drugs were prescribed by generic name. Route and Dosage form was mentioned in 77.35%-78.15%, dose mentioned in 47.25%, unit in 13.91%, regimens were mentioned in 72.93% while signa (direction for drug use) in 62.35%. Total 4384 errors out of 549 handwritten prescriptions and 501 errors out of 200 computerized prescriptions were found in clinicians and patient details. While in drug item details, total number of errors identified were 5015 and 621 in handwritten and computerized prescriptions respectively. As compared to handwritten prescriptions, computerized prescriptions appeared to be associated with relatively lower rates of error. Since out-patient prescription errors are abundant and often occur in handwritten prescriptions, prescribers need to adapt themselves to computerized prescription order entry in their daily practice.

  11. Evaluation of Hand Written and Computerized Out-Patient Prescriptions in Urban Part of Central Gujarat

    PubMed Central

    Buch, Jatin; Kothari, Nitin; Shah, Nishal

    2016-01-01

    Introduction Prescription order is an important therapeutic transaction between physician and patient. A good quality prescription is an extremely important factor for minimizing errors in dispensing medication and it should be adherent to guidelines for prescription writing for benefit of the patient. Aim To evaluate frequency and type of prescription errors in outpatient prescriptions and find whether prescription writing abides with WHO standards of prescription writing. Materials and Methods A cross-sectional observational study was conducted at Anand city. Allopathic private practitioners practising at Anand city of different specialities were included in study. Collection of prescriptions was started a month after the consent to minimize bias in prescription writing. The prescriptions were collected from local pharmacy stores of Anand city over a period of six months. Prescriptions were analysed for errors in standard information, according to WHO guide to good prescribing. Statistical Analysis Descriptive analysis was performed to estimate frequency of errors, data were expressed as numbers and percentage. Results Total 749 (549 handwritten and 200 computerised) prescriptions were collected. Abundant omission errors were identified in handwritten prescriptions e.g., OPD number was mentioned in 6.19%, patient’s age was mentioned in 25.50%, gender in 17.30%, address in 9.29% and weight of patient mentioned in 11.29%, while in drug items only 2.97% drugs were prescribed by generic name. Route and Dosage form was mentioned in 77.35%-78.15%, dose mentioned in 47.25%, unit in 13.91%, regimens were mentioned in 72.93% while signa (direction for drug use) in 62.35%. Total 4384 errors out of 549 handwritten prescriptions and 501 errors out of 200 computerized prescriptions were found in clinicians and patient details. While in drug item details, total number of errors identified were 5015 and 621 in handwritten and computerized prescriptions respectively. Conclusion As compared to handwritten prescriptions, computerized prescriptions appeared to be associated with relatively lower rates of error. Since out-patient prescription errors are abundant and often occur in handwritten prescriptions, prescribers need to adapt themselves to computerized prescription order entry in their daily practice. PMID:27504305

  12. Spotting handwritten words and REGEX using a two stage BLSTM-HMM architecture

    NASA Astrophysics Data System (ADS)

    Bideault, Gautier; Mioulet, Luc; Chatelain, Clément; Paquet, Thierry

    2015-01-01

    In this article, we propose a hybrid model for spotting words and regular expressions (REGEX) in handwritten documents. The model is made of the state-of-the-art BLSTM (Bidirectional Long Short Time Memory) neural network for recognizing and segmenting characters, coupled with a HMM to build line models able to spot the desired sequences. Experiments on the Rimes database show very promising results.

  13. Spotting words in handwritten Arabic documents

    NASA Astrophysics Data System (ADS)

    Srihari, Sargur; Srinivasan, Harish; Babu, Pavithra; Bhole, Chetan

    2006-01-01

    The design and performance of a system for spotting handwritten Arabic words in scanned document images is presented. Three main components of the system are a word segmenter, a shape based matcher for words and a search interface. The user types in a query in English within a search window, the system finds the equivalent Arabic word, e.g., by dictionary look-up, locates word images in an indexed (segmented) set of documents. A two-step approach is employed in performing the search: (1) prototype selection: the query is used to obtain a set of handwritten samples of that word from a known set of writers (these are the prototypes), and (2) word matching: the prototypes are used to spot each occurrence of those words in the indexed document database. A ranking is performed on the entire set of test word images-- where the ranking criterion is a similarity score between each prototype word and the candidate words based on global word shape features. A database of 20,000 word images contained in 100 scanned handwritten Arabic documents written by 10 different writers was used to study retrieval performance. Using five writers for providing prototypes and the other five for testing, using manually segmented documents, 55% precision is obtained at 50% recall. Performance increases as more writers are used for training.

  14. Vital sign documentation in electronic records: The development of workarounds.

    PubMed

    Stevenson, Jean E; Israelsson, Johan; Nilsson, Gunilla; Petersson, Goran; Bath, Peter A

    2018-06-01

    Workarounds are commonplace in healthcare settings. An increase in the use of electronic health records has led to an escalation of workarounds as healthcare professionals cope with systems which are inadequate for their needs. Closely related to this, the documentation of vital signs in electronic health records has been problematic. The accuracy and completeness of vital sign documentation has a direct impact on the recognition of deterioration in a patient's condition. We examined workflow processes to identify workarounds related to vital signs in a 372-bed hospital in Sweden. In three clinical areas, a qualitative study was performed with data collected during observations and interviews and analysed through thematic content analysis. We identified paper workarounds in the form of handwritten notes and a total of eight pre-printed paper observation charts. Our results suggested that nurses created workarounds to allow a smooth workflow and ensure patients safety.

  15. Questioned document workflow for handwriting with automated tools

    NASA Astrophysics Data System (ADS)

    Das, Krishnanand; Srihari, Sargur N.; Srinivasan, Harish

    2012-01-01

    During the last few years many document recognition methods have been developed to determine whether a handwriting specimen can be attributed to a known writer. However, in practice, the work-flow of the document examiner continues to be manual-intensive. Before a systematic or computational, approach can be developed, an articulation of the steps involved in handwriting comparison is needed. We describe the work flow of handwritten questioned document examination, as described in a standards manual, and the steps where existing automation tools can be used. A well-known ransom note case is considered as an example, where one encounters testing for multiple writers of the same document, determining whether the writing is disguised, known writing is formal while questioned writing is informal, etc. The findings for the particular ransom note case using the tools are given. Also observations are made for developing a more fully automated approach to handwriting examination.

  16. Ultrahigh-Dimensional Multiclass Linear Discriminant Analysis by Pairwise Sure Independence Screening

    PubMed Central

    Pan, Rui; Wang, Hansheng; Li, Runze

    2016-01-01

    This paper is concerned with the problem of feature screening for multi-class linear discriminant analysis under ultrahigh dimensional setting. We allow the number of classes to be relatively large. As a result, the total number of relevant features is larger than usual. This makes the related classification problem much more challenging than the conventional one, where the number of classes is small (very often two). To solve the problem, we propose a novel pairwise sure independence screening method for linear discriminant analysis with an ultrahigh dimensional predictor. The proposed procedure is directly applicable to the situation with many classes. We further prove that the proposed method is screening consistent. Simulation studies are conducted to assess the finite sample performance of the new procedure. We also demonstrate the proposed methodology via an empirical analysis of a real life example on handwritten Chinese character recognition. PMID:28127109

  17. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the ‘Extreme Learning Machine’ Algorithm

    PubMed Central

    McDonnell, Mark D.; Tissera, Migel D.; Vladusich, Tony; van Schaik, André; Tapson, Jonathan

    2015-01-01

    Recent advances in training deep (multi-layer) architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the ‘Extreme Learning Machine’ (ELM) approach, which also enables a very rapid training time (∼ 10 minutes). Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random ‘receptive field’ sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems. PMID:26262687

  18. On the Optimum Architecture of the Biologically Inspired Hierarchical Temporal Memory Model Applied to the Hand-Written Digit Recognition

    NASA Astrophysics Data System (ADS)

    Štolc, Svorad; Bajla, Ivan

    2010-01-01

    In the paper we describe basic functions of the Hierarchical Temporal Memory (HTM) network based on a novel biologically inspired model of the large-scale structure of the mammalian neocortex. The focus of this paper is in a systematic exploration of possibilities how to optimize important controlling parameters of the HTM model applied to the classification of hand-written digits from the USPS database. The statistical properties of this database are analyzed using the permutation test which employs a randomization distribution of the training and testing data. Based on a notion of the homogeneous usage of input image pixels, a methodology of the HTM parameter optimization is proposed. In order to study effects of two substantial parameters of the architecture: the patch size and the overlap in more details, we have restricted ourselves to the single-level HTM networks. A novel method for construction of the training sequences by ordering series of the static images is developed. A novel method for estimation of the parameter maxDist based on the box counting method is proposed. The parameter sigma of the inference Gaussian is optimized on the basis of the maximization of the belief distribution entropy. Both optimization algorithms can be equally applied to the multi-level HTM networks as well. The influences of the parameters transitionMemory and requestedGroupCount on the HTM network performance have been explored. Altogether, we have investigated 2736 different HTM network configurations. The obtained classification accuracy results have been benchmarked with the published results of several conventional classifiers.

  19. Handwritten mathematical symbols dataset.

    PubMed

    Chajri, Yassine; Bouikhalene, Belaid

    2016-06-01

    Due to the technological advances in recent years, paper scientific documents are used less and less. Thus, the trend in the scientific community to use digital documents has increased considerably. Among these documents, there are scientific documents and more specifically mathematics documents. In this context, we present our own dataset of handwritten mathematical symbols composed of 10,379 images. This dataset gathers Arabic characters, Latin characters, Arabic numerals, Latin numerals, arithmetic operators, set-symbols, comparison symbols, delimiters, etc.

  20. Protocol Handbook,

    DTIC Science & Technology

    1985-04-01

    all invitations should be handwritten in black ink and addressed in the full name of the husband and wife unless the guest is single. Requesting an...34 is handwritten in black ink . If the reply is by telephone, the number is written directly beneath the R.S.V.P. (or a separate response card may be...styles. The card should be engraved with black ink on excellent quality card stock (usually white or cream in color). Script lettering is the most

  1. Handwritten dynamics assessment through convolutional neural networks: An application to Parkinson's disease identification.

    PubMed

    Pereira, Clayton R; Pereira, Danilo R; Rosa, Gustavo H; Albuquerque, Victor H C; Weber, Silke A T; Hook, Christian; Papa, João P

    2018-05-01

    Parkinson's disease (PD) is considered a degenerative disorder that affects the motor system, which may cause tremors, micrography, and the freezing of gait. Although PD is related to the lack of dopamine, the triggering process of its development is not fully understood yet. In this work, we introduce convolutional neural networks to learn features from images produced by handwritten dynamics, which capture different information during the individual's assessment. Additionally, we make available a dataset composed of images and signal-based data to foster the research related to computer-aided PD diagnosis. The proposed approach was compared against raw data and texture-based descriptors, showing suitable results, mainly in the context of early stage detection, with results nearly to 95%. The analysis of handwritten dynamics using deep learning techniques showed to be useful for automatic Parkinson's disease identification, as well as it can outperform handcrafted features. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Handwritten mathematical symbols dataset

    PubMed Central

    Chajri, Yassine; Bouikhalene, Belaid

    2016-01-01

    Due to the technological advances in recent years, paper scientific documents are used less and less. Thus, the trend in the scientific community to use digital documents has increased considerably. Among these documents, there are scientific documents and more specifically mathematics documents. In this context, we present our own dataset of handwritten mathematical symbols composed of 10,379 images. This dataset gathers Arabic characters, Latin characters, Arabic numerals, Latin numerals, arithmetic operators, set-symbols, comparison symbols, delimiters, etc. PMID:27006975

  3. Fuzzy logic and neural networks in artificial intelligence and pattern recognition

    NASA Astrophysics Data System (ADS)

    Sanchez, Elie

    1991-10-01

    With the use of fuzzy logic techniques, neural computing can be integrated in symbolic reasoning to solve complex real world problems. In fact, artificial neural networks, expert systems, and fuzzy logic systems, in the context of approximate reasoning, share common features and techniques. A model of Fuzzy Connectionist Expert System is introduced, in which an artificial neural network is designed to construct the knowledge base of an expert system from, training examples (this model can also be used for specifications of rules in fuzzy logic control). Two types of weights are associated with the synaptic connections in an AND-OR structure: primary linguistic weights, interpreted as labels of fuzzy sets, and secondary numerical weights. Cell activation is computed through min-max fuzzy equations of the weights. Learning consists in finding the (numerical) weights and the network topology. This feedforward network is described and first illustrated in a biomedical application (medical diagnosis assistance from inflammatory-syndromes/proteins profiles). Then, it is shown how this methodology can be utilized for handwritten pattern recognition (characters play the role of diagnoses): in a fuzzy neuron describing a number for example, the linguistic weights represent fuzzy sets on cross-detecting lines and the numerical weights reflect the importance (or weakness) of connections between cross-detecting lines and characters.

  4. Fast Multiclass Segmentation using Diffuse Interface Methods on Graphs

    DTIC Science & Technology

    2013-02-01

    000 28 × 28 images of handwritten digits 0 through 9. Examples of entries can be found in Figure 6. The task is to classify each of the images into the...database of handwritten digits .” [Online]. Available: http://yann.lecun.com/exdb/mnist/ [36] J. Lellmann, J. H. Kappes, J. Yuan, F. Becker, and C...corresponding digit . The images include digits from 0 to 9; thus, this is a 10 class segmentation problem. To construct the weight matrix, we used N

  5. Southeast Asian palm leaf manuscript images: a review of handwritten text line segmentation methods and new challenges

    NASA Astrophysics Data System (ADS)

    Kesiman, Made Windu Antara; Valy, Dona; Burie, Jean-Christophe; Paulus, Erick; Sunarya, I. Made Gede; Hadi, Setiawan; Sok, Kim Heng; Ogier, Jean-Marc

    2017-01-01

    Due to their specific characteristics, palm leaf manuscripts provide new challenges for text line segmentation tasks in document analysis. We investigated the performance of six text line segmentation methods by conducting comparative experimental studies for the collection of palm leaf manuscript images. The image corpus used in this study comes from the sample images of palm leaf manuscripts of three different Southeast Asian scripts: Balinese script from Bali and Sundanese script from West Java, both from Indonesia, and Khmer script from Cambodia. For the experiments, four text line segmentation methods that work on binary images are tested: the adaptive partial projection line segmentation approach, the A* path planning approach, the shredding method, and our proposed energy function for shredding method. Two other methods that can be directly applied on grayscale images are also investigated: the adaptive local connectivity map method and the seam carving-based method. The evaluation criteria and tool provided by ICDAR2013 Handwriting Segmentation Contest were used in this experiment.

  6. Modeling the Lexical Morphology of Western Handwritten Signatures

    PubMed Central

    Diaz-Cabrera, Moises; Ferrer, Miguel A.; Morales, Aythami

    2015-01-01

    A handwritten signature is the final response to a complex cognitive and neuromuscular process which is the result of the learning process. Because of the many factors involved in signing, it is possible to study the signature from many points of view: graphologists, forensic experts, neurologists and computer vision experts have all examined them. Researchers study written signatures for psychiatric, penal, health and automatic verification purposes. As a potentially useful, multi-purpose study, this paper is focused on the lexical morphology of handwritten signatures. This we understand to mean the identification, analysis, and description of the signature structures of a given signer. In this work we analyze different public datasets involving 1533 signers from different Western geographical areas. Some relevant characteristics of signature lexical morphology have been selected, examined in terms of their probability distribution functions and modeled through a General Extreme Value distribution. This study suggests some useful models for multi-disciplinary sciences which depend on handwriting signatures. PMID:25860942

  7. A New Approach to Diagnose Parkinson's Disease Using a Structural Cooccurrence Matrix for a Similarity Analysis.

    PubMed

    de Souza, João W M; Alves, Shara S A; Rebouças, Elizângela de S; Almeida, Jefferson S; Rebouças Filho, Pedro P

    2018-01-01

    Parkinson's disease affects millions of people around the world and consequently various approaches have emerged to help diagnose this disease, among which we can highlight handwriting exams. Extracting features from handwriting exams is an important contribution of the computational field for the diagnosis of this disease. In this paper, we propose an approach that measures the similarity between the exam template and the handwritten trace of the patient following the exam template. This similarity was measured using the Structural Cooccurrence Matrix to calculate how close the handwritten trace of the patient is to the exam template. The proposed approach was evaluated using various exam templates and the handwritten traces of the patient. Each of these variations was used together with the Naïve Bayes, OPF, and SVM classifiers. In conclusion the proposed approach was proven to be better than the existing methods found in the literature and is therefore a promising tool for the diagnosis of Parkinson's disease.

  8. Progressive sparse representation-based classification using local discrete cosine transform evaluation for image recognition

    NASA Astrophysics Data System (ADS)

    Song, Xiaoning; Feng, Zhen-Hua; Hu, Guosheng; Yang, Xibei; Yang, Jingyu; Qi, Yunsong

    2015-09-01

    This paper proposes a progressive sparse representation-based classification algorithm using local discrete cosine transform (DCT) evaluation to perform face recognition. Specifically, the sum of the contributions of all training samples of each subject is first taken as the contribution of this subject, then the redundant subject with the smallest contribution to the test sample is iteratively eliminated. Second, the progressive method aims at representing the test sample as a linear combination of all the remaining training samples, by which the representation capability of each training sample is exploited to determine the optimal "nearest neighbors" for the test sample. Third, the transformed DCT evaluation is constructed to measure the similarity between the test sample and each local training sample using cosine distance metrics in the DCT domain. The final goal of the proposed method is to determine an optimal weighted sum of nearest neighbors that are obtained under the local correlative degree evaluation, which is approximately equal to the test sample, and we can use this weighted linear combination to perform robust classification. Experimental results conducted on the ORL database of faces (created by the Olivetti Research Laboratory in Cambridge), the FERET face database (managed by the Defense Advanced Research Projects Agency and the National Institute of Standards and Technology), AR face database (created by Aleix Martinez and Robert Benavente in the Computer Vision Center at U.A.B), and USPS handwritten digit database (gathered at the Center of Excellence in Document Analysis and Recognition at SUNY Buffalo) demonstrate the effectiveness of the proposed method.

  9. Neural networks and applications tutorial

    NASA Astrophysics Data System (ADS)

    Guyon, I.

    1991-09-01

    The importance of neural networks has grown dramatically during this decade. While only a few years ago they were primarily of academic interest, now dozens of companies and many universities are investigating the potential use of these systems and products are beginning to appear. The idea of building a machine whose architecture is inspired by that of the brain has roots which go far back in history. Nowadays, technological advances of computers and the availability of custom integrated circuits, permit simulations of hundreds or even thousands of neurons. In conjunction, the growing interest in learning machines, non-linear dynamics and parallel computation spurred renewed attention in artificial neural networks. Many tentative applications have been proposed, including decision systems (associative memories, classifiers, data compressors and optimizers), or parametric models for signal processing purposes (system identification, automatic control, noise canceling, etc.). While they do not always outperform standard methods, neural network approaches are already used in some real world applications for pattern recognition and signal processing tasks. The tutorial is divided into six lectures, that where presented at the Third Graduate Summer Course on Computational Physics (September 3-7, 1990) on Parallel Architectures and Applications, organized by the European Physical Society: (1) Introduction: machine learning and biological computation. (2) Adaptive artificial neurons (perceptron, ADALINE, sigmoid units, etc.): learning rules and implementations. (3) Neural network systems: architectures, learning algorithms. (4) Applications: pattern recognition, signal processing, etc. (5) Elements of learning theory: how to build networks which generalize. (6) A case study: a neural network for on-line recognition of handwritten alphanumeric characters.

  10. Good initialization model with constrained body structure for scene text recognition

    NASA Astrophysics Data System (ADS)

    Zhu, Anna; Wang, Guoyou; Dong, Yangbo

    2016-09-01

    Scene text recognition has gained significant attention in the computer vision community. Character detection and recognition are the promise of text recognition and affect the overall performance to a large extent. We proposed a good initialization model for scene character recognition from cropped text regions. We use constrained character's body structures with deformable part-based models to detect and recognize characters in various backgrounds. The character's body structures are achieved by an unsupervised discriminative clustering approach followed by a statistical model and a self-build minimum spanning tree model. Our method utilizes part appearance and location information, and combines character detection and recognition in cropped text region together. The evaluation results on the benchmark datasets demonstrate that our proposed scheme outperforms the state-of-the-art methods both on scene character recognition and word recognition aspects.

  11. A Record Book of Open Heart Surgical Cases between 1959 and 1982, Hand-Written by a Cardiac Surgeon.

    PubMed

    Kim, Won-Gon

    2016-08-01

    A book of brief records of open heart surgery underwent between 1959 and 1982 at Seoul National University Hospital was recently found. The book was hand-written by the late professor and cardiac surgeon Yung Kyoon Lee (1921-1994). This book contains valuable information about cardiac patients and surgery at the early stages of the establishment of open heart surgery in Korea, and at Seoul National University Hospital. This report is intended to analyze the content of the book.

  12. Urdu Nasta'liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks.

    PubMed

    Naz, Saeeda; Umar, Arif Iqbal; Ahmed, Riaz; Razzak, Muhammad Imran; Rashid, Sheikh Faisal; Shafait, Faisal

    2016-01-01

    The recognition of Arabic script and its derivatives such as Urdu, Persian, Pashto etc. is a difficult task due to complexity of this script. Particularly, Urdu text recognition is more difficult due to its Nasta'liq writing style. Nasta'liq writing style inherits complex calligraphic nature, which presents major issues to recognition of Urdu text owing to diagonality in writing, high cursiveness, context sensitivity and overlapping of characters. Therefore, the work done for recognition of Arabic script cannot be directly applied to Urdu recognition. We present Multi-dimensional Long Short Term Memory (MDLSTM) Recurrent Neural Networks with an output layer designed for sequence labeling for recognition of printed Urdu text-lines written in the Nasta'liq writing style. Experiments show that MDLSTM attained a recognition accuracy of 98% for the unconstrained Urdu Nasta'liq printed text, which significantly outperforms the state-of-the-art techniques.

  13. Segmental Rescoring in Text Recognition

    DTIC Science & Technology

    2014-02-04

    description relates to rescoring text hypotheses in text recognition based on segmental features. Offline printed text and handwriting recognition (OHR) can... Handwriting , College Park, Md., 2006, which is incorporated by reference here. For the set of training images 202, a character modeler 208 receives

  14. A smart sensor architecture based on emergent computation in an array of outer-totalistic cells

    NASA Astrophysics Data System (ADS)

    Dogaru, Radu; Dogaru, Ioana; Glesner, Manfred

    2005-06-01

    A novel smart-sensor architecture is proposed, capable to segment and recognize characters in a monochrome image. It is capable to provide a list of ASCII codes representing the recognized characters from the monochrome visual field. It can operate as a blind's aid or for industrial applications. A bio-inspired cellular model with simple linear neurons was found the best to perform the nontrivial task of cropping isolated compact objects such as handwritten digits or characters. By attaching a simple outer-totalistic cell to each pixel sensor, emergent computation in the resulting cellular automata lattice provides a straightforward and compact solution to the otherwise computationally intensive problem of character segmentation. A simple and robust recognition algorithm is built in a compact sequential controller accessing the array of cells so that the integrated device can provide directly a list of codes of the recognized characters. Preliminary simulation tests indicate good performance and robustness to various distortions of the visual field.

  15. Information based universal feature extraction

    NASA Astrophysics Data System (ADS)

    Amiri, Mohammad; Brause, Rüdiger

    2015-02-01

    In many real world image based pattern recognition tasks, the extraction and usage of task-relevant features are the most crucial part of the diagnosis. In the standard approach, they mostly remain task-specific, although humans who perform such a task always use the same image features, trained in early childhood. It seems that universal feature sets exist, but they are not yet systematically found. In our contribution, we tried to find those universal image feature sets that are valuable for most image related tasks. In our approach, we trained a neural network by natural and non-natural images of objects and background, using a Shannon information-based algorithm and learning constraints. The goal was to extract those features that give the most valuable information for classification of visual objects hand-written digits. This will give a good start and performance increase for all other image learning tasks, implementing a transfer learning approach. As result, in our case we found that we could indeed extract features which are valid in all three kinds of tasks.

  16. Geometrical structure of Neural Networks: Geodesics, Jeffrey's Prior and Hyper-ribbons

    NASA Astrophysics Data System (ADS)

    Hayden, Lorien; Alemi, Alex; Sethna, James

    2014-03-01

    Neural networks are learning algorithms which are employed in a host of Machine Learning problems including speech recognition, object classification and data mining. In practice, neural networks learn a low dimensional representation of high dimensional data and define a model manifold which is an embedding of this low dimensional structure in the higher dimensional space. In this work, we explore the geometrical structure of a neural network model manifold. A Stacked Denoising Autoencoder and a Deep Belief Network are trained on handwritten digits from the MNIST database. Construction of geodesics along the surface and of slices taken from the high dimensional manifolds reveal a hierarchy of widths corresponding to a hyper-ribbon structure. This property indicates that neural networks fall into the class of sloppy models, in which certain parameter combinations dominate the behavior. Employing this information could prove valuable in designing both neural network architectures and training algorithms. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No . DGE-1144153.

  17. SiGe epitaxial memory for neuromorphic computing with reproducible high performance based on engineered dislocations

    NASA Astrophysics Data System (ADS)

    Choi, Shinhyun; Tan, Scott H.; Li, Zefan; Kim, Yunjo; Choi, Chanyeol; Chen, Pai-Yu; Yeon, Hanwool; Yu, Shimeng; Kim, Jeehwan

    2018-01-01

    Although several types of architecture combining memory cells and transistors have been used to demonstrate artificial synaptic arrays, they usually present limited scalability and high power consumption. Transistor-free analog switching devices may overcome these limitations, yet the typical switching process they rely on—formation of filaments in an amorphous medium—is not easily controlled and hence hampers the spatial and temporal reproducibility of the performance. Here, we demonstrate analog resistive switching devices that possess desired characteristics for neuromorphic computing networks with minimal performance variations using a single-crystalline SiGe layer epitaxially grown on Si as a switching medium. Such epitaxial random access memories utilize threading dislocations in SiGe to confine metal filaments in a defined, one-dimensional channel. This confinement results in drastically enhanced switching uniformity and long retention/high endurance with a high analog on/off ratio. Simulations using the MNIST handwritten recognition data set prove that epitaxial random access memories can operate with an online learning accuracy of 95.1%.

  18. Text Detection, Tracking and Recognition in Video: A Comprehensive Survey.

    PubMed

    Yin, Xu-Cheng; Zuo, Ze-Yu; Tian, Shu; Liu, Cheng-Lin

    2016-04-14

    Intelligent analysis of video data is currently in wide demand because video is a major source of sensory data in our lives. Text is a prominent and direct source of information in video, while recent surveys of text detection and recognition in imagery [1], [2] focus mainly on text extraction from scene images. Here, this paper presents a comprehensive survey of text detection, tracking and recognition in video with three major contributions. First, a generic framework is proposed for video text extraction that uniformly describes detection, tracking, recognition, and their relations and interactions. Second, within this framework, a variety of methods, systems and evaluation protocols of video text extraction are summarized, compared, and analyzed. Existing text tracking techniques, tracking based detection and recognition techniques are specifically highlighted. Third, related applications, prominent challenges, and future directions for video text extraction (especially from scene videos and web videos) are also thoroughly discussed.

  19. Group discriminatory power of handwritten characters

    NASA Astrophysics Data System (ADS)

    Tomai, Catalin I.; Kshirsagar, Devika M.; Srihari, Sargur N.

    2003-12-01

    Using handwritten characters we address two questions (i) what is the group identification performance of different alphabets (upper and lower case) and (ii) what are the best characters for the verification task (same writer/different writer discrimination) knowing demographic information about the writer such as ethnicity, age or sex. The Bhattacharya distance is used to rank different characters by their group discriminatory power and the k-nn classifier to measure the individual performance of characters for group identification. Given the tasks of identifying the correct gender/age/ethnicity or handedness, the accumulated performance of characters varies between 65% and 85%.

  20. Randomized Prediction Games for Adversarial Machine Learning.

    PubMed

    Rota Bulo, Samuel; Biggio, Battista; Pillai, Ignazio; Pelillo, Marcello; Roli, Fabio

    In spam and malware detection, attackers exploit randomization to obfuscate malicious data and increase their chances of evading detection at test time, e.g., malware code is typically obfuscated using random strings or byte sequences to hide known exploits. Interestingly, randomization has also been proposed to improve security of learning algorithms against evasion attacks, as it results in hiding information about the classifier to the attacker. Recent work has proposed game-theoretical formulations to learn secure classifiers, by simulating different evasion attacks and modifying the classification function accordingly. However, both the classification function and the simulated data manipulations have been modeled in a deterministic manner, without accounting for any form of randomization. In this paper, we overcome this limitation by proposing a randomized prediction game, namely, a noncooperative game-theoretic formulation in which the classifier and the attacker make randomized strategy selections according to some probability distribution defined over the respective strategy set. We show that our approach allows one to improve the tradeoff between attack detection and false alarms with respect to the state-of-the-art secure classifiers, even against attacks that are different from those hypothesized during design, on application examples including handwritten digit recognition, spam, and malware detection.In spam and malware detection, attackers exploit randomization to obfuscate malicious data and increase their chances of evading detection at test time, e.g., malware code is typically obfuscated using random strings or byte sequences to hide known exploits. Interestingly, randomization has also been proposed to improve security of learning algorithms against evasion attacks, as it results in hiding information about the classifier to the attacker. Recent work has proposed game-theoretical formulations to learn secure classifiers, by simulating different evasion attacks and modifying the classification function accordingly. However, both the classification function and the simulated data manipulations have been modeled in a deterministic manner, without accounting for any form of randomization. In this paper, we overcome this limitation by proposing a randomized prediction game, namely, a noncooperative game-theoretic formulation in which the classifier and the attacker make randomized strategy selections according to some probability distribution defined over the respective strategy set. We show that our approach allows one to improve the tradeoff between attack detection and false alarms with respect to the state-of-the-art secure classifiers, even against attacks that are different from those hypothesized during design, on application examples including handwritten digit recognition, spam, and malware detection.

  1. Ventral-stream-like shape representation: from pixel intensity values to trainable object-selective COSFIRE models

    PubMed Central

    Azzopardi, George; Petkov, Nicolai

    2014-01-01

    The remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms. PMID:25126068

  2. Word spotting for handwritten documents using Chamfer Distance and Dynamic Time Warping

    NASA Astrophysics Data System (ADS)

    Saabni, Raid M.; El-Sana, Jihad A.

    2011-01-01

    A large amount of handwritten historical documents are located in libraries around the world. The desire to access, search, and explore these documents paves the way for a new age of knowledge sharing and promotes collaboration and understanding between human societies. Currently, the indexes for these documents are generated manually, which is very tedious and time consuming. Results produced by state of the art techniques, for converting complete images of handwritten documents into textual representations, are not yet sufficient. Therefore, word-spotting methods have been developed to archive and index images of handwritten documents in order to enable efficient searching within documents. In this paper, we present a new matching algorithm to be used in word-spotting tasks for historical Arabic documents. We present a novel algorithm based on the Chamfer Distance to compute the similarity between shapes of word-parts. Matching results are used to cluster images of Arabic word-parts into different classes using the Nearest Neighbor rule. To compute the distance between two word-part images, the algorithm subdivides each image into equal-sized slices (windows). A modified version of the Chamfer Distance, incorporating geometric gradient features and distance transform data, is used as a similarity distance between the different slices. Finally, the Dynamic Time Warping (DTW) algorithm is used to measure the distance between two images of word-parts. By using the DTW we enabled our system to cluster similar word-parts, even though they are transformed non-linearly due to the nature of handwriting. We tested our implementation of the presented methods using various documents in different writing styles, taken from Juma'a Al Majid Center - Dubai, and obtained encouraging results.

  3. Word-level recognition of multifont Arabic text using a feature vector matching approach

    NASA Astrophysics Data System (ADS)

    Erlandson, Erik J.; Trenkle, John M.; Vogt, Robert C., III

    1996-03-01

    Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.

  4. [Penicher: an manuscript addendum to his pharmacopoeia of 1695 on the copy of Pharmacy College Library].

    PubMed

    Bonnemain, Bruno

    2016-03-01

    Penicher's pharmacopeia (1695) was part of the Library of the "College de Pharmacie". The inventory of this Library was done in 1780 and is kept by the Library of the BIU Santé, Paris-Descartes University in Paris that digitized it recently. This copy contains handwritten texts that complete the original edition. The first main addition, at the beginning of the document, is three recipes of drugs, in Latin, one of them being well known at the early 18th century, the vulnerary balm of Leonardo Fioraventi (1517-1588), that is also known as Fioraventi's alcoholate. This product will still be present in the French Codex until 1949. The Penicher' book also includes, at the end, three handwritten pages in French which represent the equipment of apothecaries. These drawings are very close to the ones of Charas' Pharmacopeia. One can think that these additions are from the second part of the 18th century, but before the gift of the pharmacopeia to the College de Pharmacie by Fourcy en 1765. The author is unknown but he is probably one of the predecessor of Fourcy in Pharmacie de l'Ours (Bear's pharmacy). This gift done by Fourcy when joining the Community of Parisians pharmacists did not prevent the fact that Fourcy was sentenced by his colleagues pharmacists, a few years later, for the sales of "Chinese specialties" that someone called Jean-Daniel Smith, a physician installed in Paris, asked him to prepare.

  5. Ancient administrative handwritten documents: X-ray analysis and imaging

    PubMed Central

    Albertin, F.; Astolfo, A.; Stampanoni, M.; Peccenini, Eva; Hwu, Y.; Kaplan, F.; Margaritondo, G.

    2015-01-01

    Handwritten characters in administrative antique documents from three centuries have been detected using different synchrotron X-ray imaging techniques. Heavy elements in ancient inks, present even for everyday administrative manuscripts as shown by X-ray fluorescence spectra, produce attenuation contrast. In most cases the image quality is good enough for tomography reconstruction in view of future applications to virtual page-by-page ‘reading’. When attenuation is too low, differential phase contrast imaging can reveal the characters from refractive index effects. The results are potentially important for new information harvesting strategies, for example from the huge Archivio di Stato collection, objective of the Venice Time Machine project. PMID:25723946

  6. Ancient administrative handwritten documents: X-ray analysis and imaging.

    PubMed

    Albertin, F; Astolfo, A; Stampanoni, M; Peccenini, Eva; Hwu, Y; Kaplan, F; Margaritondo, G

    2015-03-01

    Handwritten characters in administrative antique documents from three centuries have been detected using different synchrotron X-ray imaging techniques. Heavy elements in ancient inks, present even for everyday administrative manuscripts as shown by X-ray fluorescence spectra, produce attenuation contrast. In most cases the image quality is good enough for tomography reconstruction in view of future applications to virtual page-by-page `reading'. When attenuation is too low, differential phase contrast imaging can reveal the characters from refractive index effects. The results are potentially important for new information harvesting strategies, for example from the huge Archivio di Stato collection, objective of the Venice Time Machine project.

  7. Medical Named Entity Recognition for Indonesian Language Using Word Representations

    NASA Astrophysics Data System (ADS)

    Rahman, Arief

    2018-03-01

    Nowadays, Named Entity Recognition (NER) system is used in medical texts to obtain important medical information, like diseases, symptoms, and drugs. While most NER systems are applied to formal medical texts, informal ones like those from social media (also called semi-formal texts) are starting to get recognition as a gold mine for medical information. We propose a theoretical Named Entity Recognition (NER) model for semi-formal medical texts in our medical knowledge management system by comparing two kinds of word representations: cluster-based word representation and distributed representation.

  8. A segmentation-free approach to Arabic and Urdu OCR

    NASA Astrophysics Data System (ADS)

    Sabbour, Nazly; Shafait, Faisal

    2013-01-01

    In this paper, we present a generic Optical Character Recognition system for Arabic script languages called Nabocr. Nabocr uses OCR approaches specific for Arabic script recognition. Performing recognition on Arabic script text is relatively more difficult than Latin text due to the nature of Arabic script, which is cursive and context sensitive. Moreover, Arabic script has different writing styles that vary in complexity. Nabocr is initially trained to recognize both Urdu Nastaleeq and Arabic Naskh fonts. However, it can be trained by users to be used for other Arabic script languages. We have evaluated our system's performance for both Urdu and Arabic. In order to evaluate Urdu recognition, we have generated a dataset of Urdu text called UPTI (Urdu Printed Text Image Database), which measures different aspects of a recognition system. The performance of our system for Urdu clean text is 91%. For Arabic clean text, the performance is 86%. Moreover, we have compared the performance of our system against Tesseract's newly released Arabic recognition, and the performance of both systems on clean images is almost the same.

  9. Real-time classification and sensor fusion with a spiking deep belief network.

    PubMed

    O'Connor, Peter; Neil, Daniel; Liu, Shih-Chii; Delbruck, Tobi; Pfeiffer, Michael

    2013-01-01

    Deep Belief Networks (DBNs) have recently shown impressive performance on a broad range of classification problems. Their generative properties allow better understanding of the performance, and provide a simpler solution for sensor fusion tasks. However, because of their inherent need for feedback and parallel update of large numbers of units, DBNs are expensive to implement on serial computers. This paper proposes a method based on the Siegert approximation for Integrate-and-Fire neurons to map an offline-trained DBN onto an efficient event-driven spiking neural network suitable for hardware implementation. The method is demonstrated in simulation and by a real-time implementation of a 3-layer network with 2694 neurons used for visual classification of MNIST handwritten digits with input from a 128 × 128 Dynamic Vision Sensor (DVS) silicon retina, and sensory-fusion using additional input from a 64-channel AER-EAR silicon cochlea. The system is implemented through the open-source software in the jAER project and runs in real-time on a laptop computer. It is demonstrated that the system can recognize digits in the presence of distractions, noise, scaling, translation and rotation, and that the degradation of recognition performance by using an event-based approach is less than 1%. Recognition is achieved in an average of 5.8 ms after the onset of the presentation of a digit. By cue integration from both silicon retina and cochlea outputs we show that the system can be biased to select the correct digit from otherwise ambiguous input.

  10. New baseline correction algorithm for text-line recognition with bidirectional recurrent neural networks

    NASA Astrophysics Data System (ADS)

    Morillot, Olivier; Likforman-Sulem, Laurence; Grosicki, Emmanuèle

    2013-04-01

    Many preprocessing techniques have been proposed for isolated word recognition. However, recently, recognition systems have dealt with text blocks and their compound text lines. In this paper, we propose a new preprocessing approach to efficiently correct baseline skew and fluctuations. Our approach is based on a sliding window within which the vertical position of the baseline is estimated. Segmentation of text lines into subparts is, thus, avoided. Experiments conducted on a large publicly available database (Rimes), with a BLSTM (bidirectional long short-term memory) recurrent neural network recognition system, show that our baseline correction approach highly improves performance.

  11. Practical vision based degraded text recognition system

    NASA Astrophysics Data System (ADS)

    Mohammad, Khader; Agaian, Sos; Saleh, Hani

    2011-02-01

    Rapid growth and progress in the medical, industrial, security and technology fields means more and more consideration for the use of camera based optical character recognition (OCR) Applying OCR to scanned documents is quite mature, and there are many commercial and research products available on this topic. These products achieve acceptable recognition accuracy and reasonable processing times especially with trained software, and constrained text characteristics. Even though the application space for OCR is huge, it is quite challenging to design a single system that is capable of performing automatic OCR for text embedded in an image irrespective of the application. Challenges for OCR systems include; images are taken under natural real world conditions, Surface curvature, text orientation, font, size, lighting conditions, and noise. These and many other conditions make it extremely difficult to achieve reasonable character recognition. Performance for conventional OCR systems drops dramatically as the degradation level of the text image quality increases. In this paper, a new recognition method is proposed to recognize solid or dotted line degraded characters. The degraded text string is localized and segmented using a new algorithm. The new method was implemented and tested using a development framework system that is capable of performing OCR on camera captured images. The framework allows parameter tuning of the image-processing algorithm based on a training set of camera-captured text images. Novel methods were used for enhancement, text localization and the segmentation algorithm which enables building a custom system that is capable of performing automatic OCR which can be used for different applications. The developed framework system includes: new image enhancement, filtering, and segmentation techniques which enabled higher recognition accuracies, faster processing time, and lower energy consumption, compared with the best state of the art published techniques. The system successfully produced impressive OCR accuracies (90% -to- 93%) using customized systems generated by our development framework in two industrial OCR applications: water bottle label text recognition and concrete slab plate text recognition. The system was also trained for the Arabic language alphabet, and demonstrated extremely high recognition accuracy (99%) for Arabic license name plate text recognition with processing times of 10 seconds. The accuracy and run times of the system were compared to conventional and many states of art methods, the proposed system shows excellent results.

  12. Multi-exemplar affinity propagation.

    PubMed

    Wang, Chang-Dong; Lai, Jian-Huang; Suen, Ching Y; Zhu, Jun-Yong

    2013-09-01

    The affinity propagation (AP) clustering algorithm has received much attention in the past few years. AP is appealing because it is efficient, insensitive to initialization, and it produces clusters at a lower error rate than other exemplar-based methods. However, its single-exemplar model becomes inadequate when applied to model multisubclasses in some situations such as scene analysis and character recognition. To remedy this deficiency, we have extended the single-exemplar model to a multi-exemplar one to create a new multi-exemplar affinity propagation (MEAP) algorithm. This new model automatically determines the number of exemplars in each cluster associated with a super exemplar to approximate the subclasses in the category. Solving the model is NP-hard and we tackle it with the max-sum belief propagation to produce neighborhood maximum clusters, with no need to specify beforehand the number of clusters, multi-exemplars, and superexemplars. Also, utilizing the sparsity in the data, we are able to reduce substantially the computational time and storage. Experimental studies have shown MEAP's significant improvements over other algorithms on unsupervised image categorization and the clustering of handwritten digits.

  13. CW-SSIM kernel based random forest for image classification

    NASA Astrophysics Data System (ADS)

    Fan, Guangzhe; Wang, Zhou; Wang, Jiheng

    2010-07-01

    Complex wavelet structural similarity (CW-SSIM) index has been proposed as a powerful image similarity metric that is robust to translation, scaling and rotation of images, but how to employ it in image classification applications has not been deeply investigated. In this paper, we incorporate CW-SSIM as a kernel function into a random forest learning algorithm. This leads to a novel image classification approach that does not require a feature extraction or dimension reduction stage at the front end. We use hand-written digit recognition as an example to demonstrate our algorithm. We compare the performance of the proposed approach with random forest learning based on other kernels, including the widely adopted Gaussian and the inner product kernels. Empirical evidences show that the proposed method is superior in its classification power. We also compared our proposed approach with the direct random forest method without kernel and the popular kernel-learning method support vector machine. Our test results based on both simulated and realworld data suggest that the proposed approach works superior to traditional methods without the feature selection procedure.

  14. Memristor-Based Analog Computation and Neural Network Classification with a Dot Product Engine.

    PubMed

    Hu, Miao; Graves, Catherine E; Li, Can; Li, Yunning; Ge, Ning; Montgomery, Eric; Davila, Noraica; Jiang, Hao; Williams, R Stanley; Yang, J Joshua; Xia, Qiangfei; Strachan, John Paul

    2018-03-01

    Using memristor crossbar arrays to accelerate computations is a promising approach to efficiently implement algorithms in deep neural networks. Early demonstrations, however, are limited to simulations or small-scale problems primarily due to materials and device challenges that limit the size of the memristor crossbar arrays that can be reliably programmed to stable and analog values, which is the focus of the current work. High-precision analog tuning and control of memristor cells across a 128 × 64 array is demonstrated, and the resulting vector matrix multiplication (VMM) computing precision is evaluated. Single-layer neural network inference is performed in these arrays, and the performance compared to a digital approach is assessed. Memristor computing system used here reaches a VMM accuracy equivalent of 6 bits, and an 89.9% recognition accuracy is achieved for the 10k MNIST handwritten digit test set. Forecasts show that with integrated (on chip) and scaled memristors, a computational efficiency greater than 100 trillion operations per second per Watt is possible. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. The effects of two health information texts on patient recognition memory: a randomized controlled trial.

    PubMed

    Freed, Erin; Long, Debra; Rodriguez, Tonantzin; Franks, Peter; Kravitz, Richard L; Jerant, Anthony

    2013-08-01

    To compare the effects of two health information texts on patient recognition memory, a key aspect of comprehension. Randomized controlled trial (N=60), comparing the effects of experimental and control colorectal cancer (CRC) screening texts on recognition memory, measured using a statement recognition test, accounting for response bias (score range -0.91 to 5.34). The experimental text had a lower Flesch-Kincaid reading grade level (7.4 versus 9.6), was more focused on addressing screening barriers, and employed more comparative tables than the control text. Recognition memory was higher in the experimental group (2.54 versus 1.09, t=-3.63, P=0.001), including after adjustment for age, education, and health literacy (β=0.42, 95% CI: 0.17, 0.68, P=0.001), and in analyses limited to persons with college degrees (β=0.52, 95% CI: 0.18, 0.86, P=0.004) or no self-reported health literacy problems (β=0.39, 95% CI: 0.07, 0.71, P=0.02). An experimental CRC screening text improved recognition memory, including among patients with high education and self-assessed health literacy. CRC screening texts comparable to our experimental text may be warranted for all screening-eligible patients, if such texts improve screening uptake. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  16. The Effects of Two Health Information Texts on Patient Recognition Memory: A Randomized Controlled Trial

    PubMed Central

    Freed, Erin; Long, Debra; Rodriguez, Tonantzin; Franks, Peter; Kravitz, Richard L.; Jerant, Anthony

    2013-01-01

    Objective To compare the effects of two health information texts on patient recognition memory, a key aspect of comprehension. Methods Randomized controlled trial (N = 60), comparing the effects of experimental and control colorectal cancer (CRC) screening texts on recognition memory, measured using a statement recognition test, accounting for response bias (score range −0.91 to 5.34). The experimental text had a lower Flesch-Kincaid reading grade level (7.4 versus 9.6), was more focused on addressing screening barriers, and employed more comparative tables than the control text. Results Recognition memory was higher in the experimental group (2.54 versus 1.09, t= −3.63, P = 0.001), including after adjustment for age, education, and health literacy (β = 0.42, 95% CI 0.17, 0.68, P = 0.001), and in analyses limited to persons with college degrees (β = 0.52, 95% CI 0.18, 0.86, P = 0.004) or no self-reported health literacy problems (β = 0.39, 95% CI 0.07, 0.71, P = 0.02). Conclusion An experimental CRC screening text improved recognition memory, including among patients with high education and self-assessed health literacy. Practice Implications CRC screening texts comparable to our experimental text may be warranted for all screening-eligible patients, if such texts improve screening uptake. PMID:23541216

  17. Rapid Implementation of Inpatient Electronic Physician Documentation at an Academic Hospital

    PubMed Central

    Hahn, J.S.; Bernstein, J.A.; McKenzie, R.B.; King, B.J.; Longhurst, C.A.

    2012-01-01

    Electronic physician documentation is an essential element of a complete electronic medical record (EMR). At Lucile Packard Children’s Hospital, a teaching hospital affiliated with Stanford University, we implemented an inpatient electronic documentation system for physicians over a 12-month period. Using an EMR-based free-text editor coupled with automated import of system data elements, we were able to achieve voluntary, widespread adoption of the electronic documentation process. When given the choice between electronic versus dictated report creation, the vast majority of users preferred the electronic method. In addition to increasing the legibility and accessibility of clinical notes, we also decreased the volume of dictated notes and scanning of handwritten notes, which provides the opportunity for cost savings to the institution. PMID:23620718

  18. The use of discrete-event simulation modeling to compare handwritten and electronic prescribing systems.

    PubMed

    Ghany, Ahmad; Vassanji, Karim; Kuziemsky, Craig; Keshavjee, Karim

    2013-01-01

    Electronic prescribing (e-prescribing) is expected to bring many benefits to Canadian healthcare, such as a reduction in errors and adverse drug reactions. As there currently is no functioning e-prescribing system in Canada that is completely electronic, we are unable to evaluate the performance of a live system. An alternative approach is to use simulation modeling for evaluation. We developed two discrete-event simulation models, one of the current handwritten prescribing system and one of a proposed e-prescribing system, to compare the performance of these two systems. We were able to compare the number of processes in each model, workflow efficiency, and the distribution of patients or prescriptions. Although we were able to compare these models to each other, using discrete-event simulation software was challenging. We were limited in the number of variables we could measure. We discovered non-linear processes and feedback loops in both models that could not be adequately represented using discrete-event simulation software. Finally, interactions between entities in both models could not be modeled using this type of software. We have come to the conclusion that a more appropriate approach to modeling both the handwritten and electronic prescribing systems would be to use a complex adaptive systems approach using agent-based modeling or systems-based modeling.

  19. Improving semi-text-independent method of writer verification using difference vector

    NASA Astrophysics Data System (ADS)

    Li, Xin; Ding, Xiaoqing

    2009-01-01

    The semi-text-independent method of writer verification based on the linear framework is a method that can use all characters of two handwritings to discriminate the writers in the condition of knowing the text contents. The handwritings are allowed to just have small numbers of even totally different characters. This fills the vacancy of the classical text-dependent methods and the text-independent methods of writer verification. Moreover, the information, what every character is, is used for the semi-text-independent method in this paper. Two types of standard templates, generated from many writer-unknown handwritten samples and printed samples of each character, are introduced to represent the content information of each character. The difference vectors of the character samples are gotten by subtracting the standard templates from the original feature vectors and used to replace the original vectors in the process of writer verification. By removing a large amount of content information and remaining the style information, the verification accuracy of the semi-text-independent method is improved. On a handwriting database involving 30 writers, when the query handwriting and the reference handwriting are composed of 30 distinct characters respectively, the average equal error rate (EER) of writer verification reaches 9.96%. And when the handwritings contain 50 characters, the average EER falls to 6.34%, which is 23.9% lower than the EER of not using the difference vectors.

  20. The use of the liquid crystal display (LCD) panel as a teaching aid in medical lectures.

    PubMed

    Wong, K T

    1992-01-01

    The liquid crystal display (LCD) panel is designed to project on-screen information of a microcomputer onto a larger screen with the aid of a standard overhead projector, so that large audiences may view on-screen information without having to crowd around the TV monitor. As little has been written about its use as a visual aid in medical teaching, the present report documents its use in a series of pathology lectures delivered, over a 2-year period, to two classes of about 150 medical students each. Some advantages of the LCD panel over the 35mm slide include the flexibility of last-minute text changes and less lead time needed for text preparation. It eliminates the problems of messy last-minute changes in, and improves legibility of, handwritten overhead projector transparencies. The disadvantages of using an LCD panel include the relatively bulky equipment which may pose transport problems, image clarity that is inferior to the 35mm slide, and equipment costs.

  1. Age and gender-invariant features of handwritten signatures for verification systems

    NASA Astrophysics Data System (ADS)

    AbdAli, Sura; Putz-Leszczynska, Joanna

    2014-11-01

    Handwritten signature is one of the most natural biometrics, the study of human physiological and behavioral patterns. Behavioral biometrics includes signatures that may be different due to its owner gender or age because of intrinsic or extrinsic factors. This paper presents the results of the author's research on age and gender influence on verification factors. The experiments in this research were conducted using a database that contains signatures and their associated metadata. The used algorithm is based on the universal forgery feature idea, where the global classifier is able to classify a signature as a genuine one or, as a forgery, without the actual knowledge of the signature template and its owner. Additionally, the reduction of the dimensionality with the MRMR method is discussed.

  2. Classification and Verification of Handwritten Signatures with Time Causal Information Theory Quantifiers.

    PubMed

    Rosso, Osvaldo A; Ospina, Raydonal; Frery, Alejandro C

    2016-01-01

    We present a new approach for handwritten signature classification and verification based on descriptors stemming from time causal information theory. The proposal uses the Shannon entropy, the statistical complexity, and the Fisher information evaluated over the Bandt and Pompe symbolization of the horizontal and vertical coordinates of signatures. These six features are easy and fast to compute, and they are the input to an One-Class Support Vector Machine classifier. The results are better than state-of-the-art online techniques that employ higher-dimensional feature spaces which often require specialized software and hardware. We assess the consistency of our proposal with respect to the size of the training sample, and we also use it to classify the signatures into meaningful groups.

  3. Annotation an effective device for student feedback: a critical review of the literature.

    PubMed

    Ball, Elaine C

    2010-05-01

    The paper examines hand-written annotation, its many features, difficulties and strengths as a feedback tool. It extends and clarifies what modest evidence is in the public domain and offers an evaluation of how to use annotation effectively in the support of student feedback [Marshall, C.M., 1998a. The Future of Annotation in a Digital (paper) World. Presented at the 35th Annual GLSLIS Clinic: Successes and Failures of Digital Libraries, June 20-24, University of Illinois at Urbana-Champaign, March 24, pp. 1-20; Marshall, C.M., 1998b. Toward an ecology of hypertext annotation. Hypertext. In: Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia, June 20-24, Pittsburgh Pennsylvania, US, pp. 40-49; Wolfe, J.L., Nuewirth, C.M., 2001. From the margins to the centre: the future of annotation. Journal of Business and Technical Communication, 15(3), 333-371; Diyanni, R., 2002. One Hundred Great Essays. Addison-Wesley, New York; Wolfe, J.L., 2002. Marginal pedagogy: how annotated texts affect writing-from-source texts. Written Communication, 19(2), 297-333; Liu, K., 2006. Annotation as an index to critical writing. Urban Education, 41, 192-207; Feito, A., Donahue, P., 2008. Minding the gap annotation as preparation for discussion. Arts and Humanities in Higher Education, 7(3), 295-307; Ball, E., 2009. A participatory action research study on handwritten annotation feedback and its impact on staff and students. Systemic Practice and Action Research, 22(2), 111-124; Ball, E., Franks, H., McGrath, M., Leigh, J., 2009. Annotation is a valuable tool to enhance learning and assessment in student essays. Nurse Education Today, 29(3), 284-291]. Although a significant number of studies examine annotation, this is largely related to on-line tools and computer mediated communication and not hand-written annotation as comment, phrase or sign written on the student essay to provide critique. Little systematic research has been conducted to consider how this latter form of annotation influences student learning and assessment or, indeed, helps tutors to employ better annotative practices [Juwah, C., Macfarlane-Dick, D., Matthew, B., Nicol, D., Ross, D., Smith, B., 2004. Enhancing student learning through effective formative feedback. The Higher Education Academy, 1-40; Jewitt, C., Kress, G., 2005. English in classrooms: only write down what you need to know: annotation for what? English in Education, 39(1), 5-18]. There is little evidence on ways to heighten students' self-awareness when their essays are returned with annotated feedback [Storch, N., Tapper, J., 1997. Student annotations: what NNS and NS university students say about their own writing. Journal of Second Language Writing, 6(3), 245-265]. The literature review clarifies forms of annotation as feedback practice and offers a summary of the challenges and usefulness of annotation. Copyright 2009. Published by Elsevier Ltd.

  4. Real-time classification and sensor fusion with a spiking deep belief network

    PubMed Central

    O'Connor, Peter; Neil, Daniel; Liu, Shih-Chii; Delbruck, Tobi; Pfeiffer, Michael

    2013-01-01

    Deep Belief Networks (DBNs) have recently shown impressive performance on a broad range of classification problems. Their generative properties allow better understanding of the performance, and provide a simpler solution for sensor fusion tasks. However, because of their inherent need for feedback and parallel update of large numbers of units, DBNs are expensive to implement on serial computers. This paper proposes a method based on the Siegert approximation for Integrate-and-Fire neurons to map an offline-trained DBN onto an efficient event-driven spiking neural network suitable for hardware implementation. The method is demonstrated in simulation and by a real-time implementation of a 3-layer network with 2694 neurons used for visual classification of MNIST handwritten digits with input from a 128 × 128 Dynamic Vision Sensor (DVS) silicon retina, and sensory-fusion using additional input from a 64-channel AER-EAR silicon cochlea. The system is implemented through the open-source software in the jAER project and runs in real-time on a laptop computer. It is demonstrated that the system can recognize digits in the presence of distractions, noise, scaling, translation and rotation, and that the degradation of recognition performance by using an event-based approach is less than 1%. Recognition is achieved in an average of 5.8 ms after the onset of the presentation of a digit. By cue integration from both silicon retina and cochlea outputs we show that the system can be biased to select the correct digit from otherwise ambiguous input. PMID:24115919

  5. Classification and Verification of Handwritten Signatures with Time Causal Information Theory Quantifiers

    PubMed Central

    Ospina, Raydonal; Frery, Alejandro C.

    2016-01-01

    We present a new approach for handwritten signature classification and verification based on descriptors stemming from time causal information theory. The proposal uses the Shannon entropy, the statistical complexity, and the Fisher information evaluated over the Bandt and Pompe symbolization of the horizontal and vertical coordinates of signatures. These six features are easy and fast to compute, and they are the input to an One-Class Support Vector Machine classifier. The results are better than state-of-the-art online techniques that employ higher-dimensional feature spaces which often require specialized software and hardware. We assess the consistency of our proposal with respect to the size of the training sample, and we also use it to classify the signatures into meaningful groups. PMID:27907014

  6. Retrieving handwriting by combining word spotting and manifold ranking

    NASA Astrophysics Data System (ADS)

    Peña Saldarriaga, Sebastián; Morin, Emmanuel; Viard-Gaudin, Christian

    2012-01-01

    Online handwritten data, produced with Tablet PCs or digital pens, consists in a sequence of points (x, y). As the amount of data available in this form increases, algorithms for retrieval of online data are needed. Word spotting is a common approach used for the retrieval of handwriting. However, from an information retrieval (IR) perspective, word spotting is a primitive keyword based matching and retrieval strategy. We propose a framework for handwriting retrieval where an arbitrary word spotting method is used, and then a manifold ranking algorithm is applied on the initial retrieval scores. Experimental results on a database of more than 2,000 handwritten newswires show that our method can improve the performances of a state-of-the-art word spotting system by more than 10%.

  7. Gout in Duke Federico of Montefeltro (1422-1482): a new pearl of the Italian Renaissance.

    PubMed

    Fornaciari, Antonio; Giuffra, Valentina; Armocida, Emanuele; Caramella, Davide; Rühli, Frank J; Galassi, Francesco Maria

    2018-01-01

    The article examines the truthfulness of historical accounts claiming that Renaissance Duke Federico of Montefeltro (1422-1482) suffered from gout. By direct paleopathological assessment of the skeletal remains and by the philological investigation of historical and documental sources, primarily a 1461 handwritten letter by the Duke himself to his personal physician, a description of the symptoms and Renaissance therapy is offered and a final diagnosis of gout is formulated. The Duke's handwritten letter offers a rare testimony of ancient clinical self-diagnostics and Renaissance living-experience of gout. Moreover, the article also shows how an alliance between historical, documental and paleopathological methods can greatly increase the precision of retrospective diagnoses, thus helping to shed clearer light onto the antiquity and evolution of diseases.

  8. The impact of inverted text on visual word processing: An fMRI study.

    PubMed

    Sussman, Bethany L; Reddigari, Samir; Newman, Sharlene D

    2018-06-01

    Visual word recognition has been studied for decades. One question that has received limited attention is how different text presentation orientations disrupt word recognition. By examining how word recognition processes may be disrupted by different text orientations it is hoped that new insights can be gained concerning the process. Here, we examined the impact of rotating and inverting text on the neural network responsible for visual word recognition focusing primarily on a region of the occipto-temporal cortex referred to as the visual word form area (VWFA). A lexical decision task was employed in which words and pseudowords were presented in one of three orientations (upright, rotated or inverted). The results demonstrate that inversion caused the greatest disruption of visual word recognition processes. Both rotated and inverted text elicited increased activation in spatial attention regions within the right parietal cortex. However, inverted text recruited phonological and articulatory processing regions within the left inferior frontal and left inferior parietal cortices. Finally, the VWFA was found to not behave similarly to the fusiform face area in that unusual text orientations resulted in increased activation and not decreased activation. It is hypothesized here that the VWFA activation is modulated by feedback from linguistic processes. Copyright © 2018 Elsevier Inc. All rights reserved.

  9. Recovery of handwritten text from the diaries and papers of David Livingstone

    NASA Astrophysics Data System (ADS)

    Knox, Keith T.; Easton, Roger L., Jr.; Christens-Barry, William A.; Boydston, Kenneth

    2011-03-01

    During his explorations of Africa, David Livingstone kept a diary and wrote letters about his experiences. Near the end of his travels, he ran out of paper and ink and began recording his thoughts on leftover newspaper with ink made from local seeds. These writings suffer from fading, from interference with the printed text and from bleed through of the handwriting on the other side of the paper, making them hard to read. New image processing techniques have been developed to deal with these papers to make Livingstone's handwriting available to the scholars to read. A scan of the David Livingstone's papers was made using a twelve-wavelength, multispectral imaging system. The wavelengths ranged from the ultraviolet to the near infrared. In these wavelengths, the three different types of writing behave differently, making them distinguishable from each other. So far, three methods have been used to recover Livingstone's handwriting. These include pseudocolor (to make the different writings distinguishable), spectral band ratios (to remove text that does not change), and principal components analysis (to separate the different writings). In initial trials, these techniques have been able to lift handwriting off printed text and have suppressed handwriting that has bled through from the other side of the paper.

  10. Scene Text Recognition using Similarity and a Lexicon with Sparse Belief Propagation

    PubMed Central

    Weinman, Jerod J.; Learned-Miller, Erik; Hanson, Allen R.

    2010-01-01

    Scene text recognition (STR) is the recognition of text anywhere in the environment, such as signs and store fronts. Relative to document recognition, it is challenging because of font variability, minimal language context, and uncontrolled conditions. Much information available to solve this problem is frequently ignored or used sequentially. Similarity between character images is often overlooked as useful information. Because of language priors, a recognizer may assign different labels to identical characters. Directly comparing characters to each other, rather than only a model, helps ensure that similar instances receive the same label. Lexicons improve recognition accuracy but are used post hoc. We introduce a probabilistic model for STR that integrates similarity, language properties, and lexical decision. Inference is accelerated with sparse belief propagation, a bottom-up method for shortening messages by reducing the dependency between weakly supported hypotheses. By fusing information sources in one model, we eliminate unrecoverable errors that result from sequential processing, improving accuracy. In experimental results recognizing text from images of signs in outdoor scenes, incorporating similarity reduces character recognition error by 19%, the lexicon reduces word recognition error by 35%, and sparse belief propagation reduces the lexicon words considered by 99.9% with a 12X speedup and no loss in accuracy. PMID:19696446

  11. Noisy text categorization.

    PubMed

    Vinciarelli, Alessandro

    2005-12-01

    This work presents categorization experiments performed over noisy texts. By noisy, we mean any text obtained through an extraction process (affected by errors) from media other than digital texts (e.g., transcriptions of speech recordings extracted with a recognition system). The performance of a categorization system over the clean and noisy (Word Error Rate between approximately 10 and approximately 50 percent) versions of the same documents is compared. The noisy texts are obtained through handwriting recognition and simulation of optical character recognition. The results show that the performance loss is acceptable for Recall values up to 60-70 percent depending on the noise sources. New measures of the extraction process performance, allowing a better explanation of the categorization results, are proposed.

  12. Deep neural networks for texture classification-A theoretical analysis.

    PubMed

    Basu, Saikat; Mukhopadhyay, Supratik; Karki, Manohar; DiBiano, Robert; Ganguly, Sangram; Nemani, Ramakrishna; Gayaka, Shreekant

    2018-01-01

    We investigate the use of Deep Neural Networks for the classification of image datasets where texture features are important for generating class-conditional discriminative representations. To this end, we first derive the size of the feature space for some standard textural features extracted from the input dataset and then use the theory of Vapnik-Chervonenkis dimension to show that hand-crafted feature extraction creates low-dimensional representations which help in reducing the overall excess error rate. As a corollary to this analysis, we derive for the first time upper bounds on the VC dimension of Convolutional Neural Network as well as Dropout and Dropconnect networks and the relation between excess error rate of Dropout and Dropconnect networks. The concept of intrinsic dimension is used to validate the intuition that texture-based datasets are inherently higher dimensional as compared to handwritten digits or other object recognition datasets and hence more difficult to be shattered by neural networks. We then derive the mean distance from the centroid to the nearest and farthest sampling points in an n-dimensional manifold and show that the Relative Contrast of the sample data vanishes as dimensionality of the underlying vector space tends to infinity. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Magnetic Tunnel Junction Based Long-Term Short-Term Stochastic Synapse for a Spiking Neural Network with On-Chip STDP Learning

    NASA Astrophysics Data System (ADS)

    Srinivasan, Gopalakrishnan; Sengupta, Abhronil; Roy, Kaushik

    2016-07-01

    Spiking Neural Networks (SNNs) have emerged as a powerful neuromorphic computing paradigm to carry out classification and recognition tasks. Nevertheless, the general purpose computing platforms and the custom hardware architectures implemented using standard CMOS technology, have been unable to rival the power efficiency of the human brain. Hence, there is a need for novel nanoelectronic devices that can efficiently model the neurons and synapses constituting an SNN. In this work, we propose a heterostructure composed of a Magnetic Tunnel Junction (MTJ) and a heavy metal as a stochastic binary synapse. Synaptic plasticity is achieved by the stochastic switching of the MTJ conductance states, based on the temporal correlation between the spiking activities of the interconnecting neurons. Additionally, we present a significance driven long-term short-term stochastic synapse comprising two unique binary synaptic elements, in order to improve the synaptic learning efficiency. We demonstrate the efficacy of the proposed synaptic configurations and the stochastic learning algorithm on an SNN trained to classify handwritten digits from the MNIST dataset, using a device to system-level simulation framework. The power efficiency of the proposed neuromorphic system stems from the ultra-low programming energy of the spintronic synapses.

  14. Magnetic Tunnel Junction Based Long-Term Short-Term Stochastic Synapse for a Spiking Neural Network with On-Chip STDP Learning.

    PubMed

    Srinivasan, Gopalakrishnan; Sengupta, Abhronil; Roy, Kaushik

    2016-07-13

    Spiking Neural Networks (SNNs) have emerged as a powerful neuromorphic computing paradigm to carry out classification and recognition tasks. Nevertheless, the general purpose computing platforms and the custom hardware architectures implemented using standard CMOS technology, have been unable to rival the power efficiency of the human brain. Hence, there is a need for novel nanoelectronic devices that can efficiently model the neurons and synapses constituting an SNN. In this work, we propose a heterostructure composed of a Magnetic Tunnel Junction (MTJ) and a heavy metal as a stochastic binary synapse. Synaptic plasticity is achieved by the stochastic switching of the MTJ conductance states, based on the temporal correlation between the spiking activities of the interconnecting neurons. Additionally, we present a significance driven long-term short-term stochastic synapse comprising two unique binary synaptic elements, in order to improve the synaptic learning efficiency. We demonstrate the efficacy of the proposed synaptic configurations and the stochastic learning algorithm on an SNN trained to classify handwritten digits from the MNIST dataset, using a device to system-level simulation framework. The power efficiency of the proposed neuromorphic system stems from the ultra-low programming energy of the spintronic synapses.

  15. Textual emotion recognition for enhancing enterprise computing

    NASA Astrophysics Data System (ADS)

    Quan, Changqin; Ren, Fuji

    2016-05-01

    The growing interest in affective computing (AC) brings a lot of valuable research topics that can meet different application demands in enterprise systems. The present study explores a sub area of AC techniques - textual emotion recognition for enhancing enterprise computing. Multi-label emotion recognition in text is able to provide a more comprehensive understanding of emotions than single label emotion recognition. A representation of 'emotion state in text' is proposed to encompass the multidimensional emotions in text. It ensures the description in a formal way of the configurations of basic emotions as well as of the relations between them. Our method allows recognition of the emotions for the words bear indirect emotions, emotion ambiguity and multiple emotions. We further investigate the effect of word order for emotional expression by comparing the performances of bag-of-words model and sequence model for multi-label sentence emotion recognition. The experiments show that the classification results under sequence model are better than under bag-of-words model. And homogeneous Markov model showed promising results of multi-label sentence emotion recognition. This emotion recognition system is able to provide a convenient way to acquire valuable emotion information and to improve enterprise competitive ability in many aspects.

  16. Concept recognition for extracting protein interaction relations from biomedical text

    PubMed Central

    Baumgartner, William A; Lu, Zhiyong; Johnson, Helen L; Caporaso, J Gregory; Paquette, Jesse; Lindemann, Anna; White, Elizabeth K; Medvedeva, Olga; Cohen, K Bretonnel; Hunter, Lawrence

    2008-01-01

    Background: Reliable information extraction applications have been a long sought goal of the biomedical text mining community, a goal that if reached would provide valuable tools to benchside biologists in their increasingly difficult task of assimilating the knowledge contained in the biomedical literature. We present an integrated approach to concept recognition in biomedical text. Concept recognition provides key information that has been largely missing from previous biomedical information extraction efforts, namely direct links to well defined knowledge resources that explicitly cement the concept's semantics. The BioCreative II tasks discussed in this special issue have provided a unique opportunity to demonstrate the effectiveness of concept recognition in the field of biomedical language processing. Results: Through the modular construction of a protein interaction relation extraction system, we present several use cases of concept recognition in biomedical text, and relate these use cases to potential uses by the benchside biologist. Conclusion: Current information extraction technologies are approaching performance standards at which concept recognition can begin to deliver high quality data to the benchside biologist. Our system is available as part of the BioCreative Meta-Server project and on the internet . PMID:18834500

  17. Recognition and defect detection of dot-matrix text via variation-model based learning

    NASA Astrophysics Data System (ADS)

    Ohyama, Wataru; Suzuki, Koushi; Wakabayashi, Tetsushi

    2017-03-01

    An algorithm for recognition and defect detection of dot-matrix text printed on products is proposed. Extraction and recognition of dot-matrix text contains several difficulties, which are not involved in standard camera-based OCR, that the appearance of dot-matrix characters is corrupted and broken by illumination, complex texture in the background and other standard characters printed on product packages. We propose a dot-matrix text extraction and recognition method which does not require any user interaction. The method employs detected location of corner points and classification score. The result of evaluation experiment using 250 images shows that recall and precision of extraction are 78.60% and 76.03%, respectively. Recognition accuracy of correctly extracted characters is 94.43%. Detecting printing defect of dot-matrix text is also important in the production scene to avoid illegal productions. We also propose a detection method for printing defect of dot-matrix characters. The method constructs a feature vector of which elements are classification scores of each character class and employs support vector machine to classify four types of printing defect. The detection accuracy of the proposed method is 96.68 %.

  18. Automated extraction of radiation dose information from CT dose report images.

    PubMed

    Li, Xinhua; Zhang, Da; Liu, Bob

    2011-06-01

    The purpose of this article is to describe the development of an automated tool for retrieving texts from CT dose report images. Optical character recognition was adopted to perform text recognitions of CT dose report images. The developed tool is able to automate the process of analyzing multiple CT examinations, including text recognition, parsing, error correction, and exporting data to spreadsheets. The results were precise for total dose-length product (DLP) and were about 95% accurate for CT dose index and DLP of scanned series.

  19. Hooked on You.

    ERIC Educational Resources Information Center

    Sevier, Robert

    1988-01-01

    Most successful yield strategies use a series of messages specifically designed to meet the informational and emotional needs of students in the final decision-making stages. Techniques to try include: brochures, videotapes, handwritten postscripts, posters, and phone campaigns. (MLW)

  20. Scene text recognition in mobile applications by character descriptor and structure configuration.

    PubMed

    Yi, Chucai; Tian, Yingli

    2014-07-01

    Text characters and strings in natural scene can provide valuable information for many applications. Extracting text directly from natural scene images or videos is a challenging task because of diverse text patterns and variant background interferences. This paper proposes a method of scene text recognition from detected text regions. In text detection, our previously proposed algorithms are applied to obtain text regions from scene image. First, we design a discriminative character descriptor by combining several state-of-the-art feature detectors and descriptors. Second, we model character structure at each character class by designing stroke configuration maps. Our algorithm design is compatible with the application of scene text extraction in smart mobile devices. An Android-based demo system is developed to show the effectiveness of our proposed method on scene text information extraction from nearby objects. The demo system also provides us some insight into algorithm design and performance improvement of scene text extraction. The evaluation results on benchmark data sets demonstrate that our proposed scheme of text recognition is comparable with the best existing methods.

  1. THE RELIABILITY OF HAND-WRITTEN AND COMPUTERISED RECORDS OF BIRTH DATA COLLECTED AT BARAGWANATH HOSPITAL IN SOWETO

    PubMed Central

    Ellison, GTH; Richter, LM; de Wet, T; Harris, HE; Griesel, RD; McIntyre, JA

    2007-01-01

    This study examined the reliability of hand-written and computerised records of birth data collected during the Birth to Ten study at Baragwanath Hospital in Soweto. The reliability of record-keeping in hand-written obstetric and neonatal files was assessed by comparing duplicate records of six different variables abstracted from six different sections in these files. The reliability of computerised record-keeping was assessed by comparing the original hand-written record of each variable with records contained in the hospital’s computerised database. These data sets displayed similar levels of reliability which suggests that similar errors occurred when data were transcribed from one section of the files to the next, and from these files to the computerised database. In both sets of records reliability was highest for the categorical variable infant sex, and for those continuous variables (such as maternal age and gravidity) recorded with unambiguous units. Reliability was lower for continuous variables that could be recorded with different levels of precision (such as birth weight), those that were occasionally measured more than once, and those that could be measured using more than one measurement technique (such as gestational age). Reducing the number of times records are transcribed, categorising continuous variables, and standardising the techniques used for measuring and recording variables would improve the reliability of both hand-written and computerised data sets. OPSOMMING In hierdie studie is die betroubaarheid van handgeskrewe en gerekenariseerde rekords van ge boortedata ondersoek, wat versamel is gedurende die ‘Birth to Ten’ -studie aan die Baragwanath hospitaal in Soweto. Die betroubaarheid van handgeskrewe verloskundige en pasgeboortelike rekords is beoordeel deur duplikaatrekords op ses verskillende verander likes te vergelyk, wat onttrek is uit ses verskillende dele van die betrokke lêers. Die gerekenariseerde rekords se betroubaarheid is beoordeel deur die oorspronklike geskrewe rekord van elke veranderlike te vergelyk met rekords wat beskikbaar is in die hospitaal se gerekenariseerde databasis Hierdie datastelle her vergelykbare vlakke van betroubaarheid getoon, waaruit afgelei kan word dat soortgelyke foute voorkom warmeer data oorgeplaas word vaneen deeivan ’n lêer na ’n ander, en vanaf die lêer na die gerekenariseerde databasis. In albei stelle rekords was die betroubaarheid die hoogste vir die kategoriese veranderlike suigeling se geslag, en vir daardie kontinue veranderlikes (soos moeder se ouderdom en gravida) wat in terme van ondubbelsinmge eenhede gekodeer kan word. Kontinue veranderlikes wat op wisselende vlakke van akkuratheid gemeet word (soos gewig met geboorte), veranderlikes wat soms meer as een keer gemeet is, en veranderlikes wat voigens meer as een metingstegniek bepaal is (soos draagtydsouderdom), was minder betroubaar Deur die aantal kere wat rekords oorgeskryf moet word te verminder, kontinue veranderlikes tat kategoriese veranderlikes te wysig. en tegnieke vir meting en aantekening van veranderlikes te standardiseer, kan die betroubaarheid van sowel handgeskrewe as gerekenariseerde datastelle verbeter word. PMID:9287552

  2. Reducing weight precision of convolutional neural networks towards large-scale on-chip image recognition

    NASA Astrophysics Data System (ADS)

    Ji, Zhengping; Ovsiannikov, Ilia; Wang, Yibing; Shi, Lilong; Zhang, Qiang

    2015-05-01

    In this paper, we develop a server-client quantization scheme to reduce bit resolution of deep learning architecture, i.e., Convolutional Neural Networks, for image recognition tasks. Low bit resolution is an important factor in bringing the deep learning neural network into hardware implementation, which directly determines the cost and power consumption. We aim to reduce the bit resolution of the network without sacrificing its performance. To this end, we design a new quantization algorithm called supervised iterative quantization to reduce the bit resolution of learned network weights. In the training stage, the supervised iterative quantization is conducted via two steps on server - apply k-means based adaptive quantization on learned network weights and retrain the network based on quantized weights. These two steps are alternated until the convergence criterion is met. In this testing stage, the network configuration and low-bit weights are loaded to the client hardware device to recognize coming input in real time, where optimized but expensive quantization becomes infeasible. Considering this, we adopt a uniform quantization for the inputs and internal network responses (called feature maps) to maintain low on-chip expenses. The Convolutional Neural Network with reduced weight and input/response precision is demonstrated in recognizing two types of images: one is hand-written digit images and the other is real-life images in office scenarios. Both results show that the new network is able to achieve the performance of the neural network with full bit resolution, even though in the new network the bit resolution of both weight and input are significantly reduced, e.g., from 64 bits to 4-5 bits.

  3. The Future of GLOSS Sea Level Data Archaeology

    NASA Astrophysics Data System (ADS)

    Jevrejeva, S.; Bradshaw, E.; Tamisiea, M. E.; Aarup, T.

    2014-12-01

    Long term climate records are rare, consisting of unique and unrepeatable measurements. However, data do exist in analogue form in archives, libraries and other repositories around the world. The Global Sea Level Observing System (GLOSS) Group of Experts aims to provide advice on locating hidden tide gauge data, scanning and digitising records and quality controlling the resulting data. Long sea level data time series are used in Intergovernmental Panel on Climate Change (IPCC) assessment reports and climate studies, in oceanography to study changes in ocean currents, tides and storm surges, in geodesy to establish national datum and in geography and geology to monitor coastal land movement. GLOSS has carried out a number of data archaeology activities over the past decade, which have mainly involved sending member organisations questionnaires on their repositories. The Group of Experts is now looking at future developments in sea level data archaeology and how new technologies coming on line could be used by member organisations to make data digitisation and transcription more efficient. Analogue tide data comes in two forms charts, which record the continuous measurements made by an instrument, usually via a pen trace on paper ledgers containing written values of observations The GLOSS data archaeology web pages will provide a list of software that member organisations have reported to be suitable for the automatic digitisation of tide gauge charts. Transcribing of ledgers has so far proved more labour intensive and is usually conducted by people entering numbers by hand. GLOSS is exploring using Citizen Science techniques, such as those employed by the Old Weather project, to improve the efficiency of transcribing ledgers. The Group of Experts is also looking at recent advances in Handwritten Text Recognition (HTR) technology, which mainly relies on patterns in the written word, but could be adapted to work with the patterns inherent in sea level data.

  4. From regular text to artistic writing and artworks: Fourier statistics of images with low and high aesthetic appeal

    PubMed Central

    Melmer, Tamara; Amirshahi, Seyed A.; Koch, Michael; Denzler, Joachim; Redies, Christoph

    2013-01-01

    The spatial characteristics of letters and their influence on readability and letter identification have been intensely studied during the last decades. There have been few studies, however, on statistical image properties that reflect more global aspects of text, for example properties that may relate to its aesthetic appeal. It has been shown that natural scenes and a large variety of visual artworks possess a scale-invariant Fourier power spectrum that falls off linearly with increasing frequency in log-log plots. We asked whether images of text share this property. As expected, the Fourier spectrum of images of regular typed or handwritten text is highly anisotropic, i.e., the spectral image properties in vertical, horizontal, and oblique orientations differ. Moreover, the spatial frequency spectra of text images are not scale-invariant in any direction. The decline is shallower in the low-frequency part of the spectrum for text than for aesthetic artworks, whereas, in the high-frequency part, it is steeper. These results indicate that, in general, images of regular text contain less global structure (low spatial frequencies) relative to fine detail (high spatial frequencies) than images of aesthetics artworks. Moreover, we studied images of text with artistic claim (ornate print and calligraphy) and ornamental art. For some measures, these images assume average values intermediate between regular text and aesthetic artworks. Finally, to answer the question of whether the statistical properties measured by us are universal amongst humans or are subject to intercultural differences, we compared images from three different cultural backgrounds (Western, East Asian, and Arabic). Results for different categories (regular text, aesthetic writing, ornamental art, and fine art) were similar across cultures. PMID:23554592

  5. 77 FR 57089 - Meeting of the Chronic Fatigue Syndrome Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-17

    ..., 20201. Mailed testimony must be received no later than Monday, September 24, 2012. Note: PDF files, hand-written notes and photographs will not be accepted. Requests for public comment and written testimony will...

  6. 77 FR 31856 - Meeting of the Chronic Fatigue Syndrome Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-30

    ..., 12 point font. Note: PDF files, hand-written notes and photographs will not be accepted. Requests for public comment and written testimony will not be accepted through the CFSAC mailbox. Also, the CFSAC...

  7. [Patient safety: a comparison between handwritten and computerized voluntary incident reporting].

    PubMed

    Capucho, Helaine Carneiro; Arnas, Emilly Rasquini; Cassiani, Silvia Helena De Bortoli

    2013-03-01

    This study's objective was to compare two types of voluntary incident reporting methods that affect patient safety, handwritten (HR) and computerized (CR), in relation to the number of reports, type of incident reported the individual submitting the report, and quality of reports. This was a descriptive, retrospective and cross-sectional study. CR were more frequent than HR (61.2% vs. 38.6%) among the 1,089 reports analyzed and were submitted every day of the month, while HR were submitted only on weekdays. The highest number of reports referred to medication, followed by problems related to medical-hospital material and the professional who most frequently submitted reports were nurses in both cases. Overall CR presented higher quality than HR (86.1% vs. 61.7%); 36.8% of HR were illegible, a problem that was eliminated in CR. Therefore, the use of computerized incident reporting in hospitals favors qualified voluntary reports, increasing patient safety.

  8. Do What I Say! Voice Recognition Makes Major Advances.

    ERIC Educational Resources Information Center

    Ruley, C. Dorsey

    1994-01-01

    Explains voice recognition technology applications in the workplace, schools, and libraries. Highlights include a voice-controlled work station using the DragonDictate system that can be used with dyslexic students, converting text to speech, and converting speech to text. (LRW)

  9. Word Recognition in Auditory Cortex

    ERIC Educational Resources Information Center

    DeWitt, Iain D. J.

    2013-01-01

    Although spoken word recognition is more fundamental to human communication than text recognition, knowledge of word-processing in auditory cortex is comparatively impoverished. This dissertation synthesizes current models of auditory cortex, models of cortical pattern recognition, models of single-word reading, results in phonetics and results in…

  10. Investigating an Application of Speech-to-Text Recognition: A Study on Visual Attention and Learning Behaviour

    ERIC Educational Resources Information Center

    Huang, Y-M.; Liu, C-J.; Shadiev, Rustam; Shen, M-H.; Hwang, W-Y.

    2015-01-01

    One major drawback of previous research on speech-to-text recognition (STR) is that most findings showing the effectiveness of STR for learning were based upon subjective evidence. Very few studies have used eye-tracking techniques to investigate visual attention of students on STR-generated text. Furthermore, not much attention was paid to…

  11. Paediatric electronic infusion calculator: An intervention to eliminate infusion errors in paediatric critical care.

    PubMed

    Venkataraman, Aishwarya; Siu, Emily; Sadasivam, Kalaimaran

    2016-11-01

    Medication errors, including infusion prescription errors are a major public health concern, especially in paediatric patients. There is some evidence that electronic or web-based calculators could minimise these errors. To evaluate the impact of an electronic infusion calculator on the frequency of infusion errors in the Paediatric Critical Care Unit of The Royal London Hospital, London, United Kingdom. We devised an electronic infusion calculator that calculates the appropriate concentration, rate and dose for the selected medication based on the recorded weight and age of the child and then prints into a valid prescription chart. Electronic infusion calculator was implemented from April 2015 in Paediatric Critical Care Unit. A prospective study, five months before and five months after implementation of electronic infusion calculator, was conducted. Data on the following variables were collected onto a proforma: medication dose, infusion rate, volume, concentration, diluent, legibility, and missing or incorrect patient details. A total of 132 handwritten prescriptions were reviewed prior to electronic infusion calculator implementation and 119 electronic infusion calculator prescriptions were reviewed after electronic infusion calculator implementation. Handwritten prescriptions had higher error rate (32.6%) as compared to electronic infusion calculator prescriptions (<1%) with a p  < 0.001. Electronic infusion calculator prescriptions had no errors on dose, volume and rate calculation as compared to handwritten prescriptions, hence warranting very few pharmacy interventions. Use of electronic infusion calculator for infusion prescription significantly reduced the total number of infusion prescribing errors in Paediatric Critical Care Unit and has enabled more efficient use of medical and pharmacy time resources.

  12. Manual versus automated coding of free-text self-reported medication data in the 45 and Up Study: a validation study.

    PubMed

    Gnjidic, Danijela; Pearson, Sallie-Anne; Hilmer, Sarah N; Basilakis, Jim; Schaffer, Andrea L; Blyth, Fiona M; Banks, Emily

    2015-03-30

    Increasingly, automated methods are being used to code free-text medication data, but evidence on the validity of these methods is limited. To examine the accuracy of automated coding of previously keyed in free-text medication data compared with manual coding of original handwritten free-text responses (the 'gold standard'). A random sample of 500 participants (475 with and 25 without medication data in the free-text box) enrolled in the 45 and Up Study was selected. Manual coding involved medication experts keying in free-text responses and coding using Anatomical Therapeutic Chemical (ATC) codes (i.e. chemical substance 7-digit level; chemical subgroup 5-digit; pharmacological subgroup 4-digit; therapeutic subgroup 3-digit). Using keyed-in free-text responses entered by non-experts, the automated approach coded entries using the Australian Medicines Terminology database and assigned corresponding ATC codes. Based on manual coding, 1377 free-text entries were recorded and, of these, 1282 medications were coded to ATCs manually. The sensitivity of automated coding compared with manual coding was 79% (n = 1014) for entries coded at the exact ATC level, and 81.6% (n = 1046), 83.0% (n = 1064) and 83.8% (n = 1074) at the 5, 4 and 3-digit ATC levels, respectively. The sensitivity of automated coding for blank responses was 100% compared with manual coding. Sensitivity of automated coding was highest for prescription medications and lowest for vitamins and supplements, compared with the manual approach. Positive predictive values for automated coding were above 95% for 34 of the 38 individual prescription medications examined. Automated coding for free-text prescription medication data shows very high to excellent sensitivity and positive predictive values, indicating that automated methods can potentially be useful for large-scale, medication-related research.

  13. Robust keyword retrieval method for OCRed text

    NASA Astrophysics Data System (ADS)

    Fujii, Yusaku; Takebe, Hiroaki; Tanaka, Hiroshi; Hotta, Yoshinobu

    2011-01-01

    Document management systems have become important because of the growing popularity of electronic filing of documents and scanning of books, magazines, manuals, etc., through a scanner or a digital camera, for storage or reading on a PC or an electronic book. Text information acquired by optical character recognition (OCR) is usually added to the electronic documents for document retrieval. Since texts generated by OCR generally include character recognition errors, robust retrieval methods have been introduced to overcome this problem. In this paper, we propose a retrieval method that is robust against both character segmentation and recognition errors. In the proposed method, the insertion of noise characters and dropping of characters in the keyword retrieval enables robustness against character segmentation errors, and character substitution in the keyword of the recognition candidate for each character in OCR or any other character enables robustness against character recognition errors. The recall rate of the proposed method was 15% higher than that of the conventional method. However, the precision rate was 64% lower.

  14. SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemination

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanderson, Robert D.; Albritton, Benjamin; Schwemmer, Rafael

    2011-01-01

    In this paper we present a model based on the principles of Linked Data that can be used to describe the interrelationships of images, texts and other resources to facilitate the interoperability of repositories of medieval manuscripts or other culturally important handwritten documents. The model is designed from a set of requirements derived from the real world use cases of some of the largest digitized medieval content holders, and instantiations of the model are intended as the input to collection-independent page turning and scholarly presentation interfaces. A canvas painting paradigm, such as in PDF and SVG, was selected based onmore » the lack of a one to one correlation between image and page, and to fulfill complex requirements such as when the full text of a page is known, but only fragments of the physical object remain. The model is implemented using technologies such as OAI-ORE Aggregations and OAC Annotations, as the fundamental building blocks of emerging Linked Digital Libraries. The model and implementation are evaluated through prototypes of both content providing and consuming applications. Although the system was designed from requirements drawn from the medieval manuscript domain, it is applicable to any layout-oriented presentation of images of text.« less

  15. Sub-word based Arabic handwriting analysis for writer identification

    NASA Astrophysics Data System (ADS)

    Maliki, Makki; Al-Jawad, Naseer; Jassim, Sabah

    2013-05-01

    Analysing a text or part of it is key to handwriting identification. Generally, handwriting is learnt over time and people develop habits in the style of writing. These habits are embedded in special parts of handwritten text. In Arabic each word consists of one or more sub-word(s). The end of each sub-word is considered to be a connect stroke. The main hypothesis in this paper is that sub-words are essential reflection of Arabic writer's habits that could be exploited for writer identification. Testing this hypothesis will be based on experiments that evaluate writer's identification, mainly using K nearest neighbor from group of sub-words extracted from longer text. The experimental results show that using a group of sub-words could be used to identify the writer with a successful rate between 52.94 % to 82.35% when top1 is used, and it can go up to 100% when top5 is used based on K nearest neighbor. The results show that majority of writers are identified using 7 sub-words with a reliability confident of about 90% (i.e. 90% of the rejected templates have significantly larger distances to the tested example than the distance from the correctly identified template). However previous work, using a complete word, shows successful rate of at most 90% in top 10.

  16. From Hahnemann's hand to your computer screen: building a digital homeopathy collection

    PubMed Central

    Mix, Lisa A; Cameron, Kathleen

    2011-01-01

    The University of California, San Francisco (UCSF), Library holds the unique manuscript of the sixth edition of Samuel Hahnemann's Organon der Heilkunst, the primary text of homeopathy. The manuscript volume is Hahnemann's own copy of the fifth edition of the Organon with his notes for the sixth edition, handwritten throughout the volume. There is a high level of interest in the Organon manuscript, particularly among homeopaths. This led to the decision to present a digital surrogate on the web to make it accessible to a wider audience. Digitizing Hahnemann's manuscript and determining the best method of presentation on the web posed several challenges. Lessons learned in the course of this project will inform future digital projects. This article discusses the historical significance of the sixth edition of Hahnemann's Organon, its context in UCSF's homeopathy collections, and the specifics of developing the online homeopathy collection. PMID:21243055

  17. GRAMPS: An Automated Ambulatory Geriatric Record

    PubMed Central

    Hammond, Kenric W.; King, Carol A.; Date, Vishvanath V.; Prather, Robert J.; Loo, Lawrence; Siddiqui, Khwaja

    1988-01-01

    GRAMPS (Geriatric Record and Multidisciplinary Planning System) is an interactive MUMPS system developed for VA outpatient use. It allows physicians to effectively document care in problem-oriented format with structured narrative and free text, eliminating handwritten input. We evaluated the system in a one-year controlled cohort study. When the computer, was used, appointment times averaged 8.2 minutes longer (32.6 vs. 24.4 minutes) compared to control visits with the same physicians. Computer use was associated with better quality of care as measured in the management of a common problem, hypertension, as well as decreased overall costs of care. When a faster computer was installed, data entry times improved, suggesting that slower processing had accounted for a substantial portion of the observed difference in appointment lengths. The GRAMPS system was well-accepted by providers. The modular design used in GRAMPS has been extended to medical-care applications in Nursing and Mental Health.

  18. Teach Your Computer to Read: Scanners and Optical Character Recognition.

    ERIC Educational Resources Information Center

    Marsden, Jim

    1993-01-01

    Desktop scanners can be used with a software technology called optical character recognition (OCR) to convert the text on virtually any paper document into an electronic form. OCR offers educators new flexibility in incorporating text into tests, lesson plans, and other materials. (MLF)

  19. A framework of text detection and recognition from natural images for mobile device

    NASA Astrophysics Data System (ADS)

    Selmi, Zied; Ben Halima, Mohamed; Wali, Ali; Alimi, Adel M.

    2017-03-01

    On the light of the remarkable audio-visual effect on modern life, and the massive use of new technologies (smartphones, tablets ...), the image has been given a great importance in the field of communication. Actually, it has become the most effective, attractive and suitable means of communication for transmitting information between different people. Of all the various parts of information that can be extracted from the image, our focus will be particularly on the text. Actually, since its detection and recognition in a natural image is a major problem in many applications, the text has drawn the attention of a great number of researchers in recent years. In this paper, we present a framework for text detection and recognition from natural images for mobile devices.

  20. All Spin Artificial Neural Networks Based on Compound Spintronic Synapse and Neuron.

    PubMed

    Zhang, Deming; Zeng, Lang; Cao, Kaihua; Wang, Mengxing; Peng, Shouzhong; Zhang, Yue; Zhang, Youguang; Klein, Jacques-Olivier; Wang, Yu; Zhao, Weisheng

    2016-08-01

    Artificial synaptic devices implemented by emerging post-CMOS non-volatile memory technologies such as Resistive RAM (RRAM) have made great progress recently. However, it is still a big challenge to fabricate stable and controllable multilevel RRAM. Benefitting from the control of electron spin instead of electron charge, spintronic devices, e.g., magnetic tunnel junction (MTJ) as a binary device, have been explored for neuromorphic computing with low power dissipation. In this paper, a compound spintronic device consisting of multiple vertically stacked MTJs is proposed to jointly behave as a synaptic device, termed as compound spintronic synapse (CSS). Based on our theoretical and experimental work, it has been demonstrated that the proposed compound spintronic device can achieve designable and stable multiple resistance states by interfacial and materials engineering of its components. Additionally, a compound spintronic neuron (CSN) circuit based on the proposed compound spintronic device is presented, enabling a multi-step transfer function. Then, an All Spin Artificial Neural Network (ASANN) is constructed with the CSS and CSN circuit. By conducting system-level simulations on the MNIST database for handwritten digital recognition, the performance of such ASANN has been investigated. Moreover, the impact of the resolution of both the CSS and CSN and device variation on the system performance are discussed in this work.

  1. Fully parallel write/read in resistive synaptic array for accelerating on-chip learning

    NASA Astrophysics Data System (ADS)

    Gao, Ligang; Wang, I.-Ting; Chen, Pai-Yu; Vrudhula, Sarma; Seo, Jae-sun; Cao, Yu; Hou, Tuo-Hung; Yu, Shimeng

    2015-11-01

    A neuro-inspired computing paradigm beyond the von Neumann architecture is emerging and it generally takes advantage of massive parallelism and is aimed at complex tasks that involve intelligence and learning. The cross-point array architecture with synaptic devices has been proposed for on-chip implementation of the weighted sum and weight update in the learning algorithms. In this work, forming-free, silicon-process-compatible Ta/TaO x /TiO2/Ti synaptic devices are fabricated, in which >200 levels of conductance states could be continuously tuned by identical programming pulses. In order to demonstrate the advantages of parallelism of the cross-point array architecture, a novel fully parallel write scheme is designed and experimentally demonstrated in a small-scale crossbar array to accelerate the weight update in the training process, at a speed that is independent of the array size. Compared to the conventional row-by-row write scheme, it achieves >30× speed-up and >30× improvement in energy efficiency as projected in a large-scale array. If realistic synaptic device characteristics such as device variations are taken into an array-level simulation, the proposed array architecture is able to achieve ∼95% recognition accuracy of MNIST handwritten digits, which is close to the accuracy achieved by software using the ideal sparse coding algorithm.

  2. Academic Recognition: Status and Challenges

    ERIC Educational Resources Information Center

    Bergan, Sjur

    2009-01-01

    The Council of Europe/UNESCO Recognition Convention (also known as the Lisbon Recognition Convention) provides the legal framework for academic recognition in Europe, and it serves a double purpose: as a legal text and as a guide to good practice. The ENIC and NARIC Networks promote the implementation of the Convention and seek to develop a better…

  3. Multi-frame knowledge based text enhancement for mobile phone captured videos

    NASA Astrophysics Data System (ADS)

    Ozarslan, Suleyman; Eren, P. Erhan

    2014-02-01

    In this study, we explore automated text recognition and enhancement using mobile phone captured videos of store receipts. We propose a method which includes Optical Character Resolution (OCR) enhanced by our proposed Row Based Multiple Frame Integration (RB-MFI), and Knowledge Based Correction (KBC) algorithms. In this method, first, the trained OCR engine is used for recognition; then, the RB-MFI algorithm is applied to the output of the OCR. The RB-MFI algorithm determines and combines the most accurate rows of the text outputs extracted by using OCR from multiple frames of the video. After RB-MFI, KBC algorithm is applied to these rows to correct erroneous characters. Results of the experiments show that the proposed video-based approach which includes the RB-MFI and the KBC algorithm increases the word character recognition rate to 95%, and the character recognition rate to 98%.

  4. A new method for text detection and recognition in indoor scene for assisting blind people

    NASA Astrophysics Data System (ADS)

    Jabnoun, Hanen; Benzarti, Faouzi; Amiri, Hamid

    2017-03-01

    Developing assisting system of handicapped persons become a challenging ask in research projects. Recently, a variety of tools are designed to help visually impaired or blind people object as a visual substitution system. The majority of these tools are based on the conversion of input information into auditory or tactile sensory information. Furthermore, object recognition and text retrieval are exploited in the visual substitution systems. Text detection and recognition provides the description of the surrounding environments, so that the blind person can readily recognize the scene. In this work, we aim to introduce a method for detecting and recognizing text in indoor scene. The process consists on the detection of the regions of interest that should contain the text using the connected component. Then, the text detection is provided by employing the images correlation. This component of an assistive blind person should be simple, so that the users are able to obtain the most informative feedback within the shortest time.

  5. 5 CFR 850.103 - Definitions.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... graphical image of a handwritten signature, usually created using a special computer input device, such as a... comparison with the characteristics and biometric data of a known or exemplar signature image. Director means... folder across the Government. Electronic retirement and insurance processing system means the new...

  6. 5 CFR 850.103 - Definitions.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... graphical image of a handwritten signature, usually created using a special computer input device, such as a... comparison with the characteristics and biometric data of a known or exemplar signature image. Director means... folder across the Government. Electronic retirement and insurance processing system means the new...

  7. 5 CFR 850.103 - Definitions.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... graphical image of a handwritten signature, usually created using a special computer input device, such as a... comparison with the characteristics and biometric data of a known or exemplar signature image. Director means... folder across the Government. Electronic retirement and insurance processing system means the new...

  8. 32 CFR 637.13 - Retention of property.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 32 National Defense 4 2011-07-01 2011-07-01 false Retention of property. 637.13 Section 637.13 National Defense Department of Defense (Continued) DEPARTMENT OF THE ARMY (CONTINUED) LAW ENFORCEMENT AND.... Reports of investigation, photographs, exhibits, handwritten notes, sketches, and other materials...

  9. 32 CFR 637.13 - Retention of property.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 32 National Defense 4 2010-07-01 2010-07-01 true Retention of property. 637.13 Section 637.13 National Defense Department of Defense (Continued) DEPARTMENT OF THE ARMY (CONTINUED) LAW ENFORCEMENT AND.... Reports of investigation, photographs, exhibits, handwritten notes, sketches, and other materials...

  10. Analysis of the IJCNN 2011 UTL Challenge

    DTIC Science & Technology

    2012-01-13

    large datasets from various application domains: handwriting recognition, image recognition, video processing, text processing, and ecology. The goal...http //clopinet.com/ul). We made available large datasets from various application domains handwriting recognition, image recognition, video...evaluation sets consist of 4096 examples each. Dataset Domain Features Sparsity Devel. Transf. AVICENNA Handwriting 120 0% 150205 50000 HARRY Video 5000 98.1

  11. Document reconstruction by layout analysis of snippets

    NASA Astrophysics Data System (ADS)

    Kleber, Florian; Diem, Markus; Sablatnig, Robert

    2010-02-01

    Document analysis is done to analyze entire forms (e.g. intelligent form analysis, table detection) or to describe the layout/structure of a document. Also skew detection of scanned documents is performed to support OCR algorithms that are sensitive to skew. In this paper document analysis is applied to snippets of torn documents to calculate features for the reconstruction. Documents can either be destroyed by the intention to make the printed content unavailable (e.g. tax fraud investigation, business crime) or due to time induced degeneration of ancient documents (e.g. bad storage conditions). Current reconstruction methods for manually torn documents deal with the shape, inpainting and texture synthesis techniques. In this paper the possibility of document analysis techniques of snippets to support the matching algorithm by considering additional features are shown. This implies a rotational analysis, a color analysis and a line detection. As a future work it is planned to extend the feature set with the paper type (blank, checked, lined), the type of the writing (handwritten vs. machine printed) and the text layout of a snippet (text size, line spacing). Preliminary results show that these pre-processing steps can be performed reliably on a real dataset consisting of 690 snippets.

  12. A text input system developed by using lips image recognition based LabVIEW for the seriously disabled.

    PubMed

    Chen, S C; Shao, C L; Liang, C K; Lin, S W; Huang, T H; Hsieh, M C; Yang, C H; Luo, C H; Wuo, C M

    2004-01-01

    In this paper, we present a text input system for the seriously disabled by using lips image recognition based on LabVIEW. This system can be divided into the software subsystem and the hardware subsystem. In the software subsystem, we adopted the technique of image processing to recognize the status of mouth-opened or mouth-closed depending the relative distance between the upper lip and the lower lip. In the hardware subsystem, parallel port built in PC is used to transmit the recognized result of mouth status to the Morse-code text input system. Integrating the software subsystem with the hardware subsystem, we implement a text input system by using lips image recognition programmed in LabVIEW language. We hope the system can help the seriously disabled to communicate with normal people more easily.

  13. Interruptions disrupt reading comprehension.

    PubMed

    Foroughi, Cyrus K; Werner, Nicole E; Barragán, Daniela; Boehm-Davis, Deborah A

    2015-06-01

    Previous research suggests that being interrupted while reading a text does not disrupt the later recognition or recall of information from that text. This research is used as support for Ericsson and Kintsch's (1995) long-term working memory (LT-WM) theory, which posits that disruptions while reading (e.g., interruptions) do not impair subsequent text comprehension. However, to fully comprehend a text, individuals may need to do more than recognize or recall information that has been presented in the text at a later time. Reading comprehension often requires individuals to connect and synthesize information across a text (e.g., successfully identifying complex topics such as themes and tones) and not just make a familiarity-based decision (i.e., recognition). The goal for this study was to determine whether interruptions while reading disrupt reading comprehension when the questions assessing comprehension require participants to connect and synthesize information across the passage. In Experiment 1, interruptions disrupted reading comprehension. In Experiment 2, interruptions disrupted reading comprehension but not recognition of information from the text. In Experiment 3, the addition of a 15-s time-out prior to the interruption successfully removed these negative effects. These data suggest that the time it takes to process the information needed to successfully comprehend text when reading is greater than that required for recognition. Any interference (e.g., an interruption) that occurs during the comprehension process may disrupt reading comprehension. This evidence supports the need for transient activation of information in working memory for successful text comprehension and does not support LT-WM theory. (c) 2015 APA, all rights reserved).

  14. 5 CFR 850.103 - Definitions.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ...) ELECTRONIC RETIREMENT PROCESSING General Provisions § 850.103 Definitions. In this part— Agency means an... graphical image of a handwritten signature usually created using a special computer input device (such as a... comparison with the characteristics and biometric data of a known or exemplar signature image. Director means...

  15. Atmospheric pressure MALDI for the noninvasive characterization of carbonaceous ink from Renaissance documents.

    PubMed

    Grasso, Giuseppe; Calcagno, Marzia; Rapisarda, Alessandro; D'Agata, Roberta; Spoto, Giuseppe

    2017-06-01

    The analytical methods that are usually applied to determine the compositions of inks from ancient manuscripts usually focus on inorganic components, as in the case of iron gall ink. In this work, we describe the use of atmospheric pressure/matrix-assisted laser desorption ionization-mass spectrometry (AP/MALDI-MS) as a spatially resolved analytical technique for the study of the organic carbonaceous components of inks used in handwritten parts of ancient books for the first time. Large polycyclic aromatic hydrocarbons (L-PAH) were identified in situ in the ink of XVII century handwritten documents. We prove that it is possible to apply MALDI-MS as a suitable microdestructive diagnostic tool for analyzing samples in air at atmospheric pressure, thus simplifying investigations of the organic components of artistic and archaeological objects. The interpretation of the experimental MS results was supported by independent Raman spectroscopic investigations. Graphical abstract Atmospheric pressure/MALDI mass spectrometry detects in situ polycyclic aromatic hydrocarbons in the carbonaceous ink of XVII century manuscripts.

  16. The Fundamentals of Thermal Imaging Systems.

    DTIC Science & Technology

    1979-05-10

    detection , recognition, or identification, of real ’coene objects aire discussed. It is hoped that the text will be useful to FLIR designers, evaluators...AND ANDERSON EXPERIMENT ........................ 205 Appendix F - BASIC SNR AND DETECTIVITY RELATIONS ................................... 209 Appendix... detection , recognition, or identification, of real scene objects are discussed. I• It is hoped that the material in the text will be useful to

  17. 21 CFR 11.70 - Signature/record linking.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 21 Food and Drugs 1 2010-04-01 2010-04-01 false Signature/record linking. 11.70 Section 11.70 Food... RECORDS; ELECTRONIC SIGNATURES Electronic Records § 11.70 Signature/record linking. Electronic signatures and handwritten signatures executed to electronic records shall be linked to their respective...

  18. 40 CFR 761.217 - Exception reporting.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... PROHIBITIONS PCB Waste Disposal Records and Reports § 761.217 Exception reporting. (a)(1) A generator of PCB waste, who does not receive a copy of the manifest with the handwritten signature of the owner or... 761.217 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) TOXIC SUBSTANCES CONTROL...

  19. 40 CFR 761.217 - Exception reporting.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... PROHIBITIONS PCB Waste Disposal Records and Reports § 761.217 Exception reporting. (a)(1) A generator of PCB waste, who does not receive a copy of the manifest with the handwritten signature of the owner or... 761.217 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) TOXIC SUBSTANCES CONTROL...

  20. 76 FR 54799 - Flowserve Corporation, Albuquerque, NM; Notice of Negative Determination on Reconsideration

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-09-02

    ...'' and provided four support documents (``Separation Agreement and Release'' related to Louis Reynolds... Reynolds. The ``Separation Agreement and Release'' document established that Louis Reynolds was separated... handwritten note that Louis Reynolds is one of the individuals. The ``Signatures'' document shows that Louis...

  1. Moreland Recognition Program.

    ERIC Educational Resources Information Center

    Moreland Elementary School District, San Jose, CA.

    THE FOLLOWING IS THE FULL TEXT OF THIS DOCUMENT: Recognition for special effort and achievement has been noted as a component of effective schools. Schools in the Moreland School District have effectively improved standards of discipline and achievement by providing forty-six different ways for children to receive positive recognition. Good…

  2. Boost OCR accuracy using iVector based system combination approach

    NASA Astrophysics Data System (ADS)

    Peng, Xujun; Cao, Huaigu; Natarajan, Prem

    2015-01-01

    Optical character recognition (OCR) is a challenging task because most existing preprocessing approaches are sensitive to writing style, writing material, noises and image resolution. Thus, a single recognition system cannot address all factors of real document images. In this paper, we describe an approach to combine diverse recognition systems by using iVector based features, which is a newly developed method in the field of speaker verification. Prior to system combination, document images are preprocessed and text line images are extracted with different approaches for each system, where iVector is transformed from a high-dimensional supervector of each text line and is used to predict the accuracy of OCR. We merge hypotheses from multiple recognition systems according to the overlap ratio and the predicted OCR score of text line images. We present evaluation results on an Arabic document database where the proposed method is compared against the single best OCR system using word error rate (WER) metric.

  3. Text recognition and correction for automated data collection by mobile devices

    NASA Astrophysics Data System (ADS)

    Ozarslan, Suleyman; Eren, P. Erhan

    2014-03-01

    Participatory sensing is an approach which allows mobile devices such as mobile phones to be used for data collection, analysis and sharing processes by individuals. Data collection is the first and most important part of a participatory sensing system, but it is time consuming for the participants. In this paper, we discuss automatic data collection approaches for reducing the time required for collection, and increasing the amount of collected data. In this context, we explore automated text recognition on images of store receipts which are captured by mobile phone cameras, and the correction of the recognized text. Accordingly, our first goal is to evaluate the performance of the Optical Character Recognition (OCR) method with respect to data collection from store receipt images. Images captured by mobile phones exhibit some typical problems, and common image processing methods cannot handle some of them. Consequently, the second goal is to address these types of problems through our proposed Knowledge Based Correction (KBC) method used in support of the OCR, and also to evaluate the KBC method with respect to the improvement on the accurate recognition rate. Results of the experiments show that the KBC method improves the accurate data recognition rate noticeably.

  4. Advanced Restricted Area Entry Control System (Araecs)

    DTIC Science & Technology

    2014-06-01

    113  f.  Vascular Recognition ............................................................115  g.  Handwriting Recognition...independent (unconstrained mode). In a system using “text dependent” speech the individual will speak either a fixed password or prompted to say a...specific phrase (e.g. “Please say the following numbers 33, 45, 88”) (National Science and Technology Council 2006). A text independent system is more

  5. Review of Speech-to-Text Recognition Technology for Enhancing Learning

    ERIC Educational Resources Information Center

    Shadiev, Rustam; Hwang, Wu-Yuin; Chen, Nian-Shing; Huang, Yueh-Min

    2014-01-01

    This paper reviewed literature from 1999 to 2014 inclusively on how Speech-to-Text Recognition (STR) technology has been applied to enhance learning. The first aim of this review is to understand how STR technology has been used to support learning over the past fifteen years, and the second is to analyze all research evidence to understand how…

  6. Applications of Speech-to-Text Recognition and Computer-Aided Translation for Facilitating Cross-Cultural Learning through a Learning Activity: Issues and Their Solutions

    ERIC Educational Resources Information Center

    Shadiev, Rustam; Wu, Ting-Ting; Sun, Ai; Huang, Yueh-Min

    2018-01-01

    In this study, 21 university students, who represented thirteen nationalities, participated in an online cross-cultural learning activity. The participants were engaged in interactions and exchanges carried out on Facebook® and Skype® platforms, and their multilingual communications were supported by speech-to-text recognition (STR) and…

  7. Investigating the Effectiveness of Speech-To-Text Recognition Applications on Learning Performance, Attention, and Meditation

    ERIC Educational Resources Information Center

    Shadiev, Rustam; Huang, Yueh-Min; Hwang, Jan-Pan

    2017-01-01

    In this study, the effectiveness of the application of speech-to-text recognition (STR) technology on enhancing learning and concentration in a calm state of mind, hereafter referred to as meditation (An intentional and self-regulated focusing of attention in order to relax and calm the mind), was investigated. This effectiveness was further…

  8. Does Mechanism Matter? Student Recall of Electronic versus Handwritten Feedback

    ERIC Educational Resources Information Center

    Osterbur, Megan E.; Hammer, Elizabeth Yost; Hammer, Elliott

    2015-01-01

    Student consumption and recall of feedback are necessary preconditions of successful formative assessment. Drawing on Sadler's (1998) definition of formative assessment as that which is intended to accelerate learning and improve performance through the providing of feedback, we examine how the mechanism of transmission may impact student…

  9. 38 CFR 1.554 - Requirements for making requests.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... must contain an image of the requester's handwritten signature. To make a request for VA records, write... by another confidentiality statute, the e-mail transmission must contain an image of the requester's... assure prompt processing, e-mail FOIA requests must be sent to official VA FOIA mailboxes established for...

  10. 38 CFR 1.554 - Requirements for making requests.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... must contain an image of the requester's handwritten signature. To make a request for VA records, write... by another confidentiality statute, the e-mail transmission must contain an image of the requester's... assure prompt processing, e-mail FOIA requests must be sent to official VA FOIA mailboxes established for...

  11. 38 CFR 1.554 - Requirements for making requests.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... must contain an image of the requester's handwritten signature. To make a request for VA records, write... by another confidentiality statute, the e-mail transmission must contain an image of the requester's... assure prompt processing, e-mail FOIA requests must be sent to official VA FOIA mailboxes established for...

  12. Abstract Graphemic Representations Support Preparation of Handwritten Responses

    ERIC Educational Resources Information Center

    Shen, Xingjia Rachel; Damian, Marcus F.; Stadthagen-Gonzalez, Hans

    2013-01-01

    Some evidence suggests that the written production of single words involves not only the ordered retrieval of individual letters, but that abstract, higher-level linguistic properties of the words also influence responses. We report five experiments using the "implicit priming" task adopted from the spoken domain to investigate response…

  13. Researching Australian Children's Literature

    ERIC Educational Resources Information Center

    Saxby, Maurice

    2004-01-01

    When in 1962 the author began to research the history of Australian children's literature, access to the primary sources was limited and difficult. From a catalogue drawer in the Mitchell Library of hand-written cards marked "Children's books" he could call up from the stacks, in alphabetical order, piles of early publications. His notes…

  14. The effects of blogs versus dialogue journals on open-response writing scores and attitudes of grade eight science students

    NASA Astrophysics Data System (ADS)

    Erickson, Diane K.

    Today's students have grown up surrounded by technology. They use cell phones, word processors, and the Internet with ease, talking with peers in their community and around the world through e-mails, chatrooms, instant messaging, online discussions, and weblogs ("blogs"). In the midst of this technological explosion, adolescents face a growing need for strong literacy skills in all subject areas for achievement in school and on mandated state and national high stakes tests. The purpose of this study was to examine the use of blogs as a tool for improving open-response writing in the secondary science classroom in comparison to the use of handwritten dialogue journals. The study used a mixed-method approach, gathering both quantitative and qualitative data from 94 students in four eighth-grade science classes. Two classes participated in online class blogs where they posted ideas about science and responded to the ideas of other classmates. Two classes participated in handwritten dialogue journals, writing ideas about science and exchanging journals to respond to the ideas of classmates. The study explored these research questions: Does the use of blogs, as compared to the use of handwritten dialogue journals, improve the open-response writing scores of eighth grade science students? How do students describe their experience using blogs to study science as compared to students using handwritten dialogue journals? and How do motivation, self-efficacy, and community manifest themselves in students who use blogs as compared to students who use handwritten dialogue journals? The quantitative aspect of the study used data from pre- and post-tests and from a Likert-scale post-survey. The pre- and post-writing on open-response science questions were scored using the Massachusetts Comprehensive Assessment System (MCAS) open-response scoring rubric. The study found no statistically significant difference in the writing scores between the blog group and the dialogue journal groups. The study found significant difference between the scores on the post-survey of the two groups with the blogging group registering a more positive attitude about the experience than the dialogue journal group. The qualitative aspect of the study used group and individual interviews with 26 randomly-chosen students to explore the nature of the students' experiences using blogs and dialogue journals. Overall, the blog group communicated more positive responses to the experience than did students from the dialogue journal group, often indicating that blogging was "fun" and "helpful" and made them look forward to science class. This study addressed research needs in the fields of writing, technology, and content literacy. It is significant because there is little research on the use of blogs in the middle school content classroom, particularly on the use of blogs as a tool for improving open-response writing. It adds information as to the experience of students who use blogs in the science classroom and explored it as a way to explore ideas, build understanding, and connect with others. This is significant to know as school districts look to include more technology instruction and practices in the curriculum. Blogs could give students a critical tool for writing and thinking in the content classroom, helping to prepare students for an increasingly technological and global society.

  15. Building Searchable Collections of Enterprise Speech Data.

    ERIC Educational Resources Information Center

    Cooper, James W.; Viswanathan, Mahesh; Byron, Donna; Chan, Margaret

    The study has applied speech recognition and text-mining technologies to a set of recorded outbound marketing calls and analyzed the results. Since speaker-independent speech recognition technology results in a significantly lower recognition rate than that found when the recognizer is trained for a particular speaker, a number of post-processing…

  16. Voice Recognition Software Accuracy with Second Language Speakers of English.

    ERIC Educational Resources Information Center

    Coniam, D.

    1999-01-01

    Explores the potential of the use of voice-recognition technology with second-language speakers of English. Involves the analysis of the output produced by a small group of very competent second-language subjects reading a text into the voice recognition software Dragon Systems "Dragon NaturallySpeaking." (Author/VWL)

  17. Rapid extraction of gist from visual text and its influence on word recognition.

    PubMed

    Asano, Michiko; Yokosawa, Kazuhiko

    2011-01-01

    Two experiments explored rapid extraction of gist from a visual text and its influence on word recognition. In both, a short text (sentence) containing a target word was presented for 200 ms and was followed by a target recognition task. Results showed that participants recognized contextually anomalous word targets less frequently than contextually consistent counterparts (Experiment 1). This context effect was obtained when sentences contained the same semantic content but with disrupted syntactic structure (Experiment 2). Results demonstrate that words in a briefly presented visual sentence are processed in parallel and that rapid extraction of sentence gist relies on a primitive representation of sentence context (termed protocontext) that is semantically activated by the simultaneous presentation of multiple words (i.e., a sentence) before syntactic processing.

  18. The 2016 NIST Speaker Recognition Evaluation

    DTIC Science & Technology

    2017-08-20

    The 2016 NIST Speaker Recognition Evaluation Seyed Omid Sadjadi1,∗, Timothée Kheyrkhah1,†, Audrey Tong1, Craig Greenberg1, Douglas Reynolds2, Elliot...recent in an ongoing series of speaker recognition evaluations (SRE) to foster research in ro- bust text-independent speaker recognition, as well as...online evaluation platform, a fixed training data condition, more variability in test segment duration (uni- formly distributed between 10s and 60s

  19. The impact of presentation style on the retention of online health information: a randomized-controlled experiment.

    PubMed

    Frisch, Anne-Linda; Camerini, Luca; Schulz, Peter J

    2013-01-01

    The Internet plays an increasingly important role in health education, providing laypeople with information about health-related topics that range from disease-specific contexts to general health promotion. Compared to traditional health education, the Internet allows the use of multimedia applications that offer promise to enhance individuals' health knowledge and literacy. This study aims at testing the effect of multimedia presentation of health information on learning. Relying on an experimental design, it investigates how retention of information differs for text-only presentation, image-only presentation, and multimedia (text and image) presentation of online health information. Two hundred and forty students were randomly assigned to four groups each exposed to a different website version. Three groups were exposed to the same information using text only, image only, or text and image presentation. A fourth group received unrelated information (control group). Retention was assessed by the means of a recognition test. To examine a possible interaction between website version and recognition test, half of the students received a recognition test in text form and half of them received a recognition test in imagery form. In line with assumptions from Dual Coding Theory, students exposed to the multimedia (text and image) presentation recognized significantly more information than students exposed to the text-only presentation. This did not hold for students exposed to the image-only presentation. The impact of presentation style on retention scores was moderated by the way retention was assessed for image-only presentation, but not for text-only or multimedia presentation. Possible explanations and implications for the design of online health education interventions are discussed.

  20. Cell line name recognition in support of the identification of synthetic lethality in cancer from text

    PubMed Central

    Kaewphan, Suwisa; Van Landeghem, Sofie; Ohta, Tomoko; Van de Peer, Yves; Ginter, Filip; Pyysalo, Sampo

    2016-01-01

    Motivation: The recognition and normalization of cell line names in text is an important task in biomedical text mining research, facilitating for instance the identification of synthetically lethal genes from the literature. While several tools have previously been developed to address cell line recognition, it is unclear whether available systems can perform sufficiently well in realistic and broad-coverage applications such as extracting synthetically lethal genes from the cancer literature. In this study, we revisit the cell line name recognition task, evaluating both available systems and newly introduced methods on various resources to obtain a reliable tagger not tied to any specific subdomain. In support of this task, we introduce two text collections manually annotated for cell line names: the broad-coverage corpus Gellus and CLL, a focused target domain corpus. Results: We find that the best performance is achieved using NERsuite, a machine learning system based on Conditional Random Fields, trained on the Gellus corpus and supported with a dictionary of cell line names. The system achieves an F-score of 88.46% on the test set of Gellus and 85.98% on the independently annotated CLL corpus. It was further applied at large scale to 24 302 102 unannotated articles, resulting in the identification of 5 181 342 cell line mentions, normalized to 11 755 unique cell line database identifiers. Availability and implementation: The manually annotated datasets, the cell line dictionary, derived corpora, NERsuite models and the results of the large-scale run on unannotated texts are available under open licenses at http://turkunlp.github.io/Cell-line-recognition/. Contact: sukaew@utu.fi PMID:26428294

  1. Recognition of chemical entities: combining dictionary-based and grammar-based approaches.

    PubMed

    Akhondi, Saber A; Hettne, Kristina M; van der Horst, Eelke; van Mulligen, Erik M; Kors, Jan A

    2015-01-01

    The past decade has seen an upsurge in the number of publications in chemistry. The ever-swelling volume of available documents makes it increasingly hard to extract relevant new information from such unstructured texts. The BioCreative CHEMDNER challenge invites the development of systems for the automatic recognition of chemicals in text (CEM task) and for ranking the recognized compounds at the document level (CDI task). We investigated an ensemble approach where dictionary-based named entity recognition is used along with grammar-based recognizers to extract compounds from text. We assessed the performance of ten different commercial and publicly available lexical resources using an open source indexing system (Peregrine), in combination with three different chemical compound recognizers and a set of regular expressions to recognize chemical database identifiers. The effect of different stop-word lists, case-sensitivity matching, and use of chunking information was also investigated. We focused on lexical resources that provide chemical structure information. To rank the different compounds found in a text, we used a term confidence score based on the normalized ratio of the term frequencies in chemical and non-chemical journals. The use of stop-word lists greatly improved the performance of the dictionary-based recognition, but there was no additional benefit from using chunking information. A combination of ChEBI and HMDB as lexical resources, the LeadMine tool for grammar-based recognition, and the regular expressions, outperformed any of the individual systems. On the test set, the F-scores were 77.8% (recall 71.2%, precision 85.8%) for the CEM task and 77.6% (recall 71.7%, precision 84.6%) for the CDI task. Missed terms were mainly due to tokenization issues, poor recognition of formulas, and term conjunctions. We developed an ensemble system that combines dictionary-based and grammar-based approaches for chemical named entity recognition, outperforming any of the individual systems that we considered. The system is able to provide structure information for most of the compounds that are found. Improved tokenization and better recognition of specific entity types is likely to further improve system performance.

  2. Recognition of chemical entities: combining dictionary-based and grammar-based approaches

    PubMed Central

    2015-01-01

    Background The past decade has seen an upsurge in the number of publications in chemistry. The ever-swelling volume of available documents makes it increasingly hard to extract relevant new information from such unstructured texts. The BioCreative CHEMDNER challenge invites the development of systems for the automatic recognition of chemicals in text (CEM task) and for ranking the recognized compounds at the document level (CDI task). We investigated an ensemble approach where dictionary-based named entity recognition is used along with grammar-based recognizers to extract compounds from text. We assessed the performance of ten different commercial and publicly available lexical resources using an open source indexing system (Peregrine), in combination with three different chemical compound recognizers and a set of regular expressions to recognize chemical database identifiers. The effect of different stop-word lists, case-sensitivity matching, and use of chunking information was also investigated. We focused on lexical resources that provide chemical structure information. To rank the different compounds found in a text, we used a term confidence score based on the normalized ratio of the term frequencies in chemical and non-chemical journals. Results The use of stop-word lists greatly improved the performance of the dictionary-based recognition, but there was no additional benefit from using chunking information. A combination of ChEBI and HMDB as lexical resources, the LeadMine tool for grammar-based recognition, and the regular expressions, outperformed any of the individual systems. On the test set, the F-scores were 77.8% (recall 71.2%, precision 85.8%) for the CEM task and 77.6% (recall 71.7%, precision 84.6%) for the CDI task. Missed terms were mainly due to tokenization issues, poor recognition of formulas, and term conjunctions. Conclusions We developed an ensemble system that combines dictionary-based and grammar-based approaches for chemical named entity recognition, outperforming any of the individual systems that we considered. The system is able to provide structure information for most of the compounds that are found. Improved tokenization and better recognition of specific entity types is likely to further improve system performance. PMID:25810767

  3. The Fisher-Markov selector: fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data.

    PubMed

    Cheng, Qiang; Zhou, Hongbo; Cheng, Jie

    2011-06-01

    Selecting features for multiclass classification is a critically important task for pattern recognition and machine learning applications. Especially challenging is selecting an optimal subset of features from high-dimensional data, which typically have many more variables than observations and contain significant noise, missing components, or outliers. Existing methods either cannot handle high-dimensional data efficiently or scalably, or can only obtain local optimum instead of global optimum. Toward the selection of the globally optimal subset of features efficiently, we introduce a new selector--which we call the Fisher-Markov selector--to identify those features that are the most useful in describing essential differences among the possible groups. In particular, in this paper we present a way to represent essential discriminating characteristics together with the sparsity as an optimization objective. With properly identified measures for the sparseness and discriminativeness in possibly high-dimensional settings, we take a systematic approach for optimizing the measures to choose the best feature subset. We use Markov random field optimization techniques to solve the formulated objective functions for simultaneous feature selection. Our results are noncombinatorial, and they can achieve the exact global optimum of the objective function for some special kernels. The method is fast; in particular, it can be linear in the number of features and quadratic in the number of observations. We apply our procedure to a variety of real-world data, including mid--dimensional optical handwritten digit data set and high-dimensional microarray gene expression data sets. The effectiveness of our method is confirmed by experimental results. In pattern recognition and from a model selection viewpoint, our procedure says that it is possible to select the most discriminating subset of variables by solving a very simple unconstrained objective function which in fact can be obtained with an explicit expression.

  4. Recognition of a person named entity from the text written in a natural language

    NASA Astrophysics Data System (ADS)

    Dolbin, A. V.; Rozaliev, V. L.; Orlova, Y. A.

    2017-01-01

    This work is devoted to the semantic analysis of texts, which were written in a natural language. The main goal of the research was to compare latent Dirichlet allocation and latent semantic analysis to identify elements of the human appearance in the text. The completeness of information retrieval was chosen as the efficiency criteria for methods comparison. However, it was insufficient to choose only one method for achieving high recognition rates. Thus, additional methods were used for finding references to the personality in the text. All these methods are based on the created information model, which represents person’s appearance.

  5. A deep belief network with PLSR for nonlinear system modeling.

    PubMed

    Qiao, Junfei; Wang, Gongming; Li, Wenjing; Li, Xiaoli

    2018-08-01

    Nonlinear system modeling plays an important role in practical engineering, and deep learning-based deep belief network (DBN) is now popular in nonlinear system modeling and identification because of the strong learning ability. However, the existing weights optimization for DBN is based on gradient, which always leads to a local optimum and a poor training result. In this paper, a DBN with partial least square regression (PLSR-DBN) is proposed for nonlinear system modeling, which focuses on the problem of weights optimization for DBN using PLSR. Firstly, unsupervised contrastive divergence (CD) algorithm is used in weights initialization. Secondly, initial weights derived from CD algorithm are optimized through layer-by-layer PLSR modeling from top layer to bottom layer. Instead of gradient method, PLSR-DBN can determine the optimal weights using several PLSR models, so that a better performance of PLSR-DBN is achieved. Then, the analysis of convergence is theoretically given to guarantee the effectiveness of the proposed PLSR-DBN model. Finally, the proposed PLSR-DBN is tested on two benchmark nonlinear systems and an actual wastewater treatment system as well as a handwritten digit recognition (nonlinear mapping and modeling) with high-dimension input data. The experiment results show that the proposed PLSR-DBN has better performances of time and accuracy on nonlinear system modeling than that of other methods. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. Development of a Digitalized Child's Checkups Information System.

    PubMed

    Ito, Yoshiya; Takimoto, Hidemi

    2017-01-01

    In Japan, health checkups for children take place from infancy through high school and play an important role in the maintenance and control of childhood/adolescent health. The anthropometric data obtained during these checkups are kept in health centers and schools and are also recorded in a mother's maternal and child health handbook, as well as on school health cards. These data are meaningful if they are utilized well and in an appropriate manner. They are particularly useful for the prevention of obesity-related conditions in adulthood, such as metabolic syndrome and diabetes mellitus. For this purpose, we have tried to establish a scanning system with an optical character recognition (OCR) function, which links data obtained during health checkups in infancy with that obtained in schools. In this system, handwritten characters on the records are scanned and processed using OCR. However, because many of the scanned characters are not read properly, we must wait for the improvement in the performance of the OCR function. In addition, we have developed Microsoft Excel spreadsheets, on which obesity-related indices, such as body mass index and relative body weight, are calculated. These sheets also provide functions that tabulate the frequencies of obesity in specific groups. Actively using these data and digitalized systems will not only contribute towards resolving physical health problems in children, but also decrease the risk of developing lifestyle-related diseases in adulthood.

  7. Learning optimal features for visual pattern recognition

    NASA Astrophysics Data System (ADS)

    Labusch, Kai; Siewert, Udo; Martinetz, Thomas; Barth, Erhardt

    2007-02-01

    The optimal coding hypothesis proposes that the human visual system has adapted to the statistical properties of the environment by the use of relatively simple optimality criteria. We here (i) discuss how the properties of different models of image coding, i.e. sparseness, decorrelation, and statistical independence are related to each other (ii) propose to evaluate the different models by verifiable performance measures (iii) analyse the classification performance on images of handwritten digits (MNIST data base). We first employ the SPARSENET algorithm (Olshausen, 1998) to derive a local filter basis (on 13 × 13 pixels windows). We then filter the images in the database (28 × 28 pixels images of digits) and reduce the dimensionality of the resulting feature space by selecting the locally maximal filter responses. We then train a support vector machine on a training set to classify the digits and report results obtained on a separate test set. Currently, the best state-of-the-art result on the MNIST data base has an error rate of 0,4%. This result, however, has been obtained by using explicit knowledge that is specific to the data (elastic distortion model for digits). We here obtain an error rate of 0,55% which is second best but does not use explicit data specific knowledge. In particular it outperforms by far all methods that do not use data-specific knowledge.

  8. Dual function seal: visualized digital signature for electronic medical record systems.

    PubMed

    Yu, Yao-Chang; Hou, Ting-Wei; Chiang, Tzu-Chiang

    2012-10-01

    Digital signature is an important cryptography technology to be used to provide integrity and non-repudiation in electronic medical record systems (EMRS) and it is required by law. However, digital signatures normally appear in forms unrecognizable to medical staff, this may reduce the trust from medical staff that is used to the handwritten signatures or seals. Therefore, in this paper we propose a dual function seal to extend user trust from a traditional seal to a digital signature. The proposed dual function seal is a prototype that combines the traditional seal and digital seal. With this prototype, medical personnel are not just can put a seal on paper but also generate a visualized digital signature for electronic medical records. Medical Personnel can then look at the visualized digital signature and directly know which medical personnel generated it, just like with a traditional seal. Discrete wavelet transform (DWT) is used as an image processing method to generate a visualized digital signature, and the peak signal to noise ratio (PSNR) is calculated to verify that distortions of all converted images are beyond human recognition, and the results of our converted images are from 70 dB to 80 dB. The signature recoverability is also tested in this proposed paper to ensure that the visualized digital signature is verifiable. A simulated EMRS is implemented to show how the visualized digital signature can be integrity into EMRS.

  9. A comparison of algorithms for inference and learning in probabilistic graphical models.

    PubMed

    Frey, Brendan J; Jojic, Nebojsa

    2005-09-01

    Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition, face detection, speaker identification, and prediction of gene function, it is even more exciting that researchers are on the verge of introducing systems that can perform large-scale combinatorial analyses of data, decomposing the data into interacting components. For example, computational methods for automatic scene analysis are now emerging in the computer vision community. These methods decompose an input image into its constituent objects, lighting conditions, motion patterns, etc. Two of the main challenges are finding effective representations and models in specific applications and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms. We review exact techniques and various approximate, computationally efficient techniques, including iterated conditional modes, the expectation maximization (EM) algorithm, Gibbs sampling, the mean field method, variational techniques, structured variational techniques and the sum-product algorithm ("loopy" belief propagation). We describe how each technique can be applied in a vision model of multiple, occluding objects and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.

  10. Impact of Linearity and Write Noise of Analog Resistive Memory Devices in a Neural Algorithm Accelerator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jacobs-Gedrim, Robin B.; Agarwal, Sapan; Knisely, Kathrine E.

    Resistive memory (ReRAM) shows promise for use as an analog synapse element in energy-efficient neural network algorithm accelerators. A particularly important application is the training of neural networks, as this is the most computationally-intensive procedure in using a neural algorithm. However, training a network with analog ReRAM synapses can significantly reduce the accuracy at the algorithm level. In order to assess this degradation, analog properties of ReRAM devices were measured and hand-written digit recognition accuracy was modeled for the training using backpropagation. Bipolar filamentary devices utilizing three material systems were measured and compared: one oxygen vacancy system, Ta-TaO x, andmore » two conducting metallization systems, Cu-SiO 2, and Ag/chalcogenide. Analog properties and conductance ranges of the devices are optimized by measuring the response to varying voltage pulse characteristics. Key analog device properties which degrade the accuracy are update linearity and write noise. Write noise may improve as a function of device manufacturing maturity, but write nonlinearity appears relatively consistent among the different device material systems and is found to be the most significant factor affecting accuracy. As a result, this suggests that new materials and/or fundamentally different resistive switching mechanisms may be required to improve device linearity and achieve higher algorithm training accuracy.« less

  11. Impact of Linearity and Write Noise of Analog Resistive Memory Devices in a Neural Algorithm Accelerator

    DOE PAGES

    Jacobs-Gedrim, Robin B.; Agarwal, Sapan; Knisely, Kathrine E.; ...

    2017-12-01

    Resistive memory (ReRAM) shows promise for use as an analog synapse element in energy-efficient neural network algorithm accelerators. A particularly important application is the training of neural networks, as this is the most computationally-intensive procedure in using a neural algorithm. However, training a network with analog ReRAM synapses can significantly reduce the accuracy at the algorithm level. In order to assess this degradation, analog properties of ReRAM devices were measured and hand-written digit recognition accuracy was modeled for the training using backpropagation. Bipolar filamentary devices utilizing three material systems were measured and compared: one oxygen vacancy system, Ta-TaO x, andmore » two conducting metallization systems, Cu-SiO 2, and Ag/chalcogenide. Analog properties and conductance ranges of the devices are optimized by measuring the response to varying voltage pulse characteristics. Key analog device properties which degrade the accuracy are update linearity and write noise. Write noise may improve as a function of device manufacturing maturity, but write nonlinearity appears relatively consistent among the different device material systems and is found to be the most significant factor affecting accuracy. As a result, this suggests that new materials and/or fundamentally different resistive switching mechanisms may be required to improve device linearity and achieve higher algorithm training accuracy.« less

  12. Information Transfer Problems of the Partially Sighted: Recent Results and Project Summary.

    ERIC Educational Resources Information Center

    Genensky, S. M.; And Others

    The fourth in a series of Rand reports on information transfer problems of the partially sighted reviews earlier reports and describes an experimental secretarial closed circuit TV (CCTV) system which enables the partially sighted to type from a printed or handwritten manuscript. Discussed are experiments using a pseudocolor system to determine…

  13. 17 CFR 270.0-2 - General requirements of papers and applications.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... necessary to authorize the undersigned to execute and file such instrument has been taken. The undersigned... may be present) by handwritten, typed, printed, or other legible form of notation from the facing page of the document through the last page of that document and any exhibits or attachments thereto...

  14. 17 CFR 270.0-2 - General requirements of papers and applications.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... necessary to authorize the undersigned to execute and file such instrument has been taken. The undersigned... may be present) by handwritten, typed, printed, or other legible form of notation from the facing page of the document through the last page of that document and any exhibits or attachments thereto...

  15. Older Japanese Adults and Mobile Phones: An Applied Ethnographic Study

    ERIC Educational Resources Information Center

    Hachiya, Kumiko

    2010-01-01

    This qualitative research investigates the meaning of "keitai" (mobile phones) for older Japanese adults between the ages of 59 and 79. Participants' emails from keitai, handwritten daily logs, and audio and video recordings from meetings and interviews were collected during my stay of nearly seven months in one of the largest cities in…

  16. Statistical Techniques for Efficient Indexing and Retrieval of Document Images

    ERIC Educational Resources Information Center

    Bhardwaj, Anurag

    2010-01-01

    We have developed statistical techniques to improve the performance of document image search systems where the intermediate step of OCR based transcription is not used. Previous research in this area has largely focused on challenges pertaining to generation of small lexicons for processing handwritten documents and enhancement of poor quality…

  17. Digital Management of a Hysteroscopy Surgery Using Parts of the SNOMED Medical Model

    PubMed Central

    Kollias, Anastasios; Paschopoulos, Minas; Evangelou, Angelos; Poulos, Marios

    2012-01-01

    This work describes a hysteroscopy surgery management application that was designed based on the medical information standard SNOMED. We describe how the application fulfils the needs of this procedure and the way in which existing handwritten medical information is effectively transmitted to the application’s database. PMID:22848338

  18. The Gin Builder: Examining the Skills Needed for the New Industrial Age.

    ERIC Educational Resources Information Center

    Kosty, Carlita; Lubar, Steven; Rhar, Bill

    2000-01-01

    Presents a lesson plan in which students explore the impact of industrialization on agriculture, the experience of William Ellison, a free black cotton gin mechanic, and the skills that Ellison needed. Students discuss handwritten documents, diagrams, and census information related to the cotton gin. Includes a bibliography and four handouts. (CMK)

  19. 49 CFR 381.310 - How do I apply for an exemption?

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... exemption? (a) You must send a written request (for example, a typed or handwritten (printed) letter), which... include: (1) Your name, job title, mailing address, and daytime telephone number; (2) The name of the... application must include a copy of all research reports, technical papers, and other publications and...

  20. 49 CFR 381.310 - How do I apply for an exemption?

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... exemption? (a) You must send a written request (for example, a typed or handwritten (printed) letter), which... include: (1) Your name, job title, mailing address, and daytime telephone number; (2) The name of the... application must include a copy of all research reports, technical papers, and other publications and...

  1. 49 CFR 381.310 - How do I apply for an exemption?

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... exemption? (a) You must send a written request (for example, a typed or handwritten (printed) letter), which... include: (1) Your name, job title, mailing address, and daytime telephone number; (2) The name of the... application must include a copy of all research reports, technical papers, and other publications and...

  2. 49 CFR 381.310 - How do I apply for an exemption?

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... exemption? (a) You must send a written request (for example, a typed or handwritten (printed) letter), which... include: (1) Your name, job title, mailing address, and daytime telephone number; (2) The name of the... application must include a copy of all research reports, technical papers, and other publications and...

  3. Positive Health and Financial Practices: Does Budgeting Make a Difference?

    ERIC Educational Resources Information Center

    O'Neill, Barbara; Xiao, Jing Jian; Ensle, Karen

    2017-01-01

    This study explored relationships between the practice of following a hand-written or computer-generated budget and the frequency of performance of positive personal health and financial practices. Data were collected from an online quiz completed by 942 adults, providing a simultaneous assessment of individuals' health and financial practices.…

  4. Discussion of papers

    NASA Astrophysics Data System (ADS)

    Discussion times were lively and highly fruitful. The Editors have organised questions and answers for each paper alphabetically by the speaker's surname. Although the discussion was recorded, only those questions and answers for which written versions were submitted have been included here. We are deeply indebted to Bev Lynds for transcribing the hand-written questions and answers.

  5. Towards a Better Understanding of the Legibility Bias in Performance Assessments: The Case of Gender-Based Inferences

    ERIC Educational Resources Information Center

    Greifeneder, Rainer; Zelt, Sarah; Seele, Tim; Bottenberg, Konstantin; Alt, Alexander

    2012-01-01

    Background: Handwriting legibility systematically biases evaluations in that highly legible handwriting results in more positive evaluations than less legible handwriting. Because performance assessments in educational contexts are not only based on computerized or multiple choice tests but often include the evaluation of handwritten work samples,…

  6. Error Reporting Logic

    DTIC Science & Technology

    2008-06-01

    14] Mark Weiser. Program slicing. Trans. Software Engineering , July 1984. 17 ...entitled “Perpetually Available and Secure In- formation Systems”, the Software Industry Center at CMU and its sponsors, especially the Alfred P. Sloan...ERL In Acme, a software architect can choose to associate a handwritten error message to each specification. If the specification fails, for any

  7. Mud, Blood, and Bullet Holes: Teaching History with War Letters

    ERIC Educational Resources Information Center

    Carroll, Andrew

    2013-01-01

    From handwritten letters of the American Revolution to typed emails from Iraq and Afghanistan, correspondence from U.S. troops offers students deep insight into the specific conflicts and experiences of soldiers. Over 100,000 correspondences have been donated to the Legacy Project, a national initiative launched in 1998 to preserve war letters by…

  8. Betty Kirby: Travels and Translations in the Kindergarten

    ERIC Educational Resources Information Center

    Sherwood, Elizabeth A.; Freshwater, Amy

    2009-01-01

    This article examines the pervasive influence of progressive education and travel on a public school kindergarten teacher's professional life. In a statement included in her handwritten list of goals for the children in her classroom, she echoed John Dewey, noting that a kindergarten child should "....live life fully and well because this is a…

  9. The "Intelligent Classroom": Changing Teaching and Learning with an Evolving Technological Environment.

    ERIC Educational Resources Information Center

    Winer, Laura R.; Cooperstock, Jeremy

    2002-01-01

    Describes the development and use of the Intelligent Classroom collaborative project at McGill University that explored technology use to improve teaching and learning. Explains the hardware and software installation that allows for the automated capture of audio, video, slides, and handwritten annotations during a live lecture, with subsequent…

  10. Investigating the Implemented Mathematics Curriculum of New England Navigation Cyphering Books

    ERIC Educational Resources Information Center

    Hertel, Joshua

    2016-01-01

    In this article I discuss an investigation of handwritten mathematics manuscripts known as navigation cyphering books. These manuscripts, which were prepared during the seventeenth and eighteenth centuries, are evidence of an educational tradition that was the primary means by which students in North America learned mathematics between 1607 and…

  11. 41 CFR 102-192.140 - What are your general responsibilities as a Federal mail center manager?

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... transmission of data in lieu of mail, reducing the number of handwritten addresses on outgoing mail, and other... processing activities at the facility, including all regularly scheduled, small package, and expedited... security office, the Postal Inspection Service, or other appropriate authority; (k) Track incoming packages...

  12. 49 CFR 381.210 - How do I request a waiver?

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... a written request (for example, a typed or handwritten (printed) letter), which includes all of the...) Principal place of business for the motor carrier or other entity (street address, city, State, and zip code... written statement that: (1) Describes the unique, non-emergency event for which the waiver would be used...

  13. Deep Convolutional Extreme Learning Machine and Its Application in Handwritten Digit Classification

    PubMed Central

    Yang, Xinyi

    2016-01-01

    In recent years, some deep learning methods have been developed and applied to image classification applications, such as convolutional neuron network (CNN) and deep belief network (DBN). However they are suffering from some problems like local minima, slow convergence rate, and intensive human intervention. In this paper, we propose a rapid learning method, namely, deep convolutional extreme learning machine (DC-ELM), which combines the power of CNN and fast training of ELM. It uses multiple alternate convolution layers and pooling layers to effectively abstract high level features from input images. Then the abstracted features are fed to an ELM classifier, which leads to better generalization performance with faster learning speed. DC-ELM also introduces stochastic pooling in the last hidden layer to reduce dimensionality of features greatly, thus saving much training time and computation resources. We systematically evaluated the performance of DC-ELM on two handwritten digit data sets: MNIST and USPS. Experimental results show that our method achieved better testing accuracy with significantly shorter training time in comparison with deep learning methods and other ELM methods. PMID:27610128

  14. Variational dynamic background model for keyword spotting in handwritten documents

    NASA Astrophysics Data System (ADS)

    Kumar, Gaurav; Wshah, Safwan; Govindaraju, Venu

    2013-12-01

    We propose a bayesian framework for keyword spotting in handwritten documents. This work is an extension to our previous work where we proposed dynamic background model, DBM for keyword spotting that takes into account the local character level scores and global word level scores to learn a logistic regression classifier to separate keywords from non-keywords. In this work, we add a bayesian layer on top of the DBM called the variational dynamic background model, VDBM. The logistic regression classifier uses the sigmoid function to separate keywords from non-keywords. The sigmoid function being neither convex nor concave, exact inference of VDBM becomes intractable. An expectation maximization step is proposed to do approximate inference. The advantage of VDBM over the DBM is multi-fold. Firstly, being bayesian, it prevents over-fitting of data. Secondly, it provides better modeling of data and an improved prediction of unseen data. VDBM is evaluated on the IAM dataset and the results prove that it outperforms our prior work and other state of the art line based word spotting system.

  15. Deep Convolutional Extreme Learning Machine and Its Application in Handwritten Digit Classification.

    PubMed

    Pang, Shan; Yang, Xinyi

    2016-01-01

    In recent years, some deep learning methods have been developed and applied to image classification applications, such as convolutional neuron network (CNN) and deep belief network (DBN). However they are suffering from some problems like local minima, slow convergence rate, and intensive human intervention. In this paper, we propose a rapid learning method, namely, deep convolutional extreme learning machine (DC-ELM), which combines the power of CNN and fast training of ELM. It uses multiple alternate convolution layers and pooling layers to effectively abstract high level features from input images. Then the abstracted features are fed to an ELM classifier, which leads to better generalization performance with faster learning speed. DC-ELM also introduces stochastic pooling in the last hidden layer to reduce dimensionality of features greatly, thus saving much training time and computation resources. We systematically evaluated the performance of DC-ELM on two handwritten digit data sets: MNIST and USPS. Experimental results show that our method achieved better testing accuracy with significantly shorter training time in comparison with deep learning methods and other ELM methods.

  16. The proximate unit in Chinese handwritten character production

    PubMed Central

    Chen, Jenn-Yeu; Cherng, Rong-Ju

    2013-01-01

    In spoken word production, a proximate unit is the first phonological unit at the sublexical level that is selectable for production (O'Seaghdha et al., 2010). The present study investigated whether the proximate unit in Chinese handwritten character production is the stroke, the radical, or something in between. A written version of the form preparation task was adopted. Chinese participants learned sets of two-character words, later were cued with the first character of each word, and had to write down the second character (the target). Response times were measured from the onset of a cue character to the onset of a written response. In Experiment 1, the target characters within a block shared (homogeneous) or did not share (heterogeneous) the first stroke. In Experiment 2, the first two strokes were shared in the homogeneous blocks. Response times in the homogeneous blocks and in the heterogeneous blocks were comparable in both experiments (Experiment 1: 687 vs. 684 ms, Experiment 2: 717 vs. 716). In Experiment 3 and 4, the target characters within a block shared or did not share the first radical. Response times in the homogeneous blocks were significantly faster than those in the heterogeneous blocks (Experiment 3: 685 vs. 704, Experiment 4: 594 vs. 650). In Experiment 5 and 6, the shared component was a Gestalt-like form that is more than a stroke, constitutes a portion of the target character, can be a stand-alone character itself, can be a radical of another character but is not a radical of the target character (e.g., ± in , , , ; called a logographeme). Response times in the homogeneous blocks were significantly faster than those in the heterogeneous blocks (Experiment 5: 576 vs. 625, Experiment 6: 586 vs. 620). These results suggest a model of Chinese handwritten character production in which the stroke is not a functional unit, the radical plays the role of a morpheme, and the logographeme is the proximate unit. PMID:23950752

  17. Impact of Internally Developed Electronic Prescription on Prescribing Errors at Discharge from the Emergency Department

    PubMed Central

    Hitti, Eveline; Tamim, Hani; Bakhti, Rinad; Zebian, Dina; Mufarrij, Afif

    2017-01-01

    Introduction Medication errors are common, with studies reporting at least one error per patient encounter. At hospital discharge, medication errors vary from 15%–38%. However, studies assessing the effect of an internally developed electronic (E)-prescription system at discharge from an emergency department (ED) are comparatively minimal. Additionally, commercially available electronic solutions are cost-prohibitive in many resource-limited settings. We assessed the impact of introducing an internally developed, low-cost E-prescription system, with a list of commonly prescribed medications, on prescription error rates at discharge from the ED, compared to handwritten prescriptions. Methods We conducted a pre- and post-intervention study comparing error rates in a randomly selected sample of discharge prescriptions (handwritten versus electronic) five months pre and four months post the introduction of the E-prescription. The internally developed, E-prescription system included a list of 166 commonly prescribed medications with the generic name, strength, dose, frequency and duration. We included a total of 2,883 prescriptions in this study: 1,475 in the pre-intervention phase were handwritten (HW) and 1,408 in the post-intervention phase were electronic. We calculated rates of 14 different errors and compared them between the pre- and post-intervention period. Results Overall, E-prescriptions included fewer prescription errors as compared to HW-prescriptions. Specifically, E-prescriptions reduced missing dose (11.3% to 4.3%, p <0.0001), missing frequency (3.5% to 2.2%, p=0.04), missing strength errors (32.4% to 10.2%, p <0.0001) and legibility (0.7% to 0.2%, p=0.005). E-prescriptions, however, were associated with a significant increase in duplication errors, specifically with home medication (1.7% to 3%, p=0.02). Conclusion A basic, internally developed E-prescription system, featuring commonly used medications, effectively reduced medication errors in a low-resource setting where the costs of sophisticated commercial electronic solutions are prohibitive. PMID:28874948

  18. Impact of Internally Developed Electronic Prescription on Prescribing Errors at Discharge from the Emergency Department.

    PubMed

    Hitti, Eveline; Tamim, Hani; Bakhti, Rinad; Zebian, Dina; Mufarrij, Afif

    2017-08-01

    Medication errors are common, with studies reporting at least one error per patient encounter. At hospital discharge, medication errors vary from 15%-38%. However, studies assessing the effect of an internally developed electronic (E)-prescription system at discharge from an emergency department (ED) are comparatively minimal. Additionally, commercially available electronic solutions are cost-prohibitive in many resource-limited settings. We assessed the impact of introducing an internally developed, low-cost E-prescription system, with a list of commonly prescribed medications, on prescription error rates at discharge from the ED, compared to handwritten prescriptions. We conducted a pre- and post-intervention study comparing error rates in a randomly selected sample of discharge prescriptions (handwritten versus electronic) five months pre and four months post the introduction of the E-prescription. The internally developed, E-prescription system included a list of 166 commonly prescribed medications with the generic name, strength, dose, frequency and duration. We included a total of 2,883 prescriptions in this study: 1,475 in the pre-intervention phase were handwritten (HW) and 1,408 in the post-intervention phase were electronic. We calculated rates of 14 different errors and compared them between the pre- and post-intervention period. Overall, E-prescriptions included fewer prescription errors as compared to HW-prescriptions. Specifically, E-prescriptions reduced missing dose (11.3% to 4.3%, p <0.0001), missing frequency (3.5% to 2.2%, p=0.04), missing strength errors (32.4% to 10.2%, p <0.0001) and legibility (0.7% to 0.2%, p=0.005). E-prescriptions, however, were associated with a significant increase in duplication errors, specifically with home medication (1.7% to 3%, p=0.02). A basic, internally developed E-prescription system, featuring commonly used medications, effectively reduced medication errors in a low-resource setting where the costs of sophisticated commercial electronic solutions are prohibitive.

  19. Physicians Failed to Write Flawless Prescriptions When Computerized Physician Order Entry System Crashed.

    PubMed

    Hsu, Chia-Chen; Chou, Chia-Lin; Chen, Tzeng-Ji; Ho, Chin-Chin; Lee, Chung-Yuan; Chou, Yueh-Ching

    2015-05-01

    Clinical care has become increasingly dependent on computerized physician order entry (CPOE) systems. No study has reported the adverse effect of CPOE on physicians' ability to handwrite prescriptions. This study took advantage of an extensive crash of the CPOE system at a large hospital to assess the completeness, legibility, and accuracy of physicians' handwritten prescriptions. The CPOE system had operated at the outpatient department of an academic medical center in Taiwan since 1993. During an unintentional shutdown that lasted 3.5 hours in 2010, physicians were forced to write prescriptions manually. These handwritten prescriptions, together with clinical medical records, were later audited by clinical pharmacists with respect to 16 fields of the patient's, prescriber's, and drug data. A total of 1418 prescriptions with 3805 drug items were handwritten by 114 to 1369 patients. Not a single prescription had all necessary fields filled in. Although the field of age was most frequently omitted (1282 [90.4%] of 1418 prescriptions) among the patient's data, the field of dosage form was most frequently omitted (3480 [91.5%] of 3805 items) among the drug data. In contrast, the scale of illegibility was rather small. The highest percentage reached only 1.5% (n = 57) in the field of drug frequency. Inaccuracies of strength, dose, and drug name were observed in 745 (19.6%), 517 (13.6%), and 435 (11.4%) prescribed drug items, respectively. The unintentional shutdown of a long-running CPOE system revealed that physicians fail to handwrite flawless prescriptions in the digital era. The contingency plans for computer disasters at health care facilities might include preparation of stand-alone e-prescribing software so that the service delay could be kept to the minimum. However, guidance on prescribing should remain an essential part of medical education. Copyright © 2015 Elsevier HS Journals, Inc. All rights reserved.

  20. Presentation video retrieval using automatically recovered slide and spoken text

    NASA Astrophysics Data System (ADS)

    Cooper, Matthew

    2013-03-01

    Video is becoming a prevalent medium for e-learning. Lecture videos contain text information in both the presentation slides and lecturer's speech. This paper examines the relative utility of automatically recovered text from these sources for lecture video retrieval. To extract the visual information, we automatically detect slides within the videos and apply optical character recognition to obtain their text. Automatic speech recognition is used similarly to extract spoken text from the recorded audio. We perform controlled experiments with manually created ground truth for both the slide and spoken text from more than 60 hours of lecture video. We compare the automatically extracted slide and spoken text in terms of accuracy relative to ground truth, overlap with one another, and utility for video retrieval. Results reveal that automatically recovered slide text and spoken text contain different content with varying error profiles. Experiments demonstrate that automatically extracted slide text enables higher precision video retrieval than automatically recovered spoken text.

  1. Con-Text: Text Detection for Fine-grained Object Classification.

    PubMed

    Karaoglu, Sezer; Tao, Ran; van Gemert, Jan C; Gevers, Theo

    2017-05-24

    This work focuses on fine-grained object classification using recognized scene text in natural images. While the state-of-the-art relies on visual cues only, this paper is the first work which proposes to combine textual and visual cues. Another novelty is the textual cue extraction. Unlike the state-of-the-art text detection methods, we focus more on the background instead of text regions. Once text regions are detected, they are further processed by two methods to perform text recognition i.e. ABBYY commercial OCR engine and a state-of-the-art character recognition algorithm. Then, to perform textual cue encoding, bi- and trigrams are formed between the recognized characters by considering the proposed spatial pairwise constraints. Finally, extracted visual and textual cues are combined for fine-grained classification. The proposed method is validated on four publicly available datasets: ICDAR03, ICDAR13, Con-Text and Flickr-logo. We improve the state-of-the-art end-to-end character recognition by a large margin of 15% on ICDAR03. We show that textual cues are useful in addition to visual cues for fine-grained classification. We show that textual cues are also useful for logo retrieval. Adding textual cues outperforms visual- and textual-only in fine-grained classification (70.7% to 60.3%) and logo retrieval (57.4% to 54.8%).

  2. A Novel Approach towards Medical Entity Recognition in Chinese Clinical Text

    PubMed Central

    Yu, Jian

    2017-01-01

    Medical entity recognition, a basic task in the language processing of clinical data, has been extensively studied in analyzing admission notes in alphabetic languages such as English. However, much less work has been done on nonstructural texts that are written in Chinese, or in the setting of differentiation of Chinese drug names between traditional Chinese medicine and Western medicine. Here, we propose a novel cascade-type Chinese medication entity recognition approach that aims at integrating the sentence category classifier from a support vector machine and the conditional random field-based medication entity recognition. We hypothesized that this approach could avoid the side effects of abundant negative samples and improve the performance of the named entity recognition from admission notes written in Chinese. Therefore, we applied this approach to a test set of 324 Chinese-written admission notes with manual annotation by medical experts. Our data demonstrated that this approach had a score of 94.2% in precision, 92.8% in recall, and 93.5% in F-measure for the recognition of traditional Chinese medicine drug names and 91.2% in precision, 92.6% in recall, and 91.7% F-measure for the recognition of Western medicine drug names. The differences in F-measure were significant compared with those in the baseline systems. PMID:29065612

  3. Multi-font printed Mongolian document recognition system

    NASA Astrophysics Data System (ADS)

    Peng, Liangrui; Liu, Changsong; Ding, Xiaoqing; Wang, Hua; Jin, Jianming

    2009-01-01

    Mongolian is one of the major ethnic languages in China. Large amount of Mongolian printed documents need to be digitized in digital library and various applications. Traditional Mongolian script has unique writing style and multi-font-type variations, which bring challenges to Mongolian OCR research. As traditional Mongolian script has some characteristics, for example, one character may be part of another character, we define the character set for recognition according to the segmented components, and the components are combined into characters by rule-based post-processing module. For character recognition, a method based on visual directional feature and multi-level classifiers is presented. For character segmentation, a scheme is used to find the segmentation point by analyzing the properties of projection and connected components. As Mongolian has different font-types which are categorized into two major groups, the parameter of segmentation is adjusted for each group. A font-type classification method for the two font-type group is introduced. For recognition of Mongolian text mixed with Chinese and English, language identification and relevant character recognition kernels are integrated. Experiments show that the presented methods are effective. The text recognition rate is 96.9% on the test samples from practical documents with multi-font-types and mixed scripts.

  4. Development and Evaluation of a Feedback Support System with Audio and Playback Strokes

    ERIC Educational Resources Information Center

    Li, Kai; Akahori, Kanji

    2008-01-01

    This paper describes the development and evaluation of a handwritten correction support system with audio and playback strokes used to teach Japanese writing. The study examined whether audio and playback strokes have a positive effect on students using honorific expressions in Japanese writing. The results showed that error feedback with audio…

  5. The Use of the Overhead Projector in Teaching Composition.

    ERIC Educational Resources Information Center

    Bissex, Henry

    The overhead projector, used as a controllable blackboard or bulletin board in the teaching of writing, extends the range of teaching techniques so that an instructor may (1) prepare, in advance, handwritten sheets of film--test questions, pupils' sentences, quotations, short poems--to be shown in any order or form; (2) use pictures, graphics, or…

  6. 49 CFR 592.6 - Duties of a registered importer.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... pursuant to § 592.5(a)(5)(iv), with an original hand-written signature and not with a signature that is... paragraph (d) of this section (the 30-day period will be extended if the Administrator has made written... on which Code J is checked, and the EPA has granted the ICI written permission to operate the vehicle...

  7. 41 CFR 102-192.155 - What should our agency-wide mail management policy statement cover?

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... correct street addresses, and minimizing use of hand-written addresses; (j) Ensuring that a USPS mail... should our agency-wide mail management policy statement cover? You should have a written, agency-wide...), or to return it to the sender if the addressee cannot be identified. On the other hand, agencies may...

  8. STS-42 IPMP experiment stowed in locker MF71O on OV-103's middeck

    NASA Technical Reports Server (NTRS)

    1992-01-01

    STS-42 Investigations into Polymer Membrane Processing (IPMP) experiment stainless steel cylinders are stowed in locker MF71O on the middeck of Discovery, Orbiter Vehicle (OV) 103. A checklist with numerous handwritten notations floats above the open forward locker and a roll of duct tape is secured on nearby locker.

  9. Write to read: the brain's universal reading and writing network.

    PubMed

    Perfetti, Charles A; Tan, Li-Hai

    2013-02-01

    Do differences in writing systems translate into differences in the brain's reading network? Or is this network universal, relatively impervious to variation in writing systems? A new study adds intriguing evidence to these questions by showing that reading handwritten words activates a pre-motor area across writing systems. Copyright © 2012 Elsevier Ltd. All rights reserved.

  10. Providing Formative Assessment to Students Solving Multipath Engineering Problems with Complex Arrangements of Interacting Parts: An Intelligent Tutor Approach

    ERIC Educational Resources Information Center

    Steif, Paul S.; Fu, Luoting; Kara, Levent Burak

    2016-01-01

    Problems faced by engineering students involve multiple pathways to solution. Students rarely receive effective formative feedback on handwritten homework. This paper examines the potential for computer-based formative assessment of student solutions to multipath engineering problems. In particular, an intelligent tutor approach is adopted and…

  11. Use of Screen Capture to Produce Media for Organic Chemistry

    ERIC Educational Resources Information Center

    D'Angelo, John G.

    2014-01-01

    Although many students learn best in different ways, the widest range of students can be reached when multiple modes of input are employed, especially if the student is simultaneously completing a set of handwritten notes. Computers, meanwhile, have led to countless changes in society, and education has not been exempt from these changes. Students…

  12. 28 CFR Appendix A to Part 35 - Guidance to Revisions to ADA Regulation on Nondiscrimination on the Basis of Disability in State...

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... communicated effectively using handwritten notes. One major advocacy organization, for example, noted that the... example, blood work for routine lab tests or regular allergy shots. Video Interpreting Services... or combustion engines. One commenter suggested using exhaust level as the determinant. Although there...

  13. 28 CFR Appendix A to Part 35 - Guidance to Revisions to ADA Regulation on Nondiscrimination on the Basis of Disability in State...

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... communicated effectively using handwritten notes. One major advocacy organization, for example, noted that the... example, blood work for routine lab tests or regular allergy shots. Video Interpreting Services... or combustion engines. One commenter suggested using exhaust level as the determinant. Although there...

  14. 28 CFR Appendix A to Part 35 - Guidance to Revisions to ADA Regulation on Nondiscrimination on the Basis of Disability in State...

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... communicated effectively using handwritten notes. One major advocacy organization, for example, noted that the... example, blood work for routine lab tests or regular allergy shots. Video Interpreting Services... or combustion engines. One commenter suggested using exhaust level as the determinant. Although there...

  15. 28 CFR Appendix A to Part 35 - Guidance to Revisions to ADA Regulation on Nondiscrimination on the Basis of Disability in State...

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... communicated effectively using handwritten notes. One major advocacy organization, for example, noted that the... example, blood work for routine lab tests or regular allergy shots. Video Interpreting Services... or combustion engines. One commenter suggested using exhaust level as the determinant. Although there...

  16. 49 CFR 381.410 - What may I do if I have an idea or suggestion for a pilot program?

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... (for example, a typed or handwritten (printed) letter) to the Administrator, Federal Motor Carrier... include: (1) Your name, job title, mailing address, and daytime telephone number; (2) The name of the... recommendation should include a copy of all research reports, technical papers, publications and other documents...

  17. 49 CFR 381.410 - What may I do if I have an idea or suggestion for a pilot program?

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... (for example, a typed or handwritten (printed) letter) to the Administrator, Federal Motor Carrier... include: (1) Your name, job title, mailing address, and daytime telephone number; (2) The name of the... recommendation should include a copy of all research reports, technical papers, publications and other documents...

  18. 49 CFR 381.410 - What may I do if I have an idea or suggestion for a pilot program?

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... (for example, a typed or handwritten (printed) letter) to the Administrator, Federal Motor Carrier... include: (1) Your name, job title, mailing address, and daytime telephone number; (2) The name of the... recommendation should include a copy of all research reports, technical papers, publications and other documents...

  19. 49 CFR 381.410 - What may I do if I have an idea or suggestion for a pilot program?

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... (for example, a typed or handwritten (printed) letter) to the Administrator, Federal Motor Carrier... include: (1) Your name, job title, mailing address, and daytime telephone number; (2) The name of the... recommendation should include a copy of all research reports, technical papers, publications and other documents...

  20. Understanding Student Satisfaction and Dissatisfaction: An Interpretive Study in the UK Higher Education Context

    ERIC Educational Resources Information Center

    Douglas, Jacqueline Ann; Douglas, Alexander; McClelland, Robert James; Davies, John

    2015-01-01

    This article represents a cross-sectional study of undergraduate students across two north-west university business schools in the UK. A purposefully designed questionnaire was collected from 350 students. The student experience was described in the form of hand-written narratives by first and final year students and had been identified by the…

  1. The Child Writer: Graphic Literacy and the Scottish Educational System, 1700-1820

    ERIC Educational Resources Information Center

    Eddy, Matthew Daniel

    2016-01-01

    The story of Enlightenment literacy is often reconstructed from textbooks and manuals, with the implicit focus being what children were reading. But far less attention has been devoted to how they mastered the scribal techniques that allowed them to manage knowledge on paper. Focusing on Scotland, handwritten manuscripts are used to reveal that…

  2. [Representation of letter position in visual word recognition process].

    PubMed

    Makioka, S

    1994-08-01

    Two experiments investigated the representation of letter position in visual word recognition process. In Experiment 1, subjects (12 undergraduates and graduates) were asked to detect a target word in a briefly-presented probe. Probes consisted of two kanji words. The latters which formed targets (critical letters) were always contained in probes. (e.g. target: [symbol: see text] probe: [symbol: see text]) High false alarm rate was observed when critical letters occupied the same within-word relative position (left or right within the word) in the probe words as in the target word. In Experiment 2 (subject were ten undergraduates and graduates), spaces adjacent to probe words were replaced by randomly chosen hiragana letters (e.g. [symbol: see text]), because spaces are not used to separate words in regular Japanese sentences. In addition to the effect of within-word relative position as in Experiment 1, the effect of between-word relative position (left or right across the probe words) was observed. These results suggest that information about within-word relative position of a letter is used in word recognition process. The effect of within-word relative position was explained by a connectionist model of word recognition.

  3. Automatic detection and recognition of signs from natural scenes.

    PubMed

    Chen, Xilin; Yang, Jie; Zhang, Jing; Waibel, Alex

    2004-01-01

    In this paper, we present an approach to automatic detection and recognition of signs from natural scenes, and its application to a sign translation task. The proposed approach embeds multiresolution and multiscale edge detection, adaptive searching, color analysis, and affine rectification in a hierarchical framework for sign detection, with different emphases at each phase to handle the text in different sizes, orientations, color distributions and backgrounds. We use affine rectification to recover deformation of the text regions caused by an inappropriate camera view angle. The procedure can significantly improve text detection rate and optical character recognition (OCR) accuracy. Instead of using binary information for OCR, we extract features from an intensity image directly. We propose a local intensity normalization method to effectively handle lighting variations, followed by a Gabor transform to obtain local features, and finally a linear discriminant analysis (LDA) method for feature selection. We have applied the approach in developing a Chinese sign translation system, which can automatically detect and recognize Chinese signs as input from a camera, and translate the recognized text into English.

  4. Unsupervised Medical Entity Recognition and Linking in Chinese Online Medical Text

    PubMed Central

    Gan, Liang; Cheng, Mian; Wu, Quanyuan

    2018-01-01

    Online medical text is full of references to medical entities (MEs), which are valuable in many applications, including medical knowledge-based (KB) construction, decision support systems, and the treatment of diseases. However, the diverse and ambiguous nature of the surface forms gives rise to a great difficulty for ME identification. Many existing solutions have focused on supervised approaches, which are often task-dependent. In other words, applying them to different kinds of corpora or identifying new entity categories requires major effort in data annotation and feature definition. In this paper, we propose unMERL, an unsupervised framework for recognizing and linking medical entities mentioned in Chinese online medical text. For ME recognition, unMERL first exploits a knowledge-driven approach to extract candidate entities from free text. Then, the categories of the candidate entities are determined using a distributed semantic-based approach. For ME linking, we propose a collaborative inference approach which takes full advantage of heterogenous entity knowledge and unstructured information in KB. Experimental results on real corpora demonstrate significant benefits compared to recent approaches with respect to both ME recognition and linking. PMID:29849994

  5. Reading in developmental prosopagnosia: Evidence for a dissociation between word and face recognition.

    PubMed

    Starrfelt, Randi; Klargaard, Solja K; Petersen, Anders; Gerlach, Christian

    2018-02-01

    Recent models suggest that face and word recognition may rely on overlapping cognitive processes and neural regions. In support of this notion, face recognition deficits have been demonstrated in developmental dyslexia. Here we test whether the opposite association can also be found, that is, impaired reading in developmental prosopagnosia. We tested 10 adults with developmental prosopagnosia and 20 matched controls. All participants completed the Cambridge Face Memory Test, the Cambridge Face Perception test and a Face recognition questionnaire used to quantify everyday face recognition experience. Reading was measured in four experimental tasks, testing different levels of letter, word, and text reading: (a) single word reading with words of varying length,(b) vocal response times in single letter and short word naming, (c) recognition of single letters and short words at brief exposure durations (targeting the word superiority effect), and d) text reading. Participants with developmental prosopagnosia performed strikingly similar to controls across the four reading tasks. Formal analysis revealed a significant dissociation between word and face recognition, as the difference in performance with faces and words was significantly greater for participants with developmental prosopagnosia than for controls. Adult developmental prosopagnosics read as quickly and fluently as controls, while they are seemingly unable to learn efficient strategies for recognizing faces. We suggest that this is due to the differing demands that face and word recognition put on the perceptual system. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  6. NEW BEDFORD, HANDWRITTEN NOTES REGARDING MEETING ON SOUTH TERMINAL - STATE ENHANCED REMEDY (SER), 09-28-2012, SDMS# 529022

    EPA Pesticide Factsheets

    2013-06-11

    Cf(zt ~~n,.'fb-~.. · wl CJfiiU1 , ~ Cffuf; J~) , Ctw, /IA.Au 1 ~~ , ftw ~L-- ~'JY)VY~) t~~ a~t~~~~~f«Jetud.AJ1/P v..>t'tte #ttiJ hn cLute 10 k~ fo ~ ~ ~lzt( ...

  7. 76 FR 62496 - Motor Carrier Safety Advisory Committee Series of Public Subcommittee Meetings

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-07

    ... EOBRs used in lieu of handwritten records of duty status (RODS). Time and Dates: The meetings will be held Monday-Thursday, October 24-27, 2011, from 8:30 am to 5 pm, E.T. at the Sheraton Crystal City, 1800 Jefferson Davis Highway, Arlington, VA, 22202, in meeting rooms Crystal V and VI. Matters To Be...

  8. Strategies to Help Legal Studies Students Avoid Plagiarism

    ERIC Educational Resources Information Center

    Samuels, Linda B.; Bast, Carol M.

    2006-01-01

    Plagiarism is certainly not new to academics, but it may be on the rise with easy access to the vast quantities of information available on the Internet. Students researching on the Internet do not have to take handwritten or typewritten notes. They can simply print out or copy and save whatever they find. They are even spared the tedium of having…

  9. 49 CFR 381.410 - What may I do if I have an idea or suggestion for a pilot program?

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... (for example, a typed or handwritten (printed) letter) to the Administrator, Federal Motor Carrier... include: (1) Your name, job title, mailing address, and daytime telephone number; (2) The name of the... measures in the pilot project would be designed to achieve a level of safety that is equivalent to, or...

  10. Kinetic modelling of the oxidation of large aliphatic hydrocarbons using an automatic mechanism generation.

    PubMed

    Muharam, Yuswan; Warnatz, Jürgen

    2007-08-21

    A mechanism generator code to automatically generate mechanisms for the oxidation of large hydrocarbons has been successfully modified and considerably expanded in this work. The modification was through (1) improvement of the existing rules such as cyclic-ether reactions and aldehyde reactions, (2) inclusion of some additional rules to the code, such as ketone reactions, hydroperoxy cyclic-ether formations and additional reactions of alkenes, (3) inclusion of small oxygenates, produced by the code but not included in the handwritten C(1)-C(4) sub-mechanism yet, to the handwritten C(1)-C(4) sub-mechanism. In order to evaluate mechanisms generated by the code, simulations of observed results in different experimental environments have been carried out. Experimentally derived and numerically predicted ignition delays of n-heptane-air and n-decane-air mixtures in high-pressure shock tubes in a wide range of temperatures, pressures and equivalence ratios agree very well. Concentration profiles of the main products and intermediates of n-heptane and n-decane oxidation in jet-stirred reactors at a wide range of temperatures and equivalence ratios are generally well reproduced. In addition, the ignition delay times of different normal alkanes was numerically studied.

  11. Phonological Codes Constrain Output of Orthographic Codes via Sublexical and Lexical Routes in Chinese Written Production

    PubMed Central

    Wang, Cheng; Zhang, Qingfang

    2015-01-01

    To what extent do phonological codes constrain orthographic output in handwritten production? We investigated how phonological codes constrain the selection of orthographic codes via sublexical and lexical routes in Chinese written production. Participants wrote down picture names in a picture-naming task in Experiment 1or response words in a symbol—word associative writing task in Experiment 2. A sublexical phonological property of picture names (phonetic regularity: regular vs. irregular) in Experiment 1and a lexical phonological property of response words (homophone density: dense vs. sparse) in Experiment 2, as well as word frequency of the targets in both experiments, were manipulated. A facilitatory effect of word frequency was found in both experiments, in which words with high frequency were produced faster than those with low frequency. More importantly, we observed an inhibitory phonetic regularity effect, in which low-frequency picture names with regular first characters were slower to write than those with irregular ones, and an inhibitory homophone density effect, in which characters with dense homophone density were produced more slowly than those with sparse homophone density. Results suggested that phonological codes constrained handwritten production via lexical and sublexical routes. PMID:25879662

  12. A Multiple-Label Guided Clustering Algorithm for Historical Document Dating and Localization.

    PubMed

    He, Sheng; Samara, Petros; Burgers, Jan; Schomaker, Lambert

    2016-11-01

    It is of essential importance for historians to know the date and place of origin of the documents they study. It would be a huge advancement for historical scholars if it would be possible to automatically estimate the geographical and temporal provenance of a handwritten document by inferring them from the handwriting style of such a document. We propose a multiple-label guided clustering algorithm to discover the correlations between the concrete low-level visual elements in historical documents and abstract labels, such as date and location. First, a novel descriptor, called histogram of orientations of handwritten strokes, is proposed to extract and describe the visual elements, which is built on a scale-invariant polar-feature space. In addition, the multi-label self-organizing map (MLSOM) is proposed to discover the correlations between the low-level visual elements and their labels in a single framework. Our proposed MLSOM can be used to predict the labels directly. Moreover, the MLSOM can also be considered as a pre-structured clustering method to build a codebook, which contains more discriminative information on date and geography. The experimental results on the medieval paleographic scale data set demonstrate that our method achieves state-of-the-art results.

  13. Speech Recognition as a Support Service for Deaf and Hard of Hearing Students: Adaptation and Evaluation. Final Report to Spencer Foundation.

    ERIC Educational Resources Information Center

    Stinson, Michael; Elliot, Lisa; McKee, Barbara; Coyne, Gina

    This report discusses a project that adapted new automatic speech recognition (ASR) technology to provide real-time speech-to-text transcription as a support service for students who are deaf and hard of hearing (D/HH). In this system, as the teacher speaks, a hearing intermediary, or captionist, dictates into the speech recognition system in a…

  14. Reading Machines for Blind People.

    ERIC Educational Resources Information Center

    Fender, Derek H.

    1983-01-01

    Ten stages of developing reading machines for blind people are analyzed: handling of text material; optics; electro-optics; pattern recognition; character recognition; storage; speech synthesizers; browsing and place finding; computer indexing; and other sources of input. Cost considerations of the final product are emphasized. (CL)

  15. Chemical named entities recognition: a review on approaches and applications

    PubMed Central

    2014-01-01

    The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to “text mine” these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted. PMID:24834132

  16. Chemical named entities recognition: a review on approaches and applications.

    PubMed

    Eltyeb, Safaa; Salim, Naomie

    2014-01-01

    The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to "text mine" these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted.

  17. Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments.

    PubMed

    Tian, Yingli; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

    2013-04-01

    Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech.

  18. Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments

    PubMed Central

    Tian, YingLi; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

    2012-01-01

    Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech. PMID:23630409

  19. Effects of text cohesion on comprehension and retention of colorectal cancer screening information: a preliminary study.

    PubMed

    Liu, Chiung-Ju; Rawl, Susan M

    2012-01-01

    Increasing readability of written cancer prevention information is a fundamental step to increasing awareness and knowledge of cancer screening. Instead of readability formulas, the present study focused on text cohesion, which is the degree to which the text content ties together. The purpose of this study was to examine the effect of text cohesion on reading times, comprehension, and retention of colorectal cancer prevention information. English-speaking adults (50 years of age or older) were recruited from local communities. Participants were randomly assigned to read colorectal cancer prevention subtopics presented at 2 levels of text cohesion: from higher cohesion to lower cohesion, or vice versa. Reading times, word recognition, text comprehension, and recall were assessed after reading. Two weeks later, text comprehension and recall were reassessed. Forty-two adults completed the study, but five were lost to follow up. Higher text cohesion showed a significant effect on reading times and text comprehension but not on word recognition and recall. The effect of text cohesion was not found on text comprehension and recall after 2 weeks. Increasing text cohesion facilitates reading speed and comprehension of colorectal cancer prevention information. Further research on the effect of text cohesion is warranted.

  20. The use of illustration to improve older adults' comprehension of health-related information: is it helpful?

    PubMed

    Liu, Chiung-ju; Kemper, Susan; McDowd, Joan

    2009-08-01

    To examine whether explanatory illustrations can improve older adults' comprehension of written health information. Six short health-related texts were selected from websites and pamphlets. Young and older adults were randomly assigned to read health-related texts alone or texts accompanied by explanatory illustrations. Eye movements were recorded while reading. Word recognition, text comprehension, and comprehension of the illustrations were assessed after reading. Older adults performed as well as or better than young adults on the word recognition and text comprehension measures. However, older adults performed less well than young adults on the illustration comprehension measures. Analysis of readers' eye movements showed that older adults spent more time reading illustration-related phrases and fixating on the illustrations than did young adults, yet had poorer comprehension of the illustrations. Older adults might not benefit from text illustrations because illustrations can be difficult to integrate with the text. Health practitioners should not assume that illustrations will increase older adults' comprehension of health information.

  1. The Hegemony of Heterosexuality: A Study of Introductory Texts.

    ERIC Educational Resources Information Center

    Phillips, Sarah Rengel

    1991-01-01

    Reviews introductory sociology texts from 1950-89. Reports that heterosexual biases are embedded in sociology as taught. Argues that goals of sociology texts should include the recognition and exploration of difference rather than the homogenization of sexuality. Concludes that, although introductory sociology texts have made advances in…

  2. Mind wandering in text comprehension under dual-task conditions.

    PubMed

    Dixon, Peter; Li, Henry

    2013-01-01

    In two experiments, subjects responded to on-task probes while reading under dual-task conditions. The secondary task was to monitor the text for occurrences of the letter e. In Experiment 1, reading comprehension was assessed with a multiple-choice recognition test; in Experiment 2, subjects recalled the text. In both experiments, the secondary task replicated the well-known "missing-letter effect" in which detection of e's was less effective for function words and the word "the." Letter detection was also more effective when subjects were on task, but this effect did not interact with the missing-letter effect. Comprehension was assessed in both the dual-task conditions and in control single-task conditions. In the single-task conditions, both recognition (Experiment 1) and recall (Experiment 2) was better when subjects were on task, replicating previous research on mind wandering. Surprisingly, though, comprehension under dual-task conditions only showed an effect of being on task when measured with recall; there was no effect on recognition performance. Our interpretation of this pattern of results is that subjects generate responses to on-task probes on the basis of a retrospective assessment of the contents of working memory. Further, we argue that under dual-task conditions, the contents of working memory is not closely related to the reading processes required for accurate recognition performance. These conclusions have implications for models of text comprehension and for the interpretation of on-task probe responses.

  3. Mind wandering in text comprehension under dual-task conditions

    PubMed Central

    Dixon, Peter; Li, Henry

    2013-01-01

    In two experiments, subjects responded to on-task probes while reading under dual-task conditions. The secondary task was to monitor the text for occurrences of the letter e. In Experiment 1, reading comprehension was assessed with a multiple-choice recognition test; in Experiment 2, subjects recalled the text. In both experiments, the secondary task replicated the well-known “missing-letter effect” in which detection of e's was less effective for function words and the word “the.” Letter detection was also more effective when subjects were on task, but this effect did not interact with the missing-letter effect. Comprehension was assessed in both the dual-task conditions and in control single-task conditions. In the single-task conditions, both recognition (Experiment 1) and recall (Experiment 2) was better when subjects were on task, replicating previous research on mind wandering. Surprisingly, though, comprehension under dual-task conditions only showed an effect of being on task when measured with recall; there was no effect on recognition performance. Our interpretation of this pattern of results is that subjects generate responses to on-task probes on the basis of a retrospective assessment of the contents of working memory. Further, we argue that under dual-task conditions, the contents of working memory is not closely related to the reading processes required for accurate recognition performance. These conclusions have implications for models of text comprehension and for the interpretation of on-task probe responses. PMID:24101909

  4. Machine-printed Arabic OCR

    NASA Astrophysics Data System (ADS)

    Hassibi, Khosrow M.

    1994-02-01

    This paper presents a brief overview of our research in the development of an OCR system for recognition of machine-printed texts in languages that use the Arabic alphabet. The cursive nature of machine-printed Arabic makes the segmentation of words into letters a challenging problem. In our approach, through a novel preliminary segmentation technique, a word is broken into pieces where each piece may not represent a valid letter in general. Neural networks trained on a training sample set of about 500 Arabic text images are used for recognition of these pieces. The rules governing the alphabet and character-level contextual information are used for recombining these pieces into valid letters. Higher-level contextual analysis schemes including the use of an Arabic lexicon and n-grams is also under development and are expected to improve the word recognition accuracy. The segmentation, recognition, and contextual analysis processes are closely integrated using a feedback scheme. The details of preparation of the training set and some recent results on training of the networks will be presented.

  5. Event-driven contrastive divergence for spiking neuromorphic systems.

    PubMed

    Neftci, Emre; Das, Srinjoy; Pedroni, Bruno; Kreutz-Delgado, Kenneth; Cauwenberghs, Gert

    2013-01-01

    Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been demonstrated to perform efficiently in a variety of applications, such as dimensionality reduction, feature learning, and classification. Their implementation on neuromorphic hardware platforms emulating large-scale networks of spiking neurons can have significant advantages from the perspectives of scalability, power dissipation and real-time interfacing with the environment. However, the traditional RBM architecture and the commonly used training algorithm known as Contrastive Divergence (CD) are based on discrete updates and exact arithmetics which do not directly map onto a dynamical neural substrate. Here, we present an event-driven variation of CD to train a RBM constructed with Integrate & Fire (I&F) neurons, that is constrained by the limitations of existing and near future neuromorphic hardware platforms. Our strategy is based on neural sampling, which allows us to synthesize a spiking neural network that samples from a target Boltzmann distribution. The recurrent activity of the network replaces the discrete steps of the CD algorithm, while Spike Time Dependent Plasticity (STDP) carries out the weight updates in an online, asynchronous fashion. We demonstrate our approach by training an RBM composed of leaky I&F neurons with STDP synapses to learn a generative model of the MNIST hand-written digit dataset, and by testing it in recognition, generation and cue integration tasks. Our results contribute to a machine learning-driven approach for synthesizing networks of spiking neurons capable of carrying out practical, high-level functionality.

  6. Event-driven contrastive divergence for spiking neuromorphic systems

    PubMed Central

    Neftci, Emre; Das, Srinjoy; Pedroni, Bruno; Kreutz-Delgado, Kenneth; Cauwenberghs, Gert

    2014-01-01

    Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been demonstrated to perform efficiently in a variety of applications, such as dimensionality reduction, feature learning, and classification. Their implementation on neuromorphic hardware platforms emulating large-scale networks of spiking neurons can have significant advantages from the perspectives of scalability, power dissipation and real-time interfacing with the environment. However, the traditional RBM architecture and the commonly used training algorithm known as Contrastive Divergence (CD) are based on discrete updates and exact arithmetics which do not directly map onto a dynamical neural substrate. Here, we present an event-driven variation of CD to train a RBM constructed with Integrate & Fire (I&F) neurons, that is constrained by the limitations of existing and near future neuromorphic hardware platforms. Our strategy is based on neural sampling, which allows us to synthesize a spiking neural network that samples from a target Boltzmann distribution. The recurrent activity of the network replaces the discrete steps of the CD algorithm, while Spike Time Dependent Plasticity (STDP) carries out the weight updates in an online, asynchronous fashion. We demonstrate our approach by training an RBM composed of leaky I&F neurons with STDP synapses to learn a generative model of the MNIST hand-written digit dataset, and by testing it in recognition, generation and cue integration tasks. Our results contribute to a machine learning-driven approach for synthesizing networks of spiking neurons capable of carrying out practical, high-level functionality. PMID:24574952

  7. Circle Hough transform implementation for dots recognition in braille cells

    NASA Astrophysics Data System (ADS)

    Jacinto Gómez, Edwar; Montiel Ariza, Holman; Martínez Sarmiento, Fredy Hernán.

    2017-02-01

    This paper shows a technique based on CHT (Circle Hough Transform) to achieve the optical Braille recognition (OBR). Unlike other papers developed around the same topic, this one is made by using Hough Transform to process the recognition and transcription of Braille cells, proving CHT to be an appropriate technique to go over different non-systematics factors who can affect the process, as the paper type where the text to traduce is placed, some lightning factors, input image resolution and some flaws derived from the capture process, which is realized using a scanner. Tests are performed with a local database using text generated by visual nondisabled people and some transcripts by sightless people; all of this with the support of National Institute for Blind People (INCI for their Spanish acronym) placed in Colombia.

  8. Figure Text Extraction in Biomedical Literature

    PubMed Central

    Kim, Daehyun; Yu, Hong

    2011-01-01

    Background Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org) to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures. Methodology We first evaluated an off-the-shelf Optical Character Recognition (OCR) tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT) to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons. Results/Conclusions The evaluation on 382 figures (9,643 figure texts in total) randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for text extraction. In addition, our results show that FigTExT can extract texts that do not appear in figure captions or other associated text, further suggesting the potential utility of FigTExT for improving figure search. PMID:21249186

  9. Figure text extraction in biomedical literature.

    PubMed

    Kim, Daehyun; Yu, Hong

    2011-01-13

    Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org) to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures. We first evaluated an off-the-shelf Optical Character Recognition (OCR) tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT) to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons. The evaluation on 382 figures (9,643 figure texts in total) randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for text extraction. In addition, our results show that FigTExT can extract texts that do not appear in figure captions or other associated text, further suggesting the potential utility of FigTExT for improving figure search.

  10. Image based book cover recognition and retrieval

    NASA Astrophysics Data System (ADS)

    Sukhadan, Kalyani; Vijayarajan, V.; Krishnamoorthi, A.; Bessie Amali, D. Geraldine

    2017-11-01

    In this we are developing a graphical user interface using MATLAB for the users to check the information related to books in real time. We are taking the photos of the book cover using GUI, then by using MSER algorithm it will automatically detect all the features from the input image, after this it will filter bifurcate non-text features which will be based on morphological difference between text and non-text regions. We implemented a text character alignment algorithm which will improve the accuracy of the original text detection. We will also have a look upon the built in MATLAB OCR recognition algorithm and an open source OCR which is commonly used to perform better detection results, post detection algorithm is implemented and natural language processing to perform word correction and false detection inhibition. Finally, the detection result will be linked to internet to perform online matching. More than 86% accuracy can be obtained by this algorithm.

  11. Computerized literature reference system: use of an optical scanner and optical character recognition software.

    PubMed

    Lossef, S V; Schwartz, L H

    1990-09-01

    A computerized reference system for radiology journal articles was developed by using an IBM-compatible personal computer with a hand-held optical scanner and optical character recognition software. This allows direct entry of scanned text from printed material into word processing or data-base files. Additionally, line diagrams and photographs of radiographs can be incorporated into these files. A text search and retrieval software program enables rapid searching for keywords in scanned documents. The hand scanner and software programs are commercially available, relatively inexpensive, and easily used. This permits construction of a personalized radiology literature file of readily accessible text and images requiring minimal typing or keystroke entry.

  12. Proceedings of the Annual Meeting of the Association for Education in Journalism and Mass Communication (80th, Chicago, Illinois, July 30-August 2, 1997). Addenda I.

    ERIC Educational Resources Information Center

    Association for Education in Journalism and Mass Communication.

    The 16 papers in the first section of the Addenda to this proceedings are: (1) "Shipboard News: Nineteenth Century Handwritten Periodicals at Sea" (Roy Alden Atwood); (2) "The International Institutional Press Association, 1966-1968" (Constance Ledoux Book); (3) "44 Liquormart--A Prescription for Commercial Speech: Return…

  13. Typing Compared with Handwriting for Essay Examinations at University: Letting the Students Choose

    ERIC Educational Resources Information Center

    Mogey, Nora; Paterson, Jessie; Burk, John; Purcell, Michael

    2010-01-01

    Students at the University of Edinburgh do almost all their work on computers, but at the end of the semester they are examined by handwritten essays. Intuitively it would be appealing to allow students the choice of handwriting or typing, but this raises a concern that perhaps this might not be "fair"--that the choice a student makes,…

  14. What, How and Why? A Multi-Dimensional Case Analysis of the Challenges Facing Native and Non- Native EFL Teachers

    ERIC Educational Resources Information Center

    Demir, Yusuf

    2017-01-01

    On a multifaceted basis, this paper explores the challenges experienced by native and non-native English language teachers (NESTs and NNESTs) in a tertiary-level EFL setting in Turkey. Adopting a qualitative case study design, the data were gathered from five NESTs through interviews and from five NNESTs through hand-written accounts based on the…

  15. NEW BEDFORD, FLOOD HYDROGRAPH PACKAGE (COMPUTER DISKETTE, HANDWRITTEN SAMPLING AND ANALYSIS INFORMATION, AND RELATED CORRESPONDENCE FROM OLKO ENGINEERING ARE ATTACHED), 10-01-1984, SDMS# 64536

    EPA Pesticide Factsheets

    2012-06-28

    ... Hfl-I INPUI liNE ID", ••• I •..•. 2 .•... J, .... 4" .. ".S,."",b" .. I . .... H..•.. q . .llI ... Pkt' IPIIAIION OAT" • II I'B SIU~M I) • ./ I) IIA', IN III I AI I'RII 11'1 IAI IUN • ...

  16. Learning Sparse Feature Representations using Probabilistic Quadtrees and Deep Belief Nets

    DTIC Science & Technology

    2015-04-24

    Feature Representations usingProbabilistic Quadtrees and Deep Belief Nets Learning sparse feature representations is a useful instru- ment for solving an...novel framework for the classifi cation of handwritten digits that learns sparse representations using probabilistic quadtrees and Deep Belief Nets... Learning Sparse Feature Representations usingProbabilistic Quadtrees and Deep Belief Nets Report Title Learning sparse feature representations is a useful

  17. The Packing Property

    DTIC Science & Technology

    2000-11-01

    Discrete Math . 115, 141-152. [7] Edmonds J., Giles R. (1977) A Min-Max relation for submodular functions on graphs, Annals of Discrete Math . 1, 185...projective planes, handwritten man- uscript, published: (1990) Polyhedral Combinatorics (W. Cook, P.D. Seymour eds.), DIMACS Series in Discrete Math . and...Theoretical Computer Science 1, 101-105. [11] Lovasz L. (1972) Normal hypergraphs and the perfect graph conjecture, Discrete Math . 2, 253-267. [12

  18. Non-Roman Font Generation Via Interactive Computer Graphics,

    DTIC Science & Technology

    1986-07-01

    sets of kana representing the same set of sounds: hiragana , a cursive script for transcribing native Japanese words (including those borrowed low from...used for transcribing spoken Japanese into dwritten language. Hiragana have a cursive (handwritten) appearance. homophone A syllable or word which is...language into written form. These symbol sets are syllabaries. (see also hiragana , katakana) kanji "Chinese characters" ( Japanese ). (see also hanzi

  19. A Mis-recognized Medical Vocabulary Correction System for Speech-based Electronic Medical Record

    PubMed Central

    Seo, Hwa Jeong; Kim, Ju Han; Sakabe, Nagamasa

    2002-01-01

    Speech recognition as an input tool for electronic medical record (EMR) enables efficient data entry at the point of care. However, the recognition accuracy for medical vocabulary is much poorer than that for doctor-patient dialogue. We developed a mis-recognized medical vocabulary correction system based on syllable-by-syllable comparison of speech text against medical vocabulary database. Using specialty medical vocabulary, the algorithm detects and corrects mis-recognized medical vocabularies in narrative text. Our preliminary evaluation showed 94% of accuracy in mis-recognized medical vocabulary correction.

  20. The Effects of Noisy Data on Text Retrieval.

    ERIC Educational Resources Information Center

    Taghva, Kazem; And Others

    1994-01-01

    Discusses the use of optical character recognition (OCR) for inputting documents in an information retrieval system and describes a study that used an OCR-generated database and its corresponding corrected version to examine query evaluation in the presence of noisy data. Scanning technology, recognition technology, and retrieval technology are…

  1. Computer-Aided Authoring of Programmed Instruction for Teaching Symbol Recognition. Final Report.

    ERIC Educational Resources Information Center

    Braby, Richard; And Others

    This description of AUTHOR, a computer program for the automated authoring of programmed texts designed to teach symbol recognition, includes discussions of the learning strategies incorporated in the design of the instructional materials, hardware description and the algorithm for the software, and current and future developments. Appendices…

  2. Concept Recognition in an Automatic Text-Processing System for the Life Sciences.

    ERIC Educational Resources Information Center

    Vleduts-Stokolov, Natasha

    1987-01-01

    Describes a system developed for the automatic recognition of biological concepts in titles of scientific articles; reports results of several pilot experiments which tested the system's performance; analyzes typical ambiguity problems encountered by the system; describes a disambiguation technique that was developed; and discusses future plans…

  3. Gimli: open source and high-performance biomedical name recognition

    PubMed Central

    2013-01-01

    Background Automatic recognition of biomedical names is an essential task in biomedical information extraction, presenting several complex and unsolved challenges. In recent years, various solutions have been implemented to tackle this problem. However, limitations regarding system characteristics, customization and usability still hinder their wider application outside text mining research. Results We present Gimli, an open-source, state-of-the-art tool for automatic recognition of biomedical names. Gimli includes an extended set of implemented and user-selectable features, such as orthographic, morphological, linguistic-based, conjunctions and dictionary-based. A simple and fast method to combine different trained models is also provided. Gimli achieves an F-measure of 87.17% on GENETAG and 72.23% on JNLPBA corpus, significantly outperforming existing open-source solutions. Conclusions Gimli is an off-the-shelf, ready to use tool for named-entity recognition, providing trained and optimized models for recognition of biomedical entities from scientific text. It can be used as a command line tool, offering full functionality, including training of new models and customization of the feature set and model parameters through a configuration file. Advanced users can integrate Gimli in their text mining workflows through the provided library, and extend or adapt its functionalities. Based on the underlying system characteristics and functionality, both for final users and developers, and on the reported performance results, we believe that Gimli is a state-of-the-art solution for biomedical NER, contributing to faster and better research in the field. Gimli is freely available at http://bioinformatics.ua.pt/gimli. PMID:23413997

  4. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study.

    PubMed

    Skeppstedt, Maria; Kvist, Maria; Nilsson, Gunnar H; Dalianis, Hercules

    2014-06-01

    Automatic recognition of clinical entities in the narrative text of health records is useful for constructing applications for documentation of patient care, as well as for secondary usage in the form of medical knowledge extraction. There are a number of named entity recognition studies on English clinical text, but less work has been carried out on clinical text in other languages. This study was performed on Swedish health records, and focused on four entities that are highly relevant for constructing a patient overview and for medical hypothesis generation, namely the entities: Disorder, Finding, Pharmaceutical Drug and Body Structure. The study had two aims: to explore how well named entity recognition methods previously applied to English clinical text perform on similar texts written in Swedish; and to evaluate whether it is meaningful to divide the more general category Medical Problem, which has been used in a number of previous studies, into the two more granular entities, Disorder and Finding. Clinical notes from a Swedish internal medicine emergency unit were annotated for the four selected entity categories, and the inter-annotator agreement between two pairs of annotators was measured, resulting in an average F-score of 0.79 for Disorder, 0.66 for Finding, 0.90 for Pharmaceutical Drug and 0.80 for Body Structure. A subset of the developed corpus was thereafter used for finding suitable features for training a conditional random fields model. Finally, a new model was trained on this subset, using the best features and settings, and its ability to generalise to held-out data was evaluated. This final model obtained an F-score of 0.81 for Disorder, 0.69 for Finding, 0.88 for Pharmaceutical Drug, 0.85 for Body Structure and 0.78 for the combined category Disorder+Finding. The obtained results, which are in line with or slightly lower than those for similar studies on English clinical text, many of them conducted using a larger training data set, show that the approaches used for English are also suitable for Swedish clinical text. However, a small proportion of the errors made by the model are less likely to occur in English text, showing that results might be improved by further tailoring the system to clinical Swedish. The entity recognition results for the individual entities Disorder and Finding show that it is meaningful to separate the general category Medical Problem into these two more granular entity types, e.g. for knowledge mining of co-morbidity relations and disorder-finding relations. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  5. Lexical Information in Memory for Text.

    ERIC Educational Resources Information Center

    Hayes-Roth, Barbara

    Cued-recall and two-alternative, forced-choice recognition measures were used to evaluate subjects' retention of the specific wordings of studied texts. Results obtained after 10-minute and 24 hour retention intervals suggest that the studied wordings of texts are functional components of their memory representations. Theories that assume…

  6. Offline Arabic handwriting recognition: a survey.

    PubMed

    Lorigo, Liana M; Govindaraju, Venu

    2006-05-01

    The automatic recognition of text on scanned images has enabled many applications such as searching for words in large volumes of documents, automatic sorting of postal mail, and convenient editing of previously printed documents. The domain of handwriting in the Arabic script presents unique technical challenges and has been addressed more recently than other domains. Many different methods have been proposed and applied to various types of images. This paper provides a comprehensive review of these methods. It is the first survey to focus on Arabic handwriting recognition and the first Arabic character recognition survey to provide recognition rates and descriptions of test data for the approaches discussed. It includes background on the field, discussion of the methods, and future research directions.

  7. Incorporating domain knowledge in chemical and biomedical named entity recognition with word representations.

    PubMed

    Munkhdalai, Tsendsuren; Li, Meijing; Batsuren, Khuyagbaatar; Park, Hyeon Ah; Choi, Nak Hyeon; Ryu, Keun Ho

    2015-01-01

    Chemical and biomedical Named Entity Recognition (NER) is an essential prerequisite task before effective text mining can begin for biochemical-text data. Exploiting unlabeled text data to leverage system performance has been an active and challenging research topic in text mining due to the recent growth in the amount of biomedical literature. We present a semi-supervised learning method that efficiently exploits unlabeled data in order to incorporate domain knowledge into a named entity recognition model and to leverage system performance. The proposed method includes Natural Language Processing (NLP) tasks for text preprocessing, learning word representation features from a large amount of text data for feature extraction, and conditional random fields for token classification. Other than the free text in the domain, the proposed method does not rely on any lexicon nor any dictionary in order to keep the system applicable to other NER tasks in bio-text data. We extended BANNER, a biomedical NER system, with the proposed method. This yields an integrated system that can be applied to chemical and drug NER or biomedical NER. We call our branch of the BANNER system BANNER-CHEMDNER, which is scalable over millions of documents, processing about 530 documents per minute, is configurable via XML, and can be plugged into other systems by using the BANNER Unstructured Information Management Architecture (UIMA) interface. BANNER-CHEMDNER achieved an 85.68% and an 86.47% F-measure on the testing sets of CHEMDNER Chemical Entity Mention (CEM) and Chemical Document Indexing (CDI) subtasks, respectively, and achieved an 87.04% F-measure on the official testing set of the BioCreative II gene mention task, showing remarkable performance in both chemical and biomedical NER. BANNER-CHEMDNER system is available at: https://bitbucket.org/tsendeemts/banner-chemdner.

  8. Mining Adverse Drug Reactions in Social Media with Named Entity Recognition and Semantic Methods.

    PubMed

    Chen, Xiaoyi; Deldossi, Myrtille; Aboukhamis, Rim; Faviez, Carole; Dahamna, Badisse; Karapetiantz, Pierre; Guenegou-Arnoux, Armelle; Girardeau, Yannick; Guillemin-Lanne, Sylvie; Lillo-Le-Louët, Agnès; Texier, Nathalie; Burgun, Anita; Katsahian, Sandrine

    2017-01-01

    Suspected adverse drug reactions (ADR) reported by patients through social media can be a complementary source to current pharmacovigilance systems. However, the performance of text mining tools applied to social media text data to discover ADRs needs to be evaluated. In this paper, we introduce the approach developed to mine ADR from French social media. A protocol of evaluation is highlighted, which includes a detailed sample size determination and evaluation corpus constitution. Our text mining approach provided very encouraging preliminary results with F-measures of 0.94 and 0.81 for recognition of drugs and symptoms respectively, and with F-measure of 0.70 for ADR detection. Therefore, this approach is promising for downstream pharmacovigilance analysis.

  9. Real-Time Control of an Exoskeleton Hand Robot with Myoelectric Pattern Recognition.

    PubMed

    Lu, Zhiyuan; Chen, Xiang; Zhang, Xu; Tong, Kay-Yu; Zhou, Ping

    2017-08-01

    Robot-assisted training provides an effective approach to neurological injury rehabilitation. To meet the challenge of hand rehabilitation after neurological injuries, this study presents an advanced myoelectric pattern recognition scheme for real-time intention-driven control of a hand exoskeleton. The developed scheme detects and recognizes user's intention of six different hand motions using four channels of surface electromyography (EMG) signals acquired from the forearm and hand muscles, and then drives the exoskeleton to assist the user accomplish the intended motion. The system was tested with eight neurologically intact subjects and two individuals with spinal cord injury (SCI). The overall control accuracy was [Formula: see text] for the neurologically intact subjects and [Formula: see text] for the SCI subjects. The total lag of the system was approximately 250[Formula: see text]ms including data acquisition, transmission and processing. One SCI subject also participated in training sessions in his second and third visits. Both the control accuracy and efficiency tended to improve. These results show great potential for applying the advanced myoelectric pattern recognition control of the wearable robotic hand system toward improving hand function after neurological injuries.

  10. Draft of the U.S. Constitution (August 1787) and Schedule of the Compensation of the Senate of the United States (March 1791)

    ERIC Educational Resources Information Center

    Hussey, Michael; Greenhut, Stephanie

    2011-01-01

    This article features two documents which can serve as a starting point for a lesson on public service while students debate the amount of pay that public servants should receive. These are: (1) the printed draft of the Constitution showing George Washington's handwritten corrections that eliminated state payments and included the phrase "to be…

  11. Do Students Get Higher Scores on Their Word-Processed Papers? A Study of Bias in Scoring Hand-Written vs. Word-Processed Papers.

    ERIC Educational Resources Information Center

    Arnold, Voiza; And Others

    In 1990, a study was conducted at Rio Hondo College (Whittier, California) to determine if readers exhibited any bias in scoring test papers that were composed on a word processor as opposed to being written by hand. The study began with the formulation of tentative pilot study questions and the development of procedures to address them. Three…

  12. 'Do not attempt resuscitation'--do standardised order forms make a clinical difference above hand-written note entries?

    PubMed

    Lewis, Keir Edward; Edwards, Victoria Middleton; Hall, Sian; Temblett, Paul; Hutchings, Hayley

    2009-01-01

    To quantify any effect of Standardised Order Forms (SOFs), versus hand-written note entries for 'Do Not Attempt Resuscitation'--on the selection and survival of remaining cardiopulmonary resuscitation (CPR) attempts. A prospective, observational study in two UK Hospitals, comparing numbers, demographics and survival rates from CPR attempts for 2 years prior to and 2 years after the introduction of SOFs (the only change in DNAR policy). There were 133 CPR attempts, representing 0.30% of the 44,792 admissions, pre SOFs and 147 CPR attempts representing 0.32% of the 45,340 admissions following the SOFs (p=0.46). The median duration of a CPR attempt was 11min prior to and 15min following the SOFs (p=0.02). Of the CPR attempts, there was no change in mean age (p=0.34), proportions occurring outside working hours (p=0.70) or proportions presenting with an initial shockable rhythm (p=0.30). Survival to discharge following CPR was unchanged (p=0.23). The introduction of SOFs for DNAR orders was associated with a significantly longer duration of CPR (on average by 3-4min) but no difference in overall number, demographics or type of arrest or survival in the remaining CPR attempts.

  13. Keywords image retrieval in historical handwritten Arabic documents

    NASA Astrophysics Data System (ADS)

    Saabni, Raid; El-Sana, Jihad

    2013-01-01

    A system is presented for spotting and searching keywords in handwritten Arabic documents. A slightly modified dynamic time warping algorithm is used to measure similarities between words. Two sets of features are generated from the outer contour of the words/word-parts. The first set is based on the angles between nodes on the contour and the second set is based on the shape context features taken from the outer contour. To recognize a given word, the segmentation-free approach is partially adopted, i.e., continuous word parts are used as the basic alphabet, instead of individual characters or complete words. Additional strokes, such as dots and detached short segments, are classified and used in a postprocessing step to determine the final comparison decision. The search for a keyword is performed by the search for its word parts given in the correct order. The performance of the presented system was very encouraging in terms of efficiency and match rates. To evaluate the presented system its performance is compared to three different systems. Unfortunately, there are no publicly available standard datasets with ground truth for testing Arabic key word searching systems. Therefore, a private set of images partially taken from Juma'a Al-Majid Center in Dubai for evaluation is used, while using a slightly modified version of the IFN/ENIT database for training.

  14. Examining Authenticity: An Initial Exploration of the Suitability of Handwritten Electronic Signatures.

    PubMed

    Heckeroth, J; Boywitt, C D

    2017-06-01

    Considering the increasing relevance of handwritten electronically captured signatures, we evaluated the ability of forensic handwriting examiners (FHEs) to distinguish between authentic and simulated electronic signatures. Sixty-six professional FHEs examined the authenticity of electronic signatures captured with software by signotec on a smartphone Galaxy Note 4 by Samsung and signatures made with a ballpoint pen on paper (conventional signatures). In addition, we experimentally varied the name ("J. König" vs. "A. Zaiser") and the status (authentic vs. simulated) of the signatures in question. FHEs' conclusions about the authenticity did not show a statistically significant general difference between electronic and conventional signatures. Furthermore, no significant discrepancies between electronic and conventional signatures were found with regard to other important aspects of the authenticity examination such as questioned signatures' graphic information content, the suitability of the provided sample signatures, the necessity of further examinations and the levels of difficulty of the cases under examination. Thus, this study did not reveal any indications that electronic signatures captured with software by signotec on a Galaxy Note 4 are less well suited than conventional signatures for the examination of authenticity, precluding potential technical problems concerning the integrity of electronic signatures. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Using unstructured diaries for primary data collection.

    PubMed

    Thomas, Juliet Anne

    2015-05-01

    To give a reflective account of using unstructured handwritten diaries as a method of collecting qualitative data. Diaries are primarily used in research as a method of collecting qualitative data. There are some challenges associated with their use, including compliance rates. However, they can provide a rich source of meaningful data and can avoid the difficulties of participants trying to precisely recall events after some time has elapsed. The author used unstructured handwritten diaries as her primary method of collecting data during her grounded theory doctoral study, when she examined the professional socialisation of nursing students. Over two years, 26 participants selected from four consecutive recruited groups of nursing students volunteered to take part in the study and were asked to keep a daily diary throughout their first five weeks of clinical experience. When using open-ended research questions, grounded theory's pragmatic approach permits the examination of processes thereby creating conceptual interpretive understanding of data. A wealth of rich, detailed data was obtained from the diaries that permitted the development of new theories regarding the effects early clinical experiences have on nursing students' professional socialisation. Diaries were found to provide insightful in-depth qualitative data in a resource-friendly manner. Nurse researchers should consider using diaries as an alternative to more commonly used approaches to collecting qualitative data.

  16. The Interaction between Central and Peripheral Processing in Chinese Handwritten Production: Evidence from the Effect of Lexicality and Radical Complexity

    PubMed Central

    Zhang, Qingfang; Feng, Chen

    2017-01-01

    The interaction between central and peripheral processing in written word production remains controversial. This study aims to investigate whether the effects of radical complexity and lexicality in central processing cascade into peripheral processing in Chinese written word production. The participants were asked to write characters and non-characters (lexicality) with different radical complexity (few- and many-strokes). The findings indicated that regardless of the lexicality, the writing latencies were longer for characters with higher complexity (the many-strokes condition) than for characters with lower complexity (the few-strokes condition). The participants slowed down their writing execution at the radicals' boundary strokes, which indicated a radical boundary effect in peripheral processing. Interestingly, the lexicality and the radical complexity affected the pattern of shift velocity and writing velocity during the execution of writing. Lexical processing cascades into peripheral processing but only at the beginning of Chinese characters. In contrast, the radical complexity influenced the execution of handwriting movement throughout the entire character, and the pattern of the effect interacted with the character frequency. These results suggest that the processes of the lexicality and the radical complexity function during the execution of handwritten word production, which suggests that central processing cascades over peripheral processing during Chinese characters handwriting. PMID:28348536

  17. Syntactic error modeling and scoring normalization in speech recognition: Error modeling and scoring normalization in the speech recognition task for adult literacy training

    NASA Technical Reports Server (NTRS)

    Olorenshaw, Lex; Trawick, David

    1991-01-01

    The purpose was to develop a speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Better mechanisms are provided for using speech recognition in a literacy tutor application. Using a combination of scoring normalization techniques and cheater-mode decoding, a reasonable acceptance/rejection threshold was provided. In continuous speech, the system was tested to be able to provide above 80 pct. correct acceptance of words, while correctly rejecting over 80 pct. of incorrectly pronounced words.

  18. Limited Role of Contextual Information in Adult Word Recognition. Technical Report No. 411.

    ERIC Educational Resources Information Center

    Durgunoglu, Aydin Y.

    Recognizing a word in a meaningful text involves processes that combine information from many different sources, and both bottom-up processes (such as feature extraction and letter recognition) and top-down processes (contextual information) are thought to interact when skilled readers recognize words. Two similar experiments investigated word…

  19. Cross domains Arabic named entity recognition system

    NASA Astrophysics Data System (ADS)

    Al-Ahmari, S. Saad; Abdullatif Al-Johar, B.

    2016-07-01

    Named Entity Recognition (NER) plays an important role in many Natural Language Processing (NLP) applications such as; Information Extraction (IE), Question Answering (QA), Text Clustering, Text Summarization and Word Sense Disambiguation. This paper presents the development and implementation of domain independent system to recognize three types of Arabic named entities. The system works based on a set of domain independent grammar-rules along with Arabic part of speech tagger in addition to gazetteers and lists of trigger words. The experimental results shown, that the system performed as good as other systems with better results in some cases of cross-domains corpora.

  20. Review of chart recognition in document images

    NASA Astrophysics Data System (ADS)

    Liu, Yan; Lu, Xiaoqing; Qin, Yeyang; Tang, Zhi; Xu, Jianbo

    2013-01-01

    As an effective information transmitting way, chart is widely used to represent scientific statistics datum in books, research papers, newspapers etc. Though textual information is still the major source of data, there has been an increasing trend of introducing graphs, pictures, and figures into the information pool. Text recognition techniques for documents have been accomplished using optical character recognition (OCR) software. Chart recognition techniques as a necessary supplement of OCR for document images are still an unsolved problem due to the great subjectiveness and variety of charts styles. This paper reviews the development process of chart recognition techniques in the past decades and presents the focuses of current researches. The whole process of chart recognition is presented systematically, which mainly includes three parts: chart segmentation, chart classification, and chart Interpretation. In each part, the latest research work is introduced. In the last, the paper concludes with a summary and promising future research direction.

  1. Using speech recognition to enhance the Tongue Drive System functionality in computer access.

    PubMed

    Huo, Xueliang; Ghovanloo, Maysam

    2011-01-01

    Tongue Drive System (TDS) is a wireless tongue operated assistive technology (AT), which can enable people with severe physical disabilities to access computers and drive powered wheelchairs using their volitional tongue movements. TDS offers six discrete commands, simultaneously available to the users, for pointing and typing as a substitute for mouse and keyboard in computer access, respectively. To enhance the TDS performance in typing, we have added a microphone, an audio codec, and a wireless audio link to its readily available 3-axial magnetic sensor array, and combined it with a commercially available speech recognition software, the Dragon Naturally Speaking, which is regarded as one of the most efficient ways for text entry. Our preliminary evaluations indicate that the combined TDS and speech recognition technologies can provide end users with significantly higher performance than using each technology alone, particularly in completing tasks that require both pointing and text entry, such as web surfing.

  2. Facilitating Comprehension of Non-Native English Speakers during Lectures in English with STR-Texts

    ERIC Educational Resources Information Center

    Shadiev, Rustam; Wu, Ting-Ting; Huang, Yueh-Min

    2018-01-01

    We provided texts generated by speech-to text-recognition (STR) technology for non-native English speaking students during lectures in English in order to test whether STR-texts were useful for enhancing students' comprehension of lectures. To this end, we carried out an experiment in which 60 participants were randomly assigned to a control group…

  3. Analysis Of The IJCNN 2011 UTL Challenge

    DTIC Science & Technology

    2012-01-13

    large datasets from various application domains: handwriting recognition, image recognition, video processing, text processing, and ecology. The goal...validation and final evaluation sets consist of 4096 examples each. Dataset Domain Features Sparsity Devel. Transf. AVICENNA Handwriting 120 0% 150205...documents [3]. Transfer learning methods could accelerate the application of handwriting recognizers to historical manuscript by reducing the need for

  4. Digitization of Full-Text Documents Before Publishing on the Internet: A Case Study Reviewing the Latest Optical Character Recognition Technologies.

    ERIC Educational Resources Information Center

    McClean, Clare M.

    1998-01-01

    Reviews strengths and weaknesses of five optical character recognition (OCR) software packages used to digitize paper documents before publishing on the Internet. Outlines options available and stages of the conversion process. Describes the learning experience of Eurotext, a United Kingdom-based electronic libraries project (eLib). (PEN)

  5. (Almost) Word for Word: As Voice Recognition Programs Improve, Students Reap the Benefits

    ERIC Educational Resources Information Center

    Smith, Mark

    2006-01-01

    Voice recognition software is hardly new--attempts at capturing spoken words and turning them into written text have been available to consumers for about two decades. But what was once an expensive and highly unreliable tool has made great strides in recent years, perhaps most recognized in programs such as Nuance's Dragon NaturallySpeaking…

  6. Speech recognition technology: an outlook for human-to-machine interaction.

    PubMed

    Erdel, T; Crooks, S

    2000-01-01

    Speech recognition, as an enabling technology in healthcare-systems computing, is a topic that has been discussed for quite some time, but is just now coming to fruition. Traditionally, speech-recognition software has been constrained by hardware, but improved processors and increased memory capacities are starting to remove some of these limitations. With these barriers removed, companies that create software for the healthcare setting have the opportunity to write more successful applications. Among the criticisms of speech-recognition applications are the high rates of error and steep training curves. However, even in the face of such negative perceptions, there remains significant opportunities for speech recognition to allow healthcare providers and, more specifically, physicians, to work more efficiently and ultimately spend more time with their patients and less time completing necessary documentation. This article will identify opportunities for inclusion of speech-recognition technology in the healthcare setting and examine major categories of speech-recognition software--continuous speech recognition, command and control, and text-to-speech. We will discuss the advantages and disadvantages of each area, the limitations of the software today, and how future trends might affect them.

  7. Assessing the impact of graphical quality on automatic text recognition in digital maps

    NASA Astrophysics Data System (ADS)

    Chiang, Yao-Yi; Leyk, Stefan; Honarvar Nazari, Narges; Moghaddam, Sima; Tan, Tian Xiang

    2016-08-01

    Converting geographic features (e.g., place names) in map images into a vector format is the first step for incorporating cartographic information into a geographic information system (GIS). With the advancement in computational power and algorithm design, map processing systems have been considerably improved over the last decade. However, the fundamental map processing techniques such as color image segmentation, (map) layer separation, and object recognition are sensitive to minor variations in graphical properties of the input image (e.g., scanning resolution). As a result, most map processing results would not meet user expectations if the user does not "properly" scan the map of interest, pre-process the map image (e.g., using compression or not), and train the processing system, accordingly. These issues could slow down the further advancement of map processing techniques as such unsuccessful attempts create a discouraged user community, and less sophisticated tools would be perceived as more viable solutions. Thus, it is important to understand what kinds of maps are suitable for automatic map processing and what types of results and process-related errors can be expected. In this paper, we shed light on these questions by using a typical map processing task, text recognition, to discuss a number of map instances that vary in suitability for automatic processing. We also present an extensive experiment on a diverse set of scanned historical maps to provide measures of baseline performance of a standard text recognition tool under varying map conditions (graphical quality) and text representations (that can vary even within the same map sheet). Our experimental results help the user understand what to expect when a fully or semi-automatic map processing system is used to process a scanned map with certain (varying) graphical properties and complexities in map content.

  8. Keyless Entry: Building a Text Database Using OCR Technology.

    ERIC Educational Resources Information Center

    Grotophorst, Clyde W.

    1989-01-01

    Discusses the use of optical character recognition (OCR) technology to produce an ASCII text database. A tutorial on digital scanning and OCR is provided, and a systems integration project which used the Calera CDP-3000XF scanner and text retrieval software to construct a database of dissertations at George Mason University is described. (four…

  9. 14. 'ANNISQUAM POINT JAN. 4, 1898.' Photocopy of photograph ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    14. 'ANNISQUAM POINT -- JAN. 4, 1898.' Photocopy of photograph (original glass plate negative #T86 in the collection of the Annisquam Historical Society, Annisquam, Massachusetts). Photographer: Martha Harvey (1862-1949). (The handwritten legend along the top edge of the photograph is scratched in the emulsion of the original glass plate negative. Consequently it reads in reverse when printed.) - Annisquam Bridge, Spanning Lobster Cove between Washington & River Streets, Gloucester, Essex County, MA

  10. Examining the Use of Computers in Writing by Learners of Japanese as a Foreign Language: Analysis of Kanji in the Handwritten and Typed Domains

    ERIC Educational Resources Information Center

    Dixon, Michael

    2012-01-01

    This study compares second-year Japanese university students' strategies to write kanji by hand with their strategies to produce the kanji characters on a computer, taking into account factors such as accuracy in writing, the amount of kanji used, the complexity of the kanji used, as well as how the characters used compare with the sequence…

  11. Network-based high level data classification.

    PubMed

    Silva, Thiago Christiano; Zhao, Liang

    2012-06-01

    Traditional supervised data classification considers only physical features (e.g., distance or similarity) of the input data. Here, this type of learning is called low level classification. On the other hand, the human (animal) brain performs both low and high orders of learning and it has facility in identifying patterns according to the semantic meaning of the input data. Data classification that considers not only physical attributes but also the pattern formation is, here, referred to as high level classification. In this paper, we propose a hybrid classification technique that combines both types of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features or class topologies, while the latter measures the compliance of the test instances to the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the pattern formation, but also is able to improve the performance of traditional classification techniques. Furthermore, as the class configuration's complexity increases, such as the mixture among different classes, a larger portion of the high level term is required to get correct classification. This feature confirms that the high level classification has a special importance in complex situations of classification. Finally, we show how the proposed technique can be employed in a real-world application, where it is capable of identifying variations and distortions of handwritten digit images. As a result, it supplies an improvement in the overall pattern recognition rate.

  12. eBiometrics: an enhanced multi-biometrics authentication technique for real-time remote applications on mobile devices

    NASA Astrophysics Data System (ADS)

    Kuseler, Torben; Lami, Ihsan; Jassim, Sabah; Sellahewa, Harin

    2010-04-01

    The use of mobile communication devices with advance sensors is growing rapidly. These sensors are enabling functions such as Image capture, Location applications, and Biometric authentication such as Fingerprint verification and Face & Handwritten signature recognition. Such ubiquitous devices are essential tools in today's global economic activities enabling anywhere-anytime financial and business transactions. Cryptographic functions and biometric-based authentication can enhance the security and confidentiality of mobile transactions. Using Biometric template security techniques in real-time biometric-based authentication are key factors for successful identity verification solutions, but are venerable to determined attacks by both fraudulent software and hardware. The EU-funded SecurePhone project has designed and implemented a multimodal biometric user authentication system on a prototype mobile communication device. However, various implementations of this project have resulted in long verification times or reduced accuracy and/or security. This paper proposes to use built-in-self-test techniques to ensure no tampering has taken place on the verification process prior to performing the actual biometric authentication. These techniques utilises the user personal identification number as a seed to generate a unique signature. This signature is then used to test the integrity of the verification process. Also, this study proposes the use of a combination of biometric modalities to provide application specific authentication in a secure environment, thus achieving optimum security level with effective processing time. I.e. to ensure that the necessary authentication steps and algorithms running on the mobile device application processor can not be undermined or modified by an imposter to get unauthorized access to the secure system.

  13. AN AUTOMATIC DEVICE FOR READING TYPOGRAPHICAL TEXTS,

    DTIC Science & Technology

    permissible. The system represents an attempt to apply the methods of machines designed for typescript reading to machines reading printed texts...Some characteristics by which typescript and typographical material differ are presented. The basic aspects of the recognition algorithm are given. A

  14. Using Workflows to Explore and Optimise Named Entity Recognition for Chemistry

    PubMed Central

    Kolluru, BalaKrishna; Hawizy, Lezan; Murray-Rust, Peter; Tsujii, Junichi; Ananiadou, Sophia

    2011-01-01

    Chemistry text mining tools should be interoperable and adaptable regardless of system-level implementation, installation or even programming issues. We aim to abstract the functionality of these tools from the underlying implementation via reconfigurable workflows for automatically identifying chemical names. To achieve this, we refactored an established named entity recogniser (in the chemistry domain), OSCAR and studied the impact of each component on the net performance. We developed two reconfigurable workflows from OSCAR using an interoperable text mining framework, U-Compare. These workflows can be altered using the drag-&-drop mechanism of the graphical user interface of U-Compare. These workflows also provide a platform to study the relationship between text mining components such as tokenisation and named entity recognition (using maximum entropy Markov model (MEMM) and pattern recognition based classifiers). Results indicate that, for chemistry in particular, eliminating noise generated by tokenisation techniques lead to a slightly better performance than others, in terms of named entity recognition (NER) accuracy. Poor tokenisation translates into poorer input to the classifier components which in turn leads to an increase in Type I or Type II errors, thus, lowering the overall performance. On the Sciborg corpus, the workflow based system, which uses a new tokeniser whilst retaining the same MEMM component, increases the F-score from 82.35% to 84.44%. On the PubMed corpus, it recorded an F-score of 84.84% as against 84.23% by OSCAR. PMID:21633495

  15. Using workflows to explore and optimise named entity recognition for chemistry.

    PubMed

    Kolluru, Balakrishna; Hawizy, Lezan; Murray-Rust, Peter; Tsujii, Junichi; Ananiadou, Sophia

    2011-01-01

    Chemistry text mining tools should be interoperable and adaptable regardless of system-level implementation, installation or even programming issues. We aim to abstract the functionality of these tools from the underlying implementation via reconfigurable workflows for automatically identifying chemical names. To achieve this, we refactored an established named entity recogniser (in the chemistry domain), OSCAR and studied the impact of each component on the net performance. We developed two reconfigurable workflows from OSCAR using an interoperable text mining framework, U-Compare. These workflows can be altered using the drag-&-drop mechanism of the graphical user interface of U-Compare. These workflows also provide a platform to study the relationship between text mining components such as tokenisation and named entity recognition (using maximum entropy Markov model (MEMM) and pattern recognition based classifiers). Results indicate that, for chemistry in particular, eliminating noise generated by tokenisation techniques lead to a slightly better performance than others, in terms of named entity recognition (NER) accuracy. Poor tokenisation translates into poorer input to the classifier components which in turn leads to an increase in Type I or Type II errors, thus, lowering the overall performance. On the Sciborg corpus, the workflow based system, which uses a new tokeniser whilst retaining the same MEMM component, increases the F-score from 82.35% to 84.44%. On the PubMed corpus, it recorded an F-score of 84.84% as against 84.23% by OSCAR.

  16. Chemical Entity Recognition and Resolution to ChEBI

    PubMed Central

    Grego, Tiago; Pesquita, Catia; Bastos, Hugo P.; Couto, Francisco M.

    2012-01-01

    Chemical entities are ubiquitous through the biomedical literature and the development of text-mining systems that can efficiently identify those entities are required. Due to the lack of available corpora and data resources, the community has focused its efforts in the development of gene and protein named entity recognition systems, but with the release of ChEBI and the availability of an annotated corpus, this task can be addressed. We developed a machine-learning-based method for chemical entity recognition and a lexical-similarity-based method for chemical entity resolution and compared them with Whatizit, a popular-dictionary-based method. Our methods outperformed the dictionary-based method in all tasks, yielding an improvement in F-measure of 20% for the entity recognition task, 2–5% for the entity-resolution task, and 15% for combined entity recognition and resolution tasks. PMID:25937941

  17. [Scabies and the significance of "suriones" in the handwritten manuscripts of Hildegard von Bingen].

    PubMed

    Riethe, Peter

    2006-01-01

    In her studies on nature and medicine, the "Liber simplicis medicinae" (LSM or "Physica") and the "Liber compositae medicinae" (LCM or "Causae et Curae"), Hildegard von Bingen mentions Scabies (mange) in several passages. She characterizes "suren aut (= or) sneuelzen" as the cause of the disease, which she calls also "gracillimi vermiculi", that is, tiny worms that burrow into the human skin ("ubi suren aut sneuelzen hominem comedendo ledunt"). In this context the meanings of the German-ancestor terms "suren aut sneuelzen", which are found in the Latin text concerning the "Alia Mynza", are still disputed. The question whether Hildegard knew the cause of scabies the author discusses on the basis of ancient and medieval sources as well as modem medical historical and philological/linguistic research approaches. He concludes that Hildegard was able not only to describe the symptoms exactly, but also to define the cause of the disease as a special parasite. Consequently, she differentiates other diseases of the skin, such as "grint", from scabies. The proposed interpretation of "sneuelzen" as the tick is untenable. The assumption that both terms are synonyms for sarcoptes scabiei can be confirmed by philological and medical historical research.

  18. X-ray computed tomography applied to investigate ancient manuscripts

    NASA Astrophysics Data System (ADS)

    Bettuzzi, Matteo; Albertin, Fauzia; Brancaccio, Rosa; Casali, Franco; Pia Morigi, Maria; Peccenini, Eva

    2017-03-01

    I will describe in this paper the first results of a series of X-ray tomography applications, with different system setups, running on some ancient manuscripts containing iron-gall ink. The purpose is to verify the optimum measurement conditions with a laboratory instrumentation -that is also in fact portable- in order to recognize the text from the inside of the documents, without opening them. This becomes possible by exploiting the X-rays absorption contrast of iron-based ink and the three-dimensional reconstruction potential provided by computed tomography that overcomes problems that appear in simple radiograph practice. This work is part of a larger project of EPFL (Ecole Polytechnique Fédérale de Lausanne, Switzerland), the "Venice Time Machine" project (EPEL, Digital Heritage Venice, http://dhvenice.eu/, 2015) aimed at digitizing, transcribing and sharing in an open database all the information of the State Archives of Venice, exploiting traditional digitization technologies and innovative methods of acquisition. In this first measurement campaign I investigated a manuscript of the seventeenth century made of a folded sheet; a couple of unopened ancient wills kept in the State Archives in Venice and a handwritten book of several hundred pages of notes of Physics of the nineteenth century.

  19. What do physicians tell laboratories when requesting tests? A multi-method examination of information supplied to the microbiology laboratory before and after the introduction of electronic ordering.

    PubMed

    Georgiou, Andrew; Prgomet, Mirela; Toouli, George; Callen, Joanne; Westbrook, Johanna

    2011-09-01

    The provision of relevant clinical information on pathology requests is an important part of facilitating appropriate laboratory utilization and accurate results interpretation and reporting. (1) To determine the quantity and importance of handwritten clinical information provided by physicians to the Microbiology Department of a hospital pathology service; and (2) to examine the impact of a Computerized Provider Order Entry (CPOE) system on the nature of clinical information communication to the laboratory. A multi-method and multi-stage investigation which included: (a) a retrospective audit of all handwritten Microbiology requests received over a 1-month period in the Microbiology Department of a large metropolitan teaching hospital; (b) the administration of a survey to laboratory professionals to investigate the impact of different clinical information on the processing and/or interpretation of tests; (c) an expert panel consisting of medical staff and senior scientists to assess the survey findings and their impact on pathology practice and patient care; and (d) a comparison of the provision and value of clinical information before CPOE, and across 3 years after its implementation. The audit of handwritten requests found that 43% (n=4215) contained patient-related clinical information. The laboratory survey showed that 97% (84/86) of the different types of clinical information provided for wound specimens and 86% (43/50) for stool specimens were shown to have an effect on the processing or interpretation of the specimens by one or more laboratory professionals. The evaluation of the impact of CPOE revealed a significant improvement in the provision of useful clinical information from 2005 to 2008, rising from 90.1% (n=749) to 99.8% (n=915) (p<.0001) for wound specimens and 34% (n=129) to 86% (n=422) (p<.0001) for stool specimens. This study showed that the CPOE system provided an integrated platform to access and exchange valuable patient-related information between physicians and the laboratory. These findings have important implications for helping to inform decisions about the design and structure of CPOE screens and what data entry fields should be designated or made voluntary. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  20. The Compensatory Effectiveness of Optical Character Recognition/Speech Synthesis on Reading Comprehension of Postsecondary Students with Learning Disabilities.

    ERIC Educational Resources Information Center

    Higgins, Eleanor L.; Raskind, Marshall H.

    1997-01-01

    Thirty-seven college students with learning disabilities were given a reading comprehension task under the following conditions: (1) using an optical character recognition/speech synthesis system; (2) having the text read aloud by a human reader; or (3) reading silently without assistance. Findings indicated that the greater the disability, the…

  1. Digital Paper Technologies for Topographical Applications

    DTIC Science & Technology

    2011-09-19

    measures examine were training time for each method, time for entry offeatures, procedural errors, handwriting recognition errors, and user preference...time for entry of features, procedural errors, handwriting recognition errors, and user preference. For these metrics, temporal association was...checkbox, text restricted to a specific list of values, etc.) that provides constraints to the handwriting recognizer. When the user fills out the form

  2. Syntax-directed content analysis of videotext: application to a map detection recognition system

    NASA Astrophysics Data System (ADS)

    Aradhye, Hrishikesh; Herson, James A.; Myers, Gregory

    2003-01-01

    Video is an increasingly important and ever-growing source of information to the intelligence and homeland defense analyst. A capability to automatically identify the contents of video imagery would enable the analyst to index relevant foreign and domestic news videos in a convenient and meaningful way. To this end, the proposed system aims to help determine the geographic focus of a news story directly from video imagery by detecting and geographically localizing political maps from news broadcasts, using the results of videotext recognition in lieu of a computationally expensive, scale-independent shape recognizer. Our novel method for the geographic localization of a map is based on the premise that the relative placement of text superimposed on a map roughly corresponds to the geographic coordinates of the locations the text represents. Our scheme extracts and recognizes videotext, and iteratively identifies the geographic area, while allowing for OCR errors and artistic freedom. The fast and reliable recognition of such maps by our system may provide valuable context and supporting evidence for other sources, such as speech recognition transcripts. The concepts of syntax-directed content analysis of videotext presented here can be extended to other content analysis systems.

  3. Contingency-Focused Financial Management and Logistics for the U.S. Coast Guard

    DTIC Science & Technology

    2008-12-01

    being processed by the local contracting office. Hard copy PRs with hand-written signatures are not to be accepted unless a waiver has been granted...forms used for authorizing procurement are nearly the same but would merely require drafting the documents in a different hard -copy format to provide...be created, promulgated and distributed in hard copy for managers in the field and at support commands to enact when it is evident that service

  4. Handwriting generates variable visual output to facilitate symbol learning.

    PubMed

    Li, Julia X; James, Karin H

    2016-03-01

    Recent research has demonstrated that handwriting practice facilitates letter categorization in young children. The present experiments investigated why handwriting practice facilitates visual categorization by comparing 2 hypotheses: that handwriting exerts its facilitative effect because of the visual-motor production of forms, resulting in a direct link between motor and perceptual systems, or because handwriting produces variable visual instances of a named category in the environment that then changes neural systems. We addressed these issues by measuring performance of 5-year-old children on a categorization task involving novel, Greek symbols across 6 different types of learning conditions: 3 involving visual-motor practice (copying typed symbols independently, tracing typed symbols, tracing handwritten symbols) and 3 involving visual-auditory practice (seeing and saying typed symbols of a single typed font, of variable typed fonts, and of handwritten examples). We could therefore compare visual-motor production with visual perception both of variable and similar forms. Comparisons across the 6 conditions (N = 72) demonstrated that all conditions that involved studying highly variable instances of a symbol facilitated symbol categorization relative to conditions where similar instances of a symbol were learned, regardless of visual-motor production. Therefore, learning perceptually variable instances of a category enhanced performance, suggesting that handwriting facilitates symbol understanding by virtue of its environmental output: supporting the notion of developmental change though brain-body-environment interactions. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  5. Handwriting generates variable visual input to facilitate symbol learning

    PubMed Central

    Li, Julia X.; James, Karin H.

    2015-01-01

    Recent research has demonstrated that handwriting practice facilitates letter categorization in young children. The present experiments investigated why handwriting practice facilitates visual categorization by comparing two hypotheses: That handwriting exerts its facilitative effect because of the visual-motor production of forms, resulting in a direct link between motor and perceptual systems, or because handwriting produces variable visual instances of a named category in the environment that then changes neural systems. We addressed these issues by measuring performance of 5 year-old children on a categorization task involving novel, Greek symbols across 6 different types of learning conditions: three involving visual-motor practice (copying typed symbols independently, tracing typed symbols, tracing handwritten symbols) and three involving visual-auditory practice (seeing and saying typed symbols of a single typed font, of variable typed fonts, and of handwritten examples). We could therefore compare visual-motor production with visual perception both of variable and similar forms. Comparisons across the six conditions (N=72) demonstrated that all conditions that involved studying highly variable instances of a symbol facilitated symbol categorization relative to conditions where similar instances of a symbol were learned, regardless of visual-motor production. Therefore, learning perceptually variable instances of a category enhanced performance, suggesting that handwriting facilitates symbol understanding by virtue of its environmental output: supporting the notion of developmental change though brain-body-environment interactions. PMID:26726913

  6. Improving language models for radiology speech recognition.

    PubMed

    Paulett, John M; Langlotz, Curtis P

    2009-02-01

    Speech recognition systems have become increasingly popular as a means to produce radiology reports, for reasons both of efficiency and of cost. However, the suboptimal recognition accuracy of these systems can affect the productivity of the radiologists creating the text reports. We analyzed a database of over two million de-identified radiology reports to determine the strongest determinants of word frequency. Our results showed that body site and imaging modality had a similar influence on the frequency of words and of three-word phrases as did the identity of the speaker. These findings suggest that the accuracy of speech recognition systems could be significantly enhanced by further tailoring their language models to body site and imaging modality, which are readily available at the time of report creation.

  7. Recognition of pornographic web pages by classifying texts and images.

    PubMed

    Hu, Weiming; Wu, Ou; Chen, Zhouyao; Fu, Zhouyu; Maybank, Steve

    2007-06-01

    With the rapid development of the World Wide Web, people benefit more and more from the sharing of information. However, Web pages with obscene, harmful, or illegal content can be easily accessed. It is important to recognize such unsuitable, offensive, or pornographic Web pages. In this paper, a novel framework for recognizing pornographic Web pages is described. A C4.5 decision tree is used to divide Web pages, according to content representations, into continuous text pages, discrete text pages, and image pages. These three categories of Web pages are handled, respectively, by a continuous text classifier, a discrete text classifier, and an algorithm that fuses the results from the image classifier and the discrete text classifier. In the continuous text classifier, statistical and semantic features are used to recognize pornographic texts. In the discrete text classifier, the naive Bayes rule is used to calculate the probability that a discrete text is pornographic. In the image classifier, the object's contour-based features are extracted to recognize pornographic images. In the text and image fusion algorithm, the Bayes theory is used to combine the recognition results from images and texts. Experimental results demonstrate that the continuous text classifier outperforms the traditional keyword-statistics-based classifier, the contour-based image classifier outperforms the traditional skin-region-based image classifier, the results obtained by our fusion algorithm outperform those by either of the individual classifiers, and our framework can be adapted to different categories of Web pages.

  8. Intelligent OCR Processing.

    ERIC Educational Resources Information Center

    Sun, Wei; And Others

    1992-01-01

    Identifies types and distributions of errors in text produced by optical character recognition (OCR) and proposes a process using machine learning techniques to recognize and correct errors in OCR texts. Results of experiments indicating that this strategy can reduce human interaction required for error correction are reported. (25 references)…

  9. Effects of cell-phone and text-message distractions on true and false recognition.

    PubMed

    Smith, Theodore S; Isaak, Matthew I; Senette, Christian G; Abadie, Brenton G

    2011-06-01

    This study examined the effects of electronic communication distractions, including cell-phone and texting demands, on true and false recognition, specifically semantically related words presented and not presented on a computer screen. Participants were presented with 24 Deese-Roediger-McDermott (DRM) lists while manipulating the concurrent presence or absence of cell-phone and text-message distractions during study. In the DRM paradigm, participants study lists of semantically related words (e.g., mother, crib, and diaper) linked to a non-presented critical lure (e.g., baby). After studying the lists of words, participants are then requested to recall or recognize previously presented words. Participants often not only demonstrate high remembrance for presented words (true memory: crib), but also recollection for non-presented words (false memory: baby). In the present study, true memory was highest when participants were not presented with any distraction tasks during study of DRM words, but poorer when they were required to complete a cell-phone conversation or text-message task during study. False recognition measures did not statistically vary across distraction conditions. Signal detection analyses showed that participants better discriminated true targets (list items presented during study) from true target controls (items presented during study only) when cell-phone or text-message distractions were absent than when they were present. Response bias did not vary significantly across distraction conditions, as there were no differences in the likelihood that a participant would claim an item as "old" (previously presented) rather than "new" (not previously presented). Results of this study are examined with respect to both activation monitoring and fuzzy trace theories.

  10. Icones Plantarum Malabaricarum: Early 18th century botanical drawings of medicinal plants from colonial Ceylon.

    PubMed

    Van Andel, Tinde; Scholman, Ariane; Beumer, Mieke

    2018-04-27

    From 1640-1796, the Dutch East India Company (VOC) occupied the island of Ceylon (now Sri Lanka). Several VOC officers had a keen interest in the medicinal application of the local flora. The Leiden University Library holds a two-piece codex entitled: Icones Plantarum Malabaricarum, adscriptis nominibus et viribus, Vol. I. & II. (Illustrations of Plants from the Malabar, assigned names and strength). This manuscript contains 262 watercolour drawings of medicinal plants from Sri Lanka, with handwritten descriptions of local names, habitus, medicinal properties and therapeutic applications. This anonymous document had never been studied previously. To identify all depicted plant specimens, decipher the text, trace the author, and analyse the scientific relevance of this manuscript as well as its importance for Sri Lankan ethnobotany. We digitised the entire manuscript, transcribed and translated the handwritten Dutch texts and identified the depicted species using historic and modern literature, herbarium vouchers, online databases on Sri Lankan herbal medicine and 41 botanical drawings by the same artist in the Artis library, Amsterdam. We traced the origin of the manuscript by means of watermark analysis and historical literature. We compared the historic Sinhalese and Tamil names in the manuscript to recent plant names in ethnobotanical references from Sri Lanka and southern India. We published the entire manuscript online with translations and identifications. The watermarks indicate that the paper was made between 1694 and 1718. The handwriting is of a VOC scribe. In total, ca. 252 taxa are depicted, of which we could identify 221 to species level. The drawings represent mainly native species, including Sri Lankan endemics, but also introduced medicinal and ornamental plants. Lamiaceae, Zingiberaceae and Leguminosae were the best-represented families. Frequently mentioned applications were to purify the blood and to treat gastro-intestinal problems, fever and snakebites. Many plants are characterised by their humoral properties, of which 'warming' is the most prevalent. Plant species were mostly used for their roots (28%), bark (16%) or leaves (11%). More Tamil names (260) were documented than Sinhalese (208). More than half of the Tamil names and 36% of the Sinhalese names are still used today. The author was probably a VOC surgeon based in northern Sri Lanka, who travelled around the island to document medicinal plant use. Less than half of the species were previously documented from Ceylon by the famous VOC doctor and botanist Paul Hermann in the 1670s. Further archival research is needed to identify the maker of this manuscript. Although the maker of this early 18th century manuscript remains unknown, the detailed, 300-year-old information on medicinal plant use in the Icones Plantarum Malabaricarum represents an important ethnobotanical treasure for Sri Lanka, which offers ample opportunities to study changes and continuation of medicinal plant names and practices over time. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. A Complete OCR System for Tamil Magazine Documents

    NASA Astrophysics Data System (ADS)

    Kokku, Aparna; Chakravarthy, Srinivasa

    We present a complete optical character recognition (OCR) system for Tamil magazines/documents. All the standard elements of OCR process like de-skewing, preprocessing, segmentation, character recognition, and reconstruction are implemented. Experience with OCR problems teaches that for most subtasks of OCR, there is no single technique that gives perfect results for every type of document image. We exploit the ability of neural networks to learn from experience in solving the problems of segmentation and character recognition. Text segmentation of Tamil newsprint poses a new challenge owing to its italic-like font type; problems that arise in recognition of touching and close characters are discussed. Character recognition efficiency varied from 94 to 97% for this type of font. The grouping of blocks into logical units and the determination of reading order within each logical unit helped us in reconstructing automatically the document image in an editable format.

  12. Using Speech Recognition to Enhance the Tongue Drive System Functionality in Computer Access

    PubMed Central

    Huo, Xueliang; Ghovanloo, Maysam

    2013-01-01

    Tongue Drive System (TDS) is a wireless tongue operated assistive technology (AT), which can enable people with severe physical disabilities to access computers and drive powered wheelchairs using their volitional tongue movements. TDS offers six discrete commands, simultaneously available to the users, for pointing and typing as a substitute for mouse and keyboard in computer access, respectively. To enhance the TDS performance in typing, we have added a microphone, an audio codec, and a wireless audio link to its readily available 3-axial magnetic sensor array, and combined it with a commercially available speech recognition software, the Dragon Naturally Speaking, which is regarded as one of the most efficient ways for text entry. Our preliminary evaluations indicate that the combined TDS and speech recognition technologies can provide end users with significantly higher performance than using each technology alone, particularly in completing tasks that require both pointing and text entry, such as web surfing. PMID:22255801

  13. A robust pointer segmentation in biomedical images toward building a visual ontology for biomedical article retrieval

    NASA Astrophysics Data System (ADS)

    You, Daekeun; Simpson, Matthew; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.

    2013-01-01

    Pointers (arrows and symbols) are frequently used in biomedical images to highlight specific image regions of interest (ROIs) that are mentioned in figure captions and/or text discussion. Detection of pointers is the first step toward extracting relevant visual features from ROIs and combining them with textual descriptions for a multimodal (text and image) biomedical article retrieval system. Recently we developed a pointer recognition algorithm based on an edge-based pointer segmentation method, and subsequently reported improvements made on our initial approach involving the use of Active Shape Models (ASM) for pointer recognition and region growing-based method for pointer segmentation. These methods contributed to improving the recall of pointer recognition but not much to the precision. The method discussed in this article is our recent effort to improve the precision rate. Evaluation performed on two datasets and compared with other pointer segmentation methods show significantly improved precision and the highest F1 score.

  14. BANNER: an executable survey of advances in biomedical named entity recognition.

    PubMed

    Leaman, Robert; Gonzalez, Graciela

    2008-01-01

    There has been an increasing amount of research on biomedical named entity recognition, the most basic text extraction problem, resulting in significant progress by different research teams around the world. This has created a need for a freely-available, open source system implementing the advances described in the literature. In this paper we present BANNER, an open-source, executable survey of advances in biomedical named entity recognition, intended to serve as a benchmark for the field. BANNER is implemented in Java as a machine-learning system based on conditional random fields and includes a wide survey of the best techniques recently described in the literature. It is designed to maximize domain independence by not employing brittle semantic features or rule-based processing steps, and achieves significantly better performance than existing baseline systems. It is therefore useful to developers as an extensible NER implementation, to researchers as a standard for comparing innovative techniques, and to biologists requiring the ability to find novel entities in large amounts of text.

  15. Loose, Falling Characters and Sentences: The Persistence of the OCR Problem in Digital Repository E-Books

    ERIC Educational Resources Information Center

    Kichuk, Diana

    2015-01-01

    The electronic conversion of scanned image files to readable text using optical character recognition (OCR) software and the subsequent migration of raw OCR text to e-book text file formats are key remediation or media conversion technologies used in digital repository e-book production. Despite real progress, the OCR problem of reliability and…

  16. Automating generation of textual class definitions from OWL to English.

    PubMed

    Stevens, Robert; Malone, James; Williams, Sandra; Power, Richard; Third, Allan

    2011-05-17

    Text definitions for entities within bio-ontologies are a cornerstone of the effort to gain a consensus in understanding and usage of those ontologies. Writing these definitions is, however, a considerable effort and there is often a lag between specification of the main part of an ontology (logical descriptions and definitions of entities) and the development of the text-based definitions. The goal of natural language generation (NLG) from ontologies is to take the logical description of entities and generate fluent natural language. The application described here uses NLG to automatically provide text-based definitions from an ontology that has logical descriptions of its entities, so avoiding the bottleneck of authoring these definitions by hand. To produce the descriptions, the program collects all the axioms relating to a given entity, groups them according to common structure, realises each group through an English sentence, and assembles the resulting sentences into a paragraph, to form as 'coherent' a text as possible without human intervention. Sentence generation is accomplished using a generic grammar based on logical patterns in OWL, together with a lexicon for realising atomic entities. We have tested our output for the Experimental Factor Ontology (EFO) using a simple survey strategy to explore the fluency of the generated text and how well it conveys the underlying axiomatisation. Two rounds of survey and improvement show that overall the generated English definitions are found to convey the intended meaning of the axiomatisation in a satisfactory manner. The surveys also suggested that one form of generated English will not be universally liked; that intrusion of too much 'formal ontology' was not liked; and that too much explicit exposure of OWL semantics was also not liked. Our prototype tools can generate reasonable paragraphs of English text that can act as definitions. The definitions were found acceptable by our survey and, as a result, the developers of EFO are sufficiently satisfied with the output that the generated definitions have been incorporated into EFO. Whilst not a substitute for hand-written textual definitions, our generated definitions are a useful starting point. An on-line version of the NLG text definition tool can be found at http://swat.open.ac.uk/tools/. The questionaire and sample generated text definitions may be found at http://mcs.open.ac.uk/nlg/SWAT/bio-ontologies.html.

  17. Automating generation of textual class definitions from OWL to English

    PubMed Central

    2011-01-01

    Background Text definitions for entities within bio-ontologies are a cornerstone of the effort to gain a consensus in understanding and usage of those ontologies. Writing these definitions is, however, a considerable effort and there is often a lag between specification of the main part of an ontology (logical descriptions and definitions of entities) and the development of the text-based definitions. The goal of natural language generation (NLG) from ontologies is to take the logical description of entities and generate fluent natural language. The application described here uses NLG to automatically provide text-based definitions from an ontology that has logical descriptions of its entities, so avoiding the bottleneck of authoring these definitions by hand. Results To produce the descriptions, the program collects all the axioms relating to a given entity, groups them according to common structure, realises each group through an English sentence, and assembles the resulting sentences into a paragraph, to form as ‘coherent’ a text as possible without human intervention. Sentence generation is accomplished using a generic grammar based on logical patterns in OWL, together with a lexicon for realising atomic entities. We have tested our output for the Experimental Factor Ontology (EFO) using a simple survey strategy to explore the fluency of the generated text and how well it conveys the underlying axiomatisation. Two rounds of survey and improvement show that overall the generated English definitions are found to convey the intended meaning of the axiomatisation in a satisfactory manner. The surveys also suggested that one form of generated English will not be universally liked; that intrusion of too much ‘formal ontology’ was not liked; and that too much explicit exposure of OWL semantics was also not liked. Conclusions Our prototype tools can generate reasonable paragraphs of English text that can act as definitions. The definitions were found acceptable by our survey and, as a result, the developers of EFO are sufficiently satisfied with the output that the generated definitions have been incorporated into EFO. Whilst not a substitute for hand-written textual definitions, our generated definitions are a useful starting point. Availability An on-line version of the NLG text definition tool can be found at http://swat.open.ac.uk/tools/. The questionaire and sample generated text definitions may be found at http://mcs.open.ac.uk/nlg/SWAT/bio-ontologies.html. PMID:21624160

  18. Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction

    PubMed Central

    Suominen, Hanna; Johnson, Maree; Zhou, Liyuan; Sanchez, Paula; Sirel, Raul; Basilakis, Jim; Hanlen, Leif; Estival, Dominique; Dawson, Linda; Kelly, Barbara

    2015-01-01

    Objective We study the use of speech recognition and information extraction to generate drafts of Australian nursing-handover documents. Methods Speech recognition correctness and clinicians’ preferences were evaluated using 15 recorder–microphone combinations, six documents, three speakers, Dragon Medical 11, and five survey/interview participants. Information extraction correctness evaluation used 260 documents, six-class classification for each word, two annotators, and the CRF++ conditional random field toolkit. Results A noise-cancelling lapel-microphone with a digital voice recorder gave the best correctness (79%). This microphone was also the most preferred option by all but one participant. Although the participants liked the small size of this recorder, their preference was for tablets that can also be used for document proofing and sign-off, among other tasks. Accented speech was harder to recognize than native language and a male speaker was detected better than a female speaker. Information extraction was excellent in filtering out irrelevant text (85% F1) and identifying text relevant to two classes (87% and 70% F1). Similarly to the annotators’ disagreements, there was confusion between the remaining three classes, which explains the modest 62% macro-averaged F1. Discussion We present evidence for the feasibility of speech recognition and information extraction to support clinicians’ in entering text and unlock its content for computerized decision-making and surveillance in healthcare. Conclusions The benefits of this automation include storing all information; making the drafts available and accessible almost instantly to everyone with authorized access; and avoiding information loss, delays, and misinterpretations inherent to using a ward clerk or transcription services. PMID:25336589

  19. The Form is the Substance: Classification of Genres in Text

    DTIC Science & Technology

    2001-01-01

    recipients and time of posting. This yielded improved results, but no results on use of domain specific features alone are presented. Pannu and Sycara (1996...Recognition. Pannu , Anandeep, Sycara. (1996) "Learning Text Filtering Preferences", Symposium on Machine Learning and Information Processing, AAAI

  20. Grammaire et communication (Grammar and Communication).

    ERIC Educational Resources Information Center

    Stirman-Langlois, Martine

    1994-01-01

    A technique for teaching French grammar that involves reading, rereading, and analyzing the language in authentic materials is discussed. The student is led to recognition and generalization of structures in the text. Text examples used here include a comic strip and a publicity blurb for a French city. (MSE)

Top