Sample records for resource dictionary system

  1. Concept dictionary creation and maintenance under resource constraints: lessons from the AMPATH Medical Record System.

    PubMed

    Were, Martin C; Mamlin, Burke W; Tierney, William M; Wolfe, Ben; Biondich, Paul G

    2007-10-11

    The challenges of creating and maintaining concept dictionaries are compounded in resource-limited settings. Approaches to alleviate this burden need to be based on information derived in these settings. We created a concept dictionary and evaluated new concept proposals for an open source EMR in a resource-limited setting. Overall, 87% of the concepts in the initial dictionary were used. There were 5137 new concepts proposed, with 77% of these proposed only once. Further characterization of new concept proposals revealed that 41% were due to deficiency in the existing dictionary, and 19% were synonyms to existing concepts. 25% of the requests contained misspellings, 41% were complex terms, and 17% were ambiguous. Given the resource-intensive nature of dictionary creation and maintenance, there should be considerations for centralizing the concept dictionary service, using standards, prioritizing concept proposals, and redesigning the user-interface to reduce this burden in settings with limited resources.

  2. Concept Dictionary Creation and Maintenance Under Resource Constraints: Lessons from the AMPATH Medical Record System

    PubMed Central

    Were, Martin C.; Mamlin, Burke W.; Tierney, William M.; Wolfe, Ben; Biondich, Paul G.

    2007-01-01

    The challenges of creating and maintaining concept dictionaries are compounded in resource-limited settings. Approaches to alleviate this burden need to be based on information derived in these settings. We created a concept dictionary and evaluated new concept proposals for an open source EMR in a resource-limited setting. Overall, 87% of the concepts in the initial dictionary were used. There were 5137 new concepts proposed, with 77% of these proposed only once. Further characterization of new concept proposals revealed that 41% were due to deficiency in the existing dictionary, and 19% were synonyms to existing concepts. 25% of the requests contained misspellings, 41% were complex terms, and 17% were ambiguous. Given the resource-intensive nature of dictionary creation and maintenance, there should be considerations for centralizing the concept dictionary service, using standards, prioritizing concept proposals, and redesigning the user-interface to reduce this burden in settings with limited resources. PMID:18693945

  3. Creating a medical dictionary using word alignment: the influence of sources and resources.

    PubMed

    Nyström, Mikael; Merkel, Magnus; Petersson, Håkan; Ahlfeldt, Hans

    2007-11-23

    Automatic word alignment of parallel texts with the same content in different languages is among other things used to generate dictionaries for new translations. The quality of the generated word alignment depends on the quality of the input resources. In this paper we report on automatic word alignment of the English and Swedish versions of the medical terminology systems ICD-10, ICF, NCSP, KSH97-P and parts of MeSH and how the terminology systems and type of resources influence the quality. We automatically word aligned the terminology systems using static resources, like dictionaries, statistical resources, like statistically derived dictionaries, and training resources, which were generated from manual word alignment. We varied which part of the terminology systems that we used to generate the resources, which parts that we word aligned and which types of resources we used in the alignment process to explore the influence the different terminology systems and resources have on the recall and precision. After the analysis, we used the best configuration of the automatic word alignment for generation of candidate term pairs. We then manually verified the candidate term pairs and included the correct pairs in an English-Swedish dictionary. The results indicate that more resources and resource types give better results but the size of the parts used to generate the resources only partly affects the quality. The most generally useful resources were generated from ICD-10 and resources generated from MeSH were not as general as other resources. Systematic inter-language differences in the structure of the terminology system rubrics make the rubrics harder to align. Manually created training resources give nearly as good results as a union of static resources, statistical resources and training resources and noticeably better results than a union of static resources and statistical resources. The verified English-Swedish dictionary contains 24,000 term pairs in base forms. More resources give better results in the automatic word alignment, but some resources only give small improvements. The most important type of resource is training and the most general resources were generated from ICD-10.

  4. Creating a medical dictionary using word alignment: The influence of sources and resources

    PubMed Central

    Nyström, Mikael; Merkel, Magnus; Petersson, Håkan; Åhlfeldt, Hans

    2007-01-01

    Background Automatic word alignment of parallel texts with the same content in different languages is among other things used to generate dictionaries for new translations. The quality of the generated word alignment depends on the quality of the input resources. In this paper we report on automatic word alignment of the English and Swedish versions of the medical terminology systems ICD-10, ICF, NCSP, KSH97-P and parts of MeSH and how the terminology systems and type of resources influence the quality. Methods We automatically word aligned the terminology systems using static resources, like dictionaries, statistical resources, like statistically derived dictionaries, and training resources, which were generated from manual word alignment. We varied which part of the terminology systems that we used to generate the resources, which parts that we word aligned and which types of resources we used in the alignment process to explore the influence the different terminology systems and resources have on the recall and precision. After the analysis, we used the best configuration of the automatic word alignment for generation of candidate term pairs. We then manually verified the candidate term pairs and included the correct pairs in an English-Swedish dictionary. Results The results indicate that more resources and resource types give better results but the size of the parts used to generate the resources only partly affects the quality. The most generally useful resources were generated from ICD-10 and resources generated from MeSH were not as general as other resources. Systematic inter-language differences in the structure of the terminology system rubrics make the rubrics harder to align. Manually created training resources give nearly as good results as a union of static resources, statistical resources and training resources and noticeably better results than a union of static resources and statistical resources. The verified English-Swedish dictionary contains 24,000 term pairs in base forms. Conclusion More resources give better results in the automatic word alignment, but some resources only give small improvements. The most important type of resource is training and the most general resources were generated from ICD-10. PMID:18036221

  5. Parsing and Tagging of Bilingual Dictionary

    DTIC Science & Technology

    2003-09-01

    LAMP-TR-106 CAR-TR-991 CS-TR-4529 UMIACS-TR-2003-97 PARSING ANS TAGGING OF BILINGUAL DICTIONARY Huanfeng Ma1,2, Burcu Karagol-Ayan1,2, David... dictionaries hold great potential as a source of lexical resources for training and testing automated systems for optical character recognition, machine...translation, and cross-language information retrieval. In this paper, we describe a system for extracting term lexicons from printed bilingual dictionaries

  6. Translation lexicon acquisition from bilingual dictionaries

    NASA Astrophysics Data System (ADS)

    Doermann, David S.; Ma, Huanfeng; Karagol-Ayan, Burcu; Oard, Douglas W.

    2001-12-01

    Bilingual dictionaries hold great potential as a source of lexical resources for training automated systems for optical character recognition, machine translation and cross-language information retrieval. In this work we describe a system for extracting term lexicons from printed copies of bilingual dictionaries. We describe our approach to page and definition segmentation and entry parsing. We have used the approach to parse a number of dictionaries and demonstrate the results for retrieval using a French-English Dictionary to generate a translation lexicon and a corpus of English queries applied to French documents to evaluation cross-language IR.

  7. LeadMine: a grammar and dictionary driven approach to entity recognition.

    PubMed

    Lowe, Daniel M; Sayle, Roger A

    2015-01-01

    Chemical entity recognition has traditionally been performed by machine learning approaches. Here we describe an approach using grammars and dictionaries. This approach has the advantage that the entities found can be directly related to a given grammar or dictionary, which allows the type of an entity to be known and, if an entity is misannotated, indicates which resource should be corrected. As recognition is driven by what is expected, if spelling errors occur, they can be corrected. Correcting such errors is highly useful when attempting to lookup an entity in a database or, in the case of chemical names, converting them to structures. Our system uses a mixture of expertly curated grammars and dictionaries, as well as dictionaries automatically derived from public resources. We show that the heuristics developed to filter our dictionary of trivial chemical names (from PubChem) yields a better performing dictionary than the previously published Jochem dictionary. Our final system performs post-processing steps to modify the boundaries of entities and to detect abbreviations. These steps are shown to significantly improve performance (2.6% and 4.0% F1-score respectively). Our complete system, with incremental post-BioCreative workshop improvements, achieves 89.9% precision and 85.4% recall (87.6% F1-score) on the CHEMDNER test set. Grammar and dictionary approaches can produce results at least as good as the current state of the art in machine learning approaches. While machine learning approaches are commonly thought of as "black box" systems, our approach directly links the output entities to the input dictionaries and grammars. Our approach also allows correction of errors in detected entities, which can assist with entity resolution.

  8. LeadMine: a grammar and dictionary driven approach to entity recognition

    PubMed Central

    2015-01-01

    Background Chemical entity recognition has traditionally been performed by machine learning approaches. Here we describe an approach using grammars and dictionaries. This approach has the advantage that the entities found can be directly related to a given grammar or dictionary, which allows the type of an entity to be known and, if an entity is misannotated, indicates which resource should be corrected. As recognition is driven by what is expected, if spelling errors occur, they can be corrected. Correcting such errors is highly useful when attempting to lookup an entity in a database or, in the case of chemical names, converting them to structures. Results Our system uses a mixture of expertly curated grammars and dictionaries, as well as dictionaries automatically derived from public resources. We show that the heuristics developed to filter our dictionary of trivial chemical names (from PubChem) yields a better performing dictionary than the previously published Jochem dictionary. Our final system performs post-processing steps to modify the boundaries of entities and to detect abbreviations. These steps are shown to significantly improve performance (2.6% and 4.0% F1-score respectively). Our complete system, with incremental post-BioCreative workshop improvements, achieves 89.9% precision and 85.4% recall (87.6% F1-score) on the CHEMDNER test set. Conclusions Grammar and dictionary approaches can produce results at least as good as the current state of the art in machine learning approaches. While machine learning approaches are commonly thought of as "black box" systems, our approach directly links the output entities to the input dictionaries and grammars. Our approach also allows correction of errors in detected entities, which can assist with entity resolution. PMID:25810776

  9. Dictionary Based Machine Translation from Kannada to Telugu

    NASA Astrophysics Data System (ADS)

    Sindhu, D. V.; Sagar, B. M.

    2017-08-01

    Machine Translation is a task of translating from one language to another language. For the languages with less linguistic resources like Kannada and Telugu Dictionary based approach is the best approach. This paper mainly focuses on Dictionary based machine translation for Kannada to Telugu. The proposed methodology uses dictionary for translating word by word without much correlation of semantics between them. The dictionary based machine translation process has the following sub process: Morph analyzer, dictionary, transliteration, transfer grammar and the morph generator. As a part of this work bilingual dictionary with 8000 entries is developed and the suffix mapping table at the tag level is built. This system is tested for the children stories. In near future this system can be further improved by defining transfer grammar rules.

  10. A Relational Data Dictionary Compatible with the National Bureau of Standards Information Resource Dictionary System.

    DTIC Science & Technology

    1985-12-01

    85 UNCLSSIFIED F/ 3/2 NL mhhhhhhhhhhhhl 4y 1.0 &32 MICROCOPY RESOLUTIOf TEST CKART. N NAVAL POSTGRADUATE SCHOOL Monterey, California DTIC ELECTE...Concern over corporate information resources has resulted from the explosive growth in the size, complexity and number of data bases available to...validity, and relevance, and usability of the data that is available. As a result , there has been a growing interest in two tools which .,%... provide

  11. Evaluating Online Bilingual Dictionaries: The Case of Popular Free English-Polish Dictionaries

    ERIC Educational Resources Information Center

    Lew, Robert; Szarowska, Agnieszka

    2017-01-01

    Language learners today exhibit a strong preference for free online resources. One problem with such resources is that their quality can vary dramatically. Building on related work on monolingual resources for English, we propose an evaluation framework for online bilingual dictionaries, designed to assess lexicographic quality in four major…

  12. The Talking Dictionary. The Prospectus Series, Paper No. 2.

    ERIC Educational Resources Information Center

    Ward, Ted

    Three talking dictionaires designed to increase independence and resource-use skills of handicapped children have specific advantages and limitations. System I involves a random access tape recorder, a printed or braille dictionary which contains the inquiry numbers for words, a console (similar to an adding machine) on which the number is…

  13. An Investigation into the Effect of English Learners' Dictionaries on International Students' Acquisition of the English Article System

    ERIC Educational Resources Information Center

    Miller, Julia

    2006-01-01

    Learners' dictionaries are a resource which is often overlooked by both students and teachers of English as a Second Language. The wealth of grammatical information contained within them, however, can help students to improve their English language skills and, ipso facto, their academic writing. In this study, four groups of university ESL…

  14. The Power of Math Dictionaries in the Classroom

    ERIC Educational Resources Information Center

    Patterson, Lynn Gannon; Young, Ashlee Futrell

    2013-01-01

    This article investigates the value of a math dictionary in the elementary classroom and if elementary students prefer using a traditional math dictionary or a dictionary on an iPad. In each child's journey to reading with understanding, the dictionary can be a comforting and valuable resource. Would students find a math dictionary to be a…

  15. A dictionary to identify small molecules and drugs in free text.

    PubMed

    Hettne, Kristina M; Stierum, Rob H; Schuemie, Martijn J; Hendriksen, Peter J M; Schijvenaars, Bob J A; Mulligen, Erik M van; Kleinjans, Jos; Kors, Jan A

    2009-11-15

    From the scientific community, a lot of effort has been spent on the correct identification of gene and protein names in text, while less effort has been spent on the correct identification of chemical names. Dictionary-based term identification has the power to recognize the diverse representation of chemical information in the literature and map the chemicals to their database identifiers. We developed a dictionary for the identification of small molecules and drugs in text, combining information from UMLS, MeSH, ChEBI, DrugBank, KEGG, HMDB and ChemIDplus. Rule-based term filtering, manual check of highly frequent terms and disambiguation rules were applied. We tested the combined dictionary and the dictionaries derived from the individual resources on an annotated corpus, and conclude the following: (i) each of the different processing steps increase precision with a minor loss of recall; (ii) the overall performance of the combined dictionary is acceptable (precision 0.67, recall 0.40 (0.80 for trivial names); (iii) the combined dictionary performed better than the dictionary in the chemical recognizer OSCAR3; (iv) the performance of a dictionary based on ChemIDplus alone is comparable to the performance of the combined dictionary. The combined dictionary is freely available as an XML file in Simple Knowledge Organization System format on the web site http://www.biosemantics.org/chemlist.

  16. Gene/protein name recognition based on support vector machine using dictionary as features.

    PubMed

    Mitsumori, Tomohiro; Fation, Sevrani; Murata, Masaki; Doi, Kouichi; Doi, Hirohumi

    2005-01-01

    Automated information extraction from biomedical literature is important because a vast amount of biomedical literature has been published. Recognition of the biomedical named entities is the first step in information extraction. We developed an automated recognition system based on the SVM algorithm and evaluated it in Task 1.A of BioCreAtIvE, a competition for automated gene/protein name recognition. In the work presented here, our recognition system uses the feature set of the word, the part-of-speech (POS), the orthography, the prefix, the suffix, and the preceding class. We call these features "internal resource features", i.e., features that can be found in the training data. Additionally, we consider the features of matching against dictionaries to be external resource features. We investigated and evaluated the effect of these features as well as the effect of tuning the parameters of the SVM algorithm. We found that the dictionary matching features contributed slightly to the improvement in the performance of the f-score. We attribute this to the possibility that the dictionary matching features might overlap with other features in the current multiple feature setting. During SVM learning, each feature alone had a marginally positive effect on system performance. This supports the fact that the SVM algorithm is robust on the high dimensionality of the feature vector space and means that feature selection is not required.

  17. Rdesign: A data dictionary with relational database design capabilities in Ada

    NASA Technical Reports Server (NTRS)

    Lekkos, Anthony A.; Kwok, Teresa Ting-Yin

    1986-01-01

    Data Dictionary is defined to be the set of all data attributes, which describe data objects in terms of their intrinsic attributes, such as name, type, size, format and definition. It is recognized as the data base for the Information Resource Management, to facilitate understanding and communication about the relationship between systems applications and systems data usage and to help assist in achieving data independence by permitting systems applications to access data knowledge of the location or storage characteristics of the data in the system. A research and development effort to use Ada has produced a data dictionary with data base design capabilities. This project supports data specification and analysis and offers a choice of the relational, network, and hierarchical model for logical data based design. It provides a highly integrated set of analysis and design transformation tools which range from templates for data element definition, spreadsheet for defining functional dependencies, normalization, to logical design generator.

  18. The Influence of Electronic Dictionaries on Vocabulary Knowledge Extension

    ERIC Educational Resources Information Center

    Rezaei, Mojtaba; Davoudi, Mohammad

    2016-01-01

    Vocabulary learning needs special strategies in language learning process. The use of dictionaries is a great help in vocabulary learning and nowadays the emergence of electronic dictionaries has added a new and valuable resource for vocabulary learning. The present study aims to explore the influence of Electronic Dictionaries (ED) Vs. Paper…

  19. Developing a hybrid dictionary-based bio-entity recognition technique.

    PubMed

    Song, Min; Yu, Hwanjo; Han, Wook-Shin

    2015-01-01

    Bio-entity extraction is a pivotal component for information extraction from biomedical literature. The dictionary-based bio-entity extraction is the first generation of Named Entity Recognition (NER) techniques. This paper presents a hybrid dictionary-based bio-entity extraction technique. The approach expands the bio-entity dictionary by combining different data sources and improves the recall rate through the shortest path edit distance algorithm. In addition, the proposed technique adopts text mining techniques in the merging stage of similar entities such as Part of Speech (POS) expansion, stemming, and the exploitation of the contextual cues to further improve the performance. The experimental results show that the proposed technique achieves the best or at least equivalent performance among compared techniques, GENIA, MESH, UMLS, and combinations of these three resources in F-measure. The results imply that the performance of dictionary-based extraction techniques is largely influenced by information resources used to build the dictionary. In addition, the edit distance algorithm shows steady performance with three different dictionaries in precision whereas the context-only technique achieves a high-end performance with three difference dictionaries in recall.

  20. Developing a hybrid dictionary-based bio-entity recognition technique

    PubMed Central

    2015-01-01

    Background Bio-entity extraction is a pivotal component for information extraction from biomedical literature. The dictionary-based bio-entity extraction is the first generation of Named Entity Recognition (NER) techniques. Methods This paper presents a hybrid dictionary-based bio-entity extraction technique. The approach expands the bio-entity dictionary by combining different data sources and improves the recall rate through the shortest path edit distance algorithm. In addition, the proposed technique adopts text mining techniques in the merging stage of similar entities such as Part of Speech (POS) expansion, stemming, and the exploitation of the contextual cues to further improve the performance. Results The experimental results show that the proposed technique achieves the best or at least equivalent performance among compared techniques, GENIA, MESH, UMLS, and combinations of these three resources in F-measure. Conclusions The results imply that the performance of dictionary-based extraction techniques is largely influenced by information resources used to build the dictionary. In addition, the edit distance algorithm shows steady performance with three different dictionaries in precision whereas the context-only technique achieves a high-end performance with three difference dictionaries in recall. PMID:26043907

  1. Technical Standards for Command and Control Information Systems (CCISs)

    DTIC Science & Technology

    1992-01-01

    initiation, Conformance Testing 149 management, scheduling, resource allocation , logical and IEEE P1 003 146 physical device access, interrupt handling...70 5.2.3 Remote Data Access (RDA) ........................................... 72 5.2.4 Information Resource Dictionary...146 7.2.1.2 POSIX Conformance Testing .............................. 149 7.2.2 Consortia Recommendations

  2. Recognition of chemical entities: combining dictionary-based and grammar-based approaches.

    PubMed

    Akhondi, Saber A; Hettne, Kristina M; van der Horst, Eelke; van Mulligen, Erik M; Kors, Jan A

    2015-01-01

    The past decade has seen an upsurge in the number of publications in chemistry. The ever-swelling volume of available documents makes it increasingly hard to extract relevant new information from such unstructured texts. The BioCreative CHEMDNER challenge invites the development of systems for the automatic recognition of chemicals in text (CEM task) and for ranking the recognized compounds at the document level (CDI task). We investigated an ensemble approach where dictionary-based named entity recognition is used along with grammar-based recognizers to extract compounds from text. We assessed the performance of ten different commercial and publicly available lexical resources using an open source indexing system (Peregrine), in combination with three different chemical compound recognizers and a set of regular expressions to recognize chemical database identifiers. The effect of different stop-word lists, case-sensitivity matching, and use of chunking information was also investigated. We focused on lexical resources that provide chemical structure information. To rank the different compounds found in a text, we used a term confidence score based on the normalized ratio of the term frequencies in chemical and non-chemical journals. The use of stop-word lists greatly improved the performance of the dictionary-based recognition, but there was no additional benefit from using chunking information. A combination of ChEBI and HMDB as lexical resources, the LeadMine tool for grammar-based recognition, and the regular expressions, outperformed any of the individual systems. On the test set, the F-scores were 77.8% (recall 71.2%, precision 85.8%) for the CEM task and 77.6% (recall 71.7%, precision 84.6%) for the CDI task. Missed terms were mainly due to tokenization issues, poor recognition of formulas, and term conjunctions. We developed an ensemble system that combines dictionary-based and grammar-based approaches for chemical named entity recognition, outperforming any of the individual systems that we considered. The system is able to provide structure information for most of the compounds that are found. Improved tokenization and better recognition of specific entity types is likely to further improve system performance.

  3. Recognition of chemical entities: combining dictionary-based and grammar-based approaches

    PubMed Central

    2015-01-01

    Background The past decade has seen an upsurge in the number of publications in chemistry. The ever-swelling volume of available documents makes it increasingly hard to extract relevant new information from such unstructured texts. The BioCreative CHEMDNER challenge invites the development of systems for the automatic recognition of chemicals in text (CEM task) and for ranking the recognized compounds at the document level (CDI task). We investigated an ensemble approach where dictionary-based named entity recognition is used along with grammar-based recognizers to extract compounds from text. We assessed the performance of ten different commercial and publicly available lexical resources using an open source indexing system (Peregrine), in combination with three different chemical compound recognizers and a set of regular expressions to recognize chemical database identifiers. The effect of different stop-word lists, case-sensitivity matching, and use of chunking information was also investigated. We focused on lexical resources that provide chemical structure information. To rank the different compounds found in a text, we used a term confidence score based on the normalized ratio of the term frequencies in chemical and non-chemical journals. Results The use of stop-word lists greatly improved the performance of the dictionary-based recognition, but there was no additional benefit from using chunking information. A combination of ChEBI and HMDB as lexical resources, the LeadMine tool for grammar-based recognition, and the regular expressions, outperformed any of the individual systems. On the test set, the F-scores were 77.8% (recall 71.2%, precision 85.8%) for the CEM task and 77.6% (recall 71.7%, precision 84.6%) for the CDI task. Missed terms were mainly due to tokenization issues, poor recognition of formulas, and term conjunctions. Conclusions We developed an ensemble system that combines dictionary-based and grammar-based approaches for chemical named entity recognition, outperforming any of the individual systems that we considered. The system is able to provide structure information for most of the compounds that are found. Improved tokenization and better recognition of specific entity types is likely to further improve system performance. PMID:25810767

  4. Information Resource Management in the DCSPLANS (Deputy Chief of Staff for Plans) Branch of the U.S. Army Military Personnel Center

    DTIC Science & Technology

    1986-11-15

    Richard E. Broome [Broome 1985], Major Robert M. DiBona [ DiBona 1985], Major Robert A. Kirsch II [Kirsch 1985], and Major Alan F. Noel, Jr. [Noel 1985...Major Kirsch expanded this prototype to be compatible with the emerging Federal standards for dictionary systems. Major DiBona analyzed data validation...described in [Noel 1985, Kirsch 1985]. The implementation of edit validation rules in a dictionary environment is covered in [ DiBona 1985] and implementation

  5. Speech and Language and Language Translation (SALT)

    DTIC Science & Technology

    2012-12-01

    Resources are classified as: Parallel Text Dictionaries Monolingual Text Other Dictionaries are further classified as: Text: can download entire...not clear how many are translated http://www.redsea-online.com/modules.php?name= dictionary Monolingual Text Monolingual Text; An Crubadan web...attached to a following word. A program could be written to detach the character د from unknown words, when the remaining word matches a dictionary

  6. Paper, Electronic or Online? Different Dictionaries for Different Activities

    ERIC Educational Resources Information Center

    Pasfield-Neofitou, Sarah

    2009-01-01

    Despite research suggesting that teachers highly influence their students' knowledge and use of language learning resources such as dictionaries (Loucky, 2005; Yamane, 2006), it appears that dictionary selection and use is considered something to be dealt with outside the classroom. As a result, many students receive too little advice to be able…

  7. NCI Dictionary of Genetics Terms

    Cancer.gov

    A dictionary of more than 150 genetics-related terms written for healthcare professionals. This resource was developed to support the comprehensive, evidence-based, peer-reviewed PDQ cancer genetics information summaries.

  8. The SMAP Dictionary Management System

    NASA Technical Reports Server (NTRS)

    Smith, Kevin A.; Swan, Christoper A.

    2014-01-01

    The Soil Moisture Active Passive (SMAP) Dictionary Management System is a web-based tool to develop and store a mission dictionary. A mission dictionary defines the interface between a ground system and a spacecraft. In recent years, mission dictionaries have grown in size and scope, making it difficult for engineers across multiple disciplines to coordinate the dictionary development effort. The Dictionary Management Systemaddresses these issues by placing all dictionary information in one place, taking advantage of the efficiencies inherent in co-locating what were once disparate dictionary development efforts.

  9. A study of actions in operative notes.

    PubMed

    Wang, Yan; Pakhomov, Serguei; Burkart, Nora E; Ryan, James O; Melton, Genevieve B

    2012-01-01

    Operative notes contain rich information about techniques, instruments, and materials used in procedures. To assist development of effective information extraction (IE) techniques for operative notes, we investigated the sublanguage used to describe actions within the operative report 'procedure description' section. Deep parsing results of 362,310 operative notes with an expanded Stanford parser using the SPECIALIST Lexicon resulted in 200 verbs (92% coverage) including 147 action verbs. Nominal action predicates for each action verb were gathered from WordNet, SPECIALIST Lexicon, New Oxford American Dictionary and Stedman's Medical Dictionary. Coverage gaps were seen in existing lexical, domain, and semantic resources (Unified Medical Language System (UMLS) Metathesaurus, SPECIALIST Lexicon, WordNet and FrameNet). Our findings demonstrate the need to construct surgical domain-specific semantic resources for IE from operative notes.

  10. The Effect of Bilingual Term List Size on Dictionary-Based Cross-Language Information Retrieval

    DTIC Science & Technology

    2003-02-01

    FEB 2003 2. REPORT TYPE 3. DATES COVERED 00-00-2003 to 00-00-2003 4. TITLE AND SUBTITLE The Effect of Bilingual Term List Size on Dictionary ...298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 The Effect of Bilingual Term List Size on Dictionary -Based Cross-Language Information Retrieval Dina...are extensively used as a resource for dictionary -based Cross-Language Information Retrieval (CLIR), in which the goal is to find documents written

  11. Improving Feature Representation Based on a Neural Network for Author Profiling in Social Media Texts

    PubMed Central

    2016-01-01

    We introduce a lexical resource for preprocessing social media data. We show that a neural network-based feature representation is enhanced by using this resource. We conducted experiments on the PAN 2015 and PAN 2016 author profiling corpora and obtained better results when performing the data preprocessing using the developed lexical resource. The resource includes dictionaries of slang words, contractions, abbreviations, and emoticons commonly used in social media. Each of the dictionaries was built for the English, Spanish, Dutch, and Italian languages. The resource is freely available. PMID:27795703

  12. A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources.

    PubMed

    Moon, Sungrim; Pakhomov, Serguei; Liu, Nathan; Ryan, James O; Melton, Genevieve B

    2014-01-01

    To create a sense inventory of abbreviations and acronyms from clinical texts. The most frequently occurring abbreviations and acronyms from 352,267 dictated clinical notes were used to create a clinical sense inventory. Senses of each abbreviation and acronym were manually annotated from 500 random instances and lexically matched with long forms within the Unified Medical Language System (UMLS V.2011AB), Another Database of Abbreviations in Medline (ADAM), and Stedman's Dictionary, Medical Abbreviations, Acronyms & Symbols, 4th edition (Stedman's). Redundant long forms were merged after they were lexically normalized using Lexical Variant Generation (LVG). The clinical sense inventory was found to have skewed sense distributions, practice-specific senses, and incorrect uses. Of 440 abbreviations and acronyms analyzed in this study, 949 long forms were identified in clinical notes. This set was mapped to 17,359, 5233, and 4879 long forms in UMLS, ADAM, and Stedman's, respectively. After merging long forms, only 2.3% matched across all medical resources. The UMLS, ADAM, and Stedman's covered 5.7%, 8.4%, and 11% of the merged clinical long forms, respectively. The sense inventory of clinical abbreviations and acronyms and anonymized datasets generated from this study are available for public use at http://www.bmhi.umn.edu/ihi/research/nlpie/resources/index.htm ('Sense Inventories', website). Clinical sense inventories of abbreviations and acronyms created using clinical notes and medical dictionary resources demonstrate challenges with term coverage and resource integration. Further work is needed to help with standardizing abbreviations and acronyms in clinical care and biomedicine to facilitate automated processes such as text-mining and information extraction.

  13. Natural-Annotation-based Unsupervised Construction of Korean-Chinese Domain Dictionary

    NASA Astrophysics Data System (ADS)

    Liu, Wuying; Wang, Lin

    2018-03-01

    The large-scale bilingual parallel resource is significant to statistical learning and deep learning in natural language processing. This paper addresses the automatic construction issue of the Korean-Chinese domain dictionary, and presents a novel unsupervised construction method based on the natural annotation in the raw corpus. We firstly extract all Korean-Chinese word pairs from Korean texts according to natural annotations, secondly transform the traditional Chinese characters into the simplified ones, and finally distill out a bilingual domain dictionary after retrieving the simplified Chinese words in an extra Chinese domain dictionary. The experimental results show that our method can automatically build multiple Korean-Chinese domain dictionaries efficiently.

  14. Dictionaries and distributions: Combining expert knowledge and large scale textual data content analysis : Distributed dictionary representation.

    PubMed

    Garten, Justin; Hoover, Joe; Johnson, Kate M; Boghrati, Reihane; Iskiwitch, Carol; Dehghani, Morteza

    2018-02-01

    Theory-driven text analysis has made extensive use of psychological concept dictionaries, leading to a wide range of important results. These dictionaries have generally been applied through word count methods which have proven to be both simple and effective. In this paper, we introduce Distributed Dictionary Representations (DDR), a method that applies psychological dictionaries using semantic similarity rather than word counts. This allows for the measurement of the similarity between dictionaries and spans of text ranging from complete documents to individual words. We show how DDR enables dictionary authors to place greater emphasis on construct validity without sacrificing linguistic coverage. We further demonstrate the benefits of DDR on two real-world tasks and finally conduct an extensive study of the interaction between dictionary size and task performance. These studies allow us to examine how DDR and word count methods complement one another as tools for applying concept dictionaries and where each is best applied. Finally, we provide references to tools and resources to make this method both available and accessible to a broad psychological audience.

  15. Domain Adaptation of Translation Models for Multilingual Applications

    DTIC Science & Technology

    2009-04-01

    expansion effect that corpus (or dictionary ) based trans- lation introduces - however, this effect is maintained even with monolingual query expansion [12...every day; bilingual web pages are harvested as parallel corpora as the quantity of non-English data on the web increases; online dictionaries of...approach is to customize translation models to a domain, by automatically selecting the resources ( dictionaries , parallel corpora) that are best for

  16. Label consistent K-SVD: learning a discriminative dictionary for recognition.

    PubMed

    Jiang, Zhuolin; Lin, Zhe; Davis, Larry S

    2013-11-01

    A label consistent K-SVD (LC-KSVD) algorithm to learn a discriminative dictionary for sparse coding is presented. In addition to using class labels of training data, we also associate label information with each dictionary item (columns of the dictionary matrix) to enforce discriminability in sparse codes during the dictionary learning process. More specifically, we introduce a new label consistency constraint called "discriminative sparse-code error" and combine it with the reconstruction error and the classification error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. Our algorithm learns a single overcomplete dictionary and an optimal linear classifier jointly. The incremental dictionary learning algorithm is presented for the situation of limited memory resources. It yields dictionaries so that feature points with the same class labels have similar sparse codes. Experimental results demonstrate that our algorithm outperforms many recently proposed sparse-coding techniques for face, action, scene, and object category recognition under the same learning conditions.

  17. Recent research in data description of the measurement property resource on common data dictionary

    NASA Astrophysics Data System (ADS)

    Lu, Tielin; Fan, Zitian; Wang, Chunxi; Liu, Xiaojing; Wang, Shuo; Zhao, Hua

    2018-03-01

    A method for measurement equipment data description has been proposed based on the property resource analysis. The applications of common data dictionary (CDD) to devices and equipment is mainly used in digital factory to advance the management not only in the enterprise, also to the different enterprise in the same data environment. In this paper, we can make a brief of the data flow in the whole manufacture enterprise and the automatic trigger the process of the data exchange. Furthermore,the application of the data dictionary is available for the measurement and control equipment, which can also be used in other different industry in smart manufacture.

  18. Chemical entity recognition in patents by combining dictionary-based and statistical approaches

    PubMed Central

    Akhondi, Saber A.; Pons, Ewoud; Afzal, Zubair; van Haagen, Herman; Becker, Benedikt F.H.; Hettne, Kristina M.; van Mulligen, Erik M.; Kors, Jan A.

    2016-01-01

    We describe the development of a chemical entity recognition system and its application in the CHEMDNER-patent track of BioCreative 2015. This community challenge includes a Chemical Entity Mention in Patents (CEMP) recognition task and a Chemical Passage Detection (CPD) classification task. We addressed both tasks by an ensemble system that combines a dictionary-based approach with a statistical one. For this purpose the performance of several lexical resources was assessed using Peregrine, our open-source indexing engine. We combined our dictionary-based results on the patent corpus with the results of tmChem, a chemical recognizer using a conditional random field classifier. To improve the performance of tmChem, we utilized three additional features, viz. part-of-speech tags, lemmas and word-vector clusters. When evaluated on the training data, our final system obtained an F-score of 85.21% for the CEMP task, and an accuracy of 91.53% for the CPD task. On the test set, the best system ranked sixth among 21 teams for CEMP with an F-score of 86.82%, and second among nine teams for CPD with an accuracy of 94.23%. The differences in performance between the best ensemble system and the statistical system separately were small. Database URL: http://biosemantics.org/chemdner-patents PMID:27141091

  19. A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources

    PubMed Central

    Moon, Sungrim; Pakhomov, Serguei; Liu, Nathan; Ryan, James O; Melton, Genevieve B

    2014-01-01

    Objective To create a sense inventory of abbreviations and acronyms from clinical texts. Methods The most frequently occurring abbreviations and acronyms from 352 267 dictated clinical notes were used to create a clinical sense inventory. Senses of each abbreviation and acronym were manually annotated from 500 random instances and lexically matched with long forms within the Unified Medical Language System (UMLS V.2011AB), Another Database of Abbreviations in Medline (ADAM), and Stedman's Dictionary, Medical Abbreviations, Acronyms & Symbols, 4th edition (Stedman's). Redundant long forms were merged after they were lexically normalized using Lexical Variant Generation (LVG). Results The clinical sense inventory was found to have skewed sense distributions, practice-specific senses, and incorrect uses. Of 440 abbreviations and acronyms analyzed in this study, 949 long forms were identified in clinical notes. This set was mapped to 17 359, 5233, and 4879 long forms in UMLS, ADAM, and Stedman's, respectively. After merging long forms, only 2.3% matched across all medical resources. The UMLS, ADAM, and Stedman's covered 5.7%, 8.4%, and 11% of the merged clinical long forms, respectively. The sense inventory of clinical abbreviations and acronyms and anonymized datasets generated from this study are available for public use at http://www.bmhi.umn.edu/ihi/research/nlpie/resources/index.htm (‘Sense Inventories’, website). Conclusions Clinical sense inventories of abbreviations and acronyms created using clinical notes and medical dictionary resources demonstrate challenges with term coverage and resource integration. Further work is needed to help with standardizing abbreviations and acronyms in clinical care and biomedicine to facilitate automated processes such as text-mining and information extraction. PMID:23813539

  20. CLIR Experiments at Maryland for TREC-2002: Evidence Combination for Arabic-English Retrieval

    DTIC Science & Technology

    2002-01-01

    translation resources of three types (machine translation lexicons, a printed bilingual dictionary that had been manually rekeyed, and translation...on both the term list and the collection). • The Salmone Arabic-to-English dictionary , which was made available for use in the TREC-CLIR track by...Tufts University. No translation preference information is provided in this dictionary , but it does include rich markup describing morphology and part

  1. Data Management Standards in Computer-aided Acquisition and Logistic Support (CALS)

    NASA Technical Reports Server (NTRS)

    Jefferson, David K.

    1990-01-01

    Viewgraphs and discussion on data management standards in computer-aided acquisition and logistic support (CALS) are presented. CALS is intended to reduce cost, increase quality, and improve timeliness of weapon system acquisition and support by greatly improving the flow of technical information. The phase 2 standards, industrial environment, are discussed. The information resource dictionary system (IRDS) is described.

  2. Study of Large Data Resources for Multilingual Training and System Porting (Pub Version, Open Access)

    DTIC Science & Technology

    2016-05-03

    extraction trained on a large database corpus – English Fisher. Although the performance of ported monolingual system would be worse in comparison...Language TE LI HA LA ZU LLP hours 8.6 9.6 7.9 8.1 8.4 LM sentences 11935 10743 9861 11577 10644 LM words 68175 83157 93131 93328 60832 dictionary 14505

  3. Data Base Directions: Information Resource Management - Strategies and Tools. Proceedings of the Workshop of the National Bureau of Standards and the Association for Computing Machinery (Ft. Lauderdale, Florida, October 20-22, 1980).

    ERIC Educational Resources Information Center

    Goldfine, Alan H., Ed.

    This workshop investigated how managers can evaluate, select, and effectively use information resource management (IRM) tools, especially data dictionary systems (DDS). An executive summary, which provides a definition of IRM as developed by workshop participants, precedes the keynote address, "Data: The Raw Material of a Paper Factory,"…

  4. D1-3: Marshfield Dictionary of Clinical and Translational Science (MD-CTS): An Online Reference for Clinical and Translational Science Terminology

    PubMed Central

    Finamore, Joe; Ray, William; Kadolph, Chris; Rastegar-Mojarad, Majid; Ye, Zhan; Jacqueline, Bohne; Tachinardi, Umberto; Mendonça, Eneida; Finnegan, Brian; Bartkowiak, Barbara; Weichelt, Bryan; Lin, Simon

    2014-01-01

    Background/Aims New terms are rapidly appearing in the literature and practice of clinical medicine and translational research. To catalog real-world usage of medical terms, we report the first construction of an online dictionary of clinical and translational medicinal terms, which are computationally generated in near real-time using a big data approach. This project is NIH CTSA-funded and developed by the Marshfield Clinic Research Foundation in conjunction with University of Wisconsin - Madison. Currently titled Marshfield Dictionary of Clinical and Translational Science (MD-CTS), this application is a Google-like word search tool. By entering a term into the search bar, MD-CTS will display that term’s definition, usage examples, contextual terms, related images, and ontological information. A prototype is available for public viewing at http://spellchecker.mfldclin.edu/. Methods We programmatically derived the lexicon for MD-CTS from scholarly communications by parsing through 15,156,745 MEDLINE abstracts and extracting all of the unique words found therein. We then ran this list through several filters in order to remove words that were not relevant for searching, such as common English words and numeric expressions. We then loaded the resulting 1,795,769 terms into SQL tables. Each term is cross-referenced with every occurrence in all abstracts in which it was found. Additional information is aggregated from Wiktionary, Bioportal, and Wikipedia in real-time and displayed on-screen. From this lexicon we created a supplemental dictionary resource (updated quarterly) to be used in Microsoft Office® products. Results We evaluated the utility of MD-CTS by creating a list of 100 words derived from recent clinical and translational medicine publications in the week of July 22, 2013. We then performed comparative searches for each term with Taber’s Cyclopedic Medical Dictionary, Stedman’s Medical Dictionary, Dorland’s Illustrated Medical Dictionary, Medical Subject Headings (MeSH), and MD-CTS. We compared our supplemental dictionary resource to OpenMedSpell for effectiveness in accuracy of term recognition. Conclusions In summary, we developed an online mobile and desktop reference, which comprehensively integrates Wiktionary (term information), Bioportal (ontological information), Wikipedia (related images), and Medline abstract information (term usage) for scientists and clinicians to browse in real-time. We also created a supplemental dictionary resource to be used in Microsoft Office® products.

  5. Supporting infobuttons with terminological knowledge.

    PubMed Central

    Cimino, J. J.; Elhanan, G.; Zeng, Q.

    1997-01-01

    We have developed several prototype applications which integrate clinical systems with on-line information resources by using patient data to drive queries in response to user information needs. We refer to these collectively as infobuttons because they are evoked with a minimum of keyboard entry. We make use of knowledge in our terminology, the Medical Entities Dictionary (MED) to assist with the selection of appropriate queries and resources, as well as the translation of patient data to forms recognized by the resources. This paper describes the kinds of knowledge in the MED, including literal attributes, hierarchical links and other semantic links, and how this knowledge is used in system integration. PMID:9357682

  6. Supporting infobuttons with terminological knowledge.

    PubMed

    Cimino, J J; Elhanan, G; Zeng, Q

    1997-01-01

    We have developed several prototype applications which integrate clinical systems with on-line information resources by using patient data to drive queries in response to user information needs. We refer to these collectively as infobuttons because they are evoked with a minimum of keyboard entry. We make use of knowledge in our terminology, the Medical Entities Dictionary (MED) to assist with the selection of appropriate queries and resources, as well as the translation of patient data to forms recognized by the resources. This paper describes the kinds of knowledge in the MED, including literal attributes, hierarchical links and other semantic links, and how this knowledge is used in system integration.

  7. Data dictionaries in information systems - Standards, usage , and application

    NASA Technical Reports Server (NTRS)

    Johnson, Margaret

    1990-01-01

    An overview of data dictionary systems and the role of standardization in the interchange of data dictionaries is presented. The development of the data dictionary for the Planetary Data System is cited as an example. The data element dictionary (DED), which is the repository of the definitions of the vocabulary utilized in an information system, is an important part of this service. A DED provides the definitions of the fields of the data set as well as the data elements of the catalog system. Finally, international efforts such as the Consultative Committee on Space Data Systems and other committees set up to provide standard recommendations on the usage and structure of data dictionaries in the international space science community are discussed.

  8. Resilience - A Concept

    DTIC Science & Technology

    2016-04-05

    dictionary ]. Retrieved from http://www.investopedia.com/terms/b/blackbox.asp Bodeau, D., Brtis, J., Graubart, R., & Salwen, J. (2013). Resiliency...techniques for systems-of-systems (Report No. 13-3513). Bedford, MA: The MITRE Corporation. Confidence, (n.d.). In Oxford dictionaries [Online dictionary ...Acquisition, Technology and Logistics. Holistic Strategy Approach. (n.d.). In BusinessDictionary.com [Online business dictionary ]. Retrieved from http

  9. Classic Classroom Activities: The Oxford Picture Dictionary Program.

    ERIC Educational Resources Information Center

    Weiss, Renee; Adelson-Goldstein, Jayme; Shapiro, Norma

    This teacher resource book offers over 100 reproducible communicative practice activities and 768 picture cards based on the vocabulary of the Oxford Picture Dictionary. Teacher's notes and instructions, including adaptations for multilevel classes, are provided. The activities book has up-to-date art and graphics, explaining over 3700 words. The…

  10. Specifications for a Federal Information Processing Standard Data Dictionary System

    NASA Technical Reports Server (NTRS)

    Goldfine, A.

    1984-01-01

    The development of a software specification that Federal agencies may use in evaluating and selecting data dictionary systems (DDS) is discussed. To supply the flexibility needed by widely different applications and environments in the Federal Government, the Federal Information Processing Standard (FIPS) specifies a core DDS together with an optimal set of modules. The focus and status of the development project are described. Functional specifications for the FIPS DDS are examined for the dictionary, the dictionary schema, and the dictionary processing system. The DDS user interfaces and DDS software interfaces are discussed as well as dictionary administration.

  11. Chemical entity recognition in patents by combining dictionary-based and statistical approaches.

    PubMed

    Akhondi, Saber A; Pons, Ewoud; Afzal, Zubair; van Haagen, Herman; Becker, Benedikt F H; Hettne, Kristina M; van Mulligen, Erik M; Kors, Jan A

    2016-01-01

    We describe the development of a chemical entity recognition system and its application in the CHEMDNER-patent track of BioCreative 2015. This community challenge includes a Chemical Entity Mention in Patents (CEMP) recognition task and a Chemical Passage Detection (CPD) classification task. We addressed both tasks by an ensemble system that combines a dictionary-based approach with a statistical one. For this purpose the performance of several lexical resources was assessed using Peregrine, our open-source indexing engine. We combined our dictionary-based results on the patent corpus with the results of tmChem, a chemical recognizer using a conditional random field classifier. To improve the performance of tmChem, we utilized three additional features, viz. part-of-speech tags, lemmas and word-vector clusters. When evaluated on the training data, our final system obtained an F-score of 85.21% for the CEMP task, and an accuracy of 91.53% for the CPD task. On the test set, the best system ranked sixth among 21 teams for CEMP with an F-score of 86.82%, and second among nine teams for CPD with an accuracy of 94.23%. The differences in performance between the best ensemble system and the statistical system separately were small.Database URL: http://biosemantics.org/chemdner-patents. © The Author(s) 2016. Published by Oxford University Press.

  12. Chemical Entity Recognition and Resolution to ChEBI

    PubMed Central

    Grego, Tiago; Pesquita, Catia; Bastos, Hugo P.; Couto, Francisco M.

    2012-01-01

    Chemical entities are ubiquitous through the biomedical literature and the development of text-mining systems that can efficiently identify those entities are required. Due to the lack of available corpora and data resources, the community has focused its efforts in the development of gene and protein named entity recognition systems, but with the release of ChEBI and the availability of an annotated corpus, this task can be addressed. We developed a machine-learning-based method for chemical entity recognition and a lexical-similarity-based method for chemical entity resolution and compared them with Whatizit, a popular-dictionary-based method. Our methods outperformed the dictionary-based method in all tasks, yielding an improvement in F-measure of 20% for the entity recognition task, 2–5% for the entity-resolution task, and 15% for combined entity recognition and resolution tasks. PMID:25937941

  13. Concordancers and Dictionaries as Problem-Solving Tools for ESL Academic Writing

    ERIC Educational Resources Information Center

    Yoon, Choongil

    2016-01-01

    The present study investigated how 6 Korean ESL graduate students in Canada used a suite of freely available reference resources, consisting of Web-based corpus tools, Google search engines, and dictionaries, for solving linguistic problems while completing an authentic academic writing assignment in English. Using a mixed methods design, the…

  14. The Primary Computer Dictionary.

    ERIC Educational Resources Information Center

    Girard, Suzanne; Willing, Kathlene

    Suitable for children from kindergarten to grade three, this dictionary is designed to introduce young children to computer terminology at a level that they will understand and find useful. It is also suitable for parents as a home resource, for library use, and as a handbook for teachers. The first sentence of each definition contains the kernel…

  15. Syntactic and Semantic Specifications in Online English Learners' Dictionaries

    ERIC Educational Resources Information Center

    Rizo-Rodriguez, Alfonso

    2009-01-01

    Among the multifarious linguistic resources currently available on the Internet, learners of English as a foreign language, as well as teachers and translators, can effortlessly access a vast variety of electronic dictionaries well suited to a multiplicity of lookup operations. A particular kind of lexicographical work on the Web is the…

  16. The Database Query Support Processor (QSP)

    NASA Technical Reports Server (NTRS)

    1993-01-01

    The number and diversity of databases available to users continues to increase dramatically. Currently, the trend is towards decentralized, client server architectures that (on the surface) are less expensive to acquire, operate, and maintain than information architectures based on centralized, monolithic mainframes. The database query support processor (QSP) effort evaluates the performance of a network level, heterogeneous database access capability. Air Force Material Command's Rome Laboratory has developed an approach, based on ANSI standard X3.138 - 1988, 'The Information Resource Dictionary System (IRDS)' to seamless access to heterogeneous databases based on extensions to data dictionary technology. To successfully query a decentralized information system, users must know what data are available from which source, or have the knowledge and system privileges necessary to find out this information. Privacy and security considerations prohibit free and open access to every information system in every network. Even in completely open systems, time required to locate relevant data (in systems of any appreciable size) would be better spent analyzing the data, assuming the original question was not forgotten. Extensions to data dictionary technology have the potential to more fully automate the search and retrieval for relevant data in a decentralized environment. Substantial amounts of time and money could be saved by not having to teach users what data resides in which systems and how to access each of those systems. Information describing data and how to get it could be removed from the application and placed in a dedicated repository where it belongs. The result simplified applications that are less brittle and less expensive to build and maintain. Software technology providing the required functionality is off the shelf. The key difficulty is in defining the metadata required to support the process. The database query support processor effort will provide quantitative data on the amount of effort required to implement an extended data dictionary at the network level, add new systems, adapt to changing user needs, and provide sound estimates on operations and maintenance costs and savings.

  17. Dictionary learning-based CT detection of pulmonary nodules

    NASA Astrophysics Data System (ADS)

    Wu, Panpan; Xia, Kewen; Zhang, Yanbo; Qian, Xiaohua; Wang, Ge; Yu, Hengyong

    2016-10-01

    Segmentation of lung features is one of the most important steps for computer-aided detection (CAD) of pulmonary nodules with computed tomography (CT). However, irregular shapes, complicated anatomical background and poor pulmonary nodule contrast make CAD a very challenging problem. Here, we propose a novel scheme for feature extraction and classification of pulmonary nodules through dictionary learning from training CT images, which does not require accurately segmented pulmonary nodules. Specifically, two classification-oriented dictionaries and one background dictionary are learnt to solve a two-category problem. In terms of the classification-oriented dictionaries, we calculate sparse coefficient matrices to extract intrinsic features for pulmonary nodule classification. The support vector machine (SVM) classifier is then designed to optimize the performance. Our proposed methodology is evaluated with the lung image database consortium and image database resource initiative (LIDC-IDRI) database, and the results demonstrate that the proposed strategy is promising.

  18. Sentiment analysis of political communication: combining a dictionary approach with crowdcoding.

    PubMed

    Haselmayer, Martin; Jenny, Marcelo

    2017-01-01

    Sentiment is important in studies of news values, public opinion, negative campaigning or political polarization and an explosive expansion of digital textual data and fast progress in automated text analysis provide vast opportunities for innovative social science research. Unfortunately, tools currently available for automated sentiment analysis are mostly restricted to English texts and require considerable contextual adaption to produce valid results. We present a procedure for collecting fine-grained sentiment scores through crowdcoding to build a negative sentiment dictionary in a language and for a domain of choice. The dictionary enables the analysis of large text corpora that resource-intensive hand-coding struggles to cope with. We calculate the tonality of sentences from dictionary words and we validate these estimates with results from manual coding. The results show that the crowdbased dictionary provides efficient and valid measurement of sentiment. Empirical examples illustrate its use by analyzing the tonality of party statements and media reports.

  19. Creating a Digital Jamaican Sign Language Dictionary: A R2D2 Approach

    ERIC Educational Resources Information Center

    MacKinnon, Gregory; Soutar, Iris

    2015-01-01

    The Jamaican Association for the Deaf, in their responsibilities to oversee education for individuals who are deaf in Jamaica, has demonstrated an urgent need for a dictionary that assists students, educators, and parents with the practical use of "Jamaican Sign Language." While paper versions of a preliminary resource have been explored…

  20. The "Dictionary of Smoky Mountain English" as a Resource for Southern Appalachia.

    ERIC Educational Resources Information Center

    Montgomery, Michael

    This paper argues that one important reflection of a culture's status is the existence of general reference books on it. To this end, it discusses the forthcoming "Dictionary of Smoky Mountain English," a book designed to address the lack of a comprehensive reference work on Appalachian speech and language patterns in this region. The…

  1. Dave Sperling's Guide to the Internet's Best Writing Resources.

    ERIC Educational Resources Information Center

    Sperling, Dave

    2003-01-01

    Provides a guide to writing resources on the Internet, including resources for business writing, dictionaries and thesauruses, e-mail, encyclopedias, free Web space, grammar, fun, online help, online writing labs, punctuation, and spelling. Lists useful Internet tips. (Author/VWL)

  2. A Knowledge Dictionary System for Scheduling Support

    DTIC Science & Technology

    1988-10-01

    quantity of the resource is allocated; (g) two (or more) activities that conflict temporall , can only proceed if one or more of the activities are re...does not allow any parameter substitution but merely processes the contents of the file as a series of I:eystrokes. To create a macro, simply type...bytes in the system. When the hardware actually uses such a struc- ture (e.g., the Motorola 68000 series CPU) the OS will almost always present it that

  3. Essential Nursing References.

    ERIC Educational Resources Information Center

    Nursing and Health Care Perspectives, 2000

    2000-01-01

    This partially annotated bibliography contains these categories: abstract sources, archives, audiovisuals, bibliographies, databases, dictionaries, directories, drugs/toxicology/environmental health, grant resources, histories, indexes, Internet resources, reviews, statistical sources, and writers' manuals and guides. A supplement lists Canadian…

  4. A Standard-Driven Data Dictionary for Data Harmonization of Heterogeneous Datasets in Urban Geological Information Systems

    NASA Astrophysics Data System (ADS)

    Liu, G.; Wu, C.; Li, X.; Song, P.

    2013-12-01

    The 3D urban geological information system has been a major part of the national urban geological survey project of China Geological Survey in recent years. Large amount of multi-source and multi-subject data are to be stored in the urban geological databases. There are various models and vocabularies drafted and applied by industrial companies in urban geological data. The issues such as duplicate and ambiguous definition of terms and different coding structure increase the difficulty of information sharing and data integration. To solve this problem, we proposed a national standard-driven information classification and coding method to effectively store and integrate urban geological data, and we applied the data dictionary technology to achieve structural and standard data storage. The overall purpose of this work is to set up a common data platform to provide information sharing service. Research progresses are as follows: (1) A unified classification and coding method for multi-source data based on national standards. Underlying national standards include GB 9649-88 for geology and GB/T 13923-2006 for geography. Current industrial models are compared with national standards to build a mapping table. The attributes of various urban geological data entity models are reduced to several categories according to their application phases and domains. Then a logical data model is set up as a standard format to design data file structures for a relational database. (2) A multi-level data dictionary for data standardization constraint. Three levels of data dictionary are designed: model data dictionary is used to manage system database files and enhance maintenance of the whole database system; attribute dictionary organizes fields used in database tables; term and code dictionary is applied to provide a standard for urban information system by adopting appropriate classification and coding methods; comprehensive data dictionary manages system operation and security. (3) An extension to system data management function based on data dictionary. Data item constraint input function is making use of the standard term and code dictionary to get standard input result. Attribute dictionary organizes all the fields of an urban geological information database to ensure the consistency of term use for fields. Model dictionary is used to generate a database operation interface automatically with standard semantic content via term and code dictionary. The above method and technology have been applied to the construction of Fuzhou Urban Geological Information System, South-East China with satisfactory results.

  5. Toward better public health reporting using existing off the shelf approaches: A comparison of alternative cancer detection approaches using plaintext medical data and non-dictionary based feature selection.

    PubMed

    Kasthurirathne, Suranga N; Dixon, Brian E; Gichoya, Judy; Xu, Huiping; Xia, Yuni; Mamlin, Burke; Grannis, Shaun J

    2016-04-01

    Increased adoption of electronic health records has resulted in increased availability of free text clinical data for secondary use. A variety of approaches to obtain actionable information from unstructured free text data exist. These approaches are resource intensive, inherently complex and rely on structured clinical data and dictionary-based approaches. We sought to evaluate the potential to obtain actionable information from free text pathology reports using routinely available tools and approaches that do not depend on dictionary-based approaches. We obtained pathology reports from a large health information exchange and evaluated the capacity to detect cancer cases from these reports using 3 non-dictionary feature selection approaches, 4 feature subset sizes, and 5 clinical decision models: simple logistic regression, naïve bayes, k-nearest neighbor, random forest, and J48 decision tree. The performance of each decision model was evaluated using sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. Decision models parameterized using automated, informed, and manual feature selection approaches yielded similar results. Furthermore, non-dictionary classification approaches identified cancer cases present in free text reports with evaluation measures approaching and exceeding 80-90% for most metrics. Our methods are feasible and practical approaches for extracting substantial information value from free text medical data, and the results suggest that these methods can perform on par, if not better, than existing dictionary-based approaches. Given that public health agencies are often under-resourced and lack the technical capacity for more complex methodologies, these results represent potentially significant value to the public health field. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. FRS Download Summary and Data Element Dictionary ...

    EPA Pesticide Factsheets

    ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.

  7. RCRAInfo Download Summary and Data Element Dictionary ...

    EPA Pesticide Factsheets

    ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.

  8. Terminological reference of a knowledge-based system: the data dictionary.

    PubMed

    Stausberg, J; Wormek, A; Kraut, U

    1995-01-01

    The development of open and integrated knowledge bases makes new demands on the definition of the used terminology. The definition should be realized in a data dictionary separated from the knowledge base. Within the works done at a reference model of medical knowledge, a data dictionary has been developed and used in different applications: a term definition shell, a documentation tool and a knowledge base. The data dictionary includes that part of terminology, which is largely independent of a certain knowledge model. For that reason, the data dictionary can be used as a basis for integrating knowledge bases into information systems, for knowledge sharing and reuse and for modular development of knowledge-based systems.

  9. The Junior Computer Dictionary. 101 Useful Words and Definitions to Introduce Students to Computer Terminology.

    ERIC Educational Resources Information Center

    Willing, Kathlene R.; Girard, Suzanne

    Suitable for children from grades four to seven, this dictionary is designed to introduce children to computer terminology at a level that they will understand and find useful. It is also suitable as a home resource for parents, for library use, and as a handbook for teachers. For each word, the first sentence of the definition contains the kernel…

  10. The ABCs of Data Dictionaries

    ERIC Educational Resources Information Center

    Gould, Tate; Nicholas, Amy; Blandford, William; Ruggiero, Tony; Peters, Mary; Thayer, Sara

    2014-01-01

    This overview of the basic components of a data dictionary is designed to educate and inform IDEA Part C and Part B 619 state staff about the purpose and benefits of having up-to-date data dictionaries for their data systems. This report discusses the following topics: (1) What Is a Data Dictionary?; (2) Why Is a Data Dictionary Needed and How Can…

  11. [Systems, boundaries and resources: the lexicographer Gerhard Wahrig (1923-1978) and the genesis of his project "dictionary as database"].

    PubMed

    Wahrig-Burfeind, Renate; Wahrig, Bettina

    2014-09-01

    Gerhard Wahrig's private archive has recently been retrieved by the authors and their siblings. We undertake a first survey of the unpublished material and concentrate on those aspects of Wahrig's bio-ergography which stand in relation to his life project "dictionary as database", realised shortly before his death. We argue that this project was conceived in the 1950s, while Wahrig was writing and editing dictionaries and encyclopedias for the Bibliographisches Institut in Leipzig. Wahrig, who had been a wireless operator in WWII, was well informed about the development of computers in West Germany. He was influenced both by Ferdinand de Saussure and by the discussion on language and structure in the Soviet Union. When he crossed the German/German border in 1959, he experienced mechanisms of exclusion before he could establish himself in the West as a lexicographer. We argue that the transfer of symbolic and human capital was problematic due to the cultural differences between the two Germanies. In the 1970s, he became a professor of General and Applied Linguistics. The project of a "dictionary as database" was intended both as a basis for extensive empirical research on the semantic structure of natural languages and as a working tool for the average user of the German language. Due to his untimely death, he could not pursue his idea of exploring semantic networks.

  12. Data dictionary services in XNAT and the Human Connectome Project.

    PubMed

    Herrick, Rick; McKay, Michael; Olsen, Timothy; Horton, William; Florida, Mark; Moore, Charles J; Marcus, Daniel S

    2014-01-01

    The XNAT informatics platform is an open source data management tool used by biomedical imaging researchers around the world. An important feature of XNAT is its highly extensible architecture: users of XNAT can add new data types to the system to capture the imaging and phenotypic data generated in their studies. Until recently, XNAT has had limited capacity to broadcast the meaning of these data extensions to users, other XNAT installations, and other software. We have implemented a data dictionary service for XNAT, which is currently being used on ConnectomeDB, the Human Connectome Project (HCP) public data sharing website. The data dictionary service provides a framework to define key relationships between data elements and structures across the XNAT installation. This includes not just core data representing medical imaging data or subject or patient evaluations, but also taxonomical structures, security relationships, subject groups, and research protocols. The data dictionary allows users to define metadata for data structures and their properties, such as value types (e.g., textual, integers, floats) and valid value templates, ranges, or field lists. The service provides compatibility and integration with other research data management services by enabling easy migration of XNAT data to standards-based formats such as the Resource Description Framework (RDF), JavaScript Object Notation (JSON), and Extensible Markup Language (XML). It also facilitates the conversion of XNAT's native data schema into standard neuroimaging vocabularies and structures.

  13. Data dictionary services in XNAT and the Human Connectome Project

    PubMed Central

    Herrick, Rick; McKay, Michael; Olsen, Timothy; Horton, William; Florida, Mark; Moore, Charles J.; Marcus, Daniel S.

    2014-01-01

    The XNAT informatics platform is an open source data management tool used by biomedical imaging researchers around the world. An important feature of XNAT is its highly extensible architecture: users of XNAT can add new data types to the system to capture the imaging and phenotypic data generated in their studies. Until recently, XNAT has had limited capacity to broadcast the meaning of these data extensions to users, other XNAT installations, and other software. We have implemented a data dictionary service for XNAT, which is currently being used on ConnectomeDB, the Human Connectome Project (HCP) public data sharing website. The data dictionary service provides a framework to define key relationships between data elements and structures across the XNAT installation. This includes not just core data representing medical imaging data or subject or patient evaluations, but also taxonomical structures, security relationships, subject groups, and research protocols. The data dictionary allows users to define metadata for data structures and their properties, such as value types (e.g., textual, integers, floats) and valid value templates, ranges, or field lists. The service provides compatibility and integration with other research data management services by enabling easy migration of XNAT data to standards-based formats such as the Resource Description Framework (RDF), JavaScript Object Notation (JSON), and Extensible Markup Language (XML). It also facilitates the conversion of XNAT's native data schema into standard neuroimaging vocabularies and structures. PMID:25071542

  14. Chinese-English Nuclear and Physics Dictionary.

    ERIC Educational Resources Information Center

    Air Force Systems Command, Wright-Patterson AFB, OH. Foreign Technology Div.

    The Nuclear and Physics Dictionary is one of a series of Chinese-English technical dictionaries prepared by the Foreign Technology Division, United States Air Force Systems Command. The purpose of this dictionary is to provide rapid reference tools for translators, abstractors, and research analysts concerned with scientific and technical…

  15. The T.M.R. Data Dictionary: A Management Tool for Data Base Design

    PubMed Central

    Ostrowski, Maureen; Bernes, Marshall R.

    1984-01-01

    In January 1981, a dictionary-driven ambulatory care information system known as TMR (The Medical Record) was installed at a large private medical group practice in Los Angeles. TMR's data dictionary has enabled the medical group to adapt the software to meet changing user needs largely without programming support. For top management, the dictionary is also a tool for navigating through the system's complexity and assuring the integrity of management goals.

  16. IRDS prototyping with applications to the representation of EA/RA models

    NASA Technical Reports Server (NTRS)

    Lekkos, Anthony A.; Greenwood, Bruce

    1988-01-01

    The requirements and system overview for the Information Resources Dictionary System (IRDS) are described. A formal design specification for a scaled down IRDS implementation compatible with the proposed FIPS IRDS standard is contained. The major design objectives for this IRDS will include a menu driven user interface, implementation of basic IRDS operations, and PC compatibility. The IRDS was implemented using Smalltalk/5 object oriented programming system and an ATT 6300 personal computer running under MS-DOS 3.1. The difficulties encountered in using Smalltalk are discussed.

  17. ICIS-Air Download Summary and Data Element Dictionary ...

    EPA Pesticide Factsheets

    ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.

  18. ICIS-FE&C Download Summary and Data Element Dictionary ...

    EPA Pesticide Factsheets

    ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.

  19. ICIS-NPDES Limit Summary and Data Element Dictionary ...

    EPA Pesticide Factsheets

    ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.

  20. Civil Enforcement Case Report Data Dictionary | ECHO | US ...

    EPA Pesticide Factsheets

    ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.

  1. ICIS-NPDES DMR Summary and Data Element Dictionary ...

    EPA Pesticide Factsheets

    ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.

  2. French as a Second Language. Annotated Bibliography of Learning Resources: Beginning Level. Early Childhood Services - Grade 12.

    ERIC Educational Resources Information Center

    Alberta Dept. of Education, Edmonton. Language Services Branch.

    This annotated bibliography of instructional resources for Alberta (Canada) introductory French second language teaching in early childhood, elementary, and secondary education consists of citations in 10 categories: audio/video recordings; communicative activity resources (primarily texts and workbooks); dictionaries and vocabulary handbooks;…

  3. Marks, Spaces and Boundaries: Punctuation (and Other Effects) in the Typography of Dictionaries

    ERIC Educational Resources Information Center

    Luna, Paul

    2011-01-01

    Dictionary compilers and designers use punctuation to structure and clarify entries and to encode information. Dictionaries with a relatively simple structure can have simple typography and simple punctuation; as dictionaries grew more complex, and encountered the space constraints of the printed page, complex encoding systems were developed,…

  4. An Electronic Dictionary and Translation System for Murrinh-Patha

    ERIC Educational Resources Information Center

    Seiss, Melanie; Nordlinger, Rachel

    2012-01-01

    This paper presents an electronic dictionary and translation system for the Australian language Murrinh-Patha. Its complex verbal structure makes learning Murrinh-Patha very difficult. Design learning materials or a dictionary which is easy to understand and to use also presents a challenge. This paper discusses some of the difficulties posed by…

  5. A dictionary server for supplying context sensitive medical knowledge.

    PubMed

    Ruan, W; Bürkle, T; Dudeck, J

    2000-01-01

    The Giessen Data Dictionary Server (GDDS), developed at Giessen University Hospital, integrates clinical systems with on-line, context sensitive medical knowledge to help with making medical decisions. By "context" we mean the clinical information that is being presented at the moment the information need is occurring. The dictionary server makes use of a semantic network supported by a medical data dictionary to link terms from clinical applications to their proper information sources. It has been designed to analyze the network structure itself instead of knowing the layout of the semantic net in advance. This enables us to map appropriate information sources to various clinical applications, such as nursing documentation, drug prescription and cancer follow up systems. This paper describes the function of the dictionary server and shows how the knowledge stored in the semantic network is used in the dictionary service.

  6. Evaluation of Controlled Vocabulary Resources for Development of a Consumer Entry Vocabulary for Diabetes

    PubMed Central

    Monga, Harpreet K; Sievert, MaryEllen C; Hall, Joan Houston; Longo, Daniel R

    2001-01-01

    Background Digital information technology can facilitate informed decision making by individuals regarding their personal health care. The digital divide separates those who do and those who do not have access to or otherwise make use of digital information. To close the digital divide, health care communications research must address a fundamental issue, the consumer vocabulary problem: consumers of health care, at least those who are laypersons, are not always familiar with the professional vocabulary and concepts used by providers of health care and by providers of health care information, and, conversely, health care and health care information providers are not always familiar with the vocabulary and concepts used by consumers. One way to address this problem is to develop a consumer entry vocabulary for health care communications. Objectives To evaluate the potential of controlled vocabulary resources for supporting the development of consumer entry vocabulary for diabetes. Methods We used folk medical terms from the Dictionary of American Regional English project to create exended versions of 3 controlled vocabulary resources: the Unified Medical Language System Metathesaurus, the Eurodicautom of the European Commission's Translation Service, and the European Commission Glossary of popular and technical medical terms. We extracted consumer terms from consumer-authored materials, and physician terms from physician-authored materials. We used our extended versions of the vocabulary resources to link diabetes-related terms used by health care consumers to synonymous, nearly-synonymous, or closely-related terms used by family physicians. We also examined whether retrieval of diabetes-related World Wide Web information sites maintained by nonprofit health care professional organizations, academic organizations, or governmental organizations can be improved by substituting a physician term for its related consumer term in the query. Results The Dictionary of American Regional English extension of the Metathesaurus provided coverage, either direct or indirect, of approximately 23% of the natural language consumer-term-physician-term pairs. The Dictionary of American Regional English extension of the Eurodicautom provided coverage for 16% of the term pairs. Both the Metathesaurus and the Eurodicautom indirectly related more terms than they directly related. A high percentage of covered term pairs, with more indirectly covered pairs than directly covered pairs, might be one way to make the most out of expensive controlled vocabulary resources. We compared retrieval of diabetes-related Web information sites using the physician terms to retrieval using related consumer terms We based the comparison on retrieval of sites maintained by non-profit healthcare professional organizations, academic organizations, or governmental organizations. The number of such sites in the first 20 results from a search was increased by substituting a physician term for its related consumer term in the query. This suggests that the Dictionary of American Regional English extensions of the Metathesaurus and Eurodicautom may be used to provide useful links from natural language consumer terms to natural language physician terms. Conclusions The Dictionary of American Regional English extensions of the Metathesaurus and Eurodicautom should be investigated further for support of consumer entry vocabulary for diabetes. PMID:11720966

  7. Grammar Coding in the "Oxford Advanced Learner's Dictionary of Current English."

    ERIC Educational Resources Information Center

    Wekker, Herman

    1992-01-01

    Focuses on the revised system of grammar coding for verbs in the fourth edition of the "Oxford Advanced Learner's Dictionary of Current English" (OALD4), comparing it with two other similar dictionaries. It is shown that the OALD4 is found to be more favorable on many criteria than the other comparable dictionaries. (16 references) (VWL)

  8. Graded Lexicons: New Resources for Educational Purposes and Much More

    ERIC Educational Resources Information Center

    Gala, Núria; Billami, Mokhtar B.; François, Thomas; Bernhard, Delphine

    2015-01-01

    Computational tools and resources play an important role for vocabulary acquisition. Although a large variety of dictionaries and learning games are available, few resources provide information about the complexity of a word, either for learning or for comprehension. The idea here is to use frequency counts combined with intralexical variables to…

  9. UMass at TREC 2002: Cross Language and Novelty Tracks

    DTIC Science & Technology

    2002-01-01

    resources – stemmers, dictionaries , machine translation, and an acronym database. We found that proper names were extremely important in this year’s queries...data by manually annotating 48 additional topics. 1. Cross Language Track We submitted one monolingual run and four cross-language runs. For the... monolingual run, the technology was essentially the same as the system we used for TREC 2001. For the cross-language run, we integrated some new

  10. ESP Students' Views on Online Language Resources for L2 Text Production Purposes

    ERIC Educational Resources Information Center

    Kozlova, Inna; Presas, Marisa

    2013-01-01

    The use of online language resources for L2 text production purposes is a recent phenomenon and has not yet been studied in depth. Increasing availability of new online resources seems to be changing the very nature of L2 text production. The traditional dictionary, hitherto a default resource to help with language doubts, is being left behind…

  11. A dictionary server for supplying context sensitive medical knowledge.

    PubMed Central

    Ruan, W.; Bürkle, T.; Dudeck, J.

    2000-01-01

    The Giessen Data Dictionary Server (GDDS), developed at Giessen University Hospital, integrates clinical systems with on-line, context sensitive medical knowledge to help with making medical decisions. By "context" we mean the clinical information that is being presented at the moment the information need is occurring. The dictionary server makes use of a semantic network supported by a medical data dictionary to link terms from clinical applications to their proper information sources. It has been designed to analyze the network structure itself instead of knowing the layout of the semantic net in advance. This enables us to map appropriate information sources to various clinical applications, such as nursing documentation, drug prescription and cancer follow up systems. This paper describes the function of the dictionary server and shows how the knowledge stored in the semantic network is used in the dictionary service. PMID:11079978

  12. A Participatory Research Approach to develop an Arabic Symbol Dictionary.

    PubMed

    Draffan, E A; Kadous, Amatullah; Idris, Amal; Banes, David; Zeinoun, Nadine; Wald, Mike; Halabi, Nawar

    2015-01-01

    The purpose of the Arabic Symbol Dictionary research discussed in this paper, is to provide a resource of culturally, environmentally and linguistically suitable symbols to aid communication and literacy skills. A participatory approach with the use of online social media and a bespoke symbol management system has been established to enhance the process of matching a user based Arabic and English core vocabulary with appropriate imagery. Participants including AAC users, their families, carers, teachers and therapists who have been involved in the research from the outset, collating the vocabularies, debating cultural nuances for symbols and critiquing the design of technologies for selection procedures. The positive reaction of those who have voted on the symbols with requests for early use have justified the iterative nature of the methodologies used for this part of the project. However, constant re-evaluation will be necessary and in depth analysis of all the data received has yet to be completed.

  13. Integration of a knowledge-based system and a clinical documentation system via a data dictionary.

    PubMed

    Eich, H P; Ohmann, C; Keim, E; Lang, K

    1997-01-01

    This paper describes the design and realisation of a knowledge-based system and a clinical documentation system linked via a data dictionary. The software was developed as a shell with object oriented methods and C++ for IBM-compatible PC's and WINDOWS 3.1/95. The data dictionary covers terminology and document objects with relations to external classifications. It controls the terminology in the documentation program with form-based entry of clinical documents and in the knowledge-based system with scores and rules. The software was applied to the clinical field of acute abdominal pain by implementing a data dictionary with 580 terminology objects, 501 document objects, and 2136 links; a documentation module with 8 clinical documents and a knowledge-based system with 10 scores and 7 sets of rules.

  14. "It's Just Reflex Now": German Language Learners' Use of Online Resources

    ERIC Educational Resources Information Center

    Larson-Guenette, Julie

    2013-01-01

    This study examined how often and to what extent university learners of German use online resources (e.g., online dictionaries and translators) in relation to German coursework, their motivations for use, and their beliefs about online resources and language learning. Data for this study consisted of open-ended surveys ("n" = 71) and face-to-face…

  15. Resources for Topics in Architecture.

    ERIC Educational Resources Information Center

    Van Noate, Judith, Comp.

    This guide for conducting library research on topics in architecture or on the work of a particular architect presents suggestions for utilizing four categories of resources: books, dictionaries and encyclopedias, indexes, and a periodicals and series list (PASL). Two topics are researched as examples: the contemporary architect Richard Meier, and…

  16. Implementation and management of a biomedical observation dictionary in a large healthcare information system.

    PubMed

    Vandenbussche, Pierre-Yves; Cormont, Sylvie; André, Christophe; Daniel, Christel; Delahousse, Jean; Charlet, Jean; Lepage, Eric

    2013-01-01

    This study shows the evolution of a biomedical observation dictionary within the Assistance Publique Hôpitaux Paris (AP-HP), the largest European university hospital group. The different steps are detailed as follows: the dictionary creation, the mapping to logical observation identifier names and codes (LOINC), the integration into a multiterminological management platform and, finally, the implementation in the health information system. AP-HP decided to create a biomedical observation dictionary named AnaBio, to map it to LOINC and to maintain the mapping. A management platform based on methods used for knowledge engineering has been put in place. It aims at integrating AnaBio within the health information system and improving both the quality and stability of the dictionary. This new management platform is now active in AP-HP. The AnaBio dictionary is shared by 120 laboratories and currently includes 50 000 codes. The mapping implementation to LOINC reaches 40% of the AnaBio entries and uses 26% of LOINC records. The results of our work validate the choice made to develop a local dictionary aligned with LOINC. This work constitutes a first step towards a wider use of the platform. The next step will support the entire biomedical production chain, from the clinician prescription, through laboratory tests tracking in the laboratory information system to the communication of results and the use for decision support and biomedical research. In addition, the increase in the mapping implementation to LOINC ensures the interoperability allowing communication with other international health institutions.

  17. Data dictionary and formatting standard for dissemination of geotechnical data

    USGS Publications Warehouse

    Benoit, J.; Bobbitt, J.I.; Ponti, D.J.; Shimel, S.A.; ,

    2004-01-01

    A pilot system for archiving and web dissemination of geotechnical data collected and stored by various agencies is currently under development. Part of the scope of this project, sponsored by the Consortium of Organizations for Strong-Motion Observation Systems (COSMOS) and by the Pacific Earthquake Engineering Research Center (PEER) Lifelines Program, is the development of a data dictionary and formatting standard. This paper presents the data model along with the basic structure of the data dictionary tables for this pilot system.

  18. Part 6: The Literature of Inorganic Chemistry, Revised.

    ERIC Educational Resources Information Center

    Douville, Judith A.

    2002-01-01

    Presents a list of resources on inorganic chemistry that includes general surveys, nomenclature, dictionaries, handbooks, compilations, and treatises. Selected for use by academic and student chemists. (DDR)

  19. A practical implementation for a data dictionary in an environment of diverse data sets

    USGS Publications Warehouse

    Sprenger, Karla K.; Larsen, Dana M.

    1993-01-01

    The need for a data dictionary database at the U.S. Geological Survey's EROS Data Center (EDC) was reinforced with the Earth Observing System Data and Information System (EOSDIS) requirement for consistent field definitions of data sets residing at more than one archive center. The EDC requirement addresses the existence of multiple sets with identical field definitions using various naming conventions. The EDC is developing a data dictionary database to accomplish the following foals: to standardize field names for ease in software development; to facilitate querying and updating of the date; and to generate ad hoc reports. The structure of the EDC electronic data dictionary database supports different metadata systems as well as many different data sets. A series of reports is used to keep consistency among data sets and various metadata systems.

  20. An Analysis of Data Dictionaries and Their Role in Information Resource Management.

    DTIC Science & Technology

    1984-09-01

    management system DBMS). It manages data by utilizing software routines built Lato the idta di:-tionary package and thus is not dependent 3n D.IrS soft- wace ...described as having active or passive interfaces or a combination of the two. An inter- faze is a series of commands which connact the data...carefully conceived examples in the ii:tionary’s refer- enzc manuals. A hierarchy of menus can reduce complex oper- ations to a series of smaller

  1. Emo, love and god: making sense of Urban Dictionary, a crowd-sourced online dictionary.

    PubMed

    Nguyen, Dong; McGillivray, Barbara; Yasseri, Taha

    2018-05-01

    The Internet facilitates large-scale collaborative projects and the emergence of Web 2.0 platforms, where producers and consumers of content unify, has drastically changed the information market. On the one hand, the promise of the 'wisdom of the crowd' has inspired successful projects such as Wikipedia, which has become the primary source of crowd-based information in many languages. On the other hand, the decentralized and often unmonitored environment of such projects may make them susceptible to low-quality content. In this work, we focus on Urban Dictionary, a crowd-sourced online dictionary. We combine computational methods with qualitative annotation and shed light on the overall features of Urban Dictionary in terms of growth, coverage and types of content. We measure a high presence of opinion-focused entries, as opposed to the meaning-focused entries that we expect from traditional dictionaries. Furthermore, Urban Dictionary covers many informal, unfamiliar words as well as proper nouns. Urban Dictionary also contains offensive content, but highly offensive content tends to receive lower scores through the dictionary's voting system. The low threshold to include new material in Urban Dictionary enables quick recording of new words and new meanings, but the resulting heterogeneous content can pose challenges in using Urban Dictionary as a source to study language innovation.

  2. Chinese-English Aviation and Space Dictionary.

    ERIC Educational Resources Information Center

    Air Force Systems Command, Wright-Patterson AFB, OH. Foreign Technology Div.

    The Aviation and Space Dictionary is the second of a series of Chinese-English technical dictionaries under preparation by the Foreign Technology Division, United States Air Force Systems Command. The purpose of the series is to provide rapid reference tools for translators, abstracters, and research analysts concerned with scientific and…

  3. Chinese-Cantonese Dictionary of Common Chinese-Cantonese Characters.

    ERIC Educational Resources Information Center

    Defense Language Inst., Washington, DC.

    This dictionary contains 1,500 Chinese-Cantonese characters (selected from three frequency lists), and more than 6,000 Chinese-Cantonese terms (selected from three Cantonese-English dictionaries). The characters are arranged alphabetically according to the U.S. Army Language School System of Romanization, which is described in the…

  4. Chinese-English Electronics and Telecommunications Dictionary, Vol. 2.

    ERIC Educational Resources Information Center

    Air Force Systems Command, Wright-Patterson AFB, OH. Foreign Technology Div.

    This is the second volume of the Electronics and Telecommunications Dictionary, the third of the series of Chinese-English technical dictionaries under preparation by the Foreign Technology Division, United States Air Force Systems Command. The purpose of the series is to provide rapid reference tools for translators, abstracters, and research…

  5. Chinese-English Electronics and Telecommunications Dictionary. Vol. 1.

    ERIC Educational Resources Information Center

    Air Force Systems Command, Wright-Patterson AFB, OH. Foreign Technology Div.

    This is the first volume of the Electronics and Telecommunications Dictionary, the third of the series of Chinese-English technical dictionaries under preparation by the Foreign Technology Division, United States Air Force Systems Command. The purpose of the series is to provide rapid reference tools for translators, abstracters, and research…

  6. Developing a National-Level Concept Dictionary for EHR Implementations in Kenya.

    PubMed

    Keny, Aggrey; Wanyee, Steven; Kwaro, Daniel; Mulwa, Edwin; Were, Martin C

    2015-01-01

    The increasing adoption of Electronic Health Records (EHR) by developing countries comes with the need to develop common terminology standards to assure semantic interoperability. In Kenya, where the Ministry of Health has rolled out an EHR at 646 sites, several challenges have emerged including variable dictionaries across implementations, inability to easily share data across systems, lack of expertise in dictionary management, lack of central coordination and custody of a terminology service, inadequately defined policies and processes, insufficient infrastructure, among others. A Concept Working Group was constituted to address these challenges. The country settled on a common Kenya data dictionary, initially derived as a subset of the Columbia International eHealth Laboratory (CIEL)/Millennium Villages Project (MVP) dictionary. The initial dictionary scope largely focuses on clinical needs. Processes and policies around dictionary management are being guided by the framework developed by Bakhshi-Raiez et al. Technical and infrastructure-based approaches are also underway to streamline workflow for dictionary management and distribution across implementations. Kenya's approach on comprehensive common dictionary can serve as a model for other countries in similar settings.

  7. Automatic de-identification of textual documents in the electronic health record: a review of recent research

    PubMed Central

    2010-01-01

    Background In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be considered de-identified, the HIPAA "Safe Harbor" technique requires 18 data elements (called PHI: Protected Health Information) to be removed. The de-identification of narrative text documents is often realized manually, and requires significant resources. Well aware of these issues, several authors have investigated automated de-identification of narrative text documents from the electronic health record, and a review of recent research in this domain is presented here. Methods This review focuses on recently published research (after 1995), and includes relevant publications from bibliographic queries in PubMed, conference proceedings, the ACM Digital Library, and interesting publications referenced in already included papers. Results The literature search returned more than 200 publications. The majority focused only on structured data de-identification instead of narrative text, on image de-identification, or described manual de-identification, and were therefore excluded. Finally, 18 publications describing automated text de-identification were selected for detailed analysis of the architecture and methods used, the types of PHI detected and removed, the external resources used, and the types of clinical documents targeted. All text de-identification systems aimed to identify and remove person names, and many included other types of PHI. Most systems used only one or two specific clinical document types, and were mostly based on two different groups of methodologies: pattern matching and machine learning. Many systems combined both approaches for different types of PHI, but the majority relied only on pattern matching, rules, and dictionaries. Conclusions In general, methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize. Methods based on machine learning tend to perform better, especially with PHI that is not mentioned in the dictionaries used. Finally, the issues of anonymization, sufficient performance, and "over-scrubbing" are discussed in this publication. PMID:20678228

  8. Automatic de-identification of textual documents in the electronic health record: a review of recent research.

    PubMed

    Meystre, Stephane M; Friedlin, F Jeffrey; South, Brett R; Shen, Shuying; Samore, Matthew H

    2010-08-02

    In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be considered de-identified, the HIPAA "Safe Harbor" technique requires 18 data elements (called PHI: Protected Health Information) to be removed. The de-identification of narrative text documents is often realized manually, and requires significant resources. Well aware of these issues, several authors have investigated automated de-identification of narrative text documents from the electronic health record, and a review of recent research in this domain is presented here. This review focuses on recently published research (after 1995), and includes relevant publications from bibliographic queries in PubMed, conference proceedings, the ACM Digital Library, and interesting publications referenced in already included papers. The literature search returned more than 200 publications. The majority focused only on structured data de-identification instead of narrative text, on image de-identification, or described manual de-identification, and were therefore excluded. Finally, 18 publications describing automated text de-identification were selected for detailed analysis of the architecture and methods used, the types of PHI detected and removed, the external resources used, and the types of clinical documents targeted. All text de-identification systems aimed to identify and remove person names, and many included other types of PHI. Most systems used only one or two specific clinical document types, and were mostly based on two different groups of methodologies: pattern matching and machine learning. Many systems combined both approaches for different types of PHI, but the majority relied only on pattern matching, rules, and dictionaries. In general, methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize. Methods based on machine learning tend to perform better, especially with PHI that is not mentioned in the dictionaries used. Finally, the issues of anonymization, sufficient performance, and "over-scrubbing" are discussed in this publication.

  9. Basic Aerospace Education Library

    ERIC Educational Resources Information Center

    Journal of Aerospace Education, 1975

    1975-01-01

    Lists the most significant resource items on aerospace education which are presently available. Includes source books, bibliographies, directories, encyclopedias, dictionaries, audiovisuals, curriculum/planning guides, aerospace statistics, aerospace education statistics and newsletters. (BR)

  10. Using Social Media Data to Identify Potential Candidates for Drug Repurposing: A Feasibility Study.

    PubMed

    Rastegar-Mojarad, Majid; Liu, Hongfang; Nambisan, Priya

    2016-06-16

    Drug repurposing (defined as discovering new indications for existing drugs) could play a significant role in drug development, especially considering the declining success rates of developing novel drugs. Typically, new indications for existing medications are identified by accident. However, new technologies and a large number of available resources enable the development of systematic approaches to identify and validate drug-repurposing candidates. Patients today report their experiences with medications on social media and reveal side effects as well as beneficial effects of those medications. Our aim was to assess the feasibility of using patient reviews from social media to identify potential candidates for drug repurposing. We retrieved patient reviews of 180 medications from an online forum, WebMD. Using dictionary-based and machine learning approaches, we identified disease names in the reviews. Several publicly available resources were used to exclude comments containing known indications and adverse drug effects. After manually reviewing some of the remaining comments, we implemented a rule-based system to identify beneficial effects. The dictionary-based system and machine learning system identified 2178 and 6171 disease names respectively in 64,616 patient comments. We provided a list of 10 common patterns that patients used to report any beneficial effects or uses of medication. After manually reviewing the comments tagged by our rule-based system, we identified five potential drug repurposing candidates. To our knowledge, this is the first study to consider using social media data to identify drug-repurposing candidates. We found that even a rule-based system, with a limited number of rules, could identify beneficial effect mentions in patient comments. Our preliminary study shows that social media has the potential to be used in drug repurposing.

  11. Implementation and management of a biomedical observation dictionary in a large healthcare information system

    PubMed Central

    Vandenbussche, Pierre-Yves; Cormont, Sylvie; André, Christophe; Daniel, Christel; Delahousse, Jean; Charlet, Jean; Lepage, Eric

    2013-01-01

    Objective This study shows the evolution of a biomedical observation dictionary within the Assistance Publique Hôpitaux Paris (AP-HP), the largest European university hospital group. The different steps are detailed as follows: the dictionary creation, the mapping to logical observation identifier names and codes (LOINC), the integration into a multiterminological management platform and, finally, the implementation in the health information system. Methods AP-HP decided to create a biomedical observation dictionary named AnaBio, to map it to LOINC and to maintain the mapping. A management platform based on methods used for knowledge engineering has been put in place. It aims at integrating AnaBio within the health information system and improving both the quality and stability of the dictionary. Results This new management platform is now active in AP-HP. The AnaBio dictionary is shared by 120 laboratories and currently includes 50 000 codes. The mapping implementation to LOINC reaches 40% of the AnaBio entries and uses 26% of LOINC records. The results of our work validate the choice made to develop a local dictionary aligned with LOINC. Discussion and Conclusions This work constitutes a first step towards a wider use of the platform. The next step will support the entire biomedical production chain, from the clinician prescription, through laboratory tests tracking in the laboratory information system to the communication of results and the use for decision support and biomedical research. In addition, the increase in the mapping implementation to LOINC ensures the interoperability allowing communication with other international health institutions. PMID:23635601

  12. Boosting drug named entity recognition using an aggregate classifier.

    PubMed

    Korkontzelos, Ioannis; Piliouras, Dimitrios; Dowsey, Andrew W; Ananiadou, Sophia

    2015-10-01

    Drug named entity recognition (NER) is a critical step for complex biomedical NLP tasks such as the extraction of pharmacogenomic, pharmacodynamic and pharmacokinetic parameters. Large quantities of high quality training data are almost always a prerequisite for employing supervised machine-learning techniques to achieve high classification performance. However, the human labour needed to produce and maintain such resources is a significant limitation. In this study, we improve the performance of drug NER without relying exclusively on manual annotations. We perform drug NER using either a small gold-standard corpus (120 abstracts) or no corpus at all. In our approach, we develop a voting system to combine a number of heterogeneous models, based on dictionary knowledge, gold-standard corpora and silver annotations, to enhance performance. To improve recall, we employed genetic programming to evolve 11 regular-expression patterns that capture common drug suffixes and used them as an extra means for recognition. Our approach uses a dictionary of drug names, i.e. DrugBank, a small manually annotated corpus, i.e. the pharmacokinetic corpus, and a part of the UKPMC database, as raw biomedical text. Gold-standard and silver annotated data are used to train maximum entropy and multinomial logistic regression classifiers. Aggregating drug NER methods, based on gold-standard annotations, dictionary knowledge and patterns, improved the performance on models trained on gold-standard annotations, only, achieving a maximum F-score of 95%. In addition, combining models trained on silver annotations, dictionary knowledge and patterns are shown to achieve comparable performance to models trained exclusively on gold-standard data. The main reason appears to be the morphological similarities shared among drug names. We conclude that gold-standard data are not a hard requirement for drug NER. Combining heterogeneous models build on dictionary knowledge can achieve similar or comparable classification performance with that of the best performing model trained on gold-standard annotations. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  13. National Hydrocephalus Foundation

    MedlinePlus

    ... Types of Seizures About the Foundation Mission, History & Philosophy of NHF Treatment of Hydrocephalus What is a Shunt? Treatment Third Ventriculostomy Shunt Malfunction Prognosis and Research Medical Dictionary Resources Success Stories Blessing in Disguise ...

  14. Organizing the present, looking to the future: an online knowledge repository to facilitate collaboration.

    PubMed

    Burchill, C; Roos, L L; Fergusson, P; Jebamani, L; Turner, K; Dueck, S

    2000-01-01

    Comprehensive data available in the Canadian province of Manitoba since 1970 have aided study of the interaction between population health, health care utilization, and structural features of the health care system. Given a complex linked database and many ongoing projects, better organization of available epidemiological, institutional, and technical information was needed. The Manitoba Centre for Health Policy and Evaluation wished to develop a knowledge repository to handle data, document research Methods, and facilitate both internal communication and collaboration with other sites. This evolving knowledge repository consists of both public and internal (restricted access) pages on the World Wide Web (WWW). Information can be accessed using an indexed logical format or queried to allow entry at user-defined points. The main topics are: Concept Dictionary, Research Definitions, Meta-Index, and Glossary. The Concept Dictionary operationalizes concepts used in health research using administrative data, outlining the creation of complex variables. Research Definitions specify the codes for common surgical procedures, tests, and diagnoses. The Meta-Index organizes concepts and definitions according to the Medical Sub-Heading (MeSH) system developed by the National Library of Medicine. The Glossary facilitates navigation through the research terms and abbreviations in the knowledge repository. An Education Resources heading presents a web-based graduate course using substantial amounts of material in the Concept Dictionary, a lecture in the Epidemiology Supercourse, and material for Manitoba's Regional Health Authorities. Confidential information (including Data Dictionaries) is available on the Centre's internal website. Use of the public pages has increased dramatically since January 1998, with almost 6,000 page hits from 250 different hosts in May 1999. More recently, the number of page hits has averaged around 4,000 per month, while the number of unique hosts has climbed to around 400. This knowledge repository promotes standardization and increases efficiency by placing concepts and associated programming in the Centre's collective memory. Collaboration and project management are facilitated.

  15. Organizing the Present, Looking to the Future: An Online Knowledge Repository to Facilitate Collaboration

    PubMed Central

    Burchill, Charles; Fergusson, Patricia; Jebamani, Laurel; Turner, Ken; Dueck, Stephen

    2000-01-01

    Background Comprehensive data available in the Canadian province of Manitoba since 1970 have aided study of the interaction between population health, health care utilization, and structural features of the health care system. Given a complex linked database and many ongoing projects, better organization of available epidemiological, institutional, and technical information was needed. Objective The Manitoba Centre for Health Policy and Evaluation wished to develop a knowledge repository to handle data, document research methods, and facilitate both internal communication and collaboration with other sites. Methods This evolving knowledge repository consists of both public and internal (restricted access) pages on the World Wide Web (WWW). Information can be accessed using an indexed logical format or queried to allow entry at user-defined points. The main topics are: Concept Dictionary, Research Definitions, Meta-Index, and Glossary. The Concept Dictionary operationalizes concepts used in health research using administrative data, outlining the creation of complex variables. Research Definitions specify the codes for common surgical procedures, tests, and diagnoses. The Meta-Index organizes concepts and definitions according to the Medical Sub-Heading (MeSH) system developed by the National Library of Medicine. The Glossary facilitates navigation through the research terms and abbreviations in the knowledge repository. An Education Resources heading presents a web-based graduate course using substantial amounts of material in the Concept Dictionary, a lecture in the Epidemiology Supercourse, and material for Manitoba's Regional Health Authorities. Confidential information (including Data Dictionaries) is available on the Centre's internal website. Results Use of the public pages has increased dramatically since January 1998, with almost 6,000 page hits from 250 different hosts in May 1999. More recently, the number of page hits has averaged around 4,000 per month, while the number of unique hosts has climbed to around 400. Conclusions This knowledge repository promotes standardization and increases efficiency by placing concepts and associated programming in the Centre's collective memory. Collaboration and project management are facilitated. PMID:11720929

  16. 76 FR 10055 - Changes to the Public Housing Assessment System (PHAS): Physical Condition Scoring Notice

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-23

    ... Weights and Criticality Levels, and Dictionary of Deficiency Definitions The Item Weights and Criticality Levels tables and the Dictionary of Deficiency Definitions, currently in use, were published as... Dictionary of Deficiency Definitions is found at http://www.hud.gov/offices/reac/pdf/pass_dict2.3.pdf . V...

  17. Automatic Dictionary Expansion Using Non-parallel Corpora

    NASA Astrophysics Data System (ADS)

    Rapp, Reinhard; Zock, Michael

    Automatically generating bilingual dictionaries from parallel, manually translated texts is a well established technique that works well in practice. However, parallel texts are a scarce resource. Therefore, it is desirable also to be able to generate dictionaries from pairs of comparable monolingual corpora. For most languages, such corpora are much easier to acquire, and often in considerably larger quantities. In this paper we present the implementation of an algorithm which exploits such corpora with good success. Based on the assumption that the co-occurrence patterns between different languages are related, it expands a small base lexicon. For improved performance, it also realizes a novel interlingua approach. That is, if corpora of more than two languages are available, the translations from one language to another can be determined not only directly, but also indirectly via a pivot language.

  18. Emo, love and god: making sense of Urban Dictionary, a crowd-sourced online dictionary

    PubMed Central

    McGillivray, Barbara

    2018-01-01

    The Internet facilitates large-scale collaborative projects and the emergence of Web 2.0 platforms, where producers and consumers of content unify, has drastically changed the information market. On the one hand, the promise of the ‘wisdom of the crowd’ has inspired successful projects such as Wikipedia, which has become the primary source of crowd-based information in many languages. On the other hand, the decentralized and often unmonitored environment of such projects may make them susceptible to low-quality content. In this work, we focus on Urban Dictionary, a crowd-sourced online dictionary. We combine computational methods with qualitative annotation and shed light on the overall features of Urban Dictionary in terms of growth, coverage and types of content. We measure a high presence of opinion-focused entries, as opposed to the meaning-focused entries that we expect from traditional dictionaries. Furthermore, Urban Dictionary covers many informal, unfamiliar words as well as proper nouns. Urban Dictionary also contains offensive content, but highly offensive content tends to receive lower scores through the dictionary’s voting system. The low threshold to include new material in Urban Dictionary enables quick recording of new words and new meanings, but the resulting heterogeneous content can pose challenges in using Urban Dictionary as a source to study language innovation. PMID:29892417

  19. Revealing topics and their evolution in biomedical literature using Bio-DTM: a case study of ginseng.

    PubMed

    Chen, Qian; Ai, Ni; Liao, Jie; Shao, Xin; Liu, Yufeng; Fan, Xiaohui

    2017-01-01

    Valuable scientific results on biomedicine are very rich, but they are widely scattered in the literature. Topic modeling enables researchers to discover themes from an unstructured collection of documents without any prior annotations or labels. In this paper, taking ginseng as an example, biological dynamic topic model (Bio-DTM) was proposed to conduct a retrospective study and interpret the temporal evolution of the research of ginseng. The system of Bio-DTM mainly includes four components, documents pre-processing, bio-dictionary construction, dynamic topic models, topics analysis and visualization. Scientific articles pertaining to ginseng were retrieved through text mining from PubMed. The bio-dictionary integrates MedTerms medical dictionary, the second edition of side effect resource, a dictionary of biology and HGNC database of human gene names (HGNC). A dynamic topic model, a text mining technique, was used to emphasize on capturing the development trends of topics in a sequentially collected documents. Besides the contents of topics taken on, the evolution of topics was visualized over time using ThemeRiver. From the topic 9, ginseng was used in dietary supplements and complementary and integrative health practices, and became very popular since the early twentieth century. Topic 6 reminded that the planting of ginseng is a major area of research and symbiosis and allelopathy of ginseng became a research hotspot in 2007. In addition, the Bio-DTM model gave an insight into the main pharmacologic effects of ginseng, such as anti-metabolic disorder effect, cardioprotective effect, anti-cancer effect, hepatoprotective effect, anti-thrombotic effect and neuroprotective effect. The Bio-DTM model not only discovers what ginseng's research involving in but also displays how these topics evolving over time. This approach can be applied to the biomedical field to conduct a retrospective study and guide future studies.

  20. Compressed sampling and dictionary learning framework for wavelength-division-multiplexing-based distributed fiber sensing.

    PubMed

    Weiss, Christian; Zoubir, Abdelhak M

    2017-05-01

    We propose a compressed sampling and dictionary learning framework for fiber-optic sensing using wavelength-tunable lasers. A redundant dictionary is generated from a model for the reflected sensor signal. Imperfect prior knowledge is considered in terms of uncertain local and global parameters. To estimate a sparse representation and the dictionary parameters, we present an alternating minimization algorithm that is equipped with a preprocessing routine to handle dictionary coherence. The support of the obtained sparse signal indicates the reflection delays, which can be used to measure impairments along the sensing fiber. The performance is evaluated by simulations and experimental data for a fiber sensor system with common core architecture.

  1. Definition and maintenance of a telemetry database dictionary

    NASA Technical Reports Server (NTRS)

    Knopf, William P. (Inventor)

    2007-01-01

    A telemetry dictionary database includes a component for receiving spreadsheet workbooks of telemetry data over a web-based interface from other computer devices. Another component routes the spreadsheet workbooks to a specified directory on the host processing device. A process then checks the received spreadsheet workbooks for errors, and if no errors are detected the spreadsheet workbooks are routed to another directory to await initiation of a remote database loading process. The loading process first converts the spreadsheet workbooks to comma separated value (CSV) files. Next, a network connection with the computer system that hosts the telemetry dictionary database is established and the CSV files are ported to the computer system that hosts the telemetry dictionary database. This is followed by a remote initiation of a database loading program. Upon completion of loading a flatfile generation program is manually initiated to generate a flatfile to be used in a mission operations environment by the core ground system.

  2. Foreign Objects in the Rectum

    MedlinePlus

    ... Resources In This Article Medical Dictionary Also of Interest (Quiz) Anal Fissure (News) Could a Common Blood Thinner Lower Cancer Risk? (News) Study Untangles Disparity in Colon Cancer Survival Rates (News) Poor Prognosis for Diabetic Foot Sores (News) ...

  3. Space platform expendables resupply concept definition study. Volume 3: Work breakdown structure and work breakdown structure dictionary

    NASA Technical Reports Server (NTRS)

    1984-01-01

    The work breakdown structure (WBS) for the Space Platform Expendables Resupply Concept Definition Study is described. The WBS consists of a list of WBS elements, a dictionary of element definitions, and an element logic diagram. The list and logic diagram identify the interrelationships of the elements. The dictionary defines the types of work that may be represented by or be classified under each specific element. The Space Platform Expendable Resupply WBS was selected mainly to support the program planning, scheduling, and costing performed in the programmatics task (task 3). The WBS is neither a statement-of-work nor a work authorization document. Rather, it is a framework around which to define requirements, plan effort, assign responsibilities, allocate and control resources, and report progress, expenditures, technical performance, and schedule performance. The WBS element definitions are independent of make-or-buy decisions, organizational structure, and activity locations unless exceptions are specifically stated.

  4. Barriers to Information Transfer and Approaches Toward Their Reduction, Conference Proceedings of the Technical Information Panel Specialists’ Meeting Held in Washington, DC on 23-24 September 1987.

    DTIC Science & Technology

    1988-03-01

    oriented expansion of dictionaries and systems. 4,.j - Portability. Included essential criteria for evaluation are: N - Quality of the raw (also called...hard to be made without having precise criteria for the de- cision. Because the amount of data in computerized dictionaries - on the long line of...develop- ment of MT and CAT systems - is the decisive component, the update of the (electronic) dictionary plays a substantial part in both alternatives

  5. Fast group matching for MR fingerprinting reconstruction.

    PubMed

    Cauley, Stephen F; Setsompop, Kawin; Ma, Dan; Jiang, Yun; Ye, Huihui; Adalsteinsson, Elfar; Griswold, Mark A; Wald, Lawrence L

    2015-08-01

    MR fingerprinting (MRF) is a technique for quantitative tissue mapping using pseudorandom measurements. To estimate tissue properties such as T1 , T2 , proton density, and B0 , the rapidly acquired data are compared against a large dictionary of Bloch simulations. This matching process can be a very computationally demanding portion of MRF reconstruction. We introduce a fast group matching algorithm (GRM) that exploits inherent correlation within MRF dictionaries to create highly clustered groupings of the elements. During matching, a group specific signature is first used to remove poor matching possibilities. Group principal component analysis (PCA) is used to evaluate all remaining tissue types. In vivo 3 Tesla brain data were used to validate the accuracy of our approach. For a trueFISP sequence with over 196,000 dictionary elements, 1000 MRF samples, and image matrix of 128 × 128, GRM was able to map MR parameters within 2s using standard vendor computational resources. This is an order of magnitude faster than global PCA and nearly two orders of magnitude faster than direct matching, with comparable accuracy (1-2% relative error). The proposed GRM method is a highly efficient model reduction technique for MRF matching and should enable clinically relevant reconstruction accuracy and time on standard vendor computational resources. © 2014 Wiley Periodicals, Inc.

  6. Computer Science and Technology: A Survey of Eleven Government-Developed Data Element Dictionary/Directory Systems.

    ERIC Educational Resources Information Center

    National Bureau of Standards (DOC), Washington, DC. Inst. for Computer Sciences and Technology.

    This report presents the current state of the art of government developed Data Element Dictionary/Directory (DED/D) systems. DED/D's are software tools used for managing and controlling information and data. The introduction of the report includes a list of the government agency systems surveyed and a summary matrix presenting each system's…

  7. Maximally Expressive Modeling of Operations Tasks

    NASA Technical Reports Server (NTRS)

    Jaap, John; Richardson, Lea; Davis, Elizabeth

    2002-01-01

    Planning and scheduling systems organize "tasks" into a timeline or schedule. The tasks are defined within the scheduling system in logical containers called models. The dictionary might define a model of this type as "a system of things and relations satisfying a set of rules that, when applied to the things and relations, produce certainty about the tasks that are being modeled." One challenging domain for a planning and scheduling system is the operation of on-board experiments for the International Space Station. In these experiments, the equipment used is among the most complex hardware ever developed, the information sought is at the cutting edge of scientific endeavor, and the procedures are intricate and exacting. Scheduling is made more difficult by a scarcity of station resources. The models to be fed into the scheduler must describe both the complexity of the experiments and procedures (to ensure a valid schedule) and the flexibilities of the procedures and the equipment (to effectively utilize available resources). Clearly, scheduling International Space Station experiment operations calls for a "maximally expressive" modeling schema.

  8. Image fusion using sparse overcomplete feature dictionaries

    DOEpatents

    Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt

    2015-10-06

    Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.

  9. An Extensible, User- Modifiable Framework for Planning Activities

    NASA Technical Reports Server (NTRS)

    Joshing, Joseph C.; Abramyan, Lucy; Mickelson, Megan C.; Wallick, Michael N.; Kurien, James A.; Crockett, Thomasa M.; Powell, Mark W.; Pyrzak, Guy; Aghevli, Arash

    2013-01-01

    This software provides a development framework that allows planning activities for the Mars Science Laboratory rover to be altered at any time, based on changes of the Activity Dictionary. The Activity Dictionary contains the definition of all activities that can be carried out by a particular asset (robotic or human). These definitions (and combinations of these definitions) are used by mission planners to give a daily plan of what a mission should do. During the development and course of the mission, the Activity Dictionary and actions that are going to be carried out will often be changed. Previously, such changes would require a change to the software and redeployment. Now, the Activity Dictionary authors are able to customize activity definitions, parameters, and resource usage without requiring redeployment. This software provides developers and end users the ability to modify the behavior of automatically generated activities using a script. This allows changes to the software behavior without incurring the burden of redeployment. This software is currently being used for the Mars Science Laboratory, and is in the process of being integrated into the LADEE (Lunar Atmosphere and Dust Environment Explorer) mission, as well as the International Space Station.

  10. Evaluation of a data dictionary system. [information dissemination and computer systems programs

    NASA Technical Reports Server (NTRS)

    Driggers, W. G.

    1975-01-01

    The usefulness was investigated of a data dictionary/directory system for achieving optimum benefits from existing and planned investments in computer data files in the Data Systems Development Branch and the Institutional Data Systems Division. Potential applications of the data catalogue system are discussed along with an evaluation of the system. Other topics discussed include data description, data structure, programming aids, programming languages, program networks, and test data.

  11. Navajo-English Dictionary.

    ERIC Educational Resources Information Center

    Wall, Leon; Morgan, William

    A brief summary of the sound system of the Navajo language introduces this Navajo-English dictionary. Diacritical markings and an English definition are given for each Navajo word. Words are listed alphabetically by Navajo sound. (VM)

  12. Study of Tools for Command and Telemetry Dictionaries

    NASA Technical Reports Server (NTRS)

    Pires, Craig; Knudson, Matthew D.

    2017-01-01

    The Command and Telemetry Dictionary is at the heart of space missions. The C&T Dictionary represents all of the information that is exchanged between the various systems both in space and on the ground. Large amounts of ever-changing information has to be disseminated to all for the various systems and sub-systems throughout all phases of the mission. The typical approach of having each sub-system manage it's own information flow, results in a patchwork of methods within a mission. This leads to significant duplication of effort and potential errors. More centralized methods have been developed to manage this data flow. This presentation will compare two tools that have been developed for this purpose, CCDD and SCIMI that were designed to work with the Core Flight System (cFS).

  13. The Evaluation and Systems Analysis of the SYSTRAN Machine Translation System

    DTIC Science & Technology

    1977-01-01

    DiFondi (IRDT) I. KIY *0*01 (Cu.wMu. .~ .‘~~.. lid. it a....Wp .11 id.iiSiIp Op e4.Sk s~~S.,) Machine Traca lation Evaluation scan Dictionary Update S...ntic Expression Dictionary Update *55? **Ct fCMi~ uw * lid. It -- p Sdsffl~~ Sr Sidsi ,~~~Siv) This report is the product of contractual effort to...translated end then corrected by a b ilingu.ai. exper t in each field. two types of corrections were considered iaplweatabi e, stan dictionary update and

  14. Using Dictionary Pair Learning for Seizure Detection.

    PubMed

    Ma, Xin; Yu, Nana; Zhou, Weidong

    2018-02-13

    Automatic seizure detection is extremely important in the monitoring and diagnosis of epilepsy. The paper presents a novel method based on dictionary pair learning (DPL) for seizure detection in the long-term intracranial electroencephalogram (EEG) recordings. First, for the EEG data, wavelet filtering and differential filtering are applied, and the kernel function is performed to make the signal linearly separable. In DPL, the synthesis dictionary and analysis dictionary are learned jointly from original training samples with alternating minimization method, and sparse coefficients are obtained by using of linear projection instead of costly [Formula: see text]-norm or [Formula: see text]-norm optimization. At last, the reconstructed residuals associated with seizure and nonseizure sub-dictionary pairs are calculated as the decision values, and the postprocessing is performed for improving the recognition rate and reducing the false detection rate of the system. A total of 530[Formula: see text]h from 20 patients with 81 seizures were used to evaluate the system. Our proposed method has achieved an average segment-based sensitivity of 93.39%, specificity of 98.51%, and event-based sensitivity of 96.36% with false detection rate of 0.236/h.

  15. TUNS user guide supplement: Data dictionary

    NASA Technical Reports Server (NTRS)

    1988-01-01

    Provided is a data dictionary for the Technology Utilization Network System (TUNS) providing for each element name the long name, data type, data size, descriptive name and description, data of PRI clause, legal values, and location used.

  16. Occupations, U. S. A.

    ERIC Educational Resources Information Center

    Geneva Area City Schools, OH.

    The booklet divides job titles, selected from the Dictionary of Occupational Titles, into 15 career clusters: agribusiness and natural resources, business and office education, communication and media, construction, consumer and home economics, fine arts and humanities, health occupations, hospitality and recreation, manufacturing, marine science,…

  17. 76 FR 59533 - Mandatory Reporting of Greenhouse Gases: Petroleum and Natural Gas Systems: Revisions to Best...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-09-27

    ... final rule, but which might not necessarily be ``extreme'' in practice. The Miriam Webster dictionary...'', ``ultimate'', ``outermost.'' According to the Miriam Webster dictionary, the term ``unique'' can refer to...

  18. POLARIS: Helping Managers Get Answers Fast!

    NASA Technical Reports Server (NTRS)

    Corcoran, Patricia M.; Webster, Jeffery

    2007-01-01

    This viewgraph presentation reviews the Project Online Library and Resource Information System (POLARIS) system. It is NASA-wide, web-based system, providing access to information related to Program and Project Management. It will provide a one-stop shop for access to: a searchable, sortable database of all requirements for all product lines, project life cycle diagrams with reviews, project life cycle diagrams with reviews, project review definitions with products review information from NPR 7123.1, NASA Systems Engineering Processes and Requirements, templates and examples of products, project standard WBSs with dictionaries, and requirements for implementation and approval, information from NASA s Metadata Manager (MdM): Attributes of Missions, Themes, Programs & Projects, NPR7120.5 waiver form and instructions and much more. The presentation reviews the plans and timelines for future revisions and modifications.

  19. A Selected Bibliography of Educational Sources.

    ERIC Educational Resources Information Center

    Campbell, Janet; And Others

    Focusing on materials available at the California State University at Long Beach Library, this annotated bibliography lists resources in seven subject categories pertaining to education: (1) guides to the professional educational literature; (2) books about education research methodology; (3) encyclopedias and dictionaries; (4) tests and…

  20. Using TEI for an Endangered Language Lexical Resource: The Nxa?amxcín Database-Dictionary Project

    ERIC Educational Resources Information Center

    Czaykowska-Higgins, Ewa; Holmes, Martin D.; Kell, Sarah M.

    2014-01-01

    This paper describes the evolution of a lexical resource project for Nxa?amxcín, an endangered Salish language, from the project's inception in the 1990s, based on legacy materials recorded in the 1960s and 1970s, to its current form as an online database that is transformable into various print and web-based formats for varying uses. We…

  1. Dictionary of Microscopy

    NASA Astrophysics Data System (ADS)

    Heath, Julian

    2005-10-01

    The past decade has seen huge advances in the application of microscopy in all areas of science. This welcome development in microscopy has been paralleled by an expansion of the vocabulary of technical terms used in microscopy: terms have been coined for new instruments and techniques and, as microscopes reach even higher resolution, the use of terms that relate to the optical and physical principles underpinning microscopy is now commonplace. The Dictionary of Microscopy was compiled to meet this challenge and provides concise definitions of over 2,500 terms used in the fields of light microscopy, electron microscopy, scanning probe microscopy, x-ray microscopy and related techniques. Written by Dr Julian P. Heath, Editor of Microscopy and Analysis, the dictionary is intended to provide easy navigation through the microscopy terminology and to be a first point of reference for definitions of new and established terms. The Dictionary of Microscopy is an essential, accessible resource for: students who are new to the field and are learning about microscopes equipment purchasers who want an explanation of the terms used in manufacturers' literature scientists who are considering using a new microscopical technique experienced microscopists as an aide mémoire or quick source of reference librarians, the press and marketing personnel who require definitions for technical reports.

  2. A feature dictionary supporting a multi-domain medical knowledge base.

    PubMed

    Naeymi-Rad, F

    1989-01-01

    Because different terminology is used by physicians of different specialties in different locations to refer to the same feature (signs, symptoms, test results), it is essential that our knowledge development tools provide a means to access a common pool of terms. This paper discusses the design of an online medical dictionary that provides a solution to this problem for developers of multi-domain knowledge bases for MEDAS (Medical Emergency Decision Assistance System). Our Feature Dictionary supports phrase equivalents for features, feature interactions, feature classifications, and translations to the binary features generated by the expert during knowledge creation. It is also used in the conversion of a domain knowledge to the database used by the MEDAS inference diagnostic sessions. The Feature Dictionary also provides capabilities for complex queries across multiple domains using the supported relations. The Feature Dictionary supports three methods for feature representation: (1) for binary features, (2) for continuous valued features, and (3) for derived features.

  3. Resources into Higher Education. Bibliographic Series No. 31.

    ERIC Educational Resources Information Center

    Ahrens, Joan

    Varied information sources on higher education held by the Arkansas University library are listed. The publications, which are organized by type of material include abstracts and indexes, sources for dissertations and theses, directories of educational personnel and institutions, dictionaries, encyclopedias, handbooks, statistical sources, and…

  4. CIDR

    Science.gov Websites

    Related Links & Resources Access and Applications Access Applications Example Applications Project Research and Related (R&R) forms and the SF424 (R&R) Application Guide. Access to the CIDR Program Guidelines Example Applications All applications must include a Data Dictionary of phenotypic measures to be

  5. Library Information Resource Book For Staff.

    ERIC Educational Resources Information Center

    Potts, Ken; And Others

    This guide is the Northern Illinois University (NIU) Libraries' quick reference tool for providing information about its collections, facilities, and services. The articles are arranged in an alphabetic, dictionary format with numerous cross-references, and highlight information on the following: administrative offices; company annual reports;…

  6. English for Specific Purposes. Information Guide 2.

    ERIC Educational Resources Information Center

    British Council, London (England). English-Teaching Information Centre.

    This bibliography of materials for teachers of English for specific purposes lists textbooks, technical readers, articles, resource books, reports, dictionaries, reference books, bibliographies, word frequency lists, catalogues of teaching aids, games and activities, current research in Britain, documents available in the archives of the English…

  7. Tracking state deployments of commercial vehicle information systems and networks : 1998 Washington State report

    DOT National Transportation Integrated Search

    1999-12-01

    Volume III of the Logical Architecture contract deliverable documents the Data Dictionary. This formatted version of the Teamwork model data dictionary is mechanically produced from the Teamwork CDIF (Case Data Interchange Format) output file. It is ...

  8. Satellite Power Systems (SPS) concept definition study. Volume 2, part 2: System engineering. [cost and programmatics

    NASA Technical Reports Server (NTRS)

    Hanley, G. M.

    1980-01-01

    The latest technical and programmatic developments are considered as well as expansions of the Rockwell SPS cost model covering each phase of the program through the year 2030. Comparative cost/economic analyses cover elements of the satellite, construction system, space transportation vehicles and operations, and the ground receiving station. System plans to define time phased costs and planning requirements that support major milestones through the year 2000. A special analysis is included on natural resources required to build the SPS reference configuration. An appendix contains the SPS Work Breakdown Structure and dictionary along with detail cost data sheet on each system and main element of the program. Over 200 line items address DDT&E, theoretical first unit, investment cost per satellite, and operations charges for replacement capital and normal operations and maintenance costs.

  9. Automatic vs. manual curation of a multi-source chemical dictionary: the impact on text mining.

    PubMed

    Hettne, Kristina M; Williams, Antony J; van Mulligen, Erik M; Kleinjans, Jos; Tkachenko, Valery; Kors, Jan A

    2010-03-23

    Previously, we developed a combined dictionary dubbed Chemlist for the identification of small molecules and drugs in text based on a number of publicly available databases and tested it on an annotated corpus. To achieve an acceptable recall and precision we used a number of automatic and semi-automatic processing steps together with disambiguation rules. However, it remained to be investigated which impact an extensive manual curation of a multi-source chemical dictionary would have on chemical term identification in text. ChemSpider is a chemical database that has undergone extensive manual curation aimed at establishing valid chemical name-to-structure relationships. We acquired the component of ChemSpider containing only manually curated names and synonyms. Rule-based term filtering, semi-automatic manual curation, and disambiguation rules were applied. We tested the dictionary from ChemSpider on an annotated corpus and compared the results with those for the Chemlist dictionary. The ChemSpider dictionary of ca. 80 k names was only a 1/3 to a 1/4 the size of Chemlist at around 300 k. The ChemSpider dictionary had a precision of 0.43 and a recall of 0.19 before the application of filtering and disambiguation and a precision of 0.87 and a recall of 0.19 after filtering and disambiguation. The Chemlist dictionary had a precision of 0.20 and a recall of 0.47 before the application of filtering and disambiguation and a precision of 0.67 and a recall of 0.40 after filtering and disambiguation. We conclude the following: (1) The ChemSpider dictionary achieved the best precision but the Chemlist dictionary had a higher recall and the best F-score; (2) Rule-based filtering and disambiguation is necessary to achieve a high precision for both the automatically generated and the manually curated dictionary. ChemSpider is available as a web service at http://www.chemspider.com/ and the Chemlist dictionary is freely available as an XML file in Simple Knowledge Organization System format on the web at http://www.biosemantics.org/chemlist.

  10. Automatic vs. manual curation of a multi-source chemical dictionary: the impact on text mining

    PubMed Central

    2010-01-01

    Background Previously, we developed a combined dictionary dubbed Chemlist for the identification of small molecules and drugs in text based on a number of publicly available databases and tested it on an annotated corpus. To achieve an acceptable recall and precision we used a number of automatic and semi-automatic processing steps together with disambiguation rules. However, it remained to be investigated which impact an extensive manual curation of a multi-source chemical dictionary would have on chemical term identification in text. ChemSpider is a chemical database that has undergone extensive manual curation aimed at establishing valid chemical name-to-structure relationships. Results We acquired the component of ChemSpider containing only manually curated names and synonyms. Rule-based term filtering, semi-automatic manual curation, and disambiguation rules were applied. We tested the dictionary from ChemSpider on an annotated corpus and compared the results with those for the Chemlist dictionary. The ChemSpider dictionary of ca. 80 k names was only a 1/3 to a 1/4 the size of Chemlist at around 300 k. The ChemSpider dictionary had a precision of 0.43 and a recall of 0.19 before the application of filtering and disambiguation and a precision of 0.87 and a recall of 0.19 after filtering and disambiguation. The Chemlist dictionary had a precision of 0.20 and a recall of 0.47 before the application of filtering and disambiguation and a precision of 0.67 and a recall of 0.40 after filtering and disambiguation. Conclusions We conclude the following: (1) The ChemSpider dictionary achieved the best precision but the Chemlist dictionary had a higher recall and the best F-score; (2) Rule-based filtering and disambiguation is necessary to achieve a high precision for both the automatically generated and the manually curated dictionary. ChemSpider is available as a web service at http://www.chemspider.com/ and the Chemlist dictionary is freely available as an XML file in Simple Knowledge Organization System format on the web at http://www.biosemantics.org/chemlist. PMID:20331846

  11. Unsupervised method for automatic construction of a disease dictionary from a large free text collection.

    PubMed

    Xu, Rong; Supekar, Kaustubh; Morgan, Alex; Das, Amar; Garber, Alan

    2008-11-06

    Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting con-textual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35-88%) over available, manually created disease terminologies.

  12. Unsupervised Method for Automatic Construction of a Disease Dictionary from a Large Free Text Collection

    PubMed Central

    Xu, Rong; Supekar, Kaustubh; Morgan, Alex; Das, Amar; Garber, Alan

    2008-01-01

    Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting contextual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35–88%) over available, manually created disease terminologies. PMID:18999169

  13. Relational Database Design in Information Science Education.

    ERIC Educational Resources Information Center

    Brooks, Terrence A.

    1985-01-01

    Reports on database management system (dbms) applications designed by library school students for university community at University of Iowa. Three dbms design issues are examined: synthesis of relations, analysis of relations (normalization procedure), and data dictionary usage. Database planning prior to automation using data dictionary approach…

  14. Home Performance XML to Real Estate Standards Organization Data Dictionary Translator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2015-10-27

    This translator takes fields from the HPXML and translates them into RESO’s Data Dictionary, which is used in MLS systems for real estate transactions across the country. The purpose is to get energy efficiency data into the real estate transaction.

  15. Learning Essential Terms and Concepts in Statistics and Accounting

    ERIC Educational Resources Information Center

    Peters, Pam; Smith, Adam; Middledorp, Jenny; Karpin, Anne; Sin, Samantha; Kilgore, Alan

    2014-01-01

    This paper describes a terminological approach to the teaching and learning of fundamental concepts in foundation tertiary units in Statistics and Accounting, using an online dictionary-style resource (TermFinder) with customised "termbanks" for each discipline. Designed for independent learning, the termbanks support inquiring students…

  16. DTU BCI speller: an SSVEP-based spelling system with dictionary support.

    PubMed

    Vilic, Adnan; Kjaer, Troels W; Thomsen, Carsten E; Puthusserypady, S; Sorensen, Helge B D

    2013-01-01

    In this paper, a new brain computer interface (BCI) speller, named DTU BCI speller, is introduced. It is based on the steady-state visual evoked potential (SSVEP) and features dictionary support. The system focuses on simplicity and user friendliness by using a single electrode for the signal acquisition and displays stimuli on a liquid crystal display (LCD). Nine healthy subjects participated in writing full sentences after a five minutes introduction to the system, and obtained an information transfer rate (ITR) of 21.94 ± 15.63 bits/min. The average amount of characters written per minute (CPM) is 4.90 ± 3.84 with a best case of 8.74 CPM. All subjects reported systematically on different user friendliness measures, and the overall results indicated the potentials of the DTU BCI Speller system. For subjects with high classification accuracies, the introduced dictionary approach greatly reduced the time it took to write full sentences.

  17. Creating a medical English-Swedish dictionary using interactive word alignment.

    PubMed

    Nyström, Mikael; Merkel, Magnus; Ahrenberg, Lars; Zweigenbaum, Pierre; Petersson, Håkan; Ahlfeldt, Hans

    2006-10-12

    This paper reports on a parallel collection of rubrics from the medical terminology systems ICD-10, ICF, MeSH, NCSP and KSH97-P and its use for semi-automatic creation of an English-Swedish dictionary of medical terminology. The methods presented are relevant for many other West European language pairs than English-Swedish. The medical terminology systems were collected in electronic format in both English and Swedish and the rubrics were extracted in parallel language pairs. Initially, interactive word alignment was used to create training data from a sample. Then the training data were utilised in automatic word alignment in order to generate candidate term pairs. The last step was manual verification of the term pair candidates. A dictionary of 31,000 verified entries has been created in less than three man weeks, thus with considerably less time and effort needed compared to a manual approach, and without compromising quality. As a side effect of our work we found 40 different translation problems in the terminology systems and these results indicate the power of the method for finding inconsistencies in terminology translations. We also report on some factors that may contribute to making the process of dictionary creation with similar tools even more expedient. Finally, the contribution is discussed in relation to other ongoing efforts in constructing medical lexicons for non-English languages. In three man weeks we were able to produce a medical English-Swedish dictionary consisting of 31,000 entries and also found hidden translation errors in the utilized medical terminology systems.

  18. Creating a medical English-Swedish dictionary using interactive word alignment

    PubMed Central

    Nyström, Mikael; Merkel, Magnus; Ahrenberg, Lars; Zweigenbaum, Pierre; Petersson, Håkan; Åhlfeldt, Hans

    2006-01-01

    Background This paper reports on a parallel collection of rubrics from the medical terminology systems ICD-10, ICF, MeSH, NCSP and KSH97-P and its use for semi-automatic creation of an English-Swedish dictionary of medical terminology. The methods presented are relevant for many other West European language pairs than English-Swedish. Methods The medical terminology systems were collected in electronic format in both English and Swedish and the rubrics were extracted in parallel language pairs. Initially, interactive word alignment was used to create training data from a sample. Then the training data were utilised in automatic word alignment in order to generate candidate term pairs. The last step was manual verification of the term pair candidates. Results A dictionary of 31,000 verified entries has been created in less than three man weeks, thus with considerably less time and effort needed compared to a manual approach, and without compromising quality. As a side effect of our work we found 40 different translation problems in the terminology systems and these results indicate the power of the method for finding inconsistencies in terminology translations. We also report on some factors that may contribute to making the process of dictionary creation with similar tools even more expedient. Finally, the contribution is discussed in relation to other ongoing efforts in constructing medical lexicons for non-English languages. Conclusion In three man weeks we were able to produce a medical English-Swedish dictionary consisting of 31,000 entries and also found hidden translation errors in the utilized medical terminology systems. PMID:17034649

  19. The Language Grid: supporting intercultural collaboration

    NASA Astrophysics Data System (ADS)

    Ishida, T.

    2018-03-01

    A variety of language resources already exist online. Unfortunately, since many language resources have usage restrictions, it is virtually impossible for each user to negotiate with every language resource provider when combining several resources to achieve the intended purpose. To increase the accessibility and usability of language resources (dictionaries, parallel texts, part-of-speech taggers, machine translators, etc.), we proposed the Language Grid [1]; it wraps existing language resources as atomic services and enables users to create new services by combining the atomic services, and reduces the negotiation costs related to intellectual property rights [4]. Our slogan is “language services from language resources.” We believe that modularization with recombination is the key to creating a full range of customized language environments for various user communities.

  20. Expanding Academic Vocabulary with an Interactive On-Line Database

    ERIC Educational Resources Information Center

    Horst, Marlise; Cobb, Tom; Nicolae, Ioana

    2005-01-01

    University students used a set of existing and purpose-built on-line tools for vocabulary learning in an experimental ESL course. The resources included concordance, dictionary, cloze-builder, hypertext, and a database with interactive self-quizzing feature (all freely available at www.lextutor.ca). The vocabulary targeted for learning consisted…

  1. Thumbs Up: High-Quality, Low-Cost Teaching Aids.

    ERIC Educational Resources Information Center

    Paine, Carolyn

    1982-01-01

    Exemplary teaching aids--games, workbooks, student and teacher resource books, reading materials, and records--are recommended by subject area and grade level. Materials include an ice cream cone game for mathematics, a "Life Skills Reading" book on telephone usage, a "Dictionary of Recent American History," and many other items. (PP)

  2. ArtMARC Sourcebook: Cataloging Art, Architecture, and Their Visual Images.

    ERIC Educational Resources Information Center

    McRae, Linda, Ed.; White, Lynda S., Ed.

    Profiling the proven cataloging methods of experts from libraries, art galleries, museums, and other institutions, this sourcebook outlines cataloging techniques for a wide variety of resources from ancient artifacts to architectural drawings. A data dictionary of relevant MARC fields is also included, along with data conversion comments. A…

  3. Student Outcomes 2009: Data Dictionary. Support Document

    ERIC Educational Resources Information Center

    National Centre for Vocational Education Research (NCVER), 2009

    2009-01-01

    This document was produced as an added resource for the report "Outcomes from the Productivity Places Program, 2009." The study reported the outcomes for students who completed their vocational education and training (VET) under the Productivity Places Program (PPP) during 2008. This document presents an alphabetical arrangement of the…

  4. ARBA Guide to Biographical Resources 1986-1997.

    ERIC Educational Resources Information Center

    Wick, Robert L., Ed.; Mood, Terry Ann, Ed.

    This guide provides a representative selection of biographical dictionaries and related works useful to the reference and collection development processes in all types of libraries. Three criteria were used in selection: (1) each item included was published within the past 12 years; (2) each item has been included in American Reference Books…

  5. Hupa Natural Resources Dictionary.

    ERIC Educational Resources Information Center

    Bennett, Ruth, Ed.; And Others

    Created by children in grades 5-8 who were enrolled in a year-long Hupa language class, this computer-generated, bilingual book contains descriptions and illustrations of local animals, birds, and fish. The introduction explains that students worked on a Macintosh computer able to print the Unifon alphabet used in writing the Hupa language.…

  6. Resources for Teaching Word Identification.

    ERIC Educational Resources Information Center

    Schell, Leo M., Ed.; And Others

    Only materials specifically designed to teach one or more of the following word identification skills were included in this booklet: sight words, context clues, phonic analysis, structural analysis, and dictionary skills. Materials for grades one through six are stressed, although a few materials suitable for secondary school students are listed.…

  7. Linguistic Resource Creation for Research and Technology Development: A Recent Experiment

    DTIC Science & Technology

    2003-01-01

    inflectional morphology. For instance, Tzeltal (a Mayan language of Mexico), Swahili (east Africa), and Shuar (a Jivaroan language of Ecuador) all...or they can come from dictionaries, as was the case for Tzeltal and Shuar . In the case of inflectionally rich languages, when the word forms are

  8. Holikachuk Noun Dictionary (Preliminary).

    ERIC Educational Resources Information Center

    Kari, James, Comp.; And Others

    This dictionary contains lists of nouns in the Holikachuk Athabaskan language as spoken by about twenty people, most of whom live in the village of Grayling, Alaska. The Holikachuk alphabet and sound system are presented. The nouns with English equivalents are listed according to the following categories: animals, fish, insects, birds, plants,…

  9. The Language of Biotechnology: A Dictionary of Terms.

    ERIC Educational Resources Information Center

    Walker, John M.; Cox, Michael

    This dictionary attempts to define routinely used specialized language in the various areas of biotechnology, and remain suitable for use by scientists involved in unrelated disciplines. Viewing biotechnology as the practical application of biological systems to the manufacturing and service industries, and to the management of the environment,…

  10. Advanced transportation system study: Manned launch vehicle concepts for two way transportation system payloads to LEO. Work breakdown structure and work breakdown structure dictionary

    NASA Technical Reports Server (NTRS)

    Duffy, James B.

    1992-01-01

    The report describes the work breakdown structure (WBS) and its associated WBS dictionary for task area 1 of contract NAS8-39207, advanced transportation system studies (ATSS). This WBS format is consistent with the preliminary design level of detail employed by both task area 1 and task area 4 in the ATSS study and is intended to provide an estimating structure for parametric cost estimates.

  11. Working More Productively: Tools for Administrative Data

    PubMed Central

    Roos, Leslie L; Soodeen, Ruth-Ann; Bond, Ruth; Burchill, Charles

    2003-01-01

    Objective This paper describes a web-based resource () that contains a series of tools for working with administrative data. This work in knowledge management represents an effort to document, find, and transfer concepts and techniques, both within the local research group and to a more broadly defined user community. Concepts and associated computer programs are made as “modular” as possible to facilitate easy transfer from one project to another. Study Setting/Data Sources Tools to work with a registry, longitudinal administrative data, and special files (survey and clinical) from the Province of Manitoba, Canada in the 1990–2003 period. Data Collection Literature review and analyses of web site utilization were used to generate the findings. Principal Findings The Internet-based Concept Dictionary and SAS macros developed in Manitoba are being used in a growing number of research centers. Nearly 32,000 hits from more than 10,200 hosts in a recent month demonstrate broad interest in the Concept Dictionary. Conclusions The tools, taken together, make up a knowledge repository and research production system that aid local work and have great potential internationally. Modular software provides considerable efficiency. The merging of documentation and researcher-to-researcher dissemination keeps costs manageable. PMID:14596394

  12. MD-CTS: An integrated terminology reference of clinical and translational medicine.

    PubMed

    Ray, Will; Finamore, Joe; Rastegar-Mojarad, Majid; Kadolph, Chris; Ye, Zhan; Bohne, Jacquie; Xu, Yin; Burish, Dan; Sondelski, Joshua; Easker, Melissa; Finnegan, Brian; Bartkowiak, Barbara; Smith, Catherine Arnott; Tachinardi, Umberto; Mendonca, Eneida A; Weichelt, Bryan; Lin, Simon M

    2016-01-01

    New vocabularies are rapidly evolving in the literature relative to the practice of clinical medicine and translational research. To provide integrated access to new terms, we developed a mobile and desktop online reference-Marshfield Dictionary of Clinical and Translational Science (MD-CTS). It is the first public resource that comprehensively integrates Wiktionary (word definition), BioPortal (ontology), Wiki (image reference), and Medline abstract (word usage) information. MD-CTS is accessible at http://spellchecker.mfldclin.edu/. The website provides a broadened capacity for the wider clinical and translational science community to keep pace with newly emerging scientific vocabulary. An initial evaluation using 63 randomly selected biomedical words suggests that online references generally provided better coverage (73%-95%) than paper-based dictionaries (57-71%).

  13. Booksearch: What Dictionary (General or Specialized) Do You Find Useful or Interesting for Students?

    ERIC Educational Resources Information Center

    English Journal, 1988

    1988-01-01

    Presents classroom teachers' recommendations for a variety of dictionaries that may heighten students' interest in language: a reverse dictionary, a visual dictionary, WEIGHTY WORD BOOK, a collegiate desk dictionary, OXFORD ENGLISH DICTIONARY, DICTIONARY OF AMERICAN REGIONAL ENGLISH, and a dictionary of idioms. (ARH)

  14. Data Element Dictionary: Finance. A Technical Report Concerning Finance Related Data Elements in the WICHE Management Information Systems Program. First Edition.

    ERIC Educational Resources Information Center

    Thomas, Charles R.

    This document is one of the 5 sections of the Data Element Dictionary developed as part of the WICHE Management Information Systems (MIS) Program. The elements in this section apply to both the current and historical data concerning finance. The purpose of the WICHE MIS Program is to make it possible to derive data which will be truly comparable…

  15. Kinship in Mongolian Sign Language

    ERIC Educational Resources Information Center

    Geer, Leah

    2011-01-01

    Information and research on Mongolian Sign Language is scant. To date, only one dictionary is available in the United States (Badnaa and Boll 1995), and even that dictionary presents only a subset of the signs employed in Mongolia. The present study describes the kinship system used in Mongolian Sign Language (MSL) based on data elicited from…

  16. Proposal of Network-Based Multilingual Space Dictionary Database System

    NASA Astrophysics Data System (ADS)

    Yoshimitsu, T.; Hashimoto, T.; Ninomiya, K.

    2002-01-01

    The International Academy of Astronautics (IAA) is now constructing a multilingual dictionary database system of space-friendly terms. The database consists of a lexicon and dictionaries of multiple languages. The lexicon is a table which relates corresponding terminology in different languages. Each language has a dictionary which contains terms and their definitions. The database assumes the use on the internet. Updating and searching the terms and definitions are conducted via the network. Maintaining the database is conducted by the international cooperation. A new word arises day by day, thus to easily input new words and their definitions to the database is required for the longstanding success of the system. The main key of the database is an English term which is approved at the table held once or twice with the working group members. Each language has at lease one working group member who is responsible of assigning the corresponding term and the definition of the term of his/her native language. Inputting and updating terms and their definitions can be conducted via the internet from the office of each member which may be located at his/her native country. The system is constructed by freely distributed database server program working on the Linux operating system, which will be installed at the head office of IAA. Once it is installed, it will be open to all IAA members who can search the terms via the internet. Currently the authors are constructing the prototype system which is described in this paper.

  17. Applications of the generalized information processing system (GIPSY)

    USGS Publications Warehouse

    Moody, D.W.; Kays, Olaf

    1972-01-01

    The Generalized Information Processing System (GIPSY) stores and retrieves variable-field, variable-length records consisting of numeric data, textual data, or codes. A particularly noteworthy feature of GIPSY is its ability to search records for words, word stems, prefixes, and suffixes as well as for numeric values. Moreover, retrieved records may be printed on pre-defined formats or formatted as fixed-field, fixed-length records for direct input to other-programs, which facilitates the exchange of data with other systems. At present there are some 22 applications of GIPSY falling in the general areas of bibliography, natural resources information, and management science, This report presents a description of each application including a sample input form, dictionary, and a typical formatted record. It is hoped that these examples will stimulate others to experiment with innovative uses of computer technology.

  18. From Afar to Zulu: A Dictionary of African Cultures.

    ERIC Educational Resources Information Center

    Haskins, Jim; Biondi, Joann

    This resource provides information on over 30 of Africa's most populous and well-known ethnic groups. The text concisely describes the history, traditions, environment, social structure, religion, and daily lifestyles of these diverse cultures. Each entry opens with a map outlining the area populated by the group and a list of key data regarding…

  19. Good Planets Are Hard To Find! Ecology Action Workbook and Dictionary.

    ERIC Educational Resources Information Center

    Bazar, Ronald M.; Dehr, Roma

    Even though the condition of the planet is a serious subject, becoming ecologically aware and active can be fun. This workbook provides ecologically conscious-raising activities that children can do at home or with others. A series of worksheets guides students through eight activities including: (1) assessing community resources and environmental…

  20. Spanish Sign in the Americas.

    ERIC Educational Resources Information Center

    Schein, Jerome D.

    1995-01-01

    Spanish Sign Language (SSL) is now the second most used sign language. This article introduces resources for the study of SSL, including three SSL dictionaries--two from Argentina and one from Puerto Rico. Differences in SSL between and within the two countries are noted. Implications for deaf educators in North America are drawn. (Author/DB)

  1. Aerospell Supplemental Spell Check File

    NASA Technical Reports Server (NTRS)

    2000-01-01

    Aerospell is a supplemental spell check file that can be used as a resource for researchers, writers, editors, students, and others who compose scientific and technical texts. The file extends the general spell check dictionaries of word processors by adding more than 13,000 words used in a broad range of aerospace and related disciplines.

  2. Learners' Dictionaries: State of the Art. Anthology Series 23.

    ERIC Educational Resources Information Center

    Tickoo, Makhan L., Ed.

    A collection of articles on dictionaries for advanced second language learners includes essays on the past, present, and future of learners' dictionaries; alternative dictionaries; dictionary construction; and dictionaries and their users. Titles include: "Idle Thoughts of an Idle Fellow; or Vaticinations on the Learners' Dictionary"…

  3. Noise-aware dictionary-learning-based sparse representation framework for detection and removal of single and combined noises from ECG signal

    PubMed Central

    Ramkumar, Barathram; Sabarimalai Manikandan, M.

    2017-01-01

    Automatic electrocardiogram (ECG) signal enhancement has become a crucial pre-processing step in most ECG signal analysis applications. In this Letter, the authors propose an automated noise-aware dictionary learning-based generalised ECG signal enhancement framework which can automatically learn the dictionaries based on the ECG noise type for effective representation of ECG signal and noises, and can reduce the computational load of sparse representation-based ECG enhancement system. The proposed framework consists of noise detection and identification, noise-aware dictionary learning, sparse signal decomposition and reconstruction. The noise detection and identification is performed based on the moving average filter, first-order difference, and temporal features such as number of turning points, maximum absolute amplitude, zerocrossings, and autocorrelation features. The representation dictionary is learned based on the type of noise identified in the previous stage. The proposed framework is evaluated using noise-free and noisy ECG signals. Results demonstrate that the proposed method can significantly reduce computational load as compared with conventional dictionary learning-based ECG denoising approaches. Further, comparative results show that the method outperforms existing methods in automatically removing noises such as baseline wanders, power-line interference, muscle artefacts and their combinations without distorting the morphological content of local waves of ECG signal. PMID:28529758

  4. Bearing fault diagnosis using a whale optimization algorithm-optimized orthogonal matching pursuit with a combined time-frequency atom dictionary

    NASA Astrophysics Data System (ADS)

    Zhang, Xin; Liu, Zhiwen; Miao, Qiang; Wang, Lei

    2018-07-01

    Condition monitoring and fault diagnosis of rolling element bearings are significant to guarantee the reliability and functionality of a mechanical system, production efficiency, and plant safety. However, this is almost invariably a formidable challenge because the fault features are often buried by strong background noises and other unstable interference components. To satisfactorily extract the bearing fault features, a whale optimization algorithm (WOA)-optimized orthogonal matching pursuit (OMP) with a combined time-frequency atom dictionary is proposed in this paper. Firstly, a combined time-frequency atom dictionary whose atom is a combination of Fourier dictionary atom and impact time-frequency dictionary atom is designed according to the properties of bearing fault vibration signal. Furthermore, to improve the efficiency and accuracy of signal sparse representation, the WOA is introduced into the OMP algorithm to optimize the atom parameters for best approximating the original signal with the dictionary atoms. The proposed method is validated through analyzing the bearing fault simulation signal and the real vibration signals collected from an experimental bearing and a wheelset bearing of high-speed trains. The comparisons with the respect to the state of the art in the field are illustrated in detail, which highlight the advantages of the proposed method.

  5. Noise-aware dictionary-learning-based sparse representation framework for detection and removal of single and combined noises from ECG signal.

    PubMed

    Satija, Udit; Ramkumar, Barathram; Sabarimalai Manikandan, M

    2017-02-01

    Automatic electrocardiogram (ECG) signal enhancement has become a crucial pre-processing step in most ECG signal analysis applications. In this Letter, the authors propose an automated noise-aware dictionary learning-based generalised ECG signal enhancement framework which can automatically learn the dictionaries based on the ECG noise type for effective representation of ECG signal and noises, and can reduce the computational load of sparse representation-based ECG enhancement system. The proposed framework consists of noise detection and identification, noise-aware dictionary learning, sparse signal decomposition and reconstruction. The noise detection and identification is performed based on the moving average filter, first-order difference, and temporal features such as number of turning points, maximum absolute amplitude, zerocrossings, and autocorrelation features. The representation dictionary is learned based on the type of noise identified in the previous stage. The proposed framework is evaluated using noise-free and noisy ECG signals. Results demonstrate that the proposed method can significantly reduce computational load as compared with conventional dictionary learning-based ECG denoising approaches. Further, comparative results show that the method outperforms existing methods in automatically removing noises such as baseline wanders, power-line interference, muscle artefacts and their combinations without distorting the morphological content of local waves of ECG signal.

  6. Dictionaries: British and American. The Language Library.

    ERIC Educational Resources Information Center

    Hulbert, James Root

    An account of the dictionaries, great and small, of the English-speaking world is given in this book. Subjects covered include the origin of English dictionaries, early dictionaries, Noah Webster and his successors to the present, abridged dictionaries, "The Oxford English Dictionary" and later dictionaries patterned after it, the…

  7. The semantics of Chemical Markup Language (CML): dictionaries and conventions.

    PubMed

    Murray-Rust, Peter; Townsend, Joe A; Adams, Sam E; Phadungsukanan, Weerapong; Thomas, Jens

    2011-10-14

    The semantic architecture of CML consists of conventions, dictionaries and units. The conventions conform to a top-level specification and each convention can constrain compliant documents through machine-processing (validation). Dictionaries conform to a dictionary specification which also imposes machine validation on the dictionaries. Each dictionary can also be used to validate data in a CML document, and provide human-readable descriptions. An additional set of conventions and dictionaries are used to support scientific units. All conventions, dictionaries and dictionary elements are identifiable and addressable through unique URIs.

  8. The semantics of Chemical Markup Language (CML): dictionaries and conventions

    PubMed Central

    2011-01-01

    The semantic architecture of CML consists of conventions, dictionaries and units. The conventions conform to a top-level specification and each convention can constrain compliant documents through machine-processing (validation). Dictionaries conform to a dictionary specification which also imposes machine validation on the dictionaries. Each dictionary can also be used to validate data in a CML document, and provide human-readable descriptions. An additional set of conventions and dictionaries are used to support scientific units. All conventions, dictionaries and dictionary elements are identifiable and addressable through unique URIs. PMID:21999509

  9. Dual Language = Saad Ahaah Sinil. A Navajo-English Dictionary. Revised Edition.

    ERIC Educational Resources Information Center

    Austin, Martha, Ed.; Lynch, Regina, Ed.

    A dual-language Navajo-English dictionary provides a chart of the Navajo kinship system, a two-page map of the Navajo Nation, and English equivalents for Navajo words in 46 linguistic and cultural categories. Included are words for: races (Indian and other ethnic groups); Navajo clans; age groups; Navajo ceremonies; body parts; sickness; clothing;…

  10. Comprehensive human transcription factor binding site map for combinatory binding motifs discovery.

    PubMed

    Müller-Molina, Arnoldo J; Schöler, Hans R; Araúzo-Bravo, Marcos J

    2012-01-01

    To know the map between transcription factors (TFs) and their binding sites is essential to reverse engineer the regulation process. Only about 10%-20% of the transcription factor binding motifs (TFBMs) have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory "DNA words." From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%-far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of "DNA words," newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters.

  11. Comprehensive Human Transcription Factor Binding Site Map for Combinatory Binding Motifs Discovery

    PubMed Central

    Müller-Molina, Arnoldo J.; Schöler, Hans R.; Araúzo-Bravo, Marcos J.

    2012-01-01

    To know the map between transcription factors (TFs) and their binding sites is essential to reverse engineer the regulation process. Only about 10%–20% of the transcription factor binding motifs (TFBMs) have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory “DNA words.” From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%—far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of “DNA words,” newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters. PMID:23209563

  12. Mobile-Based Dictionary of Information and Communication Technology

    NASA Astrophysics Data System (ADS)

    Liando, O. E. S.; Mewengkang, A.; Kaseger, D.; Sangkop, F. I.; Rantung, V. P.; Rorimpandey, G. C.

    2018-02-01

    This study aims to design and build mobile-based dictionary of information and communication technology applications to provide access to information in the form of glossary of terms in the context of information and communication technologies. Applications built in this study using the Android platform, with SQLite database model. This research uses prototype model development method which covers the stages of communication, Quick Plan, Quick Design Modeling, Construction of Prototype, Deployment Delivery & Feedback, and Full System Transformation. The design of this application is designed in such a way as to facilitate the user in the process of learning and understanding the new terms or vocabularies encountered in the world of information and communication technology. Mobile-based dictionary of Information And Communication Technology applications that have been built can be an alternative to learning literature. In its simplest form, this application is able to meet the need for a comprehensive and accurate dictionary of Information And Communication Technology function.

  13. Integrated Workforce Planning Model: A Proof of Concept

    NASA Technical Reports Server (NTRS)

    Guruvadoo, Eranna K.

    2001-01-01

    Recently, the Workforce and Diversity Management Office at KSC have launched a major initiative to develop and implement a competency/skill approach to Human Resource management. As the competency/skill dictionary is being elaborated, the need for a competency-based workforce-planning model is recognized. A proof of concept for such a model is presented using a multidimensional data model that can provide the data infrastructure necessary to drive intelligent decision support systems for workforce planing. The components of competency-driven workforce planning model are explained. The data model is presented and several schemes that would support the workforce-planning model are presented. Some directions and recommendations for future work are given.

  14. [Comments on "A practical dictionary of Chinese medicine" by Wiseman].

    PubMed

    Lan, Feng-li

    2006-02-01

    At least 24 Chinese-English dictionaries of Chinese Medicine have been published in China during the recent 24 years (1984-2003). This thesis comments on "A Practical Dictionary of Chinese Medicine" by Wiseman, agreeing on its establishing principles, sources and formation methods of the English system of Chinese medical terminology, and pointing out the defect. The author holds that study on the origin and development of TCM terms, standardization of Chinese medical terms in different layers, i.e. Chinese medical in classic, in commonly used modern TCM terms, and integrative medical texts, are prerequisites to the standardization of English translation of Chinese medical terms.

  15. The ADAMS interactive interpreter

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rietscha, E.R.

    1990-12-17

    The ADAMS (Advanced DAta Management System) project is exploring next generation database technology. Database management does not follow the usual programming paradigm. Instead, the database dictionary provides an additional name space environment that should be interactively created and tested before writing application code. This document describes the implementation and operation of the ADAMS Interpreter, an interactive interface to the ADAMS data dictionary and runtime system. The Interpreter executes individual statements of the ADAMS Interface Language, providing a fast, interactive mechanism to define and access persistent databases. 5 refs.

  16. Natural Language Processing Systems Evaluation Workshop Held in Berkely, California on 18 June 1991

    DTIC Science & Technology

    1991-12-01

    re~arded as -a fairly complete dictionary contains about 18,000 itemsw at soluition to the domain-restricted task at tzanlating present, and will be... dictionary access and so on, with an article. Unfortunately, the Weidner system did but as time goes on, one might imagine functionality not know that...superfast type. looped tht it A31l be built with taste by peo. writer ought to be possible in the monolingual case pie who understand languages and

  17. The Role of Dictionaries in Language Learning.

    ERIC Educational Resources Information Center

    White, Philip A.

    1997-01-01

    Examines assumptions about dictionaries, especially the bilingual dictionary, and suggests ways of integrating the monolingual dictionary into the second-language instructional process. Findings indicate that the monolingual dictionary can coexist with bilingual dictionaries within a foreign-language course if the latter are appropriately used as…

  18. Extended dynamic mode decomposition with dictionary learning: A data-driven adaptive spectral decomposition of the Koopman operator

    NASA Astrophysics Data System (ADS)

    Li, Qianxiao; Dietrich, Felix; Bollt, Erik M.; Kevrekidis, Ioannis G.

    2017-10-01

    Numerical approximation methods for the Koopman operator have advanced considerably in the last few years. In particular, data-driven approaches such as dynamic mode decomposition (DMD)51 and its generalization, the extended-DMD (EDMD), are becoming increasingly popular in practical applications. The EDMD improves upon the classical DMD by the inclusion of a flexible choice of dictionary of observables which spans a finite dimensional subspace on which the Koopman operator can be approximated. This enhances the accuracy of the solution reconstruction and broadens the applicability of the Koopman formalism. Although the convergence of the EDMD has been established, applying the method in practice requires a careful choice of the observables to improve convergence with just a finite number of terms. This is especially difficult for high dimensional and highly nonlinear systems. In this paper, we employ ideas from machine learning to improve upon the EDMD method. We develop an iterative approximation algorithm which couples the EDMD with a trainable dictionary represented by an artificial neural network. Using the Duffing oscillator and the Kuramoto Sivashinsky partical differential equation as examples, we show that our algorithm can effectively and efficiently adapt the trainable dictionary to the problem at hand to achieve good reconstruction accuracy without the need to choose a fixed dictionary a priori. Furthermore, to obtain a given accuracy, we require fewer dictionary terms than EDMD with fixed dictionaries. This alleviates an important shortcoming of the EDMD algorithm and enhances the applicability of the Koopman framework to practical problems.

  19. Extended dynamic mode decomposition with dictionary learning: A data-driven adaptive spectral decomposition of the Koopman operator.

    PubMed

    Li, Qianxiao; Dietrich, Felix; Bollt, Erik M; Kevrekidis, Ioannis G

    2017-10-01

    Numerical approximation methods for the Koopman operator have advanced considerably in the last few years. In particular, data-driven approaches such as dynamic mode decomposition (DMD) 51 and its generalization, the extended-DMD (EDMD), are becoming increasingly popular in practical applications. The EDMD improves upon the classical DMD by the inclusion of a flexible choice of dictionary of observables which spans a finite dimensional subspace on which the Koopman operator can be approximated. This enhances the accuracy of the solution reconstruction and broadens the applicability of the Koopman formalism. Although the convergence of the EDMD has been established, applying the method in practice requires a careful choice of the observables to improve convergence with just a finite number of terms. This is especially difficult for high dimensional and highly nonlinear systems. In this paper, we employ ideas from machine learning to improve upon the EDMD method. We develop an iterative approximation algorithm which couples the EDMD with a trainable dictionary represented by an artificial neural network. Using the Duffing oscillator and the Kuramoto Sivashinsky partical differential equation as examples, we show that our algorithm can effectively and efficiently adapt the trainable dictionary to the problem at hand to achieve good reconstruction accuracy without the need to choose a fixed dictionary a priori. Furthermore, to obtain a given accuracy, we require fewer dictionary terms than EDMD with fixed dictionaries. This alleviates an important shortcoming of the EDMD algorithm and enhances the applicability of the Koopman framework to practical problems.

  20. Training L2 Writers to Reference Corpora as a Self-Correction Tool

    ERIC Educational Resources Information Center

    Quinn, Cynthia

    2015-01-01

    Corpora have the potential to support the L2 writing process at the discourse level in contrast to the isolated dictionary entries that many intermediate writers rely on. To take advantage of this resource, learners need to be trained, which involves practising corpus research and referencing skills as well as learning to make data-based…

  1. Mass type-specific sparse representation for mass classification in computer-aided detection on mammograms

    PubMed Central

    2013-01-01

    Background Breast cancer is the leading cause of both incidence and mortality in women population. For this reason, much research effort has been devoted to develop Computer-Aided Detection (CAD) systems for early detection of the breast cancers on mammograms. In this paper, we propose a new and novel dictionary configuration underpinning sparse representation based classification (SRC). The key idea of the proposed algorithm is to improve the sparsity in terms of mass margins for the purpose of improving classification performance in CAD systems. Methods The aim of the proposed SRC framework is to construct separate dictionaries according to the types of mass margins. The underlying idea behind our method is that the separated dictionaries can enhance the sparsity of mass class (true-positive), leading to an improved performance for differentiating mammographic masses from normal tissues (false-positive). When a mass sample is given for classification, the sparse solutions based on corresponding dictionaries are separately solved and combined at score level. Experiments have been performed on both database (DB) named as Digital Database for Screening Mammography (DDSM) and clinical Full Field Digital Mammogram (FFDM) DBs. In our experiments, sparsity concentration in the true class (SCTC) and area under the Receiver operating characteristic (ROC) curve (AUC) were measured for the comparison between the proposed method and a conventional single dictionary based approach. In addition, a support vector machine (SVM) was used for comparing our method with state-of-the-arts classifier extensively used for mass classification. Results Comparing with the conventional single dictionary configuration, the proposed approach is able to improve SCTC of up to 13.9% and 23.6% on DDSM and FFDM DBs, respectively. Moreover, the proposed method is able to improve AUC with 8.2% and 22.1% on DDSM and FFDM DBs, respectively. Comparing to SVM classifier, the proposed method improves AUC with 2.9% and 11.6% on DDSM and FFDM DBs, respectively. Conclusions The proposed dictionary configuration is found to well improve the sparsity of dictionaries, resulting in an enhanced classification performance. Moreover, the results show that the proposed method is better than conventional SVM classifier for classifying breast masses subject to various margins from normal tissues. PMID:24564973

  2. Learning Category-Specific Dictionary and Shared Dictionary for Fine-Grained Image Categorization.

    PubMed

    Gao, Shenghua; Tsang, Ivor Wai-Hung; Ma, Yi

    2014-02-01

    This paper targets fine-grained image categorization by learning a category-specific dictionary for each category and a shared dictionary for all the categories. Such category-specific dictionaries encode subtle visual differences among different categories, while the shared dictionary encodes common visual patterns among all the categories. To this end, we impose incoherence constraints among the different dictionaries in the objective of feature coding. In addition, to make the learnt dictionary stable, we also impose the constraint that each dictionary should be self-incoherent. Our proposed dictionary learning formulation not only applies to fine-grained classification, but also improves conventional basic-level object categorization and other tasks such as event recognition. Experimental results on five data sets show that our method can outperform the state-of-the-art fine-grained image categorization frameworks as well as sparse coding based dictionary learning frameworks. All these results demonstrate the effectiveness of our method.

  3. Which Dictionary? A Review of the Leading Learners' Dictionaries.

    ERIC Educational Resources Information Center

    Nesi, Hilary

    Three major dictionaries designed for learners of English as a second language are reviewed, their elements and approaches compared and evaluated, their usefulness for different learners discussed, and recommendations for future dictionary improvement made. The dictionaries in question are the "Oxford Advanced Learner's Dictionary," the…

  4. French Dictionaries. Series: Specialised Bibliographies.

    ERIC Educational Resources Information Center

    Klaar, R. M.

    This is a list of French monolingual, French-English and English-French dictionaries available in December 1975. Dictionaries of etymology, phonetics, place names, proper names, and slang are included, as well as dictionaries for children and dictionaries of Belgian, Canadian, and Swiss French. Most other specialized dictionaries, encyclopedias,…

  5. Mapping the Future: Optimizing Joint Geospatial Engineering Support

    DTIC Science & Technology

    2006-05-16

    Environment. Maxwell Air Force Base, AL.: Air University, 1990. Babbage , Ross and Desmond Ball. Geographic Information Systems: Defence Applications...Joint Pub 4-04. Washington, DC: 27 September 2001. Wertz, Charles J. The Data Dictionary, Concepts and Uses. Wellesley, MA: QED Information...Force Defense Mapping for Future Operations, Washington, DC: September 1995, 1-7. 18 Charles J. Wertz, The Data Dictionary, Concepts and Uses

  6. A hospital-wide clinical findings dictionary based on an extension of the International Classification of Diseases (ICD).

    PubMed

    Bréant, C; Borst, F; Campi, D; Griesser, V; Momjian, S

    1999-01-01

    The use of a controlled vocabulary set in a hospital-wide clinical information system is of crucial importance for many departmental database systems to communicate and exchange information. In the absence of an internationally recognized clinical controlled vocabulary set, a new extension of the International statistical Classification of Diseases (ICD) is proposed. It expands the scope of the standard ICD beyond diagnosis and procedures to clinical terminology. In addition, the common Clinical Findings Dictionary (CFD) further records the definition of clinical entities. The construction of the vocabulary set and the CFD is incremental and manual. Tools have been implemented to facilitate the tasks of defining/maintaining/publishing dictionary versions. The design of database applications in the integrated clinical information system is driven by the CFD which is part of the Medical Questionnaire Designer tool. Several integrated clinical database applications in the field of diabetes and neuro-surgery have been developed at the HUG.

  7. Using UMLS to construct a generalized hierarchical concept-based dictionary of brain functions for information extraction from the fMRI literature.

    PubMed

    Hsiao, Mei-Yu; Chen, Chien-Chung; Chen, Jyh-Horng

    2009-10-01

    With a rapid progress in the field, a great many fMRI studies are published every year, to the extent that it is now becoming difficult for researchers to keep up with the literature, since reading papers is extremely time-consuming and labor-intensive. Thus, automatic information extraction has become an important issue. In this study, we used the Unified Medical Language System (UMLS) to construct a hierarchical concept-based dictionary of brain functions. To the best of our knowledge, this is the first generalized dictionary of this kind. We also developed an information extraction system for recognizing, mapping and classifying terms relevant to human brain study. The precision and recall of our system was on a par with that of human experts in term recognition, term mapping and term classification. Our approach presented in this paper presents an alternative to the more laborious, manual entry approach to information extraction.

  8. A hospital-wide clinical findings dictionary based on an extension of the International Classification of Diseases (ICD).

    PubMed Central

    Bréant, C.; Borst, F.; Campi, D.; Griesser, V.; Momjian, S.

    1999-01-01

    The use of a controlled vocabulary set in a hospital-wide clinical information system is of crucial importance for many departmental database systems to communicate and exchange information. In the absence of an internationally recognized clinical controlled vocabulary set, a new extension of the International statistical Classification of Diseases (ICD) is proposed. It expands the scope of the standard ICD beyond diagnosis and procedures to clinical terminology. In addition, the common Clinical Findings Dictionary (CFD) further records the definition of clinical entities. The construction of the vocabulary set and the CFD is incremental and manual. Tools have been implemented to facilitate the tasks of defining/maintaining/publishing dictionary versions. The design of database applications in the integrated clinical information system is driven by the CFD which is part of the Medical Questionnaire Designer tool. Several integrated clinical database applications in the field of diabetes and neuro-surgery have been developed at the HUG. Images Figure 1 PMID:10566451

  9. Which Desk Dictionary Is Best for Foreign Students of English?

    ERIC Educational Resources Information Center

    Yorkey, Richard

    1969-01-01

    "The American College Dictionary, "Funk and Wagnalls Standard College Dictionary," Webster's New World Dictionary of the American Language," The Random House Dictionary of the English Language," and Webster's Seventh New Collegiate Dictionary" are analyzed and ranked as to their usefulness for the foreign learner of English. (FWB)

  10. DISEASES: text mining and data integration of disease-gene associations.

    PubMed

    Pletscher-Frankild, Sune; Pallejà, Albert; Tsafou, Kalliopi; Binder, Janos X; Jensen, Lars Juhl

    2015-03-01

    Text mining is a flexible technology that can be applied to numerous different tasks in biology and medicine. We present a system for extracting disease-gene associations from biomedical abstracts. The system consists of a highly efficient dictionary-based tagger for named entity recognition of human genes and diseases, which we combine with a scoring scheme that takes into account co-occurrences both within and between sentences. We show that this approach is able to extract half of all manually curated associations with a false positive rate of only 0.16%. Nonetheless, text mining should not stand alone, but be combined with other types of evidence. For this reason, we have developed the DISEASES resource, which integrates the results from text mining with manually curated disease-gene associations, cancer mutation data, and genome-wide association studies from existing databases. The DISEASES resource is accessible through a web interface at http://diseases.jensenlab.org/, where the text-mining software and all associations are also freely available for download. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  11. When Cancer Returns

    MedlinePlus

    ... content 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  12. Coping with Advanced Cancer

    MedlinePlus

    ... content 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  13. Thinking about Complementary and Alternative Medicine

    MedlinePlus

    ... content 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  14. Caring for the Caregiver

    MedlinePlus

    ... Español 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  15. The Use of Monolingual Mobile Dictionaries in the Context of Reading by Intermediate Cantonese EFL Learners in Hong Kong

    ERIC Educational Resources Information Center

    Zou, Di; Xie, Haoran; Wang, Fu Lee

    2015-01-01

    Previous studies on dictionary consultation investigated mainly online dictionaries or simple pocket electronic dictionaries as they were commonly used among learners back then, yet the more updated mobile dictionaries were superficially investigated though they have already replaced the pocket electronic dictionaries. These studies are also…

  16. Requirements and design aspects of a data model for a data dictionary in paediatric oncology.

    PubMed

    Merzweiler, A; Knaup, P; Creutzig, U; Ehlerding, H; Haux, R; Mludek, V; Schilling, F H; Weber, R; Wiedemann, T

    2000-01-01

    German children suffering from cancer are mostly treated within the framework of multicentre clinical trials. An important task of conducting these trials is an extensive information and knowledge exchange, which has to be based on a standardised documentation. To support this effort, it is the aim of a nationwide project to define a standardised terminology that should be used by clinical trials for therapy documentation. In order to support terminology maintenance we are currently developing a data dictionary. In this paper we describe requirements and design aspects of the data model used for the data dictionary as first results of our research. We compare it with other terminology systems.

  17. Preliminary Classification of Army and Navy Entry-Level Occupations by the Holland Coding System.

    DTIC Science & Technology

    1986-12-01

    Dictionary of Holland Occupational Codes (DOHC; see Gottfredson , Holland, & Ogawa, 1982) either directly or through expert judgment. Results...publications: The Dictionary of Holland Occupational Codes (DHOC; Gottfredson , Holland, & Ogawa, 192) and The Occupations Finder (Holland, 1978). The...occupational categories ( Gottfredson et al., 1982). The agreement between the first letters codes obtained from the 1977 Occupations Finder and the

  18. Sparse Representation for Infrared Dim Target Detection via a Discriminative Over-Complete Dictionary Learned Online

    PubMed Central

    Li, Zheng-Zhou; Chen, Jing; Hou, Qian; Fu, Hong-Xia; Dai, Zhen; Jin, Gang; Li, Ru-Zhang; Liu, Chang-Ju

    2014-01-01

    It is difficult for structural over-complete dictionaries such as the Gabor function and discriminative over-complete dictionary, which are learned offline and classified manually, to represent natural images with the goal of ideal sparseness and to enhance the difference between background clutter and target signals. This paper proposes an infrared dim target detection approach based on sparse representation on a discriminative over-complete dictionary. An adaptive morphological over-complete dictionary is trained and constructed online according to the content of infrared image by K-singular value decomposition (K-SVD) algorithm. Then the adaptive morphological over-complete dictionary is divided automatically into a target over-complete dictionary describing target signals, and a background over-complete dictionary embedding background by the criteria that the atoms in the target over-complete dictionary could be decomposed more sparsely based on a Gaussian over-complete dictionary than the one in the background over-complete dictionary. This discriminative over-complete dictionary can not only capture significant features of background clutter and dim targets better than a structural over-complete dictionary, but also strengthens the sparse feature difference between background and target more efficiently than a discriminative over-complete dictionary learned offline and classified manually. The target and background clutter can be sparsely decomposed over their corresponding over-complete dictionaries, yet couldn't be sparsely decomposed based on their opposite over-complete dictionary, so their residuals after reconstruction by the prescribed number of target and background atoms differ very visibly. Some experiments are included and the results show that this proposed approach could not only improve the sparsity more efficiently, but also enhance the performance of small target detection more effectively. PMID:24871988

  19. Sparse representation for infrared Dim target detection via a discriminative over-complete dictionary learned online.

    PubMed

    Li, Zheng-Zhou; Chen, Jing; Hou, Qian; Fu, Hong-Xia; Dai, Zhen; Jin, Gang; Li, Ru-Zhang; Liu, Chang-Ju

    2014-05-27

    It is difficult for structural over-complete dictionaries such as the Gabor function and discriminative over-complete dictionary, which are learned offline and classified manually, to represent natural images with the goal of ideal sparseness and to enhance the difference between background clutter and target signals. This paper proposes an infrared dim target detection approach based on sparse representation on a discriminative over-complete dictionary. An adaptive morphological over-complete dictionary is trained and constructed online according to the content of infrared image by K-singular value decomposition (K-SVD) algorithm. Then the adaptive morphological over-complete dictionary is divided automatically into a target over-complete dictionary describing target signals, and a background over-complete dictionary embedding background by the criteria that the atoms in the target over-complete dictionary could be decomposed more sparsely based on a Gaussian over-complete dictionary than the one in the background over-complete dictionary. This discriminative over-complete dictionary can not only capture significant features of background clutter and dim targets better than a structural over-complete dictionary, but also strengthens the sparse feature difference between background and target more efficiently than a discriminative over-complete dictionary learned offline and classified manually. The target and background clutter can be sparsely decomposed over their corresponding over-complete dictionaries, yet couldn't be sparsely decomposed based on their opposite over-complete dictionary, so their residuals after reconstruction by the prescribed number of target and background atoms differ very visibly. Some experiments are included and the results show that this proposed approach could not only improve the sparsity more efficiently, but also enhance the performance of small target detection more effectively.

  20. Cancer Information Summaries: Screening/Detection

    MedlinePlus

    ... Español 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  1. Children with Cancer: A Guide for Parents

    MedlinePlus

    ... content 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  2. Cell line name recognition in support of the identification of synthetic lethality in cancer from text

    PubMed Central

    Kaewphan, Suwisa; Van Landeghem, Sofie; Ohta, Tomoko; Van de Peer, Yves; Ginter, Filip; Pyysalo, Sampo

    2016-01-01

    Motivation: The recognition and normalization of cell line names in text is an important task in biomedical text mining research, facilitating for instance the identification of synthetically lethal genes from the literature. While several tools have previously been developed to address cell line recognition, it is unclear whether available systems can perform sufficiently well in realistic and broad-coverage applications such as extracting synthetically lethal genes from the cancer literature. In this study, we revisit the cell line name recognition task, evaluating both available systems and newly introduced methods on various resources to obtain a reliable tagger not tied to any specific subdomain. In support of this task, we introduce two text collections manually annotated for cell line names: the broad-coverage corpus Gellus and CLL, a focused target domain corpus. Results: We find that the best performance is achieved using NERsuite, a machine learning system based on Conditional Random Fields, trained on the Gellus corpus and supported with a dictionary of cell line names. The system achieves an F-score of 88.46% on the test set of Gellus and 85.98% on the independently annotated CLL corpus. It was further applied at large scale to 24 302 102 unannotated articles, resulting in the identification of 5 181 342 cell line mentions, normalized to 11 755 unique cell line database identifiers. Availability and implementation: The manually annotated datasets, the cell line dictionary, derived corpora, NERsuite models and the results of the large-scale run on unannotated texts are available under open licenses at http://turkunlp.github.io/Cell-line-recognition/. Contact: sukaew@utu.fi PMID:26428294

  3. Usage Notes in the Oxford American Dictionary.

    ERIC Educational Resources Information Center

    Berner, R. Thomas

    1981-01-01

    Compares the "Oxford American Dictionary" with the "American Heritage Dictionary." Examines the dictionaries' differences in philosophies of language, introductory essays, and usage notes. Concludes that the "Oxford American Dictionary" is too conservative, paternalistic, and dogmatic for the 1980s. (DMM)

  4. Treatment Choices for Men with Early-Stage Prostate Cancer

    MedlinePlus

    ... content 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  5. Pain Control: Support for People with Cancer

    MedlinePlus

    ... Español 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  6. Chemotherapy and You: Support for People with Cancer

    MedlinePlus

    ... Español 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  7. Facing Forward Series: Life After Cancer Treatment

    MedlinePlus

    ... Español 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  8. Eating Hints: Before, During, and After Cancer Treatment

    MedlinePlus

    ... Español 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  9. Taking Time: Support for People with Cancer

    MedlinePlus

    ... Español 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  10. Radiation Therapy and You: Support for People with Cancer

    MedlinePlus

    ... Español 1-800-4-CANCER Live Chat Publications Dictionary Menu Contact Dictionary Search About Cancer Causes and Prevention Risk Factors ... Levels of Evidence: Integrative Therapies Fact Sheets NCI Dictionaries NCI Dictionary of Cancer Terms NCI Drug Dictionary ...

  11. A Simple and Practical Dictionary-based Approach for Identification of Proteins in Medline Abstracts

    PubMed Central

    Egorov, Sergei; Yuryev, Anton; Daraselia, Nikolai

    2004-01-01

    Objective: The aim of this study was to develop a practical and efficient protein identification system for biomedical corpora. Design: The developed system, called ProtScan, utilizes a carefully constructed dictionary of mammalian proteins in conjunction with a specialized tokenization algorithm to identify and tag protein name occurrences in biomedical texts and also takes advantage of Medline “Name-of-Substance” (NOS) annotation. The dictionaries for ProtScan were constructed in a semi-automatic way from various public-domain sequence databases followed by an intensive expert curation step. Measurements: The recall and precision of the system have been determined using 1,000 randomly selected and hand-tagged Medline abstracts. Results: The developed system is capable of identifying protein occurrences in Medline abstracts with a 98% precision and 88% recall. It was also found to be capable of processing approximately 300 abstracts per second. Without utilization of NOS annotation, precision and recall were found to be 98.5% and 84%, respectively. Conclusion: The developed system appears to be well suited for protein-based Medline indexing and can help to improve biomedical information retrieval. Further approaches to ProtScan's recall improvement also are discussed. PMID:14764613

  12. Combining dictionary techniques with extensible markup language (XML)--requirements to a new approach towards flexible and standardized documentation.

    PubMed Central

    Altmann, U.; Tafazzoli, A. G.; Noelle, G.; Huybrechts, T.; Schweiger, R.; Wächter, W.; Dudeck, J. W.

    1999-01-01

    In oncology various international and national standards exist for the documentation of different aspects of a disease. Since elements of these standards are repeated in different contexts, a common data dictionary could support consistent representation in any context. For the construction of such a dictionary existing documents have to be worked up in a complex procedure, that considers aspects of hierarchical decomposition of documents and of domain control as well as aspects of user presentation and models of the underlying model of patient data. In contrast to other thesauri, text chunks like definitions or explanations are very important and have to be preserved, since oncologic documentation often means coding and classification on an aggregate level and the safe use of coding systems is an important precondition for comparability of data. This paper discusses the potentials of the use of XML in combination with a dictionary for the promotion and development of standard conformable applications for tumor documentation. PMID:10566311

  13. Defining datasets and creating data dictionaries for quality improvement and research in chronic disease using routinely collected data: an ontology-driven approach.

    PubMed

    de Lusignan, Simon; Liaw, Siaw-Teng; Michalakidis, Georgios; Jones, Simon

    2011-01-01

    The burden of chronic disease is increasing, and research and quality improvement will be less effective if case finding strategies are suboptimal. To describe an ontology-driven approach to case finding in chronic disease and how this approach can be used to create a data dictionary and make the codes used in case finding transparent. A five-step process: (1) identifying a reference coding system or terminology; (2) using an ontology-driven approach to identify cases; (3) developing metadata that can be used to identify the extracted data; (4) mapping the extracted data to the reference terminology; and (5) creating the data dictionary. Hypertension is presented as an exemplar. A patient with hypertension can be represented by a range of codes including diagnostic, history and administrative. Metadata can link the coding system and data extraction queries to the correct data mapping and translation tool, which then maps it to the equivalent code in the reference terminology. The code extracted, the term, its domain and subdomain, and the name of the data extraction query can then be automatically grouped and published online as a readily searchable data dictionary. An exemplar online is: www.clininf.eu/qickd-data-dictionary.html Adopting an ontology-driven approach to case finding could improve the quality of disease registers and of research based on routine data. It would offer considerable advantages over using limited datasets to define cases. This approach should be considered by those involved in research and quality improvement projects which utilise routine data.

  14. Highly undersampled MR image reconstruction using an improved dual-dictionary learning method with self-adaptive dictionaries.

    PubMed

    Li, Jiansen; Song, Ying; Zhu, Zhen; Zhao, Jun

    2017-05-01

    Dual-dictionary learning (Dual-DL) method utilizes both a low-resolution dictionary and a high-resolution dictionary, which are co-trained for sparse coding and image updating, respectively. It can effectively exploit a priori knowledge regarding the typical structures, specific features, and local details of training sets images. The prior knowledge helps to improve the reconstruction quality greatly. This method has been successfully applied in magnetic resonance (MR) image reconstruction. However, it relies heavily on the training sets, and dictionaries are fixed and nonadaptive. In this research, we improve Dual-DL by using self-adaptive dictionaries. The low- and high-resolution dictionaries are updated correspondingly along with the image updating stage to ensure their self-adaptivity. The updated dictionaries incorporate both the prior information of the training sets and the test image directly. Both dictionaries feature improved adaptability. Experimental results demonstrate that the proposed method can efficiently and significantly improve the quality and robustness of MR image reconstruction.

  15. What Dictionary to Use? A Closer Look at the "Oxford Advanced Learner's Dictionary," the "Longman Dictionary of Contemporary English" and the "Longman Lexicon of Contempory English."

    ERIC Educational Resources Information Center

    Shaw, A. M.

    1983-01-01

    Three dictionaries are compared for their usefulness to teachers of English as a foreign language, teachers in training, students, and other users of English as a foreign language. The issue of monolingual versus bilingual dictionary format is discussed, and a previous analysis of the two bilingual dictionaries is summarized. Pronunciation…

  16. A Microcomputer E-Book—A Database System for Patient Care Experience Using A Personalized Data Dictionary

    PubMed Central

    Hepler, Kevin M.

    1983-01-01

    This paper is a description of a computerized E-book system for maintaining a record of patient care experience. It uses a microcomputer and a specially-written file management program. Its features include a dictionary that is developed by the user to permit easy data entry and retrieval while maintaining compatibility with standard reporting codes. The author of this paper has used this system to maintain a list of more than 3,500 patient contacts during a three year family practice residency at the University of Missouri-Columbia and has found it useful in his education.

  17. DICTIONARIES AND LANGUAGE CHANGE.

    ERIC Educational Resources Information Center

    POOLEY, ROBERT C.

    TWO VIEWS OF A DICTIONARY'S PURPOSE CAME INTO SHARP CONFLICT UPON THE PUBLICATION OF WEBSTER'S "THIRD NEW INTERNATIONAL UNABRIDGED DICTIONARY." THE FIRST VIEW IS THAT A DICTIONARY IS A REFERENCE BOOK ON LANGUAGE ETIQUETTE, AN AUTHORITY FOR MAINTAINING THE PURITY OF THE ENGLISH LANGUAGE. THE SECOND IS THAT A DICTIONARY IS A SCIENTIFIC…

  18. Do Dictionaries Help Students Write?

    ERIC Educational Resources Information Center

    Nesi, Hilary

    Examples are given of real lexical errors made by learner writers, and consideration is given to the way in which three learners' dictionaries could deal with the lexical items that were misused. The dictionaries were the "Oxford Advanced Learner's Dictionary," the "Longman Dictionary of Contemporary English," and the "Chambers Universal Learners'…

  19. Information on Quantifiers and Argument Structure in English Learner's Dictionaries.

    ERIC Educational Resources Information Center

    Lee, Thomas Hun-tak

    1993-01-01

    Lexicographers have been arguing for the inclusion of abstract and complex grammatical information in dictionaries. This paper examines the extent to which information about quantifiers and the argument structure of verbs is encoded in English learner's dictionaries. The Oxford Advanced Learner's Dictionary (1989), the Longman Dictionary of…

  20. Students' Understanding of Dictionary Entries: A Study with Respect to Four Learners' Dictionaries.

    ERIC Educational Resources Information Center

    Jana, Abhra; Amritavalli, Vijaya; Amritavalli, R.

    2003-01-01

    Investigates the effects of definitional information in the form of dictionary entries, on second language learners' vocabulary learning in an instructed setting. Indian students (Native Hindi speakers) of English received monolingual English dictionary entries of five previously unknown words from four different learner's dictionaries. Results…

  1. Seismic classification through sparse filter dictionaries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hickmann, Kyle Scott; Srinivasan, Gowri

    We tackle a multi-label classi cation problem involving the relation between acoustic- pro le features and the measured seismogram. To isolate components of the seismo- grams unique to each class of acoustic pro le we build dictionaries of convolutional lters. The convolutional- lter dictionaries for the individual classes are then combined into a large dictionary for the entire seismogram set. A given seismogram is classi ed by computing its representation in the large dictionary and then comparing reconstruction accuracy with this representation using each of the sub-dictionaries. The sub-dictionary with the minimal reconstruction error identi es the seismogram class.

  2. Adaptive structured dictionary learning for image fusion based on group-sparse-representation

    NASA Astrophysics Data System (ADS)

    Yang, Jiajie; Sun, Bin; Luo, Chengwei; Wu, Yuzhong; Xu, Limei

    2018-04-01

    Dictionary learning is the key process of sparse representation which is one of the most widely used image representation theories in image fusion. The existing dictionary learning method does not use the group structure information and the sparse coefficients well. In this paper, we propose a new adaptive structured dictionary learning algorithm and a l1-norm maximum fusion rule that innovatively utilizes grouped sparse coefficients to merge the images. In the dictionary learning algorithm, we do not need prior knowledge about any group structure of the dictionary. By using the characteristics of the dictionary in expressing the signal, our algorithm can automatically find the desired potential structure information that hidden in the dictionary. The fusion rule takes the physical meaning of the group structure dictionary, and makes activity-level judgement on the structure information when the images are being merged. Therefore, the fused image can retain more significant information. Comparisons have been made with several state-of-the-art dictionary learning methods and fusion rules. The experimental results demonstrate that, the dictionary learning algorithm and the fusion rule both outperform others in terms of several objective evaluation metrics.

  3. Process and methodology of developing Cassini G and C Telemetry Dictionary

    NASA Technical Reports Server (NTRS)

    Kan, Edwin P.

    1994-01-01

    While the Cassini spacecraft telemetry design had taken on the new approach of 'packetized telemetry', the AACS (Attitude and Articulation Subsystem) had further extended into the design of 'mini-packets' in its telemetry system. Such telemetry packet and mini-packet design produced the AACS Telemetry Dictionary; iterations of the latter in turn provided changes to the former. The ultimate goals were to achieve maximum telemetry packing density, optimize the 'freshness' of more time-critical data, and to effect flexibility, i.e., multiple AACS data collection schemes, without needing to change the overall spacecraft telemetry mode. This paper describes such a systematic process and methodology, evidenced by various design products related to, or as part of, the AACS Telemetry Dictionary.

  4. Coupled dictionary learning for joint MR image restoration and segmentation

    NASA Astrophysics Data System (ADS)

    Yang, Xuesong; Fan, Yong

    2018-03-01

    To achieve better segmentation of MR images, image restoration is typically used as a preprocessing step, especially for low-quality MR images. Recent studies have demonstrated that dictionary learning methods could achieve promising performance for both image restoration and image segmentation. These methods typically learn paired dictionaries of image patches from different sources and use a common sparse representation to characterize paired image patches, such as low-quality image patches and their corresponding high quality counterparts for the image restoration, and image patches and their corresponding segmentation labels for the image segmentation. Since learning these dictionaries jointly in a unified framework may improve the image restoration and segmentation simultaneously, we propose a coupled dictionary learning method to concurrently learn dictionaries for joint image restoration and image segmentation based on sparse representations in a multi-atlas image segmentation framework. Particularly, three dictionaries, including a dictionary of low quality image patches, a dictionary of high quality image patches, and a dictionary of segmentation label patches, are learned in a unified framework so that the learned dictionaries of image restoration and segmentation can benefit each other. Our method has been evaluated for segmenting the hippocampus in MR T1 images collected with scanners of different magnetic field strengths. The experimental results have demonstrated that our method achieved better image restoration and segmentation performance than state of the art dictionary learning and sparse representation based image restoration and image segmentation methods.

  5. Tensor Dictionary Learning for Positive Definite Matrices.

    PubMed

    Sivalingam, Ravishankar; Boley, Daniel; Morellas, Vassilios; Papanikolopoulos, Nikolaos

    2015-11-01

    Sparse models have proven to be extremely successful in image processing and computer vision. However, a majority of the effort has been focused on sparse representation of vectors and low-rank models for general matrices. The success of sparse modeling, along with popularity of region covariances, has inspired the development of sparse coding approaches for these positive definite descriptors. While in earlier work, the dictionary was formed from all, or a random subset of, the training signals, it is clearly advantageous to learn a concise dictionary from the entire training set. In this paper, we propose a novel approach for dictionary learning over positive definite matrices. The dictionary is learned by alternating minimization between sparse coding and dictionary update stages, and different atom update methods are described. A discriminative version of the dictionary learning approach is also proposed, which simultaneously learns dictionaries for different classes in classification or clustering. Experimental results demonstrate the advantage of learning dictionaries from data both from reconstruction and classification viewpoints. Finally, a software library is presented comprising C++ binaries for all the positive definite sparse coding and dictionary learning approaches presented here.

  6. Consolidated Environmental Resource Database Information Process (CERDIP)

    DTIC Science & Technology

    2015-11-19

    Secretary of the Army for Installations, Energy and Environment [OASA(IE&E)] ESOH 5850 21st Street, Bldg 211, Second Floor Fort Belvoir, VA 22060-5938...Elizabeth J. Keysar 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) National Defense Center for Energy and Environment Operated by Concurrent...Markup Language NDCEE National Defense Center for Energy and Environment NFDD National Geospatial–Intelligence Agency Feature Data Dictionary

  7. Medical and dermatology dictionaries: an examination of unstructured definitions and a proposal for the future.

    PubMed

    DeVries, David Todd; Papier, Art; Byrnes, Jennifer; Goldsmith, Lowell A

    2004-01-01

    Medical dictionaries serve to describe and clarify the term set used by medical professionals. In this commentary, we analyze a representative set of skin disease definitions from 2 prominent medical dictionaries, Stedman's Medical Dictionary and Dorland's Illustrated Medical Dictionary. We find that there is an apparent lack of stylistic standards with regard to content and form. We advocate a new standard form for the definition of medical terminology, a standard to complement the easy-to-read yet unstructured style of the traditional dictionary entry. This new form offers a reproducible structure, paving the way for the development of a computer readable "dictionary" of medical terminology. Such a dictionary offers immediate update capability and a fundamental improvement in the ability to search for relationships between terms.

  8. Data-Dictionary-Editing Program

    NASA Technical Reports Server (NTRS)

    Cumming, A. P.

    1989-01-01

    Access to data-dictionary relations and attributes made more convenient. Data Dictionary Editor (DDE) application program provides more convenient read/write access to data-dictionary table ("descriptions table") via data screen using SMARTQUERY function keys. Provides three main advantages: (1) User works with table names and field names rather than with table numbers and field numbers, (2) Provides online access to definitions of data-dictionary keys, and (3) Provides displayed summary list that shows, for each datum, which data-dictionary entries currently exist for any specific relation or attribute. Computer program developed to give developers of data bases more convenient access to the OMNIBASE VAX/IDM data-dictionary relations and attributes.

  9. Dictionary Learning Algorithms for Sparse Representation

    PubMed Central

    Kreutz-Delgado, Kenneth; Murray, Joseph F.; Rao, Bhaskar D.; Engan, Kjersti; Lee, Te-Won; Sejnowski, Terrence J.

    2010-01-01

    Algorithms for data-driven learning of domain-specific overcomplete dictionaries are developed to obtain maximum likelihood and maximum a posteriori dictionary estimates based on the use of Bayesian models with concave/Schur-concave (CSC) negative log priors. Such priors are appropriate for obtaining sparse representations of environmental signals within an appropriately chosen (environmentally matched) dictionary. The elements of the dictionary can be interpreted as concepts, features, or words capable of succinct expression of events encountered in the environment (the source of the measured signals). This is a generalization of vector quantization in that one is interested in a description involving a few dictionary entries (the proverbial “25 words or less”), but not necessarily as succinct as one entry. To learn an environmentally adapted dictionary capable of concise expression of signals generated by the environment, we develop algorithms that iterate between a representative set of sparse representations found by variants of FOCUSS and an update of the dictionary using these sparse representations. Experiments were performed using synthetic data and natural images. For complete dictionaries, we demonstrate that our algorithms have improved performance over other independent component analysis (ICA) methods, measured in terms of signal-to-noise ratios of separated sources. In the overcomplete case, we show that the true underlying dictionary and sparse sources can be accurately recovered. In tests with natural images, learned overcomplete dictionaries are shown to have higher coding efficiency than complete dictionaries; that is, images encoded with an over-complete dictionary have both higher compression (fewer bits per pixel) and higher accuracy (lower mean square error). PMID:12590811

  10. Nonparametric, Coupled ,Bayesian ,Dictionary ,and Classifier Learning for Hyperspectral Classification.

    PubMed

    Akhtar, Naveed; Mian, Ajmal

    2017-10-03

    We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.

  11. Fast Low-Rank Shared Dictionary Learning for Image Classification.

    PubMed

    Tiep Huu Vu; Monga, Vishal

    2017-11-01

    Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. This observation has been exploited partially in a recently proposed dictionary learning framework by separating the particularity and the commonality (COPAR). Inspired by this, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification with more intuitive constraints. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e., claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Furthermore, we develop new fast and accurate algorithms to solve the subproblems in the learning step, accelerating its convergence. The said algorithms could also be applied to FDDL and its extensions. The efficiencies of these algorithms are theoretically and experimentally verified by comparing their complexities and running time with those of other well-known dictionary learning methods. Experimental results on widely used image data sets establish the advantages of our method over the state-of-the-art dictionary learning methods.

  12. Fast Low-Rank Shared Dictionary Learning for Image Classification

    NASA Astrophysics Data System (ADS)

    Vu, Tiep Huu; Monga, Vishal

    2017-11-01

    Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. This observation has been exploited partially in a recently proposed dictionary learning framework by separating the particularity and the commonality (COPAR). Inspired by this, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification with more intuitive constraints. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e. claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Further, we develop new fast and accurate algorithms to solve the subproblems in the learning step, accelerating its convergence. The said algorithms could also be applied to FDDL and its extensions. The efficiencies of these algorithms are theoretically and experimentally verified by comparing their complexities and running time with those of other well-known dictionary learning methods. Experimental results on widely used image datasets establish the advantages of our method over state-of-the-art dictionary learning methods.

  13. A Study of Comparatively Low Achievement Students' Bilingualized Dictionary Use and Their English Learning

    ERIC Educational Resources Information Center

    Chen, Szu-An

    2016-01-01

    This study investigates bilingualized dictionary use of Taiwanese university students. It aims to examine EFL learners' overall dictionary use behavior and their perspectives on book dictionary as well as the necessity of advance guidance in using dictionaries. Data was collected through questionnaires and analyzed by SPSS 15.0. Findings indicate…

  14. Using Different Types of Dictionaries for Improving EFL Reading Comprehension and Vocabulary Learning

    ERIC Educational Resources Information Center

    Alharbi, Majed A.

    2016-01-01

    This study investigated the effects of monolingual book dictionaries, popup dictionaries, and type-in dictionaries on improving reading comprehension and vocabulary learning in an EFL program. An experimental design involving four groups and a post-test was chosen for the experiment: (1) pop-up dictionary (experimental group 1); (2) type-in…

  15. Students Working with an English Learners' Dictionary on CD-ROM.

    ERIC Educational Resources Information Center

    Winkler, Birgit

    This paper examines the growing literature on pedagogical lexicography and the growing focus on how well the learner uses the dictionary in second language learning. Dictionaries are becoming more user-friendly. This study used the writing task to reveal new insights into how students use a CD-ROM dictionary. It found a lack of dictionary-using…

  16. The Effects of Dictionary Use on the Vocabulary Learning Strategies Used by Language Learners of Spanish.

    ERIC Educational Resources Information Center

    Hsien-jen, Chin

    This study investigated the effects of dictionary use on the vocabulary learning strategies used by intermediate college-level Spanish learners to understand new vocabulary items in a reading test. Participants were randomly assigned to one of three groups: control (without a dictionary), bilingual dictionary (using a Spanish-English dictionary),…

  17. An object-oriented design for automated navigation of semantic networks inside a medical data dictionary.

    PubMed

    Ruan, W; Bürkle, T; Dudeck, J

    2000-01-01

    In this paper we present a data dictionary server for the automated navigation of information sources. The underlying knowledge is represented within a medical data dictionary. The mapping between medical terms and information sources is based on a semantic network. The key aspect of implementing the dictionary server is how to represent the semantic network in a way that is easier to navigate and to operate, i.e. how to abstract the semantic network and to represent it in memory for various operations. This paper describes an object-oriented design based on Java that represents the semantic network in terms of a group of objects. A node and its relationships to its neighbors are encapsulated in one object. Based on such a representation model, several operations have been implemented. They comprise the extraction of parts of the semantic network which can be reached from a given node as well as finding all paths between a start node and a predefined destination node. This solution is independent of any given layout of the semantic structure. Therefore the module, called Giessen Data Dictionary Server can act independent of a specific clinical information system. The dictionary server will be used to present clinical information, e.g. treatment guidelines or drug information sources to the clinician in an appropriate working context. The server is invoked from clinical documentation applications which contain an infobutton. Automated navigation will guide the user to all the information relevant to her/his topic, which is currently available inside our closed clinical network.

  18. Improving the Incoherence of a Learned Dictionary via Rank Shrinkage.

    PubMed

    Ubaru, Shashanka; Seghouane, Abd-Krim; Saad, Yousef

    2017-01-01

    This letter considers the problem of dictionary learning for sparse signal representation whose atoms have low mutual coherence. To learn such dictionaries, at each step, we first update the dictionary using the method of optimal directions (MOD) and then apply a dictionary rank shrinkage step to decrease its mutual coherence. In the rank shrinkage step, we first compute a rank 1 decomposition of the column-normalized least squares estimate of the dictionary obtained from the MOD step. We then shrink the rank of this learned dictionary by transforming the problem of reducing the rank to a nonnegative garrotte estimation problem and solving it using a path-wise coordinate descent approach. We establish theoretical results that show that the rank shrinkage step included will reduce the coherence of the dictionary, which is further validated by experimental results. Numerical experiments illustrating the performance of the proposed algorithm in comparison to various other well-known dictionary learning algorithms are also presented.

  19. The Making of the "Oxford English Dictionary."

    ERIC Educational Resources Information Center

    Winchester, Simon

    2003-01-01

    Summarizes remarks made to open the Gallaudet University conference on Dictionaries and the Standardization of languages. It concerns the making of what is arguably the world's greatest dictionary, "The Oxford English Dictionary." (VWL)

  20. A novel structured dictionary for fast processing of 3D medical images, with application to computed tomography restoration and denoising

    NASA Astrophysics Data System (ADS)

    Karimi, Davood; Ward, Rabab K.

    2016-03-01

    Sparse representation of signals in learned overcomplete dictionaries has proven to be a powerful tool with applications in denoising, restoration, compression, reconstruction, and more. Recent research has shown that learned overcomplete dictionaries can lead to better results than analytical dictionaries such as wavelets in almost all image processing applications. However, a major disadvantage of these dictionaries is that their learning and usage is very computationally intensive. In particular, finding the sparse representation of a signal in these dictionaries requires solving an optimization problem that leads to very long computational times, especially in 3D image processing. Moreover, the sparse representation found by greedy algorithms is usually sub-optimal. In this paper, we propose a novel two-level dictionary structure that improves the performance and the speed of standard greedy sparse coding methods. The first (i.e., the top) level in our dictionary is a fixed orthonormal basis, whereas the second level includes the atoms that are learned from the training data. We explain how such a dictionary can be learned from the training data and how the sparse representation of a new signal in this dictionary can be computed. As an application, we use the proposed dictionary structure for removing the noise and artifacts in 3D computed tomography (CT) images. Our experiments with real CT images show that the proposed method achieves results that are comparable with standard dictionary-based methods while substantially reducing the computational time.

  1. MiDas: Automatic Extraction of a Common Domain of Discourse in Sleep Medicine for Multi-center Data Integration

    PubMed Central

    Sahoo, Satya S.; Ogbuji, Chimezie; Luo, Lingyun; Dong, Xiao; Cui, Licong; Redline, Susan S.; Zhang, Guo-Qiang

    2011-01-01

    Clinical studies often use data dictionaries with controlled sets of terms to facilitate data collection, limited interoperability and sharing at a local site. Multi-center retrospective clinical studies require that these data dictionaries, originating from individual participating centers, be harmonized in preparation for the integration of the corresponding clinical research data. Domain ontologies are often used to facilitate multi-center data integration by modeling terms from data dictionaries in a logic-based language, but interoperability among domain ontologies (using automated techniques) is an unresolved issue. Although many upper-level reference ontologies have been proposed to address this challenge, our experience in integrating multi-center sleep medicine data highlights the need for an upper level ontology that models a common set of terms at multiple-levels of abstraction, which is not covered by the existing upper-level ontologies. We introduce a methodology underpinned by a Minimal Domain of Discourse (MiDas) algorithm to automatically extract a minimal common domain of discourse (upper-domain ontology) from an existing domain ontology. Using the Multi-Modality, Multi-Resource Environment for Physiological and Clinical Research (Physio-MIMI) multi-center project in sleep medicine as a use case, we demonstrate the use of MiDas in extracting a minimal domain of discourse for sleep medicine, from Physio-MIMI’s Sleep Domain Ontology (SDO). We then extend the resulting domain of discourse with terms from the data dictionary of the Sleep Heart and Health Study (SHHS) to validate MiDas. To illustrate the wider applicability of MiDas, we automatically extract the respective domains of discourse from 6 sample domain ontologies from the National Center for Biomedical Ontologies (NCBO) and the OBO Foundry. PMID:22195180

  2. MiDas: automatic extraction of a common domain of discourse in sleep medicine for multi-center data integration.

    PubMed

    Sahoo, Satya S; Ogbuji, Chimezie; Luo, Lingyun; Dong, Xiao; Cui, Licong; Redline, Susan S; Zhang, Guo-Qiang

    2011-01-01

    Clinical studies often use data dictionaries with controlled sets of terms to facilitate data collection, limited interoperability and sharing at a local site. Multi-center retrospective clinical studies require that these data dictionaries, originating from individual participating centers, be harmonized in preparation for the integration of the corresponding clinical research data. Domain ontologies are often used to facilitate multi-center data integration by modeling terms from data dictionaries in a logic-based language, but interoperability among domain ontologies (using automated techniques) is an unresolved issue. Although many upper-level reference ontologies have been proposed to address this challenge, our experience in integrating multi-center sleep medicine data highlights the need for an upper level ontology that models a common set of terms at multiple-levels of abstraction, which is not covered by the existing upper-level ontologies. We introduce a methodology underpinned by a Minimal Domain of Discourse (MiDas) algorithm to automatically extract a minimal common domain of discourse (upper-domain ontology) from an existing domain ontology. Using the Multi-Modality, Multi-Resource Environment for Physiological and Clinical Research (Physio-MIMI) multi-center project in sleep medicine as a use case, we demonstrate the use of MiDas in extracting a minimal domain of discourse for sleep medicine, from Physio-MIMI's Sleep Domain Ontology (SDO). We then extend the resulting domain of discourse with terms from the data dictionary of the Sleep Heart and Health Study (SHHS) to validate MiDas. To illustrate the wider applicability of MiDas, we automatically extract the respective domains of discourse from 6 sample domain ontologies from the National Center for Biomedical Ontologies (NCBO) and the OBO Foundry.

  3. Dictionary Approaches to Image Compression and Reconstruction

    NASA Technical Reports Server (NTRS)

    Ziyad, Nigel A.; Gilmore, Erwin T.; Chouikha, Mohamed F.

    1998-01-01

    This paper proposes using a collection of parameterized waveforms, known as a dictionary, for the purpose of medical image compression. These waveforms, denoted as phi(sub gamma), are discrete time signals, where gamma represents the dictionary index. A dictionary with a collection of these waveforms is typically complete or overcomplete. Given such a dictionary, the goal is to obtain a representation image based on the dictionary. We examine the effectiveness of applying Basis Pursuit (BP), Best Orthogonal Basis (BOB), Matching Pursuits (MP), and the Method of Frames (MOF) methods for the compression of digitized radiological images with a wavelet-packet dictionary. The performance of these algorithms is studied for medical images with and without additive noise.

  4. Polarimetric SAR image classification based on discriminative dictionary learning model

    NASA Astrophysics Data System (ADS)

    Sang, Cheng Wei; Sun, Hong

    2018-03-01

    Polarimetric SAR (PolSAR) image classification is one of the important applications of PolSAR remote sensing. It is a difficult high-dimension nonlinear mapping problem, the sparse representations based on learning overcomplete dictionary have shown great potential to solve such problem. The overcomplete dictionary plays an important role in PolSAR image classification, however for PolSAR image complex scenes, features shared by different classes will weaken the discrimination of learned dictionary, so as to degrade classification performance. In this paper, we propose a novel overcomplete dictionary learning model to enhance the discrimination of dictionary. The learned overcomplete dictionary by the proposed model is more discriminative and very suitable for PolSAR classification.

  5. Dictionary Approaches to Image Compression and Reconstruction

    NASA Technical Reports Server (NTRS)

    Ziyad, Nigel A.; Gilmore, Erwin T.; Chouikha, Mohamed F.

    1998-01-01

    This paper proposes using a collection of parameterized waveforms, known as a dictionary, for the purpose of medical image compression. These waveforms, denoted as lambda, are discrete time signals, where y represents the dictionary index. A dictionary with a collection of these waveforms Is typically complete or over complete. Given such a dictionary, the goal is to obtain a representation Image based on the dictionary. We examine the effectiveness of applying Basis Pursuit (BP), Best Orthogonal Basis (BOB), Matching Pursuits (MP), and the Method of Frames (MOF) methods for the compression of digitized radiological images with a wavelet-packet dictionary. The performance of these algorithms is studied for medical images with and without additive noise.

  6. Evaluating Online Dictionaries From Faculty Prospective: A Case Study Performed On English Faculty Members At King Saud University--Wadi Aldawaser Branch

    ERIC Educational Resources Information Center

    Abouserie, Hossam Eldin Mohamed Refaat

    2010-01-01

    The purpose of this study was to evaluate online dictionaries from faculty prospective. The study tried to obtain in depth information about various forms of dictionaries the faculty used; degree of awareness and accessing online dictionaries; types of online dictionaries accessed; basic features of information provided; major benefits gained…

  7. USGS national surveys and analysis projects: Preliminary compilation of integrated geological datasets for the United States

    USGS Publications Warehouse

    Nicholson, Suzanne W.; Stoeser, Douglas B.; Wilson, Frederic H.; Dicken, Connie L.; Ludington, Steve

    2007-01-01

    The growth in the use of Geographic nformation Systems (GS) has highlighted the need for regional and national digital geologic maps attributed with age and rock type information. Such spatial data can be conveniently used to generate derivative maps for purposes that include mineral-resource assessment, metallogenic studies, tectonic studies, human health and environmental research. n 1997, the United States Geological Survey’s Mineral Resources Program initiated an effort to develop national digital databases for use in mineral resource and environmental assessments. One primary activity of this effort was to compile a national digital geologic map database, utilizing state geologic maps, to support mineral resource studies in the range of 1:250,000- to 1:1,000,000-scale. Over the course of the past decade, state databases were prepared using a common standard for the database structure, fields, attributes, and data dictionaries. As of late 2006, standardized geological map databases for all conterminous (CONUS) states have been available on-line as USGS Open-File Reports. For Alaska and Hawaii, new state maps are being prepared, and the preliminary work for Alaska is being released as a series of 1:500,000-scale regional compilations. See below for a list of all published databases.

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xu, Q; Han, H; Xing, L

    Purpose: Dictionary learning based method has attracted more and more attentions in low-dose CT due to the superior performance on suppressing noise and preserving structural details. Considering the structures and noise vary from region to region in one imaging object, we propose a region-specific dictionary learning method to improve the low-dose CT reconstruction. Methods: A set of normal-dose images was used for dictionary learning. Segmentations were performed on these images, so that the training patch sets corresponding to different regions can be extracted out. After that, region-specific dictionaries were learned from these training sets. For the low-dose CT reconstruction, amore » conventional reconstruction, such as filtered back-projection (FBP), was performed firstly, and then segmentation was followed to segment the image into different regions. Sparsity constraints of each region based on its dictionary were used as regularization terms. The regularization parameters were selected adaptively according to different regions. A low-dose human thorax dataset was used to evaluate the proposed method. The single dictionary based method was performed for comparison. Results: Since the lung region is very different from the other part of thorax, two dictionaries corresponding to lung region and the rest part of thorax respectively were learned to better express the structural details and avoid artifacts. With only one dictionary some artifact appeared in the body region caused by the spot atoms corresponding to the structures in the lung region. And also some structure in the lung regions cannot be recovered well by only one dictionary. The quantitative indices of the result by the proposed method were also improved a little compared to the single dictionary based method. Conclusion: Region-specific dictionary can make the dictionary more adaptive to different region characteristics, which is much desirable for enhancing the performance of dictionary learning based method.« less

  9. Improving imbalanced scientific text classification using sampling strategies and dictionaries.

    PubMed

    Borrajo, L; Romero, R; Iglesias, E L; Redondo Marey, C M

    2011-09-15

    Many real applications have the imbalanced class distribution problem, where one of the classes is represented by a very small number of cases compared to the other classes. One of the systems affected are those related to the recovery and classification of scientific documentation. Sampling strategies such as Oversampling and Subsampling are popular in tackling the problem of class imbalance. In this work, we study their effects on three types of classifiers (Knn, SVM and Naive-Bayes) when they are applied to search on the PubMed scientific database. Another purpose of this paper is to study the use of dictionaries in the classification of biomedical texts. Experiments are conducted with three different dictionaries (BioCreative, NLPBA, and an ad-hoc subset of the UniProt database named Protein) using the mentioned classifiers and sampling strategies. Best results were obtained with NLPBA and Protein dictionaries and the SVM classifier using the Subsampling balancing technique. These results were compared with those obtained by other authors using the TREC Genomics 2005 public corpus. Copyright 2011 The Author(s). Published by Journal of Integrative Bioinformatics.

  10. Implementation of a platform dedicated to the biomedical analysis terminologies management

    PubMed Central

    Cormont, Sylvie; Vandenbussche, Pierre-Yves; Buemi, Antoine; Delahousse, Jean; Lepage, Eric; Charlet, Jean

    2011-01-01

    Background and objectives. Assistance Publique - Hôpitaux de Paris (AP-HP) is implementing a new laboratory management system (LMS) common to the 12 hospital groups. First step to this process was to acquire a biological analysis dictionary. This dictionary is interfaced with the international nomenclature LOINC, and has been developed in collaboration with experts from all biological disciplines. In this paper we describe in three steps (modeling, data migration and integration/verification) the implementation of a platform for publishing and maintaining the AP-HP laboratory data dictionary (AnaBio). Material and Methods. Due to data complexity and volume, setting up a platform dedicated to the terminology management was a key requirement. This is an enhancement tackling identified weaknesses of previous spreadsheet tool. Our core model allows interoperability regarding data exchange standards and dictionary evolution. Results. We completed our goals within one year. In addition, structuring data representation has lead to a significant data quality improvement (impacting more than 10% of data). The platform is active in the 21 hospitals of the institution spread into 165 laboratories. PMID:22195205

  11. Building a semantic web-based metadata repository for facilitating detailed clinical modeling in cancer genome studies.

    PubMed

    Sharma, Deepak K; Solbrig, Harold R; Tao, Cui; Weng, Chunhua; Chute, Christopher G; Jiang, Guoqian

    2017-06-05

    Detailed Clinical Models (DCMs) have been regarded as the basis for retaining computable meaning when data are exchanged between heterogeneous computer systems. To better support clinical cancer data capturing and reporting, there is an emerging need to develop informatics solutions for standards-based clinical models in cancer study domains. The objective of the study is to develop and evaluate a cancer genome study metadata management system that serves as a key infrastructure in supporting clinical information modeling in cancer genome study domains. We leveraged a Semantic Web-based metadata repository enhanced with both ISO11179 metadata standard and Clinical Information Modeling Initiative (CIMI) Reference Model. We used the common data elements (CDEs) defined in The Cancer Genome Atlas (TCGA) data dictionary, and extracted the metadata of the CDEs using the NCI Cancer Data Standards Repository (caDSR) CDE dataset rendered in the Resource Description Framework (RDF). The ITEM/ITEM_GROUP pattern defined in the latest CIMI Reference Model is used to represent reusable model elements (mini-Archetypes). We produced a metadata repository with 38 clinical cancer genome study domains, comprising a rich collection of mini-Archetype pattern instances. We performed a case study of the domain "clinical pharmaceutical" in the TCGA data dictionary and demonstrated enriched data elements in the metadata repository are very useful in support of building detailed clinical models. Our informatics approach leveraging Semantic Web technologies provides an effective way to build a CIMI-compliant metadata repository that would facilitate the detailed clinical modeling to support use cases beyond TCGA in clinical cancer study domains.

  12. Standardized Representation of Clinical Study Data Dictionaries with CIMI Archetypes.

    PubMed

    Sharma, Deepak K; Solbrig, Harold R; Prud'hommeaux, Eric; Pathak, Jyotishman; Jiang, Guoqian

    2016-01-01

    Researchers commonly use a tabular format to describe and represent clinical study data. The lack of standardization of data dictionary's metadata elements presents challenges for their harmonization for similar studies and impedes interoperability outside the local context. We propose that representing data dictionaries in the form of standardized archetypes can help to overcome this problem. The Archetype Modeling Language (AML) as developed by the Clinical Information Modeling Initiative (CIMI) can serve as a common format for the representation of data dictionary models. We mapped three different data dictionaries (identified from dbGAP, PheKB and TCGA) onto AML archetypes by aligning dictionary variable definitions with the AML archetype elements. The near complete alignment of data dictionaries helped map them into valid AML models that captured all data dictionary model metadata. The outcome of the work would help subject matter experts harmonize data models for quality, semantic interoperability and better downstream data integration.

  13. An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks

    PubMed Central

    Wang, Donghao; Wan, Jiangwen; Chen, Junying; Zhang, Qiang

    2016-01-01

    To adapt to sense signals of enormous diversities and dynamics, and to decrease the reconstruction errors caused by ambient noise, a novel online dictionary learning method-based compressive data gathering (ODL-CDG) algorithm is proposed. The proposed dictionary is learned from a two-stage iterative procedure, alternately changing between a sparse coding step and a dictionary update step. The self-coherence of the learned dictionary is introduced as a penalty term during the dictionary update procedure. The dictionary is also constrained with sparse structure. It’s theoretically demonstrated that the sensing matrix satisfies the restricted isometry property (RIP) with high probability. In addition, the lower bound of necessary number of measurements for compressive sensing (CS) reconstruction is given. Simulation results show that the proposed ODL-CDG algorithm can enhance the recovery accuracy in the presence of noise, and reduce the energy consumption in comparison with other dictionary based data gathering methods. PMID:27669250

  14. An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks.

    PubMed

    Wang, Donghao; Wan, Jiangwen; Chen, Junying; Zhang, Qiang

    2016-09-22

    To adapt to sense signals of enormous diversities and dynamics, and to decrease the reconstruction errors caused by ambient noise, a novel online dictionary learning method-based compressive data gathering (ODL-CDG) algorithm is proposed. The proposed dictionary is learned from a two-stage iterative procedure, alternately changing between a sparse coding step and a dictionary update step. The self-coherence of the learned dictionary is introduced as a penalty term during the dictionary update procedure. The dictionary is also constrained with sparse structure. It's theoretically demonstrated that the sensing matrix satisfies the restricted isometry property (RIP) with high probability. In addition, the lower bound of necessary number of measurements for compressive sensing (CS) reconstruction is given. Simulation results show that the proposed ODL-CDG algorithm can enhance the recovery accuracy in the presence of noise, and reduce the energy consumption in comparison with other dictionary based data gathering methods.

  15. The architecture of a distributed medical dictionary.

    PubMed

    Fowler, J; Buffone, G; Moreau, D

    1995-01-01

    Exploiting high-speed computer networks to provide a national medical information infrastructure is a goal for medical informatics. The Distributed Medical Dictionary under development at Baylor College of Medicine is a model for an architecture that supports collaborative development of a distributed online medical terminology knowledge-base. A prototype is described that illustrates the concept. Issues that must be addressed by such a system include high availability, acceptable response time, support for local idiom, and control of vocabulary.

  16. A dictionary based informational genome analysis

    PubMed Central

    2012-01-01

    Background In the post-genomic era several methods of computational genomics are emerging to understand how the whole information is structured within genomes. Literature of last five years accounts for several alignment-free methods, arisen as alternative metrics for dissimilarity of biological sequences. Among the others, recent approaches are based on empirical frequencies of DNA k-mers in whole genomes. Results Any set of words (factors) occurring in a genome provides a genomic dictionary. About sixty genomes were analyzed by means of informational indexes based on genomic dictionaries, where a systemic view replaces a local sequence analysis. A software prototype applying a methodology here outlined carried out some computations on genomic data. We computed informational indexes, built the genomic dictionaries with different sizes, along with frequency distributions. The software performed three main tasks: computation of informational indexes, storage of these in a database, index analysis and visualization. The validation was done by investigating genomes of various organisms. A systematic analysis of genomic repeats of several lengths, which is of vivid interest in biology (for example to compute excessively represented functional sequences, such as promoters), was discussed, and suggested a method to define synthetic genetic networks. Conclusions We introduced a methodology based on dictionaries, and an efficient motif-finding software application for comparative genomics. This approach could be extended along many investigation lines, namely exported in other contexts of computational genomics, as a basis for discrimination of genomic pathologies. PMID:22985068

  17. Bayesian nonparametric dictionary learning for compressed sensing MRI.

    PubMed

    Huang, Yue; Paisley, John; Lin, Qin; Ding, Xinghao; Fu, Xueyang; Zhang, Xiao-Ping

    2014-12-01

    We develop a Bayesian nonparametric model for reconstructing magnetic resonance images (MRIs) from highly undersampled k -space data. We perform dictionary learning as part of the image reconstruction process. To this end, we use the beta process as a nonparametric dictionary learning prior for representing an image patch as a sparse combination of dictionary elements. The size of the dictionary and patch-specific sparsity pattern are inferred from the data, in addition to other dictionary learning variables. Dictionary learning is performed directly on the compressed image, and so is tailored to the MRI being considered. In addition, we investigate a total variation penalty term in combination with the dictionary learning model, and show how the denoising property of dictionary learning removes dependence on regularization parameters in the noisy setting. We derive a stochastic optimization algorithm based on Markov chain Monte Carlo for the Bayesian model, and use the alternating direction method of multipliers for efficiently performing total variation minimization. We present empirical results on several MRI, which show that the proposed regularization framework can improve reconstruction accuracy over other methods.

  18. The Text Retrieval Conferences (TRECs)

    DTIC Science & Technology

    1998-10-01

    per- form a monolingual run in the target language to act as a baseline. Thirteen groups participated in the TREC-6 CLIR track. Three major...language; the use of machine-readable bilingual dictionaries or other existing linguistic re- sources; and the use of corpus resources to train or...formance for each method. In general, the best cross- language performance was between 50%-75% as ef- fective as a quality monolingual run. The TREC-7

  19. Sparsity-promoting orthogonal dictionary updating for image reconstruction from highly undersampled magnetic resonance data.

    PubMed

    Huang, Jinhong; Guo, Li; Feng, Qianjin; Chen, Wufan; Feng, Yanqiu

    2015-07-21

    Image reconstruction from undersampled k-space data accelerates magnetic resonance imaging (MRI) by exploiting image sparseness in certain transform domains. Employing image patch representation over a learned dictionary has the advantage of being adaptive to local image structures and thus can better sparsify images than using fixed transforms (e.g. wavelets and total variations). Dictionary learning methods have recently been introduced to MRI reconstruction, and these methods demonstrate significantly reduced reconstruction errors compared to sparse MRI reconstruction using fixed transforms. However, the synthesis sparse coding problem in dictionary learning is NP-hard and computationally expensive. In this paper, we present a novel sparsity-promoting orthogonal dictionary updating method for efficient image reconstruction from highly undersampled MRI data. The orthogonality imposed on the learned dictionary enables the minimization problem in the reconstruction to be solved by an efficient optimization algorithm which alternately updates representation coefficients, orthogonal dictionary, and missing k-space data. Moreover, both sparsity level and sparse representation contribution using updated dictionaries gradually increase during iterations to recover more details, assuming the progressively improved quality of the dictionary. Simulation and real data experimental results both demonstrate that the proposed method is approximately 10 to 100 times faster than the K-SVD-based dictionary learning MRI method and simultaneously improves reconstruction accuracy.

  20. Manifold optimization-based analysis dictionary learning with an ℓ1∕2-norm regularizer.

    PubMed

    Li, Zhenni; Ding, Shuxue; Li, Yujie; Yang, Zuyuan; Xie, Shengli; Chen, Wuhui

    2018-02-01

    Recently there has been increasing attention towards analysis dictionary learning. In analysis dictionary learning, it is an open problem to obtain the strong sparsity-promoting solutions efficiently while simultaneously avoiding the trivial solutions of the dictionary. In this paper, to obtain the strong sparsity-promoting solutions, we employ the ℓ 1∕2 norm as a regularizer. The very recent study on ℓ 1∕2 norm regularization theory in compressive sensing shows that its solutions can give sparser results than using the ℓ 1 norm. We transform a complex nonconvex optimization into a number of one-dimensional minimization problems. Then the closed-form solutions can be obtained efficiently. To avoid trivial solutions, we apply manifold optimization to update the dictionary directly on the manifold satisfying the orthonormality constraint, so that the dictionary can avoid the trivial solutions well while simultaneously capturing the intrinsic properties of the dictionary. The experiments with synthetic and real-world data verify that the proposed algorithm for analysis dictionary learning can not only obtain strong sparsity-promoting solutions efficiently, but also learn more accurate dictionary in terms of dictionary recovery and image processing than the state-of-the-art algorithms. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Alzheimer's disease detection via automatic 3D caudate nucleus segmentation using coupled dictionary learning with level set formulation.

    PubMed

    Al-Shaikhli, Saif Dawood Salman; Yang, Michael Ying; Rosenhahn, Bodo

    2016-12-01

    This paper presents a novel method for Alzheimer's disease classification via an automatic 3D caudate nucleus segmentation. The proposed method consists of segmentation and classification steps. In the segmentation step, we propose a novel level set cost function. The proposed cost function is constrained by a sparse representation of local image features using a dictionary learning method. We present coupled dictionaries: a feature dictionary of a grayscale brain image and a label dictionary of a caudate nucleus label image. Using online dictionary learning, the coupled dictionaries are learned from the training data. The learned coupled dictionaries are embedded into a level set function. In the classification step, a region-based feature dictionary is built. The region-based feature dictionary is learned from shape features of the caudate nucleus in the training data. The classification is based on the measure of the similarity between the sparse representation of region-based shape features of the segmented caudate in the test image and the region-based feature dictionary. The experimental results demonstrate the superiority of our method over the state-of-the-art methods by achieving a high segmentation (91.5%) and classification (92.5%) accuracy. In this paper, we find that the study of the caudate nucleus atrophy gives an advantage over the study of whole brain structure atrophy to detect Alzheimer's disease. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  2. The efficacy of dictionary use while reading for learning new words.

    PubMed

    Hamilton, Harley

    2012-01-01

    The researcher investigated the use of three types of dictionaries while reading by high school students with severe to profound hearing loss. The objective of the study was to determine the effectiveness of each type of dictionary for acquiring the meanings of unknown vocabulary in text. The three types of dictionaries were (a) an online bilingual multimedia English-American Sign Language (ASL) dictionary (OBMEAD), (b) a paper English-ASL dictionary (PBEAD), and (c) an online monolingual English dictionary (OMED). It was found that for immediate recall of target words, the OBMEAD was superior to both the PBEAD and the OMED. For later recall, no significant difference appeared between the OBMEAD and the PBEAD. For both of these, recall was statistically superior to recall for words learned via the OMED.

  3. What Online Traditional Medicine Dictionaries Bring To English Speakers Now? Concepts or Equivalents?

    PubMed Central

    Fang, Lu

    2018-01-01

    Nowadays, more and more Chinese medicine practices are applied in the world and popularizing that becomes an urgent task. To meet the requiremets, an increasing number of Chinese - English traditional medicine dictionaries have been produced at home or abroad in recent decades. Nevertheless, the users are still struggling to spot the information in dictionaries. What traditional medicine dictionaries are needed for the English speakers now? To identify an entry model for online TCM dictionaries, I compared the entries in five printed traditional medicine dictionaries and two online ones. Based upon this, I tentatively put forward two samples, “阳经 (yángjīng)” and “阴经 (yīn jīng)”, focusing on concepts transmitting, for online Chinese - English TCM dictionaries. PMID:29875861

  4. Space transfer vehicle concepts and requirements study. Volume 3: Program cost estimates. Book 2: WBS and dictionary

    NASA Technical Reports Server (NTRS)

    Peffley, A. F.

    1991-01-01

    This document describes the products and services to be developed, tested, produced, and operated for the Space Transfer Vehicle (STV) Program. The Work Breakdown Structure (WBS) and WBS Dictionary are program management tools used to catalog, account by task, and summarize work packages of a space system program. The products or services to be delivered or accomplished during the STV C/D phase are the primary focus of this work breakdown structure document.

  5. The Impact of Changing International Relations on the Scientific and Technical Community (Incidence sur la Communaute Scientifique et Technique des Transformations en cours dans les Relations Internationales.

    DTIC Science & Technology

    1993-04-01

    Processing of special formats (diagrams, tables) - Determination of words to be Pre-analysis Programs added to system dictionary Dictionary ...language Lexicon combinations, via a standardized interface (MIR) which maps METAL operates -n both monolingual lexicons and one the results of analyses...in uniform ways. transfer lexicon for each language pair. The monolingual lexi- Unfortunately, there is at present no linguistic theory which cons

  6. DOCU-TEXT: A tool before the data dictionary

    NASA Technical Reports Server (NTRS)

    Carter, B.

    1983-01-01

    DOCU-TEXT, a proprietary software package that aids in the production of documentation for a data processing organization and can be installed and operated only on IBM computers is discussed. In organizing information that ultimately will reside in a data dictionary, DOCU-TEXT proved to be a useful documentation tool in extracting information from existing production jobs, procedure libraries, system catalogs, control data sets and related files. DOCU-TEXT reads these files to derive data that is useful at the system level. The output of DOCU-TEXT is a series of user selectable reports. These reports can reflect the interactions within a single job stream, a complete system, or all the systems in an installation. Any single report, or group of reports, can be generated in an independent documentation pass.

  7. A Remote Sensing Image Fusion Method based on adaptive dictionary learning

    NASA Astrophysics Data System (ADS)

    He, Tongdi; Che, Zongxi

    2018-01-01

    This paper discusses using a remote sensing fusion method, based on' adaptive sparse representation (ASP)', to provide improved spectral information, reduce data redundancy and decrease system complexity. First, the training sample set is formed by taking random blocks from the images to be fused, the dictionary is then constructed using the training samples, and the remaining terms are clustered to obtain the complete dictionary by iterated processing at each step. Second, the self-adaptive weighted coefficient rule of regional energy is used to select the feature fusion coefficients and complete the reconstruction of the image blocks. Finally, the reconstructed image blocks are rearranged and an average is taken to obtain the final fused images. Experimental results show that the proposed method is superior to other traditional remote sensing image fusion methods in both spectral information preservation and spatial resolution.

  8. Gapped Spectral Dictionaries and Their Applications for Database Searches of Tandem Mass Spectra*

    PubMed Central

    Jeong, Kyowon; Kim, Sangtae; Bandeira, Nuno; Pevzner, Pavel A.

    2011-01-01

    Generating all plausible de novo interpretations of a peptide tandem mass (MS/MS) spectrum (Spectral Dictionary) and quickly matching them against the database represent a recently emerged alternative approach to peptide identification. However, the sizes of the Spectral Dictionaries quickly grow with the peptide length making their generation impractical for long peptides. We introduce Gapped Spectral Dictionaries (all plausible de novo interpretations with gaps) that can be easily generated for any peptide length thus addressing the limitation of the Spectral Dictionary approach. We show that Gapped Spectral Dictionaries are small thus opening a possibility of using them to speed-up MS/MS searches. Our MS-GappedDictionary algorithm (based on Gapped Spectral Dictionaries) enables proteogenomics applications (such as searches in the six-frame translation of the human genome) that are prohibitively time consuming with existing approaches. MS-GappedDictionary generates gapped peptides that occupy a niche between accurate but short peptide sequence tags and long but inaccurate full length peptide reconstructions. We show that, contrary to conventional wisdom, some high-quality spectra do not have good peptide sequence tags and introduce gapped tags that have advantages over the conventional peptide sequence tags in MS/MS database searches. PMID:21444829

  9. The Environment-Power System Analysis Tool development program. [for spacecraft power supplies

    NASA Technical Reports Server (NTRS)

    Jongeward, Gary A.; Kuharski, Robert A.; Kennedy, Eric M.; Wilcox, Katherine G.; Stevens, N. John; Putnam, Rand M.; Roche, James C.

    1989-01-01

    The Environment Power System Analysis Tool (EPSAT) is being developed to provide engineers with the ability to assess the effects of a broad range of environmental interactions on space power systems. A unique user-interface-data-dictionary code architecture oversees a collection of existing and future environmental modeling codes (e.g., neutral density) and physical interaction models (e.g., sheath ionization). The user-interface presents the engineer with tables, graphs, and plots which, under supervision of the data dictionary, are automatically updated in response to parameter change. EPSAT thus provides the engineer with a comprehensive and responsive environmental assessment tool and the scientist with a framework into which new environmental or physical models can be easily incorporated.

  10. Mandarin Chinese Dictionary: English-Chinese.

    ERIC Educational Resources Information Center

    Wang, Fred Fangyu

    This dictionary is a companion volume to the "Mandarin Chinese Dictionary (Chinese-English)" published in 1967 by Seton Hall University. The purpose of the dictionary is to help English-speaking students produce Chinese sentences in certain cultural situations by looking up the English expressions. Natural, spoken Chinese expressions within the…

  11. Intertwining thesauri and dictionaries

    NASA Technical Reports Server (NTRS)

    Buchan, R. L.

    1989-01-01

    The use of dictionaries and thesauri in information retrieval is discussed. The structure and functions of thesauri and dictionaries are described. Particular attention is given to the format of the NASA Thesaurus. The relationship between thesauri and dictionaries, the need to regularize terminology, and the capitalization of words are examined.

  12. MEANING DISCRIMINATION IN BILINGUAL DICTIONARIES.

    ERIC Educational Resources Information Center

    IANNUCCI, JAMES E.

    SEMANTIC DISCRIMINATION OF POLYSEMOUS ENTRY WORDS IN BILINGUAL DICTIONARIES WAS DISCUSSED IN THE PAPER. HANDICAPS OF PRESENT BILINGUAL DICTIONARIES AND BARRIERS TO THEIR FULL UTILIZATION WERE ENUMERATED. THE AUTHOR CONCLUDED THAT (1) A BILINGUAL DICTIONARY SHOULD HAVE A DISCRIMINATION FOR EVERY TRANSLATION OF AN ENTRY WORD WHICH HAS SEVERAL…

  13. The Use of Hyper-Reference and Conventional Dictionaries.

    ERIC Educational Resources Information Center

    Aust, Ronald; And Others

    1993-01-01

    Describes a study of 80 undergraduate foreign language learners that compared the use of a hyper-reference source incorporating an electronic dictionary and a conventional paper dictionary. Measures of consultation frequency, study time, efficiency, and comprehension are examined; bilingual and monolingual dictionary use is compared; and further…

  14. TSAR (Theater Simulation of Airbase Resources) Database Dictionary F-4G.

    DTIC Science & Technology

    1987-06-05

    REC/TRANS RT-1159 388 71ZDO CNTL UNIT C- 10062 /A 391 71320 CNTL ARN-127 393 723A0 REC/TRANS RT-689 394 723B0 INDIC, HEIGHT TOTAL NUMBER OF PART REPAIR...PROCEDURES - 2 LOU " 4"p" IW qg t~l pb i.go stanc E lm FIGURE 98 111-260 S.. RESOURCE REQUIREMENTS 111.1.5.25 LIU’S #45 - #49 - *LRU PART TIME...RT-1159 AG 386 71ZBO ADAPTER MX9577 AG 387 71ZCO MOUNT (REC/TRANS) AG 388 71ZDO CNTL UNIT C- 10062 /A AG 389 71ZEO MOUNT (DIG TO ANALOG CONVERTER) AG

  15. Toward better public health reporting using existing off the shelf approaches: The value of medical dictionaries in automated cancer detection using plaintext medical data.

    PubMed

    Kasthurirathne, Suranga N; Dixon, Brian E; Gichoya, Judy; Xu, Huiping; Xia, Yuni; Mamlin, Burke; Grannis, Shaun J

    2017-05-01

    Existing approaches to derive decision models from plaintext clinical data frequently depend on medical dictionaries as the sources of potential features. Prior research suggests that decision models developed using non-dictionary based feature sourcing approaches and "off the shelf" tools could predict cancer with performance metrics between 80% and 90%. We sought to compare non-dictionary based models to models built using features derived from medical dictionaries. We evaluated the detection of cancer cases from free text pathology reports using decision models built with combinations of dictionary or non-dictionary based feature sourcing approaches, 4 feature subset sizes, and 5 classification algorithms. Each decision model was evaluated using the following performance metrics: sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. Decision models parameterized using dictionary and non-dictionary feature sourcing approaches produced performance metrics between 70 and 90%. The source of features and feature subset size had no impact on the performance of a decision model. Our study suggests there is little value in leveraging medical dictionaries for extracting features for decision model building. Decision models built using features extracted from the plaintext reports themselves achieve comparable results to those built using medical dictionaries. Overall, this suggests that existing "off the shelf" approaches can be leveraged to perform accurate cancer detection using less complex Named Entity Recognition (NER) based feature extraction, automated feature selection and modeling approaches. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. The Oxford English Dictionary: A Brief History.

    ERIC Educational Resources Information Center

    Fritze, Ronald H.

    1989-01-01

    Reviews the development of English dictionaries in general and the Oxford English Dictionary (OED) in particular. The discussion covers the decision by the Philological Society to create the dictionary, the principles that guided its development, the involvement of James Augustus Henry Murray, the magnitude and progress of the project, and the…

  17. Dictionary Making: A Case of Kiswahili Dictionaries.

    ERIC Educational Resources Information Center

    Mohamed, Mohamed A.

    Two Swahili dictionaries and two bilingual dictionaries by the same author (one English-Swahili and one Swahili-English) are evaluated for their form and content, with illustrations offered from each. Aspects examined include: the compilation of headwords, including their meanings with relation to basic and extended meanings; treatment of…

  18. Buying and Selling Words: What Every Good Librarian Should Know about the Dictionary Business.

    ERIC Educational Resources Information Center

    Kister, Ken

    1993-01-01

    Discusses features to consider when selecting dictionaries. Topics addressed include the publishing industry; the dictionary market; profits from dictionaries; pricing; competitive marketing tactics, including similar titles, claims to numbers of entries and numbers of definitions, and similar physical appearance; a trademark infringement case;…

  19. The New Unabridged English-Persian Dictionary.

    ERIC Educational Resources Information Center

    Aryanpur, Abbas; Saleh, Jahan Shah

    This five-volume English-Persian dictionary is based on Webster's International Dictionary (1960 and 1961) and The Shorter Oxford English Dictionary (1959); it attempts to provide Persian equivalents of all the words of Oxford and all the key-words of Webster. Pronunciation keys for the English phonetic transcription and for the difficult Persian…

  20. Evaluating L2 Readers' Vocabulary Strategies and Dictionary Use

    ERIC Educational Resources Information Center

    Prichard, Caleb

    2008-01-01

    A review of the relevant literature concerning second language dictionary use while reading suggests that selective dictionary use may lead to improved comprehension and efficient vocabulary development. This study aims to examine the dictionary use of Japanese university students to determine just how selective they are when reading nonfiction…

  1. Online English-English Learner Dictionaries Boost Word Learning

    ERIC Educational Resources Information Center

    Nurmukhamedov, Ulugbek

    2012-01-01

    Learners of English might be familiar with several online monolingual dictionaries that are not necessarily the best choices for the English as Second/Foreign Language (ESL/EFL) context. Although these monolingual online dictionaries contain definitions, pronunciation guides, and other elements normally found in general-use dictionaries, they are…

  2. Research Timeline: Dictionary Use by English Language Learners

    ERIC Educational Resources Information Center

    Nesi, Hilary

    2014-01-01

    The history of research into dictionary use tends to be characterised by small-scale studies undertaken in a variety of different contexts, rather than larger-scale, longer-term funded projects. The research conducted by dictionary publishers is not generally made public, because of its commercial sensitivity, yet because dictionary production is…

  3. The Dictionary and Vocabulary Behavior: A Single Word or a Handful?

    ERIC Educational Resources Information Center

    Baxter, James

    1980-01-01

    To provide a context for dictionary selection, the vocabulary behavior of students is examined. Distinguishing between written and spoken English, the relation between dictionary use, classroom vocabulary behavior, and students' success in meeting their communicative needs is discussed. The choice of a monolingual English learners' dictionary is…

  4. Determining building interior structures using compressive sensing

    NASA Astrophysics Data System (ADS)

    Lagunas, Eva; Amin, Moeness G.; Ahmad, Fauzia; Nájar, Montse

    2013-04-01

    We consider imaging of the building interior structures using compressive sensing (CS) with applications to through-the-wall imaging and urban sensing. We consider a monostatic synthetic aperture radar imaging system employing stepped frequency waveform. The proposed approach exploits prior information of building construction practices to form an appropriate sparse representation of the building interior layout. We devise a dictionary of possible wall locations, which is consistent with the fact that interior walls are typically parallel or perpendicular to the front wall. The dictionary accounts for the dominant normal angle reflections from exterior and interior walls for the monostatic imaging system. CS is applied to a reduced set of observations to recover the true positions of the walls. Additional information about interior walls can be obtained using a dictionary of possible corner reflectors, which is the response of the junction of two walls. Supporting results based on simulation and laboratory experiments are provided. It is shown that the proposed sparsifying basis outperforms the conventional through-the-wall CS model, the wavelet sparsifying basis, and the block sparse model for building interior layout detection.

  5. Advanced Traffic Management Systems (ATMS) research analysis database system

    DOT National Transportation Integrated Search

    2001-06-01

    The ATMS Research Analysis Database Systems (ARADS) consists of a Traffic Software Data Dictionary (TSDD) and a Traffic Software Object Model (TSOM) for application to microscopic traffic simulation and signal optimization domains. The purpose of thi...

  6. Program Your Computer to Make Tough Decisions Easy.

    ERIC Educational Resources Information Center

    DiGiammarino, Frank P.

    1981-01-01

    Describes the data management and analysis system of the Lexington (Massachusetts) Public Schools. Discusses the system's database, data dictionary, and end user language and gives examples of the system's use in answering questions about school closings. (RW)

  7. Multivariate temporal dictionary learning for EEG.

    PubMed

    Barthélemy, Q; Gouy-Pailler, C; Isaac, Y; Souloumiac, A; Larue, A; Mars, J I

    2013-04-30

    This article addresses the issue of representing electroencephalographic (EEG) signals in an efficient way. While classical approaches use a fixed Gabor dictionary to analyze EEG signals, this article proposes a data-driven method to obtain an adapted dictionary. To reach an efficient dictionary learning, appropriate spatial and temporal modeling is required. Inter-channels links are taken into account in the spatial multivariate model, and shift-invariance is used for the temporal model. Multivariate learned kernels are informative (a few atoms code plentiful energy) and interpretable (the atoms can have a physiological meaning). Using real EEG data, the proposed method is shown to outperform the classical multichannel matching pursuit used with a Gabor dictionary, as measured by the representative power of the learned dictionary and its spatial flexibility. Moreover, dictionary learning can capture interpretable patterns: this ability is illustrated on real data, learning a P300 evoked potential. Copyright © 2013 Elsevier B.V. All rights reserved.

  8. Standardized Representation of Clinical Study Data Dictionaries with CIMI Archetypes

    PubMed Central

    Sharma, Deepak K.; Solbrig, Harold R.; Prud’hommeaux, Eric; Pathak, Jyotishman; Jiang, Guoqian

    2016-01-01

    Researchers commonly use a tabular format to describe and represent clinical study data. The lack of standardization of data dictionary’s metadata elements presents challenges for their harmonization for similar studies and impedes interoperability outside the local context. We propose that representing data dictionaries in the form of standardized archetypes can help to overcome this problem. The Archetype Modeling Language (AML) as developed by the Clinical Information Modeling Initiative (CIMI) can serve as a common format for the representation of data dictionary models. We mapped three different data dictionaries (identified from dbGAP, PheKB and TCGA) onto AML archetypes by aligning dictionary variable definitions with the AML archetype elements. The near complete alignment of data dictionaries helped map them into valid AML models that captured all data dictionary model metadata. The outcome of the work would help subject matter experts harmonize data models for quality, semantic interoperability and better downstream data integration. PMID:28269909

  9. Weighted Discriminative Dictionary Learning based on Low-rank Representation

    NASA Astrophysics Data System (ADS)

    Chang, Heyou; Zheng, Hao

    2017-01-01

    Low-rank representation has been widely used in the field of pattern classification, especially when both training and testing images are corrupted with large noise. Dictionary plays an important role in low-rank representation. With respect to the semantic dictionary, the optimal representation matrix should be block-diagonal. However, traditional low-rank representation based dictionary learning methods cannot effectively exploit the discriminative information between data and dictionary. To address this problem, this paper proposed weighted discriminative dictionary learning based on low-rank representation, where a weighted representation regularization term is constructed. The regularization associates label information of both training samples and dictionary atoms, and encourages to generate a discriminative representation with class-wise block-diagonal structure, which can further improve the classification performance where both training and testing images are corrupted with large noise. Experimental results demonstrate advantages of the proposed method over the state-of-the-art methods.

  10. The HLA dictionary 2008: a summary of HLA-A, -B, -C, -DRB1/3/4/5, and -DQB1 alleles and their association with serologically defined HLA-A, -B, -C, -DR, and -DQ antigens.

    PubMed

    Holdsworth, R; Hurley, C K; Marsh, S G E; Lau, M; Noreen, H J; Kempenich, J H; Setterholm, M; Maiers, M

    2009-02-01

    The 2008 report of the human leukocyte antigen (HLA) data dictionary presents serologic equivalents of HLA-A, -B, -C, -DRB1, -DRB3, -DRB4, -DRB5, and -DQB1 alleles. The dictionary is an update of the one published in 2004. The data summarize equivalents obtained by the World Health Organization Nomenclature Committee for Factors of the HLA System, the International Cell Exchange, UCLA, the National Marrow Donor Program, recent publications, and individual laboratories. The 2008 edition includes information on 832 new alleles (685 class I and 147 class II) and updated information on 766 previously listed alleles (577 class I and 189 class II). The tables list the alleles with remarks on the serologic patterns and the equivalents. The serological equivalents are listed as expert assigned types, and the data are useful for identifying potential stem cell donors who were typed by either serology or DNA-based methods. The tables with HLA equivalents are available as a searchable form on the IMGT/HLA database Web site (http://www.ebi.ac.uk/imgt/hla/dictionary.html).

  11. Evaluation of a Hyperlinked Consumer Health Dictionary for reading EHR notes.

    PubMed

    Slaughter, Laura; Oyri, Karl; Fosse, Erik

    2011-01-01

    In this paper, we report on a pilot study conducted to test the usefulness and understandability of definitions in a Consumer Health Dictionary (IVS-CHD). Our two main goals for this study were to evaluate functionality of the dictionary when embedded in electronic health records (EHR) and determine the methodology for our larger-scale project to iteratively develop the IVS-CHD. The hyperlinked IVS-CHD was made available to thoracic surgery patients reading their own EHR. We asked patients to rate definitions on two 5-level Likert items measuring perceived usefulness and understandability. We also captured the terms that patients wanted defined, but that were not included in the IVS-CHD. Preliminary results indicate the types of problems that must be avoided when creating definitions, for example, that patients prefer detailed explanations that include medical outcomes, and that do not use "unfamiliar" terms they must also look up. We also have gained insight into the types of terms that patients want defined from their EHR notes, especially certain abbreviations. Patients further commented on the experience of reading EHR notes directly from the same system used by healthcare personnel and the help strategy of linking the contents to a hyperlinked dictionary.

  12. NHEXAS PHASE I ARIZONA STUDY--STANDARD OPERATING PROCEDURE FOR THE GENERATION AND OPERATION OF DATA DICTIONARIES (UA-D-4.0)

    EPA Science Inventory

    The purpose of this SOP is to provide a standard method for the writing of data dictionaries. This procedure applies to the dictionaries used during the Arizona NHEXAS project and the "Border" study. Keywords: guidelines; data dictionaries.

    The National Human Exposure Assessme...

  13. Should Dictionaries Be Used in Translation Tests and Examinations?

    ERIC Educational Resources Information Center

    Mahmoud, Abdulmoneim

    2017-01-01

    Motivated by the conflicting views regarding the use of the dictionary in translation tests and examinations this study was intended to verify the dictionary-free vs dictionary-based translation hypotheses. The subjects were 135 Arabic-speaking male and female EFL third-year university students. A group consisting of 62 students translated a text…

  14. Corpora and Collocations in Chinese-English Dictionaries for Chinese Users

    ERIC Educational Resources Information Center

    Xia, Lixin

    2015-01-01

    The paper identifies the major problems of the Chinese-English dictionary in representing collocational information after an extensive survey of nine dictionaries popular among Chinese users. It is found that the Chinese-English dictionary only provides the collocation types of "v+n" and "v+n," but completely ignores those of…

  15. The Creation of Learner-Centred Dictionaries for Endangered Languages: A Rotuman Example

    ERIC Educational Resources Information Center

    Vamarasi, M.

    2014-01-01

    This article examines the creation of dictionaries for endangered languages (ELs). Though each dictionary is uniquely prepared for its users, all dictionaries should be based on sound principles of vocabulary learning, including the importance of lexical chunks, as emphasised by Michael Lewis in his "Lexical Approach." Many of the…

  16. Evaluating Bilingual and Monolingual Dictionaries for L2 Learners.

    ERIC Educational Resources Information Center

    Hunt, Alan

    1997-01-01

    A discussion of dictionaries and their use for second language (L2) learning suggests that lack of computerized modern language corpora can adversely affect bilingual dictionaries, commonly used by L2 learners, and shows how use of such corpora has benefitted two contemporary monolingual L2 learner dictionaries (1995 editions of the Longman…

  17. Discriminative Bayesian Dictionary Learning for Classification.

    PubMed

    Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal

    2016-12-01

    We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.

  18. Sparse dictionary for synthetic transmit aperture medical ultrasound imaging.

    PubMed

    Wang, Ping; Jiang, Jin-Yang; Li, Na; Luo, Han-Wu; Li, Fang; Cui, Shi-Gang

    2017-07-01

    It is possible to recover a signal below the Nyquist sampling limit using a compressive sensing technique in ultrasound imaging. However, the reconstruction enabled by common sparse transform approaches does not achieve satisfactory results. Considering the ultrasound echo signal's features of attenuation, repetition, and superposition, a sparse dictionary with the emission pulse signal is proposed. Sparse coefficients in the proposed dictionary have high sparsity. Images reconstructed with this dictionary were compared with those obtained with the three other common transforms, namely, discrete Fourier transform, discrete cosine transform, and discrete wavelet transform. The performance of the proposed dictionary was analyzed via a simulation and experimental data. The mean absolute error (MAE) was used to quantify the quality of the reconstructions. Experimental results indicate that the MAE associated with the proposed dictionary was always the smallest, the reconstruction time required was the shortest, and the lateral resolution and contrast of the reconstructed images were also the closest to the original images. The proposed sparse dictionary performed better than the other three sparse transforms. With the same sampling rate, the proposed dictionary achieved excellent reconstruction quality.

  19. Robust Visual Tracking via Online Discriminative and Low-Rank Dictionary Learning.

    PubMed

    Zhou, Tao; Liu, Fanghui; Bhaskar, Harish; Yang, Jie

    2017-09-12

    In this paper, we propose a novel and robust tracking framework based on online discriminative and low-rank dictionary learning. The primary aim of this paper is to obtain compact and low-rank dictionaries that can provide good discriminative representations of both target and background. We accomplish this by exploiting the recovery ability of low-rank matrices. That is if we assume that the data from the same class are linearly correlated, then the corresponding basis vectors learned from the training set of each class shall render the dictionary to become approximately low-rank. The proposed dictionary learning technique incorporates a reconstruction error that improves the reliability of classification. Also, a multiconstraint objective function is designed to enable active learning of a discriminative and robust dictionary. Further, an optimal solution is obtained by iteratively computing the dictionary, coefficients, and by simultaneously learning the classifier parameters. Finally, a simple yet effective likelihood function is implemented to estimate the optimal state of the target during tracking. Moreover, to make the dictionary adaptive to the variations of the target and background during tracking, an online update criterion is employed while learning the new dictionary. Experimental results on a publicly available benchmark dataset have demonstrated that the proposed tracking algorithm performs better than other state-of-the-art trackers.

  20. Efficient Sum of Outer Products Dictionary Learning (SOUP-DIL) and Its Application to Inverse Problems.

    PubMed

    Ravishankar, Saiprasad; Nadakuditi, Raj Rao; Fessler, Jeffrey A

    2017-12-01

    The sparsity of signals in a transform domain or dictionary has been exploited in applications such as compression, denoising and inverse problems. More recently, data-driven adaptation of synthesis dictionaries has shown promise compared to analytical dictionary models. However, dictionary learning problems are typically non-convex and NP-hard, and the usual alternating minimization approaches for these problems are often computationally expensive, with the computations dominated by the NP-hard synthesis sparse coding step. This paper exploits the ideas that drive algorithms such as K-SVD, and investigates in detail efficient methods for aggregate sparsity penalized dictionary learning by first approximating the data with a sum of sparse rank-one matrices (outer products) and then using a block coordinate descent approach to estimate the unknowns. The resulting block coordinate descent algorithms involve efficient closed-form solutions. Furthermore, we consider the problem of dictionary-blind image reconstruction, and propose novel and efficient algorithms for adaptive image reconstruction using block coordinate descent and sum of outer products methodologies. We provide a convergence study of the algorithms for dictionary learning and dictionary-blind image reconstruction. Our numerical experiments show the promising performance and speedups provided by the proposed methods over previous schemes in sparse data representation and compressed sensing-based image reconstruction.

  1. Cross-View Action Recognition via Transferable Dictionary Learning.

    PubMed

    Zheng, Jingjing; Jiang, Zhuolin; Chellappa, Rama

    2016-05-01

    Discriminative appearance features are effective for recognizing actions in a fixed view, but may not generalize well to a new view. In this paper, we present two effective approaches to learn dictionaries for robust action recognition across views. In the first approach, we learn a set of view-specific dictionaries where each dictionary corresponds to one camera view. These dictionaries are learned simultaneously from the sets of correspondence videos taken at different views with the aim of encouraging each video in the set to have the same sparse representation. In the second approach, we additionally learn a common dictionary shared by different views to model view-shared features. This approach represents the videos in each view using a view-specific dictionary and the common dictionary. More importantly, it encourages the set of videos taken from the different views of the same action to have the similar sparse representations. The learned common dictionary not only has the capability to represent actions from unseen views, but also makes our approach effective in a semi-supervised setting where no correspondence videos exist and only a few labeled videos exist in the target view. The extensive experiments using three public datasets demonstrate that the proposed approach outperforms recently developed approaches for cross-view action recognition.

  2. Orthogonal Procrustes Analysis for Dictionary Learning in Sparse Linear Representation.

    PubMed

    Grossi, Giuliano; Lanzarotti, Raffaella; Lin, Jianyi

    2017-01-01

    In the sparse representation model, the design of overcomplete dictionaries plays a key role for the effectiveness and applicability in different domains. Recent research has produced several dictionary learning approaches, being proven that dictionaries learnt by data examples significantly outperform structured ones, e.g. wavelet transforms. In this context, learning consists in adapting the dictionary atoms to a set of training signals in order to promote a sparse representation that minimizes the reconstruction error. Finding the best fitting dictionary remains a very difficult task, leaving the question still open. A well-established heuristic method for tackling this problem is an iterative alternating scheme, adopted for instance in the well-known K-SVD algorithm. Essentially, it consists in repeating two stages; the former promotes sparse coding of the training set and the latter adapts the dictionary to reduce the error. In this paper we present R-SVD, a new method that, while maintaining the alternating scheme, adopts the Orthogonal Procrustes analysis to update the dictionary atoms suitably arranged into groups. Comparative experiments on synthetic data prove the effectiveness of R-SVD with respect to well known dictionary learning algorithms such as K-SVD, ILS-DLA and the online method OSDL. Moreover, experiments on natural data such as ECG compression, EEG sparse representation, and image modeling confirm R-SVD's robustness and wide applicability.

  3. Online Multi-Modal Robust Non-Negative Dictionary Learning for Visual Tracking

    PubMed Central

    Zhang, Xiang; Guan, Naiyang; Tao, Dacheng; Qiu, Xiaogang; Luo, Zhigang

    2015-01-01

    Dictionary learning is a method of acquiring a collection of atoms for subsequent signal representation. Due to its excellent representation ability, dictionary learning has been widely applied in multimedia and computer vision. However, conventional dictionary learning algorithms fail to deal with multi-modal datasets. In this paper, we propose an online multi-modal robust non-negative dictionary learning (OMRNDL) algorithm to overcome this deficiency. Notably, OMRNDL casts visual tracking as a dictionary learning problem under the particle filter framework and captures the intrinsic knowledge about the target from multiple visual modalities, e.g., pixel intensity and texture information. To this end, OMRNDL adaptively learns an individual dictionary, i.e., template, for each modality from available frames, and then represents new particles over all the learned dictionaries by minimizing the fitting loss of data based on M-estimation. The resultant representation coefficient can be viewed as the common semantic representation of particles across multiple modalities, and can be utilized to track the target. OMRNDL incrementally learns the dictionary and the coefficient of each particle by using multiplicative update rules to respectively guarantee their non-negativity constraints. Experimental results on a popular challenging video benchmark validate the effectiveness of OMRNDL for visual tracking in both quantity and quality. PMID:25961715

  4. Efficient Sum of Outer Products Dictionary Learning (SOUP-DIL) and Its Application to Inverse Problems

    PubMed Central

    Ravishankar, Saiprasad; Nadakuditi, Raj Rao; Fessler, Jeffrey A.

    2017-01-01

    The sparsity of signals in a transform domain or dictionary has been exploited in applications such as compression, denoising and inverse problems. More recently, data-driven adaptation of synthesis dictionaries has shown promise compared to analytical dictionary models. However, dictionary learning problems are typically non-convex and NP-hard, and the usual alternating minimization approaches for these problems are often computationally expensive, with the computations dominated by the NP-hard synthesis sparse coding step. This paper exploits the ideas that drive algorithms such as K-SVD, and investigates in detail efficient methods for aggregate sparsity penalized dictionary learning by first approximating the data with a sum of sparse rank-one matrices (outer products) and then using a block coordinate descent approach to estimate the unknowns. The resulting block coordinate descent algorithms involve efficient closed-form solutions. Furthermore, we consider the problem of dictionary-blind image reconstruction, and propose novel and efficient algorithms for adaptive image reconstruction using block coordinate descent and sum of outer products methodologies. We provide a convergence study of the algorithms for dictionary learning and dictionary-blind image reconstruction. Our numerical experiments show the promising performance and speedups provided by the proposed methods over previous schemes in sparse data representation and compressed sensing-based image reconstruction. PMID:29376111

  5. Online multi-modal robust non-negative dictionary learning for visual tracking.

    PubMed

    Zhang, Xiang; Guan, Naiyang; Tao, Dacheng; Qiu, Xiaogang; Luo, Zhigang

    2015-01-01

    Dictionary learning is a method of acquiring a collection of atoms for subsequent signal representation. Due to its excellent representation ability, dictionary learning has been widely applied in multimedia and computer vision. However, conventional dictionary learning algorithms fail to deal with multi-modal datasets. In this paper, we propose an online multi-modal robust non-negative dictionary learning (OMRNDL) algorithm to overcome this deficiency. Notably, OMRNDL casts visual tracking as a dictionary learning problem under the particle filter framework and captures the intrinsic knowledge about the target from multiple visual modalities, e.g., pixel intensity and texture information. To this end, OMRNDL adaptively learns an individual dictionary, i.e., template, for each modality from available frames, and then represents new particles over all the learned dictionaries by minimizing the fitting loss of data based on M-estimation. The resultant representation coefficient can be viewed as the common semantic representation of particles across multiple modalities, and can be utilized to track the target. OMRNDL incrementally learns the dictionary and the coefficient of each particle by using multiplicative update rules to respectively guarantee their non-negativity constraints. Experimental results on a popular challenging video benchmark validate the effectiveness of OMRNDL for visual tracking in both quantity and quality.

  6. Reconstruction of magnetic resonance imaging by three-dimensional dual-dictionary learning.

    PubMed

    Song, Ying; Zhu, Zhen; Lu, Yang; Liu, Qiegen; Zhao, Jun

    2014-03-01

    To improve the magnetic resonance imaging (MRI) data acquisition speed while maintaining the reconstruction quality, a novel method is proposed for multislice MRI reconstruction from undersampled k-space data based on compressed-sensing theory using dictionary learning. There are two aspects to improve the reconstruction quality. One is that spatial correlation among slices is used by extending the atoms in dictionary learning from patches to blocks. The other is that the dictionary-learning scheme is used at two resolution levels; i.e., a low-resolution dictionary is used for sparse coding and a high-resolution dictionary is used for image updating. Numerical experiments are carried out on in vivo 3D MR images of brains and abdomens with a variety of undersampling schemes and ratios. The proposed method (dual-DLMRI) achieves better reconstruction quality than conventional reconstruction methods, with the peak signal-to-noise ratio being 7 dB higher. The advantages of the dual dictionaries are obvious compared with the single dictionary. Parameter variations ranging from 50% to 200% only bias the image quality within 15% in terms of the peak signal-to-noise ratio. Dual-DLMRI effectively uses the a priori information in the dual-dictionary scheme and provides dramatically improved reconstruction quality. Copyright © 2013 Wiley Periodicals, Inc.

  7. English/Russian terminology on radiometric calibration of space-borne optoelectronic sensors

    NASA Astrophysics Data System (ADS)

    Privalsky, V.; Zakharenkov, V.; Humpherys, T.; Sapritsky, V.; Datla, R.

    The efficient use of data acquired through exo-atmospheric observations of the Earth within the framework of existing and newly planned programs requires a unique understanding of respective terms and definitions. Yet, the last large-scale document on the subject - The International Electrotechnical Vocabulary - had been published 18 years ago. This lack of a proper document, which would reflect the changes that had occurred in the area since that time, is especially detrimental to the developing international efforts aimed at global observations of the Earth from space such as the Global Earth Observations Program proposed by the U.S.A. at the 2003 WMO Congress. To cover this gap at least partially, a bi-lingual explanatory dictionary of terms and definitions in the area of radiometric calibration of space-borne IR sensors is developed. The objectives are to produce a uniform terminology for the global space-borne observations of the Earth, establish a unique understanding of terms and definitions by the radiometric communities, including a correspondence between the Russian and American terms and definitions, and to develop a formal English/Russian reference dictionary for use by scientists and engineers involved in radiometric observations of the Earth from space. The dictionary includes close to 400 items covering basic concepts of geometric, wave and corpuscular optics, remote sensing technologies, and ground-based calibration as well as more detailed treatment of terms and definitions in the areas of radiometric quantities, symbols and units, optical phenomena and optical properties of objects and media, and radiometric systems and their properties. The dictionary contains six chapters: Basic Concepts, Quantities, Symbols, and Units, Optical phenomena, Optical characteristics of surfaces and media, Components of Radiometric Systems, Characteristics of radiometric system components, plus English/Russian and Russian/Inglish indices.

  8. Alternatively Constrained Dictionary Learning For Image Superresolution.

    PubMed

    Lu, Xiaoqiang; Yuan, Yuan; Yan, Pingkun

    2014-03-01

    Dictionaries are crucial in sparse coding-based algorithm for image superresolution. Sparse coding is a typical unsupervised learning method to study the relationship between the patches of high-and low-resolution images. However, most of the sparse coding methods for image superresolution fail to simultaneously consider the geometrical structure of the dictionary and the corresponding coefficients, which may result in noticeable superresolution reconstruction artifacts. In other words, when a low-resolution image and its corresponding high-resolution image are represented in their feature spaces, the two sets of dictionaries and the obtained coefficients have intrinsic links, which has not yet been well studied. Motivated by the development on nonlocal self-similarity and manifold learning, a novel sparse coding method is reported to preserve the geometrical structure of the dictionary and the sparse coefficients of the data. Moreover, the proposed method can preserve the incoherence of dictionary entries and provide the sparse coefficients and learned dictionary from a new perspective, which have both reconstruction and discrimination properties to enhance the learning performance. Furthermore, to utilize the model of the proposed method more effectively for single-image superresolution, this paper also proposes a novel dictionary-pair learning method, which is named as two-stage dictionary training. Extensive experiments are carried out on a large set of images comparing with other popular algorithms for the same purpose, and the results clearly demonstrate the effectiveness of the proposed sparse representation model and the corresponding dictionary learning algorithm.

  9. Sensor-Based Vibration Signal Feature Extraction Using an Improved Composite Dictionary Matching Pursuit Algorithm

    PubMed Central

    Cui, Lingli; Wu, Na; Wang, Wenjing; Kang, Chenhui

    2014-01-01

    This paper presents a new method for a composite dictionary matching pursuit algorithm, which is applied to vibration sensor signal feature extraction and fault diagnosis of a gearbox. Three advantages are highlighted in the new method. First, the composite dictionary in the algorithm has been changed from multi-atom matching to single-atom matching. Compared to non-composite dictionary single-atom matching, the original composite dictionary multi-atom matching pursuit (CD-MaMP) algorithm can achieve noise reduction in the reconstruction stage, but it cannot dramatically reduce the computational cost and improve the efficiency in the decomposition stage. Therefore, the optimized composite dictionary single-atom matching algorithm (CD-SaMP) is proposed. Second, the termination condition of iteration based on the attenuation coefficient is put forward to improve the sparsity and efficiency of the algorithm, which adjusts the parameters of the termination condition constantly in the process of decomposition to avoid noise. Third, composite dictionaries are enriched with the modulation dictionary, which is one of the important structural characteristics of gear fault signals. Meanwhile, the termination condition of iteration settings, sub-feature dictionary selections and operation efficiency between CD-MaMP and CD-SaMP are discussed, aiming at gear simulation vibration signals with noise. The simulation sensor-based vibration signal results show that the termination condition of iteration based on the attenuation coefficient enhances decomposition sparsity greatly and achieves a good effect of noise reduction. Furthermore, the modulation dictionary achieves a better matching effect compared to the Fourier dictionary, and CD-SaMP has a great advantage of sparsity and efficiency compared with the CD-MaMP. The sensor-based vibration signals measured from practical engineering gearbox analyses have further shown that the CD-SaMP decomposition and reconstruction algorithm is feasible and effective. PMID:25207870

  10. Sensor-based vibration signal feature extraction using an improved composite dictionary matching pursuit algorithm.

    PubMed

    Cui, Lingli; Wu, Na; Wang, Wenjing; Kang, Chenhui

    2014-09-09

    This paper presents a new method for a composite dictionary matching pursuit algorithm, which is applied to vibration sensor signal feature extraction and fault diagnosis of a gearbox. Three advantages are highlighted in the new method. First, the composite dictionary in the algorithm has been changed from multi-atom matching to single-atom matching. Compared to non-composite dictionary single-atom matching, the original composite dictionary multi-atom matching pursuit (CD-MaMP) algorithm can achieve noise reduction in the reconstruction stage, but it cannot dramatically reduce the computational cost and improve the efficiency in the decomposition stage. Therefore, the optimized composite dictionary single-atom matching algorithm (CD-SaMP) is proposed. Second, the termination condition of iteration based on the attenuation coefficient is put forward to improve the sparsity and efficiency of the algorithm, which adjusts the parameters of the termination condition constantly in the process of decomposition to avoid noise. Third, composite dictionaries are enriched with the modulation dictionary, which is one of the important structural characteristics of gear fault signals. Meanwhile, the termination condition of iteration settings, sub-feature dictionary selections and operation efficiency between CD-MaMP and CD-SaMP are discussed, aiming at gear simulation vibration signals with noise. The simulation sensor-based vibration signal results show that the termination condition of iteration based on the attenuation coefficient enhances decomposition sparsity greatly and achieves a good effect of noise reduction. Furthermore, the modulation dictionary achieves a better matching effect compared to the Fourier dictionary, and CD-SaMP has a great advantage of sparsity and efficiency compared with the CD-MaMP. The sensor-based vibration signals measured from practical engineering gearbox analyses have further shown that the CD-SaMP decomposition and reconstruction algorithm is feasible and effective.

  11. Using Bilingual Dictionaries.

    ERIC Educational Resources Information Center

    Thompson, Geoff

    1987-01-01

    Monolingual dictionaries have serious disadvantages in many language teaching situations; bilingual dictionaries are potentially more efficient and more motivating sources of information for language learners. (Author/CB)

  12. Patient-Specific Seizure Detection in Long-Term EEG Using Signal-Derived Empirical Mode Decomposition (EMD)-based Dictionary Approach.

    PubMed

    Kaleem, Muhammad; Gurve, Dharmendra; Guergachi, Aziz; Krishnan, Sridhar

    2018-06-25

    The objective of the work described in this paper is development of a computationally efficient methodology for patient-specific automatic seizure detection in long-term multi-channel EEG recordings. Approach: A novel patient-specific seizure detection approach based on signal-derived Empirical Mode Decomposition (EMD)-based dictionary approach is proposed. For this purpose, we use an empirical framework for EMD-based dictionary creation and learning, inspired by traditional dictionary learning methods, in which the EMD-based dictionary is learned from the multi-channel EEG data being analyzed for automatic seizure detection. We present the algorithm for dictionary creation and learning, whose purpose is to learn dictionaries with a small number of atoms. Using training signals belonging to seizure and non-seizure classes, an initial dictionary, termed as the raw dictionary, is formed. The atoms of the raw dictionary are composed of intrinsic mode functions obtained after decomposition of the training signals using the empirical mode decomposition algorithm. The raw dictionary is then trained using a learning algorithm, resulting in a substantial decrease in the number of atoms in the trained dictionary. The trained dictionary is then used for automatic seizure detection, such that coefficients of orthogonal projections of test signals against the trained dictionary form the features used for classification of test signals into seizure and non-seizure classes. Thus no hand-engineered features have to be extracted from the data as in traditional seizure detection approaches. Main results: The performance of the proposed approach is validated using the CHB-MIT benchmark database, and averaged accuracy, sensitivity and specificity values of 92.9%, 94.3% and 91.5%, respectively, are obtained using support vector machine classifier and five-fold cross-validation method. These results are compared with other approaches using the same database, and the suitability of the approach for seizure detection in long-term multi-channel EEG recordings is discussed. Significance: The proposed approach describes a computationally efficient method for automatic seizure detection in long-term multi-channel EEG recordings. The method does not rely on hand-engineered features, as are required in traditional approaches. Furthermore, the approach is suitable for scenarios where the dictionary once formed and trained can be used for automatic seizure detection of newly recorded data, making the approach suitable for long-term multi-channel EEG recordings. © 2018 IOP Publishing Ltd.

  13. Blind Linguistic Steganalysis against Translation Based Steganography

    NASA Astrophysics Data System (ADS)

    Chen, Zhili; Huang, Liusheng; Meng, Peng; Yang, Wei; Miao, Haibo

    Translation based steganography (TBS) is a kind of relatively new and secure linguistic steganography. It takes advantage of the "noise" created by automatic translation of natural language text to encode the secret information. Up to date, there is little research on the steganalysis against this kind of linguistic steganography. In this paper, a blind steganalytic method, which is named natural frequency zoned word distribution analysis (NFZ-WDA), is presented. This method has improved on a previously proposed linguistic steganalysis method based on word distribution which is targeted for the detection of linguistic steganography like nicetext and texto. The new method aims to detect the application of TBS and uses none of the related information about TBS, its only used resource is a word frequency dictionary obtained from a large corpus, or a so called natural frequency dictionary, so it is totally blind. To verify the effectiveness of NFZ-WDA, two experiments with two-class and multi-class SVM classifiers respectively are carried out. The experimental results show that the steganalytic method is pretty promising.

  14. System technology analysis of aeroassisted orbital transfer vehicles: Moderate lift/drag (0.75-1.5). Volume 3: Cost estimates and work breakdown structure/dictionary, phase 1 and 2

    NASA Technical Reports Server (NTRS)

    1985-01-01

    Technology payoffs of representative ground based (Phase 1) and space based (Phase 2) mid lift/drag ratio aeroassisted orbit transfer vehicles (AOTV) were assessed and prioritized. A narrative summary of the cost estimates and work breakdown structure/dictionary for both study phases is presented. Costs were estimated using the Grumman Space Programs Algorithm for Cost Estimating (SPACE) computer program and results are given for four AOTV configurations. The work breakdown structure follows the standard of the joint government/industry Space Systems Cost Analysis Group (SSCAG). A table is provided which shows cost estimates for each work breakdown structure element.

  15. Continuous Speech Recognition for Clinicians

    PubMed Central

    Zafar, Atif; Overhage, J. Marc; McDonald, Clement J.

    1999-01-01

    The current generation of continuous speech recognition systems claims to offer high accuracy (greater than 95 percent) speech recognition at natural speech rates (150 words per minute) on low-cost (under $2000) platforms. This paper presents a state-of-the-technology summary, along with insights the authors have gained through testing one such product extensively and other products superficially. The authors have identified a number of issues that are important in managing accuracy and usability. First, for efficient recognition users must start with a dictionary containing the phonetic spellings of all words they anticipate using. The authors dictated 50 discharge summaries using one inexpensive internal medicine dictionary ($30) and found that they needed to add an additional 400 terms to get recognition rates of 98 percent. However, if they used either of two more expensive and extensive commercial medical vocabularies ($349 and $695), they did not need to add terms to get a 98 percent recognition rate. Second, users must speak clearly and continuously, distinctly pronouncing all syllables. Users must also correct errors as they occur, because accuracy improves with error correction by at least 5 percent over two weeks. Users may find it difficult to train the system to recognize certain terms, regardless of the amount of training, and appropriate substitutions must be created. For example, the authors had to substitute “twice a day” for “bid” when using the less expensive dictionary, but not when using the other two dictionaries. From trials they conducted in settings ranging from an emergency room to hospital wards and clinicians' offices, they learned that ambient noise has minimal effect. Finally, they found that a minimal “usable” hardware configuration (which keeps up with dictation) comprises a 300-MHz Pentium processor with 128 MB of RAM and a “speech quality” sound card (e.g., SoundBlaster, $99). Anything less powerful will result in the system lagging behind the speaking rate. The authors obtained 97 percent accuracy with just 30 minutes of training when using the latest edition of one of the speech recognition systems supplemented by a commercial medical dictionary. This technology has advanced considerably in recent years and is now a serious contender to replace some or all of the increasingly expensive alternative methods of dictation with human transcription. PMID:10332653

  16. A Spectral Reconstruction Algorithm of Miniature Spectrometer Based on Sparse Optimization and Dictionary Learning.

    PubMed

    Zhang, Shang; Dong, Yuhan; Fu, Hongyan; Huang, Shao-Lun; Zhang, Lin

    2018-02-22

    The miniaturization of spectrometer can broaden the application area of spectrometry, which has huge academic and industrial value. Among various miniaturization approaches, filter-based miniaturization is a promising implementation by utilizing broadband filters with distinct transmission functions. Mathematically, filter-based spectral reconstruction can be modeled as solving a system of linear equations. In this paper, we propose an algorithm of spectral reconstruction based on sparse optimization and dictionary learning. To verify the feasibility of the reconstruction algorithm, we design and implement a simple prototype of a filter-based miniature spectrometer. The experimental results demonstrate that sparse optimization is well applicable to spectral reconstruction whether the spectra are directly sparse or not. As for the non-directly sparse spectra, their sparsity can be enhanced by dictionary learning. In conclusion, the proposed approach has a bright application prospect in fabricating a practical miniature spectrometer.

  17. A Spectral Reconstruction Algorithm of Miniature Spectrometer Based on Sparse Optimization and Dictionary Learning

    PubMed Central

    Zhang, Shang; Fu, Hongyan; Huang, Shao-Lun; Zhang, Lin

    2018-01-01

    The miniaturization of spectrometer can broaden the application area of spectrometry, which has huge academic and industrial value. Among various miniaturization approaches, filter-based miniaturization is a promising implementation by utilizing broadband filters with distinct transmission functions. Mathematically, filter-based spectral reconstruction can be modeled as solving a system of linear equations. In this paper, we propose an algorithm of spectral reconstruction based on sparse optimization and dictionary learning. To verify the feasibility of the reconstruction algorithm, we design and implement a simple prototype of a filter-based miniature spectrometer. The experimental results demonstrate that sparse optimization is well applicable to spectral reconstruction whether the spectra are directly sparse or not. As for the non-directly sparse spectra, their sparsity can be enhanced by dictionary learning. In conclusion, the proposed approach has a bright application prospect in fabricating a practical miniature spectrometer. PMID:29470406

  18. A Study on the Use of Mobile Dictionaries in Vocabulary Teaching

    ERIC Educational Resources Information Center

    Aslan, Erdinç

    2016-01-01

    In recent years, rapid developments in technology have placed books and notebooks into the mobile phones and tablets and also the dictionaries into these small boxes. Giant dictionaries, which we once barely managed to carry, have been replaced by mobile dictionaries through which we can reach any words we want with only few touches. Mobile…

  19. Letters to a Dictionary: Competing Views of Language in the Reception of "Webster's Third New International Dictionary"

    ERIC Educational Resources Information Center

    Bello, Anne Pence

    2013-01-01

    The publication of "Webster's Third New International Dictionary" in September 1961 set off a national controversy about dictionaries and language that ultimately included issues related to linguistics and English education. The negative reviews published in the press about the "Third" have shaped beliefs about the nature of…

  20. Effects of Printed, Pocket Electronic, and Online Dictionaries on High School Students' English Vocabulary Retention

    ERIC Educational Resources Information Center

    Chiu, Li-Ling; Liu, Gi-Zen

    2013-01-01

    This study obtained empirical evidence regarding the effects of using printed dictionaries (PD), pocket electronic dictionaries (PED), and online type-in dictionaries (OTID) on English vocabulary retention at a junior high school. A mixed-methods research methodology was adopted in this study. Thirty-three seventh graders were asked to use all…

  1. The Efficacy of Dictionary Use while Reading for Learning New Words

    ERIC Educational Resources Information Center

    Hamilton, Harley

    2012-01-01

    The researcher investigated the use of three types of dictionaries while reading by high school students with severe to profound hearing loss. The objective of the study was to determine the effectiveness of each type of dictionary for acquiring the meanings of unknown vocabulary in text. The three types of dictionaries were (a) an online…

  2. A Selected Bibliography of Dictionaries. General Information Series, No. 9. Indochinese Refugee Education Guides. Revised.

    ERIC Educational Resources Information Center

    Center for Applied Linguistics, Arlington, VA.

    The purpose of this bulletin is to provide the American teacher or sponsor with information on the use, limitations and availability of dictionaries that can be used by Indochinese refugees. The introductory material contains descriptions of both monolingual and bilingual dictionaries, a discussion of the inadequacies of bilingual dictionaries in…

  3. Dictionaries Can Help Writing--If Students Know How To Use Them.

    ERIC Educational Resources Information Center

    Jacobs, George M.

    A study investigated whether instruction in how to use a dictionary led to improved second language performance and greater dictionary use among English majors (N=54) in a reading and writing course at a Thai university. One of three participating classes was instructed in the use of a monolingual learner's dictionary. A passage correction test…

  4. Dictionary-Based Tensor Canonical Polyadic Decomposition

    NASA Astrophysics Data System (ADS)

    Cohen, Jeremy Emile; Gillis, Nicolas

    2018-04-01

    To ensure interpretability of extracted sources in tensor decomposition, we introduce in this paper a dictionary-based tensor canonical polyadic decomposition which enforces one factor to belong exactly to a known dictionary. A new formulation of sparse coding is proposed which enables high dimensional tensors dictionary-based canonical polyadic decomposition. The benefits of using a dictionary in tensor decomposition models are explored both in terms of parameter identifiability and estimation accuracy. Performances of the proposed algorithms are evaluated on the decomposition of simulated data and the unmixing of hyperspectral images.

  5. Ensuring the relocatability of programs in the operational system DOS YeS

    NASA Technical Reports Server (NTRS)

    Novoseltsev, S. K.; Orlov, I. G.; Chesalin, A. S.

    1979-01-01

    Specific modifications in the Disk Operational System Unified Series to insure the relocatability of programs stored permanently in the core image library is described. A self-relocating method for loading programs into the working memory with re-editing all the programs recorded in the core image library is presented. The modified linkage editor can be included in a relocation dictionary containing data about each address constant at the assembly stage at the request of the programmer. The relocation dictionary increases the dimension of the RL-phase in comparison with the dimension of this same phase when edited by the standard method, making possible the creation of multiphase program complexes. Generation and use of the modified system using Assembly language is described. An example of the use of the system is given, and limitations of the use of the relocatable programs in the modified system are outlined.

  6. Development Of International Data Standards For The COSMOS/PEER-LL Virtual Data Center

    NASA Astrophysics Data System (ADS)

    Swift, J. N.

    2005-12-01

    The COSMOS -PEER Lifelines Project 2L02 completed a Pilot Geotechnical Virtual Data Center (GVDC) system capable of both archiving geotechnical data and of disseminating data from multiple linked geotechnical databases. The Pilot GVDC system links geotechnical databases of four organizations: the California Geological Survey, Caltrans, PG&E, and the U. S. Geological Survey The System was presented and reviewed in the COSMOS-PEER Lifelines workshop on June 21 - 23, 2004, which was co-sponsored by the Federal Highway Administration (FHWA) and included participation by the United Kingdom Highways Agency (UKHA) , the Association of Geotechnical and Geoenvironmental Specialists in the United Kingdom (AGS), the United States Army Corp of Engineers (USACOE), Caltrans, United States Geological Survey (USGS), California Geological Survey (CGS), a number of state Departments of Transportation (DOTs), county building code officials, and representatives of academic institutions and private sector geotechnical companies. As of February 2005 COSMOS-PEER Lifelines Project 2L03 is currently funded to accomplish the following tasks: 1) expand the Pilot GVDC Geotechnical Data Dictionary and XML Schema to include data definitions and structures to describe in-situ measurements such as shear wave velocity profiles, and additional laboratory geotechnical test types; 2) participate in an international cooperative working group developing a single geotechnical data exchange standard that has broad international acceptance; and 3) upgrade the GVDC system to support corresponding exchange standard data dictionary and schema improvements. The new geophysical data structures being developed will include PS-logs, downhole geophysical logs, cross-hole velocity data, and velocity profiles derived using surface waves. A COSMOS-PEER Lifelines Geophysical Data Dictionary Working Committee constituted of experts in the development of data dictionary standards and experts in the specific data to be captured are presently working on this task. The international geotechnical data dictionary and schema development is a highly collaborative effort funded by a pooled fund study coordinated by state DOTs and FHWA. The technical development of the standards called DIGGS (Data Interchange for Geotechnical and Geoenvironmental Specialists) is lead by a team consisting of representatives from the University of Florida, Department of Civil Engineering (UF), AGS, Construction Industry Research and Information Association (CIRIA), UKHA, Ohio DOT, and COSMOS. The first draft of DIGGS is currently in preparation. A Geotechnical Management System Group (GMS group), composed of representatives from 13 State DOTs, FHWA, US EPA, USACOE, USGS and UKHA, oversees and approves the development of the standards. The ultimate goal of both COSMOS-PEER Lifelines Project 2L03 and the international GMS working group is to produce open and flexible, GML-compliant XML schema-based data structures and data dictionaries for review and approval by DOTs, other public agencies, and the international engineering and geoenvironmental community at large, leading to adoption of internationally accepted geotechnical and geophysical data transfer standards. Establishment of these standards is intended to significantly facilitate the accessibility and exchange of geotechnical information world wide.

  7. The role of local terminologies in electronic health records. The HEGP experience.

    PubMed

    Daniel-Le Bozec, Christel; Steichen, Olivier; Dart, Thierry; Jaulent, Marie-Christine

    2007-01-01

    Despite decades of work, there is no universally accepted standard medical terminology and no generally usable terminological tools have yet emerged. The local dictionary of concepts of the Georges Pompidou European Hospital (HEGP) is a Terminological System (TS) designed to support clinical data entry. It covers 93 data entry forms and contains definitions and synonyms of more than 5000 concepts, sometimes linked to reference terminologies such as ICD-10. In this article, we evaluate to which extend SNOMED CT could fully replace or rather be mapped to the local terminology system. We first describe the local dictionary of concepts of HEGP according to some published TS characterization framework. Then we discuss the specific role that a local terminology system plays with regards to reference terminologies.

  8. Development of the system of reactor thermophysical data on the basis of ontological modelling

    NASA Astrophysics Data System (ADS)

    Chusov, I. A.; Kirillov, P. L.; Bogoslovskaya, G. P.; Yunusov, L. K.; Obysov, N. A.; Novikov, G. E.; Pronyaev, V. G.; Erkimbaev, A. O.; Zitserman, V. Yu; Kobzev, G. A.; Trachtengerts, M. S.; Fokin, L. R.

    2017-11-01

    Compilation and processing of the thermophysical data was always an important task for the nuclear industry. The difficulties of the present stage of this activity are explained by sharp increase of the data volume and the number of new materials, as well as by the increased requirements to the reliability of the data used in the nuclear industry. General trend in the fields with predominantly orientation at the work with data (material science, chemistry and others) consists in the transition to a common infrastructure with integration of separate databases, Web-portals and other resources. This infrastructure provides the interoperability, the procedures of the data exchange, storage and dissemination. Key elements of this infrastructure is a domain-specific ontology, which provides a single information model and dictionary for semantic definitions. Formalizing the subject area, the ontology adapts the definitions for the different database schemes and provides the integration of heterogeneous data. The important property to be inherent for ontologies is a possibility of permanent expanding of new definitions, e.g. list of materials and properties. The expansion of the thermophysical data ontology at the reactor materials includes the creation of taxonomic dictionaries for thermophysical properties; the models for data presentation and their uncertainties; the inclusion along with the parameters of the state, some additional factors, such as the material porosity, the burnup rate, the irradiation rate and others; axiomatics of the properties applicable to the given class of materials.

  9. Orthogonal Procrustes Analysis for Dictionary Learning in Sparse Linear Representation

    PubMed Central

    Grossi, Giuliano; Lin, Jianyi

    2017-01-01

    In the sparse representation model, the design of overcomplete dictionaries plays a key role for the effectiveness and applicability in different domains. Recent research has produced several dictionary learning approaches, being proven that dictionaries learnt by data examples significantly outperform structured ones, e.g. wavelet transforms. In this context, learning consists in adapting the dictionary atoms to a set of training signals in order to promote a sparse representation that minimizes the reconstruction error. Finding the best fitting dictionary remains a very difficult task, leaving the question still open. A well-established heuristic method for tackling this problem is an iterative alternating scheme, adopted for instance in the well-known K-SVD algorithm. Essentially, it consists in repeating two stages; the former promotes sparse coding of the training set and the latter adapts the dictionary to reduce the error. In this paper we present R-SVD, a new method that, while maintaining the alternating scheme, adopts the Orthogonal Procrustes analysis to update the dictionary atoms suitably arranged into groups. Comparative experiments on synthetic data prove the effectiveness of R-SVD with respect to well known dictionary learning algorithms such as K-SVD, ILS-DLA and the online method OSDL. Moreover, experiments on natural data such as ECG compression, EEG sparse representation, and image modeling confirm R-SVD’s robustness and wide applicability. PMID:28103283

  10. Personalized Age Progression with Bi-Level Aging Dictionary Learning.

    PubMed

    Shu, Xiangbo; Tang, Jinhui; Li, Zechao; Lai, Hanjiang; Zhang, Liyan; Yan, Shuicheng

    2018-04-01

    Age progression is defined as aesthetically re-rendering the aging face at any future age for an individual face. In this work, we aim to automatically render aging faces in a personalized way. Basically, for each age group, we learn an aging dictionary to reveal its aging characteristics (e.g., wrinkles), where the dictionary bases corresponding to the same index yet from two neighboring aging dictionaries form a particular aging pattern cross these two age groups, and a linear combination of all these patterns expresses a particular personalized aging process. Moreover, two factors are taken into consideration in the dictionary learning process. First, beyond the aging dictionaries, each person may have extra personalized facial characteristics, e.g., mole, which are invariant in the aging process. Second, it is challenging or even impossible to collect faces of all age groups for a particular person, yet much easier and more practical to get face pairs from neighboring age groups. To this end, we propose a novel Bi-level Dictionary Learning based Personalized Age Progression (BDL-PAP) method. Here, bi-level dictionary learning is formulated to learn the aging dictionaries based on face pairs from neighboring age groups. Extensive experiments well demonstrate the advantages of the proposed BDL-PAP over other state-of-the-arts in term of personalized age progression, as well as the performance gain for cross-age face verification by synthesizing aging faces.

  11. U.S.-MEXICO BORDER PROGRAM ARIZONA BORDER STUDY--STANDARD OPERATING PROCEDURE FOR THE GENERATION AND OPERATION OF DATA DICTIONARIES (UA-D-4.0)

    EPA Science Inventory

    The purpose of this SOP is to provide a standard method for the writing of data dictionaries. This procedure applies to the dictionaries used during the Arizona NHEXAS project and the Border study. Keywords: guidelines; data dictionaries.

    The U.S.-Mexico Border Program is spon...

  12. Multimodal Task-Driven Dictionary Learning for Image Classification

    DTIC Science & Technology

    2015-12-18

    1 Multimodal Task-Driven Dictionary Learning for Image Classification Soheil Bahrampour, Student Member, IEEE, Nasser M. Nasrabadi, Fellow, IEEE...Asok Ray, Fellow, IEEE, and W. Kenneth Jenkins, Life Fellow, IEEE Abstract— Dictionary learning algorithms have been suc- cessfully used for both...reconstructive and discriminative tasks, where an input signal is represented with a sparse linear combination of dictionary atoms. While these methods are

  13. A Study of the Relationship between Type of Dictionary Used and Lexical Proficiency in Writings of Iranian EFL Students

    ERIC Educational Resources Information Center

    Vahdany, Fereidoon; Abdollahzadeh, Milad; Gholami, Shokoufeh; Ghanipoor, Mahmood

    2014-01-01

    This study aimed at investigating the relationship between types of dictionaries used and lexical proficiency in writing. Eighty TOEFL students took part in responding to two Questionnaires collecting information about their dictionary type preferences and habits of dictionary use, along with an interview for further in-depth responses. They were…

  14. English-Chinese Cross-Language IR Using Bilingual Dictionaries

    DTIC Science & Technology

    2006-01-01

    specialized dictionaries together contain about two million entries [6]. 4 Monolingual Experiment The Chinese documents and the Chinese translations of... monolingual performance. The main performance-limiting factor is the limited coverage of the dictionary used in query translation. Some of the key con...English-Chinese Cross-Language IR using Bilingual Dictionaries Aitao Chen , Hailing Jiang , and Fredric Gey School of Information Management

  15. Accelerating the reconstruction of magnetic resonance imaging by three-dimensional dual-dictionary learning using CUDA.

    PubMed

    Jiansen Li; Jianqi Sun; Ying Song; Yanran Xu; Jun Zhao

    2014-01-01

    An effective way to improve the data acquisition speed of magnetic resonance imaging (MRI) is using under-sampled k-space data, and dictionary learning method can be used to maintain the reconstruction quality. Three-dimensional dictionary trains the atoms in dictionary in the form of blocks, which can utilize the spatial correlation among slices. Dual-dictionary learning method includes a low-resolution dictionary and a high-resolution dictionary, for sparse coding and image updating respectively. However, the amount of data is huge for three-dimensional reconstruction, especially when the number of slices is large. Thus, the procedure is time-consuming. In this paper, we first utilize the NVIDIA Corporation's compute unified device architecture (CUDA) programming model to design the parallel algorithms on graphics processing unit (GPU) to accelerate the reconstruction procedure. The main optimizations operate in the dictionary learning algorithm and the image updating part, such as the orthogonal matching pursuit (OMP) algorithm and the k-singular value decomposition (K-SVD) algorithm. Then we develop another version of CUDA code with algorithmic optimization. Experimental results show that more than 324 times of speedup is achieved compared with the CPU-only codes when the number of MRI slices is 24.

  16. Discriminative object tracking via sparse representation and online dictionary learning.

    PubMed

    Xie, Yuan; Zhang, Wensheng; Li, Cuihua; Lin, Shuyang; Qu, Yanyun; Zhang, Yinghua

    2014-04-01

    We propose a robust tracking algorithm based on local sparse coding with discriminative dictionary learning and new keypoint matching schema. This algorithm consists of two parts: the local sparse coding with online updated discriminative dictionary for tracking (SOD part), and the keypoint matching refinement for enhancing the tracking performance (KP part). In the SOD part, the local image patches of the target object and background are represented by their sparse codes using an over-complete discriminative dictionary. Such discriminative dictionary, which encodes the information of both the foreground and the background, may provide more discriminative power. Furthermore, in order to adapt the dictionary to the variation of the foreground and background during the tracking, an online learning method is employed to update the dictionary. The KP part utilizes refined keypoint matching schema to improve the performance of the SOD. With the help of sparse representation and online updated discriminative dictionary, the KP part are more robust than the traditional method to reject the incorrect matches and eliminate the outliers. The proposed method is embedded into a Bayesian inference framework for visual tracking. Experimental results on several challenging video sequences demonstrate the effectiveness and robustness of our approach.

  17. Basis Expansion Approaches for Regularized Sequential Dictionary Learning Algorithms With Enforced Sparsity for fMRI Data Analysis.

    PubMed

    Seghouane, Abd-Krim; Iqbal, Asif

    2017-09-01

    Sequential dictionary learning algorithms have been successfully applied to functional magnetic resonance imaging (fMRI) data analysis. fMRI data sets are, however, structured data matrices with the notions of temporal smoothness in the column direction. This prior information, which can be converted into a constraint of smoothness on the learned dictionary atoms, has seldomly been included in classical dictionary learning algorithms when applied to fMRI data analysis. In this paper, we tackle this problem by proposing two new sequential dictionary learning algorithms dedicated to fMRI data analysis by accounting for this prior information. These algorithms differ from the existing ones in their dictionary update stage. The steps of this stage are derived as a variant of the power method for computing the SVD. The proposed algorithms generate regularized dictionary atoms via the solution of a left regularized rank-one matrix approximation problem where temporal smoothness is enforced via regularization through basis expansion and sparse basis expansion in the dictionary update stage. Applications on synthetic data experiments and real fMRI data sets illustrating the performance of the proposed algorithms are provided.

  18. Conversion of environmental data to a digital-spatial database, Puget Sound area, Washington

    USGS Publications Warehouse

    Uhrich, M.A.; McGrath, T.S.

    1997-01-01

    Data and maps from the Puget Sound Environmental Atlas, compiled for the U.S. Environmental Protection Agency, the Puget Sound Water Quality Authority, and the U.S. Army Corps of Engineers, have been converted into a digital-spatial database using a geographic information system. Environmental data for the Puget Sound area,collected from sources other than the Puget SoundEnvironmental Atlas by different Federal, State, andlocal agencies, also have been converted into thisdigital-spatial database. Background on the geographic-information-system planning process, the design and implementation of the geographic information-system database, and the reasons for conversion to this digital-spatial database are included in this report. The Puget Sound Environmental Atlas data layers include information about seabird nesting areas, eelgrass and kelp habitat, marine mammal and fish areas, and shellfish resources and bed certification. Data layers, from sources other than the Puget Sound Environmental Atlas, include the Puget Sound shoreline, the water-body system, shellfish growing areas, recreational shellfish beaches, sewage-treatment outfalls, upland hydrography,watershed and political boundaries, and geographicnames. The sources of data, descriptions of the datalayers, and the steps and errors of processing associated with conversion to a digital-spatial database used in development of the Puget Sound Geographic Information System also are included in this report. The appendixes contain data dictionaries for each of the resource layers and error values for the conversion of Puget SoundEnvironmental Atlas data.

  19. Ambiguity and variability of database and software names in bioinformatics.

    PubMed

    Duck, Geraint; Kovacevic, Aleksandar; Robertson, David L; Stevens, Robert; Nenadic, Goran

    2015-01-01

    There are numerous options available to achieve various tasks in bioinformatics, but until recently, there were no tools that could systematically identify mentions of databases and tools within the literature. In this paper we explore the variability and ambiguity of database and software name mentions and compare dictionary and machine learning approaches to their identification. Through the development and analysis of a corpus of 60 full-text documents manually annotated at the mention level, we report high variability and ambiguity in database and software mentions. On a test set of 25 full-text documents, a baseline dictionary look-up achieved an F-score of 46 %, highlighting not only variability and ambiguity but also the extensive number of new resources introduced. A machine learning approach achieved an F-score of 63 % (with precision of 74 %) and 70 % (with precision of 83 %) for strict and lenient matching respectively. We characterise the issues with various mention types and propose potential ways of capturing additional database and software mentions in the literature. Our analyses show that identification of mentions of databases and tools is a challenging task that cannot be achieved by relying on current manually-curated resource repositories. Although machine learning shows improvement and promise (primarily in precision), more contextual information needs to be taken into account to achieve a good degree of accuracy.

  20. High-recall protein entity recognition using a dictionary

    PubMed Central

    Kou, Zhenzhen; Cohen, William W.; Murphy, Robert F.

    2010-01-01

    Protein name extraction is an important step in mining biological literature. We describe two new methods for this task: semiCRFs and dictionary HMMs. SemiCRFs are a recently-proposed extension to conditional random fields that enables more effective use of dictionary information as features. Dictionary HMMs are a technique in which a dictionary is converted to a large HMM that recognizes phrases from the dictionary, as well as variations of these phrases. Standard training methods for HMMs can be used to learn which variants should be recognized. We compared the performance of our new approaches to that of Maximum Entropy (Max-Ent) and normal CRFs on three datasets, and improvement was obtained for all four methods over the best published results for two of the datasets. CRFs and semiCRFs achieved the highest overall performance according to the widely-used F-measure, while the dictionary HMMs performed the best at finding entities that actually appear in the dictionary—the measure of most interest in our intended application. PMID:15961466

  1. On the Development of Speech Resources for the Mixtec Language

    PubMed Central

    2013-01-01

    The Mixtec language is one of the main native languages in Mexico. In general, due to urbanization, discrimination, and limited attempts to promote the culture, the native languages are disappearing. Most of the information available about the Mixtec language is in written form as in dictionaries which, although including examples about how to pronounce the Mixtec words, are not as reliable as listening to the correct pronunciation from a native speaker. Formal acoustic resources, as speech corpora, are almost non-existent for the Mixtec, and no speech technologies are known to have been developed for it. This paper presents the development of the following resources for the Mixtec language: (1) a speech database of traditional narratives of the Mixtec culture spoken by a native speaker (labelled at the phonetic and orthographic levels by means of spectral analysis) and (2) a native speaker-adaptive automatic speech recognition (ASR) system (trained with the speech database) integrated with a Mixtec-to-Spanish/Spanish-to-Mixtec text translator. The speech database, although small and limited to a single variant, was reliable enough to build the multiuser speech application which presented a mean recognition/translation performance up to 94.36% in experiments with non-native speakers (the target users). PMID:23710134

  2. Environmental Health and Toxicology Resources of the United States National Library of Medicine

    PubMed Central

    Hochstein, Colette; Arnesen, Stacey; Goshorn, Jeanne

    2009-01-01

    For over 40 years, the National Library of Medicine’s (NLM) Toxicology and Environmental Health Information Program (TEHIP) has worked to organize and to provide access to an extensive array of environmental health and toxicology resources. During these years, the TEHIP program has evolved from a handful of databases developed primarily for researchers to a broad range of products and services that also serve industry, students, and the general public. TEHIP’s resources include TOXNET® , a collection of databases, including online handbooks, bibliographic references, information on the release of chemicals in the environment, and a chemical dictionary. TEHIP also produces several resources aimed towards the general public, such as the Household Products Database , which helps users explore chemicals often found in common household products, and Tox Town® , an interactive guide to commonly encountered toxic substances, health, and the environment. This paper introduces some of NLM’s environmental health and toxicology resources. PMID:17915629

  3. Hierarchical Simulation to Assess Hardware and Software Dependability

    NASA Technical Reports Server (NTRS)

    Ries, Gregory Lawrence

    1997-01-01

    This thesis presents a method for conducting hierarchical simulations to assess system hardware and software dependability. The method is intended to model embedded microprocessor systems. A key contribution of the thesis is the idea of using fault dictionaries to propagate fault effects upward from the level of abstraction where a fault model is assumed to the system level where the ultimate impact of the fault is observed. A second important contribution is the analysis of the software behavior under faults as well as the hardware behavior. The simulation method is demonstrated and validated in four case studies analyzing Myrinet, a commercial, high-speed networking system. One key result from the case studies shows that the simulation method predicts the same fault impact 87.5% of the time as is obtained by similar fault injections into a real Myrinet system. Reasons for the remaining discrepancy are examined in the thesis. A second key result shows the reduction in the number of simulations needed due to the fault dictionary method. In one case study, 500 faults were injected at the chip level, but only 255 propagated to the system level. Of these 255 faults, 110 shared identical fault dictionary entries at the system level and so did not need to be resimulated. The necessary number of system-level simulations was therefore reduced from 500 to 145. Finally, the case studies show how the simulation method can be used to improve the dependability of the target system. The simulation analysis was used to add recovery to the target software for the most common fault propagation mechanisms that would cause the software to hang. After the modification, the number of hangs was reduced by 60% for fault injections into the real system.

  4. Relaxations to Sparse Optimization Problems and Applications

    NASA Astrophysics Data System (ADS)

    Skau, Erik West

    Parsimony is a fundamental property that is applied to many characteristics in a variety of fields. Of particular interest are optimization problems that apply rank, dimensionality, or support in a parsimonious manner. In this thesis we study some optimization problems and their relaxations, and focus on properties and qualities of the solutions of these problems. The Gramian tensor decomposition problem attempts to decompose a symmetric tensor as a sum of rank one tensors.We approach the Gramian tensor decomposition problem with a relaxation to a semidefinite program. We study conditions which ensure that the solution of the relaxed semidefinite problem gives the minimal Gramian rank decomposition. Sparse representations with learned dictionaries are one of the leading image modeling techniques for image restoration. When learning these dictionaries from a set of training images, the sparsity parameter of the dictionary learning algorithm strongly influences the content of the dictionary atoms.We describe geometrically the content of trained dictionaries and how it changes with the sparsity parameter.We use statistical analysis to characterize how the different content is used in sparse representations. Finally, a method to control the structure of the dictionaries is demonstrated, allowing us to learn a dictionary which can later be tailored for specific applications. Variations of dictionary learning can be broadly applied to a variety of applications.We explore a pansharpening problem with a triple factorization variant of coupled dictionary learning. Another application of dictionary learning is computer vision. Computer vision relies heavily on object detection, which we explore with a hierarchical convolutional dictionary learning model. Data fusion of disparate modalities is a growing topic of interest.We do a case study to demonstrate the benefit of using social media data with satellite imagery to estimate hazard extents. In this case study analysis we apply a maximum entropy model, guided by the social media data, to estimate the flooded regions during a 2013 flood in Boulder, CO and show that the results are comparable to those obtained using expert information.

  5. Users guide for the Water Resources Division bibliographic retrieval and report generation system

    USGS Publications Warehouse

    Tamberg, Nora

    1983-01-01

    The WRDBIB Retrieval and Report-generation system has been developed by applying Multitrieve (CSD 1980, Reston) software to bibliographic data files. The WRDBIB data base includes some 9 ,000 records containing bibliographic citations and descriptors of WRD reports released for publication during 1968-1982. The data base is resident in the Reston Multics computer and may be accessed by registered Multics users in the field. The WRDBIB Users Guide provides detailed procedures on how to run retrieval programs using WRDBIB library files, and how to prepare custom bibliographic reports and author indexes. Users may search the WRDBIB data base on the following variable fields as described in the Data Dictionary: Authors, organizational source, title, citation, publication year, descriptors, and the WRSIC (accession) number. The Users Guide provides ample examples of program runs illustrating various retrieval and report generation aspects. Appendices include Multics access and file manipulation procedures; a ' Glossary of Selected Terms'; and a complete ' Retrieval Session ' with step-by-step outlines. (USGS)

  6. miRPathDB: a new dictionary on microRNAs and target pathways.

    PubMed

    Backes, Christina; Kehl, Tim; Stöckel, Daniel; Fehlmann, Tobias; Schneider, Lara; Meese, Eckart; Lenhof, Hans-Peter; Keller, Andreas

    2017-01-04

    In the last decade, miRNAs and their regulatory mechanisms have been intensively studied and many tools for the analysis of miRNAs and their targets have been developed. We previously presented a dictionary on single miRNAs and their putative target pathways. Since then, the number of miRNAs has tripled and the knowledge on miRNAs and targets has grown substantially. This, along with changes in pathway resources such as KEGG, leads to an improved understanding of miRNAs, their target genes and related pathways. Here, we introduce the miRNA Pathway Dictionary Database (miRPathDB), freely accessible at https://mpd.bioinf.uni-sb.de/ With the database we aim to complement available target pathway web-servers by providing researchers easy access to the information which pathways are regulated by a miRNA, which miRNAs target a pathway and how specific these regulations are. The database contains a large number of miRNAs (2595 human miRNAs), different miRNA target sets (14 773 experimentally validated target genes as well as 19 281 predicted targets genes) and a broad selection of functional biochemical categories (KEGG-, WikiPathways-, BioCarta-, SMPDB-, PID-, Reactome pathways, functional categories from gene ontology (GO), protein families from Pfam and chromosomal locations totaling 12 875 categories). In addition to Homo sapiens, also Mus musculus data are stored and can be compared to human target pathways. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Developing a distributed data dictionary service

    NASA Technical Reports Server (NTRS)

    U'Ren, J.

    2000-01-01

    This paper will explore the use of the Lightweight Directory Access Protocol (LDAP) using the ISO 11179 Data Dictionary Schema as a mechanism for standardizing the structure and communication links between data dictionaries.

  8. Multiple Sparse Representations Classification

    PubMed Central

    Plenge, Esben; Klein, Stefan S.; Niessen, Wiro J.; Meijering, Erik

    2015-01-01

    Sparse representations classification (SRC) is a powerful technique for pixelwise classification of images and it is increasingly being used for a wide variety of image analysis tasks. The method uses sparse representation and learned redundant dictionaries to classify image pixels. In this empirical study we propose to further leverage the redundancy of the learned dictionaries to achieve a more accurate classifier. In conventional SRC, each image pixel is associated with a small patch surrounding it. Using these patches, a dictionary is trained for each class in a supervised fashion. Commonly, redundant/overcomplete dictionaries are trained and image patches are sparsely represented by a linear combination of only a few of the dictionary elements. Given a set of trained dictionaries, a new patch is sparse coded using each of them, and subsequently assigned to the class whose dictionary yields the minimum residual energy. We propose a generalization of this scheme. The method, which we call multiple sparse representations classification (mSRC), is based on the observation that an overcomplete, class specific dictionary is capable of generating multiple accurate and independent estimates of a patch belonging to the class. So instead of finding a single sparse representation of a patch for each dictionary, we find multiple, and the corresponding residual energies provides an enhanced statistic which is used to improve classification. We demonstrate the efficacy of mSRC for three example applications: pixelwise classification of texture images, lumen segmentation in carotid artery magnetic resonance imaging (MRI), and bifurcation point detection in carotid artery MRI. We compare our method with conventional SRC, K-nearest neighbor, and support vector machine classifiers. The results show that mSRC outperforms SRC and the other reference methods. In addition, we present an extensive evaluation of the effect of the main mSRC parameters: patch size, dictionary size, and sparsity level. PMID:26177106

  9. A multistage gene normalization system integrating multiple effective methods.

    PubMed

    Li, Lishuang; Liu, Shanshan; Li, Lihua; Fan, Wenting; Huang, Degen; Zhou, Huiwei

    2013-01-01

    Gene/protein recognition and normalization is an important preliminary step for many biological text mining tasks. In this paper, we present a multistage gene normalization system which consists of four major subtasks: pre-processing, dictionary matching, ambiguity resolution and filtering. For the first subtask, we apply the gene mention tagger developed in our earlier work, which achieves an F-score of 88.42% on the BioCreative II GM testing set. In the stage of dictionary matching, the exact matching and approximate matching between gene names and the EntrezGene lexicon have been combined. For the ambiguity resolution subtask, we propose a semantic similarity disambiguation method based on Munkres' Assignment Algorithm. At the last step, a filter based on Wikipedia has been built to remove the false positives. Experimental results show that the presented system can achieve an F-score of 90.1%, outperforming most of the state-of-the-art systems.

  10. Space station data system analysis/architecture study. Task 4: System definition report. Appendix

    NASA Technical Reports Server (NTRS)

    1985-01-01

    Appendices to the systems definition study for the space station Data System are compiled. Supplemental information on external interface specification, simulation and modeling, and function design characteristics is presented along with data flow diagrams, a data dictionary, and function allocation matrices.

  11. Sparse Coding of Natural Human Motion Yields Eigenmotions Consistent Across People

    NASA Astrophysics Data System (ADS)

    Thomik, Andreas; Faisal, A. Aldo

    2015-03-01

    Providing a precise mathematical description of the structure of natural human movement is a challenging problem. We use a data-driven approach to seek a generative model of movement capturing the underlying simplicity of spatial and temporal structure of behaviour observed in daily life. In perception, the analysis of natural scenes has shown that sparse codes of such scenes are information theoretic efficient descriptors with direct neuronal correlates. Translating from perception to action, we identify a generative model of movement generation by the human motor system. Using wearable full-hand motion capture, we measure the digit movement of the human hand in daily life. We learn a dictionary of ``eigenmotions'' which we use for sparse encoding of the movement data. We show that the dictionaries are generally well preserved across subjects with small deviations accounting for individuality of the person and variability in tasks. Further, the dictionary elements represent motions which can naturally describe hand movements. Our findings suggest the motor system can compose complex movement behaviours out of the spatially and temporally sparse activation of ``eigenmotion'' neurons, and is consistent with data on grasp-type specificity of specialised neurons in the premotor cortex. Andreas is supported by the Luxemburg Research Fund (1229297).

  12. Developing a data dictionary for the irish nursing minimum dataset.

    PubMed

    Henry, Pamela; Mac Neela, Pádraig; Clinton, Gerard; Scott, Anne; Treacy, Pearl; Butler, Michelle; Hyde, Abbey; Morris, Roisin; Irving, Kate; Byrne, Anne

    2006-01-01

    One of the challenges in health care in Ireland is the relatively slow acceptance of standardised clinical information systems. Yet the national Irish health reform programme indicates that an Electronic Health Care Record (EHCR) will be implemented on a phased basis. [3-5]. While nursing has a key role in ensuring the quality and comparability of health information, the so- called 'invisibility' of some nursing activities makes this a challenging aim to achieve [3-5]. Any integrated health care system requires the adoption of uniform standards for electronic data exchange [1-2]. One of the pre-requisites for uniform standards is the composition of a data dictionary. Inadequate definition of data elements in a particular dataset hinders the development of an integrated data depository or electronic health care record (EHCR). This paper outlines how work on the data dictionary for the Irish Nursing Minimum Dataset (INMDS) has addressed this issue. Data set elements were devised on the basis of a large scale empirical research programme. ISO 18104, the reference terminology for nursing [6], was used to cross-map the data set elements with semantic domains, categories and links and data set items were dissected.

  13. Field-testing the new DECtalk PC system for medical applications

    NASA Technical Reports Server (NTRS)

    Grams, R. R.; Smillov, A.; Li, B.

    1992-01-01

    Synthesized human speech has now reached a new level of performance. With the introduction of DEC's new DECtalk PC, the small system developer will have a very powerful tool for creative design. It has been our privilege to be involved in the beta-testing of this new device and to add a medical dictionary which covers a wide range of medical terminology. With the inherent board level understanding of speech synthesis and the medical dictionary, it is now possible to provide full digital speech output for all medical files and terms. The application of these tools will cover a wide range of options for the future and allow a new dimension in dealing with the complex user interface experienced in medical practice.

  14. Sparsity and Nullity: Paradigm for Analysis Dictionary Learning

    DTIC Science & Technology

    2016-08-09

    16. SECURITY CLASSIFICATION OF: Sparse models in dictionary learning have been successfully applied in a wide variety of machine learning and...we investigate the relation between the SNS problem and the analysis dictionary learning problem, and show that the SNS problem plays a central role...and may be utilized to solve dictionary learning problems. 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND SUBTITLE 13. SUPPLEMENTARY NOTES 12

  15. Readers' opinions of romantic poetry are consistent with emotional measures based on the Dictionary of Affect in Language.

    PubMed

    Whissell, Cynthia

    2003-06-01

    A principal components analysis of 68 volunteers' subjective ratings of 20 excerpts of Romantic poetry and of Dictionary of Affect scores for the same excerpts produced four components representing Pleasantness, Activation, Romanticism, and Nature. Dictionary measures and subjective ratings of the same constructs loaded on the same factor. Results are interpreted as providing construct validity for the Dictionary of Affect.

  16. Fast dictionary-based reconstruction for diffusion spectrum imaging.

    PubMed

    Bilgic, Berkin; Chatnuntawech, Itthi; Setsompop, Kawin; Cauley, Stephen F; Yendiki, Anastasia; Wald, Lawrence L; Adalsteinsson, Elfar

    2013-11-01

    Diffusion spectrum imaging reveals detailed local diffusion properties at the expense of substantially long imaging times. It is possible to accelerate acquisition by undersampling in q-space, followed by image reconstruction that exploits prior knowledge on the diffusion probability density functions (pdfs). Previously proposed methods impose this prior in the form of sparsity under wavelet and total variation transforms, or under adaptive dictionaries that are trained on example datasets to maximize the sparsity of the representation. These compressed sensing (CS) methods require full-brain processing times on the order of hours using MATLAB running on a workstation. This work presents two dictionary-based reconstruction techniques that use analytical solutions, and are two orders of magnitude faster than the previously proposed dictionary-based CS approach. The first method generates a dictionary from the training data using principal component analysis (PCA), and performs the reconstruction in the PCA space. The second proposed method applies reconstruction using pseudoinverse with Tikhonov regularization with respect to a dictionary. This dictionary can either be obtained using the K-SVD algorithm, or it can simply be the training dataset of pdfs without any training. All of the proposed methods achieve reconstruction times on the order of seconds per imaging slice, and have reconstruction quality comparable to that of dictionary-based CS algorithm.

  17. Weakly supervised visual dictionary learning by harnessing image attributes.

    PubMed

    Gao, Yue; Ji, Rongrong; Liu, Wei; Dai, Qionghai; Hua, Gang

    2014-12-01

    Bag-of-features (BoFs) representation has been extensively applied to deal with various computer vision applications. To extract discriminative and descriptive BoF, one important step is to learn a good dictionary to minimize the quantization loss between local features and codewords. While most existing visual dictionary learning approaches are engaged with unsupervised feature quantization, the latest trend has turned to supervised learning by harnessing the semantic labels of images or regions. However, such labels are typically too expensive to acquire, which restricts the scalability of supervised dictionary learning approaches. In this paper, we propose to leverage image attributes to weakly supervise the dictionary learning procedure without requiring any actual labels. As a key contribution, our approach establishes a generative hidden Markov random field (HMRF), which models the quantized codewords as the observed states and the image attributes as the hidden states, respectively. Dictionary learning is then performed by supervised grouping the observed states, where the supervised information is stemmed from the hidden states of the HMRF. In such a way, the proposed dictionary learning approach incorporates the image attributes to learn a semantic-preserving BoF representation without any genuine supervision. Experiments in large-scale image retrieval and classification tasks corroborate that our approach significantly outperforms the state-of-the-art unsupervised dictionary learning approaches.

  18. Password-only authenticated three-party key exchange proven secure against insider dictionary attacks.

    PubMed

    Nam, Junghyun; Choo, Kim-Kwang Raymond; Paik, Juryon; Won, Dongho

    2014-01-01

    While a number of protocols for password-only authenticated key exchange (PAKE) in the 3-party setting have been proposed, it still remains a challenging task to prove the security of a 3-party PAKE protocol against insider dictionary attacks. To the best of our knowledge, there is no 3-party PAKE protocol that carries a formal proof, or even definition, of security against insider dictionary attacks. In this paper, we present the first 3-party PAKE protocol proven secure against both online and offline dictionary attacks as well as insider and outsider dictionary attacks. Our construct can be viewed as a protocol compiler that transforms any 2-party PAKE protocol into a 3-party PAKE protocol with 2 additional rounds of communication. We also present a simple and intuitive approach of formally modelling dictionary attacks in the password-only 3-party setting, which significantly reduces the complexity of proving the security of 3-party PAKE protocols against dictionary attacks. In addition, we investigate the security of the well-known 3-party PAKE protocol, called GPAKE, due to Abdalla et al. (2005, 2006), and demonstrate that the security of GPAKE against online dictionary attacks depends heavily on the composition of its two building blocks, namely a 2-party PAKE protocol and a 3-party key distribution protocol.

  19. Adaptive Greedy Dictionary Selection for Web Media Summarization.

    PubMed

    Cong, Yang; Liu, Ji; Sun, Gan; You, Quanzeng; Li, Yuncheng; Luo, Jiebo

    2017-01-01

    Initializing an effective dictionary is an indispensable step for sparse representation. In this paper, we focus on the dictionary selection problem with the objective to select a compact subset of basis from original training data instead of learning a new dictionary matrix as dictionary learning models do. We first design a new dictionary selection model via l 2,0 norm. For model optimization, we propose two methods: one is the standard forward-backward greedy algorithm, which is not suitable for large-scale problems; the other is based on the gradient cues at each forward iteration and speeds up the process dramatically. In comparison with the state-of-the-art dictionary selection models, our model is not only more effective and efficient, but also can control the sparsity. To evaluate the performance of our new model, we select two practical web media summarization problems: 1) we build a new data set consisting of around 500 users, 3000 albums, and 1 million images, and achieve effective assisted albuming based on our model and 2) by formulating the video summarization problem as a dictionary selection issue, we employ our model to extract keyframes from a video sequence in a more flexible way. Generally, our model outperforms the state-of-the-art methods in both these two tasks.

  20. Multi-level discriminative dictionary learning with application to large scale image classification.

    PubMed

    Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua

    2015-10-01

    The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.

  1. Fast Dictionary-Based Reconstruction for Diffusion Spectrum Imaging

    PubMed Central

    Bilgic, Berkin; Chatnuntawech, Itthi; Setsompop, Kawin; Cauley, Stephen F.; Yendiki, Anastasia; Wald, Lawrence L.; Adalsteinsson, Elfar

    2015-01-01

    Diffusion Spectrum Imaging (DSI) reveals detailed local diffusion properties at the expense of substantially long imaging times. It is possible to accelerate acquisition by undersampling in q-space, followed by image reconstruction that exploits prior knowledge on the diffusion probability density functions (pdfs). Previously proposed methods impose this prior in the form of sparsity under wavelet and total variation (TV) transforms, or under adaptive dictionaries that are trained on example datasets to maximize the sparsity of the representation. These compressed sensing (CS) methods require full-brain processing times on the order of hours using Matlab running on a workstation. This work presents two dictionary-based reconstruction techniques that use analytical solutions, and are two orders of magnitude faster than the previously proposed dictionary-based CS approach. The first method generates a dictionary from the training data using Principal Component Analysis (PCA), and performs the reconstruction in the PCA space. The second proposed method applies reconstruction using pseudoinverse with Tikhonov regularization with respect to a dictionary. This dictionary can either be obtained using the K-SVD algorithm, or it can simply be the training dataset of pdfs without any training. All of the proposed methods achieve reconstruction times on the order of seconds per imaging slice, and have reconstruction quality comparable to that of dictionary-based CS algorithm. PMID:23846466

  2. Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format

    PubMed Central

    Kinjo, Akira R.; Suzuki, Hirofumi; Yamashita, Reiko; Ikegawa, Yasuyo; Kudou, Takahiro; Igarashi, Reiko; Kengaku, Yumiko; Cho, Hasumi; Standley, Daron M.; Nakagawa, Atsushi; Nakamura, Haruki

    2012-01-01

    The Protein Data Bank Japan (PDBj, http://pdbj.org) is a member of the worldwide Protein Data Bank (wwPDB) and accepts and processes the deposited data of experimentally determined macromolecular structures. While maintaining the archive in collaboration with other wwPDB partners, PDBj also provides a wide range of services and tools for analyzing structures and functions of proteins, which are summarized in this article. To enhance the interoperability of the PDB data, we have recently developed PDB/RDF, PDB data in the Resource Description Framework (RDF) format, along with its ontology in the Web Ontology Language (OWL) based on the PDB mmCIF Exchange Dictionary. Being in the standard format for the Semantic Web, the PDB/RDF data provide a means to integrate the PDB with other biological information resources. PMID:21976737

  3. The Pocket Dictionary: A Textbook for Spelling.

    ERIC Educational Resources Information Center

    Doggett, Maran

    1982-01-01

    Reports on a productive approach to secondary-school spelling instruction--one that emphasizes how and when to use the dictionary. Describes two of the many class activities that cultivate student use of the dictionary. (RL)

  4. Cheap Words: A Paperback Dictionary Roundup.

    ERIC Educational Resources Information Center

    Kister, Ken

    1979-01-01

    Surveys currently available paperback editions in three classes of dictionaries: collegiate, abridged, and pocket. A general discussion distinguishes among the classes and offers seven consumer tips, followed by an annotated listing of dictionaries now available. (SW)

  5. On A Nonlinear Generalization of Sparse Coding and Dictionary Learning.

    PubMed

    Xie, Yuchen; Ho, Jeffrey; Vemuri, Baba

    2013-01-01

    Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝ d , and the dictionary is learned from the training data using the vector space structure of ℝ d and its Euclidean L 2 -metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis.

  6. On A Nonlinear Generalization of Sparse Coding and Dictionary Learning

    PubMed Central

    Xie, Yuchen; Ho, Jeffrey; Vemuri, Baba

    2013-01-01

    Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝd, and the dictionary is learned from the training data using the vector space structure of ℝd and its Euclidean L2-metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis. PMID:24129583

  7. An analysis dictionary learning algorithm under a noisy data model with orthogonality constraint.

    PubMed

    Zhang, Ye; Yu, Tenglong; Wang, Wenwu

    2014-01-01

    Two common problems are often encountered in analysis dictionary learning (ADL) algorithms. The first one is that the original clean signals for learning the dictionary are assumed to be known, which otherwise need to be estimated from noisy measurements. This, however, renders a computationally slow optimization process and potentially unreliable estimation (if the noise level is high), as represented by the Analysis K-SVD (AK-SVD) algorithm. The other problem is the trivial solution to the dictionary, for example, the null dictionary matrix that may be given by a dictionary learning algorithm, as discussed in the learning overcomplete sparsifying transform (LOST) algorithm. Here we propose a novel optimization model and an iterative algorithm to learn the analysis dictionary, where we directly employ the observed data to compute the approximate analysis sparse representation of the original signals (leading to a fast optimization procedure) and enforce an orthogonality constraint on the optimization criterion to avoid the trivial solutions. Experiments demonstrate the competitive performance of the proposed algorithm as compared with three baselines, namely, the AK-SVD, LOST, and NAAOLA algorithms.

  8. DES-ncRNA: A knowledgebase for exploring information about human micro and long noncoding RNAs based on literature-mining.

    PubMed

    Salhi, Adil; Essack, Magbubah; Alam, Tanvir; Bajic, Vladan P; Ma, Lina; Radovanovic, Aleksandar; Marchand, Benoit; Schmeier, Sebastian; Zhang, Zhang; Bajic, Vladimir B

    2017-07-03

    Noncoding RNAs (ncRNAs), particularly microRNAs (miRNAs) and long ncRNAs (lncRNAs), are important players in diseases and emerge as novel drug targets. Thus, unraveling the relationships between ncRNAs and other biomedical entities in cells are critical for better understanding ncRNA roles that may eventually help develop their use in medicine. To support ncRNA research and facilitate retrieval of relevant information regarding miRNAs and lncRNAs from the plethora of published ncRNA-related research, we developed DES-ncRNA ( www.cbrc.kaust.edu.sa/des_ncrna ). DES-ncRNA is a knowledgebase containing text- and data-mined information from public scientific literature and other public resources. Exploration of mined information is enabled through terms and pairs of terms from 19 topic-specific dictionaries including, for example, antibiotics, toxins, drugs, enzymes, mutations, pathways, human genes and proteins, drug indications and side effects, mutations, diseases, etc. DES-ncRNA contains approximately 878,000 associations of terms from these dictionaries of which 36,222 (5,373) are with regards to miRNAs (lncRNAs). We provide several ways to explore information regarding ncRNAs to users including controlled generation of association networks as well as hypotheses generation. We show an example how DES-ncRNA can aid research on Alzheimer disease and suggest potential therapeutic role for Fasudil. DES-ncRNA is a powerful tool that can be used on its own or as a complement to the existing resources, to support research in human ncRNA. To our knowledge, this is the only knowledgebase dedicated to human miRNAs and lncRNAs derived primarily through literature-mining enabling exploration of a broad spectrum of associated biomedical entities, not paralleled by any other resource.

  9. DES-ncRNA: A knowledgebase for exploring information about human micro and long noncoding RNAs based on literature-mining

    PubMed Central

    Salhi, Adil; Essack, Magbubah; Alam, Tanvir; Bajic, Vladan P.; Ma, Lina; Radovanovic, Aleksandar; Marchand, Benoit; Zhang, Zhang; Bajic, Vladimir B.

    2017-01-01

    ABSTRACT Noncoding RNAs (ncRNAs), particularly microRNAs (miRNAs) and long ncRNAs (lncRNAs), are important players in diseases and emerge as novel drug targets. Thus, unraveling the relationships between ncRNAs and other biomedical entities in cells are critical for better understanding ncRNA roles that may eventually help develop their use in medicine. To support ncRNA research and facilitate retrieval of relevant information regarding miRNAs and lncRNAs from the plethora of published ncRNA-related research, we developed DES-ncRNA (www.cbrc.kaust.edu.sa/des_ncrna). DES-ncRNA is a knowledgebase containing text- and data-mined information from public scientific literature and other public resources. Exploration of mined information is enabled through terms and pairs of terms from 19 topic-specific dictionaries including, for example, antibiotics, toxins, drugs, enzymes, mutations, pathways, human genes and proteins, drug indications and side effects, mutations, diseases, etc. DES-ncRNA contains approximately 878,000 associations of terms from these dictionaries of which 36,222 (5,373) are with regards to miRNAs (lncRNAs). We provide several ways to explore information regarding ncRNAs to users including controlled generation of association networks as well as hypotheses generation. We show an example how DES-ncRNA can aid research on Alzheimer disease and suggest potential therapeutic role for Fasudil. DES-ncRNA is a powerful tool that can be used on its own or as a complement to the existing resources, to support research in human ncRNA. To our knowledge, this is the only knowledgebase dedicated to human miRNAs and lncRNAs derived primarily through literature-mining enabling exploration of a broad spectrum of associated biomedical entities, not paralleled by any other resource. PMID:28387604

  10. A selective annotated bibliography for clinical audiology (1988-2008): reference works.

    PubMed

    Ferrer-Vinent, Susan T; Ferrer-Vinent, Ignacio J

    2009-06-01

    This is the 1st in a series of 3 planned companion articles that present a selected, annotated, and indexed bibliography of clinical audiology publications from 1988 to 2008. Research and preparation of the bibliography were based on published guidelines, professional audiology experience, and professional librarian experience. This article presents reference works (dictionaries, encyclopedias, handbooks, and manuals). The future planned articles will cover other monographs, periodicals, and online resources. Audiologists and librarians can use these lists as a guide when seeking clinical audiology literature.

  11. Sexting--it's in the dictionary.

    PubMed

    Mattey, Beth; Diliberto, Gail Mattey

    2013-03-01

    Sexting has become commonplace in our vocabulary, as commonplace as technology use is to our youth. The role of the school nurse necessitates awareness of issues surrounding sexting along with the capability to proactively educate students, staff and parents on the dangers of sexting. Students are empowered when provided the knowledge that only they control their own image. This article explores current terminology, incidence of sexting among today's youth, legal implications, as well as strategies and resources for schools to assist in dealing with sexting.

  12. SaRAD: a Simple and Robust Abbreviation Dictionary.

    PubMed

    Adar, Eytan

    2004-03-01

    Due to recent interest in the use of textual material to augment traditional experiments it has become necessary to automatically cluster, classify and filter natural language information. The Simple and Robust Abbreviation Dictionary (SaRAD) provides an easy to implement, high performance tool for the construction of a biomedical symbol dictionary. The algorithms, applied to the MEDLINE document set, result in a high quality dictionary and toolset to disambiguate abbreviation symbols automatically.

  13. University of Glasgow at TREC 2008: Experiments in Blog, Enterprise, and Relevance Feedback Tracks with Terrier

    DTIC Science & Technology

    2008-11-01

    improves our TREC 2007 dictionary -based approach by automatically building an internal opinion dictionary from the collection itself. We measure the opin...detecting opinionated documents. The first approach improves our TREC 2007 dictionary -based approach by automat- ically building an internal opinion... dictionary from the collection itself. The second approach is based on the OpinionFinder tool, which identifies subjective sentences in text. In particular

  14. The Effect of Bilingual Term List Size on Dictionary-Based Cross-Language Information Retrieval

    DTIC Science & Technology

    2006-01-01

    The Effect of Bilingual Term List Size on Dictionary -Based Cross-Language Information Retrieval Dina Demner-Fushman Department of Computer Science... dictionary -based Cross-Language Information Retrieval (CLIR), in which the goal is to find documents written in one natural language based on queries that...in which the documents are written. In dictionary -based CLIR techniques, the princi- pal source of translation knowledge is a translation lexicon

  15. Robust Multimodal Dictionary Learning

    PubMed Central

    Cao, Tian; Jojic, Vladimir; Modla, Shannon; Powell, Debbie; Czymmek, Kirk; Niethammer, Marc

    2014-01-01

    We propose a robust multimodal dictionary learning method for multimodal images. Joint dictionary learning for both modalities may be impaired by lack of correspondence between image modalities in training data, for example due to areas of low quality in one of the modalities. Dictionaries learned with such non-corresponding data will induce uncertainty about image representation. In this paper, we propose a probabilistic model that accounts for image areas that are poorly corresponding between the image modalities. We cast the problem of learning a dictionary in presence of problematic image patches as a likelihood maximization problem and solve it with a variant of the EM algorithm. Our algorithm iterates identification of poorly corresponding patches and re-finements of the dictionary. We tested our method on synthetic and real data. We show improvements in image prediction quality and alignment accuracy when using the method for multimodal image registration. PMID:24505674

  16. Evaluation of techniques for increasing recall in a dictionary approach to gene and protein name identification.

    PubMed

    Schuemie, Martijn J; Mons, Barend; Weeber, Marc; Kors, Jan A

    2007-06-01

    Gene and protein name identification in text requires a dictionary approach to relate synonyms to the same gene or protein, and to link names to external databases. However, existing dictionaries are incomplete. We investigate two complementary methods for automatic generation of a comprehensive dictionary: combination of information from existing gene and protein databases and rule-based generation of spelling variations. Both methods have been reported in literature before, but have hitherto not been combined and evaluated systematically. We combined gene and protein names from several existing databases of four different organisms. The combined dictionaries showed a substantial increase in recall on three different test sets, as compared to any single database. Application of 23 spelling variation rules to the combined dictionaries further increased recall. However, many rules appeared to have no effect and some appear to have a detrimental effect on precision.

  17. Dictionary Learning on the Manifold of Square Root Densities and Application to Reconstruction of Diffusion Propagator Fields*

    PubMed Central

    Sun, Jiaqi; Xie, Yuchen; Ye, Wenxing; Ho, Jeffrey; Entezari, Alireza; Blackband, Stephen J.

    2013-01-01

    In this paper, we present a novel dictionary learning framework for data lying on the manifold of square root densities and apply it to the reconstruction of diffusion propagator (DP) fields given a multi-shell diffusion MRI data set. Unlike most of the existing dictionary learning algorithms which rely on the assumption that the data points are vectors in some Euclidean space, our dictionary learning algorithm is designed to incorporate the intrinsic geometric structure of manifolds and performs better than traditional dictionary learning approaches when applied to data lying on the manifold of square root densities. Non-negativity as well as smoothness across the whole field of the reconstructed DPs is guaranteed in our approach. We demonstrate the advantage of our approach by comparing it with an existing dictionary based reconstruction method on synthetic and real multi-shell MRI data. PMID:24684004

  18. Plasma Dictionary Website

    NASA Astrophysics Data System (ADS)

    Correll, Don; Heeter, Robert; Alvarez, Mitch

    2000-10-01

    In response to many inquiries for a list of plasma terms, a database driven Plasma Dictionary website (plasmadictionary.llnl.gov) was created that allows users to submit new terms, search for specific terms or browse alphabetic listings. The Plasma Dictionary website contents began with the Fusion & Plasma Glossary terms available at the Fusion Energy Educational website (fusedweb.llnl.gov). Plasma researchers are encouraged to add terms and definitions. By clarifying the meanings of specific plasma terms, it is envisioned that the primary use of the Plasma Dictionary website will be by students, teachers, researchers, and writers for (1) Enhancing literacy in plasma science, (2) Serving as an educational aid, (3) Providing practical information, and (4) Helping clarify plasma writings. The Plasma Dictionary website has already proved useful in responding to a request from the CRC Press (www.crcpress.com) to add plasma terms to its CRC physics dictionary project (members.aol.com/physdict/).

  19. The Research on Denoising of SAR Image Based on Improved K-SVD Algorithm

    NASA Astrophysics Data System (ADS)

    Tan, Linglong; Li, Changkai; Wang, Yueqin

    2018-04-01

    SAR images often receive noise interference in the process of acquisition and transmission, which can greatly reduce the quality of images and cause great difficulties for image processing. The existing complete DCT dictionary algorithm is fast in processing speed, but its denoising effect is poor. In this paper, the problem of poor denoising, proposed K-SVD (K-means and singular value decomposition) algorithm is applied to the image noise suppression. Firstly, the sparse dictionary structure is introduced in detail. The dictionary has a compact representation and can effectively train the image signal. Then, the sparse dictionary is trained by K-SVD algorithm according to the sparse representation of the dictionary. The algorithm has more advantages in high dimensional data processing. Experimental results show that the proposed algorithm can remove the speckle noise more effectively than the complete DCT dictionary and retain the edge details better.

  20. Learning overcomplete representations from distributed data: a brief review

    NASA Astrophysics Data System (ADS)

    Raja, Haroon; Bajwa, Waheed U.

    2016-05-01

    Most of the research on dictionary learning has focused on developing algorithms under the assumption that data is available at a centralized location. But often the data is not available at a centralized location due to practical constraints like data aggregation costs, privacy concerns, etc. Using centralized dictionary learning algorithms may not be the optimal choice in such settings. This motivates the design of dictionary learning algorithms that consider distributed nature of data as one of the problem variables. Just like centralized settings, distributed dictionary learning problem can be posed in more than one way depending on the problem setup. Most notable distinguishing features are the online versus batch nature of data and the representative versus discriminative nature of the dictionaries. In this paper, several distributed dictionary learning algorithms that are designed to tackle different problem setups are reviewed. One of these algorithms is cloud K-SVD, which solves the dictionary learning problem for batch data in distributed settings. One distinguishing feature of cloud K-SVD is that it has been shown to converge to its centralized counterpart, namely, the K-SVD solution. On the other hand, no such guarantees are provided for other distributed dictionary learning algorithms. Convergence of cloud K-SVD to the centralized K-SVD solution means problems that are solvable by K-SVD in centralized settings can now be solved in distributed settings with similar performance. Finally, cloud K-SVD is used as an example to show the advantages that are attainable by deploying distributed dictionary algorithms for real world distributed datasets.

  1. Dictionnaires et encyclopedies: cuvee 89 (Dictionaries and Encyclopedias: Vintage 89).

    ERIC Educational Resources Information Center

    Ibrahim, Amr Helmy

    1989-01-01

    For the first time since its initial publication in 1905, the much-imitated "Petit Larousse" dictionary/reference book has a true competitor in Hachette's "Le Dictionnaire de notre temps", a new dictionary reflecting modern French usage. (MSE)

  2. Using dictionaries to study the mental lexicon.

    PubMed

    Anshen, F; Aronoff, M

    The notion of a mental lexicon has its historical roots in practical reference dictionaries. The distributional analysis of dictionaries provides one means of investigating the structure of the mental lexicon. We review our earlier work with dictionaries, based on a three-way horserace model of lexical access and production, and then present the most recent results of our ongoing analysis of the Oxford English Dictionary, Second Edition on CD-ROM, which traces changes in productivity over time of the English suffixes -ment and -ity, both of which originate in French borrowings. Our results lead us to question the validity of automatic analogy from a set of existing words as the driving force behind morphological productivity. Copyright 1999 Academic Press.

  3. Sensitivity computation of the ell1 minimization problem and its application to dictionary design of ill-posed problems

    NASA Astrophysics Data System (ADS)

    Horesh, L.; Haber, E.

    2009-09-01

    The ell1 minimization problem has been studied extensively in the past few years. Recently, there has been a growing interest in its application for inverse problems. Most studies have concentrated in devising ways for sparse representation of a solution using a given prototype dictionary. Very few studies have addressed the more challenging problem of optimal dictionary construction, and even these were primarily devoted to the simplistic sparse coding application. In this paper, sensitivity analysis of the inverse solution with respect to the dictionary is presented. This analysis reveals some of the salient features and intrinsic difficulties which are associated with the dictionary design problem. Equipped with these insights, we propose an optimization strategy that alleviates these hurdles while utilizing the derived sensitivity relations for the design of a locally optimal dictionary. Our optimality criterion is based on local minimization of the Bayesian risk, given a set of training models. We present a mathematical formulation and an algorithmic framework to achieve this goal. The proposed framework offers the design of dictionaries for inverse problems that incorporate non-trivial, non-injective observation operators, where the data and the recovered parameters may reside in different spaces. We test our algorithm and show that it yields improved dictionaries for a diverse set of inverse problems in geophysics and medical imaging.

  4. Password-Only Authenticated Three-Party Key Exchange Proven Secure against Insider Dictionary Attacks

    PubMed Central

    Nam, Junghyun; Choo, Kim-Kwang Raymond

    2014-01-01

    While a number of protocols for password-only authenticated key exchange (PAKE) in the 3-party setting have been proposed, it still remains a challenging task to prove the security of a 3-party PAKE protocol against insider dictionary attacks. To the best of our knowledge, there is no 3-party PAKE protocol that carries a formal proof, or even definition, of security against insider dictionary attacks. In this paper, we present the first 3-party PAKE protocol proven secure against both online and offline dictionary attacks as well as insider and outsider dictionary attacks. Our construct can be viewed as a protocol compiler that transforms any 2-party PAKE protocol into a 3-party PAKE protocol with 2 additional rounds of communication. We also present a simple and intuitive approach of formally modelling dictionary attacks in the password-only 3-party setting, which significantly reduces the complexity of proving the security of 3-party PAKE protocols against dictionary attacks. In addition, we investigate the security of the well-known 3-party PAKE protocol, called GPAKE, due to Abdalla et al. (2005, 2006), and demonstrate that the security of GPAKE against online dictionary attacks depends heavily on the composition of its two building blocks, namely a 2-party PAKE protocol and a 3-party key distribution protocol. PMID:25309956

  5. An Improved Sparse Representation over Learned Dictionary Method for Seizure Detection.

    PubMed

    Li, Junhui; Zhou, Weidong; Yuan, Shasha; Zhang, Yanli; Li, Chengcheng; Wu, Qi

    2016-02-01

    Automatic seizure detection has played an important role in the monitoring, diagnosis and treatment of epilepsy. In this paper, a patient specific method is proposed for seizure detection in the long-term intracranial electroencephalogram (EEG) recordings. This seizure detection method is based on sparse representation with online dictionary learning and elastic net constraint. The online learned dictionary could sparsely represent the testing samples more accurately, and the elastic net constraint which combines the 11-norm and 12-norm not only makes the coefficients sparse but also avoids over-fitting problem. First, the EEG signals are preprocessed using wavelet filtering and differential filtering, and the kernel function is applied to make the samples closer to linearly separable. Then the dictionaries of seizure and nonseizure are respectively learned from original ictal and interictal training samples with online dictionary optimization algorithm to compose the training dictionary. After that, the test samples are sparsely coded over the learned dictionary and the residuals associated with ictal and interictal sub-dictionary are calculated, respectively. Eventually, the test samples are classified as two distinct categories, seizure or nonseizure, by comparing the reconstructed residuals. The average segment-based sensitivity of 95.45%, specificity of 99.08%, and event-based sensitivity of 94.44% with false detection rate of 0.23/h and average latency of -5.14 s have been achieved with our proposed method.

  6. Image fusion via nonlocal sparse K-SVD dictionary learning.

    PubMed

    Li, Ying; Li, Fangyi; Bai, Bendu; Shen, Qiang

    2016-03-01

    Image fusion aims to merge two or more images captured via various sensors of the same scene to construct a more informative image by integrating their details. Generally, such integration is achieved through the manipulation of the representations of the images concerned. Sparse representation plays an important role in the effective description of images, offering a great potential in a variety of image processing tasks, including image fusion. Supported by sparse representation, in this paper, an approach for image fusion by the use of a novel dictionary learning scheme is proposed. The nonlocal self-similarity property of the images is exploited, not only at the stage of learning the underlying description dictionary but during the process of image fusion. In particular, the property of nonlocal self-similarity is combined with the traditional sparse dictionary. This results in an improved learned dictionary, hereafter referred to as the nonlocal sparse K-SVD dictionary (where K-SVD stands for the K times singular value decomposition that is commonly used in the literature), and abbreviated to NL_SK_SVD. The performance of the NL_SK_SVD dictionary is applied for image fusion using simultaneous orthogonal matching pursuit. The proposed approach is evaluated with different types of images, and compared with a number of alternative image fusion techniques. The resultant superior fused images using the present approach demonstrates the efficacy of the NL_SK_SVD dictionary in sparse image representation.

  7. A Robust Shape Reconstruction Method for Facial Feature Point Detection.

    PubMed

    Tan, Shuqiu; Chen, Dongyi; Guo, Chenggang; Huang, Zhiqi

    2017-01-01

    Facial feature point detection has been receiving great research advances in recent years. Numerous methods have been developed and applied in practical face analysis systems. However, it is still a quite challenging task because of the large variability in expression and gestures and the existence of occlusions in real-world photo shoot. In this paper, we present a robust sparse reconstruction method for the face alignment problems. Instead of a direct regression between the feature space and the shape space, the concept of shape increment reconstruction is introduced. Moreover, a set of coupled overcomplete dictionaries termed the shape increment dictionary and the local appearance dictionary are learned in a regressive manner to select robust features and fit shape increments. Additionally, to make the learned model more generalized, we select the best matched parameter set through extensive validation tests. Experimental results on three public datasets demonstrate that the proposed method achieves a better robustness over the state-of-the-art methods.

  8. Deviation of Zipf's and Heaps' Laws in Human Languages with Limited Dictionary Sizes

    PubMed Central

    Lü, Linyuan; Zhang, Zi-Ke; Zhou, Tao

    2013-01-01

    Zipf's law on word frequency and Heaps' law on the growth of distinct words are observed in Indo-European language family, but it does not hold for languages like Chinese, Japanese and Korean. These languages consist of characters, and are of very limited dictionary sizes. Extensive experiments show that: (i) The character frequency distribution follows a power law with exponent close to one, at which the corresponding Zipf's exponent diverges. Indeed, the character frequency decays exponentially in the Zipf's plot. (ii) The number of distinct characters grows with the text length in three stages: It grows linearly in the beginning, then turns to a logarithmical form, and eventually saturates. A theoretical model for writing process is proposed, which embodies the rich-get-richer mechanism and the effects of limited dictionary size. Experiments, simulations and analytical solutions agree well with each other. This work refines the understanding about Zipf's and Heaps' laws in human language systems. PMID:23378896

  9. Dictionary as Database.

    ERIC Educational Resources Information Center

    Painter, Derrick

    1996-01-01

    Discussion of dictionaries as databases focuses on the digitizing of The Oxford English dictionary (OED) and the use of Standard Generalized Mark-Up Language (SGML). Topics include the creation of a consortium to digitize the OED, document structure, relational databases, text forms, sequence, and discourse. (LRW)

  10. Intelligent Transportation Systems (ITS) logical architecture : volume 3 : data dictionary

    DOT National Transportation Integrated Search

    1982-01-01

    A Guide to Reporting Highway Statistics is a principal part of Federal Highway Administration's comprehensive highway information collection effort. This Guide has two objectives: 1) To serve as a reference to the reporting system that the Federal Hi...

  11. Regularized spherical polar fourier diffusion MRI with optimal dictionary learning.

    PubMed

    Cheng, Jian; Jiang, Tianzi; Deriche, Rachid; Shen, Dinggang; Yap, Pew-Thian

    2013-01-01

    Compressed Sensing (CS) takes advantage of signal sparsity or compressibility and allows superb signal reconstruction from relatively few measurements. Based on CS theory, a suitable dictionary for sparse representation of the signal is required. In diffusion MRI (dMRI), CS methods proposed for reconstruction of diffusion-weighted signal and the Ensemble Average Propagator (EAP) utilize two kinds of Dictionary Learning (DL) methods: 1) Discrete Representation DL (DR-DL), and 2) Continuous Representation DL (CR-DL). DR-DL is susceptible to numerical inaccuracy owing to interpolation and regridding errors in a discretized q-space. In this paper, we propose a novel CR-DL approach, called Dictionary Learning - Spherical Polar Fourier Imaging (DL-SPFI) for effective compressed-sensing reconstruction of the q-space diffusion-weighted signal and the EAP. In DL-SPFI, a dictionary that sparsifies the signal is learned from the space of continuous Gaussian diffusion signals. The learned dictionary is then adaptively applied to different voxels using a weighted LASSO framework for robust signal reconstruction. Compared with the start-of-the-art CR-DL and DR-DL methods proposed by Merlet et al. and Bilgic et al., respectively, our work offers the following advantages. First, the learned dictionary is proved to be optimal for Gaussian diffusion signals. Second, to our knowledge, this is the first work to learn a voxel-adaptive dictionary. The importance of the adaptive dictionary in EAP reconstruction will be demonstrated theoretically and empirically. Third, optimization in DL-SPFI is only performed in a small subspace resided by the SPF coefficients, as opposed to the q-space approach utilized by Merlet et al. We experimentally evaluated DL-SPFI with respect to L1-norm regularized SPFI (L1-SPFI), which uses the original SPF basis, and the DR-DL method proposed by Bilgic et al. The experiment results on synthetic and real data indicate that the learned dictionary produces sparser coefficients than the original SPF basis and results in significantly lower reconstruction error than Bilgic et al.'s method.

  12. Brain tumor classification and segmentation using sparse coding and dictionary learning.

    PubMed

    Salman Al-Shaikhli, Saif Dawood; Yang, Michael Ying; Rosenhahn, Bodo

    2016-08-01

    This paper presents a novel fully automatic framework for multi-class brain tumor classification and segmentation using a sparse coding and dictionary learning method. The proposed framework consists of two steps: classification and segmentation. The classification of the brain tumors is based on brain topology and texture. The segmentation is based on voxel values of the image data. Using K-SVD, two types of dictionaries are learned from the training data and their associated ground truth segmentation: feature dictionary and voxel-wise coupled dictionaries. The feature dictionary consists of global image features (topological and texture features). The coupled dictionaries consist of coupled information: gray scale voxel values of the training image data and their associated label voxel values of the ground truth segmentation of the training data. For quantitative evaluation, the proposed framework is evaluated using different metrics. The segmentation results of the brain tumor segmentation (MICCAI-BraTS-2013) database are evaluated using five different metric scores, which are computed using the online evaluation tool provided by the BraTS-2013 challenge organizers. Experimental results demonstrate that the proposed approach achieves an accurate brain tumor classification and segmentation and outperforms the state-of-the-art methods.

  13. Measurement of negativity bias in personal narratives using corpus-based emotion dictionaries.

    PubMed

    Cohen, Shuki J

    2011-04-01

    This study presents a novel methodology for the measurement of negativity bias using positive and negative dictionaries of emotion words applied to autobiographical narratives. At odds with the cognitive theory of mood dysregulation, previous text-analytical studies have failed to find significant correlation between emotion dictionaries and negative affectivity or dysphoria. In the present study, an a priori list dictionary of emotion words was refined based on the actual use of these words in personal narratives collected from close to 500 college students. Half of the corpus was used to construct, via concordance analysis, the grammatical structures associated with the words in their emotional sense. The second half of the corpus served as a validation corpus. The resulting dictionary ignores words that are not used in their intended emotional sense, including negated emotions, homophones, frozen idioms etc. Correlations of the resulting corpus-based negative and positive emotion dictionaries with self-report measures of negative affectivity were in the expected direction, and were statistically significant, with medium effect size. The potential use of these dictionaries as implicit measures of negativity bias and in the analysis of psychotherapy transcripts is discussed.

  14. Joint seismic data denoising and interpolation with double-sparsity dictionary learning

    NASA Astrophysics Data System (ADS)

    Zhu, Lingchen; Liu, Entao; McClellan, James H.

    2017-08-01

    Seismic data quality is vital to geophysical applications, so that methods of data recovery, including denoising and interpolation, are common initial steps in the seismic data processing flow. We present a method to perform simultaneous interpolation and denoising, which is based on double-sparsity dictionary learning. This extends previous work that was for denoising only. The original double-sparsity dictionary learning algorithm is modified to track the traces with missing data by defining a masking operator that is integrated into the sparse representation of the dictionary. A weighted low-rank approximation algorithm is adopted to handle the dictionary updating as a sparse recovery optimization problem constrained by the masking operator. Compared to traditional sparse transforms with fixed dictionaries that lack the ability to adapt to complex data structures, the double-sparsity dictionary learning method learns the signal adaptively from selected patches of the corrupted seismic data, while preserving compact forward and inverse transform operators. Numerical experiments on synthetic seismic data indicate that this new method preserves more subtle features in the data set without introducing pseudo-Gibbs artifacts when compared to other directional multi-scale transform methods such as curvelets.

  15. Intelligent Diagnosis Method for Rotating Machinery Using Dictionary Learning and Singular Value Decomposition.

    PubMed

    Han, Te; Jiang, Dongxiang; Zhang, Xiaochen; Sun, Yankui

    2017-03-27

    Rotating machinery is widely used in industrial applications. With the trend towards more precise and more critical operating conditions, mechanical failures may easily occur. Condition monitoring and fault diagnosis (CMFD) technology is an effective tool to enhance the reliability and security of rotating machinery. In this paper, an intelligent fault diagnosis method based on dictionary learning and singular value decomposition (SVD) is proposed. First, the dictionary learning scheme is capable of generating an adaptive dictionary whose atoms reveal the underlying structure of raw signals. Essentially, dictionary learning is employed as an adaptive feature extraction method regardless of any prior knowledge. Second, the singular value sequence of learned dictionary matrix is served to extract feature vector. Generally, since the vector is of high dimensionality, a simple and practical principal component analysis (PCA) is applied to reduce dimensionality. Finally, the K -nearest neighbor (KNN) algorithm is adopted for identification and classification of fault patterns automatically. Two experimental case studies are investigated to corroborate the effectiveness of the proposed method in intelligent diagnosis of rotating machinery faults. The comparison analysis validates that the dictionary learning-based matrix construction approach outperforms the mode decomposition-based methods in terms of capacity and adaptability for feature extraction.

  16. Bilevel Model-Based Discriminative Dictionary Learning for Recognition.

    PubMed

    Zhou, Pan; Zhang, Chao; Lin, Zhouchen

    2017-03-01

    Most supervised dictionary learning methods optimize the combinations of reconstruction error, sparsity prior, and discriminative terms. Thus, the learnt dictionaries may not be optimal for recognition tasks. Also, the sparse codes learning models in the training and the testing phases are inconsistent. Besides, without utilizing the intrinsic data structure, many dictionary learning methods only employ the l 0 or l 1 norm to encode each datum independently, limiting the performance of the learnt dictionaries. We present a novel bilevel model-based discriminative dictionary learning method for recognition tasks. The upper level directly minimizes the classification error, while the lower level uses the sparsity term and the Laplacian term to characterize the intrinsic data structure. The lower level is subordinate to the upper level. Therefore, our model achieves an overall optimality for recognition in that the learnt dictionary is directly tailored for recognition. Moreover, the sparse codes learning models in the training and the testing phases can be the same. We further propose a novel method to solve our bilevel optimization problem. It first replaces the lower level with its Karush-Kuhn-Tucker conditions and then applies the alternating direction method of multipliers to solve the equivalent problem. Extensive experiments demonstrate the effectiveness and robustness of our method.

  17. Compressive sensing of electrocardiogram signals by promoting sparsity on the second-order difference and by using dictionary learning.

    PubMed

    Pant, Jeevan K; Krishnan, Sridhar

    2014-04-01

    A new algorithm for the reconstruction of electrocardiogram (ECG) signals and a dictionary learning algorithm for the enhancement of its reconstruction performance for a class of signals are proposed. The signal reconstruction algorithm is based on minimizing the lp pseudo-norm of the second-order difference, called as the lp(2d) pseudo-norm, of the signal. The optimization involved is carried out using a sequential conjugate-gradient algorithm. The dictionary learning algorithm uses an iterative procedure wherein a signal reconstruction and a dictionary update steps are repeated until a convergence criterion is satisfied. The signal reconstruction step is implemented by using the proposed signal reconstruction algorithm and the dictionary update step is implemented by using the linear least-squares method. Extensive simulation results demonstrate that the proposed algorithm yields improved reconstruction performance for temporally correlated ECG signals relative to the state-of-the-art lp(1d)-regularized least-squares and Bayesian learning based algorithms. Also for a known class of signals, the reconstruction performance of the proposed algorithm can be improved by applying it in conjunction with a dictionary obtained using the proposed dictionary learning algorithm.

  18. Trying Out a New Dictionary.

    ERIC Educational Resources Information Center

    Benson, Morton; Benson, Evelyn

    1988-01-01

    Describes the BBI Combinatory Dictionary of English and demonstrates its usefulness for advanced learners of English by administering a monolingual completion test, first without a dictionary and then with the BBI, to Hungarian and Russian English teachers. Both groups' scores improved dramatically on the posttest. (LMO)

  19. Induced lexico-syntactic patterns improve information extraction from online medical forums.

    PubMed

    Gupta, Sonal; MacLean, Diana L; Heer, Jeffrey; Manning, Christopher D

    2014-01-01

    To reliably extract two entity types, symptoms and conditions (SCs), and drugs and treatments (DTs), from patient-authored text (PAT) by learning lexico-syntactic patterns from data annotated with seed dictionaries. Despite the increasing quantity of PAT (eg, online discussion threads), tools for identifying medical entities in PAT are limited. When applied to PAT, existing tools either fail to identify specific entity types or perform poorly. Identification of SC and DT terms in PAT would enable exploration of efficacy and side effects for not only pharmaceutical drugs, but also for home remedies and components of daily care. We use SC and DT term dictionaries compiled from online sources to label several discussion forums from MedHelp (http://www.medhelp.org). We then iteratively induce lexico-syntactic patterns corresponding strongly to each entity type to extract new SC and DT terms. Our system is able to extract symptom descriptions and treatments absent from our original dictionaries, such as 'LADA', 'stabbing pain', and 'cinnamon pills'. Our system extracts DT terms with 58-70% F1 score and SC terms with 66-76% F1 score on two forums from MedHelp. We show improvements over MetaMap, OBA, a conditional random field-based classifier, and a previous pattern learning approach. Our entity extractor based on lexico-syntactic patterns is a successful and preferable technique for identifying specific entity types in PAT. To the best of our knowledge, this is the first paper to extract SC and DT entities from PAT. We exhibit learning of informal terms often used in PAT but missing from typical dictionaries. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  20. Psoriasis image representation using patch-based dictionary learning for erythema severity scoring.

    PubMed

    George, Yasmeen; Aldeen, Mohammad; Garnavi, Rahil

    2018-06-01

    Psoriasis is a chronic skin disease which can be life-threatening. Accurate severity scoring helps dermatologists to decide on the treatment. In this paper, we present a semi-supervised computer-aided system for automatic erythema severity scoring in psoriasis images. Firstly, the unsupervised stage includes a novel image representation method. We construct a dictionary, which is then used in the sparse representation for local feature extraction. To acquire the final image representation vector, an aggregation method is exploited over the local features. Secondly, the supervised phase is where various multi-class machine learning (ML) classifiers are trained for erythema severity scoring. Finally, we compare the proposed system with two popular unsupervised feature extractor methods, namely: bag of visual words model (BoVWs) and AlexNet pretrained model. Root mean square error (RMSE) and F1 score are used as performance measures for the learned dictionaries and the trained ML models, respectively. A psoriasis image set consisting of 676 images, is used in this study. Experimental results demonstrate that the use of the proposed procedure can provide a setup where erythema scoring is accurate and consistent. Also, it is revealed that dictionaries with large number of atoms and small patch sizes yield the best representative erythema severity features. Further, random forest (RF) outperforms other classifiers with F1 score 0.71, followed by support vector machine (SVM) and boosting with 0.66 and 0.64 scores, respectively. Furthermore, the conducted comparative studies confirm the effectiveness of the proposed approach with improvement of 9% and 12% over BoVWs and AlexNet based features, respectively. Crown Copyright © 2018. Published by Elsevier Ltd. All rights reserved.

  1. The Planetary Data System (PDS) Data Dictionary Tool (LDDTool)

    NASA Astrophysics Data System (ADS)

    Raugh, Anne C.; Hughes, John S.

    2017-10-01

    One of the major design goals of the PDS4 development effort was to provide an avenue for discipline specialists and large data preparers such as mission archivists to extend the core PDS4 Information Model (IM) to include metadata definitions specific to their own contexts. This capability is critical for the Planetary Data System - an archive that deals with a data collection that is diverse along virtually every conceivable axis. Amid such diversity, it is in the best interests of the PDS archive and its users that all extensions to the core IM follow the same design techniques, conventions, and restrictions as the core implementation itself. Notwithstanding, expecting all mission and discipline archivist seeking to define metadata for a new context to acquire expertise in information modeling, model-driven design, ontology, schema formulation, and PDS4 design conventions and philosophy is unrealistic, to say the least.To bridge that expertise gap, the PDS Engineering Node has developed the data dictionary creation tool known as “LDDTool”. This tool incorporates the same software used to maintain and extend the core IM, packaged with an interface that enables a developer to create his contextual information model using the same, open standards-based metadata framework PDS itself uses. Through this interface, the novice dictionary developer has immediate access to the common set of data types and unit classes for defining attributes, and a straight-forward method for constructing classes. The more experienced developer, using the same tool, has access to more sophisticated modeling methods like abstraction and extension, and can define very sophisticated validation rules.We present the key features of the PDS Local Data Dictionary Tool, which both supports the development of extensions to the PDS4 IM, and ensures their compatibility with the IM.

  2. Machine-Assisted Indexing of Scientific Research Summaries

    ERIC Educational Resources Information Center

    And Others; Hunt, Bernard L.

    1975-01-01

    At the Smithsonian Science Information Exchange, a computer system indexes word combinations in research summaries, according to a Classifying Dictionary, prior to review by the professional staff. (Author/PF)

  3. Travel time tomography with local image regularization by sparsity constrained dictionary learning

    NASA Astrophysics Data System (ADS)

    Bianco, M.; Gerstoft, P.

    2017-12-01

    We propose a regularization approach for 2D seismic travel time tomography which models small rectangular groups of slowness pixels, within an overall or `global' slowness image, as sparse linear combinations of atoms from a dictionary. The groups of slowness pixels are referred to as patches and a dictionary corresponds to a collection of functions or `atoms' describing the slowness in each patch. These functions could for example be wavelets.The patch regularization is incorporated into the global slowness image. The global image models the broad features, while the local patch images incorporate prior information from the dictionary. Further, high resolution slowness within patches is permitted if the travel times from the global estimates support it. The proposed approach is formulated as an algorithm, which is repeated until convergence is achieved: 1) From travel times, find the global slowness image with a minimum energy constraint on the pixel variance relative to a reference. 2) Find the patch level solutions to fit the global estimate as a sparse linear combination of dictionary atoms.3) Update the reference as the weighted average of the patch level solutions.This approach relies on the redundancy of the patches in the seismic image. Redundancy means that the patches are repetitions of a finite number of patterns, which are described by the dictionary atoms. Redundancy in the earth's structure was demonstrated in previous works in seismics where dictionaries of wavelet functions regularized inversion. We further exploit redundancy of the patches by using dictionary learning algorithms, a form of unsupervised machine learning, to estimate optimal dictionaries from the data in parallel with the inversion. We demonstrate our approach on densely, but irregularly sampled synthetic seismic images.

  4. The chemical component dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the Protein Data Bank

    PubMed Central

    Westbrook, John D.; Shao, Chenghua; Feng, Zukang; Zhuravleva, Marina; Velankar, Sameer; Young, Jasmine

    2015-01-01

    Summary: The Chemical Component Dictionary (CCD) is a chemical reference data resource that describes all residue and small molecule components found in Protein Data Bank (PDB) entries. The CCD contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands and solvent molecules. Each chemical definition includes descriptions of chemical properties such as stereochemical assignments, chemical descriptors, systematic chemical names and idealized coordinates. The content, preparation, validation and distribution of this CCD chemical reference dataset are described. Availability and implementation: The CCD is updated regularly in conjunction with the scheduled weekly release of new PDB structure data. The CCD and amino acid variant reference datasets are hosted in the public PDB ftp repository at ftp://ftp.wwpdb.org/pub/pdb/data/monomers/components.cif.gz, ftp://ftp.wwpdb.org/pub/pdb/data/monomers/aa-variants-v1.cif.gz, and its mirror sites, and can be accessed from http://wwpdb.org. Contact: jwest@rcsb.rutgers.edu. Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25540181

  5. Pattern matching techniques for correcting low-confidence OCR words in a known context

    NASA Astrophysics Data System (ADS)

    Ford, Glenn; Hauser, Susan E.; Le, Daniel X.; Thoma, George R.

    2000-12-01

    A commercial OCR system is a key component of a system developed at the National Library of Medicine for the automated extraction of bibliographic fields from biomedical journals. This 5-engine OCR system, while exhibiting high performance overall, does not reliably convert very small characters, especially those that are in italics. As a result, the 'affiliations' field that typically contains such characters in most journals, is not captured accurately, and requires a disproportionately high manual input. To correct this problem, dictionaries have been created from words occurring in this field (e.g., university, department, street addresses, names of cities, etc.) from 230,000 articles already processed. The OCR output corresponding to the affiliation field is then matched against these dictionary entries by approximate string-matching techniques, and the ranked matches are presented to operators for verification. This paper outlines the techniques employed and the results of a comparative evaluation.

  6. Model-based semantic dictionaries for medical language understanding.

    PubMed Central

    Rassinoux, A. M.; Baud, R. H.; Ruch, P.; Trombert-Paviot, B.; Rodrigues, J. M.

    1999-01-01

    Semantic dictionaries are emerging as a major cornerstone towards achieving sound natural language understanding. Indeed, they constitute the main bridge between words and conceptual entities that reflect their meanings. Nowadays, more and more wide-coverage lexical dictionaries are electronically available in the public domain. However, associating a semantic content with lexical entries is not a straightforward task as it is subordinate to the existence of a fine-grained concept model of the treated domain. This paper presents the benefits and pitfalls in building and maintaining multilingual dictionaries, the semantics of which is directly established on an existing concept model. Concrete cases, handled through the GALEN-IN-USE project, illustrate the use of such semantic dictionaries for the analysis and generation of multilingual surgical procedures. PMID:10566333

  7. Talking Shop with Moira Runcie.

    ERIC Educational Resources Information Center

    Bowers, Rogers

    1998-01-01

    Presents an interview with Moira Runcie, Editorial Director for ELT (English Language Teaching) dictionaries at Oxford University Press. The interview focuses on the work of A.S. Hornby in creating the first learners dictionary of English and shows how modern dictionaries draw on his work. (Author/JL)

  8. Cross-label Suppression: a Discriminative and Fast Dictionary Learning with Group Regularization.

    PubMed

    Wang, Xiudong; Gu, Yuantao

    2017-05-10

    This paper addresses image classification through learning a compact and discriminative dictionary efficiently. Given a structured dictionary with each atom (columns in the dictionary matrix) related to some label, we propose crosslabel suppression constraint to enlarge the difference among representations for different classes. Meanwhile, we introduce group regularization to enforce representations to preserve label properties of original samples, meaning the representations for the same class are encouraged to be similar. Upon the cross-label suppression, we don't resort to frequently-used `0-norm or `1- norm for coding, and obtain computational efficiency without losing the discriminative power for categorization. Moreover, two simple classification schemes are also developed to take full advantage of the learnt dictionary. Extensive experiments on six data sets including face recognition, object categorization, scene classification, texture recognition and sport action categorization are conducted, and the results show that the proposed approach can outperform lots of recently presented dictionary algorithms on both recognition accuracy and computational efficiency.

  9. A Locality-Constrained and Label Embedding Dictionary Learning Algorithm for Image Classification.

    PubMed

    Zhengming Li; Zhihui Lai; Yong Xu; Jian Yang; Zhang, David

    2017-02-01

    Locality and label information of training samples play an important role in image classification. However, previous dictionary learning algorithms do not take the locality and label information of atoms into account together in the learning process, and thus their performance is limited. In this paper, a discriminative dictionary learning algorithm, called the locality-constrained and label embedding dictionary learning (LCLE-DL) algorithm, was proposed for image classification. First, the locality information was preserved using the graph Laplacian matrix of the learned dictionary instead of the conventional one derived from the training samples. Then, the label embedding term was constructed using the label information of atoms instead of the classification error term, which contained discriminating information of the learned dictionary. The optimal coding coefficients derived by the locality-based and label-based reconstruction were effective for image classification. Experimental results demonstrated that the LCLE-DL algorithm can achieve better performance than some state-of-the-art algorithms.

  10. Metacognitive factors that impact student nurse use of point of care technology in clinical settings.

    PubMed

    Kuiper, RuthAnne

    2010-01-01

    The utility of personal digital assistants (PDA) as a point of care resource in health care practice and education presents new challenges for nursing faculty. While there is a plethora of PDA resources available, little is known about the variables that effect student learning and technology adoption. In this study nursing students used PDA software programs which included a drug guide, medical dictionary, laboratory manual and nursing diagnosis manual during acute care clinical experiences. Analysis of student journals comparative reflective statements about the PDA as an adjunct to other available resources in clinical practice are presented. The benefits of having a PDA included readily available data, validation of thinking processes, and facilitation of care plan re-evaluation. Students reported increased frequency of use and independence. Significant correlations between user perceptions and computer self-efficacy suggested greater confidence in abilities with technology resulting in increased self-awareness and achievement of learning outcomes.

  11. USEEIO Satellite Tables

    EPA Pesticide Factsheets

    These files contain the environmental data as particular emissions or resources associated with a BEA sectors that are used in the USEEIO model. They are organized by the emission or resources type, as described in the manuscript. The main files (without SI) show the final satellite tables in the 'Exchanges' sheet which have emissions or resource use per USD for 2013. The other sheets in these files provide meta data for the create of the tables, including general information, sources, etc. The 'export' sheet is used for saving the satellite table for csv export. The data dictionary describes the fields in this sheet. The supporting files provide all the details data transformation and organization for the development of the satellite tables.This dataset is associated with the following publication:Yang, Y., W. Ingwersen, T. Hawkins, and D. Meyer. USEEIO: a New and Transparent United States Environmentally Extended Input-Output Model. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA,

  12. AdjScales: Visualizing Differences between Adjectives for Language Learners

    NASA Astrophysics Data System (ADS)

    Sheinman, Vera; Tokunaga, Takenobu

    In this study we introduce AdjScales, a method for scaling similar adjectives by their strength. It combines existing Web-based computational linguistic techniques in order to automatically differentiate between similar adjectives that describe the same property by strength. Though this kind of information is rarely present in most of the lexical resources and dictionaries, it may be useful for language learners that try to distinguish between similar words. Additionally, learners might gain from a simple visualization of these differences using unidimensional scales. The method is evaluated by comparison with annotation on a subset of adjectives from WordNet by four native English speakers. It is also compared against two non-native speakers of English. The collected annotation is an interesting resource in its own right. This work is a first step toward automatic differentiation of meaning between similar words for language learners. AdjScales can be useful for lexical resource enhancement.

  13. MR PROSTATE SEGMENTATION VIA DISTRIBUTED DISCRIMINATIVE DICTIONARY (DDD) LEARNING.

    PubMed

    Guo, Yanrong; Zhan, Yiqiang; Gao, Yaozong; Jiang, Jianguo; Shen, Dinggang

    2013-01-01

    Segmenting prostate from MR images is important yet challenging. Due to non-Gaussian distribution of prostate appearances in MR images, the popular active appearance model (AAM) has its limited performance. Although the newly developed sparse dictionary learning method[1, 2] can model the image appearance in a non-parametric fashion, the learned dictionaries still lack the discriminative power between prostate and non-prostate tissues, which is critical for accurate prostate segmentation. In this paper, we propose to integrate deformable model with a novel learning scheme, namely the Distributed Discriminative Dictionary ( DDD ) learning, which can capture image appearance in a non-parametric and discriminative fashion. In particular, three strategies are designed to boost the tissue discriminative power of DDD. First , minimum Redundancy Maximum Relevance (mRMR) feature selection is performed to constrain the dictionary learning in a discriminative feature space. Second , linear discriminant analysis (LDA) is employed to assemble residuals from different dictionaries for optimal separation between prostate and non-prostate tissues. Third , instead of learning the global dictionaries, we learn a set of local dictionaries for the local regions (each with small appearance variations) along prostate boundary, thus achieving better tissue differentiation locally. In the application stage, DDDs will provide the appearance cues to robustly drive the deformable model onto the prostate boundary. Experiments on 50 MR prostate images show that our method can yield a Dice Ratio of 88% compared to the manual segmentations, and have 7% improvement over the conventional AAM.

  14. Low-rank and Adaptive Sparse Signal (LASSI) Models for Highly Accelerated Dynamic Imaging

    PubMed Central

    Ravishankar, Saiprasad; Moore, Brian E.; Nadakuditi, Raj Rao; Fessler, Jeffrey A.

    2017-01-01

    Sparsity-based approaches have been popular in many applications in image processing and imaging. Compressed sensing exploits the sparsity of images in a transform domain or dictionary to improve image recovery from undersampled measurements. In the context of inverse problems in dynamic imaging, recent research has demonstrated the promise of sparsity and low-rank techniques. For example, the patches of the underlying data are modeled as sparse in an adaptive dictionary domain, and the resulting image and dictionary estimation from undersampled measurements is called dictionary-blind compressed sensing, or the dynamic image sequence is modeled as a sum of low-rank and sparse (in some transform domain) components (L+S model) that are estimated from limited measurements. In this work, we investigate a data-adaptive extension of the L+S model, dubbed LASSI, where the temporal image sequence is decomposed into a low-rank component and a component whose spatiotemporal (3D) patches are sparse in some adaptive dictionary domain. We investigate various formulations and efficient methods for jointly estimating the underlying dynamic signal components and the spatiotemporal dictionary from limited measurements. We also obtain efficient sparsity penalized dictionary-blind compressed sensing methods as special cases of our LASSI approaches. Our numerical experiments demonstrate the promising performance of LASSI schemes for dynamic magnetic resonance image reconstruction from limited k-t space data compared to recent methods such as k-t SLR and L+S, and compared to the proposed dictionary-blind compressed sensing method. PMID:28092528

  15. Low-Rank and Adaptive Sparse Signal (LASSI) Models for Highly Accelerated Dynamic Imaging.

    PubMed

    Ravishankar, Saiprasad; Moore, Brian E; Nadakuditi, Raj Rao; Fessler, Jeffrey A

    2017-05-01

    Sparsity-based approaches have been popular in many applications in image processing and imaging. Compressed sensing exploits the sparsity of images in a transform domain or dictionary to improve image recovery fromundersampledmeasurements. In the context of inverse problems in dynamic imaging, recent research has demonstrated the promise of sparsity and low-rank techniques. For example, the patches of the underlying data are modeled as sparse in an adaptive dictionary domain, and the resulting image and dictionary estimation from undersampled measurements is called dictionary-blind compressed sensing, or the dynamic image sequence is modeled as a sum of low-rank and sparse (in some transform domain) components (L+S model) that are estimated from limited measurements. In this work, we investigate a data-adaptive extension of the L+S model, dubbed LASSI, where the temporal image sequence is decomposed into a low-rank component and a component whose spatiotemporal (3D) patches are sparse in some adaptive dictionary domain. We investigate various formulations and efficient methods for jointly estimating the underlying dynamic signal components and the spatiotemporal dictionary from limited measurements. We also obtain efficient sparsity penalized dictionary-blind compressed sensing methods as special cases of our LASSI approaches. Our numerical experiments demonstrate the promising performance of LASSI schemes for dynamicmagnetic resonance image reconstruction from limited k-t space data compared to recent methods such as k-t SLR and L+S, and compared to the proposed dictionary-blind compressed sensing method.

  16. 49 CFR Appendix B to Part 604 - Reasons for Removal

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... honest mistake. Black's Law Dictionary, Revised Fourth Edition, West Publishing Company, St. Paul, Minn... performing it. Black's Law Dictionary, Revised Fourth Edition, West Publishing Company, St. Paul, Minn., 1968... force. In addition, no other policy of insurance has taken its place. Black's Law Dictionary, Revised...

  17. 49 CFR Appendix B to Part 604 - Reasons for Removal

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... honest mistake. Black's Law Dictionary, Revised Fourth Edition, West Publishing Company, St. Paul, Minn... performing it. Black's Law Dictionary, Revised Fourth Edition, West Publishing Company, St. Paul, Minn., 1968... force. In addition, no other policy of insurance has taken its place. Black's Law Dictionary, Revised...

  18. 49 CFR Appendix B to Part 604 - Reasons for Removal

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... honest mistake. Black's Law Dictionary, Revised Fourth Edition, West Publishing Company, St. Paul, Minn... performing it. Black's Law Dictionary, Revised Fourth Edition, West Publishing Company, St. Paul, Minn., 1968... force. In addition, no other policy of insurance has taken its place. Black's Law Dictionary, Revised...

  19. Ahtna Athabaskan Dictionary.

    ERIC Educational Resources Information Center

    Kari, James, Ed.

    This dictionary of Ahtna, a dialect of the Athabaskan language family, is the first to integrate all morphemes into a single alphabetically arranged section of main entries, with verbs arranged according to a theory of Ahtna (and Athabascan) verb theme categories. An introductory section details dictionary format conventions used, presents a brief…

  20. A Novel Approach to Creating Disambiguated Multilingual Dictionaries

    ERIC Educational Resources Information Center

    Boguslavsky, Igor; Cardenosa, Jesus; Gallardo, Carolina

    2009-01-01

    Multilingual lexicons are needed in various applications, such as cross-lingual information retrieval, machine translation, and some others. Often, these applications suffer from the ambiguity of dictionary items, especially when an intermediate natural language is involved in the process of the dictionary construction, since this language adds…

  1. 49 CFR Appendix B to Part 604 - Reasons for Removal

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... honest mistake. Black's Law Dictionary, Revised Fourth Edition, West Publishing Company, St. Paul, Minn... performing it. Black's Law Dictionary, Revised Fourth Edition, West Publishing Company, St. Paul, Minn., 1968... force. In addition, no other policy of insurance has taken its place. Black's Law Dictionary, Revised...

  2. The Lexicographic Treatment of Color Terms

    ERIC Educational Resources Information Center

    Williams, Krista

    2014-01-01

    This dissertation explores the main question, "What are the issues involved in the definition and translation of color terms in dictionaries?" To answer this question, I examined color term definitions in monolingual dictionaries of French and English, and color term translations in bilingual dictionaries of French paired with nine…

  3. 21 CFR 701.3 - Designation of ingredients.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ....) Cosmetic Ingredient Dictionary, Second Ed., 1977 (available from the Cosmetic, Toiletry and Fragrance... revised monographs are published in supplements to this dictionary edition by July 18, 1980. Acid Black 2.../federal_register/code_of_federal_regulations/ibr_locations.html. (v) USAN and the USP dictionary of drug...

  4. Overcoming complexities: Damage detection using dictionary learning framework

    NASA Astrophysics Data System (ADS)

    Alguri, K. Supreet; Melville, Joseph; Deemer, Chris; Harley, Joel B.

    2018-04-01

    For in situ damage detection, guided wave structural health monitoring systems have been widely researched due to their ability to evaluate large areas and their ability detect many types of damage. These systems often evaluate structural health by recording initial baseline measurements from a pristine (i.e., undamaged) test structure and then comparing later measurements with that baseline. Yet, it is not always feasible to have a pristine baseline. As an alternative, substituting the baseline with data from a surrogate (nearly identical and pristine) structure is a logical option. While effective in some circumstance, surrogate data is often still a poor substitute for pristine baseline measurements due to minor differences between the structures. To overcome this challenge, we present a dictionary learning framework to adapt surrogate baseline data to better represent an undamaged test structure. We compare the performance of our framework with two other surrogate-based damage detection strategies: (1) using raw surrogate data for comparison and (2) using sparse wavenumber analysis, a precursor to our framework for improving the surrogate data. We apply our framework to guided wave data from two 108 mm by 108 mm aluminum plates. With 20 measurements, we show that our dictionary learning framework achieves a 98% accuracy, raw surrogate data achieves a 92% accuracy, and sparse wavenumber analysis achieves a 57% accuracy.

  5. Expert system for automatically correcting OCR output

    NASA Astrophysics Data System (ADS)

    Taghva, Kazem; Borsack, Julie; Condit, Allen

    1994-03-01

    This paper describes a new expert system for automatically correcting errors made by optical character recognition (OCR) devices. The system, which we call the post-processing system, is designed to improve the quality of text produced by an OCR device in preparation for subsequent retrieval from an information system. The system is composed of numerous parts: an information retrieval system, an English dictionary, a domain-specific dictionary, and a collection of algorithms and heuristics designed to correct as many OCR errors as possible. For the remaining errors that cannot be corrected, the system passes them on to a user-level editing program. This post-processing system can be viewed as part of a larger system that would streamline the steps of taking a document from its hard copy form to its usable electronic form, or it can be considered a stand alone system for OCR error correction. An earlier version of this system has been used to process approximately 10,000 pages of OCR generated text. Among the OCR errors discovered by this version, about 87% were corrected. We implement numerous new parts of the system, test this new version, and present the results.

  6. The Use of Electronic Dictionaries for Pronunciation Practice by University EFL Students

    ERIC Educational Resources Information Center

    Metruk, Rastislav

    2017-01-01

    This paper attempts to explore how Slovak learners of English use electronic dictionaries with regard to pronunciation practice and improvement. A total of 24 Slovak university students (subjects) completed a questionnaire which contained pronunciation-related questions in connection with the use of electronic dictionaries. The questions primarily…

  7. Dictionary of Multicultural Education.

    ERIC Educational Resources Information Center

    Grant, Carl A., Ed.; Ladson-Billings, Gloria, Ed.

    The focus of this dictionary is the meanings and perspectives of various terms that are used in multicultural education. Contributors have often addressed the literal meanings of words and terms as well as contextual meanings and examples that helped create those meanings. Like other dictionaries, this one is arranged alphabetically, but it goes…

  8. Sparse Representation Based Classification with Structure Preserving Dimension Reduction

    DTIC Science & Technology

    2014-03-13

    dictionary learning [39] used stochastic approximations to update dictionary with a large data set. Laplacian score dictionary ( LSD ) [58], which is based on...vol. 4. 2003. p. 864–7. 47. Shaw B, Jebara T. Structure preserving embedding. In: The 26th annual international conference on machine learning, ICML

  9. Dictionary of Marketing Terms.

    ERIC Educational Resources Information Center

    Everhardt, Richard M.

    A listing of words and definitions compiled from more than 10 college and high school textbooks are presented in this dictionary of marketing terms. Over 1,200 entries of terms used in retailing, wholesaling, economics, and investments are included. This dictionary was designed to aid both instructors and students to better understand the…

  10. EFL Students' "Yahoo!" Online Bilingual Dictionary Use Behavior

    ERIC Educational Resources Information Center

    Tseng, Fan-ping

    2009-01-01

    This study examined 38 EFL senior high school students' "Yahoo!" online dictionary look-up behavior. In a language laboratory, the participants read an article on a reading sheet, underlined any words they did not know, looked up their unknown words in "Yahoo!" online bilingual dictionary, and wrote down the definitions of…

  11. Binukid Dictionary.

    ERIC Educational Resources Information Center

    Otanes, Fe T., Ed.; Wrigglesworth, Hazel

    1992-01-01

    The dictionary of Binukid, a language spoken in the Bukidnon province of the Philippines, is intended as a tool for students of Binukid and for native Binukid-speakers interested in learning English. A single dialect was chosen for this work. The dictionary is introduced by notes on Binukid grammar, including basic information about phonology and…

  12. Learning the Language of Difference: The Dictionary in the High School.

    ERIC Educational Resources Information Center

    Willinsky, John

    1987-01-01

    Reports on dictionaries' power to misrepresent gender. Examines the definitions of three terms (clitoris, penis, and vagina) in eight leading high school dictionaries. Concludes that the absence of certain female gender-related terms represents another instance of institutionalized silence about the experience of women. (MM)

  13. 75 FR 22805 - Federal Travel Regulation; Relocation Allowances; Standard Data Dictionary for Collection of...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-30

    ... GENERAL SERVICES ADMINISTRATION [Proposed GSA Bulletin FTR 10-XXX; Docket 2010-0009; Sequence 1] Federal Travel Regulation; Relocation Allowances; Standard Data Dictionary for Collection of Transaction... GSA is posting online a proposed FTR bulletin that contains the data dictionary that large Federal...

  14. Defining Moments \\ di-'fi-ning 'mo-mnts \\

    ERIC Educational Resources Information Center

    Kilman, Carrie

    2012-01-01

    Children encounter new words every day. Although dictionaries designed for young readers can help students explore and experiment with language, it turns out many mainstream children's dictionaries fail to accurately describe the world in which many students live. The challenges to children's dictionary publishers can be steep. First, there is the…

  15. Getting the Most out of the Dictionary

    ERIC Educational Resources Information Center

    Marckwardt, Albert H.

    2012-01-01

    The usefulness of the dictionary as a reliable source of information for word meanings, spelling, and pronunciation is widely recognized. But even in these obvious matters, the information that the dictionary has to offer is not always accurately interpreted. With respect to pronunciation there seem to be two general pitfalls: (1) the…

  16. A dictionary of commonly used terms and terminologies in nonwovens

    USDA-ARS?s Scientific Manuscript database

    A need for a comprehensive dictionary of cotton was assessed by the International Cotton Advisory Committee (ICAC), Washington, DC. The ICAC has selected the topics (from the fiber to fabric) to be covered in the dictionary. The ICAC has invited researchers/scientists from across the globe, to compi...

  17. 78 FR 68343 - Homeownership Counseling Organizations Lists Interpretive Rule

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-14

    ... into their definitional meanings, according to the Data Dictionary,\\7\\ to ensure clarity. This will be... dictionary for the Field ``Services'' can be found at http://data.hud.gov/Housing_Counselor/getServices , and a data dictionary for ``Languages'' can be found at http://data.hud.gov/Housing_Counselor/get...

  18. Measurement of Negativity Bias in Personal Narratives Using Corpus-Based Emotion Dictionaries

    ERIC Educational Resources Information Center

    Cohen, Shuki J.

    2011-01-01

    This study presents a novel methodology for the measurement of negativity bias using positive and negative dictionaries of emotion words applied to autobiographical narratives. At odds with the cognitive theory of mood dysregulation, previous text-analytical studies have failed to find significant correlation between emotion dictionaries and…

  19. A Proposal To Develop the Axiological Aspect in Onomasiological Dictionaries.

    ERIC Educational Resources Information Center

    Felices Lago, Angel Miguel

    It is argued that English dictionaries currently provide evaluative information in addition to descriptive information about the words they contain, and that this aspect of dictionaries should be developed and expanded on. First, the historical background and distribution of the axiological parameter in English-language onomasiological…

  20. Earliest English Definitions of Anaisthesia and Anaesthesia.

    PubMed

    Haridas, Rajesh P

    2017-11-01

    The earliest identified English definition of the word anaisthesia was discovered in the first edition (1684) of A Physical Dictionary, an English translation of Steven Blankaart's medical dictionary, Lexicon Medicum Graeco-Latinum. This definition was almost certainly the source of the definition of anaesthesia which appeared in Dictionarium Anglo-Britannicum (1708), a general-purpose English dictionary compiled by the lexicographer John Kersey. The words anaisthesia and anaesthesia have not been identified in English medical or surgical publications that antedate the earliest English dictionaries in which they are known to have been defined.

  1. Histopathological Image Classification using Discriminative Feature-oriented Dictionary Learning

    PubMed Central

    Vu, Tiep Huu; Mousavi, Hojjat Seyed; Monga, Vishal; Rao, Ganesh; Rao, UK Arvind

    2016-01-01

    In histopathological image analysis, feature extraction for classification is a challenging task due to the diversity of histology features suitable for each problem as well as presence of rich geometrical structures. In this paper, we propose an automatic feature discovery framework via learning class-specific dictionaries and present a low-complexity method for classification and disease grading in histopathology. Essentially, our Discriminative Feature-oriented Dictionary Learning (DFDL) method learns class-specific dictionaries such that under a sparsity constraint, the learned dictionaries allow representing a new image sample parsimoniously via the dictionary corresponding to the class identity of the sample. At the same time, the dictionary is designed to be poorly capable of representing samples from other classes. Experiments on three challenging real-world image databases: 1) histopathological images of intraductal breast lesions, 2) mammalian kidney, lung and spleen images provided by the Animal Diagnostics Lab (ADL) at Pennsylvania State University, and 3) brain tumor images from The Cancer Genome Atlas (TCGA) database, reveal the merits of our proposal over state-of-the-art alternatives. Moreover, we demonstrate that DFDL exhibits a more graceful decay in classification accuracy against the number of training images which is highly desirable in practice where generous training is often not available. PMID:26513781

  2. MR fingerprinting reconstruction with Kalman filter.

    PubMed

    Zhang, Xiaodi; Zhou, Zechen; Chen, Shiyang; Chen, Shuo; Li, Rui; Hu, Xiaoping

    2017-09-01

    Magnetic resonance fingerprinting (MR fingerprinting or MRF) is a newly introduced quantitative magnetic resonance imaging technique, which enables simultaneous multi-parameter mapping in a single acquisition with improved time efficiency. The current MRF reconstruction method is based on dictionary matching, which may be limited by the discrete and finite nature of the dictionary and the computational cost associated with dictionary construction, storage and matching. In this paper, we describe a reconstruction method based on Kalman filter for MRF, which avoids the use of dictionary to obtain continuous MR parameter measurements. With this Kalman filter framework, the Bloch equation of inversion-recovery balanced steady state free-precession (IR-bSSFP) MRF sequence was derived to predict signal evolution, and acquired signal was entered to update the prediction. The algorithm can gradually estimate the accurate MR parameters during the recursive calculation. Single pixel and numeric brain phantom simulation were implemented with Kalman filter and the results were compared with those from dictionary matching reconstruction algorithm to demonstrate the feasibility and assess the performance of Kalman filter algorithm. The results demonstrated that Kalman filter algorithm is applicable for MRF reconstruction, eliminating the need for a pre-define dictionary and obtaining continuous MR parameter in contrast to the dictionary matching algorithm. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. Weakly Supervised Dictionary Learning

    NASA Astrophysics Data System (ADS)

    You, Zeyu; Raich, Raviv; Fern, Xiaoli Z.; Kim, Jinsub

    2018-05-01

    We present a probabilistic modeling and inference framework for discriminative analysis dictionary learning under a weak supervision setting. Dictionary learning approaches have been widely used for tasks such as low-level signal denoising and restoration as well as high-level classification tasks, which can be applied to audio and image analysis. Synthesis dictionary learning aims at jointly learning a dictionary and corresponding sparse coefficients to provide accurate data representation. This approach is useful for denoising and signal restoration, but may lead to sub-optimal classification performance. By contrast, analysis dictionary learning provides a transform that maps data to a sparse discriminative representation suitable for classification. We consider the problem of analysis dictionary learning for time-series data under a weak supervision setting in which signals are assigned with a global label instead of an instantaneous label signal. We propose a discriminative probabilistic model that incorporates both label information and sparsity constraints on the underlying latent instantaneous label signal using cardinality control. We present the expectation maximization (EM) procedure for maximum likelihood estimation (MLE) of the proposed model. To facilitate a computationally efficient E-step, we propose both a chain and a novel tree graph reformulation of the graphical model. The performance of the proposed model is demonstrated on both synthetic and real-world data.

  4. Intelligent Diagnosis Method for Rotating Machinery Using Dictionary Learning and Singular Value Decomposition

    PubMed Central

    Han, Te; Jiang, Dongxiang; Zhang, Xiaochen; Sun, Yankui

    2017-01-01

    Rotating machinery is widely used in industrial applications. With the trend towards more precise and more critical operating conditions, mechanical failures may easily occur. Condition monitoring and fault diagnosis (CMFD) technology is an effective tool to enhance the reliability and security of rotating machinery. In this paper, an intelligent fault diagnosis method based on dictionary learning and singular value decomposition (SVD) is proposed. First, the dictionary learning scheme is capable of generating an adaptive dictionary whose atoms reveal the underlying structure of raw signals. Essentially, dictionary learning is employed as an adaptive feature extraction method regardless of any prior knowledge. Second, the singular value sequence of learned dictionary matrix is served to extract feature vector. Generally, since the vector is of high dimensionality, a simple and practical principal component analysis (PCA) is applied to reduce dimensionality. Finally, the K-nearest neighbor (KNN) algorithm is adopted for identification and classification of fault patterns automatically. Two experimental case studies are investigated to corroborate the effectiveness of the proposed method in intelligent diagnosis of rotating machinery faults. The comparison analysis validates that the dictionary learning-based matrix construction approach outperforms the mode decomposition-based methods in terms of capacity and adaptability for feature extraction. PMID:28346385

  5. Supervised dictionary learning for inferring concurrent brain networks.

    PubMed

    Zhao, Shijie; Han, Junwei; Lv, Jinglei; Jiang, Xi; Hu, Xintao; Zhao, Yu; Ge, Bao; Guo, Lei; Liu, Tianming

    2015-10-01

    Task-based fMRI (tfMRI) has been widely used to explore functional brain networks via predefined stimulus paradigm in the fMRI scan. Traditionally, the general linear model (GLM) has been a dominant approach to detect task-evoked networks. However, GLM focuses on task-evoked or event-evoked brain responses and possibly ignores the intrinsic brain functions. In comparison, dictionary learning and sparse coding methods have attracted much attention recently, and these methods have shown the promise of automatically and systematically decomposing fMRI signals into meaningful task-evoked and intrinsic concurrent networks. Nevertheless, two notable limitations of current data-driven dictionary learning method are that the prior knowledge of task paradigm is not sufficiently utilized and that the establishment of correspondences among dictionary atoms in different brains have been challenging. In this paper, we propose a novel supervised dictionary learning and sparse coding method for inferring functional networks from tfMRI data, which takes both of the advantages of model-driven method and data-driven method. The basic idea is to fix the task stimulus curves as predefined model-driven dictionary atoms and only optimize the other portion of data-driven dictionary atoms. Application of this novel methodology on the publicly available human connectome project (HCP) tfMRI datasets has achieved promising results.

  6. Blind compressive sensing dynamic MRI

    PubMed Central

    Lingala, Sajan Goud; Jacob, Mathews

    2013-01-01

    We propose a novel blind compressive sensing (BCS) frame work to recover dynamic magnetic resonance images from undersampled measurements. This scheme models the dynamic signal as a sparse linear combination of temporal basis functions, chosen from a large dictionary. In contrast to classical compressed sensing, the BCS scheme simultaneously estimates the dictionary and the sparse coefficients from the undersampled measurements. Apart from the sparsity of the coefficients, the key difference of the BCS scheme with current low rank methods is the non-orthogonal nature of the dictionary basis functions. Since the number of degrees of freedom of the BCS model is smaller than that of the low-rank methods, it provides improved reconstructions at high acceleration rates. We formulate the reconstruction as a constrained optimization problem; the objective function is the linear combination of a data consistency term and sparsity promoting ℓ1 prior of the coefficients. The Frobenius norm dictionary constraint is used to avoid scale ambiguity. We introduce a simple and efficient majorize-minimize algorithm, which decouples the original criterion into three simpler sub problems. An alternating minimization strategy is used, where we cycle through the minimization of three simpler problems. This algorithm is seen to be considerably faster than approaches that alternates between sparse coding and dictionary estimation, as well as the extension of K-SVD dictionary learning scheme. The use of the ℓ1 penalty and Frobenius norm dictionary constraint enables the attenuation of insignificant basis functions compared to the ℓ0 norm and column norm constraint assumed in most dictionary learning algorithms; this is especially important since the number of basis functions that can be reliably estimated is restricted by the available measurements. We also observe that the proposed scheme is more robust to local minima compared to K-SVD method, which relies on greedy sparse coding. Our phase transition experiments demonstrate that the BCS scheme provides much better recovery rates than classical Fourier-based CS schemes, while being only marginally worse than the dictionary aware setting. Since the overhead in additionally estimating the dictionary is low, this method can be very useful in dynamic MRI applications, where the signal is not sparse in known dictionaries. We demonstrate the utility of the BCS scheme in accelerating contrast enhanced dynamic data. We observe superior reconstruction performance with the BCS scheme in comparison to existing low rank and compressed sensing schemes. PMID:23542951

  7. Managing Vocabulary Mapping Services

    PubMed Central

    Che, Chengjian; Monson, Kent; Poon, Kasey B.; Shakib, Shaun C.; Lau, Lee Min

    2005-01-01

    The efficient management and maintenance of large-scale and high-quality vocabulary mapping is an operational challenge. The 3M Health Information Systems (HIS) Healthcare Data Dictionary (HDD) group developed an information management system to provide controlled mapping services, resulting in improved efficiency and quality maintenance. PMID:16779203

  8. Applications for the environment : real-time information synthesis (AERIS) eco-signal operations : operational concept.

    DOT National Transportation Integrated Search

    2002-04-01

    The Logical Architecture is based on a Computer Aided Systems Engineering (CASE) model of the requirements for the flow of data and control through the various functions included in Intelligent Transportation Systems (ITS). Data Dictionary is the com...

  9. Oak Ridge Environmental Information System (OREIS) functional system design document

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Birchfield, T.E.; Brown, M.O.; Coleman, P.R.

    1994-03-01

    The OREIS Functional System Design document provides a detailed functional description of the Oak Ridge Environmental Information System (OREIS). It expands the system requirements defined in the OREIS Phase 1-System Definition Document (ES/ER/TM-34). Documentation of OREIS development is based on the Automated Data Processing System Development Methodology, a Martin Marietta Energy Systems, Inc., procedure written to assist in developing scientific and technical computer systems. This document focuses on the development of the functional design of the user interface, which includes the integration of commercial applications software. The data model and data dictionary are summarized briefly; however, the Data Management Planmore » for OREIS (ES/ER/TM-39), a companion document to the Functional System Design document, provides the complete data dictionary and detailed descriptions of the requirements for the data base structure. The OREIS system will provide the following functions, which are executed from a Menu Manager: (1) preferences, (2) view manager, (3) macro manager, (4) data analysis (assisted analysis and unassisted analysis), and (5) spatial analysis/map generation (assisted ARC/INFO and unassisted ARC/INFO). Additional functionality includes interprocess communications, which handle background operations of OREIS.« less

  10. Evaluation of the SYSTRAN Automatic Translation System. Report No. 5.

    ERIC Educational Resources Information Center

    Chaumier, Jacques; And Others

    The Commission of the European Communities has acquired an automatic translation system (SYSTRAN), which has been put into operation on an experimental basis. The system covers translation of English into French and comprises a dictionary for food science and technology containing 25,000 words or inflections and 4,500 expressions. This report…

  11. Review of "A Dictionary of Global Huayu"

    ERIC Educational Resources Information Center

    Li, Rui

    2016-01-01

    As the first Huayu dictionary published by the Commercial Press, "A Dictionary of Global Huayu" (Chinese Language) did a pioneer work in many aspects. It did expand the influence of Chinese and provided Chinese speaker abroad a valuable reference book for study and communication. Nevertheless, there are still some demerits. First of all,…

  12. Variant Spellings in Modern American Dictionaries.

    ERIC Educational Resources Information Center

    Emery, Donald W.

    A record of how present-day desk dictionaries are recognizing the existence of variant or secondary spellings for many common English words, this reference list can be used by teachers of English and authors of spelling lists. Originally published in 1958, this revised edition uses two dictionaries not in existence then and the revised editions of…

  13. A Survey of Meaning Discrimination in Selected English/Spanish Dictionaries.

    ERIC Educational Resources Information Center

    Powers, Michael D.

    1985-01-01

    Examines the treatment of sense discrimination in eight Spanish/English English/Spanish bilingual dictionaries and one specialized dictionary. Does this by analyzing 30 words that Torrents des Prats determined have at least nine different sense discriminations from English into Spanish. Larousse was found to be far superior to the others. (SED)

  14. Aleut Dictionary (Unangam Tunudgusii). An Unabridged Lexicon of the Aleutian, Pribilof, and Commander Islands Aleut Language.

    ERIC Educational Resources Information Center

    Bergsland, Knut, Comp.

    This comprehensive dictionary draws on ethnographic and linguistic work of the Aleut language and culture dating to 1745. An introductory section explains the dictionary's format, offers a brief historical survey, and contains notes on Aleut phonology and orthography, dialectal differences and developments, Eskimo-Aleut phonological…

  15. Usage and Efficacy of Electronic Dictionaries for a Language without Word Boundaries

    ERIC Educational Resources Information Center

    Toyoda, Etsuko

    2016-01-01

    There is cumulative evidence suggesting that hyper-glossing facilitates lower-level processing and enhances reading comprehension. There are plentiful studies on electronic dictionaries for English. However, research on e-dictionaries for languages with no boundaries between words is still scarce. The main aim for the current study is to…

  16. Chinese-English Technical Dictionaries. Volume 1, Aviation and Space.

    ERIC Educational Resources Information Center

    Library of Congress, Washington, DC. Aerospace Technology Div.

    The present dictionary is the first of a series of Chinese-English technical dictionaries under preparation by the Aerospace Technology Division of the Library of Congress. The purpose of the series is to provide rapid reference tools for translators, abstractors, and research analysts concerned with scientific and technical materials published in…

  17. Chinese-English and English-Chinese Dictionaries in the Library of Congress. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Dunn, Robert, Comp.

    An annotated bibliography of the Library of Congress' Chinese-English holdings on all subjects, as well as certain polyglot and multilingual dictionaries with English and Chinese entries. Included are general, encyclopaedic and comprehensive dictionaries; vocabularies; word lists; syllabaries; lists of place names, personal names, nomenclature,…

  18. The New Oxford Picture Dictionary, English/Navajo Edition.

    ERIC Educational Resources Information Center

    Parnwell, E. C.

    This picture dictionary illustrates over 2,400 words. The dictionary is organized thematically, beginning with topics most useful for the survival needs of students in an English speaking country. However, teachers may adapt the order to reflect the needs of their students. Verbs are included on separate pages, but within topic areas in which they…

  19. The Oxford Picture Dictionary. Beginning Workbook.

    ERIC Educational Resources Information Center

    Fuchs, Marjorie

    The beginning workbook of the Oxford Picture Dictionary is in full color and offers vocabulary reinforcement activities that correspond page for page with the dictionary. Clear and simple instructions with examples make it suitable for independent use in the classroom or at home. The workbook has up-to-date art and graphics, explaining over 3700…

  20. English-Dari Dictionary.

    ERIC Educational Resources Information Center

    Peace Corps, Washington, DC.

    This 7,000-word dictionary is designed for English speakers learning Dari. The dictionary consists of two parts, the first a reference to find words easily translatable from one language to the other, the second a list of idioms and short phrases commonly used in everyday conversation, yet not readily translatable. Many of these entries have no…

  1. Linguistic and Cultural Strategies in ELT Dictionaries

    ERIC Educational Resources Information Center

    Corrius, Montse; Pujol, Didac

    2010-01-01

    There are three main types of ELT dictionaries: monolingual, bilingual, and bilingualized. Each type of dictionary, while having its own advantages, also hinders the learning of English as a foreign language and culture in so far as it is written from a homogenizing (linguistic- and culture-centric) perspective. This paper presents a new type of…

  2. Dictionaries of African Sign Languages: An Overview

    ERIC Educational Resources Information Center

    Schmaling, Constanze H.

    2012-01-01

    This article gives an overview of dictionaries of African sign languages that have been published to date most of which have not been widely distributed. After an introduction into the field of sign language lexicography and a discussion of some of the obstacles that authors of sign language dictionaries face in general, I will show problems…

  3. Supporting Social Studies Reading Comprehension with an Electronic Pop-Up Dictionary

    ERIC Educational Resources Information Center

    Fry, Sara Winstead; Gosky, Ross

    2008-01-01

    This study investigated how middle school students' comprehension was impacted by reading social studies texts online with a pop-up dictionary function for every word in the text. A quantitative counterbalance design was used to determine how 129 middle school students' reading comprehension test scores for the pop-up dictionary reading differed…

  4. A Dictionary of Hindi Verbal Expressions (Hindi-English). Final Report.

    ERIC Educational Resources Information Center

    Bahl, Kali Charan, Comp.

    This dictionary covers approximately 28,277 verbal expressions in modern standard Hindi and their rendered English equivalents. The study lists longer verbal expressions which are generally matched by single verbs in English. The lexicographer notes that the majority of entries in this dictionary do not appear in their present form in most other…

  5. Aspects of Sentence Retrieval

    DTIC Science & Technology

    2006-09-01

    English-to-Arabic-to-English Lexicon . . . . . . . . . . . . . . . . . . . . . 89 6.2.4 A WordNet Probabilistic Dictionary ...19 4.1 Examples of “translations” of the terms “zebra” and “galileo” from a translation dictionary trained...106 6.13 Comparing the use of WordNet as a translation table, and as a dictionary during the training of a translation table

  6. Bilingualised Dictionaries: How Learners Really Use Them.

    ERIC Educational Resources Information Center

    Laufer, Batia; Kimmel, Michal

    1997-01-01

    Seventy native Hebrew-speaking English-as-a-Second-Language students participated in a study that investigated what part of an entry second-language learners read when they look up an unfamiliar word in a bilingualised dictionary: the monolingual, the bilingual, or both. Results suggest the bilingualised dictionary is very effective because it is…

  7. Dictionnaires du francais langue etrangere (Dictionaries for French as a Second Language).

    ERIC Educational Resources Information Center

    Gross, Gaston; Ibrahim, Amr

    1981-01-01

    Examines the purposes served by native language dictionaries as an introduction to the review of three monolingual French dictionaries for foreigners. Devotes particular attention to the most recent, the "Dictionnaire du francais langue etrangere", published by Larousse. Stresses the characteristics that are considered desirable for this type of…

  8. Teaching WP and DP with CP/M-Based Microcomputers.

    ERIC Educational Resources Information Center

    Bartholome, Lloyd W.

    1982-01-01

    The use of CP/M (Control Program Monitor)-based microcomputers in teaching word processing and data processing is explored. The system's advantages, variations, dictionary software, and future are all discussed. (CT)

  9. Aveiro method in reproducing kernel Hilbert spaces under complete dictionary

    NASA Astrophysics Data System (ADS)

    Mai, Weixiong; Qian, Tao

    2017-12-01

    Aveiro Method is a sparse representation method in reproducing kernel Hilbert spaces (RKHS) that gives orthogonal projections in linear combinations of reproducing kernels over uniqueness sets. It, however, suffers from determination of uniqueness sets in the underlying RKHS. In fact, in general spaces, uniqueness sets are not easy to be identified, let alone the convergence speed aspect with Aveiro Method. To avoid those difficulties we propose an anew Aveiro Method based on a dictionary and the matching pursuit idea. What we do, in fact, are more: The new Aveiro method will be in relation to the recently proposed, the so called Pre-Orthogonal Greedy Algorithm (P-OGA) involving completion of a given dictionary. The new method is called Aveiro Method Under Complete Dictionary (AMUCD). The complete dictionary consists of all directional derivatives of the underlying reproducing kernels. We show that, under the boundary vanishing condition, bring available for the classical Hardy and Paley-Wiener spaces, the complete dictionary enables an efficient expansion of any given element in the Hilbert space. The proposed method reveals new and advanced aspects in both the Aveiro Method and the greedy algorithm.

  10. Assigning categorical information to Japanese medical terms using MeSH and MEDLINE.

    PubMed

    Onogi, Yuzo

    2007-01-01

    This paper reports on the assigning of MeSH (Medical Subject Headings) categories to Japanese terms in an English-Japanese dictionary using the titles and abstracts of articles indexed in MEDLINE. In a previous study, 30,000 of 80,000 terms in the dictionary were mapped to MeSH terms by normalized comparison. It was reasoned that if the remaining dictionary terms appeared in MEDLINE-indexed articles that are indexed using MeSH terms, then relevancies between the dictionary terms and MeSH terms could be calculated, and thus MeSH categories assigned. This study compares two approaches for calculating the weight matrix. One is the TF*IDF method and the other uses the inner product of two weight matrices. About 20,000 additional dictionary terms were identified in MEDLINE-indexed articles published between 2000 and 2004. The precision and recall of these algorithms were evaluated separately for MeSH terms and non-MeSH terms. Unfortunately, the precision and recall of the algorithms was not good, but this method will help with manual assignment of MeSH categories to dictionary terms.

  11. Double-dictionary matching pursuit for fault extent evaluation of rolling bearing based on the Lempel-Ziv complexity

    NASA Astrophysics Data System (ADS)

    Cui, Lingli; Gong, Xiangyang; Zhang, Jianyu; Wang, Huaqing

    2016-12-01

    The quantitative diagnosis of rolling bearing fault severity is particularly crucial to realize a proper maintenance decision. Aiming at the fault feature of rolling bearing, a novel double-dictionary matching pursuit (DDMP) for fault extent evaluation of rolling bearing based on the Lempel-Ziv complexity (LZC) index is proposed in this paper. In order to match the features of rolling bearing fault, the impulse time-frequency dictionary and modulation dictionary are constructed to form the double-dictionary by using the method of parameterized function model. Then a novel matching pursuit method is proposed based on the new double-dictionary. For rolling bearing vibration signals with different fault sizes, the signals are decomposed and reconstructed by the DDMP. After the noise reduced and signals reconstructed, the LZC index is introduced to realize the fault extent evaluation. The applications of this method to the fault experimental signals of bearing outer race and inner race with different degree of injury have shown that the proposed method can effectively realize the fault extent evaluation.

  12. Improving the dictionary lookup approach for disease normalization using enhanced dictionary and query expansion

    PubMed Central

    Jonnagaddala, Jitendra; Jue, Toni Rose; Chang, Nai-Wen; Dai, Hong-Jie

    2016-01-01

    The rapidly increasing biomedical literature calls for the need of an automatic approach in the recognition and normalization of disease mentions in order to increase the precision and effectivity of disease based information retrieval. A variety of methods have been proposed to deal with the problem of disease named entity recognition and normalization. Among all the proposed methods, conditional random fields (CRFs) and dictionary lookup method are widely used for named entity recognition and normalization respectively. We herein developed a CRF-based model to allow automated recognition of disease mentions, and studied the effect of various techniques in improving the normalization results based on the dictionary lookup approach. The dataset from the BioCreative V CDR track was used to report the performance of the developed normalization methods and compare with other existing dictionary lookup based normalization methods. The best configuration achieved an F-measure of 0.77 for the disease normalization, which outperformed the best dictionary lookup based baseline method studied in this work by an F-measure of 0.13. Database URL: https://github.com/TCRNBioinformatics/DiseaseExtract PMID:27504009

  13. A dictionary without definitions: romanticist science in the production and presentation of the Grimm brothers' German dictionary, 1838-1863.

    PubMed

    Kistner, Kelly

    2014-12-01

    Between 1838 and 1863 the Grimm brothers led a collaborative research project to create a new kind of dictionary documenting the history of the German language. They imagined the work would present a scientific account of linguistic cohesiveness and strengthen German unity. However, their dictionary volumes (most of which were arranged and written by Jacob Grimm) would be variously criticized for their idiosyncratic character and ultimately seen as a poor, and even prejudicial, piece of scholarship. This paper argues that such criticisms may reflect a misunderstanding of the dictionary. I claim it can be best understood as an artifact of romanticist science and its epistemological privileging of subjective perception coupled with a deeply-held faith in inter-subjective congruence. Thus situated, it is a rare and detailed case of Romantic ideas and ideals applied to the scientific study of social artifacts. Moreover, the dictionary's organization, reception, and legacy provide insights into the changing landscape of scientific practice in Germany, showcasing the difficulties of implementing a romanticist vision of science amidst widening gaps between the public and professionals, generalists and specialists.

  14. A data dictionary approach to multilingual documentation and decision support for the diagnosis of acute abdominal pain. (COPERNICUS 555, an European concerted action).

    PubMed

    Ohmann, C; Eich, H P; Sippel, H

    1998-01-01

    This paper describes the design and development of a multilingual documentation and decision support system for the diagnosis of acute abdominal pain. The work was performed within a multi-national COPERNICUS European concerted action dealing with information technology for quality assurance in acute abdominal pain in Europe (EURO-AAP, 555). The software engineering was based on object-oriented analysis design and programming. The program cover three modules: a data dictionary, a documentation program and a knowledge based system. National versions of the software were provided and introduced into 16 centers from Central and Eastern Europe. A prospective data collection was performed in which 4020 patients were recruited. The software design has been proven to be very efficient and useful for the development of multilingual software.

  15. Earth Observatory Satellite system definition study. Report no. 3: Design/cost tradeoff studies. Appendix A: EOS program WBS dictionary. Appendix B: EOS mission functional analysis

    NASA Technical Reports Server (NTRS)

    1974-01-01

    The work breakdown structure (WBS) dictionary for the Earth Observatory Satellite (EOS) is defined. The various elements of the EOS program are examined to include the aggregate of hardware, computer software, services, and data required to develop, produce, test, support, and operate the space vehicle and the companion ground data management system. A functional analysis of the EOS mission is developed. The operations for three typical EOS missions, Delta, Titan, and Shuttle launched are considered. The functions were determined for the top program elements, and the mission operations, function 2.0, was expanded to level one functions. Selection of ten level one functions for further analysis to level two and three functions were based on concern for the EOS operations and associated interfaces.

  16. Classification of multispectral or hyperspectral satellite imagery using clustering of sparse approximations on sparse representations in learned dictionaries obtained using efficient convolutional sparse coding

    DOEpatents

    Moody, Daniela; Wohlberg, Brendt

    2018-01-02

    An approach for land cover classification, seasonal and yearly change detection and monitoring, and identification of changes in man-made features may use a clustering of sparse approximations (CoSA) on sparse representations in learned dictionaries. The learned dictionaries may be derived using efficient convolutional sparse coding to build multispectral or hyperspectral, multiresolution dictionaries that are adapted to regional satellite image data. Sparse image representations of images over the learned dictionaries may be used to perform unsupervised k-means clustering into land cover categories. The clustering process behaves as a classifier in detecting real variability. This approach may combine spectral and spatial textural characteristics to detect geologic, vegetative, hydrologic, and man-made features, as well as changes in these features over time.

  17. Machine-Aided Indexing at NASA.

    ERIC Educational Resources Information Center

    Silvester, June P.; And Others

    1994-01-01

    Describes the National Aeronautics and Space Administration (NASA) Lexical Dictionary (NLD), a machine-aided indexing system used online at the NASA Center for AeroSpace Information (CASI). The functions of NLD system components are described in detail, and production and quality benefits resulting from machine-aided indexing at CASI are…

  18. Reference architecture and interoperability model for data mining and fusion in scientific cross-domain infrastructures

    NASA Astrophysics Data System (ADS)

    Haener, Rainer; Waechter, Joachim; Grellet, Sylvain; Robida, Francois

    2017-04-01

    Interoperability is the key factor in establishing scientific research environments and infrastructures, as well as in bringing together heterogeneous, geographically distributed risk management, monitoring, and early warning systems. Based on developments within the European Plate Observing System (EPOS), a reference architecture has been devised that comprises architectural blue-prints and interoperability models regarding the specification of business processes and logic as well as the encoding of data, metadata, and semantics. The architectural blueprint is developed on the basis of the so called service-oriented architecture (SOA) 2.0 paradigm, which combines intelligence and proactiveness of event-driven with service-oriented architectures. SOA 2.0 supports analysing (Data Mining) both, static and real-time data in order to find correlations of disparate information that do not at first appear to be intuitively obvious: Analysed data (e.g., seismological monitoring) can be enhanced with relationships discovered by associating them (Data Fusion) with other data (e.g., creepmeter monitoring), with digital models of geological structures, or with the simulation of geological processes. The interoperability model describes the information, communication (conversations) and the interactions (choreographies) of all participants involved as well as the processes for registering, providing, and retrieving information. It is based on the principles of functional integration, implemented via dedicated services, communicating via service-oriented and message-driven infrastructures. The services provide their functionality via standardised interfaces: Instead of requesting data directly, users share data via services that are built upon specific adapters. This approach replaces the tight coupling at data level by a flexible dependency on loosely coupled services. The main component of the interoperability model is the comprehensive semantic description of the information, business logic and processes on the basis of a minimal set of well-known, established standards. It implements the representation of knowledge with the application of domain-controlled vocabularies to statements about resources, information, facts, and complex matters (ontologies). Seismic experts for example, would be interested in geological models or borehole measurements at a certain depth, based on which it is possible to correlate and verify seismic profiles. The entire model is built upon standards from the Open Geospatial Consortium (Dictionaries, Service Layer), the International Organisation for Standardisation (Registries, Metadata), and the World Wide Web Consortium (Resource Description Framework, Spatial Data on the Web Best Practices). It has to be emphasised that this approach is scalable to the greatest possible extent: All information, necessary in the context of cross-domain infrastructures is referenced via vocabularies and knowledge bases containing statements that provide either the information itself or resources (service-endpoints), the information can be retrieved from. The entire infrastructure communication is subject to a broker-based business logic integration platform where the information exchanged between involved participants, is managed on the basis of standardised dictionaries, repositories, and registries. This approach also enables the development of Systems-of-Systems (SoS), which allow the collaboration of autonomous, large scale concurrent, and distributed systems, yet cooperatively interacting as a collective in a common environment.

  19. Accurate classification of brain gliomas by discriminate dictionary learning based on projective dictionary pair learning of proton magnetic resonance spectra.

    PubMed

    Adebileje, Sikiru Afolabi; Ghasemi, Keyvan; Aiyelabegan, Hammed Tanimowo; Saligheh Rad, Hamidreza

    2017-04-01

    Proton magnetic resonance spectroscopy is a powerful noninvasive technique that complements the structural images of cMRI, which aids biomedical and clinical researches, by identifying and visualizing the compositions of various metabolites within the tissues of interest. However, accurate classification of proton magnetic resonance spectroscopy is still a challenging issue in clinics due to low signal-to-noise ratio, overlapping peaks of metabolites, and the presence of background macromolecules. This paper evaluates the performance of a discriminate dictionary learning classifiers based on projective dictionary pair learning method for brain gliomas proton magnetic resonance spectroscopy spectra classification task, and the result were compared with the sub-dictionary learning methods. The proton magnetic resonance spectroscopy data contain a total of 150 spectra (74 healthy, 23 grade II, 23 grade III, and 30 grade IV) from two databases. The datasets from both databases were first coupled together, followed by column normalization. The Kennard-Stone algorithm was used to split the datasets into its training and test sets. Performance comparison based on the overall accuracy, sensitivity, specificity, and precision was conducted. Based on the overall accuracy of our classification scheme, the dictionary pair learning method was found to outperform the sub-dictionary learning methods 97.78% compared with 68.89%, respectively. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  20. Exploiting Attribute Correlations: A Novel Trace Lasso-Based Weakly Supervised Dictionary Learning Method.

    PubMed

    Wu, Lin; Wang, Yang; Pan, Shirui

    2017-12-01

    It is now well established that sparse representation models are working effectively for many visual recognition tasks, and have pushed forward the success of dictionary learning therein. Recent studies over dictionary learning focus on learning discriminative atoms instead of purely reconstructive ones. However, the existence of intraclass diversities (i.e., data objects within the same category but exhibit large visual dissimilarities), and interclass similarities (i.e., data objects from distinct classes but share much visual similarities), makes it challenging to learn effective recognition models. To this end, a large number of labeled data objects are required to learn models which can effectively characterize these subtle differences. However, labeled data objects are always limited to access, committing it difficult to learn a monolithic dictionary that can be discriminative enough. To address the above limitations, in this paper, we propose a weakly-supervised dictionary learning method to automatically learn a discriminative dictionary by fully exploiting visual attribute correlations rather than label priors. In particular, the intrinsic attribute correlations are deployed as a critical cue to guide the process of object categorization, and then a set of subdictionaries are jointly learned with respect to each category. The resulting dictionary is highly discriminative and leads to intraclass diversity aware sparse representations. Extensive experiments on image classification and object recognition are conducted to show the effectiveness of our approach.

Top