Sample records for clinical document corpus

  1. Building a comprehensive syntactic and semantic corpus of Chinese clinical texts.

    PubMed

    He, Bin; Dong, Bin; Guan, Yi; Yang, Jinfeng; Jiang, Zhipeng; Yu, Qiubin; Cheng, Jianyi; Qu, Chunyan

    2017-05-01

    To build a comprehensive corpus covering syntactic and semantic annotations of Chinese clinical texts with corresponding annotation guidelines and methods as well as to develop tools trained on the annotated corpus, which supplies baselines for research on Chinese texts in the clinical domain. An iterative annotation method was proposed to train annotators and to develop annotation guidelines. Then, by using annotation quality assurance measures, a comprehensive corpus was built, containing annotations of part-of-speech (POS) tags, syntactic tags, entities, assertions, and relations. Inter-annotator agreement (IAA) was calculated to evaluate the annotation quality and a Chinese clinical text processing and information extraction system (CCTPIES) was developed based on our annotated corpus. The syntactic corpus consists of 138 Chinese clinical documents with 47,426 tokens and 2612 full parsing trees, while the semantic corpus includes 992 documents that annotated 39,511 entities with their assertions and 7693 relations. IAA evaluation shows that this comprehensive corpus is of good quality, and the system modules are effective. The annotated corpus makes a considerable contribution to natural language processing (NLP) research into Chinese texts in the clinical domain. However, this corpus has a number of limitations. Some additional types of clinical text should be introduced to improve corpus coverage and active learning methods should be utilized to promote annotation efficiency. In this study, several annotation guidelines and an annotation method for Chinese clinical texts were proposed, and a comprehensive corpus with its NLP modules were constructed, providing a foundation for further study of applying NLP techniques to Chinese texts in the clinical domain. Copyright © 2017. Published by Elsevier Inc.

  2. ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus.

    PubMed

    Afzal, Zubair; Pons, Ewoud; Kang, Ning; Sturkenboom, Miriam C J M; Schuemie, Martijn J; Kors, Jan A

    2014-11-29

    In order to extract meaningful information from electronic medical records, such as signs and symptoms, diagnoses, and treatments, it is important to take into account the contextual properties of the identified information: negation, temporality, and experiencer. Most work on automatic identification of these contextual properties has been done on English clinical text. This study presents ContextD, an adaptation of the English ConText algorithm to the Dutch language, and a Dutch clinical corpus. We created a Dutch clinical corpus containing four types of anonymized clinical documents: entries from general practitioners, specialists' letters, radiology reports, and discharge letters. Using a Dutch list of medical terms extracted from the Unified Medical Language System, we identified medical terms in the corpus with exact matching. The identified terms were annotated for negation, temporality, and experiencer properties. To adapt the ConText algorithm, we translated English trigger terms to Dutch and added several general and document specific enhancements, such as negation rules for general practitioners' entries and a regular expression based temporality module. The ContextD algorithm utilized 41 unique triggers to identify the contextual properties in the clinical corpus. For the negation property, the algorithm obtained an F-score from 87% to 93% for the different document types. For the experiencer property, the F-score was 99% to 100%. For the historical and hypothetical values of the temporality property, F-scores ranged from 26% to 54% and from 13% to 44%, respectively. The ContextD showed good performance in identifying negation and experiencer property values across all Dutch clinical document types. Accurate identification of the temporality property proved to be difficult and requires further work. The anonymized and annotated Dutch clinical corpus can serve as a useful resource for further algorithm development.

  3. Methods and Techniques for Clinical Text Modeling and Analytics

    ERIC Educational Resources Information Center

    Ling, Yuan

    2017-01-01

    This study focuses on developing and applying methods/techniques in different aspects of the system for clinical text understanding, at both corpus and document level. We deal with two major research questions: First, we explore the question of "How to model the underlying relationships from clinical notes at corpus level?" Documents…

  4. Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network.

    PubMed

    Wu, Yonghui; Jiang, Min; Lei, Jianbo; Xu, Hua

    2015-01-01

    Rapid growth in electronic health records (EHRs) use has led to an unprecedented expansion of available clinical data in electronic formats. However, much of the important healthcare information is locked in the narrative documents. Therefore Natural Language Processing (NLP) technologies, e.g., Named Entity Recognition that identifies boundaries and types of entities, has been extensively studied to unlock important clinical information in free text. In this study, we investigated a novel deep learning method to recognize clinical entities in Chinese clinical documents using the minimal feature engineering approach. We developed a deep neural network (DNN) to generate word embeddings from a large unlabeled corpus through unsupervised learning and another DNN for the NER task. The experiment results showed that the DNN with word embeddings trained from the large unlabeled corpus outperformed the state-of-the-art CRF's model in the minimal feature engineering setting, achieving the highest F1-score of 0.9280. Further analysis showed that word embeddings derived through unsupervised learning from large unlabeled corpus remarkably improved the DNN with randomized embedding, denoting the usefulness of unsupervised feature learning.

  5. Development and Evaluation of a Clinical Note Section Header Terminology

    PubMed Central

    Denny, Joshua C.; Miller, Randolph A.; Johnson, Kevin B.; Spickard, Anderson

    2008-01-01

    Clinical documentation is often expressed in natural language text, yet providers often use common organizations that segment these notes in sections, such as “history of present illness” or “physical examination.” We developed a hierarchical section header terminology, supporting mappings to LOINC and other vocabularies; it contained 1109 concepts and 4332 synonyms. Physicians evaluated it compared to LOINC and the Evaluation and Management billing schema using a randomly selected corpus of history and physical notes. Evaluated documents contained a median of 54 sections and 27 “major sections.” There were 16,196 total sections in the evaluation note corpus. The terminology contained 99.9% of the clinical sections; LOINC matched 77% of section header concepts and 20% of section header strings in those documents. The section terminology may enable better clinical note understanding and interoperability. Future development and integration into natural language processing systems is needed. PMID:18999303

  6. Computation of term dominance in text documents

    DOEpatents

    Bauer, Travis L [Albuquerque, NM; Benz, Zachary O [Albuquerque, NM; Verzi, Stephen J [Albuquerque, NM

    2012-04-24

    An improved entropy-based term dominance metric useful for characterizing a corpus of text documents, and is useful for comparing the term dominance metrics of a first corpus of documents to a second corpus having a different number of documents.

  7. Dense Annotation of Free-Text Critical Care Discharge Summaries from an Indian Hospital and Associated Performance of a Clinical NLP Annotator.

    PubMed

    Ramanan, S V; Radhakrishna, Kedar; Waghmare, Abijeet; Raj, Tony; Nathan, Senthil P; Sreerama, Sai Madhukar; Sampath, Sriram

    2016-08-01

    Electronic Health Record (EHR) use in India is generally poor, and structured clinical information is mostly lacking. This work is the first attempt aimed at evaluating unstructured text mining for extracting relevant clinical information from Indian clinical records. We annotated a corpus of 250 discharge summaries from an Intensive Care Unit (ICU) in India, with markups for diseases, procedures, and lab parameters, their attributes, as well as key demographic information and administrative variables such as patient outcomes. In this process, we have constructed guidelines for an annotation scheme useful to clinicians in the Indian context. We evaluated the performance of an NLP engine, Cocoa, on a cohort of these Indian clinical records. We have produced an annotated corpus of roughly 90 thousand words, which to our knowledge is the first tagged clinical corpus from India. Cocoa was evaluated on a test corpus of 50 documents. The overlap F-scores across the major categories, namely disease/symptoms, procedures, laboratory parameters and outcomes, are 0.856, 0.834, 0.961 and 0.872 respectively. These results are competitive with results from recent shared tasks based on US records. The annotated corpus and associated results from the Cocoa engine indicate that unstructured text mining is a viable method for cohort analysis in the Indian clinical context, where structured EHR records are largely absent.

  8. Evaluating current automatic de-identification methods with Veteran's health administration clinical documents.

    PubMed

    Ferrández, Oscar; South, Brett R; Shen, Shuying; Friedlin, F Jeffrey; Samore, Matthew H; Meystre, Stéphane M

    2012-07-27

    The increased use and adoption of Electronic Health Records (EHR) causes a tremendous growth in digital information useful for clinicians, researchers and many other operational purposes. However, this information is rich in Protected Health Information (PHI), which severely restricts its access and possible uses. A number of investigators have developed methods for automatically de-identifying EHR documents by removing PHI, as specified in the Health Insurance Portability and Accountability Act "Safe Harbor" method.This study focuses on the evaluation of existing automated text de-identification methods and tools, as applied to Veterans Health Administration (VHA) clinical documents, to assess which methods perform better with each category of PHI found in our clinical notes; and when new methods are needed to improve performance. We installed and evaluated five text de-identification systems "out-of-the-box" using a corpus of VHA clinical documents. The systems based on machine learning methods were trained with the 2006 i2b2 de-identification corpora and evaluated with our VHA corpus, and also evaluated with a ten-fold cross-validation experiment using our VHA corpus. We counted exact, partial, and fully contained matches with reference annotations, considering each PHI type separately, or only one unique 'PHI' category. Performance of the systems was assessed using recall (equivalent to sensitivity) and precision (equivalent to positive predictive value) metrics, as well as the F(2)-measure. Overall, systems based on rules and pattern matching achieved better recall, and precision was always better with systems based on machine learning approaches. The highest "out-of-the-box" F(2)-measure was 67% for partial matches; the best precision and recall were 95% and 78%, respectively. Finally, the ten-fold cross validation experiment allowed for an increase of the F(2)-measure to 79% with partial matches. The "out-of-the-box" evaluation of text de-identification systems provided us with compelling insight about the best methods for de-identification of VHA clinical documents. The errors analysis demonstrated an important need for customization to PHI formats specific to VHA documents. This study informed the planning and development of a "best-of-breed" automatic de-identification application for VHA clinical text.

  9. BoB, a best-of-breed automated text de-identification system for VHA clinical documents.

    PubMed

    Ferrández, Oscar; South, Brett R; Shen, Shuying; Friedlin, F Jeffrey; Samore, Matthew H; Meystre, Stéphane M

    2013-01-01

    De-identification allows faster and more collaborative clinical research while protecting patient confidentiality. Clinical narrative de-identification is a tedious process that can be alleviated by automated natural language processing methods. The goal of this research is the development of an automated text de-identification system for Veterans Health Administration (VHA) clinical documents. We devised a novel stepwise hybrid approach designed to improve the current strategies used for text de-identification. The proposed system is based on a previous study on the best de-identification methods for VHA documents. This best-of-breed automated clinical text de-identification system (aka BoB) tackles the problem as two separate tasks: (1) maximize patient confidentiality by redacting as much protected health information (PHI) as possible; and (2) leave de-identified documents in a usable state preserving as much clinical information as possible. We evaluated BoB with a manually annotated corpus of a variety of VHA clinical notes, as well as with the 2006 i2b2 de-identification challenge corpus. We present evaluations at the instance- and token-level, with detailed results for BoB's main components. Moreover, an existing text de-identification system was also included in our evaluation. BoB's design efficiently takes advantage of the methods implemented in its pipeline, resulting in high sensitivity values (especially for sensitive PHI categories) and a limited number of false positives. Our system successfully addressed VHA clinical document de-identification, and its hybrid stepwise design demonstrates robustness and efficiency, prioritizing patient confidentiality while leaving most clinical information intact.

  10. Technique for information retrieval using enhanced latent semantic analysis generating rank approximation matrix by factorizing the weighted morpheme-by-document matrix

    DOEpatents

    Chew, Peter A; Bader, Brett W

    2012-10-16

    A technique for information retrieval includes parsing a corpus to identify a number of wordform instances within each document of the corpus. A weighted morpheme-by-document matrix is generated based at least in part on the number of wordform instances within each document of the corpus and based at least in part on a weighting function. The weighted morpheme-by-document matrix separately enumerates instances of stems and affixes. Additionally or alternatively, a term-by-term alignment matrix may be generated based at least in part on the number of wordform instances within each document of the corpus. At least one lower rank approximation matrix is generated by factorizing the weighted morpheme-by-document matrix and/or the term-by-term alignment matrix.

  11. Assessing the Representation of Occupation Information in Free-Text Clinical Documents Across Multiple Sources

    PubMed Central

    Lindemann, Elizabeth A.; Chen, Elizabeth S.; Rajamani, Sripriya; Manohar, Nivedha; Wang, Yan; Melton, Genevieve B.

    2017-01-01

    There has been increasing recognition of the key role of social determinants like occupation on health. Given the relatively poor understanding of occupation information in electronic health records (EHRs), we sought to characterize occupation information within free-text clinical document sources. From six distinct clinical sources, 868 total occupation-related sentences were identified for the study corpus. Building off approaches from previous studies, refined annotation guidelines were created using the National Institute for Occupational Safety and Health Occupational Data for Health data model with elements added to increase granularity. Our corpus generated 2,005 total annotations representing 39 of 41 entity types from the enhanced data model. Highest frequency entities were: Occupation Description (17.7%); Employment Status – Not Specified (12.5%); Employer Name (11.0%); Subject (9.8%); Industry Description (6.2%). Our findings support the value for standardizing entry of EHR occupation information to improve data quality for improved patient care and secondary uses of this information. PMID:29295142

  12. Human Rights Texts: Converting Human Rights Primary Source Documents into Data.

    PubMed

    Fariss, Christopher J; Linder, Fridolin J; Jones, Zachary M; Crabtree, Charles D; Biek, Megan A; Ross, Ana-Sophia M; Kaur, Taranamol; Tsai, Michael

    2015-01-01

    We introduce and make publicly available a large corpus of digitized primary source human rights documents which are published annually by monitoring agencies that include Amnesty International, Human Rights Watch, the Lawyers Committee for Human Rights, and the United States Department of State. In addition to the digitized text, we also make available and describe document-term matrices, which are datasets that systematically organize the word counts from each unique document by each unique term within the corpus of human rights documents. To contextualize the importance of this corpus, we describe the development of coding procedures in the human rights community and several existing categorical indicators that have been created by human coding of the human rights documents contained in the corpus. We then discuss how the new human rights corpus and the existing human rights datasets can be used with a variety of statistical analyses and machine learning algorithms to help scholars understand how human rights practices and reporting have evolved over time. We close with a discussion of our plans for dataset maintenance, updating, and availability.

  13. Human Rights Texts: Converting Human Rights Primary Source Documents into Data

    PubMed Central

    Fariss, Christopher J.; Linder, Fridolin J.; Jones, Zachary M.; Crabtree, Charles D.; Biek, Megan A.; Ross, Ana-Sophia M.; Kaur, Taranamol; Tsai, Michael

    2015-01-01

    We introduce and make publicly available a large corpus of digitized primary source human rights documents which are published annually by monitoring agencies that include Amnesty International, Human Rights Watch, the Lawyers Committee for Human Rights, and the United States Department of State. In addition to the digitized text, we also make available and describe document-term matrices, which are datasets that systematically organize the word counts from each unique document by each unique term within the corpus of human rights documents. To contextualize the importance of this corpus, we describe the development of coding procedures in the human rights community and several existing categorical indicators that have been created by human coding of the human rights documents contained in the corpus. We then discuss how the new human rights corpus and the existing human rights datasets can be used with a variety of statistical analyses and machine learning algorithms to help scholars understand how human rights practices and reporting have evolved over time. We close with a discussion of our plans for dataset maintenance, updating, and availability. PMID:26418817

  14. On the creation of a clinical gold standard corpus in Spanish: Mining adverse drug reactions.

    PubMed

    Oronoz, Maite; Gojenola, Koldo; Pérez, Alicia; de Ilarraza, Arantza Díaz; Casillas, Arantza

    2015-08-01

    The advances achieved in Natural Language Processing make it possible to automatically mine information from electronically created documents. Many Natural Language Processing methods that extract information from texts make use of annotated corpora, but these are scarce in the clinical domain due to legal and ethical issues. In this paper we present the creation of the IxaMed-GS gold standard composed of real electronic health records written in Spanish and manually annotated by experts in pharmacology and pharmacovigilance. The experts mainly annotated entities related to diseases and drugs, but also relationships between entities indicating adverse drug reaction events. To help the experts in the annotation task, we adapted a general corpus linguistic analyzer to the medical domain. The quality of the annotation process in the IxaMed-GS corpus has been assessed by measuring the inter-annotator agreement, which was 90.53% for entities and 82.86% for events. In addition, the corpus has been used for the automatic extraction of adverse drug reaction events using machine learning. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. Automatic de-identification of French clinical records: comparison of rule-based and machine-learning approaches.

    PubMed

    Grouin, Cyril; Zweigenbaum, Pierre

    2013-01-01

    In this paper, we present a comparison of two approaches to automatically de-identify medical records written in French: a rule-based system and a machine-learning based system using a conditional random fields (CRF) formalism. Both systems have been designed to process nine identifiers in a corpus of medical records in cardiology. We performed two evaluations: first, on 62 documents in cardiology, and on 10 documents in foetopathology - produced by optical character recognition (OCR) - to evaluate the robustness of our systems. We achieved a 0.843 (rule-based) and 0.883 (machine-learning) exact match overall F-measure in cardiology. While the rule-based system allowed us to achieve good results on nominative (first and last names) and numerical data (dates, phone numbers, and zip codes), the machine-learning approach performed best on more complex categories (postal addresses, hospital names, medical devices, and towns). On the foetopathology corpus, although our systems have not been designed for this corpus and despite OCR character recognition errors, we obtained promising results: a 0.681 (rule-based) and 0.638 (machine-learning) exact-match overall F-measure. This demonstrates that existing tools can be applied to process new documents of lower quality.

  16. Is the Juice Worth the Squeeze? Costs and Benefits of Multiple Human Annotators for Clinical Text De-identification.

    PubMed

    Carrell, David S; Cronkite, David J; Malin, Bradley A; Aberdeen, John S; Hirschman, Lynette

    2016-08-05

    Clinical text contains valuable information but must be de-identified before it can be used for secondary purposes. Accurate annotation of personally identifiable information (PII) is essential to the development of automated de-identification systems and to manual redaction of PII. Yet the accuracy of annotations may vary considerably across individual annotators and annotation is costly. As such, the marginal benefit of incorporating additional annotators has not been well characterized. This study models the costs and benefits of incorporating increasing numbers of independent human annotators to identify the instances of PII in a corpus. We used a corpus with gold standard annotations to evaluate the performance of teams of annotators of increasing size. Four annotators independently identified PII in a 100-document corpus consisting of randomly selected clinical notes from Family Practice clinics in a large integrated health care system. These annotations were pooled and validated to generate a gold standard corpus for evaluation. Recall rates for all PII types ranged from 0.90 to 0.98 for individual annotators to 0.998 to 1.0 for teams of three, when meas-ured against the gold standard. Median cost per PII instance discovered during corpus annotation ranged from $ 0.71 for an individual annotator to $ 377 for annotations discovered only by a fourth annotator. Incorporating a second annotator into a PII annotation process reduces unredacted PII and improves the quality of annotations to 0.99 recall, yielding clear benefit at reasonable cost; the cost advantages of annotation teams larger than two diminish rapidly.

  17. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    PubMed

    Oellrich, Anika; Collier, Nigel; Smedley, Damian; Groza, Tudor

    2015-01-01

    Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the ShARe/CLEF (https://sites.google.com/site/shareclefehealth/data) and i2b2 (https://i2b2.org/NLP/DataSets/) corpora needs to be requested with the individual corpus providers.

  18. Automatic Keyword Extraction from Individual Documents

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rose, Stuart J.; Engel, David W.; Cramer, Nicholas O.

    2010-05-03

    This paper introduces a novel and domain-independent method for automatically extracting keywords, as sequences of one or more words, from individual documents. We describe the method’s configuration parameters and algorithm, and present an evaluation on a benchmark corpus of technical abstracts. We also present a method for generating lists of stop words for specific corpora and domains, and evaluate its ability to improve keyword extraction on the benchmark corpus. Finally, we apply our method of automatic keyword extraction to a corpus of news articles and define metrics for characterizing the exclusivity, essentiality, and generality of extracted keywords within a corpus.

  19. Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions.

    PubMed

    Sohn, Sunghwan; Wang, Yanshan; Wi, Chung-Il; Krusemark, Elizabeth A; Ryu, Euijung; Ali, Mir H; Juhn, Young J; Liu, Hongfang

    2017-11-30

    To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability. Birth cohorts from Mayo Clinic and Sanford Children's Hospital (SCH) were used in this study (n = 298 for each). Documentation variations regarding asthma between the 2 cohorts were examined in various aspects: (1) overall corpus at the word level (ie, lexical variation), (2) topics and asthma-related concepts (ie, semantic variation), and (3) clinical note types (ie, process variation). We compared those statistics and explored NLP system portability for asthma ascertainment in 2 stages: prototype and refinement. There exist notable lexical variations (word-level similarity = 0.669) and process variations (differences in major note types containing asthma-related concepts). However, semantic-level corpora were relatively homogeneous (topic similarity = 0.944, asthma-related concept similarity = 0.971). The NLP system for asthma ascertainment had an F-score of 0.937 at Mayo, and produced 0.813 (prototype) and 0.908 (refinement) when applied at SCH. The criteria for asthma ascertainment are largely dependent on asthma-related concepts. Therefore, we believe that semantic similarity is important to estimate NLP system portability. As the Mayo Clinic and SCH corpora were relatively homogeneous at a semantic level, the NLP system, developed at Mayo Clinic, was imported to SCH successfully with proper adjustments to deal with the intrinsic corpus heterogeneity. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  20. Teaching Specific Purpose Translation: Utilization of Bilingual Contract Document as Parallel Corpus

    ERIC Educational Resources Information Center

    Siregar, Roswani

    2017-01-01

    This study introduced the specific purpose translation teaching to Indonesian undergraduate students at Universitas Al-Azhar Medan, Indonesia. The courses were attended by the Business and Economics students who are new to translation. As parallel corpus, bilingual contract documents in Indonesian and English were chosen to help the students to…

  1. An Evaluation of the UMLS in Representing Corpus Derived Clinical Concepts

    PubMed Central

    Friedlin, Jeff; Overhage, Marc

    2011-01-01

    We performed an evaluation of the Unified Medical Language System (UMLS) in representing concepts derived from medical narrative documents from three domains: chest x-ray reports, discharge summaries and admission notes. We detected concepts in these documents by identifying noun phrases (NPs) and N-grams, including unigrams (single words), bigrams (word pairs) and trigrams (word triples). After removing NPs and N-grams that did not represent discrete clinical concepts, we processed the remaining with the UMLS MetaMap program. We manually reviewed the results of MetaMap processing to determine whether MetaMap found full, partial or no representation of the concept. For full representations, we determined whether post-coordination was required. Our results showed that a large portion of concepts found in clinical narrative documents are either unrepresented or poorly represented in the current version of the UMLS Metathesaurus and that post-coordination was often required in order to fully represent a concept. PMID:22195097

  2. Machine learning-based coreference resolution of concepts in clinical documents

    PubMed Central

    Ware, Henry; Mullett, Charles J; El-Rawas, Oussama

    2012-01-01

    Objective Coreference resolution of concepts, although a very active area in the natural language processing community, has not yet been widely applied to clinical documents. Accordingly, the 2011 i2b2 competition focusing on this area is a timely and useful challenge. The objective of this research was to collate coreferent chains of concepts from a corpus of clinical documents. These concepts are in the categories of person, problems, treatments, and tests. Design A machine learning approach based on graphical models was employed to cluster coreferent concepts. Features selected were divided into domain independent and domain specific sets. Training was done with the i2b2 provided training set of 489 documents with 6949 chains. Testing was done on 322 documents. Results The learning engine, using the un-weighted average of three different measurement schemes, resulted in an F measure of 0.8423 where no domain specific features were included and 0.8483 where the feature set included both domain independent and domain specific features. Conclusion Our machine learning approach is a promising solution for recognizing coreferent concepts, which in turn is useful for practical applications such as the assembly of problem and medication lists from clinical documents. PMID:22582205

  3. Text de-identification for privacy protection: a study of its impact on clinical text information content.

    PubMed

    Meystre, Stéphane M; Ferrández, Óscar; Friedlin, F Jeffrey; South, Brett R; Shen, Shuying; Samore, Matthew H

    2014-08-01

    As more and more electronic clinical information is becoming easier to access for secondary uses such as clinical research, approaches that enable faster and more collaborative research while protecting patient privacy and confidentiality are becoming more important. Clinical text de-identification offers such advantages but is typically a tedious manual process. Automated Natural Language Processing (NLP) methods can alleviate this process, but their impact on subsequent uses of the automatically de-identified clinical narratives has only barely been investigated. In the context of a larger project to develop and investigate automated text de-identification for Veterans Health Administration (VHA) clinical notes, we studied the impact of automated text de-identification on clinical information in a stepwise manner. Our approach started with a high-level assessment of clinical notes informativeness and formatting, and ended with a detailed study of the overlap of select clinical information types and Protected Health Information (PHI). To investigate the informativeness (i.e., document type information, select clinical data types, and interpretation or conclusion) of VHA clinical notes, we used five different existing text de-identification systems. The informativeness was only minimally altered by these systems while formatting was only modified by one system. To examine the impact of de-identification on clinical information extraction, we compared counts of SNOMED-CT concepts found by an open source information extraction application in the original (i.e., not de-identified) version of a corpus of VHA clinical notes, and in the same corpus after de-identification. Only about 1.2-3% less SNOMED-CT concepts were found in de-identified versions of our corpus, and many of these concepts were PHI that was erroneously identified as clinical information. To study this impact in more details and assess how generalizable our findings were, we examined the overlap between select clinical information annotated in the 2010 i2b2 NLP challenge corpus and automatic PHI annotations from our best-of-breed VHA clinical text de-identification system (nicknamed 'BoB'). Overall, only 0.81% of the clinical information exactly overlapped with PHI, and 1.78% partly overlapped. We conclude that automated text de-identification's impact on clinical information is small, but not negligible, and that improved clinical acronyms and eponyms disambiguation could significantly reduce this impact. Copyright © 2014 Elsevier Inc. All rights reserved.

  4. VisIRR: A Visual Analytics System for Information Retrieval and Recommendation for Large-Scale Document Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Choo, Jaegul; Kim, Hannah; Clarkson, Edward

    In this paper, we present an interactive visual information retrieval and recommendation system, called VisIRR, for large-scale document discovery. VisIRR effectively combines the paradigms of (1) a passive pull through query processes for retrieval and (2) an active push that recommends items of potential interest to users based on their preferences. Equipped with an efficient dynamic query interface against a large-scale corpus, VisIRR organizes the retrieved documents into high-level topics and visualizes them in a 2D space, representing the relationships among the topics along with their keyword summary. In addition, based on interactive personalized preference feedback with regard to documents,more » VisIRR provides document recommendations from the entire corpus, which are beyond the retrieved sets. Such recommended documents are visualized in the same space as the retrieved documents, so that users can seamlessly analyze both existing and newly recommended ones. This article presents novel computational methods, which make these integrated representations and fast interactions possible for a large-scale document corpus. We illustrate how the system works by providing detailed usage scenarios. Finally, we present preliminary user study results for evaluating the effectiveness of the system.« less

  5. VisIRR: A Visual Analytics System for Information Retrieval and Recommendation for Large-Scale Document Data

    DOE PAGES

    Choo, Jaegul; Kim, Hannah; Clarkson, Edward; ...

    2018-01-31

    In this paper, we present an interactive visual information retrieval and recommendation system, called VisIRR, for large-scale document discovery. VisIRR effectively combines the paradigms of (1) a passive pull through query processes for retrieval and (2) an active push that recommends items of potential interest to users based on their preferences. Equipped with an efficient dynamic query interface against a large-scale corpus, VisIRR organizes the retrieved documents into high-level topics and visualizes them in a 2D space, representing the relationships among the topics along with their keyword summary. In addition, based on interactive personalized preference feedback with regard to documents,more » VisIRR provides document recommendations from the entire corpus, which are beyond the retrieved sets. Such recommended documents are visualized in the same space as the retrieved documents, so that users can seamlessly analyze both existing and newly recommended ones. This article presents novel computational methods, which make these integrated representations and fast interactions possible for a large-scale document corpus. We illustrate how the system works by providing detailed usage scenarios. Finally, we present preliminary user study results for evaluating the effectiveness of the system.« less

  6. Developing a disease outbreak event corpus.

    PubMed

    Conway, Mike; Kawazoe, Ai; Chanlekha, Hutchatai; Collier, Nigel

    2010-09-28

    In recent years, there has been a growth in work on the use of information extraction technologies for tracking disease outbreaks from online news texts, yet publicly available evaluation standards (and associated resources) for this new area of research have been noticeably lacking. This study seeks to create a "gold standard" data set against which to test how accurately disease outbreak information extraction systems can identify the semantics of disease outbreak events. Additionally, we hope that the provision of an annotation scheme (and associated corpus) to the community will encourage open evaluation in this new and growing application area. We developed an annotation scheme for identifying infectious disease outbreak events in news texts. An event--in the context of our annotation scheme--consists minimally of geographical (eg, country and province) and disease name information. However, the scheme also allows for the rich encoding of other domain salient concepts (eg, international travel, species, and food contamination). The work resulted in a 200-document corpus of event-annotated disease outbreak reports that can be used to evaluate the accuracy of event detection algorithms (in this case, for the BioCaster biosurveillance online news information extraction system). In the 200 documents, 394 distinct events were identified (mean 1.97 events per document, range 0-25 events per document). We also provide a download script and graphical user interface (GUI)-based event browsing software to facilitate corpus exploration. In summary, we present an annotation scheme and corpus that can be used in the evaluation of disease outbreak event extraction algorithms. The annotation scheme and corpus were designed both with the particular evaluation requirements of the BioCaster system in mind as well as the wider need for further evaluation resources in this growing research area.

  7. “Hybrid Topics” -- Facilitating the Interpretation of Topics Through the Addition of MeSH Descriptors to Bags of Words

    PubMed Central

    Yu, Zhiguo; Nguyen, Thang; Dhombres, Ferdinand; Johnson, Todd; Bodenreider, Olivier

    2018-01-01

    Extracting and understanding information, themes and relationships from large collections of documents is an important task for biomedical researchers. Latent Dirichlet Allocation is an unsupervised topic modeling technique using the bag-of-words assumption that has been applied extensively to unveil hidden thematic information within large sets of documents. In this paper, we added MeSH descriptors to the bag-of-words assumption to generate ‘hybrid topics’, which are mixed vectors of words and descriptors. We evaluated this approach on the quality and interpretability of topics in both a general corpus and a specialized corpus. Our results demonstrated that the coherence of ‘hybrid topics’ is higher than that of regular bag-of-words topics in the specialized corpus. We also found that the proportion of topics that are not associated with MeSH descriptors is higher in the specialized corpus than in the general corpus. PMID:29295179

  8. Hippocrates on Pediatric Dermatology.

    PubMed

    Sgantzos, Markos; Tsoucalas, Gregory; Karamanou, Marianna; Giatsiou, Styliani; Tsoukalas, Ioannis; Androutsos, George

    2015-01-01

    Hippocrates of Kos is well known in medicine, but his contributions to pediatric dermatology have not previously been examined. A systematic study of Corpus Hippocraticum was undertaken to document references of clinical and historical importance of pediatric dermatology. In Corpus Hippocraticum, a variety of skin diseases are described, along with proposed treatments. Hippocrates rejected the theory of the punishment of the Greek gods and supported the concept that dermatologic diseases resulted from a loss of balance in the body humors. Many of the terms that Hippocrates and his pupils used are still being used today. Moreover, he probably provided one of the first descriptions of skin findings in smallpox, Henoch-Schönlein purpura (also known as anaphylactoid purpura, purpura rheumatica, allergic purpura), and meningococcal septicemia. © 2015 Wiley Periodicals, Inc.

  9. Assisted annotation of medical free text using RapTAT

    PubMed Central

    Gobbel, Glenn T; Garvin, Jennifer; Reeves, Ruth; Cronin, Robert M; Heavirland, Julia; Williams, Jenifer; Weaver, Allison; Jayaramaraja, Shrimalini; Giuse, Dario; Speroff, Theodore; Brown, Steven H; Xu, Hua; Matheny, Michael E

    2014-01-01

    Objective To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias. Materials and methods A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19–21 documents for iterative annotation and training. Results The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ∼50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85). Discussion The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias. Conclusions Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%. PMID:24431336

  10. Validating a strategy for psychosocial phenotyping using a large corpus of clinical text.

    PubMed

    Gundlapalli, Adi V; Redd, Andrew; Carter, Marjorie; Divita, Guy; Shen, Shuying; Palmer, Miland; Samore, Matthew H

    2013-12-01

    To develop algorithms to improve efficiency of patient phenotyping using natural language processing (NLP) on text data. Of a large number of note titles available in our database, we sought to determine those with highest yield and precision for psychosocial concepts. From a database of over 1 billion documents from US Department of Veterans Affairs medical facilities, a random sample of 1500 documents from each of 218 enterprise note titles were chosen. Psychosocial concepts were extracted using a UIMA-AS-based NLP pipeline (v3NLP), using a lexicon of relevant concepts with negation and template format annotators. Human reviewers evaluated a subset of documents for false positives and sensitivity. High-yield documents were identified by hit rate and precision. Reasons for false positivity were characterized. A total of 58 707 psychosocial concepts were identified from 316 355 documents for an overall hit rate of 0.2 concepts per document (median 0.1, range 1.6-0). Of 6031 concepts reviewed from a high-yield set of note titles, the overall precision for all concept categories was 80%, with variability among note titles and concept categories. Reasons for false positivity included templating, negation, context, and alternate meaning of words. The sensitivity of the NLP system was noted to be 49% (95% CI 43% to 55%). Phenotyping using NLP need not involve the entire document corpus. Our methods offer a generalizable strategy for scaling NLP pipelines to large free text corpora with complex linguistic annotations in attempts to identify patients of a certain phenotype.

  11. Validating a strategy for psychosocial phenotyping using a large corpus of clinical text

    PubMed Central

    Gundlapalli, Adi V; Redd, Andrew; Carter, Marjorie; Divita, Guy; Shen, Shuying; Palmer, Miland; Samore, Matthew H

    2013-01-01

    Objective To develop algorithms to improve efficiency of patient phenotyping using natural language processing (NLP) on text data. Of a large number of note titles available in our database, we sought to determine those with highest yield and precision for psychosocial concepts. Materials and methods From a database of over 1 billion documents from US Department of Veterans Affairs medical facilities, a random sample of 1500 documents from each of 218 enterprise note titles were chosen. Psychosocial concepts were extracted using a UIMA-AS-based NLP pipeline (v3NLP), using a lexicon of relevant concepts with negation and template format annotators. Human reviewers evaluated a subset of documents for false positives and sensitivity. High-yield documents were identified by hit rate and precision. Reasons for false positivity were characterized. Results A total of 58 707 psychosocial concepts were identified from 316 355 documents for an overall hit rate of 0.2 concepts per document (median 0.1, range 1.6–0). Of 6031 concepts reviewed from a high-yield set of note titles, the overall precision for all concept categories was 80%, with variability among note titles and concept categories. Reasons for false positivity included templating, negation, context, and alternate meaning of words. The sensitivity of the NLP system was noted to be 49% (95% CI 43% to 55%). Conclusions Phenotyping using NLP need not involve the entire document corpus. Our methods offer a generalizable strategy for scaling NLP pipelines to large free text corpora with complex linguistic annotations in attempts to identify patients of a certain phenotype. PMID:24169276

  12. Distributed EDLSI, BM25, and Power Norm at TREC 2008

    DTIC Science & Technology

    2008-11-01

    LSI would work on the IIT Complex Document Information Processing (IIT CDIP ) test collection, which contains approximately 7 million documents (57 GB...requirements, specifically the memory, for EDLSI are reduced over LSI, they are still significant, especially for a corpus the size of IIT CDIP . After...data, for training purposes. Initially we ran the 2006 and 2007 queries against the IIT CDIP corpus and developed a pseudo submis- sion file containing

  13. Cross-language information retrieval using PARAFAC2.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bader, Brett William; Chew, Peter; Abdelali, Ahmed

    A standard approach to cross-language information retrieval (CLIR) uses Latent Semantic Analysis (LSA) in conjunction with a multilingual parallel aligned corpus. This approach has been shown to be successful in identifying similar documents across languages - or more precisely, retrieving the most similar document in one language to a query in another language. However, the approach has severe drawbacks when applied to a related task, that of clustering documents 'language-independently', so that documents about similar topics end up closest to one another in the semantic space regardless of their language. The problem is that documents are generally more similar tomore » other documents in the same language than they are to documents in a different language, but on the same topic. As a result, when using multilingual LSA, documents will in practice cluster by language, not by topic. We propose a novel application of PARAFAC2 (which is a variant of PARAFAC, a multi-way generalization of the singular value decomposition [SVD]) to overcome this problem. Instead of forming a single multilingual term-by-document matrix which, under LSA, is subjected to SVD, we form an irregular three-way array, each slice of which is a separate term-by-document matrix for a single language in the parallel corpus. The goal is to compute an SVD for each language such that V (the matrix of right singular vectors) is the same across all languages. Effectively, PARAFAC2 imposes the constraint, not present in standard LSA, that the 'concepts' in all documents in the parallel corpus are the same regardless of language. Intuitively, this constraint makes sense, since the whole purpose of using a parallel corpus is that exactly the same concepts are expressed in the translations. We tested this approach by comparing the performance of PARAFAC2 with standard LSA in solving a particular CLIR problem. From our results, we conclude that PARAFAC2 offers a very promising alternative to LSA not only for multilingual document clustering, but also for solving other problems in cross-language information retrieval.« less

  14. Factors associated with birth defects in the region of Corpus Christi, Texas

    EPA Science Inventory

    In recent years, the Birth Defects Epidemiology & Surveillance Branch of the Texas Department of State Health Services (DSHS) has documented a high prevalence of certain birth defects in the Corpus Christi, TX region. We conducted a case-control study to evaluate associations...

  15. Functional and magnetic resonance imaging correlates of corpus callosum in normal pressure hydrocephalus before and after shunting

    PubMed Central

    Mataró, Maria; Matarín, Mar; Poca, Maria Antonia; Pueyo, Roser; Sahuquillo, Juan; Barrios, Maite; Junqué, Carme

    2007-01-01

    Background Normal pressure hydrocephalus (NPH) is associated with corpus callosum abnormalities. Objectives To study the clinical and neuropsychological effect of callosal thinning in 18 patients with idiopathic NPH and to investigate the postsurgical callosal changes in 14 patients. Methods Global corpus callosum size and seven callosal subdivisions were measured. Neuropsychological assessment included an extensive battery assessing memory, psychomotor speed, visuospatial and frontal lobe functioning. Results After surgery, patients showed improvements in memory, visuospatial and frontal lobe functions, and psychomotor speed. Two frontal corpus callosum areas, the genu and the rostral body, were the regions most related to the clinical and neuropsychological dysfunction. After surgery, total corpus callosum and four of the seven subdivisions presented a significant increase in size, which was related to poorer neuropsychological and clinical outcome. Conclusion The postsurgical corpus callosum increase might be the result of decompression, re‐expansion and increase of interstitial fluid, although it may also be caused by differences in shape due to cerebral reorganisation. PMID:17056634

  16. Terminology model discovery using natural language processing and visualization techniques.

    PubMed

    Zhou, Li; Tao, Ying; Cimino, James J; Chen, Elizabeth S; Liu, Hongfang; Lussier, Yves A; Hripcsak, George; Friedman, Carol

    2006-12-01

    Medical terminologies are important for unambiguous encoding and exchange of clinical information. The traditional manual method of developing terminology models is time-consuming and limited in the number of phrases that a human developer can examine. In this paper, we present an automated method for developing medical terminology models based on natural language processing (NLP) and information visualization techniques. Surgical pathology reports were selected as the testing corpus for developing a pathology procedure terminology model. The use of a general NLP processor for the medical domain, MedLEE, provides an automated method for acquiring semantic structures from a free text corpus and sheds light on a new high-throughput method of medical terminology model development. The use of an information visualization technique supports the summarization and visualization of the large quantity of semantic structures generated from medical documents. We believe that a general method based on NLP and information visualization will facilitate the modeling of medical terminologies.

  17. Reconciling disparate information in continuity of care documents: Piloting a system to consolidate structured clinical documents.

    PubMed

    Hosseini, Masoud; Jones, Josette; Faiola, Anthony; Vreeman, Daniel J; Wu, Huanmei; Dixon, Brian E

    2017-10-01

    Due to the nature of information generation in health care, clinical documents contain duplicate and sometimes conflicting information. Recent implementation of Health Information Exchange (HIE) mechanisms in which clinical summary documents are exchanged among disparate health care organizations can proliferate duplicate and conflicting information. To reduce information overload, a system to automatically consolidate information across multiple clinical summary documents was developed for an HIE network. The system receives any number of Continuity of Care Documents (CCDs) and outputs a single, consolidated record. To test the system, a randomly sampled corpus of 522 CCDs representing 50 unique patients was extracted from a large HIE network. The automated methods were compared to manual consolidation of information for three key sections of the CCD: problems, allergies, and medications. Manual consolidation of 11,631 entries was completed in approximately 150h. The same data were automatically consolidated in 3.3min. The system successfully consolidated 99.1% of problems, 87.0% of allergies, and 91.7% of medications. Almost all of the inaccuracies were caused by issues involving the use of standardized terminologies within the documents to represent individual information entries. This study represents a novel, tested tool for de-duplication and consolidation of CDA documents, which is a major step toward improving information access and the interoperability among information systems. While more work is necessary, automated systems like the one evaluated in this study will be necessary to meet the informatics needs of providers and health systems in the future. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Vietnamese Document Representation and Classification

    NASA Astrophysics Data System (ADS)

    Nguyen, Giang-Son; Gao, Xiaoying; Andreae, Peter

    Vietnamese is very different from English and little research has been done on Vietnamese document classification, or indeed, on any kind of Vietnamese language processing, and only a few small corpora are available for research. We created a large Vietnamese text corpus with about 18000 documents, and manually classified them based on different criteria such as topics and styles, giving several classification tasks of different difficulty levels. This paper introduces a new syllable-based document representation at the morphological level of the language for efficient classification. We tested the representation on our corpus with different classification tasks using six classification algorithms and two feature selection techniques. Our experiments show that the new representation is effective for Vietnamese categorization, and suggest that best performance can be achieved using syllable-pair document representation, an SVM with a polynomial kernel as the learning algorithm, and using Information gain and an external dictionary for feature selection.

  19. Automated Classification of Selected Data Elements from Free-text Diagnostic Reports for Clinical Research.

    PubMed

    Löpprich, Martin; Krauss, Felix; Ganzinger, Matthias; Senghas, Karsten; Riezler, Stefan; Knaup, Petra

    2016-08-05

    In the Multiple Myeloma clinical registry at Heidelberg University Hospital, most data are extracted from discharge letters. Our aim was to analyze if it is possible to make the manual documentation process more efficient by using methods of natural language processing for multiclass classification of free-text diagnostic reports to automatically document the diagnosis and state of disease of myeloma patients. The first objective was to create a corpus consisting of free-text diagnosis paragraphs of patients with multiple myeloma from German diagnostic reports, and its manual annotation of relevant data elements by documentation specialists. The second objective was to construct and evaluate a framework using different NLP methods to enable automatic multiclass classification of relevant data elements from free-text diagnostic reports. The main diagnoses paragraph was extracted from the clinical report of one third randomly selected patients of the multiple myeloma research database from Heidelberg University Hospital (in total 737 selected patients). An EDC system was setup and two data entry specialists performed independently a manual documentation of at least nine specific data elements for multiple myeloma characterization. Both data entries were compared and assessed by a third specialist and an annotated text corpus was created. A framework was constructed, consisting of a self-developed package to split multiple diagnosis sequences into several subsequences, four different preprocessing steps to normalize the input data and two classifiers: a maximum entropy classifier (MEC) and a support vector machine (SVM). In total 15 different pipelines were examined and assessed by a ten-fold cross-validation, reiterated 100 times. For quality indication the average error rate and the average F1-score were conducted. For significance testing the approximate randomization test was used. The created annotated corpus consists of 737 different diagnoses paragraphs with a total number of 865 coded diagnosis. The dataset is publicly available in the supplementary online files for training and testing of further NLP methods. Both classifiers showed low average error rates (MEC: 1.05; SVM: 0.84) and high F1-scores (MEC: 0.89; SVM: 0.92). However the results varied widely depending on the classified data element. Preprocessing methods increased this effect and had significant impact on the classification, both positive and negative. The automatic diagnosis splitter increased the average error rate significantly, even if the F1-score decreased only slightly. The low average error rates and high average F1-scores of each pipeline demonstrate the suitability of the investigated NPL methods. However, it was also shown that there is no best practice for an automatic classification of data elements from free-text diagnostic reports.

  20. Patient, Physician, and Nurse Factors Associated With Entry Onto Clinical Trials and Finishing Treatment in Patients With Primary or Recurrent Uterine, Endometrial, or Cervical Cancer

    ClinicalTrials.gov

    2018-04-11

    Recurrent Cervical Carcinoma; Recurrent Uterine Corpus Carcinoma; Recurrent Uterine Corpus Sarcoma; Stage I Uterine Corpus Cancer; Stage I Uterine Sarcoma; Stage IA Cervical Cancer; Stage IB Cervical Cancer; Stage II Uterine Corpus Cancer; Stage II Uterine Sarcoma; Stage IIA Cervical Cancer; Stage IIB Cervical Cancer; Stage III Cervical Cancer; Stage III Uterine Corpus Cancer; Stage III Uterine Sarcoma; Stage IV Uterine Corpus Cancer; Stage IV Uterine Sarcoma; Stage IVA Cervical Cancer; Stage IVB Cervical Cancer

  1. Postnatal Microstructural Developmental Trajectory of Corpus Callosum Subregions and Relationship to Clinical Factors in Very Preterm Infants.

    PubMed

    Teli, Radhika; Hay, Margaret; Hershey, Alexa; Kumar, Manoj; Yin, Han; Parikh, Nehal A

    2018-05-15

    Our objectives were to define the microstructural developmental trajectory of six corpus callosum subregions and identify perinatal clinical factors that influence early development of these subregions in very preterm infants. We performed a longitudinal cohort study of very preterm infants (32 weeks gestational age or younger) (N = 36) who underwent structural MRI and diffusion tensor imaging serially at four time points - before 32, 32, 38, and 52 weeks postmenstrual age. We divided the corpus callosum into six subregions, performed probabilistic tractography, and used linear mixed effects models to evaluate the influence of antecedent clinical factors on its microstructural growth trajectory. The genu and splenium demonstrated the most rapid developmental maturation, exhibited by a steep increase in fractional anisotropy. We identified several factors that favored greater corpus callosum microstructural development, including advancing postmenstrual age, higher birth weight, and college level or higher maternal education. Bronchopulmonary dysplasia, low 5-minute Apgar scores, caffeine therapy/apnea of prematurity and male sex were associated with reduced corpus callosum microstructural integrity/development over the first six months after very preterm birth. We identified a unique postnatal microstructural growth trajectory and associated clinical factor profile for each of the six corpus callosum subregions that is consistent with the heterogeneous functional role of these white matter subregions.

  2. Generation of an annotated reference standard for vaccine adverse event reports.

    PubMed

    Foster, Matthew; Pandey, Abhishek; Kreimeyer, Kory; Botsis, Taxiarchis

    2018-07-05

    As part of a collaborative project between the US Food and Drug Administration (FDA) and the Centers for Disease Control and Prevention for the development of a web-based natural language processing (NLP) workbench, we created a corpus of 1000 Vaccine Adverse Event Reporting System (VAERS) reports annotated for 36,726 clinical features, 13,365 temporal features, and 22,395 clinical-temporal links. This paper describes the final corpus, as well as the methodology used to create it, so that clinical NLP researchers outside FDA can evaluate the utility of the corpus to aid their own work. The creation of this standard went through four phases: pre-training, pre-production, production-clinical feature annotation, and production-temporal annotation. The pre-production phase used a double annotation followed by adjudication strategy to refine and finalize the annotation model while the production phases followed a single annotation strategy to maximize the number of reports in the corpus. An analysis of 30 reports randomly selected as part of a quality control assessment yielded accuracies of 0.97, 0.96, and 0.83 for clinical features, temporal features, and clinical-temporal associations, respectively and speaks to the quality of the corpus. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. Deontic Modals in RP-US Visiting Forces Agreement (VFA): A Corpus-Based Analysis

    ERIC Educational Resources Information Center

    Dela Rosa, John Paul Obillos

    2017-01-01

    The marriage between language and the law is apparent in any legal document of whatever purpose. Hence, at present, studies on the language of the law are definitely in vogue. Grounded on Quirk et al. (1985) and Matulewska's (2010) description of deontic modality, this corpus-based linguistic study aimed at analyzing the use of deontic modals in…

  4. Automatic generation of stop word lists for information retrieval and analysis

    DOEpatents

    Rose, Stuart J

    2013-01-08

    Methods and systems for automatically generating lists of stop words for information retrieval and analysis. Generation of the stop words can include providing a corpus of documents and a plurality of keywords. From the corpus of documents, a term list of all terms is constructed and both a keyword adjacency frequency and a keyword frequency are determined. If a ratio of the keyword adjacency frequency to the keyword frequency for a particular term on the term list is less than a predetermined value, then that term is excluded from the term list. The resulting term list is truncated based on predetermined criteria to form a stop word list.

  5. Supporting the annotation of chronic obstructive pulmonary disease (COPD) phenotypes with text mining workflows.

    PubMed

    Fu, Xiao; Batista-Navarro, Riza; Rak, Rafal; Ananiadou, Sophia

    2015-01-01

    Chronic obstructive pulmonary disease (COPD) is a life-threatening lung disorder whose recent prevalence has led to an increasing burden on public healthcare. Phenotypic information in electronic clinical records is essential in providing suitable personalised treatment to patients with COPD. However, as phenotypes are often "hidden" within free text in clinical records, clinicians could benefit from text mining systems that facilitate their prompt recognition. This paper reports on a semi-automatic methodology for producing a corpus that can ultimately support the development of text mining tools that, in turn, will expedite the process of identifying groups of COPD patients. A corpus of 30 full-text papers was formed based on selection criteria informed by the expertise of COPD specialists. We developed an annotation scheme that is aimed at producing fine-grained, expressive and computable COPD annotations without burdening our curators with a highly complicated task. This was implemented in the Argo platform by means of a semi-automatic annotation workflow that integrates several text mining tools, including a graphical user interface for marking up documents. When evaluated using gold standard (i.e., manually validated) annotations, the semi-automatic workflow was shown to obtain a micro-averaged F-score of 45.70% (with relaxed matching). Utilising the gold standard data to train new concept recognisers, we demonstrated that our corpus, although still a work in progress, can foster the development of significantly better performing COPD phenotype extractors. We describe in this work the means by which we aim to eventually support the process of COPD phenotype curation, i.e., by the application of various text mining tools integrated into an annotation workflow. Although the corpus being described is still under development, our results thus far are encouraging and show great potential in stimulating the development of further automatic COPD phenotype extractors.

  6. Reaction Time Is Negatively Associated with Corpus Callosum Area in the Early Stages of CADASIL.

    PubMed

    Delorme, S; De Guio, F; Reyes, S; Jabouley, A; Chabriat, H; Jouvent, E

    2017-11-01

    Reaction time was recently recognized as a marker of subtle cognitive and behavioral alterations in the early clinical stages of CADASIL, a monogenic cerebral small-vessel disease. In unselected patients with CADASIL, brain atrophy and lacunes are the main imaging correlates of disease severity, but MR imaging correlates of reaction time in mildly affected patients are unknown. We hypothesized that reaction time is independently associated with the corpus callosum area in the early clinical stages of CADASIL. Twenty-six patients with CADASIL without dementia (Mini-Mental State Examination score > 24 and no cognitive symptoms) and without disability (modified Rankin Scale score ≤ 1) were compared with 29 age- and sex-matched controls. Corpus callosum area was determined on 3D-T1 MR imaging sequences with validated methodology. Between-group comparisons were performed with t tests or χ 2 tests when appropriate. Relationships between reaction time and corpus callosum area were tested using linear regression modeling. Reaction time was significantly related to corpus callosum area in patients (estimate = -7.4 × 10 3 , standard error = 3.3 × 10 3 , P = .03) even after adjustment for age, sex, level of education, and scores of depression and apathy (estimate = -12.2 × 10 3 , standard error = 3.8 × 10 3 , P = .005). No significant relationship was observed in controls. Corpus callosum area, a simple and robust imaging parameter, appears to be an independent correlate of reaction time at the early clinical stages of CADASIL. Further studies will determine whether corpus callosum area can be used as an outcome in future clinical trials in CADASIL or in more prevalent small-vessel diseases. © 2017 by American Journal of Neuroradiology.

  7. Comparison of Data Development Tools for Populating Cognitive Models in Social Simulation

    DTIC Science & Technology

    2011-09-01

    world surveys. STANLEY was evaluated by scoring sentiment in a document corpus and attempting to correlate those scores to a real world issue ...corpus and attempting to correlate those scores to a real world issue . Results of the study indicate that the survey data tool generated case files of...15 1. Issues with the Initial Version of the Tool .......................................21 2. The Tool Used in the Research

  8. 77 FR 2448 - Special Local Regulation; HITS Triathlon; Corpus Christi Bayfront, Corpus Christi, TX

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-01-18

    ..., 2012. This regulation will be enforced on February 18, 2012 from 6:45 a.m. to 8:15 a.m. and 11:45 a.m. to 1:15 p.m., and on February 19, 2012 from 6:45 a.m. to 9:45 a.m. ADDRESSES: Documents indicated in... clicking ``Search.'' They are also available for inspection or copying at the Docket Management Facility (M...

  9. Developing a corpus of clinical notes manually annotated for part-of-speech.

    PubMed

    Pakhomov, Serguei V; Coden, Anni; Chute, Christopher G

    2006-06-01

    This paper presents a project whose main goal is to construct a corpus of clinical text manually annotated for part-of-speech (POS) information. We describe and discuss the process of training three domain experts to perform linguistic annotation. Three domain experts were trained to perform manual annotation of a corpus of clinical notes. A part of this corpus was combined with the Penn Treebank corpus of general purpose English text and another part was set aside for testing. The corpora were then used for training and testing statistical part-of-speech taggers. We list some of the challenges as well as encouraging results pertaining to inter-rater agreement and consistency of annotation. We used the Trigrams'n'Tags (TnT) [T. Brants, TnT-a statistical part-of-speech tagger, In: Proceedings of NAACL/ANLP-2000 Symposium, 2000] tagger trained on general English data to achieve 89.79% correctness. The same tagger trained on a portion of the medical data annotated for this project improved the performance to 94.69%. Furthermore, we find that discriminating between different types of discourse represented by different sections of clinical text may be very beneficial to improve correctness of POS tagging. Our preliminary experimental results indicate the necessity for adapting state-of-the-art POS taggers to the sublanguage domain of clinical text.

  10. It’s about This and That: A Description of Anaphoric Expressions in Clinical Text

    PubMed Central

    Wang, Yan; Melton, Genevieve B.; Pakhomov, Serguei

    2011-01-01

    Although anaphoric expressions are very common in biomedical and clinical documents, little work has been done to systematically characterize their use in clinical text. Samples of ‘it’, ‘this’, and ‘that’ expressions occurring in inpatient clinical notes from four metropolitan hospitals were analyzed using a combination of semi-automated and manual annotation techniques. We developed a rule-based approach to filter potential non-referential expressions. A physician then manually annotated 1000 potential referential instances to determine referent status and the antecedent of each referent expression. A distributional analysis of the three referring expressions in the entire corpus of notes demonstrates a high prevalence of anaphora and large variance in distributions of referential expressions with different notes. Our results confirm that anaphoric expressions are common in clinical texts. Effective co-reference resolution with anaphoric expressions remains an important challenge in medical natural language processing research. PMID:22195211

  11. Biomedical literature classification using encyclopedic knowledge: a Wikipedia-based bag-of-concepts approach.

    PubMed

    Mouriño García, Marcos Antonio; Pérez Rodríguez, Roberto; Anido Rifón, Luis E

    2015-01-01

    Automatic classification of text documents into a set of categories has a lot of applications. Among those applications, the automatic classification of biomedical literature stands out as an important application for automatic document classification strategies. Biomedical staff and researchers have to deal with a lot of literature in their daily activities, so it would be useful a system that allows for accessing to documents of interest in a simple and effective way; thus, it is necessary that these documents are sorted based on some criteria-that is to say, they have to be classified. Documents to classify are usually represented following the bag-of-words (BoW) paradigm. Features are words in the text-thus suffering from synonymy and polysemy-and their weights are just based on their frequency of occurrence. This paper presents an empirical study of the efficiency of a classifier that leverages encyclopedic background knowledge-concretely Wikipedia-in order to create bag-of-concepts (BoC) representations of documents, understanding concept as "unit of meaning", and thus tackling synonymy and polysemy. Besides, the weighting of concepts is based on their semantic relevance in the text. For the evaluation of the proposal, empirical experiments have been conducted with one of the commonly used corpora for evaluating classification and retrieval of biomedical information, OHSUMED, and also with a purpose-built corpus of MEDLINE biomedical abstracts, UVigoMED. Results obtained show that the Wikipedia-based bag-of-concepts representation outperforms the classical bag-of-words representation up to 157% in the single-label classification problem and up to 100% in the multi-label problem for OHSUMED corpus, and up to 122% in the single-label classification problem and up to 155% in the multi-label problem for UVigoMED corpus.

  12. Biomedical literature classification using encyclopedic knowledge: a Wikipedia-based bag-of-concepts approach

    PubMed Central

    Pérez Rodríguez, Roberto; Anido Rifón, Luis E.

    2015-01-01

    Automatic classification of text documents into a set of categories has a lot of applications. Among those applications, the automatic classification of biomedical literature stands out as an important application for automatic document classification strategies. Biomedical staff and researchers have to deal with a lot of literature in their daily activities, so it would be useful a system that allows for accessing to documents of interest in a simple and effective way; thus, it is necessary that these documents are sorted based on some criteria—that is to say, they have to be classified. Documents to classify are usually represented following the bag-of-words (BoW) paradigm. Features are words in the text—thus suffering from synonymy and polysemy—and their weights are just based on their frequency of occurrence. This paper presents an empirical study of the efficiency of a classifier that leverages encyclopedic background knowledge—concretely Wikipedia—in order to create bag-of-concepts (BoC) representations of documents, understanding concept as “unit of meaning”, and thus tackling synonymy and polysemy. Besides, the weighting of concepts is based on their semantic relevance in the text. For the evaluation of the proposal, empirical experiments have been conducted with one of the commonly used corpora for evaluating classification and retrieval of biomedical information, OHSUMED, and also with a purpose-built corpus of MEDLINE biomedical abstracts, UVigoMED. Results obtained show that the Wikipedia-based bag-of-concepts representation outperforms the classical bag-of-words representation up to 157% in the single-label classification problem and up to 100% in the multi-label problem for OHSUMED corpus, and up to 122% in the single-label classification problem and up to 155% in the multi-label problem for UVigoMED corpus. PMID:26468436

  13. Mapping annotations with textual evidence using an scLDA model.

    PubMed

    Jin, Bo; Chen, Vicky; Chen, Lujia; Lu, Xinghua

    2011-01-01

    Most of the knowledge regarding genes and proteins is stored in biomedical literature as free text. Extracting information from complex biomedical texts demands techniques capable of inferring biological concepts from local text regions and mapping them to controlled vocabularies. To this end, we present a sentence-based correspondence latent Dirichlet allocation (scLDA) model which, when trained with a corpus of PubMed documents with known GO annotations, performs the following tasks: 1) learning major biological concepts from the corpus, 2) inferring the biological concepts existing within text regions (sentences), and 3) identifying the text regions in a document that provides evidence for the observed annotations. When applied to new gene-related documents, a trained scLDA model is capable of predicting GO annotations and identifying text regions as textual evidence supporting the predicted annotations. This study uses GO annotation data as a testbed; the approach can be generalized to other annotated data, such as MeSH and MEDLINE documents.

  14. Developing a clinical hypermedia corpus: experiences from the use of a practice-centered method.

    PubMed Central

    Timpka, T.; Nyce, J. M.; Sjöberg, C.; Hedblom, P.; Lindblom, P.

    1992-01-01

    This paper outlines a practice-centered method for creation of a hypermedia corpus. It also describes experiences with creating such a corpus of information to support interprofessional work at a Primary Healthcare Center. From these experiences, a number of basic issues regarding information systems development within medical informatics will be discussed. PMID:1482924

  15. Towards Phenotyping of Clinical Trial Eligibility Criteria.

    PubMed

    Löbe, Matthias; Stäubert, Sebastian; Goldberg, Colleen; Haffner, Ivonne; Winter, Alfred

    2018-01-01

    Medical plaintext documents contain important facts about patients, but they are rarely available for structured queries. The provision of structured information from natural language texts in addition to the existing structured data can significantly speed up the search for fulfilled inclusion criteria and thus improve the recruitment rate. This work is aimed at supporting clinical trial recruitment with text mining techniques to identify suitable subjects in hospitals. Based on the inclusion/exclusion criteria of 5 sample studies and a text corpus consisting of 212 doctor's letters and medical follow-up documentation from a university cancer center, a prototype was developed and technically evaluated using NLP procedures (UIMA) for the extraction of facts from medical free texts. It was found that although the extracted entities are not always correct (precision between 23% and 96%), they provide a decisive indication as to which patient file should be read preferentially. The prototype presented here demonstrates the technical feasibility. In order to find available, lucrative phenotypes, an in-depth evaluation is required.

  16. Myriad presentations of penile fracture: report of three cases and review of literature

    PubMed Central

    Faridi, M. S.; Agarwal, Nitin; Saini, Pradeep; Kaur, Navneet; Gupta, Arun

    2015-01-01

    Penile fracture is an unusual though not a rare condition but underreported. It is defined classically as the disruption of the tunica albuginea with rupture of the corpus cavernosum. Penile fracture can be misdiagnosed with rupture of corpus spongiosum clinically. Therefore, we are presenting three cases due to its varied clinical presentation and management. In first patient, there was a tear in the corpus spongiosum and a partial tear in the ventral urethra. Both defects were repaired with interrupted sutures. In the second patient, there was a rupture of corpus cavernosum, which was primarily repaired. After 1-year of primary surgery, patient again came with similar complaints, and diagnosis of scar dehiscence was made. Patient was treated conservatively with satisfactory results on follow-up. Third patient came with a history of 1-week. Intra-operative findings revealed only hematoma without any defect in corpora cavernosum, corpus spongiosum, and urethra. Only evacuation of hematoma was done. Early surgical treatment of penile fracture is advantageous. In recurrent penile fracture, if no penile deformity or any reasonable clinical and radiological evidence, then conservative management is advocated. Even when presentation is delayed up to 1-week, operative management has shown good results. PMID:25949981

  17. Myriad presentations of penile fracture: report of three cases and review of literature.

    PubMed

    Faridi, M S; Agarwal, Nitin; Saini, Pradeep; Kaur, Navneet; Gupta, Arun

    2015-01-01

    Penile fracture is an unusual though not a rare condition but underreported. It is defined classically as the disruption of the tunica albuginea with rupture of the corpus cavernosum. Penile fracture can be misdiagnosed with rupture of corpus spongiosum clinically. Therefore, we are presenting three cases due to its varied clinical presentation and management. In first patient, there was a tear in the corpus spongiosum and a partial tear in the ventral urethra. Both defects were repaired with interrupted sutures. In the second patient, there was a rupture of corpus cavernosum, which was primarily repaired. After 1-year of primary surgery, patient again came with similar complaints, and diagnosis of scar dehiscence was made. Patient was treated conservatively with satisfactory results on follow-up. Third patient came with a history of 1-week. Intra-operative findings revealed only hematoma without any defect in corpora cavernosum, corpus spongiosum, and urethra. Only evacuation of hematoma was done. Early surgical treatment of penile fracture is advantageous. In recurrent penile fracture, if no penile deformity or any reasonable clinical and radiological evidence, then conservative management is advocated. Even when presentation is delayed up to 1-week, operative management has shown good results.

  18. DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1-1.1

    NASA Astrophysics Data System (ADS)

    Garofolo, J. S.; Lamel, L. F.; Fisher, W. M.; Fiscus, J. G.; Pallett, D. S.

    1993-02-01

    The Texas Instruments/Massachusetts Institute of Technology (TIMIT) corpus of read speech has been designed to provide speech data for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition systems. TIMIT contains speech from 630 speakers representing 8 major dialect divisions of American English, each speaking 10 phonetically-rich sentences. The TIMIT corpus includes time-aligned orthographic, phonetic, and word transcriptions, as well as speech waveform data for each spoken sentence. The release of TIMIT contains several improvements over the Prototype CD-ROM released in December, 1988: (1) full 630-speaker corpus, (2) checked and corrected transcriptions, (3) word-alignment transcriptions, (4) NIST SPHERE-headered waveform files and header manipulation software, (5) phonemic dictionary, (6) new test and training subsets balanced for dialectal and phonetic coverage, and (7) more extensive documentation.

  19. Classification of health webpages as expert and non expert with a reduced set of cross-language features.

    PubMed

    Grabar, Natalia; Krivine, Sonia; Jaulent, Marie-Christine

    2007-10-11

    Making the distinction between expert and non expert health documents can help users to select the information which is more suitable for them, according to whether they are familiar or not with medical terminology. This issue is particularly important for the information retrieval area. In our work we address this purpose through stylistic corpus analysis and the application of machine learning algorithms. Our hypothesis is that this distinction can be performed on the basis of a small number of features and that such features can be language and domain independent. The used features were acquired in source corpus (Russian language, diabetes topic) and then tested on target (French language, pneumology topic) and source corpora. These cross-language features show 90% precision and 93% recall with non expert documents in source language; and 85% precision and 74% recall with expert documents in target language.

  20. UCD IIRG at TREC 2012 Medical Track

    DTIC Science & Technology

    2012-11-01

    documents. For ex- ample, the query “shakespeare.author” would en- sure that documents matching shakespeare in the au- thor field are returned. On the...corpus side, field extents are identified using XMLlike markup, e.g. <author> shakespeare </author>. 3 System Background & Motivation This section outlines

  1. Conjunctive Cohesion in English Language EU Documents--A Corpus-Based Analysis and Its Implications

    ERIC Educational Resources Information Center

    Trebits, Anna

    2009-01-01

    This paper reports the findings of a study which forms part of a larger-scale research project investigating the use of English in the documents of the European Union (EU). The documents of the EU show various features of texts written for legal, business and other specific purposes. Moreover, the translation services of the EU institutions often…

  2. A unified approach for development of Urdu Corpus for OCR and demographic purpose

    NASA Astrophysics Data System (ADS)

    Choudhary, Prakash; Nain, Neeta; Ahmed, Mushtaq

    2015-02-01

    This paper presents a methodology for the development of an Urdu handwritten text image Corpus and application of Corpus linguistics in the field of OCR and information retrieval from handwritten document. Compared to other language scripts, Urdu script is little bit complicated for data entry. To enter a single character it requires a combination of multiple keys entry. Here, a mixed approach is proposed and demonstrated for building Urdu Corpus for OCR and Demographic data collection. Demographic part of database could be used to train a system to fetch the data automatically, which will be helpful to simplify existing manual data-processing task involved in the field of data collection such as input forms like Passport, Ration Card, Voting Card, AADHAR, Driving licence, Indian Railway Reservation, Census data etc. This would increase the participation of Urdu language community in understanding and taking benefit of the Government schemes. To make availability and applicability of database in a vast area of corpus linguistics, we propose a methodology for data collection, mark-up, digital transcription, and XML metadata information for benchmarking.

  3. A pilot study of a heuristic algorithm for novel template identification from VA electronic medical record text.

    PubMed

    Redd, Andrew M; Gundlapalli, Adi V; Divita, Guy; Carter, Marjorie E; Tran, Le-Thuy; Samore, Matthew H

    2017-07-01

    Templates in text notes pose challenges for automated information extraction algorithms. We propose a method that identifies novel templates in plain text medical notes. The identification can then be used to either include or exclude templates when processing notes for information extraction. The two-module method is based on the framework of information foraging and addresses the hypothesis that documents containing templates and the templates within those documents can be identified by common features. The first module takes documents from the corpus and groups those with common templates. This is accomplished through a binned word count hierarchical clustering algorithm. The second module extracts the templates. It uses the groupings and performs a longest common subsequence (LCS) algorithm to obtain the constituent parts of the templates. The method was developed and tested on a random document corpus of 750 notes derived from a large database of US Department of Veterans Affairs (VA) electronic medical notes. The grouping module, using hierarchical clustering, identified 23 groups with 3 documents or more, consisting of 120 documents from the 750 documents in our test corpus. Of these, 18 groups had at least one common template that was present in all documents in the group for a positive predictive value of 78%. The LCS extraction module performed with 100% positive predictive value, 94% sensitivity, and 83% negative predictive value. The human review determined that in 4 groups the template covered the entire document, with the remaining 14 groups containing a common section template. Among documents with templates, the number of templates per document ranged from 1 to 14. The mean and median number of templates per group was 5.9 and 5, respectively. The grouping method was successful in finding like documents containing templates. Of the groups of documents containing templates, the LCS module was successful in deciphering text belonging to the template and text that was extraneous. Major obstacles to improved performance included documents composed of multiple templates, templates that included other templates embedded within them, and variants of templates. We demonstrate proof of concept of the grouping and extraction method of identifying templates in electronic medical records in this pilot study and propose methods to improve performance and scaling up. Published by Elsevier Inc.

  4. Detecting people of interest from internet data sources

    NASA Astrophysics Data System (ADS)

    Cardillo, Raymond A.; Salerno, John J.

    2006-04-01

    In previous papers, we have documented success in determining the key people of interest from a large corpus of real-world evidence. Our recent efforts focus on exploring additional domains and data sources. Internet data sources such as email, web pages, and news feeds make it easier to gather a large corpus of documents for various domains, but detecting people of interest in these sources introduces new challenges. Analyzing these massive sources magnifies entity resolution problems, and demands a storage management strategy that supports efficient algorithmic analysis and visualization techniques. This paper discusses the techniques we used in order to analyze the ENRON email repository, which are also applicable to analyzing web pages returned from our "Buddy" meta-search engine.

  5. Perspectives on Linguistic Documentation from Sociolinguistic Research on Dialects

    ERIC Educational Resources Information Center

    Tagliamonte, Sali A.

    2017-01-01

    The goal of the paper is to demonstrate how sociolinguistic research can be applied to endangered language documentation field linguistics. It first provides an overview of the techniques and practices of sociolinguistic fieldwork and the ensuring corpus compilation methods. The discussion is framed with examples from research projects focused on…

  6. Temporal Annotation in the Clinical Domain

    PubMed Central

    Styler, William F.; Bethard, Steven; Finan, Sean; Palmer, Martha; Pradhan, Sameer; de Groen, Piet C; Erickson, Brad; Miller, Timothy; Lin, Chen; Savova, Guergana; Pustejovsky, James

    2014-01-01

    This article discusses the requirements of a formal specification for the annotation of temporal information in clinical narratives. We discuss the implementation and extension of ISO-TimeML for annotating a corpus of clinical notes, known as the THYME corpus. To reflect the information task and the heavily inference-based reasoning demands in the domain, a new annotation guideline has been developed, “the THYME Guidelines to ISO-TimeML (THYME-TimeML)”. To clarify what relations merit annotation, we distinguish between linguistically-derived and inferentially-derived temporal orderings in the text. We also apply a top performing TempEval 2013 system against this new resource to measure the difficulty of adapting systems to the clinical domain. The corpus is available to the community and has been proposed for use in a SemEval 2015 task. PMID:29082229

  7. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus

    PubMed Central

    2015-01-01

    Background Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. Methods To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Results Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. Conclusions PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single disease, the promising results achieved can stimulate further work into the extraction of phenotypic information for other diseases. The PhenoCHF annotation guidelines and annotations are publicly available at https://code.google.com/p/phenochf-corpus. PMID:26099853

  8. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus.

    PubMed

    Alnazzawi, Noha; Thompson, Paul; Batista-Navarro, Riza; Ananiadou, Sophia

    2015-01-01

    Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single disease, the promising results achieved can stimulate further work into the extraction of phenotypic information for other diseases. The PhenoCHF annotation guidelines and annotations are publicly available at https://code.google.com/p/phenochf-corpus.

  9. TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations

    PubMed Central

    Miyao, Yusuke; Collier, Nigel

    2017-01-01

    Background Work on pharmacovigilance systems using texts from PubMed and Twitter typically target at different elements and use different annotation guidelines resulting in a scenario where there is no comparable set of documents from both Twitter and PubMed annotated in the same manner. Objective This study aimed to provide a comparable corpus of texts from PubMed and Twitter that can be used to study drug reports from these two sources of information, allowing researchers in the area of pharmacovigilance using natural language processing (NLP) to perform experiments to better understand the similarities and differences between drug reports in Twitter and PubMed. Methods We produced a corpus comprising 1000 tweets and 1000 PubMed sentences selected using the same strategy and annotated at entity level by the same experts (pharmacists) using the same set of guidelines. Results The resulting corpus, annotated by two pharmacists, comprises semantically correct annotations for a set of drugs, diseases, and symptoms. This corpus contains the annotations for 3144 entities, 2749 relations, and 5003 attributes. Conclusions We present a corpus that is unique in its characteristics as this is the first corpus for pharmacovigilance curated from Twitter messages and PubMed sentences using the same data selection and annotation strategies. We believe this corpus will be of particular interest for researchers willing to compare results from pharmacovigilance systems (eg, classifiers and named entity recognition systems) when using data from Twitter and from PubMed. We hope that given the comprehensive set of drug names and the annotated entities and relations, this corpus becomes a standard resource to compare results from different pharmacovigilance studies in the area of NLP. PMID:28468748

  10. TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations.

    PubMed

    Alvaro, Nestor; Miyao, Yusuke; Collier, Nigel

    2017-05-03

    Work on pharmacovigilance systems using texts from PubMed and Twitter typically target at different elements and use different annotation guidelines resulting in a scenario where there is no comparable set of documents from both Twitter and PubMed annotated in the same manner. This study aimed to provide a comparable corpus of texts from PubMed and Twitter that can be used to study drug reports from these two sources of information, allowing researchers in the area of pharmacovigilance using natural language processing (NLP) to perform experiments to better understand the similarities and differences between drug reports in Twitter and PubMed. We produced a corpus comprising 1000 tweets and 1000 PubMed sentences selected using the same strategy and annotated at entity level by the same experts (pharmacists) using the same set of guidelines. The resulting corpus, annotated by two pharmacists, comprises semantically correct annotations for a set of drugs, diseases, and symptoms. This corpus contains the annotations for 3144 entities, 2749 relations, and 5003 attributes. We present a corpus that is unique in its characteristics as this is the first corpus for pharmacovigilance curated from Twitter messages and PubMed sentences using the same data selection and annotation strategies. We believe this corpus will be of particular interest for researchers willing to compare results from pharmacovigilance systems (eg, classifiers and named entity recognition systems) when using data from Twitter and from PubMed. We hope that given the comprehensive set of drug names and the annotated entities and relations, this corpus becomes a standard resource to compare results from different pharmacovigilance studies in the area of NLP. ©Nestor Alvaro, Yusuke Miyao, Nigel Collier. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 03.05.2017.

  11. The CHEMDNER corpus of chemicals and drugs and its annotation principles.

    PubMed

    Krallinger, Martin; Rabal, Obdulia; Leitner, Florian; Vazquez, Miguel; Salgado, David; Lu, Zhiyong; Leaman, Robert; Lu, Yanan; Ji, Donghong; Lowe, Daniel M; Sayle, Roger A; Batista-Navarro, Riza Theresa; Rak, Rafal; Huber, Torsten; Rocktäschel, Tim; Matos, Sérgio; Campos, David; Tang, Buzhou; Xu, Hua; Munkhdalai, Tsendsuren; Ryu, Keun Ho; Ramanan, S V; Nathan, Senthil; Žitnik, Slavko; Bajec, Marko; Weber, Lutz; Irmer, Matthias; Akhondi, Saber A; Kors, Jan A; Xu, Shuo; An, Xin; Sikdar, Utpal Kumar; Ekbal, Asif; Yoshioka, Masaharu; Dieb, Thaer M; Choi, Miji; Verspoor, Karin; Khabsa, Madian; Giles, C Lee; Liu, Hongfang; Ravikumar, Komandur Elayavilli; Lamurias, Andre; Couto, Francisco M; Dai, Hong-Jie; Tsai, Richard Tzong-Han; Ata, Caglar; Can, Tolga; Usié, Anabel; Alves, Rui; Segura-Bedmar, Isabel; Martínez, Paloma; Oyarzabal, Julen; Valencia, Alfonso

    2015-01-01

    The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus/.

  12. The CHEMDNER corpus of chemicals and drugs and its annotation principles

    PubMed Central

    2015-01-01

    The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus/ PMID:25810773

  13. Recent Experiments with INQUERY

    DTIC Science & Technology

    1995-11-01

    were conducted with version of the INQUERY information retrieval system INQUERY is based on the Bayesian inference network retrieval model It is...corpus based query expansion For TREC a subset of of the adhoc document set was used to build the InFinder database None of the...experiments that showed signi cant improvements in retrieval eectiveness when document rankings based on the entire document text are combined with

  14. CUILESS2016: a clinical corpus applying compositional normalization of text mentions.

    PubMed

    Osborne, John D; Neu, Matthew B; Danila, Maria I; Solorio, Thamar; Bethard, Steven J

    2018-01-10

    Traditionally text mention normalization corpora have normalized concepts to single ontology identifiers ("pre-coordinated concepts"). Less frequently, normalization corpora have used concepts with multiple identifiers ("post-coordinated concepts") but the additional identifiers have been restricted to a defined set of relationships to the core concept. This approach limits the ability of the normalization process to express semantic meaning. We generated a freely available corpus using post-coordinated concepts without a defined set of relationships that we term "compositional concepts" to evaluate their use in clinical text. We annotated 5397 disorder mentions from the ShARe corpus to SNOMED CT that were previously normalized as "CUI-less" in the "SemEval-2015 Task 14" shared task because they lacked a pre-coordinated mapping. Unlike the previous normalization method, we do not restrict concept mappings to a particular set of the Unified Medical Language System (UMLS) semantic types and allow normalization to occur to multiple UMLS Concept Unique Identifiers (CUIs). We computed annotator agreement and assessed semantic coverage with this method. We generated the largest clinical text normalization corpus to date with mappings to multiple identifiers and made it freely available. All but 8 of the 5397 disorder mentions were normalized using this methodology. Annotator agreement ranged from 52.4% using the strictest metric (exact matching) to 78.2% using a hierarchical agreement that measures the overlap of shared ancestral nodes. Our results provide evidence that compositional concepts can increase semantic coverage in clinical text. To our knowledge we provide the first freely available corpus of compositional concept annotation in clinical text.

  15. Novel case of paternal paracentric inversion causing partial trisomy 13 and review of the literature.

    PubMed

    Douglas, Chad; Smith, Stephen A; Rohena, Luis

    2017-06-01

    Partial trisomies have often been reported secondary to inversion mutations. These occurrences are most frequently associated with pericentric inversions. In this report, we describe the first documented case of partial trisomy 13 secondary to a parental paracentric inversion, in this case a paternal paracentric 13q inversion. Our Patient exhibits a variety of clinical findings including global developmental delay with intellectual disability, sensorineural hearing loss, bilateral congenital polar cataracts with associated foveal and optic nerve hypoplasia, right retinal detachment, atrial septal defect, absence of corpus callosum, celiac disease, microcephaly, as well as other dysmorphic features. © 2017 Wiley Periodicals, Inc.

  16. Networks of genetic loci and the scientific literature

    NASA Astrophysics Data System (ADS)

    Semeiks, J. R.; Grate, L. R.; Mian, I. S.

    This work considers biological information graphs, networks in which nodes corre-spond to genetic loci (or "genes") and an (undirected) edge signifies that two genes are discussed in the same article(s) in the scientific literature ("documents"). Operations that utilize the topology of these graphs can assist researchers in the scientific discovery process. For example, a shortest path between two nodes defines an ordered series of genes and documents that can be used to explore the relationship(s) between genes of interest. This work (i) describes how topologies in which edges are likely to reflect genuine relationship(s) can be constructed from human-curated corpora of genes an-notated with documents (or vice versa), and (ii) illustrates the potential of biological information graphs in synthesizing knowledge in order to formulate new hypotheses and generate novel predictions for subsequent experimental study. In particular, the well-known LocusLink corpus is used to construct a biological information graph consisting of 10,297 nodes and 21,910 edges. The large-scale statistical properties of this gene-document network suggest that it is a new example of a power-law network. The segregation of genes on the basis of species and encoded protein molecular function indicate the presence of assortativity, the preference for nodes with similar attributes to be neighbors in a network. The practical utility of a gene-document network is illustrated by using measures such as shortest paths and centrality to analyze a subset of nodes corresponding to genes implicated in aging. Each release of a curated biomedical corpus defines a particular static graph. The topology of a gene-document network changes over time as curators add and/or remove nodes and/or edges. Such a dynamic, evolving corpus provides both the foundation for analyzing the growth and behavior of large complex networks and a substrate for examining trends in biological research.

  17. Unreported links between trial registrations and published articles were identified using document similarity measures in a cross-sectional analysis of ClinicalTrials.gov.

    PubMed

    Dunn, Adam G; Coiera, Enrico; Bourgeois, Florence T

    2018-03-01

    Trial registries can be used to measure reporting biases and support systematic reviews, but 45% of registrations do not provide a link to the article reporting on the trial. We evaluated the use of document similarity methods to identify unreported links between ClinicalTrials.gov and PubMed. We extracted terms and concepts from a data set of 72,469 ClinicalTrials.gov registrations and 276,307 PubMed articles and tested methods for ranking articles across 16,005 reported links and 90 manually identified unreported links. Performance was measured by the median rank of matching articles and the proportion of unreported links that could be found by screening ranked candidate articles in order. The best-performing concept-based representation produced a median rank of 3 (interquartile range [IQR] 1-21) for reported links and 3 (IQR 1-19) for the manually identified unreported links, and term-based representations produced a median rank of 2 (1-20) for reported links and 2 (IQR 1-12) in unreported links. The matching article was ranked first for 40% of registrations, and screening 50 candidate articles per registration identified 86% of the unreported links. Leveraging the growth in the corpus of reported links between ClinicalTrials.gov and PubMed, we found that document similarity methods can assist in the identification of unreported links between trial registrations and corresponding articles. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. A Corpus Investigation on the Journal of Social Sciences of the Turkic World

    ERIC Educational Resources Information Center

    Yilmaz, Isa

    2018-01-01

    In recent years, a rapid development in computer technologies has been witnessed and feasibility of data access has been increased. In today's world, restoring documents, or data in general, and transferring them to interested parties are ordinary tasks. The amount of restored documents has also increased expeditiously and this development has…

  19. Handprinted Forms and Characters

    National Institute of Standards and Technology Data Gateway

    NIST Handprinted Forms and Characters (Web, free access)   NIST Special Database 19 contains NIST's entire corpus of training materials for handprinted document and character recognition. It supersedes NIST Special Databases 3 and 7.

  20. Users’ Manual and Validation of the Automated Grading System (AGS): Improving the Quality of Intelligence Summaries Using Feedback from an Unsupervised Model of Semantics

    DTIC Science & Technology

    2012-12-01

    trajectories in space, and are therefore very highly similar, and a cosine of 0 indicates that the two vectors are unrelated. The vector of a good summary...topic. The effectiveness of the AGS’s ability to automatically grade student assignment is completely dependent on a good match between this corpus...students to summarise “User Documents” that focused on fishing, then a good corpus would contain documents about the various types of fishing

  1. Boomerang sign: Clinical significance of transient lesion in splenium of corpus callosum.

    PubMed

    Malhotra, Hardeep Singh; Garg, Ravindra Kumar; Vidhate, Mukund R; Sharma, Pawan Kumar

    2012-04-01

    Transient signal abnormality in the splenium of corpus callosum on magnetic resonance imaging (MRI) is occasionally encountered in clinical practice. It has been reported in various clinical conditions apart from patients with epilepsy. We describe 4 patients with different etiologies presenting with signal changes in the splenium of corpus callosum. They were diagnosed as having progressive myoclonic epilepsy (case 1), localization-related epilepsy (case 2), hemicrania continua (case 3), and postinfectious parkinsonism (case 4). While three patients had complete involvement of the splenium on diffusion-weighted image ("boomerang sign"), the patient having hemicrania continua showed semilunar involvement ("mini-boomerang") on T2-weighted and FLAIR image. All the cases had noncontiguous involvement of the splenium. We herein, discuss these cases with transient splenial involvement and stress that such patients do not need aggressive diagnostic and therapeutic interventions. An attempt has been made to review the literature regarding the pathophysiology, etiology, and outcome of such lesions.

  2. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition.

    PubMed

    Luo, Ling; Yang, Zhihao; Yang, Pei; Zhang, Yin; Wang, Lei; Lin, Hongfei; Wang, Jian

    2018-04-15

    In biomedical research, chemical is an important class of entities, and chemical named entity recognition (NER) is an important task in the field of biomedical information extraction. However, most popular chemical NER methods are based on traditional machine learning and their performances are heavily dependent on the feature engineering. Moreover, these methods are sentence-level ones which have the tagging inconsistency problem. In this paper, we propose a neural network approach, i.e. attention-based bidirectional Long Short-Term Memory with a conditional random field layer (Att-BiLSTM-CRF), to document-level chemical NER. The approach leverages document-level global information obtained by attention mechanism to enforce tagging consistency across multiple instances of the same token in a document. It achieves better performances with little feature engineering than other state-of-the-art methods on the BioCreative IV chemical compound and drug name recognition (CHEMDNER) corpus and the BioCreative V chemical-disease relation (CDR) task corpus (the F-scores of 91.14 and 92.57%, respectively). Data and code are available at https://github.com/lingluodlut/Att-ChemdNER. yangzh@dlut.edu.cn or wangleibihami@gmail.com. Supplementary data are available at Bioinformatics online.

  3. Visualizing the Topical Structure of the Medical Sciences: A Self-Organizing Map Approach

    PubMed Central

    Skupin, André; Biberstine, Joseph R.; Börner, Katy

    2013-01-01

    Background We implement a high-resolution visualization of the medical knowledge domain using the self-organizing map (SOM) method, based on a corpus of over two million publications. While self-organizing maps have been used for document visualization for some time, (1) little is known about how to deal with truly large document collections in conjunction with a large number of SOM neurons, (2) post-training geometric and semiotic transformations of the SOM tend to be limited, and (3) no user studies have been conducted with domain experts to validate the utility and readability of the resulting visualizations. Our study makes key contributions to all of these issues. Methodology Documents extracted from Medline and Scopus are analyzed on the basis of indexer-assigned MeSH terms. Initial dimensionality is reduced to include only the top 10% most frequent terms and the resulting document vectors are then used to train a large SOM consisting of over 75,000 neurons. The resulting two-dimensional model of the high-dimensional input space is then transformed into a large-format map by using geographic information system (GIS) techniques and cartographic design principles. This map is then annotated and evaluated by ten experts stemming from the biomedical and other domains. Conclusions Study results demonstrate that it is possible to transform a very large document corpus into a map that is visually engaging and conceptually stimulating to subject experts from both inside and outside of the particular knowledge domain. The challenges of dealing with a truly large corpus come to the fore and require embracing parallelization and use of supercomputing resources to solve otherwise intractable computational tasks. Among the envisaged future efforts are the creation of a highly interactive interface and the elaboration of the notion of this map of medicine acting as a base map, onto which other knowledge artifacts could be overlaid. PMID:23554924

  4. Clinical, cognitive, and behavioural correlates of white matter damage in progressive supranuclear palsy.

    PubMed

    Agosta, Federica; Galantucci, Sebastiano; Svetel, Marina; Lukić, Milica Ječmenica; Copetti, Massimiliano; Davidovic, Kristina; Tomić, Aleksandra; Spinelli, Edoardo G; Kostić, Vladimir S; Filippi, Massimo

    2014-05-01

    White matter (WM) tract alterations were assessed in patients with progressive supranuclear palsy (PSP) relative to healthy controls and patients with idiopathic Parkinson's disease (PD) to explore the relationship of WM tract damage with clinical disease severity, performance on cognitive tests, and apathy. 37 PSP patients, 41 PD patients, and 34 healthy controls underwent an MRI scan and clinical testing to evaluate physical disability, cognitive impairment, and apathy. In PSP, the contribution of WM tract damage to global disease severity and cognitive and behavioural disturbances was assessed using Random Forest analysis. Relative to controls, PSP patients showed diffusion tensor (DT) MRI abnormalities of the corpus callosum, superior cerebellar peduncle (SCP), cingulum and uncinate fasciculus bilaterally, and right inferior longitudinal fasciculus. Corpus callosum and SCP DT MRI measures distinguished PSP from PD patients with high accuracy (area under the curve ranging from 0.89 to 0.72). In PSP, DT MRI metrics of the corpus callosum and superior cerebellar peduncles were the best predictors of global disease severity scale scores. DT MRI metrics of the corpus callosum, right superior longitudinal and inferior longitudinal fasciculus, and left uncinate were the best predictors of executive dysfunction. In PSP, apathy severity was related to the damage to the corpus callosum, right superior longitudinal, and uncinate fasciculi. In conclusion, WM tract damage contributes to the motor, cognitive, and behavioural deficits in PSP. DT MRI offers markers for PSP diagnosis, assessment, and monitoring.

  5. Identifying biological concepts from a protein-related corpus with a probabilistic topic model

    PubMed Central

    Zheng, Bin; McLean, David C; Lu, Xinghua

    2006-01-01

    Background Biomedical literature, e.g., MEDLINE, contains a wealth of knowledge regarding functions of proteins. Major recurring biological concepts within such text corpora represent the domains of this body of knowledge. The goal of this research is to identify the major biological topics/concepts from a corpus of protein-related MEDLINE© titles and abstracts by applying a probabilistic topic model. Results The latent Dirichlet allocation (LDA) model was applied to the corpus. Based on the Bayesian model selection, 300 major topics were extracted from the corpus. The majority of identified topics/concepts was found to be semantically coherent and most represented biological objects or concepts. The identified topics/concepts were further mapped to the controlled vocabulary of the Gene Ontology (GO) terms based on mutual information. Conclusion The major and recurring biological concepts within a collection of MEDLINE documents can be extracted by the LDA model. The identified topics/concepts provide parsimonious and semantically-enriched representation of the texts in a semantic space with reduced dimensionality and can be used to index text. PMID:16466569

  6. Global tissue engineering trends. A scientometric and evolutive study.

    PubMed

    Santisteban-Espejo, Antonio; Campos, Fernando; Martin-Piedra, Laura; Durand-Herrera, Daniel; Moral-Munoz, Jose A; Campos, Antonio; Martin-Piedra, Miguel Angel

    2018-04-24

    Tissue engineering is defined as a multidisciplinary scientific discipline with the main objective to develop artificial bioengineered living tissues in order to regenerate damaged or lost tissues. Since its appearance in 1988, tissue engineering has globally spreaded in order to improve current therapeutical approaches, entailing a revolution in clinical practice. The aim of this study is to analyze global research trends on tissue engineering publications in order to realize the scenario of tissue engineering research from 1991 to 2016 by using document retrieval from Web of Science database and bibliometric analysis. Document type, language, source title, authorship, countries and filiation centers and citation count were evaluated in 31,859 documents. Obtained results suggest a great multidisciplinary role of tissue engineering due to a wide spectrum -up to 51- of scientific research areas identified in the corpus of literature, being predominant technological disciplines as Material Sciences or Engineering, followed by biological and biomedical areas, as Cell Biology, Biotechnology or Biochemistry. Distribution of authorship, journals and countries revealed a clear imbalance in which a minority is responsible of a majority of documents. Such imbalance is notorious in authorship, where a 0.3% of authors are involved in the half of the whole production.

  7. Citgo Refining and Chemicals Company, Corpus Christi West, 2007 Petition for Objection to Title V Permit

    EPA Pesticide Factsheets

    This document may be of assistance in applying the Title V air operating permit regulations. This document is part of the Title V Petition Database available at www2.epa.gov/title-v-operating-permits/title-v-petition-database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.

  8. Perspectives on Dichotic Listening and the Corpus Callosum

    ERIC Educational Resources Information Center

    Musiek, Frank E.; Weihing, Jeffrey

    2011-01-01

    The present review summarizes historic and recent research which has investigated the role of the corpus callosum in dichotic processing within the context of audiology. Examination of performance by certain clinical groups, including split brain patients, multiple sclerosis cases, and other types of neurological lesions is included. Maturational,…

  9. Automatic document classification of biological literature

    PubMed Central

    Chen, David; Müller, Hans-Michael; Sternberg, Paul W

    2006-01-01

    Background Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, a text-mining system for biological literature, which marks up full text according to a shallow ontology that includes terms of biological interest. This project investigates document classification in the context of biological literature, making use of the Textpresso markup of a corpus of Caenorhabditis elegans literature. Results We present a two-step text categorization algorithm to classify a corpus of C. elegans papers. Our classification method first uses a support vector machine-trained classifier, followed by a novel, phrase-based clustering algorithm. This clustering step autonomously creates cluster labels that are descriptive and understandable by humans. This clustering engine performed better on a standard test-set (Reuters 21578) compared to previously published results (F-value of 0.55 vs. 0.49), while producing cluster descriptions that appear more useful. A web interface allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept. Conclusion We have demonstrated a simple method to classify biological documents that embodies an improvement over current methods. While the classification results are currently optimized for Caenorhabditis elegans papers by human-created rules, the classification engine can be adapted to different types of documents. We have demonstrated this by presenting a web interface that allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept. PMID:16893465

  10. Chemical-induced disease relation extraction via convolutional neural network.

    PubMed

    Gu, Jinghang; Sun, Fuqing; Qian, Longhua; Zhou, Guodong

    2017-01-01

    This article describes our work on the BioCreative-V chemical-disease relation (CDR) extraction task, which employed a maximum entropy (ME) model and a convolutional neural network model for relation extraction at inter- and intra-sentence level, respectively. In our work, relation extraction between entity concepts in documents was simplified to relation extraction between entity mentions. We first constructed pairs of chemical and disease mentions as relation instances for training and testing stages, then we trained and applied the ME model and the convolutional neural network model for inter- and intra-sentence level, respectively. Finally, we merged the classification results from mention level to document level to acquire the final relations between chemical and disease concepts. The evaluation on the BioCreative-V CDR corpus shows the effectiveness of our proposed approach. http://www.biocreative.org/resources/corpora/biocreative-v-cdr-corpus/. © The Author(s) 2017. Published by Oxford University Press.

  11. Microstructural Corpus Callosum Anomalies in Children With Prenatal Alcohol Exposure: An Extension of Previous Diffusion Tensor Imaging Findings

    PubMed Central

    Wozniak, Jeffrey R.; Muetzel, Ryan L.; Mueller, Bryon A.; McGee, Christie L.; Freerks, Melesa A.; Ward, Erin E.; Nelson, Miranda L.; Chang, Pi-Nian; Lim, Kelvin O.

    2010-01-01

    Background Several studies have now shown corpus callosum abnormalities using diffusion tensor imaging (DTI) in children with fetal alcohol spectrum disorders (FASD) in comparison with nonexposed controls. The data suggest that posterior regions of the callosum may be disproportionately affected. The current study builds on previous efforts, including our own work, and moves beyond midline corpus callosum to probe major inter-hemispheric white matter pathways with an improved DTI tractographic method. This study also expands on our prior work by evaluating a larger sample and by incorporating children with a broader range of clinical effects including full-criteria fetal alcohol syndrome (FAS). Methods Participants included 33 children with FASD (8 FAS, 23 partial FAS, 2 static encephalopathy) and 19 nonexposed controls between the ages of 10 and 17 years. Participants underwent DTI scans and intelligence testing. Groups (FASD vs. controls) were compared on fractional anisotropy (FA) and mean diffusivity (MD) in 6 white matter tracts projected through the corpus callosum. Exploratory analyses were also conducted examining the relationships between DTI measures in the corpus callosum and measures of intellectual functioning and facial dysmorphology. Results In comparison with the control group, the FASD group had significantly lower FA in 3 posterior tracts of the corpus callosum: the posterior mid-body, the isthmus, and the splenium. A trend-level finding also suggested lower FA in the genu. Measures of white matter integrity and cognition were correlated and suggest some regional specificity, in that only posterior regions of the corpus callosum were associated with visual-perceptual skills. Correlations between measures of facial dysmorphology and posterior regions of the corpus callosum were nonsignificant. Conclusions Consistent with previous DTI studies, these results suggest that microstructural posterior corpus callosum abnormalities are present in children with prenatal alcohol exposure and cognitive impairment. These abnormalities are clinically relevant because they are associated with cognitive deficits and appear to provide evidence of abnormalities associated with prenatal alcohol exposure independent of dysmorphic features. As such, they may yield important diagnostic and prognostic information not provided by the traditional facial characteristics. PMID:19645729

  12. Domain Adaption of Parsing for Operative Notes

    PubMed Central

    Wang, Yan; Pakhomov, Serguei; Ryan, James O.; Melton, Genevieve B.

    2016-01-01

    Background Full syntactic parsing of clinical text as a part of clinical natural language processing (NLP) is critical for a wide range of applications, such as identification of adverse drug reactions, patient cohort identification, and gene interaction extraction. Several robust syntactic parsers are publicly available to produce linguistic representations for sentences. However, these existing parsers are mostly trained on general English text and often require adaptation for optimal performance on clinical text. Our objective was to adapt an existing general English parser for the clinical text of operative reports via lexicon augmentation, statistics adjusting, and grammar rules modification based on a set of biomedical text. Method The Stanford unlexicalized probabilistic context-free grammar (PCFG) parser lexicon was expanded with SPECIALIST lexicon along with statistics collected from a limited set of operative notes tagged with a two of POS taggers (GENIA tagger and MedPost). The most frequently occurring verb entries of the SPECIALIST lexicon were adjusted based on manual review of verb usage in operative notes. Stanford parser grammar production rules were also modified based on linguistic features of operative reports. An analogous approach was then applied to the GENIA corpus to test the generalizability of this approach to biomedical text. Results The new unlexicalized PCFG parser extended with the extra lexicon from SPECIALIST along with accurate statistics collected from an operative note corpus tagged with GENIA POS tagger improved the parser performance by 2.26% from 87.64% to 89.90%. There was a progressive improvement with the addition of multiple approaches. Most of the improvement occurred with lexicon augmentation combined with statistics from the operative notes corpus. Application of this approach on the GENIA corpus showed that parsing performance was boosted by 3.81% with a simple new grammar and the addition of the GENIA corpus lexicon. Conclusion Using statistics collected from clinical text tagged with POS taggers along with proper modification of grammars and lexicons of an unlexicalized PCFG parser can improve parsing performance. PMID:25661593

  13. Semantic concept-enriched dependence model for medical information retrieval.

    PubMed

    Choi, Sungbin; Choi, Jinwook; Yoo, Sooyoung; Kim, Heechun; Lee, Youngho

    2014-02-01

    In medical information retrieval research, semantic resources have been mostly used by expanding the original query terms or estimating the concept importance weight. However, implicit term-dependency information contained in semantic concept terms has been overlooked or at least underused in most previous studies. In this study, we incorporate a semantic concept-based term-dependence feature into a formal retrieval model to improve its ranking performance. Standardized medical concept terms used by medical professionals were assumed to have implicit dependency within the same concept. We hypothesized that, by elaborately revising the ranking algorithms to favor documents that preserve those implicit dependencies, the ranking performance could be improved. The implicit dependence features are harvested from the original query using MetaMap. These semantic concept-based dependence features were incorporated into a semantic concept-enriched dependence model (SCDM). We designed four different variants of the model, with each variant having distinct characteristics in the feature formulation method. We performed leave-one-out cross validations on both a clinical document corpus (TREC Medical records track) and a medical literature corpus (OHSUMED), which are representative test collections in medical information retrieval research. Our semantic concept-enriched dependence model consistently outperformed other state-of-the-art retrieval methods. Analysis shows that the performance gain has occurred independently of the concept's explicit importance in the query. By capturing implicit knowledge with regard to the query term relationships and incorporating them into a ranking model, we could build a more robust and effective retrieval model, independent of the concept importance. Copyright © 2013 Elsevier Inc. All rights reserved.

  14. Emerging Role of Probiotics in the Management of Helicobacter pylori Infection: Histopathologic Perspectives.

    PubMed

    Emara, Mohamed H; Elhawari, Soha A; Yousef, Salem; Radwan, Mohamed I; Abdel-Aziz, Hesham R

    2016-02-01

    There is growing evidence from preclinical and clinical studies that emphasizes the efficacy of probiotics in the management of Helicobacter (H) pylori infection; it increased the eradication rate, improved patient clinical manifestations and lowered treatment associated side effects. In this review we documented the potential ability of probiotics to ameliorate H. pylori induced histological features. We searched the available literature for full length articles focusing the role of probiotics on H. pylori induced gastritis from histologic perspectives. Probiotics lowered H. pylori density at the luminal side of epithelium, improved histological inflammatory and activity scores both in the gastric corpus and antrum. This effect persists for long period of time after discontinuation of probiotic supplementation and this is probably through an immune mechanism. The current evidence support the promising role of probiotics in improving H. pylori induced histopathological features both in gastric antrum and corpus and for long periods of time. Because increased density of H. pylori on the gastric mucosa is linked to more severe gastritis and increased incidence of peptic ulcers, we can infer that a reduction of the density might help to decrease the risk of developing pathologies, probably the progression toward atrophic gastritis and gastric adenocarcinoma. These effects together with improving the H. pylori eradication rates and amelioration of treatment related side effects might open the door for probiotics to be added to H. pylori eradication regimens. © 2015 John Wiley & Sons Ltd.

  15. Method and system of filtering and recommending documents

    DOEpatents

    Patton, Robert M.; Potok, Thomas E.

    2016-02-09

    Disclosed is a method and system for discovering documents using a computer and providing a small set of the most relevant documents to the attention of a human observer. Using the method, the computer obtains a seed document from the user and generates a seed document vector using term frequency-inverse corpus frequency weighting. A keyword index for a plurality of source documents can be compared with the weighted terms of the seed document vector. The comparison is then filtered to reduce the number of documents, which define an initial subset of the source documents. Initial subset vectors are generated and compared to the seed document vector to obtain a similarity value for each comparison. Based on the similarity value, the method then recommends one or more of the source documents.

  16. Boomerang sign: Clinical significance of transient lesion in splenium of corpus callosum

    PubMed Central

    Malhotra, Hardeep Singh; Garg, Ravindra Kumar; Vidhate, Mukund R.; Sharma, Pawan Kumar

    2012-01-01

    Transient signal abnormality in the splenium of corpus callosum on magnetic resonance imaging (MRI) is occasionally encountered in clinical practice. It has been reported in various clinical conditions apart from patients with epilepsy. We describe 4 patients with different etiologies presenting with signal changes in the splenium of corpus callosum. They were diagnosed as having progressive myoclonic epilepsy (case 1), localization-related epilepsy (case 2), hemicrania continua (case 3), and postinfectious parkinsonism (case 4). While three patients had complete involvement of the splenium on diffusion-weighted image (“boomerang sign”), the patient having hemicrania continua showed semilunar involvement (“mini-boomerang”) on T2-weighted and FLAIR image. All the cases had noncontiguous involvement of the splenium. We herein, discuss these cases with transient splenial involvement and stress that such patients do not need aggressive diagnostic and therapeutic interventions. An attempt has been made to review the literature regarding the pathophysiology, etiology, and outcome of such lesions. PMID:22566735

  17. Abnormal brain MRI signals in the splenium of the corpus callosum, basal ganglia and internal capsule in a suspected case with tuberculous meningitis.

    PubMed

    Hirotani, Makoto; Yabe, Ichiro; Hamada, Shinsuke; Tsuji, Sachiko; Kikuchi, Seiji; Sasaki, Hidenao

    2007-01-01

    A 34-year-old man visited the hospital with chief complaints of headache, fever, and disturbance of consciousness. In view of his clinical condition, the course of the disease, and results of examination, he was diagnosed with viral meningitis and treated accordingly. However, his clinical condition worsened, and MRI revealed abnormal signals in the splenium of the corpus callosum, in the basal ganglia and in the internal capsule, as well as the presence of severe inflammation in the base of the brain. Since he had a high ADA level in the cerebrospinal fluid and was consequently suspected to have tuberculous meningitis, he was placed on antitubercular agents. Then, his clinical condition began to improve. Additional steroid pulse therapy further improved his condition, and abnormal signals in the splenium of the corpus callosum and the basal ganglia resolved. This valuable case suggests that an immune mechanism contributed to the occurrence of central nervous system symptoms associated with tuberculous meningitis.

  18. Adult severe encephalitis/encephalopathy with a reversible splenial lesion of the corpus callosum: A case report.

    PubMed

    Mao, Xi-Jing; Zhu, Bo-Chi; Yu, Ting-Min; Yao, Gang

    2018-06-01

    Clinically mild encephalitis/encephalopathy with a reversible splenial lesion of the corpus callosum (MERS) is a recently identified clinically and radiologically distinct syndrome. Clinical symptoms and lesions on the magnetic resonance imaging (MRI) often disappear in 1 week or a few weeks. However, MERS manifesting as a severe clinical course with significant sequela has not yet been reported. A 42-year-old male presented with a 3-day history of headache, fever, and irrational speech. Physical examination showed a body temperature of 39.5°C, dysarthria, dyscalculia, recent memory disturbance, and otherwise normal vital signs. The patient developed status epilepticus and progressive consciousness disturbance. MRI showed abnormal patchy signals in the splenium of the corpus callosum. The clinical feature and the characteristic of MRI are mostly consistent with MERS. At the same time, we made a differential diagnosis by testing the NMDARAb, AMPA1Ab, AMPA2Ab, LG1Ab, CASPR2Ab, GABABRAb in CSF and serum. The subject was treated with ganciclovir, antiepileptic, and antipyretic therapy. The subject was living a virtually normal life with persistent mild memory disturbance. MRI showed that the abnormal signals in the splenium of the corpus callosum had disappeared, but hyperintensity on T2-weighted and FLAIR imaging was noted in the centrum semiovale. MERS is a rare clinicoradiological syndrome, which can manifest as severe symptoms as well. Early diagnosis and treatment should be emphasized, and the diagnostic value of MRI is highlighted. Clinicians should be alert to the potential sequela.

  19. Reversible splenial lesion syndrome associated with lobar pneumonia

    PubMed Central

    Li, Chunrong; Wu, Xiujuan; Qi, Hehe; Cheng, Yanwei; Zhang, Bing; Zhou, Hongwei; Lv, Xiaohong; Liu, Kangding; Zhang, Hong-Liang

    2016-01-01

    Abstract Background: Reversible splenial lesion syndrome (RESLES) is a rare clinico-radiological disorder with unclear pathophysiology. Clinically, RESLES is defined as reversible isolated splenial lesions in the corpus callosum, which can be readily identified by magnetic resonance imaging (MRI) and usually resolve completely over a period of time. RESLES could be typically triggered by infection, antiepileptic drugs (AEDs), poisoning, etc. More factors are increasingly recognized. Methods and results: We reported herein an 18-year-old female patient with lobar pneumonia who developed mental abnormalities during hospitalization. An isolated splenial lesion in the corpus callosum was found by head MRI and the lesion disappeared 15 days later. Based on her clinical manifestations and radiological findings, she was diagnosed with lobar pneumonia associated RESLES. We further summarize the up-to-date knowledge about the etiology, possible pathogenesis, clinical manifestations, radiological features, treatment, and prognosis of RESLES. Conclusion: This report contributes to the clinical understanding of RESLES which may present with mental abnormalities after infection. The characteristic imaging of reversible isolated splenial lesions in the corpus callosum was confirmed in this report. The clinical manifestations and lesions on MRI could disappear naturally after 1 month without special treatment. PMID:27684805

  20. [Correlation between growth rate of corpus callosum and neuromotor development in preterm infants].

    PubMed

    Liu, Rui-Ke; Sun, Jie; Hu, Li-Yan; Liu, Fang

    2015-08-01

    To investigate the growth rate of corpus callosum by cranial ultrasound in very low birth weight preterm infants and to provide a reference for early evaluation and improvement of brain development. A total of 120 preterm infants under 33 weeks' gestation were recruited and divided into 26-29(+6) weeks group (n=64) and 30-32(+6) weeks group (n=56) according to the gestational age. The growth rate of corpus callosum was compared between the two groups. The correlation between the corpus callosum length and the cerebellar vermis length and the relationship of the growth rate of corpus callosum with clinical factors and the neuromotor development were analyzed. The growth rate of corpus callosum in preterm infants declined since 2 weeks after birth. Compared with the 30-32(+6) weeks group, the 26-29(+6) weeks group had a significantly lower growth rate of corpus callosum at 3-4 weeks after birth, at 5-6 weeks after birth, and from 7 weeks after birth to 40 weeks of corrected gestational age. There was a positive linear correlation between the corpus callosum length and the cerebellar vermis length. Small-for-gestational age infants had a low growth rate of corpus callosum at 2 weeks after birth. The 12 preterm infants with severe abnormal intellectual development had a lower growth rate of corpus callosum compared with the 108 preterm infants with non-severe abnormal intellectual development at 3-6 weeks after birth. The 5 preterm infants with severe abnormal motor development had a significantly lower growth rate of corpus callosum compared with the 115 preterm infants with non-severe abnormal motor development at 3-6 weeks after birth. The decline of growth rate of corpus callosum in preterm infants at 2-6 weeks after birth can increase the risk of severe abnormal neuromotor development.

  1. Towards comprehensive syntactic and semantic annotations of the clinical narrative

    PubMed Central

    Albright, Daniel; Lanfranchi, Arrick; Fredriksen, Anwen; Styler, William F; Warner, Colin; Hwang, Jena D; Choi, Jinho D; Dligach, Dmitriy; Nielsen, Rodney D; Martin, James; Ward, Wayne; Palmer, Martha; Savova, Guergana K

    2013-01-01

    Objective To create annotated clinical narratives with layers of syntactic and semantic labels to facilitate advances in clinical natural language processing (NLP). To develop NLP algorithms and open source components. Methods Manual annotation of a clinical narrative corpus of 127 606 tokens following the Treebank schema for syntactic information, PropBank schema for predicate-argument structures, and the Unified Medical Language System (UMLS) schema for semantic information. NLP components were developed. Results The final corpus consists of 13 091 sentences containing 1772 distinct predicate lemmas. Of the 766 newly created PropBank frames, 74 are verbs. There are 28 539 named entity (NE) annotations spread over 15 UMLS semantic groups, one UMLS semantic type, and the Person semantic category. The most frequent annotations belong to the UMLS semantic groups of Procedures (15.71%), Disorders (14.74%), Concepts and Ideas (15.10%), Anatomy (12.80%), Chemicals and Drugs (7.49%), and the UMLS semantic type of Sign or Symptom (12.46%). Inter-annotator agreement results: Treebank (0.926), PropBank (0.891–0.931), NE (0.697–0.750). The part-of-speech tagger, constituency parser, dependency parser, and semantic role labeler are built from the corpus and released open source. A significant limitation uncovered by this project is the need for the NLP community to develop a widely agreed-upon schema for the annotation of clinical concepts and their relations. Conclusions This project takes a foundational step towards bringing the field of clinical NLP up to par with NLP in the general domain. The corpus creation and NLP components provide a resource for research and application development that would have been previously impossible. PMID:23355458

  2. Citgo Refining and Chemicals, West Plant, Corpus Chrisit, Texas, Order Granting in Part and Denying in Part Petition for Objection to the Title V Permit

    EPA Pesticide Factsheets

    This document may be of assistance in applying the Title V air operating permit regulations. This document is part of the Title V Petition Database available at www2.epa.gov/title-v-operating-permits/title-v-petition-database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.

  3. An Adaptive Tutor for Improving Visual Diagnosis

    DTIC Science & Technology

    2017-10-01

    designed to inform the design of the adaptive tutor including a) focus groups to develop a relative “importance” ranking, b) pairwise comparisons by...Goal – Assemble case library X Focus group to verify controlled vocabulary for diagnosis and importance ranking X Assembled corpus of 80,000 cases and...policy or decision unless so designated by other documentation. REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 Public reporting burden

  4. English "in the Context of" European Integration: A Corpus-Driven Analysis of Lexical Bundles in English EU Documents

    ERIC Educational Resources Information Center

    Jablonkai, Reka

    2010-01-01

    This study extends research into the use of English as a lingua franca in the European context by investigating the most frequent word combinations in English documents issued by EU institutions. As there is little research on the use of the English language within the European Union for ESP pedagogic purposes, as part of a larger scale analysis,…

  5. USI: a fast and accurate approach for conceptual document annotation.

    PubMed

    Fiorini, Nicolas; Ranwez, Sylvie; Montmain, Jacky; Ranwez, Vincent

    2015-03-14

    Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion - instead of one score per concept.

  6. Partial segmental thrombosis of the corpus cavernosum: imaging findings.

    PubMed

    Moya-Sánchez, E; Medina-Benítez, A; Medina-Salas, V; Fernández-Navarro, L

    2018-03-05

    Partial segmental thrombosis of the corpus cavernosum is an unusual clinical condition of unknown origin that mainly affects young males, whose characteristic presentation is the appearance of unexplained perineal pain associated with a palpable perineal mass. This entity consists of thrombosis in the perineal portion of the corpus cavernosum, usually unilateral and it is associated with underlying malignant pathologies and predisposing factors such as microtrauma. After the adequate adherence to conservative treatment, the appearance of complications such as erectile dysfunction is very uncommon. Copyright © 2018 SERAM. Publicado por Elsevier España, S.L.U. All rights reserved.

  7. The Islamic State Battle Plan: Press Release Natural Language Processing

    DTIC Science & Technology

    2016-06-01

    Processing, text mining , corpus, generalized linear model, cascade, R Shiny, leaflet, data visualization 15. NUMBER OF PAGES 83 16. PRICE CODE...Terrorism and Responses to Terrorism TDM Term Document Matrix TF Term Frequency TF-IDF Term Frequency-Inverse Document Frequency tm text mining (R...package=leaflet. Feinerer I, Hornik K (2015) Text Mining Package “tm,” Version 0.6-2. (Jul 3) https://cran.r-project.org/web/packages/tm/tm.pdf

  8. Information extraction and knowledge graph construction from geoscience literature

    NASA Astrophysics Data System (ADS)

    Wang, Chengbin; Ma, Xiaogang; Chen, Jianguo; Chen, Jingwen

    2018-03-01

    Geoscience literature published online is an important part of open data, and brings both challenges and opportunities for data analysis. Compared with studies of numerical geoscience data, there are limited works on information extraction and knowledge discovery from textual geoscience data. This paper presents a workflow and a few empirical case studies for that topic, with a focus on documents written in Chinese. First, we set up a hybrid corpus combining the generic and geology terms from geology dictionaries to train Chinese word segmentation rules of the Conditional Random Fields model. Second, we used the word segmentation rules to parse documents into individual words, and removed the stop-words from the segmentation results to get a corpus constituted of content-words. Third, we used a statistical method to analyze the semantic links between content-words, and we selected the chord and bigram graphs to visualize the content-words and their links as nodes and edges in a knowledge graph, respectively. The resulting graph presents a clear overview of key information in an unstructured document. This study proves the usefulness of the designed workflow, and shows the potential of leveraging natural language processing and knowledge graph technologies for geoscience.

  9. Opening the black box: why we need a PBL talkbank database.

    PubMed

    Koschmann, T; MacWhinney, B

    2001-01-01

    Interest runs high these days in developing "evidence-based" reviews to provide guidelines for instructional practice. However, we lack careful documentation of the ways in which the practices of problem-based learning (PBL) vary across groups and across implementations. A necessary starting point for developing any sweeping conclusions about the efficacy of PBL as an instructional innovation, therefore, is that we begin to become more articulate about what it is that people do when they say they are doing PBL. A proposal is offered for a new initiative in medical education research, one focused on documenting the range of practices employed in different implementations of PBL. A vital facet of this initiative would be the development of a shared corpus of video recordings referred to here as the "PBL TalkBank database." We propose that medical educators adopt the tradition employed in linguistics and communication studies of creating shared data corpora. The corpus in this case would consist of recordings, transcripts, and research notes documenting PBL practices in different PBL curricula. Preliminary work has been undertaken to develop such a database, and we invite the participation of other researchers.

  10. Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure.

    PubMed

    Zhang, Wen; Xiao, Fan; Li, Bin; Zhang, Siguang

    2016-01-01

    Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods.

  11. Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure

    PubMed Central

    Xiao, Fan; Li, Bin; Zhang, Siguang

    2016-01-01

    Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods. PMID:27579031

  12. Desiderata for ontologies to be used in semantic annotation of biomedical documents.

    PubMed

    Bada, Michael; Hunter, Lawrence

    2011-02-01

    A wealth of knowledge valuable to the translational research scientist is contained within the vast biomedical literature, but this knowledge is typically in the form of natural language. Sophisticated natural-language-processing systems are needed to translate text into unambiguous formal representations grounded in high-quality consensus ontologies, and these systems in turn rely on gold-standard corpora of annotated documents for training and testing. To this end, we are constructing the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-text biomedical journal articles that are being manually annotated with the entire sets of terms from select vocabularies, predominantly from the Open Biomedical Ontologies (OBO) library. Our efforts in building this corpus has illuminated infelicities of these ontologies with respect to the semantic annotation of biomedical documents, and we propose desiderata whose implementation could substantially improve their utility in this task; these include the integration of overlapping terms across OBOs, the resolution of OBO-specific ambiguities, the integration of the BFO with the OBOs and the use of mid-level ontologies, the inclusion of noncanonical instances, and the expansion of relations and realizable entities. Copyright © 2010 Elsevier Inc. All rights reserved.

  13. Diffusion tensor imaging and myelin composition analysis reveal abnormal myelination in corpus callosum of canine mucopolysaccharidosis I

    PubMed Central

    Provenzale, James M.; Nestrasil, Igor; Chen, Steven; Kan, Shih-hsin; Le, Steven Q.; Jens, Jacqueline K.; Snella, Elizabeth M.; Vondrak, Kristen N.; Yee, Jennifer K.; Vite, Charles H.; Elashoff, David; Duan, Lewei; Wang, Raymond Y.; Ellinwood, N. Matthew; Guzman, Miguel A.; Shapiro, Elsa G.; Dickson, Patricia I.

    2015-01-01

    Children with mucopolysaccharidosis I (MPS I) develop hyperintense white matter foci on T2-weighted brain magnetic resonance (MR) imaging that are associated clinically with cognitive impairment. We report here a diffusion tensor imaging (DTI) and tissue evaluation of white matter in a canine model of MPS I. We found that two DTI parameters, fractional anisotropy (a measure of white matter integrity) and radial diffusivity (which reflects degree of myelination) were abnormal in the corpus callosum of MPS I dogs compared to carrier controls. Tissue studies of the corpus callosum showed reduced expression of myelin-related genes and an abnormal composition of myelin in MPS I dogs. We treated MPS I dogs with recombinant alpha-l-iduronidase, which is the enzyme that is deficient in MPS I disease. The recombinant alpha-l-iduronidase was administered by intrathecal injection into the cisterna magna. Treated dogs showed partial correction of corpus callosum myelination. Our findings suggest that abnormal myelination occurs in the canine MPS I brain, that it may underlie clinically-relevant brain imaging findings in human MPS I patients, and that it may respond to treatment. PMID:26222335

  14. Term Familiarity to indicate Perceived and Actual Difficulty of Text in Medical Digital Libraries.

    PubMed

    Leroy, Gondy; Endicott, James E

    2011-10-01

    With increasing text digitization, digital libraries can personalize materials for individuals with different education levels and language skills. To this end, documents need meta-information describing their difficulty level. Previous attempts at such labeling used readability formulas but the formulas have not been validated with modern texts and their outcome is seldom associated with actual difficulty. We focus on medical texts and are developing new, evidence-based meta-tags that are associated with perceived and actual text difficulty. This work describes a first tag, term familiarity , which is based on term frequency in the Google corpus. We evaluated its feasibility to serve as a tag by looking at a document corpus (N=1,073) and found that terms in blogs or journal articles displayed unexpected but significantly different scores. Term familiarity was then applied to texts and results from a previous user study (N=86) and could better explain differences for perceived and actual difficulty.

  15. Treatment of eyelid injuries during period 2008-2010 at eye clinic of clinical center University of Prishtina.

    PubMed

    Shoshi, Mire; Shoshi, Avdyl; Xhafa, Agim; Kastrati, Fetije; Shoshi, Fitore; Shoshi, Fjolla

    2012-01-01

    In the Eye Clinic during 2008-2010 we have treated 446 patients, where 252 were hospitalized patients while 184 weren't. Treated patients were 1-85 years old. AIM OF THIS STUDY is to present our experience in treatment of patients with eye lids injuries and their reconstruction. In patients that were treated in the Eye Clinic, we have applied surgical methods, anti-tetanic protection and local and general medical therapy. 252 hospitalized patients also had other injuries of eye such as: rupture of bulbus, traumatic cataract, prolapsus iris, hyphaema in CA, prolapsus CV, VLC perforanc cornea et corpus alieni in CA, hyphaema totalis, VLC sclera, corpus alieni intrabulbares. Patients that weren't hospitalized were 5-10 years old, 25-35 years old and 20-25 years old. Hospitalized patients were 5-10 years old, 20-25 years old and 30-35 years old. By this we can conclude that there wasn't any significant difference based on the patients' age. In the hospitalized cases with eye-lid injuries, most injuries of the eye were: VLC perforanc cornea cum prolapsus iridea, Corpus alieni in CA et hyphaema.

  16. Semi-automated ontology generation and evolution

    NASA Astrophysics Data System (ADS)

    Stirtzinger, Anthony P.; Anken, Craig S.

    2009-05-01

    Extending the notion of data models or object models, ontology can provide rich semantic definition not only to the meta-data but also to the instance data of domain knowledge, making these semantic definitions available in machine readable form. However, the generation of an effective ontology is a difficult task involving considerable labor and skill. This paper discusses an Ontology Generation and Evolution Processor (OGEP) aimed at automating this process, only requesting user input when un-resolvable ambiguous situations occur. OGEP directly attacks the main barrier which prevents automated (or self learning) ontology generation: the ability to understand the meaning of artifacts and the relationships the artifacts have to the domain space. OGEP leverages existing lexical to ontological mappings in the form of WordNet, and Suggested Upper Merged Ontology (SUMO) integrated with a semantic pattern-based structure referred to as the Semantic Grounding Mechanism (SGM) and implemented as a Corpus Reasoner. The OGEP processing is initiated by a Corpus Parser performing a lexical analysis of the corpus, reading in a document (or corpus) and preparing it for processing by annotating words and phrases. After the Corpus Parser is done, the Corpus Reasoner uses the parts of speech output to determine the semantic meaning of a word or phrase. The Corpus Reasoner is the crux of the OGEP system, analyzing, extrapolating, and evolving data from free text into cohesive semantic relationships. The Semantic Grounding Mechanism provides a basis for identifying and mapping semantic relationships. By blending together the WordNet lexicon and SUMO ontological layout, the SGM is given breadth and depth in its ability to extrapolate semantic relationships between domain entities. The combination of all these components results in an innovative approach to user assisted semantic-based ontology generation. This paper will describe the OGEP technology in the context of the architectural components referenced above and identify a potential technology transition path to Scott AFB's Tanker Airlift Control Center (TACC) which serves as the Air Operations Center (AOC) for the Air Mobility Command (AMC).

  17. A knowledge-driven approach to biomedical document conceptualization.

    PubMed

    Zheng, Hai-Tao; Borchert, Charles; Jiang, Yong

    2010-06-01

    Biomedical document conceptualization is the process of clustering biomedical documents based on ontology-represented domain knowledge. The result of this process is the representation of the biomedical documents by a set of key concepts and their relationships. Most of clustering methods cluster documents based on invariant domain knowledge. The objective of this work is to develop an effective method to cluster biomedical documents based on various user-specified ontologies, so that users can exploit the concept structures of documents more effectively. We develop a flexible framework to allow users to specify the knowledge bases, in the form of ontologies. Based on the user-specified ontologies, we develop a key concept induction algorithm, which uses latent semantic analysis to identify key concepts and cluster documents. A corpus-related ontology generation algorithm is developed to generate the concept structures of documents. Based on two biomedical datasets, we evaluate the proposed method and five other clustering algorithms. The clustering results of the proposed method outperform the five other algorithms, in terms of key concept identification. With respect to the first biomedical dataset, our method has the F-measure values 0.7294 and 0.5294 based on the MeSH ontology and gene ontology (GO), respectively. With respect to the second biomedical dataset, our method has the F-measure values 0.6751 and 0.6746 based on the MeSH ontology and GO, respectively. Both results outperforms the five other algorithms in terms of F-measure. Based on the MeSH ontology and GO, the generated corpus-related ontologies show informative conceptual structures. The proposed method enables users to specify the domain knowledge to exploit the conceptual structures of biomedical document collections. In addition, the proposed method is able to extract the key concepts and cluster the documents with a relatively high precision. Copyright 2010 Elsevier B.V. All rights reserved.

  18. Corpus callosum atrophy as a marker of clinically meaningful cognitive decline in secondary progressive multiple sclerosis. Impact on employment status.

    PubMed

    Papathanasiou, Athanasios; Messinis, Lambros; Zampakis, Petros; Papathanasopoulos, Panagiotis

    2017-09-01

    Cognitive impairment in Multiple Sclerosis (MS) is more frequent and pronounced in secondary progressive MS (SPMS). Cognitive decline is an important predictor of employment status in patients with MS. Magnetic Resonance Imaging (MRI) markers have been used to associate tissue damage with cognitive dysfunction. The aim of the study was to designate the MRI marker that predicts cognitive decline in SPMS and explore its effect on employment status. 30 SPMS patients and 30 healthy participants underwent neuropsychological assessment using the Trail Making Test (TMT) parts A and B, semantic and phonological verbal fluency task and a computerized cognitive screening battery (Central Nervous System Vital Signs). Employment status was obtained as a quality of life measure. Brain MRI was performed in all participants. We measured total lesion volume, third ventricle width, thalamic and corpus callosum atrophy. The frequency of cognitive decline for our SPMS patients was 80%. SPMS patients differed significantly from controls in all neuropsychological measures. Corpus callosum area was correlated with cognitive flexibility, processing speed, composite memory, executive functions, psychomotor speed, reaction time and phonological verbal fluency task. Processing speed and composite memory were the most sensitive markers for predicting employment status. Corpus callosum area was the most sensitive MRI marker for memory and processing speed. Corpus callosum atrophy predicts a clinically meaningful cognitive decline, affecting employment status in our SPMS patients. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. A System for Automated Extraction of Metadata from Scanned Documents using Layout Recognition and String Pattern Search Models.

    PubMed

    Misra, Dharitri; Chen, Siyuan; Thoma, George R

    2009-01-01

    One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques.At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts.In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system.

  20. Magnetic resonance imaging findings and prognosis of gastric-type mucinous adenocarcinoma (minimal deviation adenocarcinoma or adenoma malignum) of the uterine corpus: Two case reports.

    PubMed

    Hino, Mayo; Yamaguchi, Ken; Abiko, Kaoru; Yoshioka, Yumiko; Hamanishi, Junzo; Kondoh, Eiji; Koshiyama, Masafumi; Baba, Tsukasa; Matsumura, Noriomi; Minamiguchi, Sachiko; Kido, Aki; Konishi, Ikuo

    2016-05-01

    Our group previously documented the first, very rare case of primary gastric-type mucinous adenocarcinoma of the uterine corpus. Although this type of endometrial cancer appears to be similar to the gastric-type adenocarcinoma of the uterine cervix, its main symptoms, appearance on magnetic resonance imaging (MRI) and prognosis have not been fully elucidated due to its rarity. We herein describe an additional case of gastric-type mucinous adenocarcinoma of the endometrium and review the relevant literature. The two cases at our institution (Kyoto University Hospital, Kyoto, Japan) involved postmenopausal women with a primary complaint of abnormal genital bleeding. Microscopic examination of the hysterectomy specimens indicated a highly differentiated mucinous adenocarcinoma with a desmoplastic stromal reaction. Immunohistochemistry for HIK1083 and/or MUC6 was positive in both cases, suggesting a gastric phenotype. Both patients were diagnosed at an advanced stage, they relapsed or recurred immediately after adjuvant chemotherapy, and eventually succumbed to the disease. The main symptom of gastric-type mucinous adenocarcinoma of the uterine cervix is watery discharge, whereas abnormal genital bleeding in addition to watery discharge is mainly observed in the mucinous type of endometrial adenocarcinoma. Cystic cavities in the tumor are present on MRI in cases of endometrial origin, and prognosis is very poor due to resistance to chemotherapy. Thus, gastric-type mucinous adenocarcinoma of the uterine endometrium exhibits a clinical behavior that is similar to tumors originating from the uterine cervix, but is associated with distinguishing clinical symptoms. The incidence of gastric-type endometrial adenocarcinoma may be higher than expected.

  1. Different Pattern of Inflammatory and Atrophic Changes in the Gastric Mucosa of the Greater and Lesser Curvature.

    PubMed

    Isajevs, Sergejs; Liepniece-Karele, Inta; Svirina, Darja; Santare, Daiga; Kaidaks, Sandris; Sivins, Armands; Vikmanis, Uldis; Leja, Marcis

    2015-12-01

    Appropriate biopsy sampling is important for the classification of gastritis, yet the extent of inflammation and atrophy of different regions of the stomach with chronic gastritis have been addressed only in a few studies. The aim of our study was to analyze the inflammatory, atrophic and metaplastic changes in the greater and lesser curvature of the antrum and corpus mucosa. 420 patients undergoing upper endoscopy were enrolled in the study. Four expert gastrointestinal pathologists graded biopsy specimens according to the updated Sydney classification. The obtained results showed that the mononuclear and granulocyte inflammatory cells were more prominent in the corpus lesser curvature compared to the corpus greater curvature (p=0.01 and p=0.0001, respectively). In addition, the extent and degree of atrophy and intestinal metaplasia were more prominent in the corpus lesser compared to the greater curvature (p=0.002 and p=0.0065, respectively). The frequency of distribution of H. pylori did not differ throughout both the corpus and antrum greater and lesser curvature. However, the degree of H. pylori colonization in the corpus was higher in the lesser than in the greater curvature. The interobserver agreement was significantly higher for corpus atrophy compared to antrum atrophy. These findings demonstrated that the more severe atrophic, metaplastic and inflammatory changes were observed in the lesser compared to the greater curvature of the stomach. In routine clinical settings, corpus and antral biopsies should be obtained from both lesser and greater curvature. Analysis of the incisura biopsy is also important.

  2. Diffuse axonal injury by assault.

    PubMed

    Imajo, T; Challener, R C; Roessmann, U

    1987-09-01

    A case of diffuse axonal injury (DAI) by assault is reported. The majority of DAI cases documented have been due to traffic accidents and some due to falls from height. DAI is caused by angular or rotational acceleration of the victim's head. The condition is common and is the second most important head injury after subdural hematoma with regard to death. Its clinical picture is characterized by immediate and prolonged coma or demented state. Because of the subtle nature of histological changes in DAI, awareness and intentional search for the lesion is essential. The triad of DAI is as follows: focal lesions (hemorrhages and/or lacerations) in the corpus callosum and brain stem, and microscopic demonstration of axonal damage--retraction balls. The concept of DAI will elucidate and enhance the understanding of many head trauma cases.

  3. The function of the corpus luteum of pregnancy in ovulatory dysfunction and luteal phase deficiency.

    PubMed

    Soules, M R; Hughes, C L; Aksel, S; Tyrey, L; Hammond, C B

    1981-07-01

    Relatively little knowledge exists of corpus luteum function in early pregnancy after the successful treatment of ovulatory dysfunction or luteal phase deficiency. To assess the activity of the corpus luteum of such patients, human chorionic gonadotropin (hCG) and 17-hydroxyprogesterone (17-OH-P) levels were determined in serum samples obtained from normal women (44 patients), women with ovulatory dysfunction (10 patients), and women with luteal phase deficiency (7 patients); all determinations were made during conceptive cycles, and sampling continued into the first trimester of pregnancy. There were no statistically significant abnormalities of hCG levels when infertility patients were compared with control patients. According to the premise that 17-OH-P levels reflect corpus luteal function, there appeared to be adequate function in pregnancies after progesterone treatment of luteal phase deficiency. In pregnancies following ovulation induction with clomiphene, the corpus luteum function, on the basis of 17-OH-P levels, was significantly increased in magnitude and duration. These results have clinical implications with regard to supplemental hormone therapy in early pregnancy.

  4. [Structural change of the corpus callosum fibers in toddlers with autism spectrum disorder: two-year follow-up].

    PubMed

    Chang, C; Qiu, N N; Xiao, T; Xiao, X; Chu, K K; Li, Y; Wu, Q R; Fang, H; Ke, X Y

    2017-12-02

    Objective: To conduct a follow-up investigation of structural changes of the corpus callosum fibers of toddlers (2 to 5 years of age) with autism spectrum disorder(ASD) and to explore the associations with clinical symptoms. Method: In this prospective randomized controlled study, ASD children who were diagnosed in the Child Mental Health Research Center, Nanjing Brain Hospital Affiliated to Nanjing Medical University from May 2011 to November 2012 were included in the ASD group, and developmentally delayed children were included in the control group (DD group). Diffusion tensor imaging (DTI) data from the two groups were obtained at two age levels: 2-3 years of age, and 4-5 years of age. Region of interest analysis was applied to assess characteristic values of total area and sub-regions of corpus callosum: the fraction anisotropy (FA), the mean diffusivity (MD), the radial diffusivity (RD) and the axial diffusivity (AD). All children were assessed using the Autism Diagnostic Interview-Revised (ADI-R) and Autism Treatment Evaluation Checklist (ATEC). The characteristic values of total area and sub-regions of corpus callosum of ASD group at two age levels were analyzed by paired sample t test; the characteristic values of total area and sub-regions of corpus callosum of ASD group and DD group were analyzed by independent-sample t test; the correlations between FA values of the total area and sub-regions of corpus callosum and ADI-R or ATEC scores were analyzed by Pearson correlation analysis. Result: Forty cases meeting inclusion criteria were enrolled in ASD group, and 31 eligible cases were enrolled in the control group. Four children in the ASD group were lost to follow-up, and 5 children in the control group were lost to follow-up. Longitudinal comparison between the two age subgroups of ASD patients showed that the FA values of the total corpus callosum increased (0.499 55±0.027 59 vs . 0.505 83±0.086 64, t= 4.88, P <0.05), but MD values, RD values and AD values of the total corpus callosum area decreased (0.000 89±0.000 03 vs . 0.000 81±0.000 14, 0.000 61±0.000 04 vs. 0.000 55±0.000 09, 0.001 43±0.000 03 vs . 0.001 38±0.000 03, t= 9.31, 7.90, 8.66, P <0.05 for all comparisons). In the area of corpus callosum genu, FA and AD values increased ( t= 5.59, 8.48, P <0.05 for both comparisons), but MD and RD values decreased ( t= 12.67, 11.28, P <0.05 for both comparisns). In the area of corpus callosum body, FA and RD values increased( t= 5.46, 8.48, P <0.05 for both comparisons), but MD and AD values decreased ( t= 8.08, 6.22, P <0.05 for both comparisons). In the area of corpus callosum splenium, MD, RD and AD values decreased ( t= 6.81, 4.44, 5.51, P < 0.05 for all comparisons). Among the participants 2 to 3 years of age, there were no significantly differences in FA values of total area and sub-regions of corpus callosum between ASD group and the DD group ( P > 0.05 for all comparisons); as compared with the DD group, ASD group had higher AD values of total area and splenium of corpus callosum (0.001 43±0.000 03 vs . 0.001 40±0.000 04, 0.001 34±0.000 03 vs . 0.001 32±0.000 04, t= 1.56, 1.14, P < 0.05 for both comparisons); ASD group had lower AD values but higher RD and MD values of corpus callosum genu ( t= 0.07, 0.55, 0.07, P < 0.05 for all comparisons); ASD group had lower RD values of corpus callosum body ( t= 0.07, P < 0.05). Among the participants 4 to 5 years of age, as compared with the DD group, ASD group had higher FA value of total corpus callosum area(0.505 83±0.086 64 vs . 0.483 77±0.099 30, t= 8.56, P < 0.05), lower RD value of total corpus callosum(0.000 55±0.000 09 vs . 0.000 56±0.000 12, t= 14.44, P < 0.05), lower RD values of corpus callosum body ( t= 2.20, P < 0.05), higher FA values ( t= 3.35, P < 0.05) but lower AD values of corpus callosum splenium ( t= 2.20, P < 0.05). A correlation analysis between FA values of total area and sub-regions of corpus callosum and clinical variables showed that the FA values of total area and splenium of corpus callosum in ASD group at 2 to 3 years of age were negatively correlated with the scores of language skills in ATEC ( r=- 0.35,-0.36, P < 0.05 for both comparisons). And after two years, FA values of total corpus callosum were positively correlated with the scores of social communication in ATEC ( r= 0.34, P < 0.05). There was no significant correlation between FA values of sub-regions of corpus callosum and the scores of ATEC ( P > 0.05 for all comparisons). There was no significant correlation between FA values of total area and sub-regions of corpus callosum and the scores of ADI-R ( P > 0.05 for all comparisons). Conclusion: The fiber structure of corpus callosum was still in the process of maturing during the age of 2 to 5 years; however, compared with DD group, ASD group had more extensive structural abnormalities of the corpus callosum fibers as age increased, and the structural abnormalities had correlation with the core symptoms of ASD. Trial registration Chinese Clinical Trial Registry, ChiCTR-OPC-17011995.

  5. Clustering XML Documents Using Frequent Subtrees

    NASA Astrophysics Data System (ADS)

    Kutty, Sangeetha; Tran, Tien; Nayak, Richi; Li, Yuefeng

    This paper presents an experimental study conducted over the INEX 2008 Document Mining Challenge corpus using both the structure and the content of XML documents for clustering them. The concise common substructures known as the closed frequent subtrees are generated using the structural information of the XML documents. The closed frequent subtrees are then used to extract the constrained content from the documents. A matrix containing the term distribution of the documents in the dataset is developed using the extracted constrained content. The k-way clustering algorithm is applied to the matrix to obtain the required clusters. In spite of the large number of documents in the INEX 2008 Wikipedia dataset, the proposed frequent subtree-based clustering approach was successful in clustering the documents. This approach significantly reduces the dimensionality of the terms used for clustering without much loss in accuracy.

  6. Chi-square-based scoring function for categorization of MEDLINE citations.

    PubMed

    Kastrin, A; Peterlin, B; Hristovski, D

    2010-01-01

    Text categorization has been used in biomedical informatics for identifying documents containing relevant topics of interest. We developed a simple method that uses a chi-square-based scoring function to determine the likelihood of MEDLINE citations containing genetic relevant topic. Our procedure requires construction of a genetic and a nongenetic domain document corpus. We used MeSH descriptors assigned to MEDLINE citations for this categorization task. We compared frequencies of MeSH descriptors between two corpora applying chi-square test. A MeSH descriptor was considered to be a positive indicator if its relative observed frequency in the genetic domain corpus was greater than its relative observed frequency in the nongenetic domain corpus. The output of the proposed method is a list of scores for all the citations, with the highest score given to those citations containing MeSH descriptors typical for the genetic domain. Validation was done on a set of 734 manually annotated MEDLINE citations. It achieved predictive accuracy of 0.87 with 0.69 recall and 0.64 precision. We evaluated the method by comparing it to three machine-learning algorithms (support vector machines, decision trees, naïve Bayes). Although the differences were not statistically significantly different, results showed that our chi-square scoring performs as good as compared machine-learning algorithms. We suggest that the chi-square scoring is an effective solution to help categorize MEDLINE citations. The algorithm is implemented in the BITOLA literature-based discovery support system as a preprocessor for gene symbol disambiguation process.

  7. Topic Modeling of NASA Space System Problem Reports: Research in Practice

    NASA Technical Reports Server (NTRS)

    Layman, Lucas; Nikora, Allen P.; Meek, Joshua; Menzies, Tim

    2016-01-01

    Problem reports at NASA are similar to bug reports: they capture defects found during test, post-launch operational anomalies, and document the investigation and corrective action of the issue. These artifacts are a rich source of lessons learned for NASA, but are expensive to analyze since problem reports are comprised primarily of natural language text. We apply topic modeling to a corpus of NASA problem reports to extract trends in testing and operational failures. We collected 16,669 problem reports from six NASA space flight missions and applied Latent Dirichlet Allocation topic modeling to the document corpus. We analyze the most popular topics within and across missions, and how popular topics changed over the lifetime of a mission. We find that hardware material and flight software issues are common during the integration and testing phase, while ground station software and equipment issues are more common during the operations phase. We identify a number of challenges in topic modeling for trend analysis: 1) that the process of selecting the topic modeling parameters lacks definitive guidance, 2) defining semantically-meaningful topic labels requires nontrivial effort and domain expertise, 3) topic models derived from the combined corpus of the six missions were biased toward the larger missions, and 4) topics must be semantically distinct as well as cohesive to be useful. Nonetheless,topic modeling can identify problem themes within missions and across mission lifetimes, providing useful feedback to engineers and project managers.

  8. Seckel's syndrome and malformations of cortical development: report of three new cases and review of the literature.

    PubMed

    Capovilla, G; Lorenzetti, M E; Montagnini, A; Borgatti, R; Piccinelli, P; Giordano, L; Accorsi, P; Caudana, R

    2001-05-01

    Seckel's syndrome is a rare form of primordial dwarfism, characterized by peculiar facial appearance. In the past, this condition was overdiagnosed, and most attention was given to the facial and skeletal features to define more precise diagnostic criteria. The presence of mental retardation and neurologic signs is one of the peculiar features of this syndrome, but only recently were rare cases of malformation of cortical development described, as documented by magnetic resonance imaging (MRI). Here, we present three new cases of Seckel's syndrome showing different malformations of cortical development (one gyral hypoplasia, one macrogyria and partial corpus callosum agenesis, and one bilateral opercular macrogyria). We hypothesize that the different types of clinical expression of our patients could be explained by different malformation of cortical development types. We think that MRI studies could be performed in malformative syndromes because of the possible correlations between type and extent of the lesion and the clinical picture of any individual case.

  9. Comparison of Human and Latent Semantic Analysis (LSA) Judgements of Pairwise Document Similarities for a News Corpus

    DTIC Science & Technology

    2004-09-01

    University. Miro Kraetzl critically assessed the manuscript before it was sent for review. References Allan, J., Callan, J., Croft, W.B., Ballesteros, L...Conference (TREC 6). NIST Special Publication 500-240. Baayen,R.H. (2001). Word Frequency Distributions. Kluwer Academic Publishers, P.O. Box 322 , 3300

  10. Punishing Kids: The Rise of the "Boot Camp"

    ERIC Educational Resources Information Center

    Mills, Martin; Pini, Barbara

    2015-01-01

    This paper is concerned with the rise of 'the boot camp' as a means of addressing "the problem of troubled youth" in contemporary industrialised nations such as Australia and the UK. Drawing on a corpus of publicly available material including press releases and policy documents, media reports, and programme websites, the paper explores…

  11. Linguistic, Cognitive, and Social Constraints on Lexical Entrenchment

    ERIC Educational Resources Information Center

    Chesley, Paula

    2011-01-01

    How do new words become established in a speech community? This dissertation documents linguistic, cognitive, and social factors that are hypothesized to affect "lexical entrenchment," the extent to which a new word becomes part of the lexicon of a speech community. First, in a longitudinal corpus study, I find that linguistic properties such as…

  12. Optimal Weight Assignment for a Chinese Signature File.

    ERIC Educational Resources Information Center

    Liang, Tyne; And Others

    1996-01-01

    Investigates the performance of a character-based Chinese text retrieval scheme in which monogram keys and bigram keys are encoded into document signatures. Tests and verifies the theoretical predictions of the optimal weight assignments and the minimal false hit rate in experiments using a real Chinese corpus for disyllabic queries of different…

  13. An Integrated System for Managing the Andalusian Parliament's Digital Library

    ERIC Educational Resources Information Center

    de Campos, Luis M.; Fernandez-Luna, Juan M.; Huete, Juan F.; Martin-Dancausa, Carlos J.; Tagua-Jimenez, Antonio; Tur-Vigil, Carmen

    2009-01-01

    Purpose: The purpose of this paper is to present an overview of the reorganisation of the Andalusian Parliament's digital library to improve the electronic representation and access of its official corpus by taking advantage of a document's internal organisation. Video recordings of the parliamentary sessions have also been integrated with their…

  14. Rapid Exploitation and Analysis of Documents

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buttler, D J; Andrzejewski, D; Stevens, K D

    Analysts are overwhelmed with information. They have large archives of historical data, both structured and unstructured, and continuous streams of relevant messages and documents that they need to match to current tasks, digest, and incorporate into their analysis. The purpose of the READ project is to develop technologies to make it easier to catalog, classify, and locate relevant information. We approached this task from multiple angles. First, we tackle the issue of processing large quantities of information in reasonable time. Second, we provide mechanisms that allow users to customize their queries based on latent topics exposed from corpus statistics. Third,more » we assist users in organizing query results, adding localized expert structure over results. Forth, we use word sense disambiguation techniques to increase the precision of matching user generated keyword lists with terms and concepts in the corpus. Fifth, we enhance co-occurrence statistics with latent topic attribution, to aid entity relationship discovery. Finally we quantitatively analyze the quality of three popular latent modeling techniques to examine under which circumstances each is useful.« less

  15. A System for Automated Extraction of Metadata from Scanned Documents using Layout Recognition and String Pattern Search Models

    PubMed Central

    Misra, Dharitri; Chen, Siyuan; Thoma, George R.

    2010-01-01

    One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques. At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts. In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system. PMID:21179386

  16. Shape analysis of corpus callosum in autism subtype using planar conformal mapping

    NASA Astrophysics Data System (ADS)

    He, Qing; Duan, Ye; Yin, Xiaotian; Gu, Xianfeng; Karsch, Kevin; Miles, Judith

    2009-02-01

    A number of studies have documented that autism has a neurobiological basis, but the anatomical extent of these neurobiological abnormalities is largely unknown. In this study, we aimed at analyzing highly localized shape abnormalities of the corpus callosum in a homogeneous group of autism children. Thirty patients with essential autism and twenty-four controls participated in this study. 2D contours of the corpus callosum were extracted from MR images by a semiautomatic segmentation method, and the 3D model was constructed by stacking the contours. The resulting 3D model had two openings at the ends, thus a new conformal parameterization for high genus surfaces was applied in our shape analysis work, which mapped each surface onto a planar domain. Surface matching among different individual meshes was achieved by re-triangulating each mesh according to a template surface. Statistical shape analysis was used to compare the 3D shapes point by point between patients with autism and their controls. The results revealed significant abnormalities in the anterior most and anterior body in essential autism group.

  17. Boomerang sign on MRI.

    PubMed

    Hirsch, Karen G; Hoesch, Robert E

    2012-06-01

    Altered mental status and more subtle cognitive and personality changes after traumatic brain injury (TBI) are pervasive problems in patients who survive initial injury. MRI is not necessarily part of the diagnostic evaluation of these patients. Case report with relevant image and review of the literature. Injury to the corpus callosum is commonly described in traumatic brain injury; however, extensive lesions in the splenium are not well described. This image shows an important pattern of brain injury and demonstrates a common clinical syndrome seen in patients with corpus callosum pathology. Injury to the splenium of the corpus callosum due to trauma may be extensive and can cause significant neurologic deficits. MRI is important in the diagnostic evaluation of patients with cognitive changes after TBI.

  18. A Bayesian network coding scheme for annotating biomedical information presented to genetic counseling clients.

    PubMed

    Green, Nancy

    2005-04-01

    We developed a Bayesian network coding scheme for annotating biomedical content in layperson-oriented clinical genetics documents. The coding scheme supports the representation of probabilistic and causal relationships among concepts in this domain, at a high enough level of abstraction to capture commonalities among genetic processes and their relationship to health. We are using the coding scheme to annotate a corpus of genetic counseling patient letters as part of the requirements analysis and knowledge acquisition phase of a natural language generation project. This paper describes the coding scheme and presents an evaluation of intercoder reliability for its tag set. In addition to giving examples of use of the coding scheme for analysis of discourse and linguistic features in this genre, we suggest other uses for it in analysis of layperson-oriented text and dialogue in medical communication.

  19. Discourse of 'transformational leadership' in infection control.

    PubMed

    Koteyko, Nelya; Carter, Ronald

    2008-10-01

    The article explores the impact of the ;transformational leadership' style in the role of modern matron with regards to infection control practices. Policy and guidance on the modern matron role suggest that it is distinctive in its combination of management and clinical components, and in its reliance on transformational leadership. Senior nurses are therefore expected to motivate staff by creating high expectations, modelling appropriate behaviour, and providing personal attention to followers by giving respect and responsibility. In this article, we draw on policy documents and interview data to explore the potential impact of this new management style on infection control practices. Combining the techniques of discourse analysis and corpus linguistics, we identify examples where matrons appear to disassociate themselves from the role of ;an empowered manager' who has control over human and financial resources to resolve problems in infection control efficiently.

  20. Standard Information Models for Representing Adverse Sensitivity Information in Clinical Documents.

    PubMed

    Topaz, M; Seger, D L; Goss, F; Lai, K; Slight, S P; Lau, J J; Nandigam, H; Zhou, L

    2016-01-01

    Adverse sensitivity (e.g., allergy and intolerance) information is a critical component of any electronic health record system. While several standards exist for structured entry of adverse sensitivity information, many clinicians record this data as free text. This study aimed to 1) identify and compare the existing common adverse sensitivity information models, and 2) to evaluate the coverage of the adverse sensitivity information models for representing allergy information on a subset of inpatient and outpatient adverse sensitivity clinical notes. We compared four common adverse sensitivity information models: Health Level 7 Allergy and Intolerance Domain Analysis Model, HL7-DAM; the Fast Healthcare Interoperability Resources, FHIR; the Consolidated Continuity of Care Document, C-CDA; and OpenEHR, and evaluated their coverage on a corpus of inpatient and outpatient notes (n = 120). We found that allergy specialists' notes had the highest frequency of adverse sensitivity attributes per note, whereas emergency department notes had the fewest attributes. Overall, the models had many similarities in the central attributes which covered between 75% and 95% of adverse sensitivity information contained within the notes. However, representations of some attributes (especially the value-sets) were not well aligned between the models, which is likely to present an obstacle for achieving data interoperability. Also, adverse sensitivity exceptions were not well represented among the information models. Although we found that common adverse sensitivity models cover a significant portion of relevant information in the clinical notes, our results highlight areas needed to be reconciled between the standards for data interoperability.

  1. Magnetic resonance imaging findings and prognosis of gastric-type mucinous adenocarcinoma (minimal deviation adenocarcinoma or adenoma malignum) of the uterine corpus: Two case reports

    PubMed Central

    HINO, MAYO; YAMAGUCHI, KEN; ABIKO, KAORU; YOSHIOKA, YUMIKO; HAMANISHI, JUNZO; KONDOH, EIJI; KOSHIYAMA, MASAFUMI; BABA, TSUKASA; MATSUMURA, NORIOMI; MINAMIGUCHI, SACHIKO; KIDO, AKI; KONISHI, IKUO

    2016-01-01

    Our group previously documented the first, very rare case of primary gastric-type mucinous adenocarcinoma of the uterine corpus. Although this type of endometrial cancer appears to be similar to the gastric-type adenocarcinoma of the uterine cervix, its main symptoms, appearance on magnetic resonance imaging (MRI) and prognosis have not been fully elucidated due to its rarity. We herein describe an additional case of gastric-type mucinous adenocarcinoma of the endometrium and review the relevant literature. The two cases at our institution (Kyoto University Hospital, Kyoto, Japan) involved postmenopausal women with a primary complaint of abnormal genital bleeding. Microscopic examination of the hysterectomy specimens indicated a highly differentiated mucinous adenocarcinoma with a desmoplastic stromal reaction. Immunohistochemistry for HIK1083 and/or MUC6 was positive in both cases, suggesting a gastric phenotype. Both patients were diagnosed at an advanced stage, they relapsed or recurred immediately after adjuvant chemotherapy, and eventually succumbed to the disease. The main symptom of gastric-type mucinous adenocarcinoma of the uterine cervix is watery discharge, whereas abnormal genital bleeding in addition to watery discharge is mainly observed in the mucinous type of endometrial adenocarcinoma. Cystic cavities in the tumor are present on MRI in cases of endometrial origin, and prognosis is very poor due to resistance to chemotherapy. Thus, gastric-type mucinous adenocarcinoma of the uterine endometrium exhibits a clinical behavior that is similar to tumors originating from the uterine cervix, but is associated with distinguishing clinical symptoms. The incidence of gastric-type endometrial adenocarcinoma may be higher than expected. PMID:27123265

  2. Semantic Role Labeling of Clinical Text: Comparing Syntactic Parsers and Features

    PubMed Central

    Zhang, Yaoyun; Jiang, Min; Wang, Jingqi; Xu, Hua

    2016-01-01

    Semantic role labeling (SRL), which extracts shallow semantic relation representation from different surface textual forms of free text sentences, is important for understanding clinical narratives. Since semantic roles are formed by syntactic constituents in the sentence, an effective parser, as well as an effective syntactic feature set are essential to build a practical SRL system. Our study initiates a formal evaluation and comparison of SRL performance on a clinical text corpus MiPACQ, using three state-of-the-art parsers, the Stanford parser, the Berkeley parser, and the Charniak parser. First, the original parsers trained on the open domain syntactic corpus Penn Treebank were employed. Next, those parsers were retrained on the clinical Treebank of MiPACQ for further comparison. Additionally, state-of-the-art syntactic features from open domain SRL were also examined for clinical text. Experimental results showed that retraining the parsers on clinical Treebank improved the performance significantly, with an optimal F1 measure of 71.41% achieved by the Berkeley parser. PMID:28269926

  3. Corpus callosal atrophy and associations with cognitive impairment in Parkinson disease

    PubMed Central

    Bledsoe, Ian O.; Merkitch, Doug; Dinh, Vy; Bernard, Bryan; Stebbins, Glenn T.

    2017-01-01

    Objective: To investigate atrophy of the corpus callosum on MRI in Parkinson disease (PD) and its relationship to cognitive impairment. Methods: One hundred patients with PD and 24 healthy control participants underwent clinical and neuropsychological evaluations and structural MRI brain scans. Participants with PD were classified as cognitively normal (PD-NC; n = 28), having mild cognitive impairment (PD-MCI; n = 47), or having dementia (PDD; n = 25) by Movement Disorder Society criteria. Cognitive domain (attention/working memory, executive function, memory, language, visuospatial function) z scores were calculated. With the use of FreeSurfer image processing, volumes for total corpus callosum and its subsections (anterior, midanterior, central, midposterior, posterior) were computed and normalized by total intracranial volume. Callosal volumes were compared between participants with PD and controls and among PD cognitive groups, covarying for age, sex, and PD duration and with multiple comparison corrections. Regression analyses were performed to evaluate relationships between callosal volumes and performance in cognitive domains. Results: Participants with PD had reduced corpus callosum volumes in midanterior and central regions compared to healthy controls. Participants with PDD demonstrated decreased callosal volumes involving multiple subsections spanning anterior to posterior compared to participants with PD-MCI and PD-NC. Regional callosal atrophy predicted cognitive domain performance such that central volumes were associated with the attention/working memory domain; midposterior volumes with executive function, language, and memory domains; and posterior volumes with memory and visuospatial domains. Conclusions: Notable volume loss occurs in the corpus callosum in PD, with specific neuroanatomic distributions in PDD and relationships of regional atrophy to different cognitive domains. Callosal volume loss may contribute to clinical manifestations of PD cognitive impairment. PMID:28235816

  4. Challenges for automatically extracting molecular interactions from full-text articles.

    PubMed

    McIntosh, Tara; Curran, James R

    2009-09-24

    The increasing availability of full-text biomedical articles will allow more biomedical knowledge to be extracted automatically with greater reliability. However, most Information Retrieval (IR) and Extraction (IE) tools currently process only abstracts. The lack of corpora has limited the development of tools that are capable of exploiting the knowledge in full-text articles. As a result, there has been little investigation into the advantages of full-text document structure, and the challenges developers will face in processing full-text articles. We manually annotated passages from full-text articles that describe interactions summarised in a Molecular Interaction Map (MIM). Our corpus tracks the process of identifying facts to form the MIM summaries and captures any factual dependencies that must be resolved to extract the fact completely. For example, a fact in the results section may require a synonym defined in the introduction. The passages are also annotated with negated and coreference expressions that must be resolved.We describe the guidelines for identifying relevant passages and possible dependencies. The corpus includes 2162 sentences from 78 full-text articles. Our corpus analysis demonstrates the necessity of full-text processing; identifies the article sections where interactions are most commonly stated; and quantifies the proportion of interaction statements requiring coherent dependencies. Further, it allows us to report on the relative importance of identifying synonyms and resolving negated expressions. We also experiment with an oracle sentence retrieval system using the corpus as a gold-standard evaluation set. We introduce the MIM corpus, a unique resource that maps interaction facts in a MIM to annotated passages within full-text articles. It is an invaluable case study providing guidance to developers of biomedical IR and IE systems, and can be used as a gold-standard evaluation set for full-text IR tasks.

  5. Estimation of the Mean Axon Diameter and Intra-axonal Space Volume Fraction of the Human Corpus Callosum: Diffusion q-space Imaging with Low q-values.

    PubMed

    Suzuki, Yuriko; Hori, Masaaki; Kamiya, Kouhei; Fukunaga, Issei; Aoki, Shigeki; VAN Cauteren, Marc

    2016-01-01

    Q-space imaging (QSI) is a diffusion-weighted imaging (DWI) technique that enables investigation of tissue microstructure. However, for sufficient displacement resolution to measure the microstructure, QSI requires high q-values that are usually difficult to achieve with a clinical scanner. The recently introduced "low q-value method" fits the echo attenuation to only low q-values to extract the root mean square displacement. We investigated the clinical feasibility of the low q-value method for estimating the microstructure of the human corpus callosum using a 3.0-tesla clinical scanner within a clinically feasible scan time. We performed a simulation to explore the acceptable range of maximum q-values for the low q-value method. We simulated echo attenuations caused by restricted diffusion in the intra-axonal space (IAS) and hindered diffusion in the extra-axonal space (EAS) assuming 100,000 cylinders with various diameters, and we estimated mean axon diameter, IAS volume fraction, and EAS diffusivity by fitting echo attenuations with different maximum q-values. Furthermore, we scanned the corpus callosum of 7 healthy volunteers and estimated the mean axon diameter and IAS volume fraction. Good agreement between estimated and defined values in the simulation study with maximum q-values of 700 and 800 cm(-1) suggested that the maximum q-value used in the in vivo experiment, 737 cm(-1), was reasonable. In the in vivo experiment, the mean axon diameter was larger in the body of the corpus callosum and smaller in the genu and splenium, and this anterior-to-posterior trend is consistent with previously reported histology, although our mean axon diameter seems larger in size. On the other hand, we found an opposite anterior-to-posterior trend, with high IAS volume fraction in the genu and splenium and a lower fraction in the body, which is similar to the fiber density reported in the histology study. The low q-value method may provide insights into tissue microstructure using a 3T clinical scanner within clinically feasible scan time.

  6. Playing with Word Endings: Morphological Variation in the Learning of Russian Noun Inflections

    ERIC Educational Resources Information Center

    Kempe, Vera; Brooks, Patricia J.; Mironova, Natalija; Pershukova, Angelina; Fedorova, Olga

    2007-01-01

    This paper documents the occurrence of form variability through diminutive "wordplay", and examines whether this variability facilitates or hinders morphology acquisition in a richly inflected language. First, in a longitudinal speech corpus of eight Russian mothers conversing with their children (1.6-3.6), and with an adult, the use of diminutive…

  7. Linguistic Prescriptivism in Letters to the Editor

    ERIC Educational Resources Information Center

    Lukac, Morana

    2016-01-01

    The public's concern with the fate of the standard language has been well documented in the history of the complaint tradition. The print media have for centuries featured letters to the editor on questions of language use. This study examines a corpus of 258 language-related letters to the editor published in the English-speaking print media. By…

  8. Three modality image registration of brain SPECT/CT and MR images for quantitative analysis of dopamine transporter imaging

    NASA Astrophysics Data System (ADS)

    Yamaguchi, Yuzuho; Takeda, Yuta; Hara, Takeshi; Zhou, Xiangrong; Matsusako, Masaki; Tanaka, Yuki; Hosoya, Kazuhiko; Nihei, Tsutomu; Katafuchi, Tetsuro; Fujita, Hiroshi

    2016-03-01

    Important features in Parkinson's disease (PD) are degenerations and losses of dopamine neurons in corpus striatum. 123I-FP-CIT can visualize activities of the dopamine neurons. The activity radio of background to corpus striatum is used for diagnosis of PD and Dementia with Lewy Bodies (DLB). The specific activity can be observed in the corpus striatum on SPECT images, but the location and the shape of the corpus striatum on SPECT images only are often lost because of the low uptake. In contrast, MR images can visualize the locations of the corpus striatum. The purpose of this study was to realize a quantitative image analysis for the SPECT images by using image registration technique with brain MR images that can determine the region of corpus striatum. In this study, the image fusion technique was used to fuse SPECT and MR images by intervening CT image taken by SPECT/CT. The mutual information (MI) for image registration between CT and MR images was used for the registration. Six SPECT/CT and four MR scans of phantom materials are taken by changing the direction. As the results of the image registrations, 16 of 24 combinations were registered within 1.3mm. By applying the approach to 32 clinical SPECT/CT and MR cases, all of the cases were registered within 0.86mm. In conclusions, our registration method has a potential in superimposing MR images on SPECT images.

  9. Biallelic PMS2 Mutation and Heterozygous DICER1 Mutation Presenting as Constitutional Mismatch Repair Deficiency With Corpus Callosum Agenesis: Case Report and Review of Literature.

    PubMed

    Cheyuo, Cletus; Radwan, Walid; Ahn, Janice; Gyure, Kymberly; Qaiser, Rabia; Tomboc, Patrick

    2017-10-01

    Constitutional mismatch repair deficiency syndrome is a cancer predisposition syndrome caused by autosomal recessive biallelic (homozygous) germline mutations in the mismatch repair genes (MLH1, MSH2, MSH6, and PMS2). The clinical spectrum includes neoplastic and non-neoplastic manifestations. We present the case of a 7-year-old boy who presented with T-lymphoblastic lymphoma and glioblastoma, together with non-neoplastic manifestations including corpus callosum agenesis, arachnoid cyst, developmental venous anomaly, and hydrocephalus. Gene mutation analysis revealed pathogenic biallelic mutations of PMS2 and heterozygous DICER1 variant predicted to be pathogenic. This report is the first to allude to a possible interaction of the mismatch repair system with DICER1 to cause corpus callosum agenesis.

  10. Incorporating Semantics into Data Driven Workflows for Content Based Analysis

    NASA Astrophysics Data System (ADS)

    Argüello, M.; Fernandez-Prieto, M. J.

    Finding meaningful associations between text elements and knowledge structures within clinical narratives in a highly verbal domain, such as psychiatry, is a challenging goal. The research presented here uses a small corpus of case histories and brings into play pre-existing knowledge, and therefore, complements other approaches that use large corpus (millions of words) and no pre-existing knowledge. The paper describes a variety of experiments for content-based analysis: Linguistic Analysis using NLP-oriented approaches, Sentiment Analysis, and Semantically Meaningful Analysis. Although it is not standard practice, the paper advocates providing automatic support to annotate the functionality as well as the data for each experiment by performing semantic annotation that uses OWL and OWL-S. Lessons learnt can be transmitted to legacy clinical databases facing the conversion of clinical narratives according to prominent Electronic Health Records standards.

  11. MorphoSaurus--design and evaluation of an interlingua-based, cross-language document retrieval engine for the medical domain.

    PubMed

    Markó, K; Schulz, S; Hahn, U

    2005-01-01

    We propose an interlingua-based indexing approach to account for the particular challenges that arise in the design and implementation of cross-language document retrieval systems for the medical domain. Documents, as well as queries, are mapped to a language-independent conceptual layer on which retrieval operations are performed. We contrast this approach with the direct translation of German queries to English ones which, subsequently, are matched against English documents. We evaluate both approaches, interlingua-based and direct translation, on a large medical document collection, the OHSUMED corpus. A substantial benefit for interlingua-based document retrieval using German queries on English texts is found, which amounts to 93% of the (monolingual) English baseline. Most state-of-the-art cross-language information retrieval systems translate user queries to the language(s) of the target documents. In contra-distinction to this approach, translating both documents and user queries into a language-independent, concept-like representation format is more beneficial to enhance cross-language retrieval performance.

  12. Automatic extraction of angiogenesis bioprocess from text

    PubMed Central

    Wang, Xinglong; McKendrick, Iain; Barrett, Ian; Dix, Ian; French, Tim; Tsujii, Jun'ichi; Ananiadou, Sophia

    2011-01-01

    Motivation: Understanding key biological processes (bioprocesses) and their relationships with constituent biological entities and pharmaceutical agents is crucial for drug design and discovery. One way to harvest such information is searching the literature. However, bioprocesses are difficult to capture because they may occur in text in a variety of textual expressions. Moreover, a bioprocess is often composed of a series of bioevents, where a bioevent denotes changes to one or a group of cells involved in the bioprocess. Such bioevents are often used to refer to bioprocesses in text, which current techniques, relying solely on specialized lexicons, struggle to find. Results: This article presents a range of methods for finding bioprocess terms and events. To facilitate the study, we built a gold standard corpus in which terms and events related to angiogenesis, a key biological process of the growth of new blood vessels, were annotated. Statistics of the annotated corpus revealed that over 36% of the text expressions that referred to angiogenesis appeared as events. The proposed methods respectively employed domain-specific vocabularies, a manually annotated corpus and unstructured domain-specific documents. Evaluation results showed that, while a supervised machine-learning model yielded the best precision, recall and F1 scores, the other methods achieved reasonable performance and less cost to develop. Availability: The angiogenesis vocabularies, gold standard corpus, annotation guidelines and software described in this article are available at http://text0.mib.man.ac.uk/~mbassxw2/angiogenesis/ Contact: xinglong.wang@gmail.com PMID:21821664

  13. SPECTRa-T: machine-based data extraction and semantic searching of chemistry e-theses.

    PubMed

    Downing, Jim; Harvey, Matt J; Morgan, Peter B; Murray-Rust, Peter; Rzepa, Henry S; Stewart, Diana C; Tonge, Alan P; Townsend, Joe A

    2010-02-22

    The SPECTRa-T project has developed text-mining tools to extract named chemical entities (NCEs), such as chemical names and terms, and chemical objects (COs), e.g., experimental spectral assignments and physical chemistry properties, from electronic theses (e-theses). Although NCEs were readily identified within the two major document formats studied, only the use of structured documents enabled identification of chemical objects and their association with the relevant chemical entity (e.g., systematic chemical name). A corpus of theses was analyzed and it is shown that a high degree of semantic information can be extracted from structured documents. This integrated information has been deposited in a persistent Resource Description Framework (RDF) triple-store that allows users to conduct semantic searches. The strength and weaknesses of several document formats are reviewed.

  14. Automatic Processing of Metallurgical Abstracts for the Purpose of Information Retrieval. Final Report.

    ERIC Educational Resources Information Center

    Melton, Jessica S.

    Objectives of this project were to develop and test a method for automatically processing the text of abstracts for a document retrieval system. The test corpus consisted of 768 abstracts from the metallurgical section of Chemical Abstracts (CA). The system, based on a subject indexing rational, had two components: (1) a stored dictionary of words…

  15. How well does multiple OCR error correction generalize?

    NASA Astrophysics Data System (ADS)

    Lund, William B.; Ringger, Eric K.; Walker, Daniel D.

    2013-12-01

    As the digitization of historical documents, such as newspapers, becomes more common, the need of the archive patron for accurate digital text from those documents increases. Building on our earlier work, the contributions of this paper are: 1. in demonstrating the applicability of novel methods for correcting optical character recognition (OCR) on disparate data sets, including a new synthetic training set, 2. enhancing the correction algorithm with novel features, and 3. assessing the data requirements of the correction learning method. First, we correct errors using conditional random fields (CRF) trained on synthetic training data sets in order to demonstrate the applicability of the methodology to unrelated test sets. Second, we show the strength of lexical features from the training sets on two unrelated test sets, yielding a relative reduction in word error rate on the test sets of 6.52%. New features capture the recurrence of hypothesis tokens and yield an additional relative reduction in WER of 2.30%. Further, we show that only 2.0% of the full training corpus of over 500,000 feature cases is needed to achieve correction results comparable to those using the entire training corpus, effectively reducing both the complexity of the training process and the learned correction model.

  16. Exploring dangerous neighborhoods: Latent Semantic Analysis and computing beyond the bounds of the familiar

    PubMed Central

    Cohen, Trevor; Blatter, Brett; Patel, Vimla

    2005-01-01

    Certain applications require computer systems to approximate intended human meaning. This is achievable in constrained domains with a finite number of concepts. Areas such as psychiatry, however, draw on concepts from the world-at-large. A knowledge structure with broad scope is required to comprehend such domains. Latent Semantic Analysis (LSA) is an unsupervised corpus-based statistical method that derives quantitative estimates of the similarity between words and documents from their contextual usage statistics. The aim of this research was to evaluate the ability of LSA to derive meaningful associations between concepts relevant to the assessment of dangerousness in psychiatry. An expert reference model of dangerousness was used to guide the construction of a relevant corpus. Derived associations between words in the corpus were evaluated qualitatively. A similarity-based scoring function was used to assign dangerousness categories to discharge summaries. LSA was shown to derive intuitive relationships between concepts and correlated significantly better than random with human categorization of psychiatric discharge summaries according to dangerousness. The use of LSA to derive a simulated knowledge structure can extend the scope of computer systems beyond the boundaries of constrained conceptual domains. PMID:16779020

  17. Neuroaxonal ion dyshomeostasis of the normal-appearing corpus callosum in experimental autoimmune encephalomyelitis.

    PubMed

    Chen, Chiao-Chi V; Zechariah, Anil; Hsu, Yi-Hua; Chen, Hsiao-Wen; Yang, Li-Chuan; Chang, Chen

    2008-04-01

    Atrophy of the corpus callosum (CC) is a well-documented observation in clinically definite multiple sclerosis (MS) patients. One recent hypothesis for the neurodegeneration that occurs in MS is that ion dyshomeostasis leads to neuroaxonal damage. To examine whether ion dyshomeostasis occurs in the CC during MS onset, experimental autoimmune encephalomyelitis (EAE) was utilized as an animal MS model to induce autoimmunity-mediated responses. To date, in vivo investigations of neuronal ion homeostasis has not been feasible using traditional neuroscience techniques. Therefore, the current study employed an emerging MRI method, called Mn2+-enhanced MRI (MEMRI). Mn2+ dynamics is closely associated with important neuronal activity events, and is also considered to be a Ca2+ surrogate. Furthermore, when injected intracranially, Mn2+ can be used as a multisynaptic tracer. These features enable MEMRI to detect neuronal ion homeostasis within a multisynaptic circuit that is connected to the injection site. Mn2+ was injected into the visual cortex to trace the CC, and T1-weighted imaging was utilized to observe temporal changes in Mn2+-induced signals in the traced pathways. The results showed that neuroaxonal functional changes associated with ion dyshomeostasis occurred in the CC during an acute EAE attack. In addition, the pathway appeared normal, although EAE-induced immune-cell infiltration was visible around the CC. The findings suggest that ion dyshomeostasis is a major neuronal aberration underlying the deterioration of normal-appearing brain tissues in MS, supporting its involvement in neuroaxonal functioning in MS.

  18. Hereditary motor and sensory neuropathy with agenesis of the corpus callosum.

    PubMed

    Dupré, Nicolas; Howard, Heidi C; Mathieu, Jean; Karpati, George; Vanasse, Michel; Bouchard, Jean-Pierre; Carpenter, Stirling; Rouleau, Guy A

    2003-07-01

    Hereditary motor and sensory neuropathy associated with agenesis of the corpus callosum (OMIM 218000) is an autosomal recessive disease of early onset characterized by a delay in developmental milestones, a severe sensory-motor polyneuropathy with areflexia, a variable degree of agenesis of the corpus callosum, amyotrophy, hypotonia, and cognitive impairment. Although this disorder has rarely been reported worldwide, it has a high prevalence in the Saguenay-Lac-St-Jean region of the province of Quebec (Canada) predominantly because of a founder effect. The gene defect responsible for this disorder recently has been identified, and it is a protein-truncating mutation in the SLC12A6 gene, which codes for a cotransporter protein known as KCC3. Herein, we provide the first extensive review of this disorder, covering epidemiological, clinical, and molecular genetic studies.

  19. Similarity-Based Recommendation of New Concepts to a Terminology

    PubMed Central

    Chandar, Praveen; Yaman, Anil; Hoxha, Julia; He, Zhe; Weng, Chunhua

    2015-01-01

    Terminologies can suffer from poor concept coverage due to delays in addition of new concepts. This study tests a similarity-based approach to recommending concepts from a text corpus to a terminology. Our approach involves extraction of candidate concepts from a given text corpus, which are represented using a set of features. The model learns the important features to characterize a concept and recommends new concepts to a terminology. Further, we propose a cost-effective evaluation methodology to estimate the effectiveness of terminology enrichment methods. To test our methodology, we use the clinical trial eligibility criteria free-text as an example text corpus to recommend concepts for SNOMED CT. We computed precision at various rank intervals to measure the performance of the methods. Results indicate that our automated algorithm is an effective method for concept recommendation. PMID:26958170

  20. An Infinite Mixture Model for Coreference Resolution in Clinical Notes

    PubMed Central

    Liu, Sijia; Liu, Hongfang; Chaudhary, Vipin; Li, Dingcheng

    2016-01-01

    It is widely acknowledged that natural language processing is indispensable to process electronic health records (EHRs). However, poor performance in relation detection tasks, such as coreference (linguistic expressions pertaining to the same entity/event) may affect the quality of EHR processing. Hence, there is a critical need to advance the research for relation detection from EHRs. Most of the clinical coreference resolution systems are based on either supervised machine learning or rule-based methods. The need for manually annotated corpus hampers the use of such system in large scale. In this paper, we present an infinite mixture model method using definite sampling to resolve coreferent relations among mentions in clinical notes. A similarity measure function is proposed to determine the coreferent relations. Our system achieved a 0.847 F-measure for i2b2 2011 coreference corpus. This promising results and the unsupervised nature make it possible to apply the system in big-data clinical setting. PMID:27595047

  1. SureChEMBL: a large-scale, chemically annotated patent document database.

    PubMed

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P

    2016-01-04

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. SureChEMBL: a large-scale, chemically annotated patent document database

    PubMed Central

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A.; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P.

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922

  3. Diabetes insipidus with impaired osmotic regulation in septo-optic dysplasia and agenesis of the corpus callosum.

    PubMed Central

    Masera, N; Grant, D B; Stanhope, R; Preece, M A

    1994-01-01

    The clinical and endocrinological findings in 24 children with septo-optic dysplasia and/or agenesis of the corpus callosum are described with particular reference to posterior pituitary function. Nine had diabetes insipidus. The prevalence of diabetes insipidus was similar in children with complete and incomplete forms of septo-optic dysplasia. Maintenance of normal osmotic balance was very difficult in six of these children, even after the introduction of treatment with vasopressin, either as desmopressin, or lysine vasopressin spray in one of the early cases. PMID:8110009

  4. Physical Activity Behavioral Intervention in Obese Endometrial Cancer Survivors

    ClinicalTrials.gov

    2015-10-14

    Stage IA Uterine Corpus Cancer; Stage IB Uterine Corpus Cancer; Stage II Uterine Corpus Cancer; Stage IIIA Uterine Corpus Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC Uterine Corpus Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer

  5. Sentence-Level Attachment Prediction

    NASA Astrophysics Data System (ADS)

    Albakour, M.-Dyaa; Kruschwitz, Udo; Lucas, Simon

    Attachment prediction is the task of automatically identifying email messages that should contain an attachment. This can be useful to tackle the problem of sending out emails but forgetting to include the relevant attachment (something that happens all too often). A common Information Retrieval (IR) approach in analyzing documents such as emails is to treat the entire document as a bag of words. Here we propose a finer-grained analysis to address the problem. We aim at identifying individual sentences within an email that refer to an attachment. If we detect any such sentence, we predict that the email should have an attachment. Using part of the Enron corpus for evaluation we find that our finer-grained approach outperforms previously reported document-level attachment prediction in similar evaluation settings.

  6. Radiation Therapy, Paclitaxel, and Carboplatin in Treating Patients With High-Risk Endometrial Cancer

    ClinicalTrials.gov

    2016-01-11

    Endometrial Adenocarcinoma; Stage IA Uterine Corpus Cancer; Stage IB Uterine Corpus Cancer; Stage II Uterine Corpus Cancer; Stage IIIA Uterine Corpus Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC Uterine Corpus Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer

  7. Cabozantinib and Nivolumab in Treating Patients With Advanced, Recurrent or Metastatic Endometrial Cancer

    ClinicalTrials.gov

    2018-06-13

    Recurrent Uterine Corpus Carcinoma; Stage III Uterine Corpus Cancer AJCC v7; Stage IIIA Uterine Corpus Cancer AJCC v7; Stage IIIB Uterine Corpus Cancer AJCC v7; Stage IIIC Uterine Corpus Cancer AJCC v7; Stage IIIC1 Uterine Corpus Cancer AJCC v7; Stage IIIC2 Uterine Corpus Cancer AJCC v7; Stage IV Uterine Corpus Cancer AJCC v7; Stage IVA Uterine Corpus Cancer AJCC v7; Stage IVB Uterine Corpus Cancer AJCC v7

  8. Integrity of the corpus callosum in patients with periventricular nodular heterotopia related epilepsy by FLNA mutation.

    PubMed

    Liu, Wenyu; An, Dongmei; Niu, Running; Gong, Qiyong; Zhou, Dong

    2018-01-01

    To investigate the quantitative diffusion properties of the corpus callosum (CC) in a large group of patients with periventricular nodular heterotopia (PNH) related epilepsy and to further investigate the effect of Filamin A ( FLNA ) mutation on these properties. Patients with PNH (n = 34), subdivided into FLNA -mutated (n = 11) and FLNA -nonmutated patients (n = 23) and healthy controls (n = 34), underwent 3.0 T structural MRI and diffusion imaging scan (64 direction). Fractional anisotropy (FA) and mean diffusivity (MD) were measured in the three major subdivisions of the CC (genu, body and splenium). Correlations between DTI metric changes and clinical parameters were also evaluated. Furthermore, the effect of FLNA mutation on structural integrity of the corpus callosum was examined. Patients with PNH and epilepsy had significant reductions in FA for the genu and splenium of the CC, accompanied by increases in MD for the splenium, as compared to healthy controls. There were no correlations between clinical parameters of epilepsy and MD. The FA value in the splenium negatively correlated with epilepsy duration. Interestingly, FLNA -mutated patients showed significantly decreased FA for all three major subdivisions of the CC, and increased MD for the genu and splenium, as compared to HCs and FLNA -nonmutated patients. These findings support the conclusion that patients with epilepsy secondary to PNH present widespread microstructural changes found in the corpus callosum that extend beyond the macroscopic MRI-visible lesions. This study also indicates that FLNA may affect white matter integrity in this disorder.

  9. Cueing musical emotions: An empirical analysis of 24-piece sets by Bach and Chopin documents parallels with emotional speech.

    PubMed

    Poon, Matthew; Schutz, Michael

    2015-01-01

    Acoustic cues such as pitch height and timing are effective at communicating emotion in both music and speech. Numerous experiments altering musical passages have shown that higher and faster melodies generally sound "happier" than lower and slower melodies, findings consistent with corpus analyses of emotional speech. However, equivalent corpus analyses of complex time-varying cues in music are less common, due in part to the challenges of assembling an appropriate corpus. Here, we describe a novel, score-based exploration of the use of pitch height and timing in a set of "balanced" major and minor key compositions. Our analysis included all 24 Preludes and 24 Fugues from Bach's Well-Tempered Clavier (book 1), as well as all 24 of Chopin's Preludes for piano. These three sets are balanced with respect to both modality (major/minor) and key chroma ("A," "B," "C," etc.). Consistent with predictions derived from speech, we found major-key (nominally "happy") pieces to be two semitones higher in pitch height and 29% faster than minor-key (nominally "sad") pieces. This demonstrates that our balanced corpus of major and minor key pieces uses low-level acoustic cues for emotion in a manner consistent with speech. A series of post hoc analyses illustrate interesting trade-offs, with sets featuring greater emphasis on timing distinctions between modalities exhibiting the least pitch distinction, and vice-versa. We discuss these findings in the broader context of speech-music research, as well as recent scholarship exploring the historical evolution of cue use in Western music.

  10. Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus.

    PubMed

    Stubbs, Amber; Uzuner, Özlem

    2015-12-01

    The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on the de-identification of longitudinal medical records. For this track, we de-identified a set of 1304 longitudinal medical records describing 296 patients. This corpus was de-identified under a broad interpretation of the HIPAA guidelines using double-annotation followed by arbitration, rounds of sanity checking, and proof reading. The average token-based F1 measure for the annotators compared to the gold standard was 0.927. The resulting annotations were used both to de-identify the data and to set the gold standard for the de-identification track of the 2014 i2b2/UTHealth shared task. All annotated private health information were replaced with realistic surrogates automatically and then read over and corrected manually. The resulting corpus is the first of its kind made available for de-identification research. This corpus was first used for the 2014 i2b2/UTHealth shared task, during which the systems achieved a mean F-measure of 0.872 and a maximum F-measure of 0.964 using entity-based micro-averaged evaluations. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Handedness and corpus callosal morphology in Williams syndrome.

    PubMed

    Martens, Marilee A; Wilson, Sarah J; Chen, Jian; Wood, Amanda G; Reutens, David C

    2013-02-01

    Williams syndrome is a neurodevelopmental genetic disorder caused by a hemizygous deletion on chromosome 7q11.23, resulting in atypical brain structure and function, including abnormal morphology of the corpus callosum. An influence of handedness on the size of the corpus callosum has been observed in studies of typical individuals, but handedness has not been taken into account in studies of callosal morphology in Williams syndrome. We hypothesized that callosal area is smaller and the size of the splenium and isthmus is reduced in individuals with Williams syndrome compared to healthy controls, and examined age, sex, and handedness effects on corpus callosal area. Structural magnetic resonance imaging scans were obtained on 25 individuals with Williams syndrome (18 right-handed, 7 left-handed) and 25 matched controls. We found that callosal thickness was significantly reduced in the splenium of Williams syndrome individuals compared to controls. We also found novel evidence that the callosal area was smaller in left-handed participants with Williams syndrome than their right-handed counterparts, with opposite findings observed in the control group. This novel finding may be associated with LIM-kinase hemizygosity, a characteristic of Williams syndrome. The findings may have significant clinical implications in future explorations of the Williams syndrome cognitive phenotype.

  12. Acute alcohol intoxication, diffuse axonal injury and intraventricular bleeding in patients with isolated blunt traumatic brain injury.

    PubMed

    Matsukawa, Hidetoshi; Shinoda, Masaki; Fujii, Motoharu; Takahashi, Osamu; Murakata, Atsushi; Yamamoto, Daisuke

    2013-01-01

    The influence of blood alcohol level (BAL) on outcome remains unclear. This study investigated the relationships between BAL, type and number of diffuse axonal injury (DAI), intraventricular bleeding (IVB) and 6-month outcome. This study reviewed 419 patients with isolated blunt traumatic brain injury. First, it compared clinical and radiological characteristics between patients with good recovery and disability. Second, it compared BAL among DAI lesions. Third, it evaluated the correlation between the BAL and severity of IVB, number of DAI and corpus callosum injury lesions. Regardless of BAL, older age, male gender, severe Glasgow Coma Scale score (<9), abnormal pupil, IVB and lesion on genu of corpus callosum were significantly related to disability. There were no significant differences between the BAL and lesions of DAI. Simple regression analysis revealed that there were no significant correlation between BAL and severity of IVB, number of DAI and corpus callosum injury lesions. Acute alcohol intoxication was not associated with type and number of DAI lesion, IVB and disability. This study suggested that a specific type of traumatic lesion, specifically lesion on genu of corpus callosum and IVB, might be more vital for outcome.

  13. [Penile injury caused by a Moulinette. Result of autoerotic self-mutilation].

    PubMed

    Lehsnau, M

    2007-07-01

    Autoerotic manipulations of external male genitals resulting in mutilation with different degrees of severity are rare. We report the clinical case of a 12-year-old boy who injured his glans, left corpus cavernosum and corpus spongiosum with opened urethra as a consequence of autoerotic genital self-mutilation. According to our knowledge of the current literature this is the first description of autoerotic genital self-mutilation with a Moulinette. A Moulinette is a kitchen tool with an electric engine and an extremely fast rotary double knife, which is used to reduce food into small pieces, especially vegetables and fruits.

  14. 21 CFR 522.995 - Fluprostenol.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... intramuscular injection. (2) Indications for use. For use in mares for its luteolytic effect to control the timing of estrus in estrous cycling and in clinically anestrous mares that have a corpus luteum. (3...

  15. Temsirolimus With or Without Megestrol Acetate and Tamoxifen Citrate in Treating Patients With Advanced, Persistent, or Recurrent Endometrial Cancer

    ClinicalTrials.gov

    2017-04-11

    Endometrial Carcinoma; Recurrent Uterine Corpus Carcinoma; Stage IIIA Uterine Corpus Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC1 Uterine Corpus Cancer; Stage IIIC2 Uterine Corpus Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer

  16. Practice guidelines for management of uterine corpus cancer in Korea: a Korean Society of Gynecologic Oncology Consensus Statement

    PubMed Central

    Hong, Dae Gy; Shin, So-Jin; Ju, Woong; Cho, Hanbyoul; Lee, Chulmin; Kim, Hyun-Jung; Bae, Duk-Soo

    2017-01-01

    Clinical practice guidelines for gynecologic cancers have been developed by many organizations. Although these guidelines have much in common in terms of the practice of standard of care for uterine corpus cancer, practice guidelines that reflect the characteristics of patients and healthcare and insurance systems are needed for each country. The Korean Society of Gynecologic Oncology (KSGO) published the first edition of practice guidelines for gynecologic cancer treatment in late 2006; the second edition was released in July 2010 as an evidence-based recommendation. The Guidelines Revision Committee was established in 2015 and decided to produce the third edition of the guidelines as an advanced form based on evidence-based medicine, considering up-to-date clinical trials and abundant qualified Korean data. These guidelines cover screening, surgery, adjuvant treatment, and advanced and recurrent disease with respect to endometrial carcinoma and uterine sarcoma. The committee members and many gynecologic oncologists derived key questions from the discussion, and a number of relevant scientific literatures were reviewed in advance. Recommendations for each specific question were developed by the consensus conference, and they are summarized here, together with other details. The objective of these practice guidelines is to establish standard policies on issues in clinical areas related to the management of uterine corpus cancer based on the findings in published papers to date and the consensus of experts as a KSGO Consensus Statement. PMID:27894165

  17. Scalable ranked retrieval using document images

    NASA Astrophysics Data System (ADS)

    Jain, Rajiv; Oard, Douglas W.; Doermann, David

    2013-12-01

    Despite the explosion of text on the Internet, hard copy documents that have been scanned as images still play a significant role for some tasks. The best method to perform ranked retrieval on a large corpus of document images, however, remains an open research question. The most common approach has been to perform text retrieval using terms generated by optical character recognition. This paper, by contrast, examines whether a scalable segmentation-free image retrieval algorithm, which matches sub-images containing text or graphical objects, can provide additional benefit in satisfying a user's information needs on a large, real world dataset. Results on 7 million scanned pages from the CDIP v1.0 test collection show that content based image retrieval finds a substantial number of documents that text retrieval misses, and that when used as a basis for relevance feedback can yield improvements in retrieval effectiveness.

  18. Inferring Group Processes from Computer-Mediated Affective Text Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schryver, Jack C; Begoli, Edmon; Jose, Ajith

    2011-02-01

    Political communications in the form of unstructured text convey rich connotative meaning that can reveal underlying group social processes. Previous research has focused on sentiment analysis at the document level, but we extend this analysis to sub-document levels through a detailed analysis of affective relationships between entities extracted from a document. Instead of pure sentiment analysis, which is just positive or negative, we explore nuances of affective meaning in 22 affect categories. Our affect propagation algorithm automatically calculates and displays extracted affective relationships among entities in graphical form in our prototype (TEAMSTER), starting with seed lists of affect terms. Severalmore » useful metrics are defined to infer underlying group processes by aggregating affective relationships discovered in a text. Our approach has been validated with annotated documents from the MPQA corpus, achieving a performance gain of 74% over comparable random guessers.« less

  19. Comparison of Grouping Methods for Template Extraction from VA Medical Record Text.

    PubMed

    Redd, Andrew M; Gundlapalli, Adi V; Divita, Guy; Tran, Le-Thuy; Pettey, Warren B P; Samore, Matthew H

    2017-01-01

    We investigate options for grouping templates for the purpose of template identification and extraction from electronic medical records. We sampled a corpus of 1000 documents originating from Veterans Health Administration (VA) electronic medical record. We grouped documents through hashing and binning tokens (Hashed) as well as by the top 5% of tokens identified as important through the term frequency inverse document frequency metric (TF-IDF). We then compared the approaches on the number of groups with 3 or more and the resulting longest common subsequences (LCSs) common to all documents in the group. We found that the Hashed method had a higher success rate for finding LCSs, and longer LCSs than the TF-IDF method, however the TF-IDF approach found more groups than the Hashed and subsequently more long sequences, however the average length of LCSs were lower. In conclusion, each algorithm appears to have areas where it appears to be superior.

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Crain, Steven P.; Yang, Shuang-Hong; Zha, Hongyuan

    Access to health information by consumers is ham- pered by a fundamental language gap. Current attempts to close the gap leverage consumer oriented health information, which does not, however, have good coverage of slang medical terminology. In this paper, we present a Bayesian model to automatically align documents with different dialects (slang, com- mon and technical) while extracting their semantic topics. The proposed diaTM model enables effective information retrieval, even when the query contains slang words, by explicitly modeling the mixtures of dialects in documents and the joint influence of dialects and topics on word selection. Simulations us- ing consumermore » questions to retrieve medical information from a corpus of medical documents show that diaTM achieves a 25% improvement in information retrieval relevance by nDCG@5 over an LDA baseline.« less

  1. Helicobacter pylori eradication and reflux disease onset: did gastric acid get "crazy"?

    PubMed

    Zullo, Angelo; Hassan, Cesare; Repici, Alessandro; Bruzzese, Vincenzo

    2013-02-14

    Gastroesophageal reflux disease (GORD) is highly prevalent in the general population. In the last decade, a potential relationship between Helicobacter pylori (H. pylori) eradication and GORD onset has been claimed. The main putative mechanism is the gastric acid hypersecretion that develops after bacterial cure in those patients with corpus-predominant gastritis. We performed a critical reappraisal of the intricate pathogenesis and clinical data available in this field. Oesophagitis onset after H. pylori eradication in duodenal ulcer patients has been ascribed to a gastric acid hypersecretion, which could develop following body gastritis healing. However, the absence of an acid hypersecretive status in these patients is documented by both pathophysiology and clinical studies. Indeed, duodenal ulcer recurrence is virtually abolished following H. pylori eradication. In addition, intra-oesophageal pH recording studies failed to demonstrated increased acid reflux following bacterial eradication. Moreover, oesophageal manometric studies suggest that H. pylori eradication would reduce--rather than favor--acid reflux into the oesophagus. Finally, data of clinical studies would suggest that H. pylori eradication is not significantly associated with either reflux symptoms or erosive oesophagitis onset, some data suggesting also an advantage in curing the infection when oesophagitis is already present. Therefore, the legend of "crazy acid" remains--as all the others--a fascinating, but imaginary tale.

  2. Neonatal case studies using active leptospermum honey.

    PubMed

    Mohr, Lynn D; Reyna, Roxana; Amaya, Rene

    2014-01-01

    Treatment of the neonatal patient with clinically complex wounds creates a challenge due to the safety and efficacy issues associated with the use of many advanced wound care products. The purpose of this case series was to present outcomes of 3 neonates with wounds of differing etiologies managed by Active Leptospermum Honey (ALH). Clinical case series. Clinical experiences with 3 neonates, 1 male and 2 females, are described. These premature infants received care at Rush University Medical Center, Houston, Texas, or Driscoll Children's Hospital, Corpus Christi, Texas. Each neonate presented with dissimilar wounds and differing treatment goals. For a premature infant with left foot ischemia, ALH dressings allowed for removal of nonviable tissue and facilitated the granulation of the open wounds. This removal of nonviable tissue coupled with the facilitation of granulation tissue enabled the premature infant's toe tips to be salvaged without requiring aggressive surgical intervention. For the 2 preterm infants with extravasation of intravenous solutions, ALH dressings allowed healing and increased tissue granulation without any noted toxicity to the wound bed. Further, the method of action of ALH includes an osmotic pull effect that reduced periwound erythema and edema. Although the use of ALH has been well documented in adult care, these case studies demonstrate its potential use in different wound etiologies in 3 neonatal patients.

  3. Citation Sentiment Analysis in Clinical Trial Papers

    PubMed Central

    Xu, Jun; Zhang, Yaoyun; Wu, Yonghui; Wang, Jingqi; Dong, Xiao; Xu, Hua

    2015-01-01

    In scientific writing, positive credits and negative criticisms can often be seen in the text mentioning the cited papers, providing useful information about whether a study can be reproduced or not. In this study, we focus on citation sentiment analysis, which aims to determine the sentiment polarity that the citation context carries towards the cited paper. A citation sentiment corpus was annotated first on clinical trial papers. The effectiveness of n-gram and sentiment lexicon features, and problem-specified structure features for citation sentiment analysis were then examined using the annotated corpus. The combined features from the word n-grams, the sentiment lexicons and the structure information achieved the highest Micro F-score of 0.860 and Macro-F score of 0.719, indicating that it is feasible to use machine learning methods for citation sentiment analysis in biomedical publications. A comprehensive comparison between citation sentiment analysis of clinical trial papers and other general domains were conducted, which additionally highlights the unique challenges within this domain. PMID:26958274

  4. Carboplatin and Paclitaxel With or Without Cisplatin and Radiation Therapy in Treating Patients With Stage I, Stage II, Stage III, or Stage IVA Endometrial Cancer

    ClinicalTrials.gov

    2018-01-09

    Endometrial Clear Cell Adenocarcinoma; Endometrial Serous Adenocarcinoma; Stage IA Uterine Corpus Cancer; Stage IB Uterine Corpus Cancer; Stage II Uterine Corpus Cancer; Stage IIIA Uterine Corpus Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC Uterine Corpus Cancer; Stage IVA Uterine Corpus Cancer

  5. Medroxyprogesterone in Treating Patients With Endometrioid Adenocarcinoma of the Uterine Corpus

    ClinicalTrials.gov

    2016-03-17

    Endometrial Adenocarcinoma; Endometrial Adenosquamous Carcinoma; Endometrial Endometrioid Adenocarcinoma, Variant With Squamous Differentiation; Recurrent Uterine Corpus Carcinoma; Stage I Uterine Corpus Cancer; Stage II Uterine Corpus Cancer; Stage III Uterine Corpus Cancer; Stage IV Uterine Corpus Cancer

  6. Primary diffuse large B cell lymphoma arising from a leiomyoma of the uterine corpus.

    PubMed

    Zhao, Lianhua; Ma, Qiang; Wang, Qiushi; Zeng, Ying; Luo, Qingya; Xiao, Hualiang

    2016-01-20

    Primary diffuse large B cell lymphoma (DLBCL) of the uterus is rare, and primary DLBCL arising from a uterine leiomyoma (collision tumor) has not been reported in the literature. We describe the clinical, histological, immunohistochemical, and molecular features of primary DLBCL arising from a leiomyoma in the uterine corpus. A 73-year-old female patient had a uterine mass for 23 years. An ultrasound scan revealed marked enlargement of the uterus, measuring 18.2 × 13 × 16.3 cm, with a 17.6 × 10.9 × 11.6 cm hypoechoic mass in the uterine corpus. The tumors consisted of medium- to large-sized cells exhibiting a diffuse pattern of growth with a well-circumscribed leiomyoma. The neoplastic cells strongly expressed CD79α, CD20 and PAX5. Molecular analyses indicated clonal B-cell receptor gene rearrangement. To the best of our knowledge, no previous cases of primary DLBCL arising from a leiomyoma have been reported. It is necessary to differentiate a diagnosis of primary DLBCL arising from a leiomyoma from that of leiomyoma with florid reactive lymphocytic infiltration (lymphoma-like lesion). Careful analysis of clinical, histological, immunophenotypic, and genetic features is required to establish the correct diagnosis.

  7. Doxorubicin Hydrochloride, Cisplatin, and Paclitaxel or Carboplatin and Paclitaxel in Treating Patients With Stage III-IV or Recurrent Endometrial Cancer

    ClinicalTrials.gov

    2018-03-23

    Recurrent Uterine Corpus Carcinoma; Stage IIIA Uterine Corpus Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC Uterine Corpus Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer

  8. The impact of human immune deficiency virus and hepatitis C coinfection on white matter microstructural integrity.

    PubMed

    Heaps-Woodruff, J M; Wright, P W; Ances, B M; Clifford, D; Paul, R H

    2016-06-01

    The purpose of the present study is to examine the integrity of white matter microstructure among individuals coinfected with HIV and HCV using diffusion tensor imaging (DTI). Twenty-five HIV+ patients, 21 HIV+/HCV+ patients, and 25 HIV- controls were included in this study. All HIV+ individuals were stable on combination antiretroviral therapy (cART; ≥3 months). All participants completed MRI and neuropsychological measures. Clinical variables including liver function, HIV-viral load, and CD4 count were collected from the patient groups. DTI metrics including mean diffusivity (MD), axial diffusivity (AD), radial diffusivity (RD), and fractional anisotropy (FA) from five subregions of the corpus callosum were compared across groups. The HIV+/HCV+ group and HIV+ group were similar in terms of HIV clinical variables. None of the participants met criteria for cirrhosis or fibrosis. Within the anterior corpus callosum, significant differences were observed between both HIV+ groups compared to HIV- controls on DTI measures. HIV+ and HIV+/HCV+ groups had significantly lower FA values and higher MD and RD values compared to HIV- controls; however, no differences were present between the HIV+ and HIV+/HCV+ groups. Duration of HIV infection was significantly related to DTI metrics in total corpus callosum FA only, but not other markers of HIV disease burden or neurocognitive function. Both HIV+ and HIV+/HCV+ individuals had significant alterations in white matter integrity within the corpus callosum; however, there was no evidence for an additive effect of HCV coinfection. The association between DTI metrics and duration of HIV infection suggests that HIV may continue to negatively impact white matter integrity even in well-controlled disease.

  9. Short Course Vaginal Cuff Brachytherapy in Treating Patients With Stage I-II Endometrial Cancer

    ClinicalTrials.gov

    2018-04-17

    Endometrial Clear Cell Adenocarcinoma; Endometrial Endometrioid Adenocarcinoma; Endometrial Serous Adenocarcinoma; Stage I Uterine Corpus Cancer; Stage IA Uterine Corpus Cancer; Stage IB Uterine Corpus Cancer; Stage II Uterine Corpus Cancer; Uterine Corpus Carcinosarcoma; Uterine Corpus Sarcoma

  10. Paclitaxel and Carboplatin With or Without Metformin Hydrochloride in Treating Patients With Stage III, IV, or Recurrent Endometrial Cancer

    ClinicalTrials.gov

    2018-03-07

    Endometrial Adenocarcinoma; Endometrial Clear Cell Adenocarcinoma; Endometrial Serous Adenocarcinoma; Endometrial Undifferentiated Carcinoma; Recurrent Uterine Corpus Carcinoma; Stage III Uterine Corpus Cancer AJCC v7; Stage IIIA Uterine Corpus Cancer AJCC v7; Stage IIIB Uterine Corpus Cancer AJCC v7; Stage IIIC Uterine Corpus Cancer AJCC v7; Stage IV Uterine Corpus Cancer AJCC v7; Stage IVA Uterine Corpus Cancer AJCC v7; Stage IVB Uterine Corpus Cancer AJCC v7

  11. Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies.

    PubMed

    Cohen, Raphael; Elhadad, Michael; Elhadad, Noémie

    2013-01-16

    The increasing availability of Electronic Health Record (EHR) data and specifically free-text patient notes presents opportunities for phenotype extraction. Text-mining methods in particular can help disease modeling by mapping named-entities mentions to terminologies and clustering semantically related terms. EHR corpora, however, exhibit specific statistical and linguistic characteristics when compared with corpora in the biomedical literature domain. We focus on copy-and-paste redundancy: clinicians typically copy and paste information from previous notes when documenting a current patient encounter. Thus, within a longitudinal patient record, one expects to observe heavy redundancy. In this paper, we ask three research questions: (i) How can redundancy be quantified in large-scale text corpora? (ii) Conventional wisdom is that larger corpora yield better results in text mining. But how does the observed EHR redundancy affect text mining? Does such redundancy introduce a bias that distorts learned models? Or does the redundancy introduce benefits by highlighting stable and important subsets of the corpus? (iii) How can one mitigate the impact of redundancy on text mining? We analyze a large-scale EHR corpus and quantify redundancy both in terms of word and semantic concept repetition. We observe redundancy levels of about 30% and non-standard distribution of both words and concepts. We measure the impact of redundancy on two standard text-mining applications: collocation identification and topic modeling. We compare the results of these methods on synthetic data with controlled levels of redundancy and observe significant performance variation. Finally, we compare two mitigation strategies to avoid redundancy-induced bias: (i) a baseline strategy, keeping only the last note for each patient in the corpus; (ii) removing redundant notes with an efficient fingerprinting-based algorithm. (a)For text mining, preprocessing the EHR corpus with fingerprinting yields significantly better results. Before applying text-mining techniques, one must pay careful attention to the structure of the analyzed corpora. While the importance of data cleaning has been known for low-level text characteristics (e.g., encoding and spelling), high-level and difficult-to-quantify corpus characteristics, such as naturally occurring redundancy, can also hurt text mining. Fingerprinting enables text-mining techniques to leverage available data in the EHR corpus, while avoiding the bias introduced by redundancy.

  12. Paclitaxel and Intraperitoneal Carboplatin Followed by Radiation Therapy in Treating Patients With Stage IIIC-IV Uterine Cancer

    ClinicalTrials.gov

    2015-02-10

    Endometrial Serous Adenocarcinoma; Stage IIIA Uterine Corpus Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC1 Uterine Corpus Cancer; Stage IIIC2 Uterine Corpus Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer

  13. Cueing musical emotions: An empirical analysis of 24-piece sets by Bach and Chopin documents parallels with emotional speech

    PubMed Central

    Poon, Matthew; Schutz, Michael

    2015-01-01

    Acoustic cues such as pitch height and timing are effective at communicating emotion in both music and speech. Numerous experiments altering musical passages have shown that higher and faster melodies generally sound “happier” than lower and slower melodies, findings consistent with corpus analyses of emotional speech. However, equivalent corpus analyses of complex time-varying cues in music are less common, due in part to the challenges of assembling an appropriate corpus. Here, we describe a novel, score-based exploration of the use of pitch height and timing in a set of “balanced” major and minor key compositions. Our analysis included all 24 Preludes and 24 Fugues from Bach’s Well-Tempered Clavier (book 1), as well as all 24 of Chopin’s Preludes for piano. These three sets are balanced with respect to both modality (major/minor) and key chroma (“A,” “B,” “C,” etc.). Consistent with predictions derived from speech, we found major-key (nominally “happy”) pieces to be two semitones higher in pitch height and 29% faster than minor-key (nominally “sad”) pieces. This demonstrates that our balanced corpus of major and minor key pieces uses low-level acoustic cues for emotion in a manner consistent with speech. A series of post hoc analyses illustrate interesting trade-offs, with sets featuring greater emphasis on timing distinctions between modalities exhibiting the least pitch distinction, and vice-versa. We discuss these findings in the broader context of speech-music research, as well as recent scholarship exploring the historical evolution of cue use in Western music. PMID:26578990

  14. Experimental acute thrombotic stroke in baboons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Del Zoppo, G.J.; Copeland, B.R.; Harker, L.A.

    1986-11-01

    To study the effects of antithrombotic therapy in experimental stroke, we have characterized a baboon model of acute cerebrovascular thrombosis. In this model an inflatable silastic balloon cuff has been implanted by transorbital approach around the right middle cerebral artery (MCA), proximal to the take-off of the lenticulostriate arteries (LSA). Inflation of the balloon for 3 hours in six animals produced a stereotypic sustained stroke syndrome characterized by contralateral hemiparesis. An infarction volume of 3.2 +/- 1.5 cm3 in the ipsilateral corpus striatum was documented by computerized tomographic (CT) scanning at 10 days following stroke induction and 3.9 +/- 1.9more » cm3 (n = 4) at 14 days by morphometric neuropathologic determinations of brain specimens fixed in situ by pressure-perfusion with 10% buffered formalin. Immediate pressure-perfusion fixation following deflation of the balloon was performed in 16 additional animals given Evans blue dye intravenously prior to the 3 hour MCA balloon occlusion. Light microscopy and transmission electron microscopy consistently confirmed the presence of thrombotic material occluding microcirculatory branches of the right LSA in the region of Evans blue stain, but not those of the contralateral corpus striatum. When autologous 111In-platelets were infused intravenously in four animals from the above group prior to the transient 3 hour occlusion of the right MCA, gamma scintillation camera imaging of each perfused-fixed whole brain demonstrated the presence of a single residual focus of 111In-platelet activity involving only the Evans blue-stained right corpus striatum. Focal right hemispheric activity was equivalent to 0.55 +/- 0.49 ml of whole blood, and the occlusion score derived from histologic examination of the microcirculation of the Evans blue-stained corpus striatum averaged 34.8 +/- 2.8.« less

  15. Phenotype Instance Verification and Evaluation Tool (PIVET): A Scaled Phenotype Evidence Generation Framework Using Web-Based Medical Literature.

    PubMed

    Henderson, Jette; Ke, Junyuan; Ho, Joyce C; Ghosh, Joydeep; Wallace, Byron C

    2018-05-04

    Researchers are developing methods to automatically extract clinically relevant and useful patient characteristics from raw healthcare datasets. These characteristics, often capturing essential properties of patients with common medical conditions, are called computational phenotypes. Being generated by automated or semiautomated, data-driven methods, such potential phenotypes need to be validated as clinically meaningful (or not) before they are acceptable for use in decision making. The objective of this study was to present Phenotype Instance Verification and Evaluation Tool (PIVET), a framework that uses co-occurrence analysis on an online corpus of publically available medical journal articles to build clinical relevance evidence sets for user-supplied phenotypes. PIVET adopts a conceptual framework similar to the pioneering prototype tool PheKnow-Cloud that was developed for the phenotype validation task. PIVET completely refactors each part of the PheKnow-Cloud pipeline to deliver vast improvements in speed without sacrificing the quality of the insights PheKnow-Cloud achieved. PIVET leverages indexing in NoSQL databases to efficiently generate evidence sets. Specifically, PIVET uses a succinct representation of the phenotypes that corresponds to the index on the corpus database and an optimized co-occurrence algorithm inspired by the Aho-Corasick algorithm. We compare PIVET's phenotype representation with PheKnow-Cloud's by using PheKnow-Cloud's experimental setup. In PIVET's framework, we also introduce a statistical model trained on domain expert-verified phenotypes to automatically classify phenotypes as clinically relevant or not. Additionally, we show how the classification model can be used to examine user-supplied phenotypes in an online, rather than batch, manner. PIVET maintains the discriminative power of PheKnow-Cloud in terms of identifying clinically relevant phenotypes for the same corpus with which PheKnow-Cloud was originally developed, but PIVET's analysis is an order of magnitude faster than that of PheKnow-Cloud. Not only is PIVET much faster, it can be scaled to a larger corpus and still retain speed. We evaluated multiple classification models on top of the PIVET framework and found ridge regression to perform best, realizing an average F1 score of 0.91 when predicting clinically relevant phenotypes. Our study shows that PIVET improves on the most notable existing computational tool for phenotype validation in terms of speed and automation and is comparable in terms of accuracy. ©Jette Henderson, Junyuan Ke, Joyce C Ho, Joydeep Ghosh, Byron C Wallace. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 04.05.2018.

  16. Phenotype Instance Verification and Evaluation Tool (PIVET): A Scaled Phenotype Evidence Generation Framework Using Web-Based Medical Literature

    PubMed Central

    Ke, Junyuan; Ho, Joyce C; Ghosh, Joydeep; Wallace, Byron C

    2018-01-01

    Background Researchers are developing methods to automatically extract clinically relevant and useful patient characteristics from raw healthcare datasets. These characteristics, often capturing essential properties of patients with common medical conditions, are called computational phenotypes. Being generated by automated or semiautomated, data-driven methods, such potential phenotypes need to be validated as clinically meaningful (or not) before they are acceptable for use in decision making. Objective The objective of this study was to present Phenotype Instance Verification and Evaluation Tool (PIVET), a framework that uses co-occurrence analysis on an online corpus of publically available medical journal articles to build clinical relevance evidence sets for user-supplied phenotypes. PIVET adopts a conceptual framework similar to the pioneering prototype tool PheKnow-Cloud that was developed for the phenotype validation task. PIVET completely refactors each part of the PheKnow-Cloud pipeline to deliver vast improvements in speed without sacrificing the quality of the insights PheKnow-Cloud achieved. Methods PIVET leverages indexing in NoSQL databases to efficiently generate evidence sets. Specifically, PIVET uses a succinct representation of the phenotypes that corresponds to the index on the corpus database and an optimized co-occurrence algorithm inspired by the Aho-Corasick algorithm. We compare PIVET’s phenotype representation with PheKnow-Cloud’s by using PheKnow-Cloud’s experimental setup. In PIVET’s framework, we also introduce a statistical model trained on domain expert–verified phenotypes to automatically classify phenotypes as clinically relevant or not. Additionally, we show how the classification model can be used to examine user-supplied phenotypes in an online, rather than batch, manner. Results PIVET maintains the discriminative power of PheKnow-Cloud in terms of identifying clinically relevant phenotypes for the same corpus with which PheKnow-Cloud was originally developed, but PIVET’s analysis is an order of magnitude faster than that of PheKnow-Cloud. Not only is PIVET much faster, it can be scaled to a larger corpus and still retain speed. We evaluated multiple classification models on top of the PIVET framework and found ridge regression to perform best, realizing an average F1 score of 0.91 when predicting clinically relevant phenotypes. Conclusions Our study shows that PIVET improves on the most notable existing computational tool for phenotype validation in terms of speed and automation and is comparable in terms of accuracy. PMID:29728351

  17. A Neuropsychological Profile for Agenesis of the Corpus Callosum? Cognitive, Academic, Executive, Social, and Behavioral Functioning in School-Age Children.

    PubMed

    Siffredi, Vanessa; Anderson, Vicki; McIlroy, Alissandra; Wood, Amanda G; Leventer, Richard J; Spencer-Smith, Megan M

    2018-05-01

    Agenesis of the corpus callosum (AgCC), characterized by developmental absence of the corpus callosum, is one of the most common congenital brain malformations. To date, there are limited data on the neuropsychological consequences of AgCC and factors that modulate different outcomes, especially in children. This study aimed to describe general intellectual, academic, executive, social and behavioral functioning in a cohort of school-aged children presenting for clinical services to a hospital and diagnosed with AgCC. The influences of age, social risk and neurological factors were examined. Twenty-eight school-aged children (8 to 17 years) diagnosed with AgCC completed tests of general intelligence (IQ) and academic functioning. Executive, social and behavioral functioning in daily life, and social risk, were estimated from parent and teacher rated questionnaires. MRI findings reviewed by a pediatric neurologist confirmed diagnosis and identified brain characteristics. Clinical details including the presence of epilepsy and diagnosed genetic condition were obtained from medical records. In our cohort, ~50% of children experienced general intellectual, academic, executive, social and/or behavioral difficulties and ~20% were functioning at a level comparable to typically developing children. Social risk was important for understanding variability in neuropsychological outcomes. Brain anomalies and complete AgCC were associated with lower mathematics performance and poorer executive functioning. This is the first comprehensive report of general intellectual, academic, executive social and behavioral consequences of AgCC in school-aged children. The findings have important clinical implications, suggesting that support to families and targeted intervention could promote positive neuropsychological functioning in children with AgCC who come to clinical attention. (JINS, 2018, 24, 445-455).

  18. Drug Prevention, Rehabilitation, Interdiction, and Law Enforcement (Corpus Christi, TX). Hearing before the Select Committee on Narcotics Abuse and Control. House of Representatives, Ninety-Eighth Congress, First Session (December 12 and 13, 1983).

    ERIC Educational Resources Information Center

    Congress of the U.S., Washington, DC. House Select Committee on Narcotics Abuse and Control.

    This document provides transcripts of two consecutive days of Congressional hearings on narcotics abuse and control. Opening statements from Representatives Benjamin A. Gilman, Kent Hance, and Solomon P. Ortiz are presented. Testimony and prepared statements of 61 counselors and administrators in the field of substance abuse, public officials, law…

  19. Automatic Title Generation for Spoken Broadcast News

    DTIC Science & Technology

    2001-01-01

    degrades much less with speech -recognized transcripts. Meanwhile, even though KNN performance not as well as TF.IDF and NBL in terms of F1 metric, it...test corpus of 1006 broadcast news documents, comparing the results over manual transcription to the results over automatically recognized speech . We...use both F1 and the average number of correct title words in the correct order as metric. Overall, the results show that title generation for speech

  20. Dasatinib, Paclitaxel, and Carboplatin in Treating Patients With Stage III-IV or Recurrent Endometrial Cancer

    ClinicalTrials.gov

    2018-04-04

    Endometrial Adenocarcinoma; Endometrial Clear Cell Adenocarcinoma; Endometrial Mucinous Adenocarcinoma; Endometrial Serous Adenocarcinoma; Endometrial Squamous Cell Carcinoma; Endometrial Transitional Cell Carcinoma; Endometrial Undifferentiated Carcinoma; Endometrioid Adenocarcinoma; Recurrent Uterine Corpus Carcinoma; Stage III Uterine Corpus Cancer AJCC v7; Stage IIIA Uterine Corpus Cancer AJCC v7; Stage IIIB Uterine Corpus Cancer AJCC v7; Stage IIIC Uterine Corpus Cancer AJCC v7; Stage IV Uterine Corpus Cancer AJCC v7; Stage IVA Uterine Corpus Cancer AJCC v7; Stage IVB Uterine Corpus Cancer AJCC v7

  1. Intensity-Modulated Radiation Therapy, Cisplatin, and Bevacizumab Followed by Carboplatin and Paclitaxel in Treating Patients Who Have Undergone Surgery for Endometrial Cancer

    ClinicalTrials.gov

    2018-02-15

    Endometrial Adenocarcinoma; Endometrial Adenosquamous Carcinoma; Endometrial Clear Cell Adenocarcinoma; Endometrial Serous Adenocarcinoma; Stage IA Uterine Corpus Cancer AJCC v7; Stage IB Uterine Corpus Cancer AJCC v7; Stage II Uterine Corpus Cancer AJCC v7; Stage IIIA Uterine Corpus Cancer AJCC v7; Stage IIIB Uterine Corpus Cancer AJCC v7; Stage IIIC Uterine Corpus Cancer AJCC v7; Stage IVA Uterine Corpus Cancer AJCC v7; Stage IVB Uterine Corpus Cancer AJCC v7

  2. A case of the corpus callosum and alien hand syndrome from a discrete paracallosal lesion.

    PubMed

    Faber, Raymond; Azad, Alvi; Reinsvold, Richard

    2010-08-01

    Here we present a patient with an isolated paracallosal brain lesion who exhibited behavioral changes associated with the corpus callosum syndrome (CCS) including features of the alien hand syndrome (AHS). The CCS is also known as the split-brain syndrome, the syndrome of hemisphere disconnection, the syndrome of brain bisection and the syndrome of the cerebral commissures. Because most reported cases of CCS were caused by tumors which extended beyond the corpus callosum (CC) and did not always induce a complete disconnection, there was much controversy about the role of the CC and the existence of a specific CCS. Aside from surgically based cases, the full complement of the CCS is infrequently clinically encountered. The patient described has a classic CCS from natural causes. This case report is unique in exhibiting a complete CCS with AHS secondary to an ischemic event affecting the left pericallosal region. To our knowledge this is the first case report of such a combination.

  3. Rubinstein–Taybi syndrome with agenesis of corpus callosum

    PubMed Central

    Mishra, Shubhankar; Agarwalla, Sunil Kumar; Potpalle, Dnyaneshwar Ramesh; Dash, Nishant Nilotpal

    2015-01-01

    Rubinstein–Taybi syndrome (RSTS) is a rare genetic disorder with characteristic morphological anomaly. Our patient was a 4.5-year-old girl came with features like broad thumbs, downward slanting palpebral fissures and mental retardation. Systemic abnormalities such as repeated infection, seizure with developmental delay were also associated with it. She was having head banging behavior abnormal slurring speech, incoordination while transferring things from one hand to other. Galaxy of clinical pictures and magnetic resonance imaging report helped to clinch the diagnosis as a case of “RSTS with corpus callosal agenesis” which to the best of our knowledge has never been reported in past from India. PMID:26167229

  4. Agenesis of the corpus callosum: symptoms consistent with developmental disability in two siblings.

    PubMed

    Cavalari, Rachel N S; Donovick, Peter J

    2015-02-01

    Agenesis of the corpus callosum (AgCC) is a congenital disorder that disrupts the development of neurological structures connecting the right and left hemispheres of the brain. In addition to neurological symptoms, many individuals with AgCC demonstrate marked deficits in social, communication, and adaptive skills. This paper presents two case studies of congenital AgCC in siblings with socioemotional and behavioral symptoms consistent with developmental disability, but with notably different symptom presentations and clinical needs. Conclusions from these cases suggest that unique symptom profiles of individuals with AgCC warrant careful consideration for referral to appropriate academic and habilitative services.

  5. The VPAC2 agonist peptide histidine isoleucine (PHI) up-regulates glutamate transport in the corpus callosum of a rat model of amyotrophic lateral sclerosis (hSOD1G93A) by inhibiting caspase-3 mediated inactivation of GLT-1a.

    PubMed

    Goursaud, Stéphanie; Focant, Marylène C; Berger, Julie V; Nizet, Yannick; Maloteaux, Jean-Marie; Hermans, Emmanuel

    2011-10-01

    Degeneration of corpus callosum appears in patients with amyotrophic lateral sclerosis (ALS) before clinical signs of upper motor neuron death. Considering the ALS-associated impairment of astrocytic glutamate uptake, we have characterized the expression and activity of the glutamate transporter isoforms GLT-1a and GLT-1b in the corpus callosum of transgenic rats expressing a mutated form of the human superoxide dismutase 1 (hSOD1(G93A)). We have also studied the effect of peptide histidine isoleucine (PHI), a vasoactive intestinal peptide (VIP)/pituitary adenylate cyclase-activating polypeptide (PACAP) receptor 2 (VPAC(2)) agonist on glutamate transporters both in vivo and in callosal astrocytes. Before the onset of motor symptoms, the expression of both transporter isoforms was correlated with a constitutive activity of caspase-3. This enzyme participates in the down-regulation of GLT-1 in ALS, and here we demonstrated its involvement in the selective degradation of GLT-1a in the white matter. A single stereotactic injection of PHI into the corpus callosum of symptomatic rats decreased caspase-3 activity and promoted GLT-1a expression and uptake activity. Together, with evidence for a reduced expression of prepro-VIP/PHI mRNA in the corpus callosum of transgenic animals, these data shed light on the modulatory role of the VIP/PHI system on the glutamatergic transmission in ALS.

  6. Paclitaxel, Carboplatin, and Bevacizumab or Paclitaxel, Carboplatin, and Temsirolimus or Ixabepilone, Carboplatin, and Bevacizumab in Treating Patients With Stage III, Stage IV, or Recurrent Endometrial Cancer

    ClinicalTrials.gov

    2018-01-29

    Endometrial Adenocarcinoma; Endometrial Adenosquamous Carcinoma; Endometrial Clear Cell Adenocarcinoma; Endometrial Serous Adenocarcinoma; Recurrent Uterine Corpus Carcinoma; Stage IIIA Uterine Corpus Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC Uterine Corpus Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer

  7. Male non-insulin users with type 2 diabetes mellitus are predisposed to gastric corpus-predominant inflammation after H. pylori infection.

    PubMed

    Yang, Yao-Jong; Wu, Chung-Tai; Ou, Horng-Yih; Lin, Chin-Han; Cheng, Hsiu-Chi; Chang, Wei-Lun; Chen, Wei-Ying; Yang, Hsiao-Bai; Lu, Cheng-Chan; Sheu, Bor-Shyang

    2017-10-30

    Both H. pylori infection and diabetes increase the risk of gastric cancer. This study investigated whether patients with type 2 diabetes mellitus (T2DM) and H. pylori infection had more severe corpus gastric inflammation and higher prevalence of precancerous lesions than non-diabetic controls. A total of 797 patients with type 2 diabetes mellitus were screened for H. pylori, of whom 264 had H. pylori infection. Of these patients, 129 received esophagogastroduodenoscopy to obtain topographic gastric specimens for gastric histology according to the modified Updated Sydney System, corpus-predominant gastritis index (CGI), Operative Link on Gastritis Assessment, and Operative Link on Gastric Intestinal Metaplasia Assessment. Non-diabetic dyspeptic patients who had H. pylori infection confirmed by esophagogastroduodenoscopy were enrolled as controls. The male as well as total T2DM patients had higher acute/chronic inflammatory and lymphoid follicle scores in the corpus than non-diabetic controls (p < 0.05). In contrast, the female T2DM patients had higher chronic inflammatory scores in the antrum than the controls (p < 0.05). In T2DM patients, the males had significantly higher rates of CGI than the females (p < 0.05). Multivariate logistic regression analysis showed that male patients (odds ratio: 2.28, 95% confidence interval: 1.11-4.69, p = 0.025) and non-insulin users (odds ratio: 0.33, 95% confidence interval: 0.15-0.74, p = 0.007) were independent factors for the presence of CGI in the H. pylori-infected patients with type 2 diabetes mellitus. Patients with type 2 diabetes mellitus and H. pylori infection had more severe corpus gastric inflammation than non-diabetic controls. Moreover, male gender and non-insulin users of T2DM patients were predisposed to have corpus-predominant gastritis after H. pylori infection. ClinicalTrial: NCT02466919 , retrospectively registered may 17, 2015.

  8. Subcallosal artery stroke: infarction of the fornix and the genu of the corpus callosum. The importance of the anterior communicating artery complex. Case series and review of the literature.

    PubMed

    Meila, Dan; Saliou, Guillaume; Krings, Timo

    2015-01-01

    Despite the variable anatomy of the anterior communicating artery (AcoA) complex, three main perforating branches can be typically identified the largest of which being the subcallosal artery (ScA). We present a case series of infarction in the vascular territory of the ScA to highlight the anatomy, the clinical symptomatology, and the presumed pathophysiology as it pertains to endovascular and surgical management of vascular pathology in this region. In this retrospective multicenter case series study of patients who were diagnosed with symptomatic ScA stroke, we analyzed all available clinical records, MRI, and angiographic details. Additionally, a review of the literature is provided. We identified five different cases of ScA stroke, leading to a subsequent infarction of the fornix and the genu of the corpus callosum. The presumed pathophysiology in non-iatrogenic cases is microangiopathy, rather than embolic events; iatrogenic SCA occlusion can present after both surgical and endovascular treatment of AcoA aneurysms that may occur with or without occlusion of the AcoA. Stroke in the vascular territory of the ScA leads to a characteristic imaging and clinical pattern. Ischemia involves the anterior columns of the fornix and the genu of the corpus callosum, and patients present with a Korsakoff's syndrome including disturbances of short-term memory and cognitive changes. We conclude that despite its small size, the ScA is an important artery to watch out for during surgical or endovascular treatment of AcoA aneurysms.

  9. 76 FR 18395 - Safety Zone; Naval Air Station Corpus Christi Air Show, Oso Bay, Corpus Christi, TX

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-04

    ...-AA00 Safety Zone; Naval Air Station Corpus Christi Air Show, Oso Bay, Corpus Christi, TX AGENCY: Coast... zone on the navigable waters of Oso Bay in Corpus Christi, Texas in support of the 2011 Naval Air Station Corpus Christi Air Show. This temporary safety zone is necessary to provide for the safety of...

  10. Swedish Religious Education at the End of the 1960s: Classroom Observations, Early Video Ethnography and the National Curriculum of 1962

    ERIC Educational Resources Information Center

    Flensner, K. Kittelmann; Larsson, G.

    2014-01-01

    The aim of this article is to present a unique corpus of film-recorded classroom observations of sixth-grade classes (age 12-13) in the Swedish cities of Gothenburg, Partille and Trollhättan in the late 1960s. The material documents how RE could be taught in Swedish schools in line with the curriculum of Lgr 62 which internationally was an early…

  11. Contribution to terminology internationalization by word alignment in parallel corpora.

    PubMed

    Deléger, Louise; Merkel, Magnus; Zweigenbaum, Pierre

    2006-01-01

    Creating a complete translation of a large vocabulary is a time-consuming task, which requires skilled and knowledgeable medical translators. Our goal is to examine to which extent such a task can be alleviated by a specific natural language processing technique, word alignment in parallel corpora. We experiment with translation from English to French. Build a large corpus of parallel, English-French documents, and automatically align it at the document, sentence and word levels using state-of-the-art alignment methods and tools. Then project English terms from existing controlled vocabularies to the aligned word pairs, and examine the number and quality of the putative French translations obtained thereby. We considered three American vocabularies present in the UMLS with three different translation statuses: the MeSH, SNOMED CT, and the MedlinePlus Health Topics. We obtained several thousand new translations of our input terms, this number being closely linked to the number of terms in the input vocabularies. Our study shows that alignment methods can extract a number of new term translations from large bodies of text with a moderate human reviewing effort, and thus contribute to help a human translator obtain better translation coverage of an input vocabulary. Short-term perspectives include their application to a corpus 20 times larger than that used here, together with more focused methods for term extraction.

  12. Contribution to Terminology Internationalization by Word Alignment in Parallel Corpora

    PubMed Central

    Deléger, Louise; Merkel, Magnus; Zweigenbaum, Pierre

    2006-01-01

    Background and objectives Creating a complete translation of a large vocabulary is a time-consuming task, which requires skilled and knowledgeable medical translators. Our goal is to examine to which extent such a task can be alleviated by a specific natural language processing technique, word alignment in parallel corpora. We experiment with translation from English to French. Methods Build a large corpus of parallel, English-French documents, and automatically align it at the document, sentence and word levels using state-of-the-art alignment methods and tools. Then project English terms from existing controlled vocabularies to the aligned word pairs, and examine the number and quality of the putative French translations obtained thereby. We considered three American vocabularies present in the UMLS with three different translation statuses: the MeSH, SNOMED CT, and the MedlinePlus Health Topics. Results We obtained several thousand new translations of our input terms, this number being closely linked to the number of terms in the input vocabularies. Conclusion Our study shows that alignment methods can extract a number of new term translations from large bodies of text with a moderate human reviewing effort, and thus contribute to help a human translator obtain better translation coverage of an input vocabulary. Short-term perspectives include their application to a corpus 20 times larger than that used here, together with more focused methods for term extraction. PMID:17238328

  13. OrganismTagger: detection, normalization and grounding of organism entities in biomedical documents.

    PubMed

    Naderi, Nona; Kappler, Thomas; Baker, Christopher J O; Witte, René

    2011-10-01

    Semantic tagging of organism mentions in full-text articles is an important part of literature mining and semantic enrichment solutions. Tagged organism mentions also play a pivotal role in disambiguating other entities in a text, such as proteins. A high-precision organism tagging system must be able to detect the numerous forms of organism mentions, including common names as well as the traditional taxonomic groups: genus, species and strains. In addition, such a system must resolve abbreviations and acronyms, assign the scientific name and if possible link the detected mention to the NCBI Taxonomy database for further semantic queries and literature navigation. We present the OrganismTagger, a hybrid rule-based/machine learning system to extract organism mentions from the literature. It includes tools for automatically generating lexical and ontological resources from a copy of the NCBI Taxonomy database, thereby facilitating system updates by end users. Its novel ontology-based resources can also be reused in other semantic mining and linked data tasks. Each detected organism mention is normalized to a canonical name through the resolution of acronyms and abbreviations and subsequently grounded with an NCBI Taxonomy database ID. In particular, our system combines a novel machine-learning approach with rule-based and lexical methods for detecting strain mentions in documents. On our manually annotated OT corpus, the OrganismTagger achieves a precision of 95%, a recall of 94% and a grounding accuracy of 97.5%. On the manually annotated corpus of Linnaeus-100, the results show a precision of 99%, recall of 97% and grounding accuracy of 97.4%. The OrganismTagger, including supporting tools, resources, training data and manual annotations, as well as end user and developer documentation, is freely available under an open-source license at http://www.semanticsoftware.info/organism-tagger. witte@semanticsoftware.info.

  14. Brief Report: Acrocallosal Syndrome and Autism

    ERIC Educational Resources Information Center

    Steiner, Carlos Eduardo; Guerreiro, Marilisa Mantovani; Marques-de-Faria, Antonia Paula

    2004-01-01

    The authors describe a boy presenting with acrocallosal syndrome and autism. Clinical features included craniofacial dysmorphisms, polydactyly, and mental retardation, besides behavioral symptoms compatible with autism. Neuroimaging revealed hypoplasia of the corpus callosum and cerebellar abnormalities. The role of this entity and other…

  15. Detecting functional magnetic resonance imaging activation in white matter: Interhemispheric transfer across the corpus callosum

    PubMed Central

    Mazerolle, Erin L; D'Arcy, Ryan CN; Beyea, Steven D

    2008-01-01

    Background It is generally believed that activation in functional magnetic resonance imaging (fMRI) is restricted to gray matter. Despite this, a number of studies have reported white matter activation, particularly when the corpus callosum is targeted using interhemispheric transfer tasks. These findings suggest that fMRI signals may not be neatly confined to gray matter tissue. In the current experiment, 4 T fMRI was employed to evaluate whether it is possible to detect white matter activation. We used an interhemispheric transfer task modelled after neurological studies of callosal disconnection. It was hypothesized that white matter activation could be detected using fMRI. Results Both group and individual data were considered. At liberal statistical thresholds (p < 0.005, uncorrected), group level activation was detected in the isthmus of the corpus callosum. This region connects the superior parietal cortices, which have been implicated previously in interhemispheric transfer. At the individual level, five of the 24 subjects (21%) had activation clusters that were located primarily within the corpus callosum. Consistent with the group results, the clusters of all five subjects were located in posterior callosal regions. The signal time courses for these clusters were comparable to those observed for task related gray matter activation. Conclusion The findings support the idea that, despite the inherent challenges, fMRI activation can be detected in the corpus callosum at the individual level. Future work is needed to determine whether the detection of this activation can be improved by utilizing higher spatial resolution, optimizing acquisition parameters, and analyzing the data with tissue specific models of the hemodynamic response. The ability to detect white matter fMRI activation expands the scope of basic and clinical brain mapping research, and provides a new approach for understanding brain connectivity. PMID:18789154

  16. Progressive Learning of Topic Modeling Parameters: A Visual Analytics Framework.

    PubMed

    El-Assady, Mennatallah; Sevastjanova, Rita; Sperrle, Fabian; Keim, Daniel; Collins, Christopher

    2018-01-01

    Topic modeling algorithms are widely used to analyze the thematic composition of text corpora but remain difficult to interpret and adjust. Addressing these limitations, we present a modular visual analytics framework, tackling the understandability and adaptability of topic models through a user-driven reinforcement learning process which does not require a deep understanding of the underlying topic modeling algorithms. Given a document corpus, our approach initializes two algorithm configurations based on a parameter space analysis that enhances document separability. We abstract the model complexity in an interactive visual workspace for exploring the automatic matching results of two models, investigating topic summaries, analyzing parameter distributions, and reviewing documents. The main contribution of our work is an iterative decision-making technique in which users provide a document-based relevance feedback that allows the framework to converge to a user-endorsed topic distribution. We also report feedback from a two-stage study which shows that our technique results in topic model quality improvements on two independent measures.

  17. Analyzing Document Retrievability in Patent Retrieval Settings

    NASA Astrophysics Data System (ADS)

    Bashir, Shariq; Rauber, Andreas

    Most information retrieval settings, such as web search, are typically precision-oriented, i.e. they focus on retrieving a small number of highly relevant documents. However, in specific domains, such as patent retrieval or law, recall becomes more relevant than precision: in these cases the goal is to find all relevant documents, requiring algorithms to be tuned more towards recall at the cost of precision. This raises important questions with respect to retrievability and search engine bias: depending on how the similarity between a query and documents is measured, certain documents may be more or less retrievable in certain systems, up to some documents not being retrievable at all within common threshold settings. Biases may be oriented towards popularity of documents (increasing weight of references), towards length of documents, favour the use of rare or common words; rely on structural information such as metadata or headings, etc. Existing accessibility measurement techniques are limited as they measure retrievability with respect to all possible queries. In this paper, we improve accessibility measurement by considering sets of relevant and irrelevant queries for each document. This simulates how recall oriented users create their queries when searching for relevant information. We evaluate retrievability scores using a corpus of patents from US Patent and Trademark Office.

  18. Health and Recovery Program in Increasing Physical Activity Level in Stage IA-IIIA Endometrial Cancer Survivors

    ClinicalTrials.gov

    2018-03-05

    Cancer Survivor; Endometrial Carcinoma; Stage I Uterine Corpus Cancer AJCC v7; Stage IA Uterine Corpus Cancer AJCC v7; Stage IB Uterine Corpus Cancer AJCC v7; Stage II Uterine Corpus Cancer AJCC v7; Stage IIIA Uterine Corpus Cancer AJCC v7

  19. The Corpus Callosum and Reading: An MRI Volumetric Study

    ERIC Educational Resources Information Center

    Fine, Jodene Goldenring

    2006-01-01

    Researchers have long been interested in the role of the corpus callosum in reading disorder, but existing studies have yielded inconsistent results. Some have found larger corpus callosa in those with reading disorder, others have found smaller corpus callosa, and some have found no differences in the corpus callosa of persons with and without…

  20. Natural brain-information interfaces: Recommending information by relevance inferred from human brain signals

    PubMed Central

    Eugster, Manuel J. A.; Ruotsalo, Tuukka; Spapé, Michiel M.; Barral, Oswald; Ravaja, Niklas; Jacucci, Giulio; Kaski, Samuel

    2016-01-01

    Finding relevant information from large document collections such as the World Wide Web is a common task in our daily lives. Estimation of a user’s interest or search intention is necessary to recommend and retrieve relevant information from these collections. We introduce a brain-information interface used for recommending information by relevance inferred directly from brain signals. In experiments, participants were asked to read Wikipedia documents about a selection of topics while their EEG was recorded. Based on the prediction of word relevance, the individual’s search intent was modeled and successfully used for retrieving new relevant documents from the whole English Wikipedia corpus. The results show that the users’ interests toward digital content can be modeled from the brain signals evoked by reading. The introduced brain-relevance paradigm enables the recommendation of information without any explicit user interaction and may be applied across diverse information-intensive applications. PMID:27929077

  1. Natural brain-information interfaces: Recommending information by relevance inferred from human brain signals

    NASA Astrophysics Data System (ADS)

    Eugster, Manuel J. A.; Ruotsalo, Tuukka; Spapé, Michiel M.; Barral, Oswald; Ravaja, Niklas; Jacucci, Giulio; Kaski, Samuel

    2016-12-01

    Finding relevant information from large document collections such as the World Wide Web is a common task in our daily lives. Estimation of a user’s interest or search intention is necessary to recommend and retrieve relevant information from these collections. We introduce a brain-information interface used for recommending information by relevance inferred directly from brain signals. In experiments, participants were asked to read Wikipedia documents about a selection of topics while their EEG was recorded. Based on the prediction of word relevance, the individual’s search intent was modeled and successfully used for retrieving new relevant documents from the whole English Wikipedia corpus. The results show that the users’ interests toward digital content can be modeled from the brain signals evoked by reading. The introduced brain-relevance paradigm enables the recommendation of information without any explicit user interaction and may be applied across diverse information-intensive applications.

  2. Using Distinct Sectors in Media Sampling and Full Media Analysis to Detect Presence of Documents from a Corpus

    DTIC Science & Technology

    2012-09-01

    relative performance of several conventional SQL and NoSQL databases with a set of one billion file block hashes. Digital Forensics, Sector Hashing, Full... NoSQL databases with a set of one billion file block hashes. v THIS PAGE INTENTIONALLY LEFT BLANK vi Table of Contents List of Acronyms and...Operating System NOOP No Operation assembly instruction NoSQL “Not only SQL” model for non-relational database management NSRL National Software

  3. Comprehensive Patient Questionnaires in Predicting Complications in Older Patients With Gynecologic Cancer Undergoing Surgery

    ClinicalTrials.gov

    2018-02-14

    Endometrial Serous Adenocarcinoma; Fallopian Tube Carcinoma; Ovarian Carcinoma; Primary Peritoneal Carcinoma; Stage IIIA Uterine Corpus Cancer AJCC v7; Stage IIIB Uterine Corpus Cancer AJCC v7; Stage IIIC Uterine Corpus Cancer AJCC v7; Stage IVA Uterine Corpus Cancer AJCC v7; Stage IVB Uterine Corpus Cancer AJCC v7

  4. Scaling-up NLP Pipelines to Process Large Corpora of Clinical Notes.

    PubMed

    Divita, G; Carter, M; Redd, A; Zeng, Q; Gupta, K; Trautner, B; Samore, M; Gundlapalli, A

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". This paper describes the scale-up efforts at the VA Salt Lake City Health Care System to address processing large corpora of clinical notes through a natural language processing (NLP) pipeline. The use case described is a current project focused on detecting the presence of an indwelling urinary catheter in hospitalized patients and subsequent catheter-associated urinary tract infections. An NLP algorithm using v3NLP was developed to detect the presence of an indwelling urinary catheter in hospitalized patients. The algorithm was tested on a small corpus of notes on patients for whom the presence or absence of a catheter was already known (reference standard). In planning for a scale-up, we estimated that the original algorithm would have taken 2.4 days to run on a larger corpus of notes for this project (550,000 notes), and 27 days for a corpus of 6 million records representative of a national sample of notes. We approached scaling-up NLP pipelines through three techniques: pipeline replication via multi-threading, intra-annotator threading for tasks that can be further decomposed, and remote annotator services which enable annotator scale-out. The scale-up resulted in reducing the average time to process a record from 206 milliseconds to 17 milliseconds or a 12- fold increase in performance when applied to a corpus of 550,000 notes. Purposely simplistic in nature, these scale-up efforts are the straight forward evolution from small scale NLP processing to larger scale extraction without incurring associated complexities that are inherited by the use of the underlying UIMA framework. These efforts represent generalizable and widely applicable techniques that will aid other computationally complex NLP pipelines that are of need to be scaled out for processing and analyzing big data.

  5. Diffuse corpus callosum infarction - Rare vascular entity with differing etiology.

    PubMed

    Mahale, Rohan; Mehta, Anish; Buddaraju, Kiran; John, Aju Abraham; Javali, Mahendra; Srinivasa, Rangasetty

    2016-01-15

    Infarctions of the corpus callosum are rare vascular events. It is relatively immune to vascular insult because of its rich vascular supply from anterior and posterior circulations of brain. Report of 3 patients with largely diffuse acute corpus callosum infarction. 3 patients with largely diffuse acute corpus callosum infarction were studied and each of these 3 patients had 3 different aetiologies. The 3 different aetiologies of largely diffuse acute corpus callosum infarction were cardioembolism, tuberculous arteritis and takayasu arteritis. Diffuse corpus callosum infarcts are rare events. This case series narrates the three different aetiologies of diffuse acute corpus callosum infarction which is a rare vascular event. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. 77 FR 34034 - Corpus Christi Liquefaction, LLC; Cheniere Corpus Christi Pipeline, L.P.; Notice of Intent To...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-08

    ... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PF12-3-000] Corpus Christi Liquefaction, LLC; Cheniere Corpus Christi Pipeline, L.P.; Notice of Intent To Prepare an Environmental Assessment for the Planned Corpus Christi LNG Terminal and Pipeline Project, Request for Comments on Environmental Issues, and Notice of Public...

  7. [Behavioral and cognitive profile of corpus callosum agenesia - Review].

    PubMed

    Lábadi, Beatrix; Beke, Anna Maria

    2016-11-30

    Agenesis of corpus callosum is a relatively frequent congenital cerebral malformation including dysplasia, total or partial absence of corpus callosum. The agenesis of corpus callosum can be occured in isolated form without accompanying somatic or central nervous system abnormalities and it can be associated with other central nervus system malformations. The behavioral and cognitive outcome is more favorable for patients with isolated agenesis of corpus callous than syndromic form of corpus callosum. The aim of this study is to review recent research on behavioral and social-cognitive functions in individuals with agenesis of corpus callosum. Developmental delay is common especially in higher-order cognitive and social functions. An internet database search was performed to identify publications on the subject. Fifty-five publications in English corresponded to the criteria. These studies reported deficits in language, social cognition and emotions in individuals with agenesis of corpus callosum which is known as primary corpus callous syndrome. The results indicate that individuals with agenesis of corpus callosum have deficiency in social-cognitive domain (recognition of emotions, weakness in paralinguistic aspects of language and mentalizing abilities). The impaired social cognition can be manifested in behavioral problems like autism and attention deficit hyperactivity disorder.

  8. Leveraging Wikipedia knowledge to classify multilingual biomedical documents.

    PubMed

    Antonio Mouriño García, Marcos; Pérez Rodríguez, Roberto; Anido Rifón, Luis

    2018-05-02

    This article presents a classifier that leverages Wikipedia knowledge to represent documents as vectors of concepts weights, and analyses its suitability for classifying biomedical documents written in any language when it is trained only with English documents. We propose the cross-language concept matching technique, which relies on Wikipedia interlanguage links to convert concept vectors between languages. The performance of the classifier is compared to a classifier based on machine translation, and two classifiers based on MetaMap. To perform the experiments, we created two multilingual corpus. The first one, Multi-Lingual UVigoMED (ML-UVigoMED) is composed of 23,647 Wikipedia documents about biomedical topics written in English, German, French, Spanish, Italian, Galician, Romanian, and Icelandic. The second one, English-French-Spanish-German UVigoMED (EFSG-UVigoMED) is composed of 19,210 biomedical abstract extracted from MEDLINE written in English, French, Spanish, and German. The performance of the approach proposed is superior to any of the state-of-the art classifier in the benchmark. We conclude that leveraging Wikipedia knowledge is of great advantage in tasks of multilingual classification of biomedical documents. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. [Erectile function and ablative surgery of penile tumors].

    PubMed

    Pisani, E; Austoni, E; Trinchieri, A; Ceresoli, A; Mantovani, F; Colombo, F; Mastromarino, G; Vecchio, D; Canclini, L; Fenice, O

    1994-02-01

    The Authors try to show the possibility to combine radical excision with minimal invasiveness in the surgery of penile cancer. The focal point of every therapeutic decision is correct clinical staging. Unfortunately there's some confusion in the two international staging systems (TNM and Jackson's classification). In fact it's not clear the anatomical difference between epithelioma of the glans infiltrating corpus spongiosum and subcoronary epithelioma of the shaft infiltrating the corpora cavernosa. It's obvious that the infiltration of the corpora cavernosa is a far more aggressive oncological manifestation than that of tumour infiltrating the corpus spongiosum. So we consider Jackson's classification more congenial. In terms of surgery this anatomical independence makes it easy to consider the corpora cavernosa as a distinct entity, so they remain perfectly functional when separated from the glandulo-spongio-urethral unit with its vasculo-nervous bundle. This makes conservation of the erectile function, when clinical staging show us that the tumour is not infiltrating the corpora cavernosa. The Authors show their results, which seem to be rather good.

  10. New directions in biomedical text annotation: definitions, guidelines and corpus construction

    PubMed Central

    Wilbur, W John; Rzhetsky, Andrey; Shatkay, Hagit

    2006-01-01

    Background While biomedical text mining is emerging as an important research area, practical results have proven difficult to achieve. We believe that an important first step towards more accurate text-mining lies in the ability to identify and characterize text that satisfies various types of information needs. We report here the results of our inquiry into properties of scientific text that have sufficient generality to transcend the confines of a narrow subject area, while supporting practical mining of text for factual information. Our ultimate goal is to annotate a significant corpus of biomedical text and train machine learning methods to automatically categorize such text along certain dimensions that we have defined. Results We have identified five qualitative dimensions that we believe characterize a broad range of scientific sentences, and are therefore useful for supporting a general approach to text-mining: focus, polarity, certainty, evidence, and directionality. We define these dimensions and describe the guidelines we have developed for annotating text with regard to them. To examine the effectiveness of the guidelines, twelve annotators independently annotated the same set of 101 sentences that were randomly selected from current biomedical periodicals. Analysis of these annotations shows 70–80% inter-annotator agreement, suggesting that our guidelines indeed present a well-defined, executable and reproducible task. Conclusion We present our guidelines defining a text annotation task, along with annotation results from multiple independently produced annotations, demonstrating the feasibility of the task. The annotation of a very large corpus of documents along these guidelines is currently ongoing. These annotations form the basis for the categorization of text along multiple dimensions, to support viable text mining for experimental results, methodology statements, and other forms of information. We are currently developing machine learning methods, to be trained and tested on the annotated corpus, that would allow for the automatic categorization of biomedical text along the general dimensions that we have presented. The guidelines in full detail, along with annotated examples, are publicly available. PMID:16867190

  11. Construction of an annotated corpus to support biomedical information extraction

    PubMed Central

    Thompson, Paul; Iqbal, Syed A; McNaught, John; Ananiadou, Sophia

    2009-01-01

    Background Information Extraction (IE) is a component of text mining that facilitates knowledge discovery by automatically locating instances of interesting biomedical events from huge document collections. As events are usually centred on verbs and nominalised verbs, understanding the syntactic and semantic behaviour of these words is highly important. Corpora annotated with information concerning this behaviour can constitute a valuable resource in the training of IE components and resources. Results We have defined a new scheme for annotating sentence-bound gene regulation events, centred on both verbs and nominalised verbs. For each event instance, all participants (arguments) in the same sentence are identified and assigned a semantic role from a rich set of 13 roles tailored to biomedical research articles, together with a biological concept type linked to the Gene Regulation Ontology. To our knowledge, our scheme is unique within the biomedical field in terms of the range of event arguments identified. Using the scheme, we have created the Gene Regulation Event Corpus (GREC), consisting of 240 MEDLINE abstracts, in which events relating to gene regulation and expression have been annotated by biologists. A novel method of evaluating various different facets of the annotation task showed that average inter-annotator agreement rates fall within the range of 66% - 90%. Conclusion The GREC is a unique resource within the biomedical field, in that it annotates not only core relationships between entities, but also a range of other important details about these relationships, e.g., location, temporal, manner and environmental conditions. As such, it is specifically designed to support bio-specific tool and resource development. It has already been used to acquire semantic frames for inclusion within the BioLexicon (a lexical, terminological resource to aid biomedical text mining). Initial experiments have also shown that the corpus may viably be used to train IE components, such as semantic role labellers. The corpus and annotation guidelines are freely available for academic purposes. PMID:19852798

  12. Corpus-based Customization for an Ontology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2010-09-14

    CCAT scans a corpus of text for terms, and computes lexical similarity between corpus terms and taxonomy terms. Based on a set of metrics and a learning algorithm, the system inserts corpus terms into the taxonomy. Conversely, terms from the taxonomy are disambiguated based on the text in the corpus. Unused terms are discarded, and infrequently used senses of terms are collapsed to make the taxonomy more manageable.

  13. Massive hemoperitoneum due to a ruptured corpus luteum cyst in a patient with congenital hypofibrinogenemia

    PubMed Central

    Kim, Jong-Hyun; Jeong, So-Young

    2015-01-01

    Congenital afibrinogenemia/hypofibrinogenemia is a rare inherited hematologic disorder in which a patient lacks or has insufficient level of fibrinogen, the blood coagulation factor I. The incidence of this uncommon disease is 1 to 2 per 1 million individuals. Hence, massive hemoperitoneum caused by ovulation in a woman with congenital afibrogenemia is also a very rare clinical condition. Massive hemoperitoneum usually presents as acute abdominal pain with potential findings of peritonitis including abdominal distention, hypotension and tachycardia with critical consequences. We performed emergent endoscopic surgery for hemoperitoneum caused by a ruptured corpus luteum cyst in a patient with congenital hypofibrinogenemia. To the best of our knowledge, this was the first case report of such treatment in Korea. PMID:26430672

  14. Blindness, dancing extremities, and corpus callosum and brain stem involvement: an unusual presentation of fulminant subacute sclerosing panencephalitis.

    PubMed

    Singhi, Pratibha; Saini, Arushi Gahlot; Sankhyan, Naveen; Gupta, Pankaj; Vyas, Sameer

    2015-01-01

    A 4-year-old girl presented with acute visual loss followed 2 weeks later with loss of speech and audition, fulminant neuroregression, and choreo-athetoid movements of extremities. Fundus showed bilateral chorioretinitis. Electroencephalography showed periodic complexes. Measles antibody titers were elevated in both serum and cerebrospinal fluid, consistent with subacute sclerosing panencephalitis. Neuroimaging showed discontiguous involvement of splenium of the corpus callosum and ventral pons with sparing of cortical white matter. Our case highlights the atypical clinical and radiologic presentations of subacute sclerosing panencephalitis. Pediatricians need to be aware that necrotizing chorioretinitis in a child and/or atypical brain stem changes could be the heralding feature of this condition in endemic countries. © The Author(s) 2014.

  15. Phenotype and management of Aicardi syndrome: new findings from a survey of 69 children

    USDA-ARS?s Scientific Manuscript database

    Aicardi syndrome is a rare neurodevelopmental disorder characterized by agenesis of the corpus callosum, other developmental brain abnormalities, chorioretinal lacunae, and severe seizures. Current clinical knowledge is derived from small series that focus on these major defects. The authors perform...

  16. Trial of Cisplatin Plus Radiation Followed by Carbo and Taxol Vs. Sandwich Therapy of Carbo and Taxol Followed Radiation Then Further Carbo and Taxol

    ClinicalTrials.gov

    2017-10-30

    Endometrial Clear Cell Adenocarcinoma; Endometrial Serous Adenocarcinoma; Stage IIIA Uterine Corpus Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC Uterine Corpus Cancer; Stage IVA Uterine Corpus Cancer

  17. Thyrotropin receptor autoantibodies and early miscarriages in patients with Hashimoto thyroiditis: a case-control study.

    PubMed

    Toulis, Konstantinos A; Goulis, Dimitrios G; Tsolakidou, Konstantina; Hilidis, Ilias; Fragkos, Marios; Polyzos, Stergios A; Gerofotis, Antonios; Kita, Marina; Bili, Helen; Vavilis, Dimitrios; Daniilidis, Michail; Tarlatzis, Basil C; Papadimas, Ioannis

    2013-08-01

    We have previously hypothesized that early miscarriage in women with Hashimoto thyroiditis might be the result of a cross-reactivity process, in which blocking autoantibodies against thyrotropin receptor (TSHr-Ab) antagonize hCG action on its receptor on the corpus luteum. To test this hypothesis from the clinical perspective, we investigated the presence of TSHr-Ab in Hashimoto thyroiditis patients with apparently unexplained, first-trimester recurrent miscarriages compared to that in Hashimoto thyroiditis patients with documented normal fertility. A total of 86 subjects (43 cases and 43 age-matched controls) were finally included in a case-control study. No difference in the prevalence of TSHr-Ab positivity was detected between cases and controls (Fisher's exact test, p value = 1.00). In patients with recurrent miscarriages, TSHr-Ab concentrations did not predict the number of miscarriages (univariate linear regression, p value = 0.08). These results were robust in sensitivity analyses, including only cases with full investigation or those with three or more miscarriages. We conclude that no role could be advocated for TSHr-Ab in the aetiology of recurrent miscarriages in women with Hashimoto thyroiditis.

  18. Carevive Survivor Care Planning System in Improving Quality of Life in Breast Cancer Survivors

    ClinicalTrials.gov

    2018-02-20

    Stage I Breast Cancer; Stage I Cervical Cancer; Stage I Ovarian Cancer; Stage I Uterine Corpus Cancer; Stage IA Breast Cancer; Stage IA Cervical Cancer; Stage IA Ovarian Cancer; Stage IA Uterine Corpus Cancer; Stage IB Breast Cancer; Stage IB Cervical Cancer; Stage IB Ovarian Cancer; Stage IB Uterine Corpus Cancer; Stage IC Ovarian Cancer; Stage II Breast Cancer; Stage II Cervical Cancer; Stage II Ovarian Cancer; Stage II Uterine Corpus Cancer; Stage IIA Breast Cancer; Stage IIA Cervical Cancer; Stage IIA Ovarian Cancer; Stage IIB Breast Cancer; Stage IIB Cervical Cancer; Stage IIB Ovarian Cancer; Stage IIC Ovarian Cancer; Stage III Breast Cancer; Stage III Cervical Cancer; Stage III Ovarian Cancer; Stage III Uterine Corpus Cancer; Stage IIIA Breast Cancer; Stage IIIA Cervical Cancer; Stage IIIA Ovarian Cancer; Stage IIIA Uterine Corpus Cancer; Stage IIIB Breast Cancer; Stage IIIB Cervical Cancer; Stage IIIB Ovarian Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC Breast Cancer; Stage IIIC Ovarian Cancer; Stage IIIC Uterine Corpus Cancer

  19. Graph-based word sense disambiguation of biomedical documents.

    PubMed

    Agirre, Eneko; Soroa, Aitor; Stevenson, Mark

    2010-11-15

    Word Sense Disambiguation (WSD), automatically identifying the meaning of ambiguous words in context, is an important stage of text processing. This article presents a graph-based approach to WSD in the biomedical domain. The method is unsupervised and does not require any labeled training data. It makes use of knowledge from the Unified Medical Language System (UMLS) Metathesaurus which is represented as a graph. A state-of-the-art algorithm, Personalized PageRank, is used to perform WSD. When evaluated on the NLM-WSD dataset, the algorithm outperforms other methods that rely on the UMLS Metathesaurus alone. The WSD system is open source licensed and available from http://ixa2.si.ehu.es/ukb/. The UMLS, MetaMap program and NLM-WSD corpus are available from the National Library of Medicine https://www.nlm.nih.gov/research/umls/, http://mmtx.nlm.nih.gov and http://wsd.nlm.nih.gov. Software to convert the NLM-WSD corpus into a format that can be used by our WSD system is available from http://www.dcs.shef.ac.uk/∼marks/biomedical_wsd under open source license.

  20. Using clustering and a modified classification algorithm for automatic text summarization

    NASA Astrophysics Data System (ADS)

    Aries, Abdelkrime; Oufaida, Houda; Nouali, Omar

    2013-01-01

    In this paper we describe a modified classification method destined for extractive summarization purpose. The classification in this method doesn't need a learning corpus; it uses the input text to do that. First, we cluster the document sentences to exploit the diversity of topics, then we use a learning algorithm (here we used Naive Bayes) on each cluster considering it as a class. After obtaining the classification model, we calculate the score of a sentence in each class, using a scoring model derived from classification algorithm. These scores are used, then, to reorder the sentences and extract the first ones as the output summary. We conducted some experiments using a corpus of scientific papers, and we have compared our results to another summarization system called UNIS.1 Also, we experiment the impact of clustering threshold tuning, on the resulted summary, as well as the impact of adding more features to the classifier. We found that this method is interesting, and gives good performance, and the addition of new features (which is simple using this method) can improve summary's accuracy.

  1. Developing an International Corpus of Creative English

    ERIC Educational Resources Information Center

    Hassall, Peter John

    2006-01-01

    This paper proposes an International Corpus of Creative English (ICCE) as a worldwide corpus particularly suitable for implementation in countries which have tertiary institutions with well-defined populations of students possessing similar cultural and/or linguistic backgrounds. The ICCE is contextualized as a world Englishes corpus with…

  2. The Hebrew CHILDES corpus: transcription and morphological analysis

    PubMed Central

    Albert, Aviad; MacWhinney, Brian; Nir, Bracha

    2014-01-01

    We present a corpus of transcribed spoken Hebrew that reflects spoken interactions between children and adults. The corpus is an integral part of the CHILDES database, which distributes similar corpora for over 25 languages. We introduce a dedicated transcription scheme for the spoken Hebrew data that is sensitive to both the phonology and the standard orthography of the language. We also introduce a morphological analyzer that was specifically developed for this corpus. The analyzer adequately covers the entire corpus, producing detailed correct analyses for all tokens. Evaluation on a new corpus reveals high coverage as well. Finally, we describe a morphological disambiguation module that selects the correct analysis of each token in context. The result is a high-quality morphologically-annotated CHILDES corpus of Hebrew, along with a set of tools that can be applied to new corpora. PMID:25419199

  3. 75 FR 43886 - Proposed Amendment of Class E Airspace; Corpus Christi, TX

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-27

    ...-0404; Airspace Docket No. 10-ASW-7] Proposed Amendment of Class E Airspace; Corpus Christi, TX AGENCY... action proposes to amend Class E airspace in the Corpus Christi, TX area. Additional controlled airspace is necessary to accommodate new Standard Instrument Approach Procedures (SIAPs) at Corpus Christi...

  4. 75 FR 31677 - Amendment of Class E Airspace; Corpus Christi, TX

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-06-04

    ...-0089; Airspace Docket No. 10-ASW-1] Amendment of Class E Airspace; Corpus Christi, TX AGENCY: Federal... the Corpus Christi, TX area. Additional controlled airspace is necessary to accommodate new Standard... E airspace for the Corpus Christi, TX area, reconfiguring controlled airspace at Aransas County...

  5. 75 FR 66301 - Amendment of Class E Airspace; Corpus Christi, TX

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-10-28

    ... the Corpus Christi, TX, area. Additional controlled airspace is necessary to accommodate new Standard Instrument Approach Procedures (SIAP) at Corpus Christi International Airport, Corpus Christi, TX. The FAA is taking this action to enhance the safety and management of Instrument Flight Rule (IFR) operations at the...

  6. VSV-hIFNbeta-NIS in Treating Patients With Stage IV or Recurrent Endometrial Cancer

    ClinicalTrials.gov

    2018-05-09

    Endometrial Clear Cell Adenocarcinoma; Endometrial Mixed Adenocarcinoma; Endometrial Serous Adenocarcinoma; Endometrial Undifferentiated Carcinoma; Metastatic Endometrioid Adenocarcinoma; Ovarian Endometrioid Adenocarcinoma; Recurrent Endometrial Serous Adenocarcinoma; Recurrent Uterine Corpus Carcinoma; Stage IV Uterine Corpus Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer

  7. The Effect of Corpus-Based Instruction on Pragmatic Routines

    ERIC Educational Resources Information Center

    Bardovi-Harlig, Kathleen; Mossman, Sabrina; Su, Yunwen

    2017-01-01

    This study compares the effect of using corpus-based materials and activities for the instruction of pragmatic routines under two conditions: implementing direct corpus searches by learners during classroom instruction and working with teacher-developed corpus-based materials. The outcome is compared to a repeated-test control group. Pragmatic…

  8. An Abstraction-Based Data Model for Information Retrieval

    NASA Astrophysics Data System (ADS)

    McAllister, Richard A.; Angryk, Rafal A.

    Language ontologies provide an avenue for automated lexical analysis that may be used to supplement existing information retrieval methods. This paper presents a method of information retrieval that takes advantage of WordNet, a lexical database, to generate paths of abstraction, and uses them as the basis for an inverted index structure to be used in the retrieval of documents from an indexed corpus. We present this method as a entree to a line of research on using ontologies to perform word-sense disambiguation and improve the precision of existing information retrieval techniques.

  9. Translation lexicon acquisition from bilingual dictionaries

    NASA Astrophysics Data System (ADS)

    Doermann, David S.; Ma, Huanfeng; Karagol-Ayan, Burcu; Oard, Douglas W.

    2001-12-01

    Bilingual dictionaries hold great potential as a source of lexical resources for training automated systems for optical character recognition, machine translation and cross-language information retrieval. In this work we describe a system for extracting term lexicons from printed copies of bilingual dictionaries. We describe our approach to page and definition segmentation and entry parsing. We have used the approach to parse a number of dictionaries and demonstrate the results for retrieval using a French-English Dictionary to generate a translation lexicon and a corpus of English queries applied to French documents to evaluation cross-language IR.

  10. Incremental Ontology-Based Extraction and Alignment in Semi-structured Documents

    NASA Astrophysics Data System (ADS)

    Thiam, Mouhamadou; Bennacer, Nacéra; Pernelle, Nathalie; Lô, Moussa

    SHIRIis an ontology-based system for integration of semi-structured documents related to a specific domain. The system’s purpose is to allow users to access to relevant parts of documents as answers to their queries. SHIRI uses RDF/OWL for representation of resources and SPARQL for their querying. It relies on an automatic, unsupervised and ontology-driven approach for extraction, alignment and semantic annotation of tagged elements of documents. In this paper, we focus on the Extract-Align algorithm which exploits a set of named entity and term patterns to extract term candidates to be aligned with the ontology. It proceeds in an incremental manner in order to populate the ontology with terms describing instances of the domain and to reduce the access to extern resources such as Web. We experiment it on a HTML corpus related to call for papers in computer science and the results that we obtain are very promising. These results show how the incremental behaviour of Extract-Align algorithm enriches the ontology and the number of terms (or named entities) aligned directly with the ontology increases.

  11. [Reconstruction of penile function with tissue engineering techniques].

    PubMed

    Song, Lu-jie; Pan, Lian-jun; Xu, Yue-min

    2007-04-01

    Tissue engineering techniques, with their potential applied value for penile reconstruction, are of special interest for andrologists. The purpose of this review is to appraise the recent development and publications in this field. In the past few years, great efforts have been made to develop corpus cavernosum tissues by combining smooth muscle and endothelial cells seeded on biodegradable polyglycolic acid polymer (PGA) or acellular corporal collagen matrices scaffolds. Animal experiment demonstrated that the engineered corpus cavernosum achieved adequate structural and functional parameters. Engineered cartilage rods as an alternative for the current clinical standard of semirigid or inflatable penile implants could be created by seeding chondrocyte cylindrical PGA. A series of studies showed that, compared to commercially available silicone implants, the engineered rods were flexible, elastic and stable. Besides, a variety of decellularized biological materials have been used as grafts not only for substitution of tunica albuginea but also for penile enhancement, with promising results. For treating erectile dysfunction, a new approach to recovering erectile function by cell-based therapy could be the injection of functional cells into corpus cavernosum, which seemed to be promising when combined with cell manipulation by gene therapy prior to cell transfer.

  12. Surgery and Chemotherapy With or Without Chemotherapy After Surgery in Treating Patients With Ovarian, Fallopian Tube, Uterine, or Peritoneal Cancer

    ClinicalTrials.gov

    2018-04-26

    Recurrent Uterine Corpus Cancer; Recurrent Fallopian Tube Cancer; Recurrent Ovarian Cancer; Recurrent Primary Peritoneal Cancer; Stage IIIA Uterine Corpus Cancer; Stage IIIA Fallopian Tube Cancer; Stage IIIA Ovarian Cancer; Stage IIIA Primary Peritoneal Cavity Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIB Fallopian Tube Cancer; Stage IIIB Ovarian Cancer; Stage IIIB Primary Peritoneal Cavity Cancer; Stage IIIC Uterine Corpus Cancer; Stage IIIC Fallopian Tube Cancer; Stage IIIC Ovarian Cancer; Stage IIIC Primary Peritoneal Cavity Cancer; Stage IV Fallopian Tube Cancer; Stage IV Ovarian Cancer; Stage IV Primary Peritoneal Cavity Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer

  13. De novo interstitial duplication of the 15q11.2-q14 PWS/AS region of maternal origin: Clinical description, array CGH analysis, and review of the literature.

    PubMed

    Kitsiou-Tzeli, Sophia; Tzetis, Maria; Sofocleous, Christalena; Vrettou, Christina; Xaidara, Athena; Giannikou, Krinio; Pampanos, Andreas; Mavrou, Ariadne; Kanavakis, E

    2010-08-01

    The 15q11-q13 PWS/AS critical region involves genes that are characterized by genomic imprinting. Multiple repeat elements within the region mediate rearrangements, including interstitial duplications, interstitial triplications, and supernumerary isodicentric marker chromosomes, as well as the deletions that cause Prader-Willi syndrome (PWS) and Angelman syndrome (AS). Recently, duplications of maternal origin concerning the same critical region have been implicated in autism spectrum disorders (ASD). We present a 6-month-old girl carrying a de novo duplication of maternal origin of the 15q11.2-q14 PWS/AS region (17.73 Mb in size) [46,XX,dup(15)(q11.2-q14)] detected with a high-resolution microarray-based comparative genomic hybridization (array-CGH). The patient is characterized by severe hypotonia, obesity, microstomia, long eyelashes, hirsutism, microretrognathia, short nose, severe psychomotor retardation, and multiple episodes of drug-resistant epileptic seizures, while her brain magnetic resonance imaging (MRI) documented partial corpus callosum dysplasia. In our patient the duplicated region is quite large extending beyond the Prader-Willi-Angelman critical region (PWACR), containing a number of genes that have been shown to be involved in ASD, exhibiting a severe phenotype, beyond the typical PWS/AS clinical manifestations. Reporting of similar well-characterized clinical cases with clearly delineated breakpoints of the duplicated region will clarify the contribution of specific genes to the phenotype.

  14. Correlation between morphological MRI findings and specific diagnostic categories in fetal alcohol spectrum disorders.

    PubMed

    Boronat, S; Sánchez-Montañez, A; Gómez-Barros, N; Jacas, C; Martínez-Ribot, L; Vázquez, E; Del Campo, M

    2017-01-01

    Fetal alcohol spectrum disorders (FASD) include physical and neurodevelopmental abnormalities related to prenatal alcohol exposure. Some neuroimaging findings have been clearly related to FASD, including corpus callosum and cerebellar anomalies. However, detailed studies correlating with specific FASD categories, that is, the fetal alcohol syndrome (FAS), partial FAS (pFAS) and alcohol related neurodevelopmental disorders (ARND), are lacking. We prospectively performed clinical assessment and brain MR imaging to 72 patients with suspected FASD, and diagnosis was confirmed in 62. The most frequent findings were hypoplasia of the corpus callosum and/or of the cerebellar vermis. Additional findings were vascular anomalies, gliosis, prominent perivascular spaces, occipito-cervical junction and cervical vertebral anomalies, pituitary hypoplasia, arachnoid cysts, and cavum septum pellucidum. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  15. Evaluating Corpus Literacy Training for Pre-Service Language Teachers: Six Case Studies

    ERIC Educational Resources Information Center

    Heather, Julian; Helt, Marie

    2012-01-01

    Corpus literacy is the ability to use corpora--large, principled databases of spoken and written language--for language analysis and instruction. While linguists have emphasized the importance of corpus training in teacher preparation programs, few studies have investigated the process of initiating teachers into corpus literacy with the result…

  16. 75 FR 13453 - Proposed Amendment of Class E Airspace; Corpus Christi, TX

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-03-22

    ...-0089; Airspace Docket No. 10-ASW-1] Proposed Amendment of Class E Airspace; Corpus Christi, TX AGENCY... action proposes to amend Class E airspace in the Corpus Christi, TX area. Additional controlled airspace... adding additional Class E airspace extending upward from 700 feet above the surface in the Corpus Christi...

  17. Corpus-Based Research and Pedagogy in EAP: From Lexis to Genre

    ERIC Educational Resources Information Center

    Flowerdew, Lynne

    2015-01-01

    This plenary paper showcases current corpus-based research on written academic English, illustrating the tight links that exist between corpus research and pedagogic applications. I first explicate Sinclair's concept of the "lexical approach", which underpins much corpus research and pedagogy. I then discuss studies which focus on…

  18. Corpus Callosum Area in Children and Adults with Autism

    ERIC Educational Resources Information Center

    Prigge, Molly B. D.; Lange, Nicholas; Bigler, Erin D.; Merkley, Tricia L.; Neeley, E. Shannon; Abildskov, Tracy J.; Froehlich, Alyson L.; Nielsen, Jared A.; Cooperrider, Jason R.; Cariello, Annahir N.; Ravichandran, Caitlin; Alexander, Andrew L.; Lainhart, Janet E.

    2013-01-01

    Despite repeated findings of abnormal corpus callosum structure in autism, the developmental trajectories of corpus callosum growth in the disorder have not yet been reported. In this study, we examined corpus callosum size from a developmental perspective across a 30-year age range in a large cross-sectional sample of individuals with autism…

  19. EFL Students' Perceptions of Corpus-Tools as Writing References

    ERIC Educational Resources Information Center

    Lai, Shu-Li

    2015-01-01

    A number of studies have suggested the potentials of corpus tools in vocabulary learning. However, there are still some concerns. Corpus tools might be too complicated to use; example sentences retrieved from corpus tools might be too difficult to understand; processing large number of sample sentences could be challenging and time-consuming;…

  20. Adavosertib, External Beam Radiation Therapy, and Cisplatin in Treating Patients With Cervical, Vaginal, or Uterine Cancer

    ClinicalTrials.gov

    2018-06-06

    Endometrioid Adenocarcinoma; Recurrent Cervical Carcinoma; Stage I Uterine Corpus Cancer AJCC v7; Stage I Vaginal Cancer AJCC v6 and v7; Stage IA Uterine Corpus Cancer AJCC v7; Stage IB Cervical Cancer AJCC v6 and v7; Stage IB Uterine Corpus Cancer AJCC v7; Stage IB2 Cervical Cancer AJCC v6 and v7; Stage II Cervical Cancer AJCC v7; Stage II Uterine Corpus Cancer AJCC v7; Stage II Vaginal Cancer AJCC v6 and v7; Stage IIA Cervical Cancer AJCC v7; Stage IIB Cervical Cancer AJCC v6 and v7; Stage III Cervical Cancer AJCC v6 and v7; Stage III Uterine Corpus Cancer AJCC v7; Stage III Vaginal Cancer AJCC v6 and v7; Stage IIIA Cervical Cancer AJCC v6 and v7; Stage IIIA Uterine Corpus Cancer AJCC v7; Stage IIIB Cervical Cancer AJCC v6 and v7; Stage IIIB Uterine Corpus Cancer AJCC v7; Stage IIIC Uterine Corpus Cancer AJCC v7

  1. Mid-callosal plane determination using preferred directions from diffusion tensor images

    NASA Astrophysics Data System (ADS)

    Costa, André L.; Rittner, Letícia; Lotufo, Roberto A.; Appenzeller, Simone

    2015-03-01

    The corpus callosum is the major brain structure responsible for inter{hemispheric communication between neurons. Many studies seek to relate corpus callosum attributes to patient characteristics, cerebral diseases and psychological disorders. Most of those studies rely on 2D analysis of the corpus callosum in the mid-sagittal plane. However, it is common to find conflicting results among studies, once many ignore methodological issues and define the mid-sagittal plane based on precary or invalid criteria with respect to the corpus callosum. In this work we propose a novel method to determine the mid-callosal plane using the corpus callosum internal preferred diffusion directions obtained from diffusion tensor images. This plane is analogous to the mid-sagittal plane, but intended to serve exclusively as the corpus callosum reference. Our method elucidates the great potential the directional information of the corpus callosum fibers have to indicate its own referential. Results from experiments with five image pairs from distinct subjects, obtained under the same conditions, demonstrate the method effectiveness to find the corpus callosum symmetric axis relative to the axial plane.

  2. Corpus Linguistics and the Design of a Response Message

    NASA Astrophysics Data System (ADS)

    Atwell, E.

    2002-01-01

    Most research related to SETI, the Search for Extra-Terrestrial Intelligence, is focussed on techniques for detection of possible incoming signals from extra-terrestrial intelligent sources (e.g. Turnbull et al. 1999), and algorithms for analysis of these signals to identify intelligent language-like characteristics (e.g. Elliott and Atwell 1999, 2000). However, another issue for research and debate is the nature of our response, should a signal arrive and be detected. The design of potentially the most significant communicative act in history should not be decided solely by astrophysicists; the Corpus Linguistics research community has a contribution to make to what is essentially a Corpus design and implementation project. (Vakoch 1998) advocated that the message constructed to transmit to extraterrestrials should include a broad, representative collection of perspectives rather than a single viewpoint or genre; this should strike a chord with Corpus Linguists for whom a central principle is that a corpus must be "balanced" to be representative (Meyer 2001). One idea favoured by SETI researchers is to transmit an encyclopaedia summarising human knowledge, such as the Encyclopaedia Britannica, to give ET communicators an overview and "training set" key to analysis of subsequent messages. Furthermore, this should be sent in several versions in parallel: the text; page-images, to include illustrations left out of the text-file and perhaps some sort of abstract linguistic representation of the text, using a functional or logic language (Ollongren 1999, Freudenthal 1960). The idea of "enriching" the message corpus with annotations at several levels should also strike a chord with Corpus Linguists who have long known that Natural language exhibits highly complex multi-layering sequencing, structural and functional patterns, as difficult to model as sequences and structures found in more traditional physical and biological sciences. Some corpora have been annotated with several levels or layers of linguistic knowledge, for example the SEC corpus (Taylor and Knowles 1988), the ISLE corpus (Menzel et al. 2000). Tagged and parsed corpus can be used by corpus linguists as a testbed to guide their development of grammars (e.g. Souter and Atwell 1994); and they can be used to train Natural Language Learning or data-mining models of complex sequence data (e.g. Brill 1993, Hughes 1993, Atwell 1996). Corpus linguists have a range of standards and tools for design and annotation of representative corpus resources, and experience of which annotation types are more amenable to Natural Language Learning algorithms. An Advisory panel of corpus linguists could help design and implement an extended Multi-annotated Interstellar Corpus of English, incorporating ideas from Corpus Linguistics such as: - Augment the Encyclopaedia Britannica with a collection of samples representing the diversity of language in real use. - As an additional "key", transmit a dictionary aimed at language learners which has also been a rich source for NLP - Supply our ET communicators with several levels of linguistic annotation, to give them a richer training set for their - Add translations of the English text into other human languages: Humanity should not be represented by English alone, This calls for a large-scale corpus annotation project, requiring an Interstellar Corpus Advisory Panel, analogous to the BNC or MATE advisory panels, to include experts in English grammar and semantics, English language learning, computational Natural language Learning algorithms, and corpus design, implementation, annotation, standardisation, and analysis.

  3. Magnetic resonance imaging and clinical findings in a miniature Schnauzer with hypodipsic hypernatremia.

    PubMed

    Shimokawa Miyama, Takako; Iwamoto, Emiko; Umeki, Saori; Nakaichi, Munekazu; Okuda, Masaru; Mizuno, Takuya

    2009-10-01

    A 6-month-old miniature Schnauzer presented with hypernatremia and clinical signs of vomiting, diarrhea, inappetence, and lethargy. The dog did not consume water on its own. Hypernatremia and the related clinical signs were resolved by fluid administration. Endocrinological investigations and urinalysis excluded the possibility of diabetes insipidus and hyperaldosteronism. Therefore, the dog was diagnosed with hypodipsic hypernatremia. Magnetic resonance imaging revealed dysgenesis of the corpus callosum and other forebrain structures. On the basis of these findings, congenital brain malformation associated with failure of the osmoreceptor system was suspected.

  4. Has Corpus-Based Instruction Reached a Tipping Point? Practical Applications and Pointers for Teachers

    ERIC Educational Resources Information Center

    Huang, Li-Shih

    2017-01-01

    This article provides an easy introduction into corpus-based instruction by explaining what the approach entails. It also presents key terms and discusses key theoretical concepts drawn from the literature; from these, practical applications and pointers are offered for those practitioners wishing to use corpus data or implement corpus-based…

  5. Motivating College Students' Learning English for Specific Purposes Courses through Corpus Building

    ERIC Educational Resources Information Center

    Wu, Lin-Fang

    2014-01-01

    This study was conducted to determine how to motivate technical college students to learn English for specific purposes (ESP) courses through corpus building and enhance their language proficiency during the coursework for their majors. This study explores corpus building skills, how to simplify ESP courses by corpus building for English as second…

  6. Corpus Linguistics and Language Testing: Navigating Uncharted Waters

    ERIC Educational Resources Information Center

    Egbert, Jesse

    2017-01-01

    The use of corpora and corpus linguistic methods in language testing research is increasing at an accelerated pace. The growing body of language testing research that uses corpus linguistic data is a testament to their utility in test development and validation. Although there are many reasons to be optimistic about the future of using corpus data…

  7. 33 CFR 165.808 - Corpus Christi Ship Channel, Corpus Christi, TX, safety zone.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 33 Navigation and Navigable Waters 2 2010-07-01 2010-07-01 false Corpus Christi Ship Channel, Corpus Christi, TX, safety zone. 165.808 Section 165.808 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) PORTS AND WATERWAYS SAFETY REGULATED NAVIGATION AREAS AND LIMITED ACCESS AREAS Specific Regulated Navigatio...

  8. Corpus Based Authenicity Analysis of Language Teaching Course Books

    ERIC Educational Resources Information Center

    Peksoy, Emrah; Harmaoglu, Özhan

    2017-01-01

    In this study, the resemblance of the language learning course books used in Turkey to authentic language spoken by native speakers is explored by using a corpus-based approach. For this, the 10-million-word spoken part of the British National Corpus was selected as reference corpus. After that, all language learning course books used in high…

  9. Using a Corpus in a 300-Level Spanish Grammar Course

    ERIC Educational Resources Information Center

    Benavides, Carlos

    2015-01-01

    The present study examined the use and effectiveness of a large corpus--the Corpus del Español (Davies, 2002)--in a 300-level Spanish grammar university course. Students conducted hands-on corpus searches with the goal of finding concordances containing particular types of collocations (combinations of words that tend to co-occur) and tokens (any…

  10. Computer-assisted Lemmatisation of a Cornish Text Corpus for Lexicographical Purposes

    ERIC Educational Resources Information Center

    Mills, Jon

    2002-01-01

    This project sets out to discover and develop techniques for the lemmatisation of a historical corpus of the Cornish language in order that a lemmatised dictionary macrostructure can be generated from the corpus. The system should be capable of uniquely identifying every lexical item that is attested in the corpus. A survey of published and…

  11. Effects of aging and gender on interhemispheric function.

    PubMed

    Bellis, T J; Wilber, L A

    2001-04-01

    The ability of the two hemispheres of the brain to communicate with one another via the corpus callosum is important for a wide variety of sensory, motor, and cognitive functions, many of them communication related. Anatomical evidence suggests that aging results in structural changes in the corpus callosum and that the course over time of age-related changes in corpus callosum structure may depend on the gender of the individual. Further, it has been hypothesized that age- and gender-related changes in corpus callosum structure may result in concomitant decreased performance on tasks that are reliant on interhemispheric integrity. The purpose of this study was to investigate the effects of age and gender on auditory behavioral and visuomotor temporal indices of interhemispheric function across the life span of the normal adult. Results from 120 consistently right-handed adults from age 20 to 75 years revealed that interhemispheric integrity, as measured by dichotic listening, auditory temporal patterning, and visuomotor interhemispheric transfer time tasks, decreases relatively early in the adult life span (i.e., between the ages of 40 and 55 years) and shows no further decrease thereafter. In addition, the course over time of interhemispheric decline is different for men compared to women for some tasks. These findings suggest that decreased interhemispheric function may be a possible factor contributing to auditory and communication difficulties experienced by aging adults. In addition, results of this study hold implications for the clinical assessment of interhemispheric function in aging adults and for future research into the functional ramifications of decreased multimodality interhemispheric transfer.

  12. Cytotoxic Lesions of the Corpus Callosum That Show Restricted Diffusion: Mechanisms, Causes, and Manifestations.

    PubMed

    Starkey, Jay; Kobayashi, Nobuo; Numaguchi, Yuji; Moritani, Toshio

    2017-01-01

    Cytotoxic lesions of the corpus callosum (CLOCCs) are secondary lesions associated with various entities. CLOCCs have been found in association with drug therapy, malignancy, infection, subarachnoid hemorrhage, metabolic disorders, trauma, and other entities. In all of these conditions, cell-cytokine interactions lead to markedly increased levels of cytokines and extracellular glutamate. Ultimately, this cascade can lead to dysfunction of the callosal neurons and microglia. Cytotoxic edema develops as water becomes trapped in these cells. On diffusion-weighted magnetic resonance (MR) images, CLOCCs manifest as areas of low diffusion. CLOCCs lack enhancement on contrast material-enhanced images, tend to be midline, and are relatively symmetric. The involvement of the corpus callosum typically shows one of three patterns: (a) a small round or oval lesion located in the center of the splenium, (b) a lesion centered in the splenium but extending through the callosal fibers laterally into the adjacent white matter, or (c) a lesion centered posteriorly but extending into the anterior corpus callosum. CLOCCs are frequently but not invariably reversible. Their pathologic mechanisms are discussed, the typical MR imaging findings are described, and typical cases of CLOCCs are presented. Although CLOCCs are nonspecific with regard to the underlying cause, additional imaging findings and the clinical findings can aid in making a specific diagnosis. Radiologists should be familiar with the imaging appearance of CLOCCs to avoid a misdiagnosis of ischemia. When CLOCCs are found, the underlying cause of the lesion should be sought and addressed. © RSNA, 2017 An earlier incorrect version of this article appeared online. This article was corrected on February 13, 2017.

  13. Metastatic Adenoid Cystic Carcinoma Mimicking Butterfly Glioblastoma: A Rare Presentation in the Splenium of the Corpus Callosum.

    PubMed

    Garber, Sarah T; Khoury, Laith; Bell, Diana; Schomer, Donald F; Janku, Filip; McCutcheon, Ian E

    2016-11-01

    Intracranial spread of an adenoid cystic carcinoma (ACC) of the parotid gland is rare, and metastatic ACC to the splenium of the corpus callosum mimicking butterfly glioblastoma (GBM) has not been reported previously. We report a rare case of metastasis to the splenium of the corpus callosum from ACC of the parotid gland. The tumor occupied the splenium and mimicked the presentation of a butterfly glioma. The patient had undergone parotidectomy 5 years before presentation with this intracranial lesion. On magnetic resonance imaging, the lesion was separate from the pineal gland and displaced the internal cerebral veins downward. Ventricular obstruction and increased cellularity were also suggested, and multiple fluid-filled cystic spaces were observed. The tumor was partially resected, because the extreme lateral boundary could not be visualized. Histological analysis with anti-c-kit antibody showed strong expression of the epithelial component; immunohistochemistry with anti-p63 antibody revealed nests of positive tumor cells, highlighting the myoepithelial component. The tumor also stained positive for anti-Myb antibody. The treatment for this lesion is surgical debulking followed by radiation therapy; however, the overall prognosis remains grim because of limited chemotherapy options and a propensity for recurrence in both local and distant fashions. When a tumor with adenoid histological features and a "butterfly" phenotype grows in the corpus callosum in a patient with known parotid ACC, both metastasis and adenoid variant GBM should be considered. Careful clinical and radiological correlation is required to diagnose and treat this rare lesion. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. Mild encephalopathy/encephalitis with a reversible splenial lesion (MERS): A report of five neonatal cases.

    PubMed

    Sun, Dan; Chen, Wen-Hong; Baralc, Suraj; Wang, Juan; Liu, Zhi-Sheng; Xia, Yuan-Peng; Chen, Lei

    2017-06-01

    Mild encephalopathy/encephalitis with a reversible splenial (MERS) lesion is a clinic-radiological entity. The clinical features of MERS in neonates are still not systemically reported. This paper presents five cases of MERS, and the up-to-date reviews of previously reported cases were collected and analyzed in the literature. Here we describe five cases clinically diagnosed with MERS. All of them were neonates and the average age was about 4 days. They were admitted for the common neurological symptoms such as hyperspasmia, poor reactivity and delirium. Auxiliary examinations during hospitalization also exhibited features in common. In this report, we reached following conclusions. Firstly, magnetic resonance imaging revealed solitary or comprehensive lesions in the splenium of corpus callosum, some of them extending to almost the whole corpus callosum. The lesions showed low intensity signal on T1-weighted images, homogeneously hyperintense signal on T2-weighted images, fluid-attenuated inversion recovery and diffusion-weighted images, and exhibited an obvious reduced diffusion on apparent diffusion coefficient map. Moreover, the lesions in the magnetic resonance imaging disappeared very quickly even prior to the clinical recovery. Secondly, all the cases depicted here suffered electrolyte disturbances especially hyponatremia which could be easily corrected. Lastly, all of the cases recovered quickly over one week to one month and majority of them exhibited signs of infections and normal electroencephalography.

  15. Synonym extraction and abbreviation expansion with ensembles of semantic spaces.

    PubMed

    Henriksson, Aron; Moen, Hans; Skeppstedt, Maria; Daudaravičius, Vidas; Duneld, Martin

    2014-02-05

    Terminologies that account for variation in language use by linking synonyms and abbreviations to their corresponding concept are important enablers of high-quality information extraction from medical texts. Due to the use of specialized sub-languages in the medical domain, manual construction of semantic resources that accurately reflect language use is both costly and challenging, often resulting in low coverage. Although models of distributional semantics applied to large corpora provide a potential means of supporting development of such resources, their ability to isolate synonymy from other semantic relations is limited. Their application in the clinical domain has also only recently begun to be explored. Combining distributional models and applying them to different types of corpora may lead to enhanced performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. A combination of two distributional models - Random Indexing and Random Permutation - employed in conjunction with a single corpus outperforms using either of the models in isolation. Furthermore, combining semantic spaces induced from different types of corpora - a corpus of clinical text and a corpus of medical journal articles - further improves results, outperforming a combination of semantic spaces induced from a single source, as well as a single semantic space induced from the conjoint corpus. A combination strategy that simply sums the cosine similarity scores of candidate terms is generally the most profitable out of the ones explored. Finally, applying simple post-processing filtering rules yields substantial performance gains on the tasks of extracting abbreviation-expansion pairs, but not synonyms. The best results, measured as recall in a list of ten candidate terms, for the three tasks are: 0.39 for abbreviations to long forms, 0.33 for long forms to abbreviations, and 0.47 for synonyms. This study demonstrates that ensembles of semantic spaces can yield improved performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. This notion, which merits further exploration, allows different distributional models - with different model parameters - and different types of corpora to be combined, potentially allowing enhanced performance to be obtained on a wide range of natural language processing tasks.

  16. Synonym extraction and abbreviation expansion with ensembles of semantic spaces

    PubMed Central

    2014-01-01

    Background Terminologies that account for variation in language use by linking synonyms and abbreviations to their corresponding concept are important enablers of high-quality information extraction from medical texts. Due to the use of specialized sub-languages in the medical domain, manual construction of semantic resources that accurately reflect language use is both costly and challenging, often resulting in low coverage. Although models of distributional semantics applied to large corpora provide a potential means of supporting development of such resources, their ability to isolate synonymy from other semantic relations is limited. Their application in the clinical domain has also only recently begun to be explored. Combining distributional models and applying them to different types of corpora may lead to enhanced performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. Results A combination of two distributional models – Random Indexing and Random Permutation – employed in conjunction with a single corpus outperforms using either of the models in isolation. Furthermore, combining semantic spaces induced from different types of corpora – a corpus of clinical text and a corpus of medical journal articles – further improves results, outperforming a combination of semantic spaces induced from a single source, as well as a single semantic space induced from the conjoint corpus. A combination strategy that simply sums the cosine similarity scores of candidate terms is generally the most profitable out of the ones explored. Finally, applying simple post-processing filtering rules yields substantial performance gains on the tasks of extracting abbreviation-expansion pairs, but not synonyms. The best results, measured as recall in a list of ten candidate terms, for the three tasks are: 0.39 for abbreviations to long forms, 0.33 for long forms to abbreviations, and 0.47 for synonyms. Conclusions This study demonstrates that ensembles of semantic spaces can yield improved performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. This notion, which merits further exploration, allows different distributional models – with different model parameters – and different types of corpora to be combined, potentially allowing enhanced performance to be obtained on a wide range of natural language processing tasks. PMID:24499679

  17. 31 CFR 358.19 - Who is responsible for any loss resulting from the conversion of a bearer corpus missing callable...

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...

  18. 31 CFR 358.19 - Who is responsible for any loss resulting from the conversion of a bearer corpus missing callable...

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...

  19. 31 CFR 358.19 - Who is responsible for any loss resulting from the conversion of a bearer corpus missing callable...

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...

  20. 31 CFR 358.19 - Who is responsible for any loss resulting from the conversion of a bearer corpus missing callable...

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...

  1. 31 CFR 358.19 - Who is responsible for any loss resulting from the conversion of a bearer corpus missing callable...

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...

  2. Separating Fact and Fiction: The Real Story of Corpus Use in Language Teaching

    ERIC Educational Resources Information Center

    Boulton, Alex

    2013-01-01

    This paper investigates uses of corpora in language learning ("data-driven learning") through analysis of a 600K-word corpus of empirical research papers in the field. The corpus can tell us much--the authors and the countries the studies are conducted in, the types of publication, and so on. The corpus investigation itself starts with…

  3. Recent Developments in Corpus Linguistics and Corpus-Based Research/Department of Linguistics and Modern Language Studies at the Hong Kong Institute of Education

    ERIC Educational Resources Information Center

    Xie, Qin

    2015-01-01

    Corpus linguistics has transformed the landscape of empirical research on languages in recent decades. The proliferation of corpus technology has enabled researchers worldwide to conduct research in their own geographical locations with few hindrances. It has become increasingly commonplace for researchers to compile their own corpora for specific…

  4. The Effects of Utilizing Corpus Resources to Correct Collocation Errors in L2 Writing--Students' Performance, Corpus Use and Perceptions

    ERIC Educational Resources Information Center

    Wu, Yi-ju

    2016-01-01

    Data-Driven Learning (DDL), in which learners "confront [themselves] directly with the corpus data" (Johns, 2002, p. 108), has shown to be effective in collocation learning in L2 writing. Nevertheless, there have been only few research studies of this type examining the relationship between English proficiency and corpus consultation.…

  5. Contrast radiographic study of venous drainage of the corpus cavernosum and the corpus spongiosum of the cat penis.

    PubMed

    Amiri, Ali Akbar; Gilanpour, Hassan; Veshkini, Abbas

    2014-01-01

    The aim of this study was to determine the drainage routes of the corpus cvernosum penis and the corpus spongiosum penis in the cat using contrast cavernosography. Five male cats, 1.5-2.5 years old, weighing between 4.5 and 5.5 kg were investigated. The cats were anesthetized and the root and the proximal part of the penis were exposed by an incision on the perineum reaching the scrotum. Each cat was radiographed in lateral and dorsal recumbency before and during injection of contrast medium into the erectile bodies. The corpus spongiosum penis was injected at the bulb of the penis and the corpus cavernosum penis at the root. Injection of contrast media into the cavernous bodies showed that both the external and internal iliac veins drain the erectile bodies into the caudal vena cava. Drainage from the corpus spongiosum penis was from the bulb for the proximal part and from the glans for the distal part. The corpus cavernosum penis was drained only proximally, from the crura. There was a network of veins above the pelvic symphysis and the drainage of erectile bodies where through various routes into the internal and external iliac veins.

  6. A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools

    PubMed Central

    2012-01-01

    Background We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. Results Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. Conclusions The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications. PMID:22901054

  7. Acceptance and Commitment Therapy in Improving Well-Being in Patients With Stage III-IV Cancer and Their Partners

    ClinicalTrials.gov

    2018-02-06

    Malignant Female Reproductive System Neoplasm; Malignant Hepatobiliary Neoplasm; Partner; Stage III Breast Cancer; Stage III Cervical Cancer; Stage III Colorectal Cancer; Stage III Lung Cancer; Stage III Prostate Cancer; Stage III Skin Melanoma; Stage III Uterine Corpus Cancer; Stage IIIA Breast Cancer; Stage IIIA Cervical Cancer; Stage IIIA Colorectal Cancer; Stage IIIA Lung Carcinoma; Stage IIIA Skin Melanoma; Stage IIIA Uterine Corpus Cancer; Stage IIIB Breast Cancer; Stage IIIB Cervical Cancer; Stage IIIB Colorectal Cancer; Stage IIIB Lung Carcinoma; Stage IIIB Skin Melanoma; Stage IIIB Uterine Corpus Cancer; Stage IIIC Breast Cancer; Stage IIIC Colorectal Cancer; Stage IIIC Skin Melanoma; Stage IIIC Uterine Corpus Cancer; Stage IV Breast Cancer; Stage IV Cervical Cancer; Stage IV Colorectal Cancer; Stage IV Lung Cancer; Stage IV Prostate Cancer; Stage IV Skin Melanoma; Stage IV Uterine Corpus Cancer; Stage IVA Cervical Cancer; Stage IVA Colorectal Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Cervical Cancer; Stage IVB Colorectal Cancer; Stage IVB Uterine Corpus Cancer

  8. A corpus and a concordancer of academic journal articles.

    PubMed

    Kwary, Deny A

    2018-02-01

    This data article presents a corpus (i.e. a selection of a big number of words in an electronic form) and a concordancer (i.e. a tool to show the word in its context of use) of academic journal articles. As the title suggests, the data were collected from research articles published in academic journals. The corpus contains 5,686,428 words selected from 895 journal articles published by Elsevier in 2011-2015. The corpus is classified into four subject areas: Health sciences, Life sciences, Physical Sciences, and Social Sciences, following the classifications of Scopus, which is the largest abstract and citation database of peer-reviewed scientific journals, books and conference proceedings. To ease the access and utilization of the corpus, a program to produce the key word in context (KWIC) and word frequency was created and placed on the website: corpus.kwary.net. The corpus is a valuable resource for researchers, teachers, and translators working on academic English.

  9. Hemoperitoneum from corpus luteum rupture in patients with aplastic anemia.

    PubMed

    Wang, Huaquan; Guo, Lifang; Shao, Zonghong

    2015-01-01

    Aplastic anemia is a rare hematopoietic stem-cell disorder that results in pancytopenia and hypocellular bone marrow. Women with aplastic anemia usually are at increased risk of corpus luteum rupture due to thrombocytopenia and infection. Here we report two cases had hemoperitoneum from corpus luteum rupture in patients with aplastic anemia in our center. Case 1 involved two episodes of hemoperitoneum resulting from rupture of the corpus luteum in a 23-year-old unmarried female with severe aplastic anemia. This patient was managed conservatively with platelet and packed red cell transfusion. Case 2 involved two episodes of hemoperitoneum resulting from rupture of the corpus luteum in a 33-year-old married patient with aplastic anemia. Emergency laparoscopy revealed massive hemoperitoneum. Bilateral salpingo-oophorectomy were performed successively with platelet and packed red cell transfusion. Hemoperitoneum resulting from a ruptured corpus luteum is a life-threatening condition in patients with aplastic anemia. Prompt and appropriate evaluation of corpus luteum rupture and emergent therapy are needed.

  10. Creating a Gold Standard for the Readability Measurement of Health Texts

    PubMed Central

    Kandula, Sasikiran; Zeng-Treitler, Qing

    2008-01-01

    Developing easy-to-read health texts for consumers continues to be a challenge in health communication. Though readability formulae such as Flesch-Kincaid Grade Level have been used in many studies, they were found to be inadequate to estimate the difficulty of some types of health texts. One impediment to the development of new readability assessment techniques is the absence of a gold standard that can be used to validate them. To overcome this deficiency, we have compiled a corpus of 324 health documents consisting of six different types of texts. These documents were manually reviewed and assigned a readability level (1-7 Likert scale) by a panel of five health literacy experts. The expert assigned ratings were found to be highly correlated with a patient representative’s readability ratings (r = 0.81, p<0.0001). PMID:18999150

  11. [Gastric cancer risk estimate in patients with chronic gastritis associated with Helicobacter pylori infection in a clinical setting].

    PubMed

    Arismendi-Morillo, G; Hernández, I; Mengual, E; Abreu, N; Molero, N; Fuenmayor, A; Romero, G; Lizarzábal, M

    2013-01-01

    Severity of chronic gastritis associated with Helicobacter pylori infection (CGAHpI) could play a role in evaluating the potential risk to develop gastric cancer. Our aim was to estimate the risk for gastric cancer in a clinical setting, according to histopathologic criteria, by applying the gastric cancer risk index (GCRI) METHODS: Histopathologic study of the gastric biopsies (corpus-antrum) from consecutive adult patients that underwent gastroesophageal duodenoscopy was carried out, and the GCRI was applied in patients presenting with CGAHpI. One hundred eleven patients (77% female) with a mean age of 38.6±13.1 years were included. Active Helicobacter pylori infection (aHpi) was diagnosed in 77 cases (69.40%). In 45% of the cases with aHpi, pangastritis (23%) or corpus-predominant gastritis (22%) was diagnosed. Nine cases were diagnosed with intestinal metaplasia (8%), 7 of which (77.70%) were in the aHpi group. Twenty one percent of the patients with aHpi had a GCRI of 2 (18.10%) or 3 (2.50%) points (high risk index), while 79.10% accumulated a GCRI of 0 or 1 points (low risk index). Of the patients with no aHpi, none of them had 3 points (p=0.001). Of the 18 patients that accumulated 2 or 3 points, 6 (33.30%) presented with intestinal metaplasia (all with pangastritis and corpus-predominant gastritis), of which 4 cases (66.60%) had aHpi. The estimated gastric cancer risk in patients with CGAHpI in the clinical setting studied was relatively low and 5% of the patients had a histopathologic phenotype associated with an elevated risk for developing gastric cancer. Copyright © 2012 Asociación Mexicana de Gastroenterología. Published by Masson Doyma México S.A. All rights reserved.

  12. Reduced ventral cingulum integrity and increased behavioral problems in children with isolated optic nerve hypoplasia and mild to moderate or no visual impairment.

    PubMed

    Webb, Emma A; O'Reilly, Michelle A; Clayden, Jonathan D; Seunarine, Kiran K; Dale, Naomi; Salt, Alison; Clark, Chris A; Dattani, Mehul T

    2013-01-01

    To assess the prevalence of behavioral problems in children with isolated optic nerve hypoplasia, mild to moderate or no visual impairment, and no developmental delay. To identify white matter abnormalities that may provide neural correlates for any behavioral abnormalities identified. Eleven children with isolated optic nerve hypoplasia (mean age 5.9 years) underwent behavioral assessment and brain diffusion tensor imaging, Twenty four controls with isolated short stature (mean age 6.4 years) underwent MRI, 11 of whom also completed behavioral assessments. Fractional anisotropy images were processed using tract-based spatial statistics. Partial correlation between ventral cingulum, corpus callosum and optic radiation fractional anisotropy, and child behavioral checklist scores (controlled for age at scan and sex) was performed. Children with optic nerve hypoplasia had significantly higher scores on the child behavioral checklist (p<0.05) than controls (4 had scores in the clinically significant range). Ventral cingulum, corpus callosum and optic radiation fractional anisotropy were significantly reduced in children with optic nerve hypoplasia. Right ventral cingulum fractional anisotropy correlated with total and externalising child behavioral checklist scores (r = -0.52, p<0.02, r = -0.46, p<0.049 respectively). There were no significant correlations between left ventral cingulum, corpus callosum or optic radiation fractional anisotropy and behavioral scores. Our findings suggest that children with optic nerve hypoplasia and mild to moderate or no visual impairment require behavioral assessment to determine the presence of clinically significant behavioral problems. Reduced structural integrity of the ventral cingulum correlated with behavioral scores, suggesting that these white matter abnormalities may be clinically significant. The presence of reduced fractional anisotropy in the optic radiations of children with mild to moderate or no visual impairment raises questions as to the pathogenesis of these changes which will need to be addressed by future studies.

  13. Flexitouch® Home Maintenance Therapy or Standard Home Maintenance Therapy in Treating Patients With Lower-Extremity Lymphedema Caused by Treatment for Cervical Cancer, Vulvar Cancer, or Endometrial Cancer

    ClinicalTrials.gov

    2014-12-29

    Lymphedema; Stage 0 Cervical Cancer; Stage 0 Uterine Corpus Cancer; Stage 0 Vulvar Cancer; Stage I Uterine Corpus Cancer; Stage I Vulvar Cancer; Stage IA Cervical Cancer; Stage IB Cervical Cancer; Stage II Uterine Corpus Cancer; Stage II Vulvar Cancer; Stage IIA Cervical Cancer; Stage IIB Cervical Cancer; Stage III Cervical Cancer; Stage III Uterine Corpus Cancer; Stage III Vulvar Cancer; Stage IV Uterine Corpus Cancer; Stage IVA Cervical Cancer; Stage IVB Cervical Cancer; Stage IVB Vulvar Cancer

  14. The Wildcat Corpus of Native- and Foreign-Accented English: Communicative Efficiency across Conversational Dyads with Varying Language Alignment Profiles

    ERIC Educational Resources Information Center

    Van Engen, Kristin J.; Baese-Berk, Melissa; Baker, Rachel E.; Choi, Arim; Kim, Midam; Bradlow, Ann R.

    2010-01-01

    This paper describes the development of the Wildcat Corpus of native- and foreign-accented English, a corpus containing scripted and spontaneous speech recordings from 24 native speakers of American English and 52 non-native speakers of English. The core element of this corpus is a set of spontaneous speech recordings, for which a new method of…

  15. An Analysis of the Application of Wikipedia Corpus on the Lexical Learning in the Second Language Acquisition

    ERIC Educational Resources Information Center

    Shi, Jing

    2015-01-01

    Corpus linguistics has transformed linguistic research but has a slightly moderate impact on the ESL teaching and learning. The Wikipedia Corpus, designed by Mark Davis is introduced in this essay. The corpus allows teachers to search Wikipedia in a powerful way: they can search by word, phrase, part of speech, and synonyms. Teachers can also find…

  16. Training ESP Students in Corpus Use--Challenges of Using Corpus-Based Exercises with Students of Non-Philological Studies

    ERIC Educational Resources Information Center

    Marinov, Sanja

    2013-01-01

    This paper focuses on planning a series of activities to train learners of undergraduate, non-philological studies in using a small specialised ad hoc corpus and the results they achieved in doing them. The procedure discussed in this paper is a part of a larger project which investigates the possibility of using a small specialised corpus with…

  17. Effects of citicoline therapy on the network connectivity of the corpus callosum in patients with leukoaraiosis

    PubMed Central

    Feng, Liang; Jiang, Hong; Li, Yunxia; Teng, Fei; He, Yusheng

    2017-01-01

    Abstract This study aimed to investigate the effects of citicoline therapy on the network connectivity of the corpus callosum in patients with leukoaraiosis (LA) by diffusion tension imaging (DTI). A total of 30 LA patients with Fazekas score of 2 to 3 were voluntarily assigned into citicoline group (n = 14) and control group (n = 16). In citicoline group, citicoline was administered at 0.6 g/d for 1 year. In control group, central nervous system drugs should not be used, except for sleeping pills and antidepressants. Interventions for pre-existing diseases should be conducted in both groups. During the periods of citicoline therapy and post-treatment follow-up, cranial magnetic resonance imaging and DTI were routinely performed in these patients, and the genu, body, and splenium of corpus callosum were selected as the regions of interest (ROIs). The fractional anisotropy (FA) and mean diffusivity (MD) of each ROI were determined with PANDA software. On recruitment, there were no significant differences in the general characteristics, blood biochemical results, cognition function, and the FA and MD of the corpus callosum between 2 groups (P > 0.05). After 1-year treatment, the FA of the corpus callosum reduced gradually, but the MD of the corpus callosum tended to increased in both group, although significant differences were not observed. However, the reductions in FA of genu and splenium of corpus callosum in citicoline group were significantly lower than in control group (P < 0.05); the reductions in MD of genu, body, and splenium of corpus callosum in citicoline group were significantly lower than in control group (P < 0.05). In LA patients, the disruption of the network connectivity of the corpus callosum deteriorates over time. Citicoline treatment may delay the reduction in FA of corpus callosum, which might be beneficial for the improvement of network connectivity of the corpus callosum. PMID:28121935

  18. Modifications of Erectile Tissue Components in the Penis during the Fetal Period

    PubMed Central

    Gallo, Carla B. M.; Costa, Waldemar S.; Furriel, Angelica; Bastos, Ana L.; Sampaio, Francisco J. B.

    2014-01-01

    Background The penile erectile tissue has a complex microscopic anatomy with important functions in the mechanism of penile erection. The knowledge of such structures is necessary for understanding the normal physiology of the adult penis. Therefore, it is important to know the changes of these penile structures during fetal development. This study aims to analyze the development of the main components of the erectile tissue, such as collagen, smooth muscle fibers and elastic system fibers, in human fetuses. Methodology/Principal Findings We studied the penises of 56 human fetuses aged 13 to 36 weeks post-conception (WPC). We used histochemical and immunohistochemical staining, as well as morphometric techniques to analyze the collagen, smooth muscle fibers and elastic system fibers in the corpus cavernosum and in the corpus spongiosum. These elements were identified and quantified as percentage by using the Image J software (NIH, Bethesda, USA). From 13 to 36 WPC, in the corpus cavernosum, the amount of collagen, smooth muscle fibers and elastic system fibers varied from 19.88% to 36.60%, from 4.39% to 29.76% and from 1.91% to 8.92%, respectively. In the corpus spongiosum, the amount of collagen, smooth muscle fibers and elastic system fibers varied from 34.65% to 45.89%, from 0.60% to 11.90% and from 3.22% to 11.93%, respectively. Conclusions We found strong correlation between the elements analyzed with fetal age, both in corpus cavernosum and corpus spongiosum. The growth rate of these elements was more intense during the second trimester (13 to 24 WPC) of gestation, both in corpus cavernosum and in corpus spongiosum. There is greater proportional amount of collagen in the corpus spongiosum than in corpus cavernosum during all fetal period. In the corpus spongiosum, there is about four times more collagen than smooth muscle fibers and elastic system fibers, during all fetal period studied. PMID:25170760

  19. Effects of castration on penile extracellular matrix morphology in domestic cats.

    PubMed

    Borges, Nathalia Cs; Pereira-Sampaio, Marco A; Pereira, Vivian Alves; Abidu-Figueiredo, Marcelo; Chagas, Maurício Alves

    2017-12-01

    Objectives This study was undertaken to verify the possible modifications caused by hormonal deprivation in the extracellular matrix in the penises of neutered cats. Methods Twenty-seven penises from domestic shorthair cats were collected: 14 samples from intact cats and 13 from neutered cats. Sections were stained with Weigert's resorcin-fuchsin, hematoxylin and eosin, and picrosirius red. Histomorphometric analysis was performed using light microscopy and image analysis software. The following parameters were analyzed: density of the elastic fibers and collagen fibers in the corpus spongiosum; density of the elastic fibers in the tunica albuginea of the corpus cavernosum and the tunica albuginea of the corpus spongiosum; luminal area of the urethra; area of the corpus spongiosum; area of the corpus cavernosum; and thickness of the urethral epithelium. The data were analyzed using the Shapiro-Wilk test to verify the normal distribution, and groups were compared using Student's t-test; P <0.05 indicated statistically significant differences. Results Significant differences were observed between intact cats and neutered cats in the density of elastic fibers in the tunica albuginea of the corpus cavernosum (8.13% ± 1.38% vs 3.11% ± 0.66%), tunica albuginea of the corpus spongiosum (4.37% ± 1.08% vs 3.30% ± 1.01%) and corpus spongiosum (6.28% ± 3.03% vs 4.10% ± 2.19%), and density of collagen fibers in the corpus spongiosum (34.11% ± 10.86% vs 44.21% ± 12.72%). Conclusions and relevance The results show a significant decrease in the density of the elastic fibers and a significant increase of the density of the collagen fibers in the corpus spongiosum in neutered animals. This suggests that the compliance of the periurethral region is reduced, and these changes could be a predisposing factor for urethral obstructive disease.

  20. EARS2 mutations cause fatal neonatal lactic acidosis, recurrent hypoglycemia and agenesis of corpus callosum.

    PubMed

    Danhauser, Katharina; Haack, Tobias B; Alhaddad, Bader; Melcher, Marlen; Seibt, Annette; Strom, Tim M; Meitinger, Thomas; Klee, Dirk; Mayatepek, Ertan; Prokisch, Holger; Distelmaier, Felix

    2016-06-01

    Mitochondrial aminoacyl tRNA synthetases are essential for organelle protein synthesis. Genetic defects affecting the function of these enzymes may cause pediatric mitochondrial disease. Here, we report on a child with fatal neonatal lactic acidosis and recurrent hypoglycemia caused by mutations in EARS2, encoding mitochondrial glutamyl-tRNA synthetase 2. Brain ultrasound revealed agenesis of corpus callosum. Studies on patient-derived skin fibroblasts showed severely decreased EARS2 protein levels, elevated reactive oxygen species (ROS) production, and altered mitochondrial morphology. Our report further illustrates the clinical spectrum of the severe neonatal-onset form of EARS2 mutations. Moreover, in this case the live-cell parameters appeared to be more sensitive to mitochondrial dysfunction compared to standard diagnostics, which indicates the potential relevance of fibroblast studies in children with mitochondrial diseases.

  1. Sexual dimorphism and handedness in the human corpus callosum based on magnetic resonance imaging.

    PubMed

    Tuncer, M C; Hatipoğlu, E S; Ozateş, M

    2005-08-01

    The corpus callosum (CC) is a major anatomical and functional commissure linking the two cerebral hemispheres. With MR imaging in the sagittal plane, the corpus callosum can be depicted in great detail. Mid-sagittal magnetic resonance images of 80 normal individuals were analyzed to assess whether or not the morphology of the corpus callosum and its parts are related to sex and handedness. The subjects were 40 males (20 right-handers and 20 left-handers) and 40 females (20 right-handers and 20 left-handers). The midsagittal area of the corpus callosum was divided into seven sub-areas using Witelson's method. The most striking morphological changes concerned left-handers, who had larger areas of the anterior body, posterior body and isthmus than right-handers. In addition, right-handed males had larger rostrums and isthmuses than right-handed females. These significantly increased areas were related to handedness in right-handed males. However, left-handed males had larger anterior and posterior bodies than right-handed males. In contrast, there was no significant difference between left-handers and right-handers in females. The areas of the rostrum and posterior body of the corpus callosum increased significantly with sex in males. Moreover, there were no significant age-related changes in the total corpus callosum and sub-areas of the corpus callosum. In conclusion, these anatomical changes in corpus callosum morphology require taking the sexual definition and dominant handedness into consideration.

  2. Vaccine Therapy With or Without Sirolimus in Treating Patients With NY-ESO-1 Expressing Solid Tumors

    ClinicalTrials.gov

    2016-10-03

    Anaplastic Astrocytoma; Anaplastic Oligoastrocytoma; Anaplastic Oligodendroglioma; Estrogen Receptor Negative; Estrogen Receptor Positive; Glioblastoma; Hormone-Resistant Prostate Cancer; Metastatic Prostate Carcinoma; Metastatic Renal Cell Cancer; Recurrent Adult Brain Neoplasm; Recurrent Bladder Carcinoma; Recurrent Breast Carcinoma; Recurrent Colorectal Carcinoma; Recurrent Esophageal Carcinoma; Recurrent Gastric Carcinoma; Recurrent Hepatocellular Carcinoma; Recurrent Lung Carcinoma; Recurrent Melanoma; Recurrent Ovarian Carcinoma; Recurrent Prostate Carcinoma; Recurrent Renal Cell Carcinoma; Recurrent Uterine Corpus Carcinoma; Resectable Hepatocellular Carcinoma; Sarcoma; Stage IA Breast Cancer; Stage IA Ovarian Cancer; Stage IA Uterine Corpus Cancer; Stage IB Breast Cancer; Stage IB Ovarian Cancer; Stage IB Uterine Corpus Cancer; Stage IC Ovarian Cancer; Stage II Uterine Corpus Cancer; Stage IIA Breast Cancer; Stage IIA Lung Carcinoma; Stage IIA Ovarian Cancer; Stage IIB Breast Cancer; Stage IIB Esophageal Cancer; Stage IIB Lung Carcinoma; Stage IIB Ovarian Cancer; Stage IIB Skin Melanoma; Stage IIC Ovarian Cancer; Stage IIC Skin Melanoma; Stage IIIA Breast Cancer; Stage IIIA Esophageal Cancer; Stage IIIA Lung Carcinoma; Stage IIIA Ovarian Cancer; Stage IIIA Skin Melanoma; Stage IIIA Uterine Corpus Cancer; Stage IIIB Breast Cancer; Stage IIIB Esophageal Cancer; Stage IIIB Ovarian Cancer; Stage IIIB Skin Melanoma; Stage IIIB Uterine Corpus Cancer; Stage IIIC Breast Cancer; Stage IIIC Esophageal Cancer; Stage IIIC Ovarian Cancer; Stage IIIC Skin Melanoma; Stage IIIC Uterine Corpus Cancer; Stage IV Bladder Urothelial Carcinoma; Stage IV Esophageal Cancer; Stage IV Ovarian Cancer; Stage IV Prostate Cancer; Stage IV Skin Melanoma; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer

  3. Query-oriented evidence extraction to support evidence-based medicine practice.

    PubMed

    Sarker, Abeed; Mollá, Diego; Paris, Cecile

    2016-02-01

    Evidence-based medicine practice requires medical practitioners to rely on the best available evidence, in addition to their expertise, when making clinical decisions. The medical domain boasts a large amount of published medical research data, indexed in various medical databases such as MEDLINE. As the size of this data grows, practitioners increasingly face the problem of information overload, and past research has established the time-associated obstacles faced by evidence-based medicine practitioners. In this paper, we focus on the problem of automatic text summarisation to help practitioners quickly find query-focused information from relevant documents. We utilise an annotated corpus that is specialised for the task of evidence-based summarisation of text. In contrast to past summarisation approaches, which mostly rely on surface level features to identify salient pieces of texts that form the summaries, our approach focuses on the use of corpus-based statistics, and domain-specific lexical knowledge for the identification of summary contents. We also apply a target-sentence-specific summarisation technique that reduces the problem of underfitting that persists in generic summarisation models. In automatic evaluations run over a large number of annotated summaries, our extractive summarisation technique statistically outperforms various baseline and benchmark summarisation models with a percentile rank of 96.8%. A manual evaluation shows that our extractive summarisation approach is capable of selecting content with high recall and precision, and may thus be used to generate bottom-line answers to practitioners' queries. Our research shows that the incorporation of specialised data and domain-specific knowledge can significantly improve text summarisation performance in the medical domain. Due to the vast amounts of medical text available, and the high growth of this form of data, we suspect that such summarisation techniques will address the time-related obstacles associated with evidence-based medicine. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Insights from a Learner Corpus as Opposed to a Native Corpus about Cohesive Devices in an Academic Writing Context

    ERIC Educational Resources Information Center

    Ersanli, Ceylan Yangin

    2015-01-01

    This study reports on the insights from an EFL learner corpora (a total of 151 essays and 49,690 words) generated from essays collected over the years in a Turkish state university from freshmen students enrolling in the Advanced Writing course. The comparison of cohesive devices in the non-native corpus (NNC) with those in a native corpus (NC)…

  5. The clinical spectrum of mutations in L1, a neuronal cell adhesion molecule

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fransen, E.; Vits, L.; Van Camp, G.

    1996-07-12

    Mutations in the gene encoding the neuronal cell adhesion molecule L1 are responsible for several syndromes with clinical overlap, including X-linked hydrocephalus (XLH, HSAS), MASA (mental retardation, aphasia, shuffling gait, adducted thumbs) syndrome, complicated X-linked spastic paraplegia (SP 1), X-linked mental retardation-clasped thumb (MR-CT) syndrome, and some forms of X-linked agenesis of the corpus callosum (ACC). We review 34 L1 mutations in patients with these phenotypes. 22 refs., 3 figs., 4 tabs.

  6. Identifying Issue Frames in Text

    PubMed Central

    Sagi, Eyal; Diermeier, Daniel; Kaufmann, Stefan

    2013-01-01

    Framing, the effect of context on cognitive processes, is a prominent topic of research in psychology and public opinion research. Research on framing has traditionally relied on controlled experiments and manually annotated document collections. In this paper we present a method that allows for quantifying the relative strengths of competing linguistic frames based on corpus analysis. This method requires little human intervention and can therefore be efficiently applied to large bodies of text. We demonstrate its effectiveness by tracking changes in the framing of terror over time and comparing the framing of abortion by Democrats and Republicans in the U.S. PMID:23874909

  7. Effective Web and Desktop Retrieval with Enhanced Semantic Spaces

    NASA Astrophysics Data System (ADS)

    Daoud, Amjad M.

    We describe the design and implementation of the NETBOOK prototype system for collecting, structuring and efficiently creating semantic vectors for concepts, noun phrases, and documents from a corpus of free full text ebooks available on the World Wide Web. Automatic generation of concept maps from correlated index terms and extracted noun phrases are used to build a powerful conceptual index of individual pages. To ensure scalabilty of our system, dimension reduction is performed using Random Projection [13]. Furthermore, we present a complete evaluation of the relative effectiveness of the NETBOOK system versus the Google Desktop [8].

  8. Endoscopic features of lymphoid follicles in Helicobacter pylori-associated chronic gastritis.

    PubMed

    Hayashi, Seishu; Imamura, Jun; Kimura, Kiminori; Saeki, Shunichi; Hishima, Tsunekazu

    2015-01-01

    Small, round, yellowish-white nodules (YWN) are frequently observed in Helicobacter pylori-associated gastritis. The aim of the present study was to investigate the clinical significance of these YWN. Participants comprised 211 patients with H. pylori-associated gastritis, ranging in age from 23 to 86 years. YWN were detected in 23% of participants, more frequently in women (33%) than in men (12%; P < 0.01). YWN were observed on the antral mucosa in 4.7% of cases, lesser curvature of the corpus mucosa in 20%, greater curvature of the corpus mucosa in 0.9%, and fundic mucosa in 12%. Most YWN located on the antral mucosa showed nodular type, and most YWN located on the corpus mucosa and fundic mucosa showed flat type. On magnifying endoscopy with narrow-band imaging, YWN appeared as round whitish lesions with radial or branching microvessels on the surface and hypovascular globe structures just beneath the surface of the mucosa. Targeted biopsies of YWN revealed lymphoid follicles with lymphocyte infiltration or intense inflammatory cell infiltration. The endoscopic finding of YWN could be observed at any site of the gastric mucosa in H. pylori-associated gastritis, and represented histological lymphoid follicles. © 2014 The Authors. Digestive Endoscopy © 2014 Japan Gastroenterological Endoscopy Society.

  9. Diffusion tensor imaging of cingulum bundle and corpus callosum in schizophrenia vs. bipolar disorder.

    PubMed

    Nenadić, Igor; Hoof, Anna; Dietzek, Maren; Langbein, Kerstin; Reichenbach, Jürgen R; Sauer, Heinrich; Güllmar, Daniel

    2017-08-30

    Both schizophrenia and bipolar disorder show abnormalities of white matter, as seen in diffusion tensor imaging (DTI) analyses of major brain fibre bundles. While studies in each of the two conditions have indicated possible overlap in anatomical location, there are few direct comparisons between the disorders. Also, it is unclear whether phenotypically similar subgroups (e.g. patients with bipolar disorder and psychotic features) might share white matter pathologies or be rather similar. Using region-of-interest (ROI) analysis of white matter with diffusion tensor imaging (DTI) at 3 T, we analysed fractional anisotropy (FA), radial diffusivity (RD), and apparent diffusion coefficient (ADC) of the corpus callosum and cingulum bundle in 33 schizophrenia patients, 17 euthymic (previously psychotic) bipolar disorder patients, and 36 healthy controls. ANOVA analysis showed significant main effects of group for RD and ADC (both elevated in schizophrenia). Across the corpus callosum ROIs, there was not group effect on FA, but for RD (elevated in schizophrenia, lower in bipolar disorder) and ADC (higher in schizophrenia, intermediate in bipolar disorder). Our findings show similarities and difference (some gradual) across regions of the two major fibre tracts implicated in these disorders, which would be consistent with a neurobiological overlap of similar clinical phenotypes. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  10. Usability-driven pruning of large ontologies: the case of SNOMED CT.

    PubMed

    López-García, Pablo; Boeker, Martin; Illarramendi, Arantza; Schulz, Stefan

    2012-06-01

    To study ontology modularization techniques when applied to SNOMED CT in a scenario in which no previous corpus of information exists and to examine if frequency-based filtering using MEDLINE can reduce subset size without discarding relevant concepts. Subsets were first extracted using four graph-traversal heuristics and one logic-based technique, and were subsequently filtered with frequency information from MEDLINE. Twenty manually coded discharge summaries from cardiology patients were used as signatures and test sets. The coverage, size, and precision of extracted subsets were measured. Graph-traversal heuristics provided high coverage (71-96% of terms in the test sets of discharge summaries) at the expense of subset size (17-51% of the size of SNOMED CT). Pre-computed subsets and logic-based techniques extracted small subsets (1%), but coverage was limited (24-55%). Filtering reduced the size of large subsets to 10% while still providing 80% coverage. Extracting subsets to annotate discharge summaries is challenging when no previous corpus exists. Ontology modularization provides valuable techniques, but the resulting modules grow as signatures spread across subhierarchies, yielding a very low precision. Graph-traversal strategies and frequency data from an authoritative source can prune large biomedical ontologies and produce useful subsets that still exhibit acceptable coverage. However, a clinical corpus closer to the specific use case is preferred when available.

  11. Algorithms and Results of Eye Tissues Differentiation Based on RF Ultrasound

    PubMed Central

    Jurkonis, R.; Janušauskas, A.; Marozas, V.; Jegelevičius, D.; Daukantas, S.; Patašius, M.; Paunksnis, A.; Lukoševičius, A.

    2012-01-01

    Algorithms and software were developed for analysis of B-scan ultrasonic signals acquired from commercial diagnostic ultrasound system. The algorithms process raw ultrasonic signals in backscattered spectrum domain, which is obtained using two time-frequency methods: short-time Fourier and Hilbert-Huang transformations. The signals from selected regions of eye tissues are characterized by parameters: B-scan envelope amplitude, approximated spectral slope, approximated spectral intercept, mean instantaneous frequency, mean instantaneous bandwidth, and parameters of Nakagami distribution characterizing Hilbert-Huang transformation output. The backscattered ultrasound signal parameters characterizing intraocular and orbit tissues were processed by decision tree data mining algorithm. The pilot trial proved that applied methods are able to correctly classify signals from corpus vitreum blood, extraocular muscle, and orbit tissues. In 26 cases of ocular tissues classification, one error occurred, when tissues were classified into classes of corpus vitreum blood, extraocular muscle, and orbit tissue. In this pilot classification parameters of spectral intercept and Nakagami parameter for instantaneous frequencies distribution of the 1st intrinsic mode function were found specific for corpus vitreum blood, orbit and extraocular muscle tissues. We conclude that ultrasound data should be further collected in clinical database to establish background for decision support system for ocular tissue noninvasive differentiation. PMID:22654643

  12. Genetic and phenotypic dissection of 1q43q44 microdeletion syndrome and neurodevelopmental phenotypes associated with mutations in ZBTB18 and HNRNPU.

    PubMed

    Depienne, Christel; Nava, Caroline; Keren, Boris; Heide, Solveig; Rastetter, Agnès; Passemard, Sandrine; Chantot-Bastaraud, Sandra; Moutard, Marie-Laure; Agrawal, Pankaj B; VanNoy, Grace; Stoler, Joan M; Amor, David J; Billette de Villemeur, Thierry; Doummar, Diane; Alby, Caroline; Cormier-Daire, Valérie; Garel, Catherine; Marzin, Pauline; Scheidecker, Sophie; de Saint-Martin, Anne; Hirsch, Edouard; Korff, Christian; Bottani, Armand; Faivre, Laurence; Verloes, Alain; Orzechowski, Christine; Burglen, Lydie; Leheup, Bruno; Roume, Joelle; Andrieux, Joris; Sheth, Frenny; Datar, Chaitanya; Parker, Michael J; Pasquier, Laurent; Odent, Sylvie; Naudion, Sophie; Delrue, Marie-Ange; Le Caignec, Cédric; Vincent, Marie; Isidor, Bertrand; Renaldo, Florence; Stewart, Fiona; Toutain, Annick; Koehler, Udo; Häckl, Birgit; von Stülpnagel, Celina; Kluger, Gerhard; Møller, Rikke S; Pal, Deb; Jonson, Tord; Soller, Maria; Verbeek, Nienke E; van Haelst, Mieke M; de Kovel, Carolien; Koeleman, Bobby; Monroe, Glen; van Haaften, Gijs; Attié-Bitach, Tania; Boutaud, Lucile; Héron, Delphine; Mignot, Cyril

    2017-04-01

    Subtelomeric 1q43q44 microdeletions cause a syndrome associating intellectual disability, microcephaly, seizures and anomalies of the corpus callosum. Despite several previous studies assessing genotype-phenotype correlations, the contribution of genes located in this region to the specific features of this syndrome remains uncertain. Among those, three genes, AKT3, HNRNPU and ZBTB18 are highly expressed in the brain and point mutations in these genes have been recently identified in children with neurodevelopmental phenotypes. In this study, we report the clinical and molecular data from 17 patients with 1q43q44 microdeletions, four with ZBTB18 mutations and seven with HNRNPU mutations, and review additional data from 37 previously published patients with 1q43q44 microdeletions. We compare clinical data of patients with 1q43q44 microdeletions with those of patients with point mutations in HNRNPU and ZBTB18 to assess the contribution of each gene as well as the possibility of epistasis between genes. Our study demonstrates that AKT3 haploinsufficiency is the main driver for microcephaly, whereas HNRNPU alteration mostly drives epilepsy and determines the degree of intellectual disability. ZBTB18 deletions or mutations are associated with variable corpus callosum anomalies with an incomplete penetrance. ZBTB18 may also contribute to microcephaly and HNRNPU to thin corpus callosum, but with a lower penetrance. Co-deletion of contiguous genes has additive effects. Our results confirm and refine the complex genotype-phenotype correlations existing in the 1qter microdeletion syndrome and define more precisely the neurodevelopmental phenotypes associated with genetic alterations of AKT3, ZBTB18 and HNRNPU in humans.

  13. The pearls of using real-world evidence to discover social groups

    NASA Astrophysics Data System (ADS)

    Cardillo, Raymond A.; Salerno, John J.

    2005-03-01

    In previous work, we introduced a new paradigm called Uni-Party Data Community Generation (UDCG) and a new methodology to discover social groups (a.k.a., community models) called Link Discovery based on Correlation Analysis (LDCA). We further advanced this work by experimenting with a corpus of evidence obtained from a Ponzi scheme investigation. That work identified several UDCG algorithms, developed what we called "Importance Measures" to compare the accuracy of the algorithms based on ground truth, and presented a Concept of Operations (CONOPS) that criminal investigators could use to discover social groups. However, that work used a rather small random sample of manually edited documents because the evidence contained far too many OCR and other extraction errors. Deferring the evidence extraction errors allowed us to continue experimenting with UDCG algorithms, but only used a small fraction of the available evidence. In attempt to discover techniques that are more practical in the near-term, our most recent work focuses on being able to use an entire corpus of real-world evidence to discover social groups. This paper discusses the complications of extracting evidence, suggests a method of performing name resolution, presents a new UDCG algorithm, and discusses our future direction in this area.

  14. It will be a disaster! How people protest against things which have not yet happened.

    PubMed

    Quet, Mathieu

    2015-02-01

    In the field of science and technology studies, recent works have analyzed the multiplication of promises and predictions as a major evolution of science management. The authors involved in this "sociology of technical expectations" have documented the role played by promises in the elaboration of scientific projects and their impact on the social reception of scientific issues. Yet, little attention has been paid to the predictions regarding undesirable technological futures. This article proposes therefore to analyze the discursive and argumentative practices through which journalists, scientists, and politicians denounce and propose to counter a public issue "which does not exist yet": gene doping (no case of gene doping has been recorded to date). After a literature review of the field of the sociology of technological expectations and a presentation of the corpus, the article describes the structure of predictions and analyzes the discursive strategies according to which social actors predict a disaster in the making. The analysis is based on the study of media discourses about gene doping, in a corpus of 163 French language articles from European newspapers, published between 1998 and 2012. © The Author(s) 2014.

  15. Frequency Analysis of the Words in the Academic Word List (AWL) and Non-AWL Content Words in Applied Linguistics Research Papers

    ERIC Educational Resources Information Center

    Vongpumivitch, Viphavee; Huang, Ju-yu; Chang, Yu-Chia

    2009-01-01

    This study is a corpus-based lexical study that aims to explore the use of words in Coxhead's (2000) Academic Word List (AWL) in journal articles in the field of applied linguistics. A 1.5 million-word corpus called the Applied Linguistics Research Articles Corpus (ALC) was created for this study. The corpus consists of 200 research articles that…

  16. The Ubuntu Chat Corpus for Multiparticipant Chat Analysis

    DTIC Science & Technology

    2013-03-01

    Intelligence (www.aaai.org). All rights reserved. the # LINUX corpus (Elsner and Charniak 2010), and the #IPHONE/#PHYSICS/#PYTHON corpus (Adams 2008). For many...made publicly available, making it difficult to comparatively evaluate dif- ferent techniques. Corpus Description Ubuntu, a Linux -based operating...Kubuntu (Ubuntu with KDE ) support #ubuntu-devel 2 112 074 12 140 53.7 2004-10-01 Developmental team coordination #ubuntu+1 1 621 680 26 805 52.6 2007-04-04

  17. Improving Terminology Mapping in Clinical Text with Context-Sensitive Spelling Correction.

    PubMed

    Dziadek, Juliusz; Henriksson, Aron; Duneld, Martin

    2017-01-01

    The mapping of unstructured clinical text to an ontology facilitates meaningful secondary use of health records but is non-trivial due to lexical variation and the abundance of misspellings in hurriedly produced notes. Here, we apply several spelling correction methods to Swedish medical text and evaluate their impact on SNOMED CT mapping; first in a controlled evaluation using medical literature text with induced errors, followed by a partial evaluation on clinical notes. It is shown that the best-performing method is context-sensitive, taking into account trigram frequencies and utilizing a corpus-based dictionary.

  18. Formalized Conflicts Detection Based on the Analysis of Multiple Emails: An Approach Combining Statistics and Ontologies

    NASA Astrophysics Data System (ADS)

    Zakaria, Chahnez; Curé, Olivier; Salzano, Gabriella; Smaïli, Kamel

    In Computer Supported Cooperative Work (CSCW), it is crucial for project leaders to detect conflicting situations as early as possible. Generally, this task is performed manually by studying a set of documents exchanged between team members. In this paper, we propose a full-fledged automatic solution that identifies documents, subjects and actors involved in relational conflicts. Our approach detects conflicts in emails, probably the most popular type of documents in CSCW, but the methods used can handle other text-based documents. These methods rely on the combination of statistical and ontological operations. The proposed solution is decomposed in several steps: (i) we enrich a simple negative emotion ontology with terms occuring in the corpus of emails, (ii) we categorize each conflicting email according to the concepts of this ontology and (iii) we identify emails, subjects and team members involved in conflicting emails using possibilistic description logic and a set of proposed measures. Each of these steps are evaluated and validated on concrete examples. Moreover, this approach's framework is generic and can be easily adapted to domains other than conflicts, e.g. security issues, and extended with operations making use of our proposed set of measures.

  19. 76 FR 18391 - Safety Zone; Texas International Boat Show Power Boat Races; Corpus Christi Marina, Corpus...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-04

    ... temporary safety zone in the Corpus Christi, Texas for North American Tri-Hull Championship scheduled to take place during the Texas International Boat Show. The North American Tri-Hull Championship will...

  20. Eribulin Mesylate and Gemcitabine Hydrochloride in Treating Patients With Metastatic Solid Tumors or Solid Tumors That Cannot be Removed by Surgery

    ClinicalTrials.gov

    2017-09-19

    Adult Solid Neoplasm; Recurrent Ovarian Carcinoma; Recurrent Uterine Corpus Carcinoma; Stage III Ovarian Cancer; Stage III Uterine Corpus Cancer; Stage IV Ovarian Cancer; Stage IV Uterine Corpus Cancer

  1. A Method for Extracting Important Segments from Documents Using Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Suzuki, Daisuke; Utsumi, Akira

    In this paper we propose an extraction-based method for automatic summarization. The proposed method consists of two processes: important segment extraction and sentence compaction. The process of important segment extraction classifies each segment in a document as important or not by Support Vector Machines (SVMs). The process of sentence compaction then determines grammatically appropriate portions of a sentence for a summary according to its dependency structure and the classification result by SVMs. To test the performance of our method, we conducted an evaluation experiment using the Text Summarization Challenge (TSC-1) corpus of human-prepared summaries. The result was that our method achieved better performance than a segment-extraction-only method and the Lead method, especially for sentences only a part of which was included in human summaries. Further analysis of the experimental results suggests that a hybrid method that integrates sentence extraction with segment extraction may generate better summaries.

  2. Clinical characteristics associated with the intracranial dissemination of gliomas.

    PubMed

    Cai, Xu; Qin, Jun-Jie; Hao, Shu-Yu; Li, Huan; Zeng, Chun; Sun, Sheng-Jun; Yu, Lan-Bing; Gao, Zhi-Xian; Xie, Jian

    2018-03-01

    Glioma is the most common malignant tumor of the brain and the intracranial dissemination of gliomas is the late stage of the development of the tumor. However, there is little research in literature on the occurrence of intracranial dissemination of gliomas. In order to provide a reference for clinical work, we carried out this study on intracranial dissemination of glioma. A total of 629 patients with gliomas received tumor resection by the same surgeon from August 2010 to September 2015 were included in this study. The authors performed a retrospective review of the patients and the information regarding clinical features, histopathological results, molecular pathologic results and clinical outcomes was collected and analyzed. In this retrospective study, we found that the intracranial dissemination phenomenon occurred in 53 patients (8.43%). We analyzed the clinical characteristics of patients and found that the age at diagnosis (P = 0.011), WHO grade of the tumor (P < 0.001), and involvement of the corpus callosum (P = 0.010) were associated with the occurrence of dissemination. The higher grade of the tumor, the more prone to disseminate. Deletion of 1p/19q had no significant correlation with the intracranial dissemination. MMP9, Ki-67, and EGFR were highly expressed in tumor cells that caused dissemination, and the level of Ki-67 expression had significance in statistics (P < 0.01). In our study, older age (>40 years), high pathological grade, invasion of the corpus callosum and high levels of Ki-67 expression were risk factors associated with the intracranial dissemination of gliomas. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. The tolerance of feline corpus and cauda spermatozoa to cryostress.

    PubMed

    Kunkitti, Panisara; Bergqvist, Ann-Sofi; Sjunnesson, Ylva; Johannisson, Anders; Axnér, Eva

    2016-02-01

    Epididymal sperm preservation can be used to avoid the total loss of genetic material in threatened species. Spermatozoa from the corpus, as from the cauda, are motile and can undergo capacitation. Thus, they can potentially be preserved for assisted reproductive technologies. However, cryopreservation of spermatozoa has a direct detrimental effect on sperm quality. The aim of this study was to compare the chromatin stability and the survival rate of spermatozoa from the corpus and cauda epididymis after cryopreservation. Epididymal spermatozoa were collected and cryopreserved from the corpus and cauda of 12 domestic cats. Sperm motility, progressive motility, membrane integrity, acrosome integrity, and DNA integrity were evaluated before and after freezing thawing. The average total number of spermatozoa collected from the corpus was lower (10.2 × 10(6) ± 7.4) than that from the cauda epididymis (24.9 × 10(6) ± 14.4; P = 0.005). The percentage of spermatozoa with intact DNA did not differ significantly whether it was collected from the corpus or cauda regions and did not decrease after freezing thawing in either region. However, motility of spermatozoa from both regions was affected by the freezing thawing process with a significant decline in motility after thaw compared with fresh spermatozoa. A significant difference in the percentage of motile sperm between the corpus and cauda was observed after the freezing thawing process (P < 0.001). Although sperm motility was lower in postthaw spermatozoa from the corpus epididymidis than from the cauda, the rate of the reduction did not differ between regions. This study indicates that the cryopreservation process does not have a negative effect on chromatin stability of feline epididymal spermatozoa. Spermatozoa from the corpus region have a similar freezability as spermatozoa from the cauda region. Therefore, preservation of spermatozoa from the corpus and the cauda epididymidis might be of value in preserving genetic material from endangered or valuable felids. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. NegBio: a high-performance tool for negation and uncertainty detection in radiology reports.

    PubMed

    Peng, Yifan; Wang, Xiaosong; Lu, Le; Bagheri, Mohammadhadi; Summers, Ronald; Lu, Zhiyong

    2018-01-01

    Negative and uncertain medical findings are frequent in radiology reports, but discriminating them from positive findings remains challenging for information extraction. Here, we propose a new algorithm, NegBio, to detect negative and uncertain findings in radiology reports. Unlike previous rule-based methods, NegBio utilizes patterns on universal dependencies to identify the scope of triggers that are indicative of negation or uncertainty. We evaluated NegBio on four datasets, including two public benchmarking corpora of radiology reports, a new radiology corpus that we annotated for this work, and a public corpus of general clinical texts. Evaluation on these datasets demonstrates that NegBio is highly accurate for detecting negative and uncertain findings and compares favorably to a widely-used state-of-the-art system NegEx (an average of 9.5% improvement in precision and 5.1% in F1-score). https://github.com/ncbi-nlp/NegBio.

  5. Subluxation and semantics: a corpus linguistics study.

    PubMed

    Budgell, Brian

    2016-06-01

    The purpose of this study was to analyze the curriculum of one chiropractic college in order to discover if there were any implicit consensus definitions of the term subluxation. Using the software WordSmith Tools, the corpus of an undergraduate chiropractic curriculum was analyzed by reviewing collocated terms and through discourse analysis of text blocks containing words based on the root 'sublux.' It was possible to identify 3 distinct concepts which were each referred to as 'subluxation:' i) an acute or instantaneous injurious event; ii) a clinical syndrome which manifested post-injury; iii) a physical lesion, i.e. an anatomical or physiological derangement which in most instances acted as a pain generator. In fact, coherent implicit definitions of subluxation exist and may enjoy broad but subconscious acceptance. However, confusion likely arises from failure to distinguish which concept an author or speaker is referring to when they employ the term subluxation.

  6. Hyperlexia and ambient echolalia in a case of cerebral infarction of the left anterior cingulate cortex and corpus callosum.

    PubMed

    Suzuki, Tadashi; Itoh, Shouichi; Hayashi, Mototaka; Kouno, Masako; Takeda, Katsuhiko

    2009-10-01

    We report the case of a 69-year-old woman with cerebral infarction in the left anterior cingulate cortex and corpus callosum. She showed hyperlexia, which was a distinctive reading phenomenon, as well as ambient echolalia. Clinical features also included complex disorders such as visual groping, compulsive manipulation of tools, and callosal disconnection syndrome. She read words written on the cover of a book and repeated words emanating from unrelated conversations around her or from hospital announcements. The combination of these two features due to a focal lesion has never been reported previously. The supplementary motor area may control the execution of established subroutines according to external and internal inputs. Hyperlexia as well as the compulsive manipulation of tools could be interpreted as faulty inhibition of preexisting essentially intact motor subroutines by damage to the anterior cingulate cortex reciprocally interconnected with the supplementary motor area.

  7. Well, Now, Okey Dokey: English Discourse Markers in Spanish Language Medical Consultations

    PubMed Central

    Vickers, Caroline H.; Goble, Ryan

    2013-01-01

    The purpose of this paper is to examine use of English discourse markers in otherwise Spanish language consultations. Data is derived from an audio-recorded corpus of Spanish language consultations that took place in a small community clinic in the United States as well as post-consultation interviews with patients and providers. Through quantification of the use of discourse makers in the corpus and discourse analysis of transcripts, we demonstrate that English-speaking dominant medical providers use English discourse markers more frequently and with a broader range of functions than do Spanish-speaking dominant medical providers and patients. We argue that such use of English discourse markers serves to exacerbate the power relationship between providers and patients even though the use of English discourse markers does not cause overt miscommunication in the ongoing interaction. Implications for providers who use a second language in their medical consultations are discussed. PMID:24347670

  8. The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research

    PubMed Central

    Estivalet, Gustavo L.; Meunier, Fanny

    2015-01-01

    In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional’s corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior. PMID:26630138

  9. The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research.

    PubMed

    Estivalet, Gustavo L; Meunier, Fanny

    2015-01-01

    In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional's corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior.

  10. Agenesis of the corpus callosum and autism: a comprehensive comparison.

    PubMed

    Paul, Lynn K; Corsello, Christina; Kennedy, Daniel P; Adolphs, Ralph

    2014-06-01

    The corpus callosum, with its ∼200 million axons, remains enigmatic in its contribution to cognition and behaviour. Agenesis of the corpus callosum is a congenital condition in which the corpus callosum fails to develop; such individuals exhibit localized deficits in non-literal language comprehension, humour, theory of mind and social reasoning. These findings together with parent reports suggest that behavioural and cognitive impairments in subjects with callosal agenesis may overlap with the profile of autism spectrum disorders, particularly with respect to impairments in social interaction and communication. To provide a comprehensive test of this hypothesis, we directly compared a group of 26 adults with callosal agenesis to a group of 28 adults with a diagnosis of autism spectrum disorder but no neurological abnormality. All participants had full-scale intelligence quotient scores >78 and groups were matched on age, handedness, and gender ratio. Using the Autism Diagnostic Observation Schedule together with current clinical presentation to assess autistic symptomatology, we found that 8/26 (about a third) of agenesis subjects presented with autism. However, more formal diagnosis additionally involving recollective parent-report measures regarding childhood behaviour showed that only 3/22 met complete formal criteria for an autism spectrum disorder (parent reports were unavailable for four subjects). We found no relationship between intelligence quotient and autism symptomatology in callosal agenesis, nor evidence that the presence of any residual corpus callosum differentiated those who exhibited current autism spectrum symptoms from those who did not. Relative to the autism spectrum comparison group, parent ratings of childhood behaviour indicated children with agenesis were less likely to meet diagnostic criteria for autism, even for those who met autism spectrum criteria as adults, and even though there was no group difference in parent report of current behaviours. The findings suggest two broad conclusions. First, they support the hypothesis that congenital disruption of the corpus callosum constitutes a major risk factor for developing autism. Second, they quantify specific features that distinguish autistic behaviour associated with callosal agenesis from autism more generally. Taken together, these two findings also leverage specific questions for future investigation: what are the distal causes (genetic and environmental) determining both callosal agenesis and its autistic features, and what are the proximal mechanisms by which absence of the callosum might generate autistic symptomatology? © The Author (2014). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. Information extraction from Italian medical reports: An ontology-driven approach.

    PubMed

    Viani, Natalia; Larizza, Cristiana; Tibollo, Valentina; Napolitano, Carlo; Priori, Silvia G; Bellazzi, Riccardo; Sacchi, Lucia

    2018-03-01

    In this work, we propose an ontology-driven approach to identify events and their attributes from episodes of care included in medical reports written in Italian. For this language, shared resources for clinical information extraction are not easily accessible. The corpus considered in this work includes 5432 non-annotated medical reports belonging to patients with rare arrhythmias. To guide the information extraction process, we built a domain-specific ontology that includes the events and the attributes to be extracted, with related regular expressions. The ontology and the annotation system were constructed on a development set, while the performance was evaluated on an independent test set. As a gold standard, we considered a manually curated hospital database named TRIAD, which stores most of the information written in reports. The proposed approach performs well on the considered Italian medical corpus, with a percentage of correct annotations above 90% for most considered clinical events. We also assessed the possibility to adapt the system to the analysis of another language (i.e., English), with promising results. Our annotation system relies on a domain ontology to extract and link information in clinical text. We developed an ontology that can be easily enriched and translated, and the system performs well on the considered task. In the future, it could be successfully used to automatically populate the TRIAD database. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Extracting PICO Sentences from Clinical Trial Reports using Supervised Distant Supervision

    PubMed Central

    Wallace, Byron C.; Kuiper, Joël; Sharma, Aakash; Zhu, Mingxi (Brian); Marshall, Iain J.

    2016-01-01

    Systematic reviews underpin Evidence Based Medicine (EBM) by addressing precise clinical questions via comprehensive synthesis of all relevant published evidence. Authors of systematic reviews typically define a Population/Problem, Intervention, Comparator, and Outcome (a PICO criteria) of interest, and then retrieve, appraise and synthesize results from all reports of clinical trials that meet these criteria. Identifying PICO elements in the full-texts of trial reports is thus a critical yet time-consuming step in the systematic review process. We seek to expedite evidence synthesis by developing machine learning models to automatically extract sentences from articles relevant to PICO elements. Collecting a large corpus of training data for this task would be prohibitively expensive. Therefore, we derive distant supervision (DS) with which to train models using previously conducted reviews. DS entails heuristically deriving ‘soft’ labels from an available structured resource. However, we have access only to unstructured, free-text summaries of PICO elements for corresponding articles; we must derive from these the desired sentence-level annotations. To this end, we propose a novel method – supervised distant supervision (SDS) – that uses a small amount of direct supervision to better exploit a large corpus of distantly labeled instances by learning to pseudo-annotate articles using the available DS. We show that this approach tends to outperform existing methods with respect to automated PICO extraction. PMID:27746703

  13. Use of "Google Scholar" in Corpus-Driven EAP Research

    ERIC Educational Resources Information Center

    Brezina, Vaclav

    2012-01-01

    This primarily methodological article makes a proposition for linguistic exploration of textual resources available through the "Google Scholar" search engine. These resources ("Google Scholar virtual corpus") are significantly larger than any existing corpus of academic writing. "Google Scholar", however, was not designed for linguistic searches…

  14. Corpus Approaches to Language Ideology

    ERIC Educational Resources Information Center

    Vessey, Rachelle

    2017-01-01

    This paper outlines how corpus linguistics--and more specifically the corpus-assisted discourse studies approach--can add useful dimensions to studies of language ideology. First, it is argued that the identification of words of high, low, and statistically significant frequency can help in the identification and exploration of language ideologies…

  15. Classifying medical relations in clinical text via convolutional neural networks.

    PubMed

    He, Bin; Guan, Yi; Dai, Rui

    2018-05-16

    Deep learning research on relation classification has achieved solid performance in the general domain. This study proposes a convolutional neural network (CNN) architecture with a multi-pooling operation for medical relation classification on clinical records and explores a loss function with a category-level constraint matrix. Experiments using the 2010 i2b2/VA relation corpus demonstrate these models, which do not depend on any external features, outperform previous single-model methods and our best model is competitive with the existing ensemble-based method. Copyright © 2018. Published by Elsevier B.V.

  16. 32 CFR 516.20 - Habeas Corpus.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 32 National Defense 3 2010-07-01 2010-07-01 true Habeas Corpus. 516.20 Section 516.20 National Defense Department of Defense (Continued) DEPARTMENT OF THE ARMY AID OF CIVIL AUTHORITIES AND PUBLIC RELATIONS LITIGATION Reporting Legal Proceedings to HQDA § 516.20 Habeas Corpus. (a) General. A soldier may...

  17. 33 CFR 110.75 - Corpus Christi Bay, Tex.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 33 Navigation and Navigable Waters 1 2011-07-01 2011-07-01 false Corpus Christi Bay, Tex. 110.75 Section 110.75 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND SECURITY ANCHORAGES ANCHORAGE REGULATIONS Special Anchorage Areas § 110.75 Corpus Christi Bay, Tex. (a) South area. Southward of...

  18. 33 CFR 110.75 - Corpus Christi Bay, Tex.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 33 Navigation and Navigable Waters 1 2010-07-01 2010-07-01 false Corpus Christi Bay, Tex. 110.75 Section 110.75 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND SECURITY ANCHORAGES ANCHORAGE REGULATIONS Special Anchorage Areas § 110.75 Corpus Christi Bay, Tex. (a) South area. Southward of...

  19. 33 CFR 110.75 - Corpus Christi Bay, Tex.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 33 Navigation and Navigable Waters 1 2014-07-01 2014-07-01 false Corpus Christi Bay, Tex. 110.75 Section 110.75 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND SECURITY ANCHORAGES ANCHORAGE REGULATIONS Special Anchorage Areas § 110.75 Corpus Christi Bay, Tex. (a) South area. Southward of...

  20. 33 CFR 110.75 - Corpus Christi Bay, Tex.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 33 Navigation and Navigable Waters 1 2013-07-01 2013-07-01 false Corpus Christi Bay, Tex. 110.75 Section 110.75 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND SECURITY ANCHORAGES ANCHORAGE REGULATIONS Special Anchorage Areas § 110.75 Corpus Christi Bay, Tex. (a) South area. Southward of...

  1. 33 CFR 110.75 - Corpus Christi Bay, Tex.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 33 Navigation and Navigable Waters 1 2012-07-01 2012-07-01 false Corpus Christi Bay, Tex. 110.75 Section 110.75 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND SECURITY ANCHORAGES ANCHORAGE REGULATIONS Special Anchorage Areas § 110.75 Corpus Christi Bay, Tex. (a) South area. Southward of...

  2. Marking Importance in Lectures: Interactive and Textual Orientation

    ERIC Educational Resources Information Center

    Deroey, Katrien L. B.

    2015-01-01

    This paper provides a comprehensive overview of lexicogrammatical markers of important lecture points and proposes a classification in terms of their interactive and textual orientation. The importance markers were extracted from the British Academic Spoken English corpus using corpus-driven and corpus-based methods. The classification is based on…

  3. Corpus-Based Investigations of Language Use.

    ERIC Educational Resources Information Center

    Biber, Douglas; And Others

    1996-01-01

    Examines a representative text corpus to gain insights into language structure and use and to open new areas of linguistic inquiry. Various illustrations are presented that provide a glimpse into the value of corpus-based investigations for increasing one's understanding of language use and imparting insights important for designing effective…

  4. The relaxant actions of ethanolic extract of Tridax procumbens (Linn.) on rat corpus cavernosum smooth muscle contraction.

    PubMed

    Salahdeen, Hussein M; Idowu, Gbolahan O; Yemitan, Omoniyi K; Murtala, Babatunde A; Alada, Abdul Rasak A

    2015-03-01

    The effect of Tridax procumbens aqueous ethanolic extract on the rat corpus cavernosum smooth muscles was evaluated in the present study. Corpus cavernosum strips obtained from healthy, young, adult male Wistar albino rats (250-300 g) were precontracted with phenylephrine (10-7 M) or KCl (60 mM) and then treated with various concentrations of T. procumbens extract (0.15-1.05 mg/mL). The change in corpus cavernosum strip tension was recorded. The interactions between T. procumbens extract with acetylcholine and with sodium nitroprusside were also evaluated. The results indicated that corpus cavernosum strips relaxation induced by T. procumbens extract was concentration-dependent and this was significant (p<0.5). Pre-treatment with a nitric oxide synthase (NOS) inhibitor (N(1) nitro-L-arginine-methyl ester, l-NAME), did not completely inhibit the relaxation. However, T. procumbens extract (0.6 mg/mL) significantly (p<0.5) enhanced both acetylcholine- and sodium nitroprusside-induced corpus cavernosum strips relaxation. RESULTS suggest that T. procumbens extract has a concentration-dependent relaxant effect on the isolated rat corpus cavernosum. The mechanism of action of T. procumbens extract is complex. A part of its relaxing effect is mediated directly by the release of NO from endothelium which may improve erectile dysfunction.

  5. Usability-driven pruning of large ontologies: the case of SNOMED CT

    PubMed Central

    Boeker, Martin; Illarramendi, Arantza; Schulz, Stefan

    2012-01-01

    Objectives To study ontology modularization techniques when applied to SNOMED CT in a scenario in which no previous corpus of information exists and to examine if frequency-based filtering using MEDLINE can reduce subset size without discarding relevant concepts. Materials and Methods Subsets were first extracted using four graph-traversal heuristics and one logic-based technique, and were subsequently filtered with frequency information from MEDLINE. Twenty manually coded discharge summaries from cardiology patients were used as signatures and test sets. The coverage, size, and precision of extracted subsets were measured. Results Graph-traversal heuristics provided high coverage (71–96% of terms in the test sets of discharge summaries) at the expense of subset size (17–51% of the size of SNOMED CT). Pre-computed subsets and logic-based techniques extracted small subsets (1%), but coverage was limited (24–55%). Filtering reduced the size of large subsets to 10% while still providing 80% coverage. Discussion Extracting subsets to annotate discharge summaries is challenging when no previous corpus exists. Ontology modularization provides valuable techniques, but the resulting modules grow as signatures spread across subhierarchies, yielding a very low precision. Conclusion Graph-traversal strategies and frequency data from an authoritative source can prune large biomedical ontologies and produce useful subsets that still exhibit acceptable coverage. However, a clinical corpus closer to the specific use case is preferred when available. PMID:22268217

  6. Magnetic resonance findings of the corpus callosum in canine and feline lysosomal storage diseases.

    PubMed

    Hasegawa, Daisuke; Tamura, Shinji; Nakamoto, Yuya; Matsuki, Naoaki; Takahashi, Kimimasa; Fujita, Michio; Uchida, Kazuyuki; Yamato, Osamu

    2013-01-01

    Several reports have described magnetic resonance (MR) findings in canine and feline lysosomal storage diseases such as gangliosidoses and neuronal ceroid lipofuscinosis. Although most of those studies described the signal intensities of white matter in the cerebrum, findings of the corpus callosum were not described in detail. A retrospective study was conducted on MR findings of the corpus callosum as well as the rostral commissure and the fornix in 18 cases of canine and feline lysosomal storage diseases. This included 6 Shiba Inu dogs and 2 domestic shorthair cats with GM1 gangliosidosis; 2 domestic shorthair cats, 2 familial toy poodles, and a golden retriever with GM2 gangliosidosis; and 2 border collies and 3 chihuahuas with neuronal ceroid lipofuscinoses, to determine whether changes of the corpus callosum is an imaging indicator of those diseases. The corpus callosum and the rostral commissure were difficult to recognize in all cases of juvenile-onset gangliosidoses (GM1 gangliosidosis in Shiba Inu dogs and domestic shorthair cats and GM2 gangliosidosis in domestic shorthair cats) and GM2 gangliosidosis in toy poodles with late juvenile-onset. In contrast, the corpus callosum and the rostral commissure were confirmed in cases of GM2 gangliosidosis in a golden retriever and canine neuronal ceroid lipofuscinoses with late juvenile- to early adult-onset, but were extremely thin. Abnormal findings of the corpus callosum on midline sagittal images may be a useful imaging indicator for suspecting lysosomal storage diseases, especially hypoplasia (underdevelopment) of the corpus callosum in juvenile-onset gangliosidoses.

  7. Magnetic Resonance Findings of the Corpus Callosum in Canine and Feline Lysosomal Storage Diseases

    PubMed Central

    Hasegawa, Daisuke; Tamura, Shinji; Nakamoto, Yuya; Matsuki, Naoaki; Takahashi, Kimimasa; Fujita, Michio; Uchida, Kazuyuki; Yamato, Osamu

    2013-01-01

    Several reports have described magnetic resonance (MR) findings in canine and feline lysosomal storage diseases such as gangliosidoses and neuronal ceroid lipofuscinosis. Although most of those studies described the signal intensities of white matter in the cerebrum, findings of the corpus callosum were not described in detail. A retrospective study was conducted on MR findings of the corpus callosum as well as the rostral commissure and the fornix in 18 cases of canine and feline lysosomal storage diseases. This included 6 Shiba Inu dogs and 2 domestic shorthair cats with GM1 gangliosidosis; 2 domestic shorthair cats, 2 familial toy poodles, and a golden retriever with GM2 gangliosidosis; and 2 border collies and 3 chihuahuas with neuronal ceroid lipofuscinoses, to determine whether changes of the corpus callosum is an imaging indicator of those diseases. The corpus callosum and the rostral commissure were difficult to recognize in all cases of juvenile-onset gangliosidoses (GM1 gangliosidosis in Shiba Inu dogs and domestic shorthair cats and GM2 gangliosidosis in domestic shorthair cats) and GM2 gangliosidosis in toy poodles with late juvenile-onset. In contrast, the corpus callosum and the rostral commissure were confirmed in cases of GM2 gangliosidosis in a golden retriever and canine neuronal ceroid lipofuscinoses with late juvenile- to early adult-onset, but were extremely thin. Abnormal findings of the corpus callosum on midline sagittal images may be a useful imaging indicator for suspecting lysosomal storage diseases, especially hypoplasia (underdevelopment) of the corpus callosum in juvenile-onset gangliosidoses. PMID:24386203

  8. The Use of a Corpus in Contrastive Studies.

    ERIC Educational Resources Information Center

    Filipovic, Rudolf

    1973-01-01

    Before beginning the Serbocroatian-English Contrastive Project, it was necessary to determine whether to base the analysis on a corpus or on native intuitions. It seemed that the best method would combine the theoretical and the empirical. A translation method based on a corpus of text was adopted. The Brown University "Standard Sample of…

  9. A Corpus-Based Study on English Prepositions of Place, "In" and "On"

    ERIC Educational Resources Information Center

    Arjan, Asmeza; Abdullah, Noor Hayati; Roslim, Norwati

    2013-01-01

    This corpus-based study examined the usage, mastery and developmental pattern (Norwati, 2004) of English prepositions of place, "in" and "on" across three different academic levels namely Form 4, Form 5 and College students. The Malaysian Corpus of Students Argumentative Writing (MCSAW) was used as the source of data in…

  10. Network Analysis with the Enron Email Corpus

    ERIC Educational Resources Information Center

    Hardin, J. S.; Sarkis, G.; URC, P. .

    2015-01-01

    We use the Enron email corpus to study relationships in a network by applying six different measures of centrality. Our results came out of an in-semester undergraduate research seminar. The Enron corpus is well suited to statistical analyses at all levels of undergraduate education. Through this article's focus on centrality, students can explore…

  11. What Does Corpus Linguistics Have to Offer to Language Assessment?

    ERIC Educational Resources Information Center

    Xi, Xiaoming

    2017-01-01

    In recent years, continuing advances in technology have increased the capacity to automate the extraction of a range of linguistic features of texts and thus have provided the impetus for the substantial growth of corpus linguistics. While corpus linguistic tools and methods have been used extensively in second language learning research, they…

  12. Lexical Properties of Slovene Sign Language: A Corpus-Based Study

    ERIC Educational Resources Information Center

    Vintar, Špela

    2015-01-01

    Slovene Sign Language (SZJ) has as yet received little attention from linguists. This article presents some basic facts about SZJ, its history, current status, and a description of the Slovene Sign Language Corpus and Pilot Grammar (SIGNOR) project, which compiled and annotated a representative corpus of SZJ. Finally, selected quantitative data…

  13. English Collocation Learning through Corpus Data: On-Line Concordance and Statistical Information

    ERIC Educational Resources Information Center

    Ohtake, Hiroshi; Fujita, Nobuyuki; Kawamoto, Takeshi; Morren, Brian; Ugawa, Yoshihiro; Kaneko, Shuji

    2012-01-01

    We developed an English Collocations On Demand system offering on-line corpus and concordance information to help Japanese researchers acquire a better command of English collocation patterns. The Life Science Dictionary Corpus consists of approximately 90,000,000 words collected from life science related research papers published in academic…

  14. Pattern and Meaning across Genres and Disciplines: An Exploratory Study

    ERIC Educational Resources Information Center

    Groom, Nicholas

    2005-01-01

    Work in corpus linguistics has led to the development of a theory of language as "phraseology" [Hunston, S., & Francis, G. (1999). "Pattern grammar: A corpus-driven approach to the lexical grammar of English." Amsterdam: John Benjamins. Sinclair, J. M. (1991). "Corpus, concordance, collocation." Oxford: Oxford University Press. Sinclair, J. M.…

  15. Corpus Callosum Volume and Neurocognition in Autism

    ERIC Educational Resources Information Center

    Keary, Christopher J.; Minshew, Nancy J.; Bansal, Rahul; Goradia, Dhruman; Fedorov, Serguei; Keshavan, Matcheri S.; Hardan, Antonio Y.

    2009-01-01

    The corpus callosum has recently been considered as an index of interhemispheric connectivity. This study applied a novel volumetric method to examine the size of the corpus callosum in 32 individuals with autism and 34 age-, gender- and IQ-matched controls and to investigate the relationship between this structure and cognitive measures linked to…

  16. Developing Corpus-Based Materials to Teach Pragmatic Routines

    ERIC Educational Resources Information Center

    Bardovi-Harlig, Kathleen; Mossman, Sabrina; Vellenga, Heidi E.

    2015-01-01

    This article describes how to develop teaching materials for pragmatics based on authentic language by using a spoken corpus. The authors show how to use the corpus in conjunction with textbooks to identify pragmatic routines for speech acts and how to extract appropriate language samples and adapt them for classroom use. They demonstrate how to…

  17. A Corpus-Based Approach to Online Materials Development for Writing Research Articles

    ERIC Educational Resources Information Center

    Chang, Ching-Fen; Kuo, Chih-Hua

    2011-01-01

    There has been increasing interest in the possible applications of corpora to both linguistic research and pedagogy. This study takes a corpus-based, genre-analytic approach to discipline-specific materials development. Combining corpus analysis with genre analysis makes it possible to develop teaching materials that are not only authentic but…

  18. The Effectiveness of Using Corpus-Based Materials in Vocabulary Teaching

    ERIC Educational Resources Information Center

    Paker, Turan; Özcan, Yeliz Ergül

    2017-01-01

    Our study aimed at finding out the effectiveness of corpus-based vocabulary teaching activities as well as students' attitudes towards concordance-based materials when corpus-based tasks in English vocabulary learning are used. The study was conducted in a preparatory school in a private university. The participants were 28 intermediate level…

  19. Lexical Borrowing from Chinese Languages in Malaysian English

    ERIC Educational Resources Information Center

    Imm, Tan Siew

    2009-01-01

    This paper explores how contact between English and Chinese has resulted in the incorporation of Chinese borrowings into the lexicon of Malaysian English (ME). Using a corpus-based approach, this study analyses a comprehensive range of borrowed features extracted from the Malaysian English Newspaper Corpus (MEN Corpus). Based on the contexts of…

  20. Lexical bundles in an advanced INTOCSU writing class and engineering texts: A functional analysis

    NASA Astrophysics Data System (ADS)

    Alquraishi, Mohammed Abdulrahman

    The purpose of this study is to investigate the functions of lexical bundles in two corpora: a corpus of engineering academic texts and a corpus of IEP advanced writing class texts. This study is concerned with the nature of formulaic language in Pathway IEPs and engineering texts, and whether those types of texts show similar or distinctive formulaic functions. Moreover, the study looked into lexical bundles found in an engineering 1.26 million-word corpus and an ESL 65000-word corpus using a concordancing program. The study then analyzed the functions of those lexical bundles and compared them statistically using chi-square tests. Additionally, the results of this investigation showed 236 unique frequent lexical bundles in the engineering corpus and 37 bundles in the pathway corpus. Also, the study identified several differences between the density and functions of lexical bundles in the two corpora. These differences were evident in the distribution of functions of lexical bundles and the minimal overlap of lexical bundles found in the two corpora. The results of this study call for more attention to formulaic language at ESP and EAP programs.

  1. Evaluation of pore-water samplers at a drainage ditch, Installation Restoration Site 4, Naval Air Station Corpus Christi, Corpus Christi, Texas, 2005–06

    USGS Publications Warehouse

    Vroblesky, Don A.; Casey, Clifton C.

    2007-01-01

    The U.S. Geological Survey, in cooperation with the Naval Facilities Engineering Command Southeast, used innovative sampling methods to investigate ground-water contamination by chlorobenzenes beneath a drainage ditch on the southwestern side of Installation Restoration Site 4, Naval Air Station Corpus Christi, Corpus Christi, Texas, during 2005-06. The drainage ditch, which is a potential receptor for ground-water contaminants from Installation Restoration Site 4, intermittently discharges water to Corpus Christi Bay. This report evaluates a new type of pore-water sampler developed for this investigation to examine the subsurface contamination beneath the drainage ditch. The new type of pore-water sampler appears to be an effective approach for long-term monitoring of ground water in the sand and organic-rich mud beneath the drainage ditch.

  2. Tashkeela: Novel corpus of Arabic vocalized texts, data for auto-diacritization systems.

    PubMed

    Zerrouki, Taha; Balla, Amar

    2017-04-01

    Arabic diacritics are often missed in Arabic scripts. This feature is a handicap for new learner to read َArabic, text to speech conversion systems, reading and semantic analysis of Arabic texts. The automatic diacritization systems are the best solution to handle this issue. But such automation needs resources as diactritized texts to train and evaluate such systems. In this paper, we describe our corpus of Arabic diacritized texts. This corpus is called Tashkeela. It can be used as a linguistic resource tool for natural language processing such as automatic diacritics systems, dis-ambiguity mechanism, features and data extraction. The corpus is freely available, it contains 75 million of fully vocalized words mainly 97 books from classical and modern Arabic language. The corpus is collected from manually vocalized texts using web crawling process.

  3. Parsing clinical text: how good are the state-of-the-art parsers?

    PubMed

    Jiang, Min; Huang, Yang; Fan, Jung-wei; Tang, Buzhou; Denny, Josh; Xu, Hua

    2015-01-01

    Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus. Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text.

  4. A Bag of Concepts Approach for Biomedical Document Classification Using Wikipedia Knowledge.

    PubMed

    Mouriño-García, Marcos A; Pérez-Rodríguez, Roberto; Anido-Rifón, Luis E

    2017-01-01

    The ability to efficiently review the existing literature is essential for the rapid progress of research. This paper describes a classifier of text documents, represented as vectors in spaces of Wikipedia concepts, and analyses its suitability for classification of Spanish biomedical documents when only English documents are available for training. We propose the cross-language concept matching (CLCM) technique, which relies on Wikipedia interlanguage links to convert concept vectors from the Spanish to the English space. The performance of the classifier is compared to several baselines: a classifier based on machine translation, a classifier that represents documents after performing Explicit Semantic Analysis (ESA), and a classifier that uses a domain-specific semantic an- notator (MetaMap). The corpus used for the experiments (Cross-Language UVigoMED) was purpose-built for this study, and it is composed of 12,832 English and 2,184 Spanish MEDLINE abstracts. The performance of our approach is superior to any other state-of-the art classifier in the benchmark, with performance increases up to: 124% over classical machine translation, 332% over MetaMap, and 60 times over the classifier based on ESA. The results have statistical significance, showing p-values < 0.0001. Using knowledge mined from Wikipedia to represent documents as vectors in a space of Wikipedia concepts and translating vectors between language-specific concept spaces, a cross-language classifier can be built, and it performs better than several state-of-the-art classifiers. Schattauer GmbH.

  5. A Bag of Concepts Approach for Biomedical Document Classification Using Wikipedia Knowledge*. Spanish-English Cross-language Case Study.

    PubMed

    Mouriño-García, Marcos A; Pérez-Rodríguez, Roberto; Anido-Rifón, Luis E

    2017-10-26

    The ability to efficiently review the existing literature is essential for the rapid progress of research. This paper describes a classifier of text documents, represented as vectors in spaces of Wikipedia concepts, and analyses its suitability for classification of Spanish biomedical documents when only English documents are available for training. We propose the cross-language concept matching (CLCM) technique, which relies on Wikipedia interlanguage links to convert concept vectors from the Spanish to the English space. The performance of the classifier is compared to several baselines: a classifier based on machine translation, a classifier that represents documents after performing Explicit Semantic Analysis (ESA), and a classifier that uses a domain-specific semantic annotator (MetaMap). The corpus used for the experiments (Cross-Language UVigoMED) was purpose-built for this study, and it is composed of 12,832 English and 2,184 Spanish MEDLINE abstracts. The performance of our approach is superior to any other state-of-the art classifier in the benchmark, with performance increases up to: 124% over classical machine translation, 332% over MetaMap, and 60 times over the classifier based on ESA. The results have statistical significance, showing p-values < 0.0001. Using knowledge mined from Wikipedia to represent documents as vectors in a space of Wikipedia concepts and translating vectors between language-specific concept spaces, a cross-language classifier can be built, and it performs better than several state-of-the-art classifiers.

  6. Biblio-MetReS: A bibliometric network reconstruction application and server

    PubMed Central

    2011-01-01

    Background Reconstruction of genes and/or protein networks from automated analysis of the literature is one of the current targets of text mining in biomedical research. Some user-friendly tools already perform this analysis on precompiled databases of abstracts of scientific papers. Other tools allow expert users to elaborate and analyze the full content of a corpus of scientific documents. However, to our knowledge, no user friendly tool that simultaneously analyzes the latest set of scientific documents available on line and reconstructs the set of genes referenced in those documents is available. Results This article presents such a tool, Biblio-MetReS, and compares its functioning and results to those of other user-friendly applications (iHOP, STRING) that are widely used. Under similar conditions, Biblio-MetReS creates networks that are comparable to those of other user friendly tools. Furthermore, analysis of full text documents provides more complete reconstructions than those that result from using only the abstract of the document. Conclusions Literature-based automated network reconstruction is still far from providing complete reconstructions of molecular networks. However, its value as an auxiliary tool is high and it will increase as standards for reporting biological entities and relationships become more widely accepted and enforced. Biblio-MetReS is an application that can be downloaded from http://metres.udl.cat/. It provides an easy to use environment for researchers to reconstruct their networks of interest from an always up to date set of scientific documents. PMID:21975133

  7. Neural analysis of bovine ovaries ultrasound images in the identification process of the corpus luteum

    NASA Astrophysics Data System (ADS)

    Górna, K.; Jaśkowski, B. M.; Okoń, P.; Czechlowski, M.; Koszela, K.; Zaborowicz, M.; Idziaszek, P.

    2017-07-01

    The aim of the paper is to shown the neural image analysis as a method useful for identifying the development stage of the domestic bovine corpus luteum on digital USG (UltraSonoGraphy) images. Corpus luteum (CL) is a transient endocrine gland that develops after ovulation from the follicle secretory cells. The aim of CL is the production of progesterone, which regulates many reproductive functions. In the presented studies, identification of the corpus luteum was carried out on the basis of information contained in ultrasound digital images. Development stage of the corpus luteum was considered in two aspects: just before and middle of domination phase and luteolysis and degradation phase. Prior to the classification, the ultrasound images have been processed using a GLCM (Gray Level Co-occurence Matrix). To generate a classification model, a Neural Networks module implemented in the STATISTICA was used. Five representative parameters describing the ultrasound image were used as learner variables. On the output of the artificial neural network was generated information about the development stage of the corpus luteum. Results of this study indicate that neural image analysis combined with GLCM texture analysis may be a useful tool for identifying the bovine corpus luteum in the context of its development phase. Best-generated artificial neural network model was the structure of MLP (Multi Layer Perceptron) 5:5-17-1:1.

  8. Reflective Random Indexing and indirect inference: a scalable method for discovery of implicit connections.

    PubMed

    Cohen, Trevor; Schvaneveldt, Roger; Widdows, Dominic

    2010-04-01

    The discovery of implicit connections between terms that do not occur together in any scientific document underlies the model of literature-based knowledge discovery first proposed by Swanson. Corpus-derived statistical models of semantic distance such as Latent Semantic Analysis (LSA) have been evaluated previously as methods for the discovery of such implicit connections. However, LSA in particular is dependent on a computationally demanding method of dimension reduction as a means to obtain meaningful indirect inference, limiting its ability to scale to large text corpora. In this paper, we evaluate the ability of Random Indexing (RI), a scalable distributional model of word associations, to draw meaningful implicit relationships between terms in general and biomedical language. Proponents of this method have achieved comparable performance to LSA on several cognitive tasks while using a simpler and less computationally demanding method of dimension reduction than LSA employs. In this paper, we demonstrate that the original implementation of RI is ineffective at inferring meaningful indirect connections, and evaluate Reflective Random Indexing (RRI), an iterative variant of the method that is better able to perform indirect inference. RRI is shown to lead to more clearly related indirect connections and to outperform existing RI implementations in the prediction of future direct co-occurrence in the MEDLINE corpus. 2009 Elsevier Inc. All rights reserved.

  9. An Analysis of Stative Verbs Used with the Progressive Aspect in Corpus-Informed Textbooks

    ERIC Educational Resources Information Center

    Belli, Serap Atasever

    2018-01-01

    This study was designed to investigate whether contemporary corpus-informed grammar textbooks written for English language learners and teachers presented the progressive use of stative verbs and if yes, which stative verbs were presented to occur with the progressive aspect and for which functions they took this aspect. A corpus of six electronic…

  10. Exploring Learner Language through Corpora: Comparing and Interpreting Corpus Frequency Information

    ERIC Educational Resources Information Center

    Gablasova, Dana; Brezina, Vaclav; McEnery, Tony

    2017-01-01

    This article contributes to the debate about the appropriate use of corpus data in language learning research. It focuses on frequencies of linguistic features in language use and their comparison across corpora. The majority of corpus-based second language acquisition studies employ a comparative design in which either one or more second language…

  11. A Corpus-Based Analysis of the Most Frequent Adjectives in Academic Texts

    ERIC Educational Resources Information Center

    Kartal, Galip

    2017-01-01

    Based on a mega corpus, The Corpus of Contemporary American English (COCA), this study aims to determine the most frequent adjectives used in academic texts and to investigate whether these adjectives differ in frequency and function in social sciences, technology, and medical sciences. It also identifies evaluative adjectives from a list of a…

  12. An Individual Subjectivist Critique of the Use of Corpus Linguistics to Inform Pedagogical Materials

    ERIC Educational Resources Information Center

    Richards, Kendall; Pilcher, Nick

    2016-01-01

    Corpus linguistics, or the gathering together of language into a body for analysis and development of materials, is claimed to be an assured, established method (or field) that valuably informs pedagogical materials and knowledge of language (e.g. Ädel 2010; Gardner & Nesi, 2013). The fundamental validity of corpus linguistics is rarely, if…

  13. Assessing the Lexico-Grammatical Characteristics of a Corpus of College-Level Statistics Textbooks: Implications for Instruction and Practice

    ERIC Educational Resources Information Center

    Wagler, Amy E.; Lesser, Lawrence M.; González, Ariel I.; Leal, Luis

    2015-01-01

    A corpus of current editions of statistics textbooks was assessed to compare aspects and levels of readability for the topics of "measures of center," "line of fit," "regression analysis," and "regression inference." Analysis with lexical software of these text selections revealed that the large corpus can…

  14. Analysing Culture and Interculture in Saudi EFL Textbooks: A Corpus Linguistic Approach

    ERIC Educational Resources Information Center

    Almujaiwel, Sultan

    2018-01-01

    This paper combines corpus processing tools to investigate the cultural elements of Saudi education of English as a foreign language (EFL). The latest Saudi EFL textbooks (2016 onwards) are available in researchable PDF formats. This helps process them through corpus search software tools. The method adopted is based on analysing 20 cultural…

  15. Corpus-Supported Academic Writing: How Can Technology Help?

    ERIC Educational Resources Information Center

    Chitez, Madalina; Rapp, Christian; Kruse, Otto

    2015-01-01

    Phraseology has long been used in L2 teaching of academic writing, and corpus linguistics has played a major role in the compilation and assessment of academic phrases. However, there are only a few interactive academic writing tools in which corpus methodology is implemented in a real-time design to support formulation processes. In this paper,…

  16. The Pedagogical Mediation of a Developmental Learner Corpus for Classroom-Based Language Instruction

    ERIC Educational Resources Information Center

    Belz, Julie A.; Vyatkina, Nina

    2008-01-01

    Although corpora have been used in language teaching for some time, few empirical studies explore their impact on learning outcomes. We provide a microgenetic account of learners' responses to corpus-driven instructional units for German modal particles and pronominal "da"-compounds. The units are based on developmental corpus data produced by…

  17. Applying Corpus-Based Findings to Form-Focused Instruction: The Case of Reported Speech

    ERIC Educational Resources Information Center

    Barbieri, Federica; Eckhardt, Suzanne E. B.

    2007-01-01

    Arguing that the introduction of corpus linguistics in teaching materials and the language classroom should be informed by theories and principles of SLA, this paper presents a case study illustrating how corpus-based findings on reported speech can be integrated into a form-focused model of instruction. After overviewing previous work which…

  18. Investigating L2 Spoken English through the Role Play Learner Corpus

    ERIC Educational Resources Information Center

    Nava, Andrea; Pedrazzini, Luciana

    2011-01-01

    We describe an exploratory study carried out within the University of Milan, Department of English the aim of which was to analyse features of the spoken English of first-year Modern Languages undergraduates. We compiled a learner corpus, the "Role Play" corpus, which consisted of 69 role-play interactions in English carried out by…

  19. Formulaic Language and Collocations in German Essays: From Corpus-Driven Data to Corpus-Based Materials

    ERIC Educational Resources Information Center

    Krummes, Cedric; Ensslin, Astrid

    2015-01-01

    Whereas there exists a plethora of research on collocations and formulaic language in English, this article contributes towards a somewhat less developed area: the understanding and teaching of formulaic language in German as a foreign language. It analyses formulaic sequences and collocations in German writing (corpus-driven) and provides modern…

  20. A Corpus-Based View of Lexical Gender in Written Business English

    ERIC Educational Resources Information Center

    Fuertes-Olivera, Pedro A.

    2007-01-01

    This article investigates lexical gender in specialized communication. The key method of analysis is that of forms of address, professional titles, and "generic man" in a 10 million word corpus of written Business English. After a brief introduction and literature review on both gender in specialized communication and similar corpus-based views of…

  1. Frequent Collocates and Major Senses of Two Prepositions in ESL and ENL Corpora

    ERIC Educational Resources Information Center

    Nkemleke, Daniel

    2009-01-01

    This contribution assesses in quantitative terms frequent collocates and major senses of "between" and "through" in the corpus of Cameroonian English (CCE), the corpus of East-African (Kenya and Tanzania) English which is part of the International Corpus of English (ICE) project (ICE-EA), and the London Oslo/Bergen (LOB) corpus…

  2. US News Media Portrayal of Islam and Muslims: A Corpus-Assisted Critical Discourse Analysis

    ERIC Educational Resources Information Center

    Samaie, Mahmoud; Malmir, Bahareh

    2017-01-01

    This article exploits the synergy of critical discourse studies and Corpus Linguistics to study the pervasive representation of Islam and Muslims in an approximate 670,000-word corpus of US news media stories published between 2001 and 2015. Following collocation and concordance analysis of the most frequent topics or categories which revolve…

  3. Effects of hypo- and hyperthyroidism on proliferation, angiogenesis, apoptosis and expression of COX-2 in the corpus luteum of female rats.

    PubMed

    Silva, J F; Ocarino, N M; Vieira, A L S; Nascimento, E F; Serakides, R

    2013-08-01

    Although thyroid dysfunction occurs frequently in humans and some animal species, the mechanisms by which hypo- and hyperthyroidism affect the corpus luteum have not been thoroughly elucidated. This study evaluated the levels of proliferative activity, angiogenesis, apoptosis and expression of cyclooxygenase-2 in the corpus luteum of female rats with thyroid dysfunction. These processes may be important in understanding the reproductive changes caused by thyroid dysfunction. A total of 18 adult female rats were divided into three groups (control, hypothyroid and hyperthyroid) with six animals per group. Three months after treatment to induce thyroid dysfunction, the rats were euthanized in the dioestrus phase. The ovaries were collected and immunohistochemically analysed for expression of the cell proliferation marker CDC-47, vascular endothelial growth factor (VEGF), VEGF receptor Flk-1 and cyclooxygenase-2 (COX-2). Apoptosis was evaluated using the TUNEL assay. Hypothyroidism reduced the intensity and area of COX-2 expression in the corpus luteum (p < 0.05), while hyperthyroidism did not alter COX-2 expression in the dioestrus phase. Hypothyroidism significantly reduced the expression of CDC-47 in endothelial cells and pericytes in the corpus luteum, whereas hyperthyroidism did not induce a detectable change in CDC-47 expression (p > 0.05). Hypothyroidism reduced the level of apoptosis in luteal cells (p < 0.05) and increased VEGF expression in the corpus luteum. In contrast, hyperthyroidism increased the level of apoptosis in the corpus luteum (p < 0.05). In conclusion, thyroid dysfunction differentially affects the levels of proliferative activity, angiogenesis and apoptosis and COX-2 expression in the corpus luteum of female rats. © 2013 Blackwell Verlag GmbH.

  4. Characterizing the Google Books Corpus: Strong Limits to Inferences of Socio-Cultural and Linguistic Evolution

    PubMed Central

    Pechenick, Eitan Adam; Danforth, Christopher M.; Dodds, Peter Sheridan

    2015-01-01

    It is tempting to treat frequency trends from the Google Books data sets as indicators of the “true” popularity of various words and phrases. Doing so allows us to draw quantitatively strong conclusions about the evolution of cultural perception of a given topic, such as time or gender. However, the Google Books corpus suffers from a number of limitations which make it an obscure mask of cultural popularity. A primary issue is that the corpus is in effect a library, containing one of each book. A single, prolific author is thereby able to noticeably insert new phrases into the Google Books lexicon, whether the author is widely read or not. With this understood, the Google Books corpus remains an important data set to be considered more lexicon-like than text-like. Here, we show that a distinct problematic feature arises from the inclusion of scientific texts, which have become an increasingly substantive portion of the corpus throughout the 1900s. The result is a surge of phrases typical to academic articles but less common in general, such as references to time in the form of citations. We use information theoretic methods to highlight these dynamics by examining and comparing major contributions via a divergence measure of English data sets between decades in the period 1800–2000. We find that only the English Fiction data set from the second version of the corpus is not heavily affected by professional texts. Overall, our findings call into question the vast majority of existing claims drawn from the Google Books corpus, and point to the need to fully characterize the dynamics of the corpus before using these data sets to draw broad conclusions about cultural and linguistic evolution. PMID:26445406

  5. A corpus for plant-chemical relationships in the biomedical domain.

    PubMed

    Choi, Wonjun; Kim, Baeksoo; Cho, Hyejin; Lee, Doheon; Lee, Hyunju

    2016-09-20

    Plants are natural products that humans consume in various ways including food and medicine. They have a long empirical history of treating diseases with relatively few side effects. Based on these strengths, many studies have been performed to verify the effectiveness of plants in treating diseases. It is crucial to understand the chemicals contained in plants because these chemicals can regulate activities of proteins that are key factors in causing diseases. With the accumulation of a large volume of biomedical literature in various databases such as PubMed, it is possible to automatically extract relationships between plants and chemicals in a large-scale way if we apply a text mining approach. A cornerstone of achieving this task is a corpus of relationships between plants and chemicals. In this study, we first constructed a corpus for plant and chemical entities and for the relationships between them. The corpus contains 267 plant entities, 475 chemical entities, and 1,007 plant-chemical relationships (550 and 457 positive and negative relationships, respectively), which are drawn from 377 sentences in 245 PubMed abstracts. Inter-annotator agreement scores for the corpus among three annotators were measured. The simple percent agreement scores for entities and trigger words for the relationships were 99.6 and 94.8 %, respectively, and the overall kappa score for the classification of positive and negative relationships was 79.8 %. We also developed a rule-based model to automatically extract such plant-chemical relationships. When we evaluated the rule-based model using the corpus and randomly selected biomedical articles, overall F-scores of 68.0 and 61.8 % were achieved, respectively. We expect that the corpus for plant-chemical relationships will be a useful resource for enhancing plant research. The corpus is available at http://combio.gist.ac.kr/plantchemicalcorpus .

  6. Reliability of recording uterine cancer in death certification in France and age-specific proportions of deaths from cervix and corpus uteri.

    PubMed

    Rogel, Agnès; Belot, Aurélien; Suzan, Florence; Bossard, Nadine; Boussac, Marjorie; Arveux, Patrick; Buémi, Antoine; Colonna, Marc; Danzon, Arlette; Ganry, Olivier; Guizard, Anne-Valérie; Grosclaude, Pascale; Velten, Michel; Jougla, Eric; Iwaz, Jean; Estève, Jacques; Chérié-Challine, Laurence; Remontet, Laurent

    2011-06-01

    French uterine cancer recordings in death certificates include 60% of "uterine cancer, Not Otherwise Specified (NOS)"; this hampers the estimation of mortalities from cervix and corpus uteri cancers. The aims of this work were to study the reliability of uterine cancer recordings in death certificates using a case matching with cancer registries and estimate age-specific proportions of deaths from cervix and corpus uteri cancers among all uterine cancer deaths by a statistical approach that uses incidence and survival data. Deaths from uterine cancer between 1989 and 2001 were extracted from the French National database of causes of death and case-to-case matched to women diagnosed with uterine cancer between 1989 and 1997 in 8 cancer registries. Registry data were considered as "gold-standard". Among the 1825 matched deaths, cancer registries recorded 830 cervix and 995 corpus uteri cancers. In death certificates, 5% and 40% of "true" cervix cancers were respectively coded "corpus" and "uterus, NOS" and 5% and 59% of "true" corpus cancers respectively coded "cervix" and "uterus, NOS". Miscoding cervix cancers was more frequent at advanced ages at death and in deaths at home or in small urban areas. Miscoding corpus cancers was more frequent in deaths at home or in small urban areas. From the statistical method, the estimated proportion of deaths from cervix cancer among all uterine cancer deaths was higher than 95% in women aged 30-40 years old but declined to 35% in women older than 70 years. The study clarifies the reason for poor encoding of uterus cancer mortality and refines the estimation of mortalities from cervix and corpus uteri cancers allowing future studies on the efficacy of cervical cancer screening. Copyright © 2010 Elsevier Ltd. All rights reserved.

  7. Characterizing the Google Books Corpus: Strong Limits to Inferences of Socio-Cultural and Linguistic Evolution.

    PubMed

    Pechenick, Eitan Adam; Danforth, Christopher M; Dodds, Peter Sheridan

    2015-01-01

    It is tempting to treat frequency trends from the Google Books data sets as indicators of the "true" popularity of various words and phrases. Doing so allows us to draw quantitatively strong conclusions about the evolution of cultural perception of a given topic, such as time or gender. However, the Google Books corpus suffers from a number of limitations which make it an obscure mask of cultural popularity. A primary issue is that the corpus is in effect a library, containing one of each book. A single, prolific author is thereby able to noticeably insert new phrases into the Google Books lexicon, whether the author is widely read or not. With this understood, the Google Books corpus remains an important data set to be considered more lexicon-like than text-like. Here, we show that a distinct problematic feature arises from the inclusion of scientific texts, which have become an increasingly substantive portion of the corpus throughout the 1900 s. The result is a surge of phrases typical to academic articles but less common in general, such as references to time in the form of citations. We use information theoretic methods to highlight these dynamics by examining and comparing major contributions via a divergence measure of English data sets between decades in the period 1800-2000. We find that only the English Fiction data set from the second version of the corpus is not heavily affected by professional texts. Overall, our findings call into question the vast majority of existing claims drawn from the Google Books corpus, and point to the need to fully characterize the dynamics of the corpus before using these data sets to draw broad conclusions about cultural and linguistic evolution.

  8. Approaching the Linguistic Complexity

    NASA Astrophysics Data System (ADS)

    Drożdż, Stanisław; Kwapień, Jarosław; Orczyk, Adam

    We analyze the rank-frequency distributions of words in selected English and Polish texts. We compare scaling properties of these distributions in both languages. We also study a few small corpora of Polish literary texts and find that for a corpus consisting of texts written by different authors the basic scaling regime is broken more strongly than in the case of comparable corpus consisting of texts written by the same author. Similarly, for a corpus consisting of texts translated into Polish from other languages the scaling regime is broken more strongly than for a comparable corpus of native Polish texts. Moreover, based on the British National Corpus, we consider the rank-frequency distributions of the grammatically basic forms of words (lemmas) tagged with their proper part of speech. We find that these distributions do not scale if each part of speech is analyzed separately. The only part of speech that independently develops a trace of scaling is verbs.

  9. Using Google as a Super Corpus to Drive Written Language Learning: A Comparison with the British National Corpus

    ERIC Educational Resources Information Center

    Sha, Guoquan

    2010-01-01

    Data-driven learning (DDL), or corpus-based language learning, involves the learner in an exploratory task to discover appropriate expressions or collocates regarding his writing. However, the problematic units of meaning in each learner's writing are so diverse that conventional corpora often prove futile. The search engine Google with the…

  10. Language with Character: A Stratified Corpus Comparison of Individual Differences in E-Mail Communication

    ERIC Educational Resources Information Center

    Oberlander, Jon; Gill, Alastair J.

    2006-01-01

    To what extent does the wording and syntactic form of people's writing reflect their personalities? Using a bottom-up stratified corpus comparison, rather than the top-down content analysis techniques that have been used before, we examine a corpus of e-mail messages elicited from individuals of known personality, as measured by the Eysenck…

  11. Interface Conditions on Postverbal Subjects: A Corpus Study of L2 English

    ERIC Educational Resources Information Center

    Lozano, Cristobal; Mendikoetxea, Amaya

    2010-01-01

    This paper investigates how syntactic knowledge interfaces with other cognitive systems by analysing the production of postverbal subjects, V(erb)-S(ubject) order, in an L1 Spanish-L2 English corpus and a comparable English native corpus. VS order in both native and L2 English is shown to be constrained by properties operating at three interfaces:…

  12. A Quantitative Corpus-Based Approach to English Spatial Particles: Conceptual Symmetry and Its Pedagogical Implications

    ERIC Educational Resources Information Center

    Chen, Alvin Cheng-Hsien

    2014-01-01

    The present study aims to investigate how conceptual symmetry plays a role in the use of spatial particles in English and to further examine its pedagogical implications via a corpus-based evaluation of the course books in senior high schools in Taiwan. More specifically, we adopt a quantitative corpus-based approach to investigate whether bipolar…

  13. Sketching Muslims: A Corpus Driven Analysis of Representations around the Word "Muslim" in the British Press 1998-2009

    ERIC Educational Resources Information Center

    Baker, Paul; Gabrielatos, Costas; McEnery, Tony

    2013-01-01

    This article uses methods from corpus linguistics and critical discourse analysis to examine patterns of representation around the word "Muslim" in a 143 million word corpus of British newspaper articles published between 1998 and 2009. Using the analysis tool Sketch Engine, an analysis of noun collocates of "Muslim" found that the following…

  14. A Corpus-Based Discourse Analysis of the Vision and Mission Statements of Universities in Turkey

    ERIC Educational Resources Information Center

    Efe, Ibrahim; Ozer, Omer

    2015-01-01

    This article presents findings from a corpus-assisted discourse analysis of mission and vision statements of 105 state and 66 private/foundation universities in Turkey. The paper combines a corpus-based approach with critical discourse analysis to interpret the data in relation to its institutional as well as socio-political context. It argues…

  15. Lymphedema After Surgery in Patients With Endometrial Cancer, Cervical Cancer, or Vulvar Cancer

    ClinicalTrials.gov

    2017-05-03

    Lymphedema; Stage IA Cervical Cancer; Stage IA Uterine Corpus Cancer; Stage IA Vulvar Cancer; Stage IB Cervical Cancer; Stage IB Uterine Corpus Cancer; Stage IB Vulvar Cancer; Stage II Uterine Corpus Cancer; Stage II Vulvar Cancer; Stage IIA Cervical Cancer; Stage IIIA Vulvar Cancer; Stage IIIB Vulvar Cancer; Stage IIIC Vulvar Cancer; Stage IVB Vulvar Cancer

  16. A Combined Corpus and Systemic-Functional Analysis of the Problem-Solution Pattern in a Student and Professional Corpus of Technical Writing.

    ERIC Educational Resources Information Center

    Flowerdew, Lynne

    2003-01-01

    Reports on research describing similarities and differences between expert and novice writing in the problem-solution pattern, a frequent rhetorical pattern of technical academic writing. A corpus of undergraduate student writing and one containing professional writing consisted of 80 and 60 recommendation reports, respectively, with each corpus…

  17. Using Edit Distance to Analyse Errors in a Natural Language to Logic Translation Corpus

    ERIC Educational Resources Information Center

    Barker-Plummer, Dave; Dale, Robert; Cox, Richard; Romanczuk, Alex

    2012-01-01

    We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of…

  18. Linking Adverbials in First-Year Korean University EFL Learners' Writing: A Corpus-Informed Analysis

    ERIC Educational Resources Information Center

    Ha, Myung-Jeong

    2016-01-01

    This study examines the frequency and usage patterns of linking adverbials in Korean students' essay writing in comparison with native English writing. The learner corpus used in the present study is composed of 105 essays that were produced by first-year university students in Korea. The control corpus was taken from the American LOCNESS…

  19. The catalogue of the Ripley Corpus: alchemical writings attributed to George Ripley (d. ca. 1490).

    PubMed

    Rampling, Jennifer M

    2010-07-01

    The period 1471 to 1700 saw the accretion of a large corpus of alchemical works associated with the famous English alchemist George Ripley, Canon of Bridlington (d. ca. 1490). Evaluation of Ripley's alchemy is hampered by uncertainty over the composition of the corpus, the dating and provenance of individual texts, and the difficulty of separating genuine from spurious attributions. The Catalogue of the Ripley Corpus (CRC) provides a first step in ordering these diverse materials: a descriptive catalogue of approximately forty-five alchemical treatises, recipes and poems attributed to Ripley, with an index of all known manuscript copies.

  20. Temsirolimus and Bevacizumab in Treating Patients With Advanced Endometrial, Ovarian, Liver, Carcinoid, or Islet Cell Cancer

    ClinicalTrials.gov

    2017-07-10

    Adult Hepatocellular Carcinoma; Advanced Adult Hepatocellular Carcinoma; Endometrial Serous Adenocarcinoma; Localized Non-Resectable Adult Liver Carcinoma; Lung Carcinoid Tumor; Malignant Pancreatic Gastrinoma; Malignant Pancreatic Glucagonoma; Malignant Pancreatic Insulinoma; Malignant Pancreatic Somatostatinoma; Metastatic Digestive System Neuroendocrine Tumor G1; Ovarian Carcinosarcoma; Ovarian Endometrioid Adenocarcinoma; Ovarian Seromucinous Carcinoma; Ovarian Serous Surface Papillary Adenocarcinoma; Pancreatic Alpha Cell Adenoma; Pancreatic Beta Cell Adenoma; Pancreatic Delta Cell Adenoma; Pancreatic G-Cell Adenoma; Pancreatic Polypeptide Tumor; Recurrent Adult Liver Carcinoma; Recurrent Digestive System Neuroendocrine Tumor G1; Recurrent Fallopian Tube Carcinoma; Recurrent Ovarian Carcinoma; Recurrent Pancreatic Neuroendocrine Carcinoma; Recurrent Primary Peritoneal Carcinoma; Recurrent Uterine Corpus Carcinoma; Regional Digestive System Neuroendocrine Tumor G1; Stage IIIA Fallopian Tube Cancer; Stage IIIA Ovarian Cancer; Stage IIIA Primary Peritoneal Cancer; Stage IIIA Uterine Corpus Cancer; Stage IIIB Fallopian Tube Cancer; Stage IIIB Ovarian Cancer; Stage IIIB Primary Peritoneal Cancer; Stage IIIB Uterine Corpus Cancer; Stage IIIC Fallopian Tube Cancer; Stage IIIC Ovarian Cancer; Stage IIIC Primary Peritoneal Cancer; Stage IIIC Uterine Corpus Cancer; Stage IV Fallopian Tube Cancer; Stage IV Ovarian Cancer; Stage IV Primary Peritoneal Cancer; Stage IVA Uterine Corpus Cancer; Stage IVB Uterine Corpus Cancer; Uterine Carcinosarcoma

  1. Penile erection responses of Nigella sativa seed extract on isolated rat corpus cavernosum

    NASA Astrophysics Data System (ADS)

    Aminyoto, M.; Ismail, S.

    2018-04-01

    Nigella sativa L. (NS) from Ranunculaceae family is known as black cumin in Indonesia. The seed has been used as an aphrodisiac in ethnobotanical studies and reported to have pharmacological activities such as antihypertensive through the relaxant effect of vascular smooth muscles but the direct effect to the blood vessels of the corpus cavernosum is still unknown. The purpose of this study was to examine the response of NS seed extract on penile erection in vitro. NS seeds were macerated in ethanol solvent for three days in room temperature and repeated for two times. Penile erection responses was assessed using isolated rat corpus cavernosum in Krebs-Henseleit solution, temperature 37°C, pH 7.4, aerated with carbogen gas. After acclimation, corpus cavernosum was contracted with a phenylephrine solution. Ethanolic extract of NS seeds or control solution were given after reaching the plateu phase of the highest contraction. This study showed that the contraction response of the corpus cavernosum decreased after addition of NS extract and this action was increased with the addition of the extract concentration. This study concluded that NS seed ethanol extract affects the penile erection response directly through the relaxation of blood vessels in the corpus cavernosum.

  2. Ovum transmigration after salpingectomy for ectopic pregnancy.

    PubMed

    Ross, Jackie A; Davison, Amelia Z; Sana, Yasmin; Appiah, Adjoa; Johns, Jemma; Lee, Christopher T

    2013-04-01

    What proportion of pregnancies are a result of ovum transmigration after salpingectomy for ectopic pregnancy? Approximately one-third of spontaneously conceived pregnancies are a result of pick-up of the ovum from the ovary contralateral to the remaining tube in women with a history of salpingectomy. The corpus luteum has been found contralateral to tubal ectopic pregnancies in 32% of reported cases. The rate of contralateral ovum pick-up in intrauterine pregnancies is not known. We conducted a retrospective cohort study of clinical and ultrasound records collected over a 12-year period 1999-2010. Ten per cent of cases identified were excluded from the final analysis due to incomplete data or bilateral corpora lutea. Included were 842 pregnancies in 707 women with a history of unilateral salpingectomy for ectopic pregnancy and subsequent spontaneous pregnancy. The study was set in the Early Pregnancy Unit of a large UK inner city teaching hospital. The outcome measure was the side of the corpus luteum in relation to the side of the remaining tube. The corpus luteum was located in the ovary contralateral to the remaining tube in 266/842 pregnancies (31.6%; 95% CI 28.5-34.8%). There was no significant difference in this proportion between intrauterine and ectopic pregnancies [246/769 (32.0%) versus 21/73 (28.8%), P = 0.60]. This was a retrospective study and so did not address the conception rate according to the laterality of ovulation. Our findings were very similar to the frequency of ectopic pregnancies found contralateral to the corpus luteum described in previous studies. Ovum pick-up from the cul-de-sac probably occurs reasonably frequently and is unlikely to have a causative role in the pathogenesis of ectopic pregnancy. It is not known how often this phenomenon occurs in women with intact Fallopian tubes. No specific funding was obtained. The authors have no conflicts of interest to declare.

  3. Major clinical research advances in gynecologic cancer in 2014.

    PubMed

    Suh, Dong Hoon; Lee, Kyung Hun; Kim, Kidong; Kang, Sokbom; Kim, Jae Weon

    2015-04-01

    In 2014, 9 topics were selected as major advances in clinical research for gynecologic oncology: 2 each in cervical and corpus cancer, 4 in ovarian cancer, and 1 in breast cancer. For cervical cancer, several therapeutic agents showed viable antitumor clinical response in recurrent and metastatic disease: bevacizumab, cediranib, and immunotherapies including human papillomavirus (HPV)-tumor infiltrating lymphocytes and Z-100. The HPV test received FDA approval as the primary screening tool of cervical cancer in women aged 25 and older, based on the results of the ATHENA trial, which suggested that the HPV test was a more sensitive and efficient strategy for cervical cancer screening than methods based solely on cytology. For corpus cancers, results of a phase III Gynecologic Oncology Group (GOG) 249 study of early-stage endometrial cancer with high-intermediate risk factors are followed by the controversial topic of uterine power morcellation in minimally invasive gynecologic surgery. Promising results of phase II studies regarding the effectiveness of olaparib in various ovarian cancer settings are summarized. After a brief review of results from a phase III study on pazopanib maintenance therapy in advanced ovarian cancer, 2 outstanding 2014 ASCO presentations cover the topic of using molecular subtypes in predicting response to bevacizumab. A review of the use of opportunistic bilateral salpingectomy as an ovarian cancer preventive strategy in the general population is presented. Two remarkable studies that discussed the effectiveness of adjuvant ovarian suppression in premenopausal early breast cancer have been selected as the last topics covered in this review.

  4. [An autopsy case of neonatal lactic acidosis].

    PubMed

    Giordano, G; Corradi, D; D'Adda, T; Melissari, M

    2001-02-01

    Defects in mitochondrial enzymes, such as pyruvate dehydrogenase and cytochrome oxidase, cause hereditary disorders which lead to modifications in cellular pH due to the accumulation of pyruvate and lactic acid. Mitochondrial diseases include severe neonatal diseases and less severe forms of adult diseases. We report the case of lactic acidosis in a newborn girl who was delivered at 36 weeks of gestation and who died 3 months after birth. Her family history revealed a relative with tetraparesis and mental retardation. Her clinical findings, such as tonic-clonic convulsions and accumulation of pyruvate and lactic acid in blood, urine and cerebrospinal fluid, were refractory to treatment and developed soon after birth. Ultrasound scans of the brain some days before death revealed cerebral atrophy with ventricular dilatation and thinning of the corpus callosum and septum pellucidum. The clinical diagnosis of metabolic lactic acidosis was confirmed by macroscopic, microscopic and ultrastructural findings seen at autopsy. On macroscopic examination, the heart was hypertrophic, and the brain was atrophic with ventricular dilatation and thinning of corpus callosum. Small cystic lesions were present in the basal ganglia. On microscopic examination, the latter were characterized by loss of neurons, gliosis and capillary proliferation. Ultrastructural examination of the heart and skeletal muscle showed lysis of myofibrils, mitochondrial pleomorphism and hyperplasia, and crystalline inclusion in mitochondria and in the matrix compartment. In reporting this case, we emphasize the importance of accurate postmortem examination and clinical data for the diagnosis of metabolic lactic acidosis.

  5. Agenesis of Corpus Callosum and Emotional Information Processing in Schizophrenia

    PubMed Central

    Lungu, Ovidiu; Stip, Emmanuel

    2012-01-01

    Corpus callosum (CC) is essential in providing the integration of information related to perception and action within a subcortico-cortical network, thus supporting the generation of a unified experience about and reaction to changes in the environment. Its role in schizophrenia is yet to be fully elucidated, but there is accumulating evidence that there could be differences between patients and healthy controls regarding the morphology and function of CC, especially when individuals face emotionally laden information. Here, we report a case study of a patient with partial agenesis of corpus callosum (agCC patient with agenesis of the anterior aspect, above the genu) and we provide a direct comparison with a group of patients with no apparent callosal damage (CC group) regarding the brain activity during the processing of emotionally laden information. We found that although the visual cortex activation in response to visual stimuli regardless of their emotional content was comparable in agCC patient and CC group both in terms of localization and intensity of activation, we observed a very large, non-specific and non-lateralized cerebral activation in the agCC patient, in contrast with the CC group, which showed a more lateralized and spatially localized activation, when the emotional content of the stimuli was considered. Further analysis of brain activity in the regions obtained in the CC group revealed that the agCC patient actually had an opposite activation pattern relative to most participants with no CC agenesis, indicating a dysfunctional response to these kind of stimuli, consistent with the clinical presentation of this particular patient. Our results seem to give support to the disconnection hypothesis which posits that the core symptoms of schizophrenia are related to aberrant connectivity between distinct brain areas, especially when faced with emotional stimuli, a fact consistent with the clinical tableau of this particular patient. PMID:22347194

  6. Agenesis of corpus callosum and emotional information processing in schizophrenia.

    PubMed

    Lungu, Ovidiu; Stip, Emmanuel

    2012-01-01

    Corpus callosum (CC) is essential in providing the integration of information related to perception and action within a subcortico-cortical network, thus supporting the generation of a unified experience about and reaction to changes in the environment. Its role in schizophrenia is yet to be fully elucidated, but there is accumulating evidence that there could be differences between patients and healthy controls regarding the morphology and function of CC, especially when individuals face emotionally laden information. Here, we report a case study of a patient with partial agenesis of corpus callosum (agCC patient with agenesis of the anterior aspect, above the genu) and we provide a direct comparison with a group of patients with no apparent callosal damage (CC group) regarding the brain activity during the processing of emotionally laden information. We found that although the visual cortex activation in response to visual stimuli regardless of their emotional content was comparable in agCC patient and CC group both in terms of localization and intensity of activation, we observed a very large, non-specific and non-lateralized cerebral activation in the agCC patient, in contrast with the CC group, which showed a more lateralized and spatially localized activation, when the emotional content of the stimuli was considered. Further analysis of brain activity in the regions obtained in the CC group revealed that the agCC patient actually had an opposite activation pattern relative to most participants with no CC agenesis, indicating a dysfunctional response to these kind of stimuli, consistent with the clinical presentation of this particular patient. Our results seem to give support to the disconnection hypothesis which posits that the core symptoms of schizophrenia are related to aberrant connectivity between distinct brain areas, especially when faced with emotional stimuli, a fact consistent with the clinical tableau of this particular patient.

  7. Clinical correlates to assist with chronic traumatic encephalopathy diagnosis: Insights from a novel rodent repeat concussion model.

    PubMed

    Thomsen, Gretchen M; Ko, Ara; Harada, Megan Y; Ma, Annie; Wyss, Livia; Haro, Patricia; Vit, Jean-Philippe; Avalos, Pablo; Dhillon, Navpreet K; Cho, Noell; Shelest, Oksana; Ley, Eric J

    2017-06-01

    Chronic traumatic encephalopathy (CTE) is a neurodegenerative disease linked to repetitive head injuries. Chronic traumatic encephalopathy symptoms include changes in mood, behavior, cognition, and motor function; however, CTE is currently diagnosed only postmortem. Using a rat model of recurrent traumatic brain injury (TBI), we demonstrate rodent deficits that predict the severity of CTE-like brain pathology. Bilateral, closed-skull, mild TBI was administered once per week to 35 wild-type rats; eight rats received two injuries (2×TBI), 27 rats received five injuries (5×TBI), and 13 rats were sham controls. To determine clinical correlates for CTE diagnosis, TBI rats were separated based on the severity of rotarod deficits and classified as "mild" or "severe" and further separated into "acute," "short," and "long" based on age at euthanasia (90, 144, and 235 days, respectively). Brain atrophy, phosphorylated tau, and inflammation were assessed. All eight 2×TBI cases had mild rotarod deficiency, 11 5×TBI cases had mild deficiency, and 16 cases had severe deficiency. In one cohort of rats, tested at approximately 235 days of age, balance, rearing, and grip strength were significantly worse in the severe group relative to both sham and mild groups. At the acute time period, cortical thinning, phosphorylated tau, and inflammation were not observed in either TBI group, whereas corpus callosum thinning was observed in both TBI groups. At later time points, atrophy, tau pathology, and inflammation were increased in mild and severe TBI groups in the cortex and corpus callosum, relative to sham controls. These injury effects were exacerbated over time in the severe TBI group in the corpus callosum. Our model of repeat mild TBI suggests that permanent deficits in specific motor function tests correlate with CTE-like brain pathology. Assessing balance and motor coordination over time may predict CTE diagnosis.

  8. Intraoperative determination of the extent of corpus callosotomy for epilepsy: two simple techniques.

    PubMed

    Awad, I A; Wyllie, E; Luders, H; Ahl, J

    1990-01-01

    There is increasing interest in staged corpus callosotomy for intractable generalized epilepsy. At the first procedure, a portion (usually the anterior two-thirds) of the corpus callosum is sectioned. If seizures persist, completion of callosotomy or alternative treatment approaches can be considered. It is obviously important to ascertain that the desired extent of callosotomy was in fact accomplished at the time of initial operation. Our experience and the published literature indicate that the surgeon's impression at operation can be erroneous. We describe a technique of determining extent of corpus callosotomy during the procedure. The magnetic resonance imaging (MRI) scan in the midsagittal plane is used to select the desired extent of callosotomy. That point on the corpus callosum is characterized using simple planar geometry in relation to three anatomic landmarks in that same plane: the glabella, the inion, and the bregma (midline intersection of the coronal suture). The same point along the corpus callosum can then be located on a lateral skull x-ray using these same three anatomic landmarks. At surgery, an intraoperative lateral skull x-ray is obtained with a marking clip, thereby verifying the actual extent of callosotomy. We have verified the reliability of this scheme in 5 callosotomy procedures and have used this technique for intraoperative localization of midline and parasagittal targets in another 7 cases (3 tumors, 2 aneurysms, and 2 placements of interhemispheric subdural grids). In addition, we reviewed corpus callosum topography on 25 randomly selected MRI scans.(ABSTRACT TRUNCATED AT 250 WORDS)

  9. Callosal disconnection syndrome in a left-handed patient due to infarction of the total length of the corpus callosum.

    PubMed

    Lausberg, H; Göttert, R; Münssinger, U; Boegner, F; Marx, P

    1999-03-01

    We report on a left-handed patient with an ischemic infarction affecting exclusively the total length of the corpus callosum. This lesion clinically correlated with an almost complete callosal disconnection syndrome as described in callosotomy subjects, including unilateral verbal anosmia, hemialexia, unilateral ideomotor apraxia, unilateral agraphia, unilateral tactile anomia, unilateral constructional apraxia, lack of somesthetic transfer and dissociative phenomena. Despite the patient's left-handedness, his pattern of deficits was similar to the disconnection syndrome found in right-handers. Our report focusses on motor dominance and praxis. We followed-up the improvement in left apraxia and investigated the ability to initiate and learn a new visuo-motor skill. The results permit two tentative assumptions: (1) that the improvement in left apraxia was due to a compensatory increase in ipsilateral proximal muscle control, and (2) that motor dominance, i.e. the competence to initiate and learn a new movement pattern, was hemispherically dissociable from manual dominance in the sense of praxis control.

  10. Molecular genetics in fetal neurology.

    PubMed

    Huang, Jin; Wah, Isabella Y M; Pooh, Ritsuko K; Choy, Kwong Wai

    2012-12-01

    Brain malformations, particularly related to early brain development, are a clinically and genetically heterogeneous group of fetal neurological disorders. Fetal cerebral malformation, predominantly of impaired prosencephalic development namely agenesis of the corpus callosum and septo-optic dysplasia, is the main pathological feature in fetus, and causes prominent neurodevelopmental retardation, and associated with congenital facial anomalies and visual disorders. Differential diagnosis of brain malformations can be extremely difficult even through magnetic resonance imaging. Advances in genomic and molecular genetics technologies have led to the identification of the sonic hedgehog pathways and genes critical to the normal brain development. Molecular cytogenetic and genetic studies have identified numeric and structural chromosomal abnormalities as well as mutations in genes important for the etiology of fetal neurological disorders. In this review, we update the molecular genetics findings of three common fetal neurological abnormalities, holoprosencephaly, lissencephaly and agenesis of the corpus callosum, in an attempt to assist in perinatal and prenatal diagnosis. Copyright © 2012 Elsevier Ltd. All rights reserved.

  11. The Economic Impact of Texas A&M University-Corpus Christi on the Corpus Christi Metropolitan Statistical Areas: 1998 Update.

    ERIC Educational Resources Information Center

    Texas A and M Univ., Corpus Christi.

    A study was conducted to examine the socioeconomic impact of Texas A&M University-Corpus Christi (TAMU-CC) on the surrounding community. This study was a follow-up to a previous examination of the economic relationship between the university and the community. The current study examined the short-term measurable economic impact of university…

  12. A Comparative Study of the Figures of Speech between Top 50 English and Persian Pop Song Lyrics

    ERIC Educational Resources Information Center

    Ashtiani, Farshid Tayari; Derakhshesh, Ali

    2015-01-01

    This paper is a corpus-based comparative discourse analysis of top fifty pop English and Persian song lyrics in 2014 to investigate the production of four figures of speech including metaphor, simile, personification, and hyperbole. The English corpus was compiled from the End-Year 2014 Chart of Billboard and the Persian corpus was complied from…

  13. Variation in Citational Practice in a Corpus of Student Biology Papers: From Parenthetical Plonking to Intertextual Storytelling

    ERIC Educational Resources Information Center

    Swales, John M.

    2014-01-01

    This is a corpus-based study of a key aspect of academic writing in one discipline (biology) by final-year undergraduates and first-, second-, and third-year graduate students. The papers come from the Michigan Corpus of Upper-level Student Papers, a freely available electronic database. The principal aim of the study is to examine the extent of…

  14. An Investigation of Language Teachers' Explorations of the Use of Corpus Tools in the English for Academic Purposes (EAP) Class

    ERIC Educational Resources Information Center

    Bunting, John David

    2013-01-01

    Despite claims that the use of corpus tools can have a major impact in language classrooms (e.g., Conrad, 2000, 2004; Davies, 2004; O'Keefe, McCarthy, & Carter, 2007; Sinclair, 2004b; Tsui, 2004), many language teachers express apparent apathy or even resistance towards adding corpus tools to their repertoire (Cortes, 2013b). This study…

  15. Corpus Callosum Area and Brain Volume in Autism Spectrum Disorder: Quantitative Analysis of Structural MRI from the ABIDE Database

    ERIC Educational Resources Information Center

    Kucharsky Hiess, R.; Alter, R.; Sojoudi, S.; Ardekani, B. A.; Kuzniecky, R.; Pardoe, H. R.

    2015-01-01

    Reduced corpus callosum area and increased brain volume are two commonly reported findings in autism spectrum disorder (ASD). We investigated these two correlates in ASD and healthy controls using T1-weighted MRI scans from the Autism Brain Imaging Data Exchange (ABIDE). Automated methods were used to segment the corpus callosum and intracranial…

  16. TopicLens: Efficient Multi-Level Visual Topic Exploration of Large-Scale Document Collections.

    PubMed

    Kim, Minjeong; Kang, Kyeongpil; Park, Deokgun; Choo, Jaegul; Elmqvist, Niklas

    2017-01-01

    Topic modeling, which reveals underlying topics of a document corpus, has been actively adopted in visual analytics for large-scale document collections. However, due to its significant processing time and non-interactive nature, topic modeling has so far not been tightly integrated into a visual analytics workflow. Instead, most such systems are limited to utilizing a fixed, initial set of topics. Motivated by this gap in the literature, we propose a novel interaction technique called TopicLens that allows a user to dynamically explore data through a lens interface where topic modeling and the corresponding 2D embedding are efficiently computed on the fly. To support this interaction in real time while maintaining view consistency, we propose a novel efficient topic modeling method and a semi-supervised 2D embedding algorithm. Our work is based on improving state-of-the-art methods such as nonnegative matrix factorization and t-distributed stochastic neighbor embedding. Furthermore, we have built a web-based visual analytics system integrated with TopicLens. We use this system to measure the performance and the visualization quality of our proposed methods. We provide several scenarios showcasing the capability of TopicLens using real-world datasets.

  17. Assessing semantic similarity of texts - Methods and algorithms

    NASA Astrophysics Data System (ADS)

    Rozeva, Anna; Zerkova, Silvia

    2017-12-01

    Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.

  18. Ontology construction and application in practice case study of health tourism in Thailand.

    PubMed

    Chantrapornchai, Chantana; Choksuchat, Chidchanok

    2016-01-01

    Ontology is one of the key components in semantic webs. It contains the core knowledge for an effective search. However, building ontology requires the carefully-collected knowledge which is very domain-sensitive. In this work, we present the practice of ontology construction for a case study of health tourism in Thailand. The whole process follows the METHONTOLOGY approach, which consists of phases: information gathering, corpus study, ontology engineering, evaluation, publishing, and the application construction. Different sources of data such as structure web documents like HTML and other documents are acquired in the information gathering process. The tourism corpora from various tourism texts and standards are explored. The ontology is evaluated in two aspects: automatic reasoning using Pellet, and RacerPro, and the questionnaires, used to evaluate by experts of the domains: tourism domain experts and ontology experts. The ontology usability is demonstrated via the semantic web application and via example axioms. The developed ontology is actually the first health tourism ontology in Thailand with the published application.

  19. Biomedical information retrieval across languages.

    PubMed

    Daumke, Philipp; Markü, Kornél; Poprat, Michael; Schulz, Stefan; Klar, Rüdiger

    2007-06-01

    This work presents a new dictionary-based approach to biomedical cross-language information retrieval (CLIR) that addresses many of the general and domain-specific challenges in current CLIR research. Our method is based on a multilingual lexicon that was generated partly manually and partly automatically, and currently covers six European languages. It contains morphologically meaningful word fragments, termed subwords. Using subwords instead of entire words significantly reduces the number of lexical entries necessary to sufficiently cover a specific language and domain. Mediation between queries and documents is based on these subwords as well as on lists of word-n-grams that are generated from large monolingual corpora and constitute possible translation units. The translations are then sent to a standard Internet search engine. This process makes our approach an effective tool for searching the biomedical content of the World Wide Web in different languages. We evaluate this approach using the OHSUMED corpus, a large medical document collection, within a cross-language retrieval setting.

  20. On the seamless, harmonized use of ISO/IEEE11073 and openEHR.

    PubMed

    Trigo, Jesús D; Kohl, Christian D; Eguzkiza, Aitor; Martínez-Espronceda, Miguel; Alesanco, Álvaro; Serrano, Luis; García, José; Knaup, Petra

    2014-05-01

    Standardized exchange of clinical information is a key factor in the provision of high quality health care systems. In this context, the openEHR specification facilitates the management of health data in electronic health records (EHRs), while the ISO/IEEE11073 (also referred to as X73PHD) family of standards provides a reference framework for medical device interoperability. Hospitals and health care providers using openEHR require flawless integration of data coming from external sources, such as X73PHD. Hence, a harmonization process is crucial for achieving a seamless, coherent use of those specifications in real scenarios. Such harmonization is the aim of this paper. Thus, the classes and attributes of a representative number of X73PHD specializations for medical devices--weight, temperature, blood pressure, pulse and heart rate, oximetry, and electrocardiograph--along with the X73PHD core document--ISO/IEEE11073-20601--have been analyzed and mapped to openEHR archetypes. The proposed methodology reuses the existing archetypes when possible and suggests new ones--or appropriate modifications--otherwise. As a result, this paper analyzes the inconsistencies found and the implications thereof in the coordinated use of these two standards. The procedure has also shown how existing standards are able to influence the archetype development process, enhancing the existing archetype corpus.

  1. Magnetic resonance features of cerebral malaria.

    PubMed

    Yadav, P; Sharma, R; Kumar, S; Kumar, U

    2008-06-01

    Cerebral malaria is a major health hazard, with a high incidence of mortality. The disease is endemic in many developing countries, but with a greater increase in tourism, occasional cases may be detected in countries where the disease in not prevalent. Early diagnosis and evaluation of cerebral involvement in malaria utilizing modern imaging modalities have an impact on the treatment and clinical outcome. To evaluate the magnetic resonance (MR) features of patients with cerebral malaria presenting with altered sensorium. We present the findings in three patients with cerebral malaria presenting with altered sensorium. MR imaging using a 1.5-Tesla unit was carried out. The sequences performed were 5-mm-thick T1-weighted, T2-weighted, fluid-attenuated inversion-recovery (FLAIR), and T2-weighted gradient-echo axial sequences, and sagittal and coronal FLAIR. Diffusion-weighted imaging was performed with b values of 0 and 1000 s/mm(2), and apparent diffusion coefficient (ADC) maps were obtained. Focal hyperintensities in the bilateral periventricular white matter, corpus callosum, occipital subcortex, and bilateral thalami were noticed on T2-weighted and FLAIR sequences. The lesions were more marked in the splenium of the corpus callosum. No enhancement on postcontrast T1-weighted MR images was observed. There was no evidence of restricted diffusion on the diffusion-weighted sequence and ADC map. MR is a sensitive imaging modality, with a role in the assessment of cerebral lesions in malaria. Focal white matter and corpus callosal lesions without any restricted diffusion were the key findings in our patients.

  2. Unexpected recovery of function after severe traumatic brain injury: the limits of early neuroimaging-based outcome prediction.

    PubMed

    Edlow, Brian L; Giacino, Joseph T; Hirschberg, Ronald E; Gerrard, Jason; Wu, Ona; Hochberg, Leigh R

    2013-12-01

    Prognostication in the early stage of traumatic coma is a common challenge in the neuro-intensive care unit. We report the unexpected recovery of functional milestones (i.e., consciousness, communication, and community reintegration) in a 19-year-old man who sustained a severe traumatic brain injury. The early magnetic resonance imaging (MRI) findings, at the time, suggested a poor prognosis. During the first year of the patient's recovery, MRI with diffusion tensor imaging and T2*-weighted imaging was performed on day 8 (coma), day 44 (minimally conscious state), day 198 (post-traumatic confusional state), and day 366 (community reintegration). Mean apparent diffusion coefficient (ADC) and fractional anisotropy values in the corpus callosum, cerebral hemispheric white matter, and thalamus were compared with clinical assessments using the Disability Rating Scale (DRS). Extensive diffusion restriction in the corpus callosum and bihemispheric white matter was observed on day 8, with ADC values in a range typically associated with neurotoxic injury (230-400 × 10(-6 )mm(2)/s). T2*-weighted MRI revealed widespread hemorrhagic axonal injury in the cerebral hemispheres, corpus callosum, and brainstem. Despite the presence of severe axonal injury on early MRI, the patient regained the ability to communicate and perform activities of daily living independently at 1 year post-injury (DRS = 8). MRI data should be interpreted with caution when prognosticating for patients in traumatic coma. Recovery of consciousness and community reintegration are possible even when extensive traumatic axonal injury is demonstrated by early MRI.

  3. Metronidazole-induced encephalopathy: not always a reversible situation.

    PubMed

    Hobbs, Kyle; Stern-Nezer, Sara; Buckwalter, Marion S; Fischbein, Nancy; Finley Caulfield, Anna

    2015-06-01

    Metronidazole is a nitroimidazole antimicrobial drug prescribed to treat infections caused by anaerobic bacteria and protozoa. Uncommonly, it causes central nervous system (CNS) toxicity manifesting as metronidazole-induced encephalopathy (MIE). Case report. A 65-year-old woman with hepatitis B cirrhosis (Child-Pugh class C, MELD 21) developed progressive encephalopathy to GCS 4 during a 3-week course of metronidazole for cholecystitis. Initial MRI was consistent with CNS metronidazole toxicity, with symmetrical T2 hyperintensity and generally restricted diffusion in bilateral dentate nuclei, corpus callosum, midbrain, superior cerebellar peduncles, internal capsules, and cerebral white matter. Laboratory values did not demonstrate significant electrolyte shifts, and continuous EEG was without seizure. High-dose thiamine was empirically administered. Lumbar puncture was not performed due to coagulopathy and thrombocytopenia. Despite discontinuation of metronidazole and keeping ammonia levels near normal, the patient did not improve. MRI was repeated 1 week after discontinuation of metronidazole. Although there was decreased DWI hyperintensity in the dentate nuclei, diffuse T2 hyperintensity persisted and even progressed in the brainstem, basal ganglia, and subcortical white matter. Petechial hemorrhages developed in bilateral corticospinal tracts and subcortical white matter. T1 hypointensity appeared in the corpus callosum. She was transitioned to comfort measures only and died 12 days later. MIE is an uncommon adverse effect of treatment with metronidazole that characteristically affects the dentate nuclei but may also involve the brainstem, corpus callosum, subcortical white matter, and basal ganglia. While the clinical symptoms and neuroimaging changes are usually reversible, persistent encephalopathy with poor outcome may occur.

  4. [Limiting a Medline/PubMed query to the "best" articles using the JCR relative impact factor].

    PubMed

    Avillach, P; Kerdelhué, G; Devos, P; Maisonneuve, H; Darmoni, S J

    2014-12-01

    Medline/PubMed is the most frequently used medical bibliographic research database. The aim of this study was to propose a new generic method to limit any Medline/PubMed query based on the relative impact factor and the A & B categories of the SIGAPS score. The entire PubMed corpus was used for the feasibility study, then ten frequent diseases in terms of PubMed indexing and the citations of four Nobel prize winners. The relative impact factor (RIF) was calculated by medical specialty defined in Journal Citation Reports. The two queries, which included all the journals in category A (or A OR B), were added to any Medline/PubMed query as a central point of the feasibility study. Limitation using the SIGAPS category A was larger than the when using the Core Clinical Journals (CCJ): 15.65% of PubMed corpus vs 8.64% for CCJ. The response time of this limit applied to the entire PubMed corpus was less than two seconds. For five diseases out of ten, limiting the citations with the RIF was more effective than with the CCJ. For the four Nobel prize winners, limiting the citations with the RIF was more effective than the CCJ. The feasibility study to apply a new filter based on the relative impact factor on any Medline/PubMed query was positive. Copyright © 2014 Elsevier Masson SAS. All rights reserved.

  5. Linguistic measures of chemical diversity and the "keywords" of molecular collections.

    PubMed

    Woźniak, Michał; Wołos, Agnieszka; Modrzyk, Urszula; Górski, Rafał L; Winkowski, Jan; Bajczyk, Michał; Szymkuć, Sara; Grzybowski, Bartosz A; Eder, Maciej

    2018-05-15

    Computerized linguistic analyses have proven of immense value in comparing and searching through large text collections ("corpora"), including those deposited on the Internet - indeed, it would nowadays be hard to imagine browsing the Web without, for instance, search algorithms extracting most appropriate keywords from documents. This paper describes how such corpus-linguistic concepts can be extended to chemistry based on characteristic "chemical words" that span more than traditional functional groups and, instead, look at common structural fragments molecules share. Using these words, it is possible to quantify the diversity of chemical collections/databases in new ways and to define molecular "keywords" by which such collections are best characterized and annotated.

  6. Detecting Protected Health Information in Heterogeneous Clinical Notes.

    PubMed

    Henriksson, Aron; Kvist, Maria; Dalianis, Hercules

    2017-01-01

    To enable secondary use of healthcare data in a privacy-preserving manner, there is a need for methods capable of automatically identifying protected health information (PHI) in clinical text. To that end, learning predictive models from labeled examples has emerged as a promising alternative to rule-based systems. However, little is known about differences with respect to PHI prevalence in different types of clinical notes and how potential domain differences may affect the performance of predictive models trained on one particular type of note and applied to another. In this study, we analyze the performance of a predictive model trained on an existing PHI corpus of Swedish clinical notes and applied to a variety of clinical notes: written (i) in different clinical specialties, (ii) under different headings, and (iii) by persons in different professions. The results indicate that domain adaption is needed for effective detection of PHI in heterogeneous clinical notes.

  7. Crossed aphasia following an infarction in the right corpus callosum.

    PubMed

    Ishizaki, Masatoshi; Ueyama, Hidetsugu; Nishida, Yasuto; Imamura, Shigehiro; Hirano, Teruyuki; Uchino, Makoto

    2012-02-01

    A 68-year-old right-handed woman with no history of brain damage or familial left-handedness was admitted to our hospital due to the acute onset of speech difficulty; her speech was nonfluent. Literal and phonological paraphasias, agrammatism and paragrammatism were observed. Brain MRI revealed an acute infarction in the right anterior cerebral artery territory, involving the right corpus callosum. Moreover, cerebral blood flow was decreased not only in the area of the right corpus callosum but also in the left fronto-temporal lobe, suggesting crossed diaschisis. This is a rare case of crossed aphasia following an infarction in the right corpus callosum. Copyright © 2011 Elsevier B.V. All rights reserved.

  8. Old document image segmentation using the autocorrelation function and multiresolution analysis

    NASA Astrophysics Data System (ADS)

    Mehri, Maroua; Gomez-Krämer, Petra; Héroux, Pierre; Mullot, Rémy

    2013-01-01

    Recent progress in the digitization of heterogeneous collections of ancient documents has rekindled new challenges in information retrieval in digital libraries and document layout analysis. Therefore, in order to control the quality of historical document image digitization and to meet the need of a characterization of their content using intermediate level metadata (between image and document structure), we propose a fast automatic layout segmentation of old document images based on five descriptors. Those descriptors, based on the autocorrelation function, are obtained by multiresolution analysis and used afterwards in a specific clustering method. The method proposed in this article has the advantage that it is performed without any hypothesis on the document structure, either about the document model (physical structure), or the typographical parameters (logical structure). It is also parameter-free since it automatically adapts to the image content. In this paper, firstly, we detail our proposal to characterize the content of old documents by extracting the autocorrelation features in the different areas of a page and at several resolutions. Then, we show that is possible to automatically find the homogeneous regions defined by similar indices of autocorrelation without knowledge about the number of clusters using adapted hierarchical ascendant classification and consensus clustering approaches. To assess our method, we apply our algorithm on 316 old document images, which encompass six centuries (1200-1900) of French history, in order to demonstrate the performance of our proposal in terms of segmentation and characterization of heterogeneous corpus content. Moreover, we define a new evaluation metric, the homogeneity measure, which aims at evaluating the segmentation and characterization accuracy of our methodology. We find a 85% of mean homogeneity accuracy. Those results help to represent a document by a hierarchy of layout structure and content, and to define one or more signatures for each page, on the basis of a hierarchical representation of homogeneous blocks and their topology.

  9. Parenting, corpus callosum, and executive function in preschool children.

    PubMed

    Kok, Rianne; Lucassen, Nicole; Bakermans-Kranenburg, Marian J; van IJzendoorn, Marinus H; Ghassabian, Akhgar; Roza, Sabine J; Govaert, Paul; Jaddoe, Vincent W; Hofman, Albert; Verhulst, Frank C; Tiemeier, Henning

    2014-01-01

    In this longitudinal population-based study (N = 544), we investigated whether early parenting and corpus callosum length predict child executive function abilities at 4 years of age. The length of the corpus callosum in infancy was measured using postnatal cranial ultrasounds at 6 weeks of age. At 3 years, two aspects of parenting were observed: maternal sensitivity during a teaching task and maternal discipline style during a discipline task. Parents rated executive function problems at 4 years of age in five domains of inhibition, shifting, emotional control, working memory, and planning/organizing, using the Behavior Rating Inventory of Executive Function-Preschool Version. Maternal sensitivity predicted less executive function problems at preschool age. A significant interaction was found between corpus callosum length in infancy and maternal use of positive discipline to determine child inhibition problems: The association between a relatively shorter corpus callosum in infancy and child inhibition problems was reduced in children who experienced more positive discipline. Our results point to the buffering potential of positive parenting for children with biological vulnerability.

  10. Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application

    PubMed Central

    French, Leon; Liu, Po; Marais, Olivia; Koreman, Tianna; Tseng, Lucia; Lai, Artemis; Pavlidis, Paul

    2015-01-01

    We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Cross-validation classification within the new corpus replicates results on our original corpus, recalling 67% of connectivity statements at 51% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/. PMID:26052282

  11. Recommending Education Materials for Diabetic Questions Using Information Retrieval Approaches

    PubMed Central

    Wang, Yanshan; Shen, Feichen; Liu, Sijia; Rastegar-Mojarad, Majid; Wang, Liwei

    2017-01-01

    Background Self-management is crucial to diabetes care and providing expert-vetted content for answering patients’ questions is crucial in facilitating patient self-management. Objective The aim is to investigate the use of information retrieval techniques in recommending patient education materials for diabetic questions of patients. Methods We compared two retrieval algorithms, one based on Latent Dirichlet Allocation topic modeling (topic modeling-based model) and one based on semantic group (semantic group-based model), with the baseline retrieval models, vector space model (VSM), in recommending diabetic patient education materials to diabetic questions posted on the TuDiabetes forum. The evaluation was based on a gold standard dataset consisting of 50 randomly selected diabetic questions where the relevancy of diabetic education materials to the questions was manually assigned by two experts. The performance was assessed using precision of top-ranked documents. Results We retrieved 7510 diabetic questions on the forum and 144 diabetic patient educational materials from the patient education database at Mayo Clinic. The mapping rate of words in each corpus mapped to the Unified Medical Language System (UMLS) was significantly different (P<.001). The topic modeling-based model outperformed the other retrieval algorithms. For example, for the top-retrieved document, the precision of the topic modeling-based, semantic group-based, and VSM models was 67.0%, 62.8%, and 54.3%, respectively. Conclusions This study demonstrated that topic modeling can mitigate the vocabulary difference and it achieved the best performance in recommending education materials for answering patients’ questions. One direction for future work is to assess the generalizability of our findings and to extend our study to other disease areas, other patient education material resources, and online forums. PMID:29038097

  12. Autoimmune gastritis presenting as iron deficiency anemia in childhood.

    PubMed

    Gonçalves, Cristina; Oliveira, Maria Emília; Palha, Ana M; Ferrão, Anabela; Morais, Anabela; Lopes, Ana Isabel

    2014-11-14

    To characterize clinical, laboratorial, and histological profile of pediatric autoimmune gastritis in the setting of unexplained iron deficiency anemia investigation. A descriptive, observational study including pediatric patients with a diagnosis of autoimmune gastritis (positive parietal cell antibody and gastric corpus atrophy) established in a 6 year period (2006-2011) in the setting of refractory iron deficiency anemia (refractoriness to oral iron therapy for at least 6 mo and requirement for intravenous iron therapy) investigation, after exclusion of other potentially contributing causes of anemia. Helicobacter pylori (H. pylori) infection and anti-secretory therapy were also excluded. Data were retrospectively collected from clinical files, including: demographic data (age, gender, and ethnic background), past medical history, gastrointestinal symptoms, familial history, laboratorial evaluation (Hb, serum ferritin, serum gastrin, pepsinogen I/ pepsinogen II, B12 vitamin, intrinsic factor autoantibodies, thyroid autoantibodies, and anti-transglutaminase antibodies), and endoscopic and histological findings (HE, Periodic Acid-Schiff/Alcian blue, gastrin, chromogranin A and immunochemistry analysis for CD3, CD20 and CD68). Descriptive statistical analysis was performed (mean, median, and standard deviation). We report a case-series concerning 3 girls and 2 boys with a mean age of 13.6 ± 2.8 years (3 Caucasian and 2 African). One girl had type I diabetes. Familial history was positive in 4/5 cases, respectively for autoimmune thyroiditis (2/5), sarcoidosis (1/5) and multiple myeloma (1/5). Laboratorial evaluation on admission included: Hb: 9.5 ± 0.7 g/dL; serum ferritin: 4.0 ± 0.9 ng/mL; serum gastrin: 393 ± 286 pg/mL; low pepsinogen I/ pepsinogen II ratio in 1/5 patients; normal vitamin B12 levels (analyzed in 3 patients). Endoscopy findings included: duodenal nodularity (2/5) and gastric fold softening (2/5), and histological evaluation showed corpus atrophic gastritis with lymphocytic infiltration (5/5), patchy oxyntic gland mononuclear cell infiltration (5/5), intestinal and/or pseudo-pyloric metaplasia in corpus mucosa (4/5), and enterochromaffin cell hyperplasia (4/5). Immunochemistry for gastrin on corpus biopsies was negative in all cases. Duodenal histology was normal. All biopsies were negative for H. pylori (Giemsa staining and cultural examination). We highlight autoimmune gastritis as a diagnosis to be considered when investigating refractory iron deficiency anemia in children, particularly in the setting of a personal/familial history of autoimmune disease, as well as the diagnostic contribution of a careful immunohistological evaluation.

  13. Spontaneous Speech Collection for the CSR Corpus

    DTIC Science & Technology

    1992-01-01

    Menlo Park, California 94025 1. ABSTRACT As part of a pilot data collection for DARPA’s Continuous Speech Recognition ( CSR ) speech corpus, SRI...International experi- mented with the collection of spontaneous speeoh material. The bulk of the CSR pilot data was read versions of news articles from...variable. 2. INTRODUCTION The CSR (Continuous Speech Recognition) Corpus collec- tion can be considered the successor to the Resource Man- agemen t

  14. Working Together: Contributions of Corpus Analyses and Experimental Psycholinguistics to Understanding Conversation.

    PubMed

    Meyer, Antje S; Alday, Phillip M; Decuyper, Caitlin; Knudsen, Birgit

    2018-01-01

    As conversation is the most important way of using language, linguists and psychologists should combine forces to investigate how interlocutors deal with the cognitive demands arising during conversation. Linguistic analyses of corpora of conversation are needed to understand the structure of conversations, and experimental work is indispensable for understanding the underlying cognitive processes. We argue that joint consideration of corpus and experimental data is most informative when the utterances elicited in a lab experiment match those extracted from a corpus in relevant ways. This requirement to compare like with like seems obvious but is not trivial to achieve. To illustrate this approach, we report two experiments where responses to polar (yes/no) questions were elicited in the lab and the response latencies were compared to gaps between polar questions and answers in a corpus of conversational speech. We found, as expected, that responses were given faster when they were easy to plan and planning could be initiated earlier than when they were harder to plan and planning was initiated later. Overall, in all but one condition, the latencies were longer than one would expect based on the analyses of corpus data. We discuss the implication of this partial match between the data sets and more generally how corpus and experimental data can best be combined in studies of conversation.

  15. Unique Cellular Lineage Composition of the First Gland of the Mouse Gastric Corpus.

    PubMed

    O'Neil, Andrew; Petersen, Christine P; Choi, Eunyoung; Engevik, Amy C; Goldenring, James R

    2017-01-01

    The glandular stomach has two major zones: the acid secreting corpus and the gastrin cell-containing antrum. Nevertheless, a single gland lies at the transition between the forestomach and corpus in the mouse stomach. We have sought to define the lineages that make up this gland unit at the squamocolumnar junction. The first gland in mice showed a notable absence of characteristic corpus lineages, including parietal cells and chief cells. In contrast, the gland showed strong staining of Griffonia simplicifolia-II (GSII)-lectin-positive mucous cells at the bases of glands, which were also positive for CD44 variant 9 and Clusterin. Prominent numbers of doublecortin-like kinase 1 (DCLK1) positive tuft cells were present in the first gland. The first gland contained Lgr5-expressing putative progenitor cells, and a large proportion of the cells were positive for Sox2. The cells of the first gland stained strongly for MUC4 and EpCAM, but both were absent in the normal corpus mucosa. The present studies indicate that the first gland in the corpus represents a unique anatomic entity. The presence of a concentration of progenitor cells and sensory tuft cells in this gland suggests that it may represent a source of reserve reparative cells for adapting to severe mucosal damage.

  16. UTD at TREC 2014: Query Expansion for Clinical Decision Support

    DTIC Science & Technology

    2014-11-01

    Description: A 62-year-old man sees a neurologist for progressive memory loss and jerking movements of the lower ex- tremities. Neurologic examination confirms...infiltration. Summary: 62-year-old man with progressive memory loss and in- voluntary leg movements. Brain MRI reveals cortical atrophy, and cortical...latent topics produced by the Latent Dirichlet Allocation (LDA) on the TREC-CDS corpus of scientific articles. The position of words “ loss ” and “ memory

  17. Natural language processing to extract symptoms of severe mental illness from clinical text: the Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project

    PubMed Central

    Jayatilleke, Nishamali; Kolliakou, Anna; Ball, Michael; Gorrell, Genevieve; Roberts, Angus; Stewart, Robert

    2017-01-01

    Objectives We sought to use natural language processing to develop a suite of language models to capture key symptoms of severe mental illness (SMI) from clinical text, to facilitate the secondary use of mental healthcare data in research. Design Development and validation of information extraction applications for ascertaining symptoms of SMI in routine mental health records using the Clinical Record Interactive Search (CRIS) data resource; description of their distribution in a corpus of discharge summaries. Setting Electronic records from a large mental healthcare provider serving a geographic catchment of 1.2 million residents in four boroughs of south London, UK. Participants The distribution of derived symptoms was described in 23 128 discharge summaries from 7962 patients who had received an SMI diagnosis, and 13 496 discharge summaries from 7575 patients who had received a non-SMI diagnosis. Outcome measures Fifty SMI symptoms were identified by a team of psychiatrists for extraction based on salience and linguistic consistency in records, broadly categorised under positive, negative, disorganisation, manic and catatonic subgroups. Text models for each symptom were generated using the TextHunter tool and the CRIS database. Results We extracted data for 46 symptoms with a median F1 score of 0.88. Four symptom models performed poorly and were excluded. From the corpus of discharge summaries, it was possible to extract symptomatology in 87% of patients with SMI and 60% of patients with non-SMI diagnosis. Conclusions This work demonstrates the possibility of automatically extracting a broad range of SMI symptoms from English text discharge summaries for patients with an SMI diagnosis. Descriptive data also indicated that most symptoms cut across diagnoses, rather than being restricted to particular groups. PMID:28096249

  18. Extracting Hot spots of Topics from Time Stamped Documents

    PubMed Central

    Chen, Wei; Chundi, Parvathi

    2011-01-01

    Identifying time periods with a burst of activities related to a topic has been an important problem in analyzing time-stamped documents. In this paper, we propose an approach to extract a hot spot of a given topic in a time-stamped document set. Topics can be basic, containing a simple list of keywords, or complex. Logical relationships such as and, or, and not are used to build complex topics from basic topics. A concept of presence measure of a topic based on fuzzy set theory is introduced to compute the amount of information related to the topic in the document set. Each interval in the time period of the document set is associated with a numeric value which we call the discrepancy score. A high discrepancy score indicates that the documents in the time interval are more focused on the topic than those outside of the time interval. A hot spot of a given topic is defined as a time interval with the highest discrepancy score. We first describe a naive implementation for extracting hot spots. We then construct an algorithm called EHE (Efficient Hot Spot Extraction) using several efficient strategies to improve performance. We also introduce the notion of a topic DAG to facilitate an efficient computation of presence measures of complex topics. The proposed approach is illustrated by several experiments on a subset of the TDT-Pilot Corpus and DBLP conference data set. The experiments show that the proposed EHE algorithm significantly outperforms the naive one, and the extracted hot spots of given topics are meaningful. PMID:21765568

  19. A method for integrating and ranking the evidence for biochemical pathways by mining reactions from text

    PubMed Central

    Miwa, Makoto; Ohta, Tomoko; Rak, Rafal; Rowley, Andrew; Kell, Douglas B.; Pyysalo, Sampo; Ananiadou, Sophia

    2013-01-01

    Motivation: To create, verify and maintain pathway models, curators must discover and assess knowledge distributed over the vast body of biological literature. Methods supporting these tasks must understand both the pathway model representations and the natural language in the literature. These methods should identify and order documents by relevance to any given pathway reaction. No existing system has addressed all aspects of this challenge. Method: We present novel methods for associating pathway model reactions with relevant publications. Our approach extracts the reactions directly from the models and then turns them into queries for three text mining-based MEDLINE literature search systems. These queries are executed, and the resulting documents are combined and ranked according to their relevance to the reactions of interest. We manually annotate document-reaction pairs with the relevance of the document to the reaction and use this annotation to study several ranking methods, using various heuristic and machine-learning approaches. Results: Our evaluation shows that the annotated document-reaction pairs can be used to create a rule-based document ranking system, and that machine learning can be used to rank documents by their relevance to pathway reactions. We find that a Support Vector Machine-based system outperforms several baselines and matches the performance of the rule-based system. The success of the query extraction and ranking methods are used to update our existing pathway search system, PathText. Availability: An online demonstration of PathText 2 and the annotated corpus are available for research purposes at http://www.nactem.ac.uk/pathtext2/. Contact: makoto.miwa@manchester.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23813008

  20. A Corpus Based Study on the Use of Preposition of Time "On" and "At" in Argumentative Essays of Form 4 and Form 5 Malaysian Students

    ERIC Educational Resources Information Center

    Loke, Darina Lokeman; Ali, Juliana; Anthony, Norin Norain Zulkifli

    2013-01-01

    This article presents a corpus-based investigation on English prepositions of time presented in the argumentative essays of Form 4 and Form 5 Malaysian secondary students in the MCSAW corpus. The aims were to find out the distribution patterns and the common errors in the use of preposition of time, "on" and "at". This corpus…

  1. Volumetric brain analysis as a predictor of a worse cognitive outcome in Parkinson's disease.

    PubMed

    Vasconcellos, Luiz Felipe; Pereira, João Santos; Adachi, Marcelo; Greca, Denise; Cruz, Manuela; Malak, Ana Lara; Charchat-Fichman, Helenice

    2018-04-27

    Cognitive impairment in Parkinson's disease (PD) results in significant morbidity and mortality being early diagnosis essential. Identification of patients who are at higher risk of developing cognitive impairment based only on clinical data is not sufficient. To this end, magnetic resonance imaging (MRI) with automatic segmentation, such as FreeSurfer, could be a useful tool with high accuracy because it has histological validation. The objective of this study was to evaluate clinical, neuropsychological and FreeSurfer variables that may be related to worse cognitive outcomes over 18 months in PD patients compared with controls. PD patients were recruited according to established inclusion and exclusion criteria as well individuals without any neurological or psychiatric diagnosis and were submitted to the same protocol: neurological, neuropsychological and neuroimaging evaluations. After 18 months, the study subjects were reassessed by neurological and neuropsychological evaluations. Of 171 individuals selected for first evaluation, 96 concluded the study during 18-month follow-up. The PD group presented worse performance in the neuropsychological assessment during both the initial and final evaluations. The results obtained by FreeSurfer revealed a significant reduction (unilateral or bilateral) in the volume of thalamus, caudate nucleus, putamen, hippocampus, amygdala, accumbens, corpus callosum and cerebral gray matter in the PD group. A worse cognitive outcome was more prevalent in the PD group. Worse cognitive performance documented by neuropsychological assessment in the PD group was correlated with reduced volume of several structures by FreeSurfer analysis and may be a biomarker of cognitive decline. Copyright © 2018. Published by Elsevier Ltd.

  2. Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature.

    PubMed

    Müller, H-M; Van Auken, K M; Li, Y; Sternberg, P W

    2018-03-09

    The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. We describe the next generation of the Textpresso information retrieval system, Textpresso Central (TPC). TPC builds on the strengths of the original system by expanding the full text corpus to include the PubMed Central Open Access Subset (PMC OA), as well as the WormBase C. elegans bibliography. In addition, TPC allows users to create a customized corpus by uploading and processing documents of their choosing. TPC is UIMA compliant, to facilitate compatibility with external processing modules, and takes advantage of Lucene indexing and search technology for efficient handling of millions of full text documents. Like Textpresso, TPC searches can be performed using keywords and/or categories (semantically related groups of terms), but to provide better context for interpreting and validating queries, search results may now be viewed as highlighted passages in the context of full text. To facilitate biocuration efforts, TPC also allows users to select text spans from the full text and annotate them, create customized curation forms for any data type, and send resulting annotations to external curation databases. As an example of such a curation form, we describe integration of TPC with the Noctua curation tool developed by the Gene Ontology (GO) Consortium. Textpresso Central is an online literature search and curation platform that enables biocurators and biomedical researchers to search and mine the full text of literature by integrating keyword and category searches with viewing search results in the context of the full text. It also allows users to create customized curation interfaces, use those interfaces to make annotations linked to supporting evidence statements, and then send those annotations to any database in the world. Textpresso Central URL: http://www.textpresso.org/tpc.

  3. Parsing clinical text: how good are the state-of-the-art parsers?

    PubMed Central

    2015-01-01

    Background Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. Methods In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Results Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus. Conclusions Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text. PMID:26045009

  4. NCBI disease corpus: a resource for disease name recognition and concept normalization.

    PubMed

    Doğan, Rezarta Islamaj; Leaman, Robert; Lu, Zhiyong

    2014-02-01

    Information encoded in natural language in biomedical literature publications is only useful if efficient and reliable ways of accessing and analyzing that information are available. Natural language processing and text mining tools are therefore essential for extracting valuable information, however, the development of powerful, highly effective tools to automatically detect central biomedical concepts such as diseases is conditional on the availability of annotated corpora. This paper presents the disease name and concept annotations of the NCBI disease corpus, a collection of 793 PubMed abstracts fully annotated at the mention and concept level to serve as a research resource for the biomedical natural language processing community. Each PubMed abstract was manually annotated by two annotators with disease mentions and their corresponding concepts in Medical Subject Headings (MeSH®) or Online Mendelian Inheritance in Man (OMIM®). Manual curation was performed using PubTator, which allowed the use of pre-annotations as a pre-step to manual annotations. Fourteen annotators were randomly paired and differing annotations were discussed for reaching a consensus in two annotation phases. In this setting, a high inter-annotator agreement was observed. Finally, all results were checked against annotations of the rest of the corpus to assure corpus-wide consistency. The public release of the NCBI disease corpus contains 6892 disease mentions, which are mapped to 790 unique disease concepts. Of these, 88% link to a MeSH identifier, while the rest contain an OMIM identifier. We were able to link 91% of the mentions to a single disease concept, while the rest are described as a combination of concepts. In order to help researchers use the corpus to design and test disease identification methods, we have prepared the corpus as training, testing and development sets. To demonstrate its utility, we conducted a benchmarking experiment where we compared three different knowledge-based disease normalization methods with a best performance in F-measure of 63.7%. These results show that the NCBI disease corpus has the potential to significantly improve the state-of-the-art in disease name recognition and normalization research, by providing a high-quality gold standard thus enabling the development of machine-learning based approaches for such tasks. The NCBI disease corpus, guidelines and other associated resources are available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Dogan/DISEASE/. Published by Elsevier Inc.

  5. Gastric atrophy and intestinal metaplasia before and after Helicobacter pylori eradication: a meta-analysis.

    PubMed

    Wang, Jin; Xu, Lijuan; Shi, Ruihua; Huang, Xiayue; Li, Simon Wing Heng; Huang, Zuhu; Zhang, Guoxin

    2011-01-01

    Whether gastric atrophy (GA) and intestinal metaplasia (IM) are reversible after the eradication of Helicobacter pylori remains controversial. The purpose of this meta-analysis was to systematically review histological alterations in GA and IM by comparing histological scores before and after H. pylori eradication. English-language articles in the medical literature containing information about the association between infection with H. pylori and gastric premalignant lesions (i.e. GA and IM) were identified by searching the Medline, PubMed, and EMBASE databases with suitable key words up to December 2009. Review Manager 4.2.8 was used for the meta-analysis. Twelve studies containing a total of 2,658 patients were included in the first meta-analysis. Before treatment, 2,648 patients had antrum GA, 2,401 patients had corpus GA, 2,582 patients had antrum IM, and 2,460 patients had corpus IM. Comparing the histological alterations before and after H. pylori eradication, the pooled weighted mean difference (WMD) with 95% CI for antral GA was 0.12 (0.00-0.23), p = 0.06. For corpus GA, the pooled WMD was 0.32 (0.09-0.54), p = 0.006. For antral IM, the pooled WMD was 0.02 (-0.12-0.16), p = 0.76, and for corpus IM, the pooled WMD was -0.02 (-0.05-0.02), p = 0.42. Our study shows that eradication of H. pylori results in significant improvement in GA in the corpus but not in the antrum; it also does not improve gastric mucous IM. Consequently, all patients with GA in the corpus should be tested for H. pylori infection, and eradication therapy should be prescribed for H. pylori-positive patients in those with GA in corpus. Copyright © 2011 S. Karger AG, Basel.

  6. Saw palmetto extract enhances erectile responses by inhibition of phosphodiesterase 5 activity and increase in inducible nitric oxide synthase messenger ribonucleic acid expression in rat and rabbit corpus cavernosum.

    PubMed

    Yang, Surong; Chen, Changrui; Li, Yiying; Ren, Zhenghua; Zhang, Yungang; Wu, Gantong; Wang, Hao; Hu, Zhenzhen; Yao, Minghui

    2013-06-01

    To evaluate whether saw palmetto extract (SPE) relaxes corpus cavernosum and explore the underlying mechanisms. Forty Sprague-Dawley rats and 30 New Zealand rabbits were randomly allocated into 3 SPE-treated groups (low-, middle-, and high-dose) and 1 saline-treated control group. SPE was administered intragastrically for 7 consecutive days. Another 23 rats treated with sildenafil were used to appraise the erectile response to electrical stimulation of nerves in the corpus cavernosum. The erectile functions of rats and rabbits were evaluated 24 hours after the last SPE administration or 15 minutes after intragastric sildenafil. Outcome measures included corpus cavernosum electrical activity recording, phosphodiesterase 5 (PDE5) activity detected by the colorimetric quantitative method, and messenger ribonucleic acid (mRNA) expression level for PDE5 and inducible nitric oxide synthase (iNOS) determined using real-time polymerase chain reaction. In the SPE-treated animals, the relaxant response to electrical stimulation of nerves in the corpus cavernosum, reflected by the amplitude of the electrical activity within the cavernosum, was significantly and dose-dependently augmented. Similar effects were observed in the sildenafil-treated rats. PDE5 activity in rat and rabbit corpus cavernosum tissues was significantly and dose-dependently inhibited in SPE-treated animals, whereas the iNOS mRNA level increased compared with the saline group. PDE5 mRNA, however, was only significantly enhanced in the rats treated with the middle dose of SPE. The results suggest that SPE may have potential application value for the prevention or treatment of erectile dysfunction through an increase in iNOS mRNA expression and inhibition of PDE5 activity in corpus cavernosum smooth muscles. Copyright © 2013 Elsevier Inc. All rights reserved.

  7. Polyethylene glycol restores axonal conduction after corpus callosum transection.

    PubMed

    Bamba, Ravinder; Riley, D Colton; Boyer, Richard B; Pollins, Alonda C; Shack, R Bruce; Thayer, Wesley P

    2017-05-01

    Polyethylene glycol (PEG) has been shown to restore axonal continuity after peripheral nerve transection in animal models. We hypothesized that PEG can also restore axonal continuity in the central nervous system. In this current experiment, coronal sectioning of the brains of Sprague-Dawley rats was performed after animal sacrifice. 3Brain high-resolution microelectrode arrays (MEA) were used to measure mean firing rate (MFR) and peak amplitude across the corpus callosum of the ex-vivo brain slices. The corpus callosum was subsequently transected and repeated measurements were performed. The cut ends of the corpus callosum were still apposite at this time. A PEG solution was applied to the injury site and repeated measurements were performed. MEA measurements showed that PEG was capable of restoring electrophysiology signaling after transection of central nerves. Before injury, the average MFRs at the ipsilateral, midline, and contralateral corpus callosum were 0.76, 0.66, and 0.65 spikes/second, respectively, and the average peak amplitudes were 69.79, 58.68, and 49.60 μV, respectively. After injury, the average MFRs were 0.71, 0.14, and 0.25 spikes/second, respectively and peak amplitudes were 52.11, 8.98, and 16.09 μV, respectively. After application of PEG, there were spikes in MFR and peak amplitude at the injury site and contralaterally. The average MFRs were 0.75, 0.55, and 0.47 spikes/second at the ipsilateral, midline, and contralateral corpus callosum, respectively and peak amplitudes were 59.44, 45.33, 40.02 μV, respectively. There were statistically differences in the average MFRs and peak amplitudes between the midline and non-midline corpus callosum groups ( P < 0.01, P < 0.05). These findings suggest that PEG restores axonal conduction between severed central nerves, potentially representing axonal fusion.

  8. Polyethylene glycol restores axonal conduction after corpus callosum transection

    PubMed Central

    Bamba, Ravinder; Riley, D. Colton; Boyer, Richard B.; Pollins, Alonda C.; Shack, R. Bruce; Thayer, Wesley P.

    2017-01-01

    Polyethylene glycol (PEG) has been shown to restore axonal continuity after peripheral nerve transection in animal models. We hypothesized that PEG can also restore axonal continuity in the central nervous system. In this current experiment, coronal sectioning of the brains of Sprague-Dawley rats was performed after animal sacrifice. 3Brain high-resolution microelectrode arrays (MEA) were used to measure mean firing rate (MFR) and peak amplitude across the corpus callosum of the ex-vivo brain slices. The corpus callosum was subsequently transected and repeated measurements were performed. The cut ends of the corpus callosum were still apposite at this time. A PEG solution was applied to the injury site and repeated measurements were performed. MEA measurements showed that PEG was capable of restoring electrophysiology signaling after transection of central nerves. Before injury, the average MFRs at the ipsilateral, midline, and contralateral corpus callosum were 0.76, 0.66, and 0.65 spikes/second, respectively, and the average peak amplitudes were 69.79, 58.68, and 49.60 μV, respectively. After injury, the average MFRs were 0.71, 0.14, and 0.25 spikes/second, respectively and peak amplitudes were 52.11, 8.98, and 16.09 μV, respectively. After application of PEG, there were spikes in MFR and peak amplitude at the injury site and contralaterally. The average MFRs were 0.75, 0.55, and 0.47 spikes/second at the ipsilateral, midline, and contralateral corpus callosum, respectively and peak amplitudes were 59.44, 45.33, 40.02 μV, respectively. There were statistically differences in the average MFRs and peak amplitudes between the midline and non-midline corpus callosum groups (P < 0.01, P < 0.05). These findings suggest that PEG restores axonal conduction between severed central nerves, potentially representing axonal fusion. PMID:28616031

  9. Fatty acid composition of the postmortem corpus callosum of patients with schizophrenia, bipolar disorder, or major depressive disorder.

    PubMed

    Hamazaki, K; Maekawa, M; Toyota, T; Dean, B; Hamazaki, T; Yoshikawa, T

    2017-01-01

    Studies investigating the relationship between n-3 polyunsaturated fatty acid (PUFA) levels and psychiatric disorders have thus far focused mainly on analyzing gray matter, rather than white matter, in the postmortem brain. In this study, we investigated whether PUFA levels showed abnormalities in the corpus callosum, the largest area of white matter, in the postmortem brain tissue of patients with schizophrenia, bipolar disorder, or major depressive disorder. Fatty acids in the phospholipids of the postmortem corpus callosum were evaluated by thin-layer chromatography and gas chromatography. Specimens were evaluated for patients with schizophrenia (n=15), bipolar disorder (n=15), or major depressive disorder (n=15) and compared with unaffected controls (n=15). In contrast to some previous studies, no significant differences were found in the levels of PUFAs or other fatty acids in the corpus callosum between patients and controls. A subanalysis by sex gave the same results. No significant differences were found in any PUFAs between suicide completers and non-suicide cases regardless of psychiatric disorder diagnosis. Patients with psychiatric disorders did not exhibit n-3 PUFAs deficits in the postmortem corpus callosum relative to the unaffected controls, and the corpus callosum might not be involved in abnormalities of PUFA metabolism. This area of research is still at an early stage and requires further investigation. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  10. Effects of Vitamin D Restricted Diet Administered during Perinatal and Postnatal Periods on the Penis of Wistar Rats

    PubMed Central

    Fernandes-Lima, Flávia; Gregório, Bianca M.; Nascimento, Fernanda A. M.; Costa, Waldemar S.; Sampaio, Francisco J. B.

    2018-01-01

    Vitamin D deficiency is common in pregnant women and infants. The present study aimed to investigate the effects of vitamin D restricted diet on the Wistar rats offspring penis morphology. Mother rats received either standard diet (SC) or vitamin D restricted (VitD) diet. At birth, offspring were divided into SC/SC (from SC mothers, fed with SC diet) and VitD/VitD (from VitD mothers, fed with VitD diet). After euthanasia the penises were processed for histomorphometric analysis. The VitD/VitD offspring displayed metabolic changes and reduction in the cross-sectional area of the penis, corpus cavernosum, tunica albuginea, and increased area of the corpus spongiosum. The connective tissue, smooth muscle, and cell proliferation percentages were greater in the corpus cavernosum and corpus spongiosum in the VitD/VitD offspring. The percentages of sinusoidal spaces and elastic fibers in the corpus cavernosum decreased. The elastic fibers in the tunica albuginea of the corpus spongiosum in the VitD/VitD offspring were reduced. Vitamin D restriction during perinatal and postnatal periods induced metabolic and structural changes and represented important risk factors for erectile dysfunction in the penis of the adult offspring. These findings suggest that vitamin D is an important micronutrient in maintaining the cytoarchitecture of the penis. PMID:29850540

  11. Working Together: Contributions of Corpus Analyses and Experimental Psycholinguistics to Understanding Conversation

    PubMed Central

    Meyer, Antje S.; Alday, Phillip M.; Decuyper, Caitlin; Knudsen, Birgit

    2018-01-01

    As conversation is the most important way of using language, linguists and psychologists should combine forces to investigate how interlocutors deal with the cognitive demands arising during conversation. Linguistic analyses of corpora of conversation are needed to understand the structure of conversations, and experimental work is indispensable for understanding the underlying cognitive processes. We argue that joint consideration of corpus and experimental data is most informative when the utterances elicited in a lab experiment match those extracted from a corpus in relevant ways. This requirement to compare like with like seems obvious but is not trivial to achieve. To illustrate this approach, we report two experiments where responses to polar (yes/no) questions were elicited in the lab and the response latencies were compared to gaps between polar questions and answers in a corpus of conversational speech. We found, as expected, that responses were given faster when they were easy to plan and planning could be initiated earlier than when they were harder to plan and planning was initiated later. Overall, in all but one condition, the latencies were longer than one would expect based on the analyses of corpus data. We discuss the implication of this partial match between the data sets and more generally how corpus and experimental data can best be combined in studies of conversation. PMID:29706919

  12. Qcorp: an annotated classification corpus of Chinese health questions.

    PubMed

    Guo, Haihong; Na, Xu; Li, Jiao

    2018-03-22

    Health question-answering (QA) systems have become a typical application scenario of Artificial Intelligent (AI). An annotated question corpus is prerequisite for training machines to understand health information needs of users. Thus, we aimed to develop an annotated classification corpus of Chinese health questions (Qcorp) and make it openly accessible. We developed a two-layered classification schema and corresponding annotation rules on basis of our previous work. Using the schema, we annotated 5000 questions that were randomly selected from 5 Chinese health websites within 6 broad sections. 8 annotators participated in the annotation task, and the inter-annotator agreement was evaluated to ensure the corpus quality. Furthermore, the distribution and relationship of the annotated tags were measured by descriptive statistics and social network map. The questions were annotated using 7101 tags that covers 29 topic categories in the two-layered schema. In our released corpus, the distribution of questions on the top-layered categories was treatment of 64.22%, diagnosis of 37.14%, epidemiology of 14.96%, healthy lifestyle of 10.38%, and health provider choice of 4.54% respectively. Both the annotated health questions and annotation schema were openly accessible on the Qcorp website. Users can download the annotated Chinese questions in CSV, XML, and HTML format. We developed a Chinese health question corpus including 5000 manually annotated questions. It is openly accessible and would contribute to the intelligent health QA system development.

  13. SPG11 Presenting with Tremor

    PubMed Central

    Schneider, Susanne A.; Mummery, Catherine J.; Mehrabian, Mohadeseh; Houlden, Henry; Bain, Peter G.

    2012-01-01

    Background Hereditary spastic paraplegias (HSPs) are a clinically and genetically heterogeneous group of neurological diseases, which typically present with progressive lower extremity weakness and spasticity causing progressive walking difficulties. Complicating neurological or extraneurological features may be present. Case Report We describe a 19-year-old male who was referred because of an action tremor of the hands; he later developed walking difficulties. Callosal atrophy was present on his cerebral magnetic resonance imaging scan, prompting genetic testing for SPG11, which revealed homozygous mutations. Discussion The clinical features, differential diagnosis and management of SPG11, the most common form of autosomal recessive complicated HSP with a thin corpus callosum are discussed. PMID:23439843

  14. Magnetic resonance imaging spectrum of succinate dehydrogenase-related infantile leukoencephalopathy.

    PubMed

    Helman, Guy; Caldovic, Ljubica; Whitehead, Matthew T; Simons, Cas; Brockmann, Knut; Edvardson, Simon; Bai, Renkui; Moroni, Isabella; Taylor, J Michael; Van Haren, Keith; Taft, Ryan J; Vanderver, Adeline; van der Knaap, Marjo S

    2016-03-01

    Succinate dehydrogenase-deficient leukoencephalopathy is a complex II-related mitochondrial disorder for which the clinical phenotype, neuroimaging pattern, and genetic findings have not been comprehensively reviewed. Nineteen individuals with succinate dehydrogenase deficiency-related leukoencephalopathy were reviewed for neuroradiological, clinical, and genetic findings as part of institutional review board-approved studies at Children's National Health System (Washington, DC) and VU University Medical Center (Amsterdam, the Netherlands). All individuals had signal abnormalities in the central corticospinal tracts and spinal cord where imaging was available. Other typical findings were involvement of the cerebral hemispheric white matter with sparing of the U fibers, the corpus callosum with sparing of the outer blades, the basis pontis, middle cerebellar peduncles, and cerebellar white matter, and elevated succinate on magnetic resonance spectroscopy (MRS). The thalamus was involved in most studies, with a predilection for the anterior nucleus, pulvinar, and geniculate bodies. Clinically, infantile onset neurological regression with partial recovery and subsequent stabilization was typical. All individuals had mutations in SDHA, SDHB, or SDHAF1, or proven biochemical defect. Succinate dehydrogenase deficiency is a rare leukoencephalopathy, for which improved recognition by magnetic resonance imaging (MRI) in combination with advanced sequencing technologies allows noninvasive diagnostic confirmation. The MRI pattern is characterized by cerebral hemispheric white matter abnormalities with sparing of the U fibers, corpus callosum involvement with sparing of the outer blades, and involvement of corticospinal tracts, thalami, and spinal cord. In individuals with infantile regression and this pattern of MRI abnormalities, the differential diagnosis should include succinate dehydrogenase deficiency, in particular if MRS shows elevated succinate. © 2016 American Neurological Association.

  15. On feature augmentation for semantic argument classification of the Quran English translation using support vector machine

    NASA Astrophysics Data System (ADS)

    Khaira Batubara, Dina; Arif Bijaksana, Moch; Adiwijaya

    2018-03-01

    Research on the semantic argument classification requires semantically labeled data in large numbers, called corpus. Because building a corpus is costly and time-consuming, recently many studies have used existing corpus as the training data to conduct semantic argument classification research on new domain. But previous studies have proven that there is a significant decrease in performance when classifying semantic arguments on different domain between the training and the testing data. The main problem is when there is a new argument that found in the testing data but it is not found in the training data. This research carries on semantic argument classification on a new domain that is Quran English Translation by utilizing Propbank corpus as the training data. To recognize the new argument in the training data, this research proposes four new features for extending the argument features in the training data. By using SVM Linear, the experiment has proven that augmenting the proposed features to the baseline system with some combinations option improve the performance of semantic argument classification on Quran data using Propbank Corpus as training data.

  16. Two episodes of hemoperitoneum from luteal cysts rupture in a patient with congenital factor X deficiency.

    PubMed

    Dafopoulos, Konstantinos; Galazios, Georgios; Georgadakis, Georgios; Boulbou, Maria; Koutsoyiannis, Dimitrios; Plakopoulos, Apostolos; Anastasiadis, Panagiotis

    2003-01-01

    The clinical manifestation of two episodes of hemoperitoneum from ruptured corpus luteum cysts, during the luteal phase of the cycle in a young patient with the rare congenital factor X deficiency, is reported for the first time in literature. The correct diagnosis of the underlying disorder, the gynecological management and the regular follow-up can minimize the risks of this potentially life-threatening hematological disorder. Copyright 2003 S. Karger AG, Basel

  17. Automatic corpus callosum segmentation for standardized MR brain scanning

    NASA Astrophysics Data System (ADS)

    Xu, Qing; Chen, Hong; Zhang, Li; Novak, Carol L.

    2007-03-01

    Magnetic Resonance (MR) brain scanning is often planned manually with the goal of aligning the imaging plane with key anatomic landmarks. The planning is time-consuming and subject to inter- and intra- operator variability. An automatic and standardized planning of brain scans is highly useful for clinical applications, and for maximum utility should work on patients of all ages. In this study, we propose a method for fully automatic planning that utilizes the landmarks from two orthogonal images to define the geometry of the third scanning plane. The corpus callosum (CC) is segmented in sagittal images by an active shape model (ASM), and the result is further improved by weighting the boundary movement with confidence scores and incorporating region based refinement. Based on the extracted contour of the CC, several important landmarks are located and then combined with landmarks from the coronal or transverse plane to define the geometry of the third plane. Our automatic method is tested on 54 MR images from 24 patients and 3 healthy volunteers, with ages ranging from 4 months to 70 years old. The average accuracy with respect to two manually labeled points on the CC is 3.54 mm and 4.19 mm, and differed by an average of 2.48 degrees from the orientation of the line connecting them, demonstrating that our method is sufficiently accurate for clinical use.

  18. Aicardi syndrome

    MedlinePlus

    ... called the corpus callosum) is partly or completely missing. Nearly all known cases occur in people with ... criteria: Corpus callosum that is partly or completely missing Female sex Seizures (typically beginning as infantile spasms) ...

  19. Corpus gastritis and erosive esophagitis: a report from the Middle East.

    PubMed

    Contractor, Qais Qutub; ul Haque, Imran; Saka, Hala; Contractor, Tasneem Qais

    2006-01-01

    To assess whether corpus gastritis due to Helicobacter pylori protects against erosive esophagitis in an area with high prevalence of H. pylori infection. Biopsies obtained from gastric corpus and antrum in 151 patients with symptoms of gastroesophageal reflux disease were studied for presence of H. pylori and endoscopic evidence of gastritis. Presence and grade of esophagitis at endoscopy was recorded. Fifty-four (36%) patients had endoscopic esophagitis. Patients with severe esophagitis (>or= grade II) less often had active gastritis (15/45 vs. 55/98; p=0.02) and had a lower density of H. pylori (p=0.0003) than those without esophagitis. Active corpus gastritis due to H. pylori infection may protect against erosive esophagitis in patients with gastroesophageal reflux disease in the Middle East.

  20. Chinese Grand Strategy: How Anti-Access/Area Denial (A2/AD) Fits in China’s Plan

    DTIC Science & Technology

    2014-04-01

    Qiao and Wang, Unrestricted Warfare, 142. 6. Corpus, “America’s Acupuncture Points” Asia Times Online, (Part 2, Section 5). 7. Ibid. 8. Stokes...Corpus, “America’s Acupuncture Points”, (Part 1, Section 1). 46. Qiao and Wang, Unrestricted Warfare, 93. 47. Ibid. 48. Military Factory, “American War...Employment Concepts in the 21st Century. RAND Report FA7014-06-C-0001. Santa Monica, CA: RAND, 2011. Corpus, Victor N. “America’s Acupuncture Points

  1. MRI and MR spectroscopy findings of a case of subacute sclerosing panencephalitis affecting the corpus callosum

    PubMed Central

    Öztürk, Mehmet; Sığırcı, Ahmet; Yakıncı, Cengiz

    2015-01-01

    Subacute sclerosing panencephalitis (SSPE) is a rare, slowly progressive, fatal, inflammatory and neurodegenerative disease that is seen mostly in children and young adolescents, and primarily affects the parieto-occipital lobes. The corpus callosum, cerebellum and basal ganglia are less frequently involved. MR spectroscopy (MRS) may illustrate the pathophysiological features of SSPE. To the best of our knowledge, this is the second report of MRS findings of corpus callosum involvement in a stage 3 SSPE case. PMID:26163552

  2. Smoking habit and gastritis histology.

    PubMed

    Namiot, A; Kemona, A; Namiot, Z

    2007-01-01

    Long-term cigarette smoking may increase the risk of digestive tract pathologies, however, what is the influence smoking habit on gastric mucosa histology is still poorly elicited. The aim of the study was to compare histological evaluation of gastritis in smoker and non-smoker groups. A total of 236 patients of various H. pylori status (109 infected, 127 non-infected), clinical diagnosis (107 duodenal ulcer disease, 129 dyspepsia), and smoking habit (92 smokers, 144 non-smokers) were included. Subjects were classified as smokers if they smoked 5 or more cigarettes per day for at least 3 years. A histological examination of endoscopically obtained samples was performed by two experienced pathomorphologists blinded to the diagnoses and smoking habit. Microscopic slices of the gastric mucosa were stained with hematoxylin-eosin and Giemsa. Apart from histological diagnosis, H. pylori status was additionally confirmed by an urease test (CLO-test) at least in one of two gastric locations (antrum or corpus). In the H. pylori infected population, H. pylori density, neutrophils, and mononuclear cells infiltration in the gastric corpus mucosa were lower in smokers than non-smokers, while in the antrum the differences were not significant. In the non-infected population, no significant differences in neutrophils and mononuclear cells infiltration between smokers and non-smokers were found. Since the significant differences in studied parameters of chronic gastritis between smokers and non-smokers were found in the corpus mucosa of H. pylori infected subjects, smoking should be taken into account when a histological evaluation of the gastric mucosa in the H. pylori infected population is performed.

  3. Evidence for impaired plasticity after traumatic brain injury in the developing brain.

    PubMed

    Li, Nan; Yang, Ya; Glover, David P; Zhang, Jiangyang; Saraswati, Manda; Robertson, Courtney; Pelled, Galit

    2014-02-15

    The robustness of plasticity mechanisms during brain development is essential for synaptic formation and has a beneficial outcome after sensory deprivation. However, the role of plasticity in recovery after acute brain injury in children has not been well defined. Traumatic brain injury (TBI) is the leading cause of death and disability among children, and long-term disability from pediatric TBI can be particularly devastating. We investigated the altered cortical plasticity 2-3 weeks after injury in a pediatric rat model of TBI. Significant decreases in neurophysiological responses across the depth of the noninjured, primary somatosensory cortex (S1) in TBI rats, compared to age-matched controls, were detected with electrophysiological measurements of multi-unit activity (86.4% decrease), local field potential (75.3% decrease), and functional magnetic resonance imaging (77.6% decrease). Because the corpus callosum is a clinically important white matter tract that was shown to be consistently involved in post-traumatic axonal injury, we investigated its anatomical and functional characteristics after TBI. Indeed, corpus callosum abnormalities in TBI rats were detected with diffusion tensor imaging (9.3% decrease in fractional anisotropy) and histopathological analysis (14% myelination volume decreases). Whole-cell patch clamp recordings further revealed that TBI results in significant decreases in spontaneous firing rate (57% decrease) and the potential to induce long-term potentiation in neurons located in layer V of the noninjured S1 by stimulation of the corpus callosum (82% decrease). The results suggest that post-TBI plasticity can translate into inappropriate neuronal connections and dramatic changes in the function of neuronal networks.

  4. Level statistics of words: Finding keywords in literary texts and symbolic sequences

    NASA Astrophysics Data System (ADS)

    Carpena, P.; Bernaola-Galván, P.; Hackenberg, M.; Coronado, A. V.; Oliver, J. L.

    2009-03-01

    Using a generalization of the level statistics analysis of quantum disordered systems, we present an approach able to extract automatically keywords in literary texts. Our approach takes into account not only the frequencies of the words present in the text but also their spatial distribution along the text, and is based on the fact that relevant words are significantly clustered (i.e., they self-attract each other), while irrelevant words are distributed randomly in the text. Since a reference corpus is not needed, our approach is especially suitable for single documents for which no a priori information is available. In addition, we show that our method works also in generic symbolic sequences (continuous texts without spaces), thus suggesting its general applicability.

  5. Faciobrachial dystonic seizures result from fronto-temporo-basalganglial network involvement.

    PubMed

    Iyer, Rajesh Shankar; Ramakrishnan, T C R; Karunakaran; Shinto, Ajit; Kamaleshwaran, Koramadai Karuppuswamy

    2017-01-01

    •Faciobrachial dystonic seizures (FBDS) are caused by autoantibodies to leucine-rich glioma-inactivated1 proteins, a component of the voltage-gated potassium channel complex (VGKC-complex) and precede the clinical presentation of limbic encephalitis.•The exact pathophysiology of FBDS is not known and whether they are seizures or movement disorder is still debated.•We suggest the fronto-temporo-basal ganglia network involving the medial frontal and temporal regions along with the corpus striatum and substantia nigra being responsible for the clinical phenomenon of FBDS.•The varied clinical, electrical and imaging features of FBDS in our cases and in the literature are best explained by involvement of this network.•Entrainment from any part of this network will result in similar clinical expression of FBDS, whereas other electro-clinical associations and duration depends on the extent of involvement of the network.

  6. Traumatic axonal injury: the prognostic value of lesion load in corpus callosum, brain stem, and thalamus in different magnetic resonance imaging sequences.

    PubMed

    Moen, Kent G; Brezova, Veronika; Skandsen, Toril; Håberg, Asta K; Folvik, Mari; Vik, Anne

    2014-09-01

    The aim of this study was to explore the prognostic value of visible traumatic axonal injury (TAI) loads in different MRI sequences from the early phase after adjusting for established prognostic factors. Likewise, we sought to explore the prognostic role of early apparent diffusion coefficient (ADC) values in normal-appearing corpus callosum. In this prospective study, 128 patients (mean age, 33.9 years; range, 11-69) with moderate (n = 64) and severe traumatic brain injury (TBI) were examined with MRI at a median of 8 days (range, 0-28) postinjury. TAI lesions in fluid-attenuated inversion recovery (FLAIR), diffusion-weighted imaging (DWI), and T2*-weighted gradient echo (T2*GRE) sequences were counted and FLAIR lesion volumes estimated. In patients and 47 healthy controls, mean ADC values were computed in 10 regions of interests in the normal-appearing corpus callosum. Outcome measure was the Glasgow Outcome Scale-Extended (GOS-E) at 12 months. In patients with severe TBI, number of DWI lesions and volume of FLAIR lesions in the corpus callosum, brain stem, and thalamus predicted outcome in analyses with adjustment for age, Glasgow Coma Scale score, and pupillary dilation (odds ratio, 1.3-6.9; p = <0.001-0.017). The addition of Rotterdam CT score and DWI lesions in the corpus callosum yielded the highest R2 (0.24), compared to all other MRI variables, including brain stem lesions. For patients with moderate TBI only the number of cortical contusions (p = 0.089) and Rotterdam CT score (p = 0.065) tended to predict outcome. Numbers of T2*GRE lesions did not affect outcome. Mean ADC values in the normal-appearing corpus callosum did not differ from controls. In conclusion, the loads of visible TAI lesions in the corpus callosum, brain stem, and thalamus in DWI and FLAIR were independent prognostic factors in patients with severe TBI. DWI lesions in the corpus callosum were the most important predictive MRI variable. Interestingly, number of cortical contusions in MRI and CT findings seemed more important for patients with moderate TBI.

  7. Data from clinical notes: a perspective on the tension between structure and flexible documentation

    PubMed Central

    Denny, Joshua C; Xu, Hua; Lorenzi, Nancy; Stead, William W; Johnson, Kevin B

    2011-01-01

    Clinical documentation is central to patient care. The success of electronic health record system adoption may depend on how well such systems support clinical documentation. A major goal of integrating clinical documentation into electronic heath record systems is to generate reusable data. As a result, there has been an emphasis on deploying computer-based documentation systems that prioritize direct structured documentation. Research has demonstrated that healthcare providers value different factors when writing clinical notes, such as narrative expressivity, amenability to the existing workflow, and usability. The authors explore the tension between expressivity and structured clinical documentation, review methods for obtaining reusable data from clinical notes, and recommend that healthcare providers be able to choose how to document patient care based on workflow and note content needs. When reusable data are needed from notes, providers can use structured documentation or rely on post-hoc text processing to produce structured data, as appropriate. PMID:21233086

  8. [Medicine and astrology in Arnau's corpus].

    PubMed

    Giralt, Sebastià

    2006-01-01

    The role of astrology in Arnau de Vilanova's medical work is revisited with special attention to the problems of authorship posed by the astrological writings of Arnau's corpus and to their hypothetical chronology.

  9. Improved EEG Event Classification Using Differential Energy.

    PubMed

    Harati, A; Golmohammadi, M; Lopez, S; Obeid, I; Picone, J

    2015-12-01

    Feature extraction for automatic classification of EEG signals typically relies on time frequency representations of the signal. Techniques such as cepstral-based filter banks or wavelets are popular analysis techniques in many signal processing applications including EEG classification. In this paper, we present a comparison of a variety of approaches to estimating and postprocessing features. To further aid in discrimination of periodic signals from aperiodic signals, we add a differential energy term. We evaluate our approaches on the TUH EEG Corpus, which is the largest publicly available EEG corpus and an exceedingly challenging task due to the clinical nature of the data. We demonstrate that a variant of a standard filter bank-based approach, coupled with first and second derivatives, provides a substantial reduction in the overall error rate. The combination of differential energy and derivatives produces a 24 % absolute reduction in the error rate and improves our ability to discriminate between signal events and background noise. This relatively simple approach proves to be comparable to other popular feature extraction approaches such as wavelets, but is much more computationally efficient.

  10. Endoscopical determination of gastric mucosal blood flow by the crossed thermocouple method.

    PubMed

    Hiramatsu, A; Watanabe, T; Okuhira, M; Uchiyama, S; Mizuno, T; Sameshima, Y

    1984-06-01

    A crossed thermocouple method in combination with endoscopy was applied to determine the blood flow rate of the human gastric mucosa. Determination was carried out with 11 healthy control subjects at 8 sites of the stomach. The blood flow rates at all sites in the corpus were found to be higher than those at the antrum. In subjects less than 50 years old the blood flow rate in the corpus was higher than in older subjects. These results were in agreed well with those obtained by the hydrogen gas clearance method, which is widely adopted clinically. The crossed thermocouple method is easily applicable to all sites in the gastric mucosa and the time required for the assay is very short. This method dose not require the inhalation of hydrogen gas which is necessary for the hydrogen gas clearance method and which is possibly harmful to humans. Although the values obtained by the crossed thermocouple method are relative to the value at a certain fixed site, this method will holds great potential for the determination of gastric mucosal blood flow rate.

  11. A Case of ADEM Mimicking Cerebral Adrenoleukodystrophy Based on Supratentorial MRI Findings

    PubMed Central

    BEYAZAL, Mehmet; ÜNAL, Özkan; YILMAZ, Sanem; BORA, Aydın

    2014-01-01

    A 9-year-old male admitted for syncope also had the complains of pain and numbness in his legs and frequent falling down. There was a history of upper respiratory tract infection 10 days before. On neurologic examination, paraparesia and fall a sleep were identified. On magnetic resonance imaging, the symetric signal increases were seen in biparieto-occipital white matter intented to corpus callosum at T2-weighted sequences and cytotoxic edema was seen at diffusion-weighted images. Heterogeneous contrast enhancement was seen on these areas. In addition, at the C7-Th5 vertebrae levels, spinal cord had diffuse increased signal intensity and contrast enhancement. Acute disseminated encephalomyelitis was thought based on clinical and radiological findings. Steroid therapy was started. Significant improvement was shown after treatment. On 2-year follow-up, there was no recurrence. In conclusion, it must be kept in mind that acute disseminated encephalomyelitis can rarely present with biparieto-occipital involvement which extends to corpus callosum and can mimic adrenoleukodystrophy. For the differential diagnosis butterfly glioma, tumefactive demyelinating lesions or multiple sclerosis should be considered. PMID:28360602

  12. Watershed-based segmentation of the corpus callosum in diffusion MRI

    NASA Astrophysics Data System (ADS)

    Freitas, Pedro; Rittner, Leticia; Appenzeller, Simone; Lapa, Aline; Lotufo, Roberto

    2012-02-01

    The corpus callosum (CC) is one of the most important white matter structures of the brain, interconnecting the two cerebral hemispheres, and is related to several neurodegenerative diseases. Since segmentation is usually the first step for studies in this structure, and manual volumetric segmentation is a very time-consuming task, it is important to have a robust automatic method for CC segmentation. We propose here an approach for fully automatic 3D segmentation of the CC in the magnetic resonance diffusion tensor images. The method uses the watershed transform and is performed on the fractional anisotropy (FA) map weighted by the projection of the principal eigenvector in the left-right direction. The section of the CC in the midsagittal slice is used as seed for the volumetric segmentation. Experiments with real diffusion MRI data showed that the proposed method is able to quickly segment the CC without any user intervention, with great results when compared to manual segmentation. Since it is simple, fast and does not require parameter settings, the proposed method is well suited for clinical applications.

  13. Realization of Chinese word segmentation based on deep learning method

    NASA Astrophysics Data System (ADS)

    Wang, Xuefei; Wang, Mingjiang; Zhang, Qiquan

    2017-08-01

    In recent years, with the rapid development of deep learning, it has been widely used in the field of natural language processing. In this paper, I use the method of deep learning to achieve Chinese word segmentation, with large-scale corpus, eliminating the need to construct additional manual characteristics. In the process of Chinese word segmentation, the first step is to deal with the corpus, use word2vec to get word embedding of the corpus, each character is 50. After the word is embedded, the word embedding feature is fed to the bidirectional LSTM, add a linear layer to the hidden layer of the output, and then add a CRF to get the model implemented in this paper. Experimental results show that the method used in the 2014 People's Daily corpus to achieve a satisfactory accuracy.

  14. Age-Specific Dynamics of Corpus Callosum Development in Children and its Peculiarities in Infantile Cerebral Palsy.

    PubMed

    Krasnoshchekova, E I; Zykin, P A; Tkachenko, L A; Aleksandrov, T A; Sereda, V M; Yalfimov, A N

    2016-10-01

    The age dynamics of corpus callosum development was studied on magnetic resonance images of the brain in children aged 2-11 years without neurological abnormalities and with infantile cerebral palsy. The areas of the total corpus callosum and its segments are compared in the midsagittal images. Analysis is carried out with the use of an original formula: proportion of areas of the anterior (genu, CC2; and anterior part, CC3) and posterior (isthmus, CC6 and splenium, CC7) segments: kCC=(CC2+CC3)×CC6/CC7. The results characterize age-specific dynamics of the corpus callosum development and can be used for differentiation, with high confidence, of the brain of children without neurological abnormalities from the brain patients with infantile cerebral palsy.

  15. Trends in cancer incidence in female breast, cervix uteri, corpus uteri, and ovary in India.

    PubMed

    Yeole, Balkrishna B

    2008-01-01

    Trends in breast, cervix uteri, corpus uteri and ovarian cancers in six population based cancer registries (Mumbai, Bangalore, Chennai, Delhi, Bhopal, and Barshi) were evaluated over a period of the last two decades. For studying trends we used a model that fits this data is the logarithm of Y=ABx which represents a Linear Regression model. This approach showed a decreasing trend for cancer of the cervix and increasing trends for cancers of breast, ovary and corpus uteri throughout the entire period of observation in most of the registries. The four cancers, breast, cervix, corpus uteri and ovary, constitute more than 50% of total cancers in women. As all these cancers are increasing, to understand their etiology in depth, analytic epidemiology studies should be planned in a near future on a priority basis.

  16. The proximal gastric corpus is the most responsive site of motilin-induced contractions in the stomach of the Asian house shrew.

    PubMed

    Dudani, Amrita; Aizawa, Sayaka; Zhi, Gong; Tanaka, Toru; Jogahara, Takamichi; Sakata, Ichiro; Sakai, Takafumi

    2016-07-01

    The migrating motor complex (MMC) is responsible for emptying the stomach during the interdigestive period, in preparation for the next meal. It is known that gastric phase III of MMC starts from the proximal stomach and propagates the contraction downwards. We hypothesized that a certain region of the stomach must be more responsive to motilin than others, and that motilin-induced strong gastric contractions propagate from that site. Stomachs of the Suncus or Asian house shrew, a small insectivorous mammal, were dissected and the fundus, proximal corpus, distal corpus, and antrum were examined to study the effect of motilin using an organ bath experiment. Motilin-induced contractions differed in different parts of the stomach. Only the proximal corpus induced gastric contraction even at motilin 10(-10) M, and strong contraction was induced by motilin 10(-9) M in all parts of the stomach. The GPR38 mRNA expression was also higher in the proximal corpus than in the other sections, and the lowest expression was observed in the antrum. GPR38 mRNA expression varied with low expression in the mucosal layer and high expression in the muscle layer. Additionally, motilin-induced contractions in each dissected part of the stomach were inhibited by tetrodotoxin and atropine pretreatment. These results suggest that motilin reactivity is not consistent throughout the stomach, and an area of the proximal corpus including the cardia is the most sensitive to motilin.

  17. Jointly learning word embeddings using a corpus and a knowledge base

    PubMed Central

    Bollegala, Danushka; Maehara, Takanori; Kawarabayashi, Ken-ichi

    2018-01-01

    Methods for representing the meaning of words in vector spaces purely using the information distributed in text corpora have proved to be very valuable in various text mining and natural language processing (NLP) tasks. However, these methods still disregard the valuable semantic relational structure between words in co-occurring contexts. These beneficial semantic relational structures are contained in manually-created knowledge bases (KBs) such as ontologies and semantic lexicons, where the meanings of words are represented by defining the various relationships that exist among those words. We combine the knowledge in both a corpus and a KB to learn better word embeddings. Specifically, we propose a joint word representation learning method that uses the knowledge in the KBs, and simultaneously predicts the co-occurrences of two words in a corpus context. In particular, we use the corpus to define our objective function subject to the relational constrains derived from the KB. We further utilise the corpus co-occurrence statistics to propose two novel approaches, Nearest Neighbour Expansion (NNE) and Hedged Nearest Neighbour Expansion (HNE), that dynamically expand the KB and therefore derive more constraints that guide the optimisation process. Our experimental results over a wide-range of benchmark tasks demonstrate that the proposed method statistically significantly improves the accuracy of the word embeddings learnt. It outperforms a corpus-only baseline and reports an improvement of a number of previously proposed methods that incorporate corpora and KBs in both semantic similarity prediction and word analogy detection tasks. PMID:29529052

  18. Computing symmetrical strength of N-grams: a two pass filtering approach in automatic classification of text documents.

    PubMed

    Agnihotri, Deepak; Verma, Kesari; Tripathi, Priyanka

    2016-01-01

    The contiguous sequences of the terms (N-grams) in the documents are symmetrically distributed among different classes. The symmetrical distribution of the N-Grams raises uncertainty in the belongings of the N-Grams towards the class. In this paper, we focused on the selection of most discriminating N-Grams by reducing the effects of symmetrical distribution. In this context, a new text feature selection method named as the symmetrical strength of the N-Grams (SSNG) is proposed using a two pass filtering based feature selection (TPF) approach. Initially, in the first pass of the TPF, the SSNG method chooses various informative N-Grams from the entire extracted N-Grams of the corpus. Subsequently, in the second pass the well-known Chi Square (χ(2)) method is being used to select few most informative N-Grams. Further, to classify the documents the two standard classifiers Multinomial Naive Bayes and Linear Support Vector Machine have been applied on the ten standard text data sets. In most of the datasets, the experimental results state the performance and success rate of SSNG method using TPF approach is superior to the state-of-the-art methods viz. Mutual Information, Information Gain, Odds Ratio, Discriminating Feature Selection and χ(2).

  19. Measurement of negativity bias in personal narratives using corpus-based emotion dictionaries.

    PubMed

    Cohen, Shuki J

    2011-04-01

    This study presents a novel methodology for the measurement of negativity bias using positive and negative dictionaries of emotion words applied to autobiographical narratives. At odds with the cognitive theory of mood dysregulation, previous text-analytical studies have failed to find significant correlation between emotion dictionaries and negative affectivity or dysphoria. In the present study, an a priori list dictionary of emotion words was refined based on the actual use of these words in personal narratives collected from close to 500 college students. Half of the corpus was used to construct, via concordance analysis, the grammatical structures associated with the words in their emotional sense. The second half of the corpus served as a validation corpus. The resulting dictionary ignores words that are not used in their intended emotional sense, including negated emotions, homophones, frozen idioms etc. Correlations of the resulting corpus-based negative and positive emotion dictionaries with self-report measures of negative affectivity were in the expected direction, and were statistically significant, with medium effect size. The potential use of these dictionaries as implicit measures of negativity bias and in the analysis of psychotherapy transcripts is discussed.

  20. Ancient voices on tinnitus: the pathology and treatment of tinnitus in Celsus and the Hippocratic Corpus compared and contrasted.

    PubMed

    Maltby, Maryanne Tate

    2012-01-01

    The object of the paper is to analyse the treatment of tinnitus in two ancient works, Celsus De Medicina and the Greek Hippocratic Corpus. Whilst reviews of historical references to tinnitus have identified this material, this is the first detailed treatment of the subject in these authors. The paper considers the material relating to tinnitus and suggested treatments in the Roman medical writer Celsus (mid first century AD) in contrast with those found in the Greek Hippocratic Corpus (late fifth, early fourth century BC). The lifestyle change, diet and pharmacological treatments suggested by Celsus are analysed and shown as likely to be effective. Celsus is shown to be remarkably modern in his understanding of the aetiology of the disease and his suggested dietary and pharmacological treatments appear to be soundly based. Celsus' pharmacological approach differs from the more theoretical stance of the Hippocratic Corpus based on humoural theory. The Hippocratric Corpus is more detailed in its descriptions of otological pathology and more concerned with a humoural explanation of the disease, but offers useful advice on diet and regimen and also provides the first detailed description of what appears to be Ménière's Syndrome.

  1. An annotated corpus with nanomedicine and pharmacokinetic parameters

    PubMed Central

    Lewinski, Nastassja A; Jimenez, Ivan; McInnes, Bridget T

    2017-01-01

    A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP) approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP) efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration’s Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided. PMID:29066897

  2. Genetics Home Reference: Andermann syndrome

    MedlinePlus

    ... callosum with neuronopathy agenesis of corpus callosum with peripheral neuropathy agenesis of corpus callosum with polyneuropathy Charlevoix disease ... Organization for Rare Disorders (NORD) The Foundation for Peripheral Neuropathy GeneReviews (1 link) Hereditary Motor and Sensory Neuropathy ...

  3. 76 FR 1173 - Draft Guidance for Industry on Electronic Source Documentation in Clinical Investigations...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-07

    ...] Draft Guidance for Industry on Electronic Source Documentation in Clinical Investigations; Availability... Documentation in Clinical Investigations.'' This document provides guidance to sponsors, contract research organizations (CROs), data management centers, and clinical investigators on capturing, using, and archiving...

  4. Knowledge-rich temporal relation identification and classification in clinical notes

    PubMed Central

    D’Souza, Jennifer; Ng, Vincent

    2014-01-01

    Motivation: We examine the task of temporal relation classification for the clinical domain. Our approach to this task departs from existing ones in that it is (i) ‘knowledge-rich’, employing sophisticated knowledge derived from discourse relations as well as both domain-independent and domain-dependent semantic relations, and (ii) ‘hybrid’, combining the strengths of rule-based and learning-based approaches. Evaluation results on the i2b2 Clinical Temporal Relations Challenge corpus show that our approach yields a 17–24% and 8–14% relative reduction in error over a state-of-the-art learning-based baseline system when gold-standard and automatically identified temporal relations are used, respectively. Database URL: http://www.hlt.utdallas.edu/~jld082000/temporal-relations/ PMID:25414383

  5. Recurrent hemorrhage from corpus luteum during anticoagulant therapy.

    PubMed Central

    Wong, K. P.; Gillett, P. G.

    1977-01-01

    A 43-year old woman had recurrent massive intraperitoneal hemorrhage from rupture of a hemorrhagic corpus luteum in two successive menstrual cycles while receiving anticoagulant therapy. Left oophorectomy was performed on the first occasion and right salpingo-oophorectomy with left salpingectomy on the second. While the precise incidence cannot be determined, rupture from a hemorrhagic corpus luteum appears to be a rare but potentially catastrophic complication of anticoagulant therapy. Hence possible ovarian hemorrhage should be considered in women of reproductive age receiving heparin or sodium warfarin therapy. PMID:844024

  6. MRI and MR spectroscopy findings of a case of subacute sclerosing panencephalitis affecting the corpus callosum.

    PubMed

    Öztürk, Mehmet; Sığırcı, Ahmet; Yakıncı, Cengiz

    2015-07-10

    Subacute sclerosing panencephalitis (SSPE) is a rare, slowly progressive, fatal, inflammatory and neurodegenerative disease that is seen mostly in children and young adolescents, and primarily affects the parieto-occipital lobes. The corpus callosum, cerebellum and basal ganglia are less frequently involved. MR spectroscopy (MRS) may illustrate the pathophysiological features of SSPE. To the best of our knowledge, this is the second report of MRS findings of corpus callosum involvement in a stage 3 SSPE case. 2015 BMJ Publishing Group Ltd.

  7. UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization.

    PubMed

    Choo, Jaegul; Lee, Changhyun; Reddy, Chandan K; Park, Haesun

    2013-12-01

    Topic modeling has been widely used for analyzing text document collections. Recently, there have been significant advancements in various topic modeling techniques, particularly in the form of probabilistic graphical modeling. State-of-the-art techniques such as Latent Dirichlet Allocation (LDA) have been successfully applied in visual text analytics. However, most of the widely-used methods based on probabilistic modeling have drawbacks in terms of consistency from multiple runs and empirical convergence. Furthermore, due to the complicatedness in the formulation and the algorithm, LDA cannot easily incorporate various types of user feedback. To tackle this problem, we propose a reliable and flexible visual analytics system for topic modeling called UTOPIAN (User-driven Topic modeling based on Interactive Nonnegative Matrix Factorization). Centered around its semi-supervised formulation, UTOPIAN enables users to interact with the topic modeling method and steer the result in a user-driven manner. We demonstrate the capability of UTOPIAN via several usage scenarios with real-world document corpuses such as InfoVis/VAST paper data set and product review data sets.

  8. Evaluation of a simple method for the automatic assignment of MeSH descriptors to health resources in a French online catalogue.

    PubMed

    Névéol, Aurélie; Pereira, Suzanne; Kerdelhué, Gaetan; Dahamna, Badisse; Joubert, Michel; Darmoni, Stéfan J

    2007-01-01

    The growing number of resources to be indexed in the catalogue of online health resources in French (CISMeF) calls for curating strategies involving automatic indexing tools while maintaining the catalogue's high indexing quality standards. To develop a simple automatic tool that retrieves MeSH descriptors from documents titles. In parallel to research on advanced indexing methods, a bag-of-words tool was developed for timely inclusion in CISMeF's maintenance system. An evaluation was carried out on a corpus of 99 documents. The indexing sets retrieved by the automatic tool were compared to manual indexing based on the title and on the full text of resources. 58% of the major main headings were retrieved by the bag-of-words algorithm and the precision on main heading retrieval was 69%. Bag-of-words indexing has effectively been used on selected resources to be included in CISMeF since August 2006. Meanwhile, on going work aims at improving the current version of the tool.

  9. Estrogen receptor ESR1 mediates activation of ERK1/2, CREB, and ELK1 in the corpus of the epididymis.

    PubMed

    Cavalcanti, Fernanda N; Lucas, Thais F G; Lazari, Maria Fatima M; Porto, Catarina S

    2015-06-01

    Expression of the estrogen receptor ESR1 is higher in the corpus than it is in the initial segment/caput and cauda of the epididymis. ESR1 immunostaining in the corpus has been localized not only in the nuclei but also in the cytoplasm and apical membrane, which indicates that ESR1 plays a role in membrane-initiated signaling. The present study investigated whether ESR1 mediates the activation of rapid signaling pathways by estradiol (E2) in the epididymis. We investigated the effect of E2 and the ESR1-selective agonist (4,4',4''-(4-propyl-(1H)-pyrazole-1,3,5-triyl)trisphenol (PPT) on the activation of extracellular signal-regulated protein kinases (ERK1/2), CREB protein, and ETS oncogene-related protein (ELK1). Treatment with PPT did not affect ERK1/2 phosphorylation in the cauda, but it rapidly increased ERK1/2 phosphorylation in the initial segment/caput and corpus of the epididymis. PPT also activated CREB and ELK1 in the corpus of the epididymis. The PPT-induced phosphorylation of ERK1/2, CREB, and ELK1 was blocked by the ESR1-selective antagonist MPP and by pretreatment with a non-receptor tyrosine kinase SRC inhibitor, an EGFR kinase inhibitor, an MEK1/2 inhibitor, and a phosphatidylinositol-3-kinase inhibitor. In conclusion, these results indicate that the corpus, which is a region with high expression of the estrogen receptor ESR1, is a major target in the epididymis for the activation of rapid signaling by E2. The sequence of events that follow E2 interaction with ESR1 includes the SRC-mediated transactivation of EGFR and the phosphorylation of ERK1/2, CREB, and ELK1. This rapid estrogen signaling may modulate gene expression in the corpus of the epididymis, and it may play a role in the dynamic microenvironment of the epididymal lumen. © 2015 Society for Endocrinology.

  10. A method for named entity normalization in biomedical articles: application to diseases and plants.

    PubMed

    Cho, Hyejin; Choi, Wonjun; Lee, Hyunju

    2017-10-13

    In biomedical articles, a named entity recognition (NER) technique that identifies entity names from texts is an important element for extracting biological knowledge from articles. After NER is applied to articles, the next step is to normalize the identified names into standard concepts (i.e., disease names are mapped to the National Library of Medicine's Medical Subject Headings disease terms). In biomedical articles, many entity normalization methods rely on domain-specific dictionaries for resolving synonyms and abbreviations. However, the dictionaries are not comprehensive except for some entities such as genes. In recent years, biomedical articles have accumulated rapidly, and neural network-based algorithms that incorporate a large amount of unlabeled data have shown considerable success in several natural language processing problems. In this study, we propose an approach for normalizing biological entities, such as disease names and plant names, by using word embeddings to represent semantic spaces. For diseases, training data from the National Center for Biotechnology Information (NCBI) disease corpus and unlabeled data from PubMed abstracts were used to construct word representations. For plants, a training corpus that we manually constructed and unlabeled PubMed abstracts were used to represent word vectors. We showed that the proposed approach performed better than the use of only the training corpus or only the unlabeled data and showed that the normalization accuracy was improved by using our model even when the dictionaries were not comprehensive. We obtained F-scores of 0.808 and 0.690 for normalizing the NCBI disease corpus and manually constructed plant corpus, respectively. We further evaluated our approach using a data set in the disease normalization task of the BioCreative V challenge. When only the disease corpus was used as a dictionary, our approach significantly outperformed the best system of the task. The proposed approach shows robust performance for normalizing biological entities. The manually constructed plant corpus and the proposed model are available at http://gcancer.org/plant and http://gcancer.org/normalization , respectively.

  11. Medial posterior choroidal artery territory infarction associated with tumor removal in the pineal/tectum/thalamus region through the occipital transtentorial approach.

    PubMed

    Saito, Ryuta; Kumabe, Toshihiro; Kanamori, Masayuki; Sonoda, Yukihiko; Mugikura, Shunji; Takahashi, Shoki; Tominaga, Teiji

    2013-08-01

    Damage to the deep venous system, occipital lobe, and/or corpus callosum is well known to cause complications associated with the occipital transtentorial approach (OTA), but ischemic complications are not well documented. The authors investigated the high incidences of ischemic complications associated with removal of pineal/tectal/thalamic tumors through the OTA. Clinical records of 29 patients who underwent 31 surgeries using the OTA from December 2001 to May 2011 were retrospectively studied. Tumor locations were the pineal/tectal/thalamic region for 19, cerebellum for 7, and medial temporal lobe for 3. Postoperative diffusion-weighted magnetic resonance images obtained within 72 h after surgery detected infarction in the tectal/splenial/thalamic region, presumably representing the medial posterior choroidal artery (MPChA) territory, in 10 patients. All these patients had tumor in the pineal/tectal/thalamic region. Deteriorated or newly developed eye symptoms including vertical gaze palsy tended to persist in these patients compared to those without ischemic complications. A relatively high incidence of MPChA territory infarction was associated with removal of tumors in the pineal/tectal/thalamic region through the OTA. Eye symptoms often occurred post-surgery and tended to persist in these patients. Neurosurgeons must be aware of the possibility of MPChA territory infarction to further increase the safety of the OTA. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. Palliative Care in Improving Quality of Life in Patients With High Risk Primary or Recurrent Gynecologic Malignancies

    ClinicalTrials.gov

    2015-10-15

    Cervical Carcinoma; Ovarian Carcinoma; Primary Peritoneal Carcinoma; Recurrent Cervical Carcinoma; Recurrent Ovarian Carcinoma; Recurrent Uterine Corpus Carcinoma; Recurrent Vulvar Carcinoma; Uterine Corpus Cancer; Vulvar Carcinoma; Peritoneal Neoplasms

  13. EFFECTS OF CORPUS CHRISTI BAY SEDIMENTS ON SURVIVAL, GROWTH AND REPRODUCTION OF THE MYSID, MYSIDOPSIS BAHIA

    EPA Science Inventory

    The study described here examined effects on mortality, growth, reproduction, and behavior of Americamysis bahi exposed under extended static conditions to bedded sediments from Corpus Christi Bay.

  14. Nintedanib in Treating Patients With Recurrent or Persistent Endometrial Cancer

    ClinicalTrials.gov

    2017-09-08

    Endometrial Adenocarcinoma; Endometrial Clear Cell Adenocarcinoma; Endometrial Mucinous Adenocarcinoma; Endometrial Serous Adenocarcinoma; Endometrial Squamous Cell Carcinoma; Endometrial Transitional Cell Carcinoma; Endometrial Undifferentiated Carcinoma; Malignant Uterine Corpus Mixed Epithelial and Mesenchymal Neoplasm; Recurrent Uterine Corpus Carcinoma

  15. Gemcitabine Hydrochloride and Docetaxel With or Without Bevacizumab in Treating Patients With Advanced or Recurrent Uterine Leiomyosarcoma

    ClinicalTrials.gov

    2017-07-13

    Recurrent Uterine Corpus Sarcoma; Stage IIIA Uterine Sarcoma; Stage IIIB Uterine Sarcoma; Stage IIIC Uterine Sarcoma; Stage IVA Uterine Sarcoma; Stage IVB Uterine Sarcoma; Uterine Corpus Leiomyosarcoma

  16. 26 CFR 25.2518-3 - Disclaimer of less than an entire interest.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ..., support and happiness and a testamentary power of appointment over the corpus. In the absence of the... power to invade corpus for H's health, maintenance, support and happiness. Because H retained the...

  17. 26 CFR 25.2518-3 - Disclaimer of less than an entire interest.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ..., support and happiness and a testamentary power of appointment over the corpus. In the absence of the... power to invade corpus for H's health, maintenance, support and happiness. Because H retained the...

  18. 26 CFR 25.2518-3 - Disclaimer of less than an entire interest.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ..., support and happiness and a testamentary power of appointment over the corpus. In the absence of the... power to invade corpus for H's health, maintenance, support and happiness. Because H retained the...

  19. 26 CFR 25.2518-3 - Disclaimer of less than an entire interest.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ..., support and happiness and a testamentary power of appointment over the corpus. In the absence of the... power to invade corpus for H's health, maintenance, support and happiness. Because H retained the...

  20. 26 CFR 25.2518-3 - Disclaimer of less than an entire interest.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ..., support and happiness and a testamentary power of appointment over the corpus. In the absence of the... power to invade corpus for H's health, maintenance, support and happiness. Because H retained the...

Top