Science.gov

Sample records for coding sequence incompleteness

  1. Cellulases and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2001-02-20

    The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.

  2. Cellulases and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2001-01-01

    The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.

  3. Lichenase and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2000-08-15

    The present invention provides a fungal lichenase, i.e., an endo-1,3-1,4-.beta.-D-glucanohydrolase, its coding sequence, recombinant DNA molecules comprising the lichenase coding sequences, recombinant host cells and methods for producing same. The present lichenase is from Orpinomyces PC-2.

  4. Physics and numerics of the tensor code (incomplete preliminary documentation)

    SciTech Connect

    Burton, D.E.; Lettis, L.A. Jr.; Bryan, J.B.; Frary, N.R.

    1982-07-15

    The present TENSOR code is a descendant of a code originally conceived by Maenchen and Sack and later adapted by Cherry. Originally, the code was a two-dimensional Lagrangian explicit finite difference code which solved the equations of continuum mechanics. Since then, implicit and arbitrary Lagrange-Euler (ALE) algorithms have been added. The code has been used principally to solve problems involving the propagation of stress waves through earth materials, and considerable development of rock and soil constitutive relations has been done. The code has been applied extensively to the containment of underground nuclear tests, nuclear and high explosive surface and subsurface cratering, and energy and resource recovery. TENSOR is supported by a substantial array of ancillary routines. The initial conditions are set up by a generator code TENGEN. ZON is a multipurpose code which can be used for zoning, rezoning, overlaying, and linking from other codes. Linking from some codes is facilitated by another code RADTEN. TENPLT is a fixed time graphics code which provides a wide variety of plotting options and output devices, and which is capable of producing computer movies by postprocessing problem dumps. Time history graphics are provided by the TIMPLT code from temporal dumps produced during production runs. While TENSOR can be run as a stand-alone controllee, a special controller code TCON is available to better interface the code with the LLNL computer system during production jobs. In order to standardize compilation procedures and provide quality control, a special compiler code BC is used. A number of equation of state generators are available among them ROC and PMUGEN.

  5. Numerical classification of coding sequences

    NASA Technical Reports Server (NTRS)

    Collins, D. W.; Liu, C. C.; Jukes, T. H.

    1992-01-01

    DNA sequences coding for protein may be represented by counts of nucleotides or codons. A complete reading frame may be abbreviated by its base count, e.g. A76C158G121T74, or with the corresponding codon table, e.g. (AAA)0(AAC)1(AAG)9 ... (TTT)0. We propose that these numerical designations be used to augment current methods of sequence annotation. Because base counts and codon tables do not require revision as knowledge of function evolves, they are well-suited to act as cross-references, for example to identify redundant GenBank entries. These descriptors may be compared, in place of DNA sequences, to extract homologous genes from large databases. This approach permits rapid searching with good selectivity.

  6. HIFI: a computer code for projectile fragmentation accompanied by incomplete fusion

    SciTech Connect

    Wu, J.R.

    1980-07-01

    A brief summary of a model proposed to describe projectile fragmentation accompanied by incomplete fusion and the instructions for the use of the computer code HIFI are given. The code HIFI calculates single inclusive spectra, coincident spectra and excitation functions resulting from particle-induced reactions. It is a multipurpose program which can calculate any type of coincident spectra as long as the reaction is assumed to take place in two steps.

  7. SOME CODES WHICH ARE INVARIENT UNDER A DOUBLY-TRANSITIVE PERMUTATION GROUP AND THEIR CONNECTION WITH BALANCED INCOMPLETE BLOCK DESIGNS

    DTIC Science & Technology

    If a binary code is invariant under a doubly-transitive permutation group, then the set of all code words of weight j forms a balanced incomplete...codes are properly arranged, and if the first digit is omitted, then all Reed-Muller codes are cyclic.

  8. Nonspatial Sequence Coding in CA1 Neurons

    PubMed Central

    Allen, Timothy A.; Salz, Daniel M.; McKenzie, Sam

    2016-01-01

    The hippocampus is critical to the memory for sequences of events, a defining feature of episodic memory. However, the fundamental neuronal mechanisms underlying this capacity remain elusive. While considerable research indicates hippocampal neurons can represent sequences of locations, direct evidence of coding for the memory of sequential relationships among nonspatial events remains lacking. To address this important issue, we recorded neural activity in CA1 as rats performed a hippocampus-dependent sequence-memory task. Briefly, the task involves the presentation of repeated sequences of odors at a single port and requires rats to identify each item as “in sequence” or “out of sequence”. We report that, while the animals' location and behavior remained constant, hippocampal activity differed depending on the temporal context of items—in this case, whether they were presented in or out of sequence. Some neurons showed this effect across items or sequence positions (general sequence cells), while others exhibited selectivity for specific conjunctions of item and sequence position information (conjunctive sequence cells) or for specific probe types (probe-specific sequence cells). We also found that the temporal context of individual trials could be accurately decoded from the activity of neuronal ensembles, that sequence coding at the single-cell and ensemble level was linked to sequence memory performance, and that slow-gamma oscillations (20–40 Hz) were more strongly modulated by temporal context and performance than theta oscillations (4–12 Hz). These findings provide compelling evidence that sequence coding extends beyond the domain of spatial trajectories and is thus a fundamental function of the hippocampus. SIGNIFICANCE STATEMENT The ability to remember the order of life events depends on the hippocampus, but the underlying neural mechanisms remain poorly understood. Here we addressed this issue by recording neural activity in hippocampal

  9. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    SciTech Connect

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by patterns in

  10. High-quality draft genome sequence of the Thermus amyloliquefaciens type strain YIM 77409(T) with an incomplete denitrification pathway.

    PubMed

    Zhou, En-Min; Murugapiran, Senthil K; Mefferd, Chrisabelle C; Liu, Lan; Xian, Wen-Dong; Yin, Yi-Rui; Ming, Hong; Yu, Tian-Tian; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T B K; Ngan, Chew Yee; Daum, Chris; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Spunde, Alexander; Kyrpides, Nikos; Woyke, Tanja; Li, Wen-Jun; Hedlund, Brian P

    2016-01-01

    Thermus amyloliquefaciens type strain YIM 77409(T) is a thermophilic, Gram-negative, non-motile and rod-shaped bacterium isolated from Niujie Hot Spring in Eryuan County, Yunnan Province, southwest China. In the present study we describe the features of strain YIM 77409(T) together with its genome sequence and annotation. The genome is 2,160,855 bp long and consists of 6 scaffolds with 67.4 % average GC content. A total of 2,313 genes were predicted, comprising 2,257 protein-coding and 56 RNA genes. The genome is predicted to encode a complete glycolysis, pentose phosphate pathway, and tricarboxylic acid cycle. Additionally, a large number of transporters and enzymes for heterotrophy highlight the broad heterotrophic lifestyle of this organism. A denitrification gene cluster included genes predicted to encode enzymes for the sequential reduction of nitrate to nitrous oxide, consistent with the incomplete denitrification phenotype of this strain.

  11. High compression image and image sequence coding

    NASA Technical Reports Server (NTRS)

    Kunt, Murat

    1989-01-01

    The digital representation of an image requires a very large number of bits. This number is even larger for an image sequence. The goal of image coding is to reduce this number, as much as possible, and reconstruct a faithful duplicate of the original picture or image sequence. Early efforts in image coding, solely guided by information theory, led to a plethora of methods. The compression ratio reached a plateau around 10:1 a couple of years ago. Recent progress in the study of the brain mechanism of vision and scene analysis has opened new vistas in picture coding. Directional sensitivity of the neurones in the visual pathway combined with the separate processing of contours and textures has led to a new class of coding methods capable of achieving compression ratios as high as 100:1 for images and around 300:1 for image sequences. Recent progress on some of the main avenues of object-based methods is presented. These second generation techniques make use of contour-texture modeling, new results in neurophysiology and psychophysics and scene analysis.

  12. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    DTIC Science & Technology

    2008-07-01

    COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES ... sequences which are generalizations of the Fibonacci sequences . 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16...coding bound on the rate of DNA codes is proved. To obtain the bound, we use some ensembles of DNA sequences which are generalizations of the Fibonacci

  13. Program generator for the Incomplete Cholesky Conjugate Gradient (ICCG) method with a symmetrizing preprocessor. [GENIC code package

    SciTech Connect

    Kuo-Petravic, G.; Petravic, M.

    1980-03-01

    This paper is an extension of the previous paper, A Program Generator for the Incomplete LU-Decomposition-Conjugate Gradient (ILUCG) Method which appeared in Computer Physics Communications. In that paper a generator program was presented which produced a code package to solve the system of equations Ax/sub approx./ = b/sub approx./, where A is an arbitrary nonsingular matrix, by the ILUCG method. In the present paper an alternative generator program is offered which produces a code package applicable to the case where A is symmetric and positive definite. The numerical algorithm used is the Incomplete Cholesky Conjugate Gradient (ICCG) method of Meijerink and Van der Vorst, which executes approximately twice as fast per iteration as the ILUCG method. In addition, an optional preprocessor is provided to treat the case of a not diagonally dominant nonsymmetric and nonsingular matrix A by solving the equation A/sup T/Ax/sub approx./ = A/sup T/b/sub approx./.

  14. Efficient Quantum Private Communication Based on Dynamic Control Code Sequence

    NASA Astrophysics Data System (ADS)

    Cao, Zheng-Wen; Feng, Xiao-Yi; Peng, Jin-Ye; Zeng, Gui-Hua; Qi, Jin

    2017-04-01

    Based on chaos and quantum properties, we propose a quantum private communication scheme with dynamic control code sequence. The initial sequence is obtained via chaotic systems, and the control code sequence is derived by grouping, XOR and extracting. A shift cycle algorithm is designed to enable the dynamic change of control code sequence. Analysis shows that transmission efficiency could reach 100 % with high dynamics and security.

  15. Efficient Quantum Private Communication Based on Dynamic Control Code Sequence

    NASA Astrophysics Data System (ADS)

    Cao, Zheng-Wen; Feng, Xiao-Yi; Peng, Jin-Ye; Zeng, Gui-Hua; Qi, Jin

    2016-12-01

    Based on chaos and quantum properties, we propose a quantum private communication scheme with dynamic control code sequence. The initial sequence is obtained via chaotic systems, and the control code sequence is derived by grouping, XOR and extracting. A shift cycle algorithm is designed to enable the dynamic change of control code sequence. Analysis shows that transmission efficiency could reach 100 % with high dynamics and security.

  16. High-quality draft genome sequence of the Thermus amyloliquefaciens type strain YIM 77409T with an incomplete denitrification pathway

    DOE PAGES

    Zhou, En -Min; Murugapiran, Senthil K.; Mefferd, Chrisabelle C.; ...

    2016-02-27

    Thermus amyloliquefaciens type strain YIM 77409T is a thermophilic, Gram-negative, non-motile and rod-shaped bacterium isolated from Niujie Hot Spring in Eryuan County, Yunnan Province, southwest China. In the present study we describe the features of strain YIM 77409T together with its genome sequence and annotation. The genome is 2,160,855 bp long and consists of 6 scaffolds with 67.4 % average GC content. A total of 2,313 genes were predicted, comprising 2,257 protein-coding and 56 RNA genes. The genome is predicted to encode a complete glycolysis, pentose phosphate pathway, and tricarboxylic acid cycle. Additionally, a large number of transporters and enzymesmore » for heterotrophy highlight the broad heterotrophic lifestyle of this organism. Furthermore, a denitrification gene cluster included genes predicted to encode enzymes for the sequential reduction of nitrate to nitrous oxide, consistent with the incomplete denitrification phenotype of this strain.« less

  17. Orpinomyces cellulase celf protein and coding sequences

    DOEpatents

    Li, Xin-Liang; Chen, Huizhong; Ljungdahl, Lars G.

    2000-09-05

    A cDNA (1,520 bp), designated celF, consisting of an open reading frame (ORF) encoding a polypeptide (CelF) of 432 amino acids was isolated from a cDNA library of the anaerobic rumen fungus Orpinomyces PC-2 constructed in Escherichia coli. Analysis of the deduced amino acid sequence showed that starting from the N-terminus, CelF consists of a signal peptide, a cellulose binding domain (CBD) followed by an extremely Asn-rich linker region which separate the CBD and the catalytic domains. The latter is located at the C-terminus. The catalytic domain of CelF is highly homologous to CelA and CelC of Orpinomyces PC-2, to CelA of Neocallimastix patriciarum and also to cellobiohydrolase IIs (CBHIIs) from aerobic fungi. However, Like CelA of Neocallimastix patriciarum, CelF does not have the noncatalytic repeated peptide domain (NCRPD) found in CelA and CelC from the same organism. The recombinant protein CelF hydrolyzes cellooligosaccharides in the pattern of CBHII, yielding only cellobiose as product with cellotetraose as the substrate. The genomic celF is interrupted by a 111 bp intron, located within the region coding for the CBD. The intron of the celF has features in common with genes from aerobic filamentous fungi.

  18. Full-length HLA-DRB1 coding sequences generated by a hemizygous RNA-SBT approach.

    PubMed

    Gerritsen, K E H; Groeneweg, M; Meertens, C M H; Voorter, C E M; Tilanus, M G J

    2015-11-01

    Currently 1582 HLA-DRB1 alleles have been identified in the IMGT/HLA database (v3.18). Among those alleles, more than 90% have incomplete allele sequences, which complicates the analysis of the functional relevance of polymorphism beyond exon 2. The polymorphic index of each individual exon of the currently known allele sequences, shows that polymorphism is present in all exons, albeit not equally abundant. Full-length HLA-DRB1 RNA sequencing identifies polymorphism of the complete coding region. Here we describe a hemizygous full-length RNA sequence-based typing (SBT) approach based on group-specific HLA-DRB1 amplification and subsequent sequencing. RNA full-length sequences can easily be accessed because of the short amplicon length (801 bp). The RNA-SBT approach was successfully validated on a panel of DRB1 alleles having fully known coding sequences according to the IMGT/HLA database, and cover all serological equivalents. Subsequently, the approach was applied on a panel of 54 alleles with incomplete allele sequences, resulting in full-length coding sequences and the identification of one new and one corrected allele. This study shows the universal applicability of the RNA-based sequencing approach to identify full-length coding sequences and to define the polymorphic content of HLA-DRB1 alleles.

  19. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  20. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  1. Three Ingredients for Improved Global Aftershock Forecasts: Tectonic Region, Time-Dependent Catalog Incompleteness, and Inter-Sequence Variability

    NASA Astrophysics Data System (ADS)

    Page, M. T.; Hardebeck, J.; Felzer, K. R.; Michael, A. J.; van der Elst, N.

    2015-12-01

    Following a large earthquake, seismic hazard can be orders of magnitude higher than the long-term average as a result of aftershock triggering. Due to this heightened hazard, there is a demand from emergency managers and the public for rapid, authoritative, and reliable aftershock forecasts. In the past, USGS aftershock forecasts following large, global earthquakes have been released on an ad-hoc basis with inconsistent methods, and in some cases, aftershock parameters adapted from California. To remedy this, we are currently developing an automated aftershock product that will generate more accurate forecasts based on the Reasenberg and Jones (Science, 1989) method. To better capture spatial variations in aftershock productivity and decay, we estimate regional aftershock parameters for sequences within the Garcia et al. (BSSA, 2012) tectonic regions. We find that regional variations for mean aftershock productivity exceed a factor of 10. The Reasenberg and Jones method combines modified-Omori aftershock decay, Utsu productivity scaling, and the Gutenberg-Richter magnitude distribution. We additionally account for a time-dependent magnitude of completeness following large events in the catalog. We generalize the Helmstetter et al. (2005) equation for short-term aftershock incompleteness and solve for incompleteness levels in the global NEIC catalog following large mainshocks. In addition to estimating average sequence parameters within regions, we quantify the inter-sequence parameter variability. This allows for a more complete quantification of the forecast uncertainties and Bayesian updating of the forecast as sequence-specific information becomes available.

  2. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  3. The Coding and Inter-Manual Transfer of Movement Sequences

    PubMed Central

    Shea, Charles H.; Kovacs, Attila J.; Panzer, Stefan

    2011-01-01

    The manuscript reviews recent experiments that use inter-manual transfer and inter-manual practice paradigms to determine the coordinate system (visual–spatial or motor) used in the coding of movement sequences during physical and observational practice. The results indicated that multi-element movement sequences are more effectively coded in visual–spatial coordinates even following extended practice, while very early in practice movement sequences with only a few movement elements and relatively short durations are coded in motor coordinates. Likewise, inter-manual practice of relatively simple movement sequences show benefits of right and left limb practice that involves the same motor coordinates while the opposite is true for more complex sequences. The results suggest that the coordinate system used to code the sequence information is linked to both the task characteristics and the control processes used to produce the sequence. These findings have the potential to greatly enhance our understanding of why in some conditions participants following practice with one limb or observation of one limb practice can effectively perform the task with the contralateral limb while in other (often similar) conditions cannot. PMID:21716583

  4. Nucleotide sequence alignment using sparse coding and belief propagation.

    PubMed

    Roozgard, Aminmohammad; Barzigar, Nafise; Wang, Shuang; Jiang, Xiaoqian; Ohno-Machado, Lucila; Cheng, Samuel

    2013-01-01

    Advances in DNA information extraction techniques have led to huge sequenced genomes from organisms spanning the tree of life. This increasing amount of genomic information requires tools for comparison of the nucleotide sequences. In this paper, we propose a novel nucleotide sequence alignment method based on sparse coding and belief propagation to compare the similarity of the nucleotide sequences. We used the neighbors of each nucleotide as features, and then we employed sparse coding to find a set of candidate nucleotides. To select optimum matches, belief propagation was subsequently applied to these candidate nucleotides. Experimental results show that the proposed approach is able to robustly align nucleotide sequences and is competitive to SOAPaligner [1] and BWA [2].

  5. Improving mRNA 5' coding sequence determination in the mouse genome.

    PubMed

    Piovesan, Allison; Caracausi, Maria; Pelleri, Maria Chiara; Vitale, Lorenza; Martini, Silvia; Bassani, Chiara; Gurioli, Annalisa; Casadei, Raffaella; Soldà, Giulia; Strippoli, Pierluigi

    2014-04-01

    The incomplete determination of the mRNA 5' end sequence may lead to the incorrect assignment of the first AUG codon and to errors in the prediction of the encoded protein product. Due to the significance of the mouse as a model organism in biomedical research, we performed a systematic identification of coding regions at the 5' end of all known mouse mRNAs, using an automated expressed sequence tag (EST)-based approach which we have previously described. By parsing almost 4 million BLAT alignments we found 351 mouse loci, out of 20,221 analyzed, in which an extension of the mRNA 5' coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for Apc2 and Mknk2 cDNAs. We also generated a list of 16,330 mouse mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5' end in the current form. Systematic searches in the main mouse genome databases and genome browsers showed that 82% of our results are original and have not been identified by their annotation pipelines. Moreover, the same information is not easily derivable from RNA-Seq data, due to short sequence length and laboriousness in building full-length transcript structures. In conclusion, our results improve the determination of full-length 5' coding sequences and might be useful in order to reduce errors when studying mouse gene structure and function in biomedical research.

  6. Streamlined Genome Sequence Compression using Distributed Source Coding

    PubMed Central

    Wang, Shuang; Jiang, Xiaoqian; Chen, Feng; Cui, Lijuan; Cheng, Samuel

    2014-01-01

    We aim at developing a streamlined genome sequence compression algorithm to support alternative miniaturized sequencing devices, which have limited communication, storage, and computation power. Existing techniques that require heavy client (encoder side) cannot be applied. To tackle this challenge, we carefully examined distributed source coding theory and developed a customized reference-based genome compression protocol to meet the low-complexity need at the client side. Based on the variation between source and reference, our protocol will pick adaptively either syndrome coding or hash coding to compress subsequences of changing code length. Our experimental results showed promising performance of the proposed method when compared with the state-of-the-art algorithm (GRS). PMID:25520552

  7. Mixed hidden Markov quantile regression models for longitudinal data with possibly incomplete sequences.

    PubMed

    Marino, Maria Francesca; Tzavidis, Nikos; Alfò, Marco

    2016-01-01

    Quantile regression provides a detailed and robust picture of the distribution of a response variable, conditional on a set of observed covariates. Recently, it has be been extended to the analysis of longitudinal continuous outcomes using either time-constant or time-varying random parameters. However, in real-life data, we frequently observe both temporal shocks in the overall trend and individual-specific heterogeneity in model parameters. A benchmark dataset on HIV progression gives a clear example. Here, the evolution of the CD4 log counts exhibits both sudden temporal changes in the overall trend and heterogeneity in the effect of the time since seroconversion on the response dynamics. To accommodate such situations, we propose a quantile regression model, where time-varying and time-constant random coefficients are jointly considered. Since observed data may be incomplete due to early drop-out, we also extend the proposed model in a pattern mixture perspective. We assess the performance of the proposals via a large-scale simulation study and the analysis of the CD4 count data.

  8. Transcriptome Sequencing Reveals the Character of Incomplete Dosage Compensation across Multiple Tissues in Flycatchers

    PubMed Central

    Uebbing, Severin; Künstner, Axel; Mäkinen, Hannu; Ellegren, Hans

    2013-01-01

    Sex chromosome divergence, which follows the cessation of recombination and degeneration of the sex-limited chromosome, can cause a reduction in expression level for sex-linked genes in the heterozygous sex, unless some mechanisms of dosage compensation develops to counter the reduction in gene dose. Because large-scale perturbations in expression levels arising from changes in gene dose might have strong deleterious effects, the evolutionary response should be strong. However, in birds and in at least some other female heterogametic organisms, wholesale sex chromosome dosage compensation does not seem to occur. Using RNA-seq of multiple tissues and individuals, we investigated male and female expression levels of Z-linked and autosomal genes in the collared flycatcher, a bird for which a draft genome sequence recently has been reported. We found that male expression of Z-linked genes was on average 50% higher than female expression, although there was considerable variation in the male-to-female ratio among genes. The ratio for individual genes was well correlated among tissues and there was also a correlation in the extent of compensation between flycatcher and chicken orthologs. The relative excess of male expression was positively correlated with expression breadth, expression level, and number of interacting proteins (protein connectivity), and negatively correlated with variance in expression. These observations lead to a model of compensation occurring on a gene-by-gene basis, supported by an absence of clustering of genes on the Z chromosome with respect to the extent of compensation. Equal mean expression level of autosomal and Z-linked genes in males, and 50% higher expression of autosomal than Z-linked genes in females, is compatible with that partial compensation is achieved by hypertranscription from females’ single Z chromosome. A comparison with male-to-female expression ratios in orthologous Z-linked genes of ostriches, where Z–W recombination

  9. RNAcentral: A comprehensive database of non-coding RNA sequences

    DOE PAGES

    Williams, Kelly Porter; Lau, Britney Yan

    2016-10-28

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less

  10. RNAcentral: a comprehensive database of non-coding RNA sequences

    PubMed Central

    2017-01-01

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/. PMID:27794554

  11. RNAcentral: A comprehensive database of non-coding RNA sequences

    SciTech Connect

    Williams, Kelly Porter; Lau, Britney Yan

    2016-10-28

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality.

  12. Multifractal detrended cross-correlation analysis of coding and non-coding DNA sequences through chaos-game representation

    NASA Astrophysics Data System (ADS)

    Pal, Mayukha; Satish, B.; Srinivas, K.; Rao, P. Madhusudana; Manimaran, P.

    2015-10-01

    We propose a new approach combining the chaos game representation and the two dimensional multifractal detrended cross correlation analysis methods to examine multifractal behavior in power law cross correlation between any pair of nucleotide sequences of unequal lengths. In this work, we analyzed the characteristic behavior of coding and non-coding DNA sequences of eight prokaryotes. The results show the presence of strong multifractal nature between coding and non-coding sequences of all data sets. We found that this integrative approach helps us to consider complete DNA sequences for characterization, and further it may be useful for classification, clustering, identification of class affiliation of nucleotide sequences etc. with high precision.

  13. High-quality draft genome sequence of the Thermus amyloliquefaciens type strain YIM 77409T with an incomplete denitrification pathway

    SciTech Connect

    Zhou, En -Min; Murugapiran, Senthil K.; Mefferd, Chrisabelle C.; Liu, Lan; Xian, Wen -Dong; Yin, Yi -Rui; Ming, Hong; Yu, Tian -Tian; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T. B. K.; Ngan, Chew Yee; Daum, Chris; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Spunde, Alexander; Kyrpides, Nikos; Woyke, Tanja; Li, Wen -Jun; Hedlund, Brian P.

    2016-02-27

    Thermus amyloliquefaciens type strain YIM 77409T is a thermophilic, Gram-negative, non-motile and rod-shaped bacterium isolated from Niujie Hot Spring in Eryuan County, Yunnan Province, southwest China. In the present study we describe the features of strain YIM 77409T together with its genome sequence and annotation. The genome is 2,160,855 bp long and consists of 6 scaffolds with 67.4 % average GC content. A total of 2,313 genes were predicted, comprising 2,257 protein-coding and 56 RNA genes. The genome is predicted to encode a complete glycolysis, pentose phosphate pathway, and tricarboxylic acid cycle. Additionally, a large number of transporters and enzymes for heterotrophy highlight the broad heterotrophic lifestyle of this organism. Furthermore, a denitrification gene cluster included genes predicted to encode enzymes for the sequential reduction of nitrate to nitrous oxide, consistent with the incomplete denitrification phenotype of this strain.

  14. Sequence and Structural Analyses for Functional Non-coding RNAs

    NASA Astrophysics Data System (ADS)

    Sakakibara, Yasubumi; Sato, Kengo

    Analysis and detection of functional RNAs are currently important topics in both molecular biology and bioinformatics research. Several computational methods based on stochastic context-free grammars (SCFGs) have been developed for modeling and analysing functional RNA sequences. These grammatical methods have succeeded in modeling typical secondary structures of RNAs and are used for structural alignments of RNA sequences. Such stochastic models, however, are not sufficient to discriminate member sequences of an RNA family from non-members, and hence to detect non-coding RNA regions from genome sequences. Recently, the support vector machine (SVM) and kernel function techniques have been actively studied and proposed as a solution to various problems in bioinformatics. SVMs are trained from positive and negative samples and have strong, accurate discrimination abilities, and hence are more appropriate for the discrimination tasks. A few kernel functions that extend the string kernel to measure the similarity of two RNA sequences from the viewpoint of secondary structures have been proposed. In this article, we give an overview of recent progress in SCFG-based methods for RNA sequence analysis and novel kernel functions tailored to measure the similarity of two RNA sequences and developed for use with support vector machines (SVM) in discriminating members of an RNA family from non-members.

  15. Coding Deficits in Noise-Induced Hidden Hearing Loss May Stem from Incomplete Repair of Ribbon Synapses in the Cochlea

    PubMed Central

    Shi, Lijuan; Chang, Yin; Li, Xiaowei; Aiken, Steven J.; Liu, Lijie; Wang, Jian

    2016-01-01

    Recent evidence has shown that noise-induced damage to the synapse between inner hair cells (IHCs) and type I afferent auditory nerve fibers (ANFs) may occur in the absence of permanent threshold shift (PTS), and that synapses connecting IHCs with low spontaneous rate (SR) ANFs are disproportionately affected. Due to the functional importance of low-SR ANF units for temporal processing and signal coding in noisy backgrounds, deficits in cochlear coding associated with noise-induced damage may result in significant difficulties with temporal processing and hearing in noise (i.e., “hidden hearing loss”). However, significant noise-induced coding deficits have not been reported at the single unit level following the loss of low-SR units. We have found evidence to suggest that some aspects of neural coding are not significantly changed with the initial loss of low-SR ANFs, and that further coding deficits arise in association with the subsequent reestablishment of the synapses. This suggests that synaptopathy in hidden hearing loss may be the result of insufficient repair of disrupted synapses, and not simply due to the loss of low-SR units. These coding deficits include decreases in driven spike rate for intensity coding as well as several aspects of temporal coding: spike latency, peak-to-sustained spike ratio and the recovery of spike rate as a function of click-interval. PMID:27252621

  16. Genetic algorithms with permutation coding for multiple sequence alignment.

    PubMed

    Ben Othman, Mohamed Tahar; Abdel-Azim, Gamil

    2013-08-01

    Multiple sequence alignment (MSA) is one of the topics of bio informatics that has seriously been researched. It is known as NP-complete problem. It is also considered as one of the most important and daunting tasks in computational biology. Concerning this a wide number of heuristic algorithms have been proposed to find optimal alignment. Among these heuristic algorithms are genetic algorithms (GA). The GA has mainly two major weaknesses: it is time consuming and can cause local minima. One of the significant aspects in the GA process in MSA is to maximize the similarities between sequences by adding and shuffling the gaps of Solution Coding (SC). Several ways for SC have been introduced. One of them is the Permutation Coding (PC). We propose a hybrid algorithm based on genetic algorithms (GAs) with a PC and 2-opt algorithm. The PC helps to code the MSA solution which maximizes the gain of resources, reliability and diversity of GA. The use of the PC opens the area by applying all functions over permutations for MSA. Thus, we suggest an algorithm to calculate the scoring function for multiple alignments based on PC, which is used as fitness function. The time complexity of the GA is reduced by using this algorithm. Our GA is implemented with different selections strategies and different crossovers. The probability of crossover and mutation is set as one strategy. Relevant patents have been probed in the topic.

  17. Ribosomal S27a coding sequences upstream of ubiquitin coding sequences in the genome of a pestivirus.

    PubMed

    Becher, P; Orlich, M; Thiel, H J

    1998-11-01

    Molecular characterization of cytopathogenic (cp) bovine viral diarrhea virus (BVDV) strain CP Rit, a temperature-sensitive strain widely used for vaccination, revealed that the viral genomic RNA is about 15.2 kb long, which is about 2.9 kb longer than the one of noncytopathogenic (noncp) BVDV strains. Molecular cloning and nucleotide sequencing of parts of the genome resulted in the identification of a duplication of the genomic region encoding nonstructural proteins NS3, NS4A, and part of NS4B. In addition, a nonviral sequence was found directly upstream of the second copy of the NS3 gene. The 3' part of this inserted sequence encodes an N-terminally truncated ubiquitin monomer. This is remarkable since all described cp BVDV strains with ubiquitin coding sequences contain at least one complete ubiquitin monomer. The 5' region of the nonviral sequence did not show any homology to cellular sequences identified thus far in cp BVDV strains. Databank searches revealed that this second cellular insertion encodes part of ribosomal protein S27a. Further analyses included molecular cloning and nucleotide sequencing of the cellular recombination partner. Sequence comparisons strongly suggest that the S27a and the ubiquitin coding sequences found in the genome of CP Rit were both derived from a bovine mRNA encoding a hybrid protein with the structure NH2-ubiquitin-S27a-COOH. Polyprotein processing in the genomic region encoding the N-terminal part of NS4B, the two cellular insertions, and NS3 was studied by a transient-expression assay. The respective analyses showed that the S27a-derived polypeptide, together with the truncated ubiquitin, served as processing signal to yield NS3, whereas the truncated ubiquitin alone was not capable of mediating the cleavage. Since the expression of NS3 is strictly correlated with the cp phenotype of BVDV, the altered genome organization leading to expression of NS3 most probably represents the genetic basis of cytopathogenicity of CP Rit.

  18. Code-Time Diversity for Direct Sequence Spread Spectrum Systems

    PubMed Central

    Hassan, A. Y.

    2014-01-01

    Time diversity is achieved in direct sequence spread spectrum by receiving different faded delayed copies of the transmitted symbols from different uncorrelated channel paths when the transmission signal bandwidth is greater than the coherence bandwidth of the channel. In this paper, a new time diversity scheme is proposed for spread spectrum systems. It is called code-time diversity. In this new scheme, N spreading codes are used to transmit one data symbol over N successive symbols interval. The diversity order in the proposed scheme equals to the number of the used spreading codes N multiplied by the number of the uncorrelated paths of the channel L. The paper represents the transmitted signal model. Two demodulators structures will be proposed based on the received signal models from Rayleigh flat and frequency selective fading channels. Probability of error in the proposed diversity scheme is also calculated for the same two fading channels. Finally, simulation results are represented and compared with that of maximal ration combiner (MRC) and multiple-input and multiple-output (MIMO) systems. PMID:24982925

  19. Optimal coding of vectorcardiographic sequences using spatial prediction.

    PubMed

    Augustyniak, Piotr

    2007-05-01

    This paper discusses principles, implementation details, and advantages of sequence coding algorithm applied to the compression of vectocardiograms (VCG). The main novelty of the proposed method is the automatic management of distortion distribution controlled by the local signal contents in both technical and medical aspects. As in clinical practice, the VCG loops representing P, QRS, and T waves in the three-dimensional (3-D) space are considered here as three simultaneous sequences of objects. Because of the similarity of neighboring loops, encoding the values of prediction error significantly reduces the data set volume. The residual values are de-correlated with the discrete cosine transform (DCT) and truncated at certain energy threshold. The presented method is based on the irregular temporal distribution of medical data in the signal and takes advantage of variable sampling frequency for automatically detected VCG loops. The features of the proposed algorithm are confirmed by the results of the numerical experiment carried out for a wide range of real records. The average data reduction ratio reaches a value of 8.15 while the percent root-mean-square difference (PRD) distortion ratio for the most important sections of signal does not exceed 1.1%.

  20. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    NASA Astrophysics Data System (ADS)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  1. Image sequence coding using 3D scene models

    NASA Astrophysics Data System (ADS)

    Girod, Bernd

    1994-09-01

    The implicit and explicit use of 3D models for image sequence coding is discussed. For implicit use, a 3D model can be incorporated into motion compensating prediction. A scheme that estimates the displacement vector field with a rigid body motion constraint by recovering epipolar lines from an unconstrained displacement estimate and then repeating block matching along the epipolar line is proposed. Experimental results show that an improved displacement vector field can be obtained with a rigid body motion constraint. As an example for explicit use, various results with a facial animation model for videotelephony are discussed. A 13 X 16 B-spline mask can be adapted automatically to individual faces and is used to generate facial expressions based on FACS. A depth-from-defocus range camera suitable for real-time facial motion tracking is described. Finally, the real-time facial animation system `Traugott' is presented that has been used to generate several hours of broadcast video. Experiments suggest that a videophone system based on facial animation might require a transmission bitrate of 1 kbit/s or below.

  2. In search of coding and non-coding regions of DNA sequences based on balanced estimation of diffusion entropy.

    PubMed

    Zhang, Jin; Zhang, Wenqing; Yang, Huijie

    2016-01-01

    Identification of coding regions in DNA sequences remains challenging. Various methods have been proposed, but these are limited by species-dependence and the need for adequate training sets. The elements in DNA coding regions are known to be distributed in a quasi-random way, while those in non-coding regions have typical similar structures. For short sequences, these statistical characteristics cannot be extracted correctly and cannot even be detected. This paper introduces a new way to solve the problem: balanced estimation of diffusion entropy (BEDE).

  3. Non-extensive trends in the size distribution of coding and non-coding DNA sequences in the human genome

    NASA Astrophysics Data System (ADS)

    Oikonomou, Th.; Provata, A.

    2006-03-01

    We study the primary DNA structure of four of the most completely sequenced human chromosomes (including chromosome 19 which is the most dense in coding), using non-extensive statistics. We show that the exponents governing the spatial decay of the coding size distributions vary between 5.2 ≤r ≤5.7 for the short scales and 1.45 ≤q ≤1.50 for the large scales. On the contrary, the exponents governing the spatial decay of the non-coding size distributions in these four chromosomes, take the values 2.4 ≤r ≤3.2 for the short scales and 1.50 ≤q ≤1.72 for the large scales. These results, in particular the values of the tail exponent q, indicate the existence of correlations in the coding and non-coding size distributions with tendency for higher correlations in the non-coding DNA.

  4. A convolutional code-based sequence analysis model and its application.

    PubMed

    Liu, Xiao; Geng, Xiaoli

    2013-04-16

    A new approach for encoding DNA sequences as input for DNA sequence analysis is proposed using the error correction coding theory of communication engineering. The encoder was designed as a convolutional code model whose generator matrix is designed based on the degeneracy of codons, with a codon treated in the model as an informational unit. The utility of the proposed model was demonstrated through the analysis of twelve prokaryote and nine eukaryote DNA sequences having different GC contents. Distinct differences in code distances were observed near the initiation and termination sites in the open reading frame, which provided a well-regulated characterization of the DNA sequences. Clearly distinguished period-3 features appeared in the coding regions, and the characteristic average code distances of the analyzed sequences were approximately proportional to their GC contents, particularly in the selected prokaryotic organisms, presenting the potential utility as an added taxonomic characteristic for use in studying the relationships of living organisms.

  5. Polymorphism, shared functions and convergent evolution of genes with sequences coding for polyalanine domains.

    PubMed

    Lavoie, Hugo; Debeane, Francois; Trinh, Quoc-Dien; Turcotte, Jean-Francois; Corbeil-Girard, Louis-Philippe; Dicaire, Marie-Josée; Saint-Denis, Anik; Pagé, Martin; Rouleau, Guy A; Brais, Bernard

    2003-11-15

    Mutations causing expansions of polyalanine domains are responsible for nine hereditary diseases. Other GC-rich sequences coding for some polyalanine domains were found to be polymorphic in human. These observations prompted us to identify all sequences in the human genome coding for polyalanine stretches longer than four alanines and establish their degree of polymorphism. We identified 494 annotated human proteins containing 604 polyalanine domains. Thirty-two percent (31/98) of tested sequences coding for more than seven alanines were polymorphic. The length of the polyalanine-coding sequence and its GCG or GCC repeat content are the major predictors of polymorphism. GCG codons are over-represented in human polyalanine coding sequences. Our data suggest that GCG and GCC codons play a key role in polyalanine-coding sequence appearance and polymorphism. The grouping by shared function of polyalanine-containing proteins in Homo sapiens, Drosophila melanogaster and Caenorhabditis elegans shows that the majority are involved in transcriptional regulation. Phylogenetic analyses of HOX, GATA and EVX protein families demonstrate that polyalanine domains arose independently in different members of these families, suggesting that convergent molecular evolution may have played a role. Finally polyalanine domains in vertebrates are conserved between mammals and are rarer and shorter in Gallus gallus and Danio rerio. Together our results show that the polymorphic nature of sequences coding for polyalanine domains makes them prime candidates for mutations in hereditary diseases and suggests that they have appeared in many different protein families through convergent evolution.

  6. Nucleotide sequence from the coding region of rabbit β-globin messenger RNA

    PubMed Central

    Proudfoot, N.J.

    1976-01-01

    A sequence of 89 nucleotides from rabbit β-globin mRNA has been determined and is shown to code for residues 107 to 137 of the β-globin protein. In addition, a sequence heterogeneity has been identified within this 89 nucleotide long sequence which corresponds to a known polymorphic variant of rabbit β-globin. Images PMID:61580

  7. Direct sequence CDMA power control, interleaving, and coding

    NASA Astrophysics Data System (ADS)

    Simpson, Floyd; Holtzman, Jack M.

    1993-09-01

    We develop and analyze models of power control working with other aspects of CDMA systems, such as interleaving and coding on the land/ mobile radio channel. Our orientation is that a power control scheme is keeping the received powers at the base station 'almost equal', and we will be quantifying the performance degradation incurred if the powers are not exactly equal. In doing so, we consider the performance implications of control latency and a maximum speech delay constraint. It turns out that because of positive correlations between the fading channel amplitudes, the effectiveness of the combination of interleaving and coding in combating the effects of power variations due to slow Rayleigh fading is reduced. It is shown however, that power control and interleaving/coding are most effective in complementary parameter regions, thus providing a degree of robustness for both fast and slow Rayleigh fading.

  8. Sequences encoding identical peptides for the analysis and manipulation of coding DNA

    PubMed Central

    Sánchez, Joaquín

    2013-01-01

    The use of sequences encoding identical peptides (SEIP) for the in silico analysis of coding DNA from different species has not been reported; the study of such sequences could directly reveal properties of coding DNA that are independent of peptide sequences. For practical purposes SEIP might also be manipulated for e.g. heterologous protein expression. We extracted 1,551 SEIP from human and E. coli and 2,631 SEIP from human and D. melanogaster. We then analyzed codon usage and intercodon dinucleotide tendencies and found differences in both, with more conspicuous disparities between human and E. coli than between human and D. melanogaster. We also briefly manipulated SEIP to find out if they could be used to create new coding sequences. We hence attempted replacement of human by E. coli codons via dicodon exchange but found that full replacement was not possible, this indicated robust species-specific dicodon tendencies. To test another form of codon replacement we isolated SEIP from human and the jellyfish green fluorescent protein (GFP) and we then re-constructed the GFP coding DNA with human tetra-peptide-coding sequences. Results provide proof-of-principle that SEIP may be used to reveal differences in the properties of coding DNA and to reconstruct in pieces a protein coding DNA with sequences from a different organism, the latter might be exploited in heterologous protein expression. PMID:23861567

  9. Sequences encoding identical peptides for the analysis and manipulation of coding DNA.

    PubMed

    Sánchez, Joaquín

    2013-01-01

    The use of sequences encoding identical peptides (SEIP) for the in silico analysis of coding DNA from different species has not been reported; the study of such sequences could directly reveal properties of coding DNA that are independent of peptide sequences. For practical purposes SEIP might also be manipulated for e.g. heterologous protein expression. We extracted 1,551 SEIP from human and E. coli and 2,631 SEIP from human and D. melanogaster. We then analyzed codon usage and intercodon dinucleotide tendencies and found differences in both, with more conspicuous disparities between human and E. coli than between human and D. melanogaster. We also briefly manipulated SEIP to find out if they could be used to create new coding sequences. We hence attempted replacement of human by E. coli codons via dicodon exchange but found that full replacement was not possible, this indicated robust species-specific dicodon tendencies. To test another form of codon replacement we isolated SEIP from human and the jellyfish green fluorescent protein (GFP) and we then re-constructed the GFP coding DNA with human tetra-peptide-coding sequences. Results provide proof-of-principle that SEIP may be used to reveal differences in the properties of coding DNA and to reconstruct in pieces a protein coding DNA with sequences from a different organism, the latter might be exploited in heterologous protein expression.

  10. Revisiting the Physico-Chemical Hypothesis of Code Origin: An Analysis Based on Code-Sequence Coevolution in a Finite Population

    NASA Astrophysics Data System (ADS)

    Bandhu, Ashutosh Vishwa; Aggarwal, Neha; Sengupta, Supratim

    2013-12-01

    The origin of the genetic code marked a major transition from a plausible RNA world to the world of DNA and proteins and is an important milestone in our understanding of the origin of life. We examine the efficacy of the physico-chemical hypothesis of code origin by carrying out simulations of code-sequence coevolution in finite populations in stages, leading first to the emergence of ten amino acid code(s) and subsequently to 14 amino acid code(s). We explore two different scenarios of primordial code evolution. In one scenario, competition occurs between populations of equilibrated code-sequence sets while in another scenario; new codes compete with existing codes as they are gradually introduced into the population with a finite probability. In either case, we find that natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. The code whose structure is most consistent with the standard genetic code is often not among the codes that have a high fixation probability. However, we find that the composition of the code population affects the code fixation probability. A physico-chemically optimized code gets fixed with a significantly higher probability if it competes against a set of randomly generated codes. Our results suggest that physico-chemical optimization may not be the sole driving force in ensuring the emergence of the standard genetic code.

  11. Three ingredients for Improved global aftershock forecasts: Tectonic region, time-dependent catalog incompleteness, and inter-sequence variability

    USGS Publications Warehouse

    Page, Morgan T.; Van Der Elst, Nicholas; Hardebeck, Jeanne L.; Felzer, Karen; Michael, Andrew J.

    2016-01-01

    Following a large earthquake, seismic hazard can be orders of magnitude higher than the long‐term average as a result of aftershock triggering. Because of this heightened hazard, emergency managers and the public demand rapid, authoritative, and reliable aftershock forecasts. In the past, U.S. Geological Survey (USGS) aftershock forecasts following large global earthquakes have been released on an ad hoc basis with inconsistent methods, and in some cases aftershock parameters adapted from California. To remedy this, the USGS is currently developing an automated aftershock product based on the Reasenberg and Jones (1989) method that will generate more accurate forecasts. To better capture spatial variations in aftershock productivity and decay, we estimate regional aftershock parameters for sequences within the García et al. (2012) tectonic regions. We find that regional variations for mean aftershock productivity reach almost a factor of 10. We also develop a method to account for the time‐dependent magnitude of completeness following large events in the catalog. In addition to estimating average sequence parameters within regions, we develop an inverse method to estimate the intersequence parameter variability. This allows for a more complete quantification of the forecast uncertainties and Bayesian updating of the forecast as sequence‐specific information becomes available.

  12. Indoor Mobile Positioning Based on Lidar Data and Coded Sequence Pattern

    NASA Astrophysics Data System (ADS)

    Wang, Z.; Dong, B.; Chen, D.

    2016-10-01

    This paper proposed a coded sequence pattern for automatic matching of LiDAR point data, the methods including SIFT features, Otsu segmentation and Fast Hough transformation for the identification, positioning and interpret of the coded sequence patterns, the POSIT model for fast computing the translation and rotation parameters of LiDAR point data, so as to achieve fast matching of LiDAR point data and automatic 3D mapping of indoor shafts and tunnels.

  13. Coherent direct sequence optical code multiple access encoding-decoding efficiency versus wavelength detuning.

    PubMed

    Pastor, D; Amaya, W; García-Olcina, R; Sales, S

    2007-07-01

    We present a simple theoretical model of and the experimental verification for vanishing of the autocorrelation peak due to wavelength detuning on the coding-decoding process of coherent direct sequence optical code multiple access systems based on a superstructured fiber Bragg grating. Moreover, the detuning vanishing effect has been explored to take advantage of this effect and to provide an additional degree of multiplexing and/or optical code tuning.

  14. Correcting sequencing errors in DNA coding regions using a dynamic programming approach.

    PubMed

    Xu, Y; Mural, R J; Uberbacher, E C

    1995-04-01

    This paper presents an algorithm for detecting and 'correcting' sequencing errors that occur in DNA coding regions. The types of sequencing errors addressed are insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. This would permit improved sequencing efficiency and reduce genome sequencing costs. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of 'neutral' bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. We have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. Preliminary test results have shown the usefulness of this algorithm and also exhibited some of its weakness, providing possible directions for further improvement. On a test set consisting of 68 human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the 'corrected' sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the 'corrupted' sequences using standard GRAIL II method (version 1.2).(ABSTRACT TRUNCATED AT 250 WORDS)

  15. The primordial sequence, ribosomes, and the genetic code.

    NASA Technical Reports Server (NTRS)

    Fox, S. W.; Yuki, A.; Waehneldt, T. V.; Lacey, J. C., Jr.

    1971-01-01

    Experimental investigation of the key question of the origin of life concerning the chronological order in the primordial sequence of nucleic acid, protein, and cell. It is pointed out that, when viewed against the background of experiments on the selective reaction of basic homopolyamine acids with mononucleotides (Lacey and Pruitt, 1969; Woese, 1968), the experiments made help to establish a basis for understanding how information originally flowed from proteins to nucleic acids.

  16. Correcting sequencing errors in DNA coding regions using a dynamic programming approach

    SciTech Connect

    Xu, Y.; Mural, R.J.; Uberbacher, E.C.

    1994-12-01

    This paper presents an algorithm for detecting and ``correcting`` sequencing errors that occur in DNA coding regions. The types of sequencing error addressed include insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of ``neutral`` bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. The authors have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. On a test set consisting of 68 Human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the ``corrected`` sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the ``corrupted`` sequences using standard GRAIL II method. The method uses a dynamic programming algorithm, and runs in time and space linear to the size of the input sequence.

  17. Incomplete invention of drugs.

    PubMed

    Hisa, Tomoyuki

    2007-02-01

    Scientists seldom know the differences between "rejected invention", "non-invention", "incomplete invention", "invention yet to be completed" and "defective invention". The Japanese Supreme Court appointed me as a specialist member (Article 92-2, Code of Civil Procedure) of intellectual property division for medical and biological patents. Herein, I present scientists to the differences and which of them are patentable. In order to prevent oneself from being taken for granted for the scientists' noblesse oblige by clever business administrations, the scientists must know the borderline between patentable or non-patentable.

  18. Identification of a Polyketide Synthase Coding Sequence Specific for Anatoxin-a-Producing Oscillatoria Cyanobacteria▿ †

    PubMed Central

    Cadel-Six, Sabrina; Iteman, Isabelle; Peyraud-Thomas, Caroline; Mann, Stéphane; Ploux, Olivier; Méjean, Annick

    2009-01-01

    We report the identification of a sequence from the genome of Oscillatoria sp. strain PCC 6506 coding for a polyketide synthase. Using 50 axenic cyanobacteria, we found this sequence only in the genomes of Oscillatoria strains producing anatoxin-a or homoanatoxin-a, indicating its likely involvement in the biosynthesis of these toxins. PMID:19447947

  19. Protection of the genome and central protein-coding sequences by non-coding DNA against DNA damage from radiation.

    PubMed

    Qiu, Guo-Hua

    2015-01-01

    Non-coding DNA comprises a very large proportion of the total genomic content in higher organisms, but its function remains largely unclear. Non-coding DNA sequences constitute the majority of peripheral heterochromatin, which has been hypothesized to be the genome's 'bodyguard' against DNA damage from chemicals and radiation for almost four decades. The bodyguard protective function of peripheral heterochromatin in genome defense has been strengthened by the results from numerous recent studies, which are summarized in this review. These data have suggested that cells and/or organisms with a higher level of heterochromatin and more non-coding DNA sequences, including longer telomeric DNA and rDNAs, exhibit a lower frequency of DNA damage, higher radioresistance and longer lifespan after IR exposure. In addition, the majority of heterochromatin is peripherally located in the three-dimensional structure of genome organization. Therefore, the peripheral heterochromatin with non-coding DNA could play a protective role in genome defense against DNA damage from ionizing radiation by both absorbing the radicals from water radiolysis in the cytosol and reducing the energy of IR. However, the bodyguard protection by heterochromatin has been challenged by the observation that DNA damage is less frequently detected in peripheral heterochromatin than in euchromatin, which is inconsistent with the expectation and simulation results. Previous studies have also shown that the DNA damage in peripheral heterochromatin is rarely repaired and moves more quickly, broadly and outwardly to approach the nuclear pore complex (NPC). Additionally, it has been shown that extrachromosomal circular DNAs (eccDNAs) are formed in the nucleus, highly detectable in the cytoplasm (particularly under stress conditions) and shuttle between the nucleus and the cytoplasm. Based on these studies, this review speculates that the sites of DNA damage in peripheral heterochromatin could occur more

  20. Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein‐Coding Regions

    PubMed Central

    Lelieveld, Stefan H.; Spielmann, Malte; Mundlos, Stefan; Veltman, Joris A.

    2015-01-01

    ABSTRACT For next‐generation sequencing technologies, sufficient base‐pair coverage is the foremost requirement for the reliable detection of genomic variants. We investigated whether whole‐genome sequencing (WGS) platforms offer improved coverage of coding regions compared with whole‐exome sequencing (WES) platforms, and compared single‐base coverage for a large set of exome and genome samples. We find that WES platforms have improved considerably in the last years, but at comparable sequencing depth, WGS outperforms WES in terms of covered coding regions. At higher sequencing depth (95x–160x), WES successfully captures 95% of the coding regions with a minimal coverage of 20x, compared with 98% for WGS at 87‐fold coverage. Three different assessments of sequence coverage bias showed consistent biases for WES but not for WGS. We found no clear differences for the technologies concerning their ability to achieve complete coverage of 2,759 clinically relevant genes. We show that WES performs comparable to WGS in terms of covered bases if sequenced at two to three times higher coverage. This does, however, go at the cost of substantially more sequencing biases in WES approaches. Our findings will guide laboratories to make an informed decision on which sequencing platform and coverage to choose. PMID:25973577

  1. Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein-Coding Regions.

    PubMed

    Lelieveld, Stefan H; Spielmann, Malte; Mundlos, Stefan; Veltman, Joris A; Gilissen, Christian

    2015-08-01

    For next-generation sequencing technologies, sufficient base-pair coverage is the foremost requirement for the reliable detection of genomic variants. We investigated whether whole-genome sequencing (WGS) platforms offer improved coverage of coding regions compared with whole-exome sequencing (WES) platforms, and compared single-base coverage for a large set of exome and genome samples. We find that WES platforms have improved considerably in the last years, but at comparable sequencing depth, WGS outperforms WES in terms of covered coding regions. At higher sequencing depth (95x-160x), WES successfully captures 95% of the coding regions with a minimal coverage of 20x, compared with 98% for WGS at 87-fold coverage. Three different assessments of sequence coverage bias showed consistent biases for WES but not for WGS. We found no clear differences for the technologies concerning their ability to achieve complete coverage of 2,759 clinically relevant genes. We show that WES performs comparable to WGS in terms of covered bases if sequenced at two to three times higher coverage. This does, however, go at the cost of substantially more sequencing biases in WES approaches. Our findings will guide laboratories to make an informed decision on which sequencing platform and coverage to choose.

  2. Evaluation of correlation property of linear-frequency-modulated signals coded by maximum-length sequences

    NASA Astrophysics Data System (ADS)

    Yamanaka, Kota; Hirata, Shinnosuke; Hachiya, Hiroyuki

    2016-07-01

    Ultrasonic distance measurement for obstacles has been recently applied in automobiles. The pulse-echo method based on the transmission of an ultrasonic pulse and time-of-flight (TOF) determination of the reflected echo is one of the typical methods of ultrasonic distance measurement. Improvement of the signal-to-noise ratio (SNR) of the echo and the avoidance of crosstalk between ultrasonic sensors in the pulse-echo method are required in automotive measurement. The SNR of the reflected echo and the resolution of the TOF are improved by the employment of pulse compression using a maximum-length sequence (M-sequence), which is one of the binary pseudorandom sequences generated from a linear feedback shift register (LFSR). Crosstalk is avoided by using transmitted signals coded by different M-sequences generated from different LFSRs. In the case of lower-order M-sequences, however, the number of measurement channels corresponding to the pattern of the LFSR is not enough. In this paper, pulse compression using linear-frequency-modulated (LFM) signals coded by M-sequences has been proposed. The coding of LFM signals by the same M-sequence can produce different transmitted signals and increase the number of measurement channels. In the proposed method, however, the truncation noise in autocorrelation functions and the interference noise in cross-correlation functions degrade the SNRs of received echoes. Therefore, autocorrelation properties and cross-correlation properties in all patterns of combinations of coded LFM signals are evaluated.

  3. The nucleotide sequence of the human int-1 mammary oncogene; evolutionary conservation of coding and non-coding sequences.

    PubMed Central

    van Ooyen, A; Kwee, V; Nusse, R

    1985-01-01

    The mouse mammary tumor virus can induce mammary tumors in mice by proviral activation of an evolutionarily conserved cellular oncogene called int-1. Here we present the nucleotide sequence of the human homologue of int-1, and compare it with the mouse gene. Like the mouse gene, the human homologue contains a reading frame of 370 amino acids, with only four substitutions. The amino acid changes are all in the hydrophobic leader domain of the int-1 encoded protein, and do not significantly alter its hydropathic index. The conservation between the mouse and the human int-1 genes is not restricted to exons; extensive parts of the introns are also homologous. Thus, int-1 ranks among the most conserved genes known, a property shared with other oncogenes. PMID:2998762

  4. Coding-complete sequencing classifies parrot bornavirus 5 into a novel virus species.

    PubMed

    Marton, Szilvia; Bányai, Krisztián; Gál, János; Ihász, Katalin; Kugler, Renáta; Lengyel, György; Jakab, Ferenc; Bakonyi, Tamás; Farkas, Szilvia L

    2015-11-01

    In this study, we determined the sequence of the coding region of an avian bornavirus detected in a blue-and-yellow macaw (Ara ararauna) with pathological/histopathological changes characteristic of proventricular dilatation disease. The genomic organization of the macaw bornavirus is similar to that of other bornaviruses, and its nucleotide sequence is nearly identical to the available partial parrot bornavirus 5 (PaBV-5) sequences. Phylogenetic analysis showed that these strains formed a monophyletic group distinct from other mammalian and avian bornaviruses and in calculations performed with matrix protein coding sequences, the PaBV-5 and PaBV-6 genotypes formed a common cluster, suggesting that according to the recently accepted classification system for bornaviruses, these two genotypes may belong to a new species, provisionally named Psittaciform 2 bornavirus.

  5. Synthetic neomycin-kanamycin phosphotransferase, type II coding sequence for gene targeting in mammalian cells.

    PubMed

    Jin, Seung-Gi; Mann, Jeffrey R

    2005-07-01

    The bacterial neomycin-kanamycin phosphotransferase, type II enzyme is encoded by the neo gene and confers resistance to aminoglycoside drugs such as neomycin and kanamycin-bacterial selection and G418-eukaryotic cell selection. Although widely used in gene targeting in mouse embryonic stem cells, the neo coding sequence contains numerous cryptic splice sites and has a high CpG content. At least the former can cause unwanted effects in cis at the targeted locus. We describe a synthetic sequence, sneo, which encodes the same protein as that encoded by neo. This synthetic sequence has no predicted splice sites in either strand, low CpG content, and increased mammalian codon usage. In mouse embryonic stem cells sneo expressability is similar to neo. The use of sneo in gene targeting experiments should substantially reduce the probability of unwanted effects in cis due to splicing, and perhaps CpG methylation, within the coding sequence of the selectable marker.

  6. Severe accident source term characteristics for selected Peach Bottom sequences predicted by the MELCOR Code

    SciTech Connect

    Carbajo, J.J.

    1993-09-01

    The purpose of this report is to compare in-containment source terms developed for NUREG-1159, which used the Source Term Code Package (STCP), with those generated by MELCOR to identify significant differences. For this comparison, two short-term depressurized station blackout sequences (with a dry cavity and with a flooded cavity) and a Loss-of-Coolant Accident (LOCA) concurrent with complete loss of the Emergency Core Cooling System (ECCS) were analyzed for the Peach Bottom Atomic Power Station (a BWR-4 with a Mark I containment). The results indicate that for the sequences analyzed, the two codes predict similar total in-containment release fractions for each of the element groups. However, the MELCOR/CORBH Package predicts significantly longer times for vessel failure and reduced energy of the released material for the station blackout sequences (when compared to the STCP results). MELCOR also calculated smaller releases into the environment than STCP for the station blackout sequences.

  7. A distributed coding approach for stereo sequences in the tree structured Haar transform domain

    NASA Astrophysics Data System (ADS)

    Cancellaro, M.; Carli, M.; Neri, A.

    2009-02-01

    In this contribution, a novel method for distributed video coding for stereo sequences is proposed. The system encodes independently the left and right frames of the stereoscopic sequence. The decoder exploits the side information to achieve the best reconstruction of the correlated video streams. In particular, a syndrome coder approach based on a lifted Tree Structured Haar wavelet scheme has been adopted. The experimental results show the effectiveness of the proposed scheme.

  8. Complete coding sequences of European brown hare syndrome virus (EBHSV) strains isolated in 1982 in Sweden.

    PubMed

    Lopes, Ana M; Gavier-Widén, Dolores; Le Gall-Reculé, Ghislaine; Esteves, Pedro J; Abrantes, Joana

    2013-10-01

    European brown hare syndrome (EBHS) is characterised by high mortality of European brown hares (Lepus europaeus) and mountain hares (Lepus timidus). European brown hare syndrome virus (EBHSV) and the closely related rabbit haemorrhagic disease virus (RHDV) comprise the genus Lagovirus, family Caliciviridae. In contrast to RHDV, which is well studied, with more than 30 complete genome sequences available, the only complete genome sequence available for EBHSV was obtained from a strain isolated in 1989 in France. EBHS was originally diagnosed in Sweden in 1980. Here, we report the complete coding sequences of two EBHSV strains isolated from European brown hares that died with liver lesions characteristic of EBHS in Sweden in 1982. These sequences represent the oldest complete coding sequences of EBHSV isolated from the original area of virus diagnosis. The genomic organisation is similar to that of the published French sequence. Comparison with this sequence revealed several nucleotide substitutions, corresponding to 6 % divergence. At the amino acid level, the Swedish strains are 2 % different from the French strain. Most amino acid substitutions were located within the major capsid protein VP60, but when considering the amino acid sequence length of each protein, VP10 is the protein with the highest percentage of amino acid differences. The same result was obtained when Swedish strains were compared. This evolutionary pattern has not been described previously for members of the genus Lagovirus.

  9. Genetic characterization of three novel chicken parvovirus strains based on analysis of their coding sequences.

    PubMed

    Koo, Bon-Sang; Lee, Hae-Rim; Jeon, Eun-Ok; Han, Moo-Sung; Min, Kyeong-Cheol; Lee, Seung-Baek; Bae, Yeon-Ji; Cho, Sun-Hyung; Mo, Jong-Suk; Kwon, Hyuk Moo; Sung, Haan Woo; Kim, Jong-Nyeo; Mo, In-Pil

    2015-01-01

    Chicken parvovirus (ChPV) is one of the causative agents of viral enteritis. Recently, the genome of the ABU-P1 strain of ChPV was fully sequenced and determined to have a distinct genomic composition compared with that of vertebrate parvoviruses. However, no comparative sequence analysis of coding regions of ChPVs was possible because of the lack of other sequence information. In this study, we obtained the nucleotide sequences of all genomic coding regions of three ChPVs by polymerase chain reaction using 13 primer sets, and deduced the amino acid sequences from the nucleotide sequences. The non-structural protein 1 (NS1) gene of the three ChPVs showed 95.0 to 95.5% nucleotide sequence identity and 96.5 to 98.1% amino acid sequence identity to those of NS1 from the ABU-P1 strain, respectively, and even higher nucleotide and amino acid similarities to one another. The viral proteins (VP) gene was more divergent between the three ChPV Korean strains and ABU-P1, with 88.1 to 88.3% nucleotide identity and 93.0% amino acid identity. Analysis of the putative tertiary structure of the ChPV VP2 protein showed that variable regions with less than 80% nucleotide similarity between the three Korean strains and ABU-P1 occurred in large loops of the VP2 protein believed to be involved in antigenicity, pathogenicity, and tissue tropism in other parvoviruses. Based on our analysis of full-length coding sequences, we discovered greater variation in ChPV strains than reported previously, especially in partial regions of the VP2 protein.

  10. Complete Coding Sequence of Zika Virus from a French Polynesia Outbreak in 2013

    PubMed Central

    Piorkowski, Géraldine; Charrel, Rémi N.; Boubis, Laetitia; Leparc-Goffart, Isabelle; de Lamballerie, Xavier

    2014-01-01

    Zika virus is an arthropod-borne Flavivirus member of the Spondweni serocomplex, transmitted by Aedes mosquitoes. We report here the complete coding sequence of a Zika virus strain belonging to the Asian lineage, isolated from an infected patient returning from French Polynesia, an epidemic area in 2013/2014. PMID:24903869

  11. Complete Coding Sequences of Six Toscana Virus Strains Isolated from Human Patients in France

    PubMed Central

    Leparc-Goffart, Isabelle; Piorkowski, Geraldine; Coutard, Bruno; Papageorgiou, Nicolas; De Lamballerie, Xavier

    2016-01-01

    Toscana virus (TOSV) is an arthropod-borne phlebovirus belonging to the Sandfly fever Naples virus species (genus Phlebovirus, family Bunyaviridae). Here, we report the complete coding sequences of six TOSV strains isolated from human patients having acquired the infection in southeastern France during a 12-year period. PMID:27231377

  12. Successful Recovery of Nuclear Protein-Coding Genes from Small Insects in Museums Using Illumina Sequencing

    PubMed Central

    Dasenko, Mark A.

    2015-01-01

    In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles

  13. Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Matsa, M. E.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.

  14. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    NASA Technical Reports Server (NTRS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.

  15. Translational resistivity/conductivity of coding sequences during exponential growth of Escherichia coli.

    PubMed

    Takai, Kazuyuki

    2017-01-21

    Codon adaptation index (CAI) has been widely used for prediction of expression of recombinant genes in Escherichia coli and other organisms. However, CAI has no mechanistic basis that rationalizes its application to estimation of translational efficiency. Here, I propose a model based on which we could consider how codon usage is related to the level of expression during exponential growth of bacteria. In this model, translation of a gene is considered as an analog of electric current, and an analog of electric resistance corresponding to each gene is considered. "Translational resistance" is dependent on the steady-state concentration and the sequence of the mRNA species, and "translational resistivity" is dependent only on the mRNA sequence. The latter is the sum of two parts: one is the resistivity for the elongation reaction (coding sequence resistivity), and the other comes from all of the other steps of the decoding reaction. This electric circuit model clearly shows that some conditions should be met for codon composition of a coding sequence to correlate well with its expression level. On the other hand, I calculated relative frequency of each of the 61 sense codon triplets translated during exponential growth of E. coli from a proteomic dataset covering over 2600 proteins. A tentative method for estimating relative coding sequence resistivity based on the data is presented.

  16. Key for protein coding sequences identification: computer analysis of codon strategy.

    PubMed Central

    Rodier, F; Gabarro-Arpa, J; Ehrlich, R; Reiss, C

    1982-01-01

    The signal qualifying an AUG or GUG as an initiator in mRNAs processed by E. coli ribosomes is not found to be a systematic, literal homology sequence. In contrast, stability analysis reveals that initiators always occur within nucleic acid domains of low stability, for which a high A/U content is observed. Since no aminoacid selection pressure can be detected at N-termini of the proteins, the A/U enrichment results from a biased usage of the code degeneracy. A computer analysis is presented which allows easy detection of the codon strategy. N-terminal codons carry rather systematically A or U in third position, which suggests a mechanism for translation initiation and helps to detect protein coding sequences in sequenced DNA. PMID:7038623

  17. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

    SciTech Connect

    Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.; Lamson, Jacob S.; He, Jennifer; Hoover, Cindi A.; Blow, Matthew J.; Bristow, James; Butland, Gareth; Arkin, Adam P.; Deutschbauer, Adam

    2015-05-12

    Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with any transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes

  18. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

    DOE PAGES

    Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.; ...

    2015-05-12

    Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with anymore » transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes. However, the current strategies for TnSeq are

  19. Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences

    PubMed Central

    Ivanov, Ivaylo P.; Firth, Andrew E.; Michel, Audrey M.; Atkins, John F.; Baranov, Pavel V.

    2011-01-01

    In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5′ cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized—both for increased coding capacity and potentially also for novel regulatory mechanisms—remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5′ untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data. PMID:21266472

  20. Large-scale coding sequence change underlies the evolution of postdevelopmental novelty in honey bees.

    PubMed

    Jasper, William Cameron; Linksvayer, Timothy A; Atallah, Joel; Friedman, Daniel; Chiu, Joanna C; Johnson, Brian R

    2015-02-01

    Whether coding or regulatory sequence change is more important to the evolution of phenotypic novelty is one of biology's major unresolved questions. The field of evo-devo has shown that in early development changes to regulatory regions are the dominant mode of genetic change, but whether this extends to the evolution of novel phenotypes in the adult organism is unclear. Here, we conduct ten RNA-Seq experiments across both novel and conserved tissues in the honey bee to determine to what extent postdevelopmental novelty is based on changes to the coding regions of genes. We make several discoveries. First, we show that with respect to novel physiological functions in the adult animal, positively selected tissue-specific genes of high expression underlie novelty by conferring specialized cellular functions. Such genes are often, but not always taxonomically restricted genes (TRGs). We further show that positively selected genes, whether TRGs or conserved genes, are the least connected genes within gene expression networks. Overall, this work suggests that the evo-devo paradigm is limited, and that the evolution of novelty, postdevelopment, follows additional rules. Specifically, evo-devo stresses that high network connectedness (repeated use of the same gene in many contexts) constrains coding sequence change as it would lead to negative pleiotropic effects. Here, we show that in the adult animal, the converse is true: Genes with low network connectedness (TRGs and tissue-specific conserved genes) underlie novel phenotypes by rapidly changing coding sequence to perform new-specialized functions.

  1. Error probability bounds for trellis coded modulation over sequence dependent channels

    NASA Astrophysics Data System (ADS)

    Oka, Ikuo; Biglieri, Ezio

    1989-04-01

    A technique for obtaining an upper bound to the error event probability of trellis-coded modulation in sequence-dependent channels is derived. The technique is based on the transfer function of a state diagram which has N + 1 nodes and whose branch labels are N x N error matrices. Some methods for simplifying the computation of bit error probability at the price of a looser bound are proposed. Numerical results show the applicability of the techniques presented here to trellis-coded 16-QAM with two-symbol intersymbol interference.

  2. Coding and decoding libraries of sequence-defined functional copolymers synthesized via photoligation

    NASA Astrophysics Data System (ADS)

    Zydziak, Nicolas; Konrad, Waldemar; Feist, Florian; Afonin, Sergii; Weidner, Steffen; Barner-Kowollik, Christopher

    2016-11-01

    Designing artificial macromolecules with absolute sequence order represents a considerable challenge. Here we report an advanced light-induced avenue to monodisperse sequence-defined functional linear macromolecules up to decamers via a unique photochemical approach. The versatility of the synthetic strategy--combining sequential and modular concepts--enables the synthesis of perfect macromolecules varying in chemical constitution and topology. Specific functions are placed at arbitrary positions along the chain via the successive addition of monomer units and blocks, leading to a library of functional homopolymers, alternating copolymers and block copolymers. The in-depth characterization of each sequence-defined chain confirms the precision nature of the macromolecules. Decoding of the functional information contained in the molecular structure is achieved via tandem mass spectrometry without recourse to their synthetic history, showing that the sequence information can be read. We submit that the presented photochemical strategy is a viable and advanced concept for coding individual monomer units along a macromolecular chain.

  3. SinEx DB: a database for single exon coding sequences in mammalian genomes.

    PubMed

    Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S

    2016-01-01

    Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl.

  4. SinEx DB: a database for single exon coding sequences in mammalian genomes

    PubMed Central

    Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F.; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S.

    2016-01-01

    Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as ‘single exon genes’ (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs. Database URL: www.sinex.cl PMID:27278816

  5. Incorporation of the influenza A virus NA segment into virions does not require cognate non-coding sequences

    PubMed Central

    Crescenzo-Chaigne, Bernadette; Barbezange, Cyril V. S.; Léandri, Stéphane; Roquin, Camille; Berthault, Camille; van der Werf, Sylvie

    2017-01-01

    For each influenza virus genome segment, the coding sequence is flanked by non-coding (NC) regions comprising shared, conserved sequences and specific, non-conserved sequences. The latter and adjacent parts of the coding sequence are involved in genome packaging, but the precise role of the non-conserved NC sequences is still unclear. The aim of this study is to better understand the role of the non-conserved non-coding sequences in the incorporation of the viral segments into virions. The NA-segment NC sequences were systematically replaced by those of the seven other segments. Recombinant viruses harbouring two segments with identical NC sequences were successfully rescued. Virus growth kinetics and serial passages were performed, and incorporation of the viral segments was tested by real-time RT-PCR. An initial virus growth deficiency correlated to a specific defect in NA segment incorporation. Upon serial passages, growth properties were restored. Sequencing revealed that the replacing 5′NC sequence length drove the type of mutations obtained. With sequences longer than the original, point mutations in the coding region with or without substitutions in the 3′NC region were detected. With shorter sequences, insertions were observed in the 5′NC region. Restoration of viral fitness was linked to restoration of the NA segment incorporation. PMID:28240311

  6. Coupled enhancer and coding sequence evolution of a homeobox gene shaped leaf diversity

    PubMed Central

    Vuolo, Francesco; Mentink, Remco A.; Hajheidari, Mohsen; Bailey, C. Donovan; Filatov, Dmitry A.; Tsiantis, Miltos

    2016-01-01

    Here we investigate mechanisms underlying the diversification of biological forms using crucifer leaf shape as an example. We show that evolution of an enhancer element in the homeobox gene REDUCED COMPLEXITY (RCO) altered leaf shape by changing gene expression from the distal leaf blade to its base. A single amino acid substitution evolved together with this regulatory change, which reduced RCO protein stability, preventing pleiotropic effects caused by its altered gene expression. We detected hallmarks of positive selection in these evolved regulatory and coding sequence variants and showed that modulating RCO activity can improve plant physiological performance. Therefore, interplay between enhancer and coding sequence evolution created a potentially adaptive path for morphological evolution. PMID:27852629

  7. Mutation analysis of the coding sequence of the MECP2 gene in infantile autism.

    PubMed

    Beyer, Kim S; Blasi, Francesca; Bacchelli, Elena; Klauck, Sabine M; Maestrini, Elena; Poustka, Annemarie

    2002-10-01

    Mutations in the coding region of the methyl-CpG-binding protein 2 ( MECP2) gene cause Rett syndrome and have also been reported in a number of X-linked mental retardation syndromes. Furthermore, such mutations have recently been described in a few autistic patients. In this study, a large sample of individuals with autism was screened in order to elucidate systematically whether specific mutations in MECP2 play a role in autism. The mutation analysis of the coding sequence of the gene was performed by denaturing high-pressure liquid chromatography and direct sequencing. Taken together, 14 sequence variants were identified in 152 autistic patients from 134 German families and 50 unrelated patients from the International Molecular Genetic Study of Autism Consortium affected relative-pair sample. Eleven of these variants were excluded for having an aetiological role as they were either silent mutations, did not cosegregate with autism in the pedigrees of the patients or represented known polymorphisms. The relevance of the three remaining mutations towards the aetiology of autism could not be ruled out, although they were not localised within functional domains of MeCP2 and may be rare polymorphisms. Taking into account the large size of our sample, we conclude that mutations in the coding region of MECP2 do not play a major role in autism susceptibility. Therefore, infantile autism and Rett syndrome probably represent two distinct entities at the molecular genetic level.

  8. A direct sequence spread spectrum code acquisition circuit for wireless sensor networks

    NASA Astrophysics Data System (ADS)

    Ghaisari, Jafar; Ferdosi, Arash

    2011-06-01

    Narrow band (NB), spread spectrum (SS), and ultra wide band (UWB) are three physical layer bandwidth types used in wireless sensor networks (WSN). SS and UWB technologies have many advantages over NB, which make them preferable for WSN. Synchronisation of different nodes in a WSN is an important task that is necessary to improve cooperation and lifetime of nodes. Code acquisition is the main step of a node's time synchronisation. In this article, a pseudo noise code generator and a code acquisition circuit are proposed, designed and tested using direct sequence SS technique. To investigate the properties of the designed circuits, simulations are carried out via Xilinx Foundation Series software in the real mode. The results demonstrate excellent performance of the proposed algorithms and circuits in all realistic conditions. The code acquisition circuit proposed an adaptive testing window for single dwell serial search method. The code acquisition circuit is a clock phase free approach, thus the clock coherency step is cancelled. Moreover, clock phase difference between transmitter and receiver nodes does not mostly affect the acquisition and thus synchronisation time.

  9. A novel DNA sequence similarity calculation based on simplified pulse-coupled neural network and Huffman coding

    NASA Astrophysics Data System (ADS)

    Jin, Xin; Nie, Rencan; Zhou, Dongming; Yao, Shaowen; Chen, Yanyan; Yu, Jiefu; Wang, Quan

    2016-11-01

    A novel method for the calculation of DNA sequence similarity is proposed based on simplified pulse-coupled neural network (S-PCNN) and Huffman coding. In this study, we propose a coding method based on Huffman coding, where the triplet code was used as a code bit to transform DNA sequence into numerical sequence. The proposed method uses the firing characters of S-PCNN neurons in DNA sequence to extract features. Besides, the proposed method can deal with different lengths of DNA sequences. First, according to the characteristics of S-PCNN and the DNA primary sequence, the latter is encoded using Huffman coding method, and then using the former, the oscillation time sequence (OTS) of the encoded DNA sequence is extracted. Simultaneously, relevant features are obtained, and finally the similarities or dissimilarities of the DNA sequences are determined by Euclidean distance. In order to verify the accuracy of this method, different data sets were used for testing. The experimental results show that the proposed method is effective.

  10. The signal sequence coding region promotes nuclear export of mRNA.

    PubMed

    Palazzo, Alexander F; Springer, Michael; Shibata, Yoko; Lee, Chung-Sheng; Dias, Anusha P; Rapoport, Tom A

    2007-12-01

    In eukaryotic cells, most mRNAs are exported from the nucleus by the transcription export (TREX) complex, which is loaded onto mRNAs after their splicing and capping. We have studied in mammalian cells the nuclear export of mRNAs that code for secretory proteins, which are targeted to the endoplasmic reticulum membrane by hydrophobic signal sequences. The mRNAs were injected into the nucleus or synthesized from injected or transfected DNA, and their export was followed by fluorescent in situ hybridization. We made the surprising observation that the signal sequence coding region (SSCR) can serve as a nuclear export signal of an mRNA that lacks an intron or functional cap. Even the export of an intron-containing natural mRNA was enhanced by its SSCR. Like conventional export, the SSCR-dependent pathway required the factor TAP, but depletion of the TREX components had only moderate effects. The SSCR export signal appears to be characterized in vertebrates by a low content of adenines, as demonstrated by genome-wide sequence analysis and by the inhibitory effect of silent adenine mutations in SSCRs. The discovery of an SSCR-mediated pathway explains the previously noted amino acid bias in signal sequences and suggests a link between nuclear export and membrane targeting of mRNAs.

  11. Nucleosomal signatures impose nucleosome positioning in coding and noncoding sequences in the genome

    PubMed Central

    González, Sara; García, Alicia; Vázquez, Enrique; Serrano, Rebeca; Sánchez, Mar; Quintales, Luis; Antequera, Francisco

    2016-01-01

    In the yeast genome, a large proportion of nucleosomes occupy well-defined and stable positions. While the contribution of chromatin remodelers and DNA binding proteins to maintain this organization is well established, the relevance of the DNA sequence to nucleosome positioning in the genome remains controversial. Through quantitative analysis of nucleosome positioning, we show that sequence changes distort the nucleosomal pattern at the level of individual nucleosomes in three species of Schizosaccharomyces and in Saccharomyces cerevisiae. This effect is equally detected in transcribed and nontranscribed regions, suggesting the existence of sequence elements that contribute to positioning. To identify such elements, we incorporated information from nucleosomal signatures into artificial synthetic DNA molecules and found that they generated regular nucleosomal arrays indistinguishable from those of endogenous sequences. Strikingly, this information is species-specific and can be combined with coding information through the use of synonymous codons such that genes from one species can be engineered to adopt the nucleosomal organization of another. These findings open the possibility of designing coding and noncoding DNA molecules capable of directing their own nucleosomal organization. PMID:27662899

  12. Short pulse acquisition by low sampling rate with phase-coded sequence in lidar system

    NASA Astrophysics Data System (ADS)

    Wu, Long; Xu, Jiajia; Lv, Wentao; Yang, Xiaocheng

    2016-11-01

    The requirement of high range resolution results in impractical collection of every returned laser pulse due to the limited response speed of imaging detectors. This paper proposes a phase coded sequence acquisition method for signal preprocessing. The system employs an m-sequence with N bits for demonstration with the detector controlled to accumulate N+1 bits of the echo signals to deduce one single returned laser pulse. An indoor experiment achieved 2 μs resolution with the sampling period of 28 μs by employing a 15-bit m-sequence. This method shows the potential to improve the detection capabilities of narrow laser pulses with the detectors at a low frame rate, especially for the imaging lidar systems. Meanwhile, the lidar system is able to improve the range resolution with available detectors of restricted performance.

  13. Episodic sequence memory is supported by a theta-gamma phase code

    PubMed Central

    Heusser, Andrew C.; Poeppel, David; Ezzyat, Youssef; Davachi, Lila

    2016-01-01

    The meaning we derive from our experiences is not a simple static extraction of the elements, but is largely based on the order in which those elements occur. Models propose that sequence encoding is supported by interactions between high and low frequency oscillations, such that elements within an experience are represented by neural cell assemblies firing at higher frequencies (i.e. gamma) and sequential order is coded by the specific timing of firing with respect to a lower frequency oscillation (i.e. theta). During episodic sequence memory formation in humans, we provide evidence that items in different sequence positions exhibit relatively greater gamma power along distinct phases of a theta oscillation. Furthermore, this segregation is related to successful temporal order memory. These results provide compelling evidence that memory for order, a core component of an episodic memory, capitalizes on the ubiquitous physiological mechanism of theta-gamma phase-amplitude coupling. PMID:27571010

  14. MIMO Radar System for Respiratory Monitoring Using Tx and Rx Modulation with M-Sequence Codes

    NASA Astrophysics Data System (ADS)

    Miwa, Takashi; Ogiwara, Shun; Yamakoshi, Yoshiki

    The importance of respiratory monitoring systems during sleep have increased due to early diagnosis of sleep apnea syndrome (SAS) in the home. This paper presents a simple respiratory monitoring system suitable for home use having 3D ranging of targets. The range resolution and azimuth resolution are obtained by a stepped frequency transmitting signal and MIMO arrays with preferred pair M-sequence codes doubly modulating in transmission and reception, respectively. Due to the use of these codes, Gold sequence codes corresponding to all the antenna combinations are equivalently modulated in receiver. The signal to interchannel interference ratio of the reconstructed image is evaluated by numerical simulations. The results of experiments on a developed prototype 3D-MIMO radar system show that this system can extract only the motion of respiration of a human subject 2m apart from a metallic rotatable reflector. Moreover, it is found that this system can successfully measure the respiration information of sleeping human subjects for 96.6 percent of the whole measurement time except for instances of large posture change.

  15. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    PubMed Central

    Timofeeva, Maria N.; Kinnersley, Ben; Farrington, Susan M.; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J.; Harris, Sarah E.; Northwood, Emma L.; Barrett, Jennifer H.; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F A; Hes, Frederik J.; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Försti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernández-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellví-Bel, Sergi; Campbell, Harry; Bishop, D. Timothy; Tomlinson, Ian P M; Dunlop, Malcolm G.; Houlston, Richard S.

    2015-01-01

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,045 controls from six European populations. Single-variant analysis identified a coding variant (rs3184504) in SH2B3 (12q24) associated with CRC risk (OR = 1.08, P = 3.9 × 10−7), and novel damaging coding variants in 3 genes previously tagged by GWAS efforts; rs16888728 (8q24) in UTP23 (OR = 1.15, P = 1.4 × 10−7); rs6580742 and rs12303082 (12q13) in FAM186A (OR = 1.11, P = 1.2 × 10−7 and OR = 1.09, P = 7.4 × 10−8); rs1129406 (12q13) in ATF1 (OR = 1.11, P = 8.3 × 10−9), all reaching exome-wide significance levels. Gene based tests identified associations between CRC and PCDHGA genes (P < 2.90 × 10−6). We found an excess of rare, damaging variants in base-excision (P = 2.4 × 10−4) and DNA mismatch repair genes (P = 6.1 × 10−4) consistent with a recessive mode of inheritance. This study comprehensively explores the contribution of coding sequence variation to CRC risk, identifying associations with coding variation in 4 genes and PCDHG gene cluster and several candidate recessive alleles. However, these findings suggest that recurrent, low-frequency coding variants account for a minority of the unexplained heritability of CRC. PMID:26553438

  16. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer.

    PubMed

    Timofeeva, Maria N; Kinnersley, Ben; Farrington, Susan M; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J; Harris, Sarah E; Northwood, Emma L; Barrett, Jennifer H; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F A; Hes, Frederik J; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Försti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernández-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellví-Bel, Sergi; Campbell, Harry; Bishop, D Timothy; Tomlinson, Ian P M; Dunlop, Malcolm G; Houlston, Richard S

    2015-11-10

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,045 controls from six European populations. Single-variant analysis identified a coding variant (rs3184504) in SH2B3 (12q24) associated with CRC risk (OR = 1.08, P = 3.9 × 10(-7)), and novel damaging coding variants in 3 genes previously tagged by GWAS efforts; rs16888728 (8q24) in UTP23 (OR = 1.15, P = 1.4 × 10(-7)); rs6580742 and rs12303082 (12q13) in FAM186A (OR = 1.11, P = 1.2 × 10(-7) and OR = 1.09, P = 7.4 × 10(-8)); rs1129406 (12q13) in ATF1 (OR = 1.11, P = 8.3 × 10(-9)), all reaching exome-wide significance levels. Gene based tests identified associations between CRC and PCDHGA genes (P < 2.90 × 10(-6)). We found an excess of rare, damaging variants in base-excision (P = 2.4 × 10(-4)) and DNA mismatch repair genes (P = 6.1 × 10(-4)) consistent with a recessive mode of inheritance. This study comprehensively explores the contribution of coding sequence variation to CRC risk, identifying associations with coding variation in 4 genes and PCDHG gene cluster and several candidate recessive alleles. However, these findings suggest that recurrent, low-frequency coding variants account for a minority of the unexplained heritability of CRC.

  17. An improved and validated RNA HLA class I SBT approach for obtaining full length coding sequences.

    PubMed

    Gerritsen, K E H; Olieslagers, T I; Groeneweg, M; Voorter, C E M; Tilanus, M G J

    2014-11-01

    The functional relevance of human leukocyte antigen (HLA) class I allele polymorphism beyond exons 2 and 3 is difficult to address because more than 70% of the HLA class I alleles are defined by exons 2 and 3 sequences only. For routine application on clinical samples we improved and validated the HLA sequence-based typing (SBT) approach based on RNA templates, using either a single locus-specific or two overlapping group-specific polymerase chain reaction (PCR) amplifications, with three forward and three reverse sequencing reactions for full length sequencing. Locus-specific HLA typing with RNA SBT of a reference panel, representing the major antigen groups, showed identical results compared to DNA SBT typing. Alleles encountered with unknown exons in the IMGT/HLA database and three samples, two with Null and one with a Low expressed allele, have been addressed by the group-specific RNA SBT approach to obtain full length coding sequences. This RNA SBT approach has proven its value in our routine full length definition of alleles.

  18. Cloning and sequencing of the gene coding for the large subunit of methylamine dehydrogenase from Thiobacillus versutus.

    PubMed Central

    Huitema, F; van Beeumen, J; van Driessche, G; Duine, J A; Canters, G W

    1993-01-01

    The gene that codes for the alpha-subunit of methylamine dehydrogenase from Thiobacillus versutus, madA, was cloned and sequenced. It codes for a protein of 395 amino acids preceded by a leader sequence of 31 amino acids. The derived amino acid sequence was confirmed by partial amino acid sequencing. The start of the mature protein could not be determined by direct sequencing, since the N terminus appeared to be blocked. Instead, it was determined by electrospray mass spectrometry. Confirmation of the results was obtained by sequencing the N terminus after pyroglutamate aminopeptidase digestion. The sequence is homologous to the Paracoccus denitrificans nucleotide sequence. A second open reading frame, called open reading frame 3, is located immediately downstream of madA. PMID:8407797

  19. Integrated genome analysis suggests that most conserved non-coding sequences are regulatory factor binding sites

    PubMed Central

    Hemberg, Martin; Gray, Jesse M.; Cloonan, Nicole; Kuersten, Scott; Grimmond, Sean; Greenberg, Michael E.; Kreiman, Gabriel

    2012-01-01

    More than 98% of a typical vertebrate genome does not code for proteins. Although non-coding regions are sprinkled with short (<200 bp) islands of evolutionarily conserved sequences, the function of most of these unannotated conserved islands remains unknown. One possibility is that unannotated conserved islands could encode non-coding RNAs (ncRNAs); alternatively, unannotated conserved islands could serve as promoter-distal regulatory factor binding sites (RFBSs) like enhancers. Here we assess these possibilities by comparing unannotated conserved islands in the human and mouse genomes to transcribed regions and to RFBSs, relying on a detailed case study of one human and one mouse cell type. We define transcribed regions by applying a novel transcript-calling algorithm to RNA-Seq data obtained from total cellular RNA, and we define RFBSs using ChIP-Seq and DNAse-hypersensitivity assays. We find that unannotated conserved islands are four times more likely to coincide with RFBSs than with unannotated ncRNAs. Thousands of conserved RFBSs can be categorized as insulators based on the presence of CTCF or as enhancers based on the presence of p300/CBP and H3K4me1. While many unannotated conserved RFBSs are transcriptionally active to some extent, the transcripts produced tend to be unspliced, non-polyadenylated and expressed at levels 10 to 100-fold lower than annotated coding or ncRNAs. Extending these findings across multiple cell types and tissues, we propose that most conserved non-coding genomic DNA in vertebrate genomes corresponds to promoter-distal regulatory elements. PMID:22684627

  20. RT-PCR amplification of the complete NF1 coding sequence

    SciTech Connect

    Ming Hong Shen; Meena Upadhyaya

    1994-09-01

    Neurofibromatosis type 1 (NF1) is a common autosomal dominant disorder. The NF1 gene is a large gene, 350kb in size, with at least 51 exons. It has proved hard to detect mutations in the gene by examining genomic DNA due to the high mutation rate and the large size of the gene. Since the cloning of the gene, only 45 causative mutations have been reported from over 500 unrelated NF1 patients screened. The coding sequence of the NF1 gene is approximately 3% of the genomic sequence; it will therefore be easier to search for unknown mutations by the study of mRNA. We describe a simple RT-PCR-based strategy to amplify the total coding sequence of the NF1 transcript from peripheral blood lymphocyte RNA. This strategy involves an initial cDNA synthesis step utilizing a set of random hexamers, followed by two consecutive rounds of PCR amplifications. The first round of amplification was performed using four NF1-specific nested primer pairs. This amplification allows the construction of overlapping fragments which span a 8694 bp cDNA sequence of the gene. For mutation analysis, the amplified products or their digests were subjected to electrophoresis on Hydrolink gels. Two disease-causing mutations, a 3 bp deletion in exon 17 and a 10 bp deletion in exon 44, originally detected in the genomic DNA from two unrelated NF1 patients, have been confirmed at the RNA level. The combination of this strategy with other established techniques such as SSCP, chemical cleavage of mismatch, protein truncation test (PTT) and quantitative PCR should greatly facilitate mutation and expression analyses in the NF1 gene.

  1. Whole-Exome Sequencing Identifies Rare and Low-Frequency Coding Variants Associated with LDL Cholesterol

    PubMed Central

    Lange, Leslie A.; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M.; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M.; Smith, Joshua D.; Turner, Emily H.; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A.; Holmen, Oddgeir L.; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A.; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C.; Correa, Adolfo; Griswold, Michael E.; Jakobsdottir, Johanna; Smith, Albert V.; Schreiner, Pamela J.; Feitosa, Mary F.; Zhang, Qunyuan; Huffman, Jennifer E.; Crosby, Jacy; Wassel, Christina L.; Do, Ron; Franceschini, Nora; Martin, Lisa W.; Robinson, Jennifer G.; Assimes, Themistocles L.; Crosslin, David R.; Rosenthal, Elisabeth A.; Tsai, Michael; Rieder, Mark J.; Farlow, Deborah N.; Folsom, Aaron R.; Lumley, Thomas; Fox, Ervin R.; Carlson, Christopher S.; Peters, Ulrike; Jackson, Rebecca D.; van Duijn, Cornelia M.; Uitterlinden, André G.; Levy, Daniel; Rotter, Jerome I.; Taylor, Herman A.; Gudnason, Vilmundur; Siscovick, David S.; Fornage, Myriam; Borecki, Ingrid B.; Hayward, Caroline; Rudan, Igor; Chen, Y. Eugene; Bottinger, Erwin P.; Loos, Ruth J.F.; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M.; Gabriel, Stacey B.; O’Donnell, Christopher J.; Post, Wendy S.; North, Kari E.; Reiner, Alexander P.; Boerwinkle, Eric; Psaty, Bruce M.; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P.; Cupples, L. Adrienne; Kooperberg, Charles; Wilson, James G.; Nickerson, Deborah A.; Abecasis, Goncalo R.; Rich, Stephen S.; Tracy, Russell P.; Willer, Cristen J.; Gabriel, Stacey B.; Altshuler, David M.; Abecasis, Gonçalo R.; Allayee, Hooman; Cresci, Sharon; Daly, Mark J.; de Bakker, Paul I.W.; DePristo, Mark A.; Do, Ron; Donnelly, Peter; Farlow, Deborah N.; Fennell, Tim; Garimella, Kiran; Hazen, Stanley L.; Hu, Youna; Jordan, Daniel M.; Jun, Goo; Kathiresan, Sekar; Kang, Hyun Min; Kiezun, Adam; Lettre, Guillaume; Li, Bingshan; Li, Mingyao; Newton-Cheh, Christopher H.; Padmanabhan, Sandosh; Peloso, Gina; Pulit, Sara; Rader, Daniel J.; Reich, David; Reilly, Muredach P.; Rivas, Manuel A.; Schwartz, Steve; Scott, Laura; Siscovick, David S.; Spertus, John A.; Stitziel, Nathaniel O.; Stoletzki, Nina; Sunyaev, Shamil R.; Voight, Benjamin F.; Willer, Cristen J.; Rich, Stephen S.; Akylbekova, Ermeg; Atwood, Larry D.; Ballantyne, Christie M.; Barbalic, Maja; Barr, R. Graham; Benjamin, Emelia J.; Bis, Joshua; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer; Budoff, Matthew; Burke, Greg; Buxbaum, Sarah; Carr, Jeff; Chen, Donna T.; Chen, Ida Y.; Chen, Wei-Min; Concannon, Pat; Crosby, Jacy; Cupples, L. Adrienne; D’Agostino, Ralph; DeStefano, Anita L.; Dreisbach, Albert; Dupuis, Josée; Durda, J. Peter; Ellis, Jaclyn; Folsom, Aaron R.; Fornage, Myriam; Fox, Caroline S.; Fox, Ervin; Funari, Vincent; Ganesh, Santhi K.; Gardin, Julius; Goff, David; Gordon, Ora; Grody, Wayne; Gross, Myron; Guo, Xiuqing; Hall, Ira M.; Heard-Costa, Nancy L.; Heckbert, Susan R.; Heintz, Nicholas; Herrington, David M.; Hickson, DeMarc; Huang, Jie; Hwang, Shih-Jen; Jacobs, David R.; Jenny, Nancy S.; Johnson, Andrew D.; Johnson, Craig W.; Kawut, Steven; Kronmal, Richard; Kurz, Raluca; Lange, Ethan M.; Lange, Leslie A.; Larson, Martin G.; Lawson, Mark; Lewis, Cora E.; Levy, Daniel; Li, Dalin; Lin, Honghuang; Liu, Chunyu; Liu, Jiankang; Liu, Kiang; Liu, Xiaoming; Liu, Yongmei; Longstreth, William T.; Loria, Cay; Lumley, Thomas; Lunetta, Kathryn; Mackey, Aaron J.; Mackey, Rachel; Manichaikul, Ani; Maxwell, Taylor; McKnight, Barbara; Meigs, James B.; Morrison, Alanna C.; Musani, Solomon K.; Mychaleckyj, Josyf C.; Nettleton, Jennifer A.; North, Kari; O’Donnell, Christopher J.; O’Leary, Daniel; Ong, Frank; Palmas, Walter; Pankow, James S.; Pankratz, Nathan D.; Paul, Shom; Perez, Marco; Person, Sharina D.; Polak, Joseph; Post, Wendy S.; Psaty, Bruce M.; Quinlan, Aaron R.; Raffel, Leslie J.; Ramachandran, Vasan S.; Reiner, Alexander P.; Rice, Kenneth; Rotter, Jerome I.; Sanders, Jill P.; Schreiner, Pamela; Seshadri, Sudha; Shea, Steve; Sidney, Stephen; Silverstein, Kevin; Smith, Nicholas L.; Sotoodehnia, Nona; Srinivasan, Asoke; Taylor, Herman A.; Taylor, Kent; Thomas, Fridtjof; Tracy, Russell P.; Tsai, Michael Y.; Volcik, Kelly A.; Wassel, Chrstina L.; Watson, Karol; Wei, Gina; White, Wendy; Wiggins, Kerri L.; Wilk, Jemma B.; Williams, O. Dale; Wilson, Gregory; Wilson, James G.; Wolf, Phillip; Zakai, Neil A.; Hardy, John; Meschia, James F.; Nalls, Michael; Singleton, Andrew; Worrall, Brad; Bamshad, Michael J.; Barnes, Kathleen C.; Abdulhamid, Ibrahim; Accurso, Frank; Anbar, Ran; Beaty, Terri; Bigham, Abigail; Black, Phillip; Bleecker, Eugene; Buckingham, Kati; Cairns, Anne Marie; Caplan, Daniel; Chatfield, Barbara; Chidekel, Aaron; Cho, Michael; Christiani, David C.; Crapo, James D.; Crouch, Julia; Daley, Denise; Dang, Anthony; Dang, Hong; De Paula, Alicia; DeCelie-Germana, Joan; Drumm, Allen DozorMitch; Dyson, Maynard; Emerson, Julia; Emond, Mary J.; Ferkol, Thomas; Fink, Robert; Foster, Cassandra; Froh, Deborah; Gao, Li; Gershan, William; Gibson, Ronald L.; Godwin, Elizabeth; Gondor, Magdalen; Gutierrez, Hector; Hansel, Nadia N.; Hassoun, Paul M.; Hiatt, Peter; Hokanson, John E.; Howenstine, Michelle; Hummer, Laura K.; Kanga, Jamshed; Kim, Yoonhee; Knowles, Michael R.; Konstan, Michael; Lahiri, Thomas; Laird, Nan; Lange, Christoph; Lin, Lin; Lin, Xihong; Louie, Tin L.; Lynch, David; Make, Barry; Martin, Thomas R.; Mathai, Steve C.; Mathias, Rasika A.; McNamara, John; McNamara, Sharon; Meyers, Deborah; Millard, Susan; Mogayzel, Peter; Moss, Richard; Murray, Tanda; Nielson, Dennis; Noyes, Blakeslee; O’Neal, Wanda; Orenstein, David; O’Sullivan, Brian; Pace, Rhonda; Pare, Peter; Parker, H. Worth; Passero, Mary Ann; Perkett, Elizabeth; Prestridge, Adrienne; Rafaels, Nicholas M.; Ramsey, Bonnie; Regan, Elizabeth; Ren, Clement; Retsch-Bogart, George; Rock, Michael; Rosen, Antony; Rosenfeld, Margaret; Ruczinski, Ingo; Sanford, Andrew; Schaeffer, David; Sell, Cindy; Sheehan, Daniel; Silverman, Edwin K.; Sin, Don; Spencer, Terry; Stonebraker, Jackie; Tabor, Holly K.; Varlotta, Laurie; Vergara, Candelaria I.; Weiss, Robert; Wigley, Fred; Wise, Robert A.; Wright, Fred A.; Wurfel, Mark M.; Zanni, Robert; Zou, Fei; Nickerson, Deborah A.; Rieder, Mark J.; Green, Phil; Shendure, Jay; Akey, Joshua M.; Bustamante, Carlos D.; Crosslin, David R.; Eichler, Evan E.; Fox, P. Keolu; Fu, Wenqing; Gordon, Adam; Gravel, Simon; Jarvik, Gail P.; Johnsen, Jill M.; Kan, Mengyuan; Kenny, Eimear E.; Kidd, Jeffrey M.; Lara-Garduno, Fremiet; Leal, Suzanne M.; Liu, Dajiang J.; McGee, Sean; O’Connor, Timothy D.; Paeper, Bryan; Robertson, Peggy D.; Smith, Joshua D.; Staples, Jeffrey C.; Tennessen, Jacob A.; Turner, Emily H.; Wang, Gao; Yi, Qian; Jackson, Rebecca; Peters, Ulrike; Carlson, Christopher S.; Anderson, Garnet; Anton-Culver, Hoda; Assimes, Themistocles L.; Auer, Paul L.; Beresford, Shirley; Bizon, Chris; Black, Henry; Brunner, Robert; Brzyski, Robert; Burwen, Dale; Caan, Bette; Carty, Cara L.; Chlebowski, Rowan; Cummings, Steven; Curb, J. David; Eaton, Charles B.; Ford, Leslie; Franceschini, Nora; Fullerton, Stephanie M.; Gass, Margery; Geller, Nancy; Heiss, Gerardo; Howard, Barbara V.; Hsu, Li; Hutter, Carolyn M.; Ioannidis, John; Jiao, Shuo; Johnson, Karen C.; Kooperberg, Charles; Kuller, Lewis; LaCroix, Andrea; Lakshminarayan, Kamakshi; Lane, Dorothy; Lasser, Norman; LeBlanc, Erin; Li, Kuo-Ping; Limacher, Marian; Lin, Dan-Yu; Logsdon, Benjamin A.; Ludlam, Shari; Manson, JoAnn E.; Margolis, Karen; Martin, Lisa; McGowan, Joan; Monda, Keri L.; Kotchen, Jane Morley; Nathan, Lauren; Ockene, Judith; O’Sullivan, Mary Jo; Phillips, Lawrence S.; Prentice, Ross L.; Robbins, John; Robinson, Jennifer G.; Rossouw, Jacques E.; Sangi-Haghpeykar, Haleh; Sarto, Gloria E.; Shumaker, Sally; Simon, Michael S.; Stefanick, Marcia L.; Stein, Evan; Tang, Hua; Taylor, Kira C.; Thomson, Cynthia A.; Thornton, Timothy A.; Van Horn, Linda; Vitolins, Mara; Wactawski-Wende, Jean; Wallace, Robert; Wassertheil-Smoller, Sylvia; Zeng, Donglin; Applebaum-Bowden, Deborah; Feolo, Michael; Gan, Weiniu; Paltoo, Dina N.; Sholinsky, Phyliss; Sturcke, Anne

    2014-01-01

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775

  2. Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol.

    PubMed

    Lange, Leslie A; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M; Smith, Joshua D; Turner, Emily H; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-Ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A; Holmen, Oddgeir L; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C; Correa, Adolfo; Griswold, Michael E; Jakobsdottir, Johanna; Smith, Albert V; Schreiner, Pamela J; Feitosa, Mary F; Zhang, Qunyuan; Huffman, Jennifer E; Crosby, Jacy; Wassel, Christina L; Do, Ron; Franceschini, Nora; Martin, Lisa W; Robinson, Jennifer G; Assimes, Themistocles L; Crosslin, David R; Rosenthal, Elisabeth A; Tsai, Michael; Rieder, Mark J; Farlow, Deborah N; Folsom, Aaron R; Lumley, Thomas; Fox, Ervin R; Carlson, Christopher S; Peters, Ulrike; Jackson, Rebecca D; van Duijn, Cornelia M; Uitterlinden, André G; Levy, Daniel; Rotter, Jerome I; Taylor, Herman A; Gudnason, Vilmundur; Siscovick, David S; Fornage, Myriam; Borecki, Ingrid B; Hayward, Caroline; Rudan, Igor; Chen, Y Eugene; Bottinger, Erwin P; Loos, Ruth J F; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M; Gabriel, Stacey B; O'Donnell, Christopher J; Post, Wendy S; North, Kari E; Reiner, Alexander P; Boerwinkle, Eric; Psaty, Bruce M; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P; Cupples, L Adrienne; Kooperberg, Charles; Wilson, James G; Nickerson, Deborah A; Abecasis, Goncalo R; Rich, Stephen S; Tracy, Russell P; Willer, Cristen J

    2014-02-06

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98(th) or <2(nd) percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments.

  3. Coding and decoding libraries of sequence-defined functional copolymers synthesized via photoligation

    PubMed Central

    Zydziak, Nicolas; Konrad, Waldemar; Feist, Florian; Afonin, Sergii; Weidner, Steffen; Barner-Kowollik, Christopher

    2016-01-01

    Designing artificial macromolecules with absolute sequence order represents a considerable challenge. Here we report an advanced light-induced avenue to monodisperse sequence-defined functional linear macromolecules up to decamers via a unique photochemical approach. The versatility of the synthetic strategy—combining sequential and modular concepts—enables the synthesis of perfect macromolecules varying in chemical constitution and topology. Specific functions are placed at arbitrary positions along the chain via the successive addition of monomer units and blocks, leading to a library of functional homopolymers, alternating copolymers and block copolymers. The in-depth characterization of each sequence-defined chain confirms the precision nature of the macromolecules. Decoding of the functional information contained in the molecular structure is achieved via tandem mass spectrometry without recourse to their synthetic history, showing that the sequence information can be read. We submit that the presented photochemical strategy is a viable and advanced concept for coding individual monomer units along a macromolecular chain. PMID:27901024

  4. A probabilistic coding based quantum genetic algorithm for multiple sequence alignment.

    PubMed

    Huo, Hongwei; Xie, Qiaoluan; Shen, Xubang; Stojkovic, Vojislav

    2008-01-01

    This paper presents an original Quantum Genetic algorithm for Multiple sequence ALIGNment (QGMALIGN) that combines a genetic algorithm and a quantum algorithm. A quantum probabilistic coding is designed for representing the multiple sequence alignment. A quantum rotation gate as a mutation operator is used to guide the quantum state evolution. Six genetic operators are designed on the coding basis to improve the solution during the evolutionary process. The features of implicit parallelism and state superposition in quantum mechanics and the global search capability of the genetic algorithm are exploited to get efficient computation. A set of well known test cases from BAliBASE2.0 is used as reference to evaluate the efficiency of the QGMALIGN optimization. The QGMALIGN results have been compared with the most popular methods (CLUSTALX, SAGA, DIALIGN, SB_PIMA, and QGMALIGN) results. The QGMALIGN results show that QGMALIGN performs well on the presenting biological data. The addition of genetic operators to the quantum algorithm lowers the cost of overall running time.

  5. Adaptive three-dimensional motion-compensated wavelet transform for image sequence coding

    NASA Astrophysics Data System (ADS)

    Leduc, Jean-Pierre

    1994-09-01

    This paper describes a 3D spatio-temporal coding algorithm for the bit-rate compression of digital-image sequences. The coding scheme is based on different specificities namely, a motion representation with a four-parameter affine model, a motion-adapted temporal wavelet decomposition along the motion trajectories and a signal-adapted spatial wavelet transform. The motion estimation is performed on the basis of four-parameter affine transformation models also called similitude. This transformation takes into account translations, rotations and scalings. The temporal wavelet filter bank exploits bi-orthogonal linear-phase dyadic decompositions. The 2D spatial decomposition is based on dyadic signal-adaptive filter banks with either para-unitary or bi-orthogonal bases. The adaptive filtering is carried out according to a performance criterion to be optimized under constraints in order to eventually maximize the compression ratio at the expense of graceful degradations of the subjective image quality. The major principles of the present technique is, in the analysis process, to extract and to separate the motion contained in the sequences from the spatio-temporal redundancy and, in the compression process, to take into account of the rate-distortion function on the basis of the spatio-temporal psycho-visual properties to achieve the most graceful degradations. To complete this description of the coding scheme, the compression procedure is therefore composed of scalar quantizers which exploit the spatio-temporal 3D psycho-visual properties of the Human Visual System and of entropy coders which finalize the bit rate compression.

  6. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    SciTech Connect

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng; Kurz,Thorsten; Dubchak, Inna; Frazer, Kelly A.; Ober, Carole

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs each inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.

  7. Code-Switching to Know a TL Equivalent of an L1 Word: Request-Provision-Acknowledgement (RPA) Sequence

    ERIC Educational Resources Information Center

    Lucero, Edgar

    2011-01-01

    This article focuses on the learner's use of Code-switching to learn the TL (Target Language) equivalent of an L1 word. The interactional pattern that this situation creates defines the Request-Provision-Acknowledgement (RPA) sequence. The article explains each of the turns of the sequence under the combination of the Ethnomethodological…

  8. Two lamprey Hedgehog genes share non-coding regulatory sequences and expression patterns with gnathostome Hedgehogs.

    PubMed

    Kano, Shungo; Xiao, Jin-Hua; Osório, Joana; Ekker, Marc; Hadzhiev, Yavor; Müller, Ferenc; Casane, Didier; Magdelenat, Ghislaine; Rétaux, Sylvie

    2010-10-13

    Hedgehog (Hh) genes play major roles in animal development and studies of their evolution, expression and function point to major differences among chordates. Here we focused on Hh genes in lampreys in order to characterize the evolution of Hh signalling at the emergence of vertebrates. Screening of a cosmid library of the river lamprey Lampetra fluviatilis and searching the preliminary genome assembly of the sea lamprey Petromyzon marinus indicate that lampreys have two Hh genes, named Hha and Hhb. Phylogenetic analyses suggest that Hha and Hhb are lamprey-specific paralogs closely related to Sonic/Indian Hh genes. Expression analysis indicates that Hha and Hhb are expressed in a Sonic Hh-like pattern. The two transcripts are expressed in largely overlapping but not identical domains in the lamprey embryonic brain, including a newly-described expression domain in the nasohypophyseal placode. Global alignments of genomic sequences and local alignment with known gnathostome regulatory motifs show that lamprey Hhs share conserved non-coding elements (CNE) with gnathostome Hhs albeit with sequences that have significantly diverged and dispersed. Functional assays using zebrafish embryos demonstrate gnathostome-like midline enhancer activity for CNEs contained in intron2. We conclude that lamprey Hh genes are gnathostome Shh-like in terms of expression and regulation. In addition, they show some lamprey-specific features, including duplication and structural (but not functional) changes in the intronic/regulatory sequences.

  9. A molecular code dictates sequence-specific DNA recognition by homeodomains.

    PubMed Central

    Damante, G; Pellizzari, L; Esposito, G; Fogolari, F; Viglino, P; Fabbro, D; Tell, G; Formisano, S; Di Lauro, R

    1996-01-01

    Most homeodomains bind to DNA sequences containing the motif 5'-TAAT-3'. The homeodomain of thyroid transcription factor 1 (TTF-1HD) binds to sequences containing a 5'-CAAG-3' core motif, delineating a new mechanism for differential DNA recognition by homeodomains. We investigated the molecular basis of the DNA binding specificity of TTF-1HD by both structural and functional approaches. As already suggested by the three-dimensional structure of TTF-1HD, the DNA binding specificities of the TTF-1, Antennapedia and Engrailed homeodomains, either wild-type or mutants, indicated that the amino acid residue in position 54 is involved in the recognition of the nucleotide at the 3' end of the core motif 5'-NAAN-3'. The nucleotide at the 5' position of this core sequence is recognized by the amino acids located in position 6, 7 and 8 of the TTF-1 and Antennapedia homeodomains. These data, together with previous suggestions on the role of amino acids in position 50, indicate that the DNA binding specificity of homeodomains can be determined by a combinatorial molecular code. We also show that some specific combinations of the key amino acid residues involved in DNA recognition do not follow a simple, additive rule. Images PMID:8890172

  10. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison.

    PubMed

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-07-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features.

  11. A quantum-inspired genetic algorithm based on probabilistic coding for multiple sequence alignment.

    PubMed

    Huo, Hong-Wei; Stojkovic, Vojislav; Xie, Qiao-Luan

    2010-02-01

    Quantum parallelism arises from the ability of a quantum memory register to exist in a superposition of base states. Since the number of possible base states is 2(n), where n is the number of qubits in the quantum memory register, one operation on a quantum computer performs what an exponential number of operations on a classical computer performs. The power of quantum algorithms comes from taking advantages of quantum parallelism. Quantum algorithms are exponentially faster than classical algorithms. Genetic optimization algorithms are stochastic search algorithms which are used to search large, nonlinear spaces where expert knowledge is lacking or difficult to encode. QGMALIGN--a probabilistic coding based quantum-inspired genetic algorithm for multiple sequence alignment is presented. A quantum rotation gate as a mutation operator is used to guide the quantum state evolution. Six genetic operators are designed on the coding basis to improve the solution during the evolutionary process. The experimental results show that QGMALIGN can compete with the popular methods, such as CLUSTALX and SAGA, and performs well on the presenting biological data. Moreover, the addition of genetic operators to the quantum-inspired algorithm lowers the cost of overall running time.

  12. EzEditor: a versatile sequence alignment editor for both rRNA- and protein-coding genes.

    PubMed

    Jeon, Yoon-Seong; Lee, Kihyun; Park, Sang-Cheol; Kim, Bong-Soo; Cho, Yong-Joon; Ha, Sung-Min; Chun, Jongsik

    2014-02-01

    EzEditor is a Java-based molecular sequence editor allowing manipulation of both DNA and protein sequence alignments for phylogenetic analysis. It has multiple features optimized to connect initial computer-generated multiple alignment and subsequent phylogenetic analysis by providing manual editing with reference to biological information specific to the genes under consideration. It provides various functionalities for editing rRNA alignments using secondary structure information. In addition, it supports simultaneous editing of both DNA sequences and their translated protein sequences for protein-coding genes. EzEditor is, to our knowledge, the first sequence editing software designed for both rRNA- and protein-coding genes with the visualization of biologically relevant information and should be useful in molecular phylogenetic studies. EzEditor is based on Java, can be run on all major computer operating systems and is freely available from http://sw.ezbiocloud.net/ezeditor/.

  13. Natural selection on coding and noncoding DNA sequences is associated with virulence genes in a plant pathogenic fungus.

    PubMed

    Rech, Gabriel E; Sanz-Martín, José M; Anisimova, Maria; Sukno, Serenella A; Thon, Michael R

    2014-09-04

    Natural selection leaves imprints on DNA, offering the opportunity to identify functionally important regions of the genome. Identifying the genomic regions affected by natural selection within pathogens can aid in the pursuit of effective strategies to control diseases. In this study, we analyzed genome-wide patterns of selection acting on different classes of sequences in a worldwide sample of eight strains of the model plant-pathogenic fungus Colletotrichum graminicola. We found evidence of selective sweeps, balancing selection, and positive selection affecting both protein-coding and noncoding DNA of pathogenicity-related sequences. Genes encoding putative effector proteins and secondary metabolite biosynthetic enzymes show evidence of positive selection acting on the coding sequence, consistent with an Arms Race model of evolution. The 5' untranslated regions (UTRs) of genes coding for effector proteins and genes upregulated during infection show an excess of high-frequency polymorphisms likely the consequence of balancing selection and consistent with the Red Queen hypothesis of evolution acting on these putative regulatory sequences. Based on the findings of this work, we propose that even though adaptive substitutions on coding sequences are important for proteins that interact directly with the host, polymorphisms in the regulatory sequences may confer flexibility of gene expression in the virulence processes of this important plant pathogen.

  14. Real Time PCR to detect hazelnut allergen coding sequences in processed foods.

    PubMed

    Iniesto, Elisa; Jiménez, Ana; Prieto, Nuria; Cabanillas, Beatriz; Burbano, Carmen; Pedrosa, Mercedes M; Rodríguez, Julia; Muzquiz, Mercedes; Crespo, Jesús F; Cuadrado, Carmen; Linacero, Rosario

    2013-06-01

    A quantitative RT-PCR method, employing novel primer sets designed on Cor a 9, Cor a 11 and Cor a 13 allergen-coding sequences has been setup and validated. Its specificity, sensitivity and applicability have been compared. The effect of processing on detectability of these hazelnut targets in complex food matrices was also studied. The DNA extraction method based on CTAB-phenol-chloroform was the best for hazelnut. RT-PCR using primers for Cor a 9, 11 and 13 allowed a specific and accurate amplification of these sequences. The limit of detection was 1 ppm of raw hazelnut. The method sensitivity and robustness were confirmed with spiked samples. Thermal treatments (roasting and autoclaving) reduced yield and amplificability of hazelnut DNA, however, high-hydrostatic pressure did not affect. Compared with an ELISA assay, this RT-PCR showed higher sensitivity to detected hazelnut traces in commercial foodstuffs. The RT-PCR method described is the most sensitive of those reported for the detection of hazelnut traces in processed foods.

  15. Detection by real time PCR of walnut allergen coding sequences in processed foods.

    PubMed

    Linacero, Rosario; Ballesteros, Isabel; Sanchiz, Africa; Prieto, Nuria; Iniesto, Elisa; Martinez, Yolanda; Pedrosa, Mercedes M; Muzquiz, Mercedes; Cabanillas, Beatriz; Rovira, Mercè; Burbano, Carmen; Cuadrado, Carmen

    2016-07-01

    A quantitative real-time PCR (RT-PCR) method, employing novel primer sets designed on Jug r 1, Jug r 3, and Jug r 4 allergen-coding sequences, was set up and validated. Its specificity, sensitivity, and applicability were evaluated. The DNA extraction method based on CTAB-phenol-chloroform was best for walnut. RT-PCR allowed a specific and accurate amplification of allergen sequence, and the limit of detection was 2.5pg of walnut DNA. The method sensitivity and robustness were confirmed with spiked samples, and Jug r 3 primers detected up to 100mg/kg of raw walnut (LOD 0.01%, LOQ 0.05%). Thermal treatment combined with pressure (autoclaving) reduced yield and amplification (integrity and quality) of walnut DNA. High hydrostatic pressure (HHP) did not produce any effect on the walnut DNA amplification. This RT-PCR method showed greater sensitivity and reliability in the detection of walnut traces in commercial foodstuffs compared with ELISA assays.

  16. Recombination regulator PRDM9 influences the instability of its own coding sequence in humans.

    PubMed

    Jeffreys, Alec J; Cotton, Victoria E; Neumann, Rita; Lam, Kwan-Wood Gabriel

    2013-01-08

    PRDM9 plays a key role in specifying meiotic recombination hotspot locations in humans and mice via recognition of hotspot sequence motifs by a variable tandem-repeat zinc finger domain in the protein. We now explore germ-line instability of this domain in humans. We show that repeat turnover is driven by mitotic and meiotic mutation pathways, the latter frequently resulting in substantial remodeling of zinc fingers. Turnover dynamics predict frequent allele switches in populations with correspondingly fast changes of the recombination landscape, fully consistent with the known rapid evolution of hotspot locations. We found variation in meiotic instability between men that correlated with PRDM9 status. One particular "destabilizer" variant caused hyperinstability not only of itself but also of otherwise-stable alleles in heterozygotes. PRDM9 protein thus appears to regulate the instability of its own coding sequence. However, destabilizer variants are strongly self-limiting in populations and probably have little impact on the evolution of the recombination landscape.

  17. NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents

    PubMed Central

    Liu, Sophia S.; Hockenberry, Adam J.; Lancichinetti, Andrea; Jewett, Michael C.

    2016-01-01

    The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems. PMID:27835644

  18. FOURTH SEMINAR TO THE MEMORY OF D.N. KLYSHKO: Algebraic solution of the synthesis problem for coded sequences

    NASA Astrophysics Data System (ADS)

    Leukhin, Anatolii N.

    2005-08-01

    The algebraic solution of a 'complex' problem of synthesis of phase-coded (PC) sequences with the zero level of side lobes of the cyclic autocorrelation function (ACF) is proposed. It is shown that the solution of the synthesis problem is connected with the existence of difference sets for a given code dimension. The problem of estimating the number of possible code combinations for a given code dimension is solved. It is pointed out that the problem of synthesis of PC sequences is related to the fundamental problems of discrete mathematics and, first of all, to a number of combinatorial problems, which can be solved, as the number factorisation problem, by algebraic methods by using the theory of Galois fields and groups.

  19. [Learning of reproduction of random sequences by the right and the left hand movements: coding of positions or movements].

    PubMed

    Bobrova, E V; Liakhovetskiĭ, V A; Skopin, G N

    2012-01-01

    Positional and movement errors during reproduction of memorized sequences of six random hand movements were analyzed. The task was performed by two groups of subjects: during six days by one hand (right/left) and during next six days by another hand (left/right). Mean values of accuracy errors decreases during learning only in a group which begins to work by the right hand. The quantity of transposition errors depends on type of error: positional or movement one. Subjects transpose the positions of the right hand more often when it begins to perform the task. Subjects transpose the movements of the left hand more often when it begins to perform the task. The results are evident in favor of the hypothesis about two type of movement coding: positional and vector coding (coding of positions or of changing of positions) specific in the right and the left hemispheres and suggest that learning of reproduction of movement sequences is provided by vector coding.

  20. Sequence-Based Analysis Uncovers an Abundance of Non-Coding RNA in the Total Transcriptome of Mycobacterium tuberculosis

    PubMed Central

    Arnvig, Kristine B.; Comas, Iñaki; Thomson, Nicholas R.; Houghton, Joanna; Boshoff, Helena I.; Croucher, Nicholas J.; Rose, Graham; Perkins, Timothy T.; Parkhill, Julian; Dougan, Gordon; Young, Douglas B.

    2011-01-01

    RNA sequencing provides a new perspective on the genome of Mycobacterium tuberculosis by revealing an extensive presence of non-coding RNA, including long 5’ and 3’ untranslated regions, antisense transcripts, and intergenic small RNA (sRNA) molecules. More than a quarter of all sequence reads mapping outside of ribosomal RNA genes represent non-coding RNA, and the density of reads mapping to intergenic regions was more than two-fold higher than that mapping to annotated coding sequences. Selected sRNAs were found at increased abundance in stationary phase cultures and accumulated to remarkably high levels in the lungs of chronically infected mice, indicating a potential contribution to pathogenesis. The ability of tubercle bacilli to adapt to changing environments within the host is critical to their ability to cause disease and to persist during drug treatment; it is likely that novel post-transcriptional regulatory networks will play an important role in these adaptive responses. PMID:22072964

  1. Human beta-hexosaminidase alpha chain: coding sequence and homology with the beta chain.

    PubMed Central

    Myerowitz, R; Piekarz, R; Neufeld, E F; Shows, T B; Suzuki, K

    1985-01-01

    We have isolated a cDNA clone, p beta H alpha-5, from an adult human liver library that contains the entire coding sequence of the alpha chain of beta-hexosaminidase. The cDNA insert of p beta H alpha-5 is 1944 base pairs long and contains a 168-base-pair 5' untranslated region, a 186-base-pair 3' untranslated region, and an open reading frame of 1587 base pairs corresponding to 529 amino acids (Mr, 60,697). The first 17-22 amino acids satisfy the requirements of a signal sequence. A striking sequence homology with a published partial amino acid sequence for the beta chain [O'Dowd, B. F., Quan, F., Willard, H. F., Lamhonwah, A. M., Korneluk, R. G., Lowden, J. A., Gravel, R. A. & Mahuran, D. J. (1985) Proc. Natl. Acad. Sci. USA 82, 1184-1188] suggests that both chains may have evolved from a common ancestor. A shorter alpha-chain cDNA was found to hybridize to the long arm of chromosome 15, the known location for the alpha-chain gene. In addition, we isolated another alpha-chain cDNA clone, p beta H alpha-4, from a simian virus 40-transformed human fibroblast library that contained an extra 453-base-pair piece at its 3' end. A probe consisting of this additional sequence hybridized exclusively to a single mRNA species (2.6 kilobases) in mRNA preparations from cultured human fibroblasts. In contrast, p beta H alpha-5 hybridized to both a 2.1-kilobase major and a 2.6-kilobase minor mRNA species in these same mRNA preparations, indicating the presence of two distinct alpha-chain mRNA species differing at the 3' end. Fibroblasts from an Ashkenazi Jewish patient with classic Tay-Sachs disease were deficient in both species of mRNA, confirming their genetic relationship. Images PMID:2933746

  2. Biased Gene Conversion and GC-Content Evolution in the Coding Sequences of Reptiles and Vertebrates

    PubMed Central

    Figuet, Emeric; Ballenghien, Marion; Romiguier, Jonathan; Galtier, Nicolas

    2015-01-01

    Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins. PMID:25527834

  3. Biased gene conversion and GC-content evolution in the coding sequences of reptiles and vertebrates.

    PubMed

    Figuet, Emeric; Ballenghien, Marion; Romiguier, Jonathan; Galtier, Nicolas

    2014-12-19

    Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins.

  4. RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA

    PubMed Central

    Wright, Imogen A.; Travers, Simon A.

    2014-01-01

    The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance. PMID:24861618

  5. The vicilin gene family of pea (Pisum sativum L.): a complete cDNA coding sequence for preprovicilin.

    PubMed Central

    Lycett, G W; Delauney, A J; Gatehouse, J A; Gilroy, J; Croy, R R; Boulter, D

    1983-01-01

    A cDNA plasmid bank has been constructed using mRNA from developing pea seeds and three cDNAs coding for vicilin polypeptides have been selected. These cDNAs have been sequenced and between them cover the whole of the coding sequence plus part of the 5' and 3' untranslated regions. Comparison with amino acid sequence data from the protein indicates that vicilin is synthesised as preprovicilin with subsequent removal of a signal peptide and a C-terminal peptide as well as post translational endo-proteolytic cleavage. The cDNAs represent two different classes of vicilin genes whilst amino acid data show that there are at least three major classes of vicilin polypeptide. The vicilin sequences show extensive homology with conglycinin and phaseolin except in the regions of the internal proteolytic cleavages. The evolutionary significance of this relationship is discussed. Images PMID:6687941

  6. Capsid coding sequences of foot-and-mouth disease viruses are determinants of pathogenicity in pigs.

    PubMed

    Lohse, Louise; Jackson, Terry; Bøtner, Anette; Belsham, Graham J

    2012-05-24

    The surface exposed capsid proteins, VP1, VP2 and VP3, of foot-and-mouth disease virus (FMDV) determine its antigenicity and the ability of the virus to interact with host-cell receptors. Hence, modification of these structural proteins may alter the properties of the virus.In the present study we compared the pathogenicity of different FMDVs in young pigs. In total 32 pigs, 7-weeks-old, were exposed to virus, either by direct inoculation or through contact with inoculated pigs, using cell culture adapted (O1K B64), chimeric (O1K/A-TUR and O1K/O-UKG) or field strain (O-UKG/34/2001) viruses. The O1K B64 virus and the two chimeric viruses are identical to each other except for the capsid coding region.Animals exposed to O1K B64 did not exhibit signs of disease, while pigs exposed to each of the other viruses showed typical clinical signs of foot-and-mouth disease (FMD). All pigs infected with the O1K/O-UKG chimera or the field strain (O-UKG/34/2001) developed fulminant disease. Furthermore, 3 of 4 in-contact pigs exposed to the O1K/O-UKG virus died in the acute phase of infection, likely from myocardial infection. However, in the group exposed to the O1K/A-TUR chimeric virus, only 1 pig showed symptoms of disease within the time frame of the experiment (10 days). All pigs that developed clinical disease showed a high level of viral RNA in serum and infected pigs that survived the acute phase of infection developed a serotype specific antibody response. It is concluded that the capsid coding sequences are determinants of FMDV pathogenicity in pigs.

  7. [Transposition errors during learning to reproduce a sequence by the right- and the left-hand movements: simulation of positional and movement coding].

    PubMed

    Liakhovetskiĭ, V A; Bobrova, E V; Skopin, G N

    2012-01-01

    Transposition errors during the reproduction of a hand movement sequence make it possible to receive important information on the internal representation of this sequence in the motor working memory. Analysis of such errors showed that learning to reproduce sequences of the left-hand movements improves the system of positional coding (coding ofpositions), while learning of the right-hand movements improves the system of vector coding (coding of movements). Learning of the right-hand movements after the left-hand performance involved the system of positional coding "imposed" by the left hand. Learning of the left-hand movements after the right-hand performance activated the system of vector coding. Transposition errors during learning to reproduce movement sequences can be explained by neural network using either vector coding or both vector and positional coding.

  8. Classifier assessment and feature selection for recognizing short coding sequences of human genes.

    PubMed

    Song, Kai; Zhang, Ze; Tong, Tuo-Peng; Wu, Fang

    2012-03-01

    With the ever-increasing pace of genome sequencing, there is a great need for fast and accurate computational tools to automatically identify genes in these genomes. Although great progress has been made in the development of gene-finding algorithms during the past decades, there is still room for further improvement. In particular, the issue of recognizing short exons in eukaryotes is still not solved satisfactorily. This article is devoted to assessing various linear and kernel-based classification algorithms and selecting the best combination of Z-curve features for further improvement of the issue. Eight state-of-the-art linear and kernel-based supervised pattern recognition techniques were used to identify the short (21-192 bp) coding sequences of human genes. By measuring the prediction accuracy, the tradeoff between sensitivity and specificity and the time consumption, partial least squares (PLS) and kernel partial least squares (KPLS) algorithms were verified to be the most optimal linear and kernel-based classifiers, respectively. A surprising result was that, by making good use of the interpretability of the PLS and the Z-curve methods, 93 Z-curve features were proved to be the best selective combination. Using them, the average recognition accuracy was improved as high as 7.7% by means of KPLS when compared with what was obtained by the Fisher discriminant analysis using 189 Z-curve variables (Gao and Zhang, 2004 ). The used codes are freely available from the following approaches (implemented in MATLAB and supported on Linux and MS Windows): (1) SVM: http://www.support-vector-machines.org/SVM_soft.html. (2) GP: http://www.gaussianprocess.org. (3) KPLS and KFDA: Taylor, J.S., and Cristianini, N. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, UK. (4) PLS: Wise, B.M., and Gallagher, N.B. 2011. PLS-Toolbox for use with MATLAB: ver 1.5.2. Eigenvector Technologies, Manson, WA. Supplementary Material for this article is

  9. Empirical Transition Probability Indexing Sparse-Coding Belief Propagation (ETPI-SCoBeP) Genome Sequence Alignment

    PubMed Central

    Roozgard, Aminmohammad; Barzigar, Nafise; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel

    2014-01-01

    The advance in human genome sequencing technology has significantly reduced the cost of data generation and overwhelms the computing capability of sequence analysis. Efficiency, efficacy, and scalability remain challenging in sequence alignment, which is an important and foundational operation for genome data analysis. In this paper, we propose a two-stage approach to tackle this problem. In the preprocessing step, we match blocks of reference and target sequences based on the similarities between their empirical transition probability distributions using belief propagation. We then conduct a refined match using our recently published sparse-coding belief propagation (SCoBeP) technique. Our experimental results demonstrated robustness in nucleotide sequence alignment, and our results are competitive to those of the SOAP aligner and the BWA algorithm. Moreover, compared to SCoBeP alignment, the proposed technique can handle sequences of much longer lengths. PMID:25983537

  10. Reasoning from Incomplete Knowledge.

    ERIC Educational Resources Information Center

    Collins, Allan M.; And Others

    People use a variety of plausible, but uncertain inferences to answer questions about which their knowledge is incomplete. Such inferential thinking and reasoning is being incorporated into the SCHOLAR computer-assisted instruction (CAI) system. Socratic tutorial techniques in CAI systems such as SCHOLAR are described, and examples of their…

  11. Mice carrying a complete deletion of the talin2 coding sequence are viable and fertile

    SciTech Connect

    Debrand, Emmanuel; Conti, Francesco J.; Bate, Neil; Spence, Lorraine; Mazzeo, Daniela; Pritchard, Catrin A.; Monkley, Susan J.; Critchley, David R.

    2012-09-21

    Highlights: Black-Right-Pointing-Pointer Mice lacking talin2 are viable and fertile with only a mildly dystrophic phenotype. Black-Right-Pointing-Pointer Talin2 null fibroblasts show no major defects in proliferation, adhesion or migration. Black-Right-Pointing-Pointer Maintaining a colony of talin2 null mice is difficult indicating an underlying defect. -- Abstract: Mice homozygous for several Tln2 gene targeted alleles are viable and fertile. Here we show that although the expression of talin2 protein is drastically reduced in muscle from these mice, other tissues continue to express talin2 albeit at reduced levels. We therefore generated a Tln2 allele lacking the entire coding sequence (Tln2{sup cd}). Tln2{sup cd/cd} mice were viable and fertile, and the genotypes of Tln2{sup cd/+} intercrosses were at the expected Mendelian ratio. Tln2{sup cd/cd} mice showed no major difference in body mass or the weight of the major organs compared to wild-type, although they displayed a mildly dystrophic phenotype. Moreover, Tln2{sup cd/cd} mouse embryo fibroblasts showed no obvious defects in cell adhesion, migration or proliferation. However, the number of Tln2{sup cd/cd} pups surviving to adulthood was variable suggesting that such mice have an underlying defect.

  12. Unstable microsatellite repeats facilitate rapid evolution of coding and regulatory sequences.

    PubMed

    Jansen, A; Gemayel, R; Verstrepen, K J

    2012-01-01

    Tandem repeats are intrinsically highly variable sequences since repeat units are often lost or gained during replication or following unequal recombination events. Because of their low complexity and their instability, these repeats, which are also called satellite repeats, are often considered to be useless 'junk' DNA. However, recent findings show that tandem repeats are frequently found within promoters of stress-induced genes and within the coding regions of genes encoding cell-surface and regulatory proteins. Interestingly, frequent changes in these repeats often confer phenotypic variability. Examples include variation in the microbial cell surface, rapid tuning of internal molecular clocks in flies, and enhanced morphological plasticity in mammals. This suggests that instead of being useless junk DNA, some variable tandem repeats are useful functional elements that confer 'evolvability', facilitating swift evolution and rapid adaptation to changing environments. Since changes in repeats are frequent and reversible, repeats provide a unique type of mutation that bridges the gap between rare genetic mutations, such as single nucleotide polymorphisms, and highly unstable but reversible epigenetic inheritance.

  13. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3′ UTRs and coding sequences

    PubMed Central

    Šulc, Miroslav; Marín, Ray M.; Robins, Harlan S.; Vaníček, Jiří

    2015-01-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3′ untranslated regions (3′ UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3′ UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA–mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA–mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. PMID:25948580

  14. Multiple Distinct Splicing Enhancers in the Protein-Coding Sequences of a Constitutively Spliced Pre-mRNA

    PubMed Central

    Schaal, Thomas D.; Maniatis, Tom

    1999-01-01

    We have identified multiple distinct splicing enhancer elements within protein-coding sequences of the constitutively spliced human β-globin pre-mRNA. Each of these highly conserved sequences is sufficient to activate the splicing of a heterologous enhancer-dependent pre-mRNA. One of these enhancers is activated by and binds to the SR protein SC35, whereas at least two others are activated by the SR protein SF2/ASF. A single base mutation within another enhancer element inactivates the enhancer but does not change the encoded amino acid. Thus, overlapping protein coding and RNA recognition elements may be coselected during evolution. These studies provide the first direct evidence that SR protein-specific splicing enhancers are located within the coding regions of constitutively spliced pre-mRNAs. We propose that these enhancers function as multisite splicing enhancers to specify 3′ splice-site selection. PMID:9858550

  15. ICRPfinder: a fast pattern design algorithm for coding sequences and its application in finding potential restriction enzyme recognition sites

    PubMed Central

    Li, Chao; Li, Yuhua; Zhang, Xiangmin; Stafford, Phillip; Dinu, Valentin

    2009-01-01

    Background Restriction enzymes can produce easily definable segments from DNA sequences by using a variety of cut patterns. There are, however, no software tools that can aid in gene building -- that is, modifying wild-type DNA sequences to express the same wild-type amino acid sequences but with enhanced codons, specific cut sites, unique post-translational modifications, and other engineered-in components for recombinant applications. A fast DNA pattern design algorithm, ICRPfinder, is provided in this paper and applied to find or create potential recognition sites in target coding sequences. Results ICRPfinder is applied to find or create restriction enzyme recognition sites by introducing silent mutations. The algorithm is shown capable of mapping existing cut-sites but importantly it also can generate specified new unique cut-sites within a specified region that are guaranteed not to be present elsewhere in the DNA sequence. Conclusion ICRPfinder is a powerful tool for finding or creating specific DNA patterns in a given target coding sequence. ICRPfinder finds or creates patterns, which can include restriction enzyme recognition sites, without changing the translated protein sequence. ICRPfinder is a browser-based JavaScript application and it can run on any platform, in on-line or off-line mode. PMID:19747395

  16. A novel all-optical label processing based on multiple optical orthogonal codes sequences for optical packet switching networks

    NASA Astrophysics Data System (ADS)

    Zhang, Chongfu; Qiu, Kun; Xu, Bo; Ling, Yun

    2008-05-01

    This paper proposes an all-optical label processing scheme that uses the multiple optical orthogonal codes sequences (MOOCS)-based optical label for optical packet switching (OPS) (MOOCS-OPS) networks. In this scheme, each MOOCS is a permutation or combination of the multiple optical orthogonal codes (MOOC) selected from the multiple-groups optical orthogonal codes (MGOOC). Following a comparison of different optical label processing (OLP) schemes, the principles of MOOCS-OPS network are given and analyzed. Firstly, theoretical analyses are used to prove that MOOCS is able to greatly enlarge the number of available optical labels when compared to the previous single optical orthogonal code (SOOC) for OPS (SOOC-OPS) network. Then, the key units of the MOOCS-based optical label packets, including optical packet generation, optical label erasing, optical label extraction and optical label rewriting etc., are given and studied. These results are used to verify that the proposed MOOCS-OPS scheme is feasible.

  17. Nucleotide sequence of the capsid protein gene and 3' non-coding region of papaya mosaic virus RNA.

    PubMed

    Abouhaidar, M G

    1988-01-01

    The nucleotide sequences of cDNA clones corresponding to the 3' OH end of papaya mosaic virus RNA have been determined. The 3'-terminal sequence obtained was 900 nucleotides in length, excluding the poly(A) tail, and contained an open reading frame capable of giving rise to a protein of 214 amino acid residues with an Mr of 22930. This protein was identified as the viral capsid protein. The 3' non-coding region of PMV genome RNA was about 121 nucleotides long [excluding the poly(A) tail] and homologous to the complementary sequence of the non-coding region at the 5' end of PMV RNA. A long open reading frame was also found in the predicted 5' end region of the negative strand.

  18. Code optimization of the subroutine to remove near identical matches in the sequence database homology search tool PSI-BLAST.

    PubMed

    Aspnäs, Mats; Mattila, Kimmo; Osowski, Kristoffer; Westerholm, Jan

    2010-06-01

    A central task in protein sequence characterization is the use of a sequence database homology search tool to find similar protein sequences in other individuals or species. PSI-BLAST is a widely used module of the BLAST package that calculates a position-specific score matrix from the best matching sequences and performs iterated searches using a method to avoid many similar sequences for the score. For some queries and parameter settings, PSI-BLAST may find many similar high-scoring matches, and therefore up to 80% of the total run time may be spent in this procedure. In this article, we present code optimizations that improve the cache utilization and the overall performance of this procedure. Measurements show that, for queries where the number of similar matches is high, the optimized PSI-BLAST program may be as much as 2.9 times faster than the original program.

  19. HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment1

    PubMed Central

    Johnson, Matthew G.; Gardner, Elliot M.; Liu, Yang; Medina, Rafael; Goffinet, Bernard; Shaw, A. Jonathan; Zerega, Nyree J. C.; Wickett, Norman J.

    2016-01-01

    Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper. PMID:27437175

  20. New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation

    PubMed Central

    McLysaght, Aoife; Guerzoni, Daniele

    2015-01-01

    The origin of novel protein-coding genes de novo was once considered so improbable as to be impossible. In less than a decade, and especially in the last five years, this view has been overturned by extensive evidence from diverse eukaryotic lineages. There is now evidence that this mechanism has contributed a significant number of genes to genomes of organisms as diverse as Saccharomyces, Drosophila, Plasmodium, Arabidopisis and human. From simple beginnings, these genes have in some instances acquired complex structure, regulated expression and important functional roles. New genes are often thought of as dispensable late additions; however, some recent de novo genes in human can play a role in disease. Rather than an extremely rare occurrence, it is now evident that there is a relatively constant trickle of proto-genes released into the testing ground of natural selection. It is currently unknown whether de novo genes arise primarily through an ‘RNA-first’ or ‘ORF-first’ pathway. Either way, evolutionary tinkering with this pool of genetic potential may have been a significant player in the origins of lineage-specific traits and adaptations. PMID:26323763

  1. Cloning and nucleotide sequence of the gene coding for aspartokinase II from a thermophilic methylotrophic Bacillus sp.

    PubMed Central

    Schendel, F J; Flickinger, M C

    1992-01-01

    The structural gene coding for the lysine-sensitive aspartokinase II of the methylotrophic thermotolerant Bacillus sp. strain MGA3 was cloned from a genomic library by complementation of an Escherichia coli auxotrophic mutant lacking all three aspartokinase isozymes. The nucleotide sequence of the entire 2.2-kb PstI fragment was determined, and a single open reading frame coding for the aspartokinase II enzyme was found. Aspartokinase II was shown to be an alpha 2 beta 2 tetramer (M(r) 122,000) with the beta subunit (M(r) 18,000) encoded within the alpha subunit (M(r) 45,000) in the samea reading frame. The enzyme was purified, and the N-terminal sequences of the alpha and beta subunits were identical with those predicted from the gene sequences. The predicted amino acid sequence was 76% identical with the sequence of the Bacillus subtilis aspartokinase II. The transcription initiation site was located approximately 350 bp upstream of the translation start site, and putative promoter regions at -10 (TATGCT) and -35 (ATGACA) were identified. A 300-nucleotide intervening sequence between the transcription initiation and translational start sites suggests a possible attenuation mechanism for the regulation of transcription of this enzyme in the presence of lysine. Images PMID:1444390

  2. Resolving arthropod phylogeny: exploring phylogenetic signal within 41 kb of protein-coding nuclear gene sequence.

    PubMed

    Regier, Jerome C; Shultz, Jeffrey W; Ganley, Austen R D; Hussey, April; Shi, Diane; Ball, Bernard; Zwick, Andreas; Stajich, Jason E; Cummings, Michael P; Martin, Joel W; Cunningham, Clifford W

    2008-12-01

    This study attempts to resolve relationships among and within the four basal arthropod lineages (Pancrustacea, Myriapoda, Euchelicerata, Pycnogonida) and to assess the widespread expectation that remaining phylogenetic problems will yield to increasing amounts of sequence data. Sixty-eight regions of 62 protein-coding nuclear genes (approximately 41 kilobases (kb)/taxon) were sequenced for 12 taxonomically diverse arthropod taxa and a tardigrade outgroup. Parsimony, likelihood, and Bayesian analyses of total nucleotide data generally strongly supported the monophyly of each of the basal lineages represented by more than one species. Other relationships within the Arthropoda were also supported, with support levels depending on method of analysis and inclusion/exclusion of synonymous changes. Removing third codon positions, where the assumption of base compositional homogeneity was rejected, altered the results. Removing the final class of synonymous mutations--first codon positions encoding leucine and arginine, which were also compositionally heterogeneous--yielded a data set that was consistent with a hypothesis of base compositional homogeneity. Furthermore, under such a data-exclusion regime, all 68 gene regions individually were consistent with base compositional homogeneity. Restricting likelihood analyses to nonsynonymous change recovered trees with strong support for the basal lineages but not for other groups that were variably supported with more inclusive data sets. In a further effort to increase phylogenetic signal, three types of data exploration were undertaken. (1) Individual genes were ranked by their average rate of nonsynonymous change, and three rate categories were assigned--fast, intermediate, and slow. Then, bootstrap analysis of each gene was performed separately to see which taxonomic groups received strong support. Five taxonomic groups were strongly supported independently by two or more genes, and these genes mostly belonged to the slow

  3. Characterization of EBV Promoters and Coding Regions by Sequencing PCR-Amplified DNA Fragments.

    PubMed

    Szenthe, Kalman; Bánáti, Ferenc

    2017-01-01

    DNA sequencing approaches originally developed in two directions, the chemical degradation method and the chain-termination method. The latter one became more widespread and a huge amount of sequencing data including whole genome sequences accumulated, based on the use of capillary sequencer systems and the application of a modified chain-termination method which proved to be relatively easy, fast, and reliable. In addition, relatively long, up to 1000 bp sequences could be obtained with a single read with high per-base accuracy. Although the recent appearance of next-generation DNA sequencing (NGS) technologies enabled high-throughput and low cost analysis of DNA, the modified chain-terminating methods are often applied in research until now. In the following, we shall present the application of capillary sequencing for the sequence characterization of viral genomes in case of partial and whole genome sequencing, and demonstrate it on the BARF1 promoter of Epstein Barr virus (EBV).

  4. SHARAKU: an algorithm for aligning and clustering read mapping profiles of deep sequencing in non-coding RNA processing

    PubMed Central

    Tsuchiya, Mariko; Amano, Kojiro; Abe, Masaya; Seki, Misato; Hase, Sumitaka; Sato, Kengo; Sakakibara, Yasubumi

    2016-01-01

    Motivation: Deep sequencing of the transcripts of regulatory non-coding RNA generates footprints of post-transcriptional processes. After obtaining sequence reads, the short reads are mapped to a reference genome, and specific mapping patterns can be detected called read mapping profiles, which are distinct from random non-functional degradation patterns. These patterns reflect the maturation processes that lead to the production of shorter RNA sequences. Recent next-generation sequencing studies have revealed not only the typical maturation process of miRNAs but also the various processing mechanisms of small RNAs derived from tRNAs and snoRNAs. Results: We developed an algorithm termed SHARAKU to align two read mapping profiles of next-generation sequencing outputs for non-coding RNAs. In contrast with previous work, SHARAKU incorporates the primary and secondary sequence structures into an alignment of read mapping profiles to allow for the detection of common processing patterns. Using a benchmark simulated dataset, SHARAKU exhibited superior performance to previous methods for correctly clustering the read mapping profiles with respect to 5′-end processing and 3′-end processing from degradation patterns and in detecting similar processing patterns in deriving the shorter RNAs. Further, using experimental data of small RNA sequencing for the common marmoset brain, SHARAKU succeeded in identifying the significant clusters of read mapping profiles for similar processing patterns of small derived RNA families expressed in the brain. Availability and Implementation: The source code of our program SHARAKU is available at http://www.dna.bio.keio.ac.jp/sharaku/, and the simulated dataset used in this work is available at the same link. Accession code: The sequence data from the whole RNA transcripts in the hippocampus of the left brain used in this work is available from the DNA DataBank of Japan (DDBJ) Sequence Read Archive (DRA) under the accession number DRA

  5. ENAM Mutations with Incomplete Penetrance

    PubMed Central

    Seymen, F.; Lee, K.-E.; Koruyucu, M.; Gencay, K.; Bayram, M.; Tuna, E.B.; Lee, Z.H.; Kim, J.-W.

    2014-01-01

    Amelogenesis imperfecta (AI) is a genetic disease affecting tooth enamel formation. AI can be an isolated entity or a phenotype of syndromes. To date, more than 10 genes have been associated with various forms of AI. We have identified 2 unrelated Turkish families with hypoplastic AI and performed mutational analysis. Whole-exome sequencing identified 2 novel heterozygous nonsense mutations in the ENAM gene (c.454G>T p.Glu152* in family 1, c.358C>T p.Gln120* in family 2) in the probands. Affected individuals were heterozygous for the mutation in each family. Segregation analysis within each family revealed individuals with incomplete penetrance or extremely mild enamel phenotype, in spite of having the same mutation with the other affected individuals. We believe that these findings will broaden our understanding of the clinical phenotype of AI caused by ENAM mutations. PMID:25143514

  6. Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.

    PubMed Central

    Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T

    1993-01-01

    A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829

  7. Nucleotide sequence of the melA gene, coding for alpha-galactosidase in Escherichia coli K-12.

    PubMed Central

    Liljeström, P L; Liljeström, P

    1987-01-01

    Melibiose uptake and hydrolysis in E.coli is performed by the MelB and MelA proteins, respectively. We report the cloning and sequencing of the melA gene. The nucleotide sequence data showed that melA codes for a 450 amino acid long protein with a molecular weight of 50.6 kd. The sequence data also supported the assumption that the mel locus forms an operon with melA in proximal position. A comparison of MelA with alpha-galactosidase proteins from yeast and human origin showed that these proteins have only limited homology, the yeast and human proteins being more related. However, regions common to all three proteins were found indicating sequences that might comprise the active site of alpha-galactosidase. PMID:3031590

  8. Melting temperature highlights functionally important RNA structure and sequence elements in yeast mRNA coding regions.

    PubMed

    Qi, Fei; Frishman, Dmitrij

    2017-03-07

    Secondary structure elements in the coding regions of mRNAs play an important role in gene expression and regulation, but distinguishing functional from non-functional structures remains challenging. Here we investigate the dependence of sequence-structure relationships in the coding regions on temperature based on the recent PARTE data by Wan et al. Our main finding is that the regions with high and low thermostability (high Tm and low Tm regions) are under evolutionary pressure to preserve RNA secondary structure and primary sequence, respectively. Sequences of low Tm regions display a higher degree of evolutionary conservation compared to high Tm regions. Low Tm regions are under strong synonymous constraint, while high Tm regions are not. These findings imply that high Tm regions contain thermo-stable functionally important RNA structures, which impose relaxed evolutionary constraint on sequence as long as the base-pairing patterns remain intact. By contrast, low thermostability regions contain single-stranded functionally important conserved RNA sequence elements accessible for binding by other molecules. We also find that theoretically predicted structures of paralogous mRNA pairs become more similar with growing temperature, while experimentally measured structures tend to diverge, which implies that the melting pathways of RNA structures cannot be fully captured by current computational approaches.

  9. Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing

    PubMed Central

    Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona

    2014-01-01

    The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia

  10. Analysis of the multi-copied genes and the impact of the redundant protein coding sequences on gene annotation in prokaryotic genomes.

    PubMed

    Yu, Jia-Feng; Chen, Qing-Li; Ren, Jing; Yang, Yan-Ling; Wang, Ji-Hua; Sun, Xiao

    2015-07-07

    The important roles of duplicated genes in evolutional process have been recognized in bacteria, archaebacteria and eukaryotes, while there is very little study on the multi-copied protein coding genes that share sequence identity of 100%. In this paper, the multi-copied protein coding genes in a number of prokaryotic genomes are comprehensively analyzed firstly. The results show that 0-15.93% of the protein coding genes in each genome are multi-copied genes and 0-16.49% of the protein coding genes in each genome are highly similar with the sequence identity ≥ 80%. Function and COG (Clusters of Orthologous Groups of proteins) analysis shows that 64.64% of multi-copied genes concentrate on the function of transposase and 86.28% of the COG assigned multi-copied genes concentrate on the COG code of 'L'. Furthermore, the impact of redundant protein coding sequences on the gene prediction results is studied. The results show that the problem of protein coding sequence redundancies cannot be ignored and the consistency of the gene annotation results before and after excluding the redundant sequences is negatively related with the sequences redundancy degree of the protein coding sequences in the training set.

  11. An Incomplete Paradigm

    ERIC Educational Resources Information Center

    Boulding, Kenneth E.

    1978-01-01

    Examines the role of sociobiology in explaining human behavior. Recommends that sociobiologists consider both biogenetics (DNA and information coded in the genes) and noogenetics (process by which learned structures are transmitted from one generation to the next). (Author/DB)

  12. Range sidelobe elimination in maximal sequence phase coded C.W. radars

    NASA Astrophysics Data System (ADS)

    Metaxas, D. G.; Aitchison, C. S.

    Elimination of the range sidelobe of the autocorrelation of the periodic binary maximal sequence has been achieved. A new property of the m-sequence is defined. As a result the ambiguity function of an m-sequence PSK CW radar coincides with the ambiguity function of the equivalent pulse radar as far as the range and velocity resolution are concerned. The signal to noise deterioration due to the post-correlation implementation of the new property is insignificant.

  13. [Influence of "prehistory" of sequential movements of the right and the left hand on reproduction: coding of positions, movements and sequence structure].

    PubMed

    Bobrova, E V; Liakhovetskiĭ, V A; Borshchevskaia, E R

    2011-01-01

    The dependence of errors during reproduction of a sequence of hand movements without visual feedback on the previous right- and left-hand performance ("prehistory") and on positions in space of sequence elements (random or ordered by the explicit rule) was analyzed. It was shown that the preceding information about the ordered positions of the sequence elements was used during right-hand movements, whereas left-hand movements were performed with involvement of the information about the random sequence. The data testify to a central mechanism of the analysis of spatial structure of sequence elements. This mechanism activates movement coding specific for the left hemisphere (vector coding) in case of an ordered sequence structure and positional coding specific for the right hemisphere in case of a random sequence structure.

  14. DNA sequence-based "bar codes" for tracking the origins of expressed sequence tags from a maize cDNA library constructed using multiple mRNA sources.

    PubMed

    Qiu, Fang; Guo, Ling; Wen, Tsui-Jung; Liu, Feng; Ashlock, Daniel A; Schnable, Patrick S

    2003-10-01

    To enhance gene discovery, expressed sequence tag (EST) projects often make use of cDNA libraries produced using diverse mixtures of mRNAs. As such, expression data are lost because the origins of the resulting ESTs cannot be determined. Alternatively, multiple libraries can be prepared, each from a more restricted source of mRNAs. Although this approach allows the origins of ESTs to be determined, it requires the production of multiple libraries. A hybrid approach is reported here. A cDNA library was prepared using 21 different pools of maize (Zea mays) mRNAs. DNA sequence "bar codes" were added during first-strand cDNA synthesis to uniquely identify the mRNA source pool from which individual cDNAs were derived. Using a decoding algorithm that included error correction, it was possible to identify the source mRNA pool of more than 97% of the ESTs. The frequency at which a bar code is represented in an EST contig should be proportional to the abundance of the corresponding mRNA in the source pool. Consistent with this, all ESTs derived from several genes (zein and adh1) that are known to be exclusively expressed in kernels or preferentially expressed under anaerobic conditions, respectively, were exclusively tagged with bar codes associated with mRNA pools prepared from kernel and anaerobically treated seedlings, respectively. Hence, by allowing for the retention of expression data, the bar coding of cDNA libraries can enhance the value of EST projects.

  15. Complete Coding Sequence of Usutu Virus Strain Gracula religiosa/U1609393/Belgium/2016 Obtained from the Brain Tissue of an Infected Captive Common Hill Myna (Gracula religiosa)

    PubMed Central

    Lambrecht, Bénédicte; Vandenbussche, Frank; Steensels, Mieke

    2017-01-01

    ABSTRACT The complete and annotated coding sequence and partial noncoding sequence of an Usutu virus genome were sequenced from RNA extracted from a clinical brain tissue sample obtained from a common hill myna (Gracula religiosa), demonstrating close homology with Usutu viruses circulating in Europe. PMID:28336592

  16. An ultra-sparse code underliesthe generation of neural sequences in a songbird

    NASA Astrophysics Data System (ADS)

    Hahnloser, Richard H. R.; Kozhevnikov, Alexay A.; Fee, Michale S.

    2002-09-01

    Sequences of motor activity are encoded in many vertebrate brains by complex spatio-temporal patterns of neural activity; however, the neural circuit mechanisms underlying the generation of these pre-motor patterns are poorly understood. In songbirds, one prominent site of pre-motor activity is the forebrain robust nucleus of the archistriatum (RA), which generates stereotyped sequences of spike bursts during song and recapitulates these sequences during sleep. We show that the stereotyped sequences in RA are driven from nucleus HVC (high vocal centre), the principal pre-motor input to RA. Recordings of identified HVC neurons in sleeping and singing birds show that individual HVC neurons projecting onto RA neurons produce bursts sparsely, at a single, precise time during the RA sequence. These HVC neurons burst sequentially with respect to one another. We suggest that at each time in the RA sequence, the ensemble of active RA neurons is driven by a subpopulation of RA-projecting HVC neurons that is active only at that time. As a population, these HVC neurons may form an explicit representation of time in the sequence. Such a sparse representation, a temporal analogue of the `grandmother cell' concept for object recognition, eliminates the problem of temporal interference during sequence generation and learning attributed to more distributed representations.

  17. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

    PubMed

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-11-20

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.

  18. Next-Generation Sequencing of Protein-Coding and Long Non-protein-Coding RNAs in Two Types of Exosomes Derived from Human Whole Saliva.

    PubMed

    Ogawa, Yuko; Tsujimoto, Masafumi; Yanoshita, Ryohei

    2016-01-01

    Exosomes are small extracellular vesicles containing microRNAs and mRNAs that are produced by various types of cells. We previously used ultrafiltration and size-exclusion chromatography to isolate two types of human salivary exosomes (exosomes I, II) that are different in size and proteomes. We showed that salivary exosomes contain large repertoires of small RNAs. However, precise information regarding long RNAs in salivary exosomes has not been fully determined. In this study, we investigated the compositions of protein-coding RNAs (pcRNAs) and long non-protein-coding RNAs (lncRNAs) of exosome I, exosome II and whole saliva (WS) by next-generation sequencing technology. Although 11% of all RNAs were commonly detected among the three samples, the compositions of reads mapping to known RNAs were similar. The most abundant pcRNA is ribosomal RNA protein, and pcRNAs of some salivary proteins such as S100 calcium-binding protein A8 (protein S100-A8) were present in salivary exosomes. Interestingly, lncRNAs of pseudogenes (presumably, processed pseudogenes) were abundant in exosome I, exosome II and WS. Translationally controlled tumor protein gene, which plays an important role in cell proliferation, cell death and immune responses, was highly expressed as pcRNA and pseudogenes in salivary exosomes. Our results show that salivary exosomes contain various types of RNAs such as pseudogenes and small RNAs, and may mediate intercellular communication by transferring these RNAs to target cells as gene expression regulators.

  19. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    PubMed Central

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-01-01

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations. PMID:27999334

  20. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene.

    PubMed

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-12-17

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations.

  1. HLA-F coding and regulatory segments variability determined by massively parallel sequencing procedures in a Brazilian population sample.

    PubMed

    Lima, Thálitta Hetamaro Ayala; Buttura, Renato Vidal; Donadi, Eduardo Antônio; Veiga-Castelli, Luciana Caricati; Mendes-Junior, Celso Teixeira; Castelli, Erick C

    2016-10-01

    Human Leucocyte Antigen F (HLA-F) is a non-classical HLA class I gene distinguished from its classical counterparts by low allelic polymorphism and distinctive expression patterns. Its exact function remains unknown. It is believed that HLA-F has tolerogenic and immune modulatory properties. Currently, there is little information regarding the HLA-F allelic variation among human populations and the available studies have evaluated only a fraction of the HLA-F gene segment and/or have searched for known alleles only. Here we present a strategy to evaluate the complete HLA-F variability including its 5' upstream, coding and 3' downstream segments by using massively parallel sequencing procedures. HLA-F variability was surveyed on 196 individuals from the Brazilian Southeast. The results indicate that the HLA-F gene is indeed conserved at the protein level, where thirty coding haplotypes or coding alleles were detected, encoding only four different HLA-F full-length protein molecules. Moreover, a same protein molecule is encoded by 82.45% of all coding alleles detected in this Brazilian population sample. However, the HLA-F nucleotide and haplotype variability is much higher than our current knowledge both in Brazilians and considering the 1000 Genomes Project data. This protein conservation is probably a consequence of the key role of HLA-F in the immune system physiology.

  2. Incomplete periacetabular acetabuloplasty

    PubMed Central

    2014-01-01

    Background Residual acetabular dysplasia is one of the most common complications after treatment for developmental dysplasia of the hip. The acetabular growth response after reduction of a dislocated hip varies. The options are to wait and add a redirectional osteotomy as a secondary procedure at an older age, or to perform a primary acetabuloplasty at the time of the open reduction to stimulate acetabular development. We present the early results of such a procedure—open reduction and an incomplete periacetabular acetabuloplasty—as a one-stop procedure for developmental dysplasia of the hip. Patients and methods We retrospectively reviewed the results obtained with 55 hips (in 48 patients, 43 of them girls) treated between September 2004 and February 2011. This cohort included late presentations and failures of nonoperative treatment and excluded unsuccessful previous surgical treatment (including closed reductions), neuromuscular disease, and other teratological conditions. Patients were treated once the ossific nucleus was present or when they reached one year of age. 31 cases were late presentations while 17 represented failures of nonoperative treatment. The mean age of the patients at surgery was 1.3 (0.6–2.6) years. The mean follow-up period was 4 (2–8) years. According to the IHDI classification, 1 was grade I, 9 were grade II, 13 were grade III, and 32 were grade IV. Results The mean acetabular index fell from 38 (23–49) preoperatively to 21 (10–27) at the last follow-up. There were no infections, nerve palsies, or graft extrusions. None of the cases required secondary surgery for residual acetabular dysplasia. 8 patients developed avascular necrosis (AVN) of grade II or more. The incidence of AVN was significantly associated with previous, failed nonoperative treatment. 1 patient developed coxa magna requiring shelf arthroplasty 4 years after the index procedure and 1 patient with lateral growth arrest required medial screw epiphysiodesis

  3. Beta.-glucosidase coding sequences and protein from orpinomyces PC-2

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong; Ximenes, Eduardo A.

    2001-02-06

    Provided is a novel .beta.-glucosidase from Orpinomyces sp. PC2, nucleotide sequences encoding the mature protein and the precursor protein, and methods for recombinant production of this .beta.-glucosidase.

  4. Functional Divergence of APETALA1 and FRUITFULL is due to Changes in both Regulation and Coding Sequence

    PubMed Central

    McCarthy, Elizabeth W.; Mohamed, Abeer; Litt, Amy

    2015-01-01

    Gene duplications are prevalent in plants, and functional divergence subsequent to duplication may be linked with the occurrence of novel phenotypes in plant evolution. Here, we examine the functional divergence of Arabidopsis thaliana APETALA1 (AP1) and FRUITFULL (FUL), which arose via a duplication correlated with the origin of the core eudicots. Both AP1 and FUL play a role in floral meristem identity, but AP1 is required for the formation of sepals and petals whereas FUL is involved in cauline leaf and fruit development. AP1 and FUL are expressed in mutually exclusive domains but also differ in sequence, with unique conserved motifs in the C-terminal domains of the proteins that suggest functional differentiation. To determine whether the functional divergence of AP1 and FUL is due to changes in regulation or changes in coding sequence, we performed promoter swap experiments, in which FUL was expressed in the AP1 domain in the ap1 mutant and vice versa. Our results show that FUL can partially substitute for AP1, and AP1 can partially substitute for FUL; thus, the functional divergence between AP1 and FUL is due to changes in both regulation and coding sequence. We also mutated AP1 and FUL conserved motifs to determine if they are required for protein function and tested the ability of these mutated proteins to interact in yeast with known partners. We found that these motifs appear to play at best a minor role in protein function and dimerization capability, despite being strongly conserved. Our results suggest that the functional differentiation of these two paralogous key transcriptional regulators involves both differences in regulation and in sequence; however, sequence changes in the form of unique conserved motifs do not explain the differences observed. PMID:26697035

  5. The complete coding region sequence of river buffalo (Bubalus bubalis) SRY gene.

    PubMed

    Parma, Pietro; Feligini, Maria; Greppi, Gianfranco; Enne, Giuseppe

    2004-02-01

    The Y-linked SRY gene is responsible for testis determination in mammals. Mutations in this gene can lead to XY Gonadal Dysgenesis, an abnormal sexual phenotype described in humans, cattle, horses and river buffalo. We report here the complete river buffalo SRY sequence in order to enable the genetic diagnosis of this disease. The SRY sequence was also used to confirm the evolutionary divergence time between cattle and river buffalo 10 million years ago.

  6. The PRC2-binding long non-coding RNAs in human and mouse genomes are associated with predictive sequence features

    PubMed Central

    Tu, Shiqi; Yuan, Guo-Cheng; Shao, Zhen

    2017-01-01

    Recently, long non-coding RNAs (lncRNAs) have emerged as an important class of molecules involved in many cellular processes. One of their primary functions is to shape epigenetic landscape through interactions with chromatin modifying proteins. However, mechanisms contributing to the specificity of such interactions remain poorly understood. Here we took the human and mouse lncRNAs that were experimentally determined to have physical interactions with Polycomb repressive complex 2 (PRC2), and systematically investigated the sequence features of these lncRNAs by developing a new computational pipeline for sequences composition analysis, in which each sequence is considered as a series of transitions between adjacent nucleotides. Through that, PRC2-binding lncRNAs were found to be associated with a set of distinctive and evolutionarily conserved sequence features, which can be utilized to distinguish them from the others with considerable accuracy. We further identified fragments of PRC2-binding lncRNAs that are enriched with these sequence features, and found they show strong PRC2-binding signals and are more highly conserved across species than the other parts, implying their functional importance. PMID:28139710

  7. The PRC2-binding long non-coding RNAs in human and mouse genomes are associated with predictive sequence features

    NASA Astrophysics Data System (ADS)

    Tu, Shiqi; Yuan, Guo-Cheng; Shao, Zhen

    2017-01-01

    Recently, long non-coding RNAs (lncRNAs) have emerged as an important class of molecules involved in many cellular processes. One of their primary functions is to shape epigenetic landscape through interactions with chromatin modifying proteins. However, mechanisms contributing to the specificity of such interactions remain poorly understood. Here we took the human and mouse lncRNAs that were experimentally determined to have physical interactions with Polycomb repressive complex 2 (PRC2), and systematically investigated the sequence features of these lncRNAs by developing a new computational pipeline for sequences composition analysis, in which each sequence is considered as a series of transitions between adjacent nucleotides. Through that, PRC2-binding lncRNAs were found to be associated with a set of distinctive and evolutionarily conserved sequence features, which can be utilized to distinguish them from the others with considerable accuracy. We further identified fragments of PRC2-binding lncRNAs that are enriched with these sequence features, and found they show strong PRC2-binding signals and are more highly conserved across species than the other parts, implying their functional importance.

  8. The bioinformatics of nucleotide sequence coding for proteins requiring metal coenzymes and proteins embedded with metals

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Cheung, E.; Holden, T.; Sullivan, R.; Nguyen, A.; Lieberman, D.; Cheung, T.

    2015-09-01

    All metallo-proteins need post-translation metal incorporation. In fact, the isotope ratio of Fe, Cu, and Zn in physiology and oncology have emerged as an important tool. The nickel containing F430 is the prosthetic group of the enzyme methyl coenzyme M reductase which catalyzes the release of methane in the final step of methano-genesis, a prime energy metabolism candidate for life exploration space mission in the solar system. The 3.5 Gyr early life sulfite reductase as a life switch energy metabolism had Fe-Mo clusters. The nitrogenase for nitrogen fixation 3 billion years ago had Mo. The early life arsenite oxidase needed for anoxygenic photosynthesis energy metabolism 2.8 billion years ago had Mo and Fe. The selection pressure in metal incorporation inside a protein would be quantifiable in terms of the related nucleotide sequence complexity with fractal dimension and entropy values. Simulation model showed that the studied metal-required energy metabolism sequences had at least ten times more selection pressure relatively in comparison to the horizontal transferred sequences in Mealybug, guided by the outcome histogram of the correlation R-sq values. The metal energy metabolism sequence group was compared to the circadian clock KaiC sequence group using magnesium atomic level bond shifting mechanism in the protein, and the simulation model would suggest a much higher selection pressure for the energy life switch sequence group. The possibility of using Kepler 444 as an example of ancient life in Galaxy with the associated exoplanets has been proposed and is further discussed in this report. Examples of arsenic metal bonding shift probed by Synchrotron-based X-ray spectroscopy data and Zn controlled FOXP2 regulated pathways in human and chimp brain studied tissue samples are studied in relationship to the sequence bioinformatics. The analysis results suggest that relatively large metal bonding shift amount is associated with low probability correlation R

  9. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    PubMed

    Gordon, Kacy L; Arthur, Robert K; Ruvinsky, Ilya

    2015-05-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  10. Short Time-Scale Sensory Coding in S1 during Discrimination of Whisker Vibrotactile Sequences

    PubMed Central

    Miyashita, Toshio; Lee, Daniel J.; Smith, Katherine A.; Feldman, Daniel E.

    2016-01-01

    Rodent whisker input consists of dense microvibration sequences that are often temporally integrated for perceptual discrimination. Whether primary somatosensory cortex (S1) participates in temporal integration is unknown. We trained rats to discriminate whisker impulse sequences that varied in single-impulse kinematics (5–20-ms time scale) and mean speed (150-ms time scale). Rats appeared to use the integrated feature, mean speed, to guide discrimination in this task, consistent with similar prior studies. Despite this, 52% of S1 units, including 73% of units in L4 and L2/3, encoded sequences at fast time scales (≤20 ms, mostly 5–10 ms), accurately reflecting single impulse kinematics. 17% of units, mostly in L5, showed weaker impulse responses and a slow firing rate increase during sequences. However, these units did not effectively integrate whisker impulses, but instead combined weak impulse responses with a distinct, slow signal correlated to behavioral choice. A neural decoder could identify sequences from fast unit spike trains and behavioral choice from slow units. Thus, S1 encoded fast time scale whisker input without substantial temporal integration across whisker impulses. PMID:27574970

  11. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  12. Next-gen sequencing identifies non-coding variation disrupting miRNA-binding sites in neurological disorders.

    PubMed

    Devanna, P; Chen, X S; Ho, J; Gajewski, D; Smith, S D; Gialluisi, A; Francks, C; Fisher, S E; Newbury, D F; Vernes, S C

    2017-03-14

    Understanding the genetic factors underlying neurodevelopmental and neuropsychiatric disorders is a major challenge given their prevalence and potential severity for quality of life. While large-scale genomic screens have made major advances in this area, for many disorders the genetic underpinnings are complex and poorly understood. To date the field has focused predominantly on protein coding variation, but given the importance of tightly controlled gene expression for normal brain development and disorder, variation that affects non-coding regulatory regions of the genome is likely to play an important role in these phenotypes. Herein we show the importance of 3 prime untranslated region (3'UTR) non-coding regulatory variants across neurodevelopmental and neuropsychiatric disorders. We devised a pipeline for identifying and functionally validating putatively pathogenic variants from next generation sequencing (NGS) data. We applied this pipeline to a cohort of children with severe specific language impairment (SLI) and identified a functional, SLI-associated variant affecting gene regulation in cells and post-mortem human brain. This variant and the affected gene (ARHGEF39) represent new putative risk factors for SLI. Furthermore, we identified 3'UTR regulatory variants across autism, schizophrenia and bipolar disorder NGS cohorts demonstrating their impact on neurodevelopmental and neuropsychiatric disorders. Our findings show the importance of investigating non-coding regulatory variants when determining risk factors contributing to neurodevelopmental and neuropsychiatric disorders. In the future, integration of such regulatory variation with protein coding changes will be essential for uncovering the genetic causes of complex neurological disorders and the fundamental mechanisms underlying health and disease.Molecular Psychiatry advance online publication, 14 March 2017; doi:10.1038/mp.2017.30.

  13. Cracking the Code of Human Diseases Using Next-Generation Sequencing: Applications, Challenges, and Perspectives

    PubMed Central

    Precone, Vincenza; Del Monaco, Valentina; Esposito, Maria Valeria; De Palma, Fatima Domenica Elisa; Ruocco, Anna; D'Argenio, Valeria

    2015-01-01

    Next-generation sequencing (NGS) technologies have greatly impacted on every field of molecular research mainly because they reduce costs and increase throughput of DNA sequencing. These features, together with the technology's flexibility, have opened the way to a variety of applications including the study of the molecular basis of human diseases. Several analytical approaches have been developed to selectively enrich regions of interest from the whole genome in order to identify germinal and/or somatic sequence variants and to study DNA methylation. These approaches are now widely used in research, and they are already being used in routine molecular diagnostics. However, some issues are still controversial, namely, standardization of methods, data analysis and storage, and ethical aspects. Besides providing an overview of the NGS-based approaches most frequently used to study the molecular basis of human diseases at DNA level, we discuss the principal challenges and applications of NGS in the field of human genomics. PMID:26665001

  14. Quantifying unobserved protein-coding variants in human populations provides a roadmap for large-scale sequencing projects

    PubMed Central

    Zou, James; Valiant, Gregory; Valiant, Paul; Karczewski, Konrad; Chan, Siu On; Samocha, Kaitlin; Lek, Monkol; Sunyaev, Shamil; Daly, Mark; MacArthur, Daniel G.

    2016-01-01

    As new proposals aim to sequence ever larger collection of humans, it is critical to have a quantitative framework to evaluate the statistical power of these projects. We developed a new algorithm, UnseenEst, and applied it to the exomes of 60,706 individuals to estimate the frequency distribution of all protein-coding variants, including rare variants that have not been observed yet in the current cohorts. Our results quantified the number of new variants that we expect to identify as sequencing cohorts reach hundreds of thousands of individuals. With 500K individuals, we find that we expect to capture 7.5% of all possible loss-of-function variants and 12% of all possible missense variants. We also estimate that 2,900 genes have loss-of-function frequency of <0.00001 in healthy humans, consistent with very strong intolerance to gene inactivation. PMID:27796292

  15. Analysis of mutations in the entire coding sequence of the factor VIII gene

    SciTech Connect

    Bidichadani, S.I.; Lanyon, W.G.; Connor, J.M.

    1994-09-01

    Hemophilia A is a common X-linked recessive disorder of bleeding caused by deleterious mutations in the gene for clotting factor VIII. The large size of the factor VIII gene, the high frequency of de novo mutations and its tissue-specific expression complicate the detection of mutations. We have used a combination of RT-PCR of ectopic factor VIII transcripts and genomic DNA-PCRs to amplify the entire essential sequence of the factor VIII gene. This is followed by chemical mismatch cleavage analysis and direct sequencing in order to facilitate a comprehensive search for mutations. We describe the characterization of nine potentially pathogenic mutations, six of which are novel. In each case, a correlation of the genotype with the observed phenotype is presented. In order to evaluate the pathogenicity of the five missense mutations detected, we have analyzed them for evolutionary sequence conservation and for their involvement of sequence motifs catalogued in the PROSITE database of protein sites and patterns.

  16. Molecular phylogenetic analysis in Hammondia-like organisms based on partial Hsp70 coding sequences

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The 70-kDa heat shock protein (Hsp70) sequences are considered one of the most conserved proteins in all domain of life from Archaea to eukaryotes. Hammondia heydorni, H. hammondi, Toxoplasma gondii, Neospora hughesi and N. caninum (Hammondia-like organisms) are closely related tissue cyst-forming c...

  17. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  18. Polarity effects in the hisG gene of salmonella require a site within the coding sequence.

    PubMed

    Ciampi, M S; Roth, J R

    1988-02-01

    A single site in the middle of the coding sequence of the hisG gene of Salmonella is required for most of the polar effect of mutations in this gene. Nonsense and insertion mutations mapping upstream of this point in the hisG gene all have strong polar effects on expression of downstream genes in the operon; mutations mapping promotor distal to this site have little or no polar effect. Two previously known hisG mutations, mapping in the region of the polarity site, abolish the polarity effect of insertion mutations mapping upstream of this region. New polarity site mutations have been selected which have lost the polar effect of upstream nonsense mutations. All mutations abolishing the function of the site are small deletions; three are identical, 28-bp deletions which have arisen independently. A fourth mutation is a deletion of 16 base pairs internal to the larger deletion. Several point mutations within this 16-bp region have no effect on the function of the polarity site. We believe that a small number of polarity sites of this type are responsible for polarity in all genes. The site in the hisG gene is more easily detected than most because it appears to be the only such site in the hisG gene and because it maps in the center of the coding sequence.

  19. The genomic nucleotide sequences of two differentially expressed actin-coding genes from the sea star Pisaster ochraceus.

    PubMed

    Kowbel, D J; Smith, M J

    1989-04-30

    The genomic sequences of two differentially expressed actin genes from the sea star Pisaster ochraceus are reported. The cytoplasmic actin gene (Cy) is expressed in eggs and early development. The muscle actin gene (M) is expressed in tube feet and testes. Both genes contain an 1125-nucleotide coding region interrupted by three introns at codons 41, 121 and 204. Gene M contains two additional introns at codons 150 and 267. The intron position at codon 150, although present in higher vertebrate actins, has not been reported in actin genes from invertebrates. The M gene coding region has 89.5% nucleotide homology to the Cy gene, and differs from the Cy actin gene in 13 of 375 amino acids (aa), 11 of which are found in the C-terminal half of the gene. The C-terminal half of the M gene contains a significant number of muscle isotype codons. Even though there is only 1 aa change in the first 150 codons, there have been limited substitutions at many four-fold degenerate sites which may indicate selection pressure upon the secondary structure of the mRNA and/or a biased codon usage. Variant CCAAT, TATA, and poly(A)-addition signals have been identified in the 5' and 3' flanking regions. The presence of 5' and 3' splice junction sequences in the 5' flanking region of the Cy gene suggests the potential for an intron there.

  20. Exome-wide association analysis reveals novel coding sequence variants associated with lipid traits in Chinese

    PubMed Central

    Tang, Clara S.; Zhang, He; Cheung, Chloe Y. Y.; Xu, Ming; Ho, Jenny C. Y.; Zhou, Wei; Cherny, Stacey S.; Zhang, Yan; Holmen, Oddgeir; Au, Ka-Wing; Yu, Haiyi; Xu, Lin; Jia, Jia; Porsch, Robert M.; Sun, Lijie; Xu, Weixian; Zheng, Huiping; Wong, Lai-Yung; Mu, Yiming; Dou, Jingtao; Fong, Carol H. Y.; Wang, Shuyu; Hong, Xueyu; Dong, Liguang; Liao, Yanhua; Wang, Jiansong; Lam, Levina S. M.; Su, Xi; Yan, Hua; Yang, Min-Lee; Chen, Jin; Siu, Chung-Wah; Xie, Gaoqiang; Woo, Yu-Cho; Wu, Yangfeng; Tan, Kathryn C. B.; Hveem, Kristian; Cheung, Bernard M. Y.; Zöllner, Sebastian; Xu, Aimin; Eugene Chen, Y; Jiang, Chao Qiang; Zhang, Youyi; Lam, Tai-Hing; Ganesh, Santhi K.; Huo, Yong; Sham, Pak C.; Lam, Karen S. L.; Willer, Cristen J.; Tse, Hung-Fat; Gao, Wei

    2015-01-01

    Blood lipids are important risk factors for coronary artery disease (CAD). Here we perform an exome-wide association study by genotyping 12,685 Chinese, using a custom Illumina HumanExome BeadChip, to identify additional loci influencing lipid levels. Single-variant association analysis on 65,671 single nucleotide polymorphisms reveals 19 loci associated with lipids at exome-wide significance (P<2.69 × 10−7), including three Asian-specific coding variants in known genes (CETP p.Asp459Gly, PCSK9 p.Arg93Cys and LDLR p.Arg257Trp). Furthermore, missense variants at two novel loci—PNPLA3 p.Ile148Met and PKD1L3 p.Thr429Ser—also influence levels of triglycerides and low-density lipoprotein cholesterol, respectively. Another novel gene, TEAD2, is found to be associated with high-density lipoprotein cholesterol through gene-based association analysis. Most of these newly identified coding variants show suggestive association (P<0.05) with CAD. These findings demonstrate that exome-wide genotyping on samples of non-European ancestry can identify additional population-specific possible causal variants, shedding light on novel lipid biology and CAD. PMID:26690388

  1. Mutation-selection models of coding sequence evolution with site-heterogeneous amino acid fitness profiles

    PubMed Central

    Rodrigue, Nicolas; Philippe, Hervé; Lartillot, Nicolas

    2010-01-01

    Modeling the interplay between mutation and selection at the molecular level is key to evolutionary studies. To this end, codon-based evolutionary models have been proposed as pertinent means of studying long-range evolutionary patterns and are widely used. However, these approaches have not yet consolidated results from amino acid level phylogenetic studies showing that selection acting on proteins displays strong site-specific effects, which translate into heterogeneous amino acid propensities across the columns of alignments; related codon-level studies have instead focused on either modeling a single selective context for all codon columns, or a separate selective context for each codon column, with the former strategy deemed too simplistic and the latter deemed overparameterized. Here, we integrate recent developments in nonparametric statistical approaches to propose a probabilistic model that accounts for the heterogeneity of amino acid fitness profiles across the coding positions of a gene. We apply the model to a dozen real protein-coding gene alignments and find it to produce biologically plausible inferences, for instance, as pertaining to site-specific amino acid constraints, as well as distributions of scaled selection coefficients. In their account of mutational features as well as the heterogeneous regimes of selection at the amino acid level, the modeling approaches studied here can form a backdrop for several extensions, accounting for other selective features, for variable population size, or for subtleties of mutational features, all with parameterizations couched within population-genetic theory. PMID:20176949

  2. A unified mathematical framework for coding time, space, and sequences in the hippocampal region.

    PubMed

    Howard, Marc W; MacDonald, Christopher J; Tiganj, Zoran; Shankar, Karthik H; Du, Qian; Hasselmo, Michael E; Eichenbaum, Howard

    2014-03-26

    The medial temporal lobe (MTL) is believed to support episodic memory, vivid recollection of a specific event situated in a particular place at a particular time. There is ample neurophysiological evidence that the MTL computes location in allocentric space and more recent evidence that the MTL also codes for time. Space and time represent a similar computational challenge; both are variables that cannot be simply calculated from the immediately available sensory information. We introduce a simple mathematical framework that computes functions of both spatial location and time as special cases of a more general computation. In this framework, experience unfolding in time is encoded via a set of leaky integrators. These leaky integrators encode the Laplace transform of their input. The information contained in the transform can be recovered using an approximation to the inverse Laplace transform. In the temporal domain, the resulting representation reconstructs the temporal history. By integrating movements, the equations give rise to a representation of the path taken to arrive at the present location. By modulating the transform with information about allocentric velocity, the equations code for position of a landmark. Simulated cells show a close correspondence to neurons observed in various regions for all three cases. In the temporal domain, novel secondary analyses of hippocampal time cells verified several qualitative predictions of the model. An integrated representation of spatiotemporal context can be computed by taking conjunctions of these elemental inputs, leading to a correspondence with conjunctive neural representations observed in dorsal CA1.

  3. A Unified Mathematical Framework for Coding Time, Space, and Sequences in the Hippocampal Region

    PubMed Central

    MacDonald, Christopher J.; Tiganj, Zoran; Shankar, Karthik H.; Du, Qian; Hasselmo, Michael E.; Eichenbaum, Howard

    2014-01-01

    The medial temporal lobe (MTL) is believed to support episodic memory, vivid recollection of a specific event situated in a particular place at a particular time. There is ample neurophysiological evidence that the MTL computes location in allocentric space and more recent evidence that the MTL also codes for time. Space and time represent a similar computational challenge; both are variables that cannot be simply calculated from the immediately available sensory information. We introduce a simple mathematical framework that computes functions of both spatial location and time as special cases of a more general computation. In this framework, experience unfolding in time is encoded via a set of leaky integrators. These leaky integrators encode the Laplace transform of their input. The information contained in the transform can be recovered using an approximation to the inverse Laplace transform. In the temporal domain, the resulting representation reconstructs the temporal history. By integrating movements, the equations give rise to a representation of the path taken to arrive at the present location. By modulating the transform with information about allocentric velocity, the equations code for position of a landmark. Simulated cells show a close correspondence to neurons observed in various regions for all three cases. In the temporal domain, novel secondary analyses of hippocampal time cells verified several qualitative predictions of the model. An integrated representation of spatiotemporal context can be computed by taking conjunctions of these elemental inputs, leading to a correspondence with conjunctive neural representations observed in dorsal CA1. PMID:24672015

  4. Acoustic radiation force impulse (ARFI) imaging of zebrafish embryo by high-frequency coded excitation sequence.

    PubMed

    Park, Jinhyoung; Lee, Jungwoo; Lau, Sien Ting; Lee, Changyang; Huang, Ying; Lien, Ching-Ling; Kirk Shung, K

    2012-04-01

    Acoustic radiation force impulse (ARFI) imaging has been developed as a non-invasive method for quantitative illustration of tissue stiffness or displacement. Conventional ARFI imaging (2-10 MHz) has been implemented in commercial scanners for illustrating elastic properties of several organs. The image resolution, however, is too coarse to study mechanical properties of micro-sized objects such as cells. This article thus presents a high-frequency coded excitation ARFI technique, with the ultimate goal of displaying elastic characteristics of cellular structures. Tissue mimicking phantoms and zebrafish embryos are imaged with a 100-MHz lithium niobate (LiNbO₃) transducer, by cross-correlating tracked RF echoes with the reference. The phantom results show that the contrast of ARFI image (14 dB) with coded excitation is better than that of the conventional ARFI image (9 dB). The depths of penetration are 2.6 and 2.2 mm, respectively. The stiffness data of the zebrafish demonstrate that the envelope is harder than the embryo region. The temporal displacement change at the embryo and the chorion is as large as 36 and 3.6 μm. Consequently, this high-frequency ARFI approach may serve as a remote palpation imaging tool that reveals viscoelastic properties of small biological samples.

  5. Natural image sequences constrain dynamic receptive fields and imply a sparse code.

    PubMed

    Häusler, Chris; Susemihl, Alex; Nawrot, Martin P

    2013-11-06

    In their natural environment, animals experience a complex and dynamic visual scenery. Under such natural stimulus conditions, neurons in the visual cortex employ a spatially and temporally sparse code. For the input scenario of natural still images, previous work demonstrated that unsupervised feature learning combined with the constraint of sparse coding can predict physiologically measured receptive fields of simple cells in the primary visual cortex. This convincingly indicated that the mammalian visual system is adapted to the natural spatial input statistics. Here, we extend this approach to the time domain in order to predict dynamic receptive fields that can account for both spatial and temporal sparse activation in biological neurons. We rely on temporal restricted Boltzmann machines and suggest a novel temporal autoencoding training procedure. When tested on a dynamic multi-variate benchmark dataset this method outperformed existing models of this class. Learning features on a large dataset of natural movies allowed us to model spatio-temporal receptive fields for single neurons. They resemble temporally smooth transformations of previously obtained static receptive fields and are thus consistent with existing theories. A neuronal spike response model demonstrates how the dynamic receptive field facilitates temporal and population sparseness. We discuss the potential mechanisms and benefits of a spatially and temporally sparse representation of natural visual input.

  6. Composition and phylogenetic analysis of vitellogenin coding sequences in the Indonesian coelacanth Latimeria menadoensis.

    PubMed

    Canapa, Adriana; Olmo, Ettore; Forconi, Mariko; Pallavicini, Alberto; Makapedua, Monica Daisy; Biscotti, Maria Assunta; Barucca, Marco

    2012-07-01

    The coelacanth Latimeria menadoensis, a living fossil, occupies a key phylogenetic position to explore the changes that have affected the genomes of the aquatic vertebrates that colonized dry land. This is the first study to isolate and analyze L. menadoensis mRNA. Three different vitellogenin transcripts were identified and their inferred amino acid sequences compared to those of other known vertebrates. The phylogenetic data suggest that the evolutionary history of this gene family in coelacanths was characterized by a different duplication event than those which occurred in teleosts, amniotes, and amphibia. Comparison of the three sequences highlighted differences in functional sites. Moreover, despite the presence of conserved sites compared with the other oviparous vertebrates, some sites were seen to have changed, others to be similar only to those of teleosts, and others still to resemble only to those of tetrapods.

  7. Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences.

    PubMed

    Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

    2016-07-12

    Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions.

  8. Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences

    PubMed Central

    Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

    2016-01-01

    Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. PMID:27289096

  9. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway.

  10. Complete Coding Sequences of One H9 and Three H7 Low-Pathogenic Influenza Viruses Circulating in Wild Birds in Belgium, 2009 to 2012

    PubMed Central

    Rosseel, Toon; Marché, Sylvie; Steensels, Mieke; Vangeluwe, Didier; Linden, Annick; van den Berg, Thierry; Lambrecht, Bénédicte

    2016-01-01

    The complete coding sequences of four avian influenza A viruses (two H7N7, one H7N1, and one H9N2) circulating in wild waterfowl in Belgium from 2009 to 2012 were determined using Illumina sequencing. All viral genome segments represent viruses circulating in the Eurasian wild bird population. PMID:27284153

  11. Cloning, sequencing, and expression of the gene coding for an antigenic 120-kilodalton protein of Rickettsia conorii.

    PubMed Central

    Schuenke, K W; Walker, D H

    1994-01-01

    Several high-molecular-mass (above 100 kDa) antigens are recognized by sera from humans infected with spotted fever group rickettsiae and may be important stimulators of the host immune response. Molecular cloning techniques were used to make genomic Rickettsia conorii (Malish 7 strain) libraries in expression vector lambda gt11. The 120-kDa R. conorii antigen was identified by monospecific antibodies to the recombinant protein expressed on construct lambda 4-7. The entire gene DNA sequence was obtained by using this construct and two other overlapping constructs. An open reading frame of 3,068 bp with a calculated molecular mass of approximately 112 kDa was identified. Promoters and a ribosome-binding site were identified on the basis of their DNA sequence homology to other rickettsial genes and their relative positions in the sequence. The DNA coding region shares no significant homology with other spotted fever group rickettsial antigen genes (i.e., the R. rickettsii 190-, 135-, and 17-kDa antigen-encoding genes). The PCR technique was used to amplify the gene from eight species of spotted fever group rickettsiae. A 75-kDa portion of the 120-kDa antigen was overexpressed in and purified from Escherichia coli. This polypeptide was recognized by antirickettsial antibodies and may be a useful diagnostic reagent for spotted fever group rickettsioses. Images PMID:8112862

  12. Utility of selected non-coding chloroplast DNA sequences for lineage assessment of Musa interspecific hybrids.

    PubMed

    Swangpol, Sasivimon; Volkaert, Hugo; Sotto, Rachel C; Seelanan, Tosak

    2007-07-31

    Single-copy chloroplast loci are used widely to infer phylogenetic relationship at different taxonomic levels among various groups of plants. To test the utility of chloroplast loci and to provide additional data applicable to hybrid evolution in Musa, we sequenced two introns, rpl16 and ndhA, and two intergenic spacers, psaA-ycf3 and petA-psbJ-psbL-psbF and combined these data. Using these four regions, Musa acuminata Colla (A)- and M. balbisiana Colla (B)-containing genomes were clearly distinguished. Some triploid interspecific hybrids contain A-type chloroplasts (the AAB/ABB) while others contain B-type chloroplasts (the BBA/BBB). The chloroplasts of all cultivars in 'Namwa' (BBA) group came from the same wild maternal origin, but the specific parents are still unrevealed. Though, average sequence divergences in each region were little (less than 2%), we propose that petA-psbJ intergenic spacer could be developed for diversity assessment within each genome. This segment contains three single nucleotide polymorphisms (SNPs) and two indels which could distinguish diversity within A genome whereas this same region also contains one SNP and an indel which could categorize B genome. However, an inverted repeat region which could form hairpin structure was detected in this spacer and thus was omitted from the analyses due to their incongruence to other regions. Until thoroughly identified in other members of Musaceae and Zingiberales clade, utility of this inverted repeat as phylogenetic marker in these taxa are cautioned.

  13. The analysis of incomplete data.

    NASA Technical Reports Server (NTRS)

    Hartley, H. O.; Hocking, R. R.

    1971-01-01

    In this paper, we attempt to provide a simple taxonomy for incomplete-data problems and at the same time develop unified methods of analysis. The emphasis is on techniques which are natural extensions of the complete-data analysis and which will handle rather general classes of incomplete-data problems as opposed to custom-made techniques for special problems. The principle of estimation is either maximum likelihood or is at least based on maximum likelihood.

  14. Deconstruction of archaeal genome depict strategic consensus in core pathways coding sequence assembly.

    PubMed

    Pal, Ayon; Banerjee, Rachana; Mondal, Uttam K; Mukhopadhyay, Subhasis; Bothra, Asim K

    2015-01-01

    A comprehensive in silico analysis of 71 species representing the different taxonomic classes and physiological genre of the domain Archaea was performed. These organisms differed in their physiological attributes, particularly oxygen tolerance and energy metabolism. We explored the diversity and similarity in the codon usage pattern in the genes and genomes of these organisms, emphasizing on their core cellular pathways. Our thrust was to figure out whether there is any underlying similarity in the design of core pathways within these organisms. Analyses of codon utilization pattern, construction of hierarchical linear models of codon usage, expression pattern and codon pair preference pointed to the fact that, in the archaea there is a trend towards biased use of synonymous codons in the core cellular pathways and the Nc-plots appeared to display the physiological variations present within the different species. Our analyses revealed that aerobic species of archaea possessed a larger degree of freedom in regulating expression levels than could be accounted for by codon usage bias alone. This feature might be a consequence of their enhanced metabolic activities as a result of their adaptation to the relatively O2-rich environment. Species of archaea, which are related from the taxonomical viewpoint, were found to have striking similarities in their ORF structuring pattern. In the anaerobic species of archaea, codon bias was found to be a major determinant of gene expression. We have also detected a significant difference in the codon pair usage pattern between the whole genome and the genes related to vital cellular pathways, and it was not only species-specific but pathway specific too. This hints towards the structuring of ORFs with better decoding accuracy during translation. Finally, a codon-pathway interaction in shaping the codon design of pathways was observed where the transcription pathway exhibited a significantly different coding frequency signature.

  15. Deletion of 5'-coding sequences of the cellular p53 gene in mouse erythroleukemia: a novel mechanism of oncogene regulation.

    PubMed Central

    Rovinski, B; Munroe, D; Peacock, J; Mowat, M; Bernstein, A; Benchimol, S

    1987-01-01

    The p53 gene is rearranged in an erythroleukemic cell line (DP15-2) transformed by Friend retrovirus. Here, we characterize the mutation and identify a deletion of approximately equal to 3.0 kilobases that removes exon 2 coding sequences. The gene is expressed in DP15-2 cells and results in synthesis of a 44,000-dalton protein that is missing the N-terminal amino acid residues of p53. The truncated protein is unusually stable and accumulates to high levels intracellularly. Moreover, it appears to have undergone a change in conformation as revealed by epitope mapping studies. This study represents the first description of an altered p53 gene product arising by mutation during neoplastic progression and identifies a region in the p53 protein molecule that plays a role in determining p53 stability in vivo. Images PMID:3547084

  16. An Interpretation of the Ancestral Codon from Miller’s Amino Acids and Nucleotide Correlations in Modern Coding Sequences

    PubMed Central

    Carels, Nicolas; de Leon, Miguel Ponce

    2015-01-01

    Purine bias, which is usually referred to as an “ancestral codon”, is known to result in short-range correlations between nucleotides in coding sequences, and it is common in all species. We demonstrate that RWY is a more appropriate pattern than the classical RNY, and purine bias (Rrr) is the product of a network of nucleotide compensations induced by functional constraints on the physicochemical properties of proteins. Through deductions from universal correlation properties, we also demonstrate that amino acids from Miller’s spark discharge experiment are compatible with functional primeval proteins at the dawn of living cell radiation on earth. These amino acids match the hydropathy and secondary structures of modern proteins. PMID:25922573

  17. Phenotypes of murine leukemia virus-induced tumors: influence of 3' viral coding sequences.

    PubMed Central

    Ott, D E; Keller, J; Sill, K; Rein, A

    1992-01-01

    Murine leukemia viruses (MuLVs) induce leukemias and lymphomas in mice. We have used fluorescence-activated cell sorter analysis to determine the hematopoietic phenotypes of tumor cells induced by a number of MuLVs. Tumor cells induced by ecotropic Moloney, amphotropic 4070A, and 10A1 MuLVs and by two chimeric MuLVs, Mo(4070A) and Mo(10A1), were examined with antibodies to 13 lineage-specific cell surface markers found on myeloid cell, T-cell, and B-cell lineages. The chimeric Mo(4070A) and Mo(10A1) MuLVs, consisting of Moloney MuLV with the carboxy half of the Pol region and nearly all of the Env region of 4070A and 10A1, respectively, were constructed to examine the possible influence of these sequences on Moloney MuLV-induced tumor cell phenotypes. In some instances, these phenotypic analyses were supplemented by Southern blot analysis for lymphoid cell-specific genomic DNA rearrangements at the immunoglobulin heavy-chain, the T-cell receptor gamma, and the T-cell receptor beta loci. The results of our analysis showed that Moloney MuLV, 4070A, Mo(4070A), and Mo(10A1) induced mostly T-cell tumors. Moloney MuLV and Mo(4070A) induced a wide variety of T-cell phenotypes, ranging from immature to mature phenotypes, while 4070A induced mostly prothymocyte and double-negative (CD4- CD8-) T-cell tumors. The tumor phenotypes obtained with 10A1 and Mo(10A1) were each less variable than those obtained with the other MuLVs tested. 10A1 uniformly induced a tumor consisting of lineage marker-negative cells that lack lymphoid cell-specific DNA rearrangements and histologically appear to be early undifferentiated erythroid cell-like precursors. The Mo(10A1) chimera consistently induced an intermediate T-cell tumor. The chimeric constructions demonstrated that while 4070A 3' pol and env sequences apparently did not influence the observed tumor cell phenotypes, the 10A1 half of pol and env had a strong effect on the phenotypes induced by Mo(10A1) that resulted in a phenotypic

  18. A base-sequence-modulated Golay code improves the excitation and measurement of ultrasonic guided waves in long bones.

    PubMed

    Song, Xiaojun; Ta, Dean; Wang, Weiqi

    2012-11-01

    Researchers are interested in using ultrasonic guided waves (GWs) to assess long bones. However, GWs suffer high attenuation when they propagate in long bones, resulting in a low SNR. To overcome this limitation, this paper introduces a base-sequence-modulated Golay code (BSGC) to produce larger amplitude and improve the SNR in the ultrasound evaluation of long bones. A 16-bit Golay code was used for excitation in computer simulation. The decoded GWs and the traditional GWs, which were generated by a single pulse, agreed well after decoding the received signals, and the SNR was improved by 26.12 dB. In the experiments using bovine bones, the BSGC excitation produced the amplitudes which were at least 237 times greater than those produced by a single pulse excitation. The BSGC excitation also allowed the GWs to be received over a longer distance between two transducers. The results suggest the BSGC excitation has the potential to measure GWs and assess long bones.

  19. Structural sequences are conserved in the genes coding for the alpha, alpha' and beta-subunits of the soybean 7S seed storage protein.

    PubMed Central

    Schuler, M A; Ladin, B F; Pollaco, J C; Freyer, G; Beachy, R N

    1982-01-01

    Cloned DNAs encoding four different proteins have been isolated from recombinant cDNA libraries constructed with Glycine max seed mRNAs. Two cloned DNAs code for the alpha and alpha'-subunits of the 7S seed storage protein (conglycinin). The other cloned cDNAs code for proteins which are synthesized in vitro as 68,000 d., 60,000 d. or 53,000 d. polypeptides. Hybrid selection experiments indicate that, under low stringency hybridization conditions, all four cDNAs hybridize with mRNAs for the alpha and alpha'-subunits and the 68,000 d., 60,000 d. and 53,000 d. in vitro translation products. Within three of the mRNA, there is a conserved sequence of 155 nucleotides which is responsible for this hybridization. The conserved nucleotides in the alpha and alpha'-subunit cDNAs and the 68,000 d. polypeptide cDNAs span both coding and noncoding sequences. The differences in the coding nucleotides outside the conserved region are extensive. This suggests that selective pressure to maintain the 155 conserved nucleotides has been influenced by the structure of the seed mRNA. RNA blot hybridizations demonstrate that mRNA encoding the other major subunit (beta) of the 7S seed storage protein also shares sequence homology with the conserved 155 nucleotide sequence of the alpha and alpha'-subunit mRNAs, but not with other coding sequences. Images PMID:6897678

  20. The Number, Organization, and Size of Polymorphic Membrane Protein Coding Sequences as well as the Most Conserved Pmp Protein Differ within and across Chlamydia Species.

    PubMed

    Van Lent, Sarah; Creasy, Heather Huot; Myers, Garry S A; Vanrompay, Daisy

    2016-01-01

    Variation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis. As different Pmp proteins might be indispensible for the pathogenesis of different Chlamydia species, this study investigated the conservation of Pmp proteins both within and across C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci. The pmp coding sequences were annotated in 16 C. trachomatis, 6 C. pneumoniae, 2 C. abortus, and 16 C. psittaci genomes. The number and organization of polymorphic membrane coding sequences differed within and across the analyzed Chlamydia species. The length of coding sequences of pmpA,pmpB, and pmpH was conserved among all analyzed genomes, while the length of pmpE/F and pmpG, and remarkably also of the subtype pmpD, differed among the analyzed genomes. PmpD, PmpA, PmpH, and PmpA were the most conserved Pmp in C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci, respectively. PmpB was the most conserved Pmp across the 4 analyzed Chlamydia species.

  1. Variations in Opsin Coding Sequences Cause X-Linked Cone Dysfunction Syndrome with Myopia and Dichromacy

    PubMed Central

    McClements, Michelle; Davies, Wayne I. L.; Michaelides, Michel; Young, Terri; Neitz, Maureen; MacLaren, Robert E.; Moore, Anthony T.; Hunt, David M.

    2013-01-01

    Purpose. To determine the role of variant L opsin haplotypes in seven families with Bornholm Eye Disease (BED), a cone dysfunction syndrome with dichromacy and myopia. Methods. Analysis of the opsin genes within the L/M opsin array at Xq28 included cloning and sequencing of an exon 3-5 gene fragment, long range PCR to establish gene order, and quantitative PCR to establish gene copy number. In vitro expression of normal and variant opsins was performed to examine cellular trafficking and spectral sensitivity of pigments. Results. All except one of the BED families possessed L opsin genes that contained a rare exon 3 haplotype. The exception was a family with the deleterious Cys203Arg substitution. Two rare exon 3 haplotypes were found and, where determined, these variant opsin genes were in the first position in the array. In vitro expression in transfected cultured neuronal cells showed that the variant opsins formed functional pigments, which trafficked to the cell membranes. The variant opsins were, however, less stable than wild type. Conclusions. It is concluded that the variant L opsin haplotypes underlie BED. The reduction in the amount of variant opsin produced in vitro compared with wild type indicates a possible disease mechanism. Alternatively, the recently identified defective splicing of exon 3 of the variant opsin transcript may be involved. Both mechanisms explain the presence of dichromacy and cone dystrophy. Abnormal pigment may also underlie the myopia that is invariably present in BED subjects. PMID:23322568

  2. Nonsense mutation in the glycoprotein Ib. alpha. coding sequence associated with Bernard-Soulier syndrome

    SciTech Connect

    Ware, J.; Russell, S.R.; Vicente, V.; Scharf, R.E.; Tomer, A.; McMillian, R.; Ruggeri, Z.M. )

    1990-03-01

    Three distinct gene products, the {alpha} and {beta} chains of glycoprotein (GP) Ib and GP IX, constitute the platelet membrane GP Ib-IX complex, a receptor for von Willebrand factor and thrombin involved in platelet adhesion and aggregation. Defective function of the GP Ib-IX complex is the hallmark of a rare congenital bleeding disorder of still undefined pathogenesis, the Bernard-Soulier syndrome. The authors have analyzed the molecular basis of the disease in one patient in whom immunoblotting of solubilized platelets demonstrated absence of normal GP Ib{alpha} but presence of a smaller immunoreactive species. The truncated polypeptide was also present, along with normal protein, in platelets from the patient's mother and two of his four children. Genetic characterization identified a nucleotide transition changing the Trp-343 codon (TGG) to a nonsense codon (TGA). Such a mutation explains the origin of the smaller GP Ib{alpha}, which by lacking half of the sequence on the carboxyl-terminal side, including the transmembrane domain, cannot be properly inserted in the platelet membrane. Both normal and mutant codons were found in the patient, suggesting that he is a compound heterozygote with a still unidentified defect in the other GP Ib{alpha} allele. Nonsense mutation and truncated GP Ib{alpha} polypeptide were found to cosegregate in four individuals through three generations and were associated with either Bernard-Soulier syndrome or carrier state phenotype. The molecular abnormality demonstrated in this family provides evidence that defective synthesis of GP Ib{alpha} alters the membrane expression of the GP Ib-IX complex and may be responsible for Bernard-Soulier syndrome.

  3. OrthoMaM v8: a database of orthologous exons and coding sequences for comparative genomics in mammals.

    PubMed

    Douzery, Emmanuel J P; Scornavacca, Celine; Romiguier, Jonathan; Belkhir, Khalid; Galtier, Nicolas; Delsuc, Frédéric; Ranwez, Vincent

    2014-07-01

    Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of orthologous genes in mammalian genomes using a phylogenetic framework. Since its first release in 2007, OrthoMaM has regularly evolved, not only to include newly available genomes but also to incorporate up-to-date software in its analytic pipeline. This eighth release integrates the 40 complete mammalian genomes available in Ensembl v73 and provides alignments, phylogenies, evolutionary descriptor information, and functional annotations for 13,404 single-copy orthologous CDS and 6,953 long exons. The graphical interface allows to easily explore OrthoMaM to identify markers with specific characteristics (e.g., taxa availability, alignment size, %G+C, evolutionary rate, chromosome location). It hence provides an efficient solution to sample preprocessed markers adapted to user-specific needs. OrthoMaM has proven to be a valuable resource for researchers interested in mammalian phylogenomics, evolutionary genomics, and has served as a source of benchmark empirical data sets in several methodological studies. OrthoMaM is available for browsing, query and complete or filtered downloads at http://www.orthomam.univ-montp2.fr/.

  4. Genomic integration of the full-length dystrophin coding sequence in Duchenne muscular dystrophy induced pluripotent stem cells.

    PubMed

    Farruggio, Alfonso P; Bhakta, Mital S; du Bois, Haley; Ma, Julia; P Calos, Michele

    2017-04-01

    The plasmid vectors that express the full-length human dystrophin coding sequence in human cells was developed. Dystrophin, the protein mutated in Duchenne muscular dystrophy, is extraordinarily large, providing challenges for cloning and plasmid production in Escherichia coli. The authors expressed dystrophin from the strong, widely expressed CAG promoter, along with co-transcribed luciferase and mCherry marker genes useful for tracking plasmid expression. Introns were added at the 3' and 5' ends of the dystrophin sequence to prevent translation in E. coli, resulting in improved plasmid yield. Stability and yield were further improved by employing a lower-copy number plasmid origin of replication. The dystrophin plasmids also carried an attB site recognized by phage phiC31 integrase, enabling the plasmids to be integrated into the human genome at preferred locations by phiC31 integrase. The authors demonstrated single-copy integration of plasmid DNA into the genome and production of human dystrophin in the human 293 cell line, as well as in induced pluripotent stem cells derived from a patient with Duchenne muscular dystrophy. Plasmid-mediated dystrophin expression was also demonstrated in mouse muscle. The dystrophin expression plasmids described here will be useful in cell and gene therapy studies aimed at ameliorating Duchenne muscular dystrophy.

  5. Detecting Selection in the Blue Crab, Callinectes sapidus, Using DNA Sequence Data from Multiple Nuclear Protein-Coding Genes

    PubMed Central

    Yednock, Bree K.; Neigel, Joseph E.

    2014-01-01

    The identification of genes involved in the adaptive evolution of non-model organisms with uncharacterized genomes constitutes a major challenge. This study employed a rigorous and targeted candidate gene approach to test for positive selection on protein-coding genes of the blue crab, Callinectes sapidus. Four genes with putative roles in physiological adaptation to environmental stress were chosen as candidates. A fifth gene not expected to play a role in environmental adaptation was used as a control. Large samples (n>800) of DNA sequences from C. sapidus were used in tests of selective neutrality based on sequence polymorphisms. In combination with these, sequences from the congener C. similis were used in neutrality tests based on interspecific divergence. In multiple tests, significant departures from neutral expectations and indicative of positive selection were found for the candidate gene trehalose 6-phosphate synthase (tps). These departures could not be explained by any of the historical population expansion or bottleneck scenarios that were evaluated in coalescent simulations. Evidence was also found for balancing selection at ATP-synthase subunit 9 (atps) using a maximum likelihood version of the Hudson, Kreitmen, and Aguadé test, and positive selection favoring amino acid replacements within ATP/ADP translocase (ant) was detected using the McDonald-Kreitman test. In contrast, test statistics for the control gene, ribosomal protein L12 (rpl), which presumably has experienced the same demographic effects as the candidate loci, were not significantly different from neutral expectations and could readily be explained by demographic effects. Together, these findings demonstrate the utility of the candidate gene approach for investigating adaptation at the molecular level in a marine invertebrate for which extensive genomic resources are not available. PMID:24896825

  6. Detecting selection in the blue crab, Callinectes sapidus, using DNA sequence data from multiple nuclear protein-coding genes.

    PubMed

    Yednock, Bree K; Neigel, Joseph E

    2014-01-01

    The identification of genes involved in the adaptive evolution of non-model organisms with uncharacterized genomes constitutes a major challenge. This study employed a rigorous and targeted candidate gene approach to test for positive selection on protein-coding genes of the blue crab, Callinectes sapidus. Four genes with putative roles in physiological adaptation to environmental stress were chosen as candidates. A fifth gene not expected to play a role in environmental adaptation was used as a control. Large samples (n>800) of DNA sequences from C. sapidus were used in tests of selective neutrality based on sequence polymorphisms. In combination with these, sequences from the congener C. similis were used in neutrality tests based on interspecific divergence. In multiple tests, significant departures from neutral expectations and indicative of positive selection were found for the candidate gene trehalose 6-phosphate synthase (tps). These departures could not be explained by any of the historical population expansion or bottleneck scenarios that were evaluated in coalescent simulations. Evidence was also found for balancing selection at ATP-synthase subunit 9 (atps) using a maximum likelihood version of the Hudson, Kreitmen, and Aguadé test, and positive selection favoring amino acid replacements within ATP/ADP translocase (ant) was detected using the McDonald-Kreitman test. In contrast, test statistics for the control gene, ribosomal protein L12 (rpl), which presumably has experienced the same demographic effects as the candidate loci, were not significantly different from neutral expectations and could readily be explained by demographic effects. Together, these findings demonstrate the utility of the candidate gene approach for investigating adaptation at the molecular level in a marine invertebrate for which extensive genomic resources are not available.

  7. A common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and N1-methyladenosine modification

    PubMed Central

    Cenik, Can; Chua, Hon Nian; Singh, Guramrit; Akef, Abdalla; Snyder, Michael P.; Palazzo, Alexander F.

    2017-01-01

    Introns are found in 5′ untranslated regions (5′UTRs) for 35% of all human transcripts. These 5′UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5′UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5′UTR intron status, we developed a classifier that can predict 5′UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5′ proximal-intron-minus-like-coding regions (“5IM” transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5′ cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5′ proximal positions. Finally, N1-methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5′ proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N1-methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC. PMID:27994090

  8. C.U.R.R.F. (Codon Usage regarding Restriction Finder): a free Java(®)-based tool to detect potential restriction sites in both coding and non-coding DNA sequences.

    PubMed

    Gatter, Michael; Gatter, Thomas; Matthäus, Falk

    2012-10-01

    The synthesis of complete genes is becoming a more and more popular approach in heterologous gene expression. Reasons for this are the decreasing prices and the numerous advantages in comparison to classic molecular cloning methods. Two of these advantages are the possibility to adapt the codon usage to the host organism and the option to introduce restriction enzyme target sites of choice. C.U.R.R.F. (Codon Usage regarding Restriction Finder) is a free Java(®)-based software program which is able to detect possible restriction sites in both coding and non-coding DNA sequences by introducing multiple silent or non-silent mutations, respectively. The deviation of an alternative sequence containing a desired restriction motive from the sequence with the optimal codon usage is considered during the search of potential restriction sites in coding DNA and mRNA sequences as well as protein sequences. C.U.R.R.F is available at http://www.zvm.tu-dresden.de/die_tu_dresden/fakultaeten/fakultaet_mathematik_und_naturwissenschaften/fachrichtung_biologie/mikrobiologie/allgemeine_mikrobiologie/currf.

  9. Mutations in the TSC2 gene: analysis of the complete coding sequence using the protein truncation test (PTT).

    PubMed

    van Bakel, I; Sepp, T; Ward, S; Yates, J R; Green, A J

    1997-09-01

    Mutations in the TSC2 gene on chromosome 16p13.3 are responsible for approximately 50% of familial tuberous sclerosis (TSC). The gene has 41 small exons spanning 45 kb of genomic DNA and encoding a 5.5 kb mRNA. Large germline deletions of TSC2 occur in <5% of cases, and a number of small intragenic mutations have been described. We analysed mRNA from 18 unrelated cases of TSC for TSC2 mutations using the protein truncation test (PTT). Three cases were predicted to be TSC2 mutations on the basis of linkage analysis or because a hamartoma from the patient showed loss of heterozygosity for 16p13.3 markers. Three overlapping PCR products, covering the complete coding sequence of mRNA, were generated from lymphoblastoid cell lines, translated into 35S-methionine labelled protein, and analysed by SDS-PAGE. PCR products showing PTT shifts were directly sequenced, and mutations confirmed by restriction enzyme digestion where possible. Six PTT shifts were identified. Five of these were caused by mutations predicted to produce a truncated protein: (i) a sporadic case showed a 32 bp deletion in exon 11, and a mutant mRNA without exon 11 was produced; the normal exon 10 was also spliced out; (ii) a sporadic case had a 1 bp deletion in exon 12 (1634delT); (iii) a TSC2-linked mother and daughter pair had a G-->T transversion in exon 23 (G2715T) introducing a cryptic splice site causing a 29 bp truncation of mRNA from exon 23; (iv) a sporadic case showed a 2 bp deletion in exon 36; (v) a sporadic case showed a 1 bp insertion disrupting the donor splice site of exon 37 (5007+2insA), resulting in the use of an upstream exonic cryptic splice site to cause a 29 bp truncation of mRNA from exon 37. In one case, the PTT shift was explained by in-frame splicing out of exon 10, in the presence of a normal exon 10 genomic sequence. Alternative splicing of exon 10 of the TSC2 gene may be a normal variant. Three 3rd base substitution polymorphisms were also detected during direct sequencing

  10. Intraclonal diversity in follicular lymphoma analyzed by quantitative ultra-deep sequencing of non-coding regions1

    PubMed Central

    Spence, Janice M.; Abumoussa, Andrew; Spence, John P.; Burack, W. Richard

    2014-01-01

    Cancers are characterized by genomic instability and the resulting intra-clonal diversity is a prerequisite for tumor evolution. Therefore, metrics of tumor heterogeneity may prove to be clinically meaningful. Intra-clonal heterogeneity in follicular lymphoma (FL) is apparent from studies of somatic hypermutation (SHM) caused by Activation Induced Deaminase (AID) in IGH. Aberrant SHM (aSHM), defined as AID activity outside of the IG loci, predominantly targets non-coding regions causing numerous “passenger” mutations but has the potential to generate rare significant “driver” mutations. The quantitative relationship between SHM and aSHM has not been defined. To measure SHM and aSHM, ultradeep sequencing (>20,000 fold coverage) was performed on IGH (∼1650nt) and 9 other non-coding regions potentially targeted by AID (combined 9411nt), including the 5′UTR of BCL2. Single nucleotide variants (SNV) were found in 12/12 FL specimens (median 136 SHM and 53 aSHM). The aSHM SNVs were associated with AID-motifs (p<0.0001). The number of SNVs at BCL2 varied widely among specimens and correlated with the number of SNVs at 8 other potential aSHM sites. In contrast SHM at IGH was not predictive of aSHM. Tumor heterogeneity is apparent from SNVs at low variant allele frequencies (VAF); the relative number of SNVs with VAF<5% varied with clinical grade indicating that tumor heterogeneity based on aSHM reflects a clinically meaningful parameter. These data suggest that genome-wide aSHM may be estimated from aSHM of BCL2 but not SHM of IGH. The results demonstrate a practical approach to the quantification of intra-tumoral genetic heterogeneity for clinical specimens. PMID:25311808

  11. Sequencing the GRHL3 Coding Region Reveals Rare Truncating Mutations and a Common Susceptibility Variant for Nonsyndromic Cleft Palate

    PubMed Central

    Mangold, Elisabeth; Böhmer, Anne C.; Ishorst, Nina; Hoebel, Ann-Kathrin; Gültepe, Pinar; Schuenke, Hannah; Klamt, Johanna; Hofmann, Andrea; Gölz, Lina; Raff, Ruth; Tessmann, Peter; Nowak, Stefanie; Reutter, Heiko; Hemprich, Alexander; Kreusch, Thomas; Kramer, Franz-Josef; Braumann, Bert; Reich, Rudolf; Schmidt, Gül; Jäger, Andreas; Reiter, Rudolf; Brosch, Sibylle; Stavusis, Janis; Ishida, Miho; Seselgyte, Rimante; Moore, Gudrun E.; Nöthen, Markus M.; Borck, Guntram; Aldhorae, Khalid A.; Lace, Baiba; Stanier, Philip; Knapp, Michael; Ludwig, Kerstin U.

    2016-01-01

    Nonsyndromic cleft lip with/without cleft palate (nsCL/P) and nonsyndromic cleft palate only (nsCPO) are the most frequent subphenotypes of orofacial clefts. A common syndromic form of orofacial clefting is Van der Woude syndrome (VWS) where individuals have CL/P or CPO, often but not always associated with lower lip pits. Recently, ∼5% of VWS-affected individuals were identified with mutations in the grainy head-like 3 gene (GRHL3). To investigate GRHL3 in nonsyndromic clefting, we sequenced its coding region in 576 Europeans with nsCL/P and 96 with nsCPO. Most strikingly, nsCPO-affected individuals had a higher minor allele frequency for rs41268753 (0.099) than control subjects (0.049; p = 1.24 × 10−2). This association was replicated in nsCPO/control cohorts from Latvia, Yemen, and the UK (pcombined = 2.63 × 10−5; ORallelic = 2.46 [95% CI 1.6–3.7]) and reached genome-wide significance in combination with imputed data from a GWAS in nsCPO triads (p = 2.73 × 10−9). Notably, rs41268753 is not associated with nsCL/P (p = 0.45). rs41268753 encodes the highly conserved p.Thr454Met (c.1361C>T) (GERP = 5.3), which prediction programs denote as deleterious, has a CADD score of 29.6, and increases protein binding capacity in silico. Sequencing also revealed four novel truncating GRHL3 mutations including two that were de novo in four families, where all nine individuals harboring mutations had nsCPO. This is important for genetic counseling: given that VWS is rare compared to nsCPO, our data suggest that dominant GRHL3 mutations are more likely to cause nonsyndromic than syndromic CPO. Thus, with rare dominant mutations and a common risk variant in the coding region, we have identified an important contribution for GRHL3 in nsCPO. PMID:27018475

  12. TRENDS (Transport and Retention of Nuclides in Dominant Sequences): A code for modeling iodine behavior in containment during severe accidents

    SciTech Connect

    Weber, C.F.; Beahm, E.C.; Kress, T.S.; Daish, S.R.; Shockley, W.E.

    1989-01-01

    The ultimate aim of a description of iodine behavior in severe LWR accidents is a time-dependent accounting of iodine species released into containment and to the environment. Factors involved in the behavior of iodine can be conveniently divided into four general categories: (1) initial release into containment, (2) interaction of iodine species in containment not directly involving water pools, (3) interaction of iodine species in, or with, water pools, and (4) interaction with special systems such as ice condensers or gas treatment systems. To fill the large gaps in knowledge and to provide a means for assaying the iodine source term, this program has proceeded along two paths: (1) Experimental studies of the chemical behavior of iodine under containment conditions. (2) Development of TRENDS (Transport and Retention of Nuclides in Dominant Sequences), a computer code for modeling the behavior of iodine in containment and its release from containment. The main body of this report consists of a description of TRENDS. These two parts to the program are complementary in that models within TRENDS use data that were produced in the experimental program; therefore, these models are supported by experimental evidence that was obtained under conditions expected in severe accidents. 7 refs., 1 fig., 2 tabs.

  13. Association of low-frequency and rare coding-sequence variants with blood lipids and Coronary Heart Disease in 56,000 whites and blacks

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Low-frequency coding DNA sequence variants in the proprotein convertase subtilisin/kexin type 9 gene (PCSK9) lower plasma low-density lipoprotein cholesterol (LDL-C), protect against risk of coronary heart disease (CHD), and have prompted the development of a new class of therapeutics. It is uncerta...

  14. Full-length coding sequence for 12 bovine viral diarrhea virus isolates from persistently infected cattle in a feedyard in Kansas

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We report here the full-length coding sequence of 12 bovine viral diarrhea virus (BVDV) isolates from persistently infected cattle from a feedyard in southwest Kansas, USA. These 12 genomes represent the three major genotypes (BVDV 1a, 1b, and 2a) of BVDV currently circulating in the United States....

  15. DNA polymorphism in morels: complete sequences of the internal transcribed spacer of genes coding for rRNA in Morchella esculenta (yellow morel) and Morchella conica (black morel).

    PubMed

    Wipf, D; Munch, J C; Botton, B; Buscot, F

    1996-09-01

    The internal transcribed spacer (ITS) of the gene coding for rRNA was sequenced in both directions with the gene walking technique in a black morel (Morchella conica) and a yellow morel (M. esculenta) to elucidate the ITS length discrepancy between the two species groups (750-bp ITS in black morels and 1,150-bp ITS in yellow morels.

  16. Hfq assists small RNAs in binding to the coding sequence of ompD mRNA and in rearranging its structure

    PubMed Central

    Wroblewska, Zuzanna; Olejniczak, Mikolaj

    2016-01-01

    The bacterial protein Hfq participates in the regulation of translation by small noncoding RNAs (sRNAs). Several mechanisms have been proposed to explain the role of Hfq in the regulation by sRNAs binding to the 5′-untranslated mRNA regions. However, it remains unknown how Hfq affects those sRNAs that target the coding sequence. Here, the contribution of Hfq to the annealing of three sRNAs, RybB, SdsR, and MicC, to the coding sequence of Salmonella ompD mRNA was investigated. Hfq bound to ompD mRNA with tight, subnanomolar affinity. Moreover, Hfq strongly accelerated the rates of annealing of RybB and MicC sRNAs to this mRNA, and it also had a small effect on the annealing of SdsR. The experiments using truncated RNAs revealed that the contributions of Hfq to the annealing of each sRNA were individually adjusted depending on the structures of interacting RNAs. In agreement with that, the mRNA structure probing revealed different structural contexts of each sRNA binding site. Additionally, the annealing of RybB and MicC sRNAs induced specific conformational changes in ompD mRNA consistent with local unfolding of mRNA secondary structure. Finally, the mutation analysis showed that the long AU-rich sequence in the 5′-untranslated mRNA region served as an Hfq binding site essential for the annealing of sRNAs to the coding sequence. Overall, the data showed that the functional specificity of Hfq in the annealing of each sRNA to the ompD mRNA coding sequence was determined by the sequence and structure of the interacting RNAs. PMID:27154968

  17. Incomplete intestinal absorption of fructose.

    PubMed Central

    Kneepkens, C M; Vonk, R J; Fernandes, J

    1984-01-01

    Intestinal D-fructose absorption in 31 children was investigated using measurements of breath hydrogen. Twenty five children had no abdominal symptoms and six had functional bowel disorders. After ingestion of fructose (2 g/kg bodyweight), 22 children (71%) showed a breath hydrogen increase of more than 10 ppm over basal values, indicating incomplete absorption: the increase averaged 53 ppm, range 12 to 250 ppm. Four of these children experienced abdominal symptoms. Three of the six children with bowel disorders showed incomplete absorption. Seven children were tested again with an equal amount of glucose, and in three of them also of galactose, added to the fructose. The mean maximum breath hydrogen increases were 5 and 10 ppm, respectively, compared with 103 ppm after fructose alone. In one boy several tests were performed with various sugars; fructose was the only sugar incompletely absorbed, and the effect of glucose on fructose absorption was shown to be dependent on the amount added. It is concluded that children have a limited absorptive capacity for fructose. We speculate that the enhancing effect of glucose and galactose on fructose absorption may be due to activation of the fructose carrier. Apple juice in particular contains fructose in excess of glucose and could lead to abdominal symptoms in susceptible children. PMID:6476870

  18. Profile Likelihood and Incomplete Data.

    PubMed

    Zhang, Zhiwei

    2010-04-01

    According to the law of likelihood, statistical evidence is represented by likelihood functions and its strength measured by likelihood ratios. This point of view has led to a likelihood paradigm for interpreting statistical evidence, which carefully distinguishes evidence about a parameter from error probabilities and personal belief. Like other paradigms of statistics, the likelihood paradigm faces challenges when data are observed incompletely, due to non-response or censoring, for instance. Standard methods to generate likelihood functions in such circumstances generally require assumptions about the mechanism that governs the incomplete observation of data, assumptions that usually rely on external information and cannot be validated with the observed data. Without reliable external information, the use of untestable assumptions driven by convenience could potentially compromise the interpretability of the resulting likelihood as an objective representation of the observed evidence. This paper proposes a profile likelihood approach for representing and interpreting statistical evidence with incomplete data without imposing untestable assumptions. The proposed approach is based on partial identification and is illustrated with several statistical problems involving missing data or censored data. Numerical examples based on real data are presented to demonstrate the feasibility of the approach.

  19. Deep sequencing of RNA from immune cell-derived vesicles uncovers the selective incorporation of small non-coding RNA biotypes with potential regulatory functions

    PubMed Central

    Nolte-’t Hoen, Esther N. M.; Buermans, Henk P. J.; Waasdorp, Maaike; Stoorvogel, Willem; Wauben, Marca H. M.; ’t Hoen, Peter A. C.

    2012-01-01

    Cells release RNA-carrying vesicles and membrane-free RNA/protein complexes into the extracellular milieu. Horizontal vesicle-mediated transfer of such shuttle RNA between cells allows dissemination of genetically encoded messages, which may modify the function of target cells. Other studies used array analysis to establish the presence of microRNAs and mRNA in cell-derived vesicles from many sources. Here, we used an unbiased approach by deep sequencing of small RNA released by immune cells. We found a large variety of small non-coding RNA species representing pervasive transcripts or RNA cleavage products overlapping with protein coding regions, repeat sequences or structural RNAs. Many of these RNAs were enriched relative to cellular RNA, indicating that cells destine specific RNAs for extracellular release. Among the most abundant small RNAs in shuttle RNA were sequences derived from vault RNA, Y-RNA and specific tRNAs. Many of the highly abundant small non-coding transcripts in shuttle RNA are evolutionary well-conserved and have previously been associated to gene regulatory functions. These findings allude to a wider range of biological effects that could be mediated by shuttle RNA than previously expected. Moreover, the data present leads for unraveling how cells modify the function of other cells via transfer of specific non-coding RNA species. PMID:22821563

  20. Individual variation of human S1P₁ coding sequence leads to heterogeneity in receptor function and drug interactions.

    PubMed

    Obinata, Hideru; Gutkind, Sarah; Stitham, Jeremiah; Okuno, Toshiaki; Yokomizo, Takehiko; Hwa, John; Hla, Timothy

    2014-12-01

    Sphingosine 1-phosphate receptor 1 (S1P₁), an abundantly-expressed G protein-coupled receptor which regulates key vascular and immune responses, is a therapeutic target in autoimmune diseases. Fingolimod/Gilenya (FTY720), an oral medication for relapsing-remitting multiple sclerosis, targets S1P₁ receptors on immune and neural cells to suppress neuroinflammation. However, suppression of endothelial S1P₁ receptors is associated with cardiac and vascular adverse effects. Here we report the genetic variations of the S1P₁ coding region from exon sequencing of >12,000 individuals and their functional consequences. We conducted functional analyses of 14 nonsynonymous single nucleotide polymorphisms (SNPs) of the S1PR1 gene. One SNP mutant (Arg¹²⁰ to Pro) failed to transmit sphingosine 1-phosphate (S1P)-induced intracellular signals such as calcium increase and activation of p44/42 MAPK and Akt. Two other mutants (Ile⁴⁵ to Thr and Gly³⁰⁵ to Cys) showed normal intracellular signals but impaired S1P-induced endocytosis, which made the receptor resistant to FTY720-induced degradation. Another SNP mutant (Arg¹³ to Gly) demonstrated protection from coronary artery disease in a high cardiovascular risk population. Individuals with this mutation showed a significantly lower percentage of multi-vessel coronary obstruction in a risk factor-matched case-control study. This study suggests that individual genetic variations of S1P₁ can influence receptor function and, therefore, infer differential disease risks and interaction with S1P₁-targeted therapeutics.

  1. Rate-dependent incompleteness of earthquake catalogs

    NASA Astrophysics Data System (ADS)

    Hainzl, Sebastian

    2016-04-01

    Important information about the earthquake generation process can be gained from instrumental earthquake catalogs, but this requires complete recordings to avoid biased results. The local completeness magnitude Mc is known to depend on general conditions such as the seismographic network and the environmental noise, which generally limit the possibility to detect small events. The detectability can be additionally reduced by an earthquake-induced increase of the noise-level leading to short-term variations of Mc, which cannot be resolved by traditional methods relying on the analysis of the frequency-magnitude distribution. Based on simple assumptions, I propose a new method to estimate such temporal excursions of Mc solely based on the estimation of the earthquake rate resulting in a high temporal resolution of Mc. The approach is shown to be in agreement with the apparent decrease of the estimated Gutenberg-Richter b-value in high-activity phases of recorded data sets and the observed incompleteness periods after mainshocks. Furthermore, an algorithm to estimate temporal changes of Mc is introduced and applied to empirical aftershock and swarm sequences from California and central Europe, indicating that observed b-value fluctuations are often related to rate-dependent incompleteness of the earthquake catalogs.

  2. A statistical approach for distinguishing hybridization and incomplete lineage sorting.

    PubMed

    Joly, Simon; McLenachan, Patricia A; Lockhart, Peter J

    2009-08-01

    The extent and evolutionary significance of hybridization is difficult to evaluate because of the difficulty in distinguishing hybridization from incomplete lineage sorting. Here we present a novel parametric approach for statistically distinguishing hybridization from incomplete lineage sorting based on minimum genetic distances of a nonrecombining locus. It is based on the idea that the expected minimum genetic distance between sequences from two species is smaller for some hybridization events than for incomplete lineage sorting scenarios. When applied to empirical data sets, distributions can be generated for the minimum interspecies distances expected under incomplete lineage sorting using coalescent simulations. If the observed distance between sequences from two species is smaller than its predicted distribution, incomplete lineage sorting can be rejected and hybridization inferred. We demonstrate the power of the method using simulations and illustrate its application on New Zealand alpine buttercups (Ranunculus). The method is robust and complements existing approaches. Thus it should allow biologists to assess with greater accuracy the importance of hybridization in evolution.

  3. Combinatorial variation in coding and promoter sequences of genes at the Tri locus in Pisum sativum accounts for variation in trypsin inhibitor activity in seeds.

    PubMed

    Page, D; Aubert, G; Duc, G; Welham, T; Domoney, C

    2002-05-01

    Cultivars of Pisum sativum that differ with respect to the quantitative expression of trypsin/chymotrypsin inhibitor proteins in seeds have been examined in terms of the structure of the corresponding genes. The patterns of divergence in the promoter and coding sequences are described, and the divergence among these exploited for the development of facile DNA-based assays to distinguish genotypes. Quantitative effects on gene expression may be attributed to the overall gene complement and to particular promoter/coding sequence combinations, as well as to the existence of distinct active-site variants that ultimately influence protein activity. Electronic supplementary material to this paper can be obtained by using the Springer LINK server located at http://dx.doi.org/10.1007/s00438-002-0667-4.

  4. Molecular cloning and sequence analysis of the gene coding for the 57kDa soluble antigen of the salmonid fish pathogen Renibacterium salmoninarum

    USGS Publications Warehouse

    Chien, Maw-Sheng; Gilbert , Teresa L.; Huang, Chienjin; Landolt, Marsha L.; O'Hara, Patrick J.; Winton, James R.

    1992-01-01

    The complete sequence coding for the 57-kDa major soluble antigen of the salmonid fish pathogen, Renibacterium salmoninarum, was determined. The gene contained an opening reading frame of 1671 nucleotides coding for a protein of 557 amino acids with a calculated Mr value of 57190. The first 26 amino acids constituted a signal peptide. The deduced sequence for amino acid residues 27–61 was in agreement with the 35 N-terminal amino acid residues determined by microsequencing, suggesting the protein in synthesized as a 557-amino acid precursor and processed to produce a mature protein of Mr 54505. Two regions of the protein contained imperfect direct repeats. The first region contained two copies of an 81-residue repeat, the second contained five copies of an unrelated 25-residue repeat. Also, a perfect inverted repeat (including three in-frame UAA stop codons) was observed at the carboxyl-terminus of the gene.

  5. Application of MELCOR Code to a French PWR 900 MWe Severe Accident Sequence and Evaluation of Models Performance Focusing on In-Vessel Thermal Hydraulic Results

    SciTech Connect

    De Rosa, Felice

    2006-07-01

    In the ambit of the Severe Accident Network of Excellence Project (SARNET), funded by the European Union, 6. FISA (Fission Safety) Programme, one of the main tasks is the development and validation of the European Accident Source Term Evaluation Code (ASTEC Code). One of the reference codes used to compare ASTEC results, coming from experimental and Reactor Plant applications, is MELCOR. ENEA is a SARNET member and also an ASTEC and MELCOR user. During the first 18 months of this project, we performed a series of MELCOR and ASTEC calculations referring to a French PWR 900 MWe and to the accident sequence of 'Loss of Steam Generator (SG) Feedwater' (known as H2 sequence in the French classification). H2 is an accident sequence substantially equivalent to a Station Blackout scenario, like a TMLB accident, with the only difference that in H2 sequence the scram is forced to occur with a delay of 28 seconds. The main events during the accident sequence are a loss of normal and auxiliary SG feedwater (0 s), followed by a scram when the water level in SG is equal or less than 0.7 m (after 28 seconds). There is also a main coolant pumps trip when {delta}Tsat < 10 deg. C, a total opening of the three relief valves when Tric (core maximal outlet temperature) is above 603 K (330 deg. C) and accumulators isolation when primary pressure goes below 1.5 MPa (15 bar). Among many other points, it is worth noting that this was the first time that a MELCOR 1.8.5 input deck was available for a French PWR 900. The main ENEA effort in this period was devoted to prepare the MELCOR input deck using the code version v.1.8.5 (build QZ Oct 2000 with the latest patch 185003 Oct 2001). The input deck, completely new, was prepared taking into account structure, data and same conditions as those found inside ASTEC input decks. The main goal of the work presented in this paper is to put in evidence where and when MELCOR provides good enough results and why, in some cases mainly referring to its

  6. Item Calibration in Incomplete Testing Designs

    ERIC Educational Resources Information Center

    Eggen, Theo J. H. M.; Verhelst, Norman D.

    2011-01-01

    This study discusses the justifiability of item parameter estimation in incomplete testing designs in item response theory. Marginal maximum likelihood (MML) as well as conditional maximum likelihood (CML) procedures are considered in three commonly used incomplete designs: random incomplete, multistage testing and targeted testing designs.…

  7. Identification of a cDNA clone that contains the complete coding sequence for a 140-kD rat NCAM polypeptide

    PubMed Central

    1987-01-01

    Neural cell adhesion molecules (NCAMs) are cell surface glycoproteins that appear to mediate cell-cell adhesion. In vertebrates NCAMs exist in at least three different polypeptide forms of apparent molecular masses 180, 140, and 120 kD. The 180- and 140-kD forms span the plasma membrane whereas the 120-kD form lacks a transmembrane region. In this study, we report the isolation of NCAM clones from an adult rat brain cDNA library. Sequence analysis indicated that the longest isolate, pR18, contains a 2,574 nucleotide open reading frame flanked by 208 bases of 5' and 409 bases of 3' untranslated sequence. The predicted polypeptide encoded by clone pR18 contains a single membrane-spanning region and a small cytoplasmic domain (120 amino acids), suggesting that it codes for a full-length 140-kD NCAM form. In Northern analysis, probes derived from 5' sequences of pR18, which presumably code for extracellular portions of the molecule hybridized to five discrete mRNA size classes (7.4, 6.7, 5.2, 4.3, and 2.9 kb) in adult rat brain but not to liver or muscle RNA. However, the 5.2- and 2.9-kb mRNA size classes did not hybridize to either a large restriction fragment or three oligonucleotides derived from the putative transmembrane coding region and regions that lie 3' to it. The 3' probes did hybridize to the 7.4-, 6.7-, and 4.3-kb message size classes. These combined results indicate that clone pR18 is derived from either the 7.4-, 6.7-, or 4.3- kb adult rat brain RNA size class. Comparison with chicken and mouse NCAM cDNA sequences suggests that pR18 represents the amino acid coding region of the 6.7- or 4.3-kb mRNA. The isolation of pR18, the first cDNA that contains the complete coding sequence of an NCAM polypeptide, unambiguously demonstrates the predicted linear amino acid sequence of this probable rat 140-kD polypeptide. This cDNA also contains a 30-base pair segment not found in NCAM cDNAs isolated from other species. The significance of this segment and other

  8. Simulated data supporting inbreeding rate estimates from incomplete pedigrees

    USGS Publications Warehouse

    Miller, Mark P.

    2017-01-01

    This data release includes:(1) The data from simulations used to illustrate the behavior of inbreeding rate estimators. Estimating inbreeding rates is particularly difficult for natural populations because parentage information for many individuals may be incomplete. Our analyses illustrate the behavior of a newly-described inbreeding rate estimator that outperforms previously described approaches in the scientific literature.(2) Python source code ("analytical expressions", "computer simulations", and "empricial data set") that can be used to analyze these data.

  9. Semantic Borders and Incomplete Understanding.

    PubMed

    Silva-Filho, Waldomiro J; Dazzani, Maria Virgínia

    2016-03-01

    In this article, we explore a fundamental issue of Cultural Psychology, that is our "capacity to make meaning", by investigating a thesis from contemporary philosophical semantics, namely, that there is a decisive relationship between language and rationality. Many philosophers think that for a person to be described as a rational agent he must understand the semantic content and meaning of the words he uses to express his intentional mental states, e.g., his beliefs and thoughts. Our argument seeks to investigate the thesis developed by Tyler Burge, according to which our mastery or understanding of the semantic content of the terms which form our beliefs and thoughts is an "incomplete understanding". To do this, we discuss, on the one hand, the general lines of anti-individualism or semantic externalism and, on the other, criticisms of the Burgean notion of incomplete understanding - one radical and the other moderate. We defend our understanding that the content of our beliefs must be described in the light of the limits and natural contingencies of our cognitive capacities and the normative nature of our rationality. At heart, anti-individualism leads us to think about the fact that we are social creatures, living in contingent situations, with important, but limited, cognitive capacities, and that we receive the main, and most important, portion of our knowledge simply from what others tell us. Finally, we conclude that this discussion may contribute to the current debate about the notion of borders.

  10. Combining DGE and RNA-sequencing data to identify new polyA+ non-coding transcripts in the human genome

    PubMed Central

    Philippe, Nicolas; Bou Samra, Elias; Boureux, Anthony; Mancheron, Alban; Rufflé, Florence; Bai, Qiang; De Vos, John; Rivals, Eric; Commes, Thérèse

    2014-01-01

    Recent sequencing technologies that allow massive parallel production of short reads are the method of choice for transcriptome analysis. Particularly, digital gene expression (DGE) technologies produce a large dynamic range of expression data by generating short tag signatures for each cell transcript. These tags can be mapped back to a reference genome to identify new transcribed regions that can be further covered by RNA-sequencing (RNA-Seq) reads. Here, we applied an integrated bioinformatics approach that combines DGE tags, RNA-Seq, tiling array expression data and species-comparison to explore new transcriptional regions and their specific biological features, particularly tissue expression or conservation. We analysed tags from a large DGE data set (designated as ‘TranscriRef’). We then annotated 750 000 tags that were uniquely mapped to the human genome according to Ensembl. We retained transcripts originating from both DNA strands and categorized tags corresponding to protein-coding genes, antisense, intronic- or intergenic-transcribed regions and computed their overlap with annotated non-coding transcripts. Using this bioinformatics approach, we identified ∼34 000 novel transcribed regions located outside the boundaries of known protein-coding genes. As demonstrated using sequencing data from human pluripotent stem cells for biological validation, the method could be easily applied for the selection of tissue-specific candidate transcripts. DigitagCT is available at http://cractools.gforge.inria.fr/softwares/digitagct. PMID:24357408

  11. Cloning, genetic analysis, and nucleotide sequence of a determinant coding for a 19-kilodalton peptidoglycan-associated protein (Ppl) of Legionella pneumophila.

    PubMed Central

    Ludwig, B; Schmid, A; Marre, R; Hacker, J

    1991-01-01

    A genomic library of Legionella pneumophila, the causative agent of Legionnaires disease in humans, was constructed in Escherichia coli K-12, and the recombinant clones were screened by immuno-colony blots with an antiserum raised against heat-killed L. pneumophila. Twenty-three clones coding for a Legionella-specific protein of 19 kDa were isolated. The 19-kDa protein, which represents an outer membrane protein, was found to be associated with the peptidoglycan layer both in L. pneumophila and in the recombinant E. coli clones. This was shown by electrophoresis and Western immunoblot analysis of bacterial cell membrane fractions with a monospecific polyclonal 19-kDa protein-specific antiserum. The protein was termed peptidoglycan-associated protein of L. pneumophila (Ppl). The corresponding genetic determinant, ppl, was subcloned on a 1.8-kb ClaI fragment. DNA sequence studies revealed that two open reading frames, pplA and pplB, coding for putative proteins of 18.9 and 16.8 kDa, respectively, were located on the ClaI fragment. Exonuclease III digestion studies confirmed that pplA is the gene coding for the peptidoglycan-associated 19-kDa protein of L. pneumophila. The amino acid sequence of PplA exhibits a high degree of homology to the sequences of the Pal lipoproteins of E. coli K-12 and Haemophilus influenzae. Images PMID:1855972

  12. A computer program for estimation from incomplete multinomial data

    NASA Technical Reports Server (NTRS)

    Credeur, K. R.

    1978-01-01

    Coding is given for maximum likelihood and Bayesian estimation of the vector p of multinomial cell probabilities from incomplete data. Also included is coding to calculate and approximate elements of the posterior mean and covariance matrices. The program is written in FORTRAN 4 language for the Control Data CYBER 170 series digital computer system with network operating system (NOS) 1.1. The program requires approximately 44000 octal locations of core storage. A typical case requires from 72 seconds to 92 seconds on CYBER 175 depending on the value of the prior parameter.

  13. Sequence Diversity of the oprI Gene, Coding for Major Outer Membrane Lipoprotein I, among rRNA Group I Pseudomonads

    PubMed Central

    De Vos, Daniel; Bouton, Christiane; Sarniguet, Alain; De Vos, Paul; Vauterin, Marc; Cornelis, Pierre

    1998-01-01

    The sequence of oprI, the gene coding for the major outer membrane lipoprotein I, was determined by PCR sequencing for representatives of 17 species of rRNA group I pseudomonads, with a special emphasis on Pseudomonas aeruginosa and Pseudomonas fluorescens. Within the P. aeruginosa species, oprI sequences for 25 independent isolates were found to be identical, except for one silent substitution at position 96. The oprI sequences diverged more for the other rRNA group I pseudomonads (85 to 91% similarity with P. aeruginosa oprI). An accumulation of silent and also (but to a much lesser extent) nonsilent substitutions in the different sequences was found. A clustering according to the respective presence and/or positions of the HaeIII, PvuII, and SphI sites could also be obtained. A sequence cluster analysis showed a rather widespread distribution of P. fluorescens isolates. All other rRNA group I pseudomonads clustered in a manner that was in agreement with other studies, showing that the oprI gene can be useful as a complementary phylogenetic marker for classification of rRNA group I pseudomonads. PMID:9851998

  14. A bacterial genetic screen identifies functional coding sequences of the insect mariner transposable element Famar1 amplified from the genome of the earwig, Forficula auricularia.

    PubMed Central

    Barry, Elizabeth G; Witherspoon, David J; Lampe, David J

    2004-01-01

    Transposons of the mariner family are widespread in animal genomes and have apparently infected them by horizontal transfer. Most species carry only old defective copies of particular mariner transposons that have diverged greatly from their active horizontally transferred ancestor, while a few contain young, very similar, and active copies. We report here the use of a whole-genome screen in bacteria to isolate somewhat diverged Famar1 copies from the European earwig, Forficula auricularia, that encode functional transposases. Functional and nonfunctional coding sequences of Famar1 and nonfunctional copies of Ammar1 from the European honey bee, Apis mellifera, were sequenced to examine their molecular evolution. No selection for sequence conservation was detected in any clade of a tree derived from these sequences, not even on branches leading to functional copies. This agrees with the current model for mariner transposon evolution that expects neutral evolution within particular hosts, with selection for function occurring only upon horizontal transfer to a new host. Our results further suggest that mariners are not finely tuned genetic entities and that a greater amount of sequence diversification than had previously been appreciated can occur in functional copies in a single host lineage. Finally, this method of isolating active copies can be used to isolate other novel active transposons without resorting to reconstruction of ancestral sequences. PMID:15020471

  15. Assembly of the Complete Sitka Spruce Chloroplast Genome Using 10X Genomics’ GemCode Sequencing Data

    PubMed Central

    Coombe, Lauren; Jackman, Shaun D.; Yang, Chen; Vandervalk, Benjamin P.; Moore, Richard A.; Pleasance, Stephen; Coope, Robin J.; Bohlmann, Joerg; Holt, Robert A.; Jones, Steven J. M.; Birol, Inanc

    2016-01-01

    The linked read sequencing library preparation platform by 10X Genomics produces barcoded sequencing libraries, which are subsequently sequenced using the Illumina short read sequencing technology. In this new approach, long fragments of DNA are partitioned into separate micro-reactions, where the same index sequence is incorporated into each of the sequencing fragment inserts derived from a given long fragment. In this study, we exploited this property by using reads from index sequences associated with a large number of reads, to assemble the chloroplast genome of the Sitka spruce tree (Picea sitchensis). Here we report on the first Sitka spruce chloroplast genome assembled exclusively from P. sitchensis genomic libraries prepared using the 10X Genomics protocol. We show that the resulting 124,049 base pair long genome shares high sequence similarity with the related white spruce and Norway spruce chloroplast genomes, but diverges substantially from a previously published P. sitchensis- P. thunbergii chimeric genome. The use of reads from high-frequency indices enabled separation of the nuclear genome reads from that of the chloroplast, which resulted in the simplification of the de Bruijn graphs used at the various stages of assembly. PMID:27632164

  16. Allowing for missing outcome data and incomplete uptake of randomised interventions, with application to an Internet-based alcohol trial.

    PubMed

    White, Ian R; Kalaitzaki, Eleftheria; Thompson, Simon G

    2011-11-30

    Missing outcome data and incomplete uptake of randomised interventions are common problems, which complicate the analysis and interpretation of randomised controlled trials, and are rarely addressed well in practice. To promote the implementation of recent methodological developments, we describe sequences of randomisation-based analyses that can be used to explore both issues. We illustrate these in an Internet-based trial evaluating the use of a new interactive website for those seeking help to reduce their alcohol consumption, in which the primary outcome was available for less than half of the participants and uptake of the intervention was limited. For missing outcome data, we first employ data on intermediate outcomes and intervention use to make a missing at random assumption more plausible, with analyses based on general estimating equations, mixed models and multiple imputation. We then use data on the ease of obtaining outcome data and sensitivity analyses to explore departures from the missing at random assumption. For incomplete uptake of randomised interventions, we estimate structural mean models by using instrumental variable methods. In the alcohol trial, there is no evidence of benefit unless rather extreme assumptions are made about the missing data nor an important benefit in more extensive users of the intervention. These findings considerably aid the interpretation of the trial's results. More generally, the analyses proposed are applicable to many trials with missing outcome data or incomplete intervention uptake. To facilitate use by others, Stata code is provided for all methods.

  17. Association of low-frequency and rare coding-sequence variants with blood lipids and coronary heart disease in 56,000 whites and blacks.

    PubMed

    Peloso, Gina M; Auer, Paul L; Bis, Joshua C; Voorman, Arend; Morrison, Alanna C; Stitziel, Nathan O; Brody, Jennifer A; Khetarpal, Sumeet A; Crosby, Jacy R; Fornage, Myriam; Isaacs, Aaron; Jakobsdottir, Johanna; Feitosa, Mary F; Davies, Gail; Huffman, Jennifer E; Manichaikul, Ani; Davis, Brian; Lohman, Kurt; Joon, Aron Y; Smith, Albert V; Grove, Megan L; Zanoni, Paolo; Redon, Valeska; Demissie, Serkalem; Lawson, Kim; Peters, Ulrike; Carlson, Christopher; Jackson, Rebecca D; Ryckman, Kelli K; Mackey, Rachel H; Robinson, Jennifer G; Siscovick, David S; Schreiner, Pamela J; Mychaleckyj, Josyf C; Pankow, James S; Hofman, Albert; Uitterlinden, Andre G; Harris, Tamara B; Taylor, Kent D; Stafford, Jeanette M; Reynolds, Lindsay M; Marioni, Riccardo E; Dehghan, Abbas; Franco, Oscar H; Patel, Aniruddh P; Lu, Yingchang; Hindy, George; Gottesman, Omri; Bottinger, Erwin P; Melander, Olle; Orho-Melander, Marju; Loos, Ruth J F; Duga, Stefano; Merlini, Piera Angelica; Farrall, Martin; Goel, Anuj; Asselta, Rosanna; Girelli, Domenico; Martinelli, Nicola; Shah, Svati H; Kraus, William E; Li, Mingyao; Rader, Daniel J; Reilly, Muredach P; McPherson, Ruth; Watkins, Hugh; Ardissino, Diego; Zhang, Qunyuan; Wang, Judy; Tsai, Michael Y; Taylor, Herman A; Correa, Adolfo; Griswold, Michael E; Lange, Leslie A; Starr, John M; Rudan, Igor; Eiriksdottir, Gudny; Launer, Lenore J; Ordovas, Jose M; Levy, Daniel; Chen, Y-D Ida; Reiner, Alexander P; Hayward, Caroline; Polasek, Ozren; Deary, Ian J; Borecki, Ingrid B; Liu, Yongmei; Gudnason, Vilmundur; Wilson, James G; van Duijn, Cornelia M; Kooperberg, Charles; Rich, Stephen S; Psaty, Bruce M; Rotter, Jerome I; O'Donnell, Christopher J; Rice, Kenneth; Boerwinkle, Eric; Kathiresan, Sekar; Cupples, L Adrienne

    2014-02-06

    Low-frequency coding DNA sequence variants in the proprotein convertase subtilisin/kexin type 9 gene (PCSK9) lower plasma low-density lipoprotein cholesterol (LDL-C), protect against risk of coronary heart disease (CHD), and have prompted the development of a new class of therapeutics. It is uncertain whether the PCSK9 example represents a paradigm or an isolated exception. We used the "Exome Array" to genotype >200,000 low-frequency and rare coding sequence variants across the genome in 56,538 individuals (42,208 European ancestry [EA] and 14,330 African ancestry [AA]) and tested these variants for association with LDL-C, high-density lipoprotein cholesterol (HDL-C), and triglycerides. Although we did not identify new genes associated with LDL-C, we did identify four low-frequency (frequencies between 0.1% and 2%) variants (ANGPTL8 rs145464906 [c.361C>T; p.Gln121*], PAFAH1B2 rs186808413 [c.482C>T; p.Ser161Leu], COL18A1 rs114139997 [c.331G>A; p.Gly111Arg], and PCSK7 rs142953140 [c.1511G>A; p.Arg504His]) with large effects on HDL-C and/or triglycerides. None of these four variants was associated with risk for CHD, suggesting that examples of low-frequency coding variants with robust effects on both lipids and CHD will be limited.

  18. Major breeding plumage color differences of male ruffs (Philomachus pugnax) are not associated with coding sequence variation in the MC1R gene.

    PubMed

    Farrell, Lindsay L; Küpper, Clemens; Burke, Terry; Lank, David B

    2015-01-01

    Sequence variation in the melanocortin-1 receptor (MC1R) gene explains color morph variation in several species of birds and mammals. Ruffs (Philomachus pugnax) exhibit major dark/light color differences in melanin-based male breeding plumage which is closely associated with alternative reproductive behavior. A previous study identified a microsatellite marker (Ppu020) near the MC1R locus associated with the presence/absence of ornamental plumage. We investigated whether coding sequence variation in the MC1R gene explains major dark/light plumage color variation and/or the presence/absence of ornamental plumage in ruffs. Among 821bp of the MC1R coding region from 44 male ruffs we found 3 single nucleotide polymorphisms, representing 1 nonsynonymous and 2 synonymous amino acid substitutions. None were associated with major dark/light color differences or the presence/absence of ornamental plumage. At all amino acid sites known to be functionally important in other avian species with dark/light plumage color variation, ruffs were either monomorphic or the shared polymorphism did not coincide with color morph. Neither ornamental plumage color differences nor the presence/absence of ornamental plumage in ruffs are likely to be caused entirely by amino acid variation within the coding regions of the MC1R locus. Regulatory elements and structural variation at other loci may be involved in melanin expression and contribute to the extreme plumage polymorphism observed in this species.

  19. Incompletely compacted equilibrated ordinary chondrites

    SciTech Connect

    Sasso, M.R.; Macke, R.J.; Boesenberg, J.S.; Britt, D.T.; Rovers, M.L.; Ebel, D.S.; Friedrich, J.M.

    2010-01-22

    We document the size distributions and locations of voids present within five highly porous equilibrated ordinary chondrites using high-resolution synchrotron X-ray microtomography ({mu}CT) and helium pycnometry. We found total porosities ranging from {approx}10 to 20% within these chondrites, and with {mu}CT we show that up to 64% of the void space is located within intergranular voids within the rock. Given the low (S1-S2) shock stages of the samples and the large voids between mineral grains, we conclude that these samples experienced unusually low amounts of compaction and shock loading throughout their entire post accretionary history. With Fe metal and FeS metal abundances and grain size distributions, we show that these chondrites formed naturally with greater than average porosities prior to parent body metamorphism. These materials were not 'fluffed' on their parent body by impact-related regolith gardening or events caused by seismic vibrations. Samples of all three chemical types of ordinary chondrites (LL, L, H) are represented in this study and we conclude that incomplete compaction is common within the asteroid belt.

  20. Computational performance of SequenceL coding of the lattice Boltzmann method for multi-particle flow simulations

    NASA Astrophysics Data System (ADS)

    Başağaoğlu, Hakan; Blount, Justin; Blount, Jarred; Nelson, Bryant; Succi, Sauro; Westhart, Phil M.; Harwell, John R.

    2017-04-01

    This paper reports, for the first time, the computational performance of SequenceL for mesoscale simulations of large numbers of particles in a microfluidic device via the lattice-Boltzmann method. The performance of SequenceL simulations was assessed against the optimized serial and parallelized (via OpenMP directives) FORTRAN90 simulations. At present, OpenMP directives were not included in inter-particle and particle-wall repulsive (steric) interaction calculations due to difficulties that arose from inter-iteration dependencies between consecutive iterations of the do-loops. SequenceL simulations, on the other hand, relied on built-in automatic parallelism. Under these conditions, numerical simulations revealed that the parallelized FORTRAN90 outran the performance of SequenceL by a factor of 2.5 or more when the number of particles was 100 or less. SequenceL, however, outran the performance of the parallelized FORTRAN90 by a factor of 1.3 when the number of particles was 300. Our results show that when the number of particles increased by 30-fold, the computational time of SequenceL simulations increased linearly by a factor of 1.5, as compared to a 3.2-fold increase in serial and a 7.7-fold increase in parallelized FORTRAN90 simulations. Considering SequenceL's efficient built-in parallelism that led to a relatively small increase in computational time with increased number of particles, it could be a promising programming language for computationally-efficient mesoscale simulations of large numbers of particles in microfluidic experiments.

  1. Diversity of coding sequences and gene structures of the antifungal peptide mytimycin (MytM) from the Mediterranean mussel, Mytilus galloprovincialis.

    PubMed

    Sonthi, Molruedee; Toubiana, Mylène; Pallavicini, Alberto; Venier, Paola; Roch, Philippe

    2011-10-01

    Knowledge on antifungal biomolecules is limited compared to antibacterial peptides. A strictly antifungal peptide from the blue mussel, Mytilus edulis named mytimycin (MytM) was reported in 1996 as partial NH(2) 33 amino acid sequence. Using back-translations of the previous sequence, MytM-related nucleotide sequences were identified from a normalized Mytilus galloprovincialis expressed sequence tag library. Primers designed from a consensus sequence have been used to obtain a fragment of 560 nucleotides, including the complete coding sequence of 456 nucleotides. Precursor is constituted by a signal peptide of 23 amino acids, followed by MytM of 54 amino acids (6.2-6.3 kDa, 12 cysteines) and C-terminal extension of 75 amino acids. Only two major amino acid precursor sequences emerged, one shared by M. galloprovincialis from Venice and Vigo, the other belonging to M. galloprovincialis from Palavas, with nine amino acid differences between the two MytM. Predicted disulfide bonds suggested the presence of two constrained domains joined by amino acidic NIFG track. Intriguing was the presence of conserved canonical EF hand-motif located in the C-terminus extension of the precursor. The MytM gene was found interrupted by two introns. Intron 2 existed in two forms, a long (1,112 nucleotides) and a short (716 nucleotides) one resulting from the removal of the central part of the long one. Both the short (GenBank FJ804479) and the long (GenBank FJ804478) genes are simultaneously present in the mussel genome.

  2. Identification of an androgen-repressed mRNA in rat ventral prostate as coding for sulphated glycoprotein 2 by cDNA cloning and sequence analysis.

    PubMed Central

    Bettuzzi, S; Hiipakka, R A; Gilna, P; Liao, S T

    1989-01-01

    The concentrations of a small number of mRNAs in the rat ventral prostate increase after castration and then decrease upon androgen treatment. Since the repression of specific gene expression may be important in the regulation of organ growth, we have cloned a cDNA for an androgen-repressed mRNA, the concentration of which increased 17-fold 4 days after castration, and this increase was reversed rapidly by androgen treatment. By sequence analysis the androgen-repressed mRNA was identified as that coding for sulphated glycoprotein 2. Images Fig. 1. PMID:2920020

  3. Nucleotide sequence of the GDH gene coding for the NADP-specific glutamate dehydrogenase of Saccharomyces cerevisiae.

    PubMed

    Nagasu, T; Hall, B D

    1985-01-01

    The isolation of the Saccharomyces cerevisiae gene for NADP-dependent glutamate dehydrogenase (NADP-GDH) by cross hybridization to the Neurospora crassa am gene, known to encode for NADP-GDH is described. Two DNA fragments selected from a yeast genomic library in phage lambda gt11 were shown by restriction analysis to share 2.5 kb of common sequence. A yeast shuttle vector (CV13) carrying either to the cloned fragments complements the gdh- strain of S. cerevisiae and directs substantial overproduction of NADP-GDH. One of the cloned fragments was sequenced, and the deduced amino acid (aa) sequence of the yeast NADP-GDH is 64% homologous to N. crassa, 51% to Escherichia coli and 24% to bovine NADP-GDHs.

  4. Analyzing incomplete longitudinal clinical trial data.

    PubMed

    Molenberghs, Geert; Thijs, Herbert; Jansen, Ivy; Beunckens, Caroline; Kenward, Michael G; Mallinckrodt, Craig; Carroll, Raymond J

    2004-07-01

    Using standard missing data taxonomy, due to Rubin and co-workers, and simple algebraic derivations, it is argued that some simple but commonly used methods to handle incomplete longitudinal clinical trial data, such as complete case analyses and methods based on last observation carried forward, require restrictive assumptions and stand on a weaker theoretical foundation than likelihood-based methods developed under the missing at random (MAR) framework. Given the availability of flexible software for analyzing longitudinal sequences of unequal length, implementation of likelihood-based MAR analyses is not limited by computational considerations. While such analyses are valid under the comparatively weak assumption of MAR, the possibility of data missing not at random (MNAR) is difficult to rule out. It is argued, however, that MNAR analyses are, themselves, surrounded with problems and therefore, rather than ignoring MNAR analyses altogether or blindly shifting to them, their optimal place is within sensitivity analysis. The concepts developed here are illustrated using data from three clinical trials, where it is shown that the analysis method may have an impact on the conclusions of the study.

  5. [Cloning of full-length coding sequence of tree shrew CD4 and prediction of its molecular characteristics].

    PubMed

    Tian, Wei-Wei; Gao, Yue-Dong; Guo, Yan; Huang, Jing-Fei; Xiao, Chang; Li, Zuo-Sheng; Zhang, Hua-Tang

    2012-02-01

    The tree shrews, as an ideal animal model receiving extensive attentions to human disease research, demands essential research tools, in particular cellular markers and monoclonal antibodies for immunological studies. In this paper, a 1 365 bp of the full-length CD4 cDNA encoding sequence was cloned from total RNA in peripheral blood of tree shrews, the sequence completes two unknown fragment gaps of tree shrews predicted CD4 cDNA in the GenBank database, and its molecular characteristics were analyzed compared with other mammals by using biology software such as Clustal W2.0 and so forth. The results showed that the extracellular and intracellular domains of tree shrews CD4 amino acid sequence are conserved. The tree shrews CD4 amino acid sequence showed a close genetic relationship with Homo sapiens and Macaca mulatta. Most regions of the tree shrews CD4 molecule surface showed positive charges as humans. However, compared with CD4 extracellular domain D1 of human, CD4 D1 surface of tree shrews showed more negative charges, and more two N-glycosylation sites, which may affect antibody binding. This study provides a theoretical basis for the preparation and functional studies of CD4 monoclonal antibody.

  6. Cloning, nucleotide sequence, mutagenesis, and mapping of the Bacillus subtilis pbpD gene, which codes for penicillin-binding protein 4.

    PubMed Central

    Popham, D L; Setlow, P

    1994-01-01

    The gene encoding penicillin-binding protein 4 (PBP 4) of Bacillus subtilis, pbpD, was cloned by two independent methods. PBP 4 was purified, and the amino acid sequence of a cyanogen bromide digestion product was used to design an oligonucleotide probe for identification of the gene. An oligonucleotide probe designed to hybridize to genes encoding class A high-molecular-weight PBPs also identified this gene. DNA sequence analysis of the cloned DNA revealed that (i) the amino acid sequence of PBP 4 was similar to those of other class A high-molecular-weight PBPs and (ii) pbpD appeared to be cotranscribed with a downstream gene (termed orf2) of unknown function. The orf2 gene is followed by an apparent non-protein-coding region which exhibits nucleotide sequence similarity with at least two other regions of the chromosome and which has a high potential for secondary structure formation. Mutations in pbpD resulted in the disappearance of PBP 4 but had no obvious effect on growth, cell division, sporulation, spore heat resistance, or spore germination. Expression of a transcriptional fusion of pbpD to lacZ increased throughout growth, decreased during sporulation, and was induced approximately 45 min into spore germination. A single transcription start site was detected just upstream of pbpD. The pbpD locus was mapped to the 275 to 280 degrees region of the chromosomal genetic map. Images PMID:7961491

  7. Isolation and sequencing of cDNA clones coding for the catalytic unit of glucose-6-phosphatase from two haplochromine cichlid fishes.

    PubMed

    Nagl, S; Mayer, W E; Klein, J

    1999-01-01

    Complementary DNA clones coding for the catalytic unit of the enzyme glucose-6-phosphatase (G6Pase) were obtained from Haplochromis nubilus and Haplochromis xenognathus, two cichlid fish species from Lake Victoria. The translated sequence of these two cDNAs identifies a polypeptide consisting of 352 amino acid residues and showing a 54.4% similarity to the human form of G6Pase. The amino acid sequences of the two fish species are identical. The comparison of the fish amino acid sequence with the corresponding sequences of rat, mouse, and human G6Pase revealed that the amino acid residues, which are involved in G6Pase catalysis in humans, are also conserved in fish G6Pase. Northern blot analysis showed that G6Pase is expressed at the same level in 6- and 10-day-old fish. A three base pair insertion/deletion polymorphism was found in the 3'-untranslated region of the fish G6Pase gene. The polymorphism will be a useful marker in a phylogenetic study of Lake Victoria cichlids.

  8. Development of a universal RT-PCR for amplifying and sequencing the leader and capsid-coding region of foot-and-mouth disease virus.

    PubMed

    Xu, Lizhe; Hurtle, William; Rowland, Jessica M; Casteran, Karissa A; Bucko, Stacey M; Grau, Fred R; Valdazo-González, Begoña; Knowles, Nick J; King, Donald P; Beckham, Tammy R; McIntosh, Michael T

    2013-04-01

    Foot-and-mouth disease (FMD) is a highly infectious viral disease of cloven-hoofed animals with debilitating and devastating consequences for livestock industries throughout the world. Key antigenic determinants of the causative agent, FMD virus (FMDV), reside within the surface-exposed proteins of the viral capsid. Therefore, characterization of the sequence that encodes the capsid (P1) is important for tracking the emergence or spread of FMD and for selection and development of new vaccines. Reliable methods to generate sequence for this region are challenging due to the high inter-serotypic variability between different strains of FMDV. This study describes the development and optimization of a novel, robust and universal RT-PCR method that may be used to amplify and sequence a 3kilobase (kb) fragment encompassing the leader proteinase (L) and capsid-coding portions (P1) of the FMDV genome. This new RT-PCR method was evaluated in two laboratories using RNA extracted from 134 clinical samples collected from different countries and representing a range of topotypes and lineages within each of the seven FMDV serotypes. Sequence analysis assisted in the reiterative design of primers that are suitable for routine sequencing of these RT-PCR fragments. Using this method, sequence analysis was undertaken for 49 FMD viruses collected from outbreaks in the field. This approach provides a robust tool that can be used for rapid antigenic characterization of FMDV and phylogenetic analyses and has utility for inclusion in laboratory response programs as an aid to vaccine matching or selection in the event of FMD outbreaks.

  9. 32 CFR 651.44 - Incomplete information.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 32 National Defense 4 2013-07-01 2013-07-01 false Incomplete information. 651.44 Section 651.44 National Defense Department of Defense (Continued) DEPARTMENT OF THE ARMY (CONTINUED) ENVIRONMENTAL QUALITY ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) Environmental Impact Statement § 651.44 Incomplete...

  10. 32 CFR 651.44 - Incomplete information.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 32 National Defense 4 2014-07-01 2013-07-01 true Incomplete information. 651.44 Section 651.44 National Defense Department of Defense (Continued) DEPARTMENT OF THE ARMY (CONTINUED) ENVIRONMENTAL QUALITY ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) Environmental Impact Statement § 651.44 Incomplete...

  11. 32 CFR 651.44 - Incomplete information.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 32 National Defense 4 2012-07-01 2011-07-01 true Incomplete information. 651.44 Section 651.44 National Defense Department of Defense (Continued) DEPARTMENT OF THE ARMY (CONTINUED) ENVIRONMENTAL QUALITY ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) Environmental Impact Statement § 651.44 Incomplete...

  12. Automation of a primer design and evaluation pipeline for subsequent sequencing of the coding regions of all human Refseq genes

    PubMed Central

    Lai, Daniel; Love, Donald R

    2012-01-01

    Screening for mutations in human disease-causing genes in a molecular diagnostic environment demands simplicity with a view to allowing high throughput approaches. In order to advance these requirements, we have developed and applied a primer design program, termed BatchPD, to achieve the PCR amplification of coding exons of all known human Refseq genes. Primer design, in silico PCR checks and formatted primer information for subsequent web-based interrogation are queried from existing online tools. BatchPD acts as an intermediate to automate queries and results processing and provides exon-specific information that is summarised in a spreadsheet format. PMID:22570517

  13. Tumor suppressor miR-375 regulates MYC expression via repression of CIP2A coding sequence through multiple miRNA-mRNA interactions.

    PubMed

    Jung, Hyun Min; Patel, Rushi S; Phillips, Brittany L; Wang, Hai; Cohen, Donald M; Reinhold, William C; Chang, Lung-Ji; Yang, Li-Jun; Chan, Edward K L

    2013-06-01

    MicroRNAs (miRNAs) are small, noncoding RNAs involved in posttranscriptional regulation of protein-coding genes in various biological processes. In our preliminary miRNA microarray analysis, miR-375 was identified as the most underexpressed in human oral tumor versus controls. The purpose of the present study is to examine the function of miR-375 as a candidate tumor suppressor miRNA in oral cancer. Cancerous inhibitor of PP2A (CIP2A), a guardian of oncoprotein MYC, is identified as a candidate miR-375 target based on bioinformatics. Luciferase assay accompanied by target sequence mutagenesis elucidates five functional miR-375-binding sites clustered in the CIP2A coding sequence close to the C-terminal domain. Overexpression of CIP2A is clearly demonstrated in oral cancers, and inverse correlation between miR-375 and CIP2A is observed in the tumors, as well as in NCI-60 cell lines, indicating the potential generalized involvement of the miR-375-CIP2A relationship in many other cancers. Transient transfection of miR-375 in oral cancer cells reduces the expression of CIP2A, resulting in decrease of MYC protein levels and leading to reduced proliferation, colony formation, migration, and invasion. Therefore this study shows that underexpression of tumor suppressor miR-375 could lead to uncontrolled CIP2A expression and extended stability of MYC, which contributes to promoting cancerous phenotypes.

  14. Variation in seed fatty acid composition and sequence divergence in the FAD2 gene coding region between wild and cultivated sesame.

    PubMed

    Chen, Zhenbang; Tonnis, Brandon; Morris, Brad; Wang, Richard B; Zhang, Amy L; Pinnow, David; Wang, Ming Li

    2014-12-03

    Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examination of fatty acid composition. The coding region of the FAD2 gene for fatty acid desaturase (FAD) in these accessions was also sequenced. Cultivated sesame accessions flowered and matured earlier than the wild species. The cultivated sesame seeds contained a significantly higher percentage of oleic acid (40.4%) than the seeds of the wild species (26.1%). Nucleotide polymorphisms were identified in the FAD2 gene coding region between wild and cultivated species. Some nucleotide polymorphisms led to amino acid changes, one of which was located in the enzyme active site and may contribute to the altered fatty acid composition. Based on the morphology observation, chemical analysis, and sequence analysis, it was determined that two accessions were misnamed and need to be reclassified. The results obtained from this study are useful for sesame improvement in molecular breeding programs.

  15. Second-generation sequencing of entire mitochondrial coding-regions (∼15.4 kb) holds promise for study of the phylogeny and taxonomy of human body lice and head lice.

    PubMed

    Xiong, H; Campelo, D; Pollack, R J; Raoult, D; Shao, R; Alem, M; Ali, J; Bilcha, K; Barker, S C

    2014-08-01

    The Illumina Hiseq platform was used to sequence the entire mitochondrial coding-regions of 20 body lice, Pediculus humanus Linnaeus, and head lice, P. capitis De Geer (Phthiraptera: Pediculidae), from eight towns and cities in five countries: Ethiopia, France, China, Australia and the U.S.A. These data (∼310 kb) were used to see how much more informative entire mitochondrial coding-region sequences were than partial mitochondrial coding-region sequences, and thus to guide the design of future studies of the phylogeny, origin, evolution and taxonomy of body lice and head lice. Phylogenies were compared from entire coding-region sequences (∼15.4 kb), entire cox1 (∼1.5 kb), partial cox1 (∼700 bp) and partial cytb (∼600 bp) sequences. On the one hand, phylogenies from entire mitochondrial coding-region sequences (∼15.4 kb) were much more informative than phylogenies from entire cox1 sequences (∼1.5 kb) and partial gene sequences (∼600 to ∼700 bp). For example, 19 branches had > 95% bootstrap support in our maximum likelihood tree from the entire mitochondrial coding-regions (∼15.4 kb) whereas the tree from 700 bp cox1 had only two branches with bootstrap support > 95%. Yet, by contrast, partial cytb (∼600 bp) and partial cox1 (∼486 bp) sequences were sufficient to genotype lice to Clade A, B or C. The sequences of the mitochondrial genomes of the P. humanus, P. capitis and P. schaeffi Fahrenholz studied are in NCBI GenBank under the accession numbers KC660761-800, KC685631-6330, KC241882-97, EU219988-95, HM241895-8 and JX080388-407.

  16. Simulation of Loss of RHRS Sequences in the Almaraz NPP during Mid-loop Operation using TRACE Code

    SciTech Connect

    Queral, Cesar; Gonzalez, Isaac; Exposito, Antonio

    2006-07-01

    In the framework of different international and national projects sponsored by the Spanish nuclear regulatory body, Consejo de Seguridad Nuclear, and the energy industry of Spain, UNESA, one of the most important objectives is the maintenance and developing of Spanish NPP models for different codes, such as RELAP5 and TRACE. In this context, and due to the risk importance of the loss of RHRS at mid-loop conditions, our group has developed a mid-loop model of Almaraz NPP with the TRACE code. During this kind of transients the reflux condensation is one of the cooling mechanisms anticipated in the abnormal operational procedure of loss of RHRS at mid-loop level. In this sense, several simulations of loss of the RHRS are being performed attending to different plant states, such as primary closed or open (different path vents were considered), availability of steam generators, power levels, primary inventory and different secondary conditions. These parametric analyses allow us to check the capability of this cooling mechanism at different plant configurations and to apply them to the success criteria of the reflux condensation mechanism. (authors)

  17. Cloning, sequencing, and expression of the gene coding for the human platelet. cap alpha. /sub 2/-adrenergic receptor

    SciTech Connect

    Kobilka, B.K.; Matsui, H.; Kobilka, T.S.; Yang-Feng, T.L.; Francke, U.; Caron, M.G.; Lefkowitz, R.J.; Regan, J.W.

    1987-10-30

    The gene for the human platelet ..cap alpha../sub 2/-adrenergic receptor has been cloned with oligonucleotides corresponding to the partial amino acid sequence of the purified receptor. The identity of this gene has been confirmed by the binding of ..cap alpha../sub 2/-adrenergic ligands to the cloned receptor expressed in Xenopus laevis oocytes. The deduced amino acid sequence is most similar to the recently cloned human ..beta../sub 2/- and ..beta../sub 1/-adrenergic receptors; however, similarities to the muscarinic cholinergic receptors are also evident. Two related genes have been identified by low stringency Southern blot analysis. These genes may represent additional ..cap alpha../sub 2/-adrenergic receptor subtypes.

  18. A Single Point Mutation within the Coding Sequence of Cholera Toxin B Subunit Will Increase Its Expression Yield

    PubMed Central

    Bakhshi, Bita; Boustanshenas, Mina; Ghorbani, Masoud

    2014-01-01

    Background: Cholera toxin B subunit (CTB) has been extensively considered as an immunogenic and adjuvant protein, but its yield of expression is not satisfactory in many studies. The aim of this study was to compare the expression of native and mutant recombinant CTB (rCTB) in pQE vector. Methods: ctxB fragment from Vibrio cholerae O1 ATCC14035 containing the substitution of mutant ctxB for amino acid S128T was amplified by PCR and cloned in pGETM-T easy vector. It was then transformed to E. coli Top 10F' and cultured on LB agar plate containing ampicillin. Sequence analysis confirmed the mature ctxB gene sequence and the mutant one in both constructs which were further subcloned to pQE-30 vector. Both constructs were subsequently transformed to E. coli M15 (pREP4) for expression of mature and mutant rCTB. Results: SDS-PAGE analysis showed the maximum expression of rCTB in both systems at 5 hours after induction and Western-blot analysis confirmed the presence of rCTB in blotting membranes. The expression of mutant rCTB was much higher than mature rCTB, which may be the result of serine-to-threonine substitution at position 128 of mature rCTB amino acid sequence created by PCR mutagenesis. The mutant rCTB retained pentameric stability and its ability to bind to anti- cholera toxin IgG antibodies. Conclusion: Point mutation in ctxB sequence resulted in over-expression of rCTB, probably due to the increase of solubility of produced rCTB. Consequently, this expression system can be used to produce rCTB in high yield. PMID:24842138

  19. Humans and chimpanzees differ in their cellular response to DNA damage and non-coding sequence elements of DNA repair-associated genes.

    PubMed

    Weis, E; Galetzka, D; Herlyn, H; Schneider, E; Haaf, T

    2008-01-01

    Compared to humans, chimpanzees appear to be less susceptible to many types of cancer. Because DNA repair defects lead to accumulation of gene and chromosomal mutations, species differences in DNA repair are one plausible explanation. Here we analyzed the repair kinetics of human and chimpanzee cells after cisplatin treatment and irradiation. Dot blots for the quantification of single-stranded (ss) DNA repair intermediates revealed a biphasic response of human and chimpanzee lymphoblasts to cisplatin-induced damage. The early phase of DNA repair was identical in both species with a peak of ssDNA intermediates at 1 h after DNA damage induction. However, the late phase differed between species. Human cells showed a second peak of ssDNA intermediates at 6 h, chimpanzee cells at 5 h. One of four analyzed DNA repair-associated genes, UBE2A, was differentially expressed in human and chimpanzee cells at 5 h after cisplatin treatment. Immunofluorescent staining of gammaH2AX foci demonstrated equally high numbers of DNA strand breaks in human and chimpanzee cells at 30 min after irradiation and equally low numbers at 2 h. However, at 1 h chimpanzee cells had significantly less DNA breaks than human cells. Comparative sequence analyses of approximately 100 DNA repair-associated genes in human and chimpanzee revealed 13% and 32% genes, respectively, with evidence for an accelerated evolution in promoter regions and introns. This is strikingly contrasting to the 3% of DNA repair-associated genes with positive selection in the coding sequence. Compared to the rhesus macaque as an outgroup, chimpanzees have a higher accelerated evolution in non-coding sequences than humans. The TRF1-interacting, ankyrin-related ADP-ribose polymerase (TNKS) gene showed an accelerated intraspecific evolution among humans. Our results are consistent with the view that chimpanzee cells repair different types of DNA damage faster than human cells, whereas the overall repair capacity is similar in

  20. Using machine learning and high-throughput RNA sequencing to classify the precursors of small non-coding RNAs.

    PubMed

    Ryvkin, Paul; Leung, Yuk Yee; Ungar, Lyle H; Gregory, Brian D; Wang, Li-San

    2014-05-01

    Recent advances in high-throughput sequencing allow researchers to examine the transcriptome in more detail than ever before. Using a method known as high-throughput small RNA-sequencing, we can now profile the expression of small regulatory RNAs such as microRNAs and small interfering RNAs (siRNAs) with a great deal of sensitivity. However, there are many other types of small RNAs (<50nt) present in the cell, including fragments derived from snoRNAs (small nucleolar RNAs), snRNAs (small nuclear RNAs), scRNAs (small cytoplasmic RNAs), tRNAs (transfer RNAs), and transposon-derived RNAs. Here, we present a user's guide for CoRAL (Classification of RNAs by Analysis of Length), a computational method for discriminating between different classes of RNA using high-throughput small RNA-sequencing data. Not only can CoRAL distinguish between RNA classes with high accuracy, but it also uses features that are relevant to small RNA biogenesis pathways. By doing so, CoRAL can give biologists a glimpse into the characteristics of different RNA processing pathways and how these might differ between tissue types, biological conditions, or even different species. CoRAL is available at http://wanglab.pcbi.upenn.edu/coral/.

  1. [Analysis of the molecular characteristics and cloning of full-length coding sequence of interleukin-2 in tree shrews].

    PubMed

    Huang, Xiao-Yan; Li, Ming-Li; Xu, Juan; Gao, Yue-Dong; Wang, Wen-Guang; Yin, An-Guo; Li, Xiao-Fei; Sun, Xiao-Mei; Xia, Xue-Shan; Dai, Jie-Jie

    2013-04-01

    While the tree shrew (Tupaia belangeri chinensis) is an excellent animal model for studying the mechanisms of human diseases, but few studies examine interleukin-2 (IL-2), an important immune factor in disease model evaluation. In this study, a 465 bp of the full-length IL-2 cDNA encoding sequence was cloned from the RNA of tree shrew spleen lymphocytes, which were then cultivated and stimulated with ConA (concanavalin). Clustal W 2.0 was used to compare and analyze the sequence and molecular characteristics, and establish the similarity of the overall structure of IL-2 between tree shrews and other mammals. The homology of the IL-2 nucleotide sequence between tree shrews and humans was 93%, and the amino acid homology was 80%. The phylogenetic tree results, derived through the Neighbour-Joining method using MEGA5.0, indicated a close genetic relationship between tree shrews, Homo sapiens, and Macaca mulatta. The three-dimensional structure analysis showed that the surface charges in most regions of tree shrew IL-2 were similar to between tree shrews and humans; however, the N-glycosylation sites and local structures were different, which may affect antibody binding. These results provide a fundamental basis for the future study of IL-2 monoclonal antibody in tree shrews, thereby improving their utility as a model.

  2. Increasing the Yield in Targeted Next-Generation Sequencing by Implicating CNV Analysis, Non-Coding Exons and the Overall Variant Load: The Example of Retinal Dystrophies

    PubMed Central

    Eisenberger, Tobias; Neuhaus, Christine; Khan, Arif O.; Decker, Christian; Preising, Markus N.; Friedburg, Christoph; Bieg, Anika; Gliem, Martin; Issa, Peter Charbel; Holz, Frank G.; Baig, Shahid M.; Hellenbroich, Yorck; Galvez, Alberto; Platzer, Konrad; Wollnik, Bernd; Laddach, Nadja; Ghaffari, Saeed Reza; Rafati, Maryam; Botzenhart, Elke; Tinschert, Sigrid; Börger, Doris; Bohring, Axel; Schreml, Julia; Körtge-Jung, Stefani; Schell-Apacik, Chayim; Bakur, Khadijah; Al-Aama, Jumana Y.; Neuhann, Teresa; Herkenrath, Peter; Nürnberg, Gudrun; Nürnberg, Peter; Davis, John S.; Gal, Andreas; Bergmann, Carsten; Lorenz, Birgit; Bolz, Hanno J.

    2013-01-01

    Retinitis pigmentosa (RP) and Leber congenital amaurosis (LCA) are major causes of blindness. They result from mutations in many genes which has long hampered comprehensive genetic analysis. Recently, targeted next-generation sequencing (NGS) has proven useful to overcome this limitation. To uncover “hidden mutations” such as copy number variations (CNVs) and mutations in non-coding regions, we extended the use of NGS data by quantitative readout for the exons of 55 RP and LCA genes in 126 patients, and by including non-coding 5′ exons. We detected several causative CNVs which were key to the diagnosis in hitherto unsolved constellations, e.g. hemizygous point mutations in consanguineous families, and CNVs complemented apparently monoallelic recessive alleles. Mutations of non-coding exon 1 of EYS revealed its contribution to disease. In view of the high carrier frequency for retinal disease gene mutations in the general population, we considered the overall variant load in each patient to assess if a mutation was causative or reflected accidental carriership in patients with mutations in several genes or with single recessive alleles. For example, truncating mutations in RP1, a gene implicated in both recessive and dominant RP, were causative in biallelic constellations, unrelated to disease when heterozygous on a biallelic mutation background of another gene, or even non-pathogenic if close to the C-terminus. Patients with mutations in several loci were common, but without evidence for di- or oligogenic inheritance. Although the number of targeted genes was low compared to previous studies, the mutation detection rate was highest (70%) which likely results from completeness and depth of coverage, and quantitative data analysis. CNV analysis should routinely be applied in targeted NGS, and mutations in non-coding exons give reason to systematically include 5′-UTRs in disease gene or exome panels. Consideration of all variants is indispensable because even

  3. Increasing the yield in targeted next-generation sequencing by implicating CNV analysis, non-coding exons and the overall variant load: the example of retinal dystrophies.

    PubMed

    Eisenberger, Tobias; Neuhaus, Christine; Khan, Arif O; Decker, Christian; Preising, Markus N; Friedburg, Christoph; Bieg, Anika; Gliem, Martin; Charbel Issa, Peter; Holz, Frank G; Baig, Shahid M; Hellenbroich, Yorck; Galvez, Alberto; Platzer, Konrad; Wollnik, Bernd; Laddach, Nadja; Ghaffari, Saeed Reza; Rafati, Maryam; Botzenhart, Elke; Tinschert, Sigrid; Börger, Doris; Bohring, Axel; Schreml, Julia; Körtge-Jung, Stefani; Schell-Apacik, Chayim; Bakur, Khadijah; Al-Aama, Jumana Y; Neuhann, Teresa; Herkenrath, Peter; Nürnberg, Gudrun; Nürnberg, Peter; Davis, John S; Gal, Andreas; Bergmann, Carsten; Lorenz, Birgit; Bolz, Hanno J

    2013-01-01

    Retinitis pigmentosa (RP) and Leber congenital amaurosis (LCA) are major causes of blindness. They result from mutations in many genes which has long hampered comprehensive genetic analysis. Recently, targeted next-generation sequencing (NGS) has proven useful to overcome this limitation. To uncover "hidden mutations" such as copy number variations (CNVs) and mutations in non-coding regions, we extended the use of NGS data by quantitative readout for the exons of 55 RP and LCA genes in 126 patients, and by including non-coding 5' exons. We detected several causative CNVs which were key to the diagnosis in hitherto unsolved constellations, e.g. hemizygous point mutations in consanguineous families, and CNVs complemented apparently monoallelic recessive alleles. Mutations of non-coding exon 1 of EYS revealed its contribution to disease. In view of the high carrier frequency for retinal disease gene mutations in the general population, we considered the overall variant load in each patient to assess if a mutation was causative or reflected accidental carriership in patients with mutations in several genes or with single recessive alleles. For example, truncating mutations in RP1, a gene implicated in both recessive and dominant RP, were causative in biallelic constellations, unrelated to disease when heterozygous on a biallelic mutation background of another gene, or even non-pathogenic if close to the C-terminus. Patients with mutations in several loci were common, but without evidence for di- or oligogenic inheritance. Although the number of targeted genes was low compared to previous studies, the mutation detection rate was highest (70%) which likely results from completeness and depth of coverage, and quantitative data analysis. CNV analysis should routinely be applied in targeted NGS, and mutations in non-coding exons give reason to systematically include 5'-UTRs in disease gene or exome panels. Consideration of all variants is indispensable because even

  4. Analysis of coding variants identified from exome sequencing resources for association with diabetic and non-diabetic nephropathy in African Americans.

    PubMed

    Cooke Bailey, Jessica N; Palmer, Nicholette D; Ng, Maggie C Y; Bonomo, Jason A; Hicks, Pamela J; Hester, Jessica M; Langefeld, Carl D; Freedman, Barry I; Bowden, Donald W

    2014-06-01

    Prior studies have identified common genetic variants influencing diabetic and non-diabetic nephropathy, diseases which disproportionately affect African Americans. Recently, exome sequencing techniques have facilitated identification of coding variants on a genome-wide basis in large samples. Exonic variants in known or suspected end-stage kidney disease (ESKD) or nephropathy genes can be tested for their ability to identify association either singly or in combination with known associated common variants. Coding variants in genes with prior evidence for association with ESKD or nephropathy were identified in the NHLBI-ESP GO database and genotyped in 5,045 African Americans (3,324 cases with type 2 diabetes associated nephropathy [T2D-ESKD] or non-T2D ESKD, and 1,721 controls) and 1,465 European Americans (568 T2D-ESKD cases and 897 controls). Logistic regression analyses were performed to assess association, with admixture and APOL1 risk status incorporated as covariates. Ten of 31 SNPs were associated in African Americans; four replicated in European Americans. In African Americans, SNPs in OR2L8, OR2AK2, C6orf167 (MMS22L), LIMK2, APOL3, APOL2, and APOL1 were nominally associated (P = 1.8 × 10(-4)-0.044). Haplotype analysis of common and coding variants increased evidence of association at the OR2L13 and APOL1 loci (P = 6.2 × 10(-5) and 4.6 × 10(-5), respectively). SNPs replicating in European Americans were in OR2AK2, LIMK2, and APOL2 (P = 0.0010-0.037). Meta-analyses highlighted four SNPs associated in T2D-ESKD and all-cause ESKD. Results from this study suggest a role for coding variants in the development of diabetic, non-diabetic, and/or all-cause ESKD in African Americans and/or European Americans.

  5. Single-tube, non-isotopic, multiplex PCR/OLA assay and sequence-coded separation for simultaneous screening of 31 cystic fibrosis mutations

    SciTech Connect

    Brinson, E.C.; Adriano, T.; Bloch, W.

    1994-09-01

    We have developed a rapid, single-tube, non-isotopic assay that screens a patient sample for the presence of 31 cystic fibrosis (CF) mutations. This assay can identify these mutations in a single reaction tube and a single electrophoresis run. Sample preparation is a simple, boil-and-go procedure, completed in less than an hour. The assay is composed of a 15-plex PCR, followed by a 61-plex oligonucleotide ligation assay (OLA), and incorporates a novel detection scheme, Sequence Coded Separation. Initially, the multiplex PCR amplifies 15 relevant segments of the CFTR gene, simultaneously. These PCR amplicons serve as templates for the multiplex OLA, which detects the normal or mutant allele at all loci, simultaneously. Each polymorphic site is interrogated by three oligonucleotide probes, a common probe and two allele-specific probes. Each common probe is tagged with a fluorescent dye, and the competing normal and mutant allelic probes incorporate different, non-nucleotide, mobility modifiers. These modifiers are composed of hexaethylene oxide (HEO) units, incorporated as HEO phosphoramidite monomers during automated DNA synthesis. The OLA is based on both probe hybridization and the ability of DNA ligase to discriminate single base mismatches at the junction between paired probes. Each single tube assay is electrophoresed in a single gel lane of a 4-color fluorescent DNA sequencer (Applied Biosystems, Model 373A). Each of the ligation products is identified by its unique combination of electrophoretic mobility and one of three colors. The fourth color is reserved for the in-lane size standard, used by GENESCAN{sup TM} software (Applied Biosystems) to size the OLA electrophoresis products. The Genotyper{sub TM} software (Applied Biosystems) decodes these Sequence-Coded-Separation data to create a patient summary report for all loci tested.

  6. Triple trans-splicing adeno-associated virus vectors capable of transferring the coding sequence for full-length dystrophin protein into dystrophic mice.

    PubMed

    Koo, Taeyoung; Popplewell, Linda; Athanasopoulos, Takis; Dickson, George

    2014-02-01

    Recombinant adeno-associated virus (rAAV) vectors have been shown to permit very efficient widespread transgene expression in skeletal muscle after systemic delivery, making these increasingly attractive as vectors for Duchenne muscular dystrophy (DMD) gene therapy. DMD is a severe muscle-wasting disorder caused by DMD gene mutations leading to complete loss of dystrophin protein. One of the major issues associated with delivery of the DMD gene, as a therapeutic approach for DMD, is its large open reading frame (ORF; 11.1 kb). A series of truncated microdystrophin cDNAs (delivered via a single AAV) and minidystrophin cDNAs (delivered via dual-AAV trans-spliced/overlapping reconstitution) have thus been extensively tested in DMD animal models. However, critical rod and hinge domains of dystrophin required for interaction with components of the dystrophin-associated protein complex, such as neuronal nitric oxide synthase, syntrophin, and dystrobrevin, are missing; these dystrophin domains may still need to be incorporated to increase dystrophin functionality and stabilize membrane rigidity. Full-length DMD gene delivery using AAV vectors remains elusive because of the limited single-AAV packaging capacity (4.7 kb). Here we developed a novel method for the delivery of the full-length DMD coding sequence to skeletal muscles in dystrophic mdx mice using a triple-AAV trans-splicing vector system. We report for the first time that three independent AAV vectors carrying "in tandem" sequential exonic parts of the human DMD coding sequence enable the expression of the full-length protein as a result of trans-splicing events cojoining three vectors via their inverted terminal repeat sequences. This method of triple-AAV-mediated trans-splicing could be applicable to the delivery of any large therapeutic gene (≥11 kb ORF) into postmitotic tissues (muscles or neurons) for the treatment of various inherited metabolic and genetic diseases.

  7. RNA sequencing and functional analysis implicate the regulatory role of long non-coding RNAs in tomato fruit ripening.

    PubMed

    Zhu, Benzhong; Yang, Yongfang; Li, Ran; Fu, Daqi; Wen, Liwei; Luo, Yunbo; Zhu, Hongliang

    2015-08-01

    Recently, long non-coding RNAs (lncRNAs) have been shown to play critical regulatory roles in model plants, such as Arabidopsis, rice, and maize. However, the presence of lncRNAs and how they function in fleshy fruit ripening are still largely unknown because fleshy fruit ripening is not present in the above model plants. Tomato is the model system for fruit ripening studies due to its dramatic ripening process. To investigate further the role of lncRNAs in fruit ripening, it is necessary and urgent to discover and identify novel lncRNAs and understand the function of lncRNAs in tomato fruit ripening. Here it is reported that 3679 lncRNAs were discovered from wild-type tomato and ripening mutant fruit. The lncRNAs are transcribed from all tomato chromosomes, 85.1% of which came from intergenic regions. Tomato lncRNAs are shorter and have fewer exons than protein-coding genes, a situation reminiscent of lncRNAs from other model plants. It was also observed that 490 lncRNAs were significantly up-regulated in ripening mutant fruits, and 187 lncRNAs were down-regulated, indicating that lncRNAs could be involved in the regulation of fruit ripening. In line with this, silencing of two novel tomato intergenic lncRNAs, lncRNA1459 and lncRNA1840, resulted in an obvious delay of ripening of wild-type fruit. Overall, the results indicated that lncRNAs might be essential regulators of tomato fruit ripening, which sheds new light on the regulation of fruit ripening.

  8. RNA sequencing and functional analysis implicate the regulatory role of long non-coding RNAs in tomato fruit ripening

    PubMed Central

    Zhu, Benzhong; Yang, Yongfang; Li, Ran; Fu, Daqi; Wen, Liwei; Luo, Yunbo; Zhu, Hongliang

    2015-01-01

    Recently, long non-coding RNAs (lncRNAs) have been shown to play critical regulatory roles in model plants, such as Arabidopsis, rice, and maize. However, the presence of lncRNAs and how they function in fleshy fruit ripening are still largely unknown because fleshy fruit ripening is not present in the above model plants. Tomato is the model system for fruit ripening studies due to its dramatic ripening process. To investigate further the role of lncRNAs in fruit ripening, it is necessary and urgent to discover and identify novel lncRNAs and understand the function of lncRNAs in tomato fruit ripening. Here it is reported that 3679 lncRNAs were discovered from wild-type tomato and ripening mutant fruit. The lncRNAs are transcribed from all tomato chromosomes, 85.1% of which came from intergenic regions. Tomato lncRNAs are shorter and have fewer exons than protein-coding genes, a situation reminiscent of lncRNAs from other model plants. It was also observed that 490 lncRNAs were significantly up-regulated in ripening mutant fruits, and 187 lncRNAs were down-regulated, indicating that lncRNAs could be involved in the regulation of fruit ripening. In line with this, silencing of two novel tomato intergenic lncRNAs, lncRNA1459 and lncRNA1840, resulted in an obvious delay of ripening of wild-type fruit. Overall, the results indicated that lncRNAs might be essential regulators of tomato fruit ripening, which sheds new light on the regulation of fruit ripening. PMID:25948705

  9. Deep sequencing for de novo construction of a marine fish (Sparus aurata) transcriptome database with a large coverage of protein-coding transcripts

    PubMed Central

    2013-01-01

    Background The gilthead sea bream (Sparus aurata) is the main fish species cultured in the Mediterranean area and constitutes an interesting model of research. Nevertheless, transcriptomic and genomic data are still scarce for this highly valuable species. A transcriptome database was constructed by de novo assembly of gilthead sea bream sequences derived from public repositories of mRNA and collections of expressed sequence tags together with new high-quality reads from five cDNA 454 normalized libraries of skeletal muscle (1), intestine (1), head kidney (2) and blood (1). Results Sequencing of the new 454 normalized libraries produced 2,945,914 high-quality reads and the de novo global assembly yielded 125,263 unique sequences with an average length of 727 nt. Blast analysis directed to protein and nucleotide databases annotated 63,880 sequences encoding for 21,384 gene descriptions, that were curated for redundancies and frameshifting at the homopolymer regions of open reading frames, and hosted at http://www.nutrigroup-iats.org/seabreamdb. Among the annotated gene descriptions, 16,177 were mapped in the Ingenuity Pathway Analysis (IPA) database, and 10,899 were eligible for functional analysis with a representation in 341 out of 372 IPA canonical pathways. The high representation of randomly selected stickleback transcripts by Blast search in the nucleotide gilthead sea bream database evidenced its high coverage of protein-coding transcripts. Conclusions The newly assembled gilthead sea bream transcriptome represents a progress in genomic resources for this species, as it probably contains more than 75% of actively transcribed genes, constituting a valuable tool to assist studies on functional genomics and future genome projects. PMID:23497320

  10. The walnut (Juglans regia) genome sequence reveals diversity in genes coding for the biosynthesis of non-structural polyphenols.

    PubMed

    Martínez-García, Pedro J; Crepeau, Marc W; Puiu, Daniela; Gonzalez-Ibeas, Daniel; Whalen, Jeanne; Stevens, Kristian A; Paul, Robin; Butterfield, Timothy S; Britton, Monica T; Reagan, Russell L; Chakraborty, Sandeep; Walawage, Sriema L; Vasquez-Gross, Hans A; Cardeno, Charis; Famula, Randi A; Pratt, Kevin; Kuruganti, Sowmya; Aradhya, Mallikarjuna K; Leslie, Charles A; Dandekar, Abhaya M; Salzberg, Steven L; Wegrzyn, Jill L; Langley, Charles H; Neale, David B

    2016-09-01

    The Persian walnut (Juglans regia L.), a diploid species native to the mountainous regions of Central Asia, is the major walnut species cultivated for nut production and is one of the most widespread tree nut species in the world. The high nutritional value of J. regia nuts is associated with a rich array of polyphenolic compounds, whose complete biosynthetic pathways are still unknown. A J. regia genome sequence was obtained from the cultivar 'Chandler' to discover target genes and additional unknown genes. The 667-Mbp genome was assembled using two different methods (SOAPdenovo2 and MaSuRCA), with an N50 scaffold size of 464 955 bp (based on a genome size of 606 Mbp), 221 640 contigs and a GC content of 37%. Annotation with MAKER-P and other genomic resources yielded 32 498 gene models. Previous studies in walnut relying on tissue-specific methods have only identified a single polyphenol oxidase (PPO) gene (JrPPO1). Enabled by the J. regia genome sequence, a second homolog of PPO (JrPPO2) was discovered. In addition, about 130 genes in the large gallate 1-β-glucosyltransferase (GGT) superfamily were detected. Specifically, two genes, JrGGT1 and JrGGT2, were significantly homologous to the GGT from Quercus robur (QrGGT), which is involved in the synthesis of 1-O-galloyl-β-d-glucose, a precursor for the synthesis of hydrolysable tannins. The reference genome for J. regia provides meaningful insight into the complex pathways required for the synthesis of polyphenols. The walnut genome sequence provides important tools and methods to accelerate breeding and to facilitate the genetic dissection of complex traits.

  11. Cloning and nucleotide sequence of the gene coding for enzymatically active fragments of the Bacillus polymyxa beta-amylase.

    PubMed

    Kawazu, T; Nakanishi, Y; Uozumi, N; Sasaki, T; Yamagata, H; Tsukagoshi, N; Udaka, S

    1987-04-01

    The gene encoding beta-amylase was cloned from Bacillus polymyxa 72 into Escherichia coli HB101 by inserting HindIII-generated DNA fragments into the HindIII site of pBR322. The 4.8-kilobase insert was shown to direct the synthesis of beta-amylase. A 1.8-kilobase AccI-AccI fragment of the donor strain DNA was sufficient for the beta-amylase synthesis. Homologous DNA was found by Southern blot analysis to be present only in B. polymyxa 72 and not in other bacteria such as E. coli or B. subtilis. B. polymyxa, as well as E. coli harboring the cloned DNA, was found to produce enzymatically active fragments of beta-amylases (70,000, 56,000, or 58,000, and 42,000 daltons), which were detected in situ by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Nucleotide sequence analysis of the cloned 3.1-kilobase DNA revealed that it contains one open reading frame of 2,808 nucleotides without a translational stop codon. The deduced amino acid sequence for these 2,808 nucleotides encoding a secretory precursor of the beta-amylase protein is 936 amino acids including a signal peptide of 33 or 35 residues at its amino-terminal end. The existence of a beta-amylase of larger than 100,000 daltons, which was predicted on the basis of the results of nucleotide sequence analysis of the gene, was confirmed by examining culture supernatants after various cultivation periods. It existed only transiently during cultivation, but the multiform beta-amylases described above existed for a long time. The large beta-amylase (approximately 160,000 daltons) existed for longer in the presence of a protease inhibitor such as chymostatin, suggesting that proteolytic cleavage is the cause of the formation of multiform beta-amylases.

  12. Reticulate evolution and incomplete lineage sorting among the ponderosa pines.

    PubMed

    Willyard, Ann; Cronn, Richard; Liston, Aaron

    2009-08-01

    Interspecific gene flow via hybridization may play a major role in evolution by creating reticulate rather than hierarchical lineages in plant species. Occasional diploid pine hybrids indicate the potential for introgression, but reticulation is hard to detect because ancestral polymorphism is still shared across many groups of pine species. Nucleotide sequences for 53 accessions from 17 species in subsection Ponderosae (Pinus) provide evidence for reticulate evolution. Two discordant patterns among independent low-copy nuclear gene trees and a chloroplast haplotype are better explained by introgression than incomplete lineage sorting or other causes of incongruence. Conflicting resolution of three monophyletic Pinus coulteri accessions is best explained by ancient introgression followed by a genetic bottleneck. More recent hybridization transferred a chloroplast from P. jeffreyi to a sympatric P. washoensis individual. We conclude that incomplete lineage sorting could account for other examples of non-monophyly, and caution against any analysis based on single-accession or single-locus sampling in Pinus.

  13. Molecular cloning of the goose ACSL3 and ACSL5 coding domain sequences and their expression characteristics during goose fatty liver development.

    PubMed

    He, H; Liu, H H; Wang, J W; Lv, J; Li, L; Pan, Z X

    2014-01-01

    It has been demonstrated that ACSL3 and ACSL5 play important roles in fat metabolism. To investigate the primary functions of ACSL3 and ACSL5 and to evaluate their expression levels during goose fatty liver development, we cloned the ACSL3 and ACSL5 coding domain sequences (CDSs) of geese using RT-PCR and analyzed their expression characteristics under different conditions using qRT-PCR. The results showed that the goose ACSL3 (JX511975) and ACSL5 (JX511976) sequences have high similarities with the chicken sequences both at the nucleotide and amino acid levels. Both ACSL3 and ACSL5 have high expression levels in goose liver. The expression levels of ACSL3 and ACSL5 in goose liver and hepatocytes can be changed by overfeeding geese and by treatment with unsaturated fatty acids, respectively. Together, these results indicate that ACSL3 and ACSL5 play important roles during fatty liver development. The different expression characteristics of goose ACSL3 and ACSL5 suggest that these two genes may be responsible for specific functions.

  14. Investigation of complete and incomplete fusion in 20Ne + 51V system using recoil range measurement

    NASA Astrophysics Data System (ADS)

    Ali, Sabir; Ahmad, Tauseeef; Kumar, Kamal; Rizvi, I. A.; Agarwal, Avinash; Ghugre, S. S.; Sinha, A. K.; Chaubey, A. K.

    2015-01-01

    Recoil range distributions of evaporation residues, populated in 20Ne + 51V reaction at Elab ≈ 145 MeV, have been studied to determine the degree of momentum transferred through the complete and incomplete fusion reactions. Evaporation residues (ERs) populated through the complete and incomplete fusion reactions have been identified on the basis of their recoil range in the Al catcher medium. Measured recoil range of evaporation residues have been compared with the theoretical value calculated using the code SRIM. Range integrated cross section of observed ERs have been compared with the value predicted by statistical model code PACE4.

  15. New Insights into Flavivirus Evolution, Taxonomy and Biogeographic History, Extended by Analysis of Canonical and Alternative Coding Sequences

    PubMed Central

    Moureau, Gregory; Cook, Shelley; Lemey, Philippe; Nougairede, Antoine; Forrester, Naomi L.; Khasnatinov, Maxim; Charrel, Remi N.; Firth, Andrew E.; Gould, Ernest A.; de Lamballerie, Xavier

    2015-01-01

    To generate the most diverse phylogenetic dataset for the flaviviruses to date, we determined the genomic sequences and phylogenetic relationships of 14 flaviviruses, of which 10 are primarily associated with Culex spp. mosquitoes. We analyze these data, in conjunction with a comprehensive collection of flavivirus genomes, to characterize flavivirus evolutionary and biogeographic history in unprecedented detail and breadth. Based on the presumed introduction of yellow fever virus into the Americas via the transatlantic slave trade, we extrapolated a timescale for a relevant subset of flaviviruses whose evolutionary history, shows that different Culex-spp. associated flaviviruses have been introduced from the Old World to the New World on at least five separate occasions, with 2 different sets of factors likely to have contributed to the dispersal of the different viruses. We also discuss the significance of programmed ribosomal frameshifting in a central region of the polyprotein open reading frame in some mosquito-associated flaviviruses. PMID:25719412

  16. Coding sequences and levels of expression of Hsc70t are identical in mice with different Orch-1 alleles

    SciTech Connect

    Snoek, M.; Vugt, H. van; Olavesen, M.G.; Milner, C.M.; Campbell, R.D.; Teuscher, C.

    1994-12-31

    Experimental allergic orchitis (EAO) is an autoimmune disease of the testis that is controlled by multple genes. The use of recombinant mouse strains has defined the map position of the H-2-associated locus controlling disease susceptibility, Orch-1, within the H-2S/H-2D interval. Over the last few years the definition of the structural organization of the C4-H-2D segment and identification of the recombination sites of the various intra-H-2 recombinations has reduced the map position of Orch-1 to the Hsp70.1-G7 interval. Three Hsp70 genes, Hsp70.1, Hsp70.3, and Hsc70t, and the genes G7b and G7a are located in this segment of DNA. In order to investigate whether Hsc70t is a suitable candidate for Orch-1 we have compared the sequence of the gene from a susceptible and a resistant haplotype.

  17. New insights into flavivirus evolution, taxonomy and biogeographic history, extended by analysis of canonical and alternative coding sequences.

    PubMed

    Moureau, Gregory; Cook, Shelley; Lemey, Philippe; Nougairede, Antoine; Forrester, Naomi L; Khasnatinov, Maxim; Charrel, Remi N; Firth, Andrew E; Gould, Ernest A; de Lamballerie, Xavier

    2015-01-01

    To generate the most diverse phylogenetic dataset for the flaviviruses to date, we determined the genomic sequences and phylogenetic relationships of 14 flaviviruses, of which 10 are primarily associated with Culex spp. mosquitoes. We analyze these data, in conjunction with a comprehensive collection of flavivirus genomes, to characterize flavivirus evolutionary and biogeographic history in unprecedented detail and breadth. Based on the presumed introduction of yellow fever virus into the Americas via the transatlantic slave trade, we extrapolated a timescale for a relevant subset of flaviviruses whose evolutionary history, shows that different Culex-spp. associated flaviviruses have been introduced from the Old World to the New World on at least five separate occasions, with 2 different sets of factors likely to have contributed to the dispersal of the different viruses. We also discuss the significance of programmed ribosomal frameshifting in a central region of the polyprotein open reading frame in some mosquito-associated flaviviruses.

  18. Sequence Diversity in Coding Regions of Candidate Genes in the Glycoalkaloid Biosynthetic Pathway of Wild Potato Species

    PubMed Central

    Manrique-Carpintero, Norma C.; Tokuhisa, James G.; Ginzberg, Idit; Holliday, Jason A.; Veilleux, Richard E.

    2013-01-01

    Natural variation in five candidate genes of the steroidal glycoalkaloid (SGA) metabolic pathway and whole-genome single nucleotide polymorphism (SNP) genotyping were studied in six wild [Solanum chacoense (chc 80-1), S. commersonii, S. demissum, S. sparsipilum, S. spegazzinii, S. stoloniferum] and cultivated S. tuberosum Group Phureja (phu DH) potato species with contrasting levels of SGAs. Amplicons were sequenced for five candidate genes: 3-hydroxy-3-methylglutaryl coenzyme A reductase 1 and 2 (HMG1, HMG2) and 2.3-squalene epoxidase (SQE) of primary metabolism, and solanidine galactosyltransferase (SGT1), and glucosyltransferase (SGT2) of secondary metabolism. SNPs (n = 337) producing 354 variations were detected within 3.7 kb of sequenced DNA. More polymorphisms were found in introns than exons and in genes of secondary compared to primary metabolism. Although no significant deviation from neutrality was found, dN/dS ratios < 1 and negative values of Tajima’s D test suggested purifying selection and genetic hitchhiking in the gene fragments. In addition, patterns of dN/dS ratios across the SGA pathway suggested constraint by natural selection. Comparison of nucleotide diversity estimates and dN/dS ratios showed stronger selective constraints for genes of primary rather than secondary metabolism. SNPs (n = 24) with an exclusive genotype for either phu DH (low SGA) or chc 80-1 (high SGA) were identified for HMG2, SQE, SGT1 and SGT2. The SolCAP 8303 Illumina Potato SNP chip genotyping revealed eight informative SNPs on six pseudochromosomes, with homozygous and heterozygous genotypes that discriminated high, intermediate and low levels of SGA accumulation. These results can be used to evaluate SGA accumulation in segregating or association mapping populations. PMID:23853090

  19. 7 CFR 763.8 - Incomplete applications.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... days of receipt of an incomplete application, the Agency will provide the seller and buyer written notice of any additional information that must be provided. The seller or buyer, as applicable,...

  20. Analysis of five presumptive protein-coding sequences clustered between the primosome genes, 41 and 61, of bacteriophages T4, T2, and T6.

    PubMed Central

    Selick, H E; Stormo, G D; Dyson, R L; Alberts, B M

    1993-01-01

    In bacteriophage T4, there is a strong tendency for genes that encode interacting proteins to be clustered on the chromosome. There is 1.6 kb of DNA between the DNA helicase (gene 41) and the DNA primase (gene 61) genes of this virus. The DNA sequence of this region suggests that it contains five genes, designated as open reading frames (ORFs) 61.1 to 61.5, predicted to encode proteins ranging in size from 5.94 to 22.88 kDa. Are these ORFs actually genes? As one test, we compared the DNA sequence of this region in bacteriophages T2, T4, and T6 and found that ORFs 61.1, 61.3, 61.4, and 61.5 are highly conserved among the three closely related viruses. In contrast, ORF 61.2 is conserved between phages T4 and T6 yet is absent from phage T2, where it is replaced by another ORF, T2 ORF 61.2, which is not found in the T4 and T6 genomes. As a second, independent test for coding sequences, we calculated the codon base position preferences for all ORFs in this region that could encode proteins that contain at least 30 amino acids. Both the T4/T6 and T2 versions of ORF 61.2, as well as the other ORFs, have codon base position preferences that are indistinguishable from those of known T4 genes (coefficients of 0.81 to 0.94); the six other possible ORFs of at least 90 bp in this region are ruled out as genes by this test (coefficients less than zero). Thus, both evolutionary conservation and codon usage patterns lead us to conclude that ORFs 61.1 to 61.5 represent important protein-coding sequences for this family of bacteriophages. Because they are located between the genes that encode the two interacting proteins of the T4 primosome (DNA helicase plus DNA primase), one or more may function in DNA replication by modulating primosome function. Images PMID:8383243

  1. Student Use of Physics to Make Sense of Incomplete but Functional VPython Programs in a Lab Setting

    NASA Astrophysics Data System (ADS)

    Weatherford, Shawn A.

    2011-12-01

    Computational activities in Matter & Interactions, an introductory calculus-based physics course, have the instructional goal of providing students with the experience of applying the same set of a small number of fundamental principles to model a wide range of physical systems. However there are significant instructional challenges for students to build computer programs under limited time constraints, especially for students who are unfamiliar with programming languages and concepts. Prior attempts at designing effective computational activities were successful at having students ultimately build working VPython programs under the tutelage of experienced teaching assistants in a studio lab setting. A pilot study revealed that students who completed these computational activities had significant difficultly repeating the exact same tasks and further, had difficulty predicting the animation that would be produced by the example program after interpreting the program code. This study explores the interpretation and prediction tasks as part of an instructional sequence where students are asked to read and comprehend a functional, but incomplete program. Rather than asking students to begin their computational tasks with modifying program code, we explicitly ask students to interpret an existing program that is missing key lines of code. The missing lines of code correspond to the algebraic form of fundamental physics principles or the calculation of forces which would exist between analogous physical objects in the natural world. Students are then asked to draw a prediction of what they would see in the simulation produced by the VPython program and ultimately run the program to evaluate the students' prediction. This study specifically looks at how the participants use physics while interpreting the program code and creating a whiteboard prediction. This study also examines how students evaluate their understanding of the program and modification goals at the

  2. Human transforming growth factor type. cap alpha. coding sequence is not a directed-acting oncogene when overexpressed in NIH 3T3 cells

    SciTech Connect

    Finzi, E.; Fleming, T.; Segatto, O.; Pennington, C.Y.; Bringman, T.S.; Derynck, R.; Aaronson, S.A.

    1987-06-01

    A peptide secreted by some tumor cells in vitro imparts anchorage-independent growth to normal rat kidney (NRK) cells and has been termed transforming growth factor type ..cap alpha.. (TGF-..cap alpha..). To directly investigate the transforming properties of this factor, the human sequence coding for TGF-..cap alpha.. was placed under the control of either a metallothionein promoter or a retroviral long terminal repeat. These constructs failed to induce morphological transformation upon transfection of NIH 3T3 cells, whereas viral oncogenes encoding a truncated form of its cognate receptor, the EGF receptor, or another growth factor, sis/platelet-derived growth factor 2, efficiently induced transformed foci. Binding assays were done using (/sup 125/I)-EGF. When NIH 3T3 clonal sublines were selected by transfection of TGF-..cap alpha.. expression vectors in the presence of a dominant selectable market, they were shown to secrete large amounts of TGF-..cap alpha.. into the medium, to have downregulated EGF receptors, and to be inhibited in growth by TGF-..cap alpha.. monoclonal antibody. These results indicated that secreted TGF-..cap alpha.. interacts with its receptor at a cell surface location. Single cell-derived TGF-..cap alpha..-expressing sublines grew to high saturation density in culture. These and other results imply that TGF-..cap alpha.. exerts a growth-promoting effect on the entire NIH 3T3 cell population after secretion into the medium but little, if any, effect on the individual cell synthesizing this factor. It is concluded that the normal coding sequence for TGF-..cap alpha.. is not a direct-acting oncogene when overexpressed in NIH 3T3 cells.

  3. The human transforming growth factor type alpha coding sequence is not a direct-acting oncogene when overexpressed in NIH 3T3 cells.

    PubMed Central

    Finzi, E; Fleming, T; Segatto, O; Pennington, C Y; Bringman, T S; Derynck, R; Aaronson, S A

    1987-01-01

    A peptide secreted by some tumor cells in vitro imparts anchorage-independent growth to normal rat kidney (NRK) cells and has been termed transforming growth factor type alpha (TGF-alpha). To directly investigate the transforming properties of this factor, the human sequence coding for TGF-alpha was placed under the control of either a metallothionein promoter or a retroviral long terminal repeat. These constructs failed to induce morphological transformation upon transfection of NIH 3T3 cells, whereas viral oncogenes encoding a truncated form of its cognate receptor, the EGF receptor, or another growth factor, sis/platelet-derived growth factor 2, efficiently induced transformed foci. When NIH 3T3 clonal sublines were selected by transfection of TGF-alpha expression vectors in the presence of a dominant selectable marker, they were shown to secrete large amounts of TGF-alpha into the medium, to have downregulated EGF receptors, and to be inhibited in growth by TGF-alpha monoclonal antibody. These results indicated that secreted TGF-alpha interacts with its receptor at a cell surface location. Single cell-derived TGF-alpha-expressing sublines grew to high saturation density in culture. However, when plated as single cells on contact-inhibited monolayers of NIH 3T3 cells, they failed to form colonies, whereas v-sis- and v-erbB-transfected cells formed transformed colonies under the same conditions. Moreover, TGF-alpha-expressing sublines were not tumorigenic in nude mice. These and other results imply that TGF-alpha exerts a growth-promoting effect on the entire NIH 3T3 cell population after secretion into the medium but little, if any, effect on the individual cell synthesizing this factor. It is concluded that the normal coding sequence for TGF-alpha is not a direct-acting oncogene when overexpressed in NIH 3T3 cells. Images PMID:3035551

  4. Cloning, sequence analysis, and expression in Escherichia coli of a gene coding for a beta-mannanase from the extremely thermophilic bacterium "Caldocellum saccharolyticum".

    PubMed Central

    Lüthi, E; Jasmat, N B; Grayling, R A; Love, D R; Bergquist, P L

    1991-01-01

    A lambda recombinant phage expressing beta-mannanase activity in Escherichia coli has been isolated from a genomic library of the extremely thermophilic anaerobe "Caldocellum saccharolyticum." The gene was cloned into pBR322 on a 5-kb BamHI fragment, and its location was obtained by deletion analysis. The sequence of a 2.1-kb fragment containing the mannanase gene has been determined. One open reading frame was found which could code for a protein of Mr 38,904. The mannanase gene (manA) was overexpressed in E. coli by cloning the gene downstream from the lacZ promoter of pUC18. The enzyme was most active at pH 6 and 80 degrees C and degraded locust bean gum, guar gum, Pinus radiata glucomannan, and konjak glucomannan. The noncoding region downstream from the mannanase gene showed strong homology to celB, a gene coding for a cellulase from the same organism, suggesting that the manA gene might have been inserted into its present position on the "C. saccharolyticum" genome by homologous recombination. Images PMID:2039230

  5. Plasmid- and chromosome-coded aerobactin synthesis in enteric bacteria: insertion sequences flank operon in plasmid-mediated systems.

    PubMed Central

    McDougall, S; Neilands, J B

    1984-01-01

    Large plasmids were detected in two aerobactin-producing enteric bacterial species (Aerobacter aerogenes 62-I, Salmonella arizona SA1, and S. arizona SL5301) and designated pSMN1, pSMN2, and pSMN3, respectively. Other Salmonella spp., namely, S. arizona SL5302, S. arizona SLS, Salmonella austin, and Salmonella memphis, formed aerobactin but contained no detectable large plasmids. S. arizona SL5283 made no aerobactin. A probe consisting of the aerobactin biosynthetic genes cloned on plasmid pABN5 hybridized to a HindIII digest of pSMN1 but not to digests of pSMN2 or pSMN3. A larger probe, the insert of pABN1 containing the complete aerobactin operon, hybridized to four fragments in HindIII digests of the parent plasmid, pColV-K30. A 2.0-kilobase PvuII fragment responsible for this multiple-hybridization pattern was cloned into vector pUC9 to form pSMN30. The latter was mapped and shown to correspond to either IS1 or to a closely related insertion sequence. Images PMID:6330037

  6. Incomplete Lineage Sorting Is Common in Extant Gibbon Genera

    PubMed Central

    Luca, Francesca; Carbone, Lucia; Mootnick, Alan R.; de Jong, Pieter J.; Di Rienzo, Anna

    2013-01-01

    We sequenced reduced representation libraries by means of Illumina technology to generate over 1.5 Mb of orthologous sequence from a representative of each of the four extant gibbon genera (Nomascus, Hylobates, Symphalangus, and Hoolock). We used these data to assess the evolutionary relationships between the genera by evaluating the likelihoods of all possible bifurcating trees involving the four taxa. Our analyses provide weak support for a tree with Nomascus and Hylobates as sister taxa and with Hoolock and Symphalangus as sister taxa, though bootstrap resampling suggests that other phylogenetic scenarios are also possible. This uncertainty is due to short internal branch lengths and extensive incomplete lineage sorting across taxa. The true phylogenetic relationships among gibbon genera will likely require a more extensive whole-genome sequence analysis. PMID:23341974

  7. Incomplete lineage sorting is common in extant gibbon genera.

    PubMed

    Wall, Jeffrey D; Kim, Sung K; Luca, Francesca; Carbone, Lucia; Mootnick, Alan R; de Jong, Pieter J; Di Rienzo, Anna

    2013-01-01

    We sequenced reduced representation libraries by means of Illumina technology to generate over 1.5 Mb of orthologous sequence from a representative of each of the four extant gibbon genera (Nomascus, Hylobates, Symphalangus, and Hoolock). We used these data to assess the evolutionary relationships between the genera by evaluating the likelihoods of all possible bifurcating trees involving the four taxa. Our analyses provide weak support for a tree with Nomascus and Hylobates as sister taxa and with Hoolock and Symphalangus as sister taxa, though bootstrap resampling suggests that other phylogenetic scenarios are also possible. This uncertainty is due to short internal branch lengths and extensive incomplete lineage sorting across taxa. The true phylogenetic relationships among gibbon genera will likely require a more extensive whole-genome sequence analysis.

  8. Sequence analysis of coding DNA fragments of pfcrt and pfmdr-1 genes in Plasmodium falciparum isolates from Odisha, India.

    PubMed

    Sutar, Sasmita Kumari Das; Gupta, Bhavna; Ranjit, Manoranjan; Kar, Shantanu Kumar; Das, Aparup

    2011-02-01

    The global emergence and spread of malaria parasites resistant to antimalarial drugs is the major problem in malaria control. The genetic basis of the parasite's resistance to the antimalarial drug chloroquine (CQ) is well-documented, allowing for the analysis of field isolates of malaria parasites to address evolutionary questions concerning the origin and spread of CQ-resistance. Here, we present DNA sequence analyses of both the second exon of the Plasmodium falciparum CQ-resistance transporter (pfcrt) gene and the 5' end of the P. falciparum multidrug-resistance 1 (pfmdr-1) gene in 40 P. falciparum field isolates collected from eight different localities of Odisha, India. First, we genotyped the samples for the pfcrt K76T and pfmdr-1 N86Y mutations in these two genes, which are the mutations primarily implicated in CQ-resistance. We further analyzed amino acid changes in codons 72-76 of the pfcrt haplotypes. Interestingly, both the K76T and N86Y mutations were found to co-exist in 32 out of the total 40 isolates, which were of either the CVIET or SVMNT haplotype, while the remaining eight isolates were of the CVMNK haplotype. In total, eight nonsynonymous single nucleotide polymorphisms (SNPs) were observed, six in the pfcrt gene and two in the pfmdr-1 gene. One poorly studied SNP in the pfcrt gene (A97T) was found at a high frequency in many P. falciparum samples. Using population genetics to analyze these two gene fragments, we revealed comparatively higher nucleotide diversity in the pfcrt gene than in the pfmdr-1 gene. Furthermore, linkage disequilibrium was found to be tight between closely spaced SNPs of the pfcrt gene. Finally, both the pfcrt and the pfmdr-1 genes were found to evolve under the standard neutral model of molecular evolution.

  9. Genome-Wide Detection of Predicted Non-coding RNAs Related to the Adhesion Process in Vibrio alginolyticus Using High-Throughput Sequencing

    PubMed Central

    Huang, Lixing; Hu, Jiao; Su, Yongquan; Qin, Yingxue; Kong, Wendi; Zhao, Lingmin; Ma, Ying; Xu, Xiaojin; Lin, Mao; Zheng, Jiang; Yan, Qingpi

    2016-01-01

    The ability of bacteria to adhere to fish mucus can be affected by environmental conditions and is considered to be a key virulence factor of Vibrio alginolyticus. However, the molecular mechanism underlying this ability remains unclear. Our previous study showed that stress conditions such as exposure to Cu, Pb, Hg, and low pH are capable of reducing the adhesion ability of V. alginolyticus. Non-coding RNAs (ncRNAs) play a crucial role in the intricate regulation of bacterial gene expression, thereby affecting bacterial pathogenicity. Thus, we hypothesized that ncRNAs play a key role in the V. alginolyticus adhesion process. To validate this, we combined high-throughput sequencing with computational techniques to detect ncRNA dynamics in samples after stress treatments. The expression of randomly selected novel ncRNAs was confirmed by QPCR. Among the significantly altered ncRNAs, 30 were up-regulated and 2 down-regulated by all stress treatments. The QPCR results reinforced the reliability of the sequencing data. Target prediction and KEGG pathway analysis indicated that these ncRNAs are closely related to pathways associated with in vitro adhesion, and our results indicated that chemical stress-induced reductions in the adhesion ability of V. alginolyticus might be due to the perturbation of ncRNA expression. Our findings provide important information for further functional characterization of ncRNAs during the adhesion process of V. alginolyticus. PMID:27199948

  10. The Human CCHC-type Zinc Finger Nucleic Acid-Binding Protein Binds G-Rich Elements in Target mRNA Coding Sequences and Promotes Translation.

    PubMed

    Benhalevy, Daniel; Gupta, Sanjay K; Danan, Charles H; Ghosal, Suman; Sun, Hong-Wei; Kazemier, Hinke G; Paeschke, Katrin; Hafner, Markus; Juranek, Stefan A

    2017-03-21

    The CCHC-type zinc finger nucleic acid-binding protein (CNBP/ZNF9) is conserved in eukaryotes and is essential for embryonic development in mammals. It has been implicated in transcriptional, as well as post-transcriptional, gene regulation; however, its nucleic acid ligands and molecular function remain elusive. Here, we use multiple systems-wide approaches to identify CNBP targets and function. We used photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) to identify 8,420 CNBP binding sites on 4,178 mRNAs. CNBP preferentially bound G-rich elements in the target mRNA coding sequences, most of which were previously found to form G-quadruplex and other stable structures in vitro. Functional analyses, including RNA sequencing, ribosome profiling, and quantitative mass spectrometry, revealed that CNBP binding did not influence target mRNA abundance but rather increased their translational efficiency. Considering that CNBP binding prevented G-quadruplex structure formation in vitro, we hypothesize that CNBP is supporting translation by resolving stable structures on mRNAs.

  11. Functional Anthology of Intrinsic Disorder. II. Cellular Components, Domains, Technical Terms, Developmental Processes and Coding Sequence Diversities Correlated with Long Disordered Regions

    PubMed Central

    Vucetic, Slobodan; Xie, Hongbo; Iakoucheva, Lilia M.; Oldfield, Christopher J.; Dunker, A. Keith; Obradovic, Zoran; Uversky, Vladimir N.

    2008-01-01

    Biologically active proteins without stable ordered structure (i.e., intrinsically disordered proteins) are attracting increased attention. Functional repertoires of ordered and disordered proteins are very different, and the ability to differentiate whether a given function is associated with intrinsic disorder or with a well-folded protein is crucial for modern protein science. However, there is a large gap between the number of proteins experimentally confirmed to be disordered and their actual number in nature. As a result, studies of functional properties of confirmed disordered proteins, while helpful in revealing the functional diversity of protein disorder, provide only a limited view. To overcome this problem, a bioinformatics approach for comprehensive study of functional roles of protein disorder was proposed in the first paper of this series (Xie H., Vucetic S., Iakoucheva L.M., Oldfield C.J., Dunker A.K., Obradovic Z., Uversky V.N. (2006) Functional anthology of intrinsic disorder. I. Biological processes and functions of proteins with long disordered regions. J. Proteome Res.). Applying this novel approach to Swiss-Prot sequences and functional keywords, we found over 238 and 302 keywords to be strongly positively or negatively correlated, respectively, with long intrinsically disordered regions. This paper describes ~90 Swiss-Prot keywords attributed to the cellular components, domains, technical terms, developmental processes and coding sequence diversities possessing strong positive and negative correlation with long disordered regions. PMID:17391015

  12. SHAPE Analysis of the RNA Secondary Structure of the Mouse Hepatitis Virus 5′ Untranslated Region and N-Terminal Nsp1 Coding Sequences

    PubMed Central

    Yang, Dong; Liu, Pinghua; Wudeck, Elyse V.; Giedroc, David P.; Leibowitz, Julian L.

    2014-01-01

    SHAPE technology was used to analyze RNA secondary structure of the 5′ most 474 nts of the MHV-A59 genome encompassing the minimal 5′ cis-acting region required for defective interfering RNA replication. The structures generated were in agreement with previous characterizations of SL1 through SL4 and two recently predicted secondary structure elements, S5 and SL5A. SHAPE provided biochemical support for four additional stem-loops not previously functionally investigated in MHV. Secondary structure predictions for 5′ regions of MHV-A59, BCoV and SARS-CoV were similar despite high sequence divergence. The pattern of SHAPE reactivity of in virio genomic RNA, ex virio genomic RNA, and in vitro synthesized RNA were similar, suggesting that binding of N protein or other proteins to virion RNA fails to protect the RNA from reaction with lipid permeable SHAPE reagent. Reverse genetic experiments suggested that SL5C and SL6 within the nsp1 coding sequence are not required for viral replication. PMID:25462342

  13. HLA-E coding and 3' untranslated region variability determined by next-generation sequencing in two West-African population samples.

    PubMed

    Castelli, Erick C; Mendes-Junior, Celso T; Sabbagh, Audrey; Porto, Iane O P; Garcia, André; Ramalho, Jaqueline; Lima, Thálitta H A; Massaro, Juliana D; Dias, Fabrício C; Collares, Cristhianna V A; Jamonneau, Vincent; Bucheton, Bruno; Camara, Mamadou; Donadi, Eduardo A

    2015-12-01

    HLA-E is a non-classical Human Leucocyte Antigen class I gene with immunomodulatory properties. Whereas HLA-E expression usually occurs at low levels, it is widely distributed amongst human tissues, has the ability to bind self and non-self antigens and to interact with NK cells and T lymphocytes, being important for immunosurveillance and also for fighting against infections. HLA-E is usually the most conserved locus among all class I genes. However, most of the previous studies evaluating HLA-E variability sequenced only a few exons or genotyped known polymorphisms. Here we report a strategy to evaluate HLA-E variability by next-generation sequencing (NGS) that might be used to other HLA loci and present the HLA-E haplotype diversity considering the segment encoding the entire HLA-E mRNA (including 5'UTR, introns and the 3'UTR) in two African population samples, Susu from Guinea-Conakry and Lobi from Burkina Faso. Our results indicate that (a) the HLA-E gene is indeed conserved, encoding mainly two different protein molecules; (b) Africans do present several unknown HLA-E alleles presenting synonymous mutations; (c) the HLA-E 3'UTR is quite polymorphic and (d) haplotypes in the HLA-E 3'UTR are in close association with HLA-E coding alleles. NGS has proved to be an important tool on data generation for future studies evaluating variability in non-classical MHC genes.

  14. Identification of Potential Key Long Non-Coding RNAs and Target Genes Associated with Pneumonia Using Long Non-Coding RNA Sequencing (lncRNA-Seq): A Preliminary Study

    PubMed Central

    Huang, Sai; Feng, Cong; Chen, Li; Huang, Zhi; Zhou, Xuan; Li, Bei; Wang, Li-li; Chen, Wei; Lv, Fa-qin; Li, Tan-shi

    2016-01-01

    Background This study aimed to identify the potential key long non-coding RNAs (lncRNAs) and target genes associated with pneumonia using lncRNA sequencing (lncRNA-seq). Material/Methods A total of 9 peripheral blood samples from patients with mild pneumonia (n=3) and severe pneumonia (n=3), as well as volunteers without pneumonia (n=3), were received for lncRNA-seq. Based on the sequencing data, differentially expressed lncRNAs (DE-lncRNAs) were identified by the limma package. After the functional enrichment analysis, target genes of DE-lncRNAs were predicted, and the regulatory network was constructed. Results In total, 99 DE-lncRNAs (14 upregulated and 85 downregulated ones) were identified in the mild pneumonia group and 85 (72 upregulated and 13 downregulated ones) in the severe pneumonia group, compared with the control group. Among these DE-lncRNAs, 9 lncRNAs were upregulated in both the mild and severe pneumonia groups. A set of 868 genes were predicted to be targeted by these 9 DE-lncRNAs. In the network, RP11-248E9.5 and RP11-456D7.1 targeted the majority of genes. RP11-248E9.5 regulated several genes together with CTD-2300H10.2, such as QRFP and EPS8. Both upregulated RP11-456D7.1 and RP11-96C23.9 regulated several genes, such as PDK2. RP11-456D7.1 also positively regulated CCL21. Conclusions These novel lncRNAs and their target genes may be closely associated with the progression of pneumonia. PMID:27663962

  15. Enhanced Gene Expression Rather than Natural Polymorphism in Coding Sequence of the OsbZIP23 Determines Drought Tolerance and Yield Improvement in Rice Genotypes

    PubMed Central

    Dey, Avishek; Samanta, Milan Kumar; Gayen, Srimonta; Sen, Soumitra K.; Maiti, Mrinal K.

    2016-01-01

    Drought is one of the major limiting factors for productivity of crops including rice (Oryza sativa L.). Understanding the role of allelic variations of key regulatory genes involved in stress-tolerance is essential for developing an effective strategy to combat drought. The bZIP transcription factors play a crucial role in abiotic-stress adaptation in plants via abscisic acid (ABA) signaling pathway. The present study aimed to search for allelic polymorphism in the OsbZIP23 gene across selected drought-tolerant and drought-sensitive rice genotypes, and to characterize the new allele through overexpression (OE) and gene-silencing (RNAi). Analyses of the coding DNA sequence (CDS) of the cloned OsbZIP23 gene revealed single nucleotide polymorphism at four places and a 15-nucleotide deletion at one place. The single-copy OsbZIP23 gene is expressed at relatively higher level in leaf tissues of drought-tolerant genotypes, and its abundance is more in reproductive stage. Cloning and sequence analyses of the OsbZIP23-promoter from drought-tolerant O. rufipogon and drought-sensitive IR20 cultivar showed variation in the number of stress-responsive cis-elements and a 35-nucleotide deletion at 5’-UTR in IR20. Analysis of the GFP reporter gene function revealed that the promoter activity of O. rufipogon is comparatively higher than that of IR20. The overexpression of any of the two polymorphic forms (1083 bp and 1068 bp CDS) of OsbZIP23 improved drought tolerance and yield-related traits significantly by retaining higher content of cellular water, soluble sugar and proline; and exhibited decrease in membrane lipid peroxidation in comparison to RNAi lines and non-transgenic plants. The OE lines showed higher expression of target genes-OsRab16B, OsRab21 and OsLEA3-1 and increased ABA sensitivity; indicating that OsbZIP23 is a positive transcriptional-regulator of the ABA-signaling pathway. Taken together, the present study concludes that the enhanced gene expression rather

  16. Enhanced Gene Expression Rather than Natural Polymorphism in Coding Sequence of the OsbZIP23 Determines Drought Tolerance and Yield Improvement in Rice Genotypes.

    PubMed

    Dey, Avishek; Samanta, Milan Kumar; Gayen, Srimonta; Sen, Soumitra K; Maiti, Mrinal K

    2016-01-01

    Drought is one of the major limiting factors for productivity of crops including rice (Oryza sativa L.). Understanding the role of allelic variations of key regulatory genes involved in stress-tolerance is essential for developing an effective strategy to combat drought. The bZIP transcription factors play a crucial role in abiotic-stress adaptation in plants via abscisic acid (ABA) signaling pathway. The present study aimed to search for allelic polymorphism in the OsbZIP23 gene across selected drought-tolerant and drought-sensitive rice genotypes, and to characterize the new allele through overexpression (OE) and gene-silencing (RNAi). Analyses of the coding DNA sequence (CDS) of the cloned OsbZIP23 gene revealed single nucleotide polymorphism at four places and a 15-nucleotide deletion at one place. The single-copy OsbZIP23 gene is expressed at relatively higher level in leaf tissues of drought-tolerant genotypes, and its abundance is more in reproductive stage. Cloning and sequence analyses of the OsbZIP23-promoter from drought-tolerant O. rufipogon and drought-sensitive IR20 cultivar showed variation in the number of stress-responsive cis-elements and a 35-nucleotide deletion at 5'-UTR in IR20. Analysis of the GFP reporter gene function revealed that the promoter activity of O. rufipogon is comparatively higher than that of IR20. The overexpression of any of the two polymorphic forms (1083 bp and 1068 bp CDS) of OsbZIP23 improved drought tolerance and yield-related traits significantly by retaining higher content of cellular water, soluble sugar and proline; and exhibited decrease in membrane lipid peroxidation in comparison to RNAi lines and non-transgenic plants. The OE lines showed higher expression of target genes-OsRab16B, OsRab21 and OsLEA3-1 and increased ABA sensitivity; indicating that OsbZIP23 is a positive transcriptional-regulator of the ABA-signaling pathway. Taken together, the present study concludes that the enhanced gene expression rather than

  17. Sequencing of the coding exons of the LRP1 and LDLR genes on individual DNA samples reveals novel mutations in both genes.

    PubMed

    Van Leuven, F; Thiry, E; Lambrechts, M; Stas, L; Boon, T; Bruynseels, K; Muls, E; Descamps, O

    2001-02-15

    Five coding polymorphisms in de LRP1 gene, i.e. A217V, A775P, D2080N, D2632E and G4379S were discovered by sequencing its 89 exons in three test-groups of 22 healthy individuals, 29 Alzheimer patients and 18 individuals with different clinical and molecularly uncharacterized lipid metabolism problems. No genetic defect was evident in the LRP1 gene of any of the Alzheimer's disease (AD) patients, further excluding LRP1 as a major genetic problem in AD. Lipoprotein receptor related protein (LRP) A217V (exon 6) was clearly present in all groups as a polymorphism, while D2632E was observed only once in a healthy volunteer. On the other hand, LRP1 alleles A775P, D2080N, and G4379 were encountered only in patients with FH or with undefined problems of lipid metabolism. This finding forced one to also analyze the LDL receptor (LDLR) gene, for which a method was devised to sequence the entire region comprising LDLR exons 2-18. The resulting sequence contig of 33567 nucleotides yielded finally an exact physical map that corrects published and listed LDLR gene maps in many positions. In addition, next to known mutations in LDLR that cause FH, four novel LDLR defects were defined, i.e. del e7-10, exon 9 mutation N407T, a 20 bp insertion in exon 4, and a double mutation C292W/K290R in exon 6. No evidence for pathology connected to the LRP1 'mutations' was obtained by subsequent screening for the five LRP1 variants in larger groups of 110 FH patients and 118 patients with molecularly undefined, clinical problems of cholesterol and/or lipid metabolism. In three individuals with a mutant LDLR gene a variant LRP1 allele was also present, but without direct, obvious clinical compound effects, indicating that the variant LRP1 alleles must, for the present, be considered polymorphisms.

  18. Color differences among feral pigeons (Columba livia) are not attributable to sequence variation in the coding region of the melanocortin-1 receptor gene (MC1R)

    PubMed Central

    2013-01-01

    Background Genetic variation at the melanocortin-1 receptor (MC1R) gene is correlated with melanin color variation in many birds. Feral pigeons (Columba livia) show two major melanin-based colorations: a red coloration due to pheomelanic pigment and a black coloration due to eumelanic pigment. Furthermore, within each color type, feral pigeons display continuous variation in the amount of melanin pigment present in the feathers, with individuals varying from pure white to a full dark melanic color. Coloration is highly heritable and it has been suggested that it is under natural or sexual selection, or both. Our objective was to investigate whether MC1R allelic variants are associated with plumage color in feral pigeons. Findings We sequenced 888 bp of the coding sequence of MC1R among pigeons varying both in the type, eumelanin or pheomelanin, and the amount of melanin in their feathers. We detected 10 non-synonymous substitutions and 2 synonymous substitution but none of them were associated with a plumage type. It remains possible that non-synonymous substitutions that influence coloration are present in the short MC1R fragment that we did not sequence but this seems unlikely because we analyzed the entire functionally important region of the gene. Conclusions Our results show that color differences among feral pigeons are probably not attributable to amino acid variation at the MC1R locus. Therefore, variation in regulatory regions of MC1R or variation in other genes may be responsible for the color polymorphism of feral pigeons. PMID:23915680

  19. Determination of the upper and lower limits of the mechanistic stoichiometry of incompletely coupled fluxes. Stoichiometry of incompletely coupled reactions.

    PubMed

    Beavis, A D; Lehninger, A L

    1986-07-15

    A rationale is formulated for the design of experiments to determine the upper and lower limits of the mechanistic stoichiometry of any two incompletely coupled fluxes J1 and J2. Incomplete coupling results when there is a branch at some point in the sequence of reactions or processes coupling the two fluxes. The upper limit of the mechanistic stoichiometry is given by the minimum value of dJ2/dJ1 obtained when the fluxes are systematically varied by changes in steps after the branch point. The lower limit is given by the maximum value of dJ2/dJ1 obtained when the fluxes are varied by changes in steps prior to the branch point. The rationale for determining these limits is developed from both a simple kinetic model and from a linear nonequilibrium thermodynamic treatment of coupled fluxes, using the mechanistic approach [Westerhoff, H. V. & van Dam, K. (1979) Curr. Top. Bioenerg. 9, 1-62]. The phenomenological stoichiometry, the flux ratio at level flow and the affinity ratio at static head of incompletely coupled fluxes are defined in terms of mechanistic conductances and their relationship to the mechanistic stoichiometry is discussed. From the rationale developed, experimental approaches to determine the mechanistic stoichiometry of mitochondrial oxidative phosphorylation are outlined. The principles employed do not require knowledge of the pathway or the rate of transmembrane leaks or slippage and may also be applied to analysis of the stoichiometry of other incompletely coupled systems, including vectorial H+/O and K+/O translocation coupled to mitochondrial electron transport.

  20. CYP1B1 gene mutations with incomplete penetrance in a Chinese pedigree with primary congenital glaucoma: a case report and review of literatures

    PubMed Central

    Chen, Ling; Huang, Lina; Zeng, Aineng; He, Jing

    2015-01-01

    To investigate the cytochrome P4501B1 (CYP1B1) mutations in a three-generation Chinese Han family with PCG, the 2 and 3 coding exons of CYP1B1 gene were amplified by PCR, and were directly sequenced using Sanger bidirectional sequencing reactions. The mutation c.517 G>A p.E173K was detected in all the affected individuals (which showed homozygous AA genotype) and not in all the unaffected ones except one individual. The mutation c.517 G>A p.E173K is associated with disease causing in this pedigree. And the possible genetic model is recessive inheritance. One apparently unaffected individual had mutations and haplotypes identical to her affected sibs suggested incomplete penetrance in this pedigree. PMID:26550445

  1. Sequence of a novel cytochrome CYP2B cDNA coding for a protein which is expressed in a sebaceous gland, but not in the liver.

    PubMed Central

    Friedberg, T; Grassow, M A; Bartlomowicz-Oesch, B; Siegert, P; Arand, M; Adesnik, M; Oesch, F

    1992-01-01

    The major phenobarbital-inducible rat hepatic cytochromes P-450, CYP2B1 and CYP2B2, are the paradigmatic members of a cytochrome P-450 gene subfamily that contains at least seven additional members. Specific oligonucleotide probes for these genomic members of the CYP2B subfamily were used to assess their tissue-specific expression. In Northern-blot analysis a probe specific to gene 4 (which is designated now as CYP2B12) hybridized to a single mRNA present in the preputial gland, an organ which is used as a model for sebaceous glands, but did not hybridize to mRNA isolated from the liver or from five other tissues of untreated or Aroclor 1254-treated rats. The cDNA sequence for the CYP2B12 RNA was determined from overlapping cDNA clones and contained a long open reading frame of 1476 bp. The nucleotide sequence of the CYP2B12 cDNA was 85% similar to the sequence of the CYP2B1 cDNA in its coding region and was different from any CYP2B cDNA characterized until now. The cDNA-derived primary structure of the CYP2B12 protein contains a signal sequence for its insertion into the endoplasmic reticulum and the putative haem-binding site characteristic of cytochromes P-450. A part of the potential haem pocket of CYP2B12 was identical with a similar structure in a bacterial protocatechuate dioxygenase. In immunoblot analysis of preputial-gland microsomes, antibodies against CYP2B1 recognized a single abundant protein with a lower apparent molecular mass than that of CYP2B1. Our results demonstrate that the CYP2B12 protein has the potential to be enzymically active and are the first demonstration that a member of the CYP2B subfamily is expressed exclusively and at high levels in an extrahepatic organ. Images Fig. 1. Fig. 5. Fig. 6. PMID:1445240

  2. CIMGS: An incomplete orthogonal factorization preconditioner

    SciTech Connect

    Wang, X.; Bramley, R.; Gallivan, K.

    1994-12-31

    This paper introduces, analyzes, and tests a preconditioning method for conjugate gradient (CG) type iterative methods. The authors start by examining incomplete Gram-Schmidt factorization (IGS) methods in order to motivate the new preconditioner. They show that the IGS family is more stable than IC, and they successfully factor any full rank matrix. Furthermore, IGS preconditioners are at least as effective in accelerating convergence of CG type iterative methods as the incomplete Cholesky (IC) preconditioner. The drawback of IGS methods are their high cost of factorization. This motivates finding a new algorithm, CIMGS, which can generate the same factor in a more efficient way.

  3. Adaptive schemes for incomplete quantum process tomography

    SciTech Connect

    Teo, Yong Siah; Englert, Berthold-Georg; Rehacek, Jaroslav; Hradil, Zdenek

    2011-12-15

    We propose an iterative algorithm for incomplete quantum process tomography with the help of quantum state estimation. The algorithm, which is based on the combined principles of maximum likelihood and maximum entropy, yields a unique estimator for an unknown quantum process when one has less than a complete set of linearly independent measurement data to specify the quantum process uniquely. We apply this iterative algorithm adaptively in various situations and so optimize the amount of resources required to estimate a quantum process with incomplete data.

  4. Cochlear implant in incomplete partition type I.

    PubMed

    Berrettini, S; Forli, F; De Vito, A; Bruschini, L; Quaranta, N

    2013-02-01

    In this investigation, we report on 4 patients affected by incomplete partition type I submitted to cochlear implant at our institutions. Preoperative, surgical, mapping and follow-up issues as well as results in cases with this complex malformation are described. The cases reported in the present study confirm that cochlear implantation in patients with incomplete partition type I may be challenging for cochlear implant teams. The results are variable, but in many cases satisfactory, and are mainly related to the surgical placement of the electrode and residual neural nerve fibres. Moreover, in some cases the association of cochlear nerve abnormalities and other disabilities may significantly affect results.

  5. Influence of incomplete fusion on complete fusion: Observation of a large incomplete fusion fraction at E {approx_equal}5-7 MeV/nucleon

    SciTech Connect

    Singh, Pushpendra P.; Singh, B. P.; Sharma, Manoj Kumar; Unnati,; Singh, Devendra P.; Prasad, R.; Kumar, Rakesh; Golda, K. S.

    2008-01-15

    Experiments have been carried out to explore the reaction dynamics leading to incomplete fusion of heavy ions at moderate excitation energies. Excitation functions for {sup 168}Lu{sup m}, {sup 167}Lu, {sup 167}Yb, {sup 166}Tm, {sup 179}Re, {sup 177}Re, {sup 177}W, {sup 178}Ta, and {sup 177}Hf radio-nuclides populated via complete and/or incomplete fusion of {sup 16}O with {sup 159}Tb and {sup 169}Tm have been studied over the wide projectile energy range E{sub proj}{approx_equal}75-95 MeV. Recoil-catcher technique followed by off-line {gamma}-spectrometry has been employed in the present measurements. Experimental data have been compared with the predictions of theoretical model code PACE2. The experimentally measured production cross sections of {alpha}-emitting channels were found to be larger as compared to the theoretical model predictions and may be attributed to incomplete fusion at these energies. During the analysis of experimental data, incomplete fusion has been found to be competing with complete fusion. As such, an attempt has been made to estimate the incomplete fusion fraction for both the systems, and has been found to be sensitive for projectile energy and mass asymmetry of interacting partners.

  6. The Bacteriophage Carrier State of Campylobacter jejuni Features Changes in Host Non-coding RNAs and the Acquisition of New Host-derived CRISPR Spacer Sequences

    PubMed Central

    Hooton, Steven P. T.; Brathwaite, Kelly J.; Connerton, Ian F.

    2016-01-01

    Incorporation of self-derived CRISPR DNA protospacers in Campylobacter jejuni PT14 occurs in the presence of bacteriophages encoding a CRISPR-like Cas4 protein. This phenomenon was evident in carrier state infections where both bacteriophages and host are maintained for seemingly indefinite periods as stable populations following serial passage. Carrier state cultures of C. jejuni PT14 have greater aerotolerance in nutrient limited conditions, and may have arisen as an evolutionary response to selective pressures imposed during periods in the extra-intestinal environment. A consequence of this is that bacteriophage and host remain associated and able to survive transition periods where the chances of replicative success are greatly diminished. The majority of the bacteriophage population do not commit to lytic infection, and conversely the bacterial population tolerates low-level bacteriophage replication. We recently examined the effects of Campylobacter bacteriophage/C. jejuni PT14 CRISPR spacer acquisition using deep sequencing strategies of DNA and RNA-Seq to analyze carrier state cultures. This approach identified de novo spacer acquisition in C. jejuni PT14 associated with Class III Campylobacter phages CP8/CP30A but spacer acquisition was oriented toward the capture of host DNA. In the absence of bacteriophage predation the CRISPR spacers in uninfected C. jejuni PT14 cultures remain unchanged. A distinct preference was observed for incorporation of self-derived protospacers into the third spacer position of the C. jejuni PT14 CRISPR array, with the first and second spacers remaining fixed. RNA-Seq also revealed the variation in the synthesis of non-coding RNAs with the potential to bind bacteriophage genes and/or transcript sequences. PMID:27047470

  7. 32 CFR 651.44 - Incomplete information.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... National Defense Department of Defense (Continued) DEPARTMENT OF THE ARMY (CONTINUED) ENVIRONMENTAL QUALITY ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) Environmental Impact Statement § 651.44 Incomplete information. When the proposed action will have significant adverse effects on the human environment, and there...

  8. 40 CFR 725.33 - Incomplete submissions.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... ACT REPORTING REQUIREMENTS AND REVIEW PROCESSES FOR MICROORGANISMS Administrative Procedures § 725.33 Incomplete submissions. (a) A submission under this part is not complete, and the review period does not... attachments are not in English, except for published scientific literature. (4) The submitter does not...

  9. Higher Education's Grade for Data: "Incomplete"

    ERIC Educational Resources Information Center

    Hebel, Sara

    2008-01-01

    The latest national report card on higher education, as in the past, handed out a lot of "incompletes." Like the student who keeps forgetting to turn in that lab report, the grades can't be computed without all the data. There's a lot policy makers don't know about their states' higher-education performance, and the gaps in information limit the…

  10. A comparison of AAV strategies distinguishes overlapping vectors for efficient systemic delivery of the 6.2 kb Dysferlin coding sequence

    PubMed Central

    Pryadkina, Marina; Lostal, William; Bourg, Nathalie; Charton, Karine; Roudaut, Carinne; Hirsch, Matthew L; Richard, Isabelle

    2015-01-01

    Recombinant adeno-associated virus (rAAV) is currently the best vector for gene delivery into the skeletal muscle. However, the 5-kb packaging size of this virus is a major obstacle for large gene transfer. This past decade, many different strategies were developed to circumvent this issue (concatemerization-splicing, overlapping vectors, hybrid dual or fragmented AAV). Loss of function mutations in the DYSF gene whose coding sequence is 6.2kb lead to progressive muscular dystrophies (LGMD2B: OMIM_253601; MM: OMIM_254130; DMAT: OMIM_606768). In this study, we compared large gene transfer techniques to deliver the DYSF gene into the skeletal muscle. After rAAV8s intramuscular injection into dysferlin deficient mice, we showed that the overlap strategy is the most effective approach to reconstitute a full-length messenger. After systemic administration, the level of dysferlin obtained on different muscles corresponded to 0.5- to 2-fold compared to the normal level. We further demonstrated that the overlapping vector set was efficient to correct the histopathology, resistance to eccentric contractions and whole body force in the dysferlin deficient mice. Altogether, these data indicate that using overlapping vectors could be a promising approach for a potential clinical treatment of dysferlinopathies. PMID:26029720

  11. Common and rare von Willebrand factor (VWF) coding variants, VWF levels, and factor VIII levels in African Americans: the NHLBI Exome Sequencing Project.

    PubMed

    Johnsen, Jill M; Auer, Paul L; Morrison, Alanna C; Jiao, Shuo; Wei, Peng; Haessler, Jeffrey; Fox, Keolu; McGee, Sean R; Smith, Joshua D; Carlson, Christopher S; Smith, Nicholas; Boerwinkle, Eric; Kooperberg, Charles; Nickerson, Deborah A; Rich, Stephen S; Green, David; Peters, Ulrike; Cushman, Mary; Reiner, Alex P

    2013-07-25

    Several rare European von Willebrand disease missense variants of VWF (including p.Arg2185Gln and p.His817Gln) were recently reported to be common in apparently healthy African Americans (AAs). Using data from the NHLBI Exome Sequencing Project, we assessed the association of these and other VWF coding variants with von Willebrand factor (VWF) and factor VIII (FVIII) levels in 4468 AAs. Of 30 nonsynonymous VWF variants, 6 were significantly and independently associated (P < .001) with levels of VWF and/or FVIII. Each additional copy of the common VWF variants encoding p.Thr789Ala or p.Asp1472His was associated with 6 to 8 IU/dL higher VWF levels. The VWF variant encoding p.Arg2185Gln was associated with 7 to 13 IU/dL lower VWF and FVIII levels. The type 2N-related VWF variant encoding p.His817Gln was associated with 17 IU/dL lower FVIII level but normal VWF level. A novel, rare missense VWF variant that predicts disruption of an O-glycosylation site (p.Ser1486Leu) and a rare variant encoding p.Arg2287Trp were each associated with 30 to 40 IU/dL lower VWF level (P < .001). In summary, several common and rare VWF missense variants contribute to phenotypic differences in VWF and FVIII among AAs.

  12. Corrected profile likelihood confidence interval for binomial paired incomplete data.

    PubMed

    Pradhan, Vivek; Menon, Sandeep; Das, Ujjwal

    2013-01-01

    Clinical trials often use paired binomial data as their clinical endpoint. The confidence interval is frequently used to estimate the treatment performance. Tang et al. (2009) have proposed exact and approximate unconditional methods for constructing a confidence interval in the presence of incomplete paired binary data. The approach proposed by Tang et al. can be overly conservative with large expected confidence interval width (ECIW) in some situations. We propose a profile likelihood-based method with a Jeffreys' prior correction to construct the confidence interval. This approach generates confidence interval with a much better coverage probability and shorter ECIWs. The performances of the method along with the corrections are demonstrated through extensive simulation. Finally, three real world data sets are analyzed by all the methods. Statistical Analysis System (SAS) codes to execute the profile likelihood-based methods are also presented.

  13. Methods for incomplete Bessel function evaluation

    NASA Astrophysics Data System (ADS)

    Harris, Frank E.; Fripiat, J. G.

    Presented here are detailed methods for evaluating the incomplete Bessel functions arising when Gaussian-type orbitals are used for systems periodic in one spatial dimension. The scheme is designed to yield these incomplete Bessel functions with an absolute accuracy of ±1 × 10-10, for the range of integer orders 0 ≤ n ≤ 12 [a range sufficient for a basis whose members have angular momenta of up to three units (s, p, d, or f atomic functions)]. To reach this accuracy level within acceptable computation times, new rational approximations were developed to compute the special functions involved, namely, the exponential integral E1(x) and the modified Bessel functions K0(x) and K1(x), to absolute accuracy ±1 × 10-15.

  14. Past incompleteness of a bouncing multiverse

    SciTech Connect

    Vilenkin, Alexander; Zhang, Jun E-mail: jun.zhang@tufts.edu

    2014-06-01

    According to classical GR, Anti-de Sitter (AdS) bubbles in the multiverse terminate in big crunch singularities. It has been conjectured, however, that the fundamental theory may resolve these singularities and replace them by nonsingular bounces. This may have important implications for the beginning of the multiverse. Geodesics in cosmological spacetimes are known to be past-incomplete, as long as the average expansion rate along the geodesic is positive, but it is not clear that the latter condition is satisfied if the geodesic repeatedly passes through crunching AdS bubbles. We investigate this issue in a simple multiverse model, where the spacetime consists of a patchwork of FRW regions. The conclusion is that the spacetime is still past-incomplete, even in the presence of AdS bounces.

  15. Advanced incomplete factorization algorithms for Stiltijes matrices

    SciTech Connect

    Il`in, V.P.

    1996-12-31

    The modern numerical methods for solving the linear algebraic systems Au = f with high order sparse matrices A, which arise in grid approximations of multidimensional boundary value problems, are based mainly on accelerated iterative processes with easily invertible preconditioning matrices presented in the form of approximate (incomplete) factorization of the original matrix A. We consider some recent algorithmic approaches, theoretical foundations, experimental data and open questions for incomplete factorization of Stiltijes matrices which are {open_quotes}the best{close_quotes} ones in the sense that they have the most advanced results. Special attention is given to solving the elliptic differential equations with strongly variable coefficients, singular perturbated diffusion-convection and parabolic equations.

  16. The undirected incomplete perfect phylogeny problem.

    PubMed

    Satya, Ravi Vijaya; Mukherjee, Amar

    2008-01-01

    The incomplete perfect phylogeny (IPP) problem and the incomplete perfect phylogeny haplotyping (IPPH) problem deal with constructing a phylogeny for a given set of haplotypes or genotypes with missing entries. The earlier approaches for both of these problems dealt with restricted versions of the problems, where the root is either available or can be trivially re-constructed from the data, or certain assumptions were made about the data. In this paper, we deal with the unrestricted versions of the problems, where the root of the phylogeny is neither available nor trivially recoverable from the data. Both IPP and IPPH problems have previously been proven to be NP-complete. Here, we present efficient enumerative algorithms that can handle practical instances of the problem. Empirical analysis on simulated data shows that the algorithms perform very well both in terms of speed and in terms accuracy of the recovered data.

  17. Stochastic approximation boosting for incomplete data problems.

    PubMed

    Sexton, Joseph; Laake, Petter

    2009-12-01

    Boosting is a powerful approach to fitting regression models. This article describes a boosting algorithm for likelihood-based estimation with incomplete data. The algorithm combines boosting with a variant of stochastic approximation that uses Markov chain Monte Carlo to deal with the missing data. Applications to fitting generalized linear and additive models with missing covariates are given. The method is applied to the Pima Indians Diabetes Data where over half of the cases contain missing values.

  18. Co-expressed differentially expressed genes and long non-coding RNAs involved in the celecoxib treatment of gastric cancer: An RNA sequencing analysis

    PubMed Central

    Song, Bin; Du, Juan; Feng, Ye; Gao, Yong-Jian; Zhao, Ji-Sheng

    2016-01-01

    The aim of the present study was to investigate the mechanisms of long non-coding RNAs (lncRNAs) in a gastric cancer cell line treated with celecoxib. The human gastric carcinoma cell line NCI-N87 was treated with 15 µM celecoxib for 72 h (celecoxib group) and an equal volume of dimethylsulfoxide (control group), respectively. Libraries were constructed by NEBNext Ultra RNA Library Prep kit for Illumina. Paired-end RNA sequencing reads were aligned to a human hg19 reference genome using TopHat2. Differentially expressed genes (DEGs) and lncRNAs were identified using Cuffdiff. Enrichment analysis was performed using GO-function package and KEGG profile in Bioconductor. A protein-protein interaction network was constructed using STRING database and module analysis was performed using ClusterONE plugin of Cytoscape. ATP5G1, ATP5G3, COX8A, CYC1, NDUFS3, UQCRC1, UQCRC2 and UQCRFS1 were enriched in the oxidative phosphorylation pathway. CXCL1, CXCL3, CXCL5 and CXCL8 were enriched in the chemokine signaling and cytokine-cytokine receptor interaction pathways. ITGA3, ITGA6, ITGB4, ITGB5, ITGB6 and ITGB8 were enriched in the integrin-mediated signaling pathway. DEGs co-expressed with lnc-SCD-1:13, lnc-LRR1-1:2, lnc-PTMS-1:3, lnc-S100P-3:1, lnc-AP000974.1-1:1 and lnc-RAB3IL1-2:1 were enriched in the pathways associated with cancer, such as the basal cell carcinoma pathway in cancer. In conclusion, these DEGs and differentially expressed lncRNAs may be important in the celecoxib treatment of gastric cancer. PMID:27698747

  19. Evolutionary Changes in Gene Expression, Coding Sequence and Copy-Number at the Cyp6g1 Locus Contribute to Resistance to Multiple Insecticides in Drosophila

    PubMed Central

    Harrop, Thomas W. R.; Sztal, Tamar; Lumb, Christopher; Good, Robert T.; Daborn, Phillip J.; Batterham, Philip; Chung, Henry

    2014-01-01

    Widespread use of insecticides has led to insecticide resistance in many populations of insects. In some populations, resistance has evolved to multiple pesticides. In Drosophila melanogaster, resistance to multiple classes of insecticide is due to the overexpression of a single cytochrome P450 gene, Cyp6g1. Overexpression of Cyp6g1 appears to have evolved in parallel in Drosophila simulans, a sibling species of D. melanogaster, where it is also associated with insecticide resistance. However, it is not known whether the ability of the CYP6G1 enzyme to provide resistance to multiple insecticides evolved recently in D. melanogaster or if this function is present in all Drosophila species. Here we show that duplication of the Cyp6g1 gene occurred at least four times during the evolution of different Drosophila species, and the ability of CYP6G1 to confer resistance to multiple insecticides exists in D. melanogaster and D. simulans but not in Drosophila willistoni or Drosophila virilis. In D. virilis, which has multiple copies of Cyp6g1, one copy confers resistance to DDT and another to nitenpyram, suggesting that the divergence of protein sequence between copies subsequent to the duplication affected the activity of the enzyme. All orthologs tested conferred resistance to one or more insecticides, suggesting that CYP6G1 had the capacity to provide resistance to anthropogenic chemicals before they existed. Finally, we show that expression of Cyp6g1 in the Malpighian tubules, which contributes to DDT resistance in D. melanogaster, is specific to the D. melanogaster–D. simulans lineage. Our results suggest that a combination of gene duplication, regulatory changes and protein coding changes has taken place at the Cyp6g1 locus during evolution and this locus may play a role in providing resistance to different environmental toxins in different Drosophila species. PMID:24416303

  20. ‘Default’ generated neonatal regulatory T cells are hypomethylated at conserved non-coding sequence 2 and promote long-term cardiac allograft survival

    PubMed Central

    Cheng, Chao; Wang, Sihua; Ye, Ping; Huang, Xiaofan; Liu, Zheng; Wu, Jie; Sun, Yuan; Xie, Aini; Wang, Guohua; Xia, Jiahong

    2014-01-01

    Regulatory T (Treg) cells play an important role in the maintenance of immune self-tolerance and homeostasis. We previously reported that neonatal CD4+ T cells have an intrinsic ‘default’ mechanism to become Treg (neoTreg) cells in response to T-cell receptor (TCR) stimulation. However, the underlying mechanisms are unclear and the effects of neoTreg cells on regulating immune responses remain unknown. Due to their involvement in Foxp3 regulation, we examined the role of DNA methyltransferase 1 (DNMT1) and DNMT3b during the induction of neoTreg cells in the Foxp3gfp mice. The function of neoTreg cells was assessed in an acute allograft rejection model established in RAG2−/− mice with allograft cardiac transplantation and transferred with syngeneic CD4+ effector T cells. Following ex vivo TCR stimulation, the DNMT activity was increased threefold in adult CD4+ T cells, but not significantly increased in neonatal cells. However, adoptively transferred neoTreg cells significantly prolonged cardiac allograft survival (mean survival time 47 days, P < 0·001) and maintained Foxp3 expression similar to natural Treg cells. The neoTreg cells were hypomethylated at the conserved non-coding DNA sequence 2 locus of Foxp3 compared with adult Treg cells. The DNMT antagonist 5-aza-2′-deoxycytidine (5-Aza) induced increased Foxp3 expression in mature CD4+ T cells. 5-Aza-inducible Treg cells combined with continuous 5-Aza treatment prolonged graft survival. These results indicate that the ‘default’ pathway of neoTreg cell differentiation is associated with reduced DNMT1 and DNMT3b response to TCR stimulus. The neoTreg cells may be a strategy to alleviate acute allograft rejection. PMID:24944101

  1. Sequence analysis of the clpG gene, which codes for surface antigen CS31A subunit: evidence of an evolutionary relationship between CS31A, K88, and F41 subunit genes.

    PubMed Central

    Girardeau, J P; Bertin, Y; Martin, C; Der Vartanian, M; Boeuf, C

    1991-01-01

    The clpG gene coding for the CS31A subunit was localized on a 0.9-kb SphI fragment from the recombinant plasmid pAG315. This was established by testing the ability of subclones to hybridize with a 17-meric oligonucleotide probe obtained from N-terminal analysis of the CS31A subunit. The nucleotide sequence of the region coding for CS31A was determined. From primer extension analysis, two initiation translation start sites were detected. Two possible promoterlike sequences were identified; the ribosome binding site and the translation terminator are proposed. Inverted repeat sequences leading to the formation of possible hairpin structures of the transcripts were found on the 5' untranslated region of clpG. The deduced amino acid composition was in close agreement with the chemical amino acid composition and sequence match with the first 25 N-terminal amino acids from the published N-terminal sequence of the purified CS31A subunit. The clpG gene codes for a mature protein of 257 amino acids with a molecular size of 26,777 Da. An obvious homology was observed when the amino acid sequence of CS31A was compared with those of K88 and F41. This homology includes five different conserved sequences of up to 19 identical amino acids, which is associated with conserved proline. An extensive change in the CS31A region homologous to that identified to contain the K88 receptor binding site might be responsible for the functional divergence between CS31A and K88. Images FIG. 4 FIG. 5 PMID:1938963

  2. Polar Codes

    DTIC Science & Technology

    2014-12-01

    density parity check (LDPC) code, a Reed–Solomon code, and three convolutional codes. iii CONTENTS EXECUTIVE SUMMARY...the most common. Many civilian systems use low density parity check (LDPC) FEC codes, and the Navy is planning to use LDPC for some future systems...other forward error correction methods: a turbo code, a low density parity check (LDPC) code, a Reed–Solomon code, and three convolutional codes

  3. Classification and data acquisition with incomplete data

    NASA Astrophysics Data System (ADS)

    Williams, David P.

    In remote-sensing applications, incomplete data can result when only a subset of sensors (e.g., radar, infrared, acoustic) are deployed at certain regions. The limitations of single sensor systems have spurred interest in employing multiple sensor modalities simultaneously. For example, in land mine detection tasks, different sensor modalities are better-suited to capture different aspects of the underlying physics of the mines. Synthetic aperture radar sensors may be better at detecting surface mines, while infrared sensors may be better at detecting buried mines. By employing multiple sensor modalities to address the detection task, the strengths of the disparate sensors can be exploited in a synergistic manner to improve performance beyond that which would be achievable with either single sensor alone. When multi-sensor approaches are employed, however, incomplete data can be manifested. If each sensor is located on a separate platform ( e.g., aircraft), each sensor may interrogate---and hence collect data over---only partially overlapping areas of land. As a result, some data points may be characterized by data (i.e., features) from only a subset of the possible sensors employed in the task. Equivalently, this scenario implies that some data points will be missing features. Increasing focus in the future on using---and fusing data from---multiple sensors will make such incomplete-data problems commonplace. In many applications involving incomplete data, it is possible to acquire the missing data at a cost. In multi-sensor remote-sensing applications, data is acquired by deploying sensors to data points. Acquiring data is usually an expensive, time-consuming task, a fact that necessitates an intelligent data acquisition process. Incomplete data is not limited to remote-sensing applications, but rather, can arise in virtually any data set. In this dissertation, we address the general problem of classification when faced with incomplete data. We also address the

  4. Coding Acoustic Metasurfaces.

    PubMed

    Xie, Boyang; Tang, Kun; Cheng, Hua; Liu, Zhengyou; Chen, Shuqi; Tian, Jianguo

    2017-02-01

    Coding acoustic metasurfaces can combine simple logical bits to acquire sophisticated functions in wave control. The acoustic logical bits can achieve a phase difference of exactly π and a perfect match of the amplitudes for the transmitted waves. By programming the coding sequences, acoustic metasurfaces with various functions, including creating peculiar antenna patterns and waves focusing, have been demonstrated.

  5. Multirate control with incomplete information over Profibus-DP network

    NASA Astrophysics Data System (ADS)

    Salt, J.; Casanova, V.; Cuenca, A.; Pizá, R.

    2014-07-01

    When a process field bus-decentralized peripherals (Profibus-DP) network is used in an industrial environment, a deterministic behaviour is usually claimed. However, due to some concerns such as bandwidth limitations, lack of synchronisation among different clocks and existence of time-varying delays, a more complex problem must be faced. This problem implies the transmission of irregular and, even, random sequences of incomplete information. The main consequence of this issue is the appearance of different sampling periods at different network devices. In this paper, this aspect is checked by means of a detailed Profibus-DP timescale study. In addition, in order to deal with the different periods, a delay-dependent dual-rate proportional-integral-derivative control is introduced. Stability for the proposed control system is analysed in terms of linear matrix inequalities.

  6. Identification by sequencing based typing and complete coding region analysis of three new HLA class II alleles: DRB3*0210, DRB3*0211 and DQB1*0310.

    PubMed

    Balas, A; Santos, S; Aviles, M J; Garcia-Sanchez, F; Lillo, R; Vicario, J L

    2000-10-01

    The study of HLA class II polymorphism by direct exon 2 DNA sequencing analysis has been established to be a reliable and accurate high-resolution typing procedure. This approach shows some advantages in relation to previous methods, polymerase chain reaction using sequence-specific oligonucleotides (PCR-SSO) and sequence-specific primers (PCR-SSP), basically due to the capability of analysis for the complete sequenced genomic region, including non-polymorphic motifs. DRB3 and DQB1 sequencing based typing (SBT) in unrelated bone marrow donor searching allowed us to detect three new alleles. The complete coding region sequences were characterised from cDNA. Two new DRB3 alleles, DRB3*0210 and DRB3*0211, were described in two Caucasian bone marrow donors. Both sequences showed single point mutations regarding DRB3*0202, producing amino acid replacements at positions 51 (Asp to Thr) and 67 (Leu to Ile), respectively. These two point mutations can be found in other DRB alleles, and suggest that gene conversion would be involved in the origin of both alleles. A new DQB1 sequence was found in a Spanish patient that showed two nucleotide differences, positions 134 and 141, with regard to its close similar DQB1*03011 allele. Only substitution at position 134 provoked amino acid replacement at residue 45, Glu to Gly. This single amino acid change would be involved in the lack of serologic recognition of this new molecule by DQ7-specific reagents.

  7. Essays on incomplete contracts in regulatory activities

    NASA Astrophysics Data System (ADS)

    Saavedra, Eduardo Humberto

    This dissertation consists of three essays. The first essay, The Hold-Up Problem in Public Infrastructure Franchising, characterizes the equilibria of the investment decisions in public infrastructure franchising under incomplete contracting and ex-post renegotiation. The parties (government and a firm) are unable to credibly commit to the contracted investment plan, so that a second step investment is renegotiated by the parties at the revision stage. As expected, the possibility of renegotiation affects initial non-verifiable investments. The main conclusion of this essay is that not only underinvestment but also overinvestment in infrastructure may arise in equilibrium, compared to the complete contracting case. The second essay, Alternative Institutional Arrangements in Network Utilities: An Incomplete Contracting Approach, presents a theoretical assessment of the efficiency implications of privatizing natural monopolies which are vertically related to potential competitive firms. Based on the incomplete contracts and asymmetric information paradigm. I develop a model that analyzes the relative advantages of different institutional arrangements---alternative ownership and market structures in the industry--- in terms of their allocative and productive efficiencies. The main policy conclusion of this essay is that both ownership and the existence of conglomerates in network industries matter. Among other conclusions, this essay provides an economic rationale for a mixed economy in which the network is public and vertical separation of the industry when the natural monopoly is under private ownership. The last essay, Opportunistic Behavior and Legal Disputes in the Chilean Electricity Sector, analyzes post-contractual disputes in this newly privatized industry. It discusses the presumption that opportunistic behavior and disputes arise due to inadequate market design, ambiguous regulation, and institutional weaknesses. This chapter also assesses the presumption

  8. Incomplete figure perception and invisible masking.

    PubMed

    Chikhman, Valery; Shelepin, Yuri; Foreman, Nigel; Merkuljev, Aleksey; Pronin, Sergey

    2006-01-01

    The Gollin test (measuring recognition thresholds for fragmented line drawings of everyday objects and animals) has traditionally been regarded as a test of incomplete figure perception or 'closure', though there is a debate about how such closure is achieved. Here, figural incompleteness is considered to be the result of masking, such that absence of contour elements of a fragmented figure is the result of the influence of an 'invisible' mask. It is as though the figure is partly obscured by a mask having parameters identical to those of the background. This mask is 'invisible' only consciously, but for the early stages of visual processing it is real and has properties of multiplicative noise. Incomplete Gollin figures were modeled as the figure covered by the mask with randomly distributed transparent and opaque patches. We adjusted the statistical characteristics of the contour image and empty noise patches and processed those using spatial and spatial-frequency measures. Across 73 figures, despite inter-subject variability, mean recognition threshold was always approximately 15% of total contour in naive observers. Recognition worsened with increasing spectral similarity between the figure and the 'invisible' mask. Near threshold, the spectrum of the fragmented image was equally similar to that of the 'invisible' mask and complete image. The correlation between spectral parameters of figures at threshold and complete figures was greatest for figures that were most easily recognised. Across test sessions, thresholds reduced when either figure or mask parameters were familiar. We argue that recognition thresholds for Gollin stimuli in part reflect the extraction of signal from noise.

  9. Dynamic pattern matcher using incomplete data

    NASA Technical Reports Server (NTRS)

    Johnson, Gordon G. (Inventor); Wang, Lui (Inventor)

    1993-01-01

    This invention relates generally to pattern matching systems, and more particularly to a method for dynamically adapting the system to enhance the effectiveness of a pattern match. Apparatus and methods for calculating the similarity between patterns are known. There is considerable interest, however, in the storage and retrieval of data, particularly, when the search is called or initiated by incomplete information. For many search algorithms, a query initiating a data search requires exact information, and the data file is searched for an exact match. Inability to find an exact match thus results in a failure of the system or method.

  10. Incomplete Dirac reduction of constrained Hamiltonian systems

    SciTech Connect

    Chandre, C.

    2015-10-15

    First-class constraints constitute a potential obstacle to the computation of a Poisson bracket in Dirac’s theory of constrained Hamiltonian systems. Using the pseudoinverse instead of the inverse of the matrix defined by the Poisson brackets between the constraints, we show that a Dirac–Poisson bracket can be constructed, even if it corresponds to an incomplete reduction of the original Hamiltonian system. The uniqueness of Dirac brackets is discussed. The relevance of this procedure for infinite dimensional Hamiltonian systems is exemplified.

  11. Catalytic combustion with incompletely vaporized residual fuel

    NASA Technical Reports Server (NTRS)

    Rosfjord, T. J.

    1981-01-01

    Catalytic combustion of fuel lean mixtures of incompletely vaporized residual fuel and air was investigated. The 7.6 cm diameter, graded cell reactor was constructed from zirconia spinel substrate and catalyzed with a noble metal catalyst. Streams of luminous particles exited the rector as a result of fuel deposition and carbonization on the substrate. Similar results were obtained with blends of No. 6 and No. 2 oil. Blends of shale residual oil and No. 2 oil resulted in stable operation. In shale oil blends the combustor performance degraded with a reduced degree of fuel vaporization. In tests performed with No. 2 oil a similar effect was observed.

  12. Clinical coding. Code breakers.

    PubMed

    Mathieson, Steve

    2005-02-24

    --The advent of payment by results has seen the role of the clinical coder pushed to the fore in England. --Examinations for a clinical coding qualification began in 1999. In 2004, approximately 200 people took the qualification. --Trusts are attracting people to the role by offering training from scratch or through modern apprenticeships.

  13. 49 CFR 529.4 - Requirements for incomplete automobile manufacturers.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 49 Transportation 6 2010-10-01 2010-10-01 false Requirements for incomplete automobile... AUTOMOBILES § 529.4 Requirements for incomplete automobile manufacturers. (a) Except as provided in paragraph (c) of this section, §§ 529.5 and 529.6, each incomplete automobile manufacturer is considered,...

  14. 49 CFR 529.4 - Requirements for incomplete automobile manufacturers.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 49 Transportation 6 2014-10-01 2014-10-01 false Requirements for incomplete automobile... AUTOMOBILES § 529.4 Requirements for incomplete automobile manufacturers. (a) Except as provided in paragraph (c) of this section, §§ 529.5 and 529.6, each incomplete automobile manufacturer is considered,...

  15. 49 CFR 529.4 - Requirements for incomplete automobile manufacturers.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 49 Transportation 6 2011-10-01 2011-10-01 false Requirements for incomplete automobile... AUTOMOBILES § 529.4 Requirements for incomplete automobile manufacturers. (a) Except as provided in paragraph (c) of this section, §§ 529.5 and 529.6, each incomplete automobile manufacturer is considered,...

  16. 49 CFR 529.4 - Requirements for incomplete automobile manufacturers.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 49 Transportation 6 2012-10-01 2012-10-01 false Requirements for incomplete automobile... AUTOMOBILES § 529.4 Requirements for incomplete automobile manufacturers. (a) Except as provided in paragraph (c) of this section, §§ 529.5 and 529.6, each incomplete automobile manufacturer is considered,...

  17. 49 CFR 529.4 - Requirements for incomplete automobile manufacturers.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 49 Transportation 6 2013-10-01 2013-10-01 false Requirements for incomplete automobile... AUTOMOBILES § 529.4 Requirements for incomplete automobile manufacturers. (a) Except as provided in paragraph (c) of this section, §§ 529.5 and 529.6, each incomplete automobile manufacturer is considered,...

  18. Robust pulmonary lobe segmentation against incomplete fissures

    NASA Astrophysics Data System (ADS)

    Gu, Suicheng; Zheng, Qingfeng; Siegfried, Jill; Pu, Jiantao

    2012-03-01

    As important anatomical landmarks of the human lung, accurate lobe segmentation may be useful for characterizing specific lung diseases (e.g., inflammatory, granulomatous, and neoplastic diseases). A number of investigations showed that pulmonary fissures were often incomplete in image depiction, thereby leading to the computerized identification of individual lobes a challenging task. Our purpose is to develop a fully automated algorithm for accurate identification of individual lobes regardless of the integrity of pulmonary fissures. The underlying idea of the developed lobe segmentation scheme is to use piecewise planes to approximate the detected fissures. After a rotation and a global smoothing, a number of small planes were fitted using local fissures points. The local surfaces are finally combined for lobe segmentation using a quadratic B-spline weighting strategy to assure that the segmentation is smooth. The performance of the developed scheme was assessed by comparing with a manually created reference standard on a dataset of 30 lung CT examinations. These examinations covered a number of lung diseases and were selected from a large chronic obstructive pulmonary disease (COPD) dataset. The results indicate that our scheme of lobe segmentation is efficient and accurate against incomplete fissures.

  19. Bereavement: an incomplete rite of passage.

    PubMed

    Hunter, Jennifer

    A bereavement ritual observed during anthropological fieldwork in Peru gives basis to this article which asserts that bereavement has become an incomplete rite of passage. The article reviews the role of ritual and rites of passage, examines other anthropologic examples of death and bereavement rituals, and identifies the lack of post-funeral ritual for many bereaved individuals in the United States. While funerary rituals which end with the funeral and burial of the dead are helpful in providing immediate structure for the bereaved, they are not congruent with the long-term emotional needs and reconstruction of meaning within grief. The author acknowledges value of both private ritual and reunions of the community of mourners, and recommends that bereavement counselors and/or the funeral industry offer to help bereaved construct a "ritual of remembrance and new meaning" after time has allowed them to move along in meaning reconstruction processes of making sense, finding benefits, and identity change.

  20. Shape reconstruction methods with incomplete data

    NASA Astrophysics Data System (ADS)

    Nakahata, K.; Kitahara, M.

    2000-05-01

    Linearized inverse scattering methods are applied to the shape reconstruction of defects in elastic solids. The linearized methods are based on the Born approximation in the low frequency range and the Kirchhoff approximation in the high frequency range. The experimental measurement is performed to collect the scattering data from defects. The processed data from the measurement are fed into the linearized methods and the shape of the defect is reconstructed by two linearized methods. The importance of scattering data in the low frequency range is pointed out not only for Born inversion but also for Kirchhoff inversion. In the ultrasonic measurement for the real structure, the access points of the sensor may be limited to one side of the structural surfaces and a part of the surface. From the viewpoint of application, the incomplete scattering data are used as inputs for the shape reconstruction methods and the effect of the sensing points are discussed.

  1. Inflaton dark matter from incomplete decay

    NASA Astrophysics Data System (ADS)

    Bastero-Gil, Mar; Cerezo, Rafael; Rosa, João G.

    2016-05-01

    We show that the decay of the inflaton field may be incomplete, while nevertheless successfully reheating the Universe and leaving a stable remnant that accounts for the present dark matter abundance. We note, in particular, that since the mass of the inflaton decay products is field dependent, one can construct models, endowed with an appropriate discrete symmetry, where inflaton decay is kinematically forbidden at late times and only occurs during the initial stages of field oscillations after inflation. We show that this is sufficient to ensure the transition to a radiation-dominated era and that inflaton particles typically thermalize in the process. They eventually decouple and freeze out, yielding a thermal dark matter relic. We discuss possible implementations of this generic mechanism within consistent cosmological and particle physics scenarios, for both single-field and hybrid inflation.

  2. Regulatory perspective on incomplete control rod insertions

    SciTech Connect

    Chatterton, M.

    1997-01-01

    The incomplete control rod insertions experienced at South Texas Unit 1 and Wolf Creek are of safety concern to the NRC staff because they represent potential precursors to loss of shutdown margin. Even before it was determined if these events were caused by the control rods or by the fuel there was an apparent correlation of the problem with high burnup fuel. It was determined that there was also a correlation between high burnup and high drag forces as well as with rod drop time histories and lack of rod recoil. The NRC staff initial actions were aimed at getting a perspective on the magnitude of the problem as far as the number of plants and the amount of fuel that could be involved, as well as the safety significance in terms of shutdown margin. As tests have been performed and data has been analyzed the focus has shifted more toward understanding the problem and the ways to eliminate it. At this time the staff`s understanding of the phenomena is that it was a combination of factors including burnup, power history and temperature. The problem appears to be very sensitive to these factors, the interaction of which is not clearly understood. The model developed by Westinghouse provides a possible explanation but there is not sufficient data to establish confidence levels and sensitivity studies involving the key parameters have not been done. While several fixes to the problem have been discussed, no definitive fixes have been proposed. Without complete understanding of the phenomena, or fixes that clearly eliminate the problem the safety concern remains. The safety significance depends on the amount of shutdown margin lost due to incomplete insertion of the control rods. Were the control rods to stick high in the core, the reactor could not be shutdown by the control rods and other means such as emergency boration would be required.

  3. Projectile - Mass asymmetry systematics for low energy incomplete fusion

    NASA Astrophysics Data System (ADS)

    Singh, Pushpendra P.; Yadav, Abhishek; Sharma, Vijay R.; Sharma, Manoj K.; Kumar, Pawan; Sahoo, Rudra N.; Kumar, R.; Singh, R. P.; Muralithar, S.; Singh, B. P.; Bhowmik, R. K.; Prasad, R.

    2015-06-01

    In the present work, low energy incomplete fusion (ICF) in which only a part of projectile fuses with target nucleus has been investigated in terms of various entrance channel parameters. The ICF strength function has been extracted from the analysis of experimental excitation functions (EFs) measured for different projectile-target combinations from near- to well above- barrier energies in 12C,16O(from 1.02Vb to 1.64Vb)+169Tm systems. Experimental EFs have been analysed in the framework statistical model code PACE4 based on the idea of equilibrated compound nucleus decay. It has been found that the value of ICF fraction (FICF) increases with incident projectile energy. A substantial fraction of ICF (FICF ≈ 7 %) has been accounted even at energy as low as ≈ 7.5% above the barrier (at relative velocity νrel ≈0.027) in 12C+169Tm system, and FICF ≈ 10 % at νrel ≈0.014 in 16O+169Tm system. The probability of ICF is discussed in light of the Morgenstern's mass-asymmetry systematics. The value of FICF for 16O+169Tm systems is found to be 18.3 % higher than that observed for 12C+169Tm systems. Present results together with the re-analysis of existing data for nearby systems conclusively demonstrate strong competition of ICF with CF even at slightly above barrier energies, and strong projectile dependence that seems to supplement the Morgenstern's systematics.

  4. Sequence analysis of a 10 kb DNA fragment from yeast chromosome VII reveals a novel member of the DnaJ family.

    PubMed

    Rodriguez-Belmonte, E; Rodriguez-Torres, A M; Tizon, B; Cadahia, J L; Gonzalez-Siso, I; Ramil, E; Becerra, M; Gonzalez-Dominguez, M; Cerdan, E

    1996-02-01

    We report the sequence analysis of a 10 kb DNA fragment of Saccharomyces cerevisiae chromosome VII. This sequence contains four complete open reading frames (ORFs) of greater than 100 amino acids. There are also two incomplete ORFs flanking the extremes: one of these, G2868, is the 5' part of the SCS3 gene (Hosaka et al., 1994). ORFs G2853 and G2856 correspond to the genes CEG1, coding for the alfa subunit of the mRNA guanylyl transferase and a 3' gene of unknown function previously sequenced (Shibagaki et al., 1992). G2864 is identical to SOH1 also reported (Fan and Klein, 1994).

  5. Specific amplification of the HLA-DRB4 gene from c-DNA. Complete coding sequence of the HLA alleles DRB4*0103101 and DRB4*01033.

    PubMed

    De Pablo, R; Solís, R; Balas, A; Vilches, C

    2002-01-01

    We present the complete coding sequence of the HLA alleles DRB4*0103101 and DRB4*01033 derived from the lymphoblastoid cell line G081, established from an individual of Spanish Gypsy ethnic origin. This cell was typed by PCR-SSP and reverse SSO as DRB4*0103101 but further characterization of the DRB4 gene by sequence-based typing (SBT) demonstrated heterozygosity at codon 78 (TAC, TAT). With the aim of confirming this polymorphism, RNA isolated from G081 was subjected to RT-PCR using primers designed to recognize specifically the 5' and 3' UT regions of HLA-DRB4 and the product was cloned and sequenced. Nucleotide sequences derived from seven clones confirmed the heterozygosity of G081, as they corresponded to two open reading frames of 801 nucleotides that matched either DRB4*0103101 or the recently described DRB4*01033, for which a partial sequence, spanning exons 2 and 3, has been reported. The phenotype of G081 (A*01; B*0702, *1302/1303; Cw*0602, *07; DRB1*0403, *0701; DRB4*0103101, *01033; DQB1*0202, *0302; DQA1*0201, *0301) is consistent with a proposed association of DRB4*01033 with DRB1*0403 and DQB1*0302.

  6. Structure of Freund's complete and incomplete adjuvants

    PubMed Central

    Dvorak, Ann M.; Dvorak, H. F.

    1974-01-01

    Emulsions of complete (CFA) and incomplete (IFA) Freund's adjuvants were examined in the light and electron microscopes, and the resulting morphological findings were correlated with the effectiveness of the emulsions as immunological adjuvants. Thick (viscous) emulsions of both IFA and CFA consisted of highly stable, three-dimensional meshworks composed of interconnecting strands of antigen-containing water droplets interspersed in oil phase. Included mycobacteria were confined to this meshwork and were coated with an adherent surface layer of water droplets. Thin Freund's adjuvants were less stable, relatively coarse emulsions, but even in such preparations mycobacteria showed a striking affinity for the surface of water droplets when these contained low concentrations of antigens such as human serum albumin (HSA). The characteristic adjuvant effect of CFA was observed only when associations between mycobacteria and water droplets took place. Thus, no adjuvant effect occurred with oil-in-water (o/w) emulsions, nor when antigen and mycobacteria-in-oil were injected into separate foot pads. Further, a good adjuvant effect was observed even with thin emulsions when mycobacteria-water droplet associations were abundant. These morphological and immunological data suggest that CFA is a device for bringing extrinsic, water-soluble antigens into intimate, stable contact with myco-bacteria, thereby conferring on them the ability to elicit an immunological response qualitatively similar to that induced by mycobacteria-in-oil to the intrinsic antigen, tuberculin. ImagesFIG. 1FIG. 2FIG. 3FIG. 4FIG. 5 PMID:4605156

  7. The Treatment of the Incompletely Descended Testis

    PubMed Central

    Wilson, D. S. Poole

    1939-01-01

    (1) Under three years of age the diagnosis of the incompletely descended testis is uncertain. (2) The policy of awaiting spontaneous descent may be pursued until 10 years of age but, unless the testis lies in the superior scrotal position, this policy should not be persisted in thereafter. (3) Hormonal therapy may be employed before operative treatment as a means of determining testes which will descend spontaneously. It should only be used in the prepuberty period. (4) Operative treatment may be safely carried out at any age after 3 years and should be completed before puberty. The optimum period is between 8 and 11 years. The Bevan operation may be successful when the testis is very mobile but the most consistent results are obtained by the septal transposition or Keetley-Torek operations. ImagesFig. 1Fig. 2Fig. 3Fig. 4Fig. 5Fig. 8Fig. 9Fig. 10Fig. 13Fig. 14Fig. 15Fig. 16Fig. 18Fig. 19Fig. 20Fig. 21Fig. 22 PMID:19991991

  8. Deep community detection in topologically incomplete networks

    NASA Astrophysics Data System (ADS)

    Xin, Xin; Wang, Chaokun; Ying, Xiang; Wang, Boyang

    2017-03-01

    In this paper, we consider the problem of detecting communities in topologically incomplete networks (TIN), which are usually observed from real-world networks and where some edges are missing. Existing approaches to community detection always consider the input network as connected. However, more or less, even nearly all, edges are missing in real-world applications, e.g. the protein-protein interaction networks. Clearly, it is a big challenge to effectively detect communities in these observed TIN. At first, we bring forward a simple but useful method to address the problem. Then, we design a structured deep convolutional neural network (CNN) model to better detect communities in TIN. By gradually removing edges of the real-world networks, we show the effectiveness and robustness of our structured deep model on a variety of real-world networks. Moreover, we find that the appropriate choice of hop counts can improve the performance of our deep model in some degree. Finally, experimental results conducted on synthetic data sets also show the good performance of our proposed deep CNN model.

  9. WILLIAM SEAL REJECTING AN INCOMPLETE OR IMPROPERLY SET BEARDSLEY AND ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    WILLIAM SEAL REJECTING AN INCOMPLETE OR IMPROPERLY SET BEARDSLEY AND PIPER ROTOMOLD CORMATIC CORE. - Southern Ductile Casting Company, Core Making, 2217 Carolina Avenue, Bessemer, Jefferson County, AL

  10. A Composite-Likelihood Method for Detecting Incomplete Selective Sweep from Population Genomic Data.

    PubMed

    Vy, Ha My T; Kim, Yuseob

    2015-06-01

    Adaptive evolution occurs as beneficial mutations arise and then increase in frequency by positive natural selection. How, when, and where in the genome such evolutionary events occur is a fundamental question in evolutionary biology. It is possible to detect ongoing positive selection or an incomplete selective sweep in species with sexual reproduction because, when a beneficial mutation is on the way to fixation, homologous chromosomes in the population are divided into two groups: one carrying the beneficial allele with very low polymorphism at nearby linked loci and the other carrying the ancestral allele with a normal pattern of sequence variation. Previous studies developed long-range haplotype tests to capture this difference between two groups as the signal of an incomplete selective sweep. In this study, we propose a composite-likelihood-ratio (CLR) test for detecting incomplete selective sweeps based on the joint sampling probabilities for allele frequencies of two groups as a function of strength of selection and recombination rate. Tested against simulated data, this method yielded statistical power and accuracy in parameter estimation that are higher than the iHS test and comparable to the more recently developed nSL test. This procedure was also applied to African Drosophila melanogaster population genomic data to detect candidate genes under ongoing positive selection. Upon visual inspection of sequence polymorphism, candidates detected by our CLR method exhibited clear haplotype structures predicted under incomplete selective sweeps. Our results suggest that different methods capture different aspects of genetic information regarding incomplete sweeps and thus are partially complementary to each other.

  11. CROSS-DISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY: The structural analysis of protein sequences based on the quasi-amino acids code

    NASA Astrophysics Data System (ADS)

    Zhu, Ping; Tang, Xu-Qing; Xu, Zhen-Yuan

    2009-01-01

    Proteomics is the study of proteins and their interactions in a cell. With the successful completion of the Human Genome Project, it comes the postgenome era when the proteomics technology is emerging. This paper studies protein molecule from the algebraic point of view. The algebraic system (Σ, +, *) is introduced, where Σ is the set of 64 codons. According to the characteristics of (Σ, +, *), a novel quasi-amino acids code classification method is introduced and the corresponding algebraic operation table over the set ZU of the 16 kinds of quasi-amino acids is established. The internal relation is revealed about quasi-amino acids. The results show that there exist some very close correlations between the properties of the quasi-amino acids and the codon. All these correlation relationships may play an important part in establishing the logic relationship between codons and the quasi-amino acids during the course of life origination. According to Ma F et al (2003 J. Anhui Agricultural University 30 439), the corresponding relation and the excellent properties about amino acids code are very difficult to observe. The present paper shows that (ZU, ⊕, otimes) is a field. Furthermore, the operational results display that the codon tga has different property from other stop codons. In fact, in the mitochondrion from human and ox genomic codon, tga is just tryptophane, is not the stop codon like in other genetic code, it is the case of the Chen W C et al (2002 Acta Biophysica Sinica 18(1) 87). The present theory avoids some inexplicable events of the 20 kinds of amino acids code, in other words it solves the problem of 'the 64 codon assignments of mRNA to amino acids is probably completely wrong' proposed by Yang (2006 Progress in Modern Biomedicine 6 3).

  12. Complete genome sequence of a Chinese isolate of pepper vein yellows virus and evolutionary analysis based on the CP, MP and RdRp coding regions.

    PubMed

    Liu, Maoyan; Liu, Xiangning; Li, Xun; Zhang, Deyong; Dai, Liangyin; Tang, Qianjun

    2016-03-01

    The genome sequence of pepper vein yellows virus (PeVYV) (PeVYV-HN, accession number KP326573), isolated from pepper plants (Capsicum annuum L.) grown at the Hunan Vegetables Institute (Changsha, Hunan, China), was determined by deep sequencing of small RNAs. The PeVYV-HN genome consists of 6244 nucleotides, contains six open reading frames (ORFs), and is similar to that of an isolate (AB594828) from Japan. Its genomic organization is similar to that of members of the genus Polerovirus. Sequence analysis revealed that PeVYV-HN shared 92% sequence identity with the Japanese PeVYV genome at both the nucleotide and amino acid levels. Evolutionary analysis based on the coat protein (CP), movement protein (MP), and RNA-dependent RNA polymerase (RdRP) showed that PeVYV could be divided into two major lineages corresponding to their geographical origins. The Asian isolates have a higher population expansion frequency than the African isolates. Negative selection and genetic drift (founder effect) were found to be the potential drivers of the molecular evolution of PeVYV. Moreover, recombination was not the distinct cause of PeVYV evolution. This is the first report of a complete genomic sequence of PeVYV in China.

  13. Evolving genetic code

    PubMed Central

    OHAMA, Takeshi; INAGAKI, Yuji; BESSHO, Yoshitaka; OSAWA, Syozo

    2008-01-01

    In 1985, we reported that a bacterium, Mycoplasma capricolum, used a deviant genetic code, namely UGA, a “universal” stop codon, was read as tryptophan. This finding, together with the deviant nuclear genetic codes in not a few organisms and a number of mitochondria, shows that the genetic code is not universal, and is in a state of evolution. To account for the changes in codon meanings, we proposed the codon capture theory stating that all the code changes are non-disruptive without accompanied changes of amino acid sequences of proteins. Supporting evidence for the theory is presented in this review. A possible evolutionary process from the ancient to the present-day genetic code is also discussed. PMID:18941287

  14. A new PCR primer for the identification of Paracoccidioides brasiliensis based on rRNA sequences coding the internal transcribed spacers (ITS) and 5 x 8S regions.

    PubMed

    Imai, T; Sano, A; Mikami, Y; Watanabe, K; Aoki, F H; Branchini, M L; Negroni, R; Nishimura, K; Miyaji, M

    2000-08-01

    Internal transcribed spacer (ITS) genes including the 5.8S ribosomal (r)RNA of Paracoccidioides brasiliensis were amplified and the DNA sequences were determined. Based on a comparison of the sequence information, a new polymerase chain reaction (PCR) primer pair was designed for specific amplification of DNA for P. brasiliensis. This primer pair amplified a 418-bp DNA sequence and was 100% successful in identifying 29 strains of P. brasiliensis (including the reference strains) isolated from the regions of Brazil, Costa Rica, Japan, Argentina or from different sources. The results of specificity tests of these primers to compare the fungus with those of Aspergillus fumigatus, Blastomyces dermatitidis, Candida albicans, Cryptococcus neoformans, Histoplasma capsulatum and Penicillium marneffei are also reported.

  15. Phylogenetic footprinting of non-coding RNA: hammerhead ribozyme sequences in a satellite DNA family of Dolichopoda cave crickets (Orthoptera, Rhaphidophoridae)

    PubMed Central

    2010-01-01

    Background The great variety in sequence, length, complexity, and abundance of satellite DNA has made it difficult to ascribe any function to this genome component. Recent studies have shown that satellite DNA can be transcribed and be involved in regulation of chromatin structure and gene expression. Some satellite DNAs, such as the pDo500 sequence family in Dolichopoda cave crickets, have a catalytic hammerhead (HH) ribozyme structure and activity embedded within each repeat. Results We assessed the phylogenetic footprints of the HH ribozyme within the pDo500 sequences from 38 different populations representing 12 species of Dolichopoda. The HH region was significantly more conserved than the non-hammerhead (NHH) region of the pDo500 repeat. In addition, stems were more conserved than loops. In stems, several compensatory mutations were detected that maintain base pairing. The core region of the HH ribozyme was affected by very few nucleotide substitutions and the cleavage position was altered only once among 198 sequences. RNA folding of the HH sequences revealed that a potentially active HH ribozyme can be found in most of the Dolichopoda populations and species. Conclusions The phylogenetic footprints suggest that the HH region of the pDo500 sequence family is selected for function in Dolichopoda cave crickets. However, the functional role of HH ribozymes in eukaryotic organisms is unclear. The possible functions have been related to trans cleavage of an RNA target by a ribonucleoprotein and regulation of gene expression. Whether the HH ribozyme in Dolichopoda is involved in similar functions remains to be investigated. Future studies need to demonstrate how the observed nucleotide changes and evolutionary constraint have affected the catalytic efficiency of the hammerhead. PMID:20047671

  16. Analysis of the coding-complete genomic sequence of groundnut ringspot virus suggests a common ancestor with tomato chlorotic spot virus.

    PubMed

    de Breuil, Soledad; Cañizares, Joaquín; Blanca, José Miguel; Bejerman, Nicolás; Trucco, Verónica; Giolitti, Fabián; Ziarsolo, Peio; Lenardon, Sergio

    2016-08-01

    Groundnut ringspot virus (GRSV) and tomato chlorotic spot virus (TCSV) share biological and serological properties, so their identification is carried out by molecular methods. Their genomes consist of three segmented RNAs: L, M and S. The finding of a reassortant between these two viruses may complicate correct virus identification and requires the characterization of the complete genome. Therefore, we present for the first time the complete sequences of all the genes encoded by a GRSV isolate. The high level of sequence similarity between GRSV and TCSV (over 90 % identity) observed in the genes and proteins encoded in the M RNA support previous results indicating that these viruses probably have a common ancestor.

  17. Auditory Compensation for Head Rotation Is Incomplete

    PubMed Central

    2016-01-01

    Hearing is confronted by a similar problem to vision when the observer moves. The image motion that is created remains ambiguous until the observer knows the velocity of eye and/or head. One way the visual system solves this problem is to use motor commands, proprioception, and vestibular information. These “extraretinal signals” compensate for self-movement, converting image motion into head-centered coordinates, although not always perfectly. We investigated whether the auditory system also transforms coordinates by examining the degree of compensation for head rotation when judging a moving sound. Real-time recordings of head motion were used to change the “movement gain” relating head movement to source movement across a loudspeaker array. We then determined psychophysically the gain that corresponded to a perceptually stationary source. Experiment 1 showed that the gain was small and positive for a wide range of trained head speeds. Hence, listeners perceived a stationary source as moving slightly opposite to the head rotation, in much the same way that observers see stationary visual objects move against a smooth pursuit eye movement. Experiment 2 showed the degree of compensation remained the same for sounds presented at different azimuths, although the precision of performance declined when the sound was eccentric. We discuss two possible explanations for incomplete compensation, one based on differences in the accuracy of signals encoding image motion and self-movement and one concerning statistical optimization that sacrifices accuracy for precision. We then consider the degree to which such explanations can be applied to auditory motion perception in moving listeners. PMID:27841453

  18. Spectral Regularization Algorithms for Learning Large Incomplete Matrices.

    PubMed

    Mazumder, Rahul; Hastie, Trevor; Tibshirani, Robert

    2010-03-01

    We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 10(6) × 10(6) incomplete matrix with 10(5) observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive state-of-the art techniques.

  19. Synesthesia in twins: incomplete concordance in monozygotes suggests extragenic factors.

    PubMed

    Bosley, Hannah G; Eagleman, David M

    2015-06-01

    Colored-sequence synesthesia (CSS) is a neurological condition in which sequential stimuli such as letters, numbers, or days of the week trigger simultaneous, involuntary color perception. Although the condition appears to run in families and several studies have sought a genetic link, the genetic contribution to synesthesia remains unclear. We conducted the first comparative twin study of CSS and found that CSS has a pairwise concordance of 73.9% in monozygotic twins, and a pairwise concordance of 36.4% in dizygotic twins. In line with previous studies, our results suggest a heritable element of synesthesia. However, consonant with the findings of previous single-pair case studies, our large sample size verifies that synesthesia is not completely conferred by genetics; if it were, monozygotic twins should have 100% concordance. These findings implicate a genetic mechanism of CSS that may work differently than previously thought: collectively, our data suggest that synesthesia is a heritable condition with incomplete penetrance that is substantially influenced by epigenetic and environmental factors.

  20. Analysis of partial sequences of genes coding for 16S rRNA of actinomycetes isolated from Casuarina equisetifolia nodules in Mexico.

    PubMed Central

    Niner, B M; Brandt, J P; Villegas, M; Marshall, C R; Hirsch, A M; Valdés, M

    1996-01-01

    Filamentous bacteria isolated from surface-sterilized nodules of Casuarina equisetifolia trees in México were capable of reducing acetylene, a diagnostic test for nitrogenase, but were unable to nodulate their host. Analysis of partial 16S rRNA gene sequences suggests that the Mexican isolates are not Frankia strains but members of a novel clade. PMID:8702297

  1. Sequence variability in HC-Pro coding regions of Korean Soybean mosaic virus isolates is associated with differences in RNA silencing suppression

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Soybean mosaic virus (SMV), a member of the family Potyviridae, is an important viral pathogen affecting soybean production in Korea. The variability in helper component proteinase (HC-Pro) sequences and pathogenicity of SMV tissue samples from seven Korean provinces was investigated and compared wi...

  2. Ethical coding.

    PubMed

    Resnik, Barry I

    2009-01-01

    It is ethical, legal, and proper for a dermatologist to maximize income through proper coding of patient encounters and procedures. The overzealous physician can misinterpret reimbursement requirements or receive bad advice from other physicians and cross the line from aggressive coding to coding fraud. Several of the more common problem areas are discussed.

  3. Sequence variation in the coding region of the melanocortin-1 receptor gene (MC1R) is not associated with plumage variation in the blue-crowned manakin (Lepidothrix coronata).

    PubMed

    Cheviron, Z A; Hackett, Shannon J; Brumfield, Robb T

    2006-07-07

    Avian plumage traits are the targets of both natural and sexual selection. Consequently, genetic changes resulting in plumage variation among closely related taxa might represent important evolutionary events. The molecular basis of such differences, however, is unknown in most cases. Sequence variation in the melanocortin-1 receptor gene (MC1R) is associated with melanistic phenotypes in many vertebrate taxa, including several avian species. The blue-crowned manakin (Lepidothrix coronata), a widespread, sexually dichromatic passerine, exhibits striking geographic variation in male plumage colour across its range in southern Central America and western Amazonia. Northern males are black with brilliant blue crowns whereas southern males are green with lighter blue crowns. We sequenced 810 bp of the MC1R coding region in 23 individuals spanning the range of male plumage variation. The only variable sites we detected among L. coronata sequences were four synonymous substitutions, none of which were strictly associated with either plumage type. Similarly, comparative analyses showed that L. coronata sequences were monomorphic at the three amino acid sites hypothesized to be functionally important in other birds. These results demonstrate that genes other than MC1R underlie melanic plumage polymorphism in blue-crowned manakins.

  4. NemaFootPrinter: a web based software for the identification of conserved non-coding genome sequence regions between C. elegans and C. briggsae

    PubMed Central

    Rambaldi, Davide; Guffanti, Alessandro; Morandi, Paolo; Cassata, Giuseppe

    2005-01-01

    Background NemaFootPrinter (Nematode Transcription Factor Scan Through Philogenetic Footprinting) is a web-based software for interactive identification of conserved, non-exonic DNA segments in the genomes of C. elegans and C. briggsae. It has been implemented according to the following project specifications: a) Automated identification of orthologous gene pairs. b) Interactive selection of the boundaries of the genes to be compared. c) Pairwise sequence comparison with a range of different methods. d) Identification of putative transcription factor binding sites on conserved, non-exonic DNA segments. Results Starting from a C. elegans or C. briggsae gene name or identifier, the software identifies the putative ortholog (if any), based on information derived from public nematode genome annotation databases. The investigator can then retrieve the genome DNA sequences of the two orthologous genes; visualize graphically the genes' intron/exon structure and the surrounding DNA regions; select, through an interactive graphical user interface, subsequences of the two gene regions. Using a bioinformatics toolbox (Blast2seq, Dotmatcher, Ssearch and connection to the rVista database) the investigator is able at the end of the procedure to identify and analyze significant sequences similarities, detecting the presence of transcription factor binding sites corresponding to the conserved segments. The software automatically masks exons. Discussion This software is intended as a practical and intuitive tool for the researchers interested in the identification of non-exonic conserved sequence segments between C. elegans and C. briggsae. These sequences may contain regulatory transcriptional elements since they are conserved between two related, but rapidly evolving genomes. This software also highlights the power of genome annotation databases when they are conceived as an open resource and the possibilities offered by seamless integration of different web services via the http

  5. A Method for the Annotation of Functional Similarities of Coding DNA Sequences: the Case of a Populated Cluster of Transmembrane Proteins.

    PubMed

    Fuertes, Miguel Angel; Rodrigo, José Ramón; Alonso, Carlos

    2017-01-01

    The analysis of a large number of human and mouse genes codifying for a populated cluster of transmembrane proteins revealed that some of the genes significantly vary in their primary nucleotide sequence inter-species and also intra-species. In spite of that divergence and of the fact that all these genes share a common parental function we asked the question of whether at DNA level they have some kind of common compositional structure, not evident from the analysis of their primary nucleotide sequence. To reveal the existence of gene clusters not based on primary sequence relationships we have analyzed 13574 human and 14047 mouse genes by the composon-clustering methodology. The data presented show that most of the genes from each one of the samples are distributed in 18 clusters sharing the common compositional features between the particular human and mouse clusters. It was observed, in addition, that between particular human and mouse clusters having similar composon-profiles large variations in gene population were detected as an indication that a significant amount of orthologs between both species differs in compositional features. A gene cluster containing exclusively genes codifying for transmembrane proteins, an important fraction of which belongs to the Rhodopsin G-protein coupled receptor superfamily, was also detected. This indicates that even though some of them display low sequence similarity, all of them, in both species, participate with similar compositional features in terms of composons. We conclude that in this family of transmembrane proteins in general and in the Rhodopsin G-protein coupled receptor in particular, the composon-clustering reveals the existence of a type of common compositional structure underlying the primary nucleotide sequence closely correlated to function.

  6. 40 CFR 86.085-20 - Incomplete vehicles, classification.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... and Heavy-Duty Engines, and for 1985 and Later Model Year New Gasoline Fueled, Natural Gas-Fueled, Liquefied Petroleum Gas-Fueled and Methanol-Fueled Heavy-Duty Vehicles § 86.085-20 Incomplete vehicles... 40 Protection of Environment 19 2014-07-01 2014-07-01 false Incomplete vehicles,...

  7. Calculating Balanced Incomplete Block Design for Educational Assessments.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Carlson, James E.

    A popular design in large-scale educational assessments is the balanced incomplete block design. The design assumes that the item pool is split into a set of blocks of items that are assigned to assessment booklets. This paper shows how the technique of 0-1 linear programming can be used to calculate a balanced incomplete block design. Several…

  8. Loss of Information in Estimating Item Parameters in Incomplete Designs

    ERIC Educational Resources Information Center

    Eggen, Theo J. H. M.; Verelst, Norman D.

    2006-01-01

    In this paper, the efficiency of conditional maximum likelihood (CML) and marginal maximum likelihood (MML) estimation of the item parameters of the Rasch model in incomplete designs is investigated. The use of the concept of F-information (Eggen, 2000) is generalized to incomplete testing designs. The scaled determinant of the F-information…

  9. 49 CFR 630.6 - Late and incomplete reports.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 49 Transportation 7 2012-10-01 2012-10-01 false Late and incomplete reports. 630.6 Section 630.6 Transportation Other Regulations Relating to Transportation (Continued) FEDERAL TRANSIT ADMINISTRATION, DEPARTMENT OF TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.6 Late and incomplete reports. (a) Late...

  10. 49 CFR 630.6 - Late and incomplete reports.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 49 Transportation 7 2013-10-01 2013-10-01 false Late and incomplete reports. 630.6 Section 630.6 Transportation Other Regulations Relating to Transportation (Continued) FEDERAL TRANSIT ADMINISTRATION, DEPARTMENT OF TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.6 Late and incomplete reports. (a) Late...

  11. 49 CFR 630.6 - Late and incomplete reports.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 49 Transportation 7 2011-10-01 2011-10-01 false Late and incomplete reports. 630.6 Section 630.6 Transportation Other Regulations Relating to Transportation (Continued) FEDERAL TRANSIT ADMINISTRATION, DEPARTMENT OF TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.6 Late and incomplete reports. (a) Late...

  12. 49 CFR 630.6 - Late and incomplete reports.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 49 Transportation 7 2014-10-01 2014-10-01 false Late and incomplete reports. 630.6 Section 630.6 Transportation Other Regulations Relating to Transportation (Continued) FEDERAL TRANSIT ADMINISTRATION, DEPARTMENT OF TRANSPORTATION NATIONAL TRANSIT DATABASE § 630.6 Late and incomplete reports. (a) Late...

  13. Treatment of Intravenous Leiomyomatosis with Cardiac Extension following Incomplete Resection

    PubMed Central

    Doyle, Mathew P.; Li, Annette; Villanueva, Claudia I.; Peeceeyen, Sheen C. S.; Cooper, Michael G.; Hanel, Kevin C.; Fermanis, Gary G.; Robertson, Greg

    2015-01-01

    Aim. Intravenous leiomyomatosis (IVL) with cardiac extension (CE) is a rare variant of benign uterine leiomyoma. Incomplete resection has a recurrence rate of over 30%. Different hormonal treatments have been described following incomplete resection; however no standard therapy currently exists. We review the literature for medical treatments options following incomplete resection of IVL with CE. Methods. Electronic databases were searched for all studies reporting IVL with CE. These studies were then searched for reports of patients with inoperable or incomplete resection and any further medical treatments. Our database was searched for patients with medical therapy following incomplete resection of IVL with CE and their results were included. Results. All studies were either case reports or case series. Five literature reviews confirm that surgery is the only treatment to achieve cure. The uses of progesterone, estrogen modulation, gonadotropin-releasing hormone antagonism, and aromatase inhibition have been described following incomplete resection. Currently no studies have reviewed the outcomes of these treatments. Conclusions. Complete surgical resection is the only means of cure for IVL with CE, while multiple hormonal therapies have been used with varying results following incomplete resection. Aromatase inhibitors are the only reported treatment to prevent tumor progression or recurrence in patients with incompletely resected IVL with CE. PMID:26783463

  14. A reflection of the coding of meaning in patient-physician interaction: Jürgen Habermas' theory of communication applied to sequence analysis.

    PubMed

    Skirbekk, Helge

    2004-08-01

    This paper introduces parts of Jürgen Habermas' theory of communication in an attempt to understand how meaning is coded in patient-physician communication. By having a closer look at how patients and physicians make assertions with their utterances, light will be shed on difficult aspects of reaching understanding in the clinical encounter. Habermas' theory will be used to differentiate assertions into validity claims referring to truth, truthfulness and rightness. An analysis of hypothetical physician-replies to a patient suffering from back pains will substantiate the necessity for such a theory.

  15. Omnipotent decoding potential resides in eukaryotic translation termination factor eRF1 of variant-code organisms and is modulated by the interactions of amino acid sequences within domain 1.

    PubMed

    Ito, Koichi; Frolova, Ludmila; Seit-Nebi, Alim; Karamyshev, Andrey; Kisselev, Lev; Nakamura, Yoshikazu

    2002-06-25

    In eukaryotes, a single translational release factor, eRF1, deciphers three stop codons, although its decoding mechanism remains puzzling. In the ciliate Tetrahymena thermophila, UAA and UAG codons are reassigned to Gln codons. A yeast eRF1-domain swap containing Tetrahymena domain 1 responded only to UGA in vitro and failed to complement a defect in yeast eRF1 in vivo at 37 degrees C. This finding demonstrates that decoding specificity of eRF1 from variant code organisms resides at domain 1. However, the wild-type eRF1 hybrid fully restored the growth of eRF1-deficient yeast at 30 degrees C. Tetrahymena eRF1 contains a variant sequence, KATNIKD, at the tip of domain 1. The TASNIKD variant of hybrid eRF1 rendered the eRF1-nullified yeast viable, although in an in vitro assay, the same hybrid eRF1 responded only to UGA. Nevertheless, the yeast eRF1 bearing the KATNIKD motif instead of the TASNIKS heptapeptide present in higher eukaryotes remains omnipotent in vivo. Collectively, these data suggest that variant genetic code organisms like Tetrahymena have an intrinsic potential to decode three stop codons in vivo, and that interaction within domain 1 between the KAT tripeptide and other sequences modulates the decoding specificity of Tetrahymena eRF1.

  16. Sequence of the intron/exon junctions of the coding region of the human androgen receptor gene and identification of a point mutation in a family with complete androgen insensitivity.

    PubMed

    Lubahn, D B; Brown, T R; Simental, J A; Higgs, H N; Migeon, C J; Wilson, E M; French, F S

    1989-12-01

    Androgens act through a receptor protein (AR) to mediate sex differentiation and development of the male phenotype. We have isolated the eight exons in the amino acid coding region of the AR gene from a human X chromosome library. Nucleotide sequences of the AR gene intron/exon boundaries were determined for use in designing synthetic oligonucleotide primers to bracket coding exons for amplification by the polymerase chain reaction. Genomic DNA was amplified from 46,XY phenotypic female siblings with complete androgen insensitivity syndrome. AR binding affinity for dihydrotestosterone in the affected siblings was lower than in normal males, but the binding capacity was normal. Sequence analysis of amplified exons demonstrated within the AR steroid-binding domain (exon G) a single guanine to adenine mutation, resulting in replacement of valine with methionine at amino acid residue 866. As expected, the carrier mother had both normal and mutant AR genes. Thus, a single point mutation in the steroid-binding domain of the AR gene correlated with the expression of an AR protein ineffective in stimulating male sexual development.

  17. Incomplete suppression of distractor-related activity in the frontal eye field results in curved saccades.

    PubMed

    McPeek, Robert M

    2006-11-01

    Saccades in the presence of distractors show significant trajectory curvature. Based on previous work in the superior colliculus (SC), we speculated that curvature arises when a movement is initiated before competition between the target and distractor goals has been fully resolved. To test this hypothesis, we recorded frontal eye field (FEF) activity for curved and straight saccades in search. In contrast to the SC, activity in FEF is normally poorly correlated with saccade dynamics. However, the FEF, like the SC, is involved in target selection. Thus if curvature is caused by incomplete target selection, we expect to see its neural correlates in the FEF. We found that saccades that curve toward a distractor are accompanied by an increase in perisaccadic activity of FEF neurons coding the distractor location, and saccades that curve away are accompanied by a decrease in activity. In contrast, for FEF neurons coding the target location, there is no significant difference in activity between curved and straight saccades. To establish that the distractor-related activity is causally related to saccade curvature, we applied microstimulation to sites in the FEF before saccades to targets presented without distractors. The stimulation was subthreshold for evoking saccades and the temporal structure of the stimulation train resembled the activity recorded for curved saccades. The resulting movements curved toward the location coded by the stimulation site. These results support the idea that saccade curvature results from incomplete suppression of distractor-related activity during target selection.

  18. Direct resequencing of the complete ERBB2 coding sequence reveals an absence of activating mutations in ERBB2 amplified breast cancer.

    PubMed

    Zito, Christina I; Riches, David; Kolmakova, Julia; Simons, Jan; Egholm, Michael; Stern, David F

    2008-07-01

    Gene amplification is among the most common genetic abnormalities that cause cancer. One of the most clinically important gene amplifications in human cancer causes extensive reduplication of ERBB2. A variety of cancers also occasionally harbor somatic mutations in ERBB2. Gene amplification and activating mutations both have predictive value for clinical response to targeted inhibitors. Since the number of gene copies in an amplicon may exceed 100, and since amplicons may encompass multiple genes, high-resolution analysis of gene amplifications poses considerable technical challenges. We have overcome this obstacle by using emulsion-based resequencing to determine the sequence of many independently-amplified individual DNA molecules in parallel. We used this high throughput sequencing technology to analyze ERBB2 mutational status in five ERBB2 amplified cell lines (four breast, one ovarian) and two breast tumors. Genomic DNA was isolated and the 28 exons of ERBB2 were independently amplified. Amplicons were then pooled at equimolar ratios, subjected to emulsion PCR (emPCR) and finally to picotiter plate pyrosequencing. High-quality sequence data were obtained for all amplicons analyzed and no activating mutations within ERBB2 were identified. Although we did not find activating mutations within the multiple copies of ERBB2 in these samples, the results establish the utility of this technology as a feasible and cost-effective approach for high resolution analysis of amplified genes.

  19. Characterization of tissue expression and full-length coding sequence of a novel human gene mapping at 3q12.1 and transcribed in oligodendrocytes.

    PubMed

    Fayein, Nicole-Adeline; Stankoff, Bruno; Auffray, Charles; Devignes, Marie-Dominique

    2002-05-01

    Macro-array differential hybridization of a collection of 5058 human gene transcripts represented in an IMAGE infant brain cDNA library has led to the identification of transcripts displaying preferential or specific expression in brain (Genome Res. 9 (1999) 195; http://idefix.upr420.vjf.cnrs.fr/IMAGE). Most of these genes correspond to as yet undescribed functions. Detailed characterization of the expression, sequence, and genome assignment of one of these genes named C3orf4, is reported here. The full-length sequence of the transcript was obtained by 5' extension RT-PCR. The gene transcript (2.8 kb) encodes a 253 amino acid long protein, with four transmembrane domains. The position of the C3orf4 gene was determined at 3q12.1 thanks to the draft sequence of the human genome. It is composed of five exons spanning more than 7 kb. No TATAA box but a CpG island was found upstream of the beginning of the gene. Northern blot analysis and in situ hybridization revealed a predominant expression in myelinated structures such as corpus callosum and spinal cord. RT-PCR showed expression of the C3orf4 gene in rat optic nerve and cultured oligodendrocytes, the myelinating cells of the central nervous system, but not in astrocytes. This work supports further investigations aimed at determining the role of the C3orf4 gene in myelinating cells.

  20. De Novo Assembly of Coding Sequences of the Mangrove Palm (Nypa fruticans) Using RNA-Seq and Discovery of Whole-Genome Duplications in the Ancestor of Palms.

    PubMed

    He, Ziwen; Zhang, Zhang; Guo, Wuxia; Zhang, Ying; Zhou, Renchao; Shi, Suhua

    2015-01-01

    Nypa fruticans (Arecaceae) is the only monocot species of true mangroves. This species represents the earliest mangrove fossil recorded. How N. fruticans adapts to the harsh and unstable intertidal zone is an interesting question. However, the 60 gene segments deposited in NCBI are insufficient for solving this question. In this study, we sequenced, assembled and annotated the transcriptome of N. fruticans using next-generation sequencing technology. A total of 19,918,800 clean paired-end reads were de novo assembled into 45,368 unigenes with a N50 length of 1,096 bp. A total of 41.35% unigenes were functionally annotated using Blast2GO. Many genes annotated to "response to stress" and 15 putative positively selected genes were identified. Simple sequence repeats were identified and compared with other palms. The divergence time between N. fruticans and other palms was estimated at 75 million years ago using the genomic data, which is consistent with the fossil record. After calculating the synonymous substitution rate between paralogs, we found that two whole-genome duplication events were shared by N. fruticans and other palms. These duplication events provided a large amount of raw material for the more than 2,000 later speciation events in Arecaceae. This study provides a high quality resource for further functional and evolutionary studies of N. fruticans and palms in general.

  1. Novel features in the genetic code and codon reading patterns in Neurospora crassa mitochondria based on sequences of six mitochondrial tRNAs

    PubMed Central

    Heckman, Joyce E.; Sarnoff, Joshua; Alzner-DeWeerd, Birgit; Yin, Samuel; RajBhandary, Uttam L.

    1980-01-01

    We report the sequences of Neurospora crassa mitochondrial alanine, leucine1, leucine2, threonine, tryptophan, and valine tRNAs. On the basis of the anticodon sequences of these tRNAs and of a glutamine tRNA, whose sequence analysis is nearly complete, we infer the following: (i) The N. crassa mitochondrial tRNA species for alanine, leucine2, threonine, and valine, amino acids that belong to four-codon families (GCN, CUN, ACN, and GUN, respectively; N = U, C, A, or G) all contain an unmodified U in the first position of the anticodon. In contrast, tRNA species for glutamine, leucine1, and tryptophan, amino acids that use codons ending in purines (CAGA, UUGA, and UGGA, respectively) contain a modified U derivative in the same position. These findings and the fact that we have not detected any other isoacceptor tRNAs for these amino acids suggest that N. crassa mitochondrial tRNAs containing U in the first position of the anticodon are capable of reading all four codons of a four-codon family whereas those containing a modified U are restricted to reading codons ending in A or G. Such an expanded codon-reading ability of certain mitochondrial tRNAs will explain how the mitochondrial protein-synthesizing system operates with a much lower number of tRNA species than do systems present in prokaryotes or in eukaryotic cytoplasm. (ii) The anticodon sequence of the N. crassa mitochondrial tryptophan tRNA is U*CA and not CCA or CmCA as is the case with tryptophan tRNAs from prokaryotes or from eukaryotic cytoplasm. Because a tRNA with U*CA in the anti-codon would be expected to read the codon UGA, as well as the normal tryptophan codon UGG, this suggests that in N. crassa mitochondria, as in yeast and in human mitochondria, UGA is a codon for tryptophan and not a signal for chain termination. (iii) The anticodon sequences of the two leucine tRNAs indicate that N. crassa mitochondria use both families of leucine codons (UUAG and CUN; N = U, C, A, or G) for leucine, in

  2. The sequence of sequencers: The history of sequencing DNA.

    PubMed

    Heather, James M; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way.

  3. Autocatalysis, information and coding.

    PubMed

    Wills, P R

    2001-01-01

    Autocatalytic self-construction in macromolecular systems requires the existence of a reflexive relationship between structural components and the functional operations they perform to synthesise themselves. The possibility of reflexivity depends on formal, semiotic features of the catalytic structure-function relationship, that is, the embedding of catalytic functions in the space of polymeric structures. Reflexivity is a semiotic property of some genetic sequences. Such sequences may serve as the basis for the evolution of coding as a result of autocatalytic self-organisation in a population of assignment catalysts. Autocatalytic selection is a mechanism whereby matter becomes differentiated in primitive biochemical systems. In the case of coding self-organisation, it corresponds to the creation of symbolic information. Prions are present-day entities whose replication through autocatalysis reflects aspects of biological semiotics less obvious than genetic coding.

  4. Complete genome sequence of Desulfohalobium retbaense type strain (HR(100)).

    PubMed

    Spring, Stefan; Nolan, Matt; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Land, Miriam; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Munk, Christine; Kiss, Hajnalka; Chain, Patrick; Han, Cliff; Brettin, Thomas; Detter, John C; Schüler, Esther; Göker, Markus; Rohde, Manfred; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-01-28

    Desulfohalobium retbaense (Ollivier et al. 1991) is the type species of the polyphyletic genus Desulfohalobium, which comprises, at the time of writing, two species and represents the family Desulfohalobiaceae within the Deltaproteobacteria. D. retbaense is a moderately halophilic sulfate-reducing bacterium, which can utilize H(2) and a limited range of organic substrates, which are incompletely oxidized to acetate and CO(2), for growth. The type strain HR(100) (T) was isolated from sediments of the hypersaline Retba Lake in Senegal. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the family Desulfohalobiaceae. The 2,909,567 bp genome (one chromosome and a 45,263 bp plasmid) with its 2,552 protein-coding and 57 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  5. Community structure of non-coding RNA interaction network.

    PubMed

    Nacher, Jose C

    2013-04-02

    Rapid technological advances have shown that the ratio of non-protein coding genes rises to 98.5% in humans, suggesting that current knowledge on genetic information processing might be largely incomplete. It implies that protein-coding sequences only represent a small fraction of cellular transcriptional information. Here, we examine the community structure of the network defined by functional interactions between non-coding RNAs (ncRNAs) and proteins related bio-macromolecules (PRMs) using a two-fold approach: modularity in bipartite network and k-clique community detection. First, the high modularity scores as well as the distribution of community sizes showing a scaling-law revealed manifestly non-random features. Second, the k-clique sub-graphs and overlaps show that the identified communities of the ncRNA molecules of H. sapiens can potentially be associated with certain functions. These findings highlight the complex modular structure of ncRNA interactions and its possible regulatory roles in the cell.

  6. Sharing code.

    PubMed

    Kubilius, Jonas

    2014-01-01

    Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclude by listing alternative lesser-known tools for code and materials sharing.

  7. The complete sequence of mitochondrial genome of Wuzhishan pig (Sus Scrofa).

    PubMed

    Chai, Yu-Lan; Xu, Dong; Ma, Hai-Ming

    2016-01-01

    In the present study, we sequenced the complete mitochondrial genome of Wuzhishan pig, which was 16,741 bp in size and had a nucleotide composition in A and T (60.46%). The genome consisted of a major non-coding control region (D-loop region) and 37 genes, including 2 ribosomal RNA (rRNA) genes, 13 protein-coding genes (PCGs), and 22 transfer RNA (tRNA) genes. The genes in the mitochondrial genomes of Wuzhishan pig used three kinds of initiation codons (ATA, ATG, and GTG) and four kinds of termination codons (TAA, AGA, TAG, and an incomplete termination codons T-). The complete mitochondrial genome sequence of Wuzhishan pig provides an important data set for further study on genetic mechanism.

  8. gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes

    PubMed Central

    Nakagawa, So; Takahashi, Mahoko Ueda

    2016-01-01

    In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species. Database URL: http://geve.med.u-tokai.ac.jp PMID:27242033

  9. gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes.

    PubMed

    Nakagawa, So; Takahashi, Mahoko Ueda

    2016-01-01

    In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp.

  10. The topology of integrable systems with incomplete fields

    SciTech Connect

    Aleshkin, K R

    2014-09-30

    Liouville's theorem holds for Hamiltonian systems with complete Hamiltonian fields which possess a complete involutive system of first integrals; such systems are called Liouville-integrable. In this paper integrable systems with incomplete Hamiltonian fields are investigated. It is shown that Liouville's theorem remains valid in the case of a single incomplete field, while if the number of incomplete fields is greater, a certain analogue of the theorem holds. An integrable system on the algebra sl(3) is taken as an example. Bibliography: 11 titles.

  11. Double Incomplete Internal Biliary Fistula: Coexisting Cholecystogastric and Cholecystoduodenal Fistula

    PubMed Central

    Beksac, Kemal; Erkan, Arman; Kaynaroglu, Volkan

    2016-01-01

    Internal biliary fistula is a rare complication of a common surgical disease, cholelithiasis. It is seen in 0.74% of all biliary tract surgeries and is thought to be a result of repeated inflammatory periods of the gallbladder. In this report we present a case of incomplete cholecystogastric and cholecystoduodenal fistulae in a single patient missed by ultrasonography and endoscopic retrograde cholangiopancreatography and diagnosed intraoperatively. In the literature there is only one report of an incomplete cholecystogastric fistula. To our knowledge this is the first case of double incomplete internal biliary fistulae. PMID:26904348

  12. Survival analysis with incomplete genetic data.

    PubMed

    Lin, D Y

    2014-01-01

    Genetic data are now collected frequently in clinical studies and epidemiological cohort studies. For a large study, it may be prohibitively expensive to genotype all study subjects, especially with the next-generation sequencing technology. Two-phase sampling, such as case-cohort and nested case-control sampling, is cost-effective in such settings but entails considerable analysis challenges, especially if efficient estimators are desired. Another type of missing data arises when the investigators are interested in the haplotypes or the genetic markers that are not on the genotyping platform used for the current study. Valid and efficient analysis of such missing data is also interesting and challenging. This article provides an overview of these issues and outlines some directions for future research.

  13. Report number codes

    SciTech Connect

    Nelson, R.N.

    1985-05-01

    This publication lists all report number codes processed by the Office of Scientific and Technical Information. The report codes are substantially based on the American National Standards Institute, Standard Technical Report Number (STRN)-Format and Creation Z39.23-1983. The Standard Technical Report Number (STRN) provides one of the primary methods of identifying a specific technical report. The STRN consists of two parts: The report code and the sequential number. The report code identifies the issuing organization, a specific program, or a type of document. The sequential number, which is assigned in sequence by each report issuing entity, is not included in this publication. Part I of this compilation is alphabetized by report codes followed by issuing installations. Part II lists the issuing organization followed by the assigned report code(s). In both Parts I and II, the names of issuing organizations appear for the most part in the form used at the time the reports were issued. However, for some of the more prolific installations which have had name changes, all entries have been merged under the current name.

  14. Mutational analysis of the connexin 36 gene (CX36) and exclusion of the coding sequence as a candidate region for catatonic schizophrenia in a large pedigree.

    PubMed

    Meyer, Jobst; Mai, Marion; Ortega, Gabriela; Mössner, Rainald; Lesch, Klaus-Peter

    2002-11-01

    The murine connexin 36 gene (Cx36) encodes a gap-junction channel protein which is preferentially expressed in brain and retina. The human orthologue CX36 is located on chromosome 15q14, a region recently shown to contain a susceptibility gene for hereditary catatonic schizophrenia. Therefore, CX36 was considered as a positional candidate for mutational analysis. Three polymorphic sites within CX36 were found by sequencing the two exons, the intron-exon boundaries and the putative promoter region of the gene derived from patients and control subjects. No variant exclusively cosegregates with the disease in a large pedigree that mainly supports the chromosome 15q14 locus, providing evidence that CX36 is not causative for the pathogenesis of catatonic schizophrenia in this family.

  15. The positive regulatory function of the 5'-proximal open reading frames in GCN4 mRNA can be mimicked by heterologous, short coding sequences.

    PubMed Central

    Williams, N P; Mueller, P P; Hinnebusch, A G

    1988-01-01

    Translational control of GCN4 expression in the yeast Saccharomyces cerevisiae is mediated by multiple AUG codons present in the leader of GCN4 mRNA, each of which initiates a short open reading frame of only two or three codons. Upstream AUG codons 3 and 4 are required to repress GCN4 expression in normal growth conditions; AUG codons 1 and 2 are needed to overcome this repression in amino acid starvation conditions. We show that the regulatory function of AUG codons 1 and 2 can be qualitatively mimicked by the AUG codons of two heterologous upstream open reading frames (URFs) containing the initiation regions of the yeast genes PGK and TRP1. These AUG codons inhibit GCN4 expression when present singly in the mRNA leader; however, they stimulate GCN4 expression in derepressing conditions when inserted upstream from AUG codons 3 and 4. This finding supports the idea that AUG codons 1 and 2 function in the control mechanism as translation initiation sites and further suggests that suppression of the inhibitory effects of AUG codons 3 and 4 is a general consequence of the translation of URF 1 and 2 sequences upstream. Several observations suggest that AUG codons 3 and 4 are efficient initiation sites; however, these sequences do not act as positive regulatory elements when placed upstream from URF 1. This result suggests that efficient translation is only one of the important properties of the 5' proximal URFs in GCN4 mRNA. We propose that a second property is the ability to permit reinitiation following termination of translation and that URF 1 is optimized for this regulatory function. Images PMID:3065626

  16. Efficient and accurate computation of the incomplete Airy functions

    NASA Technical Reports Server (NTRS)

    Constantinides, E. D.; Marhefka, R. J.

    1993-01-01

    The incomplete Airy integrals serve as canonical functions for the uniform ray optical solutions to several high-frequency scattering and diffraction problems that involve a class of integrals characterized by two stationary points that are arbitrarily close to one another or to an integration endpoint. Integrals with such analytical properties describe transition region phenomena associated with composite shadow boundaries. An efficient and accurate method for computing the incomplete Airy functions would make the solutions to such problems useful for engineering purposes. In this paper a convergent series solution for the incomplete Airy functions is derived. Asymptotic expansions involving several terms are also developed and serve as large argument approximations. The combination of the series solution with the asymptotic formulae provides for an efficient and accurate computation of the incomplete Airy functions. Validation of accuracy is accomplished using direct numerical integration data.

  17. The 5'-flanking region of the RP58 coding sequence shows prominent promoter activity in multipolar cells in the subventricular zone during corticogenesis.

    PubMed

    Ohtaka-Maruyama, C; Hirai, S; Miwa, A; Takahashi, A; Okado, H

    2012-01-10

    Pyramidal neurons of the neocortex are produced from progenitor cells located in the neocortical ventricular zone (VZ) and subventricular zone (SVZ) during embryogenesis. RP58 is a transcriptional repressor that is strongly expressed in the developing brain and plays an essential role in corticogenesis. The expression of RP58 is strictly regulated in a time-dependent and spatially restricted manner. It is maximally expressed in E15-16 embryonic cerebral cortex, localized specifically to the cortical plate and SVZ of the neocortex, hippocampus, and parts of amygdala during brain development, and found in glutamatergic but not GABAergic neurons. Identification of the promoter activity underlying specific expression patterns provides important clues to their mechanisms of action. Here, we show that the RP58 gene promoter is activated prominently in multipolar migrating cells, the first in vivo analysis of RP58 promoter activity in the brain. The 5.3 kb 5'-flanking genomic DNA of the RP58 coding region demonstrates promoter activity in neurons both in vitro and in vivo. This promoter is highly responsive to the transcription factor neurogenin2 (Ngn2), which is a direct upstream activator of RP58 expression. Using in utero electroporation, we demonstrate that RP58 gene promoter activity is first detected in a subpopulation of pin-like VZ cells, then prominently activated in migrating multipolar cells in the multipolar cell accumulation zone (MAZ) located just above the VZ. In dissociated primary cultured cortical neurons, RP58 promoter activity mimics in vivo expression patterns from a molecular standpoint that RP58 is expressed in a fraction of Sox2-positive progenitor cells, Ngn2-positive neuronal committed cells, and Tuj1-positive young neurons, but not in Dlx2-positive GABAergic neurons. Finally, we show that Cre recombinase expression under the control of the RP58 gene promoter is a feasible tool for conditional gene switching in post-mitotic multipolar migrating

  18. A region of the polyoma virus genome between the replication origin and late protein coding sequences is required in cis for both early gene expression and viral DNA replication.

    PubMed Central

    Tyndall, C; La Mantia, G; Thacker, C M; Favaloro, J; Kamen, R

    1981-01-01

    Deletion mutants within the Py DNA region between the replication origin and the beginning of late protein coding sequences have been constructed and analysed for viability, early gene expression and viral DNA replication. Assay of replicative competence was facilitated by the use of Py transformed mouse cells (COP lines) which express functional large T-protein but contain no free viral DNA. Viable mutants defined three new nonessential regions of the genome. Certain deletions spanning the PvuII site at nt 5130 (67.4 mu) were unable to express early genes and had a cis-acting defect in DNA replication. Other mutants had intermediate phenotypes. Relevance of these results to eucaryotic "enhancer" elements is discussed. Images PMID:6275353

  19. Investigation of incomplete fusion dynamics at energy 4-8 MeV/nucleon

    NASA Astrophysics Data System (ADS)

    Kumar, Harish; Tali, Suhail A.; Ansari, M. Afzal; Singh, D.; Ali, Rahbar; Kumar, Kamal; Sathik, N. P. M.; Parashari, Siddharth; Ali, Asif; Dubey, R.; Bala, Indu; Kumar, Rakesh; Singh, R. P.; Muralithar, S.

    2017-04-01

    The recoil-catcher activation technique followed by the offline γ-ray spectroscopy has been adopted for the excitation function measurement of residues populated in 12,13C induced reactions with 175Lu target at lower projectile energies ≈ 4- 8 MeV /nucleon. The independent cross-sections for some of the populated residues have been estimated by subtracting the contributions of higher charge precursor isobars from the measured cumulative cross-sections. The measured excitation functions are compared with theoretical predictions based on statistical model code PACE-4. This comparison reveals that complete fusion process solely contributes in the formation of xn-pxn channels and an enhancement in the measured cross-sections of α-emitting channels from the theoretical predictions may be attributed to the incomplete fusion process. The incomplete fusion probability is found to be higher in case of 12C than for a one neutron rich projectile 13C throughout the incident energy region. Present findings obtained for 12,13C + 175Lu systems have been compared with informations extracted from previously studied systems and projectile structure is found to strongly affect the incomplete fusion dynamics in terms of projectile α-Q-value along with projectile-target mass-asymmetry. Moreover, it may be pointed out that Morgenstern's mass-asymmetry systematic is probably the projectile structure dependent systematic. A substantial contribution to incomplete fusion coming from collision trajectories with ℓ ≤ℓcrit is also observed, contrary to the SUMRULE model assumptions.

  20. A class of constacyclic BCH codes and new quantum codes

    NASA Astrophysics Data System (ADS)

    liu, Yang; Li, Ruihu; Lv, Liangdong; Ma, Yuena

    2017-03-01

    Constacyclic BCH codes have been widely studied in the literature and have been used to construct quantum codes in latest years. However, for the class of quantum codes of length n=q^{2m}+1 over F_{q^2} with q an odd prime power, there are only the ones of distance δ ≤ 2q^2 are obtained in the literature. In this paper, by a detailed analysis of properties of q2-ary cyclotomic cosets, maximum designed distance δ _{max} of a class of Hermitian dual-containing constacyclic BCH codes with length n=q^{2m}+1 are determined, this class of constacyclic codes has some characteristic analog to that of primitive BCH codes over F_{q^2}. Then we can obtain a sequence of dual-containing constacyclic codes of designed distances 2q^2<δ ≤ δ _{max}. Consequently, new quantum codes with distance d > 2q^2 can be constructed from these dual-containing codes via Hermitian Construction. These newly obtained quantum codes have better code rate compared with those constructed from primitive BCH codes.

  1. Colour cyclic code for Brillouin distributed sensors

    NASA Astrophysics Data System (ADS)

    Le Floch, Sébastien; Sauser, Florian; Llera, Miguel; Rochat, Etienne

    2015-09-01

    For the first time, a colour cyclic coding (CCC) is theoretically and experimentally demonstrated for Brillouin optical time-domain analysis (BOTDA) distributed sensors. Compared to traditional intensity-modulated cyclic codes, the code presents an additional gain of √2 while keeping the same number of sequences as for a colour coding. A comparison with a standard BOTDA sensor is realized and validates the theoretical coding gain.

  2. Impacts of Model Building Energy Codes

    SciTech Connect

    Athalye, Rahul A.; Sivaraman, Deepak; Elliott, Douglas B.; Liu, Bing; Bartlett, Rosemarie

    2016-10-31

    The U.S. Department of Energy (DOE) Building Energy Codes Program (BECP) periodically evaluates national and state-level impacts associated with energy codes in residential and commercial buildings. Pacific Northwest National Laboratory (PNNL), funded by DOE, conducted an assessment of the prospective impacts of national model building energy codes from 2010 through 2040. A previous PNNL study evaluated the impact of the Building Energy Codes Program; this study looked more broadly at overall code impacts. This report describes the methodology used for the assessment and presents the impacts in terms of energy savings, consumer cost savings, and reduced CO2 emissions at the state level and at aggregated levels. This analysis does not represent all potential savings from energy codes in the U.S. because it excludes several states which have codes which are fundamentally different from the national model energy codes or which do not have state-wide codes. Energy codes follow a three-phase cycle that starts with the development of a new model code, proceeds with the adoption of the new code by states and local jurisdictions, and finishes when buildings comply with the code. The development of new model code editions creates the potential for increased energy savings. After a new model code is adopted, potential savings are realized in the field when new buildings (or additions and alterations) are constructed to comply with the new code. Delayed adoption of a model code and incomplete compliance with the code’s requirements erode potential savings. The contributions of all three phases are crucial to the overall impact of codes, and are considered in this assessment.

  3. The internal gene duplication and interrupted coding sequences in the MmpL genes of Mycobacterium tuberculosis: Towards understanding the multidrug transport in an evolutionary perspective.

    PubMed

    Sandhu, Padmani; Akhter, Yusuf

    2015-05-01

    The multidrug resistance has emerged as a major problem in the treatment of many of the infectious diseases. Tuberculosis (TB) is one of such disease caused by Mycobacterium tuberculosis. There is short term chemotherapy to treat the infection, but the main hurdle is the development of the resistance to antibiotics. This resistance is primarily due to the impermeable mycolic acid rich cell wall of the bacteria and other factors such as efflux of antibiotics from the bacterial cell. The MmpL (Mycobacterial Membrane Protein Large) proteins of mycobacteria are involved in the lipid transport and antibiotic efflux as indicated by the preliminary reports. We present here, comprehensive comparative sequence and structural analysis, which revealed topological signatures shared by the MmpL proteins and RND (Resistance Nodulation Division) multidrug efflux transporters. This provides evidence in support of the notion that they belong to the extended RND permeases superfamily. In silico modelled tertiary structures are in homology with an integral membrane component present in all of the RND efflux pumps. We document internal gene duplication and gene splitting events happened in the MmpL genes, which further elucidate the molecular functions of these putative transporters in an evolutionary perspective.

  4. Tandem repeat sequence variation as causative cis-eQTLs for protein-coding gene expression variation: the case of CSTB.

    PubMed

    Borel, Christelle; Migliavacca, Eugenia; Letourneau, Audrey; Gagnebin, Maryline; Béna, Frédérique; Sailani, M Reza; Dermitzakis, Emmanouil T; Sharp, Andrew J; Antonarakis, Stylianos E

    2012-08-01

    Association studies have revealed expression quantitative trait loci (eQTLs) for a large number of genes. However, the causative variants that regulate gene expression levels are generally unknown. We hypothesized that copy-number variation of sequence repeats contribute to the expression variation of some genes. Our laboratory has previously identified that the rare expansion of a repeat c.-174CGGGGCGGGGCG in the promoter region of the CSTB gene causes a silencing of the gene, resulting in progressive myoclonus epilepsy. Here, we genotyped the repeat length and quantified CSTB expression by quantitative real-time polymerase chain reaction in 173 lymphoblastoid cell lines (LCLs) and fibroblast samples from the GenCord collection. The majority of alleles contain either two or three copies of this repeat. Independent analysis revealed that the c.-174CGGGGCGGGGCG repeat length is strongly associated with CSTB expression (P = 3.14 × 10(-11)) in LCLs only. Examination of both genotyped and imputed single-nucleotide polymorphisms (SNPs) within 2 Mb of CSTB revealed that the dodecamer repeat represents the strongest cis-eQTL for CSTB in LCLs. We conclude that the common two or three copy variation is likely the causative cis-eQTL for CSTB expression variation. More broadly, we propose that polymorphic tandem repeats may represent the causative variation of a fraction of cis-eQTLs in the genome.

  5. Sharing code

    PubMed Central

    Kubilius, Jonas

    2014-01-01

    Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclude by listing alternative lesser-known tools for code and materials sharing. PMID:25165519

  6. Incomplete Relaxation between Beats after Myocardial Hypoxia and Ischemia

    PubMed Central

    Weisfeldt, Myron L.; Armstrong, Paul; Scully, Hugh E.; Sanders, Charles A.; Daggett, Willard M.

    1974-01-01

    Recovery from hypoxia has been shown to prolong cardiac muscle contraction, particularly the relaxation phase. The present studies were designed to examine whether incomplete relaxation between beats can result from this prolongation of contraction and relaxation in isolated muscle after hypoxia and in the canine heart after both hypoxia and acute ischemia. The relationship between heart rate and the extent of incomplete relaxation is emphasized in view of the known enhancement of the velocity of contraction caused by increasing heart rate. The extent of incomplete relaxation during 10-s periods of pacing at increasing rates was examined before and after hypoxia in isometric cat right ventricular papillary muscle (12-120 beats/min) and in the canine isovolumic left ventricle (120-180 beats/min). Incomplete relaxation was quantified by measuring the difference between the lowest diastolic tension or pressure during pacing and the true resting tension or pressure determined by interruption of pacing at each rate. In eight cat papillary muscles (29°C), there was significantly greater incomplete relaxation 5 min after hypoxia at rates of 96 and 120 beats/min (P < 0.02 vs. before hypoxia). In seven canine isovolumic left ventricles, recovery from hypoxia and higher heart rates also resulted in incomplete relaxation. Incomplete relaxation before hypoxia at a rate of 180 beats/min was 0.8±0.5 cm H2O and at 5 min of recovery from hypoxia was 12.6±3.5 cm H2O (P < 0.01). 12 hearts were subjected to a 1.5-3-min period of acute ischemia and fibrillation. There was significant incomplete relaxation at a rate of 140 beats/min for 5 min after defibrillation and reperfusion. These data indicate that incomplete relaxation is an important determinant of diastolic hemodynamics during recovery from ischemia or hypoxia. The extent of incomplete relaxation appears to be a function of the rate of normalization of the velocity of relaxation and tension development after ischemia or

  7. Handling incomplete correlated continuous and binary outcomes in meta-analysis of individual participant data.

    PubMed

    Gomes, Manuel; Hatfield, Laura; Normand, Sharon-Lise

    2016-09-20

    Meta-analysis of individual participant data (IPD) is increasingly utilised to improve the estimation of treatment effects, particularly among different participant subgroups. An important concern in IPD meta-analysis relates to partially or completely missing outcomes for some studies, a problem exacerbated when interest is on multiple discrete and continuous outcomes. When leveraging information from incomplete correlated outcomes across studies, the fully observed outcomes may provide important information about the incompleteness of the other outcomes. In this paper, we compare two models for handling incomplete continuous and binary outcomes in IPD meta-analysis: a joint hierarchical model and a sequence of full conditional mixed models. We illustrate how these approaches incorporate the correlation across the multiple outcomes and the between-study heterogeneity when addressing the missing data. Simulations characterise the performance of the methods across a range of scenarios which differ according to the proportion and type of missingness, strength of correlation between outcomes and the number of studies. The joint model provided confidence interval coverage consistently closer to nominal levels and lower mean squared error compared with the fully conditional approach across the scenarios considered. Methods are illustrated in a meta-analysis of randomised controlled trials comparing the effectiveness of implantable cardioverter-defibrillator devices alone to implantable cardioverter-defibrillator combined with cardiac resynchronisation therapy for treating patients with chronic heart failure. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  8. DNA sequences encoding osteoinductive products

    SciTech Connect

    Wang, E.A.; Wozney, J.M.; Rosen, V.

    1991-05-07

    This patent describes an isolated DNA sequence encoding an osteoinductive protein the DNA sequence comprising a coding sequence. It comprises: nucleotide No.1 through nucleotide No.387, nucleotide No.356 through nucleotide No.1543, nucleotide $402 through nucleotide No.1626, naturally occurring allelic sequences and equivalent degenerative codon sequences and sequences which hybridize to any of sequences under stringent hybridization conditions; and encode a protein characterized by the ability to induce the formation of bone and/or cartilage.

  9. Incomplete caries removal: a systematic review and meta-analysis.

    PubMed

    Schwendicke, F; Dörfer, C E; Paris, S

    2013-04-01

    Increasing numbers of clinical trials have demonstrated the benefits of incomplete caries removal, in particular in the treatment of deep caries. This study systematically reviewed randomized controlled trials investigating one- or two-step incomplete compared with complete caries removal. Studies treating primary and permanent teeth with primary caries lesions requiring a restoration were analyzed. The following primary and secondary outcomes were investigated: risk of pulpal exposure, post-operative pulpal symptoms, overall failure, and caries progression. Electronic databases were screened for studies from 1967 to 2012. Cross-referencing was used to identify further articles. Odds ratios (OR) as effect estimates were calculated in a random-effects model. From 364 screened articles, 10 studies representing 1,257 patients were included. Meta-analysis showed risk reduction for both pulpal exposure (OR [95% CI] 0.31 [0.19-0.49]) and pulpal symptoms (OR 0.58 [0.31-1.10]) for teeth treated with one- or two-step incomplete excavation. Risk of failure seemed to be similar for both complete and incomplete excavation, but data for this outcome were of limited quality and inconclusive (OR 0.97 [0.64-1.46]). Based on reviewed studies, incomplete caries removal seems advantageous compared with complete excavation, especially in proximity to the pulp. However, evidence levels are currently insufficient for definitive conclusions because of high risk of bias within studies.

  10. Handling incomplete smoking history data in survival analysis.

    PubMed

    Furukawa, Kyoji; Preston, Dale L; Misumi, Munechika; Cullings, Harry M

    2014-10-26

    While data are unavoidably missing or incomplete in most observational studies, consequences of mishandling such incompleteness in analysis are often overlooked. When time-varying information is collected irregularly and infrequently over a long period, even precisely obtained data may implicitly involve substantial incompleteness. Motivated by an analysis to quantitatively evaluate the effects of smoking and radiation on lung cancer risks among Japanese atomic-bomb survivors, we provide a unique application of multiple imputation to incompletely observed smoking histories under the assumption of missing at random. Predicting missing values for the age of smoking initiation and, given initiation, smoking intensity and cessation age, analyses can be based on complete, though partially imputed, smoking histories. A simulation study shows that multiple imputation appropriately conditioned on the outcome and other relevant variables can produce consistent estimates when data are missing at random. Our approach is particularly appealing in large cohort studies where a considerable amount of time-varying information is incomplete under a mechanism depending in a complex manner on other variables. In application to the motivating example, this approach is expected to reduce estimation bias that might be unavoidable in naive analyses, while keeping efficiency by retaining known information.

  11. Evaluating summary statistics used to test for incomplete lineage sorting: mito-nuclear discordance in the reef sponge Callyspongia vaginalis.

    PubMed

    Debiasse, Melissa B; Nelson, Bradley J; Hellberg, Michael E

    2014-01-01

    Conflicting patterns of population differentiation between the mitochondrial and nuclear genomes (mito-nuclear discordance) have become increasingly evident as multilocus data sets have become easier to generate. Incomplete lineage sorting (ILS) of nucDNA is often implicated as the cause of such discordance, stemming from the large effective population size of nucDNA relative to mtDNA. However, selection, sex-biased dispersal and historical demography can also lead to mito-nuclear discordance. Here, we compare patterns of genetic diversity and subdivision for six nuclear protein-coding gene regions to those for mtDNA in a common Caribbean coral reef sponge, Callyspongia vaginalis, along the Florida reef tract. We also evaluated a suite of summary statistics to determine which are effective metrics for comparing empirical and simulated data when testing drivers of mito-nuclear discordance in a statistical framework. While earlier work revealed three divergent and geographically subdivided mtDNACOI haplotypes separated by 2.4% sequence divergence, nuclear alleles were admixed with respect to mitochondrial clade and geography. Bayesian analysis showed that substitution rates for the nuclear loci were up to 7 times faster than for mitochondrial COI. Coalescent simulations and neutrality tests suggested that mito-nuclear discordance in C. vaginalis is not the result of ILS in the nucDNA or selection on the mtDNA but is more likely caused by changes in population size. Sperm-mediated gene flow may also influence patterns of population subdivision in the nucDNA.

  12. Circular codes, symmetries and transformations.

    PubMed

    Fimmel, Elena; Giannerini, Simone; Gonzalez, Diego Luis; Strüngmann, Lutz

    2015-06-01

    Circular codes, putative remnants of primeval comma-free codes, have gained considerable attention in the last years. In fact they represent a second kind of genetic code potentially involved in detecting and maintaining the normal reading frame in protein coding sequences. The discovering of an universal code across species suggested many theoretical and experimental questions. However, there is a key aspect that relates circular codes to symmetries and transformations that remains to a large extent unexplored. In this article we aim at addressing the issue by studying the symmetries and transformations that connect different circular codes. The main result is that the class of 216 C3 maximal self-complementary codes can be partitioned into 27 equivalence classes defined by a particular set of transformations. We show that such transformations can be put in a group theoretic framework with an intuitive geometric interpretation. More general mathematical results about symmetry transformations which are valid for any kind of circular codes are also presented. Our results pave the way to the study of the biological consequences of the mathematical structure behind circular codes and contribute to shed light on the evolutionary steps that led to the observed symmetries of present codes.

  13. Role of tunnelling in complete and incomplete fusion induced by 9Be on 169Tm and 187Re targets at around barrier energies

    NASA Astrophysics Data System (ADS)

    Kharab, Rajesh; Chahal, Rajiv; Kumar, Rajiv

    2017-04-01

    We have analyzed the complete and incomplete fusion excitation function for 9Be +169Tm, 187Re reactions at around barrier energies using the code PLATYPUS based on classical dynamical model. The quantum mechanical tunnelling correction is incorporated at near and sub barrier energies which significantly improves the matching between the data and prediction.

  14. Marks of Change in Sequences

    NASA Astrophysics Data System (ADS)

    Jürgensen, H.

    2011-12-01

    Given a sequence of events, how does one recognize that a change has occurred? We explore potential definitions of the concept of change in a sequence and propose that words in relativized solid codes might serve as indicators of change.

  15. Speech coding

    SciTech Connect

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  16. Incomplete fuzzy data processing systems using artificial neural network

    NASA Technical Reports Server (NTRS)

    Patyra, Marek J.

    1992-01-01

    In this paper, the implementation of a fuzzy data processing system using an artificial neural network (ANN) is discussed. The binary representation of fuzzy data is assumed, where the universe of discourse is decartelized into n equal intervals. The value of a membership function is represented by a binary number. It is proposed that incomplete fuzzy data processing be performed in two stages. The first stage performs the 'retrieval' of incomplete fuzzy data, and the second stage performs the desired operation on the retrieval data. The method of incomplete fuzzy data retrieval is proposed based on the linear approximation of missing values of the membership function. The ANN implementation of the proposed system is presented. The system was computationally verified and showed a relatively small total error.

  17. A comparison of incomplete-data methods for categorical data.

    PubMed

    van der Palm, Daniël W; van der Ark, L Andries; Vermunt, Jeroen K

    2016-04-01

    We studied four methods for handling incomplete categorical data in statistical modeling: (1) maximum likelihood estimation of the statistical model with incomplete data, (2) multiple imputation using a loglinear model, (3) multiple imputation using a latent class model, (4) and multivariate imputation by chained equations. Each method has advantages and disadvantages, and it is unknown which method should be recommended to practitioners. We reviewed the merits of each method and investigated their effect on the bias and stability of parameter estimates and bias of the standard errors. We found that multiple imputation using a latent class model with many latent classes was the most promising method for handling incomplete categorical data, especially when the number of variables used in the imputation model is large.

  18. Algodystrophy: complex regional pain syndrome and incomplete forms

    PubMed Central

    Giannotti, Stefano; Bottai, Vanna; Dell’Osso, Giacomo; Bugelli, Giulia; Celli, Fabio; Cazzella, Niki; Guido, Giulio

    2016-01-01

    Summary The algodystrophy, also known as complex regional pain syndrome (CRPS), is a painful disease characterized by erythema, edema, functional impairment, sensory and vasomotor disturbance. The diagnosis of CRPS is based solely on clinical signs and symptoms, and for exclusion compared to other forms of chronic pain. There is not a specific diagnostic procedure; careful clinical evaluation and additional test should lead to an accurate diagnosis. There are similar forms of chronic pain known as bone marrow edema syndrome, in which is absent the history of trauma or triggering events and the skin dystrophic changes and vasomotor alterations. These incomplete forms are self-limited, and surgical treatment is generally not needed. It is still controversial, if these forms represent a distinct self-limiting entity or an incomplete variant of CRPS. In painful unexplained conditions such as frozen shoulder, post-operative stiff shoulder or painful knee prosthesis, the algodystrophy, especially in its incomplete forms, could represent the cause. PMID:27252736

  19. A Novel Method to Assess Incompleteness of Mammography Reports

    PubMed Central

    Gimenez, Francisco J.; Wu, Yirong; Burnside, Elizabeth S.; Rubin, Daniel L.

    2014-01-01

    Mammography has been shown to improve outcomes of women with breast cancer, but it is subject to inter-reader variability. One well-documented source of such variability is in the content of mammography reports. The mammography report is of crucial importance, since it documents the radiologist’s imaging observations, interpretation of those observations in terms of likelihood of malignancy, and suggested patient management. In this paper, we define an incompleteness score to measure how incomplete the information content is in the mammography report and provide an algorithm to calculate this metric. We then show that the incompleteness score can be used to predict errors in interpretation. This method has 82.6% accuracy at predicting errors in interpretation and can possibly reduce total diagnostic errors by up to 21.7%. Such a method can easily be modified to suit other domains that depend on quality reporting. PMID:25954448

  20. A novel method to assess incompleteness of mammography reports.

    PubMed

    Gimenez, Francisco J; Wu, Yirong; Burnside, Elizabeth S; Rubin, Daniel L

    2014-01-01

    Mammography has been shown to improve outcomes of women with breast cancer, but it is subject to inter-reader variability. One well-documented source of such variability is in the content of mammography reports. The mammography report is of crucial importance, since it documents the radiologist's imaging observations, interpretation of those observations in terms of likelihood of malignancy, and suggested patient management. In this paper, we define an incompleteness score to measure how incomplete the information content is in the mammography report and provide an algorithm to calculate this metric. We then show that the incompleteness score can be used to predict errors in interpretation. This method has 82.6% accuracy at predicting errors in interpretation and can possibly reduce total diagnostic errors by up to 21.7%. Such a method can easily be modified to suit other domains that depend on quality reporting.

  1. Group-specific amplification of cDNA from DRB1 genes. Complete coding sequences of partially defined alleles and identification of the new alleles DRB1*040602, DRB1*111102, DRB1*080103, and DRB1*0113.

    PubMed

    Balas, Antonio; Vilches, Carlos; Rodríguez, Miguel A; Fernández, Begoña; Martinez, Maria Paz; de Pablo, Rosario; García-Sánchez, Félix; Vicario, Jose L

    2006-12-01

    We present here the complete coding sequences, previously unavailable, of the DRB1 alleles DRB1*030102, *0306, *040701, *0408, *1327, *1356, *1411, *1446, *1503, *1504, *0806, *0813, and *0818. For cDNA isolation, new group-specific primers located at the 5'UT and 3'UT regions were used to carry out allele-specific amplification and a convenient method for determining full-length sequences for DRB1 alleles. Complete coding sequencing of samples previously typed as DRB1*0406, DRB1*080101, and DRB1*1111 revealed new alleles with noncoding nucleotide changes at exons 1 and 3. In addition, we found a novel allele, DRB1*0113, whose second exon carries a sequence motif characteristic of DRB1*07 alleles. The predicted class II haplotypic associations of all alleles are reported and discussed.

  2. Quantum Stackelberg duopoly with incomplete information [rapid communication

    NASA Astrophysics Data System (ADS)

    Lo, C.-F.; Kiang, D.

    2005-10-01

    We investigate the quantum version of the Stackelberg duopoly with incomplete information, especially how the quantum entanglement affects the first-mover advantage in the classical form. It is found that while positive entanglement enhances the first-mover advantage beyond the classical limit, the advantage is dramatically suppressed by negative entanglement. Moreover, despite that positive quantum entanglement improves the first-mover's tolerance for the informational incompleteness, the quantum effect does not change the basic fact that Firm A's lack of complete information of Firm B's unit cost is eradicating the first-mover advantage.

  3. Incomplete use of condoms: the importance of sexual arousal.

    PubMed

    Graham, Cynthia A; Crosby, Richard A; Milhausen, Robin R; Sanders, Stephanie A; Yarber, William L

    2011-10-01

    The purpose of this study was to identify associations between incomplete condom use (not using condoms from start to finish of sex) and sexual arousal variables. A convenience sample of heterosexual men (n = 761) completed a web-based questionnaire. Men who scored higher on sexual arousability were more likely to put a condom on after sex had begun (AOR = 1.58). Men who reported difficulty reaching orgasm were more likely to report removing condoms before sex was over (AOR = 2.08). These findings suggest that sexual arousal may be an important, and under-studied, factor associated with incomplete use of condoms.

  4. Incomplete and False Identification Distributions: Group Screening Models.

    DTIC Science & Technology

    1981-05-01

    AD-AO99 310 MARYLAND LUN1V COLLEGE PARK F/6 12/1 INCOMPLETE AND FALSE IDENTIFICATION DISTRIBUTIONS: GROUP SCREEN--ETC(U) MAY 8l S KOTZ. N L JOHNSON...present paper, we extend some of these results to the case of screeningw’mpling schemes. KeyWords and Phrases: group screening; binomial distribution...Johnson and Kotz (1981a). la) Incomplete identification. Consider a sample of size n without replacement from a lot of size N conforming X defective (or

  5. SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments

    PubMed Central

    Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic

    2001-01-01

    Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202

  6. QR Codes

    ERIC Educational Resources Information Center

    Lai, Hsin-Chih; Chang, Chun-Yen; Li, Wen-Shiane; Fan, Yu-Lin; Wu, Ying-Tien

    2013-01-01

    This study presents an m-learning method that incorporates Integrated Quick Response (QR) codes. This learning method not only achieves the objectives of outdoor education, but it also increases applications of Cognitive Theory of Multimedia Learning (CTML) (Mayer, 2001) in m-learning for practical use in a diverse range of outdoor locations. When…

  7. Dinucleotide circular codes and bijective transformations.

    PubMed

    Fimmel, Elena; Giannerini, Simone; Gonzalez, Diego Luis; Strüngmann, Lutz

    2015-12-07

    The presence of circular codes in mRNA coding sequences is postulated to be involved in informational mechanisms aimed at detecting and maintaining the normal reading frame during protein synthesis. Most of the recent research is focused on trinucleotide circular codes. However, also dinucleotide circular codes are important since dinucleotides are ubiquitous in genomes and associated to important biological functions. In this work we adopt the group theoretic approach used for trinucleotide codes in Fimmel et al. (2015) to study dinucleotide circular codes and highlight their symmetry properties. Moreover, we characterize such codes in terms of n-circularity and provide a graph representation that allows to visualize them geometrically. The results establish a theoretical framework for the study of the biological implications of dinucleotide circular codes in genomic sequences.

  8. Complete mitochondrial genome of Bactrocera arecae (Insecta: Tephritidae) by next-generation sequencing and molecular phylogeny of Dacini tribe

    PubMed Central

    Yong, Hoi-Sen; Song, Sze-Looi; Lim, Phaik-Eem; Chan, Kok-Gan; Chow, Wan-Loo; Eamsobhana, Praphathip

    2015-01-01

    The whole mitochondrial genome of the pest fruit fly Bactrocera arecae was obtained from next-generation sequencing of genomic DNA. It had a total length of 15,900 bp, consisting of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a non-coding region (A + T-rich control region). The control region (952 bp) was flanked by rrnS and trnI genes. The start codons included 6 ATG, 3 ATT and 1 each of ATA, ATC, GTG and TCG. Eight TAA, two TAG, one incomplete TA and two incomplete T stop codons were represented in the protein-coding genes. The cloverleaf structure for trnS1 lacked the D-loop, and that of trnN and trnF lacked the TΨC-loop. Molecular phylogeny based on 13 protein-coding genes was concordant with 37 mitochondrial genes, with B. arecae having closest genetic affinity to B. tryoni. The subgenus Bactrocera of Dacini tribe and the Dacinae subfamily (Dacini and Ceratitidini tribes) were monophyletic. The whole mitogenome of B. arecae will serve as a useful dataset for studying the genetics, systematics and phylogenetic relationships of the many species of Bactrocera genus in particular, and tephritid fruit flies in general. PMID:26472633

  9. Program Synthesizes UML Sequence Diagrams

    NASA Technical Reports Server (NTRS)

    Barry, Matthew R.; Osborne, Richard N.

    2006-01-01

    A computer program called "Rational Sequence" generates Universal Modeling Language (UML) sequence diagrams of a target Java program running on a Java virtual machine (JVM). Rational Sequence thereby performs a reverse engineering function that aids in the design documentation of the target Java program. Whereas previously, the construction of sequence diagrams was a tedious manual process, Rational Sequence generates UML sequence diagrams automatically from the running Java code.

  10. Endodontic and restorative management of incompletely fractured molar teeth.

    PubMed

    Gutmann, J L; Rakusin, H

    1994-11-01

    The treatment of fractured teeth poses significant problems for the practitioner. However, once the treatment planning decision has been made to attempt to retain the tooth, various practical regimens are available to effect this goal. This paper addresses the specific use of glass ionomer in the restorative management of incompletely, vertically fractured molar teeth integrated with specific root canal treatment techniques.

  11. Robustness of shape descriptors to incomplete contour representations.

    PubMed

    Ghosh, Anarta; Petkov, Nicolai

    2005-11-01

    With inspiration from psychophysical researches of the human visual system, we propose a novel aspect and a method for performance evaluation of contour-based shape recognition algorithms regarding their robustness to incompleteness of contours. We use complete contour representations of objects as a reference (training) set. Incomplete contour representations of the same objects are used as a test set. The performance of an algorithm is reported using the recognition rate as a function of the percentage of contour retained. We call this evaluation procedure the ICR test. We consider three types of contour incompleteness, viz. segment-wise contour deletion, occlusion, and random pixel depletion. As an illustration, the robustness of two shape recognition algorithms to contour incompleteness is evaluated. These algorithms use a shape context and a distance multiset as local shape descriptors. Qualitatively, both algorithms mimic human visual perception in the sense that recognition performance monotonously increases with the degree of completeness and that they perform best in the case of random depletion and worst in the case of occluded contours. The distance multiset method performs better than the shape context method in this test framework.

  12. 19 CFR 122.74 - Incomplete (pro forma) manifest.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 19 Customs Duties 1 2011-04-01 2011-04-01 false Incomplete (pro forma) manifest. 122.74 Section 122.74 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT...; Electronic Manifest Requirements for Passengers, Crew Members, and Non-Crew Members Onboard...

  13. Computer Simulation of Incomplete-Data Interpretation Exercise.

    ERIC Educational Resources Information Center

    Robertson, Douglas Frederick

    1987-01-01

    Described is a computer simulation that was used to help general education students enrolled in a large introductory geology course. The purpose of the simulation is to learn to interpret incomplete data. Students design a plan to collect bathymetric data for an area of the ocean. Procedures used by the students and instructor are included.…

  14. Gender under Incomplete Acquisition: Heritage Speakers' Knowledge of Noun Categorization

    ERIC Educational Resources Information Center

    Polinsky, Maria

    2008-01-01

    The author discusses a study of gender assignment (noun categorization) in heritage Russian and presents issues in the methodology of heritage language study. To anticipate the conclusions of this article, the gender assignment data presented argue for the systematicity of what emerges under incomplete acquisition. The system is different from its…

  15. Root cause of incomplete control rod insertions at Westinghouse reactors

    SciTech Connect

    Ray, S.

    1997-01-01

    Within the past year, incomplete RCCA insertions have been observed on high burnup fuel assemblies at two Westinghouse PWRs. Initial tests at the Wolf Creek site indicated that the direct cause of the incomplete insertions observed at Wolf Creek was excessive fuel assembly thimble tube distortion. Westinghouse committed to the NRC to perform a root cause analysis by the end of August, 1996. The root cause analysis process used by Westinghouse included testing at ten sites to obtain drag, growth and other characteristics of high burnup fuel assemblies. It also included testing at the Westinghouse hot cell of two of the Wolf Creek incomplete insertion assemblies. A mechanical model was developed to calculate the response of fuel assemblies when subjected to compressive loads. Detailed manufacturing reviews were conducted to determine if this was a manufacturing related issue. In addition, a review of available worldwide experience was performed. Based on the above, it was concluded that the thimble tube distortion observed on the Wolf Creek incomplete insertion assemblies was caused by unusual fuel assembly growth over and above what would typically be expected as a result of irradiation exposure. It was determined that the unusual growth component is a combination of growth due to oxide accumulation and accelerated growth, and would only be expected in high temperature plants on fuel assemblies that see long residence times and high power duties.

  16. Nonallelic heterogeneity in autosomal dominant retinitis pigmentosa with incomplete penetrance

    SciTech Connect

    Kim, S.K.; Berson, E.L.; Dryja, T.P.

    1994-08-01

    Retinitis pigmentosa is a group of retinal diseases in which photoreceptor cells throughout the retina degenerate. Although there is considerable genetic heterogeneity (autosomal dominant, autosomal recessive, and X-linked forms exist), there is a possibility that some clinically defined subtypes of the disease may be the result of mutations at the same locus. One possible clinically defined subtype is that of autosomal dominant retinitis pigmentosa (ADRP) with incomplete penetrance. Whereas in most families with ADRP, carriers can be clearly identified because of visual loss, ophthalmological findings, or abnormal electroretinograms (ERGs), in occasional families some obligate carriers are asymptomatic and have normal or nearly normal ERGs even late in life. A recent paper reported the mapping of the diseases locus in one pedigree (designated adRP7) with ADRP with incomplete penetrance to chromosome 7p. To test the idea that ADRP with incomplete penetrance may be genetically homogeneous, we have evaluated whether a different family with incomplete penetrance also has a disease gene linked to the same region. 4 refs., 1 fig., 1 tab.

  17. Limit Pricing with Incomplete Information: Answers to Frequently Asked Questions

    ERIC Educational Resources Information Center

    Sorenson, Timothy L.

    2004-01-01

    Strategic pricing is an important and exciting topic in industrial organization and the economics of strategy. A wide range of texts use what has become a standard version of the Milgrom and Roberts (1982a) limit-pricing model to convey the essential ideas of strategic pricing under incomplete information. In addition to providing a formal, but…

  18. An Interactive Approach to Analyzing Incomplete Multivariate Data.

    ERIC Educational Resources Information Center

    Raymond, Mark R.

    This paper examines some of the problems that arise when conducting multivariate analyses with incomplete data. The literature on the effectiveness of several missing data procedures (MDP) is summarized. The most widely used MDPs are: (1) listwise deletion; (2) pairwise deletion; (3) variable mean; (4) correlational methods. No MDP should be used…

  19. 40 CFR 86.085-20 - Incomplete vehicles, classification.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 18 2010-07-01 2010-07-01 false Incomplete vehicles, classification... PROGRAMS (CONTINUED) CONTROL OF EMISSIONS FROM NEW AND IN-USE HIGHWAY VEHICLES AND ENGINES General Provisions for Emission Regulations for 1977 and Later Model Year New Light-Duty Vehicles, Light-Duty...

  20. Detection of 98. 5% of the mutations in 200 Belgian cystic fibrosis alleles by reverse dot-blot and sequencing of the complete coding region and exon/intron junctions of the CFTR gene

    SciTech Connect

    Cuppens, H.; Marynen, P.; Cassiman, J.J. ); De Boeck, C. )

    1993-12-01

    The authors have previously shown that about 85% of the mutations in 194 Belgian cystic fibrosis alleles could be detected by a reverse dot-blot assay. In the present study, 50 Belgian chromosomes were analyzed for mutations in the cystic fibrosis transmembrane conductance regulator gene by means of direct solid phase automatic sequencing of PCR products of individual exons. Twenty-six disease mutations and 14 polymorphisms were found. Twelve of these mutations and 3 polymorphisms were not described before. With the exception of one mutant allele carrying two mutations, these mutations were the only mutations found in the complete coding region and their exon/intron boundaries. The total sensitivity of mutant CF alleles that could be identified was 98.5%. Given the heterogeneity of these mutations, most of them very rare, CFTR mutation screening still remains rather complex in the population, and population screening, whether desirable or not, does not appear to be technically feasible with the methods currently available. 24 refs., 1 fig., 2 tabs.

  1. Molecular weight abnormalities of the CTCF transcription factor: CTCF migrates aberrantly in SDS-PAGE and the size of the expressed protein is affected by the UTRs and sequences within the coding region of the CTCF gene.

    PubMed

    Klenova, E M; Nicolas, R H; U, S; Carne, A F; Lee, R E; Lobanenkov, V V; Goodwin, G H

    1997-02-01

    CTCF belongs to the Zn finger transcription factors family and binds to the promoter region of c-myc. CTCF is highly conserved between species, ubiquitous and localised in nuclei. The endogenous CTCF migrates as a 130 kDa (CTCF-130) protein on SDS-PAGE, however, the open reading frame (ORF) of the CTCF cDNA encodes only a 82 kDa protein (CTCF-82). In the present study we investigate this phenomenon and show with mass-spectra analysis that this occurs due to aberrant mobility of the CTCF protein. Another paradox is that our original cDNA, composed of the ORF and 3'-untranslated region (3'-UTR), produces a protein with the apparent molecular weight of 70 kDa (CTCF-70). This paradox has been found to be an effect of the UTRs and sequences within the coding region of the CTCF gene resulting in C-terminal truncation of CTCF-130. The potential attenuator has been identified and point-mutated. This restored the electrophoretic mobility of the CTCF protein to 130 kDa. CTCF-70, the aberrantly migrating CTCF N-terminus per se, is also detected in some cell types and therefore may have some biological implications. In particular, CTCF-70 interferes with CTCF-130 normal function, enhancing transactivation induced by CTCF-130 in COS6 cells. The mechanism of CTCF-70 action and other possible functions of CTCF-70 are discussed.

  2. Molecular weight abnormalities of the CTCF transcription factor: CTCF migrates aberrantly in SDS-PAGE and the size of the expressed protein is affected by the UTRs and sequences within the coding region of the CTCF gene.

    PubMed Central

    Klenova, E M; Nicolas, R H; U, S; Carne, A F; Lee, R E; Lobanenkov, V V; Goodwin, G H

    1997-01-01

    CTCF belongs to the Zn finger transcription factors family and binds to the promoter region of c-myc. CTCF is highly conserved between species, ubiquitous and localised in nuclei. The endogenous CTCF migrates as a 130 kDa (CTCF-130) protein on SDS-PAGE, however, the open reading frame (ORF) of the CTCF cDNA encodes only a 82 kDa protein (CTCF-82). In the present study we investigate this phenomenon and show with mass-spectra analysis that this occurs due to aberrant mobility of the CTCF protein. Another paradox is that our original cDNA, composed of the ORF and 3'-untranslated region (3'-UTR), produces a protein with the apparent molecular weight of 70 kDa (CTCF-70). This paradox has been found to be an effect of the UTRs and sequences within the coding region of the CTCF gene resulting in C-terminal truncation of CTCF-130. The potential attenuator has been identified and point-mutated. This restored the electrophoretic mobility of the CTCF protein to 130 kDa. CTCF-70, the aberrantly migrating CTCF N-terminus per se, is also detected in some cell types and therefore may have some biological implications. In particular, CTCF-70 interferes with CTCF-130 normal function, enhancing transactivation induced by CTCF-130 in COS6 cells. The mechanism of CTCF-70 action and other possible functions of CTCF-70 are discussed. PMID:9016583

  3. Investigation of complete and incomplete fusion dynamics of {sup 20}Ne induced reactions at energies above the Coulomb barrier

    SciTech Connect

    Singh, D.; Ali, R.; Kumar, Harish; Ansari, M. Afzal; Rashid, M. H.; Guin, R.

    2014-08-14

    Experiment has been performed to explore the complete and incomplete fusion dynamics in heavy ion collisions using stacked foil activation technique. The measurement of excitation functions of the evaporation residues produced in the {sup 20}Ne+{sup 165}Ho system at projectile energies ranges ≈ 4-8 MeV/nucleon have been done. Measured cumulative and direct cross-sections have been compared with the theoretical model code PACE-2, which takes into account only the complete fusion process. The analysis indicates the presence of contributions from incomplete fusion processes in some α-emission channels following the break-up of the projectile {sup 20}Ne in the nuclear field of the target nucleus {sup 165}Ho.

  4. The sequence of sequencers: The history of sequencing DNA

    PubMed Central

    Heather, James M.; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

  5. Identification of a putative methylenetetrahydrofolate reductase by sequence analysis of a 6.8 kb DNA fragment of yeast chromosome VII.

    PubMed

    Tizon, B; Rodríguez-Torres, M; Rodríguez-Belmonte, E; Cadahia, J L; Cerdan, E

    1996-09-01

    We report the sequence analysis of a 6.8 kb DNA fragment from Saccharomyces cerevisiae chromosome VII. This sequence contains five open reading frames (ORFs) greater than 100 amino acids. There is also an incomplete ORF flanking one of the extremes, G2868, which is the 3' end of the SCS3 gene (Hosaka et al., 1994). The translated sequence of ORF G2882 shows similarity to the human methylenetetrahydrofolate reductase (Goyette et al., 1994). ORF G2889 shows no significant homologies with the sequences compiled in databases. ORF G2893 corresponds to the gene SUP44, coding for the yeast ribosomal protein S4 (All-Robin et al., 1990). G2873 and G2896 are internal ORFs.

  6. Using Incomplete Trios to Boost Confidence in Family Based Association Studies.

    PubMed

    Dhankani, Varsha; Gibbs, David L; Knijnenburg, Theo; Kramer, Roger; Vockley, Joseph; Niederhuber, John; Shmulevich, Ilya; Bernard, Brady

    2016-01-01

    Most currently available family based association tests are designed to account only for nuclear families with complete genotypes for parents as well as offspring. Due to the availability of increasingly less expensive generation of whole genome sequencing information, genetic studies are able to collect data for more families and from large family cohorts with the goal of improving statistical power. However, due to missing genotypes, many families are not included in the family based association tests, negating the benefits of large scale sequencing data. Here, we present the CIFBAT method to use incomplete families in Family Based Association Test (FBAT) to evaluate robustness against missing data. CIFBAT uses quantile intervals of the FBAT statistic by randomly choosing valid completions of incomplete family genotypes based on Mendelian inheritance rules. By considering all valid completions equally likely and computing quantile intervals over many randomized iterations, CIFBAT avoids assumption of a homogeneous population structure or any particular missingness pattern in the data. Using simulated data, we show that the quantile intervals computed by CIFBAT are useful in validating robustness of the FBAT statistic against missing data and in identifying genomic markers with higher precision. We also propose a novel set of candidate genomic markers for uterine related abnormalities from analysis of familial whole genome sequences, and provide validation for a previously established set of candidate markers for Type 1 diabetes. We have provided a software package that incorporates TDT, robustTDT, FBAT, and CIFBAT. The data format proposed for the software uses half the memory space that the standard FBAT format (PED) files use, making it efficient for large scale genome wide association studies.

  7. Widespread Discordance of Gene Trees with Species Tree inDrosophila: Evidence for Incomplete Lineage Sorting

    SciTech Connect

    Pollard, Daniel A.; Iyer, Venky N.; Moses, Alan M.; Eisen,Michael B.

    2006-08-28

    The phylogenetic relationship of the now fully sequencedspecies Drosophila erecta and D. yakuba with respect to the D.melanogaster species complex has been a subject of controversy. All threepossible groupings of the species have been reported in the past, thoughrecent multi-gene studies suggest that D. erecta and D. yakuba are sisterspecies. Using the whole genomes of each of these species as well as thefour other fully sequenced species in the subgenus Sophophora, we set outto investigate the placement of D. erecta and D. yakuba in the D.melanogaster species group and to understand the cause of the pastincongruence. Though we find that the phylogeny grouping D. erecta and D.yakuba together is the best supported, we also find widespreadincongruence in nucleotide and amino acid substitutions, insertions anddeletions, and gene trees. The time inferred to span the two keyspeciation events is short enough that under the coalescent model, theincongruence could be the result of incomplete lineage sorting.Consistent with the lineage-sorting hypothesis, substitutions supportingthe same tree were spatially clustered. Support for the different treeswas found to be linked to recombination such that adjacent genes supportthe same tree most often in regions of low recombination andsubstitutions supporting the same tree are most enriched roughly on thesame scale as linkage disequilibrium, also consistent with lineagesorting. The incongruence was found to be statistically significant androbust to model and species choice. No systematic biases were found. Weconclude that phylogenetic incongruence in the D. melanogaster speciescomplex is the result, at least in part, of incomplete lineage sorting.Incomplete lineage sorting will likely cause phylogenetic incongruence inmany comparative genomics datasets. Methods to infer the correct speciestree, the history of every base in the genome, and comparative methodsthat control for and/or utilize this information will be

  8. Genetic code, hamming distance and stochastic matrices.

    PubMed

    He, Matthew X; Petoukhov, Sergei V; Ricci, Paolo E

    2004-09-01

    In this paper we use the Gray code representation of the genetic code C=00, U=10, G=11 and A=01 (C pairs with G, A pairs with U) to generate a sequence of genetic code-based matrices. In connection with these code-based matrices, we use the Hamming distance to generate a sequence of numerical matrices. We then further investigate the properties of the numerical matrices and show that they are doubly stochastic and symmetric. We determine the frequency distributions of the Hamming distances, building blocks of the matrices, decomposition and iterations of matrices. We present an explicit decomposition formula for the genetic code-based matrix in terms of permutation matrices, which provides a hypercube representation of the genetic code. It is also observed that there is a Hamiltonian cycle in a genetic code-based hypercube.

  9. Incomplete Stevens-Johnson syndrome secondary to atypical pneumonia.

    PubMed

    Ramasamy, Anantharaman; Patel, Chiraush; Conlon, Christopher

    2011-10-04

    Steven-Johnson syndrome is a common condition characterised by erythematous target lesions on the skin and involvement of the oral mucosa, genitals and conjunctivae. It has been documented as one of the extra-pulmonary manifestations of Mycoplasma pneumoniae infection. Recently, there has been several documentation of an incomplete presentation of this syndrome - without the typical rash but with mucosal, conjunctival and genital involvement. Our case illustrates that the incomplete Steven-Johnson syndrome may present with oral mucosal and conjunctival involvement alone without skin or genital involvement. This important clinical diagnosis should not be missed due to its atypical presentation. Treatment of Steven-Johnson syndrome remains supportive along with treating the underlying infection if recognised.

  10. Investigations on the Incompletely Developed Plane Diagonal-Tension Field

    NASA Technical Reports Server (NTRS)

    Kuhn, Paul

    1940-01-01

    This report presents the results of an investigation on the incompletely developed diagonal-tension field. Actual diagonal-tension beams work in an intermediate stage between pure shear and pure diagonal tension; the theory developed by wagner for diagonal tension is not directly applicable. The first part of the paper reviews the most essential items of the theory of pure diagonal tension as well as previous attempts to formulate a theory of incomplete diagonal tension. The second part of the paper describes strain measurement made by the N. A. C. A. to obtain the necessary coefficients for the proposed theory. The third part of the paper discusses the stress analysis of diagonal-tension beams by means of the proposed theory.

  11. Survey incompleteness and the evolution of the QSO luminosity function

    NASA Technical Reports Server (NTRS)

    Majewski, Steven R.; Munn, Jeffrey A.; Kron, Richard G.; Bershady, Matthew A.; Smetanka, John J.; Koo, David C.

    1993-01-01

    We concentrate on a type of QSO survey which depends on selecting QSO candidates based on combinations of colors. Since QSO's have emission lines and power-law continua, they are expected to yield broadband colors unlike those of stellar photospheres. Previously, the fraction of QSO's expected to be hiding (unselected) within the locus of stellar (U-J, J-F) colors was estimated at about 15 percent. We have now verified that the KK88 survey is at least 11 percent incomplete, but have determined that it may be as much as 34 percent incomplete. The 'missing' QSO's are expected to be predominantly at z less than or = 2.2. We have studied the proper motion and variability properties of all stellar objects with J less than or = 22.5 or F less than or = 21.5 in the SA 57 field which has previously been surveyed with a multicolor QSO search by KK88.

  12. Α Markov model for longitudinal studies with incomplete dichotomous outcomes.

    PubMed

    Efthimiou, Orestis; Welton, Nicky; Samara, Myrto; Leucht, Stefan; Salanti, Georgia

    2017-03-01

    Missing outcome data constitute a serious threat to the validity and precision of inferences from randomized controlled trials. In this paper, we propose the use of a multistate Markov model for the analysis of incomplete individual patient data for a dichotomous outcome reported over a period of time. The model accounts for patients dropping out of the study and also for patients relapsing. The time of each observation is accounted for, and the model allows the estimation of time-dependent relative treatment effects. We apply our methods to data from a study comparing the effectiveness of 2 pharmacological treatments for schizophrenia. The model jointly estimates the relative efficacy and the dropout rate and also allows for a wide range of clinically interesting inferences to be made. Assumptions about the missingness mechanism and the unobserved outcomes of patients dropping out can be incorporated into the analysis. The presented method constitutes a viable candidate for analyzing longitudinal, incomplete binary data.

  13. Α Markov model for longitudinal studies with incomplete dichotomous outcomes

    PubMed Central

    Welton, Nicky; Samara, Myrto; Leucht, Stefan; Salanti, Georgia

    2016-01-01

    Missing outcome data constitute a serious threat to the validity and precision of inferences from randomized controlled trials. In this paper, we propose the use of a multistate Markov model for the analysis of incomplete individual patient data for a dichotomous outcome reported over a period of time. The model accounts for patients dropping out of the study and also for patients relapsing. The time of each observation is accounted for, and the model allows the estimation of time‐dependent relative treatment effects. We apply our methods to data from a study comparing the effectiveness of 2 pharmacological treatments for schizophrenia. The model jointly estimates the relative efficacy and the dropout rate and also allows for a wide range of clinically interesting inferences to be made. Assumptions about the missingness mechanism and the unobserved outcomes of patients dropping out can be incorporated into the analysis. The presented method constitutes a viable candidate for analyzing longitudinal, incomplete binary data. PMID:27917593

  14. Noise effects on conflicting interest quantum games with incomplete information

    NASA Astrophysics Data System (ADS)

    Situ, Haozhen; Huang, Zhiming; Zhang, Cai

    2016-09-01

    Noise effects can be harmful to quantum information systems. In the present paper, we study noise effects in the context of quantum games with incomplete information, which have more complicated structure than quantum games with complete information. The effects of several paradigmatic noises on three newly proposed conflicting interest quantum games with incomplete information are studied using numerical optimization method. Intuitively noises will bring down the payoffs. However, we find that in some situations the outcome of the games under the influence of noise effects are counter-intuitive. Sometimes stronger noise may lead to higher payoffs. Some properties of the game, like quantum advantage, fairness and equilibrium, are invulnerable to some kinds of noises.

  15. Bayesian Inference of Natural Rankings in Incomplete Competition Networks

    NASA Astrophysics Data System (ADS)

    Park, Juyong; Yook, Soon-Hyung

    2014-08-01

    Competition between a complex system's constituents and a corresponding reward mechanism based on it have profound influence on the functioning, stability, and evolution of the system. But determining the dominance hierarchy or ranking among the constituent parts from the strongest to the weakest - essential in determining reward and penalty - is frequently an ambiguous task due to the incomplete (partially filled) nature of competition networks. Here we introduce the ``Natural Ranking,'' an unambiguous ranking method applicable to a round robin tournament, and formulate an analytical model based on the Bayesian formula for inferring the expected mean and error of the natural ranking of nodes from an incomplete network. We investigate its potential and uses in resolving important issues of ranking by applying it to real-world competition networks.

  16. Einstein's Boxes: Incompleteness of Quantum Mechanics Without a Separation Principle

    NASA Astrophysics Data System (ADS)

    Held, Carsten

    2015-09-01

    Einstein made several attempts to argue for the incompleteness of quantum mechanics (QM), not all of them using a separation principle. One unpublished example, the box parable, has received increased attention in the recent literature. Though the example is tailor-made for applying a separation principle and Einstein indeed applies one, he begins his discussion without it. An analysis of this first part of the parable naturally leads to an argument for incompleteness not involving a separation principle. I discuss the argument and its systematic import. Though it should be kept in mind that the argument is not the one Einstein intends, I show how it suggests itself and leads to a conflict between QM's completeness and a physical principle more fundamental than the separation principle, i.e. a principle saying that QM should deliver probabilities for physical systems possessing properties at definite times.

  17. Contributions to the theory of incomplete tension bay

    NASA Technical Reports Server (NTRS)

    Schapitz, E

    1937-01-01

    The present report offers an approximate theory for the stress and deformation condition after buckling of the skin in reinforced panels and shells loaded in simple shear and compression and under combined stresses. The theory presents a unified scheme for stresses of these types. It is based upon the concept of a nonuniform stress distribution in the metal panel and its marked power of resistance against compressive stresses ("incomplete" tension bay).

  18. Distributed control systems with incomplete and uncertain information

    NASA Astrophysics Data System (ADS)

    Tang, Jingpeng

    Scientific and engineering advances in wireless communication, sensors, propulsion, and other areas are rapidly making it possible to develop unmanned air vehicles (UAVs) with sophisticated capabilities. UAVs have come to the forefront as tools for airborne reconnaissance to search for, detect, and destroy enemy targets in relatively complex environments. They potentially reduce risk to human life, are cost effective, and are superior to manned aircraft for certain types of missions. It is desirable for UAVs to have a high level of intelligent autonomy to carry out mission tasks with little external supervision and control. This raises important issues involving tradeoffs between centralized control and the associated potential to optimize mission plans, and decentralized control with great robustness and the potential to adapt to changing conditions. UAV capabilities have been extended several ways through armament (e.g., Hellfire missiles on Predator UAVs), increased endurance and altitude (e.g., Global Hawk), and greater autonomy. Some known barriers to full-scale implementation of UAVs are increased communication and control requirements as well as increased platform and system complexity. One of the key problems is how UAV systems can handle incomplete and uncertain information in dynamic environments. Especially when the system is composed of heterogeneous and distributed UAVs, the overall system complexity is increased under such conditions. Presented through the use of published papers, this dissertation lays the groundwork for the study of methodologies for handling incomplete and uncertain information for distributed control systems. An agent-based simulation framework is built to investigate mathematical approaches (optimization) and emergent intelligence approaches. The first paper provides a mathematical approach for systems of UAVs to handle incomplete and uncertain information. The second paper describes an emergent intelligence approach for UAVs

  19. Radiopaque Tagging Masks Caries Lesions following Incomplete Excavation in vitro.

    PubMed

    Schwendicke, F; Meyer-Lueckel, H; Schulz, M; Dörfer, C E; Paris, S

    2014-06-01

    One-step incomplete excavation seals caries-affected dentin under a restoration and appears to be advantageous in the treatment of deep lesions. However, it is impossible to discriminate radiographically between intentionally left, arrested lesions and overlooked or active lesions. This diagnostic uncertainty decreases the acceptance of minimally invasive excavation and might lead to unnecessary re-treatment of incompletely excavated teeth. Radiopaque tagging of sealed lesions might mask arrested lesions and assist in discrimination from progressing lesions. Therefore, we microradiographically screened 4 substances (SnCl2, AgNO3, CsF, CsCH3COO) for their effect on artificial lesions. Since water-dissolved tin chloride (SnCl2×Aq) was found to stably mask artificial lesions, we then investigated its radiographic effects on progressing lesions. Natural lesions were incompletely excavated and radiopaque tagging performed. Grey-value differences (△GV) between sound and carious dentin were determined and radiographs assessed by 20 dentists. While radiographic effects of SnCl2×Aq were stable for non-progressing lesions, they significantly decreased during a second demineralization (p < .001, t test). For natural lesions, tagging with SnCl2×Aq significantly reduced △GV (p < .001, Wilcoxon). Tagged lesions were detected significantly less often than untagged lesions (p < .001). SnCl2×Aq was suitable to mask caries-affected dentin and discriminate between arrested and progressing lesions in vitro. Radiopaque tagging could resolve diagnostic uncertainties associated with incomplete excavation.

  20. Reconstruction of Unilateral Incomplete Cryptophthalmos in Fraser Syndrome.

    PubMed

    Tran, Ann Q; Lee, Bradford W; Alameddine, Ramzi M; Korn, Bobby S; Kikkawa, Don O

    2015-03-25

    A full-term baby girl with Fraser syndrome was born with right incomplete cryptophthalmos. On examination, the globe was completely covered with skin with partially formed eyelids laterally. At 3 years of age, she underwent an evisceration with orbital implant and reconstruction of the eyelids and fornices using the pre-existing scleral remnant. Custom ocular prosthetic fitting was performed 5 weeks postoperatively. At 4 years follow up, she continued to successfully retain an ocular prosthesis.

  1. 3D ultrasound image segmentation using multiple incomplete feature sets

    NASA Astrophysics Data System (ADS)

    Fan, Liexiang; Herrington, David M.; Santago, Peter, II

    1999-05-01

    We use three features, the intensity, texture and motion to obtain robust results for segmentation of intracoronary ultrasound images. Using a parameterized equation to describe the lumen-plaque and media-adventitia boundaries, we formulate the segmentation as a parameter estimation through a cost functional based on the posterior probability, which can handle the incompleteness of the features in ultrasound images by employing outlier detection.

  2. Complete genome sequence of Desulfohalobium retbaense type strain (HR100T)

    PubMed Central

    Spring, Stefan; Nolan, Matt; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Land, Miriam; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Munk, Christine; Kiss, Hajnalka; Chain, Patrick; Han, Cliff; Brettin, Thomas; Detter, John C.; Schüler, Esther; Göker, Markus; Rohde, Manfred; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2010-01-01

    Desulfohalobium retbaense (Ollivier et al. 1991) is the type species of the polyphyletic genus Desulfohalobium, which comprises, at the time of writing, two species and represents the family Desulfohalobiaceae within the Deltaproteobacteria. D. retbaense is a moderately halophilic sulfate-reducing bacterium, which can utilize H2 and a limited range of organic substrates, which are incompletely oxidized to acetate and CO2, for growth. The type strain HR100T was isolated from sediments of the hypersaline Retba Lake in Senegal. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the family Desulfohalobiaceae. The 2,909,567 bp genome (one chromosome and a 45,263 bp plasmid) with its 2,552 protein-coding and 57 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304676

  3. Complete genome sequence of Desulfohalobium retbaense type strain (HR100T)

    SciTech Connect

    Spring, Stefan; Nolan, Matt; Lapidus, Alla L.; Glavina Del Rio, Tijana; Copeland, A; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Land, Miriam L; Chen, Feng; Bruce, David; Goodwin, Lynne A.; Pitluck, Sam; Ivanova, N; Mavromatis, K; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Hauser, Loren John; Chang, Yun-Juan; Jeffries, Cynthia; Munk, Christine; Kiss, Hajnalka; Chain, Patrick S. G.; Han, Cliff; Brettin, Thomas S; Detter, J. Chris; Schuler, Esther; Goker, Markus; Rohde, Manfred; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2010-01-01

    Desulfohalobium retbaense (Ollivier et al. 1991) is the type species of the polyphyletic genus Desulfohalobium, which comprises, at the time of writing, two species and represents the family Desulfohalobiaceae within the Deltaproteobacteria. D. retbaense is a moderately halophilic sulfate-reducing bacterium, which can utilize H2 and a limited range of organic substrates, which are incompletely oxidized to acetate and CO2, for growth. The type strain HR100T was isolated from sediments of the hypersaline Retba Lake in Senegal. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the family Desulfohalobiaceae. The 2,909,567 bp genome (one chromosome and a 45,263 bp plasmid) with its 2,552 protein-coding and 57 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  4. Early expression of a Trypanosoma brucei VSG gene duplicated from an incomplete basic copy.

    PubMed

    Aline, R F; Myler, P J; Gobright, E; Stuart, K D

    1994-01-01

    Intrachromosomal variant surface glycoprotein (VSG) genes in Trypanosoma brucei are expressed by a mechanism involving gene conversion. The 3' boundary of gene conversion is usually within the last 130 bp of the VSG gene, a region of partially conserved sequences. We report here the loss of the predominant telomeric A VSG gene in the cloned variant antigenic type (VAT) 5A3, leaving only an intrachromosomal A VSG gene (the A-B gene). The nucleotide sequence of the A-B VSG gene reveals that it lacks the normal VSG 3' sequence. Surprisingly, we find cells expressing this A-B VSG gene in relapse populations arising from VAT 5A3. Since the A VSG mRNAs from these cells have a normal 3' sequence, the incomplete A-B VSG gene must be expressed via a partial gene conversion that supplies the functional 3' end. Although the A-B VSG gene is no longer predominant like the telomeric A VSG gene, it is still expressed more frequently than other intrachromosomal VSG genes, suggesting that factors other than a telomeric location determine whether a VSG gene is expressed early in a serodeme.

  5. Detecting non-coding selective pressure in coding regions

    PubMed Central

    Chen, Hui; Blanchette, Mathieu

    2007-01-01

    Background Comparative genomics approaches, where orthologous DNA regions are compared and inter-species conserved regions are identified, have proven extremely powerful for identifying non-coding regulatory regions located in intergenic or intronic regions. However, non-coding functional elements can also be located within coding region, as is common for exonic splicing enhancers, some transcription factor binding sites, and RNA secondary structure elements affecting mRNA stability, localization, or translation. Since these functional elements are located in regions that are themselves highly conserved because they are coding for a protein, they generally escaped detection by comparative genomics approaches. Results We introduce a comparative genomics approach for detecting non-coding functional elements located within coding regions. Codon evolution is modeled as a mixture of codon substitution models, where each component of the mixture describes the evolution of codons under a specific type of coding selective pressure. We show how to compute the posterior distribution of the entropy and parsimony scores under this null model of codon evolution. The method is applied to a set of growth hormone 1 orthologous mRNA sequences and a known exonic splicing elements is detected. The analysis of a set of CORTBP2 orthologous genes reveals a region of several hundred base pairs under strong non-coding selective pressure whose function remains unknown. Conclusion Non-coding functional elements, in particular those involved in post-transcriptional regulation, are likely to be much more prevalent than is currently known. With the numerous genome sequencing projects underway, comparative genomics approaches like that proposed here are likely to become increasingly powerful at detecting such elements. PMID:17288582

  6. Gradual molecular evolution of a sex determination switch through incomplete penetrance of femaleness.

    PubMed

    Beye, Martin; Seelmann, Christine; Gempe, Tanja; Hasselmann, Martin; Vekemans, Xavier; Fondrk, M Kim; Page, Robert E

    2013-12-16

    Some genes regulate phenotypes that are either present or absent. They are often important regulators of developmental switches and are involved in morphological evolution. We have little understanding of the molecular mechanisms by which these absence/presence gene functions have evolved, because the phenotype and fitness of molecular intermediate forms are unknown. Here, we studied the sex-determining switch of 14 natural sequence variants of the csd gene among 76 genotypes of the honeybee (Apis mellifera). Heterozygous genotypes (different specificities) of the csd gene determine femaleness, while hemizygous genotypes (single specificity) determine maleness. Homozygous genotypes of the csd gene (same specificity) are lethal. We found that at least five amino acid differences and length variation between Csd specificities in the specifying domain (PSD) were sufficient to regularly induce femaleness. We estimated that, on average, six pairwise amino acid differences evolved under positive selection. We also identified a natural evolutionary intermediate that showed only three amino acid length differences in the PSD relative to its parental allele. This genotype showed an intermediate fitness because it implemented lethality regularly and induced femaleness infrequently (i.e., incomplete penetrance). We suggest incomplete penetrance as a mechanism through which new molecular switches can gradually and adaptively evolve.

  7. Video coding with dynamic background

    NASA Astrophysics Data System (ADS)

    Paul, Manoranjan; Lin, Weisi; Lau, Chiew Tong; Lee, Bu-Sung

    2013-12-01

    Motion estimation (ME) and motion compensation (MC) using variable block size, sub-pixel search, and multiple reference frames (MRFs) are the major reasons for improved coding performance of the H.264 video coding standard over other contemporary coding standards. The concept of MRFs is suitable for repetitive motion, uncovered background, non-integer pixel displacement, lighting change, etc. The requirement of index codes of the reference frames, computational time in ME & MC, and memory buffer for coded frames limits the number of reference frames used in practical applications. In typical video sequences, the previous frame is used as a reference frame with 68-92% of cases. In this article, we propose a new video coding method using a reference frame [i.e., the most common frame in scene (McFIS)] generated by dynamic background modeling. McFIS is more effective in terms of rate-distortion and computational time performance compared to the MRFs techniques. It has also inherent capability of scene change detection (SCD) for adaptive group of picture (GOP) size determination. As a result, we integrate SCD (for GOP determination) with reference frame generation. The experimental results show that the proposed coding scheme outperforms the H.264 video coding with five reference frames and the two relevant state-of-the-art algorithms by 0.5-2.0 dB with less computational time.

  8. Region-based fractal video coding

    NASA Astrophysics Data System (ADS)

    Zhu, Shiping; Belloulata, Kamel

    2008-10-01

    A novel video sequence compression scheme is proposed in order to realize the efficient and economical transmission of video sequence, and also the region-based functionality of MPEG-4. The CPM and NCIM fractal coding scheme is applied on each region independently by a prior image segmentation map (alpha plane) which is exactly the same as defined in MPEG-4. The first n frames of video sequence are encoded as a "set" using the Circular Prediction Mapping (CPM) and encode the remaining frames using the Non Contractive Interframe Mapping (NCIM). The CPM and NCIM accomplish the motion estimation and compensation, which can exploit the high temporal correlations between the adjacent frames of video sequence. The experimental results with the monocular video sequences provide promising performances at low bit rate coding, such as the application in video conference. We believe the proposed fractal video codec will be a powerful and efficient technique for the region-based video sequence coding.

  9. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  10. Interface requirements to couple thermal hydraulics codes to severe accident codes: ICARE/CATHARE

    SciTech Connect

    Camous, F.; Jacq, F.; Chatelard, P.

    1997-07-01

    In order to describe with the same code the whole sequence of severe LWR accidents, up to the vessel failure, the Institute of Protection and Nuclear Safety has performed a coupling of the severe accident code ICARE2 to the thermalhydraulics code CATHARE2. The resulting code, ICARE/CATHARE, is designed to be as pertinent as possible in all the phases of the accident. This paper is mainly devoted to the description of the ICARE2-CATHARE2 coupling.

  11. Syndrome source coding and its universal generalization

    NASA Technical Reports Server (NTRS)

    Ancheta, T. C., Jr.

    1975-01-01

    A method of using error-correcting codes to obtain data compression, called syndrome-source-coding, is described in which the source sequence is treated as an error pattern whose syndrome forms the compressed data. It is shown that syndrome-source-coding can achieve arbitrarily small distortion with the number of compressed digits per source digit arbitrarily close to the entropy of a binary memoryless source. A universal generalization of syndrome-source-coding is formulated which provides robustly-effective, distortionless, coding of source ensembles.

  12. A necessary and sufficient symbolic condition for the existence of incomplete Cholesky factorization

    SciTech Connect

    Wang, Xiaoge; Bramley, R.; Gallivan, K.A.

    1996-12-31

    Incomplete Cholesky factorization (IC) is a widely known and effective method of accelerating the convergence of conjugate gradient (CG) iterative methods for solving symmetric positive definite (SPD) linear systems. A major weakness of IC is that it may break down due to nonpositive pivots. Methods of overcoming this problem can be divided into two classes: numerical and structural strategies. A numerical strategy uses numerical values generated during the factorization process to modify the factorization, as in the work. A structural strategy, as in the work, selects the sparsity pattern to insure the completion of the IC process. Structural strategies are important for applications where a sequence of linear systems must be solved, each coefficient matrix with the same non-zero pattern. This occurs when solving linear programming problems using interior point methods, or when solving discretized nonlinear partial differential equations with a fixed mesh. Although the values can change from one step to another, the sparsity pattern is fixed.

  13. De-coding and re-coding RNA recognition by PUF and PPR repeat proteins.

    PubMed

    Hall, Traci M Tanaka

    2016-02-01

    PUF and PPR proteins are two families of α-helical repeat proteins that recognize single-stranded RNA sequences. Both protein families hold promise as scaffolds for designed RNA-binding domains. A modular protein RNA recognition code was apparent from the first crystal structures of a PUF protein in complex with RNA, and recent studies continue to advance our understanding of natural PUF protein recognition (de-coding) and our ability to engineer specificity (re-coding). Degenerate recognition motifs make de-coding specificity of individual PPR proteins challenging. Nevertheless, re-coding PPR protein specificity using a consensus recognition code has been successful.

  14. Social Interactions under Incomplete Information: Games, Equilibria, and Expectations

    NASA Astrophysics Data System (ADS)

    Yang, Chao

    My dissertation research investigates interactions of agents' behaviors through social networks when some information is not shared publicly, focusing on solutions to a series of challenging problems in empirical research, including heterogeneous expectations and multiple equilibria. The first chapter, "Social Interactions under Incomplete Information with Heterogeneous Expectations", extends the current literature in social interactions by devising econometric models and estimation tools with private information in not only the idiosyncratic shocks but also some exogenous covariates. For example, when analyzing peer effects in class performances, it was previously assumed that all control variables, including individual IQ and SAT scores, are known to the whole class, which is unrealistic. This chapter allows such exogenous variables to be private information and models agents' behaviors as outcomes of a Bayesian Nash Equilibrium in an incomplete information game. The distribution of equilibrium outcomes can be described by the equilibrium conditional expectations, which is unique when the parameters are within a reasonable range according to the contraction mapping theorem in function spaces. The equilibrium conditional expectations are heterogeneous in both exogenous characteristics and the private information, which makes estimation in this model more demanding than in previous ones. This problem is solved in a computationally efficient way by combining the quadrature method and the nested fixed point maximum likelihood estimation. In Monte Carlo experiments, if some exogenous characteristics are private information and the model is estimated under the mis-specified hypothesis that they are known to the public, estimates will be biased. Applying this model to municipal public spending in North Carolina, significant negative correlations between contiguous municipalities are found, showing free-riding effects. The Second chapter "A Tobit Model with Social

  15. Analytical Solution for Reactive Solute Transport Considering Incomplete Mixing

    NASA Astrophysics Data System (ADS)

    Bellin, A.; Chiogna, G.

    2013-12-01

    The laboratory experiments of Gramling et al. (2002) showed that incomplete mixing at the pore scale exerts a significant impact on transport of reactive solutes and that assuming complete mixing leads to overestimation of product concentration in bimolecular reactions. We consider here the family of equilibrium reactions for which the concentration of the reactants and the product can be expressed as a function of the mixing ratio, the concentration of a fictitious non reactive solute. For this type of reactions we propose, in agreement with previous studies, to model the effect of incomplete mixing at scales smaller than the Darcy scale assuming that the mixing ratio is distributed within an REV according to a Beta distribution. We compute the parameters of the Beta model by imposing that the mean concentration is equal to the value that the concentration assumes at the continuum Darcy scale, while the variance decays with time as a power law. We show that our model reproduces the concentration profiles of the reaction product measured in the Gramling et al. (2002) experiments using the transport parameters obtained from conservative experiments and an instantaneous reaction kinetic. The results are obtained applying analytical solutions both for conservative and for reactive solute transport, thereby providing a method to handle the effect of incomplete mixing on multispecies reactive solute transport, which is simpler than other previously developed methods. Gramling, C. M., C. F. Harvey, and L. C. Meigs (2002), Reactive transport in porous media: A comparison of model prediction with laboratory visualization, Environ. Sci. Technol., 36(11), 2508-2514.

  16. Shift register generators and applications to coding

    NASA Technical Reports Server (NTRS)

    Morakis, J. C.

    1968-01-01

    The most important properties of shift register generated sequences are exposed. The application of shift registers as multiplication and division circuits leads to the generation of some error correcting and detecting codes.

  17. A robust coding scheme for packet video

    NASA Technical Reports Server (NTRS)

    Chen, Yun-Chung; Sayood, Khalid; Nelson, Don J.

    1992-01-01

    A layered packet video coding algorithm based on a progressive transmission scheme is presented. The algorithm provides good compression and can handle significant packet loss with graceful degradation in the reconstruction sequence. Simulation results for various conditions are presented.

  18. A Supernodal Approach to Incomplete LU Factorization with Partial Pivoting

    SciTech Connect

    Li, Xiaoye Sherry; Shao, Meiyue

    2009-06-25

    We present a new supernode-based incomplete LU factorization method to construct a preconditioner for solving sparse linear systems with iterative methods. The new algorithm is primarily based on the ILUTP approach by Saad, and we incorporate a number of techniques to improve the robustness and performance of the traditional ILUTP method. These include the new dropping strategies that accommodate the use of supernodal structures in the factored matrix. We present numerical experiments to demonstrate that our new method is competitive with the other ILU approaches and is well suited for today's high performance architectures.

  19. Bayesian model updating using incomplete modal data without mode matching

    NASA Astrophysics Data System (ADS)

    Sun, Hao; Büyüköztürk, Oral

    2016-04-01

    This study investigates a new probabilistic strategy for model updating using incomplete modal data. A hierarchical Bayesian inference is employed to model the updating problem. A Markov chain Monte Carlo technique with adaptive random-work steps is used to draw parameter samples for uncertainty quantification. Mode matching between measured and predicted modal quantities is not required through model reduction. We employ an iterated improved reduced system technique for model reduction. The reduced model retains the dynamic features as close as possible to those of the model before reduction. The proposed algorithm is finally validated by an experimental example.

  20. Conditioning Analysis of Incomplete Cholesky Factorizations with Orthogonal Dropping

    SciTech Connect

    Napov, Artem

    2013-08-01

    The analysis of preconditioners based on incomplete Cholesky factorization in which the neglected (dropped) components are orthogonal to the approximations being kept is presented. General estimate for the condition number of the preconditioned system is given which only depends on the accuracy of individual approximations. The estimate is further improved if, for instance, only the newly computed rows of the factor are modified during each approximation step. In this latter case it is further shown to be sharp. The analysis is illustrated with some existing factorizations in the context of discretized elliptic partial differential equations.

  1. Incomplete block factorization preconditioning for indefinite elliptic problems

    SciTech Connect

    Guo, Chun-Hua

    1996-12-31

    The application of the finite difference method to approximate the solution of an indefinite elliptic problem produces a linear system whose coefficient matrix is block tridiagonal and symmetric indefinite. Such a linear system can be solved efficiently by a conjugate residual method, particularly when combined with a good preconditioner. We show that specific incomplete block factorization exists for the indefinite matrix if the mesh size is reasonably small. And this factorization can serve as an efficient preconditioner. Some efforts are made to estimate the eigenvalues of the preconditioned matrix. Numerical results are also given.

  2. Incompletely-Condensed Fluorinated Silsesquioxane: Synthesis and Crystal Structure

    DTIC Science & Technology

    2011-11-29

    other provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a ...ABSTRACT A recently developed sub-class of POSS, fluorinated polyhedral oligomeric silsesquioxane (F-POSS), consists of a Si-O core with a periphery of...incompletely-condensed silsesquioxane, (CF3(CF2)7CH2CH2)8Si8O11(OH)2, has been synthesized via a multi-step synthesis (52% yield). The structure was

  3. Incomplete block SSOR preconditionings for high order discretizations

    SciTech Connect

    Kolotilina, L.

    1994-12-31

    This paper considers the solution of linear algebraic systems Ax = b resulting from the p-version of the Finite Element Method (FEM) using PCG iterations. Contrary to the h-version, the p-version ensures the desired accuracy of a discretization not by refining an original finite element mesh but by introducing higher degree polynomials as additional basis functions which permits to reduce the size of the resulting linear system as compared with the h-version. The suggested preconditionings are the so-called Incomplete Block SSOR (IBSSOR) preconditionings.

  4. Chronic incomplete atrioventricular block induced by radiofrequency catheter ablation

    SciTech Connect

    Huang, S.K.; Bharati, S.; Graham, A.R.; Gorman, G.; Lev, M. )

    1989-10-01

    To determine if catheter ablation of the atrioventricular (AV) junction with radiofrequency energy can induce chronic incomplete (first- and second-degree) AV block to avoid the need for a permanent pacemaker, 20 closed-chest dogs were studied. Group 1 (10 dogs) received radiofrequency energy (750 kHz) with a fixed power setting (5 or 10 W) while increasing the pulse duration from 10 to 50 seconds for each application. Group 2 (10 dogs) received energy with a fixed pulse duration (20 or 30 seconds) while increasing the power setting from 5 to 10 W or from 10 to 20 W during each energy delivery. Radiofrequency energy was delivered between a chest-patch electrode and the distal electrode of a regular 7F tripolar His bundle catheter. For each application, the energy delivery was interrupted when (1) the PR interval prolonged (greater than 50%) or (2) second-degree or complete AV block occurred and persisted up to 5 seconds. The ablation procedure ended when there was (1) persistent PR prolongation (greater than 50%) or persistent second-degree AV block (lasting greater than 30 minutes) after ablation, (2) occurrence of two consecutive transient (less than 1 minute) complete AV blocks after each energy delivery, or (3) complete AV block (lasting greater than 2 minutes) after ablation. Of seven dogs in group 1 and five dogs in group 2 in which incomplete AV block was achieved 1 hour after the procedure, six in group 1 and five in group 2 remained in incomplete AV block 2-3 months after ablation. One dog in group 1 progressed into complete AV block. Of the remaining three dogs in group 1 and five dogs in group 2 in which complete AV block was initially achieved 1 hour after ablation, two in group 1 and four in group 2 continued to have complete AV block, whereas one in each group had AV conduction returned to incomplete at 1-2 months of follow-up.

  5. Uniform Asymptotic Expansion for the Incomplete Beta Function

    NASA Astrophysics Data System (ADS)

    Nemes, Gergő; Olde Daalhuis, Adri B.

    2016-10-01

    In [Temme N.M., Special functions. An introduction to the classical functions of mathematical physics, A Wiley-Interscience Publication, John Wiley & Sons, Inc., New York, 1996, Section 11.3.3.1] a uniform asymptotic expansion for the incomplete beta function was derived. It was not obvious from those results that the expansion is actually an asymptotic expansion. We derive a remainder estimate that clearly shows that the result indeed has an asymptotic property, and we also give a recurrence relation for the coefficients.

  6. Incomplete exponential sums and Diffie-Hellman triples

    NASA Astrophysics Data System (ADS)

    Banks, William D.; Friedlander, John B.; Konyagin, Sergei V.; Shparlinski, Igor E.

    2006-03-01

    Let p be a prime and vartheta an integer of order t in the multiplicative group modulo p. In this paper, we continue the study of the distribution of Diffie-Hellman triples (vartheta(x,) vartheta(y,) vartheta(xy) ) by considering the closely related problem of estimating exponential sums formed from linear combinations of the entries in such triples. We show that the techniques developed earlier for complete sums can be combined, modified and developed further to treat incomplete sums as well. Our bounds imply uniformity of distribution results for Diffie-Hellman triples as the pair (x,y) varies over small boxes.

  7. Incomplete optical shielding in cold sodium atom traps

    SciTech Connect

    Yurovsky, Vladimir; Ben-Reuven, Abraham

    1997-01-05

    A simple two-channel model, based on the semiclassical Landau-Zener (LZ) approximation, with averaging over angle-dependent exponents, is proposed as a fast means for accounting for the incomplete optical shielding of collisions, as observed in recent experiments conducted by Weiner and co-workers on ultracold sodium-atom traps, and its dependence on the laser polarization. The model yields a reasonably good agreement with the recent quantum close-coupling calculations of Julienne and co-workers. The remaining discrepancy between both theories and the data is qualitatively attributed to a partial overlap of the collision ranges at which loss processes and optical shielding occur.

  8. Error-correction coding

    NASA Technical Reports Server (NTRS)

    Hinds, Erold W. (Principal Investigator)

    1996-01-01

    This report describes the progress made towards the completion of a specific task on error-correcting coding. The proposed research consisted of investigating the use of modulation block codes as the inner code of a concatenated coding system in order to improve the overall space link communications performance. The study proposed to identify and analyze candidate codes that will complement the performance of the overall coding system which uses the interleaved RS (255,223) code as the outer code.

  9. MSLICE Sequencing

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas M.; Joswig, Joseph C.; Shams, Khawaja S.; Norris, Jeffrey S.; Morris, John R.

    2011-01-01

    MSLICE Sequencing is a graphical tool for writing sequences and integrating them into RML files, as well as for producing SCMF files for uplink. When operated in a testbed environment, it also supports uplinking these SCMF files to the testbed via Chill. This software features a free-form textural sequence editor featuring syntax coloring, automatic content assistance (including command and argument completion proposals), complete with types, value ranges, unites, and descriptions from the command dictionary that appear as they are typed. The sequence editor also has a "field mode" that allows tabbing between arguments and displays type/range/units/description for each argument as it is edited. Color-coded error and warning annotations on problematic tokens are included, as well as indications of problems that are not visible in the current scroll range. "Quick Fix" suggestions are made for resolving problems, and all the features afforded by modern source editors are also included such as copy/cut/paste, undo/redo, and a sophisticated find-and-replace system optionally using regular expressions. The software offers a full XML editor for RML files, which features syntax coloring, content assistance and problem annotations as above. There is a form-based, "detail view" that allows structured editing of command arguments and sequence parameters when preferred. The "project view" shows the user s "workspace" as a tree of "resources" (projects, folders, and files) that can subsequently be opened in editors by double-clicking. Files can be added, deleted, dragged-dropped/copied-pasted between folders or projects, and these operations are undoable and redoable. A "problems view" contains a tabular list of all problems in the current workspace. Double-clicking on any row in the table opens an editor for the appropriate sequence, scrolling to the specific line with the problem, and highlighting the problematic characters. From there, one can invoke "quick fix" as described

  10. Experimental Study of Spatially Incomplete Structural System Identification

    DTIC Science & Technology

    1994-03-01

    procedure is the transfer of the data from the HP 3562A to the personal computer that will allow the data to be evaluated using MATLAB codes . 1. File...100-20 hz)* 8 rows/hz), columns I thru 50. These four files provide the data for the test system FRF in the MATLAB codes used to solve the localization...the file coding % End frequency (120+n*100) hz n=[1,2,3] % Data Files: - files must be in *.mat format % file location c:\\ matlab \\beamdata directory

  11. Sequential Syndrome Decoding of Convolutional Codes

    NASA Technical Reports Server (NTRS)

    Reed, I. S.; Truong, T. K.

    1984-01-01

    The algebraic structure of convolutional codes are reviewed and sequential syndrome decoding is applied to those codes. These concepts are then used to realize by example actual sequential decoding, using the stack algorithm. The Fano metric for use in sequential decoding is modified so that it can be utilized to sequentially find the minimum weight error sequence.

  12. Theory and Simulations of Incomplete Reconnection During Sawteeth Due to Diamagnetic Effects

    NASA Astrophysics Data System (ADS)

    Beidler, Matthew Thomas

    Tokamaks use magnetic fields to confine plasmas to achieve fusion; they are the leading approach proposed for the widespread production of fusion energy. The sawtooth crash in tokamaks limits the core temperature, adversely impacts confinement, and seeds disruptions. Adequate knowledge of the physics governing the sawtooth crash and a predictive capability of its ramifications has been elusive, including an understanding of incomplete reconnection, i.e., why sawteeth often cease prematurely before processing all available magnetic flux. In this dissertation, we introduce a model for incomplete reconnection in sawtooth crashes resulting from increasing diamagnetic effects in the nonlinear phase of magnetic reconnection. Physically, the reconnection inflow self-consistently convects the high pressure core of a tokamak toward the q=1 rational surface, thereby increasing the pressure gradient at the reconnection site. If the pressure gradient at the rational surface becomes large enough due to the self-consistent evolution, incomplete reconnection will occur due to diamagnetic effects becoming large enough to suppress reconnection. Predictions of this model are borne out in large-scale proof-of-principle two-fluid simulations of reconnection in a 2D slab geometry and are also consistent with data from the Mega Ampere Spherical Tokamak (MAST). Additionally, we present simulations from the 3D extended-MHD code M3D-C1 used to study the sawtooth crash in a 3D toroidal geometry for resistive-MHD and two-fluid models. This is the first study in a 3D tokamak geometry to show that the inclusion of two-fluid physics in the model equations is essential for recovering timescales more closely in line with experimental results compared to resistive-MHD and contrast the dynamics in the two models. We use a novel approach to sample the data in the plane of reconnection perpendicular to the (m,n)=(1,1) mode to carefully assess the reconnection physics. Using local measures of

  13. Dynamic Financial Constraints: Distinguishing Mechanism Design from Exogenously Incomplete Regimes*

    PubMed Central

    Karaivanov, Alexander; Townsend, Robert M.

    2014-01-01

    We formulate and solve a range of dynamic models of constrained credit/insurance that allow for moral hazard and limited commitment. We compare them to full insurance and exogenously incomplete financial regimes (autarky, saving only, borrowing and lending in a single asset). We develop computational methods based on mechanism design, linear programming, and maximum likelihood to estimate, compare, and statistically test these alternative dynamic models with financial/information constraints. Our methods can use both cross-sectional and panel data and allow for measurement error and unobserved heterogeneity. We estimate the models using data on Thai households running small businesses from two separate samples. We find that in the rural sample, the exogenously incomplete saving only and borrowing regimes provide the best fit using data on consumption, business assets, investment, and income. Family and other networks help consumption smoothing there, as in a moral hazard constrained regime. In contrast, in urban areas, we find mechanism design financial/information regimes that are decidedly less constrained, with the moral hazard model fitting best combined business and consumption data. We perform numerous robustness checks in both the Thai data and in Monte Carlo simulations and compare our maximum likelihood criterion with results from other metrics and data not used in the estimation. A prototypical counterfactual policy evaluation exercise using the estimation results is also featured. PMID:25246710

  14. An information propagation model considering incomplete reading behavior in microblog

    NASA Astrophysics Data System (ADS)

    Su, Qiang; Huang, Jiajia; Zhao, Xiande

    2015-02-01

    Microblog is one of the most popular communication channels on the Internet, and has already become the third largest source of news and public opinions in China. Although researchers have studied the information propagation in microblog using the epidemic models, previous studies have not considered the incomplete reading behavior among microblog users. Therefore, the model cannot fit the real situations well. In this paper, we proposed an improved model entitled Microblog-Susceptible-Infected-Removed (Mb-SIR) for information propagation by explicitly considering the user's incomplete reading behavior. We also tested the effectiveness of the model using real data from Sina Microblog. We demonstrate that the new proposed model is more accurate in describing the information propagation in microblog. In addition, we also investigate the effects of the critical model parameters, e.g., reading rate, spreading rate, and removed rate through numerical simulations. The simulation results show that, compared with other parameters, reading rate plays the most influential role in the information propagation performance in microblog.

  15. Observable Priors: Limiting Biases in Estimated Parameters for Incomplete Orbits

    NASA Astrophysics Data System (ADS)

    Kosmo, Kelly; Martinez, Gregory; Hees, Aurelien; Witzel, Gunther; Ghez, Andrea M.; Do, Tuan; Sitarski, Breann; Chu, Devin; Dehghanfar, Arezu

    2017-01-01

    Over twenty years of monitoring stellar orbits at the Galactic center has provided an unprecedented opportunity to study the physics and astrophysics of the supermassive black hole (SMBH) at the center of the Milky Way Galaxy. In order to constrain the mass of and distance to the black hole, and to evaluate its gravitational influence on orbiting bodies, we use Bayesian statistics to infer black hole and stellar orbital parameters from astrometric and radial velocity measurements of stars orbiting the central SMBH. Unfortunately, most of the short period stars in the Galactic center have periods much longer than our twenty year time baseline of observations, resulting in incomplete orbital phase coverage--potentially biasing fitted parameters. Using the Bayesian statistical framework, we evaluate biases in the black hole and orbital parameters of stars with varying phase coverage, using various prior models to fit the data. We present evidence that incomplete phase coverage of an orbit causes prior assumptions to bias statistical quantities, and propose a solution to reduce these biases for orbits with low phase coverage. The explored solution assumes uniformity in the observables rather than in the inferred model parameters, as is the current standard method of orbit fitting. Of the cases tested, priors that assume uniform astrometric and radial velocity observables reduce the biases in the estimated parameters. The proposed method will not only improve orbital estimates of stars orbiting the central SMBH, but can also be extended to other orbiting bodies with low phase coverage such as visual binaries and exoplanets.

  16. Topological effects of data incompleteness of gene regulatory networks

    PubMed Central

    2012-01-01

    Background The topological analysis of biological networks has been a prolific topic in network science during the last decade. A persistent problem with this approach is the inherent uncertainty and noisy nature of the data. One of the cases in which this situation is more marked is that of transcriptional regulatory networks (TRNs) in bacteria. The datasets are incomplete because regulatory pathways associated to a relevant fraction of bacterial genes remain unknown. Furthermore, direction, strengths and signs of the links are sometimes unknown or simply overlooked. Finally, the experimental approaches to infer the regulations are highly heterogeneous, in a way that induces the appearance of systematic experimental-topological correlations. And yet, the quality of the available data increases constantly. Results In this work we capitalize on these advances to point out the influence of data (in)completeness and quality on some classical results on topological analysis of TRNs, specially regarding modularity at different levels. Conclusions In doing so, we identify the most relevant factors affecting the validity of previous findings, highlighting important caveats to future prokaryotic TRNs topological analysis. PMID:22920968

  17. Incomplete oxidation of ethylenediaminetetraacetic acid in chemical oxygen demand analysis.

    PubMed

    Anderson, James E; Mueller, Sherry A; Kim, Byung R

    2007-09-01

    Ethylenediaminetetraacetic acid (EDTA) was found to incompletely oxidize in chemical oxygen demand (COD) analysis, leading to incorrect COD values for water samples containing relatively large amounts of EDTA. The degree of oxidation depended on the oxidant used, its concentration, and the length of digestion. The COD concentrations measured using COD vials with a potassium dichromate concentration of 0.10 N (after dilution by sample and sulfuric acid) were near theoretical oxygen demand values. However, COD measured with dichromate concentrations of 0.010 N and 0.0022 N were 30 to 40% lower than theoretical oxygen demand values. Similarly, lower COD values were observed with manganic sulfate as oxidant at 0.011 N. Extended digestion yielded somewhat higher COD values, suggesting incomplete and slower oxidation of EDTA, as a result of lower oxidant concentrations. For wastewater in which EDTA is a large fraction of COD, accurate COD measurement may not be achieved with methods using dichromate concentrations less than 0.1 N.

  18. Analysis of recurrent event data with incomplete observation gaps.

    PubMed

    Kim, Yang-Jin; Jhun, Myoungshic

    2008-03-30

    In analysis of recurrent event data, recurrent events are not completely experienced when the terminating event occurs before the end of a study. To make valid inference of recurrent events, several methods have been suggested for accommodating the terminating event (Statist. Med. 1997; 16:911-924; Biometrics 2000; 56:554-562). In this paper, our interest is to consider a particular situation, where intermittent dropouts result in observation gaps during which no recurrent events are observed. In this situation, risk status varies over time and the usual definition of risk variable is not applicable. In particular, we consider the case when information on the observation gap is incomplete, that is, the starting time of intermittent dropout is known but the terminating time is not available. This incomplete information is modeled in terms of an interval-censored mechanism. Our proposed method is applied to the study of the Young Traffic Offenders Program on conviction rates, wherein a certain proportion of subjects experienced suspensions with intermittent dropouts during the study.

  19. Cannabinoids induce incomplete maturation of cultured human leukemia cells

    SciTech Connect

    Murison, G.; Chubb, C.B.H.; Maeda, S.; Gemmell, M.A.; Huberman, E.

    1987-08-01

    Monocyte maturation markers were induced in cultured human myeloblastic ML-2 leukemia cells after treatment for 1-6 days with 0.03-30 ..mu..M ..delta../sup 9/-tetrahydrocannabinol (THC), the major psychoactive component of marijuana. After a 2-day or longer treatment, 2- to 5-fold increases were found in the percentages of cells exhibiting reactivity with either the murine OKM1 monoclonal antibody of the Leu-M5 monoclonal antibody, staining positively for nonspecific esterase activity, and displaying a promonocyte morphology. The increases in these differentiation markers after treatment with 0.03-1 ..mu..M THC were dose dependent. At this dose range, THC did not cause an inhibition of cell growth. The THC-induced cell maturation was also characterized by specific changes in the patterns of newly synthesized proteins. The THC-induced differentiation did not, however, result in cells with a highly developed mature monocyte phenotype. However, treatment of these incompletely matured cells with either phorbol 12-myristate 13-acetate of 1..cap alpha..,25-dihydroxycholecalciferol, which are inducers of differentiation in myeloid leukemia cells (including ML-2 cells), produced cells with a mature monocyte morphology. The ML-2 cell system described here may be a useful tool for deciphering critical biochemical events that lead to the cannabinoid-induced incomplete cell differentiation of ML-2 cells and other related cell types. Findings obtained from this system may have important implications for studies of cannabinoid effects on normal human bone-marrow progenitor cells.

  20. Semiparametric pseudoscore for regression with multidimensional but incompletely observed regressor.

    PubMed

    Hu, Zonghui; Qin, Jing; Follmann, Dean

    2017-02-16

    We study the regression fβ (Y|X,Z), where Y is the response, Z∈Rd is a vector of fully observed regressors and X is the regressor with incomplete observation. To handle missing data, maximum likelihood estimation via expectation-maximisation (EM) is the most efficient but is sensitive to the specification of the distribution of X. Under a missing at random assumption, we propose an EM-type estimation via a semiparametric pseudoscore. Like in EM, we derive the conditional expectation of the score function given Y and Z, or the mean score, over the incompletely observed units under a postulated distribution of X. Instead of directly using the 'mean score' in estimating equation, we use it as a working index to construct the semiparametric pseudoscore via nonparametric regression. Introduction of semiparametric pseudoscore into the EM framework reduces sensitivity to the specified distribution of X. It also avoids the curse of dimensionality when Z is multidimensional. The resulting regression estimator is more than doubly robust: it is consistent if either the pattern of missingness in X is correctly specified or the working index is appropriately, but not necessarily correctly, specified. It attains optimal efficiency when both conditions are satisfied. Numerical performance is explored by Monte Carlo simulations and a study on treating hepatitis C patients with HIV coinfection. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.