linear sequence comparisons: Topics by Science.gov

Sample records for linear sequence comparisons

Criterion for estimation of stress-deformed state of SD-materials

NASA Astrophysics Data System (ADS)

Orekhov, Andrey V.

2018-05-01

A criterion is proposed that determines the moment when the growth pattern of the monotonic numerical sequence varies from the linear to the parabolic one. The criterion is based on the comparison of squares of errors for the linear and the incomplete quadratic approximation. The approximating functions are constructed locally, only at those points that are located near a possible change in nature of the increase in the sequence.
Defining Differential Genetic Signatures in CXCR4- and the CCR5-Utilizing HIV-1 Co-Linear Sequences

PubMed Central

Aiamkitsumrit, Benjamas; Dampier, Will; Martin-Garcia, Julio; Nonnemacher, Michael R.; Pirrone, Vanessa; Ivanova, Tatyana; Zhong, Wen; Kilareski, Evelyn; Aldigun, Hazeez; Frantz, Brian; Rimbey, Matthew; Wojno, Adam; Passic, Shendra; Williams, Jean W.; Shah, Sonia; Blakey, Brandon; Parikh, Nirzari; Jacobson, Jeffrey M.; Moldover, Brian; Wigdahl, Brian

2014-01-01

The adaptation of human immunodeficiency virus type-1 (HIV-1) to an array of physiologic niches is advantaged by the plasticity of the viral genome, encoded proteins, and promoter. CXCR4-utilizing (X4) viruses preferentially, but not universally, infect CD4+ T cells, generating high levels of virus within activated HIV-1-infected T cells that can be detected in regional lymph nodes and peripheral blood. By comparison, the CCR5-utilizing (R5) viruses have a greater preference for cells of the monocyte-macrophage lineage; however, while R5 viruses also display a propensity to enter and replicate in T cells, they infect a smaller percentage of CD4+ T cells in comparison to X4 viruses. Additionally, R5 viruses have been associated with viral transmission and CNS disease and are also more prevalent during HIV-1 disease. Specific adaptive changes associated with X4 and R5 viruses were identified in co-linear viral sequences beyond the Env-V3. The in silico position-specific scoring matrix (PSSM) algorithm was used to define distinct groups of X4 and R5 sequences based solely on sequences in Env-V3. Bioinformatic tools were used to identify genetic signatures involving specific protein domains or long terminal repeat (LTR) transcription factor sites within co-linear viral protein R (Vpr), trans-activator of transcription (Tat), or LTR sequences that were preferentially associated with X4 or R5 Env-V3 sequences. A number of differential amino acid and nucleotide changes were identified across the co-linear Vpr, Tat, and LTR sequences, suggesting the presence of specific genetic signatures that preferentially associate with X4 or R5 viruses. Investigation of the genetic relatedness between X4 and R5 viruses utilizing phylogenetic analyses of complete sequences could not be used to definitively and uniquely identify groups of R5 or X4 sequences; in contrast, differences in the genetic diversities between X4 and R5 were readily identified within these co-linear sequences in HIV-1-infected patients. PMID:25265194
Evidence for the Concerted Evolution between Short Linear Protein Motifs and Their Flanking Regions

PubMed Central

Chica, Claudia; Diella, Francesca; Gibson, Toby J.

2009-01-01

Background Linear motifs are short modules of protein sequences that play a crucial role in mediating and regulating many protein–protein interactions. The function of linear motifs strongly depends on the context, e.g. functional instances mainly occur inside flexible regions that are accessible for interaction. Sometimes linear motifs appear as isolated islands of conservation in multiple sequence alignments. However, they also occur in larger blocks of sequence conservation, suggesting an active role for the neighbouring amino acids. Results The evolution of regions flanking 116 functional linear motif instances was studied. The conservation of the amino acid sequence and order/disorder tendency of those regions was related to presence/absence of the instance. For the majority of the analysed instances, the pairs of sequences conserving the linear motif were also observed to maintain a similar local structural tendency and/or to have higher local sequence conservation when compared to pairs of sequences where one is missing the linear motif. Furthermore, those instances have a higher chance to co–evolve with the neighbouring residues in comparison to the distant ones. Those findings are supported by examples where the regulation of the linear motif–mediated interaction has been shown to depend on the modifications (e.g. phosphorylation) at neighbouring positions or is thought to benefit from the binding versatility of disordered regions. Conclusion The results suggest that flanking regions are relevant for linear motif–mediated interactions, both at the structural and sequence level. More interestingly, they indicate that the prediction of linear motif instances can be enriched with contextual information by performing a sequence analysis similar to the one presented here. This can facilitate the understanding of the role of these predicted instances in determining the protein function inside the broader context of the cellular network where they arise. PMID:19584925
Sequence information signal processor

DOEpatents

Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.

1999-01-01

An electronic circuit is used to compare two sequences, such as genetic sequences, to determine which alignment of the sequences produces the greatest similarity. The circuit includes a linear array of series-connected processors, each of which stores a single element from one of the sequences and compares that element with each successive element in the other sequence. For each comparison, the processor generates a scoring parameter that indicates which segment ending at those two elements produces the greatest degree of similarity between the sequences. The processor uses the scoring parameter to generate a similar scoring parameter for a comparison between the stored element and the next successive element from the other sequence. The processor also delivers the scoring parameter to the next processor in the array for use in generating a similar scoring parameter for another pair of elements. The electronic circuit determines which processor and alignment of the sequences produce the scoring parameter with the highest value.
Statistical method to compare massive parallel sequencing pipelines.

PubMed

Elsensohn, M H; Leblay, N; Dimassi, S; Campan-Fournier, A; Labalme, A; Roucher-Boulez, F; Sanlaville, D; Lesca, G; Bardel, C; Roy, P

2017-03-01

Today, sequencing is frequently carried out by Massive Parallel Sequencing (MPS) that cuts drastically sequencing time and expenses. Nevertheless, Sanger sequencing remains the main validation method to confirm the presence of variants. The analysis of MPS data involves the development of several bioinformatic tools, academic or commercial. We present here a statistical method to compare MPS pipelines and test it in a comparison between an academic (BWA-GATK) and a commercial pipeline (TMAP-NextGENe®), with and without reference to a gold standard (here, Sanger sequencing), on a panel of 41 genes in 43 epileptic patients. This method used the number of variants to fit log-linear models for pairwise agreements between pipelines. To assess the heterogeneity of the margins and the odds ratios of agreement, four log-linear models were used: a full model, a homogeneous-margin model, a model with single odds ratio for all patients, and a model with single intercept. Then a log-linear mixed model was fitted considering the biological variability as a random effect. Among the 390,339 base-pairs sequenced, TMAP-NextGENe® and BWA-GATK found, on average, 2253.49 and 1857.14 variants (single nucleotide variants and indels), respectively. Against the gold standard, the pipelines had similar sensitivities (63.47% vs. 63.42%) and close but significantly different specificities (99.57% vs. 99.65%; p < 0.001). Same-trend results were obtained when only single nucleotide variants were considered (99.98% specificity and 76.81% sensitivity for both pipelines). The method allows thus pipeline comparison and selection. It is generalizable to all types of MPS data and all pipelines.
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

PubMed Central

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-01-01

Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Localization of Non-Linearly Modeled Autonomous Mobile Robots Using Out-of-Sequence Measurements

PubMed Central

Besada-Portas, Eva; Lopez-Orozco, Jose A.; Lanillos, Pablo; de la Cruz, Jesus M.

2012-01-01

This paper presents a state of the art of the estimation algorithms dealing with Out-of-Sequence (OOS) measurements for non-linearly modeled systems. The state of the art includes a critical analysis of the algorithm properties that takes into account the applicability of these techniques to autonomous mobile robot navigation based on the fusion of the measurements provided, delayed and OOS, by multiple sensors. Besides, it shows a representative example of the use of one of the most computationally efficient approaches in the localization module of the control software of a real robot (which has non-linear dynamics, and linear and non-linear sensors) and compares its performance against other approaches. The simulated results obtained with the selected OOS algorithm shows the computational requirements that each sensor of the robot imposes to it. The real experiments show how the inclusion of the selected OOS algorithm in the control software lets the robot successfully navigate in spite of receiving many OOS measurements. Finally, the comparison highlights that not only is the selected OOS algorithm among the best performing ones of the comparison, but it also has the lowest computational and memory cost. PMID:22736962
Localization of non-linearly modeled autonomous mobile robots using out-of-sequence measurements.

PubMed

Besada-Portas, Eva; Lopez-Orozco, Jose A; Lanillos, Pablo; de la Cruz, Jesus M

2012-01-01

This paper presents a state of the art of the estimation algorithms dealing with Out-of-Sequence (OOS) measurements for non-linearly modeled systems. The state of the art includes a critical analysis of the algorithm properties that takes into account the applicability of these techniques to autonomous mobile robot navigation based on the fusion of the measurements provided, delayed and OOS, by multiple sensors. Besides, it shows a representative example of the use of one of the most computationally efficient approaches in the localization module of the control software of a real robot (which has non-linear dynamics, and linear and non-linear sensors) and compares its performance against other approaches. The simulated results obtained with the selected OOS algorithm shows the computational requirements that each sensor of the robot imposes to it. The real experiments show how the inclusion of the selected OOS algorithm in the control software lets the robot successfully navigate in spite of receiving many OOS measurements. Finally, the comparison highlights that not only is the selected OOS algorithm among the best performing ones of the comparison, but it also has the lowest computational and memory cost.
Sequence information signal processor for local and global string comparisons

DOEpatents

Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.

1997-01-01

A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.
Beyond Linear Sequence Comparisons: The use of genome-levelcharacters for phylogenetic reconstruction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boore, Jeffrey L.

2004-11-27

Although the phylogenetic relationships of many organisms have been convincingly resolved by the comparisons of nucleotide or amino acid sequences, others have remained equivocal despite great effort. Now that large-scale genome sequencing projects are sampling many lineages, it is becoming feasible to compare large data sets of genome-level features and to develop this as a tool for phylogenetic reconstruction that has advantages over conventional sequence comparisons. Although it is unlikely that these will address a large number of evolutionary branch points across the broad tree of life due to the infeasibility of such sampling, they have great potential for convincinglymore » resolving many critical, contested relationships for which no other data seems promising. However, it is important that we recognize potential pitfalls, establish reasonable standards for acceptance, and employ rigorous methodology to guard against a return to earlier days of scenario-driven evolutionary reconstructions.« less
Prediction of virus-host protein-protein interactions mediated by short linear motifs.

PubMed

Becerra, Andrés; Bucheli, Victor A; Moreno, Pedro A

2017-03-09

Short linear motifs in host organisms proteins can be mimicked by viruses to create protein-protein interactions that disable or control metabolic pathways. Given that viral linear motif instances of host motif regular expressions can be found by chance, it is necessary to develop filtering methods of functional linear motifs. We conduct a systematic comparison of linear motifs filtering methods to develop a computational approach for predicting motif-mediated protein-protein interactions between human and the human immunodeficiency virus 1 (HIV-1). We implemented three filtering methods to obtain linear motif sets: 1) conserved in viral proteins (C), 2) located in disordered regions (D) and 3) rare or scarce in a set of randomized viral sequences (R). The sets C,D,R are united and intersected. The resulting sets are compared by the number of protein-protein interactions correctly inferred with them - with experimental validation. The comparison is done with HIV-1 sequences and interactions from the National Institute of Allergy and Infectious Diseases (NIAID). The number of correctly inferred interactions allows to rank the interactions by the sets used to deduce them: D∪R and C. The ordering of the sets is descending on the probability of capturing functional interactions. With respect to HIV-1, the sets C∪R, D∪R, C∪D∪R infer all known interactions between HIV1 and human proteins mediated by linear motifs. We found that the majority of conserved linear motifs in the virus are located in disordered regions. We have developed a method for predicting protein-protein interactions mediated by linear motifs between HIV-1 and human proteins. The method only use protein sequences as inputs. We can extend the software developed to any other eukaryotic virus and host in order to find and rank candidate interactions. In future works we will use it to explore possible viral attack mechanisms based on linear motif mimicry.
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

PubMed

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-02-01

A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ)

PubMed Central

Mascher, Martin; Muehlbauer, Gary J; Rokhsar, Daniel S; Chapman, Jarrod; Schmutz, Jeremy; Barry, Kerrie; Muñoz-Amatriaín, María; Close, Timothy J; Wise, Roger P; Schulman, Alan H; Himmelbach, Axel; Mayer, Klaus FX; Scholz, Uwe; Poland, Jesse A; Stein, Nils; Waugh, Robbie

2013-01-01

Next-generation whole-genome shotgun assemblies of complex genomes are highly useful, but fail to link nearby sequence contigs with each other or provide a linear order of contigs along individual chromosomes. Here, we introduce a strategy based on sequencing progeny of a segregating population that allows de novo production of a genetically anchored linear assembly of the gene space of an organism. We demonstrate the power of the approach by reconstructing the chromosomal organization of the gene space of barley, a large, complex and highly repetitive 5.1 Gb genome. We evaluate the robustness of the new assembly by comparison to a recently released physical and genetic framework of the barley genome, and to various genetically ordered sequence-based genotypic datasets. The method is independent of the need for any prior sequence resources, and will enable rapid and cost-efficient establishment of powerful genomic information for many species. PMID:23998490
Sequence comparison alignment-free approach based on suffix tree and L-words frequency.

PubMed

Soares, Inês; Goios, Ana; Amorim, António

2012-01-01

The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.
On the normalization of the minimum free energy of RNAs by sequence length.

PubMed

Trotta, Edoardo

2014-01-01

The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size.
On the Normalization of the Minimum Free Energy of RNAs by Sequence Length

PubMed Central

Trotta, Edoardo

2014-01-01

The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size. PMID:25405875
Breaking the computational barriers of pairwise genome comparison.

PubMed

Torreno, Oscar; Trelles, Oswaldo

2015-08-11

Conventional pairwise sequence comparison software algorithms are being used to process much larger datasets than they were originally designed for. This can result in processing bottlenecks that limit software capabilities or prevent full use of the available hardware resources. Overcoming the barriers that limit the efficient computational analysis of large biological sequence datasets by retrofitting existing algorithms or by creating new applications represents a major challenge for the bioinformatics community. We have developed C libraries for pairwise sequence comparison within diverse architectures, ranging from commodity systems to high performance and cloud computing environments. Exhaustive tests were performed using different datasets of closely- and distantly-related sequences that span from small viral genomes to large mammalian chromosomes. The tests demonstrated that our solution is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods. We have addressed the problem of pairwise and all-versus-all comparison of large sequences in general, greatly increasing the limits on input data size. The approach described here is based on a modular out-of-core strategy that uses secondary storage to avoid reaching memory limits during the identification of High-scoring Segment Pairs (HSPs) between the sequences under comparison. Software engineering concepts were applied to avoid intermediate result re-calculation, to minimise the performance impact of input/output (I/O) operations and to modularise the process, thus enhancing application flexibility and extendibility. Our computationally-efficient approach allows tasks such as the massive comparison of complete genomes, evolutionary event detection, the identification of conserved synteny blocks and inter-genome distance calculations to be performed more effectively.
CAMELOT: A machine learning approach for coarse-grained simulations of aggregation of block-copolymeric protein sequences

PubMed Central

Ruff, Kiersten M.; Harmon, Tyler S.; Pappu, Rohit V.

2015-01-01

We report the development and deployment of a coarse-graining method that is well suited for computer simulations of aggregation and phase separation of protein sequences with block-copolymeric architectures. Our algorithm, named CAMELOT for Coarse-grained simulations Aided by MachinE Learning Optimization and Training, leverages information from converged all atom simulations that is used to determine a suitable resolution and parameterize the coarse-grained model. To parameterize a system-specific coarse-grained model, we use a combination of Boltzmann inversion, non-linear regression, and a Gaussian process Bayesian optimization approach. The accuracy of the coarse-grained model is demonstrated through direct comparisons to results from all atom simulations. We demonstrate the utility of our coarse-graining approach using the block-copolymeric sequence from the exon 1 encoded sequence of the huntingtin protein. This sequence comprises of 17 residues from the N-terminal end of huntingtin (N17) followed by a polyglutamine (polyQ) tract. Simulations based on the CAMELOT approach are used to show that the adsorption and unfolding of the wild type N17 and its sequence variants on the surface of polyQ tracts engender a patchy colloid like architecture that promotes the formation of linear aggregates. These results provide a plausible explanation for experimental observations, which show that N17 accelerates the formation of linear aggregates in block-copolymeric N17-polyQ sequences. The CAMELOT approach is versatile and is generalizable for simulating the aggregation and phase behavior of a range of block-copolymeric protein sequences. PMID:26723608
Comparison of sequence of trunk and arm motions between short and long official distance groups in javelin throwing.

PubMed

Liu, Hui; Leigh, Steve; Yu, Bing

2014-03-01

The purpose of this study was to determine the effects of sequences of the trunk and arm angular motions on the performance of javelin throwing. In this study, 32 male and 30 female elite javelin throwers participated and were separated into a short official distance group or a long official distance group in each gender. Three-dimensional coordinates of 21 body landmarks and 3 marks on the javelin in the best trial were collected for each subject. Joint center linear velocities and selected trunk and arm segment and joint angles and angular velocities were calculated. The times of the initiations of the selected segment and joint angular motions and maximum angular velocities were determined. The sequences of the initiations of the selected segment and joint angular motions and maximum angular velocities were compared between short and long official distance groups and between genders. The results demonstrated that short and long official distance groups employed similar sequences of the trunk and arm motions. Male and female javelin throwers employed different sequences of the trunk and arm motions. The sequences of the trunk and arm motions were different from those of the maximal joint center linear velocities.
Zseq: An Approach for Preprocessing Next-Generation Sequencing Data.

PubMed

Alkhateeb, Abedalrhman; Rueda, Luis

2017-08-01

Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique k-mers in each sequence as its corresponding score and also takes into the account other factors such as ambiguous nucleotides or high GC-content percentage in k-mers. Based on a z-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold. Zseq algorithm is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as de novo assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Moreover, de novo assembled transcripts from the reads filtered by Zseq have longer genomic sequences than other tested methods. Estimating the threshold of the cutoff point is introduced using labeling rules with optimistic results.

Complete Genome Sequence of ER2796, a DNA Methyltransferase-Deficient Strain of Escherichia coli K-12.

PubMed

Anton, Brian P; Mongodin, Emmanuel F; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R; Roberts, Richard J; Raleigh, Elisabeth A

2015-01-01

We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems.
Complete Genome Sequence of ER2796, a DNA Methyltransferase-Deficient Strain of Escherichia coli K-12

PubMed Central

Anton, Brian P.; Mongodin, Emmanuel F.; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R.; Roberts, Richard J.; Raleigh, Elisabeth A.

2015-01-01

We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems. PMID:26010885
Identification and characterization of two bisabolene synthases from linear glandular trichomes of sunflower (Helianthus annuus L., Asteraceae).

PubMed

Aschenbrenner, Anna-Katharina; Kwon, Moonhyuk; Conrad, Jürgen; Ro, Dae-Kyun; Spring, Otmar

2016-04-01

Sunflower is known to produce a variety of bisabolene-type sesquiterpenes and accumulates these substances in trichomes of leaves, stems and flowering parts. A bioinformatics approach was used to identify the enzyme responsible for the initial step in the biosynthesis of these compounds from its precursor farnesyl pyrophosphate. Based on sequence similarity with a known bisabolene synthases from Arabidopsis thaliana AtTPS12, candidate genes of Helianthus were searched in EST-database and used to design specific primers. PCR experiments identified two candidates in the RNA pool of linear glandular trichomes of sunflower. Their sequences contained the typical motifs of sesquiterpene synthases and their expression in yeast functionally characterized them as bisabolene synthases. Spectroscopic analysis identified the stereochemistry of the product of both enzymes as (Z)-γ-bisabolene. The origin of the two sunflower bisabolene synthase genes from the transcripts of linear trichomes indicates that they may be involved in the synthesis of sesquiterpenes produced in these trichomes. Comparison of the amino acid sequences of the sunflower bisabolene synthases showed high similarity with sesquiterpene synthases from other Asteracean species and indicated putative evolutionary origin from a β-farnesene synthase. Copyright © 2016 Elsevier Ltd. All rights reserved.
A model-adaptivity method for the solution of Lennard-Jones based adhesive contact problems

NASA Astrophysics Data System (ADS)

Ben Dhia, Hachmi; Du, Shuimiao

2018-05-01

The surface micro-interaction model of Lennard-Jones (LJ) is used for adhesive contact problems (ACP). To address theoretical and numerical pitfalls of this model, a sequence of partitions of contact models is adaptively constructed to both extend and approximate the LJ model. It is formed by a combination of the LJ model with a sequence of shifted-Signorini (or, alternatively, -Linearized-LJ) models, indexed by a shift parameter field. For each model of this sequence, a weak formulation of the associated local ACP is developed. To track critical localized adhesive areas, a two-step strategy is developed: firstly, a macroscopic frictionless (as first approach) linear-elastic contact problem is solved once to detect contact separation zones. Secondly, at each shift-adaptive iteration, a micro-macro ACP is re-formulated and solved within the multiscale Arlequin framework, with significant reduction of computational costs. Comparison of our results with available analytical and numerical solutions shows the effectiveness of our global strategy.
Structure-preserving interpolation of temporal and spatial image sequences using an optical flow-based method.

PubMed

Ehrhardt, J; Säring, D; Handels, H

2007-01-01

Modern tomographic imaging devices enable the acquisition of spatial and temporal image sequences. But, the spatial and temporal resolution of such devices is limited and therefore image interpolation techniques are needed to represent images at a desired level of discretization. This paper presents a method for structure-preserving interpolation between neighboring slices in temporal or spatial image sequences. In a first step, the spatiotemporal velocity field between image slices is determined using an optical flow-based registration method in order to establish spatial correspondence between adjacent slices. An iterative algorithm is applied using the spatial and temporal image derivatives and a spatiotemporal smoothing step. Afterwards, the calculated velocity field is used to generate an interpolated image at the desired time by averaging intensities between corresponding points. Three quantitative measures are defined to evaluate the performance of the interpolation method. The behavior and capability of the algorithm is demonstrated by synthetic images. A population of 17 temporal and spatial image sequences are utilized to compare the optical flow-based interpolation method to linear and shape-based interpolation. The quantitative results show that the optical flow-based method outperforms the linear and shape-based interpolation statistically significantly. The interpolation method presented is able to generate image sequences with appropriate spatial or temporal resolution needed for image comparison, analysis or visualization tasks. Quantitative and qualitative measures extracted from synthetic phantoms and medical image data show that the new method definitely has advantages over linear and shape-based interpolation.
A 3D sequence-independent representation of the protein data bank.

PubMed

Fischer, D; Tsai, C J; Nussinov, R; Wolfson, H

1995-10-01

Here we address the following questions. How many structurally different entries are there in the Protein Data Bank (PDB)? How do the proteins populate the structural universe? To investigate these questions a structurally non-redundant set of representative entries was selected from the PDB. Construction of such a dataset is not trivial: (i) the considerable size of the PDB requires a large number of comparisons (there were more than 3250 structures of protein chains available in May 1994); (ii) the PDB is highly redundant, containing many structurally similar entries, not necessarily with significant sequence homology, and (iii) there is no clear-cut definition of structural similarity. The latter depend on the criteria and methods used. Here, we analyze structural similarity ignoring protein topology. To date, representative sets have been selected either by hand, by sequence comparison techniques which ignore the three-dimensional (3D) structures of the proteins or by using sequence comparisons followed by linear structural comparison (i.e. the topology, or the sequential order of the chains, is enforced in the structural comparison). Here we describe a 3D sequence-independent automated and efficient method to obtain a representative set of protein molecules from the PDB which contains all unique structures and which is structurally non-redundant. The method has two novel features. The first is the use of strictly structural criteria in the selection process without taking into account the sequence information. To this end we employ a fast structural comparison algorithm which requires on average approximately 2 s per pairwise comparison on a workstation. The second novel feature is the iterative application of a heuristic clustering algorithm that greatly reduces the number of comparisons required. We obtain a representative set of 220 chains with resolution better than 3.0 A, or 268 chains including lower resolution entries, NMR entries and models. The resulting set can serve as a basis for extensive structural classification and studies of 3D recurring motifs and of sequence-structure relationships. The clustering algorithm succeeds in classifying into the same structural family chains with no significant sequence homology, e.g. all the globins in one single group, all the trypsin-like serine proteases in another or all the immunoglobulin-like folds into a third. In addition, unexpected structural similarities of interest have been automatically detected between pairs of chains. A cluster analysis of the representative structures demonstrates the way the "structural universe' is populated.
Quantitative comparison between a multiecho sequence and a single-echo sequence for susceptibility-weighted phase imaging.

PubMed

Gilbert, Guillaume; Savard, Geneviève; Bard, Céline; Beaudoin, Gilles

2012-06-01

The aim of this study was to investigate the benefits arising from the use of a multiecho sequence for susceptibility-weighted phase imaging using a quantitative comparison with a standard single-echo acquisition. Four healthy adult volunteers were imaged on a clinical 3-T system using a protocol comprising two different three-dimensional susceptibility-weighted gradient-echo sequences: a standard single-echo sequence and a multiecho sequence. Both sequences were repeated twice in order to evaluate the local noise contribution by a subtraction of the two acquisitions. For the multiecho sequence, the phase information from each echo was independently unwrapped, and the background field contribution was removed using either homodyne filtering or the projection onto dipole fields method. The phase information from all echoes was then combined using a weighted linear regression. R2 maps were also calculated from the multiecho acquisitions. The noise standard deviation in the reconstructed phase images was evaluated for six manually segmented regions of interest (frontal white matter, posterior white matter, globus pallidus, putamen, caudate nucleus and lateral ventricle). The use of the multiecho sequence for susceptibility-weighted phase imaging led to a reduction of the noise standard deviation for all subjects and all regions of interest investigated in comparison to the reference single-echo acquisition. On average, the noise reduction ranged from 18.4% for the globus pallidus to 47.9% for the lateral ventricle. In addition, the amount of noise reduction was found to be strongly inversely correlated to the estimated R2 value (R=-0.92). In conclusion, the use of a multiecho sequence is an effective way to decrease the noise contribution in susceptibility-weighted phase images, while preserving both contrast and acquisition time. The proposed approach additionally permits the calculation of R2 maps. Copyright © 2012 Elsevier Inc. All rights reserved.
Zero entropy continuous interval maps and MMLS-MMA property

NASA Astrophysics Data System (ADS)

Jiang, Yunping

2018-06-01

We prove that the flow generated by any continuous interval map with zero topological entropy is minimally mean-attractable and minimally mean-L-stable. One of the consequences is that any oscillating sequence is linearly disjoint from all flows generated by all continuous interval maps with zero topological entropy. In particular, the Möbius function is linearly disjoint from all flows generated by all continuous interval maps with zero topological entropy (Sarnak’s conjecture for continuous interval maps). Another consequence is a non-trivial example of a flow having discrete spectrum. We also define a log-uniform oscillating sequence and show a result in ergodic theory for comparison. This material is based upon work supported by the National Science Foundation. It is also partially supported by a collaboration grant from the Simons Foundation (grant number 523341) and PSC-CUNY awards and a grant from NSFC (grant number 11571122).
Assessing pooled BAC and whole genome shotgun strategies for assembly of complex genomes.

PubMed

Haiminen, Niina; Feltus, F Alex; Parida, Laxmi

2011-04-15

We investigate if pooling BAC clones and sequencing the pools can provide for more accurate assembly of genome sequences than the "whole genome shotgun" (WGS) approach. Furthermore, we quantify this accuracy increase. We compare the pooled BAC and WGS approaches using in silico simulations. Standard measures of assembly quality focus on assembly size and fragmentation, which are desirable for large whole genome assemblies. We propose additional measures enabling easy and visual comparison of assembly quality, such as rearrangements and redundant sequence content, relative to the known target sequence. The best assembly quality scores were obtained using 454 coverage of 15× linear and 5× paired (3kb insert size) reads (15L-5P) on Arabidopsis. This regime gave similarly good results on four additional plant genomes of very different GC and repeat contents. BAC pooling improved assembly scores over WGS assembly, coverage and redundancy scores improving the most. BAC pooling works better than WGS, however, both require a physical map to order the scaffolds. Pool sizes up to 12Mbp work well, suggesting this pooling density to be effective in medium-scale re-sequencing applications such as targeted sequencing of QTL intervals for candidate gene discovery. Assuming the current Roche/454 Titanium sequencing limitations, a 12 Mbp region could be re-sequenced with a full plate of linear reads and a half plate of paired-end reads, yielding 15L-5P coverage after read pre-processing. Our simulation suggests that massively over-sequencing may not improve accuracy. Our scoring measures can be used generally to evaluate and compare results of simulated genome assemblies.
Accuracy for detection of simulated lesions: comparison of fluid-attenuated inversion-recovery, proton density--weighted, and T2-weighted synthetic brain MR imaging

NASA Technical Reports Server (NTRS)

Herskovits, E. H.; Itoh, R.; Melhem, E. R.

2001-01-01

OBJECTIVE: The objective of our study was to determine the effects of MR sequence (fluid-attenuated inversion-recovery [FLAIR], proton density--weighted, and T2-weighted) and of lesion location on sensitivity and specificity of lesion detection. MATERIALS AND METHODS: We generated FLAIR, proton density-weighted, and T2-weighted brain images with 3-mm lesions using published parameters for acute multiple sclerosis plaques. Each image contained from zero to five lesions that were distributed among cortical-subcortical, periventricular, and deep white matter regions; on either side; and anterior or posterior in position. We presented images of 540 lesions, distributed among 2592 image regions, to six neuroradiologists. We constructed a contingency table for image regions with lesions and another for image regions without lesions (normal). Each table included the following: the reviewer's number (1--6); the MR sequence; the side, position, and region of the lesion; and the reviewer's response (lesion present or absent [normal]). We performed chi-square and log-linear analyses. RESULTS: The FLAIR sequence yielded the highest true-positive rates (p < 0.001) and the highest true-negative rates (p < 0.001). Regions also differed in reviewers' true-positive rates (p < 0.001) and true-negative rates (p = 0.002). The true-positive rate model generated by log-linear analysis contained an additional sequence-location interaction. The true-negative rate model generated by log-linear analysis confirmed these associations, but no higher order interactions were added. CONCLUSION: We developed software with which we can generate brain images of a wide range of pulse sequences and that allows us to specify the location, size, shape, and intrinsic characteristics of simulated lesions. We found that the use of FLAIR sequences increases detection accuracy for cortical-subcortical and periventricular lesions over that associated with proton density- and T2-weighted sequences.
Vander Lugt correlation of DNA sequence data

NASA Astrophysics Data System (ADS)

Christens-Barry, William A.; Hawk, James F.; Martin, James C.

1990-12-01

DNA, the molecule containing the genetic code of an organism, is a linear chain of subunits. It is the sequence of subunits, of which there are four kinds, that constitutes the unique blueprint of an individual. This sequence is the focus of a large number of analyses performed by an army of geneticists, biologists, and computer scientists. Most of these analyses entail searches for specific subsequences within the larger set of sequence data. Thus, most analyses are essentially pattern recognition or correlation tasks. Yet, there are special features to such analysis that influence the strategy and methods of an optical pattern recognition approach. While the serial processing employed in digital electronic computers remains the main engine of sequence analyses, there is no fundamental reason that more efficient parallel methods cannot be used. We describe an approach using optical pattern recognition (OPR) techniques based on matched spatial filtering. This allows parallel comparison of large blocks of sequence data. In this study we have simulated a Vander Lugt1 architecture implementing our approach. Searches for specific target sequence strings within a block of DNA sequence from the Co/El plasmid2 are performed.
A Constraint Generation Approach to Learning Stable Linear Dynamical Systems

DTIC Science & Technology

2008-01-01

task of learning dynamic textures from image sequences as well as to modeling biosurveillance drug-sales data. The constraint generation approach...previous methods in our experiments. One application of LDSs in computer vision is learning dynamic textures from video data [8]. An advantage of...over-the-counter (OTC) drug sales for biosurveillance , and sunspot numbers from the UCR archive [9]. Comparison to the best alternative methods [7, 10
Assessing pooled BAC and whole genome shotgun strategies for assembly of complex genomes

PubMed Central

2011-01-01

Background We investigate if pooling BAC clones and sequencing the pools can provide for more accurate assembly of genome sequences than the "whole genome shotgun" (WGS) approach. Furthermore, we quantify this accuracy increase. We compare the pooled BAC and WGS approaches using in silico simulations. Standard measures of assembly quality focus on assembly size and fragmentation, which are desirable for large whole genome assemblies. We propose additional measures enabling easy and visual comparison of assembly quality, such as rearrangements and redundant sequence content, relative to the known target sequence. Results The best assembly quality scores were obtained using 454 coverage of 15× linear and 5× paired (3kb insert size) reads (15L-5P) on Arabidopsis. This regime gave similarly good results on four additional plant genomes of very different GC and repeat contents. BAC pooling improved assembly scores over WGS assembly, coverage and redundancy scores improving the most. Conclusions BAC pooling works better than WGS, however, both require a physical map to order the scaffolds. Pool sizes up to 12Mbp work well, suggesting this pooling density to be effective in medium-scale re-sequencing applications such as targeted sequencing of QTL intervals for candidate gene discovery. Assuming the current Roche/454 Titanium sequencing limitations, a 12 Mbp region could be re-sequenced with a full plate of linear reads and a half plate of paired-end reads, yielding 15L-5P coverage after read pre-processing. Our simulation suggests that massively over-sequencing may not improve accuracy. Our scoring measures can be used generally to evaluate and compare results of simulated genome assemblies. PMID:21496274
Evaluating whole transcriptome amplification for gene profiling experiments using RNA-Seq.

PubMed

Faherty, Sheena L; Campbell, C Ryan; Larsen, Peter A; Yoder, Anne D

2015-07-30

RNA-Seq has enabled high-throughput gene expression profiling to provide insight into the functional link between genotype and phenotype. Low quantities of starting RNA can be a severe hindrance for studies that aim to utilize RNA-Seq. To mitigate this bottleneck, whole transcriptome amplification (WTA) technologies have been developed to generate sufficient sequencing targets from minute amounts of RNA. Successful WTA requires accurate replication of transcript abundance without the loss or distortion of specific mRNAs. Here, we test the efficacy of NuGEN's Ovation RNA-Seq V2 system, which uses linear isothermal amplification with a unique chimeric primer for amplification, using white adipose tissue from standard laboratory rats (Rattus norvegicus). Our goal was to investigate potential biological artifacts introduced through WTA approaches by establishing comparisons between matched raw and amplified RNA libraries derived from biological replicates. We found that 93% of expressed genes were identical between all unamplified versus matched amplified comparisons, also finding that gene density is similar across all comparisons. Our sequencing experiment and downstream bioinformatic analyses using the Tuxedo analysis pipeline resulted in the assembly of 25,543 high-quality transcripts. Libraries constructed from raw RNA and WTA samples averaged 15,298 and 15,253 expressed genes, respectively. Although significant differentially expressed genes (P < 0.05) were identified in all matched samples, each of these represents less than 0.15% of all shared genes for each comparison. Transcriptome amplification is efficient at maintaining relative transcript frequencies with no significant bias when using this NuGEN linear isothermal amplification kit under ideal laboratory conditions as presented in this study. This methodology has broad applications, from clinical and diagnostic, to field-based studies when sample acquisition, or sample preservation, methods prove challenging.
Computational model for simulation of sequences of helicity and angular momentum transfer in turbid tissue-like scattering medium (Conference Presentation)

NASA Astrophysics Data System (ADS)

Doronin, Alexander; Meglinski, Igor

2017-02-01

Current report considers development of a unified Monte Carlo (MC) -based computational model for simulation of propagation of Laguerre-Gaussian (LG) beams in turbid tissue-like scattering medium. With a primary goal to proof the concept of using complex light for tissue diagnosis we explore propagation of LG beams in comparison with Gaussian beams for both linear and circular polarization. MC simulations of radially and azimuthally polarized LG beams in turbid media have been performed, classic phenomena such as preservation of the orbital angular momentum, optical memory and helicity flip are observed, detailed comparison is presented and discussed.
Mum, why do you keep on growing? Impacts of environmental variability on optimal growth and reproduction allocation strategies of annual plants.

PubMed

De Lara, Michel

2006-05-01

In their 1990 paper Optimal reproductive efforts and the timing of reproduction of annual plants in randomly varying environments, Amir and Cohen considered stochastic environments consisting of i.i.d. sequences in an optimal allocation discrete-time model. We suppose here that the sequence of environmental factors is more generally described by a Markov chain. Moreover, we discuss the connection between the time interval of the discrete-time dynamic model and the ability of the plant to rebuild completely its vegetative body (from reserves). We formulate a stochastic optimization problem covering the so-called linear and logarithmic fitness (corresponding to variation within and between years), which yields optimal strategies. For "linear maximizers'', we analyse how optimal strategies depend upon the environmental variability type: constant, random stationary, random i.i.d., random monotonous. We provide general patterns in terms of targets and thresholds, including both determinate and indeterminate growth. We also provide a partial result on the comparison between ;"linear maximizers'' and "log maximizers''. Numerical simulations are provided, allowing to give a hint at the effect of different mathematical assumptions.
ELM server: a new resource for investigating short functional sites in modular eukaryotic proteins

PubMed Central

Puntervoll, Pål; Linding, Rune; Gemünd, Christine; Chabanis-Davidson, Sophie; Mattingsdal, Morten; Cameron, Scott; Martin, David M. A.; Ausiello, Gabriele; Brannetti, Barbara; Costantini, Anna; Ferrè, Fabrizio; Maselli, Vincenza; Via, Allegra; Cesareni, Gianni; Diella, Francesca; Superti-Furga, Giulio; Wyrwicz, Lucjan; Ramu, Chenna; McGuigan, Caroline; Gudavalli, Rambabu; Letunic, Ivica; Bork, Peer; Rychlewski, Leszek; Küster, Bernhard; Helmer-Citterich, Manuela; Hunter, William N.; Aasland, Rein; Gibson, Toby J.

2003-01-01

Multidomain proteins predominate in eukaryotic proteomes. Individual functions assigned to different sequence segments combine to create a complex function for the whole protein. While on-line resources are available for revealing globular domains in sequences, there has hitherto been no comprehensive collection of small functional sites/motifs comparable to the globular domain resources, yet these are as important for the function of multidomain proteins. Short linear peptide motifs are used for cell compartment targeting, protein–protein interaction, regulation by phosphorylation, acetylation, glycosylation and a host of other post-translational modifications. ELM, the Eukaryotic Linear Motif server at http://elm.eu.org/, is a new bioinformatics resource for investigating candidate short non-globular functional motifs in eukaryotic proteins, aiming to fill the void in bioinformatics tools. Sequence comparisons with short motifs are difficult to evaluate because the usual significance assessments are inappropriate. Therefore the server is implemented with several logical filters to eliminate false positives. Current filters are for cell compartment, globular domain clash and taxonomic range. In favourable cases, the filters can reduce the number of retained matches by an order of magnitude or more. PMID:12824381
Exploring representations of protein structure for automated remote homology detection and mapping of protein structure space

PubMed Central

2014-01-01

Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993
Comparison of impedimetric detection of DNA hybridization on the various biosensors based on modified glassy carbon electrodes with PANHS and nanomaterials of RGO and MWCNTs.

PubMed

Benvidi, Ali; Tezerjani, Marzieh Dehghan; Jahanbani, Shahriar; Mazloum Ardakani, Mohammad; Moshtaghioun, Seyed Mohammad

2016-01-15

In this research, we have developed lable free DNA biosensors based on modified glassy carbon electrodes (GCE) with reduced graphene oxide (RGO) and carbon nanotubes (MWCNTs) for detection of DNA sequences. This paper compares the detection of BRCA1 5382insC mutation using independent glassy carbon electrodes (GCE) modified with RGO and MWCNTs. A probe (BRCA1 5382insC mutation detection (ssDNA)) was then immobilized on the modified electrodes for a specific time. The immobilization of the probe and its hybridization with the target DNA (Complementary DNA) were performed under optimum conditions using different electrochemical techniques such as cyclic voltammetry (CV) and electrochemical impedance spectroscopy (EIS). The proposed biosensors were used for determination of complementary DNA sequences. The non-modified DNA biosensor (1-pyrenebutyric acid-N- hydroxysuccinimide ester (PANHS)/GCE), revealed a linear relationship between ∆Rct and logarithm of the complementary target DNA concentration ranging from 1.0×10(-16)molL(-1) to 1.0×10(-10)mol L(-1) with a correlation coefficient of 0.992, for DNA biosensors modified with multi-wall carbon nanotubes (MWCNTs) and reduced graphene oxide (RGO) wider linear range and lower detection limit were obtained. For ssDNA/PANHS/MWCNTs/GCE a linear range 1.0×10(-17)mol L(-1)-1.0×10(-10)mol L(-1) with a correlation coefficient of 0.993 and for ssDNA/PANHS/RGO/GCE a linear range from 1.0×10(-18)mol L(-1) to 1.0×10(-10)mol L(-1) with a correlation coefficient of 0.985 were obtained. In addition, the mentioned biosensors were satisfactorily applied for discriminating of complementary sequences from noncomplementary sequences, so the mentioned biosensors can be used for the detection of BRCA1-associated breast cancer. Copyright © 2015. Published by Elsevier B.V.
Infrared space observatory photometry of circumstellar dust in Vega-type systems

NASA Technical Reports Server (NTRS)

Fajardo-Acosta, S. B.; Stencel, R. E.; Backman, D. E.; Thakur, N.

1998-01-01

The ISOPHOT (Infrared Space Observatory Photometry) instrument onboard the Infrared Space Observatory (ISO) was used to obtain 3.6-90 micron photometry of Vega-type systems. Photometric data were calibrated with the ISOPHOT fine calibration source 1 (FCS1). Linear regression was used to derive transformations to make comparisons to ground-based and IRAS photometry systems possible. These transformations were applied to the photometry of 14 main-sequence stars. Details of these results are reported on.

An Inquiry-Oriented Approach to Span and Linear Independence: The Case of the Magic Carpet Ride Sequence

ERIC Educational Resources Information Center

Wawro, Megan; Rasmussen, Chris; Zandieh, Michelle; Sweeney, George Franklin; Larson, Christine

2012-01-01

In this paper we present an innovative instructional sequence for an introductory linear algebra course that supports students' reinvention of the concepts of span, linear dependence, and linear independence. Referred to as the Magic Carpet Ride sequence, the problems begin with an imaginary scenario that allows students to build rich imagery and…
[Application of second generation dual-source computed tomography dual-energy scan mode in detecting pancreatic adenocarcinoma].

PubMed

Xue, Hua-dan; Liu, Wei; Sun, Hao; Wang, Xuan; Chen, Yu; Su, Bai-yan; Sun, Zhao-yong; Chen, Fang; Jin, Zheng-yu

2010-12-01

To analyze the clinical value of multiple sequences derived from dual-source computed tomography (DSCT) dual-energy scan mode in detecting pancreatic adenocarcinoma. Totally 23 patients with clinically or pathologically diagnosed pancreatic cancer were enrolled in this retrospective study. DSCT (Definition Flash) was used and dual-energy scan mode was used in their pancreatic parenchyma phase scan (100kVp/230mAs and Sn140kVp/178mAs) . Mono-energetic 60kev, mono-energetic 80kev, mono-energetic 100kev, mono-energetic 120kev, linear blend image, non-linear blend image, and iodine map were acquired. pancreatic parenchyma-tumor CT value difference, ratio of tumor to pancreatic parenchyma, and pancreatic parenchyma-tumor contrast to noise ratio were calculated. One-way ANOVA was used for the comparison of diagnostic values of the above eight different dual-energy derived sequences for pancreatic cancer. The pancreatic parenchyma-tumor CT value difference, ratio of tumor to pancreatic parenchyma, and pancreatic parenchyma-tumor contrast to noise ratio were significantly different among eight sequences (P<0.05) . Mono-energetic 60kev image showed the largest parenchyma-tumor CT value [ (77.53 ± 23.42) HU] , and iodine map showed the lowest tumor/parenchyma enhancement ratio (0.39?0.12) and the largest contrast to noise ratio (4.08 ± 1.46) . Multiple sequences can be derived from dual-energy scan mode with DSCT via multiple post-processing methods. Integration of these sequences may further improve the sensitivity of the multislice spiral CT in the diagnosis of pancreatic cancer.
Several Families of Sequences with Low Correlation and Large Linear Span

NASA Astrophysics Data System (ADS)

Zeng, Fanxin; Zhang, Zhenyu

In DS-CDMA systems and DS-UWB radios, low correlation of spreading sequences can greatly help to minimize multiple access interference (MAI) and large linear span of spreading sequences can reduce their predictability. In this letter, new sequence sets with low correlation and large linear span are proposed. Based on the construction Trm1[Trnm(αbt+γiαdt)]r for generating p-ary sequences of period pn-1, where n=2m, d=upm±v, b=u±v, γi∈GF(pn), and p is an arbitrary prime number, several methods to choose the parameter d are provided. The obtained sequences with family size pn are of four-valued, five-valued, six-valued or seven-valued correlation and the maximum nontrivial correlation value is (u+v-1)pm-1. The simulation by a computer shows that the linear span of the new sequences is larger than that of the sequences with Niho-type and Welch-type decimations, and similar to that of [10].
Recognizing short coding sequences of prokaryotic genome using a novel iteratively adaptive sparse partial least squares algorithm

PubMed Central

2013-01-01

Background Significant efforts have been made to address the problem of identifying short genes in prokaryotic genomes. However, most known methods are not effective in detecting short genes. Because of the limited information contained in short DNA sequences, it is very difficult to accurately distinguish between protein coding and non-coding sequences in prokaryotic genomes. We have developed a new Iteratively Adaptive Sparse Partial Least Squares (IASPLS) algorithm as the classifier to improve the accuracy of the identification process. Results For testing, we chose the short coding and non-coding sequences from seven prokaryotic organisms. We used seven feature sets (including GC content, Z-curve, etc.) of short genes. In comparison with GeneMarkS, Metagene, Orphelia, and Heuristic Approachs methods, our model achieved the best prediction performance in identification of short prokaryotic genes. Even when we focused on the very short length group ([60–100 nt)), our model provided sensitivity as high as 83.44% and specificity as high as 92.8%. These values are two or three times higher than three of the other methods while Metagene fails to recognize genes in this length range. The experiments also proved that the IASPLS can improve the identification accuracy in comparison with other widely used classifiers, i.e. Logistic, Random Forest (RF) and K nearest neighbors (KNN). The accuracy in using IASPLS was improved 5.90% or more in comparison with the other methods. In addition to the improvements in accuracy, IASPLS required ten times less computer time than using KNN or RF. Conclusions It is conclusive that our method is preferable for application as an automated method of short gene classification. Its linearity and easily optimized parameters make it practicable for predicting short genes of newly-sequenced or under-studied species. Reviewers This article was reviewed by Alexey Kondrashov, Rajeev Azad (nominated by Dr J.Peter Gogarten) and Yuriy Fofanov (nominated by Dr Janet Siefert). PMID:24067167
On a comparison of two schemes in sequential data assimilation

NASA Astrophysics Data System (ADS)

Grishina, Anastasiia A.; Penenko, Alexey V.

2017-11-01

This paper is focused on variational data assimilation as an approach to mathematical modeling. Realization of the approach requires a sequence of connected inverse problems with different sets of observational data to be solved. Two variational data assimilation schemes, "implicit" and "explicit", are considered in the article. Their equivalence is shown and the numerical results are given on a basis of non-linear Robertson system. To avoid the "inverse problem crime" different schemes were used to produce synthetic measurement and to solve the data assimilation problem.
Sensitivity analysis of linear CROW gyroscopes and comparison to a single-resonator gyroscope

NASA Astrophysics Data System (ADS)

Zamani-Aghaie, Kiarash; Digonnet, Michel J. F.

2013-03-01

This study presents numerical simulations of the maximum sensitivity to absolute rotation of a number of coupled resonator optical waveguide (CROW) gyroscopes consisting of a linear array of coupled ring resonators. It examines in particular the impact on the maximum sensitivity of the number of rings, of the relative spatial orientation of the rings (folded and unfolded), of various sequences of coupling ratios between the rings and various sequences of ring dimensions, and of the number of input/output waveguides (one or two) used to inject and collect the light. In all configurations the sensitivity is maximized by proper selection of the coupling ratio(s) and phase bias, and compared to the maximum sensitivity of a resonant waveguide optical gyroscope (RWOG) utilizing a single ring-resonator waveguide with the same radius and loss as each ring in the CROW. Simulations show that although some configurations are more sensitive than others, in spite of numerous claims to the contrary made in the literature, in all configurations the maximum sensitivity is independent of the number of rings, and does not exceed the maximum sensitivity of an RWOG. There are no sensitivity benefits to utilizing any of these linear CROWs for absolute rotation sensing. For equal total footprint, an RWOG is √N times more sensitive, and it is easier to fabricate and stabilize.
Investigation and appreciation of optimal output feedback. Volume 1: A convergent algorithm for the stochastic infinite-time discrete optimal output feedback problem

NASA Technical Reports Server (NTRS)

Halyo, N.; Broussard, J. R.

1984-01-01

The stochastic, infinite time, discrete output feedback problem for time invariant linear systems is examined. Two sets of sufficient conditions for the existence of a stable, globally optimal solution are presented. An expression for the total change in the cost function due to a change in the feedback gain is obtained. This expression is used to show that a sequence of gains can be obtained by an algorithm, so that the corresponding cost sequence is monotonically decreasing and the corresponding sequence of the cost gradient converges to zero. The algorithm is guaranteed to obtain a critical point of the cost function. The computational steps necessary to implement the algorithm on a computer are presented. The results are applied to a digital outer loop flight control problem. The numerical results for this 13th order problem indicate a rate of convergence considerably faster than two other algorithms used for comparison.
Autonomous replication and addition of telomerelike sequences to DNA microinjected into Paramecium tetraurelia macronuclei.

PubMed Central

Gilley, D; Preer, J R; Aufderheide, K J; Polisky, B

1988-01-01

Paramecium tetraurelia can be transformed by microinjection of cloned serotype A gene sequences into the macronucleus. Transformants are detected by their ability to express serotype A surface antigen from the injected templates. After injection, the DNA is converted from a supercoiled form to a linear form by cleavage at nonrandom sites. The linear form appears to replicate autonomously as a unit-length molecule and is present in transformants at high copy number. The injected DNA is further processed by the addition of paramecium-type telomeric sequences to the termini of the linear DNA. To examine the fate of injected linear DNA molecules, plasmid pSA14SB DNA containing the A gene was cleaved into two linear pieces, a 14-kilobase (kb) piece containing the A gene and flanking sequences and a 2.2-kb piece consisting of the procaryotic vector. In transformants expressing the A gene, we observed that two linear DNA species were present which correspond to the two species injected. Both species had Paramecium telomerelike sequences added to their termini. For the 2.2-kb DNA, we show that the site of addition of the telomerelike sequences is directly at one terminus and within one nucleotide of the other terminus. These results indicate that injected procaryotic DNA is capable of autonomous replication in Paramecium macronuclei and that telomeric addition in the macronucleus does not require specific recognition sequences. Images PMID:3211128
Vibration-Rotation Bands of HF and DF

DTIC Science & Technology

1977-09-23

98 IZa. Comparison of Observed and Calculated Line Positions of HF, Av = I Sequence ........................... 99 f2b. Comparison of Observed and...Calculated Line Positions of HF, Av = 2 Sequence ........................... 102 12c. Comparison of Observed and Calculated Line Positions of HF, Av = 3...Sequence ........................... 107 i2d. Comparison of Observed and Calculated Line Positions ofHF, Av = 4 Sequence ........................... fi
Applications of a Sequence of Points in Teaching Linear Algebra, Numerical Methods and Discrete Mathematics

ERIC Educational Resources Information Center

Shi, Yixun

2009-01-01

Based on a sequence of points and a particular linear transformation generalized from this sequence, two recent papers (E. Mauch and Y. Shi, "Using a sequence of number pairs as an example in teaching mathematics". Math. Comput. Educ., 39 (2005), pp. 198-205; Y. Shi, "Case study projects for college mathematics courses based on a particular…
From Arithmetic Sequences to Linear Equations

ERIC Educational Resources Information Center

Matsuura, Ryota; Harless, Patrick

2012-01-01

The first part of the article focuses on deriving the essential properties of arithmetic sequences by appealing to students' sense making and reasoning. The second part describes how to guide students to translate their knowledge of arithmetic sequences into an understanding of linear equations. Ryota Matsuura originally wrote these lessons for…
GenColors: annotation and comparative genomics of prokaryotes made easy.

PubMed

Romualdi, Alessandro; Felder, Marius; Rose, Dominic; Gausmann, Ulrike; Schilhabel, Markus; Glöckner, Gernot; Platzer, Matthias; Sühnel, Jürgen

2007-01-01

GenColors (gencolors.fli-leibniz.de) is a new web-based software/database system aimed at an improved and accelerated annotation of prokaryotic genomes considering information on related genomes and making extensive use of genome comparison. It offers a seamless integration of data from ongoing sequencing projects and annotated genomic sequences obtained from GenBank. A variety of export/import filters manages an effective data flow from sequence assembly and manipulation programs (e.g., GAP4) to GenColors and back as well as to standard GenBank file(s). The genome comparison tools include best bidirectional hits, gene conservation, syntenies, and gene core sets. Precomputed UniProt matches allow annotation and analysis in an effective manner. In addition to these analysis options, base-specific quality data (coverage and confidence) can also be handled if available. The GenColors system can be used both for annotation purposes in ongoing genome projects and as an analysis tool for finished genomes. GenColors comes in two types, as dedicated genome browsers and as the Jena Prokaryotic Genome Viewer (JPGV). Dedicated genome browsers contain genomic information on a set of related genomes and offer a large number of options for genome comparison. The system has been efficiently used in the genomic sequencing of Borrelia garinii and is currently applied to various ongoing genome projects on Borrelia, Legionella, Escherichia, and Pseudomonas genomes. One of these dedicated browsers, the Spirochetes Genome Browser (sgb.fli-leibniz.de) with Borrelia, Leptospira, and Treponema genomes, is freely accessible. The others will be released after finalization of the corresponding genome projects. JPGV (jpgv.fli-leibniz.de) offers information on almost all finished bacterial genomes, as compared to the dedicated browsers with reduced genome comparison functionality, however. As of January 2006, this viewer includes 632 genomic elements (e.g., chromosomes and plasmids) of 293 species. The system provides versatile quick and advanced search options for all currently known prokaryotic genomes and generates circular and linear genome plots. Gene information sheets contain basic gene information, database search options, and links to external databases. GenColors is also available on request for local installation.
Comparison of System Identification Techniques for the Hydraulic Manipulator Test Bed (HMTB)

NASA Technical Reports Server (NTRS)

Morris, A. Terry

1996-01-01

In this thesis linear, dynamic, multivariable state-space models for three joints of the ground-based Hydraulic Manipulator Test Bed (HMTB) are identified. HMTB, housed at the NASA Langley Research Center, is a ground-based version of the Dexterous Orbital Servicing System (DOSS), a representative space station manipulator. The dynamic models of the HMTB manipulator will first be estimated by applying nonparametric identification methods to determine each joint's response characteristics using various input excitations. These excitations include sum of sinusoids, pseudorandom binary sequences (PRBS), bipolar ramping pulses, and chirp input signals. Next, two different parametric system identification techniques will be applied to identify the best dynamical description of the joints. The manipulator is localized about a representative space station orbital replacement unit (ORU) task allowing the use of linear system identification methods. Comparisons, observations, and results of both parametric system identification techniques are discussed. The thesis concludes by proposing a model reference control system to aid in astronaut ground tests. This approach would allow the identified models to mimic on-orbit dynamic characteristics of the actual flight manipulator thus providing astronauts with realistic on-orbit responses to perform space station tasks in a ground-based environment.
Linear and exponential TAIL-PCR: a method for efficient and quick amplification of flanking sequences adjacent to Tn5 transposon insertion sites.

PubMed

Jia, Xianbo; Lin, Xinjian; Chen, Jichen

2017-11-02

Current genome walking methods are very time consuming, and many produce non-specific amplification products. To amplify the flanking sequences that are adjacent to Tn5 transposon insertion sites in Serratia marcescens FZSF02, we developed a genome walking method based on TAIL-PCR. This PCR method added a 20-cycle linear amplification step before the exponential amplification step to increase the concentration of the target sequences. Products of the linear amplification and the exponential amplification were diluted 100-fold to decrease the concentration of the templates that cause non-specific amplification. Fast DNA polymerase with a high extension speed was used in this method, and an amplification program was used to rapidly amplify long specific sequences. With this linear and exponential TAIL-PCR (LETAIL-PCR), we successfully obtained products larger than 2 kb from Tn5 transposon insertion mutant strains within 3 h. This method can be widely used in genome walking studies to amplify unknown sequences that are adjacent to known sequences.
Approximation theory for LQG (Linear-Quadratic-Gaussian) optimal control of flexible structures

NASA Technical Reports Server (NTRS)

Gibson, J. S.; Adamian, A.

1988-01-01

An approximation theory is presented for the LQG (Linear-Quadratic-Gaussian) optimal control problem for flexible structures whose distributed models have bounded input and output operators. The main purpose of the theory is to guide the design of finite dimensional compensators that approximate closely the optimal compensator. The optimal LQG problem separates into an optimal linear-quadratic regulator problem and an optimal state estimation problem. The solution of the former problem lies in the solution to an infinite dimensional Riccati operator equation. The approximation scheme approximates the infinite dimensional LQG problem with a sequence of finite dimensional LQG problems defined for a sequence of finite dimensional, usually finite element or modal, approximations of the distributed model of the structure. Two Riccati matrix equations determine the solution to each approximating problem. The finite dimensional equations for numerical approximation are developed, including formulas for converting matrix control and estimator gains to their functional representation to allow comparison of gains based on different orders of approximation. Convergence of the approximating control and estimator gains and of the corresponding finite dimensional compensators is studied. Also, convergence and stability of the closed-loop systems produced with the finite dimensional compensators are discussed. The convergence theory is based on the convergence of the solutions of the finite dimensional Riccati equations to the solutions of the infinite dimensional Riccati equations. A numerical example with a flexible beam, a rotating rigid body, and a lumped mass is given.
[Sensitivity and specificity of nested PCR pyrosequencing in hepatitis B virus drug resistance gene testing].

PubMed

Sun, Shumei; Zhou, Hao; Zhou, Bin; Hu, Ziyou; Hou, Jinlin; Sun, Jian

2012-05-01

To evaluate the sensitivity and specificity of nested PCR combined with pyrosequencing in the detection of HBV drug-resistance gene. RtM204I (ATT) mutant and rtM204 (ATG) nonmutant plasmids mixed at different ratios were detected for mutations using nested-PCR combined with pyrosequencing, and the results were compared with those by conventional PCR pyrosequencing to analyze the linearity and consistency of the two methods. Clinical specimens with different viral loads were examined for drug-resistant mutations using nested PCR pyrosequencing and nested PCR combined with dideoxy sequencing (Sanger) for comparison of the detection sensitivity and specificity. The fitting curves demonstrated good linearity of both conventional PCR pyrosequencing and nested PCR pyrosequencing (R(2)>0.99, P<0.05). Nested PCR showed a better consistency with the predicted value than conventional PCR, and was superior to conventional PCR for detection of samples containing 90% mutant plasmid. In the detection of clinical specimens, Sanger sequencing had a significantly lower sensitivity than nested PCR pyrosequencing (92% vs 100%, P<0.01). The detection sensitivity of Sanger sequencing varied with the viral loads, especially in samples with low viral copies (HBV DNA ≤3log10 copies/ml), where the sensitivity was 78%, significantly lower than that of pyrosequencing (100%, P<0.01). Neither of the two methods yielded positive results for the negative control samples, suggesting their good specificity. Compared with nested PCR and Sanger sequencing method, nested PCR pyrosequencing has a higher sensitivity especially in clinical specimens with low viral copies, which can be important for early detection of HBV mutant strains and hence more effective clinical management.
Hepatic fat quantification using automated six-point Dixon: Comparison with conventional chemical shift based sequences and computed tomography.

PubMed

Shimizu, Kie; Namimoto, Tomohiro; Nakagawa, Masataka; Morita, Kosuke; Oda, Seitaro; Nakaura, Takeshi; Utsunomiya, Daisuke; Yamashita, Yasuyuki

To compare automated six-point Dixon (6-p-Dixon) MRI comparing with dual-echo chemical-shift-imaging (CSI) and CT for hepatic fat fraction in phantoms and clinical study. Phantoms and fifty-nine patients were examined both MRI and CT for quantitative fat measurements. In phantom study, linear regression between fat concentration and 6-p-Dixon showed good agreement. In clinical study, linear regression between 6-p-Dixon and dual-echo CSI showed good agreement. CT attenuation value was strongly correlated with 6-p-Dixon (R 2 =0.852; P<0.001) and dual-echo CSI (R 2 =0.812; P<0.001). Automated 6-p-Dixon and dual-echo CSI were accurate correlation with CT attenuation value of liver parenchyma. 6-p-Dixon has the potential for automated hepatic fat quantification. Copyright © 2017 Elsevier Inc. All rights reserved.
Specter: linear deconvolution for targeted analysis of data-independent acquisition mass spectrometry proteomics.

PubMed

Peckner, Ryan; Myers, Samuel A; Jacome, Alvaro Sebastian Vaca; Egertson, Jarrett D; Abelin, Jennifer G; MacCoss, Michael J; Carr, Steven A; Jaffe, Jacob D

2018-05-01

Mass spectrometry with data-independent acquisition (DIA) is a promising method to improve the comprehensiveness and reproducibility of targeted and discovery proteomics, in theory by systematically measuring all peptide precursors in a biological sample. However, the analytical challenges involved in discriminating between peptides with similar sequences in convoluted spectra have limited its applicability in important cases, such as the detection of single-nucleotide polymorphisms (SNPs) and alternative site localizations in phosphoproteomics data. We report Specter (https://github.com/rpeckner-broad/Specter), an open-source software tool that uses linear algebra to deconvolute DIA mixture spectra directly through comparison to a spectral library, thus circumventing the problems associated with typical fragment-correlation-based approaches. We validate the sensitivity of Specter and its performance relative to that of other methods, and show that Specter is able to successfully analyze cases involving highly similar peptides that are typically challenging for DIA analysis methods.
Newton's method: A link between continuous and discrete solutions of nonlinear problems

NASA Technical Reports Server (NTRS)

Thurston, G. A.

1980-01-01

Newton's method for nonlinear mechanics problems replaces the governing nonlinear equations by an iterative sequence of linear equations. When the linear equations are linear differential equations, the equations are usually solved by numerical methods. The iterative sequence in Newton's method can exhibit poor convergence properties when the nonlinear problem has multiple solutions for a fixed set of parameters, unless the iterative sequences are aimed at solving for each solution separately. The theory of the linear differential operators is often a better guide for solution strategies in applying Newton's method than the theory of linear algebra associated with the numerical analogs of the differential operators. In fact, the theory for the differential operators can suggest the choice of numerical linear operators. In this paper the method of variation of parameters from the theory of linear ordinary differential equations is examined in detail in the context of Newton's method to demonstrate how it might be used as a guide for numerical solutions.
DNA Sequencing Using capillary Electrophoresis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dr. Barry Karger

2011-05-09

The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linkedmore » polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other application papers of sequencing up to this level were also published in the mid 1990's. A major interest of the sequencing community has always been read length. The longer the sequence read per run the more efficient the process as well as the ability to read repeat sequences. We therefore devoted a great deal of time to studying the factors influencing read length in capillary electrophoresis, including polymer type and molecule weight, capillary column temperature, applied electric field, etc. In our initial optimization, we were able to demonstrate, for the first time, the sequencing of over 1000 bases with 90% accuracy. The run required 80 minutes for separation. Sequencing of 1000 bases per column was next demonstrated on a multiple capillary instrument. Our studies revealed that linear polyacrylamide produced the longest read lengths because the hydrophilic single strand DNA had minimal interaction with the very hydrophilic linear polyacrylamide. Any interaction of the DNA with the polymer would lead to broader peaks and lower read length. Another important parameter was the molecular weight of the linear chains. High molecular weight (> 1 MDA) was important to allow the long single strand DNA to reptate through the entangled polymer matrix. In an important paper, we showed an inverse emulsion method to prepare reproducibility linear polyacrylamide polymer with an average MWT of 9MDa. This approach was used in the polymer for sequencing the human genome. Another critical factor in the successful use of capillary electrophoresis for sequencing was the sample preparation method. In the Sanger sequencing reaction, high concentration of salts and dideoxynucleotide remained. Since the sample was introduced to the capillary column by electrokinetic injection, these salt ions would be favorably injected into the column over the sequencing fragments, thus reducing the signal for longer fragments and hence reading read length. In two papers, we examined the role of individual components from the sequencing reaction and then developed a protocol to reduce the deleterious salts. We demonstrated a robust method for achieving long read length DNA sequencing. Continuing our advances, we next demonstrated the achievement of over 1000 bases in less than one hour with a base calling accuracy of between 98 and 99%. In this work, we implemented energy transfer dyes which allowed for cleaner differentiation of the 4 dye labeled terminal nucleotides. In addition, we developed improved base calling software to help read sequencing when the separation was only minimal as occurs at long read lengths. Another critical parameter we studied was column temperature. We demonstrated that read lengths improved as the column temperature was increased from room temperature to 60 C or 70 C. The higher temperature relaxed the DNA chains under the influence of the high electric field.« less

Bit error rate tester using fast parallel generation of linear recurring sequences

DOEpatents

Pierson, Lyndon G.; Witzke, Edward L.; Maestas, Joseph H.

2003-05-06

A fast method for generating linear recurring sequences by parallel linear recurring sequence generators (LRSGs) with a feedback circuit optimized to balance minimum propagation delay against maximal sequence period. Parallel generation of linear recurring sequences requires decimating the sequence (creating small contiguous sections of the sequence in each LRSG). A companion matrix form is selected depending on whether the LFSR is right-shifting or left-shifting. The companion matrix is completed by selecting a primitive irreducible polynomial with 1's most closely grouped in a corner of the companion matrix. A decimation matrix is created by raising the companion matrix to the (n*k).sup.th power, where k is the number of parallel LRSGs and n is the number of bits to be generated at a time by each LRSG. Companion matrices with 1's closely grouped in a corner will yield sparse decimation matrices. A feedback circuit comprised of XOR logic gates implements the decimation matrix in hardware. Sparse decimation matrices can be implemented with minimum number of XOR gates, and therefore a minimum propagation delay through the feedback circuit. The LRSG of the invention is particularly well suited to use as a bit error rate tester on high speed communication lines because it permits the receiver to synchronize to the transmitted pattern within 2n bits.
First complete mitochondrial genome sequence from a box jellyfish reveals a highly fragmented linear architecture and insights into telomere evolution.

PubMed

Smith, David Roy; Kayal, Ehsan; Yanagihara, Angel A; Collins, Allen G; Pirro, Stacy; Keeling, Patrick J

2012-01-01

Animal mitochondrial DNAs (mtDNAs) are typically single circular chromosomes, with the exception of those from medusozoan cnidarians (jellyfish and hydroids), which are linear and sometimes fragmented. Most medusozoans have linear monomeric or linear bipartite mitochondrial genomes, but preliminary data have suggested that box jellyfish (cubozoans) have mtDNAs that consist of many linear chromosomes. Here, we present the complete mtDNA sequence from the winged box jellyfish Alatina moseri (the first from a cubozoan). This genome contains unprecedented levels of fragmentation: 18 unique genes distributed over eight 2.9- to 4.6-kb linear chromosomes. The telomeres are identical within and between chromosomes, and recombination between subtelomeric sequences has led to many genes initiating or terminating with sequences from other genes (the most extreme case being 150 nt of a ribosomal RNA containing the 5' end of nad2), providing evidence for a gene conversion-based model of telomere evolution. The silent-site nucleotide variation within the A. moseri mtDNA is among the highest observed from a eukaryotic genome and may be associated with elevated rates of recombination.
High-Frame-Rate Doppler Ultrasound Using a Repeated Transmit Sequence

PubMed Central

Podkowa, Anthony S.; Oelze, Michael L.; Ketterling, Jeffrey A.

2018-01-01

The maximum detectable velocity of high-frame-rate color flow Doppler ultrasound is limited by the imaging frame rate when using coherent compounding techniques. Traditionally, high quality ultrasonic images are produced at a high frame rate via coherent compounding of steered plane wave reconstructions. However, this compounding operation results in an effective downsampling of the slow-time signal, thereby artificially reducing the frame rate. To alleviate this effect, a new transmit sequence is introduced where each transmit angle is repeated in succession. This transmit sequence allows for direct comparison between low resolution, pre-compounded frames at a short time interval in ways that are resistent to sidelobe motion. Use of this transmit sequence increases the maximum detectable velocity by a scale factor of the transmit sequence length. The performance of this new transmit sequence was evaluated using a rotating cylindrical phantom and compared with traditional methods using a 15-MHz linear array transducer. Axial velocity estimates were recorded for a range of ±300 mm/s and compared to the known ground truth. Using these new techniques, the root mean square error was reduced from over 400 mm/s to below 50 mm/s in the high-velocity regime compared to traditional techniques. The standard deviation of the velocity estimate in the same velocity range was reduced from 250 mm/s to 30 mm/s. This result demonstrates the viability of the repeated transmit sequence methods in detecting and quantifying high-velocity flow. PMID:29910966
A powerful and flexible approach to the analysis of RNA sequence count data.

PubMed

Zhou, Yi-Hui; Xia, Kai; Wright, Fred A

2011-10-01

A number of penalization and shrinkage approaches have been proposed for the analysis of microarray gene expression data. Similar techniques are now routinely applied to RNA sequence transcriptional count data, although the value of such shrinkage has not been conclusively established. If penalization is desired, the explicit modeling of mean-variance relationships provides a flexible testing regimen that 'borrows' information across genes, while easily incorporating design effects and additional covariates. We describe BBSeq, which incorporates two approaches: (i) a simple beta-binomial generalized linear model, which has not been extensively tested for RNA-Seq data and (ii) an extension of an expression mean-variance modeling approach to RNA-Seq data, involving modeling of the overdispersion as a function of the mean. Our approaches are flexible, allowing for general handling of discrete experimental factors and continuous covariates. We report comparisons with other alternate methods to handle RNA-Seq data. Although penalized methods have advantages for very small sample sizes, the beta-binomial generalized linear model, combined with simple outlier detection and testing approaches, appears to have favorable characteristics in power and flexibility. An R package containing examples and sample datasets is available at http://www.bios.unc.edu/research/genomic_software/BBSeq yzhou@bios.unc.edu; fwright@bios.unc.edu Supplementary data are available at Bioinformatics online.
Isograde mapping and mineral identification on the island of Naxos, Greece, using DAIS 7915 hyperspectral data

NASA Astrophysics Data System (ADS)

Echtler, Helmut; Segl, Karl; Dickerhof, Corinna; Chabrillat, Sabine; Kaufmann, Hermann J.

2003-03-01

The ESF-LSF 1997 flight campaign conducted by the German Aerospace Center (DLR) recorded several transects across the island of Naxos using the airborne hyperspectral scanner DAIS. The geological targets cover all major litho-tectonic units of a metamorphic dome with the transition of metamorphic zonations from the outer meta-sedimentary greenschist envelope to the gneissic amphibolite facies and migmatitic core. Mineral identification of alternating marble-dolomite sequences and interlayered schists bearing muscovite and biotite has been accomplished using the airborne hyperspectral DAIS 7915 sensor. Data have been noise filtered based on maximum noise fraction (MNF) and fast Fourier transform (FFT) and converted from radiance to reflectance. For mineral identification, constrained linear spectral unmixing and spectral angle mapper (SAM) algorithms were tested. Due to their unsatisfying results a new approach was developed which consists of a linear mixture modeling and spectral feature fitting. This approach provides more detailed and accurate information. Results are discussed in comparison with detailed geological mapping and additional information. Calcites are clearly separated from dolomites as well as the mica-schist sequences by a good resolution of the mineral muscovite. Thereon an outstanding result represents the very good resolution of the chlorite/mica (muscovite, biotite)-transition defining a metamorphic isograde.
Relationship between magnetic field strength and magnetic-resonance-related acoustic noise levels.

PubMed

Moelker, Adriaan; Wielopolski, Piotr A; Pattynama, Peter M T

2003-02-01

The need for better signal-to-noise ratios and resolution has pushed magnetic resonance imaging (MRI) towards high-field MR-scanners for which only little data on MR-related acoustic noise production have been published. The purpose of this study was to validate the theoretical relationship of sound pressure level (SPL) and static magnetic field strength. This is relevant for allowing adequate comparisons of acoustic data of MR systems at various magnetic field strengths. Acoustic data were acquired during various pulse sequences at field strengths of 0.5, 1.0, 1.5 and 2.0 Tesla using the same MRI unit by means of a Helicon rampable magnet. Continuous-equivalent, i.e. time-averaged, linear SPLs and 1/3-octave band frequencies were recorded. Ramping from 0.5 to 1.0 Tesla and from 1.0 to 2.0 Tesla resulted in an SPL increase of 5.7 and 5.2 dB(L), respectively, when averaged over the various pulse sequences. Most of the acoustic energy was in the 1-kHz frequency band, irrespective of magnetic field strength. The relation between field strength and SPL was slightly non-linear, i.e. a slightly less increase at higher field strengths, presumably caused by the elastic properties of the gradient coil encasings.
Host Cell Virus Entry Mediated by Australian Bat Lyssavirus Envelope G glycoprotein

DTIC Science & Technology

2013-10-24

39 Figure 7. Comparison of the amino acid sequences of Saccolaimus and Pteropus ABLV G mature protein... sequence analysis revealed that the PCR products were identical. Sequence comparisons of the ABLV N and other lyssavirus N proteins showed that ABLV...Saccolaimus flaviventris) (129). Nucleoprotein sequence comparisons revealed that the Saccolaimus N protein shared 96% amino acid homology with the Pteropus
Comparison of hybrid (68)Ga-PSMA PET/MRI and (68)Ga-PSMA PET/CT in the evaluation of lymph node and bone metastases of prostate cancer.

PubMed

Freitag, Martin T; Radtke, Jan P; Hadaschik, Boris A; Kopp-Schneider, A; Eder, Matthias; Kopka, Klaus; Haberkorn, Uwe; Roethke, Matthias; Schlemmer, Heinz-Peter; Afshar-Oromieh, Ali

2016-01-01

To evaluate the reproducibility of the combination of hybrid PET/MRI and the (68)Ga-PSMA-11 tracer in depicting lymph node (LN) and bone metastases of prostate cancer (PC) in comparison with that of PET/CT. A retrospective analysis of 26 patients who were subjected to (68)Ga-PSMA PET/CTlow-dose (1 h after injection) followed by PET/MRI (3 h after injection) was performed. MRI sequences included T1-w native, T1-w contrast-enhanced, T2-w fat-saturated and diffusion-weighted sequences (DWIb800). Discordant PET-positive and morphological findings were evaluated. Standardized uptake values (SUV) of PET-positive LNs and bone lesions were quantified and their morphological size and conspicuity determined. Comparing the PET components, the proportion of discordant PSMA-positive suspicious findings was very low (98.5 % of 64 LNs concordant, 100 % of 28 bone lesions concordant). Two PET-positive bone metastases could not be confirmed morphologically using CTlow-dose, but could be confirmed using MRI. In 12 of 20 patients, 47 PET-positive LNs (71.9 %) were smaller than 1 cm in short axis diameter. There were significant linear correlations between PET/MRI SUVs and PET/CT SUVs in the 64 LN metastases (p < 0.0001) and in the 28 osseous metastases (p < 0.0001) for SUVmean and SUVmax, respectively. The LN SUVs were significantly higher on PET/MRI than on PET/CT (p SUVmax < 0.0001; p SUVmean < 0.0001) but there was no significant difference between the bone lesion SUVs (p SUVmax = 0.495; p SUVmean = 0.381). Visibility of LNs was significantly higher on MRI using the T1-w contrast-enhanced fat-saturated sequence (p = 0.013), the T2-w fat-saturated sequence (p < 0.0001) and the DWI sequence (p < 0.0001) compared with CTlow-dose. For bone lesions, only the overall conspicuity was higher on MRI compared with CTlow-dose (p < 0.006). Nodal and osseous metastases of PC are accurately and reliably depicted by hybrid PET/MRI using (68)Ga-PSMA-11 with very low discordance compared with PET/CT including PET-positive LNs of normal size. The correlation between PET/MRI SUVs and PET/CT SUVs was linear in LN and bone metastases but was significantly lower in control (non-metastatic) tissue.
Variational Dirac-Hartree-Fock calculation of the Breit interaction

NASA Astrophysics Data System (ADS)

Goldman, S. P.

1988-04-01

The calculation of the retarded version of the Breit interaction in the context of the VDHF method is discussed. With the use of Slater-type basis functions, all the terms involved can be calculated in closed form. The results are expressed as an expansion in powers of one-electron energy differences and linear combinations of hypergeometric functions. Convergence is fast and high accuracy is obtained with a small number of terms in the expansion even for high values of the nuclear charge. An added advantage is that the lowest order cancellations occurring in the retardation terms are accounted for exactly a priori. A comparison of the number of terms in the total expansion needed for an accuracy of 12 significant digits in the total energy, as well as a comparison of the results with an without retardation and in the local potential approximation, are presented for the carbon isoelectronic sequence.
A comparison of mixed-integer linear programming models for workforce scheduling with position-dependent processing times

NASA Astrophysics Data System (ADS)

Moreno-Camacho, Carlos A.; Montoya-Torres, Jairo R.; Vélez-Gallego, Mario C.

2018-06-01

Only a few studies in the available scientific literature address the problem of having a group of workers that do not share identical levels of productivity during the planning horizon. This study considers a workforce scheduling problem in which the actual processing time is a function of the scheduling sequence to represent the decline in workers' performance, evaluating two classical performance measures separately: makespan and maximum tardiness. Several mathematical models are compared with each other to highlight the advantages of each approach. The mathematical models are tested with randomly generated instances available from a public e-library.
Complete mitochondrial genome of the moon jellyfish, Aurelia sp. nov. (Cnidaria, Scyphozoa).

PubMed

Hwang, Dae-Sik; Park, Eunji; Won, Yong-Jin; Lee, Jae-Seong

2014-02-01

We sequenced 16,971 bp of the linear mitochondrial DNA of the moon jellyfish Aurelia sp. nov. and characterized it by comparing with Aurelia aurita. They had 13 protein-coding genes (PCGs), 16S rRNA and 12S rRNA with three tRNAs (tRNA-Leu, tRNA-Ser(TGA), tRNA-Met). Both have another two PCGs, orf969 and orf324 with telomeres at both ends. After comparison of Aurelia sp. nov. with Aurelia aurita, we found low-protein similarity of orf969 (59%) and orf324 (75%), respectively, while the other 13 PCGs showed 80% to 98% protein similarities.
3D volumetry comparison using 3T magnetic resonance imaging between normal and adenoma-containing pituitary glands.

PubMed

Roldan-Valadez, Ernesto; Garcia-Ulloa, Ana Cristina; Gonzalez-Gutierrez, Omar; Martinez-Lopez, Manuel

2011-01-01

Computed-assisted three-dimensional data (3D) allows for an accurate evaluation of volumes compared with traditional measurements. An in vitro method comparison between geometric volume and 3D volumetry to obtain reference data for pituitary volumes in normal pituitary glands (PGs) and PGs containing adenomas. Prospective, transverse, analytical study. Forty-eight subjects underwent brain magnetic resonance imaging (MRI) with 3D sequencing for computer-aided volumetry. PG phantom volumes by both methods were compared. Using the best volumetric method, volumes of normal PGs and PGs with adenoma were compared. Statistical analysis used the Bland-Altman method, t-statistics, effect size and linear regression analysis. Method comparison between 3D volumetry and geometric volume revealed a lower bias and precision for 3D volumetry. A total of 27 patients exhibited normal PGs (mean age, 42.07 ± 16.17 years), although length, height, width, geometric volume and 3D volumetry were greater in women than in men. A total of 21 patients exhibited adenomas (mean age 39.62 ± 10.79 years), and length, height, width, geometric volume and 3D volumetry were greater in men than in women, with significant volumetric differences. Age did not influence pituitary volumes on linear regression analysis. Results from the present study showed that 3D volumetry was more accurate than the geometric method. In addition, the upper normal limits of PGs overlapped with lower volume limits during early stage microadenomas.
Sequenced drive for rotary valves

DOEpatents

Mittell, Larry C.

1981-01-01

A sequenced drive for rotary valves which provides the benefits of applying rotary and linear motions to the movable sealing element of the valve. The sequenced drive provides a close approximation of linear motion while engaging or disengaging the movable element with the seat minimizing wear and damage due to scrubbing action. The rotary motion of the drive swings the movable element out of the flowpath thus eliminating obstruction to flow through the valve.
Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.

PubMed

Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai

2015-12-01

The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
Optical Processing Techniques For Pseudorandom Sequence Prediction

NASA Astrophysics Data System (ADS)

Gustafson, Steven C.

1983-11-01

Pseudorandom sequences are series of apparently random numbers generated, for example, by linear or nonlinear feedback shift registers. An important application of these sequences is in spread spectrum communication systems, in which, for example, the transmitted carrier phase is digitally modulated rapidly and pseudorandomly and in which the information to be transmitted is incorporated as a slow modulation in the pseudorandom sequence. In this case the transmitted information can be extracted only by a receiver that uses for demodulation the same pseudorandom sequence used by the transmitter, and thus this type of communication system has a very high immunity to third-party interference. However, if a third party can predict in real time the probable future course of the transmitted pseudorandom sequence given past samples of this sequence, then interference immunity can be significantly reduced.. In this application effective pseudorandom sequence prediction techniques should be (1) applicable in real time to rapid (e.g., megahertz) sequence generation rates, (2) applicable to both linear and nonlinear pseudorandom sequence generation processes, and (3) applicable to error-prone past sequence samples of limited number and continuity. Certain optical processing techniques that may meet these requirements are discussed in this paper. In particular, techniques based on incoherent optical processors that perform general linear transforms or (more specifically) matrix-vector multiplications are considered. Computer simulation examples are presented which indicate that significant prediction accuracy can be obtained using these transforms for simple pseudorandom sequences. However, the useful prediction of more complex pseudorandom sequences will probably require the application of more sophisticated optical processing techniques.
Conservative secondary structure motifs already present in early-stage folding (in silico) as found in serpines family.

PubMed

Brylinski, Michal; Konieczny, Leszek; Kononowicz, Andrzej; Roterman, Irena

2008-03-21

The well-known procedure implemented in ClustalW oriented on the sequence comparison was applied to structure comparison. The consensus sequence as well as consensus structure has been defined for proteins belonging to serpine family. The structure of early stage intermediate was the object for similarity search. The high values of W(sequence) appeared to be accordant with high values of W(structure) making possible structure comparison using common criteria for sequence and structure comparison. Since the early stage structural form has been created according to limited conformational sub-space which does not include the beta-structure (this structure is mediated by C7eq structural form), is particularly important to see, that the C7eq structural form may be treated as the seed for beta-structure present in the final native structure of protein. The applicability of ClustalW procedure to structure comparison makes these two comparisons unified.
Prediction of optimum sorption isotherm: comparison of linear and non-linear method.

PubMed

Kumar, K Vasanth; Sivanesan, S

2005-11-11

Equilibrium parameters for Bismarck brown onto rice husk were estimated by linear least square and a trial and error non-linear method using Freundlich, Langmuir and Redlich-Peterson isotherms. A comparison between linear and non-linear method of estimating the isotherm parameters was reported. The best fitting isotherm was Langmuir isotherm and Redlich-Peterson isotherm equation. The results show that non-linear method could be a better way to obtain the parameters. Redlich-Peterson isotherm is a special case of Langmuir isotherm when the Redlich-Peterson isotherm constant g was unity.
A Mixed Integer Linear Program for Solving a Multiple Route Taxi Scheduling Problem

NASA Technical Reports Server (NTRS)

Montoya, Justin Vincent; Wood, Zachary Paul; Rathinam, Sivakumar; Malik, Waqar Ahmad

2010-01-01

Aircraft movements on taxiways at busy airports often create bottlenecks. This paper introduces a mixed integer linear program to solve a Multiple Route Aircraft Taxi Scheduling Problem. The outputs of the model are in the form of optimal taxi schedules, which include routing decisions for taxiing aircraft. The model extends an existing single route formulation to include routing decisions. An efficient comparison framework compares the multi-route formulation and the single route formulation. The multi-route model is exercised for east side airport surface traffic at Dallas/Fort Worth International Airport to determine if any arrival taxi time savings can be achieved by allowing arrivals to have two taxi routes: a route that crosses an active departure runway and a perimeter route that avoids the crossing. Results indicate that the multi-route formulation yields reduced arrival taxi times over the single route formulation only when a perimeter taxiway is used. In conditions where the departure aircraft are given an optimal and fixed takeoff sequence, accumulative arrival taxi time savings in the multi-route formulation can be as high as 3.6 hours more than the single route formulation. If the departure sequence is not optimal, the multi-route formulation results in less taxi time savings made over the single route formulation, but the average arrival taxi time is significantly decreased.
A powerful and flexible approach to the analysis of RNA sequence count data

PubMed Central

Zhou, Yi-Hui; Xia, Kai; Wright, Fred A.

2011-01-01

Motivation: A number of penalization and shrinkage approaches have been proposed for the analysis of microarray gene expression data. Similar techniques are now routinely applied to RNA sequence transcriptional count data, although the value of such shrinkage has not been conclusively established. If penalization is desired, the explicit modeling of mean–variance relationships provides a flexible testing regimen that ‘borrows’ information across genes, while easily incorporating design effects and additional covariates. Results: We describe BBSeq, which incorporates two approaches: (i) a simple beta-binomial generalized linear model, which has not been extensively tested for RNA-Seq data and (ii) an extension of an expression mean–variance modeling approach to RNA-Seq data, involving modeling of the overdispersion as a function of the mean. Our approaches are flexible, allowing for general handling of discrete experimental factors and continuous covariates. We report comparisons with other alternate methods to handle RNA-Seq data. Although penalized methods have advantages for very small sample sizes, the beta-binomial generalized linear model, combined with simple outlier detection and testing approaches, appears to have favorable characteristics in power and flexibility. Availability: An R package containing examples and sample datasets is available at http://www.bios.unc.edu/research/genomic_software/BBSeq Contact: yzhou@bios.unc.edu; fwright@bios.unc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21810900
Population entropies estimates of proteins

NASA Astrophysics Data System (ADS)

Low, Wai Yee

2017-05-01

The Shannon entropy equation provides a way to estimate variability of amino acids sequences in a multiple sequence alignment of proteins. Knowledge of protein variability is useful in many areas such as vaccine design, identification of antibody binding sites, and exploration of protein 3D structural properties. In cases where the population entropies of a protein are of interest but only a small sample size can be obtained, a method based on linear regression and random subsampling can be used to estimate the population entropy. This method is useful for comparisons of entropies where the actual sequence counts differ and thus, correction for alignment size bias is needed. In the current work, an R based package named EntropyCorrect that enables estimation of population entropy is presented and an empirical study on how well this new algorithm performs on simulated dataset of various combinations of population and sample sizes is discussed. The package is available at https://github.com/lloydlow/EntropyCorrect. This article, which was originally published online on 12 May 2017, contained an error in Eq. (1), where the summation sign was missing. The corrected equation appears in the Corrigendum attached to the pdf.

Random number generators tested on quantum Monte Carlo simulations.

PubMed

Hongo, Kenta; Maezono, Ryo; Miura, Kenichi

2010-08-01

We have tested and compared several (pseudo) random number generators (RNGs) applied to a practical application, ground state energy calculations of molecules using variational and diffusion Monte Carlo metheds. A new multiple recursive generator with 8th-order recursion (MRG8) and the Mersenne twister generator (MT19937) are tested and compared with the RANLUX generator with five luxury levels (RANLUX-[0-4]). Both MRG8 and MT19937 are proven to give the same total energy as that evaluated with RANLUX-4 (highest luxury level) within the statistical error bars with less computational cost to generate the sequence. We also tested the notorious implementation of linear congruential generator (LCG), RANDU, for comparison. (c) 2010 Wiley Periodicals, Inc.
Definition and Analysis of a System for the Automated Comparison of Curriculum Sequencing Algorithms in Adaptive Distance Learning

ERIC Educational Resources Information Center

Limongelli, Carla; Sciarrone, Filippo; Temperini, Marco; Vaste, Giulia

2011-01-01

LS-Lab provides automatic support to comparison/evaluation of the Learning Object Sequences produced by different Curriculum Sequencing Algorithms. Through this framework a teacher can verify the correspondence between the behaviour of different sequencing algorithms and her pedagogical preferences. In fact the teacher can compare algorithms…
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.

PubMed

Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark

2003-07-04

The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
A 100-Year Review: Methods and impact of genetic selection in dairy cattle-From daughter-dam comparisons to deep learning algorithms.

PubMed

Weigel, K A; VanRaden, P M; Norman, H D; Grosu, H

2017-12-01

In the early 1900s, breed society herdbooks had been established and milk-recording programs were in their infancy. Farmers wanted to improve the productivity of their cattle, but the foundations of population genetics, quantitative genetics, and animal breeding had not been laid. Early animal breeders struggled to identify genetically superior families using performance records that were influenced by local environmental conditions and herd-specific management practices. Daughter-dam comparisons were used for more than 30 yr and, although genetic progress was minimal, the attention given to performance recording, genetic theory, and statistical methods paid off in future years. Contemporary (herdmate) comparison methods allowed more accurate accounting for environmental factors and genetic progress began to accelerate when these methods were coupled with artificial insemination and progeny testing. Advances in computing facilitated the implementation of mixed linear models that used pedigree and performance data optimally and enabled accurate selection decisions. Sequencing of the bovine genome led to a revolution in dairy cattle breeding, and the pace of scientific discovery and genetic progress accelerated rapidly. Pedigree-based models have given way to whole-genome prediction, and Bayesian regression models and machine learning algorithms have joined mixed linear models in the toolbox of modern animal breeders. Future developments will likely include elucidation of the mechanisms of genetic inheritance and epigenetic modification in key biological pathways, and genomic data will be used with data from on-farm sensors to facilitate precision management on modern dairy farms. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Application of Logic to Integer Sequences: A Survey

NASA Astrophysics Data System (ADS)

Makowsky, Johann A.

Chomsky and Schützenberger showed in 1963 that the sequence d L (n), which counts the number of words of a given length n in a regular language L, satisfies a linear recurrence relation with constant coefficients for n, or equivalently, the generating function g_L(x)=sumn d_L(n) x^n is a rational function. In this talk we survey results concerning sequences a(n) of natural numbers which satisfy linear recurrence relations over ℤ or ℤ m , and
Second-order kinetic model for the sorption of cadmium onto tree fern: a comparison of linear and non-linear methods.

PubMed

Ho, Yuh-Shan

2006-01-01

A comparison was made of the linear least-squares method and a trial-and-error non-linear method of the widely used pseudo-second-order kinetic model for the sorption of cadmium onto ground-up tree fern. Four pseudo-second-order kinetic linear equations are discussed. Kinetic parameters obtained from the four kinetic linear equations using the linear method differed but they were the same when using the non-linear method. A type 1 pseudo-second-order linear kinetic model has the highest coefficient of determination. Results show that the non-linear method may be a better way to obtain the desired parameters.
Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors

DTIC Science & Technology

2015-07-15

Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors
An extended sequence specificity for UV-induced DNA damage.

PubMed

Chung, Long H; Murray, Vincent

2018-01-01

The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
On relative distortion in fingerprint comparison.

PubMed

Kalka, Nathan D; Hicklin, R Austin

2014-11-01

When fingerprints are deposited, non-uniform pressure in conjunction with the inherent elasticity of friction ridge skin often causes linear and non-linear distortions in the ridge and valley structure. The effects of these distortions must be considered during analysis of fingerprint images. Even when individual prints are not notably distorted, relative distortion between two prints can have a serious impact on comparison. In this paper we discuss several metrics for quantifying and visualizing linear and non-linear fingerprint deformations, and software tools to assist examiners in accounting for distortion in fingerprint comparisons. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Recognition of Yeast Species from Gene Sequence Comparisons

USDA-ARS?s Scientific Manuscript database

This review discusses recognition of yeast species from gene sequence comparisons, which have been responsible for doubling the number of known yeasts over the past decade. The resolution provided by various single gene sequences is examined for both ascomycetous and basidiomycetous species, and th...
The limits of protein sequence comparison?

PubMed Central

Pearson, William R; Sierk, Michael L

2010-01-01

Modern sequence alignment algorithms are used routinely to identify homologous proteins, proteins that share a common ancestor. Homologous proteins always share similar structures and often have similar functions. Over the past 20 years, sequence comparison has become both more sensitive, largely because of profile-based methods, and more reliable, because of more accurate statistical estimates. As sequence and structure databases become larger, and comparison methods become more powerful, reliable statistical estimates will become even more important for distinguishing similarities that are due to homology from those that are due to analogy (convergence). The newest sequence alignment methods are more sensitive than older methods, but more accurate statistical estimates are needed for their full power to be realized. PMID:15919194
A Modified LS+AR Model to Improve the Accuracy of the Short-term Polar Motion Prediction

NASA Astrophysics Data System (ADS)

Wang, Z. W.; Wang, Q. X.; Ding, Y. Q.; Zhang, J. J.; Liu, S. S.

2017-03-01

There are two problems of the LS (Least Squares)+AR (AutoRegressive) model in polar motion forecast: the inner residual value of LS fitting is reasonable, but the residual value of LS extrapolation is poor; and the LS fitting residual sequence is non-linear. It is unsuitable to establish an AR model for the residual sequence to be forecasted, based on the residual sequence before forecast epoch. In this paper, we make solution to those two problems with two steps. First, restrictions are added to the two endpoints of LS fitting data to fix them on the LS fitting curve. Therefore, the fitting values next to the two endpoints are very close to the observation values. Secondly, we select the interpolation residual sequence of an inward LS fitting curve, which has a similar variation trend as the LS extrapolation residual sequence, as the modeling object of AR for the residual forecast. Calculation examples show that this solution can effectively improve the short-term polar motion prediction accuracy by the LS+AR model. In addition, the comparison results of the forecast models of RLS (Robustified Least Squares)+AR, RLS+ARIMA (AutoRegressive Integrated Moving Average), and LS+ANN (Artificial Neural Network) confirm the feasibility and effectiveness of the solution for the polar motion forecast. The results, especially for the polar motion forecast in the 1-10 days, show that the forecast accuracy of the proposed model can reach the world level.
Transformation of Saccharomyces cerevisiae and Schizosaccharomyces pombe with linear plasmids containing 2 micron sequences.

PubMed Central

Guerrini, A M; Ascenzioni, F; Tribioli, C; Donini, P

1985-01-01

Linear plasmids were constructed by adding telomeres prepared from Tetrahymena pyriformis rDNA to a circular hybrid Escherichia coli-yeast vector and transforming Saccharomyces cerevisiae. The parental vector contained the entire 2 mu yeast circle and the LEU gene from S. cerevisiae. Three transformed clones were shown to contain linear plasmids which were characterized by restriction analysis and shown to be rearranged versions of the desired linear plasmids. The plasmids obtained were imperfect palindromes: part of the parental vector was present in duplicated form, part as unique sequences and part was absent. The sequences that had been lost included a large portion of the 2 mu circle. The telomeres were approximately 450 bp longer than those of T. pyriformis. DNA prepared from transformed S. cerevisiae clones was used to transform Schizosaccharomyces pombe. The transformed S. pombe clones contained linear plasmids identical in structure to their linear parents in S. cerevisiae. No structural re-arrangements or integration into S. pombe was observed. Little or no telomere growth had occurred after transfer from S. cerevisiae to S. pombe. A model is proposed to explain the genesis of the plasmids. Images Fig. 1. Fig. 2. Fig. 4. PMID:3896773
Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase.

PubMed

Shao, Zhiyong; Graf, Shannon; Chaga, Oleg Y; Lavrov, Dennis V

2006-10-15

The 16,937-nuceotide sequence of the linear mitochondrial DNA (mt-DNA) molecule of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa) - the first mtDNA sequence from the class Scypozoa and the first sequence of a linear mtDNA from Metazoa - has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs. In addition, two open reading frames of 324 and 969 base pairs in length have been found. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the polymerase [but not the exonuclease] domain of family B DNA polymerases, and this ORF has been tentatively identified as dnab. This is the first report of dnab in animal mtDNA. The genes in A. aurita mtDNA are arranged in two clusters with opposite transcriptional polarities; transcription proceeding toward the ends of the molecule. The determined sequences at the ends of the molecule are nearly identical but inverted and lack any obvious potential secondary structures or telomere-like repeat elements. The acquisition of mitochondrial genomic data for the second class of Cnidaria allows us to reconstruct characteristic features of mitochondrial evolution in this animal phylum.
Protein alignment algorithms with an efficient backtracking routine on multiple GPUs.

PubMed

Blazewicz, Jacek; Frohmberg, Wojciech; Kierzynka, Michal; Pesch, Erwin; Wojciechowski, Pawel

2011-05-20

Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a GPU platform but in most cases address the problem of sequence database scanning and computing only the alignment score whereas the alignment itself is omitted. Thus, the need arose to implement the global and semiglobal Needleman-Wunsch, and Smith-Waterman algorithms with a backtracking procedure which is needed to construct the alignment. In this paper we present the solution that performs the alignment of every given sequence pair, which is a required step for progressive multiple sequence alignment methods, as well as for DNA recognition at the DNA assembly stage. Performed tests show that the implementation, with performance up to 6.3 GCUPS on a single GPU for affine gap penalties, is very efficient in comparison to other CPU and GPU-based solutions. Moreover, multiple GPUs support with load balancing makes the application very scalable. The article shows that the backtracking procedure of the sequence alignment algorithms may be designed to fit in with the GPU architecture. Therefore, our algorithm, apart from scores, is able to compute pairwise alignments. This opens a wide range of new possibilities, allowing other methods from the area of molecular biology to take advantage of the new computational architecture. Performed tests show that the efficiency of the implementation is excellent. Moreover, the speed of our GPU-based algorithms can be almost linearly increased when using more than one graphics card.
MMASS: an optimized array-based method for assessing CpG island methylation.

PubMed

Ibrahim, Ashraf E K; Thorne, Natalie P; Baird, Katie; Barbosa-Morais, Nuno L; Tavaré, Simon; Collins, V Peter; Wyllie, Andrew H; Arends, Mark J; Brenton, James D

2006-01-01

We describe an optimized microarray method for identifying genome-wide CpG island methylation called microarray-based methylation assessment of single samples (MMASS) which directly compares methylated to unmethylated sequences within a single sample. To improve previous methods we used bioinformatic analysis to predict an optimized combination of methylation-sensitive enzymes that had the highest utility for CpG-island probes and different methods to produce unmethylated representations of test DNA for more sensitive detection of differential methylation by hybridization. Subtraction or methylation-dependent digestion with McrBC was used with optimized (MMASS-v2) or previously described (MMASS-v1, MMASS-sub) methylation-sensitive enzyme combinations and compared with a published McrBC method. Comparison was performed using DNA from the cell line HCT116. We show that the distribution of methylation microarray data is inherently skewed and requires exogenous spiked controls for normalization and that analysis of digestion of methylated and unmethylated control sequences together with linear fit models of replicate data showed superior statistical power for the MMASS-v2 method. Comparison with previous methylation data for HCT116 and validation of CpG islands from PXMP4, SFRP2, DCC, RARB and TSEN2 confirmed the accuracy of MMASS-v2 results. The MMASS-v2 method offers improved sensitivity and statistical power for high-throughput microarray identification of differential methylation.
A Multiple-Sequence Variant of the Multiple-Baseline Design: A Strategy for Analysis of Sequence Effects and Treatment Comparison.

ERIC Educational Resources Information Center

Noell, George H.; Gresham, Frank M.

2001-01-01

Describes design logic and potential uses of a variant of the multiple-baseline design. The multiple-baseline multiple-sequence (MBL-MS) consists of multiple-baseline designs that are interlaced with one another and include all possible sequences of treatments. The MBL-MS design appears to be primarily useful for comparison of treatments taking…
Comparison results on preconditioned SOR-type iterative method for Z-matrices linear systems

NASA Astrophysics Data System (ADS)

Wang, Xue-Zhong; Huang, Ting-Zhu; Fu, Ying-Ding

2007-09-01

In this paper, we present some comparison theorems on preconditioned iterative method for solving Z-matrices linear systems, Comparison results show that the rate of convergence of the Gauss-Seidel-type method is faster than the rate of convergence of the SOR-type iterative method.
Subject-level reliability analysis of fast fMRI with application to epilepsy.

PubMed

Hao, Yongfu; Khoo, Hui Ming; von Ellenrieder, Nicolas; Gotman, Jean

2017-07-01

Recent studies have applied the new magnetic resonance encephalography (MREG) sequence to the study of interictal epileptic discharges (IEDs) in the electroencephalogram (EEG) of epileptic patients. However, there are no criteria to quantitatively evaluate different processing methods, to properly use the new sequence. We evaluated different processing steps of this new sequence under the common generalized linear model (GLM) framework by assessing the reliability of results. A bootstrap sampling technique was first used to generate multiple replicated data sets; a GLM with different processing steps was then applied to obtain activation maps, and the reliability of these maps was assessed. We applied our analysis in an event-related GLM related to IEDs. A higher reliability was achieved by using a GLM with head motion confound regressor with 24 components rather than the usual 6, with an autoregressive model of order 5 and with a canonical hemodynamic response function (HRF) rather than variable latency or patient-specific HRFs. Comparison of activation with IED field also favored the canonical HRF, consistent with the reliability analysis. The reliability analysis helps to optimize the processing methods for this fast fMRI sequence, in a context in which we do not know the ground truth of activation areas. Magn Reson Med 78:370-382, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
The complete genome sequencing of Prevotella intermedia strain OMA14 and a subsequent fine-scale, intra-species genomic comparison reveal an unusual amplification of conjugative and mobile transposons and identify a novel Prevotella-lineage-specific repeat

PubMed Central

Naito, Mariko; Ogura, Yoshitoshi; Itoh, Takehiko; Shoji, Mikio; Okamoto, Masaaki; Hayashi, Tetsuya; Nakayama, Koji

2016-01-01

Prevotella intermedia is a pathogenic bacterium involved in periodontal diseases. Here, we present the complete genome sequence of a clinical strain, OMA14, of this bacterium along with the results of comparative genome analysis with strain 17 of the same species whose genome has also been sequenced, but not fully analysed yet. The genomes of both strains consist of two circular chromosomes: the larger chromosomes are similar in size and exhibit a high overall linearity of gene organizations, whereas the smaller chromosomes show a significant size variation and have undergone remarkable genome rearrangements. Unique features of the Pre. intermedia genomes are the presence of a remarkable number of essential genes on the second chromosomes and the abundance of conjugative and mobilizable transposons (CTns and MTns). The CTns/MTns are particularly abundant in the second chromosomes, involved in its extensive genome rearrangement, and have introduced a number of strain-specific genes into each strain. We also found a novel 188-bp repeat sequence that has been highly amplified in Pre. intermedia and are specifically distributed among the Pre. intermedia-related species. These findings expand our understanding of the genetic features of Pre. intermedia and the roles of CTns and MTns in the evolution of bacteria. PMID:26645327

Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions.

PubMed

Chemes, Lucía Beatriz; de Prat-Gay, Gonzalo; Sánchez, Ignacio Enrique

2015-06-01

Pathogen linear motif mimics are highly evolvable elements that facilitate rewiring of host protein interaction networks. Host linear motifs and pathogen mimics differ in sequence, leading to thermodynamic and structural differences in the resulting protein-protein interactions. Moreover, the functional output of a mimic depends on the motif and domain repertoire of the pathogen protein. Regulatory evolution mediated by linear motifs can be understood by measuring evolutionary rates, quantifying positive and negative selection and performing phylogenetic reconstructions of linear motif natural history. Convergent evolution of linear motif mimics is widespread among unrelated proteins from viral, prokaryotic and eukaryotic pathogens and can also take place within individual protein phylogenies. Statistics, biochemistry and laboratory models of infection link pathogen linear motifs to phenotypic traits such as tropism, virulence and oncogenicity. In vitro evolution experiments and analysis of natural sequences suggest that changes in linear motif composition underlie pathogen adaptation to a changing environment. Copyright © 2015 Elsevier Ltd. All rights reserved.
UFO: a web server for ultra-fast functional profiling of whole genome protein sequences.

PubMed

Meinicke, Peter

2009-09-02

Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.
A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

PubMed Central

Freschi, Valerio; Bogliolo, Alessandro

2012-01-01

In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086
On Asymptotically Good Ramp Secret Sharing Schemes

NASA Astrophysics Data System (ADS)

Geil, Olav; Martin, Stefano; Martínez-Peñas, Umberto; Matsumoto, Ryutaroh; Ruano, Diego

Asymptotically good sequences of linear ramp secret sharing schemes have been intensively studied by Cramer et al. in terms of sequences of pairs of nested algebraic geometric codes. In those works the focus is on full privacy and full reconstruction. In this paper we analyze additional parameters describing the asymptotic behavior of partial information leakage and possibly also partial reconstruction giving a more complete picture of the access structure for sequences of linear ramp secret sharing schemes. Our study involves a detailed treatment of the (relative) generalized Hamming weights of the considered codes.
Linear Chromosome-generating System of Agrobacterium tumefaciens C58: Protelomerase Generates and Protects Hairpin Ends

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, Wai Mun; DaGloria, Jeanne; Fox, Heather

2012-09-05

Agrobacterium tumefaciens C58, the pathogenic bacteria that causes crown gall disease in plants, harbors one circular and one linear chromosome and two circular plasmids. The telomeres of its unusual linear chromosome are covalently closed hairpins. The circular and linear chromosomes co-segregate and are stably maintained in the organism. We have determined the sequence of the two ends of the linear chromosome thus completing the previously published genome sequence of A. tumefaciens C58. We found that the telomeres carry nearly identical 25-bp sequences at the hairpin ends that are related by dyad symmetry. We further showed that its Atu2523 gene encodesmore » a protelomerase (resolvase) and that the purified enzyme can generate the linear chromosomal closed hairpin ends in a sequence-specific manner. Agrobacterium protelomerase, whose presence is apparently limited to biovar 1 strains, acts via a cleavage-and-religation mechanism by making a pair of transient staggered nicks invariably at 6-bp spacing as the reaction intermediate. The enzyme can be significantly shortened at both the N and C termini and still maintain its enzymatic activity. Although the full-length enzyme can uniquely bind to its product telomeres, the N-terminal truncations cannot. The target site can also be shortened from the native 50-bp inverted repeat to 26 bp; thus, the Agrobacterium hairpin-generating system represents the most compact activity of all hairpin linear chromosome- and plasmid-generating systems to date. The biochemical analyses of the protelomerase reactions further revealed that the tip of the hairpin telomere may be unusually polymorphically capable of accommodating any nucleotide.« less
New powerful statistics for alignment-free sequence comparison under a pattern transfer model.

PubMed

Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S; Sun, Fengzhu

2011-09-07

Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D*2 and D(s)2 showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D*2 and D(s)2 by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. Copyright © 2011 Elsevier Ltd. All rights reserved.
New Powerful Statistics for Alignment-free Sequence Comparison Under a Pattern Transfer Model

PubMed Central

Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S.; Sun, Fengzhu

2011-01-01

Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D2∗ and D2s showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D2∗ and D2s by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. PMID:21723298
Multiple alignment-free sequence comparison

PubMed Central

Ren, Jie; Song, Kai; Sun, Fengzhu; Deng, Minghua; Reinert, Gesine

2013-01-01

Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, and , extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, , and , averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics. Availability: Our implementation of the five statistics is available as R package named ‘multiAlignFree’ at be http://www-rcf.usc.edu/∼fsun/Programs/multiAlignFree/multiAlignFreemain.html. Contact: reinert@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23990418
Genetic stability of gene targeted immunoglobulin loci. I. Heavy chain isotype exchange induced by a universal gene replacement vector.

PubMed Central

Kardinal, C; Selmayr, M; Mocikat, R

1996-01-01

Gene targeting at the immunoglobulin loci of B cells is an efficient tool for studying immunoglobulin expression or generating chimeric antibodies. We have shown that vector integration induced by human immunoglobulin G1 (IgG1) insertion vectors results in subsequent vector excision mediated by the duplicated target sequence, whereas replacement events which could be induced by the same constructs remain stable. We could demonstrate that the distribution of the vector homology strongly influences the genetic stability obtained. To this end we developed a novel type of a heavy chain replacement vector making use of the heavy chain class switch recombination sequence. Despite the presence of a two-sided homology this construct is universally applicable irrespective of the constant gene region utilized by the B cell. In comparison to an integration vector the frequency of stable incorporation was strongly increased, but we still observed vector excision, although at a markedly reduced rate. The latter events even occurred with circular constructs. Linearization of the construct at various sites and the comparison with an integration vector that carries the identical homology sequence, but differs in the distribution of homology, revealed the following features of homologous recombination of immunoglobulin genes: (i) the integration frequency is only determined by the length of the homology flank where the cross-over takes place; (ii) a 5' flank that does not meet the minimum requirement of homology length cannot be complemented by a sufficient 3' flank; (iii) free vector ends play a role for integration as well as for replacement targeting; (iv) truncating recombination events are suppressed in the presence of two flanks. Furthermore, we show that the switch region that was used as 3' flank is non-functional in an inverted orientation. Images Figure 2 PMID:8958041
Sequencing a piece of history: complete genome sequence of the original Escherichia coli strain

PubMed Central

Dunne, Karl A; Chaudhuri, Roy R; Rossiter, Amanda E; Beriotto, Irene; Browning, Douglas F; Squire, Derrick; Cunningham, Adam F; Cole, Jeffrey A; Loman, Nicholas

2017-01-01

In 1885, Theodor Escherich first described the Bacillus coli commune, which was subsequently renamed Escherichia coli. We report the complete genome sequence of this original strain (NCTC 86). The 5 144 392 bp circular chromosome encodes the genes for 4805 proteins, which include antigens, virulence factors, antimicrobial-resistance factors and secretion systems, of a commensal organism from the pre-antibiotic era. It is located in the E. coli A subgroup and is closely related to E. coli K-12 MG1655. E. coli strain NCTC 86 and the non-pathogenic K-12, C, B and HS strains share a common backbone that is largely co-linear. The exception is a large 2 803 932 bp inversion that spans the replication terminus from gmhB to clpB. Comparison with E. coli K-12 reveals 41 regions of difference (577 351 bp) distributed across the chromosome. For example, and contrary to current dogma, E. coli NCTC 86 includes a nine gene sil locus that encodes a silver-resistance efflux pump acquired before the current widespread use of silver nanoparticles as an antibacterial agent, possibly resulting from the widespread use of silver utensils and currency in Germany in the 1800s. In summary, phylogenetic comparisons with other E. coli strains confirmed that the original strain isolated by Escherich is most closely related to the non-pathogenic commensal strains. It is more distant from the root than the pathogenic organisms E. coli 042 and O157 : H7; therefore, it is not an ancestral state for the species. PMID:28663823
Genetic stability of gene targeted immunoglobulin loci. I. Heavy chain isotype exchange induced by a universal gene replacement vector.

PubMed

Kardinal, C; Selmayr, M; Mocikat, R

1996-11-01

Gene targeting at the immunoglobulin loci of B cells is an efficient tool for studying immunoglobulin expression or generating chimeric antibodies. We have shown that vector integration induced by human immunoglobulin G1 (IgG1) insertion vectors results in subsequent vector excision mediated by the duplicated target sequence, whereas replacement events which could be induced by the same constructs remain stable. We could demonstrate that the distribution of the vector homology strongly influences the genetic stability obtained. To this end we developed a novel type of a heavy chain replacement vector making use of the heavy chain class switch recombination sequence. Despite the presence of a two-sided homology this construct is universally applicable irrespective of the constant gene region utilized by the B cell. In comparison to an integration vector the frequency of stable incorporation was strongly increased, but we still observed vector excision, although at a markedly reduced rate. The latter events even occurred with circular constructs. Linearization of the construct at various sites and the comparison with an integration vector that carries the identical homology sequence, but differs in the distribution of homology, revealed the following features of homologous recombination of immunoglobulin genes: (i) the integration frequency is only determined by the length of the homology flank where the cross-over takes place; (ii) a 5' flank that does not meet the minimum requirement of homology length cannot be complemented by a sufficient 3' flank; (iii) free vector ends play a role for integration as well as for replacement targeting; (iv) truncating recombination events are suppressed in the presence of two flanks. Furthermore, we show that the switch region that was used as 3' flank is non-functional in an inverted orientation.
When syntax meets action: Brain potential evidence of overlapping between language and motor sequencing.

PubMed

Casado, Pilar; Martín-Loeches, Manuel; León, Inmaculada; Hernández-Gutiérrez, David; Espuny, Javier; Muñoz, Francisco; Jiménez-Ortega, Laura; Fondevila, Sabela; de Vega, Manuel

2018-03-01

This study aims to extend the embodied cognition approach to syntactic processing. The hypothesis is that the brain resources to plan and perform motor sequences are also involved in syntactic processing. To test this hypothesis, Event-Related brain Potentials (ERPs) were recorded while participants read sentences with embedded relative clauses, judging for their acceptability (half of the sentences contained a subject-verb morphosyntactic disagreement). The sentences, previously divided into three segments, were self-administered segment-by-segment in two different sequential manners: linear or non-linear. Linear self-administration consisted of successively pressing three buttons with three consecutive fingers in the right hand, while non-linear self-administration implied the substitution of the finger in the middle position by the right foot. Our aim was to test whether syntactic processing could be affected by the manner the sentences were self-administered. Main results revealed that the ERPs LAN component vanished whereas the P600 component increased in response to incorrect verbs, for non-linear relative to linear self-administration. The LAN and P600 components reflect early and late syntactic processing, respectively. Our results convey evidence that language syntactic processing and performing non-linguistic motor sequences may share resources in the human brain. Copyright © 2017 Elsevier Ltd. All rights reserved.
Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie

2009-11-20

RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR)more » shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.« less
Phylogenetic shadowing of primate sequences to find functional regions of the human genome.

PubMed

Boffelli, Dario; McAuliffe, Jon; Ovcharenko, Dmitriy; Lewis, Keith D; Ovcharenko, Ivan; Pachter, Lior; Rubin, Edward M

2003-02-28

Nonhuman primates represent the most relevant model organisms to understand the biology of Homo sapiens. The recent divergence and associated overall sequence conservation between individual members of this taxon have nonetheless largely precluded the use of primates in comparative sequence studies. We used sequence comparisons of an extensive set of Old World and New World monkeys and hominoids to identify functional regions in the human genome. Analysis of these data enabled the discovery of primate-specific gene regulatory elements and the demarcation of the exons of multiple genes. Much of the information content of the comprehensive primate sequence comparisons could be captured with a small subset of phylogenetically close primates. These results demonstrate the utility of intraprimate sequence comparisons to discover common mammalian as well as primate-specific functional elements in the human genome, which are unattainable through the evaluation of more evolutionarily distant species.
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment

PubMed Central

2013-01-01

Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

PubMed

Nagar, Anurag; Hahsler, Michael

2013-01-01

Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.
BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons

PubMed Central

2011-01-01

Background Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. Results BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. Conclusions There is a clear need for a user-friendly program that can produce genome comparisons for a large number of prokaryote genomes with an emphasis on rapidly utilising unfinished or unassembled genome data. Here we present BRIG, a cross-platform application that enables the interactive generation of comparative genomic images via a simple graphical-user interface. BRIG is freely available for all operating systems at http://sourceforge.net/projects/brig/. PMID:21824423
BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons.

PubMed

Alikhan, Nabil-Fareed; Petty, Nicola K; Ben Zakour, Nouri L; Beatson, Scott A

2011-08-08

Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. There is a clear need for a user-friendly program that can produce genome comparisons for a large number of prokaryote genomes with an emphasis on rapidly utilising unfinished or unassembled genome data. Here we present BRIG, a cross-platform application that enables the interactive generation of comparative genomic images via a simple graphical-user interface. BRIG is freely available for all operating systems at http://sourceforge.net/projects/brig/.
123s and ABCs: developmental shifts in logarithmic-to-linear responding reflect fluency with sequence values.

PubMed

Hurst, Michelle; Monahan, K Leigh; Heller, Elizabeth; Cordes, Sara

2014-11-01

When placing numbers along a number line with endpoints 0 and 1000, children generally space numbers logarithmically until around the age of 7, when they shift to a predominantly linear pattern of responding. This developmental shift of responding on the number placement task has been argued to be indicative of a shift in the format of the underlying representation of number (Siegler & Opfer, ). In the current study, we provide evidence from both child and adult participants to suggest that performance on the number placement task may not reflect the structure of the mental number line, but instead is a function of the fluency (i.e. ease) with which the individual can work with the values in the sequence. In Experiment 1, adult participants respond logarithmically when placing numbers on a line with less familiar anchors (1639 to 2897), despite linear responding on control tasks with standard anchors involving a similar range (0 to 1287) and a similar numerical magnitude (2000 to 3000). In Experiment 2, we show a similar developmental shift in childhood from logarithmic to linear responding for a non-numerical sequence with no inherent magnitude (the alphabet). In conclusion, we argue that the developmental trend towards linear behavior on the number line task is a product of successful strategy use and mental fluency with the values of the sequence, resulting from familiarity with endpoints and increased knowledge about general ordering principles of the sequence.A video abstract of this article can be viewed at:http://www.youtube.com/watch?v=zg5Q2LIFk3M. © 2014 John Wiley & Sons Ltd.
A comparative analysis of the foamy and ortho virus capsid structures reveals an ancient domain duplication.

PubMed

Taylor, William R; Stoye, Jonathan P; Taylor, Ian A

2017-04-04

The Spumaretrovirinae (foamy viruses) and the Orthoretrovirinae (e.g. HIV) share many similarities both in genome structure and the sequences of the core viral encoded proteins, such as the aspartyl protease and reverse transcriptase. Similarity in the gag region of the genome is less obvious at the sequence level but has been illuminated by the recent solution of the foamy virus capsid (CA) structure. This revealed a clear structural similarity to the orthoretrovirus capsids but with marked differences that left uncertainty in the relationship between the two domains that comprise the structure. We have applied protein structure comparison methods in order to try and resolve this ambiguous relationship. These included both the DALI method and the SAP method, with rigorous statistical tests applied to the results of both methods. For this, we employed collections of artificial fold 'decoys' (generated from the pair of native structures being compared) to provide a customised background distribution for each comparison, thus allowing significance levels to be estimated. We have shown that the relationship of the two domains conforms to a simple linear correspondence rather than a domain transposition. These similarities suggest that the origin of both viral capsids was a common ancestor with a double domain structure. In addition, we show that there is also a significant structural similarity between the amino and carboxy domains in both the foamy and ortho viruses. These results indicate that, as well as the duplication of the double domain capsid, there may have been an even more ancient gene-duplication that preceded the double domain structure. In addition, our structure comparison methodology demonstrates a general approach to problems where the components have a high intrinsic level of similarity.

The alignment of enzymatic steps reveals similar metabolic pathways and probable recruitment events in Gammaproteobacteria.

PubMed

Poot-Hernandez, Augusto Cesar; Rodriguez-Vazquez, Katya; Perez-Rueda, Ernesto

2015-11-17

It is generally accepted that gene duplication followed by functional divergence is one of the main sources of metabolic diversity. In this regard, there is an increasing interest in the development of methods that allow the systematic identification of these evolutionary events in metabolism. Here, we used a method not based on biomolecular sequence analysis to compare and identify common and variable routes in the metabolism of 40 Gammaproteobacteria species. The metabolic maps deposited in the KEGG database were transformed into linear Enzymatic Step Sequences (ESS) by using the breadth-first search algorithm. These ESS represent subsequent enzymes linked to each other, where their catalytic activities are encoded in the Enzyme Commission numbers. The ESS were compared in an all-against-all (pairwise comparisons) approach by using a dynamic programming algorithm, leaving only a set of significant pairs. From these comparisons, we identified a set of functionally conserved enzymatic steps in different metabolic maps, in which cell wall components and fatty acid and lysine biosynthesis were included. In addition, we found that pathways associated with biosynthesis share a higher proportion of similar ESS than degradation pathways and secondary metabolism pathways. Also, maps associated with the metabolism of similar compounds contain a high proportion of similar ESS, such as those maps from nucleotide metabolism pathways, in particular the inosine monophosphate pathway. Furthermore, diverse ESS associated with the low part of the glycolysis pathway were identified as functionally similar to multiple metabolic pathways. In summary, our comparisons may help to identify similar reactions in different metabolic pathways and could reinforce the patchwork model in the evolution of metabolism in Gammaproteobacteria.
A new preconditioner update strategy for the solution of sequences of linear systems in structural mechanics: application to saddle point problems in elasticity

NASA Astrophysics Data System (ADS)

Mercier, Sylvain; Gratton, Serge; Tardieu, Nicolas; Vasseur, Xavier

2017-12-01

Many applications in structural mechanics require the numerical solution of sequences of linear systems typically issued from a finite element discretization of the governing equations on fine meshes. The method of Lagrange multipliers is often used to take into account mechanical constraints. The resulting matrices then exhibit a saddle point structure and the iterative solution of such preconditioned linear systems is considered as challenging. A popular strategy is then to combine preconditioning and deflation to yield an efficient method. We propose an alternative that is applicable to the general case and not only to matrices with a saddle point structure. In this approach, we consider to update an existing algebraic or application-based preconditioner, using specific available information exploiting the knowledge of an approximate invariant subspace or of matrix-vector products. The resulting preconditioner has the form of a limited memory quasi-Newton matrix and requires a small number of linearly independent vectors. Numerical experiments performed on three large-scale applications in elasticity highlight the relevance of the new approach. We show that the proposed method outperforms the deflation method when considering sequences of linear systems with varying matrices.
Gear Shifting of Quadriceps during Isometric Knee Extension Disclosed Using Ultrasonography.

PubMed

Zhang, Shu; Huang, Weijian; Zeng, Yu; Shi, Wenxiu; Diao, Xianfen; Wei, Xiguang; Ling, Shan

2018-01-01

Ultrasonography has been widely employed to estimate the morphological changes of muscle during contraction. To further investigate the motion pattern of quadriceps during isometric knee extensions, we studied the relative motion pattern between femur and quadriceps under ultrasonography. An interesting observation is that although the force of isometric knee extension can be controlled to change almost linearly, femur in the simultaneously captured ultrasound video sequences has several different piecewise moving patterns. This phenomenon is like quadriceps having several forward gear ratios like a car starting from rest towards maximal voluntary contraction (MVC) and then returning to rest. Therefore, to verify this assumption, we captured several ultrasound video sequences of isometric knee extension and collected the torque/force signal simultaneously. Then we extract the shapes of femur from these ultrasound video sequences using video processing techniques and study the motion pattern both qualitatively and quantitatively. The phenomenon can be seen easier via a comparison between the torque signal and relative spatial distance between femur and quadriceps. Furthermore, we use cluster analysis techniques to study the process and the clustering results also provided preliminary support to the conclusion that, during both ramp increasing and decreasing phases, quadriceps contraction may have several forward gear ratios relative to femur.
Noise and drift analysis of non-equally spaced timing data

NASA Technical Reports Server (NTRS)

Vernotte, F.; Zalamansky, G.; Lantz, E.

1994-01-01

Generally, it is possible to obtain equally spaced timing data from oscillators. The measurement of the drifts and noises affecting oscillators is then performed by using a variance (Allan variance, modified Allan variance, or time variance) or a system of several variances (multivariance method). However, in some cases, several samples, or even several sets of samples, are missing. In the case of millisecond pulsar timing data, for instance, observations are quite irregularly spaced in time. Nevertheless, since some observations are very close together (one minute) and since the timing data sequence is very long (more than ten years), information on both short-term and long-term stability is available. Unfortunately, a direct variance analysis is not possible without interpolating missing data. Different interpolation algorithms (linear interpolation, cubic spline) are used to calculate variances in order to verify that they neither lose information nor add erroneous information. A comparison of the results of the different algorithms is given. Finally, the multivariance method was adapted to the measurement sequence of the millisecond pulsar timing data: the responses of each variance of the system are calculated for each type of noise and drift, with the same missing samples as in the pulsar timing sequence. An estimation of precision, dynamics, and separability of this method is given.
Solution Structure of Acidocin B, a Circular Bacteriocin Produced by Lactobacillus acidophilus M46

PubMed Central

Acedo, Jeella Z.; van Belkum, Marco J.; Lohans, Christopher T.; McKay, Ryan T.; Miskolzie, Mark

2015-01-01

Acidocin B, a bacteriocin produced by Lactobacillus acidophilus M46, was originally reported to be a linear peptide composed of 59 amino acid residues. However, its high sequence similarity to gassericin A, a circular bacteriocin from Lactobacillus gasseri LA39, suggested that acidocin B might be circular as well. Acidocin B was purified from culture supernatant by a series of hydrophobic interaction chromatographic steps. Its circular nature was ascertained by matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) mass spectrometry and tandem mass spectrometry (MS/MS) sequencing. The peptide sequence was found to consist of 58 amino acids with a molecular mass of 5,621.5 Da. The sequence of the acidocin B biosynthetic gene cluster was also determined and showed high nucleotide sequence similarity to that of gassericin A. The nuclear magnetic resonance (NMR) solution structure of acidocin B in sodium dodecyl sulfate micelles was elucidated, revealing that it is composed of four α-helices of similar length that are folded to form a compact, globular bundle with a central pore. This is a three-dimensional structure for a member of subgroup II circular bacteriocins, which are classified based on their isoelectric points of ∼7 or lower. Comparison of acidocin B with carnocyclin A, a subgroup I circular bacteriocin with four α-helices and a pI of 10, revealed differences in the overall folding. The observed variations could be attributed to inherent diversity in their physical properties, which also required the use of different solvent systems for three-dimensional structural elucidation. PMID:25681186
On the auto and cross correlation of PN sequences

NASA Technical Reports Server (NTRS)

Morakis, J. C.

1969-01-01

The autocorrelation and crosscorrelation properties of pseudorandom (PN) sequences are analyzed by using some important properties of PN sequences. These properties make this discussion understandable without the need of linear algebraic approach. The analysis is followed by some experimental results.
Improved therapy-success prediction with GSS estimated from clinical HIV-1 sequences.

PubMed

Pironti, Alejandro; Pfeifer, Nico; Kaiser, Rolf; Walter, Hauke; Lengauer, Thomas

2014-01-01

Rules-based HIV-1 drug-resistance interpretation (DRI) systems disregard many amino-acid positions of the drug's target protein. The aims of this study are (1) the development of a drug-resistance interpretation system that is based on HIV-1 sequences from clinical practice rather than hard-to-get phenotypes, and (2) the assessment of the benefit of taking all available amino-acid positions into account for DRI. A dataset containing 34,934 therapy-naïve and 30,520 drug-exposed HIV-1 pol sequences with treatment history was extracted from the EuResist database and the Los Alamos National Laboratory database. 2,550 therapy-change-episode baseline sequences (TCEB) were assigned to test set A. Test set B contains 1,084 TCEB from the HIVdb TCE repository. Sequences from patients absent in the test sets were used to train three linear support vector machines to produce scores that predict drug exposure pertaining to each of 20 antiretrovirals: the first one uses the full amino-acid sequences (DEfull), the second one only considers IAS drug-resistance positions (DEonlyIAS), and the third one disregards IAS drug-resistance positions (DEnoIAS). For performance comparison, test sets A and B were evaluated with DEfull, DEnoIAS, DEonlyIAS, geno2pheno[resistance], HIVdb, ANRS, HIV-GRADE, and REGA. Clinically-validated cut-offs were used to convert the continuous output of the first four methods into susceptible-intermediate-resistant (SIR) predictions. With each method, a genetic susceptibility score (GSS) was calculated for each therapy episode in each test set by converting the SIR prediction for its compounds to integer: S=2, I=1, and R=0. The GSS were used to predict therapy success as defined by the EuResist standard datum definition. Statistical significance was assessed using a Wilcoxon signed-rank test. A comparison of the therapy-success prediction performances among the different interpretation systems for test set A can be found in Table 1, while those for test set B are found in Figure 1. Therapy-success prediction of first-line therapies with DEnoIAS performed better than DEonlyIAS (p<10-16). Therapy success prediction benefits from the consideration of all available mutations. The increase in performance was largest in first-line therapies with transmitted drug-resistance mutations.
Comparison of the relation between timing and force control during finger-tapping sequences by pianists and non pianists.

PubMed

Inui, N; Ichihara, T

2001-10-01

To examine the relation between timing and force control during finger taping sequences by both pianists and nonpianists, participants tapped a force plate connected to strain gauges. A series of finger tapping tasks consisted of 16 combinations of pace (intertap interval: 180, 200, 400, or 800 ms) and peak force (50, 100, 200, or 400 g). Analysis showed that, although movement timing was independent of force control under low or medium pace conditions, there were strong interactions between the 2 parameters under high pace conditions. The results indicate that participants adapted the movement by switching from separately controlling these parameters in the slow and moderate movement to coupling them in the fast movement. While variations in the intertap interval affected force production by nonpianists, they had little effect for pianists. The ratios of time-to-peak force to press duration increased linearly in pianists but varied irregularly in nonpianists, as the required force decreased. Thus, pianists regulate peak force by timing control of peak force to press duration, suggesting that training affects the relationship between the 2 parameters.
Self-expressive Dictionary Learning for Dynamic 3D Reconstruction.

PubMed

Zheng, Enliang; Ji, Dinghuang; Dunn, Enrique; Frahm, Jan-Michael

2017-08-22

We target the problem of sparse 3D reconstruction of dynamic objects observed by multiple unsynchronized video cameras with unknown temporal overlap. To this end, we develop a framework to recover the unknown structure without sequencing information across video sequences. Our proposed compressed sensing framework poses the estimation of 3D structure as the problem of dictionary learning, where the dictionary is defined as an aggregation of the temporally varying 3D structures. Given the smooth motion of dynamic objects, we observe any element in the dictionary can be well approximated by a sparse linear combination of other elements in the same dictionary (i.e. self-expression). Our formulation optimizes a biconvex cost function that leverages a compressed sensing formulation and enforces both structural dependency coherence across video streams, as well as motion smoothness across estimates from common video sources. We further analyze the reconstructability of our approach under different capture scenarios, and its comparison and relation to existing methods. Experimental results on large amounts of synthetic data as well as real imagery demonstrate the effectiveness of our approach.
K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.

PubMed

Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue

2018-05-15

Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.
Implementing situation-background-assessment-recommendation in an anaesthetic clinic and subsequent information retention among receivers: A prospective interventional study of postoperative handovers.

PubMed

Randmaa, Maria; Swenne, Christine L; Mårtensson, Gunilla; Högberg, Hans; Engström, Maria

2016-03-01

Communication errors cause clinical incidents and adverse events in relation to surgery. To ensure proper postoperative patient care, it is essential that personnel remember and recall information given during the handover from the operating theatre to the postanaesthesia care unit. Formalizing the handover may improve communication and aid memory, but research in this area is lacking. The objective of this study was to evaluate whether implementing the communication tool Situation-Background-Assessment-Recommendation (SBAR) affects receivers' information retention after postoperative handover. A prospective intervention study with an intervention group and comparison nonintervention group, with assessments before and after the intervention. The postanaesthesia care units of two hospitals in Sweden during 2011 and 2012. Staff involved in the handover between the operating theatre and the postanaesthesia care units within each hospital. Implementation of the communication tool SBAR in one hospital. The main outcome was the percentage of recalled information sequences among receivers after the handover. Data were collected using both audio-recordings and observations recorded on a study-specific protocol form. Preintervention, 73 handovers were observed (intervention group, n = 40; comparison group, n = 33) involving 72 personnel (intervention group, n = 40; comparison group, n = 32). Postintervention, 91 handovers were observed (intervention group, n = 44; comparison group, n = 47) involving 57 personnel (intervention group, n = 31; comparison group, n = 26). In the intervention group, the percentage of recalled information sequences by the receivers increased from 43.4% preintervention to 52.6% postintervention (P = 0.004) and the SBAR structure improved significantly (P = 0.028). In the comparison group, the corresponding figures were 51.3 and 52.6% (P = 0.725) with no difference in SBAR structure. When a linear regression generalised estimating equation model was used to account for confounding influences, we were unable to show a significant difference in the information recalled between the intervention group and the nonintervention group over time. Compared with the comparison group with no intervention, when SBAR was implemented in an anaesthetic clinic, we were unable to show any improvement in recalled information among receivers following postoperative handover. Current controlled trials http://www.controlled-trials.com Identifier: ISRCTN37251313.
Linear Lepidopteran ambidensovirus 1 sequences drive random integration of a reporter gene in transfected Spodoptera frugiperda cells.

PubMed

Rizk, Francine; Laverdure, Sylvain; d'Alençon, Emmanuelle; Bossin, Hervé; Dupressoir, Thierry

2018-01-01

The Lepidopteran ambidensovirus 1 isolated from Junonia coenia (hereafter JcDV) is an invertebrate parvovirus considered as a viral transduction vector as well as a potential tool for the biological control of insect pests. Previous works showed that JcDV-based circular plasmids experimentally integrate into insect cells genomic DNA. In order to approach the natural conditions of infection and possible integration, we generated linear JcDV- gfp based molecules which were transfected into non permissive Spodoptera frugiperda ( Sf9 ) cultured cells. Cells were monitored for the expression of green fluorescent protein (GFP) and DNA was analyzed for integration of transduced viral sequences. Non-structural protein modulation of the VP-gene cassette promoter activity was additionally assayed. We show that linear JcDV-derived molecules are capable of long term genomic integration and sustained transgene expression in Sf9 cells. As expected, only the deletion of both inverted terminal repeats (ITR) or the polyadenylation signals of NS and VP genes dramatically impairs the global transduction/expression efficiency. However, all the integrated viral sequences we characterized appear "scrambled" whatever the viral content of the transfected vector. Despite a strong GFP expression, we were unable to recover any full sequence of the original constructs and found rearranged viral and non-viral sequences as well. Cellular flanking sequences were identified as non-coding ones. On the other hand, the kinetics of GFP expression over time led us to investigate the apparent down-regulation by non-structural proteins of the VP-gene cassette promoter. Altogether, our results show that JcDV-derived sequences included in linear DNA molecules are able to drive efficiently the integration and expression of a foreign gene into the genome of insect cells, whatever their composition, provided that at least one ITR is present. However, the transfected sequences were extensively rearranged with cellular DNA during or after random integration in the host cell genome. Lastly, the non-structural proteins seem to participate in the regulation of p9 promoter activity rather than to the integration of viral sequences.
Dynamics of actin evolution in dinoflagellates.

PubMed

Kim, Sunju; Bachvaroff, Tsvetan R; Handy, Sara M; Delwiche, Charles F

2011-04-01

Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.
The complete genome sequencing of Prevotella intermedia strain OMA14 and a subsequent fine-scale, intra-species genomic comparison reveal an unusual amplification of conjugative and mobile transposons and identify a novel Prevotella-lineage-specific repeat.

PubMed

Naito, Mariko; Ogura, Yoshitoshi; Itoh, Takehiko; Shoji, Mikio; Okamoto, Masaaki; Hayashi, Tetsuya; Nakayama, Koji

2016-02-01

Prevotella intermedia is a pathogenic bacterium involved in periodontal diseases. Here, we present the complete genome sequence of a clinical strain, OMA14, of this bacterium along with the results of comparative genome analysis with strain 17 of the same species whose genome has also been sequenced, but not fully analysed yet. The genomes of both strains consist of two circular chromosomes: the larger chromosomes are similar in size and exhibit a high overall linearity of gene organizations, whereas the smaller chromosomes show a significant size variation and have undergone remarkable genome rearrangements. Unique features of the Pre. intermedia genomes are the presence of a remarkable number of essential genes on the second chromosomes and the abundance of conjugative and mobilizable transposons (CTns and MTns). The CTns/MTns are particularly abundant in the second chromosomes, involved in its extensive genome rearrangement, and have introduced a number of strain-specific genes into each strain. We also found a novel 188-bp repeat sequence that has been highly amplified in Pre. intermedia and are specifically distributed among the Pre. intermedia-related species. These findings expand our understanding of the genetic features of Pre. intermedia and the roles of CTns and MTns in the evolution of bacteria. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Molecular and Morphological Characterization of Fasciola spp. Isolated from Different Host Species in a Newly Emerging Focus of Human Fascioliasis in Iran

PubMed Central

Shafiei, Reza; Sarkari, Bahador; Sadjjadi, Seyed Mahmuod; Mowlavi, Gholam Reza; Moshfe, Abdolali

2014-01-01

The current study aimed to find out the morphometric and genotypic divergences of the flukes isolated from different hosts in a newly emerging focus of human fascioliasis in Iran. Adult Fasciola spp. were collected from 34 cattle, 13 sheep, and 11 goats from Kohgiluyeh and Boyer-Ahmad province, southwest of Iran. Genomic DNA was extracted from the flukes and PCR-RFLP was used to characterize the isolates. The ITS1, ITS2, and mitochondrial genes (mtDNA) of NDI and COI from individual liver flukes were amplified and the amplicons were sequenced. Genetic variation within and between the species was evaluated by comparing the sequences. Moreover, morphometric characteristics of flukes were measured through a computer image analysis system. Based on RFLP profile, from the total of 58 isolates, 41 isolates (from cattle, sheep, and goat) were identified as Fasciola hepatica, while 17 isolates from cattle were identified as Fasciola gigantica. Comparison of the ITS1 and ITS2 sequences showed six and seven single-base substitutions, resulting in segregation of the specimens into two different genotypes. The sequences of COI markers showed seven DNA polymorphic sites for F. hepatica and 35 DNA polymorphic sites for F. gigantica. Morphological diversity of the two species was observed in linear, ratios, and areas measurements. The findings have implications for studying the population genetics, epidemiology, and control of the disease. PMID:25018891
High-purity circular RNA isolation method (RPAD) reveals vast collection of intronic circRNAs.

PubMed

Panda, Amaresh C; De, Supriyo; Grammatikakis, Ioannis; Munk, Rachel; Yang, Xiaoling; Piao, Yulan; Dudekula, Dawood B; Abdelmohsen, Kotb; Gorospe, Myriam

2017-07-07

High-throughput RNA sequencing methods coupled with specialized bioinformatic analyses have recently uncovered tens of thousands of unique circular (circ)RNAs, but their complete sequences, genes of origin and functions are largely unknown. Given that circRNAs lack free ends and are thus relatively stable, their association with microRNAs (miRNAs) and RNA-binding proteins (RBPs) can influence gene expression programs. While exoribonuclease treatment is widely used to degrade linear RNAs and enrich circRNAs in RNA samples, it does not efficiently eliminate all linear RNAs. Here, we describe a novel method for the isolation of highly pure circRNA populations involving RNase R treatment followed by Polyadenylation and poly(A)+ RNA Depletion (RPAD), which removes linear RNA to near completion. High-throughput sequencing of RNA prepared using RPAD from human cervical carcinoma HeLa cells and mouse C2C12 myoblasts led to two surprising discoveries: (i) many exonic circRNA (EcircRNA) isoforms share an identical backsplice sequence but have different body sizes and sequences, and (ii) thousands of novel intronic circular RNAs (IcircRNAs) are expressed in cells. In sum, isolating high-purity circRNAs using the RPAD method can enable quantitative and qualitative analyses of circRNA types and sequence composition, paving the way for the elucidation of circRNA functions. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
High-purity circular RNA isolation method (RPAD) reveals vast collection of intronic circRNAs

PubMed Central

De, Supriyo; Grammatikakis, Ioannis; Munk, Rachel; Yang, Xiaoling; Piao, Yulan; Dudekula, Dawood B.; Gorospe, Myriam

2017-01-01

Abstract High-throughput RNA sequencing methods coupled with specialized bioinformatic analyses have recently uncovered tens of thousands of unique circular (circ)RNAs, but their complete sequences, genes of origin and functions are largely unknown. Given that circRNAs lack free ends and are thus relatively stable, their association with microRNAs (miRNAs) and RNA-binding proteins (RBPs) can influence gene expression programs. While exoribonuclease treatment is widely used to degrade linear RNAs and enrich circRNAs in RNA samples, it does not efficiently eliminate all linear RNAs. Here, we describe a novel method for the isolation of highly pure circRNA populations involving RNase R treatment followed by Polyadenylation and poly(A)+ RNA Depletion (RPAD), which removes linear RNA to near completion. High-throughput sequencing of RNA prepared using RPAD from human cervical carcinoma HeLa cells and mouse C2C12 myoblasts led to two surprising discoveries: (i) many exonic circRNA (EcircRNA) isoforms share an identical backsplice sequence but have different body sizes and sequences, and (ii) thousands of novel intronic circular RNAs (IcircRNAs) are expressed in cells. In sum, isolating high-purity circRNAs using the RPAD method can enable quantitative and qualitative analyses of circRNA types and sequence composition, paving the way for the elucidation of circRNA functions. PMID:28444238
Multiuser receiver for DS-CDMA signals in multipath channels: an enhanced multisurface method.

PubMed

Mahendra, Chetan; Puthusserypady, Sadasivan

2006-11-01

This paper deals with the problem of multiuser detection in direct-sequence code-division multiple-access (DS-CDMA) systems in multipath environments. The existing multiuser detectors can be divided into two categories: (1) low-complexity poor-performance linear detectors and (2) high-complexity good-performance nonlinear detectors. In particular, in channels where the orthogonality of the code sequences is destroyed by multipath, detectors with linear complexity perform much worse than the nonlinear detectors. In this paper, we propose an enhanced multisurface method (EMSM) for multiuser detection in multipath channels. EMSM is an intermediate piecewise linear detection scheme with a run-time complexity linear in the number of users. Its bit error rate performance is compared with existing linear detectors, a nonlinear radial basis function detector trained by the new support vector learning algorithm, and Verdu's optimal detector. Simulations in multipath channels, for both synchronous and asynchronous cases, indicate that it always outperforms all other linear detectors, performing nearly as well as nonlinear detectors.
PIPI: PTM-Invariant Peptide Identification Using Coding Method.

PubMed

Yu, Fengchao; Li, Ning; Yu, Weichuan

2016-12-02

In computational proteomics, the identification of peptides with an unlimited number of post-translational modification (PTM) types is a challenging task. The computational cost associated with database search increases exponentially with respect to the number of modified amino acids and linearly with respect to the number of potential PTM types at each amino acid. The problem becomes intractable very quickly if we want to enumerate all possible PTM patterns. To address this issue, one group of methods named restricted tools (including Mascot, Comet, and MS-GF+) only allow a small number of PTM types in database search process. Alternatively, the other group of methods named unrestricted tools (including MS-Alignment, ProteinProspector, and MODa) avoids enumerating PTM patterns with an alignment-based approach to localizing and characterizing modified amino acids. However, because of the large search space and PTM localization issue, the sensitivity of these unrestricted tools is low. This paper proposes a novel method named PIPI to achieve PTM-invariant peptide identification. PIPI belongs to the category of unrestricted tools. It first codes peptide sequences into Boolean vectors and codes experimental spectra into real-valued vectors. For each coded spectrum, it then searches the coded sequence database to find the top scored peptide sequences as candidates. After that, PIPI uses dynamic programming to localize and characterize modified amino acids in each candidate. We used simulation experiments and real data experiments to evaluate the performance in comparison with restricted tools (i.e., Mascot, Comet, and MS-GF+) and unrestricted tools (i.e., Mascot with error tolerant search, MS-Alignment, ProteinProspector, and MODa). Comparison with restricted tools shows that PIPI has a close sensitivity and running speed. Comparison with unrestricted tools shows that PIPI has the highest sensitivity except for Mascot with error tolerant search and ProteinProspector. These two tools simplify the task by only considering up to one modified amino acid in each peptide, which results in a higher sensitivity but has difficulty in dealing with multiple modified amino acids. The simulation experiments also show that PIPI has the lowest false discovery proportion, the highest PTM characterization accuracy, and the shortest running time among the unrestricted tools.
Control of the NASA Langley 16-Foot Transonic Tunnel with the Self-Organizing Feature Map

NASA Technical Reports Server (NTRS)

Motter, Mark A.

1998-01-01

A predictive, multiple model control strategy is developed based on an ensemble of local linear models of the nonlinear system dynamics for a transonic wind tunnel. The local linear models are estimated directly from the weights of a Self Organizing Feature Map (SOFM). Local linear modeling of nonlinear autonomous systems with the SOFM is extended to a control framework where the modeled system is nonautonomous, driven by an exogenous input. This extension to a control framework is based on the consideration of a finite number of subregions in the control space. Multiple self organizing feature maps collectively model the global response of the wind tunnel to a finite set of representative prototype controls. These prototype controls partition the control space and incorporate experimental knowledge gained from decades of operation. Each SOFM models the combination of the tunnel with one of the representative controls, over the entire range of operation. The SOFM based linear models are used to predict the tunnel response to a larger family of control sequences which are clustered on the representative prototypes. The control sequence which corresponds to the prediction that best satisfies the requirements on the system output is applied as the external driving signal. Each SOFM provides a codebook representation of the tunnel dynamics corresponding to a prototype control. Different dynamic regimes are organized into topological neighborhoods where the adjacent entries in the codebook represent the minimization of a similarity metric which is the essence of the self organizing feature of the map. Thus, the SOFM is additionally employed to identify the local dynamical regime, and consequently implements a switching scheme than selects the best available model for the applied control. Experimental results of controlling the wind tunnel, with the proposed method, during operational runs where strict research requirements on the control of the Mach number were met, are presented. Comparison to similar runs under the same conditions with the tunnel controlled by either the existing controller or an expert operator indicate the superiority of the method.

Linear reduction method for predictive and informative tag SNP selection.

PubMed

He, Jingwu; Westbrooks, Kelly; Zelikovsky, Alexander

2005-01-01

Constructing a complete human haplotype map is helpful when associating complex diseases with their related SNPs. Unfortunately, the number of SNPs is very large and it is costly to sequence many individuals. Therefore, it is desirable to reduce the number of SNPs that should be sequenced to a small number of informative representatives called tag SNPs. In this paper, we propose a new linear algebra-based method for selecting and using tag SNPs. We measure the quality of our tag SNP selection algorithm by comparing actual SNPs with SNPs predicted from selected linearly independent tag SNPs. Our experiments show that for sufficiently long haplotypes, knowing only 0.4% of all SNPs the proposed linear reduction method predicts an unknown haplotype with the error rate below 2% based on 10% of the population.
The World Health Organization Global Programme on AIDS proposal for standardization of HIV sequence nomenclature. WHO Network for HIV Isolation and Characterization.

PubMed

Korber, B T; Osmanov, S; Esparza, J; Myers, G

1994-11-01

The World Health Organization Global Programme on AIDS (WHO/GPA) is conducting a large-scale collaborative study of human immunodeficiency virus type 1 (HIV-1) variation, based in four potential vaccine-trial site countries: Brazil, Rwanda, Thailand, and Uganda. Through the course of this study, it was crucial to keep track of certain attributes of the samples from which the viral nucleotide sequences were derived (e.g., country of origin and viral culture characterization), so that meaningful sequence comparisons could be made. Here we describe a system developed in the context of the WHO/GPA study that summarizes such critical attributes by representing them as standardized characters directly incorporated into sequence names. This nomenclature allows linkage of clinical, phenotypic, and geographic information with molecular data. We propose that other investigators involved in human immunodeficiency virus (HIV) nucleotide sequencing efforts adopt a similar standardized sequence nomenclature to facilitate cross-study sequence comparison. HIV sequence data are being generated at an ever-increasing rate; directly coupled to this increase is our deepening understanding of biological parameters that influence or result from sequence variability. A standardized sequence nomenclature that includes relevant biological information would enable researchers to better utilize the growing body of sequence data, and enhance their ability to interpret the biological implications of their own data through facilitating comparisons with previously published work.
Comparison of linear and non-linear method in estimating the sorption isotherm parameters for safranin onto activated carbon.

PubMed

Kumar, K Vasanth; Sivanesan, S

2005-08-31

Comparison analysis of linear least square method and non-linear method for estimating the isotherm parameters was made using the experimental equilibrium data of safranin onto activated carbon at two different solution temperatures 305 and 313 K. Equilibrium data were fitted to Freundlich, Langmuir and Redlich-Peterson isotherm equations. All the three isotherm equations showed a better fit to the experimental equilibrium data. The results showed that non-linear method could be a better way to obtain the isotherm parameters. Redlich-Peterson isotherm is a special case of Langmuir isotherm when the Redlich-Peterson isotherm constant g was unity.
Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison.

PubMed

Dai, Qi; Yang, Yanchun; Wang, Tianming

2008-10-15

Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.
Polarized radiance distribution measurements of skylight. I. System description and characterization.

PubMed

Voss, K J; Liu, Y

1997-08-20

A new system to measure the natural skylight polarized radiance distribution has been developed. The system is based on a fish-eye lens, CCD camera system, and filter changer. With this system sequences of images can be combined to determine the linear polarization components of the incident light field. Calibration steps to determine the system 's polarization characteristics are described. Comparisons of the radiance measurements of this system and a simple pointing radiometer were made in the field and agreed within 10 % for measurements at 560 and 670 nm and 25 % at 860 nm. Polarization tests were done in the laboratory. The accuracy of the intensity measurements is estimated to be 10 %, while the accuracy of measurements of elements of the Mueller matrix are estimated to be 2 %.
Modeling the dynamics of choice.

PubMed

Baum, William M; Davison, Michael

2009-06-01

A simple linear-operator model both describes and predicts the dynamics of choice that may underlie the matching relation. We measured inter-food choice within components of a schedule that presented seven different pairs of concurrent variable-interval schedules for 12 food deliveries each with no signals indicating which pair was in force. This measure of local choice was accurately described and predicted as obtained reinforcer sequences shifted it to favor one alternative or the other. The effect of a changeover delay was reflected in one parameter, the asymptote, whereas the effect of a difference in overall rate of food delivery was reflected in the other parameter, rate of approach to the asymptote. The model takes choice as a primary dependent variable, not derived by comparison between alternatives-an approach that agrees with the molar view of behaviour.
A confidence interval analysis of sampling effort, sequencing depth, and taxonomic resolution of fungal community ecology in the era of high-throughput sequencing.

PubMed

Oono, Ryoko

2017-01-01

High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions 'how and why are communities different?' This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences.
A confidence interval analysis of sampling effort, sequencing depth, and taxonomic resolution of fungal community ecology in the era of high-throughput sequencing

PubMed Central

2017-01-01

High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions ‘how and why are communities different?’ This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences. PMID:29253889
Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.

The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. Results: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a frameworkmore » based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. Availability: Phylo-VISTA is available at http://www-gsd.lbl. gov/phylovista. It requires an Internet browser with Java Plugin 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu« less
Reaction schemes visualized in network form: the syntheses of strychnine as an example.

PubMed

Proudfoot, John R

2013-05-24

Representation of synthesis sequences in a network form provides an effective method for the comparison of multiple reaction schemes and an opportunity to emphasize features such as reaction scale that are often relegated to experimental sections. An example of data formatting that allows construction of network maps in Cytoscape is presented, along with maps that illustrate the comparison of multiple reaction sequences, comparison of scaffold changes within sequences, and consolidation to highlight common key intermediates used across sequences. The 17 different synthetic routes reported for strychnine are used as an example basis set. The reaction maps presented required a significant data extraction and curation, and a standardized tabular format for reporting reaction information, if applied in a consistent way, could allow the automated combination of reaction information across different sources.
Alignment-free Transcriptomic and Metatranscriptomic Comparison Using Sequencing Signatures with Variable Length Markov Chains.

PubMed

Liao, Weinan; Ren, Jie; Wang, Kun; Wang, Shun; Zeng, Feng; Wang, Ying; Sun, Fengzhu

2016-11-23

The comparison between microbial sequencing data is critical to understand the dynamics of microbial communities. The alignment-based tools analyzing metagenomic datasets require reference sequences and read alignments. The available alignment-free dissimilarity approaches model the background sequences with Fixed Order Markov Chain (FOMC) yielding promising results for the comparison of microbial communities. However, in FOMC, the number of parameters grows exponentially with the increase of the order of Markov Chain (MC). Under a fixed high order of MC, the parameters might not be accurately estimated owing to the limitation of sequencing depth. In our study, we investigate an alternative to FOMC to model background sequences with the data-driven Variable Length Markov Chain (VLMC) in metatranscriptomic data. The VLMC originally designed for long sequences was extended to apply to high-throughput sequencing reads and the strategies to estimate the corresponding parameters were developed. The flexible number of parameters in VLMC avoids estimating the vast number of parameters of high-order MC under limited sequencing depth. Different from the manual selection in FOMC, VLMC determines the MC order adaptively. Several beta diversity measures based on VLMC were applied to compare the bacterial RNA-Seq and metatranscriptomic datasets. Experiments show that VLMC outperforms FOMC to model the background sequences in transcriptomic and metatranscriptomic samples. A software pipeline is available at https://d2vlmc.codeplex.com.
Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing

USDA-ARS?s Scientific Manuscript database

Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitat...
Association analysis for feet and legs disorders with whole-genome sequence variants in 3 dairy cattle breeds.

PubMed

Wu, Xiaoping; Guldbrandtsen, Bernt; Lund, Mogens Sandø; Sahana, Goutam

2016-09-01

Identification of genetic variants associated with feet and legs disorders (FLD) will aid in the genetic improvement of these traits by providing knowledge on genes that influence trait variations. In Denmark, FLD in cattle has been recorded since the 1990s. In this report, we used deregressed breeding values as response variables for a genome-wide association study. Bulls (5,334 Danish Holstein, 4,237 Nordic Red Dairy Cattle, and 1,180 Danish Jersey) with deregressed estimated breeding values were genotyped with the Illumina Bovine 54k single nucleotide polymorphism (SNP) genotyping array. Genotypes were imputed to whole-genome sequence variants, and then 22,751,039 SNP on 29 autosomes were used for an association analysis. A modified linear mixed-model approach (efficient mixed-model association eXpedited, EMMAX) and a linear mixed model were used for association analysis. We identified 5 (3,854 SNP), 3 (13,642 SNP), and 0 quantitative trait locus (QTL) regions associated with the FLD index in Danish Holstein, Nordic Red Dairy Cattle, and Danish Jersey populations, respectively. We did not identify any QTL that were common among the 3 breeds. In a meta-analysis of the 3 breeds, 4 QTL regions were significant, but no additional QTL region was identified compared with within-breed analyses. Comparison between top SNP locations within these QTL regions and known genes suggested that RASGRP1, LCORL, MOS, and MITF may be candidate genes for FLD in dairy cattle. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Identification of divergent protein domains by combining HMM-HMM comparisons and co-occurrence detection.

PubMed

Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent

2014-01-01

Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence--the general domain tendency to preferentially appear along with some favorite domains in the proteins--to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced.
Identification of Divergent Protein Domains by Combining HMM-HMM Comparisons and Co-Occurrence Detection

PubMed Central

Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent

2014-01-01

Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence — the general domain tendency to preferentially appear along with some favorite domains in the proteins — to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced. PMID:24901648
Progressive structure-based alignment of homologous proteins: Adopting sequence comparison strategies.

PubMed

Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G

2012-09-01

Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Comparing light sensitivity, linearity and step response of electronic cameras for ophthalmology.

PubMed

Kopp, O; Markert, S; Tornow, R P

2002-01-01

To develop and test a procedure to measure and compare light sensitivity, linearity and step response of electronic cameras. The pixel value (PV) of digitized images as a function of light intensity (I) was measured. The sensitivity was calculated from the slope of the P(I) function, the linearity was estimated from the correlation coefficient of this function. To measure the step response, a short sequence of images was acquired. During acquisition, a light source was switched on and off using a fast shutter. The resulting PV was calculated for each video field of the sequence. A CCD camera optimized for the near-infrared (IR) spectrum showed the highest sensitivity for both, visible and IR light. There are little differences in linearity. The step response depends on the procedure of integration and read out.
Equivalent linearization for fatigue life estimates of a nonlinear structure

NASA Technical Reports Server (NTRS)

Miles, R. N.

1989-01-01

An analysis is presented of the suitability of the method of equivalent linearization for estimating the fatigue life of a nonlinear structure. Comparisons are made of the fatigue life of a nonlinear plate as predicted using conventional equivalent linearization and three other more accurate methods. The excitation of the plate is assumed to be Gaussian white noise and the plate response is modeled using a single resonant mode. The methods used for comparison consist of numerical simulation, a probabalistic formulation, and a modification of equivalent linearization which avoids the usual assumption that the response process is Gaussian. Remarkably close agreement is obtained between all four methods, even for cases where the response is significantly linear.
Understanding Linear Function: A Comparison of Selected Textbooks from England and Shanghai

ERIC Educational Resources Information Center

Wang, Yuqian; Barmby, Patrick; Bolden, David

2017-01-01

This study describes a comparison of how worked examples in selected textbooks from England and Shanghai presented possible learning trajectories towards understanding linear function. Six selected English textbooks and one Shanghai compulsory textbook were analysed with regards to the understanding required for pure mathematics knowledge in…
A De-Novo Genome Analysis Pipeline (DeNoGAP) for large-scale comparative prokaryotic genomics studies.

PubMed

Thakur, Shalabh; Guttman, David S

2016-06-30

Comparative analysis of whole genome sequence data from closely related prokaryotic species or strains is becoming an increasingly important and accessible approach for addressing both fundamental and applied biological questions. While there are number of excellent tools developed for performing this task, most scale poorly when faced with hundreds of genome sequences, and many require extensive manual curation. We have developed a de-novo genome analysis pipeline (DeNoGAP) for the automated, iterative and high-throughput analysis of data from comparative genomics projects involving hundreds of whole genome sequences. The pipeline is designed to perform reference-assisted and de novo gene prediction, homolog protein family assignment, ortholog prediction, functional annotation, and pan-genome analysis using a range of proven tools and databases. While most existing methods scale quadratically with the number of genomes since they rely on pairwise comparisons among predicted protein sequences, DeNoGAP scales linearly since the homology assignment is based on iteratively refined hidden Markov models. This iterative clustering strategy enables DeNoGAP to handle a very large number of genomes using minimal computational resources. Moreover, the modular structure of the pipeline permits easy updates as new analysis programs become available. DeNoGAP integrates bioinformatics tools and databases for comparative analysis of a large number of genomes. The pipeline offers tools and algorithms for annotation and analysis of completed and draft genome sequences. The pipeline is developed using Perl, BioPerl and SQLite on Ubuntu Linux version 12.04 LTS. Currently, the software package accompanies script for automated installation of necessary external programs on Ubuntu Linux; however, the pipeline should be also compatible with other Linux and Unix systems after necessary external programs are installed. DeNoGAP is freely available at https://sourceforge.net/projects/denogap/ .

Nucleotide sequence of the Varkud mitochondrial plasmid of Neurospora and synthesis of a hybrid transcript with a 5' leader derived from mitochondrial RNA.

PubMed

Akins, R A; Grant, D M; Stohl, L L; Bottorff, D A; Nargang, F E; Lambowitz, A M

1988-11-05

The Mauriceville and Varkud mitochondrial plasmids of Neurospora are closely related, closed circular DNAs (3.6 and 3.7 kb, respectively; 1 kb = 10(3) bases or base-pairs), whose characteristics suggest relationships to mitochondrial DNA introns and retrotransposons. Here, we characterized the structure of the Varkud plasmid, determined its complete nucleotide sequence and mapped its major transcripts. The Mauriceville and Varkud plasmids have more than 97% positional identity. Both plasmids contain a 710 amino acid open reading frame that encodes a reverse transcriptase-like protein. The amino acid sequence of this open reading frame is strongly conserved between the two plasmids (701/710 amino acids) as expected for a functionally important protein. Both plasmids have a 0.4 kb region that contains five PstI palindromes and a direct repeat of approximately 160 base-pairs. Comparison of sequences in this region suggests that the Varkud plasmid has diverged less from a common ancestor than has the Mauriceville plasmid. Two major transcripts of the Varkud plasmid were detected by Northern hybridization experiments: a full-length linear RNA of 3.7 kb and an additional prominent transcript of 4.9 kb, 1.2 kb longer than monomer plasmid. Remarkably, we find that the 4.9 kb transcript is a hybrid RNA consisting of the full-length 3.7 kb Varkud plasmid transcript plus a 5' leader of 1.2 kb that is derived from the 5' end of the mitochondrial small rRNA. This and other findings suggest that the Varkud plasmid, like certain RNA viruses, has a mechanism for joining heterologous RNAs to the 5' end of its major transcript, and that, under some circumstances, nucleotide sequences in mitochondria may be recombined at the RNA level.
Modeling and prediction of peptide drift times in ion mobility spectrometry using sequence-based and structure-based approaches.

PubMed

Zhang, Yiming; Jin, Quan; Wang, Shuting; Ren, Ren

2011-05-01

The mobile behavior of 1481 peptides in ion mobility spectrometry (IMS), which are generated by protease digestion of the Drosophila melanogaster proteome, is modeled and predicted based on two different types of characterization methods, i.e. sequence-based approach and structure-based approach. In this procedure, the sequence-based approach considers both the amino acid composition of a peptide and the local environment profile of each amino acid in the peptide; the structure-based approach is performed with the CODESSA protocol, which regards a peptide as a common organic compound and generates more than 200 statistically significant variables to characterize the whole structure profile of a peptide molecule. Subsequently, the nonlinear support vector machine (SVM) and Gaussian process (GP) as well as linear partial least squares (PLS) regression is employed to correlate the structural parameters of the characterizations with the IMS drift times of these peptides. The obtained quantitative structure-spectrum relationship (QSSR) models are evaluated rigorously and investigated systematically via both one-deep and two-deep cross-validations as well as the rigorous Monte Carlo cross-validation (MCCV). We also give a comprehensive comparison on the resulting statistics arising from the different combinations of variable types with modeling methods and find that the sequence-based approach can give the QSSR models with better fitting ability and predictive power but worse interpretability than the structure-based approach. In addition, though the QSSR modeling using sequence-based approach is not needed for the preparation of the minimization structures of peptides before the modeling, it would be considerably efficient as compared to that using structure-based approach. Copyright © 2011 Elsevier Ltd. All rights reserved.
Molecular basis of branched peptides resistance to enzyme proteolysis.

PubMed

Falciani, Chiara; Lozzi, Luisa; Pini, Alessandro; Corti, Federico; Fabbrini, Monica; Bernini, Andrea; Lelli, Barbara; Niccolai, Neri; Bracci, Luisa

2007-03-01

We found that synthetic peptides in the form of dendrimers become resistant to proteolysis. To determine the molecular basis of this resistance, different bioactive peptides were synthesized in monomeric, two-branched and tetra-branched form and incubated with human plasma and serum. Proteolytic resistance of branched multimeric sequences was compared to that of the same peptides synthesized as multimeric linear molecules. Unmodified peptides and cleaved sequences were detected by high pressure liquid chromatography and mass spectrometry. An increase in peptide copies did not increase peptide resistance in linear multimeric sequences, whereas multimericity progressively enhanced proteolytic stability of branched multimeric peptides. A structure-based hypothesis of branched peptide resistance to proteolysis by metallopeptidases is presented.
BLAST and FASTA similarity searching for multiple sequence alignment.

PubMed

Pearson, William R

2014-01-01

BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data.

PubMed

Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A; Larsen, Martin Jakob

2016-01-01

Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.
Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data

PubMed Central

Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A.; Larsen, Martin Jakob

2016-01-01

Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths. PMID:27002637
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.

PubMed

Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim

2010-03-01

Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org
Random-effects linear modeling and sample size tables for two special crossover designs of average bioequivalence studies: the four-period, two-sequence, two-formulation and six-period, three-sequence, three-formulation designs.

PubMed

Diaz, Francisco J; Berg, Michel J; Krebill, Ron; Welty, Timothy; Gidal, Barry E; Alloway, Rita; Privitera, Michael

2013-12-01

Due to concern and debate in the epilepsy medical community and to the current interest of the US Food and Drug Administration (FDA) in revising approaches to the approval of generic drugs, the FDA is currently supporting ongoing bioequivalence studies of antiepileptic drugs, the EQUIGEN studies. During the design of these crossover studies, the researchers could not find commercial or non-commercial statistical software that quickly allowed computation of sample sizes for their designs, particularly software implementing the FDA requirement of using random-effects linear models for the analyses of bioequivalence studies. This article presents tables for sample-size evaluations of average bioequivalence studies based on the two crossover designs used in the EQUIGEN studies: the four-period, two-sequence, two-formulation design, and the six-period, three-sequence, three-formulation design. Sample-size computations assume that random-effects linear models are used in bioequivalence analyses with crossover designs. Random-effects linear models have been traditionally viewed by many pharmacologists and clinical researchers as just mathematical devices to analyze repeated-measures data. In contrast, a modern view of these models attributes an important mathematical role in theoretical formulations in personalized medicine to them, because these models not only have parameters that represent average patients, but also have parameters that represent individual patients. Moreover, the notation and language of random-effects linear models have evolved over the years. Thus, another goal of this article is to provide a presentation of the statistical modeling of data from bioequivalence studies that highlights the modern view of these models, with special emphasis on power analyses and sample-size computations.
Rooting gene trees without outgroups: EP rooting.

PubMed

Sinsheimer, Janet S; Little, Roderick J A; Lake, James A

2012-01-01

Gene sequences are routinely used to determine the topologies of unrooted phylogenetic trees, but many of the most important questions in evolution require knowing both the topologies and the roots of trees. However, general algorithms for calculating rooted trees from gene and genomic sequences in the absence of gene paralogs are few. Using the principles of evolutionary parsimony (EP) (Lake JA. 1987a. A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Mol Biol Evol. 4:167-181) and its extensions (Cavender, J. 1989. Mechanized derivation of linear invariants. Mol Biol Evol. 6:301-316; Nguyen T, Speed TP. 1992. A derivation of all linear invariants for a nonbalanced transversion model. J Mol Evol. 35:60-76), we explicitly enumerate all linear invariants that solely contain rooting information and derive algorithms for rooting gene trees directly from gene and genomic sequences. These new EP linear rooting invariants allow one to determine rooted trees, even in the complete absence of outgroups and gene paralogs. EP rooting invariants are explicitly derived for three taxon trees, and rules for their extension to four or more taxa are provided. The method is demonstrated using 18S ribosomal DNA to illustrate how the new animal phylogeny (Aguinaldo AMA et al. 1997. Evidence for a clade of nematodes, arthropods, and other moulting animals. Nature 387:489-493; Lake JA. 1990. Origin of the metazoa. Proc Natl Acad Sci USA 87:763-766) may be rooted directly from sequences, even when they are short and paralogs are unavailable. These results are consistent with the current root (Philippe H et al. 2011. Acoelomorph flatworms are deuterostomes related to Xenoturbella. Nature 470:255-260).
Rooting Gene Trees without Outgroups: EP Rooting

PubMed Central

Sinsheimer, Janet S.; Little, Roderick J. A.; Lake, James A.

2012-01-01

Gene sequences are routinely used to determine the topologies of unrooted phylogenetic trees, but many of the most important questions in evolution require knowing both the topologies and the roots of trees. However, general algorithms for calculating rooted trees from gene and genomic sequences in the absence of gene paralogs are few. Using the principles of evolutionary parsimony (EP) (Lake JA. 1987a. A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Mol Biol Evol. 4:167–181) and its extensions (Cavender, J. 1989. Mechanized derivation of linear invariants. Mol Biol Evol. 6:301–316; Nguyen T, Speed TP. 1992. A derivation of all linear invariants for a nonbalanced transversion model. J Mol Evol. 35:60–76), we explicitly enumerate all linear invariants that solely contain rooting information and derive algorithms for rooting gene trees directly from gene and genomic sequences. These new EP linear rooting invariants allow one to determine rooted trees, even in the complete absence of outgroups and gene paralogs. EP rooting invariants are explicitly derived for three taxon trees, and rules for their extension to four or more taxa are provided. The method is demonstrated using 18S ribosomal DNA to illustrate how the new animal phylogeny (Aguinaldo AMA et al. 1997. Evidence for a clade of nematodes, arthropods, and other moulting animals. Nature 387:489–493; Lake JA. 1990. Origin of the metazoa. Proc Natl Acad Sci USA 87:763–766) may be rooted directly from sequences, even when they are short and paralogs are unavailable. These results are consistent with the current root (Philippe H et al. 2011. Acoelomorph flatworms are deuterostomes related to Xenoturbella. Nature 470:255–260). PMID:22593551
Isoelectronic studies of the 5s/sup 2/ /sup 1/S/sub 0/-5s5p/sup 1,3/P/sub J/ intervals in the Cd sequence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Curtis, L.J.

1986-02-01

The 5s/sup 2/ /sup 1/S/sub 0/-5s5p/sup 1,3/P/sub J/ energy intervals in the Cd isoelectronic sequence have been investigated through a semiempirical systematization of recent measurements and through the performance of ab initio multiconfiguration Dirac-Fock calculations. Screening-parameter reductions of the spin-orbit and exchange energies both for the observed data and for the theoretically computed values establish the existence of empirical linearities similar to those exploited earlier for the Be, Mg, and Zn sequences. This permits extrapolative isoelectronic predictions of the relative energies of the 5s5p levels, which can be connected to 5s/sup 2/ using intersinglet intervals obtained from empirically corrected abmore » initio calculations. These linearities have also been examined homologously for the Zn, Cd, and Hg sequences, and common relationships have been found that accurately describe all three of these sequences.« less
Dynamic ASXL1 Exon Skipping and Alternative Circular Splicing in Single Human Cells

PubMed Central

Natarajan, Sivaraman; Carter, Robert; Brown, Patrick O.

2016-01-01

Circular RNAs comprise a poorly understood new class of noncoding RNA. In this study, we used a combination of targeted deletion, high-resolution splicing detection, and single-cell sequencing to deeply probe ASXL1 circular splicing. We found that efficient circular splicing required the canonical transcriptional start site and inverted AluSx elements. Sequencing-based interrogation of isoforms after ASXL1 overexpression identified promiscuous linear splicing between all exons, with the two most abundant non-canonical linear products skipping the exons that produced the circular isoforms. Single-cell sequencing revealed a strong preference for either the linear or circular ASXL1 isoforms in each cell, and found the predominant exon skipping product is frequently co-expressed with its reciprocal circular isoform. Finally, absolute quantification of ASXL1 isoforms confirmed our findings and suggests that standard methods overestimate circRNA abundance. Taken together, these data reveal a dynamic new view of circRNA genesis, providing additional framework for studying their roles in cellular biology. PMID:27736885
IDENTIFICATION OF AVIAN-SPECIFIC FECAL METAGENOMIC SEQUENCES USING GENOME FRAGMENT ENRICHMENTS

EPA Science Inventory

Sequence analysis of microbial genomes has provided biologists the opportunity to compare genetic differences between closely related microorganisms. While random sequencing has also been used to study natural microbial communities, metagenomic comparisons via sequencing analysis...
Comparison of Nonlinear Random Response Using Equivalent Linearization and Numerical Simulation

NASA Technical Reports Server (NTRS)

Rizzi, Stephen A.; Muravyov, Alexander A.

2000-01-01

A recently developed finite-element-based equivalent linearization approach for the analysis of random vibrations of geometrically nonlinear multiple degree-of-freedom structures is validated. The validation is based on comparisons with results from a finite element based numerical simulation analysis using a numerical integration technique in physical coordinates. In particular, results for the case of a clamped-clamped beam are considered for an extensive load range to establish the limits of validity of the equivalent linearization approach.
The Multifaceted Variable Approach: Selection of Method in Solving Simple Linear Equations

ERIC Educational Resources Information Center

Tahir, Salma; Cavanagh, Michael

2010-01-01

This paper presents a comparison of the solution strategies used by two groups of Year 8 students as they solved linear equations. The experimental group studied algebra following a multifaceted variable approach, while the comparison group used a traditional approach. Students in the experimental group employed different solution strategies,…
Integrated Bayesian models of learning and decision making for saccadic eye movements.

PubMed

Brodersen, Kay H; Penny, Will D; Harrison, Lee M; Daunizeau, Jean; Ruff, Christian C; Duzel, Emrah; Friston, Karl J; Stephan, Klaas E

2008-11-01

The neurophysiology of eye movements has been studied extensively, and several computational models have been proposed for decision-making processes that underlie the generation of eye movements towards a visual stimulus in a situation of uncertainty. One class of models, known as linear rise-to-threshold models, provides an economical, yet broadly applicable, explanation for the observed variability in the latency between the onset of a peripheral visual target and the saccade towards it. So far, however, these models do not account for the dynamics of learning across a sequence of stimuli, and they do not apply to situations in which subjects are exposed to events with conditional probabilities. In this methodological paper, we extend the class of linear rise-to-threshold models to address these limitations. Specifically, we reformulate previous models in terms of a generative, hierarchical model, by combining two separate sub-models that account for the interplay between learning of target locations across trials and the decision-making process within trials. We derive a maximum-likelihood scheme for parameter estimation as well as model comparison on the basis of log likelihood ratios. The utility of the integrated model is demonstrated by applying it to empirical saccade data acquired from three healthy subjects. Model comparison is used (i) to show that eye movements do not only reflect marginal but also conditional probabilities of target locations, and (ii) to reveal subject-specific learning profiles over trials. These individual learning profiles are sufficiently distinct that test samples can be successfully mapped onto the correct subject by a naïve Bayes classifier. Altogether, our approach extends the class of linear rise-to-threshold models of saccadic decision making, overcomes some of their previous limitations, and enables statistical inference both about learning of target locations across trials and the decision-making process within trials.
Device and method for imaging of non-linear and linear properties of formations surrounding a borehole

DOEpatents

Johnson, Paul A; Tencate, James A; Le Bas, Pierre-Yves; Guyer, Robert; Vu, Cung Khac; Skelt, Christopher

2013-11-05

In some aspects of the disclosure, a method and an apparatus is disclosed for investigating material surrounding the borehole. The method includes generating a first low frequency acoustic wave within the borehole, wherein the first low frequency acoustic wave induces a linear and a nonlinear response in one or more features in the material that are substantially perpendicular to a radius of the borehole; directing a first sequence of high frequency pulses in a direction perpendicularly with respect to the longitudinal axis of the borehole into the material contemporaneously with the first acoustic wave; and receiving one or more second high frequency pulses at one or more receivers positionable in the borehole produced by an interaction between the first sequence of high frequency pulses and the one or more features undergoing linear and nonlinear elastic distortion due to the first low frequency acoustic wave to investigate the material surrounding the borehole.
A state-based probabilistic model for tumor respiratory motion prediction

NASA Astrophysics Data System (ADS)

Kalet, Alan; Sandison, George; Wu, Huanmei; Schmitz, Ruth

2010-12-01

This work proposes a new probabilistic mathematical model for predicting tumor motion and position based on a finite state representation using the natural breathing states of exhale, inhale and end of exhale. Tumor motion was broken down into linear breathing states and sequences of states. Breathing state sequences and the observables representing those sequences were analyzed using a hidden Markov model (HMM) to predict the future sequences and new observables. Velocities and other parameters were clustered using a k-means clustering algorithm to associate each state with a set of observables such that a prediction of state also enables a prediction of tumor velocity. A time average model with predictions based on average past state lengths was also computed. State sequences which are known a priori to fit the data were fed into the HMM algorithm to set a theoretical limit of the predictive power of the model. The effectiveness of the presented probabilistic model has been evaluated for gated radiation therapy based on previously tracked tumor motion in four lung cancer patients. Positional prediction accuracy is compared with actual position in terms of the overall RMS errors. Various system delays, ranging from 33 to 1000 ms, were tested. Previous studies have shown duty cycles for latencies of 33 and 200 ms at around 90% and 80%, respectively, for linear, no prediction, Kalman filter and ANN methods as averaged over multiple patients. At 1000 ms, the previously reported duty cycles range from approximately 62% (ANN) down to 34% (no prediction). Average duty cycle for the HMM method was found to be 100% and 91 ± 3% for 33 and 200 ms latency and around 40% for 1000 ms latency in three out of four breathing motion traces. RMS errors were found to be lower than linear and no prediction methods at latencies of 1000 ms. The results show that for system latencies longer than 400 ms, the time average HMM prediction outperforms linear, no prediction, and the more general HMM-type predictive models. RMS errors for the time average model approach the theoretical limit of the HMM, and predicted state sequences are well correlated with sequences known to fit the data.
A symmetric bipolar nebula around MWC 922.

PubMed

Tuthill, P G; Lloyd, J P

2007-04-13

We report regular and symmetric structure around dust-enshrouded Be star MWC 922 obtained with infrared imaging. Biconical lobes that appear nearly square in aspect, forming this "Red Square" nebula, are crossed by a series of rungs that terminate in bright knots or "vortices," and an equatorial dark band crossing the core delimits twin hyperbolic arcs. The intricate yet cleanly constructed forms that comprise the skeleton of the object argue for minimal perturbation from global turbulent or chaotic effects. We also report the presence of a linear comb structure, which may arise from optically projected shadows of a periodic feature in the inner regions, such as corrugations in the rim of a circumstellar disk. The sequence of nested polar rings draws comparison with the triple-ring system seen around the only naked-eye supernova in recent history: SN1987A.
Marker optimization for facial motion acquisition and deformation.

PubMed

Le, Binh H; Zhu, Mingyang; Deng, Zhigang

2013-11-01

A long-standing problem in marker-based facial motion capture is what are the optimal facial mocap marker layouts. Despite its wide range of potential applications, this problem has not yet been systematically explored to date. This paper describes an approach to compute optimized marker layouts for facial motion acquisition as optimization of characteristic control points from a set of high-resolution, ground-truth facial mesh sequences. Specifically, the thin-shell linear deformation model is imposed onto the example pose reconstruction process via optional hard constraints such as symmetry and multiresolution constraints. Through our experiments and comparisons, we validate the effectiveness, robustness, and accuracy of our approach. Besides guiding minimal yet effective placement of facial mocap markers, we also describe and demonstrate its two selected applications: marker-based facial mesh skinning and multiresolution facial performance capture.

[A delayed motor production of open chains of linear strokes presented visually in static and dynamic modes: a comparison between 9 to 11 years old children and adults].

PubMed

Antonova, A A; Absatova, K A; Korneev, A A; Kurgansky, A V

2015-01-01

The production of drawing movements was studied in 29 right-handed children of 9-to-11 years old. The movements were the sequences of horizontal and vertical linear stokes conjoined at right angle (open polygonal chains) referred to throughout the paper as trajectories. The length of a trajectory varied from 4 to 6. The trajectories were presented visually to a subject in static (linedrawing) and dynamic (moving cursor that leaves no trace) modes. The subjects were asked to draw (copy) a trajectory in response to delayed go-signal (short click) as fast as possible without lifting the pen. The production latency time, the average movement duration along a trajectory segment, and overall number of errors committed by a subject during trajectory production were analyzed. A comparison of children's data with similar data in adults (16 subjects) shows the following. First, a substantial reduction in error rate is observed in the age range between 9 and 11 years old for both static and dynamic modes of trajectory presentation, with children of 11 still committing more error than adults. Second, the averaged movement duration shortens with age while the latency time tends to increase. Third, unlike the adults, the children of 9-11 do not show any difference in latency time between static and dynamic modes of visual presentation of trajectories. The difference in trajectory production between adult and children is attributed to the predominant involvement of on-line programming in children and pre-programming in adults.
Terminations of DNA synthesis on 'proflavine and light'-treated phi X174 single-stranded DNA.

PubMed

Piette, J; Calberg-Bacq, C M; Lopez, M; van de Vorst, A

1984-04-05

Bacteriophage phi X174 single-stranded DNA molecules were primed with five different restriction fragments and irradiated with visible light in the presence of proflavine. This photodamaged DNA was used as template for the in vitro complementary chain synthesis by E. coli DNA polymerase I (Klenow fragment). Chain terminations were observed by polyacrylamide gel electrophoresis of the synthesized products and localized by comparison with standard sequencing performed simultaneously on the untreated template. 90% of the chain terminations occurred one nucleotide before a guanine residue in the template strand. More than 80% of the sequenced guanine residues were blocking lesions demonstrating the absence of 'hot-spots' for the photodamaging effect of proflavine. At a defined position, the chain termination frequency increased linearly with the irradiation time and was directly influenced by the proflavine concentration present. An important part of lesions resulted from the action of singlet oxygen produced by excited proflavine as shown by the effect that both NaN3 and 2H2O exerted on the reaction. The induced blocking lesions must be important in vivo since no complete replicative forms could be extracted from cell infected with bacteriophages inactivated by 'proflavine and light' treatment.
Comparison of methods for calculating conditional expectations of sufficient statistics for continuous time Markov chains.

PubMed

Tataru, Paula; Hobolth, Asger

2011-12-05

Continuous time Markov chains (CTMCs) is a widely used model for describing the evolution of DNA sequences on the nucleotide, amino acid or codon level. The sufficient statistics for CTMCs are the time spent in a state and the number of changes between any two states. In applications past evolutionary events (exact times and types of changes) are unaccessible and the past must be inferred from DNA sequence data observed in the present. We describe and implement three algorithms for computing linear combinations of expected values of the sufficient statistics, conditioned on the end-points of the chain, and compare their performance with respect to accuracy and running time. The first algorithm is based on an eigenvalue decomposition of the rate matrix (EVD), the second on uniformization (UNI), and the third on integrals of matrix exponentials (EXPM). The implementation in R of the algorithms is available at http://www.birc.au.dk/~paula/. We use two different models to analyze the accuracy and eight experiments to investigate the speed of the three algorithms. We find that they have similar accuracy and that EXPM is the slowest method. Furthermore we find that UNI is usually faster than EVD.
Sorting by Cuts, Joins, and Whole Chromosome Duplications.

PubMed

Zeira, Ron; Shamir, Ron

2017-02-01

Genome rearrangement problems have been extensively studied due to their importance in biology. Most studied models assumed a single copy per gene. However, in reality, duplicated genes are common, most notably in cancer. In this study, we make a step toward handling duplicated genes by considering a model that allows the atomic operations of cut, join, and whole chromosome duplication. Given two linear genomes, [Formula: see text] with one copy per gene and [Formula: see text] with two copies per gene, we give a linear time algorithm for computing a shortest sequence of operations transforming [Formula: see text] into [Formula: see text] such that all intermediate genomes are linear. We also show that computing an optimal sequence with fewest duplications is NP-hard.
Improved diagonal queue medical image steganography using Chaos theory, LFSR, and Rabin cryptosystem.

PubMed

Jain, Mamta; Kumar, Anil; Choudhary, Rishabh Charan

2017-06-01

In this article, we have proposed an improved diagonal queue medical image steganography for patient secret medical data transmission using chaotic standard map, linear feedback shift register, and Rabin cryptosystem, for improvement of previous technique (Jain and Lenka in Springer Brain Inform 3:39-51, 2016). The proposed algorithm comprises four stages, generation of pseudo-random sequences (pseudo-random sequences are generated by linear feedback shift register and standard chaotic map), permutation and XORing using pseudo-random sequences, encryption using Rabin cryptosystem, and steganography using the improved diagonal queues. Security analysis has been carried out. Performance analysis is observed using MSE, PSNR, maximum embedding capacity, as well as by histogram analysis between various Brain disease stego and cover images.
Using video-oriented instructions to speed up sequence comparison.

PubMed

Wozniak, A

1997-04-01

This document presents an implementation of the well-known Smith-Waterman algorithm for comparison of proteic and nucleic sequences, using specialized video instructions. These instructions, SIMD-like in their design, make possible parallelization of the algorithm at the instruction level. Benchmarks on an ULTRA SPARC running at 167 MHz show a speed-up factor of two compared to the same algorithm implemented with integer instructions on the same machine. Performance reaches over 18 million matrix cells per second on a single processor, giving to our knowledge the fastest implementation of the Smith-Waterman algorithm on a workstation. The accelerated procedure was introduced in LASSAP--a LArge Scale Sequence compArison Package software developed at INRIA--which handles parallelism at higher level. On a SUN Enterprise 6000 server with 12 processors, a speed of nearly 200 million matrix cells per second has been obtained. A sequence of length 300 amino acids is scanned against SWISSPROT R33 (1,8531,385 residues) in 29 s. This procedure is not restricted to databank scanning. It applies to all cases handled by LASSAP (intra- and inter-bank comparisons, Z-score computation, etc.
Digital signal processing methods for biosequence comparison.

PubMed Central

Benson, D C

1990-01-01

A method is discussed for DNA or protein sequence comparison using a finite field fast Fourier transform, a digital signal processing technique; and statistical methods are discussed for analyzing the output of this algorithm. This method compares two sequences of length N in computing time proportional to N log N compared to N2 for methods currently used. This method makes it feasible to compare very long sequences. An example is given to show that the method correctly identifies sites of known homology. PMID:2349096
A note on chaotic unimodal maps and applications.

PubMed

Zhou, C T; He, X T; Yu, M Y; Chew, L Y; Wang, X G

2006-09-01

Based on the word-lift technique of symbolic dynamics of one-dimensional unimodal maps, we investigate the relation between chaotic kneading sequences and linear maximum-length shift-register sequences. Theoretical and numerical evidence that the set of the maximum-length shift-register sequences is a subset of the set of the universal sequence of one-dimensional chaotic unimodal maps is given. By stabilizing unstable periodic orbits on superstable periodic orbits, we also develop techniques to control the generation of long binary sequences.
"Big Brown Dog" or "Brown Big Dog?" An Electrophysiological Study of Semantic Constraints on Prenominal Adjective Order

ERIC Educational Resources Information Center

Kemmerer, David; Weber-Fox, Christine; Price, Karen; Zdanczyk, Cynthia; Way, Heather

2007-01-01

Event-related brain potentials (ERPs) were recorded while participants read and made acceptability judgments about sentences containing three types of adjective sequences: (1) normal sequences--e.g., "Jennifer rode a huge gray elephant"; (2) reversed sequences that violate grammatical-semantic constraints on linear order--e.g., *"Jennifer rode a…
BACCardI--a tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison.

PubMed

Bartels, Daniela; Kespohl, Sebastian; Albaum, Stefan; Drüke, Tanja; Goesmann, Alexander; Herold, Julia; Kaiser, Olaf; Pühler, Alfred; Pfeiffer, Friedhelm; Raddatz, Günter; Stoye, Jens; Meyer, Folker; Schuster, Stephan C

2005-04-01

We provide the graphical tool BACCardI for the construction of virtual clone maps from standard assembler output files or BLAST based sequence comparisons. This new tool has been applied to numerous genome projects to solve various problems including (a) validation of whole genome shotgun assemblies, (b) support for contig ordering in the finishing phase of a genome project, and (c) intergenome comparison between related strains when only one of the strains has been sequenced and a large insert library is available for the other. The BACCardI software can seamlessly interact with various sequence assembly packages. Genomic assemblies generated from sequence information need to be validated by independent methods such as physical maps. The time-consuming task of building physical maps can be circumvented by virtual clone maps derived from read pair information of large insert libraries.
Comparison of internal transcribed spacers and intergenic spacer regions of five common Iranian sheep bursate nematodes.

PubMed

Nabavi, Reza; Conneely, Brendan; McCarthy, Elaine; Good, Barbara; Shayan, Parviz; DE Waal, Theo

2014-09-01

Accurate identification of sheep nematodes is a critical point in epidemiological studies and monitoring of drug resistance in flocks. However, due to a close morphological similarity between the eggs and larval stages of many of these nematodes, such identification is not a trivial task. There are a number of studies showing that molecular targets in ribosomal DNA (Internal transcribed spacer 1, 2 and Intergenic spacer) are suitable for accurate identification of sheep bursate nematodes. The objective of present study was to compare the ITS1, ITS2 and IGS regions of Iranian common bursate nematodes in order to choose best target for specific identification methods. The first and second internal transcribed spacers (ITS1and ITS2) and intergenic spacer (IGS) of the ribosomal DNA (rDNA) of 5 common Iranian bursate nematodes of sheep were sequenced. The sequences of some non-Iranian isolates were used for comparison in order to evaluate the variation in sequence homology between geographically different nematode populations. Comparison of the ITS1 and ITS2 sequences of Iranian nematodes showed greatest similarity among Teladorsagia circumcincta and Marshallagia marshalli of 94% and 88%, respectively. While Trichostrongylus colubriformis and M. marshalli showed the highest homology (99%) in the IGS sequences. Comparison of the spacer sequences of Iranian with non-Iranian isolates showed significantly higher variation in Haemonchus contortus compared to the other species. Both the ITS1 and ITS2 sequences are convenient targets to have species-specific identification of Iranian bursate nematodes. On the other hand the IGS region may be a less suitable molecular target.
Sequence of the tomato chloroplast DNA and evolutionary comparison of solanaceous plastid genomes.

PubMed

Kahlau, Sabine; Aspinall, Sue; Gray, John C; Bock, Ralph

2006-08-01

Tomato, Solanum lycopersicum (formerly Lycopersicon esculentum), has long been one of the classical model species of plant genetics. More recently, solanaceous species have become a model of evolutionary genomics, with several EST projects and a tomato genome project having been initiated. As a first contribution toward deciphering the genetic information of tomato, we present here the complete sequence of the tomato chloroplast genome (plastome). The size of this circular genome is 155,461 base pairs (bp), with an average AT content of 62.14%. It contains 114 genes and conserved open reading frames (ycfs). Comparison with the previously sequenced plastid DNAs of Nicotiana tabacum and Atropa belladonna reveals patterns of plastid genome evolution in the Solanaceae family and identifies varying degrees of conservation of individual plastid genes. In addition, we discovered several new sites of RNA editing by cytidine-to-uridine conversion. A detailed comparison of editing patterns in the three solanaceous species highlights the dynamics of RNA editing site evolution in chloroplasts. To assess the level of intraspecific plastome variation in tomato, the plastome of a second tomato cultivar was sequenced. Comparison of the two genotypes (IPA-6, bred in South America, and Ailsa Craig, bred in Europe) revealed no nucleotide differences, suggesting that the plastomes of modern tomato cultivars display very little, if any, sequence variation.
Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures

PubMed Central

2012-01-01

Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups. PMID:22726767
Temporal regularization of ultrasound-based liver motion estimation for image-guided radiation therapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

O’Shea, Tuathan P., E-mail: tuathan.oshea@icr.ac.uk; Bamber, Jeffrey C.; Harris, Emma J.

Purpose: Ultrasound-based motion estimation is an expanding subfield of image-guided radiation therapy. Although ultrasound can detect tissue motion that is a fraction of a millimeter, its accuracy is variable. For controlling linear accelerator tracking and gating, ultrasound motion estimates must remain highly accurate throughout the imaging sequence. This study presents a temporal regularization method for correlation-based template matching which aims to improve the accuracy of motion estimates. Methods: Liver ultrasound sequences (15–23 Hz imaging rate, 2.5–5.5 min length) from ten healthy volunteers under free breathing were used. Anatomical features (blood vessels) in each sequence were manually annotated for comparison withmore » normalized cross-correlation based template matching. Five sequences from a Siemens Acuson™ scanner were used for algorithm development (training set). Results from incremental tracking (IT) were compared with a temporal regularization method, which included a highly specific similarity metric and state observer, known as the α–β filter/similarity threshold (ABST). A further five sequences from an Elekta Clarity™ system were used for validation, without alteration of the tracking algorithm (validation set). Results: Overall, the ABST method produced marked improvements in vessel tracking accuracy. For the training set, the mean and 95th percentile (95%) errors (defined as the difference from manual annotations) were 1.6 and 1.4 mm, respectively (compared to 6.2 and 9.1 mm, respectively, for IT). For each sequence, the use of the state observer leads to improvement in the 95% error. For the validation set, the mean and 95% errors for the ABST method were 0.8 and 1.5 mm, respectively. Conclusions: Ultrasound-based motion estimation has potential to monitor liver translation over long time periods with high accuracy. Nonrigid motion (strain) and the quality of the ultrasound data are likely to have an impact on tracking performance. A future study will investigate spatial uniformity of motion and its effect on the motion estimation errors.« less
Multi-Maneuver Clohessy-Wiltshire Targeting

NASA Technical Reports Server (NTRS)

Dannemiller, David P.

2011-01-01

Orbital rendezvous involves execution of a sequence of maneuvers by a chaser vehicle to bring the chaser to a desired state relative to a target vehicle while meeting intermediate and final relative constraints. Intermediate and final relative constraints are necessary to meet a multitude of requirements such as to control approach direction, ensure relative position is adequate for operation of space-to-space communication systems and relative sensors, provide fail-safe trajectory features, and provide contingency hold points. The effect of maneuvers on constraints is often coupled, so the maneuvers must be solved for as a set. For example, maneuvers that affect orbital energy change both the chaser's height and downrange position relative to the target vehicle. Rendezvous designers use experience and rules-of-thumb to design a sequence of maneuvers and constraints. A non-iterative method is presented for targeting a rendezvous scenario that includes a sequence of maneuvers and relative constraints. This method is referred to as Multi-Maneuver Clohessy-Wiltshire Targeting (MM_CW_TGT). When a single maneuver is targeted to a single relative position, the classic CW targeting solution is obtained. The MM_CW_TGT method involves manipulation of the CW state transition matrix to form a linear system. As a starting point for forming the algorithm, the effects of a series of impulsive maneuvers on the state are derived. Simple and moderately complex examples are used to demonstrate the pattern of the resulting linear system. The general form of the pattern results in an algorithm for formation of the linear system. The resulting linear system relates the effect of maneuver components and initial conditions on relative constraints specified by the rendezvous designer. Solution of the linear system includes the straight-forward inverse of a square matrix. Inversion of the square matrix is assured if the designer poses a controllable scenario - a scenario where the the constraints can be met by the sequence of maneuvers. Matrices in the linear system are dependent on selection of maneuvers and constraints by the designer, but the matrices are independent of the chaser's initial conditions. For scenarios where the sequence of maneuvers and constraints are fixed, the linear system can be formed and the square matrix inverted prior to real-time operations. Example solutions are presented for several rendezvous scenarios to illustrate the utility of the method. The MM_CW_TGT method has been used during the preliminary design of rendezvous scenarios and is expected to be useful for iterative methods in the generation of an initial guess and corrections.
Fast alignment-free sequence comparison using spaced-word frequencies.

PubMed

Leimeister, Chris-Andre; Boden, Marcus; Horwege, Sebastian; Lindner, Sebastian; Morgenstern, Burkhard

2014-07-15

Alignment-free methods for sequence comparison are increasingly used for genome analysis and phylogeny reconstruction; they circumvent various difficulties of traditional alignment-based approaches. In particular, alignment-free methods are much faster than pairwise or multiple alignments. They are, however, less accurate than methods based on sequence alignment. Most alignment-free approaches work by comparing the word composition of sequences. A well-known problem with these methods is that neighbouring word matches are far from independent. To reduce the statistical dependency between adjacent word matches, we propose to use 'spaced words', defined by patterns of 'match' and 'don't care' positions, for alignment-free sequence comparison. We describe a fast implementation of this approach using recursive hashing and bit operations, and we show that further improvements can be achieved by using multiple patterns instead of single patterns. To evaluate our approach, we use spaced-word frequencies as a basis for fast phylogeny reconstruction. Using real-world and simulated sequence data, we demonstrate that our multiple-pattern approach produces better phylogenies than approaches relying on contiguous words. Our program is freely available at http://spaced.gobics.de/. © The Author 2014. Published by Oxford University Press.
Preliminary Classification of Novel Hemorrhagic Fever-Causing Viruses Using Sequence-Based PAirwise Sequence Comparison (PASC) Analysis.

PubMed

Bào, Yīmíng; Kuhn, Jens H

2018-01-01

During the last decade, genome sequence-based classification of viruses has become increasingly prominent. Viruses can be even classified based on coding-complete genome sequence data alone. Nevertheless, classification remains arduous as experts are required to establish phylogenetic trees to depict the evolutionary relationships of such sequences for preliminary taxonomic placement. Pairwise sequence comparison (PASC) of genomes is one of several novel methods for establishing relationships among viruses. This method, provided by the US National Center for Biotechnology Information as an open-access tool, circumvents phylogenetics, and yet PASC results are often in agreement with those of phylogenetic analyses. Computationally inexpensive, PASC can be easily performed by non-taxonomists. Here we describe how to use the PASC tool for the preliminary classification of novel viral hemorrhagic fever-causing viruses.
Insights from Human/Mouse genome comparisons

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pennacchio, Len A.

2003-03-30

Large-scale public genomic sequencing efforts have provided a wealth of vertebrate sequence data poised to provide insights into mammalian biology. These include deep genomic sequence coverage of human, mouse, rat, zebrafish, and two pufferfish (Fugu rubripes and Tetraodon nigroviridis) (Aparicio et al. 2002; Lander et al. 2001; Venter et al. 2001; Waterston et al. 2002). In addition, a high-priority has been placed on determining the genomic sequence of chimpanzee, dog, cow, frog, and chicken (Boguski 2002). While only recently available, whole genome sequence data have provided the unique opportunity to globally compare complete genome contents. Furthermore, the shared evolutionary ancestrymore » of vertebrate species has allowed the development of comparative genomic approaches to identify ancient conserved sequences with functionality. Accordingly, this review focuses on the initial comparison of available mammalian genomes and describes various insights derived from such analysis.« less
Intervening sequences in a plant gene-comparison of the partial sequence of cDNA and genomic DNA of French bean phaseolin

NASA Astrophysics Data System (ADS)

Sun, S. M.; Slightom, J. L.; Hall, T. C.

1981-01-01

A plant gene coding for the major storage protein (phaseolin, G1-globulin) of the French bean was isolated from a genomic library constructed in the phage vector Charon 24A. Comparison of the nucleotide sequence of part of the gene with that of the cloned messenger RNA (cDNA) revealed the presence of three intervening sequences, all beginning with GTand ending with AG. The 5' and 3' boundaries of intervening sequences TVS-A (88 base pairs) and IVS-B (124 base pairs) are similar to those described for animal and viral genes, but the 3' boundary of IVS-C (129 base pairs) shows some differences. A sequence of 185 amino acids deduced from the cloned DMAs represents about 40% of a phaseolin polypeptide.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes

PubMed Central

Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim

2010-01-01

Motivation: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith–Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid™, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. Availability: The database can be accessed through http://proteinworlddb.org Contact: otto@fiocruz.br PMID:20089515

Streaming fragment assignment for real-time analysis of sequencing experiments

PubMed Central

Roberts, Adam; Pachter, Lior

2013-01-01

We present eXpress, a software package for highly efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time, and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data, showing greater efficiency than other quantification methods. PMID:23160280
Concatenated shift registers generating maximally spaced phase shifts of PN-sequences

NASA Technical Reports Server (NTRS)

Hurd, W. J.; Welch, L. R.

1977-01-01

A large class of linearly concatenated shift registers is shown to generate approximately maximally spaced phase shifts of pn-sequences, for use in pseudorandom number generation. A constructive method is presented for finding members of this class, for almost all degrees for which primitive trinomials exist. The sequences which result are not normally characterized by trinomial recursions, which is desirable since trinomial sequences can have some undesirable randomness properties.
Indigenous and introduced potyviruses of legumes and Passiflora spp. from Australia: biological properties and comparison of coat protein sequences

USDA-ARS?s Scientific Manuscript database

Coat protein sequences of 33 Potyvirus isolates from legume and Passiflora spp. were sequenced to determine the identity of infecting viruses. Phylogenetic analysis of the sequences revealed the presence of seven distinct virus species....
Quantum Private Comparison Protocol with Linear Optics

NASA Astrophysics Data System (ADS)

Luo, Qing-bin; Yang, Guo-wu; She, Kun; Li, Xiaoyu

2016-12-01

In this paper, we propose an innovative quantum private comparison(QPC) protocol based on partial Bell-state measurement from the view of linear optics, which enabling two parties to compare the equality of their private information with the help of a semi-honest third party. Partial Bell-state measurement has been realized by using only linear optical elements in experimental measurement-device-independent quantum key distribution(MDI-QKD) schemes, which makes us believe that our protocol can be realized in the near future. The security analysis shows that the participants will not leak their private information.
Genome-Wide Stochastic Adaptive DNA Amplification at Direct and Inverted DNA Repeats in the Parasite Leishmania

PubMed Central

Plourde, Marie; Gingras, Hélène; Roy, Gaétan; Lapointe, Andréanne; Leprohon, Philippe; Papadopoulou, Barbara; Corbeil, Jacques; Ouellette, Marc

2014-01-01

Gene amplification of specific loci has been described in all kingdoms of life. In the protozoan parasite Leishmania, the product of amplification is usually part of extrachromosomal circular or linear amplicons that are formed at the level of direct or inverted repeated sequences. A bioinformatics screen revealed that repeated sequences are widely distributed in the Leishmania genome and the repeats are chromosome-specific, conserved among species, and generally present in low copy number. Using sensitive PCR assays, we provide evidence that the Leishmania genome is continuously being rearranged at the level of these repeated sequences, which serve as a functional platform for constitutive and stochastic amplification (and deletion) of genomic segments in the population. This process is adaptive as the copy number of advantageous extrachromosomal circular or linear elements increases upon selective pressure and is reversible when selection is removed. We also provide mechanistic insights on the formation of circular and linear amplicons through RAD51 recombinase-dependent and -independent mechanisms, respectively. The whole genome of Leishmania is thus stochastically rearranged at the level of repeated sequences, and the selection of parasite subpopulations with changes in the copy number of specific loci is used as a strategy to respond to a changing environment. PMID:24844805
Genome Comparisons Reveal a Dominant Mechanism of Chromosome Number Reduction in Grasses and Accelerated Genome Evolution in Triticeae

USDA-ARS?s Scientific Manuscript database

Single nucleotide polymorphism was employed in the construction of a high-resolution, expressed sequence tag (EST) map of Aegilops tauschii, the diploid source of the wheat D genome. Comparison of the map with the rice and sorghum genome sequences revealed 50 inversions and translocations; 2, 8, and...
An improved model for whole genome phylogenetic analysis by Fourier transform.

PubMed

Yin, Changchuan; Yau, Stephen S-T

2015-10-07

DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.

PubMed

Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu

2015-06-01

High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.
Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

USDA-ARS?s Scientific Manuscript database

Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylat...
Sequence information gain based motif analysis.

PubMed

Maynou, Joan; Pairó, Erola; Marco, Santiago; Perera, Alexandre

2015-11-09

The detection of regulatory regions in candidate sequences is essential for the understanding of the regulation of a particular gene and the mechanisms involved. This paper proposes a novel methodology based on information theoretic metrics for finding regulatory sequences in promoter regions. This methodology (SIGMA) has been tested on genomic sequence data for Homo sapiens and Mus musculus. SIGMA has been compared with different publicly available alternatives for motif detection, such as MEME/MAST, Biostrings (Bioconductor package), MotifRegressor, and previous work such Qresiduals projections or information theoretic based detectors. Comparative results, in the form of Receiver Operating Characteristic curves, show how, in 70% of the studied Transcription Factor Binding Sites, the SIGMA detector has a better performance and behaves more robustly than the methods compared, while having a similar computational time. The performance of SIGMA can be explained by its parametric simplicity in the modelling of the non-linear co-variability in the binding motif positions. Sequence Information Gain based Motif Analysis is a generalisation of a non-linear model of the cis-regulatory sequences detection based on Information Theory. This generalisation allows us to detect transcription factor binding sites with maximum performance disregarding the covariability observed in the positions of the training set of sequences. SIGMA is freely available to the public at http://b2slab.upc.edu.
Modeling coding-sequence evolution within the context of residue solvent accessibility.

PubMed

Scherrer, Michael P; Meyer, Austin G; Wilke, Claus O

2012-09-12

Protein structure mediates site-specific patterns of sequence divergence. In particular, residues in the core of a protein (solvent-inaccessible residues) tend to be more evolutionarily conserved than residues on the surface (solvent-accessible residues). Here, we present a model of sequence evolution that explicitly accounts for the relative solvent accessibility of each residue in a protein. Our model is a variant of the Goldman-Yang 1994 (GY94) model in which all model parameters can be functions of the relative solvent accessibility (RSA) of a residue. We apply this model to a data set comprised of nearly 600 yeast genes, and find that an evolutionary-rate ratio ω that varies linearly with RSA provides a better model fit than an RSA-independent ω or an ω that is estimated separately in individual RSA bins. We further show that the branch length t and the transition-transverion ratio κ also vary with RSA. The RSA-dependent GY94 model performs better than an RSA-dependent Muse-Gaut 1994 (MG94) model in which the synonymous and non-synonymous rates individually are linear functions of RSA. Finally, protein core size affects the slope of the linear relationship between ω and RSA, and gene expression level affects both the intercept and the slope. Structure-aware models of sequence evolution provide a significantly better fit than traditional models that neglect structure. The linear relationship between ω and RSA implies that genes are better characterized by their ω slope and intercept than by just their mean ω.
Young children make their gestural communication systems more language-like: segmentation and linearization of semantic elements in motion events.

PubMed

Clay, Zanna; Pople, Sally; Hood, Bruce; Kita, Sotaro

2014-08-01

Research on Nicaraguan Sign Language, created by deaf children, has suggested that young children use gestures to segment the semantic elements of events and linearize them in ways similar to those used in signed and spoken languages. However, it is unclear whether this is due to children's learning processes or to a more general effect of iterative learning. We investigated whether typically developing children, without iterative learning, segment and linearize information. Gestures produced in the absence of speech to express a motion event were examined in 4-year-olds, 12-year-olds, and adults (all native English speakers). We compared the proportions of gestural expressions that segmented semantic elements into linear sequences and that encoded them simultaneously. Compared with adolescents and adults, children reshaped the holistic stimuli by segmenting and recombining their semantic features into linearized sequences. A control task on recognition memory ruled out the possibility that this was due to different event perception or memory. Young children spontaneously bring fundamental properties of language into their communication system. © The Author(s) 2014.
Linearized aerodynamic and control law models of the X-29A airplane and comparison with flight data

NASA Technical Reports Server (NTRS)

Bosworth, John T.

1992-01-01

Flight control system design and analysis for aircraft rely on mathematical models of the vehicle dynamics. In addition to a six degree of freedom nonlinear simulation, the X-29A flight controls group developed a set of programs that calculate linear perturbation models throughout the X-29A flight envelope. The models include the aerodynamics as well as flight control system dynamics and were used for stability, controllability, and handling qualities analysis. These linear models were compared to flight test results to help provide a safe flight envelope expansion. A description is given of the linear models at three flight conditions and two flight control system modes. The models are presented with a level of detail that would allow the reader to reproduce the linear results if desired. Comparison between the response of the linear model and flight measured responses are presented to demonstrate the strengths and weaknesses of the linear models' ability to predict flight dynamics.
Circularized Chromosome with a Large Palindromic Structure in Streptomyces griseus Mutants

PubMed Central

Uchida, Tetsuya; Ishihara, Naoto; Zenitani, Hiroyuki; Hiratsu, Keiichiro; Kinashi, Haruyasu

2004-01-01

Streptomyces linear chromosomes display various types of rearrangements after telomere deletion, including circularization, arm replacement, and amplification. We analyzed the new chromosomal deletion mutants Streptomyces griseus 301-22-L and 301-22-M. In these mutants, chromosomal arm replacement resulted in long terminal inverted repeats (TIRs) at both ends; different sizes were deleted again and recombined inside the TIRs, resulting in a circular chromosome with an extremely large palindrome. Short palindromic sequences were found in parent strain 2247, and these sequences might have played a role in the formation of this unique structure. Dynamic structural changes of Streptomyces linear chromosomes shown by this and previous studies revealed extraordinary strategies of members of this genus to keep a functional chromosome, even if it is linear or circular. PMID:15150216
Implicit Sequence Learning in Dyslexia: A Within-Sequence Comparison of First- and Higher-Order Information

ERIC Educational Resources Information Center

Du, Wenchong; Kelly, Steve W.

2013-01-01

The present study examines implicit sequence learning in adult dyslexics with a focus on comparing sequence transitions with different statistical complexities. Learning of a 12-item deterministic sequence was assessed in 12 dyslexic and 12 non-dyslexic university students. Both groups showed equivalent standard reaction time increments when the…
Partial amino acid sequence of the branched chain amino acid aminotransferase (TmB) of E. coli JA199 pDU11

DOE Office of Scientific and Technical Information (OSTI.GOV)

Feild, M.J.; Armstrong, F.B.

1987-05-01

E. coli JA199 pDU11 harbors a multicopy plasmid containing the ilv GEDAY gene cluster of S. typhimurium. TmB, gene product of ilv E, was purified, crystallized, and subjected to Edman degradation using a gas phase sequencer. The intact protein yielded an amino terminal 31 residue sequence. Both carboxymethylated apoenzyme and (/sup 3/H)-NaBH-reduced holoenzyme were then subjected to digestion by trypsin. The digests were fractionated using reversed phase HPLC, and the peptides isolated were sequenced. The borohydride-treated holoenzyme was used to isolate the cofactor-binding peptide. The peptide is 27 residues long and a comparison with known sequences of other aminotransferases revealedmore » limited homology. Peptides accounting for 211 of 288 predicted residues have been sequenced, including 9 residues of the carboxyl terminus. Comparison of peptides with the inferred amino acid sequence of the E. coli K-12 enzyme has helped determine the sequence of the amino terminal 59 residues; only two differences between the sequences are noted in this region.« less
Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

PubMed Central

Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

1988-01-01

Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437
Linear Relation for Wind-blown Bubble Sizes of Main-sequence OB Stars in a Molecular Environment and Implication for Supernova Progenitors

NASA Astrophysics Data System (ADS)

Chen, Yang; Zhou, Ping; Chu, You-Hua

2013-05-01

We find a linear relationship between the size of a massive star's main-sequence bubble in a molecular environment and the star's initial mass: R b ≈ 1.22 M/M ⊙ - 9.16 pc, assuming a constant interclump pressure. Since stars in the mass range of 8 to 25-30 M ⊙ will end their evolution in the red supergiant phase without launching a Wolf-Rayet wind, the main-sequence wind-blown bubbles are mainly responsible for the extent of molecular gas cavities, while the effect of the photoionization is comparatively small. This linear relation can thus be used to infer the masses of the massive star progenitors of supernova remnants (SNRs) that are discovered to evolve in molecular cavities, while few other means are available for inferring the properties of SNR progenitors. We have used this method to estimate the initial masses of the progenitors of eight SNRs: Kes 69, Kes 75, Kes 78, 3C 396, 3C 397, HC 40, Vela, and RX J1713-3946.
Clustered regularly interspaced short palindromic repeats (CRISPRs) for the genotyping of bacterial pathogens.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2009-01-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) are DNA sequences composed of a succession of repeats (23- to 47-bp long) separated by unique sequences called spacers. Polymorphism can be observed in different strains of a species and may be used for genotyping. We describe protocols and bioinformatics tools that allow the identification of CRISPRs from sequenced genomes, their comparison, and their component determination (the direct repeats and the spacers). A schematic representation of the spacer organization can be produced, allowing an easy comparison between strains.
Multidisciplinary study on Wyoming test sites

NASA Technical Reports Server (NTRS)

Houston, R. S. (Principal Investigator); Marrs, R. W.; Borgman, L. E.

1975-01-01

The author has identified the following significant results. Ten EREP data passes over the Wyoming test site provided excellent S190A and S190B coverage and some useful S192 imagery. These data were employed in an evaluation of the EREP imaging sensors in several earth resources applications. Boysen Reservoir and Hyattsville were test areas for band to band comparison of the S190 and S192 sensors and for evaluation of the image data for geologic mapping. Contrast measurements were made from the S192 image data for typical sequence of sedimentary rocks. Histograms compiled from these measurements show that near infrared S192 bands provide the greatest amount of contrast between geologic units. Comparison was also made between LANDSAT imagery and S190B and aerial photography for regional land use mapping. The S190B photography was found far superior to the color composite LANDSAT imagery and was almost as effective as the 1:120,000 scale aerial photography. A map of linear elements prepared from LANDSAT and EREP imagery of the southwestern Bighorn Mountains provided an important aid in defining the relationship between fracture and ground water movement through the Madison aquifer.

rasbhari: Optimizing Spaced Seeds for Database Searching, Read Mapping and Alignment-Free Sequence Comparison.

PubMed

Hahn, Lars; Leimeister, Chris-André; Ounit, Rachid; Lonardi, Stefano; Morgenstern, Burkhard

2016-10-01

Many algorithms for sequence analysis rely on word matching or word statistics. Often, these approaches can be improved if binary patterns representing match and don't-care positions are used as a filter, such that only those positions of words are considered that correspond to the match positions of the patterns. The performance of these approaches, however, depends on the underlying patterns. Herein, we show that the overlap complexity of a pattern set that was introduced by Ilie and Ilie is closely related to the variance of the number of matches between two evolutionarily related sequences with respect to this pattern set. We propose a modified hill-climbing algorithm to optimize pattern sets for database searching, read mapping and alignment-free sequence comparison of nucleic-acid sequences; our implementation of this algorithm is called rasbhari. Depending on the application at hand, rasbhari can either minimize the overlap complexity of pattern sets, maximize their sensitivity in database searching or minimize the variance of the number of pattern-based matches in alignment-free sequence comparison. We show that, for database searching, rasbhari generates pattern sets with slightly higher sensitivity than existing approaches. In our Spaced Words approach to alignment-free sequence comparison, pattern sets calculated with rasbhari led to more accurate estimates of phylogenetic distances than the randomly generated pattern sets that we previously used. Finally, we used rasbhari to generate patterns for short read classification with CLARK-S. Here too, the sensitivity of the results could be improved, compared to the default patterns of the program. We integrated rasbhari into Spaced Words; the source code of rasbhari is freely available at http://rasbhari.gobics.de/.
Complete genome sequence of the plant pathogen Erwinia amylovora strain ATCC 49946

USDA-ARS?s Scientific Manuscript database

Erwinia amylovora causes the economically important disease fire blight that affects rosaceous plants, especially pear and apple. Here we report the complete genome sequence and annotation of strain ATCC 49946. The analysis of the sequence and its comparison with sequenced genomes of closely related...
AlignMe—a membrane protein sequence alignment web server

PubMed Central

Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.

2014-01-01

We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425
Kinematics and spectra of planetary nebulae with O VI-sequence nuclei

NASA Technical Reports Server (NTRS)

Johnson, H. M.

1976-01-01

Spectral features of NGC 5189 and NGC 6905 are tabulated. Fabry-Perot profiles around H alpha and O III lambda 5007 of NGC 5189, NGC 6905, NGC 246, and NGC 1535, are illustrated. The latter planetary nebula is a non-O VI-sequence, comparison object of high excitation. The kinematics of the four planetary nebulae are simply analyzed. Discussion of these data is motivated by the possibility of collisional excitation by high-speed ejecta from broad-lined O VI-sequence nuclei, and by the opportunity to make a comparison with conditions in the supernova remnant or ring nebula, G2.4 + 1.4, which contains an O VI-sequence nucleus of Population I.
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment

ERIC Educational Resources Information Center

Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos

2013-01-01

In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…
VizieR Online Data Catalog: KIC 8462852 GTC spectra (Deeg+, 2018)

NASA Astrophysics Data System (ADS)

Deeg, H. J.; Alonso, R.; Nespral, D.; Boyajian, T.

2018-01-01

Spectra obtained in the follow-up of KIC 8462852 (Boyajian's star) with OSIRIS at the GTC telescope. These spectra have been reduced as described in the paper and are contained in two directories, for target and comparison spectra: sp_target contains spectra of the target star (KIC 8462852) sp_compar contains spectra of the comparison star (KIC 8462763) At each pointing of the GTC, a sequence of 10-45 spectra was generated. The individual spectra are named: tpXXYY.dat for the target spectra and cpXXYY.dat for the comparison spectra, where XX is the pointing number, and YY is a sequence number. The format of each spectrum file is a two-column ascii file: Wavelength (Angstrom) | Flux (arbitrary units)) The files times_pXX.dat correspond to each of the pointings and contain the times of mid-exposure of each spectrum, in the HJD_UTC-2400000 framework. These times apply to both target and comparison spectra and are ordered by increasing sequence number. There are a total of 516 spectra of the target and 516 spectra of the comparison. (19 data files).
Comparison of Linear Induction Motor Theories for the LIMRV and TLRV Motors

DOT National Transportation Integrated Search

1978-01-01

The Oberretl, Yamamura, and Mosebach theories of the linear induction motor are described and also applied to predict performance characteristics of the TLRV & LIMRV linear induction motors. The effect of finite motor width and length on performance ...
Spatio-temporal alignment of pedobarographic image sequences.

PubMed

Oliveira, Francisco P M; Sousa, Andreia; Santos, Rubim; Tavares, João Manuel R S

2011-07-01

This article presents a methodology to align plantar pressure image sequences simultaneously in time and space. The spatial position and orientation of a foot in a sequence are changed to match the foot represented in a second sequence. Simultaneously with the spatial alignment, the temporal scale of the first sequence is transformed with the aim of synchronizing the two input footsteps. Consequently, the spatial correspondence of the foot regions along the sequences as well as the temporal synchronizing is automatically attained, making the study easier and more straightforward. In terms of spatial alignment, the methodology can use one of four possible geometric transformation models: rigid, similarity, affine, or projective. In the temporal alignment, a polynomial transformation up to the 4th degree can be adopted in order to model linear and curved time behaviors. Suitable geometric and temporal transformations are found by minimizing the mean squared error (MSE) between the input sequences. The methodology was tested on a set of real image sequences acquired from a common pedobarographic device. When used in experimental cases generated by applying geometric and temporal control transformations, the methodology revealed high accuracy. In addition, the intra-subject alignment tests from real plantar pressure image sequences showed that the curved temporal models produced better MSE results (P < 0.001) than the linear temporal model. This article represents an important step forward in the alignment of pedobarographic image data, since previous methods can only be applied on static images.
Pleiotropy Analysis of Quantitative Traits at Gene Level by Multivariate Functional Linear Models

PubMed Central

Wang, Yifan; Liu, Aiyi; Mills, James L.; Boehnke, Michael; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Xiong, Momiao; Wu, Colin O.; Fan, Ruzong

2015-01-01

In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks’s Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. PMID:25809955
Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models.

PubMed

Wang, Yifan; Liu, Aiyi; Mills, James L; Boehnke, Michael; Wilson, Alexander F; Bailey-Wilson, Joan E; Xiong, Momiao; Wu, Colin O; Fan, Ruzong

2015-05-01

In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. © 2015 WILEY PERIODICALS, INC.
RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation.

PubMed

Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin

2017-01-21

RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .
Alignment-free sequence comparison (II): theoretical power of comparison statistics.

PubMed

Wan, Lin; Reinert, Gesine; Sun, Fengzhu; Waterman, Michael S

2010-11-01

Rapid methods for alignment-free sequence comparison make large-scale comparisons between sequences increasingly feasible. Here we study the power of the statistic D2, which counts the number of matching k-tuples between two sequences, as well as D2*, which uses centralized counts, and D2S, which is a self-standardized version, both from a theoretical viewpoint and numerically, providing an easy to use program. The power is assessed under two alternative hidden Markov models; the first one assumes that the two sequences share a common motif, whereas the second model is a pattern transfer model; the null model is that the two sequences are composed of independent and identically distributed letters and they are independent. Under the first alternative model, the means of the tuple counts in the individual sequences change, whereas under the second alternative model, the marginal means are the same as under the null model. Using the limit distributions of the count statistics under the null and the alternative models, we find that generally, asymptotically D2S has the largest power, followed by D2*, whereas the power of D2 can even be zero in some cases. In contrast, even for sequences of length 140,000 bp, in simulations D2* generally has the largest power. Under the first alternative model of a shared motif, the power of D2*approaches 100% when sufficiently many motifs are shared, and we recommend the use of D2* for such practical applications. Under the second alternative model of pattern transfer,the power for all three count statistics does not increase with sequence length when the sequence is sufficiently long, and hence none of the three statistics under consideration canbe recommended in such a situation. We illustrate the approach on 323 transcription factor binding motifs with length at most 10 from JASPAR CORE (October 12, 2009 version),verifying that D2* is generally more powerful than D2. The program to calculate the power of D2, D2* and D2S can be downloaded from http://meta.cmb.usc.edu/d2. Supplementary Material is available at www.liebertonline.com/cmb.
Linear reduction methods for tag SNP selection.

PubMed

He, Jingwu; Zelikovsky, Alex

2004-01-01

It is widely hoped that constructing a complete human haplotype map will help to associate complex diseases with certain SNP's. Unfortunately, the number of SNP's is huge and it is very costly to sequence many individuals. Therefore, it is desirable to reduce the number of SNP's that should be sequenced to considerably small number of informative representatives, so called tag SNP's. In this paper, we propose a new linear algebra based method for selecting and using tag SNP's. Our method is purely combinatorial and can be combined with linkage disequilibrium (LD) and block based methods. We measure the quality of our tag SNP selection algorithm by comparing actual SNP's with SNP's linearly predicted from linearly chosen tag SNP's. We obtain an extremely good compression and prediction rates. For example, for long haplotypes (>25000 SNP's), knowing only 0.4% of all SNP's we predict the entire unknown haplotype with 2% accuracy while the prediction method is based on a 10% sample of the population.
Method and apparatus for biological sequence comparison

DOEpatents

Marr, T.G.; Chang, W.I.

1997-12-23

A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.
Method and apparatus for biological sequence comparison

DOEpatents

Marr, Thomas G.; Chang, William I-Wei

1997-01-01

A method and apparatus for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence.
A distributed model predictive control scheme for leader-follower multi-agent systems

NASA Astrophysics Data System (ADS)

Franzè, Giuseppe; Lucia, Walter; Tedesco, Francesco

2018-02-01

In this paper, we present a novel receding horizon control scheme for solving the formation problem of leader-follower configurations. The algorithm is based on set-theoretic ideas and is tuned for agents described by linear time-invariant (LTI) systems subject to input and state constraints. The novelty of the proposed framework relies on the capability to jointly use sequences of one-step controllable sets and polyhedral piecewise state-space partitions in order to online apply the 'better' control action in a distributed receding horizon fashion. Moreover, we prove that the design of both robust positively invariant sets and one-step-ahead controllable regions is achieved in a distributed sense. Simulations and numerical comparisons with respect to centralised and local-based strategies are finally performed on a group of mobile robots to demonstrate the effectiveness of the proposed control strategy.
An ancient genome duplication contributed to the abundance of metabolic genes in the moss Physcomitrella patens

PubMed Central

Rensing, Stefan A; Ick, Julia; Fawcett, Jeffrey A; Lang, Daniel; Zimmer, Andreas; Van de Peer, Yves; Reski, Ralf

2007-01-01

Background: Analyses of complete genomes and large collections of gene transcripts have shown that most, if not all seed plants have undergone one or more genome duplications in their evolutionary past. Results: In this study, based on a large collection of EST sequences, we provide evidence that the haploid moss Physcomitrella patens is a paleopolyploid as well. Based on the construction of linearized phylogenetic trees we infer the genome duplication to have occurred between 30 and 60 million years ago. Gene Ontology and pathway association of the duplicated genes in P. patens reveal different biases of gene retention compared with seed plants. Conclusion: Metabolic genes seem to have been retained in excess following the genome duplication in P. patens. This might, at least partly, explain the versatility of metabolism, as described for P. patens and other mosses, in comparison to other land plants. PMID:17683536
Comparison of Methods of Detection of Exceptional Sequences in Prokaryotic Genomes.

PubMed

Rusinov, I S; Ershova, A S; Karyagina, A S; Spirin, S A; Alexeevski, A V

2018-02-01

Many proteins need recognition of specific DNA sequences for functioning. The number of recognition sites and their distribution along the DNA might be of biological importance. For example, the number of restriction sites is often reduced in prokaryotic and phage genomes to decrease the probability of DNA cleavage by restriction endonucleases. We call a sequence an exceptional one if its frequency in a genome significantly differs from one predicted by some mathematical model. An exceptional sequence could be either under- or over-represented, depending on its frequency in comparison with the predicted one. Exceptional sequences could be considered biologically meaningful, for example, as targets of DNA-binding proteins or as parts of abundant repetitive elements. Several methods to predict frequency of a short sequence in a genome, based on actual frequencies of certain its subsequences, are used. The most popular are methods based on Markov chain models. But any rigorous comparison of the methods has not previously been performed. We compared three methods for the prediction of short sequence frequencies: the maximum-order Markov chain model-based method, the method that uses geometric mean of extended Markovian estimates, and the method that utilizes frequencies of all subsequences including discontiguous ones. We applied them to restriction sites in complete genomes of 2500 prokaryotic species and demonstrated that the results depend greatly on the method used: lists of 5% of the most under-represented sites differed by up to 50%. The method designed by Burge and coauthors in 1992, which utilizes all subsequences of the sequence, showed a higher precision than the other two methods both on prokaryotic genomes and randomly generated sequences after computational imitation of selective pressure. We propose this method as the first choice for detection of exceptional sequences in prokaryotic genomes.
Dali server update.

PubMed

Holm, Liisa; Laakso, Laura M

2016-07-08

The Dali server (http://ekhidna2.biocenter.helsinki.fi/dali) is a network service for comparing protein structures in 3D. In favourable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing sequences. The Dali server has been running in various places for over 20 years and is used routinely by crystallographers on newly solved structures. The latest update of the server provides enhanced analytics for the study of sequence and structure conservation. The server performs three types of structure comparisons: (i) Protein Data Bank (PDB) search compares one query structure against those in the PDB and returns a list of similar structures; (ii) pairwise comparison compares one query structure against a list of structures specified by the user; and (iii) all against all structure comparison returns a structural similarity matrix, a dendrogram and a multidimensional scaling projection of a set of structures specified by the user. Structural superimpositions are visualized using the Java-free WebGL viewer PV. The structural alignment view is enhanced by sequence similarity searches against Uniprot. The combined structure-sequence alignment information is compressed to a stack of aligned sequence logos. In the stack, each structure is structurally aligned to the query protein and represented by a sequence logo. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Occurrence of a Sequence in Marine Cyanophages Similar to That of T4 g20 and Its Application to PCR-Based Detection and Quantification Techniques†

PubMed Central

Fuller, Nicholas J.; Wilson, William H.; Joint, Ian R.; Mann, Nicholas H.

1998-01-01

Viruses are ubiquitous components of marine ecosystems and are known to infect unicellular phycoerythrin-containing cyanobacteria belonging to the genus Synechococcus. A conserved region from the cyanophage genome was identified in three genetically distinct cyanomyoviruses, and a sequence analysis revealed that this region exhibited significant similarity to a gene encoding a capsid assembly protein (gp20) from the enteric coliphage T4. The results of a comparison of gene 20 sequences from three cyanomyoviruses and T4 allowed us to design two degenerate PCR primers, CPS1 and CPS2, which specifically amplified a 165-bp region from the majority of cyanomyoviruses tested. A competitive PCR (cPCR) analysis revealed that cyanomyovirus strains could be accurately enumerated, and it was demonstrated that quantification was log-linear over ca. 3 orders of magnitude. Different calibration curves were obtained for each of the three cyanomyovirus strains tested; consequently, cPCR performed with primers CPS1 and CPS2 could lead to substantial inaccuracies in estimates of phage abundance in natural assemblages. Further sequence analysis of cyanomyovirus gene 20 homologs would be necessary in order to design primers which do not exhibit phage-to-phage variability in priming efficiency. It was demonstrated that PCR products of the correct size could be amplified from seawater samples following 100× concentration and even directly without any prior concentration. Hence, the use of degenerate primers in PCR analyses of cyanophage populations should provide valuable data on the diversity of cyanophages in natural assemblages. Further optimization of procedures may ultimately lead to a sensitive assay which can be used to analyze natural cyanophage populations both quantitatively (by cPCR) and qualitatively following phylogenetic analysis of amplified products. PMID:9603813

Whole-genome sequence analysis of the Mycobacterium avium complex and proposal of the transfer of Mycobacterium yongonense to Mycobacterium intracellulare subsp. yongonense subsp. nov.

PubMed

Castejon, Maria; Menéndez, Maria Carmen; Comas, Iñaki; Vicente, Ana; Garcia, Maria J

2018-06-01

Bacterial whole-genome sequences contain informative features of their evolutionary pathways. Comparison of whole-genome sequences have become the method of choice for classification of prokaryotes, thus allowing the identification of bacteria from an evolutionary perspective, and providing data to resolve some current controversies. Currently, controversy exists about the assignment of members of the Mycobacterium avium complex, as is for the cases of Mycobacterium yongonense and 'Mycobacterium indicus pranii'. These two mycobacteria, closely related to Mycobacterium intracellulare on the basis of standard phenotypic and single gene-sequences comparisons, were not considered a member of such species on the basis on some particular differences displayed by a single strain. Whole-genome sequence comparison procedures, namely the average nucleotide identity and the genome distance, showed that those two mycobacteria should be considered members of the species M. intracellulare. The results were confirmed with other whole-genome comparison supplementary methods. According to the data provided, Mycobacterium yongonense and 'Mycobacterium indicus pranii' should be considered and renamed and included as members of M. intracellulare. This study highlights the problems caused when a novel species is accepted on the basis of a single strain, as was the case for M. yongonense. Based mainly on whole-genome sequence analysis, we conclude that M. yongonense should be reclassified as a subspecies of Mycobacterium intracellulareas Mycobacterium intracellularesubsp. yongonense and 'Mycobacterium indicus pranii' classified in the same subspecies as the type strain of Mycobacterium intracellulare and classified as Mycobacterium intracellularesubsp. intracellulare.
Non-Linear Finite Element Modeling of THUNDER Piezoelectric Actuators

NASA Technical Reports Server (NTRS)

Taleghani, Barmac K.; Campbell, Joel F.

1999-01-01

A NASTRAN non-linear finite element model has been developed for predicting the dome heights of THUNDER (THin Layer UNimorph Ferroelectric DrivER) piezoelectric actuators. To analytically validate the finite element model, a comparison was made with a non-linear plate solution using Von Karmen's approximation. A 500 volt input was used to examine the actuator deformation. The NASTRAN finite element model was also compared with experimental results. Four groups of specimens were fabricated and tested. Four different input voltages, which included 120, 160, 200, and 240 Vp-p with a 0 volts offset, were used for this comparison.
Exploring the Presence of microDNAs in Prostate Cancer Cell Lines, Tissue, and Sera of Prostate Cancer Patients and its Possible Application as Biomarker

DTIC Science & Technology

2015-08-01

Sequence tags were mapped on the human reference genome using the Novoalign software. Only those tags... the linear islands to create a novel junctional sequence that does not exist in the genome . Thus the PE- sequence of a fragment that breaks at or...identified in cancer cell lines. (b) Median percent GC content of microDNAs and the genomic sequences up- or downstream of the source loci are
Exploring the Presence of microDNAs in Prostate Cancer Cell Lines, Tissue, and Sera of Prostate Cancer Patients and its Possible Application as Biomarker

DTIC Science & Technology

2016-04-01

Sequence tags were mapped on the human reference genome using the Novoalign software. Only those...ends of the linear islands to create a novel junctional sequence that does not exist in the genome . Thus the PE- sequence of a fragment that breaks at... genome (Fig. 3b). Those PE-tags where one tag maps uniquely to an island and the other remains unmapped, but passes the sequence quality filter,
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.

PubMed

Wang, Chunlin; Lefkowitz, Elliot J

2004-10-28

Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Used together, QS-search and DS-BLAST provide a flexible solution to adapt sequential similarity searching applications in high performance computing environments. Their ease of use and their ability to wrap a variety of database search programs provide an analytical architecture to assist both the seasoned bioinformaticist and the wet-bench biologist.
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters

PubMed Central

Wang, Chunlin; Lefkowitz, Elliot J

2004-01-01

Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Conclusions Used together, QS-search and DS-BLAST provide a flexible solution to adapt sequential similarity searching applications in high performance computing environments. Their ease of use and their ability to wrap a variety of database search programs provide an analytical architecture to assist both the seasoned bioinformaticist and the wet-bench biologist. PMID:15511296
Methylene blue binding to DNA with alternating AT base sequence: minor groove binding is favored over intercalation.

PubMed

Rohs, Remo; Sklenar, Heinz

2004-04-01

The results presented in this paper on methylene blue (MB) binding to DNA with AT alternating base sequence complement the data obtained in two former modeling studies of MB binding to GC alternating DNA. In the light of the large amount of experimental data for both systems, this theoretical study is focused on a detailed energetic analysis and comparison in order to understand their different behavior. Since experimental high-resolution structures of the complexes are not available, the analysis is based on energy minimized structural models of the complexes in different binding modes. For both sequences, four different intercalation structures and two models for MB binding in the minor and major groove have been proposed. Solvent electrostatic effects were included in the energetic analysis by using electrostatic continuum theory, and the dependence of MB binding on salt concentration was investigated by solving the non-linear Poisson-Boltzmann equation. We find that the relative stability of the different complexes is similar for the two sequences, in agreement with the interpretation of spectroscopic data. Subtle differences, however, are seen in energy decompositions and can be attributed to the change from symmetric 5'-YpR-3' intercalation to minor groove binding with increasing salt concentration, which is experimentally observed for the AT sequence at lower salt concentration than for the GC sequence. According to our results, this difference is due to the significantly lower non-electrostatic energy for the minor groove complex with AT alternating DNA, whereas the slightly lower binding energy to this sequence is caused by a higher deformation energy of DNA. The energetic data are in agreement with the conclusions derived from different spectroscopic studies and can also be structurally interpreted on the basis of the modeled complexes. The simple static modeling technique and the neglect of entropy terms and of non-electrostatic solute-solvent interactions, which are assumed to be nearly constant for the compared complexes of MB with DNA, seem to be justified by the results.
Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.

1987-01-01

The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homologymore » (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.« less
Amino acid sequences of ribosomal proteins S11 from Bacillus stearothermophilus and S19 from Halobacterium marismortui. Comparison of the ribosomal protein S11 family.

PubMed

Kimura, M; Kimura, J; Hatakeyama, T

1988-11-21

The complete amino acid sequences of ribosomal proteins S11 from the Gram-positive eubacterium Bacillus stearothermophilus and of S19 from the archaebacterium Halobacterium marismortui have been determined. A search for homologous sequences of these proteins revealed that they belong to the ribosomal protein S11 family. Homologous proteins have previously been sequenced from Escherichia coli as well as from chloroplast, yeast and mammalian ribosomes. A pairwise comparison of the amino acid sequences showed that Bacillus protein S11 shares 68% identical residues with S11 from Escherichia coli and a slightly lower homology (52%) with the homologous chloroplast protein. The halophilic protein S19 is more related to the eukaryotic (45-49%) than to the eubacterial counterparts (35%).
The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae).

PubMed

Pan, Hong-Chun; Fang, Hong-Yan; Li, Shi-Wei; Liu, Jun-Hong; Wang, Ying; Wang, An-Tai

2014-12-01

The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae) is composed of two linear DNA molecules. The mitochondrial DNA (mtDNA) molecule 1 is 8010 bp long and contains six protein-coding genes, large subunit rRNA, methionine and tryptophan tRNAs, two pseudogenes consisting respectively of a partial copy of COI, and terminal sequences at two ends of the linear mtDNA, while the mtDNA molecule 2 is 7576 bp long and contains seven protein-coding genes, small subunit rRNA, methionine tRNA, a pseudogene consisting of a partial copy of COI and terminal sequences at two ends of the linear mtDNA. COI gene begins with GTG as start codon, whereas other 12 protein-coding genes start with a typical ATG initiation codon. In addition, all protein-coding genes are terminated with TAA as stop codon.
Incorporating Writing in an Integrated Calculus, Linear Algebra, and Differential Equations Sequence.

ERIC Educational Resources Information Center

Kelly, Susan E.; LeDocq, Rebecca Lewin

2001-01-01

Describes the specific courses in a sequence along with how the writing has been implemented in each course. Provides ideas for how to efficiently handle the additional paper load so students receive the necessary feedback while keeping the grading time reasonable. (Author/ASK)
Extracting Both Peptide Sequence and Glycan Structural Information by 157 nm Photodissociation of N-Linked Glycopeptides

PubMed Central

Zhang, Liangyi; Reilly, James P.

2009-01-01

157 nm photodissociation of N-linked glycopeptides was investigated in MALDI tandem time-of-flight (TOF) and linear ion trap mass spectrometers. Singly-charged glycopeptides yielded abundant peptide and glycan fragments. The peptide fragments included a series of x-, y-, v- and w- ions with the glycan remaining intact. These provide information about the peptide sequence and the glycosylation site. In addition to glycosidic fragments, abundant cross-ring glycan fragments that are not observed in low-energy CID were detected. These fragments provide insight into the glycan sequence and linkages. Doubly-charged glycopeptides generated by nanospray in the linear ion trap mass spectrometer also yielded peptide and glycan fragments. However, the former were dominated by low-energy fragments such as b- and y- type ions while glycan was primarily cleaved at glycosidic bonds. PMID:19113943
G-sequentially connectedness for topological groups with operations

NASA Astrophysics Data System (ADS)

Mucuk, Osman; Cakalli, Huseyin

2016-08-01

It is a well-known fact that for a Hausdorff topological group X, the limits of convergent sequences in X define a function denoted by lim from the set of all convergent sequences in X to X. This notion has been modified by Connor and Grosse-Erdmann for real functions by replacing lim with an arbitrary linear functional G defined on a linear subspace of the vector space of all real sequences. Recently some authors have extended the concept to the topological group setting and introduced the concepts of G-sequential continuity, G-sequential compactness and G-sequential connectedness. In this work, we present some results about G-sequentially closures, G-sequentially connectedness and fundamental system of G-sequentially open neighbourhoods for topological group with operations which include topological groups, topological rings without identity, R-modules, Lie algebras, Jordan algebras, and many others.
Prediction of siRNA potency using sparse logistic regression.

PubMed

Hu, Wei; Hu, John

2014-06-01

RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.
Aligning the unalignable: bacteriophage whole genome alignments.

PubMed

Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M

2016-01-13

In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).
Ole e 13 is the unique food allergen in olive: Structure-functional, substrates docking, and molecular allergenicity comparative analysis.

PubMed

Jimenez-Lopez, J C; Robles-Bolivar, P; Lopez-Valverde, F J; Lima-Cabello, E; Kotchoni, S O; Alché, J D

2016-05-01

Thaumatin-like proteins (TLPs) are enzymes with important functions in pathogens defense and in the response to biotic and abiotic stresses. Last identified olive allergen (Ole e 13) is a TLP, which may also importantly contribute to food allergy and cross-allergenicity to pollen allergen proteins. The goals of this study are the characterization of the structural-functionality of Ole e 13 with a focus in its catalytic mechanism, and its molecular allergenicity by extensive analysis using different molecular computer-aided approaches covering a) functional-regulatory motifs, b) comparative study of linear sequence, 2-D and 3D structural homology modeling, c) molecular docking with two different β-D-glucans, d) conservational and evolutionary analysis, e) catalytic mechanism modeling, and f) IgE-binding, B- and T-cell epitopes identification and comparison to other allergenic TLPs. Sequence comparison, structure-based features, and phylogenetic analysis identified Ole e 13 as a thaumatin-like protein. 3D structural characterization revealed a conserved overall folding among plants TLPs, with mayor differences in the acidic (catalytic) cleft. Molecular docking analysis using two β-(1,3)-glucans allowed to identify fundamental residues involved in the endo-1,3-β-glucanase activity, and defining E84 as one of the conserved residues of the TLPs responsible of the nucleophilic attack to initiate the enzymatic reaction and D107 as proton donor, thus proposing a catalytic mechanism for Ole e 13. Identification of IgE-binding, B- and T-cell epitopes may help designing strategies to improve diagnosis and immunotherapy to food allergy and cross-allergenic pollen TLPs. Copyright © 2016 Elsevier Inc. All rights reserved.
GPU-based cloud service for Smith-Waterman algorithm using frequency distance filtration scheme.

PubMed

Lee, Sheng-Ta; Lin, Chun-Yuan; Hung, Che Lun

2013-01-01

As the conventional means of analyzing the similarity between a query sequence and database sequences, the Smith-Waterman algorithm is feasible for a database search owing to its high sensitivity. However, this algorithm is still quite time consuming. CUDA programming can improve computations efficiently by using the computational power of massive computing hardware as graphics processing units (GPUs). This work presents a novel Smith-Waterman algorithm with a frequency-based filtration method on GPUs rather than merely accelerating the comparisons yet expending computational resources to handle such unnecessary comparisons. A user friendly interface is also designed for potential cloud server applications with GPUs. Additionally, two data sets, H1N1 protein sequences (query sequence set) and human protein database (database set), are selected, followed by a comparison of CUDA-SW and CUDA-SW with the filtration method, referred to herein as CUDA-SWf. Experimental results indicate that reducing unnecessary sequence alignments can improve the computational time by up to 41%. Importantly, by using CUDA-SWf as a cloud service, this application can be accessed from any computing environment of a device with an Internet connection without time constraints.
General methods for determining the linear stability of coronal magnetic fields

NASA Technical Reports Server (NTRS)

Craig, I. J. D.; Sneyd, A. D.; Mcclymont, A. N.

1988-01-01

A time integration of a linearized plasma equation of motion has been performed to calculate the ideal linear stability of arbitrary three-dimensional magnetic fields. The convergence rates of the explicit and implicit power methods employed are speeded up by using sequences of cyclic shifts. Growth rates are obtained for Gold-Hoyle force-free equilibria, and the corkscrew-kink instability is found to be very weak.
General methods for determining the linear stability of coronal magnetic fields

DOE Office of Scientific and Technical Information (OSTI.GOV)

Craig, I.J.D.; Sneyd, A.D.; McClymont, A.N.

1988-12-01

A time integration of a linearized plasma equation of motion has been performed to calculate the ideal linear stability of arbitrary three-dimensional magnetic fields. The convergence rates of the explicit and implicit power methods employed are speeded up by using sequences of cyclic shifts. Growth rates are obtained for Gold-Hoyle force-free equilibria, and the corkscrew-kink instability is found to be very weak. 19 references.
ArrayPitope: Automated Analysis of Amino Acid Substitutions for Peptide Microarray-Based Antibody Epitope Mapping.

PubMed

Hansen, Christian Skjødt; Østerbye, Thomas; Marcatili, Paolo; Lund, Ole; Buus, Søren; Nielsen, Morten

2017-01-01

Identification of epitopes targeted by antibodies (B cell epitopes) is of critical importance for the development of many diagnostic and therapeutic tools. For clinical usage, such epitopes must be extensively characterized in order to validate specificity and to document potential cross-reactivity. B cell epitopes are typically classified as either linear epitopes, i.e. short consecutive segments from the protein sequence or conformational epitopes adapted through native protein folding. Recent advances in high-density peptide microarrays enable high-throughput, high-resolution identification and characterization of linear B cell epitopes. Using exhaustive amino acid substitution analysis of peptides originating from target antigens, these microarrays can be used to address the specificity of polyclonal antibodies raised against such antigens containing hundreds of epitopes. However, the interpretation of the data provided in such large-scale screenings is far from trivial and in most cases it requires advanced computational and statistical skills. Here, we present an online application for automated identification of linear B cell epitopes, allowing the non-expert user to analyse peptide microarray data. The application takes as input quantitative peptide data of fully or partially substituted overlapping peptides from a given antigen sequence and identifies epitope residues (residues that are significantly affected by substitutions) and visualize the selectivity towards each residue by sequence logo plots. Demonstrating utility, the application was used to identify and address the antibody specificity of 18 linear epitope regions in Human Serum Albumin (HSA), using peptide microarray data consisting of fully substituted peptides spanning the entire sequence of HSA and incubated with polyclonal rabbit anti-HSA (and mouse anti-rabbit-Cy3). The application is made available at: www.cbs.dtu.dk/services/ArrayPitope.

ArrayPitope: Automated Analysis of Amino Acid Substitutions for Peptide Microarray-Based Antibody Epitope Mapping

PubMed Central

Hansen, Christian Skjødt; Østerbye, Thomas; Marcatili, Paolo; Lund, Ole; Buus, Søren

2017-01-01

Identification of epitopes targeted by antibodies (B cell epitopes) is of critical importance for the development of many diagnostic and therapeutic tools. For clinical usage, such epitopes must be extensively characterized in order to validate specificity and to document potential cross-reactivity. B cell epitopes are typically classified as either linear epitopes, i.e. short consecutive segments from the protein sequence or conformational epitopes adapted through native protein folding. Recent advances in high-density peptide microarrays enable high-throughput, high-resolution identification and characterization of linear B cell epitopes. Using exhaustive amino acid substitution analysis of peptides originating from target antigens, these microarrays can be used to address the specificity of polyclonal antibodies raised against such antigens containing hundreds of epitopes. However, the interpretation of the data provided in such large-scale screenings is far from trivial and in most cases it requires advanced computational and statistical skills. Here, we present an online application for automated identification of linear B cell epitopes, allowing the non-expert user to analyse peptide microarray data. The application takes as input quantitative peptide data of fully or partially substituted overlapping peptides from a given antigen sequence and identifies epitope residues (residues that are significantly affected by substitutions) and visualize the selectivity towards each residue by sequence logo plots. Demonstrating utility, the application was used to identify and address the antibody specificity of 18 linear epitope regions in Human Serum Albumin (HSA), using peptide microarray data consisting of fully substituted peptides spanning the entire sequence of HSA and incubated with polyclonal rabbit anti-HSA (and mouse anti-rabbit-Cy3). The application is made available at: www.cbs.dtu.dk/services/ArrayPitope. PMID:28095436
Phylogenetic mixtures and linear invariants for equal input models.

PubMed

Casanellas, Marta; Steel, Mike

2017-04-01

The reconstruction of phylogenetic trees from molecular sequence data relies on modelling site substitutions by a Markov process, or a mixture of such processes. In general, allowing mixed processes can result in different tree topologies becoming indistinguishable from the data, even for infinitely long sequences. However, when the underlying Markov process supports linear phylogenetic invariants, then provided these are sufficiently informative, the identifiability of the tree topology can be restored. In this paper, we investigate a class of processes that support linear invariants once the stationary distribution is fixed, the 'equal input model'. This model generalizes the 'Felsenstein 1981' model (and thereby the Jukes-Cantor model) from four states to an arbitrary number of states (finite or infinite), and it can also be described by a 'random cluster' process. We describe the structure and dimension of the vector spaces of phylogenetic mixtures and of linear invariants for any fixed phylogenetic tree (and for all trees-the so called 'model invariants'), on any number n of leaves. We also provide a precise description of the space of mixtures and linear invariants for the special case of [Formula: see text] leaves. By combining techniques from discrete random processes and (multi-) linear algebra, our results build on a classic result that was first established by James Lake (Mol Biol Evol 4:167-191, 1987).
NNAlign: a platform to construct and evaluate artificial neural network models of receptor-ligand interactions.

PubMed

Nielsen, Morten; Andreatta, Massimo

2017-07-03

Peptides are extensively used to characterize functional or (linear) structural aspects of receptor-ligand interactions in biological systems, e.g. SH2, SH3, PDZ peptide-recognition domains, the MHC membrane receptors and enzymes such as kinases and phosphatases. NNAlign is a method for the identification of such linear motifs in biological sequences. The algorithm aligns the amino acid or nucleotide sequences provided as training set, and generates a model of the sequence motif detected in the data. The webserver allows setting up cross-validation experiments to estimate the performance of the model, as well as evaluations on independent data. Many features of the training sequences can be encoded as input, and the network architecture is highly customizable. The results returned by the server include a graphical representation of the motif identified by the method, performance values and a downloadable model that can be applied to scan protein sequences for occurrence of the motif. While its performance for the characterization of peptide-MHC interactions is widely documented, we extended NNAlign to be applicable to other receptor-ligand systems as well. Version 2.0 supports alignments with insertions and deletions, encoding of receptor pseudo-sequences, and custom alphabets for the training sequences. The server is available at http://www.cbs.dtu.dk/services/NNAlign-2.0. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Complete Genome Sequence of EtG, the First Phage Sequenced from Erwinia tracheiphila.

PubMed

Andrade-Domínguez, Andrés; Kolter, Roberto; Shapiro, Lori R

2018-02-22

Erwinia tracheiphila is the causal agent of bacterial wilt of cucurbits. Here, we report the genome sequence of the temperate phage EtG, which was isolated from an E. tracheiphila -infected cucumber plant. Phage EtG has a linear 30,413-bp double-stranded DNA genome with cohesive ends and 45 predicted open reading frames. Copyright © 2018 Andrade-Domínguez et al.
AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.

PubMed

Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V

2003-01-01

In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.
Domain similarity based orthology detection.

PubMed

Bitard-Feildel, Tristan; Kemena, Carsten; Greenwood, Jenny M; Bornberg-Bauer, Erich

2015-05-13

Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .
Introduction of the hybcell-based compact sequencing technology and comparison to state-of-the-art methodologies for KRAS mutation detection.

PubMed

Zopf, Agnes; Raim, Roman; Danzer, Martin; Niklas, Norbert; Spilka, Rita; Pröll, Johannes; Gabriel, Christian; Nechansky, Andreas; Roucka, Markus

2015-03-01

The detection of KRAS mutations in codons 12 and 13 is critical for anti-EGFR therapy strategies; however, only those methodologies with high sensitivity, specificity, and accuracy as well as the best cost and turnaround balance are suitable for routine daily testing. Here we compared the performance of compact sequencing using the novel hybcell technology with 454 next-generation sequencing (454-NGS), Sanger sequencing, and pyrosequencing, using an evaluation panel of 35 specimens. A total of 32 mutations and 10 wild-type cases were reported using 454-NGS as the reference method. Specificity ranged from 100% for Sanger sequencing to 80% for pyrosequencing. Sanger sequencing and hybcell-based compact sequencing achieved a sensitivity of 96%, whereas pyrosequencing had a sensitivity of 88%. Accuracy was 97% for Sanger sequencing, 85% for pyrosequencing, and 94% for hybcell-based compact sequencing. Quantitative results were obtained for 454-NGS and hybcell-based compact sequencing data, resulting in a significant correlation (r = 0.914). Whereas pyrosequencing and Sanger sequencing were not able to detect multiple mutated cell clones within one tumor specimen, 454-NGS and the hybcell-based compact sequencing detected multiple mutations in two specimens. Our comparison shows that the hybcell-based compact sequencing is a valuable alternative to state-of-the-art methodologies used for detection of clinically relevant point mutations.
High-speed multiple sequence alignment on a reconfigurable platform.

PubMed

Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf

2006-01-01

Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.
Evaluation in a Research and Development Context.

ERIC Educational Resources Information Center

Cooley, William W.

Educational research and development (R&D) has often been characterized as a neat, linear sequence of discrete steps, moving from research through development to evaluation and dissemination. Although the inadequacies of such linear models of educational research and development have been pointed out previously, these models have been so much…
Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms.

PubMed

Ortegon, Patricia; Poot-Hernández, Augusto C; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

2015-01-01

In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case.
Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms

PubMed Central

Ortegon, Patricia; Poot-Hernández, Augusto C.; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

2015-01-01

In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case. PMID:25973143
Bacterial community comparisons by taxonomy-supervised analysis independent of sequence alignment and clustering

PubMed Central

Sul, Woo Jun; Cole, James R.; Jesus, Ederson da C.; Wang, Qiong; Farris, Ryan J.; Fish, Jordan A.; Tiedje, James M.

2011-01-01

High-throughput sequencing of 16S rRNA genes has increased our understanding of microbial community structure, but now even higher-throughput methods to the Illumina scale allow the creation of much larger datasets with more samples and orders-of-magnitude more sequences that swamp current analytic methods. We developed a method capable of handling these larger datasets on the basis of assignment of sequences into an existing taxonomy using a supervised learning approach (taxonomy-supervised analysis). We compared this method with a commonly used clustering approach based on sequence similarity (taxonomy-unsupervised analysis). We sampled 211 different bacterial communities from various habitats and obtained ∼1.3 million 16S rRNA sequences spanning the V4 hypervariable region by pyrosequencing. Both methodologies gave similar ecological conclusions in that β-diversity measures calculated by using these two types of matrices were significantly correlated to each other, as were the ordination configurations and hierarchical clustering dendrograms. In addition, our taxonomy-supervised analyses were also highly correlated with phylogenetic methods, such as UniFrac. The taxonomy-supervised analysis has the advantages that it is not limited by the exhaustive computation required for the alignment and clustering necessary for the taxonomy-unsupervised analysis, is more tolerant of sequencing errors, and allows comparisons when sequences are from different regions of the 16S rRNA gene. With the tremendous expansion in 16S rRNA data acquisition underway, the taxonomy-supervised approach offers the potential to provide more rapid and extensive community comparisons across habitats and samples. PMID:21873204
SWI or T2*: which MRI sequence to use in the detection of cerebral microbleeds? The Karolinska Imaging Dementia Study.

PubMed

Shams, S; Martola, J; Cavallin, L; Granberg, T; Shams, M; Aspelin, P; Wahlund, L O; Kristoffersen-Wiberg, M

2015-06-01

Cerebral microbleeds are thought to have potentially important clinical implications in dementia and stroke. However, the use of both T2* and SWI MR imaging sequences for microbleed detection has complicated the cross-comparison of study results. We aimed to determine the impact of microbleed sequences on microbleed detection and associated clinical parameters. Patients from our memory clinic (n = 246; 53% female; mean age, 62) prospectively underwent 3T MR imaging, with conventional thick-section T2*, thick-section SWI, and conventional thin-section SWI. Microbleeds were assessed separately on thick-section SWI, thin-section SWI, and T2* by 3 raters, with varying neuroradiologic experience. Clinical and radiologic parameters from the dementia investigation were analyzed in association with the number of microbleeds in negative binomial regression analyses. Prevalence and number of microbleeds were higher on thick-/thin-section SWI (20/21%) compared with T2*(17%). There was no difference in microbleed prevalence/number between thick- and thin-section SWI. Interrater agreement was excellent for all raters and sequences. Univariate comparisons of clinical parameters between patients with and without microbleeds yielded no difference across sequences. In the regression analysis, only minor differences in clinical associations with the number of microbleeds were noted across sequences. Due to the increased detection of microbleeds, we recommend SWI as the sequence of choice in microbleed detection. Microbleeds and their association with clinical parameters are robust to the effects of varying MR imaging sequences, suggesting that comparison of results across studies is possible, despite differing microbleed sequences. © 2015 by American Journal of Neuroradiology.
Comparison of Dixon Sequences for Estimation of Percent Breast Fibroglandular Tissue

PubMed Central

Ledger, Araminta E. W.; Scurr, Erica D.; Hughes, Julie; Macdonald, Alison; Wallace, Toni; Thomas, Karen; Wilson, Robin; Leach, Martin O.; Schmidt, Maria A.

2016-01-01

Objectives To evaluate sources of error in the Magnetic Resonance Imaging (MRI) measurement of percent fibroglandular tissue (%FGT) using two-point Dixon sequences for fat-water separation. Methods Ten female volunteers (median age: 31 yrs, range: 23–50 yrs) gave informed consent following Research Ethics Committee approval. Each volunteer was scanned twice following repositioning to enable an estimation of measurement repeatability from high-resolution gradient-echo (GRE) proton-density (PD)-weighted Dixon sequences. Differences in measures of %FGT attributable to resolution, T1 weighting and sequence type were assessed by comparison of this Dixon sequence with low-resolution GRE PD-weighted Dixon data, and against gradient-echo (GRE) or spin-echo (SE) based T1-weighted Dixon datasets, respectively. Results %FGT measurement from high-resolution PD-weighted Dixon sequences had a coefficient of repeatability of ±4.3%. There was no significant difference in %FGT between high-resolution and low-resolution PD-weighted data. Values of %FGT from GRE and SE T1-weighted data were strongly correlated with that derived from PD-weighted data (r = 0.995 and 0.96, respectively). However, both sequences exhibited higher mean %FGT by 2.9% (p < 0.0001) and 12.6% (p < 0.0001), respectively, in comparison with PD-weighted data; the increase in %FGT from the SE T1-weighted sequence was significantly larger at lower breast densities. Conclusion Although measurement of %FGT at low resolution is feasible, T1 weighting and sequence type impact on the accuracy of Dixon-based %FGT measurements; Dixon MRI protocols for %FGT measurement should be carefully considered, particularly for longitudinal or multi-centre studies. PMID:27011312
Improved CT Detection of Acute Herpes Simplex Virus Type 1 Encephalitis Based on a Frequency-Selective Nonlinear Blending: Comparison With MRI.

PubMed

Bongers, Malte Niklas; Bier, Georg; Ditt, Hendrik; Beck, Robert; Ernemann, Ulrike; Nikolaou, Konstantin; Horger, Marius

2016-11-01

The purpose of this study is to compare the diagnostic efficacy of a new CT postprocessing tool based on frequency-selective nonlinear blending (best-contrast CT) with that of standard linear blending of unenhanced head CT in patients with herpes simplex virus type 1 and herpes simplex virus encephalitis (HSE), using FLAIR MRI sequences as the standard of reference. Fifteen consecutive patients (six women and nine men; mean [± SD] age, 60 ± 19 years) with proven HSE (positive polymerase chain reaction results from CSF analysis and the presence of neurologic deficits) were retrospectively enrolled. All patients had undergone head CT and MRI (mean time interval, 2 ± 2 days). After standardized unenhanced head CT scans were read, presets of the best-contrast algorithm were determined (center, 30 HU; delta, 5 HU; slope, 5 nondimensional), and resulting images were analyzed. Contrast enhancement was objectively measured by ROI analysis, comparing contrast-to-noise ratios (CNRs) of unenhanced CT and best-contrast CT. FLAIR and DWI MRI sequences were analyzed, and FLAIR was considered as the standard of reference. For assessment of disease extent, a previously reported 50-point score (HSE score) was used. CNR values for unenhanced head CT (CNR, 5.42 ± 2.77) could be statistically significantly increased using best-contrast CT (CNR, 9.62 ± 4.28) (p = 0.003). FLAIR sequences yielded a median HSE score of 9.0 (range, 6-17) and DWI sequences yielded HSE scores of 6.0 (range, 5-17). By comparison, unenhanced head CT resulted in a median HSE score of 3.5 (range, 1-6). The median best-contrast CT HSE score was 7.5 (range, 6-10). Agreement between FLAIR and unenhanced CT was 54.44%, that between DWI and best-contrast CT was 95.36%, and that between FLAIR and best-contrast CT was 85.21%. The most frequently overseen findings were located at the level of the upper part of the mesencephalon and at the subthalamic or insular level. Frequency-selective nonlinear blending significantly increases contrast and detects brain parenchymal involvement in HSE more sensitively compared with unenhanced CT. The sensitivity of best-contrast CT seems to be equal to that of DWI and almost as good as that of FLAIR.
Linear Streptomyces plasmids form superhelical circles through interactions between their terminal proteins

PubMed Central

Tsai, Hsiu-Hui; Huang, Chih-Hung; Tessmer, Ingrid; Erie, Dorothy A.; Chen, Carton W.

2011-01-01

Linear chromosomes and linear plasmids of Streptomyces possess covalently bound terminal proteins (TPs) at the 5′ ends of their telomeres. These TPs are proposed to act as primers for DNA synthesis that patches the single-stranded gaps at the 3′ ends during replication. Most (‘archetypal’) Streptomyces TPs (designated Tpg) are highly conserved in size and sequence. In addition, there are a number of atypical TPs with heterologous sequences and sizes, one of which is Tpc that caps SCP1 plasmid of Streptomyces coelicolor. Interactions between the TPs on the linear Streptomyces replicons have been suggested by electrophoretic behaviors of TP-capped DNA and circular genetic maps of Streptomyces chromosomes. Using chemical cross-linking, we demonstrated intramolecular and intermolecular interactions in vivo between Tpgs, between Tpcs and between Tpg and Tpc. Interactions between the chromosomal and plasmid telomeres were also detected in vivo. The intramolecular telomere interactions produced negative superhelicity in the linear DNA, which was relaxed by topoisomerase I. Such intramolecular association between the TPs poses a post-replicational complication in the formation of a pseudo-dimeric structure that requires resolution by exchanging TPs or DNA. PMID:21109537
Molecular characterization of a novel orthomyxovirus from rainbow and steelhead trout (Oncorhynchus mykiss)

USGS Publications Warehouse

Batts, William N.; LaPatra, Scott E.; Katona, Ryan; Leis, Eric; Fei Fan Ng, Terry; Bruieuc, Marine S.O.; Breyta, Rachel; Purcell, Maureen; Waltzek, Thomas B.; Delwart, Eric; Winton, James

2017-01-01

A novel virus, rainbow trout orthomyxovirus (RbtOV), was isolated in 1997 and again in 2000 from commercially-reared rainbow trout (Oncorhynchus mykiss) in Idaho, USA. The virus grew optimally in the CHSE-214 cell line at 15°C producing a diffuse cytopathic effect; however, juvenile rainbow trout exposed to cell culture-grown virus showed no mortality or gross pathology. Electron microscopy of preparations from infected cell cultures revealed the presence of typical orthomyxovirus particles. The complete genome of RbtOV is comprised of eight linear segments of single-stranded, negative-sense RNA having highly conserved 5′ and 3′-terminal nucleotide sequences. Another virus isolated in 2014 from steelhead trout (also O. mykiss) in Wisconsin, USA, and designated SttOV was found to have eight genome segments with high amino acid sequence identities (89–99%) to the corresponding genes of RbtOV, suggesting these new viruses are isolates of the same virus species and may be more widespread than currently realized. The new isolates had the same genome segment order and the closest pairwise amino acid sequence identities of 16–42% with Infectious salmon anemia virus (ISAV), the type species and currently only member of the genus Isavirus in the family Orthomyxoviridae. However, pairwise comparisons of the predicted amino acid sequences of the 10 RbtOV and SttOV proteins with orthologs from representatives of the established orthomyxoviral genera and a phylogenetic analysis using the PB1 protein showed that while RbtOV and SttOV clustered most closely with ISAV, they diverged sufficiently to merit consideration as representatives of a novel genus. A set of PCR primers was designed using conserved regions of the PB1 gene to produce amplicons that may be sequenced for identification of similar fish orthomyxoviruses in the future.
SU-F-T-540: Comprehensive Fluence Delivery Optimization with Multileaf Collimation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weppler, S; Villarreal-Barajas, J; Department of Medical Physics, Tom Baker Cancer Center, Calgary, Alberta

2016-06-15

Purpose: Multileaf collimator (MLC) leaf sequencing is performed via commercial black-box implementations, on which a user has limited to no access. We have developed an explicit, generic MLC sequencing model to serve as a tool for future investigations of fluence map optimization, fluence delivery optimization, and rotational collimator delivery methods. Methods: We have developed a novel, comprehensive model to effectively account for a variety of transmission and penumbra effects previously treated on an ad hoc basis in the literature. As the model is capable of quantifying a variety of effects, we utilize the asymmetric leakage intensity across each leaf tomore » deliver fluence maps with pixel size smaller than the narrowest leaf width. Developed using linear programming and mixed integer programming formulations, the model is implemented using state of the art open-source solvers. To demonstrate the versatility of the algorithm, a graphical user interface (GUI) was developed in MATLAB capable of accepting custom leaf specifications and transmission parameters. As a preliminary proof-ofconcept, we have sequenced the leaves of a Varian 120 Leaf Millennium MLC for five prostate cancer patient fields and one head and neck field. Predetermined fluence maps have been processed by data smoothing methods to obtain pixel sizes of 2.5 cm{sup 2}. The quality of output was analyzed using computer simulations. Results: For the prostate fields, an average root mean squared error (RMSE) of 0.82 and gamma (0.5mm/0.5%) of 91.4% were observed compared to RMSE and gamma (0.5mm/0.5%) values of 7.04 and 34.0% when the leakage considerations were omitted. Similar results were observed for the head and neck case. Conclusion: A model to sequence MLC leaves to optimality has been proposed. Future work will involve extensive testing and evaluation of the method on clinical MLCs and comparison with black-box leaf sequencing algorithms currently used by commercial treatment planning systems.« less
Morphology and Small-Subunit Ribosomal DNA Sequence of Henneguya Adiposa (Myxosporea) From Ictalurus punctatus (Siluriformes)

USDA-ARS?s Scientific Manuscript database

The original description of Henneguya adiposa, a myxozoan parasitizing channel catfish Ictalurus punctatus, is supplemented with new data on spore morphology, including photomicrographs and line drawings, as well as 18S small-subunit (SSU) ribosomal DNA (rDNA) sequence. Elongate, translucent, linear...
Properties of Cordonnier, Perrin and Van der Laan Numbers

ERIC Educational Resources Information Center

Shannon, A. G.; Anderson, P. G.; Horadam, A. F.

2006-01-01

This paper aims to explore some properties of certain third-order linear sequences which have some properties analogous to the better known second-order sequences of Fibonacci and Lucas. Historical background issues are outlined. These, together with the number and combinatorial theoretical results, provide plenty of pedagogical opportunities for…

Microbial genome sequencing using optical mapping and Illumina sequencing

USDA-ARS?s Scientific Manuscript database

Introduction Optical mapping is a technique in which strands of genomic DNA are digested with one or more restriction enzymes, and a physical map of the genome constructed from the resulting image. In outline, genomic DNA is extracted from a pure culture, linearly arrayed on a specialized glass sli...
Mineral and Lithology Mapping of Drill Core Pulps Using Visible and Infrared Spectrometry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Taylor, G. R., E-mail: G.Taylor@unsw.edu.au

2000-12-15

A novel approach for using field spectrometry for determining both the mineralogy and the lithology of drill core pulps (powders) is developed and evaluated. The methodology is developed using material from a single drillhole through a mineralized sequence of rocks from central New South Wales. Mineral library spectra are used in linear unmixing routines to determine the mineral abundances in drill core pulps that represent between 1 m and 3 m of core. Comparison with X-Ray Diffraction (XRD) analyses shows that for most major constituents, spectrometry provides an estimate of quantitative mineralogy that is as reliable as that provided bymore » XRD. Confusion between the absorption features of calcite and those of chlorite causes the calcite contents determined by spectrometry to be unreliable. Convex geometry is used to recognize the spectra of those samples that are extreme and are representative of unique lithologies. Linear unmixing is used to determine the abundance of these lithologies in each drillhole sample and these abundances are used to interpret the geology of the drillhole. The interpreted geology agrees well with conventional drillhole logs of the visible geology and photographs of the split core. The methods developed provide a quick and cost-effective way of determining the lithology and alteration mineralogy of drill core pulps.« less
Validation of finite element computations for the quantitative prediction of underwater noise from impact pile driving.

PubMed

Zampolli, Mario; Nijhof, Marten J J; de Jong, Christ A F; Ainslie, Michael A; Jansen, Erwin H W; Quesson, Benoit A J

2013-01-01

The acoustic radiation from a pile being driven into the sediment by a sequence of hammer strikes is studied with a linear, axisymmetric, structural acoustic frequency domain finite element model. Each hammer strike results in an impulsive sound that is emitted from the pile and then propagated in the shallow water waveguide. Measurements from accelerometers mounted on the head of a test pile and from hydrophones deployed in the water are used to validate the model results. Transfer functions between the force input at the top of the anvil and field quantities, such as acceleration components in the structure or pressure in the fluid, are computed with the model. These transfer functions are validated using accelerometer or hydrophone measurements to infer the structural forcing. A modeled hammer forcing pulse is used in the successive step to produce quantitative predictions of sound exposure at the hydrophones. The comparison between the model and the measurements shows that, although several simplifying assumptions were made, useful predictions of noise levels based on linear structural acoustic models are possible. In the final part of the paper, the model is used to characterize the pile as an acoustic radiator by analyzing the flow of acoustic energy.
Teaching Formulaic Sequences in an English-Language Class: The Effects of Explicit Instruction versus Coursebook Instruction

ERIC Educational Resources Information Center

Le-Thi, Duyen; Rodgers, Michael P. H.; Pellicer-Sánchez, Ana

2017-01-01

This study investigates the relative effectiveness of different teaching approaches on the learning of formulaic sequences. Three comparisons were made in this study: the effects of explicit teaching of formulaic sequences versus teaching embedded in traditional coursebook instruction, the effects of the degree of salience of the sequences in the…
W-curve alignments for HIV-1 genomic comparisons.

PubMed

Cork, Douglas J; Lembark, Steven; Tovanabutra, Sodsai; Robb, Merlin L; Kim, Jerome H

2010-06-01

The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly. We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison. The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE. Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison technique of aligning extremes of the curves to effectively phase-shift them past the HIV-1 gap problem, is presented. Besides yielding similar neighbor-joining phenogram topologies, most Mother and Infant C2-V5 sequences in the cohort pairs geometrically map closest to each other, indicating that W-curve heuristics overcame any gap problem.
Common Amino Acid Subsequences in a Universal Proteome—Relevance for Food Science

PubMed Central

Minkiewicz, Piotr; Darewicz, Małgorzata; Iwaniak, Anna; Sokołowska, Jolanta; Starowicz, Piotr; Bucholska, Justyna; Hrynkiewicz, Monika

2015-01-01

A common subsequence is a fragment of the amino acid chain that occurs in more than one protein. Common subsequences may be an object of interest for food scientists as biologically active peptides, epitopes, and/or protein markers that are used in comparative proteomics. An individual bioactive fragment, in particular the shortest fragment containing two or three amino acid residues, may occur in many protein sequences. An individual linear epitope may also be present in multiple sequences of precursor proteins. Although recent recommendations for prediction of allergenicity and cross-reactivity include not only sequence identity, but also similarities in secondary and tertiary structures surrounding the common fragment, local sequence identity may be used to screen protein sequence databases for potential allergens in silico. The main weakness of the screening process is that it overlooks allergens and cross-reactivity cases without identical fragments corresponding to linear epitopes. A single peptide may also serve as a marker of a group of allergens that belong to the same family and, possibly, reveal cross-reactivity. This review article discusses the benefits for food scientists that follow from the common subsequences concept. PMID:26340620
Droplet-based pyrosequencing using digital microfluidics.

PubMed

Boles, Deborah J; Benton, Jonathan L; Siew, Germaine J; Levy, Miriam H; Thwar, Prasanna K; Sandahl, Melissa A; Rouse, Jeremy L; Perkins, Lisa C; Sudarsan, Arjun P; Jalili, Roxana; Pamula, Vamsee K; Srinivasan, Vijay; Fair, Richard B; Griffin, Peter B; Eckhardt, Allen E; Pollack, Michael G

2011-11-15

The feasibility of implementing pyrosequencing chemistry within droplets using electrowetting-based digital microfluidics is reported. An array of electrodes patterned on a printed-circuit board was used to control the formation, transportation, merging, mixing, and splitting of submicroliter-sized droplets contained within an oil-filled chamber. A three-enzyme pyrosequencing protocol was implemented in which individual droplets contained enzymes, deoxyribonucleotide triphosphates (dNTPs), and DNA templates. The DNA templates were anchored to magnetic beads which enabled them to be thoroughly washed between nucleotide additions. Reagents and protocols were optimized to maximize signal over background, linearity of response, cycle efficiency, and wash efficiency. As an initial demonstration of feasibility, a portion of a 229 bp Candida parapsilosis template was sequenced using both a de novo protocol and a resequencing protocol. The resequencing protocol generated over 60 bp of sequence with 100% sequence accuracy based on raw pyrogram levels. Excellent linearity was observed for all of the homopolymers (two, three, or four nucleotides) contained in the C. parapsilosis sequence. With improvements in microfluidic design it is expected that longer reads, higher throughput, and improved process integration (i.e., "sample-to-sequence" capability) could eventually be achieved using this low-cost platform.
Effective gene prediction by high resolution frequency estimator based on least-norm solution technique

PubMed Central

2014-01-01

Linear algebraic concept of subspace plays a significant role in the recent techniques of spectrum estimation. In this article, the authors have utilized the noise subspace concept for finding hidden periodicities in DNA sequence. With the vast growth of genomic sequences, the demand to identify accurately the protein-coding regions in DNA is increasingly rising. Several techniques of DNA feature extraction which involves various cross fields have come up in the recent past, among which application of digital signal processing tools is of prime importance. It is known that coding segments have a 3-base periodicity, while non-coding regions do not have this unique feature. One of the most important spectrum analysis techniques based on the concept of subspace is the least-norm method. The least-norm estimator developed in this paper shows sharp period-3 peaks in coding regions completely eliminating background noise. Comparison of proposed method with existing sliding discrete Fourier transform (SDFT) method popularly known as modified periodogram method has been drawn on several genes from various organisms and the results show that the proposed method has better as well as an effective approach towards gene prediction. Resolution, quality factor, sensitivity, specificity, miss rate, and wrong rate are used to establish superiority of least-norm gene prediction method over existing method. PMID:24386895
In-vivo spinal cord deformation in flexion

NASA Astrophysics Data System (ADS)

Yuan, Qing; Dougherty, Lawrence; Margulies, Susan S.

1997-05-01

Traumatic mechanical loading of the head-neck complex results cervical spinal cord injury when the distortion of the cord is sufficient to produce functional or structural failure of the cord's neural and/or vascular components. Characterizing cervical spinal cord deformation during physiological loading conditions is an important step to defining a comprehensive injury threshold associated with acute spinal cord injury. In this study, in vivo quasi- static deformation of the cervical spinal cord during flexion of the neck in human volunteers was measured using magnetic resonance (MR) imaging of motion with spatial modulation of magnetization (SPAMM). A custom-designed device was built to guide the motion of the neck and enhance more reproducibility. the SPAMM pulse sequence labeled the tissue with a series of parallel tagging lines. A single- shot gradient-recalled-echo sequence was used to acquire the mid-sagittal image of the cervical spine. A comparison of the tagged line pattern in each MR reference and deformed image pair revealed the distortion of the spinal cord. The results showed the cervical spinal cord elongates during head flexion. The elongation experienced by the spinal cord varies linearly with head flexion, with the posterior surface of the cord stretching more than the anterior surface. The maximal elongation of the cord is about 12 percent of its original length.
Unravelling Glucan Recognition Systems by Glycome Microarrays Using the Designer Approach and Mass Spectrometry*

PubMed Central

Palma, Angelina S.; Liu, Yan; Zhang, Hongtao; Zhang, Yibing; McCleary, Barry V.; Yu, Guangli; Huang, Qilin; Guidolin, Leticia S.; Ciocchini, Andres E.; Torosantucci, Antonella; Wang, Denong; Carvalho, Ana Luísa; Fontes, Carlos M. G. A.; Mulloy, Barbara; Childs, Robert A.; Feizi, Ten; Chai, Wengang

2015-01-01

Glucans are polymers of d-glucose with differing linkages in linear or branched sequences. They are constituents of microbial and plant cell-walls and involved in important bio-recognition processes, including immunomodulation, anticancer activities, pathogen virulence, and plant cell-wall biodegradation. Translational possibilities for these activities in medicine and biotechnology are considerable. High-throughput micro-methods are needed to screen proteins for recognition of specific glucan sequences as a lead to structure–function studies and their exploitation. We describe construction of a “glucome” microarray, the first sequence-defined glycome-scale microarray, using a “designer” approach from targeted ligand-bearing glucans in conjunction with a novel high-sensitivity mass spectrometric sequencing method, as a screening tool to assign glucan recognition motifs. The glucome microarray comprises 153 oligosaccharide probes with high purity, representing major sequences in glucans. Negative-ion electrospray tandem mass spectrometry with collision-induced dissociation was used for complete linkage analysis of gluco-oligosaccharides in linear “homo” and “hetero” and branched sequences. The system is validated using antibodies and carbohydrate-binding modules known to target α- or β-glucans in different biological contexts, extending knowledge on their specificities, and applied to reveal new information on glucan recognition by two signaling molecules of the immune system against pathogens: Dectin-1 and DC-SIGN. The sequencing of the glucan oligosaccharides by the MS method and their interrogation on the microarrays provides detailed information on linkage, sequence and chain length requirements of glucan-recognizing proteins, and are a sensitive means of revealing unsuspected sequences in the polysaccharides. PMID:25670804
Generation of linear dynamic models from a digital nonlinear simulation

NASA Technical Reports Server (NTRS)

Daniele, C. J.; Krosel, S. M.

1979-01-01

The results and methodology used to derive linear models from a nonlinear simulation are presented. It is shown that averaged positive and negative perturbations in the state variables can reduce numerical errors in finite difference, partial derivative approximations and, in the control inputs, can better approximate the system response in both directions about the operating point. Both explicit and implicit formulations are addressed. Linear models are derived for the F 100 engine, and comparisons of transients are made with the nonlinear simulation. The problem of startup transients in the nonlinear simulation in making these comparisons is addressed. Also, reduction of the linear models is investigated using the modal and normal techniques. Reduced-order models of the F 100 are derived and compared with the full-state models.
Identification of gyrB and rpoB gene mutations and differentially expressed proteins between a novobiocin-resistant Aeromonas hydrophila catfish vaccine strain and its virulent parent strain

USDA-ARS?s Scientific Manuscript database

Sequence comparison between the full-length 2412 bp DNA gyrase subunit B (gyrB) gene of a novobiocin resistant Aeromonas hydrophila AH11NOVO vaccine strain and that of its virulent parent strain AH11P revealed 10 missense mutations. Similarly, sequence comparison between the full-length 4092 bp RNA ...
Using variable rate models to identify genes under selection in sequence pairs: their validity and limitations for EST sequences.

PubMed

Church, Sheri A; Livingstone, Kevin; Lai, Zhao; Kozik, Alexander; Knapp, Steven J; Michelmore, Richard W; Rieseberg, Loren H

2007-02-01

Using likelihood-based variable selection models, we determined if positive selection was acting on 523 EST sequence pairs from two lineages of sunflower and lettuce. Variable rate models are generally not used for comparisons of sequence pairs due to the limited information and the inaccuracy of estimates of specific substitution rates. However, previous studies have shown that the likelihood ratio test (LRT) is reliable for detecting positive selection, even with low numbers of sequences. These analyses identified 56 genes that show a signature of selection, of which 75% were not identified by simpler models that average selection across codons. Subsequent mapping studies in sunflower show four of five of the positively selected genes identified by these methods mapped to domestication QTLs. We discuss the validity and limitations of using variable rate models for comparisons of sequence pairs, as well as the limitations of using ESTs for identification of positively selected genes.
Linear programming model to construct phylogenetic network for 16S rRNA sequences of photosynthetic organisms and influenza viruses.

PubMed

Mathur, Rinku; Adlakha, Neeru

2014-06-01

Phylogenetic trees give the information about the vertical relationships of ancestors and descendants but phylogenetic networks are used to visualize the horizontal relationships among the different organisms. In order to predict reticulate events there is a need to construct phylogenetic networks. Here, a Linear Programming (LP) model has been developed for the construction of phylogenetic network. The model is validated by using data sets of chloroplast of 16S rRNA sequences of photosynthetic organisms and Influenza A/H5N1 viruses. Results obtained are in agreement with those obtained by earlier researchers.
Optimal choice of word length when comparing two Markov sequences using a χ 2-statistic.

PubMed

Bai, Xin; Tang, Kujin; Ren, Jie; Waterman, Michael; Sun, Fengzhu

2017-10-03

Alignment-free sequence comparison using counts of word patterns (grams, k-tuples) has become an active research topic due to the large amount of sequence data from the new sequencing technologies. Genome sequences are frequently modelled by Markov chains and the likelihood ratio test or the corresponding approximate χ 2 -statistic has been suggested to compare two sequences. However, it is not known how to best choose the word length k in such studies. We develop an optimal strategy to choose k by maximizing the statistical power of detecting differences between two sequences. Let the orders of the Markov chains for the two sequences be r 1 and r 2 , respectively. We show through both simulations and theoretical studies that the optimal k= max(r 1 ,r 2 )+1 for both long sequences and next generation sequencing (NGS) read data. The orders of the Markov chains may be unknown and several methods have been developed to estimate the orders of Markov chains based on both long sequences and NGS reads. We study the power loss of the statistics when the estimated orders are used. It is shown that the power loss is minimal for some of the estimators of the orders of Markov chains. Our studies provide guidelines on choosing the optimal word length for the comparison of Markov sequences.
Construction of Pseudomolecule Sequences of the aus Rice Cultivar Kasalath for Comparative Genomics of Asian Cultivated Rice

PubMed Central

Sakai, Hiroaki; Kanamori, Hiroyuki; Arai-Kichise, Yuko; Shibata-Hatta, Mari; Ebana, Kaworu; Oono, Youko; Kurita, Kanako; Fujisawa, Hiroko; Katagiri, Satoshi; Mukai, Yoshiyuki; Hamada, Masao; Itoh, Takeshi; Matsumoto, Takashi; Katayose, Yuichi; Wakasa, Kyo; Yano, Masahiro; Wu, Jianzhong

2014-01-01

Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone. PMID:24578372
Inferences from structural comparison: flexibility, secondary structure wobble and sequence alignment optimization.

PubMed

Zhang, Gaihua; Su, Zhen

2012-01-01

Work on protein structure prediction is very useful in biological research. To evaluate their accuracy, experimental protein structures or their derived data are used as the 'gold standard'. However, as proteins are dynamic molecular machines with structural flexibility such a standard may be unreliable. To investigate the influence of the structure flexibility, we analysed 3,652 protein structures of 137 unique sequences from 24 protein families. The results showed that (1) the three-dimensional (3D) protein structures were not rigid: the root-mean-square deviation (RMSD) of the backbone Cα of structures with identical sequences was relatively large, with the average of the maximum RMSD from each of the 137 sequences being 1.06 Å; (2) the derived data of the 3D structure was not constant, e.g. the highest ratio of the secondary structure wobble site was 60.69%, with the sequence alignments from structural comparisons of two proteins in the same family sometimes being completely different. Proteins may have several stable conformations and the data derived from resolved structures as a 'gold standard' should be optimized before being utilized as criteria to evaluate the prediction methods, e.g. sequence alignment from structural comparison. Helix/β-sheet transition exists in normal free proteins. The coil ratio of the 3D structure could affect its resolution as determined by X-ray crystallography.
A Comparison of Traditional Worksheet and Linear Programming Methods for Teaching Manure Application Planning.

ERIC Educational Resources Information Center

Schmitt, M. A.; And Others

1994-01-01

Compares traditional manure application planning techniques calculated to meet agronomic nutrient needs on a field-by-field basis with plans developed using computer-assisted linear programming optimization methods. Linear programming provided the most economical and environmentally sound manure application strategy. (Contains 15 references.) (MDH)
An Evidence-Based Systematic Review of Amplitude Compression in Hearing Aids for School-Age Children With Hearing Loss

PubMed Central

McCreery, Ryan W.; Venediktov, Rebecca A.; Coleman, Jaumeiko J.; Leech, Hillary M.

2013-01-01

Purpose Two clinical questions were developed: one addressing the comparison of linear amplification with compression limiting to linear amplification with peak clipping, and the second comparing wide dynamic range compression with linear amplification for outcomes of audibility, speech recognition, speech and language, and self- or parent report in children with hearing loss. Method Twenty-six databases were systematically searched for studies addressing a clinical question and meeting all inclusion criteria. Studies were evaluated for methodological quality, and effect sizes were reported or calculated when possible. Results The literature search resulted in the inclusion of 8 studies. All 8 studies included comparisons of wide dynamic range compression to linear amplification, and 2 of the 8 studies provided comparisons of compression limiting versus peak clipping. Conclusions Moderate evidence from the included studies demonstrated that audibility was improved and speech recognition was either maintained or improved with wide dynamic range compression as compared with linear amplification. No significant differences were observed between compression limiting and peak clipping on outcomes (i.e., speech recognition and self-/parent report) reported across the 2 studies. Preference ratings appear to be influenced by participant characteristics and environmental factors. Further research is needed before conclusions can confidently be drawn. PMID:22858616
Scrambled Sobol Sequences via Permutation

DTIC Science & Technology

2009-01-01

LCG LCG64 LFG MLFG PMLCG Sobol Scrambler PermutationScrambler LinearScrambler <<uses>> PermuationFactory StaticFactory DynamicFactory <<uses>> Figure 3...Phy., 19:252–256, 1979. [2] Emanouil I. Atanassov. A new efficient algorithm for generating the scrambled sobol ’ sequence. In NMA ’02: Revised Papers...Deidre W.Evan, and Micheal Mascagni. On the scrambled sobol sequence. In ICCS2005, pages 775–782, 2005. [7] Richard Durstenfeld. Algorithm 235: Random

Draft Genome Sequence of the Serratia rubidaea CIP 103234T Reference Strain, a Human-Opportunistic Pathogen.

PubMed

Bonnin, Rémy A; Girlich, Delphine; Imanci, Dilek; Dortet, Laurent; Naas, Thierry

2015-11-19

We provide here the first genome sequence of a Serratia rubidaea isolate, a human-opportunistic pathogen. This reference sequence will permit a comparison of this species with others of the Serratia genus. Copyright © 2015 Bonnin et al.
A clone-free, single molecule map of the domestic cow (Bos taurus) genome.

PubMed

Zhou, Shiguo; Goldstein, Steve; Place, Michael; Bechner, Michael; Patino, Diego; Potamousis, Konstantinos; Ravindran, Prabu; Pape, Louise; Rincon, Gonzalo; Hernandez-Ortiz, Juan; Medrano, Juan F; Schwartz, David C

2015-08-28

The cattle (Bos taurus) genome was originally selected for sequencing due to its economic importance and unique biology as a model organism for understanding other ruminants, or mammals. Currently, there are two cattle genome sequence assemblies (UMD3.1 and Btau4.6) from groups using dissimilar assembly algorithms, which were complemented by genetic and physical map resources. However, past comparisons between these assemblies revealed substantial differences. Consequently, such discordances have engendered ambiguities when using reference sequence data, impacting genomic studies in cattle and motivating construction of a new optical map resource--BtOM1.0--to guide comparisons and improvements to the current sequence builds. Accordingly, our comprehensive comparisons of BtOM1.0 against the UMD3.1 and Btau4.6 sequence builds tabulate large-to-immediate scale discordances requiring mediation. The optical map, BtOM1.0, spanning the B. taurus genome (Hereford breed, L1 Dominette 01449) was assembled from an optical map dataset consisting of 2,973,315 (439 X; raw dataset size before assembly) single molecule optical maps (Rmaps; 1 Rmap = 1 restriction mapped DNA molecule) generated by the Optical Mapping System. The BamHI map spans 2,575.30 Mb and comprises 78 optical contigs assembled by a combination of iterative (using the reference sequence: UMD3.1) and de novo assembly techniques. BtOM1.0 is a high-resolution physical map featuring an average restriction fragment size of 8.91 Kb. Comparisons of BtOM1.0 vs. UMD3.1, or Btau4.6, revealed that Btau4.6 presented far more discordances (7,463) vs. UMD3.1 (4,754). Overall, we found that Btau4.6 presented almost double the number of discordances than UMD3.1 across most of the 6 categories of sequence vs. map discrepancies, which are: COMPLEX (misassembly), DELs (extraneous sequences), INSs (missing sequences), ITs (Inverted/Translocated sequences), ECs (extra restriction cuts) and MCs (missing restriction cuts). Alignments of UMD3.1 and Btau4.6 to BtOM1.0 reveal discordances commensurate with previous reports, and affirm the NCBI's current designation of UMD3.1 sequence assembly as the "reference assembly" and the Btau4.6 as the "alternate assembly." The cattle genome optical map, BtOM1.0, when used as a comprehensive and largely independent guide, will greatly assist improvements to existing sequence builds, and later serve as an accurate physical scaffold for studies concerning the comparative genomics of cattle breeds.
Modular and configurable optimal sequence alignment software: Cola.

PubMed

Zamani, Neda; Sundström, Görel; Höppner, Marc P; Grabherr, Manfred G

2014-01-01

The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficiently implemented algorithms that can be parameterised to accommodate more complex non-linear scoring schemes are thus desirable. We present Cola, alignment software that implements different optimal alignment algorithms, also allowing for scoring contiguous matches of nucleotides in a nonlinear manner. The latter places more emphasis on short, highly conserved motifs, and less on the surrounding nucleotides, which can be more diverged. To illustrate the differences, we report results from aligning 14,100 sequences from 3' untranslated regions of human genes to 25 of their mammalian counterparts, where we found that a nonlinear scoring scheme is more consistent than a linear scheme in detecting short, conserved motifs. Cola is freely available under LPGL from https://github.com/nedaz/cola.
Non-Condon equilibrium Fermi’s golden rule electronic transition rate constants via the linearized semiclassical method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sun, Xiang; Geva, Eitan

2016-06-28

In this paper, we test the accuracy of the linearized semiclassical (LSC) expression for the equilibrium Fermi’s golden rule rate constant for electronic transitions in the presence of non-Condon effects. We do so by performing a comparison with the exact quantum-mechanical result for a model where the donor and acceptor potential energy surfaces are parabolic and identical except for shifts in the equilibrium energy and geometry, and the coupling between them is linear in the nuclear coordinates. Since non-Condon effects may or may not give rise to conical intersections, both possibilities are examined by considering: (1) A modified Garg-Onuchic-Ambegaokar modelmore » for charge transfer in the condensed phase, where the donor-acceptor coupling is linear in the primary mode coordinate, and for which non-Condon effects do not give rise to a conical intersection; (2) the linear vibronic coupling model for electronic transitions in gas phase molecules, where non-Condon effects give rise to conical intersections. We also present a comprehensive comparison between the linearized semiclassical expression and a progression of more approximate expressions. The comparison is performed over a wide range of frictions and temperatures for model (1) and over a wide range of temperatures for model (2). The linearized semiclassical method is found to reproduce the exact quantum-mechanical result remarkably well for both models over the entire range of parameters under consideration. In contrast, more approximate expressions are observed to deviate considerably from the exact result in some regions of parameter space.« less
eShadow: A tool for comparing closely related sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ovcharenko, Ivan; Boffelli, Dario; Loots, Gabriela G.

2004-01-15

Primate sequence comparisons are difficult to interpret due to the high degree of sequence similarity shared between such closely related species. Recently, a novel method, phylogenetic shadowing, has been pioneered for predicting functional elements in the human genome through the analysis of multiple primate sequence alignments. We have expanded this theoretical approach to create a computational tool, eShadow, for the identification of elements under selective pressure in multiple sequence alignments of closely related genomes, such as in comparisons of human to primate or mouse to rat DNA. This tool integrates two different statistical methods and allows for the dynamic visualizationmore » of the resulting conservation profile. eShadow also includes a versatile optimization module capable of training the underlying Hidden Markov Model to differentially predict functional sequences. This module grants the tool high flexibility in the analysis of multiple sequence alignments and in comparing sequences with different divergence rates. Here, we describe the eShadow comparative tool and its potential uses for analyzing both multiple nucleotide and protein alignments to predict putative functional elements. The eShadow tool is publicly available at http://eshadow.dcode.org/« less
RECOVIR Software for Identifying Viruses

NASA Technical Reports Server (NTRS)

Chakravarty, Sugoto; Fox, George E.; Zhu, Dianhui

2013-01-01

Most single-stranded RNA (ssRNA) viruses mutate rapidly to generate a large number of strains with highly divergent capsid sequences. Determining the capsid residues or nucleotides that uniquely characterize these strains is critical in understanding the strain diversity of these viruses. RECOVIR (an acronym for "recognize viruses") software predicts the strains of some ssRNA viruses from their limited sequence data. Novel phylogenetic-tree-based databases of protein or nucleic acid residues that uniquely characterize these virus strains are created. Strains of input virus sequences (partial or complete) are predicted through residue-wise comparisons with the databases. RECOVIR uses unique characterizing residues to identify automatically strains of partial or complete capsid sequences of picorna and caliciviruses, two of the most highly diverse ssRNA virus families. Partition-wise comparisons of the database residues with the corresponding residues of more than 300 complete and partial sequences of these viruses resulted in correct strain identification for all of these sequences. This study shows the feasibility of creating databases of hitherto unknown residues uniquely characterizing the capsid sequences of two of the most highly divergent ssRNA virus families. These databases enable automated strain identification from partial or complete capsid sequences of these human and animal pathogens.
Matrix Transformations between Certain Sequence Spaces over the Non-Newtonian Complex Field

PubMed Central

Efe, Hakan

2014-01-01

In some cases, the most general linear operator between two sequence spaces is given by an infinite matrix. So the theory of matrix transformations has always been of great interest in the study of sequence spaces. In the present paper, we introduce the matrix transformations in sequence spaces over the field ℂ* and characterize some classes of infinite matrices with respect to the non-Newtonian calculus. Also we give the necessary and sufficient conditions on an infinite matrix transforming one of the classical sets over ℂ* to another one. Furthermore, the concept for sequence-to-sequence and series-to-series methods of summability is given with some illustrated examples. PMID:25110740
A Comparison of Linear versus Non-Linear Models of Aversive Self-Awareness, Dissociation, and Non-Suicidal Self-Injury among Young Adults

ERIC Educational Resources Information Center

Armey, Michael F.; Crowther, Janis H.

2008-01-01

Research has identified a significant increase in both the incidence and prevalence of non-suicidal self-injury (NSSI). The present study sought to test both linear and non-linear cusp catastrophe models by using aversive self-awareness, which was operationalized as a composite of aversive self-relevant affect and cognitions, and dissociation as…
Lines of Eigenvectors and Solutions to Systems of Linear Differential Equations

ERIC Educational Resources Information Center

Rasmussen, Chris; Keynes, Michael

2003-01-01

The purpose of this paper is to describe an instructional sequence where students invent a method for locating lines of eigenvectors and corresponding solutions to systems of two first order linear ordinary differential equations with constant coefficients. The significance of this paper is two-fold. First, it represents an innovative alternative…
Formation of template-switching artifacts by linear amplification.

PubMed

Chakravarti, Dhrubajyoti; Mailander, Paula C

2008-07-01

Linear amplification is a method of synthesizing single-stranded DNA from either a single-stranded DNA or one strand of a double-stranded DNA. In this protocol, molecules of a single primer DNA are extended by multiple rounds of DNA synthesis at high temperature using thermostable DNA polymerases. Although linear amplification generates the intended full-length single-stranded product, it is more efficient over single-stranded templates than double-stranded templates. We analyzed linear amplification over single- or double-stranded mouse H-ras DNA (exon 1-2 region). The single-stranded H-ras template yielded only the intended product. However, when the double-stranded template was used, additional artifact products were observed. Increasing the concentration of the double-stranded template produced relatively higher amounts of these artifact products. One of the artifact DNA bands could be mapped and analyzed by sequencing. It contained three template-switching products. These DNAs were formed by incomplete DNA strand extension over the template strand, followed by switching to the complementary strand at a specific Ade nucleotide within a putative hairpin sequence, from which DNA synthesis continued over the complementary strand.
Comparison of traditional phenotypic identification methods with partial 5' 16S rRNA gene sequencing for species-level identification of nonfermenting Gram-negative bacilli.

PubMed

Cloud, Joann L; Harmsen, Dag; Iwen, Peter C; Dunn, James J; Hall, Gerri; Lasala, Paul Rocco; Hoggan, Karen; Wilson, Deborah; Woods, Gail L; Mellmann, Alexander

2010-04-01

Correct identification of nonfermenting Gram-negative bacilli (NFB) is crucial for patient management. We compared phenotypic identifications of 96 clinical NFB isolates with identifications obtained by 5' 16S rRNA gene sequencing. Sequencing identified 88 isolates (91.7%) with >99% similarity to a sequence from the assigned species; 61.5% of sequencing results were concordant with phenotypic results, indicating the usability of sequencing to identify NFB.
Sequence Tolerance of a Highly Stable Single Domain Antibody: Comparison of Computational and Experimental Profiles

DTIC Science & Technology

2016-09-09

evaluating 18 mutants using either the A or B conformer is only r = ~ 0.2. Given the poor performance of approximating the observed experimental ...1 Sequence Tolerance of a Highly Stable Single Domain Antibody: Comparison of Computational and Experimental Profiles Mark A. Olson,1 Patricia...unusually high thermal stability is explored by a combined computational and experimental study. Starting with the crystallographic structure
Neurophysiological correlates of linearization in language production

PubMed Central

Habets, Boukje; Jansma, Bernadette M; Münte, Thomas F

2008-01-01

Background During speech production the planning of a description of several events requires, among other things, a verbal sequencing of these events. During this process, referred to as linearization during conceptualization, the speaker can choose between different types of temporal connectives such as 'Before' X did A, Y did B' or 'After' Y did B, X did A'. To capture the neural events of such linearization processes, event-related potentials (ERP) were measured in native speakers of German. Utterances were elicited by presenting a sequence of two pictures on a video screen. Each picture consists of an object that is associated with a particular action (e.g. book = reading). A coloured vocalization cue indicated to describe the sequence of two actions associated with the objects in chronological (e.g. red cue: 'After' I drove the car, I read a book) or reversed order (yellow cue). Results Brain potentials showed reliable differences between the two conditions from 180 ms after the onset of the vocalization prompt, with ERPs from the 'After' condition being more negative. This 'Before/After' difference showed a fronto-central distribution between 180 and 230 ms. From 300 ms onwards, a parietal distribution was observed. The latter effect is interpreted as an instance of the P300 response, which is known to be modulated by task difficulty. Conclusion ERPs preceding overt sentence production are sensitive to conceptual linearization. The observed early, more fronto-centrally distributed variation could be interpreted as involvement of working memory needed to order the events according to the instruction. The later parietal distributed variation relates to the complexity in linearization, with the non-chronological order being more demanding during the updating of the concepts in working memory. PMID:18681961
A Teaching and Learning Sequence about the Interplay of Chance and Determinism in Nonlinear Systems

ERIC Educational Resources Information Center

Stavrou, D.; Duit, R.; Komorek, M.

2008-01-01

A teaching and learning sequence aimed at introducing upper secondary school students to the interplay between chance and determinism in nonlinear systems is presented. Three experiments concerning nonlinear systems (deterministic chaos, self-organization and fractals) and one experiment concerning linear systems are introduced. Thirty upper…
Arduino-based automation of a DNA extraction system.

PubMed

Kim, Kyung-Won; Lee, Mi-So; Ryu, Mun-Ho; Kim, Jong-Won

2015-01-01

There have been many studies to detect infectious diseases with the molecular genetic method. This study presents an automation process for a DNA extraction system based on microfluidics and magnetic bead, which is part of a portable molecular genetic test system. This DNA extraction system consists of a cartridge with chambers, syringes, four linear stepper actuators, and a rotary stepper actuator. The actuators provide a sequence of steps in the DNA extraction process, such as transporting, mixing, and washing for the gene specimen, magnetic bead, and reagent solutions. The proposed automation system consists of a PC-based host application and an Arduino-based controller. The host application compiles a G code sequence file and interfaces with the controller to execute the compiled sequence. The controller executes stepper motor axis motion, time delay, and input-output manipulation. It drives the stepper motor with an open library, which provides a smooth linear acceleration profile. The controller also provides a homing sequence to establish the motor's reference position, and hard limit checking to prevent any over-travelling. The proposed system was implemented and its functionality was investigated, especially regarding positioning accuracy and velocity profile.
Genome Sequence of the Bacterium Streptomyces davawensis JCM 4913 and Heterologous Production of the Unique Antibiotic Roseoflavin

PubMed Central

Jankowitsch, Frank; Schwarz, Julia; Rückert, Christian; Gust, Bertolt; Szczepanowski, Rafael; Blom, Jochen; Pelzer, Stefan; Kalinowski, Jörn

2012-01-01

Streptomyces davawensis JCM 4913 synthesizes the antibiotic roseoflavin, a structural riboflavin (vitamin B2) analog. Here, we report the 9,466,619-bp linear chromosome of S. davawensis JCM 4913 and a 89,331-bp linear plasmid. The sequence has an average G+C content of 70.58% and contains six rRNA operons (16S-23S-5S) and 69 tRNA genes. The 8,616 predicted protein-coding sequences include 32 clusters coding for secondary metabolites, several of which are unique to S. davawensis. The chromosome contains long terminal inverted repeats of 33,255 bp each and atypical telomeres. Sequence analysis with regard to riboflavin biosynthesis revealed three different patterns of gene organization in Streptomyces species. Heterologous expression of a set of genes present on a subgenomic fragment of S. davawensis resulted in the production of roseoflavin by the host Streptomyces coelicolor M1152. Phylogenetic analysis revealed that S. davawensis is a close relative of Streptomyces cinnabarinus, and much to our surprise, we found that the latter bacterium is a roseoflavin producer as well. PMID:23043000
Genome sequence of Lactobacillus rhamnosus ATCC 8530.

PubMed

Pittet, Vanessa; Ewen, Emily; Bushell, Barry R; Ziola, Barry

2012-02-01

Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences.
Scalable Kernel Methods and Algorithms for General Sequence Analysis

ERIC Educational Resources Information Center

Kuksa, Pavel

2011-01-01

Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…
Identification of food and beverage spoilage yeasts from DNA sequence analyses

USDA-ARS?s Scientific Manuscript database

Detection, identification, and classification of yeasts has undergone a major transformation in the last decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined DNA sequences from domains 1 and 2 (D1/D2) of th...
Sequenced sorghum mutant library- an efficient platform for discovery of causal gene mutations

USDA-ARS?s Scientific Manuscript database

Ethyl methanesulfonate (EMS) efficiently generates high-density mutations in genomes. We applied whole-genome sequencing to 256 phenotyped mutant lines of sorghum (Sorghum bicolor L. Moench) to 16x coverage. Comparisons with the reference sequence revealed >1.8 million canonical EMS-induced G/C to A...

Isotherm investigation for the sorption of fluoride onto Bio-F: comparison of linear and non-linear regression method

NASA Astrophysics Data System (ADS)

Yadav, Manish; Singh, Nitin Kumar

2017-12-01

A comparison of the linear and non-linear regression method in selecting the optimum isotherm among three most commonly used adsorption isotherms (Langmuir, Freundlich, and Redlich-Peterson) was made to the experimental data of fluoride (F) sorption onto Bio-F at a solution temperature of 30 ± 1 °C. The coefficient of correlation (r2) was used to select the best theoretical isotherm among the investigated ones. A total of four Langmuir linear equations were discussed and out of which linear form of most popular Langmuir-1 and Langmuir-2 showed the higher coefficient of determination (0.976 and 0.989) as compared to other Langmuir linear equations. Freundlich and Redlich-Peterson isotherms showed a better fit to the experimental data in linear least-square method, while in non-linear method Redlich-Peterson isotherm equations showed the best fit to the tested data set. The present study showed that the non-linear method could be a better way to obtain the isotherm parameters and represent the most suitable isotherm. Redlich-Peterson isotherm was found to be the best representative (r2 = 0.999) for this sorption system. It is also observed that the values of β are not close to unity, which means the isotherms are approaching the Freundlich but not the Langmuir isotherm.
First genome sequences of Achromobacter phages reveal new members of the N4 family.

PubMed

Wittmann, Johannes; Dreiseikelmann, Brigitte; Rohde, Manfred; Meier-Kolthoff, Jan P; Bunk, Boyke; Rohde, Christine

2014-01-27

Multi-resistant Achromobacter xylosoxidans has been recognized as an emerging pathogen causing nosocomially acquired infections during the last years. Phages as natural opponents could be an alternative to fight such infections. Bacteriophages against this opportunistic pathogen were isolated in a recent study. This study shows a molecular analysis of two podoviruses and reveals first insights into the genomic structure of Achromobacter phages so far. Growth curve experiments and adsorption kinetics were performed for both phages. Adsorption and propagation in cells were visualized by electron microscopy. Both phage genomes were sequenced with the PacBio RS II system based on single molecule, real-time (SMRT) technology and annotated with several bioinformatic tools. To further elucidate the evolutionary relationships between the phage genomes, a phylogenomic analysis was conducted using the genome Blast Distance Phylogeny approach (GBDP). In this study, we present the first detailed analysis of genome sequences of two Achromobacter phages so far. Phages JWAlpha and JWDelta were isolated from two different waste water treatment plants in Germany. Both phages belong to the Podoviridae and contain linear, double-stranded DNA with a length of 72329 bp and 73659 bp, respectively. 92 and 89 putative open reading frames were identified for JWAlpha and JWDelta, respectively, by bioinformatic analysis with several tools. The genomes have nearly the same organization and could be divided into different clusters for transcription, replication, host interaction, head and tail structure and lysis. Detailed annotation via protein comparisons with BLASTP revealed strong similarities to N4-like phages. Analysis of the genomes of Achromobacter phages JWAlpha and JWDelta and comparisons of different gene clusters with other phages revealed that they might be strongly related to other N4-like phages, especially of the Escherichia group. Although all these phages show a highly conserved genomic structure and partially strong similarities at the amino acid level, some differences could be identified. Those differences, e.g. the existence of specific genes for replication or host interaction in some N4-like phages, seem to be interesting targets for further examination of function and specific mechanisms, which might enlighten the mechanism of phage establishment in the host cell after infection.
The genome sequence of the biocontrol fungus Metarhizium anisopliae and comparative genomics of Metarhizium species.

PubMed

Pattemore, Julie A; Hane, James K; Williams, Angela H; Wilson, Bree A L; Stodart, Ben J; Ash, Gavin J

2014-08-07

Metarhizium anisopliae is an important fungal biocontrol agent of insect pests of agricultural crops. Genomics can aid the successful commercialization of biopesticides by identification of key genes differentiating closely related species, selection of virulent microbial isolates which are amenable to industrial scale production and formulation and through the reduction of phenotypic variability. The genome of Metarhizium isolate ARSEF23 was recently published as a model for M. anisopliae, however phylogenetic analysis has since re-classified this isolate as M. robertsii. We present a new annotated genome sequence of M. anisopliae (isolate Ma69) and whole genome comparison to M. robertsii (ARSEF23) and M. acridum (CQMa 102). Whole genome analysis of M. anisopliae indicates significant macrosynteny with M. robertsii but with some large genomic inversions. In comparison to M. acridum, the genome of M. anisopliae shares lower sequence homology. While alignments overall are co-linear, the genome of M. acridum is not contiguous enough to conclusively observe macrosynteny. Mating type gene analysis revealed both MAT1-1 and MAT1-2 genes present in M. anisopliae suggesting putative homothallism, despite having no known teleomorph, in contrast with the putatively heterothallic M. acridum isolate CQMa 102 (MAT1-2) and M. robertsii isolate ARSEF23 (altered MAT1-1). Repetitive DNA and RIP analysis revealed M. acridum to have twice the repetitive content of the other two species and M. anisopliae to be five times more RIP affected than M. robertsii. We also present an initial bioinformatic survey of candidate pathogenicity genes in M. anisopliae. The annotated genome of M. anisopliae is an important resource for the identification of virulence genes specific to M. anisopliae and development of species- and strain- specific assays. New insight into the possibility of homothallism and RIP affectedness has important implications for the development of M. anisopliae as a biopesticide as it may indicate the potential for greater inherent diversity in this species than the other species. This could present opportunities to select isolates with unique combinations of pathogenicity factors, or it may point to instability in the species, a negative attribute in a biopesticide.
Analytic reconstruction of magnetic resonance imaging signal obtained from a periodic encoding field.

PubMed

Rybicki, F J; Hrovat, M I; Patz, S

2000-09-01

We have proposed a two-dimensional PERiodic-Linear (PERL) magnetic encoding field geometry B(x,y) = g(y)y cos(q(x)x) and a magnetic resonance imaging pulse sequence which incorporates two fields to image a two-dimensional spin density: a standard linear gradient in the x dimension, and the PERL field. Because of its periodicity, the PERL field produces a signal where the phase of the two dimensions is functionally different. The x dimension is encoded linearly, but the y dimension appears as the argument of a sinusoidal phase term. Thus, the time-domain signal and image spin density are not related by a two-dimensional Fourier transform. They are related by a one-dimensional Fourier transform in the x dimension and a new Bessel function integral transform (the PERL transform) in the y dimension. The inverse of the PERL transform provides a reconstruction algorithm for the y dimension of the spin density from the signal space. To date, the inverse transform has been computed numerically by a Bessel function expansion over its basis functions. This numerical solution used a finite sum to approximate an infinite summation and thus introduced a truncation error. This work analytically determines the basis functions for the PERL transform and incorporates them into the reconstruction algorithm. The improved algorithm is demonstrated by (1) direct comparison between the numerically and analytically computed basis functions, and (2) reconstruction of a known spin density. The new solution for the basis functions also lends proof of the system function for the PERL transform under specific conditions.
De Novo Protein Structure Prediction

NASA Astrophysics Data System (ADS)

Hung, Ling-Hong; Ngan, Shing-Chung; Samudrala, Ram

An unparalleled amount of sequence data is being made available from large-scale genome sequencing efforts. The data provide a shortcut to the determination of the function of a gene of interest, as long as there is an existing sequenced gene with similar sequence and of known function. This has spurred structural genomic initiatives with the goal of determining as many protein folds as possible (Brenner and Levitt, 2000; Burley, 2000; Brenner, 2001; Heinemann et al., 2001). The purpose of this is twofold: First, the structure of a gene product can often lead to direct inference of its function. Second, since the function of a protein is dependent on its structure, direct comparison of the structures of gene products can be more sensitive than the comparison of sequences of genes for detecting homology. Presently, structural determination by crystallography and NMR techniques is still slow and expensive in terms of manpower and resources, despite attempts to automate the processes. Computer structure prediction algorithms, while not providing the accuracy of the traditional techniques, are extremely quick and inexpensive and can provide useful low-resolution data for structure comparisons (Bonneau and Baker, 2001). Given the immense number of structures which the structural genomic projects are attempting to solve, there would be a considerable gain even if the computer structure prediction approach were applicable to a subset of proteins.
SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments

PubMed Central

Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic

2001-01-01

Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202
Interactive computer programs for the graphic analysis of nucleotide sequence data.

PubMed Central

Luckow, V A; Littlewood, R K; Rownd, R H

1984-01-01

A group of interactive computer programs have been developed which aid in the collection and graphical analysis of nucleotide and protein sequence data. The programs perform the following basic functions: a) enter, edit, list, and rearrange sequence data; b) permit automatic entry of nucleotide sequence data directly from an autoradiograph into the computer; c) search for restriction sites or other specified patterns and plot a linear or circular restriction map, or print their locations; d) plot base composition; e) analyze homology between sequences by plotting a two-dimensional graphic matrix; and f) aid in plotting predicted secondary structures of RNA molecules. PMID:6546437
Identification of novel antibody-reactive detection sites for comprehensive gluten monitoring.

PubMed

Röckendorf, Niels; Meckelein, Barbara; Scherf, Katharina A; Schalk, Kathrin; Koehler, Peter; Frey, Andreas

2017-01-01

Certain cereals like wheat, rye or barley contain gluten, a protein mixture that can trigger celiac disease (CD). To make gluten-free diets available for affected individuals the gluten content of foodstuff must be monitored. For this purpose, antibody-based assays exist which rely on the recognition of certain linear gluten sequence motifs. Yet, not all CD-active gluten constituents and fragments formed during food processing/fermentation may be covered by those tests. In this study, we therefore assayed the coverage of reportedly CD-active gluten components by currently available detection antibodies and determined the antibody-inducing capacity of wheat gluten constituents in order to provide novel diagnostic targets for comprehensive gluten quantitation. Immunizations of outbred mice with purified gliadins and glutenins were conducted and the linear target recognition profile of the sera was recorded using synthetic peptide arrays that covered the sequence space of gluten constituents present in those preparations. The resulting murine immunorecognition profile of gluten demonstrated that further linear binding sites beyond those recognized by the monoclonal antibodies α20, R5 and G12 exist and may be exploitable as diagnostic targets. We conclude that the safety of foodstuffs for CD patients can be further improved by complementing current tests with antibodies directed against additional CD-active gluten components. Currently unrepresented linear gluten detection sites in glutenins and α-gliadins suggest sequences QQQYPS, PQQSFP, QPGQGQQG and QQPPFS as novel targets for antibody generation.
Identification of novel antibody-reactive detection sites for comprehensive gluten monitoring

PubMed Central

Röckendorf, Niels; Meckelein, Barbara; Scherf, Katharina A.; Schalk, Kathrin; Koehler, Peter

2017-01-01

Certain cereals like wheat, rye or barley contain gluten, a protein mixture that can trigger celiac disease (CD). To make gluten-free diets available for affected individuals the gluten content of foodstuff must be monitored. For this purpose, antibody-based assays exist which rely on the recognition of certain linear gluten sequence motifs. Yet, not all CD-active gluten constituents and fragments formed during food processing/fermentation may be covered by those tests. In this study, we therefore assayed the coverage of reportedly CD-active gluten components by currently available detection antibodies and determined the antibody-inducing capacity of wheat gluten constituents in order to provide novel diagnostic targets for comprehensive gluten quantitation. Immunizations of outbred mice with purified gliadins and glutenins were conducted and the linear target recognition profile of the sera was recorded using synthetic peptide arrays that covered the sequence space of gluten constituents present in those preparations. The resulting murine immunorecognition profile of gluten demonstrated that further linear binding sites beyond those recognized by the monoclonal antibodies α20, R5 and G12 exist and may be exploitable as diagnostic targets. We conclude that the safety of foodstuffs for CD patients can be further improved by complementing current tests with antibodies directed against additional CD-active gluten components. Currently unrepresented linear gluten detection sites in glutenins and α-gliadins suggest sequences QQQYPS, PQQSFP, QPGQGQQG and QQPPFS as novel targets for antibody generation. PMID:28759621
Codification of scan path parameters and development of perimeter scan strategies for 3D bowl-shaped laser forming

NASA Astrophysics Data System (ADS)

Tavakoli, A.; Naeini, H. Moslemi; Roohi, Amir H.; Gollo, M. Hoseinpour; Shahabad, Sh. Imani

2018-01-01

In the 3D laser forming process, developing an appropriate laser scan pattern for producing specimens with high quality and uniformity is critical. This study presents certain principles for developing scan paths. Seven scan path parameters are considered, including: (1) combined linear or curved path; (2) type of combined linear path; (3) order of scan sequences; (4) the position of the start point in each scan; (5) continuous or discontinuous scan path; (6) direction of scan path; and (7) angular arrangement of combined linear scan paths. Regarding these path parameters, ten combined linear scan patterns are presented. Numerical simulations show continuous hexagonal, scan pattern, scanning from outer to inner path, is the optimized. In addition, it is observed the position of the start point and the angular arrangement of scan paths is the most effective path parameters. Also, further experimentations show four sequences due to creat symmetric condition enhance the height of the bowl-shaped products and uniformity. Finally, the optimized hexagonal pattern was compared with the similar circular one. In the hexagonal scan path, distortion value and standard deviation rather to edge height of formed specimen is very low, and the edge height despite of decreasing length of scan path increases significantly compared to the circular scan path. As a result, four-sequence hexagonal scan pattern is proposed as the optimized perimeter scan path to produce bowl-shaped product.
Molecular characterization of the virulent infectious hematopoietic necrosis virus (IHNV) strain 220-90

PubMed Central

2010-01-01

Background Infectious hematopoietic necrosis virus (IHNV) is the type species of the genus Novirhabdovirus, within the family Rhabdoviridae, infecting several species of wild and hatchery reared salmonids. Similar to other rhabdoviruses, IHNV has a linear single-stranded, negative-sense RNA genome of approximately 11,000 nucleotides. The IHNV genome encodes six genes; the nucleocapsid, phosphoprotein, matrix protein, glycoprotein, non-virion protein and polymerase protein genes, respectively. This study describes molecular characterization of the virulent IHNV strain 220-90, belonging to the M genogroup, and its phylogenetic relationships with available sequences of IHNV isolates worldwide. Results The complete genomic sequence of IHNV strain 220-90 was determined from the DNA of six overlapping clones obtained by RT-PCR amplification of genomic RNA. The complete genome sequence of 220-90 comprises 11,133 nucleotides (GenBank GQ413939) with the gene order of 3'-N-P-M-G-NV-L-5'. These genes are separated by conserved gene junctions, with di-nucleotide gene spacers. An additional uracil nucleotide was found at the end of the 5'-trailer region, which was not reported before in other IHNV strains. The first 15 of the 16 nucleotides at the 3'- and 5'-termini of the genome are complementary, and the first 4 nucleotides at 3'-ends of the IHNV are identical to other novirhadoviruses. Sequence homology and phylogenetic analysis of the glycoprotein genes show that 220-90 strain is 97% identical to most of the IHNV strains. Comparison of the virulent 220-90 genomic sequences with less virulent WRAC isolate shows more than 300 nucleotides changes in the genome, which doesn't allow one to speculate putative residues involved in the virulence of IHNV. Conclusion We have molecularly characterized one of the well studied IHNV isolates, 220-90 of genogroup M, which is virulent for rainbow trout, and compared phylogenetic relationship with North American and other strains. Determination of the complete nucleotide sequence is essential for future studies on pathogenesis of IHNV using a reverse genetics approach and developing efficient control strategies. PMID:20085652
Gradient-Induced Voltages on 12-Lead ECGs during High Duty-Cycle MRI Sequences and a Method for Their Removal considering Linear and Concomitant Gradient Terms

PubMed Central

Zhang, Shelley HuaLei; Ho Tse, Zion Tsz; Dumoulin, Charles L.; Kwong, Raymond Y.; Stevenson, William G.; Watkins, Ronald; Ward, Jay; Wang, Wei; Schmidt, Ehud J.

2015-01-01

Purpose To restore 12-lead ECG signal fidelity inside MRI by removing magnetic-field gradient induced-voltages during high gradient-duty-cycle sequences. Theory and Methods A theoretical equation was derived, providing first- and second-order electrical fields induced at individual ECG electrode as a function of gradient fields. Experiments were performed at 3T on healthy volunteers, using a customized acquisition system which captured full amplitude and frequency response of ECGs, or a commercial recording system. The 19 equation coefficients were derived by linear regression of data from accelerated sequences, and used to compute induced-voltages in real-time during full-resolution sequences to remove ECG artifacts. Restored traces were evaluated relative to ones acquired without imaging. Results Measured induced-voltages were 0.7V peak-to-peak during balanced Steady-State Free Precession (bSSFP) with heart at the isocenter. Applying the equation during gradient echo sequencing, three-dimensional fast spin echo and multi-slice bSSFP imaging restored nonsaturated traces and second-order concomitant terms showed larger contributions in electrodes farther from the magnet isocenter. Equation coefficients are evaluated with high repeatability (ρ = 0.996) and are subject, sequence, and slice-orientation dependent. Conclusion Close agreement between theoretical and measured gradient-induced voltages allowed for real-time removal. Prospective estimation of sequence-periods where large induced-voltages occur may allow hardware removal of these signals. PMID:26101951
The CNVrd2 package: measurement of copy number at complex loci using high-throughput sequencing data.

PubMed

Nguyen, Hoang T; Merriman, Tony R; Black, Michael A

2014-01-01

Recent advances in high-throughout sequencing technologies have made it possible to accurately assign copy number (CN) at CN variable loci. However, current analytic methods often perform poorly in regions in which complex CN variation is observed. Here we report the development of a read depth-based approach, CNVrd2, for investigation of CN variation using high-throughput sequencing data. This methodology was developed using data from the 1000 Genomes Project from the CCL3L1 locus, and tested using data from the DEFB103A locus. In both cases, samples were selected for which paralog ratio test data were also available for comparison. The CNVrd2 method first uses observed read-count ratios to refine segmentation results in one population. Then a linear regression model is applied to adjust the results across multiple populations, in combination with a Bayesian normal mixture model to cluster segmentation scores into groups for individual CN counts. The performance of CNVrd2 was compared to that of two other read depth-based methods (CNVnator, cn.mops) at the CCL3L1 and DEFB103A loci. The highest concordance with the paralog ratio test method was observed for CNVrd2 (77.8/90.4% for CNVrd2, 36.7/4.8% for cn.mops and 7.2/1% for CNVnator at CCL3L1 and DEF103A). CNVrd2 is available as an R package as part of the Bioconductor project: http://www.bioconductor.org/packages/release/bioc/html/CNVrd2.html.
Application of Linear and Non-Linear Harmonic Methods for Unsteady Transonic Flow

NASA Astrophysics Data System (ADS)

Gundevia, Rayomand

This thesis explores linear and non-linear computational methods for solving unsteady flow. The eventual goal is to apply these methods to two-dimensional and three-dimensional flutter predictions. In this study the quasi-one-dimensional nozzle is used as a framework for understanding these methods and their limitations. Subsonic and transonic cases are explored as the back-pressure is forced to oscillate with known amplitude and frequency. A steady harmonic approach is used to solve this unsteady problem for which perturbations are said to be small in comparison to the mean flow. The use of a linearized Euler equations (LEE) scheme is good at capturing the flow characteristics but is limited by accuracy to relatively small amplitude perturbations. The introduction of time-averaged second-order terms in the Non-Linear Harmonic (NLH) method means that a better approximation of the mean-valued solution, upon which the linearization is based, can be made. The nonlinear time-accurate Euler solutions are used for comparison and to establish the regimes of unsteadiness for which these schemes fails. The usefulness of the LEE and NLH methods lie in the gains in computational efficiency over the full equations.
Enzyme sequence similarity improves the reaction alignment method for cross-species pathway comparison

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ovacik, Meric A.; Androulakis, Ioannis P., E-mail: yannis@rci.rutgers.edu; Biomedical Engineering Department, Rutgers University, Piscataway, NJ 08854

2013-09-15

Pathway-based information has become an important source of information for both establishing evolutionary relationships and understanding the mode of action of a chemical or pharmaceutical among species. Cross-species comparison of pathways can address two broad questions: comparison in order to inform evolutionary relationships and to extrapolate species differences used in a number of different applications including drug and toxicity testing. Cross-species comparison of metabolic pathways is complex as there are multiple features of a pathway that can be modeled and compared. Among the various methods that have been proposed, reaction alignment has emerged as the most successful at predicting phylogeneticmore » relationships based on NCBI taxonomy. We propose an improvement of the reaction alignment method by accounting for sequence similarity in addition to reaction alignment method. Using nine species, including human and some model organisms and test species, we evaluate the standard and improved comparison methods by analyzing glycolysis and citrate cycle pathways conservation. In addition, we demonstrate how organism comparison can be conducted by accounting for the cumulative information retrieved from nine pathways in central metabolism as well as a more complete study involving 36 pathways common in all nine species. Our results indicate that reaction alignment with enzyme sequence similarity results in a more accurate representation of pathway specific cross-species similarities and differences based on NCBI taxonomy.« less
DNA sequence similarity recognition by hybridization to short oligomers

DOEpatents

Milosavljevic, Aleksandar

1999-01-01

Methods are disclosed for the comparison of nucleic acid sequences. Data is generated by hybridizing sets of oligomers with target nucleic acids. The data thus generated is manipulated simultaneously with respect to both (i) matching between oligomers and (ii) matching between oligomers and putative reference sequences available in databases. Using data compression methods to manipulate this mutual information, sequences for the target can be constructed.
Comparison of fMRI data from passive listening and active-response story processing tasks in children

PubMed Central

Vannest, Jennifer J.; Karunanayaka, Prasanna R.; Altaye, Mekibib; Schmithorst, Vincent J.; Plante, Elena M.; Eaton, Kenneth J.; Rasmussen, Jerod M.; Holland, Scott K.

2009-01-01

Purpose To use functional MRI methods to visualize a network of auditory and language-processing brain regions associated with processing an aurally-presented story. We compare a passive listening (PL) story paradigm to an active-response (AR) version including on-line performance monitoring and a sparse acquisition technique. Materials/Methods Twenty children (ages 11−13) completed PL and AR story processing tasks. The PL version presented alternating 30-second blocks of stories and tones; the AR version presented story segments, comprehension questions, and 5s tone sequences, with fMRI acquisitions between stimuli. fMRI data was analyzed using a general linear model approach and paired t-test identifying significant group activation. Results Both tasks activated in primary auditory cortex, superior temporal gyrus bilaterally, left inferior frontal gyrus. The AR task demonstrated more extensive activation, including dorsolateral prefrontal cortex and anterior/posterior cingulate cortex. Comparison of effect size in each paradigm showed a larger effect for the AR paradigm in a left inferior frontal ROI. Conclusion Activation patterns for story processing in children are similar in passive listening and active-response tasks. Increases in extent and magnitude of activation in the AR task are likely associated with memory and attention resources engaged across acquisition intervals. PMID:19306445
Comparison of fMRI data from passive listening and active-response story processing tasks in children.

PubMed

Vannest, Jennifer J; Karunanayaka, Prasanna R; Altaye, Mekibib; Schmithorst, Vincent J; Plante, Elena M; Eaton, Kenneth J; Rasmussen, Jerod M; Holland, Scott K

2009-04-01

To use functional MRI (fMRI) methods to visualize a network of auditory and language-processing brain regions associated with processing an aurally-presented story. We compare a passive listening (PL) story paradigm to an active-response (AR) version including online performance monitoring and a sparse acquisition technique. Twenty children (ages 11-13 years) completed PL and AR story processing tasks. The PL version presented alternating 30-second blocks of stories and tones; the AR version presented story segments, comprehension questions, and 5-second tone sequences, with fMRI acquisitions between stimuli. fMRI data was analyzed using a general linear model approach and paired t-test identifying significant group activation. Both tasks showed activation in the primary auditory cortex, superior temporal gyrus bilaterally, and left inferior frontal gyrus (IFG). The AR task demonstrated more extensive activation, including the dorsolateral prefrontal cortex and anterior/posterior cingulate cortex. Comparison of effect size in each paradigm showed a larger effect for the AR paradigm in a left inferior frontal region-of-interest (ROI). Activation patterns for story processing in children are similar in PL and AR tasks. Increases in extent and magnitude of activation in the AR task are likely associated with memory and attention resources engaged across acquisition intervals.
Conditions for Stabilizability of Linear Switched Systems

NASA Astrophysics Data System (ADS)

Minh, Vu Trieu

2011-06-01

This paper investigates some conditions that can provide stabilizability for linear switched systems with polytopic uncertainties via their closed loop linear quadratic state feedback regulator. The closed loop switched systems can stabilize unstable open loop systems or stable open loop systems but in which there is no solution for a common Lyapunov matrix. For continuous time switched linear systems, we show that if there exists solution in an associated Riccati equation for the closed loop systems sharing one common Lyapunov matrix, the switched linear systems are stable. For the discrete time switched systems, we derive a Linear Matrix Inequality (LMI) to calculate a common Lyapunov matrix and solution for the stable closed loop feedback systems. These closed loop linear quadratic state feedback regulators guarantee the global asymptotical stability for any switched linear systems with any switching signal sequence.
Mössbauer spectra linearity improvement by sine velocity waveform followed by linearization process

NASA Astrophysics Data System (ADS)

Kohout, Pavel; Frank, Tomas; Pechousek, Jiri; Kouril, Lukas

2018-05-01

This note reports the development of a new method for linearizing the Mössbauer spectra recorded with a sine drive velocity signal. Mössbauer spectra linearity is a critical parameter to determine Mössbauer spectrometer accuracy. Measuring spectra with a sine velocity axis and consecutive linearization increases the linearity of spectra in a wider frequency range of a drive signal, as generally harmonic movement is natural for velocity transducers. The obtained data demonstrate that linearized sine spectra have lower nonlinearity and line width parameters in comparison with those measured using a traditional triangle velocity signal.

Comparison of Linear and Non-linear Regression Analysis to Determine Pulmonary Pressure in Hyperthyroidism.

PubMed

Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan

2017-01-01

This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
Dimensional Reduction for the General Markov Model on Phylogenetic Trees.

PubMed

Sumner, Jeremy G

2017-03-01

We present a method of dimensional reduction for the general Markov model of sequence evolution on a phylogenetic tree. We show that taking certain linear combinations of the associated random variables (site pattern counts) reduces the dimensionality of the model from exponential in the number of extant taxa, to quadratic in the number of taxa, while retaining the ability to statistically identify phylogenetic divergence events. A key feature is the identification of an invariant subspace which depends only bilinearly on the model parameters, in contrast to the usual multi-linear dependence in the full space. We discuss potential applications including the computation of split (edge) weights on phylogenetic trees from observed sequence data.
Comparison of linear synchronous and induction motors

DOT National Transportation Integrated Search

2004-06-01

A propulsion prade study was conducted as part of the Colorado Maglev Project of FTA's Urban Maglev Technology Development Program to identify and evaluate prospective linear motor designs that could potentially meet the system performance requiremen...
Comparative analysis of linear and non-linear method of estimating the sorption isotherm parameters for malachite green onto activated carbon.

PubMed

Kumar, K Vasanth

2006-08-21

The experimental equilibrium data of malachite green onto activated carbon were fitted to the Freundlich, Langmuir and Redlich-Peterson isotherms by linear and non-linear method. A comparison between linear and non-linear of estimating the isotherm parameters was discussed. The four different linearized form of Langmuir isotherm were also discussed. The results confirmed that the non-linear method as a better way to obtain isotherm parameters. The best fitting isotherm was Langmuir and Redlich-Peterson isotherm. Redlich-Peterson is a special case of Langmuir when the Redlich-Peterson isotherm constant g was unity.
Comparison of base composition analysis and Sanger sequencing of mitochondrial DNA for four U.S. population groups.

PubMed

Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M

2014-01-01

A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.
Multiple pass laser amplifier system

DOEpatents

Brueckner, Keith A.; Jorna, Siebe; Moncur, N. Kent

1977-01-01

A laser amplification method for increasing the energy extraction efficiency from laser amplifiers while reducing the energy flux that passes through a flux limited system which includes apparatus for decomposing a linearly polarized light beam into multiple components, passing the components through an amplifier in delayed time sequence and recombining the amplified components into an in phase linearly polarized beam.
Comparing Linear and Nonlinear Delivery of Introductory Psychology Lectures: Improving Student Retention

ERIC Educational Resources Information Center

Cramer, Kenneth M.; Sands, Mandy

2016-01-01

As in most disciplines, the typical introductory class presents topics to students in a linear fashion, beginning (to use psychology as an example) with the history of the field, research methods, brain and neurons, sensation and perception, and so on. This study examined the impact of topic sequence on student achievement. The same professor…
Generating Nice Linear Systems for Matrix Gaussian Elimination

ERIC Educational Resources Information Center

Homewood, L. James

2004-01-01

In this article an augmented matrix that represents a system of linear equations is called nice if a sequence of elementary row operations that reduces the matrix to row-echelon form, through matrix Gaussian elimination, does so by restricting all entries to integers in every step. Many instructors wish to use the example of matrix Gaussian…
Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

PubMed

Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

2016-05-01

Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.
Comparison of ribosomal RNA removal methods for transcriptome sequencing workflows in teleost fish

USDA-ARS?s Scientific Manuscript database

RNA sequencing (RNA-Seq) is becoming the standard for transcriptome analysis. Removal of contaminating ribosomal RNA (rRNA) is a priority in the preparation of libraries suitable for sequencing. rRNAs are commonly removed from total RNA via either mRNA selection or rRNA depletion. These methods have...
Genome Sequence of Lactobacillus rhamnosus ATCC 8530

PubMed Central

Pittet, Vanessa; Ewen, Emily; Bushell, Barry R.

2012-01-01

Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences. PMID:22247527
Nucleotide sequence analysis establishes the role of endogenous murine leukemia virus DNA segments in formation of recombinant mink cell focus-forming murine leukemia viruses.

PubMed Central

Khan, A S

1984-01-01

The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017
A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

DOE Office of Scientific and Technical Information (OSTI.GOV)

Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.

2006-01-20

Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would bemore » very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since both of these genomes are crop plants, their complete genome sequence will facilitate development of chloroplast genetic engineering technology, as in recent studies from Daniell's lab. Knowing the exact sequence from spacer regions is crucial for introducing transgenes into the chloroplast genome.« less
First Molecular Identification and Phylogeny of Moroccan Anopheles sergentii (Diptera: Culicidae) Based on Second Internal Transcribed Spencer (ITS2) and Cytochrome c Oxidase I (COI) Sequences.

PubMed

Benabdelkrim Filali, Oumama; Kabine, Mostafa; El Hamouchi, Adil; Lemrani, Meryem; Debboun, Mustapha; Sarih, M'hammed

2018-06-05

Anopheles sergentii known as the "oasis vector" or the "desert malaria vector" is considered the main vector of malaria in the southern parts of Morocco. Its presence in Morocco is confirmed for the first time through sequencing of mitochondrial DNA (mDNA) cytochrome c oxidase subunit I (COI) barcodes and nuclear ribosomal DNA (rDNA) second internal transcribed spacer (ITS2) sequences and direct comparison with specimens of A. sergentii of other countries. The DNA barcodes (n = 39) obtained from A. sergentii collected in 2015 and 2016 showed more diversity with 10 haplotypes, compared with 3 haplotypes obtained from ITS2 sequences (n = 59). Moreover, the comparison using the ITS2 sequences showed closer evolutionary relationship between the Moroccan and Egyptian strains than the Iranian strain. Nevertheless, genetic differences due to geographical segregation were also observed. This study provides the first report on the sequence of rDNA-ITS2 and mtDNA COI, which could be used to better understand the biodiversity of A. sergentii.
Synthetic internal control sequences to increase negative call veracity in multiplexed, quantitative PCR assays for Phakopsora pachyrhizi

USDA-ARS?s Scientific Manuscript database

Quantitative PCR (Q-PCR) utilizing specific primer sequences and a fluorogenic, 5’-exonuclease linear hydrolysis probe is well established as a detection and identification method for Phakopsora pachyrhizi, the soybean rust pathogen. Because of the extreme sensitivity of Q-PCR, the DNA of a single u...
Implementation of Objective PASC-Derived Taxon Demarcation Criteria for Official Classification of Filoviruses.

PubMed

Bào, Yīmíng; Amarasinghe, Gaya K; Basler, Christopher F; Bavari, Sina; Bukreyev, Alexander; Chandran, Kartik; Dolnik, Olga; Dye, John M; Ebihara, Hideki; Formenty, Pierre; Hewson, Roger; Kobinger, Gary P; Leroy, Eric M; Mühlberger, Elke; Netesov, Sergey V; Patterson, Jean L; Paweska, Janusz T; Smither, Sophie J; Takada, Ayato; Towner, Jonathan S; Volchkov, Viktor E; Wahl-Jensen, Victoria; Kuhn, Jens H

2017-05-11

The mononegaviral family Filoviridae has eight members assigned to three genera and seven species. Until now, genus and species demarcation were based on arbitrarily chosen filovirus genome sequence divergence values (≈50% for genera, ≈30% for species) and arbitrarily chosen phenotypic virus or virion characteristics. Here we report filovirus genome sequence-based taxon demarcation criteria using the publicly accessible PAirwise Sequencing Comparison (PASC) tool of the US National Center for Biotechnology Information (Bethesda, MD, USA). Comparison of all available filovirus genomes in GenBank using PASC revealed optimal genus demarcation at the 55-58% sequence diversity threshold range for genera and at the 23-36% sequence diversity threshold range for species. Because these thresholds do not change the current official filovirus classification, these values are now implemented as filovirus taxon demarcation criteria that may solely be used for filovirus classification in case additional data are absent. A near-complete, coding-complete, or complete filovirus genome sequence will now be required to allow official classification of any novel "filovirus." Classification of filoviruses into existing taxa or determining the need for novel taxa is now straightforward and could even become automated using a presented algorithm/flowchart rooted in RefSeq (type) sequences.
HIV drug resistance testing among patients failing second line antiretroviral therapy. Comparison of in-house and commercial sequencing.

PubMed

Chimukangara, Benjamin; Varyani, Bhavini; Shamu, Tinei; Mutsvangwa, Junior; Manasa, Justen; White, Elizabeth; Chimbetete, Cleophas; Luethy, Ruedi; Katzenstein, David

2017-05-01

HIV genotyping is often unavailable in low and middle-income countries due to infrastructure requirements and cost. We compared genotype resistance testing in patients with virologic failure, by amplification of HIV pol gene, followed by "in-house" sequencing and commercial sequencing. Remnant plasma samples from adults and children failing second-line ART were amplified and sequenced using in-house and commercial di-deoxysequencing, and analyzed in Harare, Zimbabwe and at Stanford, U.S.A, respectively. HIV drug resistance mutations were determined using the Stanford HIV drug resistance database. Twenty-six of 28 samples were amplified and 25 were successfully genotyped. Comparison of average percent nucleotide and amino acid identities between 23 pairs sequenced in both laboratories were 99.51 (±0.56) and 99.11 (±0.95), respectively. All pairs clustered together in phylogenetic analysis. Sequencing analysis identified 6/23 pairs with mutation discordances resulting in differences in phenotype, but these did not impact future regimens. The results demonstrate our ability to produce good quality drug resistance data in-house. Despite discordant mutations in some sequence pairs, the phenotypic predictions were not clinically significant. Copyright © 2016 Elsevier B.V. All rights reserved.
Reduced-Size Integer Linear Programming Models for String Selection Problems: Application to the Farthest String Problem.

PubMed

Zörnig, Peter

2015-08-01

We present integer programming models for some variants of the farthest string problem. The number of variables and constraints is substantially less than that of the integer linear programming models known in the literature. Moreover, the solution of the linear programming-relaxation contains only a small proportion of noninteger values, which considerably simplifies the rounding process. Numerical tests have shown excellent results, especially when a small set of long sequences is given.
Can Linear Superiorization Be Useful for Linear Optimization Problems?

PubMed Central

Censor, Yair

2017-01-01

Linear superiorization considers linear programming problems but instead of attempting to solve them with linear optimization methods it employs perturbation resilient feasibility-seeking algorithms and steers them toward reduced (not necessarily minimal) target function values. The two questions that we set out to explore experimentally are (i) Does linear superiorization provide a feasible point whose linear target function value is lower than that obtained by running the same feasibility-seeking algorithm without superiorization under identical conditions? and (ii) How does linear superiorization fare in comparison with the Simplex method for solving linear programming problems? Based on our computational experiments presented here, the answers to these two questions are: “yes” and “very well”, respectively. PMID:29335660
Comparison of magnetic resonance imaging sequences for depicting the subthalamic nucleus for deep brain stimulation.

PubMed

Nagahama, Hiroshi; Suzuki, Kengo; Shonai, Takaharu; Aratani, Kazuki; Sakurai, Yuuki; Nakamura, Manami; Sakata, Motomichi

2015-01-01

Electrodes are surgically implanted into the subthalamic nucleus (STN) of Parkinson's disease patients to provide deep brain stimulation. For ensuring correct positioning, the anatomic location of the STN must be determined preoperatively. Magnetic resonance imaging has been used for pinpointing the location of the STN. To identify the optimal imaging sequence for identifying the STN, we compared images produced with T2 star-weighted angiography (SWAN), gradient echo T2*-weighted imaging, and fast spin echo T2-weighted imaging in 6 healthy volunteers. Our comparison involved measurement of the contrast-to-noise ratio (CNR) for the STN and substantia nigra and a radiologist's interpretations of the images. Of the sequences examined, the CNR and qualitative scores were significantly higher on SWAN images than on other images (p < 0.01) for STN visualization. Kappa value (0.74) on SWAN images was the highest in three sequences for visualizing the STN. SWAN is the sequence best suited for identifying the STN at the present time.

Comparison of linear and non-linear models for the adsorption of fluoride onto geo-material: limonite.

PubMed

Sahin, Rubina; Tapadia, Kavita

2015-01-01

The three widely used isotherms Langmuir, Freundlich and Temkin were examined in an experiment using fluoride (F⁻) ion adsorption on a geo-material (limonite) at four different temperatures by linear and non-linear models. Comparison of linear and non-linear regression models were given in selecting the optimum isotherm for the experimental results. The coefficient of determination, r², was used to select the best theoretical isotherm. The four Langmuir linear equations (1, 2, 3, and 4) are discussed. Langmuir isotherm parameters obtained from the four Langmuir linear equations using the linear model differed but they were the same when using the nonlinear model. Langmuir-2 isotherm is one of the linear forms, and it had the highest coefficient of determination (r² = 0.99) compared to the other Langmuir linear equations (1, 3 and 4) in linear form, whereas, for non-linear, Langmuir-4 fitted best among all the isotherms because it had the highest coefficient of determination (r² = 0.99). The results showed that the non-linear model may be a better way to obtain the parameters. In the present work, the thermodynamic parameters show that the absorption of fluoride onto limonite is both spontaneous (ΔG < 0) and endothermic (ΔH > 0). Scanning electron microscope and X-ray diffraction images also confirm the adsorption of F⁻ ion onto limonite. The isotherm and kinetic study reveals that limonite can be used as an adsorbent for fluoride removal. In future we can develop new technology for fluoride removal in large scale by using limonite which is cost-effective, eco-friendly and is easily available in the study area.
A Comparison of Linear and Systems Thinking Approaches for Program Evaluation Illustrated Using the Indiana Interdisciplinary GK-12

ERIC Educational Resources Information Center

Dyehouse, Melissa; Bennett, Deborah; Harbor, Jon; Childress, Amy; Dark, Melissa

2009-01-01

Logic models are based on linear relationships between program resources, activities, and outcomes, and have been used widely to support both program development and evaluation. While useful in describing some programs, the linear nature of the logic model makes it difficult to capture the complex relationships within larger, multifaceted…
Analyzing Multilevel Data: An Empirical Comparison of Parameter Estimates of Hierarchical Linear Modeling and Ordinary Least Squares Regression

ERIC Educational Resources Information Center

Rocconi, Louis M.

2011-01-01

Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

ERIC Educational Resources Information Center

Haberman, Shelby J.; Sinharay, Sandip

2010-01-01

Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
Draft Genome Sequence of Streptomyces clavuligerus NRRL 3585, a Producer of Diverse Secondary Metabolites▿

PubMed Central

Song, Ju Yeon; Jeong, Haeyoung; Yu, Dong Su; Fischbach, Michael A.; Park, Hong-Seog; Kim, Jae Jong; Seo, Jeong-Sun; Jensen, Susan E.; Oh, Tae Kwang; Lee, Kye Joon; Kim, Jihyun F.

2010-01-01

Streptomyces clavuligerus is an important industrial strain that produces a number of antibiotics, including clavulanic acid and cephamycin C. A high-quality draft genome sequence of the S. clavuligerus NRRL 3585 strain was produced by employing a hybrid approach that involved Sanger sequencing, Roche/454 pyrosequencing, optical mapping, and partial finishing. Its genome, comprising four linear replicons, one chromosome, and four plasmids, carries numerous sets of genes involved in the biosynthesis of secondary metabolites, including a variety of antibiotics. PMID:20889745
Multilocus sequence typing of total-genome-sequenced bacteria.

PubMed

Larsen, Mette V; Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W; Aarestrup, Frank M; Lund, Ole

2012-04-01

Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST.
Demonstration of Nondeclarative Sequence Learning in Mice: Development of an Animal Analog of the Human Serial Reaction Time Task

ERIC Educational Resources Information Center

Christie, Michael A.; Hersch, Steven M.

2004-01-01

In this paper, we demonstrate nondeclarative sequence learning in mice using an animal analog of the human serial reaction time task (SRT) that uses a within-group comparison of behavior in response to a repeating sequence versus a random sequence. Ten female B6CBA mice performed eleven 96-trial sessions containing 24 repetitions of a 4-trial…
Structural and optical properties of self-assembled chains of plasmonic nanocubes

DOE PAGES

Klinkova, Anna; Gang, Oleg; Therien-Aubin, Heloise; ...

2014-10-10

Solution-based linear self-assembly of metal nanoparticles offers a powerful strategy for creating plasmonic polymers, which, so far, have been formed from spherical nanoparticles and nanorods. Here, we report linear solution-based self-assembly of metal nanocubes (NCs), examine the structural characteristics of the NC chains and demonstrate their advanced optical characteristics. Predominant face-to-face assembly of large NCs coated with short polymer ligands led to a larger volume of hot spots in the chains, a nearly uniform E-field enhancement in the gaps between co-linear NCs and a new coupling mode for NC chains, in comparison with chains of nanospheres with similar dimensions, compositionmore » and surface chemistry. The NC chains exhibited a stronger surface enhanced Raman scattering (SERS) signal, in comparison with linear assemblies of nanospheres. The experimental results were in agreement with finite difference time domain (FDTD) simulations.« less
Isolation of an insertion sequence (IS1051) from Xanthomonas campestris pv. dieffenbachiae with potential use for strain identification and characterization.

PubMed Central

Berthier, Y; Thierry, D; Lemattre, M; Guesdon, J L

1994-01-01

A new insertion sequence was isolated from Xanthomonas campestris pv. dieffenbachiae. Sequence analysis showed that this element is 1,158 bp long and has 15-bp inverted repeat ends containing two mismatches. Comparison of this sequence with sequences in data bases revealed significant homology with Escherichia coli IS5. IS1051, which detected multiple restriction fragment length polymorphisms, was used as a probe to characterize strains from the pathovar dieffenbachiae. Images PMID:7906933
Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.

PubMed

Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko

2016-03-01

In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.
CoCoNUT: an efficient system for the comparison and analysis of genomes

PubMed Central

2008-01-01

Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit) that allows solving several different tasks in a unified framework: (1) finding regions of high similarity among multiple genomic sequences and aligning them, (2) comparing two draft or multi-chromosomal genomes, (3) locating large segmental duplications in large genomic sequences, and (4) mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component), CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics. PMID:19014477
Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

2011-01-01

Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g.more » Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.« less
Phylogenetic Relationship of Necoclí Virus to Other South American Hantaviruses (Bunyaviridae: Hantavirus).

PubMed

Montoya-Ruiz, Carolina; Cajimat, Maria N B; Milazzo, Mary Louise; Diaz, Francisco J; Rodas, Juan David; Valbuena, Gustavo; Fulhorst, Charles F

2015-07-01

The results of a previous study suggested that Cherrie's cane rat (Zygodontomys cherriei) is the principal host of Necoclí virus (family Bunyaviridae, genus Hantavirus) in Colombia. Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences in this study confirmed that Necoclí virus is phylogenetically closely related to Maporal virus, which is principally associated with the delicate pygmy rice rat (Oligoryzomys delicatus) in western Venezuela. In pairwise comparisons, nonidentities between the complete amino acid sequence of the nucleocapsid protein of Necoclí virus and the complete amino acid sequences of the nucleocapsid proteins of other hantaviruses were ≥8.7%. Likewise, nonidentities between the complete amino acid sequence of the glycoprotein precursor of Necoclí virus and the complete amino acid sequences of the glycoprotein precursors of other hantaviruses were ≥11.7%. Collectively, the unique association of Necoclí virus with Z. cherriei in Colombia, results of the Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences, and results of the pairwise comparisons of amino acid sequences strongly support the notion that Necoclí virus represents a novel species in the genus Hantavirus. Further work is needed to determine whether Calabazo virus (a hantavirus associated with Z. brevicauda cherriei in Panama) and Necoclí virus are conspecific.
In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics.

PubMed

Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

2017-01-01

Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica . All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.
Application of MLST and Pilus Gene Sequence Comparisons to Investigate the Population Structures of Actinomyces naeslundii and Actinomyces oris

PubMed Central

Henssge, Uta; Do, Thuy; Gilbert, Steven C.; Cox, Steven; Clark, Douglas; Wickström, Claes; Ligtenberg, A. J. M.; Radford, David R.; Beighton, David

2011-01-01

Actinomyces naeslundii and Actinomyces oris are members of the oral biofilm. Their identification using 16S rRNA sequencing is problematic and better achieved by comparison of metG partial sequences. A. oris is more abundant and more frequently isolated than A. naeslundii. We used a multi-locus sequence typing approach to investigate the genotypic diversity of these species and assigned A. naeslundii (n = 37) and A. oris (n = 68) isolates to 32 and 68 sequence types (ST), respectively. Neighbor-joining and ClonalFrame dendrograms derived from the concatenated partial sequences of 7 house-keeping genes identified at least 4 significant subclusters within A. oris and 3 within A. naeslundii. The strain collection we had investigated was an under-representation of the total population since at least 3 STs composed of single strains may represent discrete clusters of strains not well represented in the collection. The integrity of these sub-clusters was supported by the sequence analysis of fimP and fimA, genes coding for the type 1 and 2 fimbriae, respectively. An A. naeslundii subcluster was identified with both fimA and fimP genes and these strains were able to bind to MUC7 and statherin while all other A. naeslundii strains possessed only fimA and did not bind to statherin. An A. oris subcluster harboured a fimA gene similar to that of Actinomyces odontolyticus but no detectable fimP failed to bind significantly to either MUC7 or statherin. These data are evidence of extensive genotypic and phenotypic diversity within the species A. oris and A. naeslundii but the status of the subclusters identified here will require genome comparisons before their phylogenic position can be unequivocally established. PMID:21738661
Application of MLST and pilus gene sequence comparisons to investigate the population structures of Actinomyces naeslundii and Actinomyces oris.

PubMed

Henssge, Uta; Do, Thuy; Gilbert, Steven C; Cox, Steven; Clark, Douglas; Wickström, Claes; Ligtenberg, A J M; Radford, David R; Beighton, David

2011-01-01

Actinomyces naeslundii and Actinomyces oris are members of the oral biofilm. Their identification using 16S rRNA sequencing is problematic and better achieved by comparison of metG partial sequences. A. oris is more abundant and more frequently isolated than A. naeslundii. We used a multi-locus sequence typing approach to investigate the genotypic diversity of these species and assigned A. naeslundii (n = 37) and A. oris (n = 68) isolates to 32 and 68 sequence types (ST), respectively. Neighbor-joining and ClonalFrame dendrograms derived from the concatenated partial sequences of 7 house-keeping genes identified at least 4 significant subclusters within A. oris and 3 within A. naeslundii. The strain collection we had investigated was an under-representation of the total population since at least 3 STs composed of single strains may represent discrete clusters of strains not well represented in the collection. The integrity of these sub-clusters was supported by the sequence analysis of fimP and fimA, genes coding for the type 1 and 2 fimbriae, respectively. An A. naeslundii subcluster was identified with both fimA and fimP genes and these strains were able to bind to MUC7 and statherin while all other A. naeslundii strains possessed only fimA and did not bind to statherin. An A. oris subcluster harboured a fimA gene similar to that of Actinomyces odontolyticus but no detectable fimP failed to bind significantly to either MUC7 or statherin. These data are evidence of extensive genotypic and phenotypic diversity within the species A. oris and A. naeslundii but the status of the subclusters identified here will require genome comparisons before their phylogenic position can be unequivocally established.
In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics

PubMed Central

Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

2017-01-01

Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563
The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families.

PubMed

Yooseph, Shibu; Sutton, Granger; Rusch, Douglas B; Halpern, Aaron L; Williamson, Shannon J; Remington, Karin; Eisen, Jonathan A; Heidelberg, Karla B; Manning, Gerard; Li, Weizhong; Jaroszewski, Lukasz; Cieplak, Piotr; Miller, Christopher S; Li, Huiying; Mashiyama, Susan T; Joachimiak, Marcin P; van Belle, Christopher; Chandonia, John-Marc; Soergel, David A; Zhai, Yufeng; Natarajan, Kannan; Lee, Shaun; Raphael, Benjamin J; Bafna, Vineet; Friedman, Robert; Brenner, Steven E; Godzik, Adam; Eisenberg, David; Dixon, Jack E; Taylor, Susan S; Strausberg, Robert L; Frazier, Marvin; Venter, J Craig

2007-03-01

Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.
In silico comparison of genomic regions containing genes coding for enzymes and transcription factors for the phenylpropanoid pathway in Phaseolus vulgaris L. and Glycine max L. Merr

PubMed Central

Reinprecht, Yarmilla; Yadegari, Zeinab; Perry, Gregory E.; Siddiqua, Mahbuba; Wright, Lori C.; McClean, Phillip E.; Pauls, K. Peter

2013-01-01

Legumes contain a variety of phytochemicals derived from the phenylpropanoid pathway that have important effects on human health as well as seed coat color, plant disease resistance and nodulation. However, the information about the genes involved in this important pathway is fragmentary in common bean (Phaseolus vulgaris L.). The objectives of this research were to isolate genes that function in and control the phenylpropanoid pathway in common bean, determine their genomic locations in silico in common bean and soybean, and analyze sequences of the 4CL gene family in two common bean genotypes. Sequences of phenylpropanoid pathway genes available for common bean or other plant species were aligned, and the conserved regions were used to design sequence-specific primers. The PCR products were cloned and sequenced and the gene sequences along with common bean gene-based (g) markers were BLASTed against the Glycine max v.1.0 genome and the P. vulgaris v.1.0 (Andean) early release genome. In addition, gene sequences were BLASTed against the OAC Rex (Mesoamerican) genome sequence assembly. In total, fragments of 46 structural and regulatory phenylpropanoid pathway genes were characterized in this way and placed in silico on common bean and soybean sequence maps. The maps contain over 250 common bean g and SSR (simple sequence repeat) markers and identify the positions of more than 60 additional phenylpropanoid pathway gene sequences, plus the putative locations of seed coat color genes. The majority of cloned phenylpropanoid pathway gene sequences were mapped to one location in the common bean genome but had two positions in soybean. The comparison of the genomic maps confirmed previous studies, which show that common bean and soybean share genomic regions, including those containing phenylpropanoid pathway gene sequences, with conserved synteny. Indels identified in the comparison of Andean and Mesoamerican common bean 4CL gene sequences might be used to develop inter-pool phenylpropanoid pathway gene-based markers. We anticipate that the information obtained by this study will simplify and accelerate selections of common bean with specific phenylpropanoid pathway alleles to increase the contents of beneficial phenylpropanoids in common bean and other legumes. PMID:24046770
Comparison-based optical study on a point-line-coupling-focus system with linear Fresnel heliostats.

PubMed

Dai, Yanjun; Li, Xian; Zhou, Lingyu; Ma, Xuan; Wang, Ruzhu

2016-05-16

Concentrating the concept of a beam-down solar tower with linear Fresnel heliostat (PLCF) is one of the feasible choices and has great potential in reducing spot size and improving optical efficiency. Optical characteristics of a PLCF system with the hyperboloid reflector are introduced and investigated theoretically. Taking into account solar position and optical surface errors, a Monte Carlo ray-tracing (MCRT) analysis model for a PLCF system is developed and applied in a comparison-based study on the optical performance between the PLCF system and the conventional beam-down solar tower system with flat and spherical heliostats. The optimal square facet of linear Fresnel heliostat is also proposed for matching with the 3D-CPC receiver.

Title Sequences, Dress, Settings, and Such.

ERIC Educational Resources Information Center

Bell, John

Comparisons of television shows along genre lines suggest significant elements of aural/visual richness as well as valuable categories of comparison for future use in other comparisons. An examination of two sitcoms and two police shows produced roughly 25 years apart--"Make Room for Daddy" with "The Cosby Show" and "Naked…
Improved efficiency in amplification of Escherichia coli o-antigen gene clusters using genome-wide sequence comparison

USDA-ARS?s Scientific Manuscript database

Background: In many bacteria including E. coli, genes encoding O-antigens are clustered in the chromosome, with a 39-bp JUMPstart sequence and gnd gene located upstream and downstream of the cluster, respectively. For determining the DNA sequence of the E. coli O-antigen gene cluster, one set of P...
Comparison of Three Instructional Sequences for the Addition and Subtraction Algorithms. Technical Report 273.

ERIC Educational Resources Information Center

Wiles, Clyde A.

The study's purpose was to investigate the differential effects on the achievement of second-grade students that could be attributed to three instructional sequences for the learning of the addition and subtraction algorithms. One sequence presented the addition algorithm first (AS), the second presented the subtraction algorithm first (SA), and…
Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936.

PubMed

Durrens, Pascal; Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J; Noël, Thierry

2017-08-03

Clavispora lusitaniae , an environmental saprophytic yeast belonging to the CTG clade of Candida , can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. Copyright © 2017 Durrens et al.
Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936

PubMed Central

Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J.; Noël, Thierry

2017-01-01

ABSTRACT Clavispora lusitaniae, an environmental saprophytic yeast belonging to the CTG clade of Candida, can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. PMID:28774979
[The dilemma of data flood - reducing costs and increasing quality control].

PubMed

Gassmann, B

2012-09-05

Digitization is found everywhere in sonography. Printing of ultrasound images using the videoprinter with special paper will be done in single cases. The documentation of sonography procedures is more and more done by saving image sequences instead of still frames. Echocardiography is routinely recorded in between with so called R-R-loops. Doing contrast enhanced ultrasound recording of sequences is necessary to get a deep impression of the vascular structure of interest. Working with this data flood in daily practice a specialized software is required. Comparison in follow up of stored and recent images/sequences is very helpful. Nevertheless quality control of the ultrasound system and the transducers is simple and safe - using a phantom for detail resolution and general image quality the stored images/sequences are comparable over the life cycle of the system. The comparison in follow up is showing decreased image quality and transducer defects immediately.
Resolution of the African hominoid trichotomy by use of a mitochondrial gene sequence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruvolo, M.; Disotell, T.R.; Allard, M.W.

1991-02-15

Mitochondrial DNA sequences encoding the cytochrome oxidase subunit II gene have been determined for five primate species, siamang (Hylobates syndactylus), lowland gorilla (Gorilla gorilla), pygmy chimpanzee (Pan paniscus), crab-eating macaque (Macaca fascicularis), and green monkey (Cercopithecus aethiops), and compared with published sequences of other primate and nonprimate species. Comparisons of cytochrome oxidase subunit II gene sequences provide clear-cut evidence from the mitochondrial genome for the separation of the African ape trichotomy into two evolutionary lineages, one leading to gorillas and the other to humans and chimpanzees. Several different tree-building methods support this same phylogenetic tree topology. The comparisons also yieldmore » trees in which a substantial length separates the divergence point of gorillas from that of humans and chimpanzees, suggesting that the lineage most immediately ancestral to humans and chimpanzees may have been in existence for a relatively long time.« less
Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.

PubMed

Bauer, Markus; Klau, Gunnar W; Reinert, Knut

2007-07-27

The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input sequences. Our program LARA is freely available for academic purposes from http://www.planet-lisa.net.
Design and analysis of linear cascade DNA hybridization chain reactions using DNA hairpins

NASA Astrophysics Data System (ADS)

Bui, Hieu; Garg, Sudhanshu; Miao, Vincent; Song, Tianqi; Mokhtar, Reem; Reif, John

2017-01-01

DNA self-assembly has been employed non-conventionally to construct nanoscale structures and dynamic nanoscale machines. The technique of hybridization chain reactions by triggered self-assembly has been shown to form various interesting nanoscale structures ranging from simple linear DNA oligomers to dendritic DNA structures. Inspired by earlier triggered self-assembly works, we present a system for controlled self-assembly of linear cascade DNA hybridization chain reactions using nine distinct DNA hairpins. NUPACK is employed to assist in designing DNA sequences and Matlab has been used to simulate DNA hairpin interactions. Gel electrophoresis and ensemble fluorescence reaction kinetics data indicate strong evidence of linear cascade DNA hybridization chain reactions. The half-time completion of the proposed linear cascade reactions indicates a linear dependency on the number of hairpins.
Construction of trypanosome artificial mini-chromosomes.

PubMed Central

Lee, M G; E, Y; Axelrod, N

1995-01-01

We report the preparation of two linear constructs which, when transformed into the procyclic form of Trypanosoma brucei, become stably inherited artificial mini-chromosomes. Both of the two constructs, one of 10 kb and the other of 13 kb, contain a T.brucei PARP promoter driving a chloramphenicol acetyltransferase (CAT) gene. In the 10 kb construct the CAT gene is followed by one hygromycin phosphotransferase (Hph) gene, and in the 13 kb construct the CAT gene is followed by three tandemly linked Hph genes. At each end of these linear molecules are telomere repeats and subtelomeric sequences. Electroporation of these linear DNA constructs into the procyclic form of T.brucei generated hygromycin-B resistant cell lines. In these cell lines, the input DNA remained linear and bounded by the telomere ends, but it increased in size. In the cell lines generated by the 10 kb construct, the input DNA increased in size to 20-50 kb. In the cell lines generated by the 13 kb constructs, two sizes of linear DNAs containing the input plasmid were detected: one of 40-50 kb and the other of 150 kb. The increase in size was not the result of in vivo tandem repetitions of the input plasmid, but represented the addition of new sequences. These Hph containing linear DNA molecules were maintained stably in cell lines for at least 20 generations in the absence of drug selection and were subsequently referred to as trypanosome artificial mini-chromosomes, or TACs. Images PMID:8532534
An experimental and analytical investigation on the response of GR/EP composite I-frames

NASA Technical Reports Server (NTRS)

Moas, E., Jr.; Boitnott, R. L.; Griffin, O. H., Jr.

1991-01-01

Six-foot diameter, semicircular graphite/epoxy specimens representative of generic aircraft frames were loaded quasi-statically to determine their load response and failure mechanisms for large deflections that occur in an airplane crash. These frame-skin specimens consisted of a cylindrical skin section cocured with a semicircular I-frame. Various frame laminate stacking sequences and geometries were evaluated by statically loading the specimen until multiple failures occurred. Two analytical methods were compared for modeling the frame-skin specimens: a two-dimensional branched-shell finite element analysis and a one-dimensional, closed-form, curved beam solution derived using an energy method. Excellent correlation was obtained between experimental results and the finite element predictions of the linear response of the frames prior to the initial failure. The beam solution was used for rapid parameter and design studies, and was found to be stiff in comparison with the finite element analysis. The specimens were found to be useful for evaluating composite frame designs.
Generalized Lagrangian Jacobi Gauss collocation method for solving unsteady isothermal gas through a micro-nano porous medium

NASA Astrophysics Data System (ADS)

Parand, Kourosh; Latifi, Sobhan; Delkhosh, Mehdi; Moayeri, Mohammad M.

2018-01-01

In the present paper, a new method based on the Generalized Lagrangian Jacobi Gauss (GLJG) collocation method is proposed. The nonlinear Kidder equation, which explains unsteady isothermal gas through a micro-nano porous medium, is a second-order two-point boundary value ordinary differential equation on the unbounded interval [0, ∞). Firstly, using the quasilinearization method, the equation is converted to a sequence of linear ordinary differential equations. Then, by using the GLJG collocation method, the problem is reduced to solving a system of algebraic equations. It must be mentioned that this equation is solved without domain truncation and variable changing. A comparison with some numerical solutions made and the obtained results indicate that the presented solution is highly accurate. The important value of the initial slope, y'(0), is obtained as -1.191790649719421734122828603800159364 for η = 0.5. Comparing to the best result obtained so far, it is accurate up to 36 decimal places.
A rapid, ratiometric, enzyme-free, and sensitive single-step miRNA detection using three-way junction based FRET probes

NASA Astrophysics Data System (ADS)

Luo, Qingying; Liu, Lin; Yang, Cai; Yuan, Jing; Feng, Hongtao; Chen, Yan; Zhao, Peng; Yu, Zhiqiang; Jin, Zongwen

2018-03-01

MicroRNAs (miRNAs) are single stranded endogenous molecules composed of only 18-24 nucleotides which are critical for gene expression regulating the translation of messenger RNAs. Conventional methods based on enzyme-assisted nucleic acid amplification techniques have many problems, such as easy contamination, high cost, susceptibility to false amplification, and tendency to have sequence mismatches. Here we report a rapid, ratiometric, enzyme-free, sensitive, and highly selective single-step miRNA detection using three-way junction assembled (or self-assembled) FRET probes. The developed strategy can be operated within the linear range from subnanomolar to hundred nanomolar concentrations of miRNAs. In comparison with the traditional approaches, our method showed high sensitivity for the miRNA detection and extreme selectivity for the efficient discrimination of single-base mismatches. The results reveal that the strategy paved a new avenue for the design of novel highly specific probes applicable in diagnostics and potentially in microscopic imaging of miRNAs in real biological environments.
Incorporating Vibration Test Results for the Advanced Stirling Convertor into the System Dynamic Model

NASA Technical Reports Server (NTRS)

Meer, David W.; Lewandowski, Edward J.

2010-01-01

The U.S. Department of Energy (DOE), Lockheed Martin Corporation (LM), and NASA Glenn Research Center (GRC) have been developing the Advanced Stirling Radioisotope Generator (ASRG) for use as a power system for space science missions. As part of the extended operation testing of this power system, the Advanced Stirling Convertors (ASC) at NASA GRC undergo a vibration test sequence intended to simulate the vibration history that an ASC would experience when used in an ASRG for a space mission. During these tests, a data system collects several performance-related parameters from the convertor under test for health monitoring and analysis. Recently, an additional sensor recorded the slip table position during vibration testing to qualification level. The System Dynamic Model (SDM) integrates Stirling cycle thermodynamics, heat flow, mechanical mass, spring, damper systems, and electrical characteristics of the linear alternator and controller. This Paper presents a comparison of the performance of the ASC when exposed to vibration to that predicted by the SDM when exposed to the same vibration.
Examining the Differences of Linear Systems between Finnish and Taiwanese Textbooks

ERIC Educational Resources Information Center

Yang, Der-Ching; Lin, Yung-Chi

2015-01-01

The purpose of this study was to examine the differences between Finnish and Taiwanese textbooks for grades 7 to 9 on the topic of solving systems of linear equations (simultaneous equations). The specific textbooks examined were TK in Taiwan and FL in Finland. The content analysis method was used to examine (a) the teaching sequence, (b)…
Recursive inversion of externally defined linear systems

NASA Technical Reports Server (NTRS)

Bach, Ralph E., Jr.; Baram, Yoram

1988-01-01

The approximate inversion of an internally unknown linear system, given by its impulse response sequence, by an inverse system having a finite impulse response, is considered. The recursive least squares procedure is shown to have an exact initialization, based on the triangular Toeplitz structure of the matrix involved. The proposed approach also suggests solutions to the problems of system identification and compensation.
Method of Conjugate Radii for Solving Linear and Nonlinear Systems

NASA Technical Reports Server (NTRS)

Nachtsheim, Philip R.

1999-01-01

This paper describes a method to solve a system of N linear equations in N steps. A quadratic form is developed involving the sum of the squares of the residuals of the equations. Equating the quadratic form to a constant yields a surface which is an ellipsoid. For different constants, a family of similar ellipsoids can be generated. Starting at an arbitrary point an orthogonal basis is constructed and the center of the family of similar ellipsoids is found in this basis by a sequence of projections. The coordinates of the center in this basis are the solution of linear system of equations. A quadratic form in N variables requires N projections. That is, the current method is an exact method. It is shown that the sequence of projections is equivalent to a special case of the Gram-Schmidt orthogonalization process. The current method enjoys an advantage not shared by the classic Method of Conjugate Gradients. The current method can be extended to nonlinear systems without modification. For nonlinear equations the Method of Conjugate Gradients has to be augmented with a line-search procedure. Results for linear and nonlinear problems are presented.
Peptidomic Identification of Cysteine-Rich Peptides from Plants.

PubMed

Hemu, Xinya; Serra, Aida; Darwis, Dina A; Cornvik, Tobias; Sze, Siu Kwan; Tam, James P

2018-01-01

Plant cysteine-rich peptides (CRPs) constitute a majority of plant-derived peptides with high molecular diversity. This protocol describes a rapid and efficient peptidomic approach to identify a whole spectrum of CRPs in a plant extract and decipher their molecular diversity and bioprocessing mechanism. Cyclotides from C. ternatea are used as the model CRPs to demonstrate our methodology. Cyclotides exist naturally in both cyclic and linear forms, although the linear forms (acyclotide) are generally present at much lower concentrations. Both cyclotides and acyclotides require linearization of their backbone prior to fragmentation and sequencing. A novel and practical three-step chemoenzymatic treatment was developed to linearize and distinguish both forms: (1) N-terminal acetylation that pre-labels the acyclotides; (2) conversion of Cys into pseudo-Lys through aziridine-mediated S-alkylation to reduce disulfide bonds and to increase the net charge of peptides; and (3) opening of cyclic backbones by the novel asparaginyl endopeptidase butelase 2 that cleaves at the native bioprocessing site. The treated peptides are subsequently analyzed by liquid chromatography coupled to mass spectrometry using electron transfer dissociation fragmentation and sequences are identified by matching the MS/MS spectra directly with the transcriptomic database.
Subcellular localization for Gram positive and Gram negative bacterial proteins using linear interpolation smoothing model.

PubMed

Saini, Harsh; Raicar, Gaurav; Dehzangi, Abdollah; Lal, Sunil; Sharma, Alok

2015-12-07

Protein subcellular localization is an important topic in proteomics since it is related to a protein׳s overall function, helps in the understanding of metabolic pathways, and in drug design and discovery. In this paper, a basic approximation technique from natural language processing called the linear interpolation smoothing model is applied for predicting protein subcellular localizations. The proposed approach extracts features from syntactical information in protein sequences to build probabilistic profiles using dependency models, which are used in linear interpolation to determine how likely is a sequence to belong to a particular subcellular location. This technique builds a statistical model based on maximum likelihood. It is able to deal effectively with high dimensionality that hinders other traditional classifiers such as Support Vector Machines or k-Nearest Neighbours without sacrificing performance. This approach has been evaluated by predicting subcellular localizations of Gram positive and Gram negative bacterial proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.
Linearized Poststall Aerodynamic and Control Law Models of the X-31A Aircraft and Comparison with Flight Data

NASA Technical Reports Server (NTRS)

Stoliker, Patrick C.; Bosworth, John T.; Georgie, Jennifer

1997-01-01

The X-31A aircraft has a unique configuration that uses thrust-vector vanes and aerodynamic control effectors to provide an operating envelope to a maximum 70 deg angle of attack, an inherently nonlinear portion of the flight envelope. This report presents linearized versions of the X-31A longitudinal and lateral-directional control systems, with aerodynamic models sufficient to evaluate characteristics in the poststall envelope at 30 deg, 45 deg, and 60 deg angle of attack. The models are presented with detail sufficient to allow the reader to reproduce the linear results or perform independent control studies. Comparisons between the responses of the linear models and flight data are presented in the time and frequency domains to demonstrate the strengths and weaknesses of the ability to predict high-angle-of-attack flight dynamics using linear models. The X-31A six-degree-of-freedom simulation contains a program that calculates linear perturbation models throughout the X-31A flight envelope. The models include aerodynamics and flight control system dynamics that are used for stability, controllability, and handling qualities analysis. The models presented in this report demonstrate the ability to provide reasonable linear representations in the poststall flight regime.

Characterization of a linear DNA plasmid from the filamentous fungal plant pathogen Glomerella musae [Anamorph: Colletotrichum musae (Berk. and Curt.) arx.

USGS Publications Warehouse

Freeman, S.; Redman, R.S.; Grantham, G.; Rodriguez, R.J.

1997-01-01

A 7.4-kilobase (kb) DNA plasmid was isolated from Glomerella musae isolate 927 and designated pGML1. Exonuclease treatments indicated that pGML1 was a linear plasmid with blocked 5' termini. Cell-fractionation experiments combined with sequence-specific PCR amplification revealed that pGML1 resided in mitochondria. The pGML1 plasmid hybridized to cesium chloride-fractionated nuclear DNA but not to A + T-rich mitochondrial DNA. An internal 7.0-kb section of pGML1 was cloned and did not hybridize with either nuclear or mitochondrial DNA from G. musae. Sequence analysis revealed identical terminal inverted repeats (TIR) of 520 bp at the ends of the cloned 7.0-kb section of pGML1. The occurrence of pGML1 did not correspond with the pathogenicity of G. musae on banana fruit. Four additional isolates of G. musae possessed extrachromosomal DNA fragments similar in size and sequence to pGML1.
Sequence, Structure, and Context Preferences of Human RNA Binding Proteins.

PubMed

Dominguez, Daniel; Freese, Peter; Alexis, Maria S; Su, Amanda; Hochman, Myles; Palden, Tsultrim; Bazile, Cassandra; Lambert, Nicole J; Van Nostrand, Eric L; Pratt, Gabriel A; Yeo, Gene W; Graveley, Brenton R; Burge, Christopher B

2018-06-07

RNA binding proteins (RBPs) orchestrate the production, processing, and function of mRNAs. Here, we present the affinity landscapes of 78 human RBPs using an unbiased assay that determines the sequence, structure, and context preferences of these proteins in vitro by deep sequencing of bound RNAs. These data enable construction of "RNA maps" of RBP activity without requiring crosslinking-based assays. We found an unexpectedly low diversity of RNA motifs, implying frequent convergence of binding specificity toward a relatively small set of RNA motifs, many with low compositional complexity. Offsetting this trend, however, we observed extensive preferences for contextual features distinct from short linear RNA motifs, including spaced "bipartite" motifs, biased flanking nucleotide composition, and bias away from or toward RNA structure. Our results emphasize the importance of contextual features in RNA recognition, which likely enable targeting of distinct subsets of transcripts by different RBPs that recognize the same linear motif. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
A large inversion in the linear chromosome of Streptomyces griseus caused by replicative transposition of a new Tn3 family transposon.

PubMed

Murata, M; Uchida, T; Yang, Y; Lezhava, A; Kinashi, H

2011-04-01

We have comprehensively analyzed the linear chromosomes of Streptomyces griseus mutants constructed and kept in our laboratory. During this study, macrorestriction analysis of AseI and DraI fragments of mutant 402-2 suggested a large chromosomal inversion. The junctions of chromosomal inversion were cloned and sequenced and compared with the corresponding target sequences in the parent strain 2247. Consequently, a transposon-involved mechanism was revealed. Namely, a transposon originally located at the left target site was replicatively transposed to the right target site in an inverted direction, which generated a second copy and at the same time caused a 2.5-Mb chromosomal inversion. The involved transposon named TnSGR was grouped into a new subfamily of the resolvase-encoding Tn3 family transposons based on its gene organization. At the end, terminal diversity of S. griseus chromosomes is discussed by comparing the sequences of strains 2247 and IFO13350.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuzmina, L.K.

The research deals with different aspects of mathematical modelling and the analysis of complex dynamic non-linear systems as a consequence of applied problems in mechanics (in particular those for gyrosystems, for stabilization and orientation systems, control systems of movable objects, including the aviation and aerospace systems) Non-linearity, multi-connectedness and high dimensionness of dynamical problems, that occur at the initial full statement lead to the need of the problem narrowing, and of the decomposition of the full model, but with safe-keeping of main properties and of qualitative equivalence. The elaboration of regular methods for modelling problems in dynamics, the generalization ofmore » reduction principle are the main aims of the investigations. Here, uniform methodology, based on Lyapunov`s methods, founded by N.G.Ohetayev, is developed. The objects of the investigations are considered with exclusive positions, as systems of singularly perturbed class, treated as ones with singular parametrical perturbations. It is the natural extension of the statements of N.G.Chetayev and P.A.Kuzmin for parametrical stability. In paper the systematical procedures for construction of correct simplified models (comparison ones) are developed, the validity conditions of the transition are determined the appraisals are received, the regular algorithms of engineering level are obtained. Applicabilitelly to the stabilization and orientation systems with the gyroscopic controlling subsystems, these methods enable to build the hierarchical sequence of admissible simplified models; to determine the conditions of their correctness.« less
Evidence of codon usage in the nearest neighbor spacing distribution of bases in bacterial genomes

NASA Astrophysics Data System (ADS)

Higareda, M. F.; Geiger, O.; Mendoza, L.; Méndez-Sánchez, R. A.

2012-02-01

Statistical analysis of whole genomic sequences usually assumes a homogeneous nucleotide density throughout the genome, an assumption that has been proved incorrect for several organisms since the nucleotide density is only locally homogeneous. To avoid giving a single numerical value to this variable property, we propose the use of spectral statistics, which characterizes the density of nucleotides as a function of its position in the genome. We show that the cumulative density of bases in bacterial genomes can be separated into an average (or secular) plus a fluctuating part. Bacterial genomes can be divided into two groups according to the qualitative description of their secular part: linear and piecewise linear. These two groups of genomes show different properties when their nucleotide spacing distribution is studied. In order to analyze genomes having a variable nucleotide density, statistically, the use of unfolding is necessary, i.e., to get a separation between the secular part and the fluctuations. The unfolding allows an adequate comparison with the statistical properties of other genomes. With this methodology, four genomes were analyzed Burkholderia, Bacillus, Clostridium and Corynebacterium. Interestingly, the nearest neighbor spacing distributions or detrended distance distributions are very similar for species within the same genus but they are very different for species from different genera. This difference can be attributed to the difference in the codon usage.
A synteny-based draft genome sequence of the forage grass Lolium perenne.

PubMed

Byrne, Stephen L; Nagy, Istvan; Pfeifer, Matthias; Armstead, Ian; Swain, Suresh; Studer, Bruno; Mayer, Klaus; Campbell, Jacqueline D; Czaban, Adrian; Hentrup, Stephan; Panitz, Frank; Bendixen, Christian; Hedegaard, Jakob; Caccamo, Mario; Asp, Torben

2015-11-01

Here we report the draft genome sequence of perennial ryegrass (Lolium perenne), an economically important forage and turf grass species that is widely cultivated in temperate regions worldwide. It is classified along with wheat, barley, oats and Brachypodium distachyon in the Pooideae sub-family of the grass family (Poaceae). Transcriptome data was used to identify 28,455 gene models, and we utilized macro-co-linearity between perennial ryegrass and barley, and synteny within the grass family, to establish a synteny-based linear gene order. The gametophytic self-incompatibility mechanism enables the pistil of a plant to reject self-pollen and therefore promote out-crossing. We have used the sequence assembly to characterize transcriptional changes in the stigma during pollination with both compatible and incompatible pollen. Characterization of the pollen transcriptome identified homologs to pollen allergens from a range of species, many of which were expressed to very high levels in mature pollen grains, and are potentially involved in the self-incompatibility mechanism. The genome sequence provides a valuable resource for future breeding efforts based on genomic prediction, and will accelerate the development of new varieties for more productive grasslands. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Droplet-Based Pyrosequencing Using Digital Microfluidics

PubMed Central

Boles, Deborah J.; Benton, Jonathan L.; Siew, Germaine J.; Levy, Miriam H.; Thwar, Prasanna K.; Sandahl, Melissa A.; Rouse, Jeremy L.; Perkins, Lisa C.; Sudarsan, Arjun P.; Jalili, Roxana; Pamula, Vamsee K.; Srinivasan, Vijay; Fair, Richard B.; Griffin, Peter B.; Eckhardt, Allen E.; Pollack, Michael G.

2013-01-01

The feasibility of implementing pyrosequencing chemistry within droplets using electrowetting-based digital microfluidics is reported. An array of electrodes patterned on a printed-circuit board was used to control the formation, transportation, merging, mixing, and splitting of submicroliter-sized droplets contained within an oil-filled chamber. A three-enzyme pyrosequencing protocol was implemented in which individual droplets contained enzymes, deoxyribonucleotide triphosphates (dNTPs), and DNA templates. The DNA templates were anchored to magnetic beads which enabled them to be thoroughly washed between nucleotide additions. Reagents and protocols were optimized to maximize signal over background, linearity of response, cycle efficiency, and wash efficiency. As an initial demonstration of feasibility, a portion of a 229 bp Candida parapsilosis template was sequenced using both a de novo protocol and a resequencing protocol. The resequencing protocol generated over 60 bp of sequence with 100% sequence accuracy based on raw pyrogram levels. Excellent linearity was observed for all of the homopolymers (two, three, or four nucleotides) contained in the C. parapsilosis sequence. With improvements in microfluidic design it is expected that longer reads, higher throughput, and improved process integration (i.e., “sample-to-sequence” capability) could eventually be achieved using this low-cost platform. PMID:21932784
Conserved features of eukaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70.

PubMed Central

Hunt, C; Morimoto, R I

1985-01-01

We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
Non-linear behavior of fiber composite laminates

NASA Technical Reports Server (NTRS)

Hashin, Z.; Bagchi, D.; Rosen, B. W.

1974-01-01

The non-linear behavior of fiber composite laminates which results from lamina non-linear characteristics was examined. The analysis uses a Ramberg-Osgood representation of the lamina transverse and shear stress strain curves in conjunction with deformation theory to describe the resultant laminate non-linear behavior. A laminate having an arbitrary number of oriented layers and subjected to a general state of membrane stress was treated. Parametric results and comparison with experimental data and prior theoretical results are presented.
Periodic binary sequence generators: VLSI circuits considerations

NASA Technical Reports Server (NTRS)

Perlman, M.

1984-01-01

Feedback shift registers are efficient periodic binary sequence generators. Polynomials of degree r over a Galois field characteristic 2(GF(2)) characterize the behavior of shift registers with linear logic feedback. The algorithmic determination of the trinomial of lowest degree, when it exists, that contains a given irreducible polynomial over GF(2) as a factor is presented. This corresponds to embedding the behavior of an r-stage shift register with linear logic feedback into that of an n-stage shift register with a single two-input modulo 2 summer (i.e., Exclusive-OR gate) in its feedback. This leads to Very Large Scale Integrated (VLSI) circuit architecture of maximal regularity (i.e., identical cells) with intercell communications serialized to a maximal degree.
On new classes of solutions of nonlinear partial differential equations in the form of convergent special series

NASA Astrophysics Data System (ADS)

Filimonov, M. Yu.

2017-12-01

The method of special series with recursively calculated coefficients is used to solve nonlinear partial differential equations. The recurrence of finding the coefficients of the series is achieved due to a special choice of functions, in powers of which the solution is expanded in a series. We obtain a sequence of linear partial differential equations to find the coefficients of the series constructed. In many cases, one can deal with a sequence of linear ordinary differential equations. We construct classes of solutions in the form of convergent series for a certain class of nonlinear evolution equations. A new class of solutions of generalized Boussinesque equation with an arbitrary function in the form of a convergent series is constructed.
Number of 24-Hour Diet Recalls Needed to Estimate Energy Intake

PubMed Central

MA, Yunsheng; Olendzki, Barbara C.; Pagoto, Sherry L.; Hurley, Thomas G.; Magner, Robert P.; Ockene, Ira S.; Schneider, Kristin L.; Merriam, Philip A.; Hébert, James R.

2009-01-01

Purpose Twenty-four-hour diet recall interviews (24HRs) are used to assess diet and to validate other diet assessment instruments. Therefore it is important to know how many 24HRs are required to describe an individual's intake. Method Seventy-nine middle-aged white women completed seven 24HRs over a 14-day period, during which energy expenditure (EE) was determined by the doubly labeled water method (DLW). Mean daily intakes were compared to DLW-derived EE using paired t tests. Linear mixed models were used to evaluate the effect of call sequence and day of the week on 24HR-derived energy intake while adjusting for education, relative body weight, social desirability, and an interaction between call sequence and social desirability. Results Mean EE from DLW was 2115 kcal/day. Adjusted 24HR-derived energy intake was lowest at call 1 (1501 kcal/day); significantly higher energy intake was observed at calls 2 and 3 (2246 and 2315 kcal/day, respectively). Energy intake on Friday was significantly lower than on Sunday. Averaging energy intake from the first two calls better approximated true energy expenditure than did the first call, and averaging the first three calls further improved the estimate (p = 0.02 for both comparisons). Additional calls did not improve estimation. Conclusions Energy intake is underreported on the first 24HR. Three 24HRs appear optimal for estimating energy intake. PMID:19576535
Number of 24-hour diet recalls needed to estimate energy intake.

PubMed

Ma, Yunsheng; Olendzki, Barbara C; Pagoto, Sherry L; Hurley, Thomas G; Magner, Robert P; Ockene, Ira S; Schneider, Kristin L; Merriam, Philip A; Hébert, James R

2009-08-01

Twenty-four-hour diet recall interviews (24HRs) are used to assess diet and to validate other diet assessment instruments. Therefore it is important to know how many 24HRs are required to describe an individual's intake. Seventy-nine middle-aged white women completed seven 24HRs over a 14-day period, during which energy expenditure (EE) was determined by the doubly labeled water method (DLW). Mean daily intakes were compared to DLW-derived EE using paired t tests. Linear mixed models were used to evaluate the effect of call sequence and day of the week on 24HR-derived energy intake while adjusting for education, relative body weight, social desirability, and an interaction between call sequence and social desirability. Mean EE from DLW was 2115 kcal/day. Adjusted 24HR-derived energy intake was lowest at call 1 (1501 kcal/day); significantly higher energy intake was observed at calls 2 and 3 (2246 and 2315 kcal/day, respectively). Energy intake on Friday was significantly lower than on Sunday. Averaging energy intake from the first two calls better approximated true energy expenditure than did the first call, and averaging the first three calls further improved the estimate (p=0.02 for both comparisons). Additional calls did not improve estimation. Energy intake is underreported on the first 24HR. Three 24HRs appear optimal for estimating energy intake.
A robust method to analyze copy number alterations of less than 100 kb in single cells using oligonucleotide array CGH.

PubMed

Möhlendick, Birte; Bartenhagen, Christoph; Behrens, Bianca; Honisch, Ellen; Raba, Katharina; Knoefel, Wolfram T; Stoecklein, Nikolas H

2013-01-01

Comprehensive genome wide analyses of single cells became increasingly important in cancer research, but remain to be a technically challenging task. Here, we provide a protocol for array comparative genomic hybridization (aCGH) of single cells. The protocol is based on an established adapter-linker PCR (WGAM) and allowed us to detect copy number alterations as small as 56 kb in single cells. In addition we report on factors influencing the success of single cell aCGH downstream of the amplification method, including the characteristics of the reference DNA, the labeling technique, the amount of input DNA, reamplification, the aCGH resolution, and data analysis. In comparison with two other commercially available non-linear single cell amplification methods, WGAM showed a very good performance in aCGH experiments. Finally, we demonstrate that cancer cells that were processed and identified by the CellSearch® System and that were subsequently isolated from the CellSearch® cartridge as single cells by fluorescence activated cell sorting (FACS) could be successfully analyzed using our WGAM-aCGH protocol. We believe that even in the era of next-generation sequencing, our single cell aCGH protocol will be a useful and (cost-) effective approach to study copy number alterations in single cells at resolution comparable to those reported currently for single cell digital karyotyping based on next generation sequencing data.
Virtual Northern analysis of the human genome.

PubMed

Hurowitz, Evan H; Drori, Iddo; Stodden, Victoria C; Donoho, David L; Brown, Patrick O

2007-05-23

We applied the Virtual Northern technique to human brain mRNA to systematically measure human mRNA transcript lengths on a genome-wide scale. We used separation by gel electrophoresis followed by hybridization to cDNA microarrays to measure 8,774 mRNA transcript lengths representing at least 6,238 genes at high (>90%) confidence. By comparing these transcript lengths to the Refseq and H-Invitational full-length cDNA databases, we found that nearly half of our measurements appeared to represent novel transcript variants. Comparison of length measurements determined by hybridization to different cDNAs derived from the same gene identified clones that potentially correspond to alternative transcript variants. We observed a close linear relationship between ORF and mRNA lengths in human mRNAs, identical in form to the relationship we had previously identified in yeast. Some functional classes of protein are encoded by mRNAs whose untranslated regions (UTRs) tend to be longer or shorter than average; these functional classes were similar in both human and yeast. Human transcript diversity is extensive and largely unannotated. Our length dataset can be used as a new criterion for judging the completeness of cDNAs and annotating mRNA sequences. Similar relationships between the lengths of the UTRs in human and yeast mRNAs and the functions of the proteins they encode suggest that UTR sequences serve an important regulatory role among eukaryotes.
Dynamic Textures Modeling via Joint Video Dictionary Learning.

PubMed

Wei, Xian; Li, Yuanxiang; Shen, Hao; Chen, Fang; Kleinsteuber, Martin; Wang, Zhongfeng

2017-04-06

Video representation is an important and challenging task in the computer vision community. In this paper, we consider the problem of modeling and classifying video sequences of dynamic scenes which could be modeled in a dynamic textures (DT) framework. At first, we assume that image frames of a moving scene can be modeled as a Markov random process. We propose a sparse coding framework, named joint video dictionary learning (JVDL), to model a video adaptively. By treating the sparse coefficients of image frames over a learned dictionary as the underlying "states", we learn an efficient and robust linear transition matrix between two adjacent frames of sparse events in time series. Hence, a dynamic scene sequence is represented by an appropriate transition matrix associated with a dictionary. In order to ensure the stability of JVDL, we impose several constraints on such transition matrix and dictionary. The developed framework is able to capture the dynamics of a moving scene by exploring both sparse properties and the temporal correlations of consecutive video frames. Moreover, such learned JVDL parameters can be used for various DT applications, such as DT synthesis and recognition. Experimental results demonstrate the strong competitiveness of the proposed JVDL approach in comparison with state-of-the-art video representation methods. Especially, it performs significantly better in dealing with DT synthesis and recognition on heavily corrupted data.
Children's discrimination of vowel sequences

NASA Astrophysics Data System (ADS)

Coady, Jeffry A.; Kluender, Keith R.; Evans, Julia

2003-10-01

Children's ability to discriminate sequences of steady-state vowels was investigated. Vowels (as in ``beet,'' ``bat,'' ``bought,'' and ``boot'') were synthesized at durations of 40, 80, 160, 320, 640, and 1280 ms. Four different vowel sequences were created by concatenating different orders of vowels for each duration, separated by 10-ms intervening silence. Thus, sequences differed in vowel order and duration (rate). Sequences were 12 s in duration, with amplitude ramped linearly over the first and last 2 s. Sequence pairs included both same (identical sequences) and different trials (sequences with vowels in different orders). Sequences with vowel of equal duration were presented on individual trials. Children aged 7;0 to 10;6 listened to pairs of sequences (with 100 ms between sequences) and responded whether sequences sounded the same or different. Results indicate that children are best able to discriminate sequences of intermediate-duration vowels, typical of conversational speaking rate. Children were less accurate with both shorter and longer vowels. Results are discussed in terms of auditory processing (shortest vowels) and memory (longest vowels). [Research supported by NIDCD DC-05263, DC-04072, and DC-005650.
Subject-specific bone attenuation correction for brain PET/MR: can ZTE-MRI substitute CT scan accurately?

PubMed

Khalifé, Maya; Fernandez, Brice; Jaubert, Olivier; Soussan, Michael; Brulon, Vincent; Buvat, Irène; Comtat, Claude

2017-09-21

In brain PET/MR applications, accurate attenuation maps are required for accurate PET image quantification. An implemented attenuation correction (AC) method for brain imaging is the single-atlas approach that estimates an AC map from an averaged CT template. As an alternative, we propose to use a zero echo time (ZTE) pulse sequence to segment bone, air and soft tissue. A linear relationship between histogram normalized ZTE intensity and measured CT density in Hounsfield units ([Formula: see text]) in bone has been established thanks to a CT-MR database of 16 patients. Continuous AC maps were computed based on the segmented ZTE by setting a fixed linear attenuation coefficient (LAC) to air and soft tissue and by using the linear relationship to generate continuous μ values for the bone. Additionally, for the purpose of comparison, four other AC maps were generated: a ZTE derived AC map with a fixed LAC for the bone, an AC map based on the single-atlas approach as provided by the PET/MR manufacturer, a soft-tissue only AC map and, finally, the CT derived attenuation map used as the gold standard (CTAC). All these AC maps were used with different levels of smoothing for PET image reconstruction with and without time-of-flight (TOF). The subject-specific AC map generated by combining ZTE-based segmentation and linear scaling of the normalized ZTE signal into [Formula: see text] was found to be a good substitute for the measured CTAC map in brain PET/MR when used with a Gaussian smoothing kernel of [Formula: see text] corresponding to the PET scanner intrinsic resolution. As expected TOF reduces AC error regardless of the AC method. The continuous ZTE-AC performed better than the other alternative MR derived AC methods, reducing the quantification error between the MRAC corrected PET image and the reference CTAC corrected PET image.
Comparison of buried sand ridges and regressive sand ridges on the outer shelf of the East China Sea

NASA Astrophysics Data System (ADS)

Wu, Ziyin; Jin, Xianglong; Zhou, Jieqiong; Zhao, Dineng; Shang, Jihong; Li, Shoujun; Cao, Zhenyi; Liang, Yuyang

2017-06-01

Based on multi-beam echo soundings and high-resolution single-channel seismic profiles, linear sand ridges in U14 and U2 on the East China Sea (ECS) shelf are identified and compared in detail. Linear sand ridges in U14 are buried sand ridges, which are 90 m below the seafloor. It is presumed that these buried sand ridges belong to the transgressive systems tract (TST) formed 320-200 ka ago and that their top interface is the maximal flooding surface (MFS). Linear sand ridges in U2 are regressive sand ridges. It is presumed that these buried sand ridges belong to the TST of the last glacial maximum (LGM) and that their top interface is the MFS of the LGM. Four sub-stage sand ridges of U2 are discerned from the high-resolution single-channel seismic profile and four strikes of regressive sand ridges are distinguished from the submarine topographic map based on the multi-beam echo soundings. These multi-stage and multi-strike linear sand ridges are the response of, and evidence for, the evolution of submarine topography with respect to sea-level fluctuations since the LGM. Although the difference in the age of formation between U14 and U2 is 200 ka and their sequences are 90 m apart, the general strikes of the sand ridges are similar. This indicates that the basic configuration of tidal waves on the ECS shelf has been stable for the last 200 ka. A basic evolutionary model of the strata of the ECS shelf is proposed, in which sea-level change is the controlling factor. During the sea-level change of about 100 ka, five to six strata are developed and the sand ridges develop in the TST. A similar story of the evolution of paleo-topography on the ECS shelf has been repeated during the last 300 ka.
Subject-specific bone attenuation correction for brain PET/MR: can ZTE-MRI substitute CT scan accurately?

NASA Astrophysics Data System (ADS)

Khalifé, Maya; Fernandez, Brice; Jaubert, Olivier; Soussan, Michael; Brulon, Vincent; Buvat, Irène; Comtat, Claude

2017-10-01

In brain PET/MR applications, accurate attenuation maps are required for accurate PET image quantification. An implemented attenuation correction (AC) method for brain imaging is the single-atlas approach that estimates an AC map from an averaged CT template. As an alternative, we propose to use a zero echo time (ZTE) pulse sequence to segment bone, air and soft tissue. A linear relationship between histogram normalized ZTE intensity and measured CT density in Hounsfield units (HU ) in bone has been established thanks to a CT-MR database of 16 patients. Continuous AC maps were computed based on the segmented ZTE by setting a fixed linear attenuation coefficient (LAC) to air and soft tissue and by using the linear relationship to generate continuous μ values for the bone. Additionally, for the purpose of comparison, four other AC maps were generated: a ZTE derived AC map with a fixed LAC for the bone, an AC map based on the single-atlas approach as provided by the PET/MR manufacturer, a soft-tissue only AC map and, finally, the CT derived attenuation map used as the gold standard (CTAC). All these AC maps were used with different levels of smoothing for PET image reconstruction with and without time-of-flight (TOF). The subject-specific AC map generated by combining ZTE-based segmentation and linear scaling of the normalized ZTE signal into HU was found to be a good substitute for the measured CTAC map in brain PET/MR when used with a Gaussian smoothing kernel of 4~mm corresponding to the PET scanner intrinsic resolution. As expected TOF reduces AC error regardless of the AC method. The continuous ZTE-AC performed better than the other alternative MR derived AC methods, reducing the quantification error between the MRAC corrected PET image and the reference CTAC corrected PET image.

SeqAPASS (Sequence Alignment to Predict Across Species Susceptibility) software and documentation

EPA Science Inventory

SeqAPASS is a software application facilitates rapid and streamlined, yet transparent, comparisons of the similarity of toxicologically-significant molecular targets across species. The present application facilitates analysis of primary amino acid sequence similarity (including ...
Filling Gaps in Biodiversity Knowledge for Macrofungi: Contributions and Assessment of an Herbarium Collection DNA Barcode Sequencing Project

PubMed Central

Osmundson, Todd W.; Robert, Vincent A.; Schoch, Conrad L.; Baker, Lydia J.; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M.

2013-01-01

Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1–2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa. PMID:23638077
Filling gaps in biodiversity knowledge for macrofungi: contributions and assessment of an herbarium collection DNA barcode sequencing project.

PubMed

Osmundson, Todd W; Robert, Vincent A; Schoch, Conrad L; Baker, Lydia J; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M

2013-01-01

Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1-2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa.
Hiding message into DNA sequence through DNA coding and chaotic maps.

PubMed

Liu, Guoyan; Liu, Hongjun; Kadir, Abdurahman

2014-09-01

The paper proposes an improved reversible substitution method to hide data into deoxyribonucleic acid (DNA) sequence, and four measures have been taken to enhance the robustness and enlarge the hiding capacity, such as encode the secret message by DNA coding, encrypt it by pseudo-random sequence, generate the relative hiding locations by piecewise linear chaotic map, and embed the encoded and encrypted message into a randomly selected DNA sequence using the complementary rule. The key space and the hiding capacity are analyzed. Experimental results indicate that the proposed method has a better performance compared with the competing methods with respect to robustness and capacity.
Hidden Markov models of biological primary sequence information.

PubMed Central

Baldi, P; Chauvin, Y; Hunkapiller, T; McClure, M A

1994-01-01

Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences. PMID:8302831
Arbitrarily accurate twin composite π -pulse sequences

NASA Astrophysics Data System (ADS)

Torosov, Boyan T.; Vitanov, Nikolay V.

2018-04-01

We present three classes of symmetric broadband composite pulse sequences. The composite phases are given by analytic formulas (rational fractions of π ) valid for any number of constituent pulses. The transition probability is expressed by simple analytic formulas and the order of pulse area error compensation grows linearly with the number of pulses. Therefore, any desired compensation order can be produced by an appropriate composite sequence; in this sense, they are arbitrarily accurate. These composite pulses perform equally well as or better than previously published ones. Moreover, the current sequences are more flexible as they allow total pulse areas of arbitrary integer multiples of π .
Estimation of pairwise sequence similarity of mammalian enhancers with word neighbourhood counts.

PubMed

Göke, Jonathan; Schulz, Marcel H; Lasserre, Julia; Vingron, Martin

2012-03-01

The identity of cells and tissues is to a large degree governed by transcriptional regulation. A major part is accomplished by the combinatorial binding of transcription factors at regulatory sequences, such as enhancers. Even though binding of transcription factors is sequence-specific, estimating the sequence similarity of two functionally similar enhancers is very difficult. However, a similarity measure for regulatory sequences is crucial to detect and understand functional similarities between two enhancers and will facilitate large-scale analyses like clustering, prediction and classification of genome-wide datasets. We present the standardized alignment-free sequence similarity measure N2, a flexible framework that is defined for word neighbourhoods. We explore the usefulness of adding reverse complement words as well as words including mismatches into the neighbourhood. On simulated enhancer sequences as well as functional enhancers in mouse development, N2 is shown to outperform previous alignment-free measures. N2 is flexible, faster than competing methods and less susceptible to single sequence noise and the occurrence of repetitive sequences. Experiments on the mouse enhancers reveal that enhancers active in different tissues can be separated by pairwise comparison using N2. N2 represents an improvement over previous alignment-free similarity measures without compromising speed, which makes it a good candidate for large-scale sequence comparison of regulatory sequences. The software is part of the open-source C++ library SeqAn (www.seqan.de) and a compiled version can be downloaded at http://www.seqan.de/projects/alf.html. Supplementary data are available at Bioinformatics online.
Can mutational GC-pressure create new linear B-cell epitopes in herpes simplex virus type 1 glycoprotein B?

PubMed

Khrustalev, Vladislav Victorovich

2009-01-01

We showed that GC-content of nucleotide sequences coding for linear B-cell epitopes of herpes simplex virus type 1 (HSV1) glycoprotein B (gB) is higher than GC-content of sequences coding for epitope-free regions of this glycoprotein (G + C = 73 and 64%, respectively). Linear B-cell epitopes have been predicted in HSV1 gB by BepiPred algorithm ( www.cbs.dtu.dk/services/BepiPred ). Proline is an acrophilic amino acid residue (it is usually situated on the surface of protein globules, and so included in linear B-cell epitopes). Indeed, the level of proline is much higher in predicted epitopes of gB than in epitope-free regions (17.8% versus 1.8%). This amino acid is coded by GC-rich codons (CCX) that can be produced due to nucleotide substitutions caused by mutational GC-pressure. GC-pressure will also lead to disappearance of acrophobic phenylalanine, isoleucine, methionine and tyrosine coded by GC-poor codons. Results of our "in-silico directed mutagenesis" showed that single nonsynonymous substitutions in AT to GC direction in two long epitope-free regions of gB will cause formation of new linear epitopes or elongation of previously existing epitopes flanking these regions in 25% of 539 possible cases. The calculations of GC-content and amino acid content have been performed by CodonChanges algorithm ( www.barkovsky.hotmail.ru ).
Reagent for Evaluating Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS) Performance in Bottom-Up Proteomic Experiments.

PubMed

Beri, Joshua; Rosenblatt, Michael M; Strauss, Ethan; Urh, Marjeta; Bereman, Michael S

2015-12-01

We present a novel proteomic standard for assessing liquid chromatography-tandem mass spectrometry (LC-MS/MS) instrument performance, in terms of chromatographic reproducibility and dynamic range within a single LC-MS/MS injection. The peptide mixture standard consists of six peptides that were specifically synthesized to cover a wide range of hydrophobicities (grand average hydropathy (GRAVY) scores of -0.6 to 1.9). A combination of stable isotope labeled amino acids ((13)C and (15)N) were inserted to create five isotopologues. By combining these isotopologues at different ratios, they span four orders of magnitude within each distinct peptide sequence. Each peptide, from lightest to heaviest, increases in abundance by a factor of 10. We evaluate several metrics on our quadrupole orbitrap instrument using the 6 × 5 LC-MS/MS reference mixture spiked into a complex lysate background as a function of dynamic range, including mass measurement accuracy (MMA) and the linear range of quantitation of MS1 and parallel reaction monitoring experiments. Detection and linearity of the instrument routinely spanned three orders of magnitude across the gradient (500 fmol to 0.5 fmol on column) and no systematic trend was observed for MMA of targeted peptides as a function of abundance by analysis of variance analysis (p = 0.17). Detection and linearity of the fifth isotopologue (i.e., 0.05 fmol on column) was dependent on the peptide and instrument scan type (MS1 vs PRM). We foresee that this standard will serve as a powerful method to conduct both intra-instrument performance monitoring/evaluation, technology development, and inter-instrument comparisons.
Complete Genome Sequence of a Street Rabies Virus Isolated from a Dog in Nigeria

PubMed Central

Zhou, Ming; Zhou, Zutao; Kia, Grace S. N.; Gnanadurai, Clement W.; Leyson, Christina M.; Umoh, Jarlath U.; Kwaga, Jacob P.; Kazeem, Haruna M.

2013-01-01

A canine rabies virus (RABV) was isolated from a trade dog in Nigeria. Its entire genome was sequenced and found to be closely related to canine RABVs circulating in Africa. Sequence comparison indicates that the virus is closely related to the Africa 2 RABV lineage. The virus is now termed DRV-NG11. PMID:23469344
Deep Sequencing Reveals a Divergent Ugandan cassava brown streak virus Isolate from Malawi

PubMed Central

Winter, Stephan; Mukasa, Settumba; Tairo, Fred; Sseruwagi, Peter; Ndunguru, Joseph; Duffy, Siobain

2017-01-01

ABSTRACT Illumina sequencing of RNA from a cassava cutting from northern Malawi produced a genome of Ugandan cassava brown streak virus (UCBSV-MW-NB7_2013). Sequence comparisons revealed stronger similarity to an isolate from nearby Tanzania (93.4% pairwise nucleotide identity) than to those previously reported from Malawi (86.9 to 87.0%). PMID:28818908
Comparison of Burrows-Wheeler transform-based mapping algorithms used in high-throughput whole-genome sequencing: application to Illumina data for livestock genomes

USDA-ARS?s Scientific Manuscript database

Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics. Mapping sequence reads onto a reference genome is a fundamental step in the analysis of NGS data. Eff...
Characterizing the D2 statistic: word matches in biological sequences.

PubMed

Forêt, Sylvain; Wilson, Susan R; Burden, Conrad J

2009-01-01

Word matches are often used in sequence comparison methods, either as a measure of sequence similarity or in the first search steps of algorithms such as BLAST or BLAT. The D2 statistic is the number of matches of words of k letters between two sequences. Recent advances have been made in the characterization of this statistic and in the approximation of its distribution. Here, these results are extended to the case of approximate word matches. We compute the exact value of the variance of the D2 statistic for the case of a uniform letter distribution, and introduce a method to provide accurate approximations of the variance in the remaining cases. This enables the distribution of D2 to be approximated for typical situations arising in biological research. We apply these results to the identification of cis-regulatory modules, and show that this method detects such sequences with a high accuracy. The ability to approximate the distribution of D2 for both exact and approximate word matches will enable the use of this statistic in a more precise manner for sequence comparison, database searches, and identification of transcription factor binding sites.
A Refined Model for Radar Homing Intercepts.

DTIC Science & Technology

1983-10-27

Helge Toutenburq, Prior Information in Linear Models ,(Wiley, NY, 1982). 7. F. A. Graybill , Introduction to Matrices with Applications in StatisticF... linear target trajectory model z i = 0 + 1 r i + wi () where w i i=I,..., N is a sequence of uncorrelated zero-mean A noise, the general formula for...z i (i=l,..., N) at r. and a linear regression model 1 z i = a0 + a1 r i + w i =(Al) where wi is the corruption noise; the problem is to estimate a0
Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

PubMed

Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

2012-01-01

Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.
A Model of BGA Thermal Fatigue Life Prediction Considering Load Sequence Effects

PubMed Central

Hu, Weiwei; Li, Yaqiu; Sun, Yufeng; Mosleh, Ali

2016-01-01

Accurate testing history data is necessary for all fatigue life prediction approaches, but such data is always deficient especially for the microelectronic devices. Additionally, the sequence of the individual load cycle plays an important role in physical fatigue damage. However, most of the existing models based on the linear damage accumulation rule ignore the sequence effects. This paper proposes a thermal fatigue life prediction model for ball grid array (BGA) packages to take into consideration the load sequence effects. For the purpose of improving the availability and accessibility of testing data, a new failure criterion is discussed and verified by simulation and experimentation. The consequences for the fatigue underlying sequence load conditions are shown. PMID:28773980
Information capacity of nucleotide sequences and its applications.

PubMed

Sadovsky, M G

2006-05-01

The information capacity of nucleotide sequences is defined through the specific entropy of frequency dictionary of a sequence determined with respect to another one containing the most probable continuations of shorter strings. This measure distinguishes a sequence both from a random one, and from ordered entity. A comparison of sequences based on their information capacity is studied. An order within the genetic entities is found at the length scale ranged from 3 to 8. Some other applications of the developed methodology to genetics, bioinformatics, and molecular biology are discussed.
Genetic variability in Melipona quinquefasciata (Hymenoptera, Apidae, Meliponini) from northeastern Brazil determined using the first internal transcribed spacer (ITS1).

PubMed

Pereira, J O P; Freitas, B M; Jorge, D M M; Torres, D C; Soares, C E A; Grangeiro, T B

2009-01-01

Melipona quinquefasciata is a ground-nesting South American stingless bee whose geographic distribution was believed to comprise only the central and southern states of Brazil. We obtained partial sequences (about 500-570 bp) of first internal transcribed spacer (ITS1) nuclear ribosomal DNA from Melipona specimens putatively identified as M. quinquefasciata collected from different localities in northeastern Brazil. To confirm the taxonomic identity of the northeastern samples, specimens from the state of Goiás (Central region of Brazil) were included for comparison. All sequences were deposited in GenBank (accession numbers EU073751-EU073759). The mean nucleotide divergence (excluding sites with insertions/deletions) in the ITS1 sequences was only 1.4%, ranging from 0 to 4.1%. When the sites with insertions/deletions were also taken into account, sequence divergences varied from 0 to 5.3%. In all pairwise comparisons, the ITS1 sequence from the specimens collected in Goiás was most divergent compared to the ITS1 sequences of the bees from the other locations. However, neighbor-joining phylogenetic analysis showed that all ITS1 sequences from northeastern specimens along with the sample of Goiás were resolved in a single clade with a bootstrap support of 100%. The ITS1 sequencing data thus support the occurrence of M. quinquefasciata in northeast Brazil.
Dilations and the Equation of a Line

ERIC Educational Resources Information Center

Yopp, David A.

2016-01-01

Students engage in proportional reasoning when they use covariance and multiple comparisons. Without rich connections to proportional reasoning, students may develop inadequate understandings of linear relationships and the equations that model them. Teachers can improve students' understanding of linear relationships by focusing on realistic…
GMX approximation for the linear E ⊗ ɛ Jahn-Teller effect

NASA Astrophysics Data System (ADS)

Mancini, Jay D.; Fessatidis, Vassilios; Bowen, Samuel P.

2006-02-01

A newly developed generalized moments expansion (GMX) based on the t-expansion of Horn and Weinstein is applied to a linear E ⊗ ɛ Jahn-Teller system. Comparisons are made with other moments schemes as well a coupled cluster approximation.

Symbolic Solution of Linear Differential Equations

NASA Technical Reports Server (NTRS)

Feinberg, R. B.; Grooms, R. G.

1981-01-01

An algorithm for solving linear constant-coefficient ordinary differential equations is presented. The computational complexity of the algorithm is discussed and its implementation in the FORMAC system is described. A comparison is made between the algorithm and some classical algorithms for solving differential equations.
Using PacBio sequencing to investigate the bacterial microbiota of traditional Buryatian cottage cheese and comparison with Italian and Kazakhstan artisanal cheeses.

PubMed

Jin, Hao; Mo, Lanxin; Pan, Lin; Hou, Qaingchaun; Li, Chuanjuan; Darima, Iaptueva; Yu, Jie

2018-05-09

Traditional fermented dairy foods including cottage cheese have been major components of the Buryatia diet for centuries. Buryatian cheeses have maintained not only their unique taste and flavor but also their rich natural lactic acid bacteria (LAB) content. However, relatively few studies have described their microbial communities or explored their potential to serve as LAB resources. In this study, the bacterial microbiota community of 7 traditional artisan cheeses produced by local Buryatian families was investigated using single-molecule, real-time sequencing. In addition, we compared the bacterial microbiota of the Buryatian cheese samples with data sets of cheeses from Kazakhstan and Italy. Furthermore, we isolated and preserved several LAB samples from Buryatian cheese. A total of 62 LAB strains (belonging to 6 genera and 14 species or subspecies) were isolated from 7 samples of Buryatian cheese. Full-length 16S rRNA sequencing of the microbiota revealed 145 species of 82 bacterial genera, belonging to 7 phyla. The most dominant species was Lactococcus lactis (43.89%). Data sets of cheeses from Italy and Kazakhstan were retrieved from public databases. Principal component analysis and multivariate ANOVA showed marked differences in the structure of the microbiota communities in the cheese data sets from the 3 regions. Linear discriminant analyses of the effect size identified 48 discriminant bacterial clades among the 3 groups, which might have contributed to the observed structural differences. Our results indicate that the bacterial communities of traditional artisan cheeses vary depending on geographic origin. In addition, we isolated novel and valuable LAB resources for the improvement of cottage cheese production. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
On the Use of Equivalent Linearization for High-Cycle Fatigue Analysis of Geometrically Nonlinear Structures

NASA Technical Reports Server (NTRS)

Rizzi, Stephen A.

2003-01-01

The use of stress predictions from equivalent linearization analyses in the computation of high-cycle fatigue life is examined. Stresses so obtained differ in behavior from the fully nonlinear analysis in both spectral shape and amplitude. Consequently, fatigue life predictions made using this data will be affected. Comparisons of fatigue life predictions based upon the stress response obtained from equivalent linear and numerical simulation analyses are made to determine the range over which the equivalent linear analysis is applicable.
Can linear superiorization be useful for linear optimization problems?

NASA Astrophysics Data System (ADS)

Censor, Yair

2017-04-01

Linear superiorization (LinSup) considers linear programming problems but instead of attempting to solve them with linear optimization methods it employs perturbation resilient feasibility-seeking algorithms and steers them toward reduced (not necessarily minimal) target function values. The two questions that we set out to explore experimentally are: (i) does LinSup provide a feasible point whose linear target function value is lower than that obtained by running the same feasibility-seeking algorithm without superiorization under identical conditions? (ii) How does LinSup fare in comparison with the Simplex method for solving linear programming problems? Based on our computational experiments presented here, the answers to these two questions are: ‘yes’ and ‘very well’, respectively.
Permanent draft genome sequence of Comamonas testosteroni KF-1

PubMed Central

Weiss, Michael; Kesberg, Anna I.; LaButti, Kurt M.; Pitluck, Sam; Bruce, David; Hauser, Loren; Copeland, Alex; Woyke, Tanja; Lowry, Stephen; Lucas, Susan; Land, Miriam; Goodwin, Lynne; Kjelleberg, Staffan; Cook, Alasdair M.; Buhmann, Matthias; Thomas, Torsten; Schleheck, David

2013-01-01

Comamonas testosteroni KF-1 is a model organism for the elucidation of the novel biochemical degradation pathways for xenobiotic 4-sulfophenylcarboxylates (SPC) formed during biodegradation of synthetic 4-sulfophenylalkane surfactants (linear alkylbenzenesulfonates, LAS) by bacterial communities. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 6,026,527 bp long chromosome (one sequencing gap) exhibits an average G+C content of 61.79% and is predicted to encode 5,492 protein-coding genes and 114 RNA genes. PMID:23991256
The detection error of thermal test low-frequency cable based on M sequence correlation algorithm

NASA Astrophysics Data System (ADS)

Wu, Dongliang; Ge, Zheyang; Tong, Xin; Du, Chunlin

2018-04-01

The problem of low accuracy and low efficiency of off-line detecting on thermal test low-frequency cable faults could be solved by designing a cable fault detection system, based on FPGA export M sequence code(Linear feedback shift register sequence) as pulse signal source. The design principle of SSTDR (Spread spectrum time-domain reflectometry) reflection method and hardware on-line monitoring setup figure is discussed in this paper. Testing data show that, this detection error increases with fault location of thermal test low-frequency cable.
Effective connectivity between superior temporal gyrus and Heschl's gyrus during white noise listening: linear versus non-linear models.

PubMed

Hamid, Ka; Yusoff, An; Rahman, Mza; Mohamad, M; Hamid, Aia

2012-04-01

This fMRI study is about modelling the effective connectivity between Heschl's gyrus (HG) and the superior temporal gyrus (STG) in human primary auditory cortices. MATERIALS #ENTITYSTARTX00026; Ten healthy male participants were required to listen to white noise stimuli during functional magnetic resonance imaging (fMRI) scans. Statistical parametric mapping (SPM) was used to generate individual and group brain activation maps. For input region determination, two intrinsic connectivity models comprising bilateral HG and STG were constructed using dynamic causal modelling (DCM). The models were estimated and inferred using DCM while Bayesian Model Selection (BMS) for group studies was used for model comparison and selection. Based on the winning model, six linear and six non-linear causal models were derived and were again estimated, inferred, and compared to obtain a model that best represents the effective connectivity between HG and the STG, balancing accuracy and complexity. Group results indicated significant asymmetrical activation (p(uncorr) < 0.001) in bilateral HG and STG. Model comparison results showed strong evidence of STG as the input centre. The winning model is preferred by 6 out of 10 participants. The results were supported by BMS results for group studies with the expected posterior probability, r = 0.7830 and exceedance probability, ϕ = 0.9823. One-sample t-tests performed on connection values obtained from the winning model indicated that the valid connections for the winning model are the unidirectional parallel connections from STG to bilateral HG (p < 0.05). Subsequent model comparison between linear and non-linear models using BMS prefers non-linear connection (r = 0.9160, ϕ = 1.000) from which the connectivity between STG and the ipsi- and contralateral HG is gated by the activity in STG itself. We are able to demonstrate that the effective connectivity between HG and STG while listening to white noise for the respective participants can be explained by a non-linear dynamic causal model with the activity in STG influencing the STG-HG connectivity non-linearly.
Intelligent Distributed Systems

DTIC Science & Technology

2015-10-23

periodic gossiping algorithms by using convex combination rules rather than standard averaging rules. On a ring graph, we have discovered how to sequence...the gossips within a period to achieve the best possible convergence rate and we have related this optimal value to the classic edge coloring problem...consensus. There are three different approaches to distributed averaging: linear iterations, gossiping , and dou- ble linear iterations which are also known as
Recursive inversion of externally defined linear systems by FIR filters

NASA Technical Reports Server (NTRS)

Bach, Ralph E., Jr.; Baram, Yoram

1989-01-01

The approximate inversion of an internally unknown linear system, given by its impulse response sequence, by an inverse system having a finite impulse response, is considered. The recursive least-squares procedure is shown to have an exact initialization, based on the triangular Toeplitz structure of the matrix involved. The proposed approach also suggests solutions to the problem of system identification and compensation.
Development and validation of a general purpose linearization program for rigid aircraft models

NASA Technical Reports Server (NTRS)

Duke, E. L.; Antoniewicz, R. F.

1985-01-01

A FORTRAN program that provides the user with a powerful and flexible tool for the linearization of aircraft models is discussed. The program LINEAR numerically determines a linear systems model using nonlinear equations of motion and a user-supplied, nonlinear aerodynamic model. The system model determined by LINEAR consists of matrices for both the state and observation equations. The program has been designed to allow easy selection and definition of the state, control, and observation variables to be used in a particular model. Also, included in the report is a comparison of linear and nonlinear models for a high performance aircraft.
Comparison of Damage Models for Predicting the Non-Linear Response of Laminates Under Matrix Dominated Loading Conditions

NASA Technical Reports Server (NTRS)

Schuecker, Clara; Davila, Carlos G.; Rose, Cheryl A.

2010-01-01

Five models for matrix damage in fiber reinforced laminates are evaluated for matrix-dominated loading conditions under plane stress and are compared both qualitatively and quantitatively. The emphasis of this study is on a comparison of the response of embedded plies subjected to a homogeneous stress state. Three of the models are specifically designed for modeling the non-linear response due to distributed matrix cracking under homogeneous loading, and also account for non-linear (shear) behavior prior to the onset of cracking. The remaining two models are localized damage models intended for predicting local failure at stress concentrations. The modeling approaches of distributed vs. localized cracking as well as the different formulations of damage initiation and damage progression are compared and discussed.
Development of low-frequency kernel-function aerodynamics for comparison with time-dependent finite-difference methods

NASA Technical Reports Server (NTRS)

Bland, S. R.

1982-01-01

Finite difference methods for unsteady transonic flow frequency use simplified equations in which certain of the time dependent terms are omitted from the governing equations. Kernel functions are derived for two dimensional subsonic flow, and provide accurate solutions of the linearized potential equation with the same time dependent terms omitted. These solutions make possible a direct evaluation of the finite difference codes for the linear problem. Calculations with two of these low frequency kernel functions verify the accuracy of the LTRAN2 and HYTRAN2 finite difference codes. Comparisons of the low frequency kernel function results with the Possio kernel function solution of the complete linear equations indicate the adequacy of the HYTRAN approximation for frequencies in the range of interest for flutter calculations.
A String Number-Line Lesson Sequence to Promote Students' Relative Thinking and Understanding of Scale, Key Elements of Proportional Reasoning

ERIC Educational Resources Information Center

Hilton, Annette; Hilton, Geoff

2018-01-01

This article describes part of a study in which researchers designed lesson sequences based around using a string number line to help teachers support children's development of relative thinking and understanding of linear scale. In the first year of the study, eight teachers of Years 3-5 participated in four one-day professional development…
Monitoring Elementary Students' Writing Progress Using Curriculum-Based Measures: Grade and Gender Differences

ERIC Educational Resources Information Center

McMaster, Kristen L.; Shin, Jaehyun; Espin, Christine A.; Jung, Pyung-Gang; Wayman, Miya Miura; Deno, Stanley L.

2017-01-01

The purpose of this study was to examine slopes from curriculum-based measures of writing (CBM-W) as indicators of growth in writing. Responses to story prompts administered for 5 min to 89 students in Grades 2-5 were collected across 12 weeks and scored for correct word sequences (CWS) and correct minus incorrect sequences (CIWS). Linear mixed…
KRAS Mutation Test in Korean Patients with Colorectal Carcinomas: A Methodological Comparison between Sanger Sequencing and a Real-Time PCR-Based Assay.

PubMed

Lee, Sung Hak; Chung, Arthur Minwoo; Lee, Ahwon; Oh, Woo Jin; Choi, Yeong Jin; Lee, Youn-Soo; Jung, Eun Sun

2017-01-01

Mutations in the KRAS gene have been identified in approximately 50% of colorectal cancers (CRCs). KRAS mutations are well established biomarkers in anti-epidermal growth factor receptor therapy. Therefore, assessment of KRAS mutations is needed in CRC patients to ensure appropriate treatment. We compared the analytical performance of the cobas test to Sanger sequencing in 264 CRC cases. In addition, discordant specimens were evaluated by 454 pyrosequencing. KRAS mutations for codons 12/13 were detected in 43.2% of cases (114/264) by Sanger sequencing. Of 257 evaluable specimens for comparison, KRAS mutations were detected in 112 cases (43.6%) by Sanger sequencing and 118 cases (45.9%) by the cobas test. Concordance between the cobas test and Sanger sequencing for each lot was 93.8% positive percent agreement (PPA) and 91.0% negative percent agreement (NPA) for codons 12/13. Results from the cobas test and Sanger sequencing were discordant for 20 cases (7.8%). Twenty discrepant cases were subsequently subjected to 454 pyrosequencing. After comprehensive analysis of the results from combined Sanger sequencing-454 pyrosequencing and the cobas test, PPA was 97.5% and NPA was 100%. The cobas test is an accurate and sensitive test for detecting KRAS -activating mutations and has analytical power equivalent to Sanger sequencing. Prescreening using the cobas test with subsequent application of Sanger sequencing is the best strategy for routine detection of KRAS mutations in CRC.
ESTuber db: an online database for Tuber borchii EST sequences.

PubMed

Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo

2007-03-08

The ESTuber database (http://www.itb.cnr.it/estuber) includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.
Investigating Correlation between Protein Sequence Similarity and Semantic Similarity Using Gene Ontology Annotations.

PubMed

Ikram, Najmul; Qadir, Muhammad Abdul; Afzal, Muhammad Tanvir

2018-01-01

Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.
Technical adequacy of bisulfite sequencing and pyrosequencing for detection of mitochondrial DNA methylation: Sources and avoidance of false-positive detection.

PubMed

Owa, Chie; Poulin, Matthew; Yan, Liying; Shioda, Toshi

2018-01-01

The existence of cytosine methylation in mammalian mitochondrial DNA (mtDNA) is a controversial subject. Because detection of DNA methylation depends on resistance of 5'-modified cytosines to bisulfite-catalyzed conversion to uracil, examined parameters that affect technical adequacy of mtDNA methylation analysis. Negative control amplicons (NCAs) devoid of cytosine methylation were amplified to cover the entire human or mouse mtDNA by long-range PCR. When the pyrosequencing template amplicons were gel-purified after bisulfite conversion, bisulfite pyrosequencing of NCAs did not detect significant levels of bisulfite-resistant cytosines (brCs) at ND1 (7 CpG sites) or CYTB (8 CpG sites) genes (CI95 = 0%-0.94%); without gel-purification, significant false-positive brCs were detected from NCAs (CI95 = 4.2%-6.8%). Bisulfite pyrosequencing of highly purified, linearized mtDNA isolated from human iPS cells or mouse liver detected significant brCs (~30%) in human ND1 gene when the sequencing primer was not selective in bisulfite-converted and unconverted templates. However, repeated experiments using a sequencing primer selective in bisulfite-converted templates almost completely (< 0.8%) suppressed brC detection, supporting the false-positive nature of brCs detected using the non-selective primer. Bisulfite-seq deep sequencing of linearized, gel-purified human mtDNA detected 9.4%-14.8% brCs for 9 CpG sites in ND1 gene. However, because all these brCs were associated with adjacent non-CpG brCs showing the same degrees of bisulfite resistance, DNA methylation in this mtDNA-encoded gene was not confirmed. Without linearization, data generated by bisulfite pyrosequencing or deep sequencing of purified mtDNA templates did not pass the quality control criteria. Shotgun bisulfite sequencing of human mtDNA detected extremely low levels of CpG methylation (<0.65%) over non-CpG methylation (<0.55%). Taken together, our study demonstrates that adequacy of mtDNA methylation analysis using methods dependent on bisulfite conversion needs to be established for each experiment, taking effects of incomplete bisulfite conversion and template impurity or topology into consideration.
The mitochondrial genome of Hydra oligactis (Cnidaria, Hydrozoa) sheds new light on animal mtDNA evolution and cnidarian phylogeny.

PubMed

Kayal, Ehsan; Lavrov, Dennis V

2008-02-29

The 16,314-nuceotide sequence of the linear mitochondrial DNA (mtDNA) molecule of Hydra oligactis (Cnidaria, Hydrozoa)--the first from the class Hydrozoa--has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs, as is typical for cnidarians. All genes have the same transcriptional orientation and their arrangement in the genome is similar to that of the jellyfish Aurelia aurita. In addition, a partial copy of cox1 is present at one end of the molecule in a transcriptional orientation opposite to the rest of the genes, forming a part of inverted terminal repeat characteristic of linear mtDNA and linear mitochondrial plasmids. The sequence close to at least one end of the molecule contains several homonucleotide runs as well as small inverted repeats that are able to form strong secondary structures and may be involved in mtDNA maintenance and expression. Phylogenetic analysis of mitochondrial genes of H. oligactis and other cnidarians supports the Medusozoa hypothesis but also suggests that Anthozoa may be paraphyletic, with octocorallians more closely related to the Medusozoa than to the Hexacorallia. The latter inference implies that Anthozoa is paraphyletic and that the polyp (rather than a medusa) is the ancestral body type in Cnidaria.
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.

PubMed

Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H

2017-04-15

Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. Copyright © 2017 Sinclair et al.

Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

PubMed Central

Sinclair, Robert M.; Ravantti, Janne J.

2017-01-01

ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. PMID:28122979
Iteration with Spreadsheets.

ERIC Educational Resources Information Center

Smith, Michael

1990-01-01

Presents several examples of the iteration method using computer spreadsheets. Examples included are simple iterative sequences and the solution of equations using the Newton-Raphson formula, linear interpolation, and interval bisection. (YP)
Heat*seq: an interactive web tool for high-throughput sequencing experiment comparison with public data.

PubMed

Devailly, Guillaume; Mantsoki, Anna; Joshi, Anagha

2016-11-01

Better protocols and decreasing costs have made high-throughput sequencing experiments now accessible even to small experimental laboratories. However, comparing one or few experiments generated by an individual lab to the vast amount of relevant data freely available in the public domain might be limited due to lack of bioinformatics expertise. Though several tools, including genome browsers, allow such comparison at a single gene level, they do not provide a genome-wide view. We developed Heat*seq, a web-tool that allows genome scale comparison of high throughput experiments chromatin immuno-precipitation followed by sequencing, RNA-sequencing and Cap Analysis of Gene Expression) provided by a user, to the data in the public domain. Heat*seq currently contains over 12 000 experiments across diverse tissues and cell types in human, mouse and drosophila. Heat*seq displays interactive correlation heatmaps, with an ability to dynamically subset datasets to contextualize user experiments. High quality figures and tables are produced and can be downloaded in multiple formats. Web application: http://www.heatstarseq.roslin.ed.ac.uk/ Source code: https://github.com/gdevailly CONTACT: Guillaume.Devailly@roslin.ed.ac.uk or Anagha.Joshi@roslin.ed.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
The primary structures of ribosomal proteins S14 and S16 from the archaebacterium Halobacterium marismortui. Comparison with eubacterial and eukaryotic ribosomal proteins.

PubMed

Kimura, J; Kimura, M

1987-09-05

The amino acid sequences of two ribosomal proteins, S14 and S16, from the archaebacterium Halobacterium marismortui have been determined. Sequence data were obtained by the manual and solid-phase sequencing of peptides derived from enzymatic digestions with trypsin, chymotrypsin, pepsin, and Staphylococcus aureus protease as well as by chemical cleavage with cyanogen bromide. Proteins S14 and S16 contain 109 and 126 amino acid residues and have Mr values of 11,964 and 13,515, respectively. Comparison of the sequences with those of ribosomal proteins from other organisms demonstrates that S14 has a significant homology with the rat liver ribosomal protein S11 (36% identity) as well as with the Escherichia coli ribosomal protein S17 (37%), and that S16 is related to the yeast ribosomal protein YS22 (40%) and proteins S8 from E. coli (28%) and Bacillus stearothermophilus (30%). A comparison of the amino acid residues in the homologous regions of halophilic and nonhalophilic ribosomal proteins reveals that halophilic proteins have more glutamic acids, asparatic acids, prolines, and alanines, and less lysines, arginines, and isoleucines than their nonhalophilic counterparts. These amino acid substitutions probably contribute to the structural stability of halophilic ribosomal proteins.
Metavir 2: new tools for viral metagenome comparison and assembled virome analysis

PubMed Central

2014-01-01

Background Metagenomics, based on culture-independent sequencing, is a well-fitted approach to provide insights into the composition, structure and dynamics of environmental viral communities. Following recent advances in sequencing technologies, new challenges arise for existing bioinformatic tools dedicated to viral metagenome (i.e. virome) analysis as (i) the number of viromes is rapidly growing and (ii) large genomic fragments can now be obtained by assembling the huge amount of sequence data generated for each metagenome. Results To face these challenges, a new version of Metavir was developed. First, all Metavir tools have been adapted to support comparative analysis of viromes in order to improve the analysis of multiple datasets. In addition to the sequence comparison previously provided, viromes can now be compared through their k-mer frequencies, their taxonomic compositions, recruitment plots and phylogenetic trees containing sequences from different datasets. Second, a new section has been specifically designed to handle assembled viromes made of thousands of large genomic fragments (i.e. contigs). This section includes an annotation pipeline for uploaded viral contigs (gene prediction, similarity search against reference viral genomes and protein domains) and an extensive comparison between contigs and reference genomes. Contigs and their annotations can be explored on the website through specifically developed dynamic genomic maps and interactive networks. Conclusions The new features of Metavir 2 allow users to explore and analyze viromes composed of raw reads or assembled fragments through a set of adapted tools and a user-friendly interface. PMID:24646187
Identification of Clinical Coryneform Bacterial Isolates: Comparison of Biochemical Methods and Sequence Analysis of 16S rRNA and rpoB Genes▿

PubMed Central

Adderson, Elisabeth E.; Boudreaux, Jan W.; Cummings, Jessica R.; Pounds, Stanley; Wilson, Deborah A.; Procop, Gary W.; Hayden, Randall T.

2008-01-01

We compared the relative levels of effectiveness of three commercial identification kits and three nucleic acid amplification tests for the identification of coryneform bacteria by testing 50 diverse isolates, including 12 well-characterized control strains and 38 organisms obtained from pediatric oncology patients at our institution. Between 33.3 and 75.0% of control strains were correctly identified to the species level by phenotypic systems or nucleic acid amplification assays. The most sensitive tests were the API Coryne system and amplification and sequencing of the 16S rRNA gene using primers optimized for coryneform bacteria, which correctly identified 9 of 12 control isolates to the species level, and all strains with a high-confidence call were correctly identified. Organisms not correctly identified were species not included in the test kit databases or not producing a pattern of reactions included in kit databases or which could not be differentiated among several genospecies based on reaction patterns. Nucleic acid amplification assays had limited abilities to identify some bacteria to the species level, and comparison of sequence homologies was complicated by the inclusion of allele sequences obtained from uncultivated and uncharacterized strains in databases. The utility of rpoB genotyping was limited by the small number of representative gene sequences that are currently available for comparison. The correlation between identifications produced by different classification systems was poor, particularly for clinical isolates. PMID:18160450
Comparison of Electrochemical Methods for the Evaluation of Cast AZ91 Magnesium Alloy

PubMed Central

Tkacz, Jakub; Minda, Jozef; Fintová, Stanislava; Wasserbauer, Jaromír

2016-01-01

Linear polarization is a potentiodynamic method used for electrochemical characterization of materials. Obtained values of corrosion potential and corrosion current density offer information about material behavior in corrosion environments from the thermodynamic and kinetic points of view, respectively. The present study offers a comparison of applications of the linear polarization method (from −100 mV to +200 mV vs. EOCP), the cathodic polarization of the specimen (−100 mV vs. EOCP), and the anodic polarization of the specimen (+100 mV vs. EOCP), and a discussion of the differences in the obtained values of the electrochemical characteristics of cast AZ91 magnesium alloy. The corrosion current density obtained by cathodic polarization was similar to the corrosion current density obtained by linear polarization, while a lower value was obtained by anodic polarization. Signs of corrosion attack were observed only in the case of linear polarization including cathodic and anodic polarization of the specimen. PMID:28774046
Water-waves on linear shear currents. A comparison of experimental and numerical results.

NASA Astrophysics Data System (ADS)

Simon, Bruno; Seez, William; Touboul, Julien; Rey, Vincent; Abid, Malek; Kharif, Christian

2016-04-01

Propagation of water waves can be described for uniformly sheared current conditions. Indeed, some mathematical simplifications remain applicable in the study of waves whether there is no current or a linearly sheared current. However, the widespread use of mathematical wave theories including shear has rarely been backed by experimental studies of such flows. New experimental and numerical methods were both recently developed to study wave current interactions for constant vorticity. On one hand, the numerical code can simulate, in two dimensions, arbitrary non-linear waves. On the other hand, the experimental methods can be used to generate waves with various shear conditions. Taking advantage of the simplicity of the experimental protocol and versatility of the numerical code, comparisons between experimental and numerical data are discussed and compared with linear theory for validation of the methods. ACKNOWLEDGEMENTS The DGA (Direction Générale de l'Armement, France) is acknowledged for its financial support through the ANR grant N° ANR-13-ASTR-0007.
[Hepatitis C virus: sequence homology of a European isolate and divergence from the prototype].

PubMed

Seelig, R; Seelig, H P; Renz, M

1991-08-01

The polymerase chain reaction (PCR) detected specific hepatitis C viral (HCV) RNA sequences in liver biopsies from two patients with chronic hepatitis, in the tissue of a liver implantate, in plasma from four chronic non-A, non-B hepatitis (NANBH) patients and, for the first time, in an infectious anti-D-immunoglobulin preparation. A comparison of the viral sequences coding for a region for the nonstructural NS3 protein from the liver tissues revealed only a very small degree of sequence divergence on the cDNA as well as on the amino acid level (between 0 and 5%). The sequence similarities of the RNA isolated from plasma of the four chronic NANBH patients and the anti-D-immunoglobulin preparation were partly somewhat lower but altogether also high (between 90 and 100%). In contrast, all eight cDNA and amino acid sequences exhibited a significantly higher degree of divergence in comparison with the HCV prototype sequence (between 29 and 32%) than among themselves (between 0 and 10%). This unexpected high sequence similarity of the eight European isolates and their low homology to the Northamerican prototype sequence is indicative for the existence of different types of HCV. This will be important not only for epidemiological studies but also for the development of effective diagnostic procedures and vaccines. Concerning the pathogenesis of NANBH, a double infection or a helper mechanism has to be considered: in addition to the C virus, sequences of an other virus particle were found in the infectious IgG preparation as well as in the liver biopsies.
Clinical laboratory investigation of the Sanofi ACCESS CK-MB procedure and comparison to electrophoresis and Abbott IMx.

PubMed

Mao, G D; Adeli, K; Eisenbrey, A B; Artiss, J D

1996-07-01

This evaluation was undertaken to verify the application protocol for the CK-MB assay on the ACCESS Immunoassay Analyzer (Sanofi Diagnostics Pasteur, Chaska, MN). The results show that the ACCESS CK-MB assay total imprecision was 6.8% to 9.1%. Analytical linearity of the ACCESS CK-MB assay was excellent in the range of < 1-214 micrograms/L. A comparison of the ACCESS CK-MB assay with the IMx (Abbott Laboratories, Abbott Park, IL) method shows good correlation r = 0.990 (n = 108). Linear regression analysis yielded Y = 1.36X-0.3, Sx/y = 7.2. ACCESS CK-MB values also correlated well with CK-MB by electrophoresis with r = 0.968 (n = 132). The linear regression equation for this comparison was Y = 1.08X + 1.4, Sx/y = 14.1. The expected non-myocardial infarction range of CK-MB determined by the ACCESS system was 1.3-9.4 micrograms/L (mean = 4.0, n = 58). The ACCESS CK-MB assay would appear to be rapid, precise and clinically useful.
New insights into transcription fidelity: thermal stability of non-canonical structures in template DNA regulates transcriptional arrest, pause, and slippage.

PubMed

Tateishi-Karimata, Hisae; Isono, Noburu; Sugimoto, Naoki

2014-01-01

The thermal stability and topology of non-canonical structures of G-quadruplexes and hairpins in template DNA were investigated, and the effect of non-canonical structures on transcription fidelity was evaluated quantitatively. We designed ten template DNAs: A linear sequence that does not have significant higher-order structure, three sequences that form hairpin structures, and six sequences that form G-quadruplex structures with different stabilities. Templates with non-canonical structures induced the production of an arrested, a slipped, and a full-length transcript, whereas the linear sequence produced only a full-length transcript. The efficiency of production for run-off transcripts (full-length and slipped transcripts) from templates that formed the non-canonical structures was lower than that from the linear. G-quadruplex structures were more effective inhibitors of full-length product formation than were hairpin structure even when the stability of the G-quadruplex in an aqueous solution was the same as that of the hairpin. We considered that intra-polymerase conditions may differentially affect the stability of non-canonical structures. The values of transcription efficiencies of run-off or arrest transcripts were correlated with stabilities of non-canonical structures in the intra-polymerase condition mimicked by 20 wt% polyethylene glycol (PEG). Transcriptional arrest was induced when the stability of the G-quadruplex structure (-ΔG°37) in the presence of 20 wt% PEG was more than 8.2 kcal mol(-1). Thus, values of stability in the presence of 20 wt% PEG are an important indicator of transcription perturbation. Our results further our understanding of the impact of template structure on the transcription process and may guide logical design of transcription-regulating drugs.
Real-time volumetric relative dosimetry for magnetic resonance—image-guided radiation therapy (MR-IGRT)

NASA Astrophysics Data System (ADS)

Lee, Hannah J.; Kadbi, Mo; Bosco, Gary; Ibbott, Geoffrey S.

2018-02-01

The integration of magnetic resonance imaging (MRI) with linear accelerators (linac) has enabled the use of 3D MR-visible gel dosimeters for real-time verification of volumetric dose distributions. Several iron-based radiochromic 3D gels were created in-house then imaged and irradiated in a pre-clinical 1.5 T-7 MV MR-Linac. MR images were acquired using a range of balanced-fast field echo (b-FFE) sequences during irradiation to assess the contrast and dose response in irradiated regions and to minimize the presence of MR artifacts. Out of four radiochromic 3D gel formulations, the FOX 3D gel was found to provide superior MR contrast in the irradiated regions. The FOX gels responded linearly with respect to real-time dose and the signal remained stable post-irradiation for at least 20 min. The response of the FOX gel also was found to be unaffected by the radiofrequency and gradient fields created by the b-FFE sequence during irradiation. A reusable version of the FOX gel was used for b-FFE sequence optimization to reduce artifacts by increasing the number of averages at the expense of temporal resolution. Regardless of the real-time MR sequence used, the FOX 3D gels responded linearly to dose with minimal magnetic field effects due to the strong 1.5 T field or gradient fields present during imaging. These gels can easily be made in-house using non-reusable and reusable formulations depending on the needs of the clinic, and the results of this study encourage further applications of 3D gels for MR-IGRT applications.
New Insights into Transcription Fidelity: Thermal Stability of Non-Canonical Structures in Template DNA Regulates Transcriptional Arrest, Pause, and Slippage

PubMed Central

Tateishi-Karimata, Hisae; Isono, Noburu; Sugimoto, Naoki

2014-01-01

The thermal stability and topology of non-canonical structures of G-quadruplexes and hairpins in template DNA were investigated, and the effect of non-canonical structures on transcription fidelity was evaluated quantitatively. We designed ten template DNAs: A linear sequence that does not have significant higher-order structure, three sequences that form hairpin structures, and six sequences that form G-quadruplex structures with different stabilities. Templates with non-canonical structures induced the production of an arrested, a slipped, and a full-length transcript, whereas the linear sequence produced only a full-length transcript. The efficiency of production for run-off transcripts (full-length and slipped transcripts) from templates that formed the non-canonical structures was lower than that from the linear. G-quadruplex structures were more effective inhibitors of full-length product formation than were hairpin structure even when the stability of the G-quadruplex in an aqueous solution was the same as that of the hairpin. We considered that intra-polymerase conditions may differentially affect the stability of non-canonical structures. The values of transcription efficiencies of run-off or arrest transcripts were correlated with stabilities of non-canonical structures in the intra-polymerase condition mimicked by 20 wt% polyethylene glycol (PEG). Transcriptional arrest was induced when the stability of the G-quadruplex structure (−ΔGo 37) in the presence of 20 wt% PEG was more than 8.2 kcal mol−1. Thus, values of stability in the presence of 20 wt% PEG are an important indicator of transcription perturbation. Our results further our understanding of the impact of template structure on the transcription process and may guide logical design of transcription-regulating drugs. PMID:24594642
Genome Sequences of Multidrug-Resistant Salmonella enterica subsp. enterica Serovar Infantis Strains from Broiler Chicks in Hungary

PubMed Central

Wilk, Tímea; Szabó, Móni; Szmolka, Ama; Kiss, János; Barta, Endre; Nagy, Tibor

2016-01-01

Three strains of Salmonella enterica serovar Infantis isolated from healthy broiler chickens from 2012 to 2013 have been sequenced. Comparison of these and previously published S. Infantis genome sequences of broiler origin in 1996 and 2004 will provide new insight into the genome evolution and recent spread of S. Infantis in poultry. PMID:27979950
Characterization and complete genome sequence of a panicovirus from Bermuda grass by high-throughput sequencing.

PubMed

Tahir, Muhammad N; Lockhart, Ben; Grinstead, Samuel; Mollov, Dimitre

2017-04-01

Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high-throughput sequencing (HTS). The nearly full genome sequence of a panicovirus was identified from one HTS scaffold. Sanger sequencing was used to confirm the HTS results and complete the genome sequence of 4404 nt. This virus was provisionally named Bermuda grass latent virus (BGLV). Its predicted open reading frames follow the typical arrangement of the genus Panicovirus. Based on sequence comparisons and phylogenetic analyses BGLV differs from other viruses and therefore taxonomically it is a new member of the genus Panicovirus, family Tombusviridae.
A statistical physics perspective on alignment-independent protein sequence comparison.

PubMed

Chattopadhyay, Amit K; Nasiev, Diar; Flower, Darren R

2015-08-01

Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from 'first passage probability distribution' to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. © The Author 2015. Published by Oxford University Press.
A ruby-colored Pseudobaeospora species is described as new from material collected on the island of Hawaii.

PubMed

Desjardin, Dennis E; Hemmes, Don E; Perry, Brian A

2014-01-01

Pseudobaeospora wipapatiae is described as new based on material collected in alien wet habitats on the island of Hawaii. Unique features of this beautiful species include deep ruby-colored basidiomes with two-spored basidia, amyloid cheilocystidia and a hymeniderm pileipellis with abundant pileocystidia that is initially deep ruby in KOH then changes to lilac gray. Phylogenetic analysis of nuclear large ribosomal subunit sequence data suggest a close relationship between Pseudobaeospora and Tricholoma. BLAST comparisons of internal transcribed spacer and 5.8S nuclear ribosomal subunit regions sequence data reveal greatest similarity with existing sequences of Pseudobaeospora species. A comprehensive description, color photograph, illustrations of salient micromorphological features and comparisons with phenetically similar taxa are provided. © 2014 by The Mycological Society of America.
Assessment of computer techniques for processing digital LANDSAT MSS data for lithological discrimination of Serra do Ramalho, State of Bahia

NASA Technical Reports Server (NTRS)

Paradella, W. R. (Principal Investigator); Vitorello, I.; Monteiro, M. D.

1984-01-01

Enhancement techniques and thematic classifications were applied to the metasediments of Bambui Super Group (Upper Proterozoic) in the Region of Serra do Ramalho, SW of the state of Bahia. Linear contrast stretch, band-ratios with contrast stretch, and color-composites allow lithological discriminations. The effects of human activities and of vegetation cover mask and limit, in several ways, the lithological discrimination with digital MSS data. Principal component images and color composite of linear contrast stretch of these products, show lithological discrimination through tonal gradations. This set of products allows the delineations of several metasedimentary sequences to a level superior to reconnaissance mapping. Supervised (maximum likelihood classifier) and nonsupervised (K-Means classifier) classification of the limestone sequence, host to fluorite mineralization show satisfactory results.
Chromosome diversity and similarity within the Actinomycetales.

PubMed

Kirby, Ralph

2011-06-01

Many chromosomes from Actinomycetales, an order within the Actinobacteria, have been sequenced over the last 10 years and the pace is increasing. This group of Gram-positive and high G+C% bacteria is economically and medically important. However, this group of organisms also is just about the only order in the kingdom Bacteria to have a relatively high proportion of linear chromosomes. Chromosome topology varies within the order according to the genera. Streptomyces, Kitasatospora and Rhodococcus, at least as chromosome sequencing stands at present, have a very high proportion of linear chromosomes, whereas most other genera seem to have circular chromosomes. This review examines chromosome topology across the Actinomycetales and how this affects our concepts of chromosome evolution. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
FASMA: a service to format and analyze sequences in multiple alignments.

PubMed

Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

2007-12-01

Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.

A space-efficient algorithm for local similarities.

PubMed

Huang, X Q; Hardison, R C; Miller, W

1990-10-01

Existing dynamic-programming algorithms for identifying similar regions of two sequences require time and space proportional to the product of the sequence lengths. Often this space requirement is more limiting than the time requirement. We describe a dynamic-programming local-similarity algorithm that needs only space proportional to the sum of the sequence lengths. The method can also find repeats within a single long sequence. To illustrate the algorithm's potential, we discuss comparison of a 73,360 nucleotide sequence containing the human beta-like globin gene cluster and a corresponding 44,594 nucleotide sequence for rabbit, a problem well beyond the capabilities of other dynamic-programming software.
Virtual Estimator for Piecewise Linear Systems Based on Observability Analysis

PubMed Central

Morales-Morales, Cornelio; Adam-Medina, Manuel; Cervantes, Ilse; Vela-Valdés and, Luis G.; García Beltrán, Carlos Daniel

2013-01-01

This article proposes a virtual sensor for piecewise linear systems based on observability analysis that is in function of a commutation law related with the system's outpu. This virtual sensor is also known as a state estimator. Besides, it presents a detector of active mode when the commutation sequences of each linear subsystem are arbitrary and unknown. For the previous, this article proposes a set of virtual estimators that discern the commutation paths of the system and allow estimating their output. In this work a methodology in order to test the observability for piecewise linear systems with discrete time is proposed. An academic example is presented to show the obtained results. PMID:23447007
Direct comparison of XAFS spectroscopy and sequential extraction for arsenic speciation in coal

USGS Publications Warehouse

Huggins, Frank E.; Huffman, G.P.; Kolker, A.; Mroczkowski, S.; Palmer, C.A.; Finkelman, R.B.

2000-01-01

The speciation of arsenic in an Ohio bituminous coal and a North Dakota lignite has been examined by the complementary methods of arsenic XAFS spectroscopy and sequential extraction by aqueous solutions of ammonium acetate, HCl, HF, and HNO3. In order to facilitate a more direct comparison of the two methods, the arsenic XAFS spectra were obtained from aliquots of the coal prepared after each stage of the leaching procedure. For the aliquots, approximately linear correlations (r2 > 0.98 for the Ohio coal, > 0.90 for the ND lignite) were observed between the height of the edge-step in the XAFS analysis and the concentration of arsenic measured by instrumental neutron activation analysis. Results from the leaching sequence indicate that there are two major arsenic forms present in both coals; one is removed by leaching with HCl and the other by HNO3. Whereas the XAFS spectral signatures of the arsenic leached by HCl are compatible with arsenate for both coals, the arsenic leached by HNO3 is identified as arsenic associated with pyrite for the Ohio coal and as an As3+ species for the North Dakota lignite. Minor arsenate forms persist in both coals after the final leaching with nitric acid. The arsenate forms extracted in HCl are believed to be oxidation products derived from the other major arsenic forms upon exposure of the pulverized coals to air.
Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man

PubMed Central

Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.

2000-01-01

The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409
Partial Shotgun Sequencing of the Boechera stricta Genome Reveals Extensive Microsynteny and Promoter Conservation with Arabidopsis1[W

PubMed Central

Windsor, Aaron J.; Schranz, M. Eric; Formanová, Nataša; Gebauer-Jung, Steffi; Bishop, John G.; Schnabelrauch, Domenica; Kroymann, Juergen; Mitchell-Olds, Thomas

2006-01-01

Comparative genomics provides insight into the evolutionary dynamics that shape discrete sequences as well as whole genomes. To advance comparative genomics within the Brassicaceae, we have end sequenced 23,136 medium-sized insert clones from Boechera stricta, a wild relative of Arabidopsis (Arabidopsis thaliana). A significant proportion of these sequences, 18,797, are nonredundant and display highly significant similarity (BLASTn e-value ≤ 10−30) to low copy number Arabidopsis genomic regions, including more than 9,000 annotated coding sequences. We have used this dataset to identify orthologous gene pairs in the two species and to perform a global comparison of DNA regions 5′ to annotated coding regions. On average, the 500 nucleotides upstream to coding sequences display 71.4% identity between the two species. In a similar analysis, 61.4% identity was observed between 5′ noncoding sequences of Brassica oleracea and Arabidopsis, indicating that regulatory regions are not as diverged among these lineages as previously anticipated. By mapping the B. stricta end sequences onto the Arabidopsis genome, we have identified nearly 2,000 conserved blocks of microsynteny (bracketing 26% of the Arabidopsis genome). A comparison of fully sequenced B. stricta inserts to their homologous Arabidopsis genomic regions indicates that indel polymorphisms >5 kb contribute substantially to the genome size difference observed between the two species. Further, we demonstrate that microsynteny inferred from end-sequence data can be applied to the rapid identification and cloning of genomic regions of interest from nonmodel species. These results suggest that among diploid relatives of Arabidopsis, small- to medium-scale shotgun sequencing approaches can provide rapid and cost-effective benefits to evolutionary and/or functional comparative genomic frameworks. PMID:16607030
Indexcov: fast coverage quality control for whole-genome sequencing.

PubMed

Pedersen, Brent S; Collins, Ryan L; Talkowski, Michael E; Quinlan, Aaron R

2017-11-01

The BAM and CRAM formats provide a supplementary linear index that facilitates rapid access to sequence alignments in arbitrary genomic regions. Comparing consecutive entries in a BAM or CRAM index allows one to infer the number of alignment records per genomic region for use as an effective proxy of sequence depth in each genomic region. Based on these properties, we have developed indexcov, an efficient estimator of whole-genome sequencing coverage to rapidly identify samples with aberrant coverage profiles, reveal large-scale chromosomal anomalies, recognize potential batch effects, and infer the sex of a sample. Indexcov is available at https://github.com/brentp/goleft under the MIT license. © The Authors 2017. Published by Oxford University Press.
Revisiting Isotherm Analyses Using R: Comparison of Linear, Non-linear, and Bayesian Techniques

EPA Science Inventory

Extensive adsorption isotherm data exist for an array of chemicals of concern on a variety of engineered and natural sorbents. Several isotherm models exist that can accurately describe these data from which the resultant fitting parameters may subsequently be used in numerical ...
The impact of targeting repetitive BamHI-W sequences on the sensitivity and precision of EBV DNA quantification.

PubMed

Sanosyan, Armen; Fayd'herbe de Maudave, Alexis; Bollore, Karine; Zimmermann, Valérie; Foulongne, Vincent; Van de Perre, Philippe; Tuaillon, Edouard

2017-01-01

Viral load monitoring and early Epstein-Barr virus (EBV) DNA detection are essential in routine laboratory testing, especially in preemptive management of Post-transplant Lymphoproliferative Disorder. Targeting the repetitive BamHI-W sequence was shown to increase the sensitivity of EBV DNA quantification, but the variability of BamHI-W reiterations was suggested to be a source of quantification bias. We aimed to assess the extent of variability associated with BamHI-W PCR and its impact on the sensitivity of EBV DNA quantification using the 1st WHO international standard, EBV strains and clinical samples. Repetitive BamHI-W- and LMP2 single- sequences were amplified by in-house qPCRs and BXLF-1 sequence by a commercial assay (EBV R-gene™, BioMerieux). Linearity and limits of detection of in-house methods were assessed. The impact of repeated versus single target sequences on EBV DNA quantification precision was tested on B95.8 and Raji cell lines, possessing 11 and 7 copies of the BamHI-W sequence, respectively, and on clinical samples. BamHI-W qPCR demonstrated a lower limit of detection compared to LMP2 qPCR (2.33 log10 versus 3.08 log10 IU/mL; P = 0.0002). BamHI-W qPCR underestimated the EBV DNA load on Raji strain which contained fewer BamHI-W copies than the WHO standard derived from the B95.8 EBV strain (mean bias: - 0.21 log10; 95% CI, -0.54 to 0.12). Comparison of BamHI-W qPCR versus LMP2 and BXLF-1 qPCR showed an acceptable variability between EBV DNA levels in clinical samples with the mean bias being within 0.5 log10 IU/mL EBV DNA, whereas a better quantitative concordance was observed between LMP2 and BXLF-1 assays. Targeting BamHI-W resulted to a higher sensitivity compared to LMP2 but the variable reiterations of BamHI-W segment are associated with higher quantification variability. BamHI-W can be considered for clinical and therapeutic monitoring to detect an early EBV DNA and a dynamic change in viral load.
Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data.

PubMed

Liu, Yang; Chiaromonte, Francesca; Ross, Howard; Malhotra, Raunaq; Elleder, Daniel; Poss, Mary

2015-06-30

Infection with feline immunodeficiency virus (FIV) causes an immunosuppressive disease whose consequences are less severe if cats are co-infected with an attenuated FIV strain (PLV). We use virus diversity measurements, which reflect replication ability and the virus response to various conditions, to test whether diversity of virulent FIV in lymphoid tissues is altered in the presence of PLV. Our data consisted of the 3' half of the FIV genome from three tissues of animals infected with FIV alone, or with FIV and PLV, sequenced by 454 technology. Since rare variants dominate virus populations, we had to carefully distinguish sequence variation from errors due to experimental protocols and sequencing. We considered an exponential-normal convolution model used for background correction of microarray data, and modified it to formulate an error correction approach for minor allele frequencies derived from high-throughput sequencing. Similar to accounting for over-dispersion in counts, this accounts for error-inflated variability in frequencies - and quite effectively reproduces empirically observed distributions. After obtaining error-corrected minor allele frequencies, we applied ANalysis Of VAriance (ANOVA) based on a linear mixed model and found that conserved sites and transition frequencies in FIV genes differ among tissues of dual and single infected cats. Furthermore, analysis of minor allele frequencies at individual FIV genome sites revealed 242 sites significantly affected by infection status (dual vs. single) or infection status by tissue interaction. All together, our results demonstrated a decrease in FIV diversity in bone marrow in the presence of PLV. Importantly, these effects were weakened or undetectable when error correction was performed with other approaches (thresholding of minor allele frequencies; probabilistic clustering of reads). We also queried the data for cytidine deaminase activity on the viral genome, which causes an asymmetric increase in G to A substitutions, but found no evidence for this host defense strategy. Our error correction approach for minor allele frequencies (more sensitive and computationally efficient than other algorithms) and our statistical treatment of variation (ANOVA) were critical for effective use of high-throughput sequencing data in understanding viral diversity. We found that co-infection with PLV shifts FIV diversity from bone marrow to lymph node and spleen.
The impact of targeting repetitive BamHI-W sequences on the sensitivity and precision of EBV DNA quantification

PubMed Central

Fayd’herbe de Maudave, Alexis; Bollore, Karine; Zimmermann, Valérie; Foulongne, Vincent; Van de Perre, Philippe; Tuaillon, Edouard

2017-01-01

Background Viral load monitoring and early Epstein-Barr virus (EBV) DNA detection are essential in routine laboratory testing, especially in preemptive management of Post-transplant Lymphoproliferative Disorder. Targeting the repetitive BamHI-W sequence was shown to increase the sensitivity of EBV DNA quantification, but the variability of BamHI-W reiterations was suggested to be a source of quantification bias. We aimed to assess the extent of variability associated with BamHI-W PCR and its impact on the sensitivity of EBV DNA quantification using the 1st WHO international standard, EBV strains and clinical samples. Methods Repetitive BamHI-W- and LMP2 single- sequences were amplified by in-house qPCRs and BXLF-1 sequence by a commercial assay (EBV R-gene™, BioMerieux). Linearity and limits of detection of in-house methods were assessed. The impact of repeated versus single target sequences on EBV DNA quantification precision was tested on B95.8 and Raji cell lines, possessing 11 and 7 copies of the BamHI-W sequence, respectively, and on clinical samples. Results BamHI-W qPCR demonstrated a lower limit of detection compared to LMP2 qPCR (2.33 log10 versus 3.08 log10 IU/mL; P = 0.0002). BamHI-W qPCR underestimated the EBV DNA load on Raji strain which contained fewer BamHI-W copies than the WHO standard derived from the B95.8 EBV strain (mean bias: - 0.21 log10; 95% CI, -0.54 to 0.12). Comparison of BamHI-W qPCR versus LMP2 and BXLF-1 qPCR showed an acceptable variability between EBV DNA levels in clinical samples with the mean bias being within 0.5 log10 IU/mL EBV DNA, whereas a better quantitative concordance was observed between LMP2 and BXLF-1 assays. Conclusions Targeting BamHI-W resulted to a higher sensitivity compared to LMP2 but the variable reiterations of BamHI-W segment are associated with higher quantification variability. BamHI-W can be considered for clinical and therapeutic monitoring to detect an early EBV DNA and a dynamic change in viral load. PMID:28850597
Switching Systems: Controllability and Control Design

DTIC Science & Technology

2009-04-25

controllable linear time invariant (LTI) systems ẋ = Ax+Bu are stabilizable and the stabilization can be always done by a...to control the system is bounded. As an application controllability conditions for a class of bimodal linear time invariant (LTI) systems are also...There exist a universal ( finite ) switching sequence σ such that the time varying system ẋ = A(σ)x+ B(σ)u is globally controllable . Proof: The
Complete genome sequence of a novel pararetrovirus isolated from soybean

USDA-ARS?s Scientific Manuscript database

We report the complete genome sequence of Soybean Putnam pararetrovirus (SPPRV), a new pararetrovirus isolated from a soybean field in Putnam County, Ohio, USA. Comparison of SPPRV with other plant-infecting pararetroviruses places it in the genus Caulimovirus of the family Caulimoviridae....
USE OF COMPETITIVE DNA HYBRIDIZATION TO IDENTIFY DIFFERENCES IN THE GENOMES OF TWO CLOSELY RELATED FECAL INDICATOR BACTERIA

EPA Science Inventory

Although recent technological advances in DNA sequencing and computational biology now allow scientists to compare entire microbial genomes, comparisons of closely related bacterial species and individual isolates by whole-genome sequencing approaches remains prohibitively expens...
The LINE-1 DNA sequences in four mammalian orders predict proteins that conserve homologies to retrovirus proteins.

PubMed Central

Fanning, T; Singer, M

1987-01-01

Recent work suggests that one or more members of the highly repeated LINE-1 (L1) DNA family found in all mammals may encode one or more proteins. Here we report the sequence of a portion of an L1 cloned from the domestic cat (Felis catus). These data permit comparison of the L1 sequences in four mammalian orders (Carnivore, Lagomorph, Rodent and Primate) and the comparison supports the suggested coding potential. In two separate, noncontiguous regions in the carboxy terminal half of the proteins predicted from the DNA sequences, there are several strongly conserved segments. In one region, these share homology with known or suspected reverse transcriptases, as described by others in rodents and primates. In the second region, closer to the carboxy terminus, the strongly conserved segments are over 90% homologous among the four orders. One of the latter segments is cysteine rich and resembles the putative metal binding domains of nucleic acid binding proteins, including those of TFIIIA and retroviruses. PMID:3562227
Primary structures of ribosomal proteins from the archaebacterium Halobacterium marismortui and the eubacterium Bacillus stearothermophilus.

PubMed

Arndt, E; Scholzen, T; Krömer, W; Hatakeyama, T; Kimura, M

1991-06-01

Approximately 40 ribosomal proteins from each Halobacterium marismortui and Bacillus stearothermophilus have been sequenced either by direct protein sequence analysis or by DNA sequence analysis of the appropriate genes. The comparison of the amino acid sequences from the archaebacterium H marismortui with the available ribosomal proteins from the eubacterial and eukaryotic kingdoms revealed four different groups of proteins: 24 proteins are related to both eubacterial as well as eukaryotic proteins. Eleven proteins are exclusively related to eukaryotic counterparts. For three proteins only eubacterial relatives-and for another three proteins no counterpart-could be found. The similarities of the halobacterial ribosomal proteins are in general somewhat higher to their eukaryotic than to their eubacterial counterparts. The comparison of B stearothermophilus proteins with their E coli homologues showed that the proteins evolved at different rates. Some proteins are highly conserved with 64-76% identity, others are poorly conserved with only 25-34% identical amino acid residues.
Computational complexity of algorithms for sequence comparison, short-read assembly and genome alignment.

PubMed

Baichoo, Shakuntala; Ouzounis, Christos A

A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput sequencing, numerous applications in genome biology and fundamental research on comparative genomics. The computational complexity of these algorithms has been previously reported in original research papers, yet this often neglected property has not been reviewed previously in a systematic manner and for a wider audience. We provide a review of space and time complexity of key sequence analysis algorithms and highlight their properties in a comprehensive manner, in order to identify potential opportunities for further research in algorithm or data structure optimization. The complexity aspect is poised to become pivotal as we will be facing challenges related to the continuous increase of genomic data on unprecedented scales and complexity in the foreseeable future, when robust biological simulation at the cell level and above becomes a reality. Copyright © 2017 Elsevier B.V. All rights reserved.
Winnowing DNA for Rare Sequences: Highly Specific Sequence and Methylation Based Enrichment

PubMed Central

Thompson, Jason D.; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

2012-01-01

Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue. PMID:22355378
Fast Algorithms for Mining Co-evolving Time Series

DTIC Science & Technology

2011-09-01

Keogh et al., 2001, 2004] and (b) forecasting, like an autoregressive integrated moving average model ( ARIMA ) and related meth- ods [Box et al., 1994...computing hardware? We develop models to mine time series with missing values, to extract compact representation from time sequences, to segment the...sequences, and to do forecasting. For large scale data, we propose algorithms for learning time series models , in particular, including Linear Dynamical
Establishing homologies in protein sequences

NASA Technical Reports Server (NTRS)

Dayhoff, M. O.; Barker, W. C.; Hunt, L. T.

1983-01-01

Computer-based statistical techniques used to determine homologies between proteins occurring in different species are reviewed. The technique is based on comparison of two protein sequences, either by relating all segments of a given length in one sequence to all segments of the second or by finding the best alignment of the two sequences. Approaches discussed include selection using printed tabulations, identification of very similar sequences, and computer searches of a database. The use of the SEARCH, RELATE, and ALIGN programs (Dayhoff, 1979) is explained; sample data are presented in graphs, diagrams, and tables and the construction of scoring matrices is considered.
The non-linear power spectrum of the Lyman alpha forest

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arinyo-i-Prats, Andreu; Miralda-Escudé, Jordi; Viel, Matteo

2015-12-01

The Lyman alpha forest power spectrum has been measured on large scales by the BOSS survey in SDSS-III at z∼ 2.3, has been shown to agree well with linear theory predictions, and has provided the first measurement of Baryon Acoustic Oscillations at this redshift. However, the power at small scales, affected by non-linearities, has not been well examined so far. We present results from a variety of hydrodynamic simulations to predict the redshift space non-linear power spectrum of the Lyα transmission for several models, testing the dependence on resolution and box size. A new fitting formula is introduced to facilitate themore » comparison of our simulation results with observations and other simulations. The non-linear power spectrum has a generic shape determined by a transition scale from linear to non-linear anisotropy, and a Jeans scale below which the power drops rapidly. In addition, we predict the two linear bias factors of the Lyα forest and provide a better physical interpretation of their values and redshift evolution. The dependence of these bias factors and the non-linear power on the amplitude and slope of the primordial fluctuations power spectrum, the temperature-density relation of the intergalactic medium, and the mean Lyα transmission, as well as the redshift evolution, is investigated and discussed in detail. A preliminary comparison to the observations shows that the predicted redshift distortion parameter is in good agreement with the recent determination of Blomqvist et al., but the density bias factor is lower than observed. We make all our results publicly available in the form of tables of the non-linear power spectrum that is directly obtained from all our simulations, and parameters of our fitting formula.« less

Some links on this page may take you to non-federal websites. Their policies may differ from this site.