A DYNAMICAL SIGNATURE OF MULTIPLE STELLAR POPULATIONS IN 47 TUCANAE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Richer, Harvey B.; Heyl, Jeremy; Anderson, Jay
2013-07-01
Based on the width of its main sequence, and an actual observed split when viewed through particular filters, it is widely accepted that 47 Tucanae contains multiple stellar populations. In this contribution, we divide the main sequence of 47 Tuc into four color groups, which presumably represent stars of various chemical compositions. The kinematic properties of each of these groups are explored via proper motions, and a strong signal emerges of differing proper-motion anisotropies with differing main-sequence color; the bluest main-sequence stars exhibit the largest proper-motion anisotropy which becomes undetectable for the reddest stars. In addition, the bluest stars aremore » also the most centrally concentrated. A similar analysis for Small Magellanic Cloud stars, which are located in the background of 47 Tuc on our frames, yields none of the anisotropy exhibited by the 47 Tuc stars. We discuss implications of these results for possible formation scenarios of the various populations.« less
Multiple Access Interference Reduction Using Received Response Code Sequence for DS-CDMA UWB System
NASA Astrophysics Data System (ADS)
Toh, Keat Beng; Tachikawa, Shin'ichi
This paper proposes a combination of novel Received Response (RR) sequence at the transmitter and a Matched Filter-RAKE (MF-RAKE) combining scheme receiver system for the Direct Sequence-Code Division Multiple Access Ultra Wideband (DS-CDMA UWB) multipath channel model. This paper also demonstrates the effectiveness of the RR sequence in Multiple Access Interference (MAI) reduction for the DS-CDMA UWB system. It suggests that by using conventional binary code sequence such as the M sequence or the Gold sequence, there is a possibility of generating extra MAI in the UWB system. Therefore, it is quite difficult to collect the energy efficiently although the RAKE reception method is applied at the receiver. The main purpose of the proposed system is to overcome the performance degradation for UWB transmission due to the occurrence of MAI during multiple accessing in the DS-CDMA UWB system. The proposed system improves the system performance by improving the RAKE reception performance using the RR sequence which can reduce the MAI effect significantly. Simulation results verify that significant improvement can be obtained by the proposed system in the UWB multipath channel models.
The Multiple Stellar Populations in the Ancient LMC Globular Clusters Hodge 11 and NGC 2210
NASA Astrophysics Data System (ADS)
Chaboyer, Brian; Gilligan, Christina; Wagner-Kaiser, Rachel; Mackey, Dougal; Sarajedini, Ata; Cummings, Jeffrey; Grocholski, Aaron; Geisler, Doug; Cohen, Roger; Villanova, Sandro; Yang, Soung-Chul; Parisi, Celeste
2018-01-01
Hubble Space telescope images of the ancient LMC globular clusters Hodge 11 and NGC 2210 in the F336W, F606W and F814W filters were obtained between June 2016 and April 2017. These deep images has been analyzed with the Dolphot software package. High quality photometry has been obtained from three magnitudes brighter than the horizontal branch, to about four magnitudes fainter than the main sequence turn-off. Both clusters show an excess of red main sequence stars in the F336W filter, indicating that multiple stellar populations exist in both clusters. Hodge 11 shows irregularities in its horizontal branch morphology, which is indicative of the presence of an approximately 0.1 dex internal helium abundance spread.
Limit cycles in piecewise-affine gene network models with multiple interaction loops
NASA Astrophysics Data System (ADS)
Farcot, Etienne; Gouzé, Jean-Luc
2010-01-01
In this article, we consider piecewise affine differential equations modelling gene networks. We work with arbitrary decay rates, and under a local hypothesis expressed as an alignment condition of successive focal points. The interaction graph of the system may be rather complex (multiple intricate loops of any sign, multiple thresholds, etc.). Our main result is an alternative theorem showing that if a sequence of region is periodically visited by trajectories, then under our hypotheses, there exists either a unique stable periodic solution, or the origin attracts all trajectories in this sequence of regions. This result extends greatly our previous work on a single negative feedback loop. We give several examples and simulations illustrating different cases.
Zhang, Huimin; He, Hongkui; Yu, Xiujuan; Xu, Zhaohui; Zhang, Zhizhou
2016-11-01
It remains an unsolved problem to quantify a natural microbial community by rapidly and conveniently measuring multiple species with functional significance. Most widely used high throughput next-generation sequencing methods can only generate information mainly for genus-level taxonomic identification and quantification, and detection of multiple species in a complex microbial community is still heavily dependent on approaches based on near full-length ribosome RNA gene or genome sequence information. In this study, we used near full-length rRNA gene library sequencing plus Primer-Blast to design species-specific primers based on whole microbial genome sequences. The primers were intended to be specific at the species level within relevant microbial communities, i.e., a defined genomics background. The primers were tested with samples collected from the Daqu (also called fermentation starters) and pit mud of a traditional Chinese liquor production plant. Sixteen pairs of primers were found to be suitable for identification of individual species. Among them, seven pairs were chosen to measure the abundance of microbial species through quantitative PCR. The combination of near full-length ribosome RNA gene library sequencing and Primer-Blast may represent a broadly useful protocol to quantify multiple species in complex microbial population samples with species-specific primers.
RBT-GA: a novel metaheuristic for solving the Multiple Sequence Alignment problem.
Taheri, Javid; Zomaya, Albert Y
2009-07-07
Multiple Sequence Alignment (MSA) has always been an active area of research in Bioinformatics. MSA is mainly focused on discovering biologically meaningful relationships among different sequences or proteins in order to investigate the underlying main characteristics/functions. This information is also used to generate phylogenetic trees. This paper presents a novel approach, namely RBT-GA, to solve the MSA problem using a hybrid solution methodology combining the Rubber Band Technique (RBT) and the Genetic Algorithm (GA) metaheuristic. RBT is inspired by the behavior of an elastic Rubber Band (RB) on a plate with several poles, which is analogues to locations in the input sequences that could potentially be biologically related. A GA attempts to mimic the evolutionary processes of life in order to locate optimal solutions in an often very complex landscape. RBT-GA is a population based optimization algorithm designed to find the optimal alignment for a set of input protein sequences. In this novel technique, each alignment answer is modeled as a chromosome consisting of several poles in the RBT framework. These poles resemble locations in the input sequences that are most likely to be correlated and/or biologically related. A GA-based optimization process improves these chromosomes gradually yielding a set of mostly optimal answers for the MSA problem. RBT-GA is tested with one of the well-known benchmarks suites (BALiBASE 2.0) in this area. The obtained results show that the superiority of the proposed technique even in the case of formidable sequences.
NASA Astrophysics Data System (ADS)
Milone, A. P.; Bedin, L. R.; Piotto, G.; Marino, A. F.; Cassisi, S.; Bellini, A.; Jerjen, H.; Pietrinferni, A.; Aparicio, A.; Rich, R. M.
2015-07-01
Recent studies have shown that the extended main-sequence turn-off (eMSTO) is a common feature of intermediate-age star clusters in the Magellanic Clouds (MCs). The most simple explanation is that these stellar systems harbour multiple generations of stars with an age difference of a few hundred million years. However, while an eMSTO has been detected in a large number of clusters with ages between ˜1-2 Gyr, several studies of young clusters in both MCs and in nearby galaxies do not find any evidence for a prolonged star formation history, i. e. for multiple stellar generations. These results have suggested alternative interpretation of the eMSTOs observed in intermediate-age star clusters. The eMSTO could be due to stellar rotation mimicking an age spread or to interacting binaries. In these scenarios, intermediate-age MC clusters would be simple stellar populations, in close analogy with younger clusters. Here, we provide the first evidence for an eMSTO in a young stellar cluster. We exploit multiband Hubble Space Telescope photometry to study the ˜300-Myr old star cluster NGC 1856 in the Large Magellanic Cloud and detected a broadened MSTO that is consistent with a prolonged star formation which had a duration of about 150 Myr. Below the turn-off, the main sequence (MS) of NGC 1856 is split into a red and blue component, hosting 33 ± 5 and 67 ± 5 per cent of the total number of MS stars, respectively. We discuss these findings in the context of multiple-stellar-generation, stellar-rotation, and interacting-binary hypotheses.
A VLT/NACO survey for triple and quadruple systems among visual pre-main sequence binaries
NASA Astrophysics Data System (ADS)
Correia, S.; Zinnecker, H.; Ratzka, Th.; Sterzik, M. F.
2006-12-01
Aims.This paper describes a systematic search for high-order multiplicity among wide visual Pre-Main Sequence (PMS) binaries. Methods: .We conducted an Adaptive Optics survey of a sample of 58 PMS wide binaries from various star-forming regions, which include 52 T Tauri systems with mostly K- and M-type primaries, with the NIR instrument NACO at the VLT. Results: .Of these 52 systems, 7 are found to be triple (2 new) and 7 quadruple (1 new). The new close companions are most likely physically bound based on their probability of chance projection and, for some of them, on their position on a color-color diagram. The corresponding degree of multiplicity among wide binaries (number of triples and quadruples divided by the number of systems) is 26.9 ± 7.2% in the projected separation range ~0.07 arcsec -12'', with the largest contribution from the Taurus-Auriga cloud. We also found that this degree of multiplicity is twice in Taurus compared to Ophiuchus and Chamaeleon for which the same number of sources are present in our sample. Considering a restricted sample composed of systems at distance 140-190 pc, the degree of multiplicity is 26.8 ± 8.1%, in the separation range 10/14 AU-1700/2300 AU (30 binaries, 5 triples, 6 quadruples). The observed frequency agrees with results from previous multiplicity surveys within the uncertainties, although a significant overabundance of quadruple systems compared to triple systems is apparent. Tentatively including the spectroscopic pairs in our restricted sample and comparing the multiplicity fractions to those measured for solar-type main-sequence stars in the solar neighborhood leads to the conclusion that both the ratio of triples to binaries and the ratio of quadruples to triples seems to be in excess among young stars. Most of the current numerical simulations of multiple star formation, and especially smoothed particles hydrodynamics simulations, over-predict the fraction of high-order multiplicity when compared to our results. The circumstellar properties around the individual components of our high-order multiple systems tend to favor mixed systems (i.e. systems including components of wTTS and cTTS type), which is in general agreement with previous studies of disks in binaries, with the exception of Taurus, where we find a preponderance of similar type of components among the multiples studied.
DNAAlignEditor: DNA alignment editor tool
Sanchez-Villeda, Hector; Schroeder, Steven; Flint-Garcia, Sherry; Guill, Katherine E; Yamasaki, Masanori; McMullen, Michael D
2008-01-01
Background With advances in DNA re-sequencing methods and Next-Generation parallel sequencing approaches, there has been a large increase in genomic efforts to define and analyze the sequence variability present among individuals within a species. For very polymorphic species such as maize, this has lead to a need for intuitive, user-friendly software that aids the biologist, often with naïve programming capability, in tracking, editing, displaying, and exporting multiple individual sequence alignments. To fill this need we have developed a novel DNA alignment editor. Results We have generated a nucleotide sequence alignment editor (DNAAlignEditor) that provides an intuitive, user-friendly interface for manual editing of multiple sequence alignments with functions for input, editing, and output of sequence alignments. The color-coding of nucleotide identity and the display of associated quality score aids in the manual alignment editing process. DNAAlignEditor works as a client/server tool having two main components: a relational database that collects the processed alignments and a user interface connected to database through universal data access connectivity drivers. DNAAlignEditor can be used either as a stand-alone application or as a network application with multiple users concurrently connected. Conclusion We anticipate that this software will be of general interest to biologists and population genetics in editing DNA sequence alignments and analyzing natural sequence variation regardless of species, and will be particularly useful for manual alignment editing of sequences in species with high levels of polymorphism. PMID:18366684
Shahinyan, Grigor; Margaryan, Armine; Panosyan, Hovik; Trchounian, Armen
2017-05-02
Among the huge diversity of thermophilic bacteria mainly bacilli have been reported as active thermostable lipase producers. Geothermal springs serve as the main source for isolation of thermostable lipase producing bacilli. Thermostable lipolytic enzymes, functioning in the harsh conditions, have promising applications in processing of organic chemicals, detergent formulation, synthesis of biosurfactants, pharmaceutical processing etc. In order to study the distribution of lipase-producing thermophilic bacilli and their specific lipase protein primary structures, three lipase producers from different genera were isolated from mesothermal (27.5-70 °C) springs distributed on the territory of Armenia and Nagorno Karabakh. Based on phenotypic characteristics and 16S rRNA gene sequencing the isolates were identified as Geobacillus sp., Bacillus licheniformis and Anoxibacillus flavithermus strains. The lipase genes of isolates were sequenced by using initially designed primer sets. Multiple alignments generated from primary structures of the lipase proteins and annotated lipase protein sequences, conserved regions analysis and amino acid composition have illustrated the similarity (98-99%) of the lipases with true lipases (family I) and GDSL esterase family (family II). A conserved sequence block that determines the thermostability has been identified in the multiple alignments of the lipase proteins. The results are spreading light on the lipase producing bacilli distribution in geothermal springs in Armenia and Nagorno Karabakh. Newly isolated bacilli strains could be prospective source for thermostable lipases and their genes.
RBT-GA: a novel metaheuristic for solving the multiple sequence alignment problem
Taheri, Javid; Zomaya, Albert Y
2009-01-01
Background Multiple Sequence Alignment (MSA) has always been an active area of research in Bioinformatics. MSA is mainly focused on discovering biologically meaningful relationships among different sequences or proteins in order to investigate the underlying main characteristics/functions. This information is also used to generate phylogenetic trees. Results This paper presents a novel approach, namely RBT-GA, to solve the MSA problem using a hybrid solution methodology combining the Rubber Band Technique (RBT) and the Genetic Algorithm (GA) metaheuristic. RBT is inspired by the behavior of an elastic Rubber Band (RB) on a plate with several poles, which is analogues to locations in the input sequences that could potentially be biologically related. A GA attempts to mimic the evolutionary processes of life in order to locate optimal solutions in an often very complex landscape. RBT-GA is a population based optimization algorithm designed to find the optimal alignment for a set of input protein sequences. In this novel technique, each alignment answer is modeled as a chromosome consisting of several poles in the RBT framework. These poles resemble locations in the input sequences that are most likely to be correlated and/or biologically related. A GA-based optimization process improves these chromosomes gradually yielding a set of mostly optimal answers for the MSA problem. Conclusion RBT-GA is tested with one of the well-known benchmarks suites (BALiBASE 2.0) in this area. The obtained results show that the superiority of the proposed technique even in the case of formidable sequences. PMID:19594869
Bandeira, Nuno; Clauser, Karl R; Pevzner, Pavel A
2007-07-01
Despite significant advances in the identification of known proteins, the analysis of unknown proteins by MS/MS still remains a challenging open problem. Although Klaus Biemann recognized the potential of MS/MS for sequencing of unknown proteins in the 1980s, low throughput Edman degradation followed by cloning still remains the main method to sequence unknown proteins. The automated interpretation of MS/MS spectra has been limited by a focus on individual spectra and has not capitalized on the information contained in spectra of overlapping peptides. Indeed the powerful shotgun DNA sequencing strategies have not been extended to automated protein sequencing. We demonstrate, for the first time, the feasibility of automated shotgun protein sequencing of protein mixtures by utilizing MS/MS spectra of overlapping and possibly modified peptides generated via multiple proteases of different specificities. We validate this approach by generating highly accurate de novo reconstructions of multiple regions of various proteins in western diamondback rattlesnake venom. We further argue that shotgun protein sequencing has the potential to overcome the limitations of current protein sequencing approaches and thus catalyze the otherwise impractical applications of proteomics methodologies in studies of unknown proteins.
Screening for SNPs with Allele-Specific Methylation based on Next-Generation Sequencing Data.
Hu, Bo; Ji, Yuan; Xu, Yaomin; Ting, Angela H
2013-05-01
Allele-specific methylation (ASM) has long been studied but mainly documented in the context of genomic imprinting and X chromosome inactivation. Taking advantage of the next-generation sequencing technology, we conduct a high-throughput sequencing experiment with four prostate cell lines to survey the whole genome and identify single nucleotide polymorphisms (SNPs) with ASM. A Bayesian approach is proposed to model the counts of short reads for each SNP conditional on its genotypes of multiple subjects, leading to a posterior probability of ASM. We flag SNPs with high posterior probabilities of ASM by accounting for multiple comparisons based on posterior false discovery rates. Applying the Bayesian approach to the in-house prostate cell line data, we identify 269 SNPs as candidates of ASM. A simulation study is carried out to demonstrate the quantitative performance of the proposed approach.
NASA Astrophysics Data System (ADS)
Gopinath, T.; Veglia, Gianluigi
2016-06-01
Conventional multidimensional magic angle spinning (MAS) solid-state NMR (ssNMR) experiments detect the signal arising from the decay of a single coherence transfer pathway (FID), resulting in one spectrum per acquisition time. Recently, we introduced two new strategies, namely DUMAS (DUal acquisition Magic Angle Spinning) and MEIOSIS (Multiple ExperIments via Orphan SpIn operatorS), that enable the simultaneous acquisitions of multidimensional ssNMR experiments using multiple coherence transfer pathways. Here, we combined the main elements of DUMAS and MEIOSIS to harness both orphan spin operators and residual polarization and increase the number of simultaneous acquisitions. We show that it is possible to acquire up to eight two-dimensional experiments using four acquisition periods per each scan. This new suite of pulse sequences, called MAeSTOSO for Multiple Acquisitions via Sequential Transfer of Orphan Spin pOlarization, relies on residual polarization of both 13C and 15N pathways and combines low- and high-sensitivity experiments into a single pulse sequence using one receiver and commercial ssNMR probes. The acquisition of multiple experiments does not affect the sensitivity of the main experiment; rather it recovers the lost coherences that are discarded, resulting in a significant gain in experimental time. Both merits and limitations of this approach are discussed.
Oliani, L C; Lidani, K C F; Gabriel, J E
2015-10-16
MyoD and MyoG are transcription factors that have essential roles in myogenic lineage determination and muscle differentiation. The purpose of this study was to compare multiple amino acid sequences of myogenic regulatory proteins to infer evolutionary relationships among chordates. Protein sequences from Mus musculus (P10085 and P12979), human Homo sapiens (P15172 and P15173), bovine Bos taurus (Q7YS82 and Q7YS81), wild pig Sus scrofa (P49811 and P49812), quail Coturnix coturnix (P21572 and P34060), chicken Gallus gallus (P16075 and P17920), rat Rattus norvegicus (Q02346 and P20428), domestic water buffalo Bubalus bubalis (D2SP11 and A7L034), and sheep Ovis aries (Q90477 and D3YKV7) were searched from a non-redundant protein sequence database UniProtKB/Swiss-Prot, and subsequently analyzed using the Mega6.0 software. MyoD evolutionary analyses revealed the presence of three main clusters with all mammals branched in one cluster, members of the order Rodentia (mouse and rat) in a second branch linked to the first, and birds of the order Galliformes (chicken and quail) remaining isolated in a third. MyoG evolutionary analyses aligned sequences in two main clusters, all mammalian specimens grouped in different sub-branches, and birds clustered in a second branch. These analyses suggest that the evolution of MyoD and MyoG was driven by different pathways.
Chamings, Anthony; Nelson, Tiffanie M; Vibin, Jessy; Wille, Michelle; Klaassen, Marcel; Alexandersen, Soren
2018-04-13
We evaluated the presence of coronaviruses by PCR in 918 Australian wild bird samples collected during 2016-17. Coronaviruses were detected in 141 samples (15.3%) from species of ducks, shorebirds and herons and from multiple sampling locations. Sequencing of selected positive samples found mainly gammacoronaviruses, but also some deltacoronaviruses. The detection rate of coronaviruses was improved by using multiple PCR assays, as no single assay could detect all coronavirus positive samples. Sequencing of the relatively conserved Orf1 PCR amplicons found that Australian duck gammacoronaviruses were similar to duck gammacoronaviruses around the world. Some sequenced shorebird gammacoronaviruses belonged to Charadriiformes lineages, but others were more closely related to duck gammacoronaviruses. Australian duck and heron deltacoronaviruses belonged to lineages with other duck and heron deltacoronaviruses, but were almost 20% different in nucleotide sequence to other deltacoronavirus sequences available. Deltacoronavirus sequences from shorebirds formed a lineage with a deltacoronavirus from a ruddy turnstone detected in the United States. Given that Australian duck gammacoronaviruses are highly similar to those found in other regions, and Australian ducks rarely come into contact with migratory Palearctic duck species, we hypothesise that migratory shorebirds are the important vector for moving wild bird coronaviruses into and out of Australia.
Resolution enhancement using a new multiple-pulse decoupling sequence for quadrupolar nuclei.
Delevoye, L; Trébosc, J; Gan, Z; Montagne, L; Amoureux, J-P
2007-05-01
A new decoupling composite pulse sequence is proposed to remove the broadening on spin S=1/2 magic-angle spinning (MAS) spectra arising from the scalar coupling with a quadrupolar nucleus I. It is illustrated on the (31)P spectrum of an aluminophosphate, AlPO(4)-14, which is broadened by the presence of (27)Al/(31)P scalar couplings. The multiple-pulse (MP) sequence has the advantage over the continuous wave (CW) irradiation to efficiently annul the scalar dephasing without reintroducing the dipolar interaction. The MP decoupling sequence is first described in a rotor-synchronised version (RS-MP) where one parameter only needs to be adjusted. It clearly avoids the dipolar recoupling in order to achieve a better resolution than using the CW sequence. In a second improved version, the MP sequence is experimentally studied in the vicinity of the perfect rotor-synchronised conditions. The linewidth at half maximum (FWHM) of 65 Hz using (27)Al CW decoupling decreases to 48 Hz with RS-MP decoupling and to 30 Hz with rotor-asynchronised MP (RA-MP) decoupling. The main phenomena are explained using both experimental results and numerical simulations.
ASCA X-ray observations of pre-main-sequence stars
NASA Technical Reports Server (NTRS)
Skinner, S. L.; Walter, F. M.; Yamauchi, S.
1996-01-01
The results of recent Advanced Satellite for Cosmology and Astrophysics (ASCA) X-ray observations of two pre-main sequence stars are presented: the weak emission line T Tauri star HD 142361, and the Herbig Ae star HD 104237. The solid state imaging spectrometer spectra for HD 142361 shows a clear emission line from H-like Mg 7, and spectral fits reveal a multiple temperature plasma with a hot component of at least 16 MK. The spectra of HD 104237 show a complex temperature structure with the hottest plasma at temperatures of greater than 30 MK. It is concluded that mechanisms that predict only soft X-ray emission can be dismissed for Herbig Ae stars.
Screening for SNPs with Allele-Specific Methylation based on Next-Generation Sequencing Data
Hu, Bo; Xu, Yaomin
2013-01-01
Allele-specific methylation (ASM) has long been studied but mainly documented in the context of genomic imprinting and X chromosome inactivation. Taking advantage of the next-generation sequencing technology, we conduct a high-throughput sequencing experiment with four prostate cell lines to survey the whole genome and identify single nucleotide polymorphisms (SNPs) with ASM. A Bayesian approach is proposed to model the counts of short reads for each SNP conditional on its genotypes of multiple subjects, leading to a posterior probability of ASM. We flag SNPs with high posterior probabilities of ASM by accounting for multiple comparisons based on posterior false discovery rates. Applying the Bayesian approach to the in-house prostate cell line data, we identify 269 SNPs as candidates of ASM. A simulation study is carried out to demonstrate the quantitative performance of the proposed approach. PMID:23710259
Bonizzoni, Paola; Rizzi, Raffaella; Pesole, Graziano
2005-10-05
Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems--hence the need to develop novel strategies. We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions) due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion). It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at http://aspic.algo.disco.unimib.it/aspic-devel/.
Stars caught in the braking stage in young Magellanic Cloud clusters
NASA Astrophysics Data System (ADS)
D'Antona, Francesca; Milone, Antonino P.; Tailo, Marco; Ventura, Paolo; Vesperini, Enrico; di Criscienzo, Marcella
2017-08-01
The colour-magnitude diagrams of many Magellanic Cloud clusters (with ages up to 2 billion years) display extended turnoff regions where the stars leave the main sequence, suggesting the presence of multiple stellar populations with ages that may differ even by hundreds of millions of years 1,2,3 . A strongly debated question is whether such an extended turnoff is instead due to populations with different stellar rotations3,4,5,6 . The recent discovery of a 'split' main sequence in some younger clusters (~80-400 Myr) added another piece to this puzzle. The blue side of the main sequence is consistent with slowly rotating stellar models, and the red side consistent with rapidly rotating models7,8,9,10. However, a complete theoretical characterization of the observed colour-magnitude diagram also seemed to require an age spread9. We show here that, in the three clusters so far analysed, if the blue main-sequence stars are interpreted with models in which the stars have always been slowly rotating, they must be ~30% younger than the rest of the cluster. If they are instead interpreted as stars that were initially rapidly rotating but have later slowed down, the age difference disappears, and this 'braking' also helps to explain the apparent age differences of the extended turnoff. The age spreads in Magellanic Cloud clusters are thus a manifestation of rotational stellar evolution. Observational tests are suggested.
High-Resolution Spectroscopy of some very Active Southern Stars
NASA Technical Reports Server (NTRS)
Soderblom, David R.; King, Jeremy R.; Henry, Todd J.
1998-01-01
We have obtained high-resolution echelle spectra of 18 solar-type stars that an earlier survey showed to have very high levels of Ca II H and K emission. Most of these stars belong to close binary systems, but five remain as probable single stars or well-separated binaries that are younger than the Pleiades on the basis of their lithium abundances and H.alpha emission. Three of these probable single stars also lie more than 1 mag above the main sequence in a color-magnitude diagram, and appear to have ages of 10 to 15 Myr. Two of them, HD 202917 and HD 222259, also appear to have a kinematic association with the pre-main-sequence multiple system HD 98800.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Chengyuan; De Grijs, Richard; Deng, Licai, E-mail: joshuali@pku.edu.cn, E-mail: grijs@pku.edu.cn
2014-04-01
Using a combination of high-resolution Hubble Space Telescope/Wide-Field and Planetary Camera-2 observations, we explore the physical properties of the stellar populations in two intermediate-age star clusters, NGC 1831 and NGC 1868, in the Large Magellanic Cloud based on their color-magnitude diagrams. We show that both clusters exhibit extended main-sequence turn offs. To explain the observations, we consider variations in helium abundance, binarity, age dispersions, and the fast rotation of the clusters' member stars. The observed narrow main sequence excludes significant variations in helium abundance in both clusters. We first establish the clusters' main-sequence binary fractions using the bulk of themore » clusters' main-sequence stellar populations ≳ 1 mag below their turn-offs. The extent of the turn-off regions in color-magnitude space, corrected for the effects of binarity, implies that age spreads of order 300 Myr may be inferred for both clusters if the stellar distributions in color-magnitude space were entirely due to the presence of multiple populations characterized by an age range. Invoking rapid rotation of the population of cluster members characterized by a single age also allows us to match the observed data in detail. However, when taking into account the extent of the red clump in color-magnitude space, we encounter an apparent conflict for NGC 1831 between the age dispersion derived from that based on the extent of the main-sequence turn off and that implied by the compact red clump. We therefore conclude that, for this cluster, variations in stellar rotation rate are preferred over an age dispersion. For NGC 1868, both models perform equally well.« less
Searching for Partners of Cool Senior Citizens
NASA Astrophysics Data System (ADS)
Jao, Wei-Chun; Henry, T. J.
2012-01-01
Mass is one of the most fundamental parameters in stellar astronomy. In order to measure dynamical masses, one needs to find nearby binary systems that can be resolved and monitored, ideally with orbital periods that completely wrap in a reasonable amount of time. Many surveys have been made of nearby main sequence dwarfs, and their mass-luminosity relation is well established. As part of our Cool Subdwarf Investigations (CSI) program, we are searching for subdwarf binaries of spectral types K and M within 60 parsecs to measure their multiplicity rate and to reveal binaries appropriate for mass determinations. Here we present results of our CSI work using HST's Fine Guidance Sensors. When combined with previous CSI work and results in the literature, we find the multiplicity rate of subdwarfs, 21%, to be surprisingly low compared to that of similar main sequence K and M stars, 37%. This work has several implications, including that the star formation and/or evolution history of subdwarfs is different than for dwarfs, and that ideal systems for subdwarf mass determinations are difficult to find. This work is supported by HST grant GO-11943.
Influence of DNA sequence on the structure of minicircles under torsional stress
Wang, Qian; Irobalieva, Rossitza N.; Chiu, Wah; Schmid, Michael F.; Fogg, Jonathan M.; Zechiedrich, Lynn
2017-01-01
Abstract The sequence dependence of the conformational distribution of DNA under various levels of torsional stress is an important unsolved problem. Combining theory and coarse-grained simulations shows that the DNA sequence and a structural correlation due to topology constraints of a circle are the main factors that dictate the 3D structure of a 336 bp DNA minicircle under torsional stress. We found that DNA minicircle topoisomers can have multiple bend locations under high torsional stress and that the positions of these sharp bends are determined by the sequence, and by a positive mechanical correlation along the sequence. We showed that simulations and theory are able to provide sequence-specific information about individual DNA minicircles observed by cryo-electron tomography (cryo-ET). We provided a sequence-specific cryo-ET tomogram fitting of DNA minicircles, registering the sequence within the geometric features. Our results indicate that the conformational distribution of minicircles under torsional stress can be designed, which has important implications for using minicircle DNA for gene therapy. PMID:28609782
NASA Technical Reports Server (NTRS)
Rede, Leonard J.; Booth, Andrew; Hsieh, Jonathon; Summer, Kellee
2004-01-01
This paper presents a discussion of the evolution of a sequencer from a simple EPICS (Experimental Physics and Industrial Control System) based sequencer into a complex implementation designed utilizing UML (Unified Modeling Language) methodologies and a CASE (Computer Aided Software Engineering) tool approach. The main purpose of the sequencer (called the IF Sequencer) is to provide overall control of the Keck Interferometer to enable science operations be carried out by a single operator (and/or observer). The interferometer links the two 10m telescopes of the W. M. Keck Observatory at Mauna Kea, Hawaii. The IF Sequencer is a high-level, multi-threaded, Hare1 finite state machine, software program designed to orchestrate several lower-level hardware and software hard real time subsystems that must perform their work in a specific and sequential order. The sequencing need not be done in hard real-time. Each state machine thread commands either a high-speed real-time multiple mode embedded controller via CORB A, or slower controllers via EPICS Channel Access interfaces. The overall operation of the system is simplified by the automation. The UML is discussed and our use of it to implement the sequencer is presented. The decision to use the Rhapsody product as our CASE tool is explained and reflected upon. Most importantly, a section on lessons learned is presented and the difficulty of integrating CASE tool automatically generated C++ code into a large control system consisting of multiple infrastructures is presented.
NASA Astrophysics Data System (ADS)
Reder, Leonard J.; Booth, Andrew; Hsieh, Jonathan; Summers, Kellee R.
2004-09-01
This paper presents a discussion of the evolution of a sequencer from a simple Experimental Physics and Industrial Control System (EPICS) based sequencer into a complex implementation designed utilizing UML (Unified Modeling Language) methodologies and a Computer Aided Software Engineering (CASE) tool approach. The main purpose of the Interferometer Sequencer (called the IF Sequencer) is to provide overall control of the Keck Interferometer to enable science operations to be carried out by a single operator (and/or observer). The interferometer links the two 10m telescopes of the W. M. Keck Observatory at Mauna Kea, Hawaii. The IF Sequencer is a high-level, multi-threaded, Harel finite state machine software program designed to orchestrate several lower-level hardware and software hard real-time subsystems that must perform their work in a specific and sequential order. The sequencing need not be done in hard real-time. Each state machine thread commands either a high-speed real-time multiple mode embedded controller via CORBA, or slower controllers via EPICS Channel Access interfaces. The overall operation of the system is simplified by the automation. The UML is discussed and our use of it to implement the sequencer is presented. The decision to use the Rhapsody product as our CASE tool is explained and reflected upon. Most importantly, a section on lessons learned is presented and the difficulty of integrating CASE tool automatically generated C++ code into a large control system consisting of multiple infrastructures is presented.
An unbiased study of debris discs around A-type stars with Herschel
NASA Astrophysics Data System (ADS)
Thureau, N. D.; Greaves, J. S.; Matthews, B. C.; Kennedy, G.; Phillips, N.; Booth, M.; Duchêne, G.; Horner, J.; Rodriguez, D. R.; Sibthorpe, B.; Wyatt, M. C.
2014-12-01
The Herschel DEBRIS (Disc Emission via a Bias-free Reconnaissance in the Infrared/Submillimetre) survey brings us a unique perspective on the study of debris discs around main-sequence A-type stars. Bias-free by design, the survey offers a remarkable data set with which to investigate the cold disc properties. The statistical analysis of the 100 and 160 μm data for 86 main-sequence A stars yields a lower than previously found debris disc rate. Considering better than 3σ excess sources, we find a detection rate ≥24 ± 5 per cent at 100 μm which is similar to the debris disc rate around main-sequence F/G/K-spectral type stars. While the 100 and 160 μm excesses slowly decline with time, debris discs with large excesses are found around some of the oldest A stars in our sample, evidence that the debris phenomenon can survive throughout the length of the main sequence (˜1 Gyr). Debris discs are predominantly detected around the youngest and hottest stars in our sample. Stellar properties such as metallicity are found to have no effect on the debris disc incidence. Debris discs are found around A stars in single systems and multiple systems at similar rates. While tight and wide binaries (<1 and >100 au, respectively) host debris discs with a similar frequency and global properties, no intermediate separation debris systems were detected in our sample.
Typing and comparative genome analysis of Brucella melitensis isolated from Lebanon.
Abou Zaki, Natalia; Salloum, Tamara; Osman, Marwan; Rafei, Rayane; Hamze, Monzer; Tokajian, Sima
2017-10-16
Brucella melitensis is the main causative agent of the zoonotic disease brucellosis. This study aimed at typing and characterizing genetic variation in 33 Brucella isolates recovered from patients in Lebanon. Bruce-ladder multiplex PCR and PCR-RFLP of omp31, omp2a and omp2b were performed. Sixteen representative isolates were chosen for draft-genome sequencing and analyzed to determine variations in virulence, resistance, genomic islands, prophages and insertion sequences. Comparative whole-genome single nucleotide polymorphism analysis was also performed. The isolates were confirmed to be B. melitensis. Genome analysis revealed multiple virulence determinants and efflux pumps. Genome comparisons and single nucleotide polymorphisms divided the isolates based on geographical distribution but revealed high levels of similarity between the strains. Sequence divergence in B. melitensis was mainly due to lateral gene transfer of mobile elements. This is the first report of an in-depth genomic characterization of B. melitensis in Lebanon. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Paparini, Andrea; Yang, Rongchang; Chen, Linda; Tong, Kaising; Gibson-Kueh, Susan; Lymbery, Alan; Ryan, Una M
2017-11-01
Currently, the systematics, biology and epidemiology of piscine Cryptosporidium species are poorly understood. Here, we compared Sanger ‒ and next-generation ‒ sequencing (NGS), of piscine Cryptosporidium, at the 18S rRNA and actin genes. The hosts comprised 11 ornamental fish species, spanning four orders and eight families. The objectives were: to (i) confirm the rich genetic diversity of the parasite and the high frequency of mixed infections; and (ii) explore the potential of NGS in the presence of complex genetic mixtures. By Sanger sequencing, four main genotypes were obtained at the actin locus, while for the 18S locus, seven genotypes were identified. At both loci, NGS revealed frequent mixed infections, consisting of one highly dominant variant plus substantially rarer genotypes. Both sequencing methods detected novel Cryptosporidium genotypes at both loci, including a novel and highly abundant actin genotype that was identified by both Sanger sequencing and NGS. Importantly, this genotype accounted for 68·9% of all NGS reads from all samples (249 585/362 372). The present study confirms that aquarium fish can harbour a large and unexplored Cryptosporidium genetic diversity. Although commonly used in molecular parasitology studies, nested PCR prevents quantitative comparisons and thwarts the advantages of NGS, when this latter approach is used to investigate multiple infections.
EXTENDED STAR FORMATION IN THE INTERMEDIATE-AGE LARGE MAGELLANIC CLOUD STAR CLUSTER NGC 2209
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keller, Stefan C.; Mackey, A. Dougal; Da Costa, Gary S.
2012-12-10
We present observations of the 1 Gyr old star cluster NGC 2209 in the Large Magellanic Cloud made with the GMOS imager on the Gemini South Telescope. These observations show that the cluster exhibits a main-sequence turnoff that spans a broader range in luminosity than can be explained by a single-aged stellar population. This places NGC 2209 amongst a growing list of intermediate-age (1-3 Gyr) clusters that show evidence for extended or multiple epochs of star formation of between 50 and 460 Myr in extent. The extended main-sequence turnoff observed in NGC 2209 is a confirmation of the prediction inmore » Keller et al. made on the basis of the cluster's large core radius. We propose that secondary star formation is a defining feature of the evolution of massive star clusters. Dissolution of lower mass clusters through evaporation results in only clusters that have experienced secondary star formation surviving for a Hubble time, thus providing a natural connection between the extended main-sequence turnoff phenomenon and the ubiquitous light-element abundance ranges seen in the ancient Galactic globular clusters.« less
A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.
Luczak, Brian B; James, Benjamin T; Girgis, Hani Z
2017-12-06
Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.
A Search for Strong Radio Emission from the Magnetic Interactions of Trappist-1 and its Satellites
NASA Astrophysics Data System (ADS)
Pineda, J. Sebastian; Hallinan, Gregg
2018-06-01
The first nearby very-low mass star planet-host discovered, Trappist-1, presents not only a unique opportunity for studying a system of multiple terrestrial planets, but a means to examine the possibility of significant star-planet magnetic interactions at the end of the main sequence. These very-low mass stars and brown dwarfs have been observationally confirmed as capable of generating strong radio emissions produced by the electron cyclotron maser instability as a consequence of currents coupling the magnetospheric environment to the stellar atmosphere. However, multiple electrodynamic mechanisms have been proposed to power these magnetospheric processes, including a potentially significant role for short-period satellites analogous to the auroral interactions between Jupiter and its moons or the Sun and the solar system planets. With multiple close in terrestrial satellites, the Trappist-1 system is an important test case of these potential theories. We present a search for these radio emissions from the seven-planet Trappist-1 system using the Karl G. Jansky Very Large Array, looking for both highly circularly polarized radio emission and persistent quiescent emissions at GHz frequencies. We place these observations in the context of the possible electrodynamic engines driving radio emissions in very-low mass stars and brown dwarfs, and their relation to magnetic field topology, with implications for future radio surveys of planet-hosts at the end of the main sequence.
A brief introduction to web-based genome browsers.
Wang, Jun; Kong, Lei; Gao, Ge; Luo, Jingchu
2013-03-01
Genome browser provides a graphical interface for users to browse, search, retrieve and analyze genomic sequence and annotation data. Web-based genome browsers can be classified into general genome browsers with multiple species and species-specific genome browsers. In this review, we attempt to give an overview for the main functions and features of web-based genome browsers, covering data visualization, retrieval, analysis and customization. To give a brief introduction to the multiple-species genome browser, we describe the user interface and main functions of the Ensembl and UCSC genome browsers using the human alpha-globin gene cluster as an example. We further use the MSU and the Rice-Map genome browsers to show some special features of species-specific genome browser, taking a rice transcription factor gene OsSPL14 as an example.
ERIC Educational Resources Information Center
Noell, George H.; Gresham, Frank M.
2001-01-01
Describes design logic and potential uses of a variant of the multiple-baseline design. The multiple-baseline multiple-sequence (MBL-MS) consists of multiple-baseline designs that are interlaced with one another and include all possible sequences of treatments. The MBL-MS design appears to be primarily useful for comparison of treatments taking…
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S
2013-06-25
A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
ERIC Educational Resources Information Center
Olander, Clas; Wickman, Per-Olof; Tytler, Russell; Ingerman, Åke
2018-01-01
The aim of this article is to investigate students' meaning-making processes of multiple representations during a teaching sequence about the human body in lower secondary school. Two main influences are brought together to accomplish the analysis: on the one hand, theories on signs and representations as scaffoldings for learning and, on the…
NASA Astrophysics Data System (ADS)
Calamida, A.; Strampelli, G.; Rest, A.; Bono, G.; Ferraro, I.; Saha, A.; Iannicola, G.; Scolnic, D.; James, D.; Smith, C.; Zenteno, A.
2017-04-01
We present a multi-band photometric catalog of ≈1.7 million cluster members for a field of view of ≈2° × 2° across ω Cen. Photometry is based on images collected with the Dark Energy Camera on the 4 m Blanco telescope and the Advanced Camera for Surveys on the Hubble Space Telescope. The unprecedented photometric accuracy and field coverage allowed us, for the first time, to investigate the spatial distribution of ω Cen multiple populations from the core to the tidal radius, confirming its very complex structure. We found that the frequency of blue main-sequence stars is increasing compared to red main-sequence stars starting from a distance of ≈25‧ from the cluster center. Blue main-sequence stars also show a clumpy spatial distribution, with an excess in the northeast quadrant of the cluster pointing toward the direction of the Galactic center. Stars belonging to the reddest and faintest red-giant branch also show a more extended spatial distribution in the outskirts of ω Cen, a region never explored before. Both these stellar sub-populations, according to spectroscopic measurements, are more metal-rich compared to the cluster main stellar population. These findings, once confirmed, make ω Cen the only stellar system currently known where metal-rich stars have a more extended spatial distribution compared to metal-poor stars. Kinematic and chemical abundance measurements are now needed for stars in the external regions of ω Cen to better characterize the properties of these sub-populations. Based on observations made with the Dark Energy Camera (DECam) on the 4 m Blanco telescope (NOAO) under programs 2014A-0327, 2015A-0151, 2016A-0189, PIs: A. Calamida, A. Rest, and on observations made with the NASA/ESA Hubble Space Telescope, obtained by the Space Telescope Science Institute. STScI is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-26555.
Uptake, Results, and Outcomes of Germline Multiple-Gene Sequencing After Diagnosis of Breast Cancer.
Kurian, Allison W; Ward, Kevin C; Hamilton, Ann S; Deapen, Dennis M; Abrahamse, Paul; Bondarenko, Irina; Li, Yun; Hawley, Sarah T; Morrow, Monica; Jagsi, Reshma; Katz, Steven J
2018-05-10
Low-cost sequencing of multiple genes is increasingly available for cancer risk assessment. Little is known about uptake or outcomes of multiple-gene sequencing after breast cancer diagnosis in community practice. To examine the effect of multiple-gene sequencing on the experience and treatment outcomes for patients with breast cancer. For this population-based retrospective cohort study, patients with breast cancer diagnosed from January 2013 to December 2015 and accrued from SEER registries across Georgia and in Los Angeles, California, were surveyed (n = 5080, response rate = 70%). Responses were merged with SEER data and results of clinical genetic tests, either BRCA1 and BRCA2 (BRCA1/2) sequencing only or including additional other genes (multiple-gene sequencing), provided by 4 laboratories. Type of testing (multiple-gene sequencing vs BRCA1/2-only sequencing), test results (negative, variant of unknown significance, or pathogenic variant), patient experiences with testing (timing of testing, who discussed results), and treatment (strength of patient consideration of, and surgeon recommendation for, prophylactic mastectomy), and prophylactic mastectomy receipt. We defined a patient subgroup with higher pretest risk of carrying a pathogenic variant according to practice guidelines. Among 5026 patients (mean [SD] age, 59.9 [10.7]), 1316 (26.2%) were linked to genetic results from any laboratory. Multiple-gene sequencing increasingly replaced BRCA1/2-only testing over time: in 2013, the rate of multiple-gene sequencing was 25.6% and BRCA1/2-only testing, 74.4%;in 2015 the rate of multiple-gene sequencing was 66.5% and BRCA1/2-only testing, 33.5%. Multiple-gene sequencing was more often ordered by genetic counselors (multiple-gene sequencing, 25.5% and BRCA1/2-only testing, 15.3%) and delayed until after surgery (multiple-gene sequencing, 32.5% and BRCA1/2-only testing, 19.9%). Multiple-gene sequencing substantially increased rate of detection of any pathogenic variant (multiple-gene sequencing: higher-risk patients, 12%; average-risk patients, 4.2% and BRCA1/2-only testing: higher-risk patients, 7.8%; average-risk patients, 2.2%) and variants of uncertain significance, especially in minorities (multiple-gene sequencing: white patients, 23.7%; black patients, 44.5%; and Asian patients, 50.9% and BRCA1/2-only testing: white patients, 2.2%; black patients, 5.6%; and Asian patients, 0%). Multiple-gene sequencing was not associated with an increase in the rate of prophylactic mastectomy use, which was highest with pathogenic variants in BRCA1/2 (BRCA1/2, 79.0%; other pathogenic variant, 37.6%; variant of uncertain significance, 30.2%; negative, 35.3%). Multiple-gene sequencing rapidly replaced BRCA1/2-only testing for patients with breast cancer in the community and enabled 2-fold higher detection of clinically relevant pathogenic variants without an associated increase in prophylactic mastectomy. However, important targets for improvement in the clinical utility of multiple-gene sequencing include postsurgical delay and racial/ethnic disparity in variants of uncertain significance.
Discovery and characterization of 3000+ main-sequence binaries from APOGEE spectra
NASA Astrophysics Data System (ADS)
El-Badry, Kareem; Ting, Yuan-Sen; Rix, Hans-Walter; Quataert, Eliot; Weisz, Daniel R.; Cargile, Phillip; Conroy, Charlie; Hogg, David W.; Bergemann, Maria; Liu, Chao
2018-05-01
We develop a data-driven spectral model for identifying and characterizing spatially unresolved multiple-star systems and apply it to APOGEE DR13 spectra of main-sequence stars. Binaries and triples are identified as targets whose spectra can be significantly better fit by a superposition of two or three model spectra, drawn from the same isochrone, than any single-star model. From an initial sample of ˜20 000 main-sequence targets, we identify ˜2500 binaries in which both the primary and secondary stars contribute detectably to the spectrum, simultaneously fitting for the velocities and stellar parameters of both components. We additionally identify and fit ˜200 triple systems, as well as ˜700 velocity-variable systems in which the secondary does not contribute detectably to the spectrum. Our model simplifies the process of simultaneously fitting single- or multi-epoch spectra with composite models and does not depend on a velocity offset between the two components of a binary, making it sensitive to traditionally undetectable systems with periods of hundreds or thousands of years. In agreement with conventional expectations, almost all the spectrally identified binaries with measured parallaxes fall above the main sequence in the colour-magnitude diagram. We find excellent agreement between spectrally and dynamically inferred mass ratios for the ˜600 binaries in which a dynamical mass ratio can be measured from multi-epoch radial velocities. We obtain full orbital solutions for 64 systems, including 14 close binaries within hierarchical triples. We make available catalogues of stellar parameters, abundances, mass ratios, and orbital parameters.
The V-band Empirical Mass-luminosity Relation for Main Sequence Stars
NASA Astrophysics Data System (ADS)
Xia, Fang; Fu, Yan-Ning
2010-07-01
Stellar mass is an indispensable parameter in the studies of stellar physics and stellar dynamics. On the one hand, the most reliable way to determine the stellar dynamical mass is via orbital determinations of binaries. On the other hand, however, most stellar masses have to be estimated by using the mass luminosity relation (MLR). Therefore, it is important to obtain the empirical MLR through fitting the data of stellar dynamical mass and luminosity. The effect of metallicity can make this relation disperse in the V-band, but studies show that this is mainly limited to the case when the stellar mass is less than 0.6M⊙ Recently, many relevant data have been accumulated for main sequence stars with larger masses, which make it possible to significantly improve the corresponding MLR. Using a fitting method which can reasonably assign weights to the observational data including two quantities with different dimensions, we obtain a V-band MLR based on the dynamical masses and luminosities of 203 main sequence stars. In comparison with the previous work, the improved MLR is statistically significant, and the relative error of mass estimation reaches about 5%. Therefore, our MLR is useful not only in the studies of statistical nature, but also in the studies of concrete stellar systems, such as the long-term dynamical study and the short-term positioning study of a specific multiple star system.
The V Band Empirical Mass-Luminosity Relation for Main Sequence Stars
NASA Astrophysics Data System (ADS)
Xia, F.; Fu, Y. N.
2010-01-01
Stellar mass is an indispensable parameter in the studies of stellar physics and stellar dynamics. On the one hand, the most reliable way to determine the stellar dynamical mass is via orbital determination of binaries. On the other hand, however, most stellar masses have to be estimated by using the mass-luminosity relation (MLR). Therefore, it is important to obtain the empirical MLR through fitting the data of stellar dynamical mass and luminosity. The effect of metallicity can make this relation disperse in the V-band, but studies show that this is mainly limited to the case when the stellar mass is less than 0.6M⊙. Recently, many relevant data have been accumulated for main sequence stars with larger mass, which make it possible to significantly improve the corresponding MLR. Using a fitting method which can reasonably assign weight to the observational data including two quantities with different dimensions, we obtain a V-band MLR based on the dynamical masses and luminosities of 203 main sequence stars. Compared with the previous work, the improved MLR is statistically significant, and the relative error of mass estimation reaches about 5%. Therefore, our MLR is useful not only in studies of statistical nature, but also in studies of concrete stellar systems, such as the long-term dynamical study and the short-term positioning study of a specific multiple star system.
Brütting, Christine; Emmer, Alexander; Kornhuber, Malte; Staege, Martin S
2016-08-01
Although multiple sclerosis (MS) is one of the most common central nervous system diseases in young adults, little is known about its etiology. Several human endogenous retroviruses (ERVs) are considered to play a role in MS. We are interested in which ERVs can be identified in the vicinity of MS associated genetic marker to find potential initiators of MS. We analysed the chromosomal regions surrounding 58 single nucleotide polymorphisms (SNPs) that are associated with MS identified in one of the last major genome wide association studies. We scanned these regions for putative endogenous retrovirus sequences with large open reading frames (ORFs). We observed that more retrovirus-related putative ORFs exist in the relatively close vicinity of SNP marker indices in multiple sclerosis compared to control SNPs. We found very high homologies to HERV-K, HCML-ARV, XMRV, Galidia ERV, HERV-H/env62 and XMRV-like mouse endogenous retrovirus mERV-XL. The associated genes (CYP27B1, CD6, CD58, MPV17L2, IL12RB1, CXCR5, PTGER4, TAGAP, TYK2, ICAM3, CD86, GALC, GPR65 as well as the HLA DRB1*1501) are mainly involved in the immune system, but also in vitamin D regulation. The most frequently detected ERV sequences are related to the multiple sclerosis-associated retrovirus, the human immunodeficiency virus 1, HERV-K, and the Simian foamy virus. Our data shows that there is a relation between MS associated SNPs and the number of retroviral elements compared to control. Our data identifies new ERV sequences that have not been associated with MS, so far.
Finding the target sites of RNA-binding proteins
Li, Xiao; Kazan, Hilal; Lipshitz, Howard D; Morris, Quaid D
2014-01-01
RNA–protein interactions differ from DNA–protein interactions because of the central role of RNA secondary structure. Some RNA-binding domains (RBDs) recognize their target sites mainly by their shape and geometry and others are sequence-specific but are sensitive to secondary structure context. A number of small- and large-scale experimental approaches have been developed to measure RNAs associated in vitro and in vivo with RNA-binding proteins (RBPs). Generalizing outside of the experimental conditions tested by these assays requires computational motif finding. Often RBP motif finding is done by adapting DNA motif finding methods; but modeling secondary structure context leads to better recovery of RBP-binding preferences. Genome-wide assessment of mRNA secondary structure has recently become possible, but these data must be combined with computational predictions of secondary structure before they add value in predicting in vivo binding. There are two main approaches to incorporating structural information into motif models: supplementing primary sequence motif models with preferred secondary structure contexts (e.g., MEMERIS and RNAcontext) and directly modeling secondary structure recognized by the RBP using stochastic context-free grammars (e.g., CMfinder and RNApromo). The former better reconstruct known binding preferences for sequence-specific RBPs but are not suitable for modeling RBPs that recognize shape and geometry of RNAs. Future work in RBP motif finding should incorporate interactions between multiple RBDs and multiple RBPs in binding to RNA. WIREs RNA 2014, 5:111–130. doi: 10.1002/wrna.1201 PMID:24217996
DOE Office of Scientific and Technical Information (OSTI.GOV)
Calamida, A.; Saha, A.; Strampelli, G.
2017-04-01
We present a multi-band photometric catalog of ≈1.7 million cluster members for a field of view of ≈2° × 2° across ω Cen. Photometry is based on images collected with the Dark Energy Camera on the 4 m Blanco telescope and the Advanced Camera for Surveys on the Hubble Space Telescope . The unprecedented photometric accuracy and field coverage allowed us, for the first time, to investigate the spatial distribution of ω Cen multiple populations from the core to the tidal radius, confirming its very complex structure. We found that the frequency of blue main-sequence stars is increasing compared to red main-sequencemore » stars starting from a distance of ≈25′ from the cluster center. Blue main-sequence stars also show a clumpy spatial distribution, with an excess in the northeast quadrant of the cluster pointing toward the direction of the Galactic center. Stars belonging to the reddest and faintest red-giant branch also show a more extended spatial distribution in the outskirts of ω Cen, a region never explored before. Both these stellar sub-populations, according to spectroscopic measurements, are more metal-rich compared to the cluster main stellar population. These findings, once confirmed, make ω Cen the only stellar system currently known where metal-rich stars have a more extended spatial distribution compared to metal-poor stars. Kinematic and chemical abundance measurements are now needed for stars in the external regions of ω Cen to better characterize the properties of these sub-populations.« less
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA
2011-01-18
A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karaiskos, Pantelis, E-mail: pkaraisk@med.uoa.gr; Gamma Knife Department, Hygeia Hospital, Athens; Moutsatsos, Argyris
Purpose: To propose, verify, and implement a simple and efficient methodology for the improvement of total geometric accuracy in multiple brain metastases gamma knife (GK) radiation surgery. Methods and Materials: The proposed methodology exploits the directional dependence of magnetic resonance imaging (MRI)-related spatial distortions stemming from background field inhomogeneities, also known as sequence-dependent distortions, with respect to the read-gradient polarity during MRI acquisition. First, an extra MRI pulse sequence is acquired with the same imaging parameters as those used for routine patient imaging, aside from a reversal in the read-gradient polarity. Then, “average” image data are compounded from data acquiredmore » from the 2 MRI sequences and are used for treatment planning purposes. The method was applied and verified in a polymer gel phantom irradiated with multiple shots in an extended region of the GK stereotactic space. Its clinical impact in dose delivery accuracy was assessed in 15 patients with a total of 96 relatively small (<2 cm) metastases treated with GK radiation surgery. Results: Phantom study results showed that use of average MR images eliminates the effect of sequence-dependent distortions, leading to a total spatial uncertainty of less than 0.3 mm, attributed mainly to gradient nonlinearities. In brain metastases patients, non-eliminated sequence-dependent distortions lead to target localization uncertainties of up to 1.3 mm (mean: 0.51 ± 0.37 mm) with respect to the corresponding target locations in the “average” MRI series. Due to these uncertainties, a considerable underdosage (5%-32% of the prescription dose) was found in 33% of the studied targets. Conclusions: The proposed methodology is simple and straightforward in its implementation. Regarding multiple brain metastases applications, the suggested approach may substantially improve total GK dose delivery accuracy in smaller, outlying targets.« less
Spray combustion model improvement study, 1
NASA Technical Reports Server (NTRS)
Chen, C. P.; Kim, Y. M.; Shang, H. M.
1993-01-01
This study involves the development of numerical and physical modeling in spray combustion. These modeling efforts are mainly motivated to improve the physical submodels of turbulence, combustion, atomization, dense spray effects, and group vaporization. The present mathematical formulation can be easily implemented in any time-marching multiple pressure correction methodologies such as MAST code. A sequence of validation cases includes the nonevaporating, evaporating and_burnin dense_sprays.
Lidz, Barbara H.; Hine, A.C.; Shinn, Eugene A.; Kindinger, Jack G.
1991-01-01
High-resolution seismic-reflection profiles off the lower Florida Keys reveal a multiple outlier-reef tract system ~0.5 to 1.5 km sea-ward of the bank margin. The system is characterized by a massive, outer main reef tract of high (28 m) unburied relief that parallels the margin and at least two narrower, discontinuous reef tracts of lower relief between the main tract and the shallow bank-margin reefs. The outer tract is ~0.5 to 1 km wide and extends a distance of ~57 km. A single pass divides the outer tract into two main reefs. The outlier reefs developed on antecedent, low-gradient to horizontal offbank surfaces, interpreted to be Pleistocene beaches that formed terracelike features. Radiocarbon dates of a coral core from the outer tract confirm a pre-Holocene age. These multiple outlier reefs represent a new windward-margin model that presents a significant, unique mechanism for progradation of carbonate platforms during periods of sea-level fluctuation. Infilling of the back-reef terrace basins would create new terraced promontories and would extend or "step" the platform seaward for hundreds of metres. Subsequent outlier-reef development would produce laterally accumulating sequences.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spiegel, David S.; Madhusudhan, Nikku, E-mail: dave@ias.edu, E-mail: Nikku.Madhusudhan@yale.edu
When the Sun ascends the red giant branch (RGB), its luminosity will increase and all the planets will receive much greater irradiation than they do now. Jupiter, in particular, might end up more highly irradiated than the hot Neptune GJ 436b and, hence, could appropriately be termed a 'hot Jupiter'. When their stars go through the RGB or asymptotic giant branch stages, many of the currently known Jupiter-mass planets in several-AU orbits will receive levels of irradiation comparable to the hot Jupiters, which will transiently increase their atmospheric temperatures to {approx}1000 K or more. Furthermore, massive planets around post-main-sequence starsmore » could accrete a non-negligible amount of material from the enhanced stellar winds, thereby significantly altering their atmospheric chemistry as well as causing a significant accretion luminosity during the epochs of most intense stellar mass loss. Future generations of infrared observatories might be able to probe the thermal and chemical structure of such hot Jupiters' atmospheres. Finally, we argue that, unlike their main-sequence analogs (whose zonal winds are thought to be organized in only a few broad, planetary-scale jets), red-giant hot Jupiters should have multiple, narrow jets of zonal winds and efficient day-night redistribution.« less
Methods for magnetic resonance analysis using magic angle technique
Hu, Jian Zhi [Richland, WA; Wind, Robert A [Kennewick, WA; Minard, Kevin R [Kennewick, WA; Majors, Paul D [Kennewick, WA
2011-11-22
Methods of performing a magnetic resonance analysis of a biological object are disclosed that include placing the object in a main magnetic field (that has a static field direction) and in a radio frequency field; rotating the object at a frequency of less than about 100 Hz around an axis positioned at an angle of about 54.degree.44' relative to the main magnetic static field direction; pulsing the radio frequency to provide a sequence that includes a phase-corrected magic angle turning pulse segment; and collecting data generated by the pulsed radio frequency. In particular embodiments the method includes pulsing the radio frequency to provide at least two of a spatially selective read pulse, a spatially selective phase pulse, and a spatially selective storage pulse. Further disclosed methods provide pulse sequences that provide extended imaging capabilities, such as chemical shift imaging or multiple-voxel data acquisition.
Hockenberry, Adam J; Pah, Adam R; Jewett, Michael C; Amaral, Luís A N
2017-01-01
Studies dating back to the 1970s established that sequence complementarity between the anti-Shine-Dalgarno (aSD) sequence on prokaryotic ribosomes and the 5' untranslated region of mRNAs helps to facilitate translation initiation. The optimal location of aSD sequence binding relative to the start codon, the full extents of the aSD sequence and the functional form of the relationship between aSD sequence complementarity and translation efficiency have not been fully resolved. Here, we investigate these relationships by leveraging the sequence diversity of endogenous genes and recently available genome-wide estimates of translation efficiency. We show that-after accounting for predicted mRNA structure-aSD sequence complementarity increases the translation of endogenous mRNAs by roughly 50%. Further, we observe that this relationship is nonlinear, with translation efficiency maximized for mRNAs with intermediate levels of aSD sequence complementarity. The mechanistic insights that we observe are highly robust: we find nearly identical results in multiple datasets spanning three distantly related bacteria. Further, we verify our main conclusions by re-analysing a controlled experimental dataset. © 2017 The Authors.
Quantiprot - a Python package for quantitative analysis of protein sequences.
Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold
2017-07-17
The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.
Zhao, Ya-E; Wang, Zheng-Hang; Xu, Yang; Wu, Li-Ping; Hu, Li
2013-10-01
According to base pairing, the rRNA folds into corresponding secondary structures, which contain additional phylogenetic information. On the basis of sequencing for complete rDNA sequences (18S, ITS1, 5.8S, ITS2 and 28S rDNA) of Demodex, we predicted the secondary structure of the complete rDNA sequence (18S, 5.8S, and 28S rDNA) of Demodex folliculorum, which was in concordance with that of the main arthropod lineages in past studies. And together with the sequence data from GenBank, we also predicted the secondary structures of divergent domains in SSU rRNA of 51 species and in LSU rRNA of 43 species from four superfamilies in Acari (Cheyletoidea, Tetranychoidea, Analgoidea and Ixodoidea). The multiple alignment among the four superfamilies in Acari showed that, insertions from Tetranychoidea SSU rRNA formed two newly proposed helixes, and helix c3-2b of LSU rRNA was absent in Demodex (Cheyletoidea) taxa. Generally speaking, LSU rRNA presented more remarkable differences than SSU rRNA did, mainly in D2, D3, D5, D7a, D7b, D8 and D10. Copyright © 2013 Elsevier Inc. All rights reserved.
Doddapaneni, Harshavardhan; Yao, Jiqiang; Lin, Hong; Walker, M Andrew; Civerolo, Edwin L
2006-01-01
Background The Gram-negative, xylem-limited phytopathogenic bacterium Xylella fastidiosa is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Despite its economic impact, relatively little is known about the genomic variations among strains isolated from different hosts and their influence on the population genetics of this pathogen. With the availability of genome sequence information for four strains, it is now possible to perform genome-wide analyses to identify and categorize such DNA variations and to understand their influence on strain functional divergence. Results There are 1,579 genes and 194 non-coding homologous sequences present in the genomes of all four strains, representing a 76. 2% conservation of the sequenced genome. About 60% of the X. fastidiosa unique sequences exist as tandem gene clusters of 6 or more genes. Multiple alignments identified 12,754 SNPs and 14,449 INDELs in the 1528 common genes and 20,779 SNPs and 10,075 INDELs in the 194 non-coding sequences. The average SNP frequency was 1.08 × 10-2 per base pair of DNA and the average INDEL frequency was 2.06 × 10-2 per base pair of DNA. On an average, 60.33% of the SNPs were synonymous type while 39.67% were non-synonymous type. The mutation frequency, primarily in the form of external INDELs was the main type of sequence variation. The relative similarity between the strains was discussed according to the INDEL and SNP differences. The number of genes unique to each strain were 60 (9a5c), 54 (Dixon), 83 (Ann1) and 9 (Temecula-1). A sub-set of the strain specific genes showed significant differences in terms of their codon usage and GC composition from the native genes suggesting their xenologous origin. Tandem repeat analysis of the genomic sequences of the four strains identified associations of repeat sequences with hypothetical and phage related functions. Conclusion INDELs and strain specific genes have been identified as the main source of variations among strains, with individual strains showing different rates of genome evolution. Based on these genome comparisons, it appears that the Pierce's disease strain Temecula-1 genome represents the ancestral genome of the X. fastidiosa. Results of this analysis are publicly available in the form of a web database. PMID:16948851
Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing.
Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan
2013-06-27
Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available for third-party implementation and use, and can be downloaded from http://s3.amazonaws.com/jnj_rainbow/index.html.
Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing
2013-01-01
Background Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Results Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Conclusions Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available for third-party implementation and use, and can be downloaded from http://s3.amazonaws.com/jnj_rainbow/index.html. PMID:23802613
MANGO: a new approach to multiple sequence alignment.
Zhang, Zefeng; Lin, Hao; Li, Ming
2007-01-01
Multiple sequence alignment is a classical and challenging task for biological sequence analysis. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state of the art multiple sequence alignment programs suffer from the 'once a gap, always a gap' phenomenon. Is there a radically new way to do multiple sequence alignment? This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds are provably significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks showing that MANGO compares favorably, in both accuracy and speed, against state-of-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, Prob-ConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0 and Kalign 2.0.
Batstone, D J; Torrijos, M; Ruiz, C; Schmidt, J E
2004-01-01
The model structure in anaerobic digestion has been clarified following publication of the IWA Anaerobic Digestion Model No. 1 (ADM1). However, parameter values are not well known, and uncertainty and variability in the parameter values given is almost unknown. Additionally, platforms for identification of parameters, namely continuous-flow laboratory digesters, and batch tests suffer from disadvantages such as long run times, and difficulty in defining initial conditions, respectively. Anaerobic sequencing batch reactors (ASBRs) are sequenced into fill-react-settle-decant phases, and offer promising possibilities for estimation of parameters, as they are by nature, dynamic in behaviour, and allow repeatable behaviour to establish initial conditions, and evaluate parameters. In this study, we estimated parameters describing winery wastewater (most COD as ethanol) degradation using data from sequencing operation, and validated these parameters using unsequenced pulses of ethanol and acetate. The model used was the ADM1, with an extension for ethanol degradation. Parameter confidence spaces were found by non-linear, correlated analysis of the two main Monod parameters; maximum uptake rate (k(m)), and half saturation concentration (K(S)). These parameters could be estimated together using only the measured acetate concentration (20 points per cycle). From interpolating the single cycle acetate data to multiple cycles, we estimate that a practical "optimal" identifiability could be achieved after two cycles for the acetate parameters, and three cycles for the ethanol parameters. The parameters found performed well in the short term, and represented the pulses of acetate and ethanol (within 4 days of the winery-fed cycles) very well. The main discrepancy was poor prediction of pH dynamics, which could be due to an unidentified buffer with an overall influence the same as a weak base (possibly CaCO3). Based on this work, ASBR systems are effective for parameter estimation, especially for comparative wastewater characterisation. The main disadvantages are heavy computational requirements for multiple cycles, and difficulty in establishing the correct biomass concentration in the reactor, though the last is also a disadvantage for continuous fixed film reactors, and especially, batch tests.
Neural mechanisms of sequence generation in songbirds
NASA Astrophysics Data System (ADS)
Langford, Bruce
Animal models in research are useful for studying more complex behavior. For example, motor sequence generation of actions requiring good muscle coordination such as writing with a pen, playing an instrument, or speaking, may involve the interaction of many areas in the brain, each a complex system in itself; thus it can be difficult to determine causal relationships between neural behavior and the behavior being studied. Birdsong, however, provides an excellent model behavior for motor sequence learning, memory, and generation. The song consists of learned sequences of notes that are spectrographically stereotyped over multiple renditions of the song, similar to syllables in human speech. The main areas of the songbird brain involve in singing are known, however, the mechanisms by which these systems store and produce song are not well understood. We used a custom built, head-mounted, miniature motorized microdrive to chronically record the neural firing patterns of identified neurons in HVC, a pre-motor cortical nucleus which has been shown to be important in song timing. These were done in Bengalese finch which generate a song made up of stereotyped notes but variable note sequences. We observed song related bursting in neurons projecting to Area X, a homologue to basal ganglia, and tonic firing in HVC interneurons. Interneuron had firing rate patterns that were consistent over multiple renditions of the same note sequence. We also designed and built a light-weight, low-powered wireless programmable neural stimulator using Bluetooth Low Energy Protocol. It was able to generate perturbations in the song when current pulses were administered to RA, which projects to the brainstem nucleus responsible for syringeal muscle control.
Unipro UGENE: a unified bioinformatics toolkit.
Okonechnikov, Konstantin; Golosova, Olga; Fursov, Mikhail
2012-04-15
Unipro UGENE is a multiplatform open-source software with the main goal of assisting molecular biologists without much expertise in bioinformatics to manage, analyze and visualize their data. UGENE integrates widely used bioinformatics tools within a common user interface. The toolkit supports multiple biological data formats and allows the retrieval of data from remote data sources. It provides visualization modules for biological objects such as annotated genome sequences, Next Generation Sequencing (NGS) assembly data, multiple sequence alignments, phylogenetic trees and 3D structures. Most of the integrated algorithms are tuned for maximum performance by the usage of multithreading and special processor instructions. UGENE includes a visual environment for creating reusable workflows that can be launched on local resources or in a High Performance Computing (HPC) environment. UGENE is written in C++ using the Qt framework. The built-in plugin system and structured UGENE API make it possible to extend the toolkit with new functionality. UGENE binaries are freely available for MS Windows, Linux and Mac OS X at http://ugene.unipro.ru/download.html. UGENE code is licensed under the GPLv2; the information about the code licensing and copyright of integrated tools can be found in the LICENSE.3rd_party file provided with the source bundle.
FASMA: a service to format and analyze sequences in multiple alignments.
Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M
2007-12-01
Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.
Rodas, Claudia; Klena, John D.; Nicklasson, Matilda; Iniguez, Volga; Sjöling, Åsa
2011-01-01
Background Enterotoxigenic Escherichia coli (ETEC) is a major cause of traveller's and infantile diarrhoea in the developing world. ETEC produces two toxins, a heat-stable toxin (known as ST) and a heat-labile toxin (LT) and colonization factors that help the bacteria to attach to epithelial cells. Methodology/Principal Findings In this study, we characterized a subset of ETEC clinical isolates recovered from Bolivian children under 5 years of age using a combination of multilocus sequence typing (MLST) analysis, virulence typing, serotyping and antimicrobial resistance test patterns in order to determine the genetic background of ETEC strains circulating in Bolivia. We found that strains expressing the heat-labile (LT) enterotoxin and colonization factor CS17 were common and belonged to several MLST sequence types but mainly to sequence type-423 and sequence type-443 (Achtman scheme). To further study the LT/CS17 strains we analysed the nucleotide sequence of the CS17 operon and compared the structure to LT/CS17 ETEC isolates from Bangladesh. Sequence analysis confirmed that all sequence type-423 strains from Bolivia had a single nucleotide polymorphism; SNPbol in the CS17 operon that was also found in some other MLST sequence types from Bolivia but not in strains recovered from Bangladeshi children. The dominant ETEC clone in Bolivia (sequence type-423/SNPbol) was found to persist over multiple years and was associated with severe diarrhoea but these strains were variable with respect to antimicrobial resistance patterns. Conclusion/Significance The results showed that although the LT/CS17 phenotype is common among ETEC strains in Bolivia, multiple clones, as determined by unique MLST sequence types, populate this phenotype. Our data also appear to suggest that acquisition and loss of antimicrobial resistance in LT-expressing CS17 ETEC clones is more dynamic than acquisition or loss of virulence factors. PMID:22140423
Rodas, Claudia; Klena, John D; Nicklasson, Matilda; Iniguez, Volga; Sjöling, Asa
2011-01-01
Enterotoxigenic Escherichia coli (ETEC) is a major cause of traveller's and infantile diarrhoea in the developing world. ETEC produces two toxins, a heat-stable toxin (known as ST) and a heat-labile toxin (LT) and colonization factors that help the bacteria to attach to epithelial cells. In this study, we characterized a subset of ETEC clinical isolates recovered from Bolivian children under 5 years of age using a combination of multilocus sequence typing (MLST) analysis, virulence typing, serotyping and antimicrobial resistance test patterns in order to determine the genetic background of ETEC strains circulating in Bolivia. We found that strains expressing the heat-labile (LT) enterotoxin and colonization factor CS17 were common and belonged to several MLST sequence types but mainly to sequence type-423 and sequence type-443 (Achtman scheme). To further study the LT/CS17 strains we analysed the nucleotide sequence of the CS17 operon and compared the structure to LT/CS17 ETEC isolates from Bangladesh. Sequence analysis confirmed that all sequence type-423 strains from Bolivia had a single nucleotide polymorphism; SNP(bol) in the CS17 operon that was also found in some other MLST sequence types from Bolivia but not in strains recovered from Bangladeshi children. The dominant ETEC clone in Bolivia (sequence type-423/SNP(bol)) was found to persist over multiple years and was associated with severe diarrhoea but these strains were variable with respect to antimicrobial resistance patterns. The results showed that although the LT/CS17 phenotype is common among ETEC strains in Bolivia, multiple clones, as determined by unique MLST sequence types, populate this phenotype. Our data also appear to suggest that acquisition and loss of antimicrobial resistance in LT-expressing CS17 ETEC clones is more dynamic than acquisition or loss of virulence factors.
SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments
Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric
2014-01-01
This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee. PMID:24972831
Splicing predictions reliably classify different types of alternative splicing
Busch, Anke; Hertel, Klemens J.
2015-01-01
Alternative splicing is a key player in the creation of complex mammalian transcriptomes and its misregulation is associated with many human diseases. Multiple mRNA isoforms are generated from most human genes, a process mediated by the interplay of various RNA signature elements and trans-acting factors that guide spliceosomal assembly and intron removal. Here, we introduce a splicing predictor that evaluates hundreds of RNA features simultaneously to successfully differentiate between exons that are constitutively spliced, exons that undergo alternative 5′ or 3′ splice-site selection, and alternative cassette-type exons. Surprisingly, the splicing predictor did not feature strong discriminatory contributions from binding sites for known splicing regulators. Rather, the ability of an exon to be involved in one or multiple types of alternative splicing is dictated by its immediate sequence context, mainly driven by the identity of the exon's splice sites, the conservation around them, and its exon/intron architecture. Thus, the splicing behavior of human exons can be reliably predicted based on basic RNA sequence elements. PMID:25805853
Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming
2016-07-08
The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bellini, A.; Anderson, J.; Marel, R. P. van der
2015-09-01
Numerous observational studies have revealed the ubiquitous presence of multiple stellar populations in globular clusters and cast many difficult challenges for the study of the formation and dynamical history of these stellar systems. In this Letter we present the results of a study of the kinematic properties of multiple populations in NGC 2808 based on high-precision Hubble Space Telescope proper-motion measurements. In a recent study, Milone et al. identified five distinct populations (A–E) in NGC 2808. Populations D and E coincide with the helium-enhanced populations in the middle and the blue main sequences (mMS and bMS) previously discovered by Piottomore » et al.; populations A–C correspond to the redder main sequence that, in Piotto et al., was associated with the primordial stellar population. Our analysis shows that, in the outermost regions probed (between about 1.5 and 2 times the cluster half-light radius), the velocity distribution of populations D and E is radially anisotropic (the deviation from an isotropic distribution is significant at the ∼3.5σ level). Stars of populations D and E have a smaller tangential velocity dispersion than those of populations A–C, while no significant differences are found in the radial velocity dispersion. We present the results of a numerical simulation showing that the observed differences between the kinematics of these stellar populations are consistent with the expected kinematic fingerprint of the diffusion toward the cluster outer regions of stellar populations initially more centrally concentrated.« less
Multiplexed fragaria chloroplast genome sequencing
W. Njuguna; A. Liston; R. Cronn; N.V. Bassil
2010-01-01
A method to sequence multiple chloroplast genomes using ultra high throughput sequencing technologies was recently described. Complete chloroplast genome sequences can resolve phylogenetic relationships at low taxonomic levels and identify informative point mutations and indels. The objective of this research was to sequence multiple Fragaria...
IRCI-Free MIMO-OFDM SAR Using Circularly Shifted Zadoff-Chu Sequences
NASA Astrophysics Data System (ADS)
Cao, Yun-He; Xia, Xiang-Gen
2015-05-01
Cyclic prefix (CP) based MIMO-OFDM radar has been recently proposed for distributed transmit antennas, where there is no inter-range-cell interference (IRCI). It can collect full spatial diversity and each transmitter transmits signals with the same frequency band, i.e., the range resolution is not reduced. However, it needs to transmit multiple OFDM pulses consecutively to obtain range profiles for a single swath, which may be too long in time for a reasonable swath width. In this letter, we propose a CP based MIMO-OFDM synthetic aperture radar (SAR) system, where each transmitter transmits only a single OFDM pulse to obtain range profiles for a swath and has the same frequency band, thus the range resolution is not reduced. It is IRCI free and can collect the full spatial diversity if the transmit antennas are distributed. Our main idea is to use circularly shifted Zadoff-Chu sequences as the weighting coefficients in the OFDM pulses for different transmit antennas and apply spatial filters with multiple receive antennas to divide the whole swath into multiple subswaths, and then each subswath is reconstructed/imaged using our proposed IRCI free range reconstruction method.
NASA Astrophysics Data System (ADS)
Kim, Sookwan; De Santis, Laura; Böhm, Gualtiero; Kuk Hong, Jong; Jin, Young Keun; Geletti, Riccardo; Wardell, Nigel; Petronio, Lorenzo; Colizza, Ester
2014-05-01
The Ross Sea, located between Victoria Land and Marie Byrd Land in Antarctica, is one of the main drainage of the Antarctic Ice Sheet (AIS). Reflection seismic data acquired by many countries during several decades have provided insights into the history of the Ross Sea and the AIS evolution. However the majority of the existing seismic data are concentrated in the shelf area, where hiatus formed by grounding ice sheet erosion multiple events prevent to reconstruct the entire sedimentary sequences depositional evolution. On the outer shelf and upper slope, the sedimentary sequences are relatively well preserved. The main purpose of this study is the investigation of the Cenozoic Antarctic Ice Sheet evolution through the seismic sequence analysis of the outer shelf and slope of the Central Basin, in the Ross Sea. The data used are the new multi-channel seismic data, KSL12, were acquired on the outer shelf and upper slope of the Central Bain in February 2013 by Korea Polar Research Institute. The reflection seismic data, previously collected by the Italian Antarctic Program (PNRA) and other data available from the Seismic Data Library System (SDLS) are also used for velocity tomography and seismic sequence mapping. The seismic data were processed by a conventional processing flow to produce the seismic profiles. Preliminary results show well-developed prograding wedges at the mouth of glacial troughs, eroded by a major glacial unconformity, the Ross Sea Unconformity 4 (RSU-4), correlated to a main event between early- and mid-Miocene. The velocity anomalies shown along KSL12-1 can be interpreted as showing the occurrence of gas and fluids, diagenetic horizons and sediment compactions. The isopach maps of each sequence show the variation of thickness of the sediments depocenter shift. The seismic sequence stratigraphy and acoustic facies analysis provide information about different phases of ice sheet's advance and retreat related to the AIS Cenozoic dynamics.
Clinical features of multiple organ failure in the elderly.
Wang, S W; Fan, L
1990-09-01
Multiple organ failure (MOF) in the elderly is a new syndrome evolved from multiple organ chronic diseases on the basis of multiple organ dysfunction in the aged. Its characteristics are clinically different from those of MOF due to serious trauma. 122 cases of MOF were analysed retrospectively and their clinical features discussed. MOF with a long course is the natural presentation in many of the elderly before death. Its main precipitating factors are pulmonary infection, metastatic carcinoma, cardiac attack, etc. The sequence of a failure in organs is heart, lung, kidney, liver, etc. The mortality is similar to that of MOF due to trauma. However, those suffering from 4-organ failure can still survive, and instead, the renal failure can be mostly fatal. More attention should be paid to the prevention of MOF in the elderly so as to shorten its developing course.
Embedding strategies for effective use of information from multiple sequence alignments.
Henikoff, S.; Henikoff, J. G.
1997-01-01
We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452
Simultaneous phylogeny reconstruction and multiple sequence alignment
Yue, Feng; Shi, Jian; Tang, Jijun
2009-01-01
Background A phylogeny is the evolutionary history of a group of organisms. To date, sequence data is still the most used data type for phylogenetic reconstruction. Before any sequences can be used for phylogeny reconstruction, they must be aligned, and the quality of the multiple sequence alignment has been shown to affect the quality of the inferred phylogeny. At the same time, all the current multiple sequence alignment programs use a guide tree to produce the alignment and experiments showed that good guide trees can significantly improve the multiple alignment quality. Results We devise a new algorithm to simultaneously align multiple sequences and search for the phylogenetic tree that leads to the best alignment. We also implemented the algorithm as a C program package, which can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM250 or BLOSUM62. The performance of the new method are compared with those from other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. Experimental results suggest that this method has good performance in terms of both phylogeny accuracy and alignment quality. Conclusion We present an algorithm to align multiple sequences and reconstruct the phylogenies that minimize the alignment score, which is based on an efficient algorithm to solve the median problems for three sequences. Our extensive experiments suggest that this method is very promising and can produce high quality phylogenies and alignments. PMID:19208110
Single-cell genomic sequencing using Multiple Displacement Amplification.
Lasken, Roger S
2007-10-01
Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
Bellerophon: A program to detect chimeric sequences in multiple sequence alignments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip
2003-12-23
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
Birrer, Simone C; Dafforn, Katherine A; Simpson, Stuart L; Kelaher, Brendan P; Potts, Jaimie; Scanes, Peter; Johnston, Emma L
2018-05-15
Coastal waterways are increasingly exposed to multiple stressors, e.g. contaminants that can be delivered via pulse or press exposures. Therefore, it is crucial that ecological impacts can be differentiated among stressors to manage ecosystem threats. We investigated microbial community development in sediments exposed to press and pulse stressors. Press exposures were created with in situ mesocosm sediments containing a range of 'metal' concentrations (sediment contaminated with multiple metal(loid)s) and organic enrichment (fertiliser), while the pulse exposure was simulated by a single dose of organic fertiliser. All treatments and exposure concentrations were crossed in a fully factorial field experiment. We used amplicon sequencing to compare the sensitivity of the 1) total (DNA) and active (RNA) component of 2) bacterial (16S rRNA) and eukaryotic (18S rRNA) communities to contaminant exposures. Overall microbial community change was greater when exposed to press than pulse stressors, with the bacterial community responding more strongly than the eukaryotes. The total bacterial community represents a more time-integrated measure of change and proved to be more sensitive to multiple stressors than the active community. Metals and organic enrichment treatments interacted such that the effect of metals was weaker when the sediment was organically enriched. Taxa-level analyses revealed that press enrichment resulted in potential functional changes, mainly involving nitrogen cycling. Furthermore, enrichment generally reduced the abundance of active eukaryotes in the sediment. As well as demonstrating interactive impacts of metals and organic enrichment, this study highlights the sensitivity of next-generation sequencing for ecosystem biomonitoring of interacting stressors and identifies opportunities for more targeted application. Copyright © 2018 Elsevier B.V. All rights reserved.
Sun, Zhifu; Cunningham, Julie; Slager, Susan; Kocher, Jean-Pierre
2015-01-01
Bisulfite treatment-based methylation microarray (mainly Illumina 450K Infinium array) and next-generation sequencing (reduced representation bisulfite sequencing, Agilent SureSelect Human Methyl-Seq, NimbleGen SeqCap Epi CpGiant or whole-genome bisulfite sequencing) are commonly used for base resolution DNA methylome research. Although multiple tools and methods have been developed and used for the data preprocessing and analysis, confusions remains for these platforms including how and whether the 450k array should be normalized; which platform should be used to better fit researchers’ needs; and which statistical models would be more appropriate for differential methylation analysis. This review presents the commonly used platforms and compares the pros and cons of each in methylome profiling. We then discuss approaches to study design, data normalization, bias correction and model selection for differentially methylated individual CpGs and regions. PMID:26366945
Convergent evolution of marine mammals is associated with distinct substitutions in common genes
Zhou, Xuming; Seim, Inge; Gladyshev, Vadim N.
2015-01-01
Phenotypic convergence is thought to be driven by parallel substitutions coupled with natural selection at the sequence level. Multiple independent evolutionary transitions of mammals to an aquatic environment offer an opportunity to test this thesis. Here, whole genome alignment of coding sequences identified widespread parallel amino acid substitutions in marine mammals; however, the majority of these changes were not unique to these animals. Conversely, we report that candidate aquatic adaptation genes, identified by signatures of likelihood convergence and/or elevated ratio of nonsynonymous to synonymous nucleotide substitution rate, are characterized by very few parallel substitutions and exhibit distinct sequence changes in each group. Moreover, no significant positive correlation was found between likelihood convergence and positive selection in all three marine lineages. These results suggest that convergence in protein coding genes associated with aquatic lifestyle is mainly characterized by independent substitutions and relaxed negative selection. PMID:26549748
Improving performance of DS-CDMA systems using chaotic complex Bernoulli spreading codes
NASA Astrophysics Data System (ADS)
Farzan Sabahi, Mohammad; Dehghanfard, Ali
2014-12-01
The most important goal of spreading spectrum communication system is to protect communication signals against interference and exploitation of information by unintended listeners. In fact, low probability of detection and low probability of intercept are two important parameters to increase the performance of the system. In Direct Sequence Code Division Multiple Access (DS-CDMA) systems, these properties are achieved by multiplying the data information in spreading sequences. Chaotic sequences, with their particular properties, have numerous applications in constructing spreading codes. Using one-dimensional Bernoulli chaotic sequence as spreading code is proposed in literature previously. The main feature of this sequence is its negative auto-correlation at lag of 1, which with proper design, leads to increase in efficiency of the communication system based on these codes. On the other hand, employing the complex chaotic sequences as spreading sequence also has been discussed in several papers. In this paper, use of two-dimensional Bernoulli chaotic sequences is proposed as spreading codes. The performance of a multi-user synchronous and asynchronous DS-CDMA system will be evaluated by applying these sequences under Additive White Gaussian Noise (AWGN) and fading channel. Simulation results indicate improvement of the performance in comparison with conventional spreading codes like Gold codes as well as similar complex chaotic spreading sequences. Similar to one-dimensional Bernoulli chaotic sequences, the proposed sequences also have negative auto-correlation. Besides, construction of complex sequences with lower average cross-correlation is possible with the proposed method.
A novel approach to multiple sequence alignment using hadoop data grids.
Sudha Sadasivam, G; Baktavatchalam, G
2010-01-01
Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.
Analysis of the cytochrome c oxidase subunit II (COX2) gene in giant panda, Ailuropoda melanoleuca.
Ling, S S; Zhu, Y; Lan, D; Li, D S; Pang, H Z; Wang, Y; Li, D Y; Wei, R P; Zhang, H M; Wang, C D; Hu, Y D
2017-01-23
The giant panda, Ailuropoda melanoleuca (Ursidae), has a unique bamboo-based diet; however, this low-energy intake has been sufficient to maintain the metabolic processes of this species since the fourth ice age. As mitochondria are the main sites for energy metabolism in animals, the protein-coding genes involved in mitochondrial respiratory chains, particularly cytochrome c oxidase subunit II (COX2), which is the rate-limiting enzyme in electron transfer, could play an important role in giant panda metabolism. Therefore, the present study aimed to isolate, sequence, and analyze the COX2 DNA from individuals kept at the Giant Panda Protection and Research Center, China, and compare these sequences with those of the other Ursidae family members. Multiple sequence alignment showed that the COX2 gene had three point mutations that defined three haplotypes, with 60% of the sequences corresponding to haplotype I. The neutrality tests revealed that the COX2 gene was conserved throughout evolution, and the maximum likelihood phylogenetic analysis, using homologous sequences from other Ursidae species, showed clustering of the COX2 sequences of giant pandas, suggesting that this gene evolved differently in them.
A deep learning pipeline for Indian dance style classification
NASA Astrophysics Data System (ADS)
Dewan, Swati; Agarwal, Shubham; Singh, Navjyoti
2018-04-01
In this paper, we address the problem of dance style classification to classify Indian dance or any dance in general. We propose a 3-step deep learning pipeline. First, we extract 14 essential joint locations of the dancer from each video frame, this helps us to derive any body region location within the frame, we use this in the second step which forms the main part of our pipeline. Here, we divide the dancer into regions of important motion in each video frame. We then extract patches centered at these regions. Main discriminative motion is captured in these patches. We stack the features from all such patches of a frame into a single vector and form our hierarchical dance pose descriptor. Finally, in the third step, we build a high level representation of the dance video using the hierarchical descriptors and train it using a Recurrent Neural Network (RNN) for classification. Our novelty also lies in the way we use multiple representations for a single video. This helps us to: (1) Overcome the RNN limitation of learning small sequences over big sequences such as dance; (2) Extract more data from the available dataset for effective deep learning by training multiple representations. Our contributions in this paper are three-folds: (1) We provide a deep learning pipeline for classification of any form of dance; (2) We prove that a segmented representation of a dance video works well with sequence learning techniques for recognition purposes; (3) We extend and refine the ICD dataset and provide a new dataset for evaluation of dance. Our model performs comparable or better in some cases than the state-of-the-art on action recognition benchmarks.
Sequence alignment visualization in HTML5 without Java.
Gille, Christoph; Birgit, Weyand; Gille, Andreas
2014-01-01
Java has been extensively used for the visualization of biological data in the web. However, the Java runtime environment is an additional layer of software with an own set of technical problems and security risks. HTML in its new version 5 provides features that for some tasks may render Java unnecessary. Alignment-To-HTML is the first HTML-based interactive visualization for annotated multiple sequence alignments. The server side script interpreter can perform all tasks like (i) sequence retrieval, (ii) alignment computation, (iii) rendering, (iv) identification of a homologous structural models and (v) communication with BioDAS-servers. The rendered alignment can be included in web pages and is displayed in all browsers on all platforms including touch screen tablets. The functionality of the user interface is similar to legacy Java applets and includes color schemes, highlighting of conserved and variable alignment positions, row reordering by drag and drop, interlinked 3D visualization and sequence groups. Novel features are (i) support for multiple overlapping residue annotations, such as chemical modifications, single nucleotide polymorphisms and mutations, (ii) mechanisms to quickly hide residue annotations, (iii) export to MS-Word and (iv) sequence icons. Alignment-To-HTML, the first interactive alignment visualization that runs in web browsers without additional software, confirms that to some extend HTML5 is already sufficient to display complex biological data. The low speed at which programs are executed in browsers is still the main obstacle. Nevertheless, we envision an increased use of HTML and JavaScript for interactive biological software. Under GPL at: http://www.bioinformatics.org/strap/toHTML/.
PFAAT version 2.0: a tool for editing, annotating, and analyzing multiple sequence alignments.
Caffrey, Daniel R; Dana, Paul H; Mathur, Vidhya; Ocano, Marco; Hong, Eun-Jong; Wang, Yaoyu E; Somaroo, Shyamal; Caffrey, Brian E; Potluri, Shobha; Huang, Enoch S
2007-10-11
By virtue of their shared ancestry, homologous sequences are similar in their structure and function. Consequently, multiple sequence alignments are routinely used to identify trends that relate to function. This type of analysis is particularly productive when it is combined with structural and phylogenetic analysis. Here we describe the release of PFAAT version 2.0, a tool for editing, analyzing, and annotating multiple sequence alignments. Support for multiple annotations is a key component of this release as it provides a framework for most of the new functionalities. The sequence annotations are accessible from the alignment and tree, where they are typically used to label sequences or hyperlink them to related databases. Sequence annotations can be created manually or extracted automatically from UniProt entries. Once a multiple sequence alignment is populated with sequence annotations, sequences can be easily selected and sorted through a sophisticated search dialog. The selected sequences can be further analyzed using statistical methods that explicitly model relationships between the sequence annotations and residue properties. Residue annotations are accessible from the alignment viewer and are typically used to designate binding sites or properties for a particular residue. Residue annotations are also searchable, and allow one to quickly select alignment columns for further sequence analysis, e.g. computing percent identities. Other features include: novel algorithms to compute sequence conservation, mapping conservation scores to a 3D structure in Jmol, displaying secondary structure elements, and sorting sequences by residue composition. PFAAT provides a framework whereby end-users can specify knowledge for a protein family in the form of annotation. The annotations can be combined with sophisticated analysis to test hypothesis that relate to sequence, structure and function.
NASA Astrophysics Data System (ADS)
Miyatake, Teruhiko; Chiba, Kazuki; Hamamura, Masanori; Tachikawa, Shin'ichi
We propose a novel asynchronous direct-sequence codedivision multiple access (DS-CDMA) using feedback-controlled spreading sequences (FCSSs) (FCSS/DS-CDMA). At the receiver of FCSS/DS-CDMA, the code-orthogonalizing filter (COF) produces a spreading sequence, and the receiver returns the spreading sequence to the transmitter. Then the transmitter uses the spreading sequence as its updated version. The performance of FCSS/DS-CDMA is evaluated over time-dispersive channels. The results indicate that FCSS/DS-CDMA greatly suppresses both the intersymbol interference (ISI) and multiple access interference (MAI) over time-invariant channels. FCSS/DS-CDMA is applicable to the decentralized multiple access.
The dynamics of post-main sequence planetary systems
NASA Astrophysics Data System (ADS)
Mustill, Alexander James
2017-06-01
The study of planetary systems after their host stars have left the main sequence is of fundamental importance for exoplanet science, as the most direct determination of the compositions of extra-Solar planets, asteroids and comets is in fact made by an analysis of the elemental abundances of the remnants of these bodies accreted into the atmospheres of white dwarfs.To understand how the accreted bodies relate to the source populations in the planetary system, and to model their dynamical delivery to the white dwarf, it is necessary to understand the effects of stellar evolution on bodies' orbits. On the red giant branch (RGB) and asymptotic giant branch (AGB) prior to becoming a white dwarf, stars expand to a large size (>1 au) and are easily deformed by orbiting planets, leading to tidal energy dissipation and orbital decay. They also lose half or more of their mass, causing the expansion of bodies' orbits. This mass loss increases the planet:star mass ratio, so planetary systems orbiting white dwarfs can be much less stable than those orbiting their main-sequence progenitors. Finally, small bodies in the system experience strong non-gravitational forces during the RGB and AGB: aerodynamic drag from the mass shed by the star, and strong radiation forces as the stellar luminosity reaches several thousand Solar luminosities.I will review these effects, focusing on planet--star tidal interactions and planet--asteroid interactions, and I will discuss some of the numerical challenges in modelling systems over their entire lifetimes of multiple Gyr.
Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske
2007-02-14
The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
Shrestha, Rima D; Grinberg, Alex; Dukkipati, Venkata S R; Pleydell, Eve J; Prattley, Deborah J; French, Nigel P
2014-05-28
Several Cryptosporidium species are known to infect cattle. However, the occurrence of mixed infections with more than one species and the impact of this phenomenon on animal and human health are poorly understood. Therefore, to detect the presence of mixed Cryptosporidium infections, 15 immunofluorescence-positive specimens obtained from 6-week-old calves' faeces (n=60) on one dairy farm were subjected to PCR-sequencing at multiple loci. DNA sequences of three Cryptosporidium species: C. parvum (15/15), C. bovis (3/15) and C. andersoni (1/15), and two new genetic variants were identified. There was evidence of mixed infections in five specimens. C. parvum, C. bovis and C. andersoni sequences were detected together in one specimen, C. parvum and C. bovis in two specimens, and C. parvum and C. parvum-like variants in the remaining two specimens. Sequencing of gp60 amplicons identified the IIaA19G4R1 (8/15) and IIaA18G3R1 (4/15) C. parvum subgenotypes. This study provides evidence of endemic mixed infections with the three main Cryptosporidium species of cattle and new genetic variants, in calves at the transition age of six weeks. The results add to the body of evidence describing Cryptosporidium isolates as genetically heterogeneous populations, and highlight the need for iterative genotyping to explore their genetic makeup. Copyright © 2014 Elsevier B.V. All rights reserved.
Verginelli, Fabio; Capelli, Cristian; Coia, Valentina; Musiani, Marco; Falchetti, Mario; Ottini, Laura; Palmirotta, Raffaele; Tagliacozzo, Antonio; De Grossi Mazzorin, Iacopo; Mariani-Costantini, Renato
2005-12-01
The question of the origins of the dog has been much debated. The dog is descended from the wolf that at the end of the last glaciation (the archaeologically hypothesized period of dog domestication) was one of the most widespread among Holarctic mammals. Scenarios provided by genetic studies range from multiple dog-founding events to a single origin in East Asia. The earliest fossil dogs, dated approximately 17-12,000 radiocarbon ((14)C) years ago (YA), were found in Europe and in the Middle East. Ancient DNA (a-DNA) evidence could contribute to the identification of dog-founder wolf populations. To gain insight into the relationships between ancient European wolves and dogs we analyzed a 262-bp mitochondrial DNA control region fragment retrieved from five prehistoric Italian canids ranging in age from approximately 15,000 to approximately 3,000 (14)C YA. These canids were compared to a worldwide sample of 547 purebred dogs and 341 wolves. The ancient sequences were highly diverse and joined the three major clades of extant dog sequences. Phylogenetic investigations highlighted relationships between the ancient sequences and geographically widespread extant dog matrilines and between the ancient sequences and extant wolf matrilines of mainly East European origin. The results provide a-DNA support for the involvement of European wolves in the origins of the three major dog clades. Genetic data also suggest multiple independent domestication events. East European wolves may still reflect the genetic variation of ancient dog-founder populations.
Using a Sequence of Earcons to Monitor Multiple Simulated Patients.
Hickling, Anna; Brecknell, Birgit; Loeb, Robert G; Sanderson, Penelope
2017-03-01
The aim of this study was to determine whether a sequence of earcons can effectively convey the status of multiple processes, such as the status of multiple patients in a clinical setting. Clinicians often monitor multiple patients. An auditory display that intermittently conveys the status of multiple patients may help. Nonclinician participants listened to sequences of 500-ms earcons that each represented the heart rate (HR) and oxygen saturation (SpO 2 ) levels of a different simulated patient. In each sequence, one, two, or three patients had an abnormal level of HR and/or SpO 2 . In Experiment 1, participants reported which of nine patients in a sequence were abnormal. In Experiment 2, participants identified the vital signs of one, two, or three abnormal patients in sequences of one, five, or nine patients, where the interstimulus interval (ISI) between earcons was 150 ms. Experiment 3 used the five-sequence condition of Experiment 2, but the ISI was either 150 ms or 800 ms. Participants reported which patient(s) were abnormal with median 95% accuracy. Identification accuracy for vital signs decreased as the number of abnormal patients increased from one to three, p < .001, but accuracy was unaffected by number of patients in a sequence. Overall, identification accuracy was significantly higher with an ISI of 800 ms (89%) compared with an ISI of 150 ms (83%), p < .001. A multiple-patient display can be created by cycling through earcons that represent individual patients. The principles underlying the multiple-patient display can be extended to other vital signs, designs, and domains.
The 2016-2017 Central Italy Seismic Sequence: Source Complexity Inferred from Rupture Models.
NASA Astrophysics Data System (ADS)
Scognamiglio, L.; Tinti, E.; Casarotti, E.; Pucci, S.; Villani, F.; Cocco, M.; Magnoni, F.; Michelini, A.
2017-12-01
The Apennines have been struck by several seismic sequences in recent years, showing evidence of the activation of multiple segments of normal fault systems in a variable and, relatively short, time span, as in the case of the 1980 Irpinia earthquake (three shocks in 40 s), the 1997 Umbria-Marche sequence (four main shocks in 18 days) and the 2009 L'Aquila earthquake having three segments activated within a few weeks. The 2016-2017 central Apennines seismic sequence begin on August 24th with a MW 6.0 earthquake, which strike the region between Amatrice and Accumoli causing 299 fatalities. This earthquake ruptures a nearly 20 km long normal fault and shows a quite heterogeneous slip distribution. On October 26th, another main shock (MW 5.9) occurs near Visso extending the activated seismogenic area toward the NW. It is a double event rupturing contiguous patches on the fault segment of the normal fault system. Four days after the second main shock, on October 30th, a third earthquake (MW 6.5) occurs near Norcia, roughly midway between Accumoli and Visso. In this work we have inverted strong motion waveforms and GPS data to retrieve the source model of the MW 6.5 event with the aim of interpreting the rupture process in the framework of this complex sequence of moderate magnitude earthquakes. We noted that some preliminary attempts to model the slip distribution of the October 30th main shock using a single fault plane oriented along the Apennines did not provide convincing fits to the observed waveforms. In addition, the deformation pattern inferred from satellite observations suggested the activation of a multi-fault structure, that is coherent to the complexity and the extension of the geological surface deformation. We investigated the role of multi-fault ruptures and we found that this event revealed an extraordinary complexity of the rupture geometry and evolution: the coseismic rupture propagated almost simultaneously on a normal fault and on a blind fault, possibly inherited from compressional tectonics. These earthquakes raise serious concerns on our understanding of fault segmentation and seismicity evolution during sequences of normal faulting earthquakes. Finally, the retrieved rupture history has important implications on seismic hazard assessment and on the maximum expected magnitude in a given tectonic area.
NASA Astrophysics Data System (ADS)
Ruhl, C. J.; Abercrombie, R. E.; Smith, K. D.; Zaliapin, I.
2016-11-01
After approximately 2 months of swarm-like earthquakes in the Mogul neighborhood of west Reno, NV, seismicity rates and event magnitudes increased over several days culminating in an Mw 4.9 dextral strike-slip earthquake on 26 April 2008. Although very shallow, the Mw 4.9 main shock had a different sense of slip than locally mapped dip-slip surface faults. We relocate 7549 earthquakes, calculate 1082 focal mechanisms, and statistically cluster the relocated earthquake catalog to understand the character and interaction of active structures throughout the Mogul, NV earthquake sequence. Rapid temporary instrument deployment provides high-resolution coverage of microseismicity, enabling a detailed analysis of swarm behavior and faulting geometry. Relocations reveal an internally clustered sequence in which foreshocks evolved on multiple structures surrounding the eventual main shock rupture. The relocated seismicity defines a fault-fracture mesh and detailed fault structure from approximately 2-6 km depth on the previously unknown Mogul fault that may be an evolving incipient strike-slip fault zone. The seismicity volume expands before the main shock, consistent with pore pressure diffusion, and the aftershock volume is much larger than is typical for an Mw 4.9 earthquake. We group events into clusters using space-time-magnitude nearest-neighbor distances between events and develop a cluster criterion through randomization of the relocated catalog. Identified clusters are largely main shock-aftershock sequences, without evidence for migration, occurring within the diffuse background seismicity. The migration rate of the largest foreshock cluster and simultaneous background events is consistent with it having triggered, or having been triggered by, an aseismic slip event.
Common Amino Acid Subsequences in a Universal Proteome—Relevance for Food Science
Minkiewicz, Piotr; Darewicz, Małgorzata; Iwaniak, Anna; Sokołowska, Jolanta; Starowicz, Piotr; Bucholska, Justyna; Hrynkiewicz, Monika
2015-01-01
A common subsequence is a fragment of the amino acid chain that occurs in more than one protein. Common subsequences may be an object of interest for food scientists as biologically active peptides, epitopes, and/or protein markers that are used in comparative proteomics. An individual bioactive fragment, in particular the shortest fragment containing two or three amino acid residues, may occur in many protein sequences. An individual linear epitope may also be present in multiple sequences of precursor proteins. Although recent recommendations for prediction of allergenicity and cross-reactivity include not only sequence identity, but also similarities in secondary and tertiary structures surrounding the common fragment, local sequence identity may be used to screen protein sequence databases for potential allergens in silico. The main weakness of the screening process is that it overlooks allergens and cross-reactivity cases without identical fragments corresponding to linear epitopes. A single peptide may also serve as a marker of a group of allergens that belong to the same family and, possibly, reveal cross-reactivity. This review article discusses the benefits for food scientists that follow from the common subsequences concept. PMID:26340620
Wang, Ruijia; Nambiar, Ram; Zheng, Dinghai
2018-01-01
Abstract PolyA_DB is a database cataloging cleavage and polyadenylation sites (PASs) in several genomes. Previous versions were based mainly on expressed sequence tags (ESTs), which had a limited amount and could lead to inaccurate PAS identification due to the presence of internal A-rich sequences in transcripts. Here, we present an updated version of the database based solely on deep sequencing data. First, PASs are mapped by the 3′ region extraction and deep sequencing (3′READS) method, ensuring unequivocal PAS identification. Second, a large volume of data based on diverse biological samples increases PAS coverage by 3.5-fold over the EST-based version and provides PAS usage information. Third, strand-specific RNA-seq data are used to extend annotated 3′ ends of genes to obtain more thorough annotations of alternative polyadenylation (APA) sites. Fourth, conservation information of PAS across mammals sheds light on significance of APA sites. The database (URL: http://www.polya-db.org/v3) currently holds PASs in human, mouse, rat and chicken, and has links to the UCSC genome browser for further visualization and for integration with other genomic data. PMID:29069441
Tettelin, Hervé; Masignani, Vega; Cieslewicz, Michael J.; Donati, Claudio; Medini, Duccio; Ward, Naomi L.; Angiuoli, Samuel V.; Crabtree, Jonathan; Jones, Amanda L.; Durkin, A. Scott; DeBoy, Robert T.; Davidsen, Tanja M.; Mora, Marirosa; Scarselli, Maria; Margarit y Ros, Immaculada; Peterson, Jeremy D.; Hauser, Christopher R.; Sundaram, Jaideep P.; Nelson, William C.; Madupu, Ramana; Brinkac, Lauren M.; Dodson, Robert J.; Rosovitz, Mary J.; Sullivan, Steven A.; Daugherty, Sean C.; Haft, Daniel H.; Selengut, Jeremy; Gwinn, Michelle L.; Zhou, Liwei; Zafar, Nikhat; Khouri, Hoda; Radune, Diana; Dimitrov, George; Watkins, Kisha; O'Connor, Kevin J. B.; Smith, Shannon; Utterback, Teresa R.; White, Owen; Rubens, Craig E.; Grandi, Guido; Madoff, Lawrence C.; Kasper, Dennis L.; Telford, John L.; Wessels, Michael R.; Rappuoli, Rino; Fraser, Claire M.
2005-01-01
The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for ≈80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes. PMID:16172379
Vakili Azghandi, Masoume; Nasiri, Mohammadreza; Shamsa, Ali; Jalali, Mohsen; Shariati, Mohammad Mahdi
2016-04-01
The SRY gene (SRY) provides instructions for making a transcription factor called the sex-determining region Y protein. The sex-determining region Y protein causes a fetus to develop as a male. In this study, SRY of 15 spices included of human, chimpanzee, dog, pig, rat, cattle, buffalo, goat, sheep, horse, zebra, frog, urial, dolphin and killer whale were used for determine of bioinformatic differences. Nucleotide sequences of SRY were retrieved from the NCBI databank. Bioinformatic analysis of SRY is done by CLC Main Workbench version 5.5 and ClustalW (http:/www.ebi.ac.uk/clustalw/) and MEGA6 softwares. The multiple sequence alignment results indicated that SRY protein sequences from Orcinus orca (killer whale) and Tursiopsaduncus (dolphin) have least genetic distance of 0.33 in these 15 species and are 99.67% identical at the amino acid level. Homosapiens and Pantroglodytes (chimpanzee) have the next lowest genetic distance of 1.35 and are 98.65% identical at the amino acid level. These findings indicate that the SRY proteins are conserved in the 15 species, and their evolutionary relationships are similar.
Carvalho, Natalia D. M.; Carmo, Edson; Neves, Rogerio O.; Schneider, Carlos Henrique; Gross, Maria Claudia
2016-01-01
Abstract Differences in heterochromatin distribution patterns and its composition were observed in Amazonian teiid species. Studies have shown repetitive DNA harbors heterochromatic blocks which are located in centromeric and telomeric regions in Ameiva ameiva (Linnaeus, 1758), Kentropyx calcarata (Spix, 1825), Kentropyx pelviceps (Cope, 1868), and Tupinambis teguixin (Linnaeus, 1758). In Cnemidophorus sp.1, repetitive DNA has multiple signals along all chromosomes. The aim of this study was to characterize moderately and highly repetitive DNA sequences by Cot1-DNA from Ameiva ameiva and Cnemidophorus sp.1 genomes through cloning and DNA sequencing, as well as mapping them chromosomally to better understand its organization and genome dynamics. The results of sequencing of DNA libraries obtained by Cot1-DNA showed that different microsatellites, transposons, retrotransposons, and some gene families also comprise the fraction of repetitive DNA in the teiid species. FISH using Cot1-DNA probes isolated from both Ameiva ameiva and Cnemidophorus sp.1 showed these sequences mainly located in heterochromatic centromeric, and telomeric regions in Ameiva ameiva, Kentropyx calcarata, Kentropyx pelviceps, and Tupinambis teguixin chromosomes, indicating they play structural and functional roles in the genome of these species. In Cnemidophorus sp.1, Cot1-DNA probe isolated from Ameiva ameiva had multiple interstitial signals on chromosomes, whereas mapping of Cot1-DNA isolated from the Ameiva ameiva and Cnemidophorus sp.1 highlighted centromeric regions of some chromosomes. Thus, the data obtained showed that many repetitive DNA classes are part of the genome of Ameiva ameiva, Cnemidophorus sp.1, Kentroyx calcarata, Kentropyx pelviceps, and Tupinambis teguixin, and these sequences are shared among the analyzed teiid species, but they were not always allocated at the same chromosome position. PMID:27551343
Carvalho, Natalia D M; Carmo, Edson; Neves, Rogerio O; Schneider, Carlos Henrique; Gross, Maria Claudia
2016-01-01
Differences in heterochromatin distribution patterns and its composition were observed in Amazonian teiid species. Studies have shown repetitive DNA harbors heterochromatic blocks which are located in centromeric and telomeric regions in Ameiva ameiva (Linnaeus, 1758), Kentropyx calcarata (Spix, 1825), Kentropyx pelviceps (Cope, 1868), and Tupinambis teguixin (Linnaeus, 1758). In Cnemidophorus sp.1, repetitive DNA has multiple signals along all chromosomes. The aim of this study was to characterize moderately and highly repetitive DNA sequences by C ot1-DNA from Ameiva ameiva and Cnemidophorus sp.1 genomes through cloning and DNA sequencing, as well as mapping them chromosomally to better understand its organization and genome dynamics. The results of sequencing of DNA libraries obtained by C ot1-DNA showed that different microsatellites, transposons, retrotransposons, and some gene families also comprise the fraction of repetitive DNA in the teiid species. FISH using C ot1-DNA probes isolated from both Ameiva ameiva and Cnemidophorus sp.1 showed these sequences mainly located in heterochromatic centromeric, and telomeric regions in Ameiva ameiva, Kentropyx calcarata, Kentropyx pelviceps, and Tupinambis teguixin chromosomes, indicating they play structural and functional roles in the genome of these species. In Cnemidophorus sp.1, C ot1-DNA probe isolated from Ameiva ameiva had multiple interstitial signals on chromosomes, whereas mapping of C ot1-DNA isolated from the Ameiva ameiva and Cnemidophorus sp.1 highlighted centromeric regions of some chromosomes. Thus, the data obtained showed that many repetitive DNA classes are part of the genome of Ameiva ameiva, Cnemidophorus sp.1, Kentroyx calcarata, Kentropyx pelviceps, and Tupinambis teguixin, and these sequences are shared among the analyzed teiid species, but they were not always allocated at the same chromosome position.
Bellerophon: a program to detect chimeric sequences in multiple sequence alignments.
Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip
2004-09-22
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments. Bellerophon is available as an interactive web server at http://foo.maths.uq.edu.au/~huber/bellerophon.pl
Badelita, S; Dobrea, C; Colita, A; Dogaru, M; Dragomir, M; Jardan, C; Coriu, D
2015-01-01
Multiple myeloma and JAK2 positive chronic myeloproliferative neoplasms are hematologic malignancies with a completely different cellular origin. Two cases of simultaneous occurrence of multiple myeloma, one with primary myelofibrosis and another one with essential thrombocythemia are reported in this article. In such cases, an accurate diagnosis requires a molecular testing, including gene sequencing and differential diagnosis of pancytosis associated with splenic amyloidosis. In general, in such cases, of two coexisting malignant hematologic diseases, the treatment of the most aggressive one is recommended. For our two cases, it was decided to start a Velcade based therapy. The main concern was the medullar toxicity, especially when a multiple myeloma was associated with a primary myelofibrosis. Abbreviations:JAK2 = Janus kinase 2 gene, PMF = primary myelofibrosis, MPNs = myeloproliferative neoplasms, ET = essential thrombocythemia, PV = polycythemia vera, MM = multiple myeloma, WBC = white blood cells, Hb = haemoglobin, Ht = haematocrit, Plt = platelets, BMB = bone marrow biopsy, CBC = blood cell count, CT = computerized tomography, LAP = leukocyte alkaline phosphatase, MGUS = monoclonal gammopathy of undetermined significance. PMID:25914740
DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method.
Balech, Bachir; Monaco, Alfonso; Perniola, Michele; Santamaria, Monica; Donvito, Giacinto; Vicario, Saverio; Maggi, Giorgio; Pesole, Graziano
2018-01-01
Multiple sequence alignment (MSA) is a fundamental component in many DNA sequence analyses including metagenomics studies and phylogeny inference. When guided by protein profiles, DNA multiple alignments assume a higher precision and robustness. Here we present details of the use of the upgraded version of MSA-PAD (2.0), which is a DNA multiple sequence alignment framework able to align DNA sequences coding for single/multiple protein domains guided by PFAM or user-defined annotations. MSA-PAD has two alignment strategies, called "Gene" and "Genome," accounting for coding domains order and genomic rearrangements, respectively. Novel options were added to the present version, where the MSA can be guided by protein profiles provided by the user. This allows MSA-PAD 2.0 to run faster and to add custom protein profiles sometimes not present in PFAM database according to the user's interest. MSA-PAD 2.0 is currently freely available as a Web application at https://recasgateway.cloud.ba.infn.it/ .
VizieR Online Data Catalog: NGC 6802 dwarf cluster members and non-members (Tang+, 2017)
NASA Astrophysics Data System (ADS)
Tang, B.; Geisler, D.; Friel, E.; Villanova, S.; Smiljanic, R.; Casey, A. R.; Randich, S.; Magrini, L.; San, Roman I.; Munoz, C.; Cohen, R. E.; Mauro, F.; Bragaglia, A.; Donati, P.; Tautvaisiene, G.; Drazdauskas, A.; Zenoviene, R.; Snaith, O.; Sousa, S.; Adibekyan, V.; Costado, M. T.; Blanco-Cuaresma, S.; Jimenez-Esteban, F.; Carraro, G.; Zwitter, T.; Francois, P.; Jofre, P.; Sordo, R.; Gilmore, G.; Flaccomio, E.; Koposov, S.; Korn, A. J.; Lanzafame, A. C.; Pancino, E.; Bayo, A.; Damiani, F.; Franciosini, E.; Hourihane, A.; Lardo, C.; Lewis, J.; Monaco, L.; Morbidelli, L.; Prisinzano, L.; Sacco, G.; Worley, C. C.; Zaggia, S.
2016-11-01
The dwarf stars in NGC 6802 observed by GIRAFFE spectrograph are separated into four tables: 1. cluster members in the lower main sequence; 2. cluster members in the upper main sequence; 3. non-member dwarfs in the lower main sequence; 4. non-member dwarfs in the upper main sequence. The star coordinates, V band magnitude, V-I color, and radial velocity are given. (4 data files).
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment
2013-01-01
Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.
Nagar, Anurag; Hahsler, Michael
2013-01-01
Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.
3D Micro-tomography on Aggregates from the 2014- 2015 Eruption of Hunga Tonga-Hunga Ha'apai Volcano
NASA Astrophysics Data System (ADS)
Colombier, M.; Scheu, B.; Cronin, S. J.; Tost, M.; Dobson, K. J.; Dingwell, D. B.
2016-12-01
In December 2014- January 2015, a surtseyan eruption at Hunga Tonga-Hunga Ha'apai volcano (Tonga) formed a new island. Three main eruptive phases were distinguished by observation and deposits: (i) mound and cone construction, involving collapse of 300-600 m-high wet tephra jets, grain flows, slope-remobilisation and energetic surges, with little or no convective plume (ii) The upper cone-building phase with lower jets (mainly <300 m) but greater ash production (weak, steam-rich plumes to 6 km) and weak surges, and (iii) final phase with weak surge, fall and ballistic deposits with more vesicular pyroclasts producing proximal capping deposits. Most sampled deposits contain ash, lapilli and bombs, and lapilli-sized aggregates are ubiquitous. We used high-resolution 3D X-ray microcomputed tomography (XCT) to quantify the grain size distribution (GSD) and porosity by sampling multiple stratigraphic units within the main eruptive sequences. We visualized and quantified the internal structure of the aggregates to understand the evolution of this surtseyan eruption. We present here an overview of the textural information: porosity, vesicle size distribution and morphology as well as the variability of the aggregation features. Aggregates from the fall deposits of the early wet phase are mostly loosely packed, poorly-structured ash clusters. Aggregates from the early surge sequence and the main cone building phase dominantly exhibit a central particle coated by ash cluster material. Vesicles in the particles from the early fall deposits tend to be smaller and more isolated than in the particles from the surge sequence and the main cone building phase. The GSD of aggregates obtained by XCT is highly valuable to correct the total GSD of volcaniclastic deposits. The strong variations in the aggregation features across the eruption suggest a range of different formation and deposition mechanisms related to varying degrees of magma-water-interaction, which changed the morphology and textural properties of the individual particles.
DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors.
Schmollinger, Martin; Nieselt, Kay; Kaufmann, Michael; Morgenstern, Burkhard
2004-09-09
Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.
Computer-aided visualization and analysis system for sequence evaluation
Chee, M.S.
1998-08-18
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device. 27 figs.
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.
2004-05-11
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.
1998-08-18
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.
2003-08-19
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro
2012-01-01
Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.
eShadow: A tool for comparing closely related sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ovcharenko, Ivan; Boffelli, Dario; Loots, Gabriela G.
2004-01-15
Primate sequence comparisons are difficult to interpret due to the high degree of sequence similarity shared between such closely related species. Recently, a novel method, phylogenetic shadowing, has been pioneered for predicting functional elements in the human genome through the analysis of multiple primate sequence alignments. We have expanded this theoretical approach to create a computational tool, eShadow, for the identification of elements under selective pressure in multiple sequence alignments of closely related genomes, such as in comparisons of human to primate or mouse to rat DNA. This tool integrates two different statistical methods and allows for the dynamic visualizationmore » of the resulting conservation profile. eShadow also includes a versatile optimization module capable of training the underlying Hidden Markov Model to differentially predict functional sequences. This module grants the tool high flexibility in the analysis of multiple sequence alignments and in comparing sequences with different divergence rates. Here, we describe the eShadow comparative tool and its potential uses for analyzing both multiple nucleotide and protein alignments to predict putative functional elements. The eShadow tool is publicly available at http://eshadow.dcode.org/« less
NASA Astrophysics Data System (ADS)
Tamburini, Fabrizio; Licata, Ignazio
2017-09-01
The search for dark matter (DM) is one of the most active and challenging areas of current research. Possible DM candidates are ultralight fields such as axions and weak interacting massive particles (WIMPs). Axions piled up in the center of stars are supposed to generate matter/DM configurations with oscillating geometries at a very rapid frequency, which is a multiple of the axion mass m B (Brito et al (2015); Brito et al (2016)). Borra and Trottier (2016) recently found peculiar ultrafast periodic spectral modulations in 236 main sequence stars in the sample of 2.5 million spectra of galactic halo stars of the Sloan Digital Sky Survey (˜1% of main sequence stars in the F-K spectral range) that were interpreted as optical signals from extraterrestrial civilizations, suggesting them as possible candidates for the search for extraterrestrial intelligence (SETI) program. We argue, instead, that this could be the first indirect evidence of bosonic axion-like DM fields inside main sequence stars, with a stable radiative nucleus, where a stable DM core can be hosted. These oscillations were not observed in earlier stellar spectral classes probably because of the impossibility of starting a stable oscillatory regime due to the presence of chaotic motions in their convective nuclei. The axion mass values, (50< {m}B< 2.4× {10}3) μ {eV}, obtained from the frequency range observed by Borra and Trottier, (0.6070< f< 0.6077) THz, agree with the recent theoretical results from high-temperature lattice quantum chromodynamics (Borsanyi et al (2016); Borsanyi et al (2016b)).
X-Raying the Coronae of HD 155555
NASA Technical Reports Server (NTRS)
Lalitha, S.; Singh, K.P.; Drake, S. A.; Kashyap, V.
2015-01-01
We present an analysis of the high-resolution Chandra observation of the multiple system, HD 155555 (an RS CVn type binary system, HD 155555 AB, and its spatially resolved low-mass companion HD 155555 C). This is an intriguing system which shows properties of both an active pre-main sequence star and a synchronised (main sequence) binary. We obtain the emission measure distribution, temperature structures, plasma densities, and abundances of this system and compare them with the coronal properties of other young/active stars. HD 155555 AB and HD 155555 C produce copious X-ray emission with log L(sub x) of 30.54 and 29.30, respectively, in the 0.3-6.0 kiloelectronvolt energy band. The light curves of individual stars show variability on timescales of few minutes to hours. We analyse the dispersed spectra and reconstruct the emission measure distribution using spectral line analysis. The resulting elemental abundances exhibit inverse first ionisation potential effect in both cases. An analysis of He-like triplets yields a range of coronal electron densities 1010 - 1013 per cubic centimeter. Since HD 155555 AB is classified both as an RS CVn and a PMS star, we compare our results with those of other slightly older active main-sequence stars and T Tauri stars, which indicates that the coronal properties of HD 155555 AB closely resemble that of an older RS CVn binary rather than a younger PMS star. Our results also suggests that the properties of HD 155555 C is very similar to those of other active M dwarfs.
A NEAR-INFRARED STUDY OF THE STAR-FORMING REGION RCW 34
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van der Walt, D. J.; De Villiers, H. M.; Czanik, R. J.
2012-07-15
We report the results of a near-infrared imaging study of a 7.8 Multiplication-Sign 7.8 arcmin{sup 2} region centered on the 6.7 GHz methanol maser associated with the RCW 34 star-forming region using the 1.4 m IRSF telescope at Sutherland. A total of 1283 objects were detected simultaneously in J, H, and K for an exposure time of 10,800 s. The J - H, H - K two-color diagram revealed a strong concentration of more than 700 objects with colors similar to what is expected of reddened classical T Tauri stars. The distribution of the objects on the K versus Jmore » - K color-magnitude diagram is also suggestive that a significant fraction of the 1283 objects is made up of lower mass pre-main-sequence stars. We also present the luminosity function for the subset of about 700 pre-main-sequence stars and show that it suggests ongoing star formation activity for about 10{sup 7} years. An examination of the spatial distribution of the pre-main-sequence stars shows that the fainter (older) part of the population is more dispersed over the observed region and the brighter (younger) subset is more concentrated around the position of the O8.5V star. This suggests that the physical effects of the O8.5V star and the two early B-type stars on the remainder of the cloud out of which they formed could have played a role in the onset of the more recent episode of star formation in RCW 34.« less
2013-01-01
Background Wheat gluten has unique nutritional and technological characteristics, but is also a major trigger of allergies and intolerances. One of the most severe diseases caused by gluten is coeliac disease. The peptides produced in the digestive tract by the incomplete digestion of gluten proteins trigger the disease. The majority of the epitopes responsible reside in the gliadin fraction of gluten. The location of the multiple gliadin genes in blocks has to date complicated their elimination by classical breeding techniques or by the use of biotechnological tools. As an approach to silence multiple gliadin genes we have produced 38 transgenic lines of bread wheat containing combinations of two endosperm-specific promoters and three different inverted repeat sequences to silence three fractions of gliadins by RNA interference. Results The effects of the RNA interference constructs on the content of the gluten proteins, total protein and starch, thousand seed weights and SDSS quality tests of flour were analyzed in these transgenic lines in two consecutive years. The characteristics of the inverted repeat sequences were the main factor that determined the efficiency of silencing. The promoter used had less influence on silencing, although a synergy in silencing efficiency was observed when the two promoters were used simultaneously. Genotype and the environment also influenced silencing efficiency. Conclusions We conclude that to obtain wheat lines with an optimum reduction of toxic gluten epitopes one needs to take into account the factors of inverted repeat sequences design, promoter choice and also the wheat background used. PMID:24044767
Houseknecht, D.W.; Bird, K.J.
2004-01-01
Beaufortian strata (Jurassic-Lower Cretaceous) in the National Petroleum Reserve in Alaska (NPRA) are a focus of exploration since the 1994 discovery of the nearby Alpine oil field (>400 MMBO). These strata include the Kingak Shale, a succession of depositional sequences influenced by rift opening of the Arctic Ocean Basin. Interpretation of sequence stratigraphy and depositional facies from a regional two-dimensional seismic grid and well data allows the definition of four sequence sets that each displays unique stratal geometries and thickness trends across NPRA. A Lower to Middle Jurassic sequence set includes numerous transgressive-regressive sequences that collectively built a clastic shelf in north-central NPRA. Along the south-facing, lobate shelf margin, condensed shales in transgressive systems tracts downlap and coalesce into a basinal condensed section that is likely an important hydrocarbon source rock. An Oxfordian-Kimmeridgian sequence set, deposited during pulses of uplift on the Barrow arch, includes multiple transgressive-regressive sequences that locally contain well-winnowed, shoreface sandstones at the base of transgressive systems tracts. These shoreface sandstones and overlying shales, deposited during maximum flooding, form stratigraphic traps that are the main objective of exploration in the Alpine play in NPRA. A Valanginian sequence set includes at least two transgressive-regressive sequences that display relatively distal characteristics, suggesting high relative sea level. An important exception is the presence of a basal transgressive systems tract that locally contains shoreface sandstones of reservoir quality. A Hauterivian sequence set includes two transgressive-regressive sequences that constitute a shelf-margin wedge developed as the result of tectonic uplift along the Barrow arch during rift opening of the Arctic Ocean Basin. This sequence set displays stratal geometries suggesting incision and synsedimentary collapse of the shelf margin. ?? 2004. The American Association of Petroleum Geologists. All rights reserved.
A Novel Center Star Multiple Sequence Alignment Algorithm Based on Affine Gap Penalty and K-Band
NASA Astrophysics Data System (ADS)
Zou, Quan; Shan, Xiao; Jiang, Yi
Multiple sequence alignment is one of the most important topics in computational biology, but it cannot deal with the large data so far. As the development of copy-number variant(CNV) and Single Nucleotide Polymorphisms(SNP) research, many researchers want to align numbers of similar sequences for detecting CNV and SNP. In this paper, we propose a novel multiple sequence alignment algorithm based on affine gap penalty and k-band. It can align more quickly and accurately, that will be helpful for mining CNV and SNP. Experiments prove the performance of our algorithm.
Did A Planet Survive A Post-Main Sequence Evolutionary Event?
NASA Astrophysics Data System (ADS)
Sorber, Rebecca; Jang-Condell, Hannah; Zimmerman, Mara
2018-06-01
The GL86 is star system approximately 10 pc away with a main sequence K- type ~ 0.77 M⊙ star (GL 86A) with a white dwarf ~0.49 M⊙ companion (GL86 B). The system has a ~ 18.4 AU semi-major axis, an orbital period of ~353 yrs, and an eccentricity of ~ 0.39. A 4.5 MJ planet orbits the main sequence star with a semi-major axis of 0.113 AU, an orbital period of 15.76 days, in a near circular orbit with an eccentricity of 0.046. If we assume that this planet was formed during the time when the white dwarf was a main sequence star, it would be difficult for the planet to have remained in a stable orbit during the post-main sequence evolution of GL86 B. The post-main sequence evolution with planet survival will be examined by modeling using the program Mercury (Chambers 1999). Using the model, we examine the origins of the planet: whether it formed before or after the post-main sequence evolution of GL86B. The modeling will give us insight into the dynamical evolution of, not only, the binary star system, but also the planet’s life cycle.
Dynamical mass and multiplicity constraints on co-orbital bodies around stars
NASA Astrophysics Data System (ADS)
Veras, Dimitri; Marsh, Thomas R.; Gänsicke, Boris T.
2016-09-01
Objects transiting near or within the disruption radius of both main-sequence (e.g. KOI 1843) and white dwarf (WD 1145+017) stars are now known. Upon fragmentation or disintegration, these planets or asteroids may produce co-orbital configurations of nearly equal mass objects. However, as evidenced by the co-orbital objects detected by transit photometry in the WD 1145+017 system, these bodies are largely unconstrained in size, mass, and total number (multiplicity). Motivated by potential future similar discoveries, we perform N-body simulations to demonstrate if and how debris masses and multiplicity may be bounded due to second-to-minute deviations and the resulting accumulated phase shifts in the osculating orbital period amongst multiple co-orbital equal point masses. We establish robust lower and upper mass bounds as a function of orbital period deviation, but find the constraints on multiplicity to be weak. We also quantify the fuzzy instability boundary, and show that mutual collisions occur in less than 5, 10, and 20 per cent of our simulations for masses of 1021, 1022, and 1023 kg. Our results may provide useful initial rough constraints on other stellar systems with multiple co-orbital bodies.
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.
1999-10-26
A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.
2001-06-05
A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).
Mink, S; Härtig, E; Jennewein, P; Doppler, W; Cato, A C
1992-01-01
Mouse mammary tumor virus (MMTV) is a milk-transmitted retrovirus involved in the neoplastic transformation of mouse mammary gland cells. The expression of this virus is regulated by mammary cell type-specific factors, steroid hormones, and polypeptide growth factors. Sequences for mammary cell-specific expression are located in an enhancer element in the extreme 5' end of the long terminal repeat region of this virus. This enhancer, when cloned in front of the herpes simplex thymidine kinase promoter, endows the promoter with mammary cell-specific response. Using functional and DNA-protein-binding studies with constructs mutated in the MMTV long terminal repeat enhancer, we have identified two main regulatory elements necessary for the mammary cell-specific response. These elements consist of binding sites for a transcription factor in the family of CTF/NFI proteins and the transcription factor mammary cell-activating factor (MAF) that recognizes the sequence G Pu Pu G C/G A A G G/T. Combinations of CTF/NFI- and MAF-binding sites or multiple copies of either one of these binding sites but not solitary binding sites mediate mammary cell-specific expression. The functional activities of these two regulatory elements are enhanced by another factor that binds to the core sequence ACAAAG. Interdigitated binding sites for CTF/NFI, MAF, and/or the ACAAAG factor are also found in the 5' upstream regions of genes encoding whey milk proteins from different species. These findings suggest that mammary cell-specific regulation is achieved by a concerted action of factors binding to multiple regulatory sites. Images PMID:1328867
Lim, Shu Yong; Yap, Kien-Pong; Teh, Cindy Shuan Ju; Jabar, Kartini Abdul; Thong, Kwai Lin
2017-04-01
Enterococcus faecium is both a commensal of the human intestinal tract and an opportunistic pathogen. The increasing incidence of enterococcal infections is mainly due to the ability of this organism to develop resistance to multiple antibiotics, including vancomycin. The aim of this study was to perform comparative genome analyses on four vancomycin-resistant Enterococcus faecium (VRE fm ) strains isolated from two fatal cases in a tertiary hospital in Malaysia. Two sequence types, ST80 and ST203, were identified which belong to the clinically important clonal complex (CC) 17. This is the first report on the emergence of ST80 strains in Malaysia. Three of the studied strains (VREr5, VREr6, VREr7) were each isolated from different body sites of a single patient (patient Y) and had different PFGE patterns. While VREr6 and VREr7 were phenotypically and genotypically similar, the initial isolate, VREr5, was found to be more similar to VRE2 isolated from another patient (patient X), in terms of the genome contents, sequence types and phylogenomic relationship. Both the clinical records and genome sequence data suggested that patient Y was infected by multiple strains from different clones and the strain that infected patient Y could have derived from the same clone from patient X. These multidrug resistant strains harbored a number of virulence genes such as the epa locus and pilus-associated genes which could enhance their persistence. Apart from that, a homolog of E. faecalis bee locus was identified in VREr5 which might be involved in biofilm formation. Overall, our comparative genomic analyses had provided insight into the genetic relatedness, as well as the virulence potential, of the four clinical strains. Copyright © 2016 Elsevier B.V. All rights reserved.
Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment
2011-01-01
Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510
Dynamical investigations of the multiple stars
NASA Astrophysics Data System (ADS)
Kiyaeva, Olga V.; Zhuchkov, Roman Ya.
2017-11-01
Two multiple stars - the quadruple star - Bootis (ADS 9173) and the triple star T Taury were investigated. The visual double star - Bootiswas studied on the basis of the Pulkovo 26-inch refractor observations 1982-2013. An invisible satellite of the component A was discovered due to long-term uniform series of observations. Its orbital period is 20 ± 2 years. The known invisible satellite of the component B with near 5 years period was confirmed due to high precision CCD observations. The astrometric orbits of the both components were calculated. The orbits of inner and outer pairs of the pre-main sequence binary T Taury were calculated on the basis of high precision observations by the VLT and on the Keck II Telescope. This weakly hierarchical triple system is stable with probability more than 70%.
Novel genomic findings in multiple myeloma identified through routine diagnostic sequencing.
Ryland, Georgina L; Jones, Kate; Chin, Melody; Markham, John; Aydogan, Elle; Kankanige, Yamuna; Caruso, Marisa; Guinto, Jerick; Dickinson, Michael; Prince, H Miles; Yong, Kwee; Blombery, Piers
2018-05-14
Multiple myeloma is a genomically complex haematological malignancy with many genomic alterations recognised as important in diagnosis, prognosis and therapeutic decision making. Here, we provide a summary of genomic findings identified through routine diagnostic next-generation sequencing at our centre. A cohort of 86 patients with multiple myeloma underwent diagnostic sequencing using a custom hybridisation-based panel targeting 104 genes. Sequence variants, genome-wide copy number changes and structural rearrangements were detected using an inhouse-developed bioinformatics pipeline. At least one mutation was found in 69 (80%) patients. Frequently mutated genes included TP53 (36%), KRAS (22.1%), NRAS (15.1%), FAM46C/DIS3 (8.1%) and TET2/FGFR3 (5.8%), including multiple mutations not previously described in myeloma. Importantly we observed TP53 mutations in the absence of a 17 p deletion in 8% of the cohort, highlighting the need for sequencing-based assessment in addition to cytogenetics to identify these high-risk patients. Multiple novel copy number changes and immunoglobulin heavy chain translocations are also discussed. Our results demonstrate that many clinically relevant genomic findings remain in multiple myeloma which have not yet been identified through large-scale sequencing efforts, and provide important mechanistic insights into plasma cell pathobiology. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
NASA Technical Reports Server (NTRS)
Khanampompan, Teerapat; Gladden, Roy; Fisher, Forest; DelGuercio, Chris
2008-01-01
The Sequence History Update Tool performs Web-based sequence statistics archiving for Mars Reconnaissance Orbiter (MRO). Using a single UNIX command, the software takes advantage of sequencing conventions to automatically extract the needed statistics from multiple files. This information is then used to populate a PHP database, which is then seamlessly formatted into a dynamic Web page. This tool replaces a previous tedious and error-prone process of manually editing HTML code to construct a Web-based table. Because the tool manages all of the statistics gathering and file delivery to and from multiple data sources spread across multiple servers, there is also a considerable time and effort savings. With the use of The Sequence History Update Tool what previously took minutes is now done in less than 30 seconds, and now provides a more accurate archival record of the sequence commanding for MRO.
Differential evolution-simulated annealing for multiple sequence alignment
NASA Astrophysics Data System (ADS)
Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.
2017-10-01
Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.
The 2016 Kumamoto earthquake sequence.
Kato, Aitaro; Nakamura, Kouji; Hiyama, Yohei
2016-01-01
Beginning in April 2016, a series of shallow, moderate to large earthquakes with associated strong aftershocks struck the Kumamoto area of Kyushu, SW Japan. An M j 7.3 mainshock occurred on 16 April 2016, close to the epicenter of an M j 6.5 foreshock that occurred about 28 hours earlier. The intense seismicity released the accumulated elastic energy by right-lateral strike slip, mainly along two known, active faults. The mainshock rupture propagated along multiple fault segments with different geometries. The faulting style is reasonably consistent with regional deformation observed on geologic timescales and with the stress field estimated from seismic observations. One striking feature of this sequence is intense seismic activity, including a dynamically triggered earthquake in the Oita region. Following the mainshock rupture, postseismic deformation has been observed, as well as expansion of the seismicity front toward the southwest and northwest.
The 2016 Kumamoto earthquake sequence
KATO, Aitaro; NAKAMURA, Kouji; HIYAMA, Yohei
2016-01-01
Beginning in April 2016, a series of shallow, moderate to large earthquakes with associated strong aftershocks struck the Kumamoto area of Kyushu, SW Japan. An Mj 7.3 mainshock occurred on 16 April 2016, close to the epicenter of an Mj 6.5 foreshock that occurred about 28 hours earlier. The intense seismicity released the accumulated elastic energy by right-lateral strike slip, mainly along two known, active faults. The mainshock rupture propagated along multiple fault segments with different geometries. The faulting style is reasonably consistent with regional deformation observed on geologic timescales and with the stress field estimated from seismic observations. One striking feature of this sequence is intense seismic activity, including a dynamically triggered earthquake in the Oita region. Following the mainshock rupture, postseismic deformation has been observed, as well as expansion of the seismicity front toward the southwest and northwest. PMID:27725474
Plastome data reveal multiple geographic origins of Quercus Group Ilex
Grimm, Guido W.; Papini, Alessio; Vessella, Federico; Cardoni, Simone; Tordoni, Enrico; Piredda, Roberta; Franc, Alain; Denk, Thomas
2016-01-01
Nucleotide sequences from the plastome are currently the main source for assessing taxonomic and phylogenetic relationships in flowering plants and their historical biogeography at all hierarchical levels. One major exception is the large and economically important genus Quercus (oaks). Whereas differentiation patterns of the nuclear genome are in agreement with morphology and the fossil record, diversity patterns in the plastome are at odds with established taxonomic and phylogenetic relationships. However, the extent and evolutionary implications of this incongruence has yet to be fully uncovered. The DNA sequence divergence of four Euro-Mediterranean Group Ilex oak species (Quercus ilex L., Q. coccifera L., Q. aucheri Jaub. & Spach., Q. alnifolia Poech.) was explored at three chloroplast markers (rbcL, trnK/matK, trnH-psbA). Phylogenetic relationships were reconstructed including worldwide members of additional 55 species representing all Quercus subgeneric groups. Family and order sequence data were harvested from gene banks to better frame the observed divergence in larger taxonomic contexts. We found a strong geographic sorting in the focal group and the genus in general that is entirely decoupled from species boundaries. High plastid divergence in members of Quercus Group Ilex, including haplotypes shared with related, but long isolated oak lineages, point towards multiple geographic origins of this group of oaks. The results suggest that incomplete lineage sorting and repeated phases of asymmetrical introgression among ancestral lineages of Group Ilex and two other main Groups of Eurasian oaks (Cyclobalanopsis and Cerris) caused this complex pattern. Comparison with the current phylogenetic synthesis also suggests an initial high- versus mid-latitude biogeographic split within Quercus. High plastome plasticity of Group Ilex reflects geographic area disruptions, possibly linked with high tectonic activity of past and modern distribution ranges, that did not leave imprints in the nuclear genome of modern species and infrageneric lineages. PMID:27123376
Analysis of Ribosome Inactivating Protein (RIP): A Bioinformatics Approach
NASA Astrophysics Data System (ADS)
Jothi, G. Edward Gnana; Majilla, G. Sahaya Jose; Subhashini, D.; Deivasigamani, B.
2012-10-01
In spite of the medical advances in recent years, the world is in need of different sources to encounter certain health issues.Ribosome Inactivating Proteins (RIPs) were found to be one among them. In order to get easy access about RIPs, there is a need to analyse RIPs towards constructing a database on RIPs. Also, multiple sequence alignment was done towards screening for homologues of significant RIPs from rare sources against RIPs from easily available sources in terms of similarity. Protein sequences were retrieved from SWISS-PROT and are further analysed using pair wise and multiple sequence alignment.Analysis shows that, 151 RIPs have been characterized to date. Amongst them, there are 87 type I, 37 type II, 1 type III and 25 unknown RIPs. The sequence length information of various RIPs about the availability of full or partial sequence was also found. The multiple sequence alignment of 37 type I RIP using the online server Multalin, indicates the presence of 20 conserved residues. Pairwise alignment and multiple sequence alignment of certain selected RIPs in two groups namely Group I and Group II were carried out and the consensus level was found to be 98%, 98% and 90% respectively.
Whole-exome sequencing of primary plasma cell leukemia discloses heterogeneous mutational patterns.
Cifola, Ingrid; Lionetti, Marta; Pinatel, Eva; Todoerti, Katia; Mangano, Eleonora; Pietrelli, Alessandro; Fabris, Sonia; Mosca, Laura; Simeon, Vittorio; Petrucci, Maria Teresa; Morabito, Fortunato; Offidani, Massimo; Di Raimondo, Francesco; Falcone, Antonietta; Caravita, Tommaso; Battaglia, Cristina; De Bellis, Gianluca; Palumbo, Antonio; Musto, Pellegrino; Neri, Antonino
2015-07-10
Primary plasma cell leukemia (pPCL) is a rare and aggressive form of plasma cell dyscrasia and may represent a valid model for high-risk multiple myeloma (MM). To provide novel information concerning the mutational profile of this disease, we performed the whole-exome sequencing of a prospective series of 12 pPCL cases included in a Phase II multicenter clinical trial and previously characterized at clinical and molecular levels. We identified 1, 928 coding somatic non-silent variants on 1, 643 genes, with a mean of 166 variants per sample, and only few variants and genes recurrent in two or more samples. An excess of C > T transitions and the presence of two main mutational signatures (related to APOBEC over-activity and aging) occurring in different translocation groups were observed. We identified 14 candidate cancer driver genes, mainly involved in cell-matrix adhesion, cell cycle, genome stability, RNA metabolism and protein folding. Furthermore, integration of mutation data with copy number alteration profiles evidenced biallelically disrupted genes with potential tumor suppressor functions. Globally, cadherin/Wnt signaling, extracellular matrix and cell cycle checkpoint resulted the most affected functional pathways. Sequencing results were finally combined with gene expression data to better elucidate the biological relevance of mutated genes. This study represents the first whole-exome sequencing screen of pPCL and evidenced a remarkable genetic heterogeneity of mutational patterns. This may provide a contribution to the comprehension of the pathogenetic mechanisms associated with this aggressive form of PC dyscrasia and potentially with high-risk MM.
AlignMe—a membrane protein sequence alignment web server
Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.
2014-01-01
We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425
Enhanced sequencing coverage with digital droplet multiple displacement amplification
Sidore, Angus M.; Lan, Freeman; Lim, Shaun W.; Abate, Adam R.
2016-01-01
Sequencing small quantities of DNA is important for applications ranging from the assembly of uncultivable microbial genomes to the identification of cancer-associated mutations. To obtain sufficient quantities of DNA for sequencing, the small amount of starting material must be amplified significantly. However, existing methods often yield errors or non-uniform coverage, reducing sequencing data quality. Here, we describe digital droplet multiple displacement amplification, a method that enables massive amplification of low-input material while maintaining sequence accuracy and uniformity. The low-input material is compartmentalized as single molecules in millions of picoliter droplets. Because the molecules are isolated in compartments, they amplify to saturation without competing for resources; this yields uniform representation of all sequences in the final product and, in turn, enhances the quality of the sequence data. We demonstrate the ability to uniformly amplify the genomes of single Escherichia coli cells, comprising just 4.7 fg of starting DNA, and obtain sequencing coverage distributions that rival that of unamplified material. Digital droplet multiple displacement amplification provides a simple and effective method for amplifying minute amounts of DNA for accurate and uniform sequencing. PMID:26704978
Tillmar, Andreas O.; Dell'Amico, Barbara; Welander, Jenny; Holmlund, Gunilla
2013-01-01
Species identification can be interesting in a wide range of areas, for example, in forensic applications, food monitoring and in archeology. The vast majority of existing DNA typing methods developed for species determination, mainly focuses on a single species source. There are, however, many instances where all species from mixed sources need to be determined, even when the species in minority constitutes less than 1 % of the sample. The introduction of next generation sequencing opens new possibilities for such challenging samples. In this study we present a universal deep sequencing method using 454 GS Junior sequencing of a target on the mitochondrial gene 16S rRNA. The method was designed through phylogenetic analyses of DNA reference sequences from more than 300 mammal species. Experiments were performed on artificial species-species mixture samples in order to verify the method’s robustness and its ability to detect all species within a mixture. The method was also tested on samples from authentic forensic casework. The results showed to be promising, discriminating over 99.9 % of mammal species and the ability to detect multiple donors within a mixture and also to detect minor components as low as 1 % of a mixed sample. PMID:24358309
Kim, Tae Hoon; Dekker, Job
2018-05-01
ChIP-chip can be used to analyze protein-DNA interactions in a region-wide and genome-wide manner. DNA microarrays contain PCR products or oligonucleotide probes that are designed to represent genomic sequences. Identification of genomic sites that interact with a specific protein is based on competitive hybridization of the ChIP-enriched DNA and the input DNA to DNA microarrays. The ChIP-chip protocol can be divided into two main sections: Amplification of ChIP DNA and hybridization of ChIP DNA to arrays. A large amount of DNA is required to hybridize to DNA arrays, and hybridization to a set of multiple commercial arrays that represent the entire human genome requires two rounds of PCR amplifications. The relative hybridization intensity of ChIP DNA and that of the input DNA is used to determine whether the probe sequence is a potential site of protein-DNA interaction. Resolution of actual genomic sites bound by the protein is dependent on the size of the chromatin and on the genomic distance between the probes on the array. As with expression profiling using gene chips, ChIP-chip experiments require multiple replicates for reliable statistical measure of protein-DNA interactions. © 2018 Cold Spring Harbor Laboratory Press.
Vd’ačný, Peter; Bourland, William A.; Orsi, William; Epstein, Slava S.; Foissner, Wilhelm
2012-01-01
The class Litostomatea is a highly diverse ciliate taxon comprising hundreds of free-living and endocommensal species. However, their traditional morphology-based classification conflicts with 18S rRNA gene phylogenies indicating (1) a deep bifurcation of the Litostomatea into Rhynchostomatia and Haptoria + Trichostomatia, and (2) body polarization and simplification of the oral apparatus as main evolutionary trends in the Litostomatea. To test whether 18S rRNA molecules provide a suitable proxy for litostomatean evolutionary history, we used eighteen new ITS1-5.8S rRNA-ITS2 region sequences from various free-living litostomatean orders. These single- and multiple-locus analyses are in agreement with previous 18S rRNA gene phylogenies, supporting that both 18S rRNA gene and ITS region sequences are effective tools for resolving phylogenetic relationships among the litostomateans. Despite insertions, deletions and mutational saturations in the ITS region, the present study shows that ITS1 and ITS2 molecules can be used to infer phylogenetic relationships not only at species level but also at higher taxonomic ranks when their secondary structure information is utilized to aid alignment. PMID:22789763
Vd'ačný, Peter; Bourland, William A; Orsi, William; Epstein, Slava S; Foissner, Wilhelm
2012-11-01
The class Litostomatea is a highly diverse ciliate taxon comprising hundreds of free-living and endocommensal species. However, their traditional morphology-based classification conflicts with 18S rRNA gene phylogenies indicating (1) a deep bifurcation of the Litostomatea into Rhynchostomatia and Haptoria+Trichostomatia, and (2) body polarization and simplification of the oral apparatus as main evolutionary trends in the Litostomatea. To test whether 18S rRNA molecules provide a suitable proxy for litostomatean evolutionary history, we used eighteen new ITS1-5.8S rRNA-ITS2 region sequences from various free-living litostomatean orders. These single- and multiple-locus analyses are in agreement with previous 18S rRNA gene phylogenies, supporting that both 18S rRNA gene and ITS region sequences are effective tools for resolving phylogenetic relationships among the litostomateans. Despite insertions, deletions and mutational saturations in the ITS region, the present study shows that ITS1 and ITS2 molecules can be used to infer phylogenetic relationships not only at species level but also at higher taxonomic ranks when their secondary structure information is utilized to aid alignment. Copyright © 2012 Elsevier Inc. All rights reserved.
Riley, Matthew C; Wilkes, Rebecca P
2015-12-18
Recent outbreaks of canine distemper have prompted examination of strains from clinical samples submitted to the University of Tennessee College of Veterinary Medicine (UTCVM) Clinical Virology Lab. We previously described a new strain of CDV that significantly diverged from all genotypes reported to date including America 2, the genotype proposed to be the main lineage currently circulating in the US. The aim of this study was to determine when this new strain appeared and how widespread it is in animal populations, given that it has also been detected in fully vaccinated adult dogs. Additionally, we sequenced complete viral genomes to characterize the strain and determine if variation is confined to known variable regions of the genome or if the changes are also present in more conserved regions. Archived clinical samples were genotyped using real-time RT-PCR amplification and sequencing. The genomes of two unrelated viruses from a dog and fox each from a different state were sequenced and aligned with previously published genomes. Phylogenetic analysis was performed using coding, non-coding and genome-length sequences. Virus neutralization assays were used to evaluate potential antigenic differences between this strain and a vaccine strain and mixed ANOVA test was used to compare the titers. Genotyping revealed this strain first appeared in 2011 and was detected in dogs from multiple states in the Southeast region of the United States. It was the main strain detected among the clinical samples that were typed from 2011-2013, including wildlife submissions. Genome sequencing demonstrated that it is highly conserved within a new lineage and preliminary serologic testing showed significant differences in neutralizing antibody titers between this strain and the strain commonly used in vaccines. This new strain represents an emerging CDV in domestic dogs in the US, may be associated with a stable reservoir in the wildlife population, and could facilitate vaccine escape.
Bhore, Subhash J; Kassim, Amelia; Loh, Chye Ying; Shah, Farida H
2010-01-01
It is well known that the nutritional quality of the American oil-palm (Elaeis oleifera) mesocarp oil is superior to that of African oil-palm (Elaeis guineensis Jacq. Tenera) mesocarp oil. Therefore, it is of important to identify the genetic features for its superior value. This could be achieved through the genome sequencing of the oil-palm. However, the genome sequence is not available in the public domain due to commercial secrecy. Hence, we constructed a cDNA library and generated expressed sequence tags (3,205) from the mesocarp tissue of the American oil-palm. We continued to annotate each of these cDNAs after submitting to GenBank/DDBJ/EMBL. A rough analysis turned our attention to the beta-carotene hydroxylase (Chyb) enzyme encoding cDNA. Then, we completed the full sequencing of cDNA clone for its both strands using M13 forward and reverse primers. The full nucleotide and protein sequence was further analyzed and annotated using various Bioinformatics tools. The analysis results showed the presence of fatty acid hydroxylase superfamily domain in the protein sequence. The multiple sequence alignment of selected Chyb amino acid sequences from other plant species and algal members with E. oleifera Chyb using ClustalW and its phylogenetic analysis suggest that Chyb from monocotyledonous plant species, Lilium hubrid, Crocus sativus and Zea mays are the most evolutionary related with E. oleifera Chyb. This study reports the annotation of E. oleifera Chyb. Abbreviations ESTs - expressed sequence tags, EoChyb - Elaeis oleifera beta-carotene hydroxylase, MC - main cluster PMID:21364789
Marck, C
1988-01-01
DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831
Protein contact prediction using patterns of correlation.
Hamilton, Nicholas; Burrage, Kevin; Ragan, Mark A; Huber, Thomas
2004-09-01
We describe a new method for using neural networks to predict residue contact pairs in a protein. The main inputs to the neural network are a set of 25 measures of correlated mutation between all pairs of residues in two "windows" of size 5 centered on the residues of interest. While the individual pair-wise correlations are a relatively weak predictor of contact, by training the network on windows of correlation the accuracy of prediction is significantly improved. The neural network is trained on a set of 100 proteins and then tested on a disjoint set of 1033 proteins of known structure. An average predictive accuracy of 21.7% is obtained taking the best L/2 predictions for each protein, where L is the sequence length. Taking the best L/10 predictions gives an average accuracy of 30.7%. The predictor is also tested on a set of 59 proteins from the CASP5 experiment. The accuracy is found to be relatively consistent across different sequence lengths, but to vary widely according to the secondary structure. Predictive accuracy is also found to improve by using multiple sequence alignments containing many sequences to calculate the correlations. Copyright 2004 Wiley-Liss, Inc.
Using comparative genome analysis to identify problems in annotated microbial genomes.
Poptsova, Maria S; Gogarten, J Peter
2010-07-01
Genome annotation is a tedious task that is mostly done by automated methods; however, the accuracy of these approaches has been questioned since the beginning of the sequencing era. Genome annotation is a multilevel process, and errors can emerge at different stages: during sequencing, as a result of gene-calling procedures, and in the process of assigning gene functions. Missed or wrongly annotated genes differentially impact different types of analyses. Here we discuss and demonstrate how the methods of comparative genome analysis can refine annotations by locating missing orthologues. We also discuss possible reasons for errors and show that the second-generation annotation systems, which combine multiple gene-calling programs with similarity-based methods, perform much better than the first annotation tools. Since old errors may propagate to the newly sequenced genomes, we emphasize that the problem of continuously updating popular public databases is an urgent and unresolved one. Due to the progress in genome-sequencing technologies, automated annotation techniques will remain the main approach in the future. Researchers need to be aware of the existing errors in the annotation of even well-studied genomes, such as Escherichia coli, and consider additional quality control for their results.
A Study of Two Instructional Sequences Informed by Alternative Learning Progressions in Genetics
NASA Astrophysics Data System (ADS)
Duncan, Ravit Golan; Choi, Jinnie; Castro-Faix, Moraima; Cavera, Veronica L.
2017-12-01
Learning progressions (LPs) are hypothetical models of how learning in a domain develops over time with appropriate instruction. In the domain of genetics, there are two independently developed alternative LPs. The main difference between the two progressions hinges on their assumptions regarding the accessibility of classical (Mendelian) versus molecular genetics and the order in which they should be taught. In order to determine the relative difficulty of the different genetic ideas included in the two progressions, and to test which one is a better fit with students' actual learning, we developed two modules in classical and molecular genetics and alternated their sequence in an implementation study with 11th grade students studying biology. We developed a set of 56 ordered multiple-choice items that collectively assessed both molecular and classical genetic ideas. We found significant gains in students' learning in both molecular and classical genetics, with the largest gain relating to understanding the informational content of genes and the smallest gain in understanding modes of inheritance. Using multidimensional item response modeling, we found no statistically significant differences between the two instructional sequences. However, there was a trend of slightly higher gains for the molecular-first sequence for all genetic ideas.
Development of Overarm Throwing Technique Reflects Throwing Ability during Childhood
KASUYAMA, Tatsuya; MUTOU, Ikuo; SASAMOTO, Hitoshi
2016-01-01
Background: It is important to acquire fundamental movement skills during childhood. Throwing is a representative manipulative skill required for various intrinsic factors. However, the relationship between intrinsic factors and throwing ability in childhood is unclear. The purpose of this study was to investigate intrinsic factors related to the ball throwing distance of Japanese elementary school children. Methods: Japanese elementary school children from grades 1-6 (aged 6-12 years; n=112) participated in this study. The main outcome was throwing ability, which was measured as the ball throwing distance. We measured five general anthropometric parameters, seven physical fitness parameters, and the Roberton's developmental sequence for all subjects. The relationships between the throwing ability and the 13 parameters were analysed. Results: The Roberton's developmental sequence was the best predictor of ball throwing distance (r=0.80, p≤0.01). The best multiple regression model, which included sex, handgrip strength, shuttle run test, and the Roberton's developmental sequence, accounted for 81% of the total variance. Conclusions: The development of correct throwing technique reflects throwing abilities in childhood. In addition to the throwing sequence, enhancement of grip strength and aerobic capacity are also required for children's throwing ability. PMID:28289578
Prefiltering Model for Homology Detection Algorithms on GPU.
Retamosa, Germán; de Pedro, Luis; González, Ivan; Tamames, Javier
2016-01-01
Homology detection has evolved over the time from heavy algorithms based on dynamic programming approaches to lightweight alternatives based on different heuristic models. However, the main problem with these algorithms is that they use complex statistical models, which makes it difficult to achieve a relevant speedup and find exact matches with the original results. Thus, their acceleration is essential. The aim of this article was to prefilter a sequence database. To make this work, we have implemented a groundbreaking heuristic model based on NVIDIA's graphics processing units (GPUs) and multicore processors. Depending on the sensitivity settings, this makes it possible to quickly reduce the sequence database by factors between 50% and 95%, while rejecting no significant sequences. Furthermore, this prefiltering application can be used together with multiple homology detection algorithms as a part of a next-generation sequencing system. Extensive performance and accuracy tests have been carried out in the Spanish National Centre for Biotechnology (NCB). The results show that GPU hardware can accelerate the execution times of former homology detection applications, such as National Centre for Biotechnology Information (NCBI), Basic Local Alignment Search Tool for Proteins (BLASTP), up to a factor of 4.
[The human variome project and its progress].
Gao, Shan; Zhang, Ning; Zhang, Lei; Duan, Guang-You; Zhang, Tao
2010-11-01
The main goal of post genomics is to explain how the genome, the map of which has been constructed in the Human Genome Project, affacts activities of life. This leads to generate multiple "omics": structural genomics, functional genomics, proteomics, metabonomics, et al. In Jun. 2006, Melbourne, Australia, Human Genome Variation Society (HGVS) initiated the Human Variome Project (HVP) to collect all the sequence variation and polymorphism data worldwidely. HVP is to search and determine those mutations related with human diseases by association study between genetype and phenotype on the scale of genome level and other methods. Those results will be translated into clinical application. Considering the potential effects of this project on human health, this paper introduced its origin and main content in detail and discussed its meaning and prospect.
Morrison, Heather; Roscoe, Eileen M; Atwell, Amy
2011-01-01
We evaluated antecedent exercise for treating the automatically reinforced problem behavior of 4 individuals with autism. We conducted preference assessments to identify leisure and exercise items that were associated with high levels of engagement and low levels of problem behavior. Next, we conducted three 3-component multiple-schedule sequences: an antecedent-exercise test sequence, a noncontingent leisure-item control sequence, and a social-interaction control sequence. Within each sequence, we used a 3-component multiple schedule to evaluate preintervention, intervention, and postintervention effects. Problem behavior decreased during the postintervention component relative to the preintervention component for 3 of the 4 participants during the exercise-item assessment; however, the effects could not be attributed solely to exercise for 1 of these participants. PMID:21941383
ERIC Educational Resources Information Center
Rau, M. A.; Aleven, V.; Rummel, N.; Pardos, Z.
2014-01-01
Providing learners with multiple representations of learning content has been shown to enhance learning outcomes. When multiple representations are presented across consecutive problems, we have to decide in what sequence to present them. Prior research has demonstrated that interleaving "tasks types" (as opposed to blocking them) can…
Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng
2012-01-01
To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944
NASA Astrophysics Data System (ADS)
Thipboon, Ritthichai; Kaewrakmuk, Metichai; Surina, Farung; Sanguansak, Nuanwan
2017-09-01
Recurrent novae (RNe) are novae with multiple recorded outbursts powered by a thermonuclear runaway. The outburst occurs on the surface of the white dwarf which accompanies with a late type main-sequence or giant secondary star transferring material onto the white dwarf primary star. They resemble classical novae (CNe) outbursts but only RNe has more than one recorded outbursts. RNe play an important role as one of the suspected progenitor systems of Type Ia supernovae (SNe) which are used as primary distance indicators in cosmology. Thus, it is important to investigate the outburst type of CNe and RNe and finally ascertain the population of objects that might ultimately be candidates for Type Ia SNe explosions. The proposal that RNe occupy a region separated from CNe in an outburst amplitude versus speed class diagram was adopted. Since the low amplitude results from the existence of an evolved secondary and/or high mass transfer rate in the quiescent system, RNe candidates should accordingly have low amplitude. We selected 3 preliminary targets including T Pyx, BT Mon and V574 Pup. Their amplitudes are not that low but the lowest amplitude that can be observed with Thai National Telescope (TNT). We obtained their magnitudes at quiescence using ULTRASPEC camera on the 2.4-m TNT. The positions of three targets on optical and near-infrared color-magnitude diagrams suggest that all three should have main-sequence secondary stars. This is true for T Pyx, whose secondary star has been confirmed its spectroscopy to be a main-sequence star, but not yet confirmed for BT Mon and V574 Pup.
Mazuet, Christelle; Legeay, Christine; Sautereau, Jean; Ma, Laurence; Bouchier, Christiane; Bouvet, Philippe; Popoff, Michel R.
2016-01-01
In France, human botulism is mainly food-borne intoxication, whereas infant botulism is rare. A total of 99 group I and II Clostridium botulinum strains including 59 type A (12 historical isolates [1947–1961], 43 from France [1986–2013], 3 from other countries, and 1 collection strain), 31 type B (3 historical, 23 recent isolates, 4 from other countries, and 1 collection strain), and 9 type E (5 historical, 3 isolates, and 1 collection strain) were investigated by botulinum locus gene sequencing and multilocus sequence typing analysis. Historical C. botulinum A strains mainly belonged to subtype A1 and sequence type (ST) 1, whereas recent strains exhibited a wide genetic diversity: subtype A1 in orfX or ha locus, A1(B), A1(F), A2, A2b2, A5(B2′) A5(B3′), as well as the recently identified A7 and A8 subtypes, and were distributed into 25 STs. Clostridium botulinum A1(B) was the most frequent subtype from food-borne botulism and food. Group I C. botulinum type B in France were mainly subtype B2 (14 out of 20 historical and recent strains) and were divided into 19 STs. Food-borne botulism resulting from ham consumption during the recent period was due to group II C. botulinum B4. Type E botulism is rare in France, 5 historical and 1 recent strains were subtype E3. A subtype E12 was recently identified from an unusual ham contamination. Clostridium botulinum strains from human botulism in France showed a wide genetic diversity and seems to result not from a single evolutionary lineage but from multiple and independent genetic rearrangements. PMID:27189984
The Effects of Rotation on the Main-sequence Turnoff of Intermediate-age Massive Star Clusters
NASA Astrophysics Data System (ADS)
Yang, Wuming; Bi, Shaolan; Meng, Xiangcun; Liu, Zhie
2013-10-01
The double or extended main-sequence turnoffs (MSTOs) in the color-magnitude diagram (CMD) of intermediate-age massive star clusters in the Large Magellanic Cloud are generally interpreted as age spreads of a few hundred Myr. However, such age spreads do not exist in younger clusters (i.e., 40-300 Myr), which challenges this interpretation. The effects of rotation on the MSTOs of star clusters have been studied in previous works, but the results obtained are conflicting. Compared with previous works, we consider the effects of rotation on the main-sequence lifetime of stars. Our calculations show that rotating models have a fainter and redder MSTO with respect to non-rotating counterparts with ages between about 0.8 and 2.2 Gyr, but have a brighter and bluer MSTO when age is larger than 2.4 Gyr. The spread of the MSTO caused by a typical rotation rate is equivalent to the effect of an age spread of about 200 Myr. Rotation could lead to the double or extended MSTOs in the CMD of the star clusters with ages between about 0.8 and 2.2 Gyr. However, the extension is not significant, and it does not even exist in younger clusters. If the efficiency of the mixing were high enough, the effects of the mixing would counteract the effect of the centrifugal support in the late stage of evolution, and the rotationally induced extension would disappear in the old intermediate-age star clusters, but younger clusters would have an extended MSTO. Moreover, the effects of rotation might aid in understanding the formation of some "multiple populations" in globular clusters.
Reconstructing evolutionary trees in parallel for massive sequences.
Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam
2017-12-14
Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .
Multiple DNA and protein sequence alignment on a workstation and a supercomputer.
Tajima, K
1988-11-01
This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.
Interferometric capability for the Magellan Project
NASA Astrophysics Data System (ADS)
Carleton, Nathaniel P.; Traub, Wesley A.; Angel, J. Roger P.
1998-07-01
The Magellan Project is building two 6.5-m telescopes, 60 m apart, at the Las Campanas Observatory in Chile. There are on-going plans to combine the beams of the two main telescopes, and of smaller auxiliary telescopes, for interferometric measurements. In this paper we consider the array of auxiliary telescopes as a stand-alone instrument, recognizing that it will operate as such for some large fraction of the time. Our interest is sharpened by the availability of six 1.8-m optical systems, retired from the Smithsonian-Arizona Multiple-Mirror Telescope in preparation for the installation of a single-mirror 6.5-m system. We have completed a design for a 1.8-m telescope, in which the MMT components are supported on a proven tripod mount. The optics-support uses steel for stiffness, and low-thermal- expansion rods for passive stability. This array will be a powerful tool for the investigation of stellar limb darkening, surface features, and changes of diameter in pulsations, as well as dust disks, shells, and binary companions. The 1.8-m telescopes on good sites such as Magellan's should be able to operate at full aperture for interferometry at 2.2 micrometers . They should therefore be able to reach to magnitude K equals 10 or so, and thus to cover substantial samples of both main-sequence and pre-main- sequence stars, and of fully evolved stars as well.
TaxI: a software tool for DNA barcoding using distance methods
Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel
2005-01-01
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
Characterization of HIV Transmission in South-East Austria
Kessler, Harald H.; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J.; Mehta, Sanjay R.
2016-01-01
To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects. PMID:26967154
Characterization of HIV Transmission in South-East Austria.
Hoenigl, Martin; Chaillon, Antoine; Kessler, Harald H; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J; Mehta, Sanjay R
2016-01-01
To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects.
Stellar Parameters in an Instant with Machine Learning. Application to Kepler LEGACY Targets
NASA Astrophysics Data System (ADS)
Bellinger, Earl P.; Angelou, George C.; Hekker, Saskia; Basu, Sarbani; Ball, Warrick H.; Guggenberger, Elisabet
2017-10-01
With the advent of dedicated photometric space missions, the ability to rapidly process huge catalogues of stars has become paramount. Bellinger and Angelou et al. [1] recently introduced a new method based on machine learning for inferring the stellar parameters of main-sequence stars exhibiting solar-like oscillations. The method makes precise predictions that are consistent with other methods, but with the advantages of being able to explore many more parameters while costing practically no time. Here we apply the method to 52 so-called "LEGACY" main-sequence stars observed by the Kepler space mission. For each star, we present estimates and uncertainties of mass, age, radius, luminosity, core hydrogen abundance, surface helium abundance, surface gravity, initial helium abundance, and initial metallicity as well as estimates of their evolutionary model parameters of mixing length, overshooting coeffcient, and diffusion multiplication factor. We obtain median uncertainties in stellar age, mass, and radius of 14.8%, 3.6%, and 1.7%, respectively. The source code for all analyses and for all figures appearing in this manuscript can be found electronically at
NASA Astrophysics Data System (ADS)
Principe, David; Huenemoerder, David P.; Schulz, Norbert; Kastner, Joel H.; Weintraub, David; Preibisch, Thomas
2018-01-01
We present Chandra High Energy Transmission Grating (HETG) observations of the ∼3 Myr old pre-main sequence (pre-MS) stellar cluster IC 348. With 400-500 cluster members at a distance of ∼300 pc, IC 348 is an ideal target to observe a large number of X-ray sources in a single pointing and is thus an extremely efficient use of Chandra-HETG. High resolution X-ray spectroscopy offers a means to investigate detailed spectral characteristic of X-ray emitting plasmas and their surrounding environments. We present preliminary results where we compare X-ray spectral signatures (e.g., luminosity, temperature, column density, abundance) of the X-ray brightest pre-MS stars in IC 348 with spectral type, multiwavelength signatures of accretion, and the presence of circumstellar disks at multiple stages of pre-MS stellar evolution. Assuming all IC 348 members formed from the same primordial molecular cloud, any disparity between coronal abundances of individual members, as constrained by the identification and strength of emission lines, will constrain the source(s) of coronal chemical evolution at a stage of pre-MS evolution vital to the formation of planets.
Yoshikawa, Miho; Zhang, Ming; Kurisu, Futoshi; Toyota, Koki
2017-01-01
Most bioremediation studies on volatile organic compounds (VOCs) have focused on a single contaminant or its derived compounds and degraders have been identified under single contaminant conditions. Bioremediation of multiple contaminants remains a challenging issue. To identify a bacterial consortium that degrades multiple VOCs (dichloromethane (DCM), benzene, and toluene), we applied DNA-stable isotope probing. For individual tests, we combined a 13 C-labeled VOC with other two unlabeled VOCs, and prepared three unlabeled VOCs as a reference. Over 11 days, DNA was periodically extracted from the consortia, and the bacterial community was evaluated by next-generation sequencing of bacterial 16S rRNA gene amplicons. Density gradient fractions of the DNA extracts were amplified by universal bacterial primers for the 16S rRNA gene sequences, and the amplicons were analyzed by terminal restriction fragment length polymorphism (T-RFLP) using restriction enzymes: Hha I and Msp I. The T-RFLP fragments were identified by 16S rRNA gene cloning and sequencing. Under all test conditions, the consortia were dominated by Rhodanobacter , Bradyrhizobium / Afipia , Rhizobium , and Hyphomicrobium . DNA derived from Hyphomicrobium and Propioniferax shifted toward heavier fractions under the condition added with 13 C-DCM and 13 C-benzene, respectively, compared with the reference, but no shifts were induced by 13 C-toluene addition. This implies that Hyphomicrobium and Propioniferax were the main DCM and benzene degraders, respectively, under the coexisting condition. The known benzene degrader Pseudomonas sp. was present but not actively involved in the degradation.
Wan, Shixiang; Zou, Quan
2017-01-01
Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
NASA Astrophysics Data System (ADS)
Niaz, Mansoor
The main objectives of this study are:(1) to elaborate a framework based on a rational reconstruction of developments that led to the formulation of the laws of definite and multiple proportions; (2) to ascertain students' views of the two laws; (3) to formulate criteria based on the framework for evaluating chemistry textbooks' treatment of the two laws; and (4) to provide a rationale for chemistry teachers to respond to the question: Can we teach chemistry without the laws of definite and multiple proportions? Results obtained show that most of the textbooks present the laws of definite and multiple proportions within an inductivist perspective, characterized by the following sequence: experimental findings showed that chemical elements combined in fixed/multiple proportions, followed by the formulation of the laws of definite and multiple proportions, and finally Dalton's atomic theory was postulated to explain the laws. Students were found to be reluctant to question the laws that they learnt as the building blocks of chemistry. It is concluded that by emphasizing the laws of definite and multiple proportions, textbooks inevitably endorse the dichotomy between theories and laws, which is questioned by philosophers of science (Lakatos 1970; Giere 1995a, b). An alternative approach is presented which shows that we can teach chemistry without the laws of definite and multiple proportions.
The Young Visual Binary Survey
NASA Astrophysics Data System (ADS)
Prato, Lisa; Avilez, Ian; Lindstrom, Kyle; Graham, Sean; Sullivan, Kendall; Biddle, Lauren; Skiff, Brian; Nofi, Larissa; Schaefer, Gail; Simon, Michal
2018-01-01
Differences in the stellar and circumstellar properties of the components of young binaries provide key information about star and disk formation and evolution processes. Because objects with separations of a few to a few hundred astronomical units share a common environment and composition, multiple systems allow us to control for some of the factors which play into star formation. We are completing analysis of a rich sample of about 100 pre-main sequence binaries and higher order multiples, primarily located in the Taurus and Ophiuchus star forming regions. This poster will highlight some of out recent, exciting results. All reduced spectra and the results of our analysis will be publicly available to the community at http://jumar.lowell.edu/BinaryStars/. Support for this research was provided in part by NSF award AST-1313399 and by NASA Keck KPDA funding.
Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW.
Oliver, Tim; Schmidt, Bertil; Nathan, Darran; Clemens, Ralf; Maskell, Douglas
2005-08-15
Aligning hundreds of sequences using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. We present a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware. This results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA.
Mango: multiple alignment with N gapped oligos.
Zhang, Zefeng; Lin, Hao; Li, Ming
2008-06-01
Multiple sequence alignment is a classical and challenging task. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state-of-the-art works suffer from the "once a gap, always a gap" phenomenon. Is there a radically new way to do multiple sequence alignment? In this paper, we introduce a novel and orthogonal multiple sequence alignment method, using both multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole and tries to build the alignment vertically, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds have proved significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks, showing that MANGO compares favorably, in both accuracy and speed, against state-of-the-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, ProbConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0, and Kalign 2.0. We have further demonstrated the scalability of MANGO on very large datasets of repeat elements. MANGO can be downloaded at http://www.bioinfo.org.cn/mango/ and is free for academic usage.
Optimized scheduling technique of null subcarriers for peak power control in 3GPP LTE downlink.
Cho, Soobum; Park, Sang Kyu
2014-01-01
Orthogonal frequency division multiple access (OFDMA) is a key multiple access technique for the long term evolution (LTE) downlink. However, high peak-to-average power ratio (PAPR) can cause the degradation of power efficiency. The well-known PAPR reduction technique, dummy sequence insertion (DSI), can be a realistic solution because of its structural simplicity. However, the large usage of subcarriers for the dummy sequences may decrease the transmitted data rate in the DSI scheme. In this paper, a novel DSI scheme is applied to the LTE system. Firstly, we obtain the null subcarriers in single-input single-output (SISO) and multiple-input multiple-output (MIMO) systems, respectively; then, optimized dummy sequences are inserted into the obtained null subcarrier. Simulation results show that Walsh-Hadamard transform (WHT) sequence is the best for the dummy sequence and the ratio of 16 to 20 for the WHT and randomly generated sequences has the maximum PAPR reduction performance. The number of near optimal iteration is derived to prevent exhausted iterations. It is also shown that there is no bit error rate (BER) degradation with the proposed technique in LTE downlink system.
Optimized Scheduling Technique of Null Subcarriers for Peak Power Control in 3GPP LTE Downlink
Park, Sang Kyu
2014-01-01
Orthogonal frequency division multiple access (OFDMA) is a key multiple access technique for the long term evolution (LTE) downlink. However, high peak-to-average power ratio (PAPR) can cause the degradation of power efficiency. The well-known PAPR reduction technique, dummy sequence insertion (DSI), can be a realistic solution because of its structural simplicity. However, the large usage of subcarriers for the dummy sequences may decrease the transmitted data rate in the DSI scheme. In this paper, a novel DSI scheme is applied to the LTE system. Firstly, we obtain the null subcarriers in single-input single-output (SISO) and multiple-input multiple-output (MIMO) systems, respectively; then, optimized dummy sequences are inserted into the obtained null subcarrier. Simulation results show that Walsh-Hadamard transform (WHT) sequence is the best for the dummy sequence and the ratio of 16 to 20 for the WHT and randomly generated sequences has the maximum PAPR reduction performance. The number of near optimal iteration is derived to prevent exhausted iterations. It is also shown that there is no bit error rate (BER) degradation with the proposed technique in LTE downlink system. PMID:24883376
VizieR Online Data Catalog: VFTS. O-type stellar content of 30 Dor (Walborn+, 2014)
NASA Astrophysics Data System (ADS)
Walborn, N. R.; Sana, H.; Simon-Diaz, S.; Maiz Apellaniz, J.; Taylor, W. D.; Evans, C. J.; Markova, N.; Lennon, D. J.; de Koter, A.
2014-06-01
Detailed spectral classifications are presented for 352 O-B0 stars in the VLT-FLAMES Tarantula Survey ESO Large Programme, of which 213 O-type are judged of sufficiently high quality for further morphological analysis. Among them, six subcategories of special interest are distinguished. (1) Several new examples of the earliest spectral types O2-O3 have been found, while a previously known example has been determined to belong to the nitrogen-rich ON2 class. (2) A group of extremely rapidly rotating main-sequence objects has been isolated, including the largest vsini values known, the spatial and radial-velocity distributions of which suggest ejection from the two principal ionizing clusters NGC 2070 and NGC 2060. (3) Several new examples of the evolved, rapidly rotating Onfp class show similar evidence, although at least some of them are spectroscopic binaries. (4) No fewer than 48 members of the Vz category, hypothesized to be on or near the zero-age main sequence, are found in this sample; in contrast to the rapid rotators, they are strongly concentrated to the ionizing clusters and a newly recognized region of current and recent star formation to the north, supporting their interpretation as very young objects, as do their relatively faint absolute magnitudes. (5) A surprisingly large fraction of the main-sequence spectra belong to the recently recognized V((fc)) class, with CIII emission lines of similar strength to the usual NIII in V((f)) spectra, although a comparable number of the latter are also present, as well as six objects with very high-quality data but no trace of either emission feature, presenting new challenges to physical interpretations. (6) Two mid-O Vz and three late-O giant/supergiant spectra with morphologically enhanced nitrogen lines have been detected. Absolute visual magnitudes have been derived for each star with individual extinction laws, and composite Hertzsprung-Russell diagrams provide evidence of the multiple generations present in this field. Spectroscopic binaries, resolved visual multiples, and possible associations with X-ray sources are noted. Astrophysical and dynamical analyses of this unique dataset underway will provide new insights into the evolution of massive stars and starburst clusters. (2 data files).
Zepeda-Mendoza, Marie Lisandra; Bohmann, Kristine; Carmona Baez, Aldo; Gilbert, M Thomas P
2016-05-03
DNA metabarcoding is an approach for identifying multiple taxa in an environmental sample using specific genetic loci and taxa-specific primers. When combined with high-throughput sequencing it enables the taxonomic characterization of large numbers of samples in a relatively time- and cost-efficient manner. One recent laboratory development is the addition of 5'-nucleotide tags to both primers producing double-tagged amplicons and the use of multiple PCR replicates to filter erroneous sequences. However, there is currently no available toolkit for the straightforward analysis of datasets produced in this way. We present DAMe, a toolkit for the processing of datasets generated by double-tagged amplicons from multiple PCR replicates derived from an unlimited number of samples. Specifically, DAMe can be used to (i) sort amplicons by tag combination, (ii) evaluate PCR replicates dissimilarity, and (iii) filter sequences derived from sequencing/PCR errors, chimeras, and contamination. This is attained by calculating the following parameters: (i) sequence content similarity between the PCR replicates from each sample, (ii) reproducibility of each unique sequence across the PCR replicates, and (iii) copy number of the unique sequences in each PCR replicate. We showcase the insights that can be obtained using DAMe prior to taxonomic assignment, by applying it to two real datasets that vary in their complexity regarding number of samples, sequencing libraries, PCR replicates, and used tag combinations. Finally, we use a third mock dataset to demonstrate the impact and importance of filtering the sequences with DAMe. DAMe allows the user-friendly manipulation of amplicons derived from multiple samples with PCR replicates built in a single or multiple sequencing libraries. It allows the user to: (i) collapse amplicons into unique sequences and sort them by tag combination while retaining the sample identifier and copy number information, (ii) identify sequences carrying unused tag combinations, (iii) evaluate the comparability of PCR replicates of the same sample, and (iv) filter tagged amplicons from a number of PCR replicates using parameters of minimum length, copy number, and reproducibility across the PCR replicates. This enables an efficient analysis of complex datasets, and ultimately increases the ease of handling datasets from large-scale studies.
CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment.
Chen, Xi; Wang, Chen; Tang, Shanjiang; Yu, Ce; Zou, Quan
2017-06-24
The multiple sequence alignment (MSA) is a classic and powerful technique for sequence analysis in bioinformatics. With the rapid growth of biological datasets, MSA parallelization becomes necessary to keep its running time in an acceptable level. Although there are a lot of work on MSA problems, their approaches are either insufficient or contain some implicit assumptions that limit the generality of usage. First, the information of users' sequences, including the sizes of datasets and the lengths of sequences, can be of arbitrary values and are generally unknown before submitted, which are unfortunately ignored by previous work. Second, the center star strategy is suited for aligning similar sequences. But its first stage, center sequence selection, is highly time-consuming and requires further optimization. Moreover, given the heterogeneous CPU/GPU platform, prior studies consider the MSA parallelization on GPU devices only, making the CPUs idle during the computation. Co-run computation, however, can maximize the utilization of the computing resources by enabling the workload computation on both CPU and GPU simultaneously. This paper presents CMSA, a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU/GPU platform. It performs and optimizes multiple sequence alignment automatically for users' submitted sequences without any assumptions. CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized. Moreover, CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn 2 ) to O(mn). The experimental results show that CMSA achieves an up to 11× speedup and outperforms the state-of-the-art software. CMSA focuses on the multiple similar RNA/DNA sequence alignment and proposes a novel bitmap based algorithm to improve the center star strategy. We can conclude that harvesting the high performance of modern GPU is a promising approach to accelerate multiple sequence alignment. Besides, adopting the co-run computation model can maximize the entire system utilization significantly. The source code is available at https://github.com/wangvsa/CMSA .
EdiPy: a resource to simulate the evolution of plant mitochondrial genes under the RNA editing.
Picardi, Ernesto; Quagliariello, Carla
2006-02-01
EdiPy is an online resource appropriately designed to simulate the evolution of plant mitochondrial genes in a biologically realistic fashion. EdiPy takes into account the presence of sites subjected to RNA editing and provides multiple artificial alignments corresponding to both genomic and cDNA sequences. Each artificial data set can successively be submitted to main and widespread evolutionary and phylogenetic software packages such as PAUP, Phyml, PAML and Phylip. As an online bioinformatic resource, EdiPy is available at the following web page: http://biologia.unical.it/py_script/index.html.
NASA Astrophysics Data System (ADS)
Yu, Jinchen; Peng, Mingshu
2016-10-01
In this paper, a Kaldor-Kalecki model of business cycle with both discrete and distributed delays is considered. With the corresponding characteristic equation analyzed, the local stability of the positive equilibrium is investigated. It is found that there exist Hopf bifurcations when the discrete time delay passes a sequence of critical values. By applying the method of multiple scales, the explicit formulae which determine the direction of Hopf bifurcation and the stability of bifurcating periodic solutions are derived. Finally, numerical simulations are carried out to illustrate our main results.
Mostert, Lizel; Groenewald, Johannes Z.; Summerbell, Richard C.; Robert, Vincent; Sutton, Deanna A.; Padhye, Arvind A.; Crous, Pedro W.
2005-01-01
To date, three species of Phaeoacremonium have been associated with phaeohyphomycosis. These are P. parasiticum (formerly Phialophora parasitica), P. inflatipes, and P. rubrigenum. Numerous unknown isolates resembling Phaeoacremonium spp. have in recent years been isolated from human patients as well as from woody plants that appear to be the main environmental source of these fungi. Nine new Phaeoacremonium species, of which six were obtained as etiologic agents of human opportunistic infection, are reported. They can be identified based on their cultural and morphological characters, and the identifications are strongly supported in phylogenetic analyses of partial sequences of the actin, β-tubulin, and calmodulin genes. A multiple-entry electronic key based on morphological, cultural, and β-tubulin sequence data was developed to facilitate routine species identification. Reexamination of all isolates of P. inflatipes associated with human disease showed them to be misidentified and to belong to the new taxa described here. PMID:15814996
Phocine Distemper Virus in Seals, East Coast, United States, 2006
Earle, J.A. Philip; Melia, Mary M.; Doherty, Nadine V.; Nielsen, Ole
2011-01-01
In 2006 and 2007, elevated numbers of deaths among seals, constituting an unusual mortality event, occurred off the coasts of Maine and Massachusetts, United States. We isolated a virus from seal tissue and confirmed it as phocine distemper virus (PDV). We compared the viral hemagglutinin, phosphoprotein, and fusion (F) and matrix (M) protein gene sequences with those of viruses from the 1988 and 2002 PDV epizootics. The virus showed highest similarity with a PDV 1988 Netherlands virus, which raises the possibility that the 2006 isolate from the United States might have emerged independently from 2002 PDVs and that multiple lineages of PDV might be circulating among enzootically infected North American seals. Evidence from comparison of sequences derived from different tissues suggested that mutations in the F and M genes occur in brain tissue that are not present in lung, liver, or blood, which suggests virus persistence in the central nervous system. PMID:21291591
Aftershocks driven by afterslip and fluid pressure sweeping through a fault-fracture mesh
Ross, Zachary E.; Rollins, Christopher; Cochran, Elizabeth S.; Hauksson, Egill; Avouac, Jean-Philippe; Ben-Zion, Yehuda
2017-01-01
A variety of physical mechanisms are thought to be responsible for the triggering and spatiotemporal evolution of aftershocks. Here we analyze a vigorous aftershock sequence and postseismic geodetic strain that occurred in the Yuha Desert following the 2010 Mw 7.2 El Mayor-Cucapah earthquake. About 155,000 detected aftershocks occurred in a network of orthogonal faults and exhibit features of two distinct mechanisms for aftershock triggering. The earliest aftershocks were likely driven by afterslip that spread away from the main shock with the logarithm of time. A later pulse of aftershocks swept again across the Yuha Desert with square root time dependence and swarm-like behavior; together with local geological evidence for hydrothermalism, these features suggest that the events were driven by fluid diffusion. The observations illustrate how multiple driving mechanisms and the underlying fault structure jointly control the evolution of an aftershock sequence.
Learning of goal-relevant and -irrelevant complex visual sequences in human V1.
Rosenthal, Clive R; Mallik, Indira; Caballero-Gaudes, Cesar; Sereno, Martin I; Soto, David
2018-06-12
Learning and memory are supported by a network involving the medial temporal lobe and linked neocortical regions. Emerging evidence indicates that primary visual cortex (i.e., V1) may contribute to recognition memory, but this has been tested only with a single visuospatial sequence as the target memorandum. The present study used functional magnetic resonance imaging to investigate whether human V1 can support the learning of multiple, concurrent complex visual sequences involving discontinous (second-order) associations. Two peripheral, goal-irrelevant but structured sequences of orientated gratings appeared simultaneously in fixed locations of the right and left visual fields alongside a central, goal-relevant sequence that was in the focus of spatial attention. Pseudorandom sequences were introduced at multiple intervals during the presentation of the three structured visual sequences to provide an online measure of sequence-specific knowledge at each retinotopic location. We found that a network involving the precuneus and V1 was involved in learning the structured sequence presented at central fixation, whereas right V1 was modulated by repeated exposure to the concurrent structured sequence presented in the left visual field. The same result was not found in left V1. These results indicate for the first time that human V1 can support the learning of multiple concurrent sequences involving complex discontinuous inter-item associations, even peripheral sequences that are goal-irrelevant. Copyright © 2018. Published by Elsevier Inc.
Shih, Arthur Chun-Chieh; Lee, DT; Peng, Chin-Lin; Wu, Yu-Wei
2007-01-01
Background When aligning several hundreds or thousands of sequences, such as epidemic virus sequences or homologous/orthologous sequences of some big gene families, to reconstruct the epidemiological history or their phylogenies, how to analyze and visualize the alignment results of many sequences has become a new challenge for computational biologists. Although there are several tools available for visualization of very long sequence alignments, few of them are applicable to the alignments of many sequences. Results A multiple-logo alignment visualization tool, called Phylo-mLogo, is presented in this paper. Phylo-mLogo calculates the variabilities and homogeneities of alignment sequences by base frequencies or entropies. Different from the traditional representations of sequence logos, Phylo-mLogo not only displays the global logo patterns of the whole alignment of multiple sequences, but also demonstrates their local homologous logos for each clade hierarchically. In addition, Phylo-mLogo also allows the user to focus only on the analysis of some important, structurally or functionally constrained sites in the alignment selected by the user or by built-in automatic calculation. Conclusion With Phylo-mLogo, the user can symbolically and hierarchically visualize hundreds of aligned sequences simultaneously and easily check the changes of their amino acid sites when analyzing many homologous/orthologous or influenza virus sequences. More information of Phylo-mLogo can be found at URL . PMID:17319966
Cavalcante, Manoella Gemaque; Bastos, Carlos Eduardo Matos Carvalho; Nagamachi, Cleusa Yoshiko; Pieczarka, Julio Cesar; Vicari, Marcelo Ricardo; Noronha, Renata Coelho Rodrigues
2018-01-01
Cytogenetic studies show that there is great karyotypic diversity in order Testudines (2n = 26–68), and that this may be mainly attributed to the presence/absence of microchromosomes. Members of the Podocnemididae family have the smallest diploid numbers of this order (2n = 26–28), which may be a derived condition of the group. Diverse studies suggest that repetitive-DNA-rich sites generally act as hotspots for double-strand breaks and chromosomal reorganization. In this context, we used fluorescent in situ hybridization (FISH) to map telomeric sequences (TTAGGG)n, 45S rDNA, and the genes encoding histones H1 and H3 in two species of genus Podocnemis. We also observed conservation of the 45S rDNA and H1 histone sequences (probable case of conserved synteny), but multiple conserved and non-conserved clusters of H3 genes, which colocalized with the interstitial telomeric sequences in the Podocnemis genome. Our results suggest that fusions have occurred between macro and microchromosomes or between microchromosomes, leading to the observed reduction in diploid number in the family Podocnemididae. PMID:29813087
Cavalcante, Manoella Gemaque; Bastos, Carlos Eduardo Matos Carvalho; Nagamachi, Cleusa Yoshiko; Pieczarka, Julio Cesar; Vicari, Marcelo Ricardo; Noronha, Renata Coelho Rodrigues
2018-01-01
Cytogenetic studies show that there is great karyotypic diversity in order Testudines (2n = 26-68), and that this may be mainly attributed to the presence/absence of microchromosomes. Members of the Podocnemididae family have the smallest diploid numbers of this order (2n = 26-28), which may be a derived condition of the group. Diverse studies suggest that repetitive-DNA-rich sites generally act as hotspots for double-strand breaks and chromosomal reorganization. In this context, we used fluorescent in situ hybridization (FISH) to map telomeric sequences (TTAGGG)n, 45S rDNA, and the genes encoding histones H1 and H3 in two species of genus Podocnemis. We also observed conservation of the 45S rDNA and H1 histone sequences (probable case of conserved synteny), but multiple conserved and non-conserved clusters of H3 genes, which colocalized with the interstitial telomeric sequences in the Podocnemis genome. Our results suggest that fusions have occurred between macro and microchromosomes or between microchromosomes, leading to the observed reduction in diploid number in the family Podocnemididae.
MouSensor: A Versatile Genetic Platform to Create Super Sniffer Mice for Studying Human Odor Coding.
D'Hulst, Charlotte; Mina, Raena B; Gershon, Zachary; Jamet, Sophie; Cerullo, Antonio; Tomoiaga, Delia; Bai, Li; Belluscio, Leonardo; Rogers, Matthew E; Sirotin, Yevgeniy; Feinstein, Paul
2016-07-26
Typically, ∼0.1% of the total number of olfactory sensory neurons (OSNs) in the main olfactory epithelium express the same odorant receptor (OR) in a singular fashion and their axons coalesce into homotypic glomeruli in the olfactory bulb. Here, we have dramatically increased the total number of OSNs expressing specific cloned OR coding sequences by multimerizing a 21-bp sequence encompassing the predicted homeodomain binding site sequence, TAATGA, known to be essential in OR gene choice. Singular gene choice is maintained in these "MouSensors." In vivo synaptopHluorin imaging of odor-induced responses by known M71 ligands shows functional glomerular activation in an M71 MouSensor. Moreover, a behavioral avoidance task demonstrates that specific odor detection thresholds are significantly decreased in multiple transgenic lines, expressing mouse or human ORs. We have developed a versatile platform to study gene choice and axon identity, to create biosensors with great translational potential, and to finally decode human olfaction. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Long-range barcode labeling-sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Feng; Zhang, Tao; Singh, Kanwar K.
Methods for sequencing single large DNA molecules by clonal multiple displacement amplification using barcoded primers. Sequences are binned based on barcode sequences and sequenced using a microdroplet-based method for sequencing large polynucleotide templates to enable assembly of haplotype-resolved complex genomes and metagenomes.
High-speed multiple sequence alignment on a reconfigurable platform.
Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf
2006-01-01
Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.
NASA Astrophysics Data System (ADS)
Marziani, Paola; Sulentic, J. W.; Dultzin, D.; Negrete, A.; del Olmo, A.; Martínez-Carballo, M. A.; Stirpe, G. M.; D'Onofrio, M.; Perea, J.
2016-10-01
The 4D eigenvector 1 parameter space defined by Sulentic et al. may be seen as a surrogate H-R diagram for quasars. As in the stellar H-R diagram, a source sequence can be easily identified. In the case of quasars, the main sequence appears to be mainly driven by Eddington ratio. A transition Eddington ratio may in part explain the striking observational differences between quasars at opposite ends of the main sequence. The eigenvector-1 approach opens the door towards properly contextualized models of quasar physics, geometry and kinematics. We review some of the progress that has been made over the past 15 years, and point out still unsolved issues.
System, method and apparatus for generating phrases from a database
NASA Technical Reports Server (NTRS)
McGreevy, Michael W. (Inventor)
2004-01-01
A phrase generation is a method of generating sequences of terms, such as phrases, that may occur within a database of subsets containing sequences of terms, such as text. A database is provided and a relational model of the database is created. A query is then input. The query includes a term or a sequence of terms or multiple individual terms or multiple sequences of terms or combinations thereof. Next, several sequences of terms that are contextually related to the query are assembled from contextual relations in the model of the database. The sequences of terms are then sorted and output. Phrase generation can also be an iterative process used to produce sequences of terms from a relational model of a database.
Foltz, T M; Welsh, B M
1999-01-01
This paper uses the fact that the discrete Fourier transform diagonalizes a circulant matrix to provide an alternate derivation of the symmetric convolution-multiplication property for discrete trigonometric transforms. Derived in this manner, the symmetric convolution-multiplication property extends easily to multiple dimensions using the notion of block circulant matrices and generalizes to multidimensional asymmetric sequences. The symmetric convolution of multidimensional asymmetric sequences can then be accomplished by taking the product of the trigonometric transforms of the sequences and then applying an inverse trigonometric transform to the result. An example is given of how this theory can be used for applying a two-dimensional (2-D) finite impulse response (FIR) filter with nonlinear phase which models atmospheric turbulence.
Identification of Prostate Cancer-Specific microDNAs
2016-02-01
circular DNA by rolling circle amplification (RCA) and then amplified DNA fragments were subject to deep sequencing. Deep sequencing of the...demonstrate the existence of microDNAs in prostate cancer. We adopted multiple displacement amplification (MDA) with random 2 primers for enriched...prostate cancer cells through multiple displacement amplification and next generation sequencing. R e la ti v e c e ll g ro w th ( % ) 0 20
Applying Agrep to r-NSA to solve multiple sequences approximate matching.
Ni, Bing; Wong, Man-Hon; Lam, Chi-Fai David; Leung, Kwong-Sak
2014-01-01
This paper addresses the approximate matching problem in a database consisting of multiple DNA sequences, where the proposed approach applies Agrep to a new truncated suffix array, r-NSA. The construction time of the structure is linear to the database size, and the computations of indexing a substring in the structure are constant. The number of characters processed in applying Agrep is analysed theoretically, and the theoretical upper-bound can approximate closely the empirical number of characters, which is obtained through enumerating the characters in the actual structure built. Experiments are carried out using (synthetic) random DNA sequences, as well as (real) genome sequences including Hepatitis-B Virus and X-chromosome. Experimental results show that, compared to the straight-forward approach that applies Agrep to multiple sequences individually, the proposed approach solves the matching problem in much shorter time. The speed-up of our approach depends on the sequence patterns, and for highly similar homologous genome sequences, which are the common cases in real-life genomes, it can be up to several orders of magnitude.
Introducing difference recurrence relations for faster semi-global alignment of long sequences.
Suzuki, Hajime; Kasahara, Masahiro
2018-02-19
The read length of single-molecule DNA sequencers is reaching 1 Mb. Popular alignment software tools widely used for analyzing such long reads often take advantage of single-instruction multiple-data (SIMD) operations to accelerate calculation of dynamic programming (DP) matrices in the Smith-Waterman-Gotoh (SWG) algorithm with a fixed alignment start position at the origin. Nonetheless, 16-bit or 32-bit integers are necessary for storing the values in a DP matrix when sequences to be aligned are long; this situation hampers the use of the full SIMD width of modern processors. We proposed a faster semi-global alignment algorithm, "difference recurrence relations," that runs more rapidly than the state-of-the-art algorithm by a factor of 2.1. Instead of calculating and storing all the values in a DP matrix directly, our algorithm computes and stores mainly the differences between the values of adjacent cells in the matrix. Although the SWG algorithm and our algorithm can output exactly the same result, our algorithm mainly involves 8-bit integer operations, enabling us to exploit the full width of SIMD operations (e.g., 32) on modern processors. We also developed a library, libgaba, so that developers can easily integrate our algorithm into alignment programs. Our novel algorithm and optimized library implementation will facilitate accelerating nucleotide long-read analysis algorithms that use pairwise alignment stages. The library is implemented in the C programming language and available at https://github.com/ocxtal/libgaba .
Jaschob, Daniel; Davis, Trisha N; Riffle, Michael
2014-07-23
As high throughput sequencing continues to grow more commonplace, the need to disseminate the resulting data via web applications continues to grow. Particularly, there is a need to disseminate multiple versions of related gene and protein sequences simultaneously--whether they represent alleles present in a single species, variations of the same gene among different strains, or homologs among separate species. Often this is accomplished by displaying all versions of the sequence at once in a manner that is not intuitive or space-efficient and does not facilitate human understanding of the data. Web-based applications needing to disseminate multiple versions of sequences would benefit from a drop-in module designed to effectively disseminate these data. SnipViz is a client-side software tool designed to disseminate multiple versions of related gene and protein sequences on web sites. SnipViz has a space-efficient, interactive, and dynamic interface for navigating, analyzing and visualizing sequence data. It is written using standard World Wide Web technologies (HTML, Javascript, and CSS) and is compatible with most web browsers. SnipViz is designed as a modular client-side web component and may be incorporated into virtually any web site and be implemented without any programming. SnipViz is a drop-in client-side module for web sites designed to efficiently visualize and disseminate gene and protein sequences. SnipViz is open source and is freely available at https://github.com/yeastrc/snipviz.
Generating Models of Surgical Procedures using UMLS Concepts and Multiple Sequence Alignment
Meng, Frank; D’Avolio, Leonard W.; Chen, Andrew A.; Taira, Ricky K.; Kangarloo, Hooshang
2005-01-01
Surgical procedures can be viewed as a process composed of a sequence of steps performed on, by, or with the patient’s anatomy. This sequence is typically the pattern followed by surgeons when generating surgical report narratives for documenting surgical procedures. This paper describes a methodology for semi-automatically deriving a model of conducted surgeries, utilizing a sequence of derived Unified Medical Language System (UMLS) concepts for representing surgical procedures. A multiple sequence alignment was computed from a collection of such sequences and was used for generating the model. These models have the potential of being useful in a variety of informatics applications such as information retrieval and automatic document generation. PMID:16779094
Jenista, Elizabeth R; Stokes, Ashley M; Branca, Rosa Tamara; Warren, Warren S
2009-11-28
A recent quantum computing paper (G. S. Uhrig, Phys. Rev. Lett. 98, 100504 (2007)) analytically derived optimal pulse spacings for a multiple spin echo sequence designed to remove decoherence in a two-level system coupled to a bath. The spacings in what has been called a "Uhrig dynamic decoupling (UDD) sequence" differ dramatically from the conventional, equal pulse spacing of a Carr-Purcell-Meiboom-Gill (CPMG) multiple spin echo sequence. The UDD sequence was derived for a model that is unrelated to magnetic resonance, but was recently shown theoretically to be more general. Here we show that the UDD sequence has theoretical advantages for magnetic resonance imaging of structured materials such as tissue, where diffusion in compartmentalized and microstructured environments leads to fluctuating fields on a range of different time scales. We also show experimentally, both in excised tissue and in a live mouse tumor model, that optimal UDD sequences produce different T(2)-weighted contrast than do CPMG sequences with the same number of pulses and total delay, with substantial enhancements in most regions. This permits improved characterization of low-frequency spectral density functions in a wide range of applications.
Texture analysis of common renal masses in multiple MR sequences for prediction of pathology
NASA Astrophysics Data System (ADS)
Hoang, Uyen N.; Malayeri, Ashkan A.; Lay, Nathan S.; Summers, Ronald M.; Yao, Jianhua
2017-03-01
This pilot study performs texture analysis on multiple magnetic resonance (MR) images of common renal masses for differentiation of renal cell carcinoma (RCC). Bounding boxes are drawn around each mass on one axial slice in T1 delayed sequence to use for feature extraction and classification. All sequences (T1 delayed, venous, arterial, pre-contrast phases, T2, and T2 fat saturated sequences) are co-registered and texture features are extracted from each sequence simultaneously. Random forest is used to construct models to classify lesions on 96 normal regions, 87 clear cell RCCs, 8 papillary RCCs, and 21 renal oncocytomas; ground truths are verified through pathology reports. The highest performance is seen in random forest model when data from all sequences are used in conjunction, achieving an overall classification accuracy of 83.7%. When using data from one single sequence, the overall accuracies achieved for T1 delayed, venous, arterial, and pre-contrast phase, T2, and T2 fat saturated were 79.1%, 70.5%, 56.2%, 61.0%, 60.0%, and 44.8%, respectively. This demonstrates promising results of utilizing intensity information from multiple MR sequences for accurate classification of renal masses.
Cui, Zhihua; Zhang, Yi
2014-02-01
As a promising and innovative research field, bioinformatics has attracted increasing attention recently. Beneath the enormous number of open problems in this field, one fundamental issue is about the accurate and efficient computational methodology that can deal with tremendous amounts of data. In this paper, we survey some applications of swarm intelligence to discover patterns of multiple sequences. To provide a deep insight, ant colony optimization, particle swarm optimization, artificial bee colony and artificial fish swarm algorithm are selected, and their applications to multiple sequence alignment and motif detecting problem are discussed.
Choy, G.L.; Bowman, J.R.
1990-01-01
On January 22, 1988, three large intraplate earthquakes (with MS 6.3, 6.4 and 6.7) occurred within a 12-hour period near Tennant Creek, Australia. Broadband displacement and velocity records of body waves from teleseismically recorded data are analyzed to determine source mechanisms, depths, and complexity of rupture of each of the three main shocks. Hypocenters of an additional 150 foreshocks and aftershocks constrained by local arrival time data and field observations of surface rupture are used to complement the source characteristics of the main shocks. The interpretation of the combined data sets suggests that the overall rupture process involved unusually complicated stress release. Rupture characteristics suggest that substantial slow slip occurred on each of the three fault interfaces that was not accompanied by major energy release. Variation of focal depth and the strong increase of moment and radiated energy with each main shock imply that lateral variations of strength were more important than vertical gradients of shear stress in controlling the progression of rupture. -from Authors
Score distributions of gapped multiple sequence alignments down to the low-probability tail
NASA Astrophysics Data System (ADS)
Fieth, Pascal; Hartmann, Alexander K.
2016-08-01
Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to multiple sequence alignments with gaps, which are much more relevant for practical applications in molecular biology. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10-160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.
Nakano, Michiharu; Shimada, Takehiko; Endo, Tomoko; Fujii, Hiroshi; Nesumi, Hirohisa; Kita, Masayuki; Ebina, Masumi; Shimizu, Tokurou; Omura, Mitsuo
2012-02-01
Polyembryony, in which multiple somatic nucellar cell-derived embryos develop in addition to the zygotic embryo in a seed, is common in the genus Citrus. Previous genetic studies indicated polyembryony is mainly determined by a single locus, but the underlying molecular mechanism is still unclear. As a step towards identification and characterization of the gene or genes responsible for nucellar embryogenesis in Citrus, haplotype-specific physical maps around the polyembryony locus were constructed. By sequencing three BAC clones aligned on the polyembryony haplotype, a single contiguous draft sequence consisting of 380 kb containing 70 predicted open reading frames (ORFs) was reconstructed. Single nucleotide polymorphism genotypes detected in the sequenced genomic region showed strong association with embryo type in Citrus, indicating a common polyembryony locus is shared among widely diverse Citrus cultivars and species. The arrangement of the predicted ORFs in the characterized genomic region showed high collinearity to the genomic sequence of chromosome 4 of Vitis vinifera and linkage group VI of Populus trichocarpa, suggesting that the syntenic relationship among these species is conserved even though V. vinifera and P. trichocarpa are non-apomictic species. This is the first study to characterize in detail the genomic structure of an apomixis locus determining adventitious embryony. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Tickling the retina: integration of subthreshold electrical pulses can activate retinal neurons
NASA Astrophysics Data System (ADS)
Sekhar, S.; Jalligampala, A.; Zrenner, E.; Rathbun, D. L.
2016-08-01
Objective. The field of retinal prosthetics has made major progress over the last decade, restoring visual percepts to people suffering from retinitis pigmentosa. The stimulation pulses used by present implants are suprathreshold, meaning individual pulses are designed to activate the retina. In this paper we explore subthreshold pulse sequences as an alternate stimulation paradigm. Subthreshold pulses have the potential to address important open problems such as fading of visual percepts when patients are stimulated at moderate pulse repetition rates and the difficulty in preferentially stimulating different retinal pathways. Approach. As a first step in addressing these issues we used Gaussian white noise electrical stimulation combined with spike-triggered averaging to interrogate whether a subthreshold sequence of pulses can be used to activate the mouse retina. Main results. We demonstrate that the retinal network can integrate multiple subthreshold electrical stimuli under an experimental paradigm immediately relevant to retinal prostheses. Furthermore, these characteristic stimulus sequences varied in their shape and integration window length across the population of retinal ganglion cells. Significance. Because the subthreshold sequences activate the retina at stimulation rates that would typically induce strong fading (25 Hz), such retinal ‘tickling’ has the potential to minimize the fading problem. Furthermore, the diversity found across the cell population in characteristic pulse sequences suggests that these sequences could be used to selectively address the different retinal pathways (e.g. ON versus OFF). Both of these outcomes may significantly improve visual perception in retinal implant patients.
Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas
2016-01-01
Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.
Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas
2016-01-01
Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid. PMID:26840129
NASA Astrophysics Data System (ADS)
Milone, A. P.; Marino, A. F.; Di Criscienzo, M.; D'Antona, F.; Bedin, L. R.; Da Costa, G.; Piotto, G.; Tailo, M.; Dotter, A.; Angeloni, R.; Anderson, J.; Jerjen, H.; Li, C.; Dupree, A.; Granata, V.; Lagioia, E. P.; Mackey, A. D.; Nardiello, D.; Vesperini, E.
2018-06-01
The split main sequences (MSs) and extended MS turnoffs (eMSTOs) detected in a few young clusters have demonstrated that these stellar systems host multiple populations differing in a number of properties such as rotation and, possibly, age. We analyse Hubble Space Telescope photometry for 13 clusters with ages between ˜40 and ˜1000 Myr and of different masses. Our goal is to investigate for the first time the occurrence of multiple populations in a large sample of young clusters. We find that all the clusters exhibit the eMSTO phenomenon and that MS stars more massive than ˜1.6 M_{⊙} define a blue and a red MS, with the latter hosting the majority of MS stars. The comparison between the observations and isochrones suggests that the blue MSs are made of slow-rotating stars, while the red MSs host stars with rotational velocities close to the breakup value. About half of the bright MS stars in the youngest clusters are H α emitters. These Be stars populate the red MS and the reddest part of the eMSTO, thus supporting the idea that the red MS is made of fast rotators. We conclude that the split MS and the eMSTO are a common feature of young clusters in both Magellanic Clouds. The phenomena of a split MS and an eMSTO occur for stars that are more massive than a specific threshold, which is independent of the host-cluster mass. As a by-product, we report the serendipitous discovery of a young Small Magellanic Cloud cluster, GALFOR 1.
BlockLogo: visualization of peptide and sequence motif conservation
Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian; Sun, Jing; Schönbach, Christian; Reinherz, Ellis L.; Zhang, Guang Lan; Brusic, Vladimir
2013-01-01
BlockLogo is a web-server application for visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, selection of motif positions, type of sequence, and output format definition. The output has BlockLogo along with the sequence logo, and a table of motif frequencies. We deployed BlockLogo as an online application and have demonstrated its utility through examples that show visualization of T-cell epitopes and B-cell epitopes (both continuous and discontinuous). Our additional example shows a visualization and analysis of structural motifs that determine specificity of peptide binding to HLA-DR molecules. The BlockLogo server also employs selected experimentally validated prediction algorithms to enable on-the-fly prediction of MHC binding affinity to 15 common HLA class I and class II alleles as well as visual analysis of discontinuous epitopes from multiple sequence alignments. It enables the visualization and analysis of structural and functional motifs that are usually described as regular expressions. It provides a compact view of discontinuous motifs composed of distant positions within biological sequences. BlockLogo is available at: http://research4.dfci.harvard.edu/cvc/blocklogo/ and http://methilab.bu.edu/blocklogo/ PMID:24001880
Li, Ruichao; Xie, Miaomiao; Dong, Ning; Lin, Dachuan; Yang, Xuemei; Wong, Marcus Ho Yin; Chan, Edward Wai-Chi; Chen, Sheng
2018-03-01
Multidrug resistance (MDR)-encoding plasmids are considered major molecular vehicles responsible for transmission of antibiotic resistance genes among bacteria of the same or different species. Delineating the complete sequences of such plasmids could provide valuable insight into the evolution and transmission mechanisms underlying bacterial antibiotic resistance development. However, due to the presence of multiple repeats of mobile elements, complete sequencing of MDR plasmids remains technically complicated, expensive, and time-consuming. Here, we demonstrate a rapid and efficient approach to obtaining multiple MDR plasmid sequences through the use of the MinION nanopore sequencing platform, which is incorporated in a portable device. By assembling the long sequencing reads generated by a single MinION run according to a rapid barcoding sequencing protocol, we obtained the complete sequences of 20 plasmids harbored by multiple bacterial strains. Importantly, single long reads covering a plasmid end-to-end were recorded, indicating that de novo assembly may be unnecessary if the single reads exhibit high accuracy. This workflow represents a convenient and cost-effective approach for systematic assessment of MDR plasmids responsible for treatment failure of bacterial infections, offering the opportunity to perform detailed molecular epidemiological studies to probe the evolutionary and transmission mechanisms of MDR-encoding elements.
Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.
The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. Results: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a frameworkmore » based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. Availability: Phylo-VISTA is available at http://www-gsd.lbl. gov/phylovista. It requires an Internet browser with Java Plugin 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu« less
A core microbiome associated with the peritoneal tumors of pseudomyxoma peritonei
2013-01-01
Background Pseudomyxoma peritonei (PMP) is a malignancy characterized by dissemination of mucus-secreting cells throughout the peritoneum. This disease is associated with significant morbidity and mortality and despite effective treatment options for early-stage disease, patients with PMP often relapse. Thus, there is a need for additional treatment options to reduce relapse rate and increase long-term survival. A previous study identified the presence of both typed and non-culturable bacteria associated with PMP tissue and determined that increased bacterial density was associated with more severe disease. These findings highlighted the possible role for bacteria in PMP disease. Methods To more clearly define the bacterial communities associated with PMP disease, we employed a sequenced-based analysis to profile the bacterial populations found in PMP tumor and mucin tissue in 11 patients. Sequencing data were confirmed by in situ hybridization at multiple taxonomic depths and by culturing. A pilot clinical study was initiated to determine whether the addition of antibiotic therapy affected PMP patient outcome. Main results We determined that the types of bacteria present are highly conserved in all PMP patients; the dominant phyla are the Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes. A core set of taxon-specific sequences were found in all 11 patients; many of these sequences were classified into taxonomic groups that also contain known human pathogens. In situ hybridization directly confirmed the presence of bacteria in PMP at multiple taxonomic depths and supported our sequence-based analysis. Furthermore, culturing of PMP tissue samples allowed us to isolate 11 different bacterial strains from eight independent patients, and in vitro analysis of subset of these isolates suggests that at least some of these strains may interact with the PMP-associated mucin MUC2. Finally, we provide evidence suggesting that targeting these bacteria with antibiotic treatment may increase the survival of PMP patients. Conclusions Using 16S amplicon-based sequencing, direct in situ hybridization analysis and culturing methods, we have identified numerous bacterial taxa that are consistently present in all PMP patients tested. Combined with data from a pilot clinical study, these data support the hypothesis that adding antimicrobials to the standard PMP treatment could improve PMP patient survival. PMID:23844722
Chi, Hongshu; Taik, Patricia; Foley, Emily J; Racicot, Alycia C; Gray, Hilary M; Guzzetta, Katherine E; Lin, Hsin-Yun; Song, Yen-Ling; Tung, Che-Huang; Zenke, Kosuke; Yoshinaga, Tomoyoshi; Cheng, Chao-Yin; Chang, Wei-Jen; Gong, Hui
2017-07-01
The ciliate protozoan Cryptocaryon irritans parasitizes marine fish and causes lethal white spot disease. Sporadic infections as well as large-scale outbreaks have been reported globally and the parasite's broad host range poses particular threat to the aquaculture and ornamental fish markets. In order to better understand C. irritans' population structure, we sequenced and compared mitochondrial cox-1, SSU rRNA, and ITS-1 sequences from 8 new isolates of C. irritans collected in China, Japan, and Taiwan. We detected two SSU rRNA haplotypes, which differ at three positions, separating the isolates into two main groups (I and II). Cox-1 sequences also support the division into two groups, and the cox-1 divergence between these two groups is unexpectedly high (9.28% for 1582 nucleotide positions). The divergence is much greater than that detected in Ichthyophthirius multifiliis, the ciliate protozoan causing freshwater white spot disease in fish, where intraspecies divergence on cox-1 sequence is only 1.95%. ITS-1 sequences derived from these eight isolates and from all other C. irritans isolates (deposited in the GenBank) not only support the two groups, but further suggest the presence of a third group with even greater sequence divergence. Finally, a small Ka/Ks ratio estimated from cox-1 sequences suggests that this gene in C. irritans remains under strong purifying selection. Taken together, the C. irritans species may consists of many subspecies and/or syngens. Further work is needed to determine if there is reproductive isolation between the groups we have defined. Copyright © 2017 Elsevier Inc. All rights reserved.
On the Detection and Characterization of Polluted White Dwarfs
NASA Astrophysics Data System (ADS)
Steele, Amy; Debes, John H.; Deming, Drake
2017-06-01
There is evidence of circumstellar material around main sequence, giant, and white dwarf stars. What happens to this material after the main sequence? With this work, we focus on the characterization of the material around WD 1145+017. The goals are to monitor the white dwarf—which has a transiting, disintegrating planetesimal and determine the composition of the evaporated material for that same white dwarf by looking at high-resolution spectra. We also present preliminary results of follow-up photometric observations of known polluted WDs. If rocky bodies survive red giant branch evolution, then the material raining down on a WD atmosphere is a direct probe of main sequence cosmochemistry. If rocky bodies do not survive the evolution, then this informs the degree of post-main-sequence processing. These case studies will provide the community with further insight about debris disk modeling, the degree of post-main-sequence processing of circumstellar material, and the composition of a disintegrating planetesimal.
The set of triple-resonance sequences with a multiple quantum coherence evolution period
NASA Astrophysics Data System (ADS)
Koźmiński, Wiktor; Zhukov, Igor
2004-12-01
The new pulse sequence building block that relies on evolution of heteronuclear multiple quantum coherences is proposed. The particular chemical shifts are obtained in multiple quadrature, using linear combinations of frequencies taken from spectra measured at different quantum levels. The pulse sequences designed in this way consist of small number of RF-pulses, are as short as possible, and could be applied for determination of coupling constants. The examples presented involve 2D correlations H NCO, H NCA, H N(CO) CA, and H(N) COCA via heteronuclear zero and double coherences, as well as 2D H NCOCA technique with simultaneous evolution of triple and three distinct single quantum coherences. Applications of the new sequences are presented for 13C, 15N-labeled ubiquitin.
Mercado, Francisco; Almanza, Angélica; Rubio, Nazario; Soto, Enrique
2018-06-11
Multiple sclerosis (MS) is a high prevalence degenerative disease characterized at the cellular level by glial and neuronal cell death. The causes of cell death during the disease course are not fully understood. In this work we demonstrate that in a MS model induced by Theiler's murine encephalomyelitis virus (TMEV) infection, the inward rectifier (Kir) 4.1 potassium channel subunit is overexpressed in astrocytes. In voltage clamp experiments the inward current density from TMEV-infected astrocytes was significantly larger than in mock-infected ones. The cRNA hybridization analysis from mock- and TMEV-infected cells showed an upregulation of a potassium transport channel coding sequence. We validated this mRNA increase by RT-PCR and quantitative PCR using Kir 4.1 specific primers. Western blotting experiments confirmed the upregulation of Kir 4.1, and alignment between sequences provided the demonstration that the over-expressed gene encodes for a Kir family member. Flow cytometry showed that the Kir 4.1 protein is located mainly in the cell membrane in mock and TMEV-infected astrocytes. Our results demonstrate an increase in K + inward current in TMEV-infected glial cells, this increment may reduce the neuronal depolarization, contributing to cell resilience mechanisms. Copyright © 2018 Elsevier B.V. All rights reserved.
Introduction to bioinformatics.
Can, Tolga
2014-01-01
Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.
Fuchs, Sabine A; Harakalova, Magdalena; van Haaften, Gijs; van Hasselt, Peter M; Cuppen, Edwin; Houwen, Roderick H J
2012-07-01
The genetic defect in a number of rare disorders of metal metabolism remains elusive. The limited number of patients with these disorders impedes the identification of the causative gene through positional cloning, which requires numerous families with multiple affected individuals. However, with next-generation sequencing all coding DNA (exomes) or whole genomes of patients can be sequenced to identify genes that are consistently mutated in patients. With this strategy only a limited number of patients and/or pedigrees is needed, bringing the elucidation of the genetic cause of even very rare diseases within reach. The main challenge associated with whole exome sequencing is the identification of the disease-causing mutation(s) among abundant genetic candidate variants. We describe several strategies to manage this data wealth, including comparison with control databases, increasing the number of patients and controls, and reducing the genomic region under investigation through homozygosity mapping. In this review we introduce a number of rare disorders of copper metabolism, with a suspected but yet unknown monogenetic cause, as an attractive target for this strategy. We anticipate that use of these novel techniques will identify the basic defect in the disorders described in this review, as well as in other genetic disorders of metal metabolism, in the next few years.
A catalog of aftershock sequences in Greece (1971 1997): Their spatial and temporal characteristics
NASA Astrophysics Data System (ADS)
Drakatos, George; Latoussakis, John
A complete catalog of aftershock sequences is provided for main earthquakes with ML 5.0, which occurred in the area of Greece and surrounding regions the last twenty-seven years. The Monthly Bulletins of the Institute of Geodynamics (National Observatory of Athens) have been used as data source. In order to get a homogeneous catalog, several selection criteria have been applied and hence a catalog of 44 aftershock sequences is compiled. The relations between the duration of the sequence, the number of aftershocks, the magnitude of the largest aftershock and its delay time from the main shock as well as the subsurface rupture length versus the magnitude of the main shock are calculated. The results show that linearity exists between the subsurface rupture length and the magnitude of the main shock independent of the slip type, as well as between the magnitude of the main shock (M) and its largest aftershock (Ma). The mean difference M-Ma is almost one unit. In the 40% of the analyzed sequences, the largest aftershock occurred within one day after the main shock.The fact that the aftershock sequences show the same behavior for earthquakes that occur in the same region supports the theory that the spatial and temporal characteristics are strongly related to the stress distribution of the fault area.
Multiplicity of the Galactic Senior Citizens: A high-resolution search for cool subdwarf companions
NASA Astrophysics Data System (ADS)
Ziegler, Carl; Law, Nicholas M.
2015-01-01
Cool subdwarfs, with spectral types late K and M, are the oldest members of the low-mass stellar population. Mostly present in the galactic halo, subdwarfs are characterized by their low metallicity and high proper-motions. Understanding their binary fraction could give key insights into the star formation process early in the Milky Way's history. However, because of their low luminosity and relative rarity in the solar neighborhood, binary surveys of cool subdwarfs have suffered from small sample sizes and large incompleteness gaps. It appears, however, that the binary fraction of red subdwarfs is much lower than for their main-sequence cousins. Using the highly efficient Robo-AO system, we present the largest high-resolution survey of subdwarfs yet. We find from 349 target cool subdwarfs, 39 are in multiple systems, 13 newly discovered, for a binary fraction of 11 ± 1.8%.
SIG: a general-purpose signal processing program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lager, D.; Azevedo, S.
1986-02-01
SIG is a general-purpose signal processing, analysis, and display program. Its main purpose is to perform manipulations on time- and frequency-domain signals. It also accommodates other representations for data such as transfer function polynomials. Signal processing operations include digital filtering, auto/cross spectral density, transfer function/impulse response, convolution, Fourier transform, and inverse Fourier transform. Graphical operations provide display of signals and spectra, including plotting, cursor zoom, families of curves, and multiple viewport plots. SIG provides two user interfaces with a menu mode for occasional users and a command mode for more experienced users. Capability exits for multiple commands per line, commandmore » files with arguments, commenting lines, defining commands, automatic execution for each item in a repeat sequence, etc. SIG is presently available for VAX(VMS), VAX (BERKELEY 4.2 UNIX), SUN (BERKELEY 4.2 UNIX), DEC-20 (TOPS-20), LSI-11/23 (TSX), and DEC PRO 350 (TSX). 4 refs., 2 figs.« less
Murillo, Gabriel H; You, Na; Su, Xiaoquan; Cui, Wei; Reilly, Muredach P; Li, Mingyao; Ning, Kang; Cui, Xinping
2016-05-15
Single nucleotide variant (SNV) detection procedures are being utilized as never before to analyze the recent abundance of high-throughput DNA sequencing data, both on single and multiple sample datasets. Building on previously published work with the single sample SNV caller genotype model selection (GeMS), a multiple sample version of GeMS (MultiGeMS) is introduced. Unlike other popular multiple sample SNV callers, the MultiGeMS statistical model accounts for enzymatic substitution sequencing errors. It also addresses the multiple testing problem endemic to multiple sample SNV calling and utilizes high performance computing (HPC) techniques. A simulation study demonstrates that MultiGeMS ranks highest in precision among a selection of popular multiple sample SNV callers, while showing exceptional recall in calling common SNVs. Further, both simulation studies and real data analyses indicate that MultiGeMS is robust to low-quality data. We also demonstrate that accounting for enzymatic substitution sequencing errors not only improves SNV call precision at low mapping quality regions, but also improves recall at reference allele-dominated sites with high mapping quality. The MultiGeMS package can be downloaded from https://github.com/cui-lab/multigems xinping.cui@ucr.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Columbia/Einstein observations of galactic X-ray sources
NASA Technical Reports Server (NTRS)
Long, K. S.
1979-01-01
The imaging observations of galactic clusters are presented. These fall into three categories: pre-main-sequence stars in the Orion nebulae, isolated-main-and-post main-sequence stars, and supernova remnants SNR. In addition to SNR, approximately 30 sources were detected.
Kwarciak, Kamil; Radom, Marcin; Formanowicz, Piotr
2016-04-01
The classical sequencing by hybridization takes into account a binary information about sequence composition. A given element from an oligonucleotide library is or is not a part of the target sequence. However, the DNA chip technology has been developed and it enables to receive a partial information about multiplicity of each oligonucleotide the analyzed sequence consist of. Currently, it is not possible to assess the exact data of such type but even partial information should be very useful. Two realistic multiplicity information models are taken into consideration in this paper. The first one, called "one and many" assumes that it is possible to obtain information if a given oligonucleotide occurs in a reconstructed sequence once or more than once. According to the second model, called "one, two and many", one is able to receive from biochemical experiment information if a given oligonucleotide is present in an analyzed sequence once, twice or at least three times. An ant colony optimization algorithm has been implemented to verify the above models and to compare with existing algorithms for sequencing by hybridization which utilize the additional information. The proposed algorithm solves the problem with any kind of hybridization errors. Computational experiment results confirm that using even the partial information about multiplicity leads to increased quality of reconstructed sequences. Moreover, they also show that the more precise model enables to obtain better solutions and the ant colony optimization algorithm outperforms the existing ones. Test data sets and the proposed ant colony optimization algorithm are available on: http://bioserver.cs.put.poznan.pl/download/ACO4mSBH.zip. Copyright © 2016 Elsevier Ltd. All rights reserved.
Identifying Multiple Populations in M71 using CN
NASA Astrophysics Data System (ADS)
Gerber, Jeffrey M.; Friel, Eileen D.; Vesperini, Enrico
2018-01-01
It is now well established that globular clusters (GCs) host multiple stellar populations characterized by differences in several light elements. While these populations have been found in nearly all GCs, we still lack an entirely successful model to explain their formation. A key constraint to these models is the detailed pattern of light element abundances seen among the populations; different techniques for identifying these populations probe different elements and do not always yield the same results. We study a large sample of stars in the GC M71 for light elements C and N, using the CN and CH band strength to identify multiple populations. Our measurements come from low-resolution spectroscopy obtained with the WIYN-3.5m telescope for ~150 stars from the tip of the red-giant branch down to the main-sequence turn-off. The large number of stars and broad spatial coverage of our sample (out to ~3.5 half-light radii) allows us to carry out a comprehensive characterization of the multiple populations in M71. We use a combination of the various spectroscopic and photometric indicators to draw a more complete picture of the properties of the populations and to investigate the consistency of classifications using different techniques.
Adaptive Digital Signature Design and Short-Data-Record Adaptive Filtering
2008-04-01
rate BPSK binary phase shift keying CA − CFAR cell averaging− constant false alarm rate CDMA code − division multiple − access CFAR constant false...Cotae, “Spreading sequence design for multiple cell synchronous DS-CDMA systems under total weighted squared correlation criterion,” EURASIP Journal...415-428, Mar. 2002. [6] P. Cotae, “Spreading sequence design for multiple cell synchronous DS-CDMA systems under total weighted squared correlation
Bernsen, M R; Dijkman, H B; de Vries, E; Figdor, C G; Ruiter, D J; Adema, G J; van Muijen, G N
1998-10-01
Molecular analysis of small tissue samples has become increasingly important in biomedical studies. Using a laser dissection microscope and modified nucleic acid isolation protocols, we demonstrate that multiple mRNA as well as DNA sequences can be identified from a single-cell sample. In addition, we show that the specificity of procurement of tissue samples is not compromised by smear contamination resulting from scraping of the microtome knife during sectioning of lesions. The procedures described herein thus allow for efficient RT-PCR or PCR analysis of multiple nucleic acid sequences from small tissue samples obtained by laser-assisted microdissection.
Sun, Mingjun; Jing, Zhigang; Di, Dongdong; Yan, Hao; Zhang, Zhicheng; Xu, Quangang; Zhang, Xiyue; Wang, Xun; Ni, Bo; Sun, Xiangxiang; Yan, Chengxu; Yang, Zhen; Tian, Lili; Li, Jinping; Fan, Weixing
2017-01-01
Brucellosis is a worldwide zoonotic disease caused by Brucella spp. In China, brucellosis is recognized as a reemerging disease mainly caused by Brucella melitensis specie. To better understand the currently endemic B. melitensis strains in China, three Brucella genotyping methods were applied to 110 B. melitensis strains obtained in past several years. By MLVA genotyping, five MLVA-8 genotypes were identified, among which genotypes 42 (1-5-3-13-2-2-3-2) was recognized as the predominant genotype, while genotype 63 (1-5-3-13-2-3-3-2) and a novel genotype of 1-5-3-13-2-4-3-2 were second frequently observed. MLVA-16 discerned a total of 57 MLVA-16 genotypes among these Brucella strains, with 41 genotypes being firstly detected and the other 16 genotypes being previously reported. By BruMLSA21 typing, six sequence types (STs) were identified, among them ST8 is the most frequently seen in China while the other five STs were firstly detected and designated as ST137, ST138, ST139, ST140, and ST141 by international multilocus sequence typing database. Whole-genome sequence (WGS)-single-nucleotide polymorphism (SNP)-based typing and phylogenetic analysis resolved Chinese B. melitensis strains into five clusters, reflecting the existence of multiple lineages among these Chinese B. melitensis strains. In phylogeny, Chinese lineages are more closely related to strains collected from East Mediterranean and Middle East countries, such as Turkey, Kuwait, and Iraq. In the next few years, MLVA typing will certainly remain an important epidemiological tool for Brucella infection analysis, as it displays a high discriminatory ability and achieves result largely in agreement with WGS-SNP-based typing. However, WGS-SNP-based typing is found to be the most powerful and reliable method in discerning Brucella strains and will be popular used in the future.
Viral Diagnostics in Plants Using Next Generation Sequencing: Computational Analysis in Practice.
Jones, Susan; Baizan-Edge, Amanda; MacFarlane, Stuart; Torrance, Lesley
2017-01-01
Viruses cause significant yield and quality losses in a wide variety of cultivated crops. Hence, the detection and identification of viruses is a crucial facet of successful crop production and of great significance in terms of world food security. Whilst the adoption of molecular techniques such as RT-PCR has increased the speed and accuracy of viral diagnostics, such techniques only allow the detection of known viruses, i.e., each test is specific to one or a small number of related viruses. Therefore, unknown viruses can be missed and testing can be slow and expensive if molecular tests are unavailable. Methods for simultaneous detection of multiple viruses have been developed, and (NGS) is now a principal focus of this area, as it enables unbiased and hypothesis-free testing of plant samples. The development of NGS protocols capable of detecting multiple known and emergent viruses present in infected material is proving to be a major advance for crops, nuclear stocks or imported plants and germplasm, in which disease symptoms are absent, unspecific or only triggered by multiple viruses. Researchers want to answer the question "how many different viruses are present in this crop plant?" without knowing what they are looking for: RNA-sequencing (RNA-seq) of plant material allows this question to be addressed. As well as needing efficient nucleic acid extraction and enrichment protocols, virus detection using RNA-seq requires fast and robust bioinformatics methods to enable host sequence removal and virus classification. In this review recent studies that use RNA-seq for virus detection in a variety of crop plants are discussed with specific emphasis on the computational methods implemented. The main features of a number of specific bioinformatics workflows developed for virus detection from NGS data are also outlined and possible reasons why these have not yet been widely adopted are discussed. The review concludes by discussing the future directions of this field, including the use of bioinformatics tools for virus detection deployed in analytical environments using cloud computing.
Generation of 2A-linked multicistronic cassettes by recombinant PCR.
Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A
2012-02-01
The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. It is now possible to express multiple proteins from a single open reading frame (ORF) using 2A peptide-linked multicistronic vectors. These small sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. Expression of more than two genes using conventional approaches has several limitations, most notably imbalanced protein expression and large size. The use of 2A peptide sequences alleviates these concerns. They are small (18-22 amino acids) and have divergent amino-terminal sequences, which minimizes the chance for homologous recombination and allows for multiple, different 2A peptide sequences to be used within a single vector. Importantly, separation of genes placed between 2A peptide sequences is nearly 100%, which allows for stoichiometric and concordant expression of the genes, regardless of the order of placement within the vector. This protocol describes the use of recombinant polymerase chain reaction (PCR) to connect multiple 2A-linked protein sequences. The final construct is subcloned into an expression vector.
Notredame, Cedric
2018-05-02
Cedric Notredame from the Centre for Genomic Regulation gives a presentation on New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era at the JGI/Argonne HPC Workshop on January 26, 2010.
USDA-ARS?s Scientific Manuscript database
The Spodoptera littoralis multiple nucleopolyhedrovirus (SpliMNPV), a pathogen of the Egyptian cotton leaf worm Spodoptera littoralis, was subjected to sequencing of its entire DNA genome and bioassay analysis comparing its virulence to that of other baculoviruses. The annotated SpliMNPV genome of...
Applications of Single-Cell Sequencing for Multiomics.
Xu, Yungang; Zhou, Xiaobo
2018-01-01
Single-cell sequencing interrogates the sequence or chromatin information from individual cells with advanced next-generation sequencing technologies. It provides a higher resolution of cellular differences and a better understanding of the underlying genetic and epigenetic mechanisms of an individual cell in the context of its survival and adaptation to microenvironment. However, it is more challenging to perform single-cell sequencing and downstream data analysis, owing to the minimal amount of starting materials, sample loss, and contamination. In addition, due to the picogram level of the amount of nucleic acids used, heavy amplification is often needed during sample preparation of single-cell sequencing, resulting in the uneven coverage, noise, and inaccurate quantification of sequencing data. All these unique properties raise challenges in and thus high demands for computational methods that specifically fit single-cell sequencing data. We here comprehensively survey the current strategies and challenges for multiple single-cell sequencing, including single-cell transcriptome, genome, and epigenome, beginning with a brief introduction to multiple sequencing techniques for single cells.
Eye movement sequence generation in humans: Motor or goal updating?
Quaia, Christian; Joiner, Wilsaan M.; FitzGibbon, Edmond J.; Optican, Lance M.; Smith, Maurice A.
2011-01-01
Saccadic eye movements are often grouped in pre-programmed sequences. The mechanism underlying the generation of each saccade in a sequence is currently poorly understood. Broadly speaking, two alternative schemes are possible: first, after each saccade the retinotopic location of the next target could be estimated, and an appropriate saccade could be generated. We call this the goal updating hypothesis. Alternatively, multiple motor plans could be pre-computed, and they could then be updated after each movement. We call this the motor updating hypothesis. We used McLaughlin’s intra-saccadic step paradigm to artificially create a condition under which these two hypotheses make discriminable predictions. We found that in human subjects, when sequences of two saccades are planned, the motor updating hypothesis predicts the landing position of the second saccade in two-saccade sequences much better than the goal updating hypothesis. This finding suggests that the human saccadic system is capable of executing sequences of saccades to multiple targets by planning multiple motor commands, which are then updated by serial subtraction of ongoing motor output. PMID:21191134
The proximal-to-distal sequence in upper-limb motions on multiple levels and time scales.
Serrien, Ben; Baeyens, Jean-Pierre
2017-10-01
The proximal-to-distal sequence is a phenomenon that can be observed in a large variety of motions of the upper limbs in both humans and other mammals. The mechanisms behind this sequence are not completely understood and motor control theories able to explain this phenomenon are currently incomplete. The aim of this narrative review is to take a theoretical constraints-led approach to the proximal-to-distal sequence and provide a broad multidisciplinary overview of relevant literature. This sequence exists at multiple levels (brain, spine, muscles, kinetics and kinematics) and on multiple time scales (motion, motor learning and development, growth and possibly even evolution). We hypothesize that the proximodistal spatiotemporal direction on each time scale and level provides part of the organismic constraints that guide the dynamics at the other levels and time scales. The constraint-led approach in this review may serve as a first onset towards integration of evidence and a framework for further experimentation to reveal the dynamics of the proximal-to-distal sequence. Copyright © 2017 Elsevier B.V. All rights reserved.
Prof. Hayashi's work on the pre-main sequence evolution and brown dwarfs
NASA Astrophysics Data System (ADS)
Nakano, Takenori
2012-09-01
Prof. Hayashi's work on the evolution of stars in the pre-main sequence stage is reviewed. The historical background and the process of finding the Hayashi phase are mentioned. The work on the evolution of low-mass stars is also reviewed including the determination of the bottom of the main sequence and evolution of brown dwarfs, and comparison is made with the other works in the same period.
Wang, Yi; Wang, Yan; Zhang, Lu; Liu, Dongxin; Luo, Lijuan; Li, Hua; Cao, Xiaolong; Liu, Kai; Xu, Jianguo; Ye, Changyun
2016-01-01
We have devised a novel isothermal amplification technology, termed endonuclease restriction-mediated real-time multiple cross displacement amplification (ET-MCDA), which facilitated multiplex, rapid, specific and sensitive detection of nucleic-acid sequences at a constant temperature. The ET-MCDA integrated multiple cross displacement amplification strategy, restriction endonuclease cleavage and real-time fluorescence detection technique. In the ET-MCDA system, the functional cross primer E-CP1 or E-CP2 was constructed by adding a short sequence at the 5' end of CP1 or CP2, respectively, and the new E-CP1 or E-CP2 primer was labeled at the 5' end with a fluorophore and in the middle with a dark quencher. The restriction endonuclease Nb.BsrDI specifically recognized the short sequence and digested the newly synthesized double-stranded terminal sequences (5' end short sequences and their complementary sequences), which released the quenching, resulting on a gain of fluorescence signal. Thus, the ET-MCDA allowed real-time detection of single or multiple targets in only a single reaction, and the positive results were observed in as short as 12 min, detecting down to 3.125 fg of genomic DNA per tube. Moreover, the analytical specificity and the practical application of the ET-MCDA were also successfully evaluated in this study. Here, we provided the details on the novel ET-MCDA technique and expounded the basic ET-MCDA amplification mechanism.
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
Normal and compound poisson approximations for pattern occurrences in NGS reads.
Zhai, Zhiyuan; Reinert, Gesine; Song, Kai; Waterman, Michael S; Luan, Yihui; Sun, Fengzhu
2012-06-01
Next generation sequencing (NGS) technologies are now widely used in many biological studies. In NGS, sequence reads are randomly sampled from the genome sequence of interest. Most computational approaches for NGS data first map the reads to the genome and then analyze the data based on the mapped reads. Since many organisms have unknown genome sequences and many reads cannot be uniquely mapped to the genomes even if the genome sequences are known, alternative analytical methods are needed for the study of NGS data. Here we suggest using word patterns to analyze NGS data. Word pattern counting (the study of the probabilistic distribution of the number of occurrences of word patterns in one or multiple long sequences) has played an important role in molecular sequence analysis. However, no studies are available on the distribution of the number of occurrences of word patterns in NGS reads. In this article, we build probabilistic models for the background sequence and the sampling process of the sequence reads from the genome. Based on the models, we provide normal and compound Poisson approximations for the number of occurrences of word patterns from the sequence reads, with bounds on the approximation error. The main challenge is to consider the randomness in generating the long background sequence, as well as in the sampling of the reads using NGS. We show the accuracy of these approximations under a variety of conditions for different patterns with various characteristics. Under realistic assumptions, the compound Poisson approximation seems to outperform the normal approximation in most situations. These approximate distributions can be used to evaluate the statistical significance of the occurrence of patterns from NGS data. The theory and the computational algorithm for calculating the approximate distributions are then used to analyze ChIP-Seq data using transcription factor GABP. Software is available online (www-rcf.usc.edu/∼fsun/Programs/NGS_motif_power/NGS_motif_power.html). In addition, Supplementary Material can be found online (www.liebertonline.com/cmb).
Vuillemin, Aurèle; Horn, Fabian; Alawi, Mashal; Henny, Cynthia; Wagner, Dirk; Crowe, Sean A.; Kallmeyer, Jens
2017-01-01
Extracellular DNA is ubiquitous in soil and sediment and constitutes a dominant fraction of environmental DNA in aquatic systems. In theory, extracellular DNA is composed of genomic elements persisting at different degrees of preservation produced by processes occurring on land, in the water column and sediment. Extracellular DNA can be taken up as a nutrient source, excreted or degraded by microorganisms, or adsorbed onto mineral matrices, thus potentially preserving information from past environments. To test whether extracellular DNA records lacustrine conditions, we sequentially extracted extracellular and intracellular DNA from anoxic sediments of ferruginous Lake Towuti, Indonesia. We applied 16S rRNA gene Illumina sequencing on both fractions to discriminate exogenous from endogenous sources of extracellular DNA in the sediment. Environmental sequences exclusively found as extracellular DNA in the sediment originated from multiple sources. For instance, Actinobacteria, Verrucomicrobia, and Acidobacteria derived from soils in the catchment. Limited primary productivity in the water column resulted in few sequences of Cyanobacteria in the oxic photic zone, whereas stratification of the water body mainly led to secondary production by aerobic and anaerobic heterotrophs. Chloroflexi and Planctomycetes, the main degraders of sinking organic matter and planktonic sequences at the water-sediment interface, were preferentially preserved during the initial phase of burial. To trace endogenous sources of extracellular DNA, we used relative abundances of taxa in the intracellular DNA to define which microbial populations grow, decline or persist at low density with sediment depth. Cell lysis became an important additional source of extracellular DNA, gradually covering previous genetic assemblages as other microbial genera became more abundant with depth. The use of extracellular DNA as nutrient by active microorganisms led to selective removal of sequences with lowest GC contents. We conclude that extracellular DNA preserved in shallow lacustrine sediments reflects the initial environmental context, but is gradually modified and thereby shifts from its stratigraphic context. Discrimination of exogenous and endogenous sources of extracellular DNA allows simultaneously addressing in-lake and post-depositional processes. In deeper sediments, the accumulation of resting stages and sequences from cell lysis would require stringent extraction and specific primers if ancient DNA is targeted. PMID:28798742
Li, Man; Ling, Cheng; Xu, Qi; Gao, Jingyang
2018-02-01
Sequence classification is crucial in predicting the function of newly discovered sequences. In recent years, the prediction of the incremental large-scale and diversity of sequences has heavily relied on the involvement of machine-learning algorithms. To improve prediction accuracy, these algorithms must confront the key challenge of extracting valuable features. In this work, we propose a feature-enhanced protein classification approach, considering the rich generation of multiple sequence alignment algorithms, N-gram probabilistic language model and the deep learning technique. The essence behind the proposed method is that if each group of sequences can be represented by one feature sequence, composed of homologous sites, there should be less loss when the sequence is rebuilt, when a more relevant sequence is added to the group. On the basis of this consideration, the prediction becomes whether a query sequence belonging to a group of sequences can be transferred to calculate the probability that the new feature sequence evolves from the original one. The proposed work focuses on the hierarchical classification of G-protein Coupled Receptors (GPCRs), which begins by extracting the feature sequences from the multiple sequence alignment results of the GPCRs sub-subfamilies. The N-gram model is then applied to construct the input vectors. Finally, these vectors are imported into a convolutional neural network to make a prediction. The experimental results elucidate that the proposed method provides significant performance improvements. The classification error rate of the proposed method is reduced by at least 4.67% (family level I) and 5.75% (family Level II), in comparison with the current state-of-the-art methods. The implementation program of the proposed work is freely available at: https://github.com/alanFchina/CNN .
The size evolution of star-forming and quenched galaxies in the IllustrisTNG simulation
NASA Astrophysics Data System (ADS)
Genel, Shy; Nelson, Dylan; Pillepich, Annalisa; Springel, Volker; Pakmor, Rüdiger; Weinberger, Rainer; Hernquist, Lars; Naiman, Jill; Vogelsberger, Mark; Marinacci, Federico; Torrey, Paul
2018-03-01
We analyse scaling relations and evolution histories of galaxy sizes in TNG100, part of the IllustrisTNG simulation suite. Observational qualitative trends of size with stellar mass, star formation rate and redshift are reproduced, and a quantitative comparison of projected r band sizes at 0 ≲ z ≲ 2 shows agreement to much better than 0.25 dex. We follow populations of z = 0 galaxies with a range of masses backwards in time along their main progenitor branches, distinguishing between main-sequence and quenched galaxies. Our main findings are as follows. (i) At M*, z = 0 ≳ 109.5 M⊙, the evolution of the median main progenitor differs, with quenched galaxies hardly growing in median size before quenching, whereas main-sequence galaxies grow their median size continuously, thus opening a gap from the progenitors of quenched galaxies. This is partly because the main-sequence high-redshift progenitors of quenched z = 0 galaxies are drawn from the lower end of the size distribution of the overall population of main-sequence high-redshift galaxies. (ii) Quenched galaxies with M*, z = 0 ≳ 109.5 M⊙ experience a steep size growth on the size-mass plane after their quenching time, but with the exception of galaxies with M*, z = 0 ≳ 1011 M⊙, the size growth after quenching is small in absolute terms, such that most of the size (and mass) growth of quenched galaxies (and its variation among them) occurs while they are still on the main sequence. After they become quenched, the size growth rate of quenched galaxies as a function of time, as opposed to versus mass, is similar to that of main-sequence galaxies. Hence, the size gap is retained down to z = 0.
NASA Astrophysics Data System (ADS)
Furrer, Julien; Kramer, Frank; Marino, John P.; Glaser, Steffen J.; Luy, Burkhard
2004-01-01
Homonuclear Hartmann-Hahn transfer is one of the most important building blocks in modern high-resolution NMR. It constitutes a very efficient transfer element for the assignment of proteins, nucleic acids, and oligosaccharides. Nevertheless, in macromolecules exceeding ˜10 kDa TOCSY-experiments can show decreasing sensitivity due to fast transverse relaxation processes that are active during the mixing periods. In this article we propose the MOCCA-XY16 multiple pulse sequence, originally developed for efficient TOCSY transfer through residual dipolar couplings, as a homonuclear Hartmann-Hahn sequence with improved relaxation properties. A theoretical analysis of the coherence transfer via scalar couplings and its relaxation behavior as well as experimental transfer curves for MOCCA-XY16 relative to the well-characterized DIPSI-2 multiple pulse sequence are given.
Furrer, Julien; Kramer, Frank; Marino, John P; Glaser, Steffen J; Luy, Burkhard
2004-01-01
Homonuclear Hartmann-Hahn transfer is one of the most important building blocks in modern high-resolution NMR. It constitutes a very efficient transfer element for the assignment of proteins, nucleic acids, and oligosaccharides. Nevertheless, in macromolecules exceeding approximately 10 kDa TOCSY-experiments can show decreasing sensitivity due to fast transverse relaxation processes that are active during the mixing periods. In this article we propose the MOCCA-XY16 multiple pulse sequence, originally developed for efficient TOCSY transfer through residual dipolar couplings, as a homonuclear Hartmann-Hahn sequence with improved relaxation properties. A theoretical analysis of the coherence transfer via scalar couplings and its relaxation behavior as well as experimental transfer curves for MOCCA-XY16 relative to the well-characterized DIPSI-2 multiple pulse sequence are given.
Chen, Guiqian; Qiu, Yuan; Zhuang, Qingye; Wang, Suchun; Wang, Tong; Chen, Jiming; Wang, Kaicheng
2018-05-09
Next generation sequencing (NGS) is a powerful tool for the characterization, discovery, and molecular identification of RNA viruses. There were multiple NGS library preparation methods published for strand-specific RNA-seq, but some methods are not suitable for identifying and characterizing RNA viruses. In this study, we report a NGS library preparation method to identify RNA viruses using the Ion Torrent PGM platform. The NGS sequencing adapters were directly inserted into the sequencing library through reverse transcription and polymerase chain reaction, without fragmentation and ligation of nucleic acids. The results show that this method is simple to perform, able to identify multiple species of RNA viruses in clinical samples.
Genetic diversity among isolates of Autographa californica multiple nucleopolyhedrovirus
USDA-ARS?s Scientific Manuscript database
Our knowledge of genetic variation at the nucleotide sequence level of Autographa californica multiple nucleopolyhedrovirus (AcMNPV; Baculoviridae: Alphabaculovirus) derives from complete genome sequences of the C6 clonal isolate of AcMNPV and the R1 and CL3 clonal isolates of AcMNPV variants Rachip...
USDA-ARS?s Scientific Manuscript database
The Agrotis ipsilon multiple nucleopolyhedrovirus (AgipMNPV) is a group II nucleopolyhedrovirus (NPV) from the black cutworm, A. ipsilon, with potential as a biopesticide to control infestations of cutworm larvae. The genome of the Illinois strain of AgipMNPV was completely sequenced. The AgipMNPV...
USDA-ARS?s Scientific Manuscript database
Geographic isolates of Lymantria dispar multiple nucleopolyhedrovirus: Genome sequence analysis and pathogenicity against European and Asian gypsy moth strains. To evaluate the genetic diversity of Lymantria dispar nucleopolyhedrovirus (LdMNPV) at the genomic level, the genomes of three isolates of...
EUGENE'HOM: A generic similarity-based gene finder using multiple homologous sequences.
Foissac, Sylvain; Bardou, Philippe; Moisan, Annick; Cros, Marie-Josée; Schiex, Thomas
2003-07-01
EUGENE'HOM is a gene prediction software for eukaryotic organisms based on comparative analysis. EUGENE'HOM is able to take into account multiple homologous sequences from more or less closely related organisms. It integrates the results of TBLASTX analysis, splice site and start codon prediction and a robust coding/non-coding probabilistic model which allows EUGENE'HOM to handle sequences from a variety of organisms. The current target of EUGENE'HOM is plant sequences. The EUGENE'HOM web site is available at http://genopole.toulouse.inra.fr/bioinfo/eugene/EuGeneHom/cgi-bin/EuGeneHom.pl.
Narad, Priyanka; Kumar, Abhishek; Chakraborty, Amlan; Patni, Pranav; Sengupta, Abhishek; Wadhwa, Gulshan; Upadhyaya, K C
2017-09-01
Transcription factors are trans-acting proteins that interact with specific nucleotide sequences known as transcription factor binding site (TFBS), and these interactions are implicated in regulation of the gene expression. Regulation of transcriptional activation of a gene often involves multiple interactions of transcription factors with various sequence elements. Identification of these sequence elements is the first step in understanding the underlying molecular mechanism(s) that regulate the gene expression. For in silico identification of these sequence elements, we have developed an online computational tool named transcription factor information system (TFIS) for detecting TFBS for the first time using a collection of JAVA programs and is mainly based on TFBS detection using position weight matrix (PWM). The database used for obtaining position frequency matrices (PFM) is JASPAR and HOCOMOCO, which is an open-access database of transcription factor binding profiles. Pseudo-counts are used while converting PFM to PWM, and TFBS detection is carried out on the basis of percent score taken as threshold value. TFIS is equipped with advanced features such as direct sequence retrieving from NCBI database using gene identification number and accession number, detecting binding site for common TF in a batch of gene sequences, and TFBS detection after generating PWM from known raw binding sequences in addition to general detection methods. TFIS can detect the presence of potential TFBSs in both the directions at the same time. This feature increases its efficiency. And the results for this dual detection are presented in different colors specific to the orientation of the binding site. Results obtained by the TFIS are more detailed and specific to the detected TFs as integration of more informative links from various related web servers are added in the result pages like Gene Ontology, PAZAR database and Transcription Factor Encyclopedia in addition to NCBI and UniProt. Common TFs like SP1, AP1 and NF-KB of the Amyloid beta precursor gene is easily detected using TFIS along with multiple binding sites. In another scenario of embryonic developmental process, TFs of the FOX family (FOXL1 and FOXC1) were also identified. TFIS is platform-independent which is publicly available along with its support and documentation at http://tfistool.appspot.com and http://www.bioinfoplus.com/tfis/ . TFIS is licensed under the GNU General Public License, version 3 (GPL-3.0).
Responses of stream microbes to multiple anthropogenic stressors in a mesocosm study.
Nuy, Julia K; Lange, Anja; Beermann, Arne J; Jensen, Manfred; Elbrecht, Vasco; Röhl, Oliver; Peršoh, Derek; Begerow, Dominik; Leese, Florian; Boenigk, Jens
2018-08-15
Stream ecosystems are affected by multiple anthropogenic stressors worldwide. Even though effects of many single stressors are comparatively well studied, the effects of multiple stressors are difficult to predict. In particular bacteria and protists, which are responsible for the majority of ecosystem respiration and element flows, are infrequently studied with respect to multiple stressors responses. We conducted a stream mesocosm experiment to characterize the responses of single and multiple stressors on microbiota. Two functionally important stream habitats, leaf litter and benthic phototrophic rock biofilms, were exposed to three stressors in a full factorial design: fine sediment deposition, increased chloride concentration (salinization) and reduced flow velocity. We analyzed the microbial composition in the two habitat types of the mesocosms using an amplicon sequencing approach. Community analysis on different taxonomic levels as well as principle component analyses (PCoAs) based on realtive abundances of operational taxonomic units (OTUs) showed treatment specific shifts in the eukaryotic biofilm community. Analysis of variance (ANOVA) revealed that Bacillariophyta responded positively salinity and sediment increase, while the relative read abundance of chlorophyte taxa decreased. The combined effects of multiple stressors were mainly antagonistic. Therefore, the community composition in multiply stressed environments resembled the composition of the unstressed control community in terms of OTU occurrence and relative abundances. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Binary Sequences for Spread-Spectrum Multiple-Access Communication
1977-08-01
Massey, J. L., and Uhran, J. J., Jr., "Sub-baud coding," Proceedings of the Thirteenth Annual Allerton Conference on Circuit and System Theory, pp. 539...sequences in a multipl.e access environment," Proceedings of the Thirteenth Annual AIlerton Conference on Circuit and System Theory, pp. 21-27, October...34 Proceedings of the Thirteenth Annual Allertcn Conference on Circuit and System Theory, pp. 548-559, October 1975. Yao, K., *Performance bounds on
NASA Astrophysics Data System (ADS)
Ellison, Sara L.; Sánchez, Sebastian F.; Ibarra-Medel, Hector; Antonio, Braulio; Mendel, J. Trevor; Barrera-Ballesteros, Jorge
2018-02-01
The tight correlation between total galaxy stellar mass and star formation rate (SFR) has become known as the star-forming main sequence. Using ˜487 000 spaxels from galaxies observed as part of the Sloan Digital Sky Survey Mapping Nearby Galaxies at Apache Point Observatory (MaNGA) survey, we confirm previous results that a correlation also exists between the surface densities of star formation (ΣSFR) and stellar mass (Σ⋆) on kpc scales, representing a `resolved' main sequence. Using a new metric (ΔΣSFR), which measures the relative enhancement or deficit of star formation on a spaxel-by-spaxel basis relative to the resolved main sequence, we investigate the SFR profiles of 864 galaxies as a function of their position relative to the global star-forming main sequence (ΔSFR). For galaxies above the global main sequence (positive ΔSFR) ΔΣSFR is elevated throughout the galaxy, but the greatest enhancement in star formation occurs at small radii (<3 kpc, or 0.5Re). Moreover, galaxies that are at least a factor of 3 above the main sequence show diluted gas phase metallicities out to 2Re, indicative of metal-poor gas inflows accompanying the starbursts. For quiescent/passive galaxies that lie at least a factor of 10 below the star-forming main sequence, there is an analogous deficit of star formation throughout the galaxy with the lowest values of ΔΣSFR in the central 3 kpc. Our results are in qualitative agreement with the `compaction' scenario in which a central starburst leads to mass growth in the bulge and may ultimately precede galactic quenching from the inside-out.
Reefing Line Tension in CPAS Main Parachute Clusters
NASA Technical Reports Server (NTRS)
Ray, Eric S.
2013-01-01
Reefing lines are an essential feature to manage inflation loads. During each Engineering Development Unit (EDU) test of the Capsule Parachute Assembly System (CPAS), a chase aircraft is staged to be level with the cluster of Main ringsail parachutes during the initial inflation and reefed stages. This allows for capturing high-quality still photographs of the reefed skirt, suspension line, and canopy geometry. The over-inflation angles are synchronized with measured loads data in order to compute the tension force in the reefing line. The traditional reefing tension equation assumes radial symmetry, but cluster effects cause the reefed skirt of each parachute to elongate to a more elliptical shape. This effect was considered in evaluating multiple parachutes to estimate the semi-major and semi-minor axes. Three flight tests are assessed, including one with a skipped first stage, which had peak reefing line tension over three times higher than the nominal parachute disreef sequence.
UV observations of blue stragglers and population 2 K dwarfs
NASA Technical Reports Server (NTRS)
Carney, B. W.; Bond, H. E.
1986-01-01
Blue stragglers are stars, found usually in either open or globular clusters, that appear to lie on the main sequence, but are brighter and bluer than the cluster turn-off. Currently, two rival models are invoked to explain this apparently pathological behavior: internal mixing (so that fresh fuel is brought into the stellar core); and mass transfer (by which a normal main sequence star acquires mass from an evolving nearby companion and so moves up the main sequence). The latter model predicts that in the absence of complete mass transfer (i.e., coalescence), blue stragglers should be binary systems with the fainter star in a post-main sequence evolutionary state. It is important to ascertain the cause of this phenomenon since stellar evolution models of main sequence stars play such a vital role in astronomy. If mass transfer is involved, one may easily exclude binaries from age determinations of clusters, but if mixing is the cause, our age determinations will be much less accurate unless we can determine whether all stars or only some mix, and what causes the mixing to occur at all.
The Role Of Rejuvenation In Shaping The High-Mass End Of The Main Sequence
NASA Astrophysics Data System (ADS)
Mancini, Chiara
2017-06-01
We investigate the nature of star forming galaxies with reduced specific SFRs and high stellar masses, those that seemingly cause the so-called bending of the main sequence. The fact that such objects host large bulges recently lead some to suggest that the internal formation of the bulges, via compaction or disk instabilities, was the late event that induced sSFRs of massive galaxies to drop in a slow downfall and thus the main sequence to bend. We have studied in detail a sample of 16 galaxies at 0.5
Roca, Alberto I
2014-01-01
The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org.
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.
Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
NASA Astrophysics Data System (ADS)
Hiebert, R. S.; Bekker, A.; Houlé, M. G.; Wing, B. A.; Rouxel, O. J.
2016-10-01
Assimilation by mafic to ultramafic magmas of sulfur-bearing country rocks is considered an important contributing factor to reach sulfide saturation and form magmatic Ni-Cu-platinum group element (PGE) sulfide deposits. Sulfur-bearing sedimentary rocks in the Archean are generally characterized by mass-independent fractionation of sulfur isotopes that is a result of atmospheric photochemical reactions, which produces isotopically distinct pools of sulfur. Likewise, low-temperature processing of iron, through biological and abiotic redox cycling, produces a range of Fe isotope values in Archean sedimentary rocks that is distinct from the range of the mantle and magmatic Fe isotope values. Both of these signals can be used to identify potential country rock assimilants and their contribution to magmatic sulfide deposits. We use multiple S and Fe isotopes to characterize the composition of the potential iron and sulfur sources for the sulfide liquids that formed the Hart deposit in the Shaw Dome area within the Abitibi greenstone belt in Ontario (Canada). The Hart deposit is composed of two zones with komatiite-associated Ni-Cu-PGE mineralization; the main zone consists of a massive sulfide deposit at the base of the basal flow in the komatiite sequence, whereas the eastern extension consists of a semi-massive sulfide zone located 12 to 25 m above the base of the second flow in the komatiite sequence. Low δ56Fe values and non-zero δ34S and Δ33S values of the komatiitic rocks and associated mineralization at the Hart deposit is best explained by mixing and isotope exchange with crustal materials, such as exhalite and graphitic argillite, rather than intrinsic fractionation within the komatiite. This approach allows tracing the extent of crustal contamination away from the deposit and the degree of mixing between the sulfide and komatiite melts. The exhalite and graphitic argillite were the dominant contaminants for the main zone of mineralization and the eastern extension zone of the Hart deposit, respectively. Critically, the extent of contamination, as revealed by multiple S and Fe isotope systematics, is greatest within the deposit and decreases away from it within the komatiite flow. This pattern points to a local source of crustal contamination for the mantle-derived komatiitic melt and a low degree of homogenization between the mineralization and the surrounding lava flow. Coupled S and Fe isotope patterns like those identified at the Hart deposit may provide a useful tool for assessing the potential of a komatiitic sequence to host Ni-Cu-(PGE).
Koo, Eung Seo; Kim, Man Su; Choi, Yong Seon; Park, Kwon-Sam; Jeong, Yong Seok
2017-01-01
Human norovirus (HNoV), a positive-sense RNA virus, is the main causative agent of acute viral gastroenteritis. Multiple pandemic variants of the genogroup II genotype 4 (GII.4) of NoV have attracted great attention from researchers worldwide. However, novel variants of GII.17 have been overtaking those pandemic variants in some areas of East Asia. To investigate the environmental occurrence of GII in South Korea, we collected water samples from coastal streams and a neighboring waste water treatment plant in North Jeolla province (in March, July, and December of 2015). Based on capsid gene region C analysis, four different genotypes (GII.4, GII.13, GII.17, and GII.21) were detected, with much higher prevalence of GII.17 than of GII.4. Additional sequence analyses of the ORF1-ORF2 junction and ORF2 from the water samples revealed that the GII.17 sequences in this study were closely related to the novel strains of GII.P17-GII.17, the main causative variants of the 2014–2015 HNoV outbreak in China and Japan. In addition, the GII.P21-GII.21 variants were identified in this study and they had new amino acid sequence variations in the blockade epitopes of the P2 domain. From these results, we present two important findings: 1) the novel GII.P17-GII.17 variants appeared to be predominant in the study area, and 2) new GII.21 variants have emerged in South Korea. PMID:28199388
Demographics of Star-forming Galaxies since z ∼ 2.5. I. The UVJ Diagram in CANDELS
NASA Astrophysics Data System (ADS)
Fang, Jerome J.; Faber, S. M.; Koo, David C.; Rodríguez-Puebla, Aldo; Guo, Yicheng; Barro, Guillermo; Behroozi, Peter; Brammer, Gabriel; Chen, Zhu; Dekel, Avishai; Ferguson, Henry C.; Gawiser, Eric; Giavalisco, Mauro; Kartaltepe, Jeyhan; Kocevski, Dale D.; Koekemoer, Anton M.; McGrath, Elizabeth J.; McIntosh, Daniel; Newman, Jeffrey A.; Pacifici, Camilla; Pandya, Viraj; Pérez-González, Pablo G.; Primack, Joel R.; Salmon, Brett; Trump, Jonathan R.; Weiner, Benjamin; Willner, S. P.; Acquaviva, Viviana; Dahlen, Tomas; Finkelstein, Steven L.; Finlator, Kristian; Fontana, Adriano; Galametz, Audrey; Grogin, Norman A.; Gruetzbauch, Ruth; Johnson, Seth; Mobasher, Bahram; Papovich, Casey J.; Pforr, Janine; Salvato, Mara; Santini, P.; van der Wel, Arjen; Wiklind, Tommy; Wuyts, Stijn
2018-05-01
This is the first in a series of papers examining the demographics of star-forming (SF) galaxies at 0.2 < z < 2.5 in CANDELS. We study 9100 galaxies from GOODS-S and UDS, having published values of redshifts, masses, star formation rates (SFRs), and dust attenuation (A V ) derived from UV–optical spectral energy distribution fitting. In agreement with previous works, we find that the UVJ colors of a galaxy are closely correlated with its specific star formation rate (SSFR) and A V . We define rotated UVJ coordinate axes, termed S SED and C SED, that are parallel and perpendicular to the SF sequence and derive a quantitative calibration that predicts SSFR from C SED with an accuracy of ∼0.2 dex. SFRs from UV–optical fitting and from UV+IR values based on Spitzer/MIPS 24 μm agree well overall, but systematic differences of order 0.2 dex exist at high and low redshifts. A novel plotting scheme conveys the evolution of multiple galaxy properties simultaneously, and dust growth, as well as star formation decline and quenching, exhibit “mass-accelerated evolution” (“downsizing”). A population of transition galaxies below the SF main sequence is identified. These objects are located between SF and quiescent galaxies in UVJ space, and have lower A V and smaller radii than galaxies on the main sequence. Their properties are consistent with their being in transit between the two regions. The relative numbers of quenched, transition, and SF galaxies are given as a function of mass and redshift.
Delling, Bo; Palm, Stefan; Palkopoulou, Eleftheria; Prestegaard, Tore
2014-01-01
Presence of sympatric populations may reflect local diversification or secondary contact of already distinct forms. The Baltic cisco (Coregonus albula) normally spawns in late autumn, but in a few lakes in Northern Europe sympatric autumn and spring- or winter-spawners have been described. So far, the evolutionary relationships and taxonomic status of these main life history forms have remained largely unclear. With microsatellites and mtDNA sequences, we analyzed extant and extinct spring- and autumn-spawners from a total of 23 Swedish localities, including sympatric populations. Published sequences from Baltic ciscoes in Germany and Finland, and Coregonus sardinella from North America were also included together with novel mtDNA sequences from Siberian C. sardinella. A clear genetic structure within Sweden was found that included two population assemblages markedly differentiated at microsatellites and apparently fixed for mtDNA haplotypes from two distinct clades. All sympatric Swedish populations belonged to the same assemblage, suggesting parallel evolution of spring-spawning rather than secondary contact. The pattern observed further suggests that postglacial immigration to Northern Europe occurred from at least two different refugia. Previous results showing that mtDNA in Baltic cisco is paraphyletic with respect to North American C. sardinella were confirmed. However, the inclusion of Siberian C. sardinella revealed a more complicated pattern, as these novel haplotypes were found within one of the two main C. albula clades and were clearly distinct from those in North American C. sardinella. The evolutionary history of Northern Hemisphere ciscoes thus seems to be more complex than previously recognized. PMID:25540695
Delling, Bo; Palm, Stefan; Palkopoulou, Eleftheria; Prestegaard, Tore
2014-11-01
Presence of sympatric populations may reflect local diversification or secondary contact of already distinct forms. The Baltic cisco (Coregonus albula) normally spawns in late autumn, but in a few lakes in Northern Europe sympatric autumn and spring- or winter-spawners have been described. So far, the evolutionary relationships and taxonomic status of these main life history forms have remained largely unclear. With microsatellites and mtDNA sequences, we analyzed extant and extinct spring- and autumn-spawners from a total of 23 Swedish localities, including sympatric populations. Published sequences from Baltic ciscoes in Germany and Finland, and Coregonus sardinella from North America were also included together with novel mtDNA sequences from Siberian C. sardinella. A clear genetic structure within Sweden was found that included two population assemblages markedly differentiated at microsatellites and apparently fixed for mtDNA haplotypes from two distinct clades. All sympatric Swedish populations belonged to the same assemblage, suggesting parallel evolution of spring-spawning rather than secondary contact. The pattern observed further suggests that postglacial immigration to Northern Europe occurred from at least two different refugia. Previous results showing that mtDNA in Baltic cisco is paraphyletic with respect to North American C. sardinella were confirmed. However, the inclusion of Siberian C. sardinella revealed a more complicated pattern, as these novel haplotypes were found within one of the two main C. albula clades and were clearly distinct from those in North American C. sardinella. The evolutionary history of Northern Hemisphere ciscoes thus seems to be more complex than previously recognized.
Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
Tharakaraman, Kannan; Mariño-Ramírez, Leonardo; Sheetlin, Sergey L; Landsman, David; Spouge, John L
2006-01-01
Background Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set. Results We describe an improvement to the A-GLAM computer program, which predicts regulatory elements within DNA sequences with Gibbs sampling. The improvement adds an optional "scanning step" after Gibbs sampling. Gibbs sampling produces a position specific scoring matrix (PSSM). The new scanning step resembles an iterative PSI-BLAST search based on the PSSM. First, it assigns an "individual score" to each subsequence of appropriate length within the input sequences using the initial PSSM. Second, it computes an E-value from each individual score, to assess the agreement between the corresponding subsequence and the PSSM. Third, it permits subsequences with E-values falling below a threshold to contribute to the underlying PSSM, which is then updated using the Bayesian calculus. A-GLAM iterates its scanning step to convergence, at which point no new subsequences contribute to the PSSM. After convergence, A-GLAM reports predicted regulatory elements within each sequence in order of increasing E-values, so users have a statistical evaluation of the predicted elements in a convenient presentation. Thus, although the Gibbs sampling step in A-GLAM finds at most one regulatory element per input sequence, the scanning step can now rapidly locate further instances of the element in each sequence. Conclusion Datasets from experiments determining the binding sites of transcription factors were used to evaluate the improvement to A-GLAM. Typically, the datasets included several sequences containing multiple instances of a regulatory motif. The improvements to A-GLAM permitted it to predict the multiple instances. PMID:16961919
Dynamic modeling of normal faults of the 2016 Central Italy earthquake sequence
NASA Astrophysics Data System (ADS)
Aochi, Hideo
2017-04-01
The earthquake sequence of the Central Italy in 2016 are characterized mainly by the Mw6.0 24th August, Mw5.9 26th October and Mw6.4 30th October as well as two Mw5.4 earthquakes (24th August, 26th October) (catalogue INGV). They all show normal faulting mechanisms corresponding to the Apennines's tectonics. They are aligned briefly along NNW-SSE axis, and they may not be on a single continuous fault plane. Therefore, dynamic rupture modeling of sequences should be carried out supposing co-planar normal multiple segments. We apply a Boundary Domain Method (BDM, Goto and Bielak, GJI, 2008) coupling a boundary integral equation method and a domain-based method, namely a finite difference method in this study. The Mw6.0 24th August earthquake is modeled. We use the basic information of hypocenter position, focal mechanism and potential ruptured dimension from the INGV catalogue and Tinti et al., GRL, 2016), and begin with a simple condition (homogeneous boundary condition). From our preliminary simulations, it is shown that a uniformly extended rupture model does not fit the near-field ground motions and localized heterogeneity would be required.
Characterization of Proteoforms with Unknown Post-translational Modifications Using the MIScore
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kou, Qiang; Zhu, Binhai; Wu, Si
Various proteoforms may be generated from a single gene due to primary structure alterations (PSAs) such as genetic variations, alternative splicing, and post-translational modifications (PTMs). Top-down mass spectrometry is capable of analyzing intact proteins and identifying patterns of multiple PSAs, making it the method of choice for studying complex proteoforms. In top-down proteomics, proteoform identification is often performed by searching tandem mass spectra against a protein sequence database that contains only one reference protein sequence for each gene or transcript variant in a proteome. Because of the incompleteness of the protein database, an identified proteoform may contain unknown PSAs comparedmore » with the reference sequence. Proteoform characterization is to identify and localize PSAs in a proteoform. Although many software tools have been proposed for proteoform identification by top-down mass spectrometry, the characterization of proteoforms in identified proteoform-spectrum matches still relies mainly on manual annotation. We propose to use the Modification Identification Score (MIScore), which is based on Bayesian models, to automatically identify and localize PTMs in proteoforms. Experiments showed that the MIScore is accurate in identifying and localizing one or two modifications.« less
Hao, Pei; Zheng, Huajun; Yu, Yao; Ding, Guohui; Gu, Wenyi; Chen, Shuting; Yu, Zhonghao; Ren, Shuangxi; Oda, Munehiro; Konno, Tomonobu; Wang, Shengyue; Li, Xuan; Ji, Zai-Si; Zhao, Guoping
2011-01-17
Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus) is an important species of Lactic Acid Bacteria (LAB) used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermediate metabolism, unique EPS synthesis and efficient DNA restriction/modification systems, are all different from those of the collection strains that might benefit the industrial production of yogurt. Other common features shared by Lb. bulgaricus strains, such as efficient protocooperation with Streptococcus thermophilus and lactate production as well as well-equipped stress tolerance mechanisms may account for it being selected originally for yogurt fermentation industry. Multiple lines of evidence suggested that Lb. bulgaricus 2038 was genetically closer to the common ancestor of the subspecies than the other two sequenced collection strains, probably due to a strict industrial maintenance process for strain 2038 that might have halted its genome decay and sustained a gene network suitable for large scale yogurt production.
Ding, Guohui; Gu, Wenyi; Chen, Shuting; Yu, Zhonghao; Ren, Shuangxi; Oda, Munehiro; Konno, Tomonobu; Wang, Shengyue; Li, Xuan; Ji, Zai-Si; Zhao, Guoping
2011-01-01
Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus) is an important species of Lactic Acid Bacteria (LAB) used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermediate metabolism, unique EPS synthesis and efficient DNA restriction/modification systems, are all different from those of the collection strains that might benefit the industrial production of yogurt. Other common features shared by Lb. bulgaricus strains, such as efficient protocooperation with Streptococcus thermophilus and lactate production as well as well-equipped stress tolerance mechanisms may account for it being selected originally for yogurt fermentation industry. Multiple lines of evidence suggested that Lb. bulgaricus 2038 was genetically closer to the common ancestor of the subspecies than the other two sequenced collection strains, probably due to a strict industrial maintenance process for strain 2038 that might have halted its genome decay and sustained a gene network suitable for large scale yogurt production. PMID:21264216
Subbotin, S A; Vierstraete, A; De Ley, P; Rowe, J; Waeyenberge, L; Moens, M; Vanfleteren, J R
2001-10-01
The ITS1, ITS2, and 5.8S gene sequences of nuclear ribosomal DNA from 40 taxa of the family Heteroderidae (including the genera Afenestrata, Cactodera, Heterodera, Globodera, Punctodera, Meloidodera, Cryphodera, and Thecavermiculatus) were sequenced and analyzed. The ITS regions displayed high levels of sequence divergence within Heteroderinae and compared to outgroup taxa. Unlike recent findings in root knot nematodes, ITS sequence polymorphism does not appear to complicate phylogenetic analysis of cyst nematodes. Phylogenetic analyses with maximum-parsimony, minimum-evolution, and maximum-likelihood methods were performed with a range of computer alignments, including elision and culled alignments. All multiple alignments and phylogenetic methods yielded similar basic structure for phylogenetic relationships of Heteroderidae. The cyst-forming nematodes are represented by six main clades corresponding to morphological characters and host specialization, with certain clades assuming different positions depending on alignment procedure and/or method of phylogenetic inference. Hypotheses of monophyly of Punctoderinae and Heteroderinae are, respectively, strongly and moderately supported by the ITS data across most alignments. Close relationships were revealed between the Avenae and the Sacchari groups and between the Humuli group and the species H. salixophila within Heteroderinae. The Goettingiana group occupies a basal position within this subfamily. The validity of the genera Afenestrata and Bidera was tested and is discussed based on molecular data. We conclude that ITS sequence data are appropriate for studies of relationships within the different species groups and less so for recovery of more ancient speciations within Heteroderidae. Copyright 2001 Academic Press.
No Metallicity Correlation Associated with the Kepler Dichotomy
NASA Astrophysics Data System (ADS)
Munoz Romero, Carlos Eduardo; Kempton, Eliza
2018-01-01
NASA’s Kepler mission has discovered thousands of planetary systems, ∼ 20% of which are found to host multiple transiting planets. This relative paucity (compared to the high fraction of single transiting systems) is postulated to result from a distinction in the architecture between multi-transiting systems and those hosting a single transiting planet: a phenomenon usually referred to as the Kepler dichotomy. We investigate the hypothesis that external giant planets are the main cause behind the over-abundance of single- relative to multi-transiting systems, which would be signaled by higher metallicities in the former sample. To this end, we perform a statistical analysis on the stellar metallicity distribution with respect to planet multiplicity in the Kepler data. We perform our analysis on a variety of samples taken from a population of 1062 Kepler main sequence planetary hosts, using precisely determined metallicities from the California-Kepler survey. Contrary to some predictions, we do not find a significant difference between the stellar metallicities of the single- and multiple-transiting planet systems. However, we do find a 43% upper bound for systems with a single non-giant planet that could also host a hidden giant planet, based on metallicity considerations. While the presence of external giant planets might be one factor behind the Kepler dichotomy, our results also favor alternative explanations. We suggest that additional radial velocity and direct imaging measurements are necessary to constrain the presence of gas giants in systems with a single transiting planet.
The Star-forming Main Sequence of Dwarf Low Surface Brightness Galaxies
NASA Astrophysics Data System (ADS)
McGaugh, Stacy S.; Schombert, James M.; Lelli, Federico
2017-12-01
We explore the star-forming properties of late-type, low surface brightness (LSB) galaxies. The star-forming main sequence ({SFR}-{M}* ) of LSB dwarfs has a steep slope, indistinguishable from unity (1.04 ± 0.06). They form a distinct sequence from more massive spirals, which exhibit a shallower slope. The break occurs around {M}* ≈ {10}10 {M}⊙ , and can also be seen in the gas mass—stellar mass plane. The global Kennicutt-Schmidt law ({SFR}-{M}g) has a slope of 1.47 ± 0.11 without the break seen in the main sequence. There is an ample supply of gas in LSB galaxies, which have gas depletion times well in excess of a Hubble time, and often tens of Hubble times. Only ˜ 3 % of this cold gas needs be in the form of molecular gas to sustain the observed star formation. In analogy with the faint, long-lived stars of the lower stellar main sequence, it may be appropriate to consider the main sequence of star-forming galaxies to be defined by thriving dwarfs (with {M}* < {10}10 {M}⊙ ), while massive spirals (with {M}* > {10}10 {M}⊙ ) are weary giants that constitute more of a turn-off population.
Hybrid spread spectrum radio system
Smith, Stephen F [London, TN; Dress, William B [Camas, WA
2010-02-09
Systems and methods are described for hybrid spread spectrum radio systems. A method, includes receiving a hybrid spread spectrum signal including: fast frequency hopping demodulating and direct sequence demodulating a direct sequence spread spectrum signal, wherein multiple frequency hops occur within a single data-bit time and each bit is represented by chip transmissions at multiple frequencies.
USDA-ARS?s Scientific Manuscript database
Little is known about genetic variation of Lymantria dispar multiple nucleopolyhedrovirus (LdMNPV; Baculoviridae: Alphabaculovirus) at the nucleotide sequence level. To obtain a more comprehensive view of genetic diversity among isolates of LdMNPV, partial sequences of the lef-8 gene were generated...
Draft genome of the living fossil Ginkgo biloba.
Guan, Rui; Zhao, Yunpeng; Zhang, He; Fan, Guangyi; Liu, Xin; Zhou, Wenbin; Shi, Chengcheng; Wang, Jiahao; Liu, Weiqing; Liang, Xinming; Fu, Yuanyuan; Ma, Kailong; Zhao, Lijun; Zhang, Fumin; Lu, Zuhong; Lee, Simon Ming-Yuen; Xu, Xun; Wang, Jian; Yang, Huanming; Fu, Chengxin; Ge, Song; Chen, Wenbin
2016-11-21
Ginkgo biloba L. (Ginkgoaceae) is one of the most distinctive plants. It possesses a suite of fascinating characteristics including a large genome, outstanding resistance/tolerance to abiotic and biotic stresses, and dioecious reproduction, making it an ideal model species for biological studies. However, the lack of a high-quality genome sequence has been an impediment to our understanding of its biology and evolution. The 10.61 Gb genome sequence containing 41,840 annotated genes was assembled in the present study. Repetitive sequences account for 76.58% of the assembled sequence, and long terminal repeat retrotransposons (LTR-RTs) are particularly prevalent. The diversity and abundance of LTR-RTs is due to their gradual accumulation and a remarkable amplification between 16 and 24 million years ago, and they contribute to the long introns and large genome. Whole genome duplication (WGD) may have occurred twice, with an ancient WGD consistent with that shown to occur in other seed plants, and a more recent event specific to ginkgo. Abundant gene clusters from tandem duplication were also evident, and enrichment of expanded gene families indicates a remarkable array of chemical and antibacterial defense pathways. The ginkgo genome consists mainly of LTR-RTs resulting from ancient gradual accumulation and two WGD events. The multiple defense mechanisms underlying the characteristic resilience of ginkgo are fostered by a remarkable enrichment in ancient duplicated and ginkgo-specific gene clusters. The present study sheds light on sequencing large genomes, and opens an avenue for further genetic and evolutionary research.
Prediction of β-turns in proteins from multiple alignment using neural network
Kaur, Harpreet; Raghava, Gajendra Pal Singh
2003-01-01
A neural network-based method has been developed for the prediction of β-turns in proteins by using multiple sequence alignment. Two feed-forward back-propagation networks with a single hidden layer are used where the first-sequence structure network is trained with the multiple sequence alignment in the form of PSI-BLAST–generated position-specific scoring matrices. The initial predictions from the first network and PSIPRED-predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. A significant improvement in prediction accuracy has been achieved by using evolutionary information contained in the multiple sequence alignment. The final network yields an overall prediction accuracy of 75.5% when tested by sevenfold cross-validation on a set of 426 nonhomologous protein chains. The corresponding Qpred, Qobs, and Matthews correlation coefficient values are 49.8%, 72.3%, and 0.43, respectively, and are the best among all the previously published β-turn prediction methods. The Web server BetaTPred2 (http://www.imtech.res.in/raghava/betatpred2/) has been developed based on this approach. PMID:12592033
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gomez de Castro, Ana Ines; Lopez-Santiago, Javier; Talavera, Antonio
2013-03-20
AK Sco stands out among pre-main-sequence binaries because of its prominent ultraviolet excess, the high eccentricity of its orbit, and the strong tides driven by it. AK Sco consists of two F5-type stars that get as close as 11 R{sub *} at periastron passage. The presence of a dense (n{sub e} {approx} 10{sup 11} cm{sup -3}) extended envelope has been unveiled recently. In this article, we report the results from an XMM-Newton-based monitoring of the system. We show that at periastron, X-ray and UV fluxes are enhanced by a factor of {approx}3 with respect to the apastron values. The X-raymore » radiation is produced in an optically thin plasma with T {approx} 6.4 Multiplication-Sign 10{sup 6} K and it is found that the N{sub H} column density rises from 0.35 Multiplication-Sign 10{sup 21} cm{sup -2} at periastron to 1.11 Multiplication-Sign 10{sup 21} cm{sup -2} at apastron, in good agreement with previous polarimetric observations. The UV emission detected in the Optical Monitor band seems to be caused by the reprocessing of the high-energy magnetospheric radiation on the circumstellar material. Further evidence of the strong magnetospheric disturbances is provided by the detection of line broadening of 278.7 km s{sup -1} in the N V line with Hubble Space Telescope/Space Telescope Imaging Spectrograph. Numerical simulations of the mass flow from the circumbinary disk to the components have been carried out. They provide a consistent scenario with which to interpret AK Sco observations. We show that the eccentric orbit acts like a gravitational piston. At apastron, matter is dragged efficiently from the inner disk border, filling the inner gap and producing accretion streams that end as ring-like structures around each component of the system. At periastron, the ring-like structures come into contact, leading to angular momentum loss, and thus producing an accretion outburst.« less
Determination of dipole coupling constants using heteronuclear multiple quantum NMR
NASA Astrophysics Data System (ADS)
Weitekamp, D. P.; Garbow, J. R.; Pines, A.
1982-09-01
The problem of extracting dipole couplings from a system of N spins I = 1/2 and one spin S by NMR techniques is analyzed. The resolution attainable using a variety of single quantum methods is reviewed. The theory of heteronuclear multiple quantum (HMQ) NMR is developed, with particular emphasis being placed on the superior resolution available in HMQ spectra. Several novel pulse sequences are introduced, including a two-step method for the excitation of HMQ coherence. Experiments on partially oriented [1-13C] benzene demonstrate the excitation of the necessary HMQ coherence and illustrate the calculation of relative line intensities. Spectra of high order HMQ coherence under several different effective Hamiltonians achievable by multiple pulse sequences are discussed. A new effective Hamiltonian, scalar heteronuclear recoupled interactions by multiple pulse (SHRIMP), achieved by the simultaneous irradiation of both spin species with the same multiple pulse sequence, is introduced. Experiments are described which allow heteronuclear couplings to be correlated with an S-spin spreading parameter in spectra free of inhomogeneous broadening.
Sequence harmony: detecting functional specificity from alignments
Feenstra, K. Anton; Pirovano, Walter; Krab, Klaas; Heringa, Jaap
2007-01-01
Multiple sequence alignments are often used for the identification of key specificity-determining residues within protein families. We present a web server implementation of the Sequence Harmony (SH) method previously introduced. SH accurately detects subfamily specific positions from a multiple alignment by scoring compositional differences between subfamilies, without imposing conservation. The SH web server allows a quick selection of subtype specific sites from a multiple alignment given a subfamily grouping. In addition, it allows the predicted sites to be directly mapped onto a protein structure and displayed. We demonstrate the use of the SH server using the family of plant mitochondrial alternative oxidases (AOX). In addition, we illustrate the usefulness of combining sequence and structural information by showing that the predicted sites are clustered into a few distinct regions in an AOX homology model. The SH web server can be accessed at www.ibi.vu.nl/programs/seqharmwww. PMID:17584793
Solving the problem of comparing whole bacterial genomes across different sequencing platforms.
Kaas, Rolf S; Leekitcharoenphon, Pimlapas; Aarestrup, Frank M; Lund, Ole
2014-01-01
Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools.
Kordes, Sebastian; Kössl, Manfred
2017-01-01
Abstract For the purpose of orientation, echolocating bats emit highly repetitive and spatially directed sonar calls. Echoes arising from call reflections are used to create an acoustic image of the environment. The inferior colliculus (IC) represents an important auditory stage for initial processing of echolocation signals. The present study addresses the following questions: (1) how does the temporal context of an echolocation sequence mimicking an approach flight of an animal affect neuronal processing of distance information to echo delays? (2) how does the IC process complex echolocation sequences containing echo information from multiple objects (multiobject sequence)? Here, we conducted neurophysiological recordings from the IC of ketamine-anaesthetized bats of the species Carollia perspicillata and compared the results from the IC with the ones from the auditory cortex (AC). Neuronal responses to an echolocation sequence was suppressed when compared to the responses to temporally isolated and randomized segments of the sequence. The neuronal suppression was weaker in the IC than in the AC. In contrast to the cortex, the time course of the acoustic events is reflected by IC activity. In the IC, suppression sharpens the neuronal tuning to specific call-echo elements and increases the signal-to-noise ratio in the units’ responses. When presenting multiple-object sequences, despite collicular suppression, the neurons responded to each object-specific echo. The latter allows parallel processing of multiple echolocation streams at the IC level. Altogether, our data suggests that temporally-precise neuronal responses in the IC could allow fast and parallel processing of multiple acoustic streams. PMID:29242823
Beetz, M Jerome; Kordes, Sebastian; García-Rosales, Francisco; Kössl, Manfred; Hechavarría, Julio C
2017-01-01
For the purpose of orientation, echolocating bats emit highly repetitive and spatially directed sonar calls. Echoes arising from call reflections are used to create an acoustic image of the environment. The inferior colliculus (IC) represents an important auditory stage for initial processing of echolocation signals. The present study addresses the following questions: (1) how does the temporal context of an echolocation sequence mimicking an approach flight of an animal affect neuronal processing of distance information to echo delays? (2) how does the IC process complex echolocation sequences containing echo information from multiple objects (multiobject sequence)? Here, we conducted neurophysiological recordings from the IC of ketamine-anaesthetized bats of the species Carollia perspicillata and compared the results from the IC with the ones from the auditory cortex (AC). Neuronal responses to an echolocation sequence was suppressed when compared to the responses to temporally isolated and randomized segments of the sequence. The neuronal suppression was weaker in the IC than in the AC. In contrast to the cortex, the time course of the acoustic events is reflected by IC activity. In the IC, suppression sharpens the neuronal tuning to specific call-echo elements and increases the signal-to-noise ratio in the units' responses. When presenting multiple-object sequences, despite collicular suppression, the neurons responded to each object-specific echo. The latter allows parallel processing of multiple echolocation streams at the IC level. Altogether, our data suggests that temporally-precise neuronal responses in the IC could allow fast and parallel processing of multiple acoustic streams.
Evolutionary distances in the twilight zone--a rational kernel approach.
Schwarz, Roland F; Fletcher, William; Förster, Frank; Merget, Benjamin; Wolf, Matthias; Schultz, Jörg; Markowetz, Florian
2010-12-31
Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.
DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.
Kelly, Steven; Maini, Philip K
2013-01-01
The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.
Photometric binary stars in Praesepe and the search for globular cluster binaries
NASA Technical Reports Server (NTRS)
Bolte, Michael
1991-01-01
A radial velocity study of the stars which are located on a second sequence above the single-star zero-age main sequence at a given color in the color-magnitude diagram of the open cluster Praesepe, (NGC 2632) shows that 10, and possibly 11, of 17 are binary systems. Of the binary systems, five have full amplitudes for their velocity variations that are greater than 50 km/s. To the extent that they can be applied to globular clusters, these results suggests that (1) observations of 'second-sequence' stars in globular clusters would be an efficient way of finding main-sequence binary systems in globulars, and (2) current instrumentation on large telescopes is sufficient for establishing unambiguously the existence of main-sequence binary systems in nearby globular clusters.
Stellar Variability at the Main-sequence Turnoff of the Intermediate-age LMC Cluster NGC 1846
NASA Astrophysics Data System (ADS)
Salinas, R.; Pajkos, M. A.; Vivas, A. K.; Strader, J.; Contreras Ramos, R.
2018-04-01
Intermediate-age (IA) star clusters in the Large Magellanic Cloud (LMC) present extended main-sequence turn-offs (MSTO) that have been attributed to either multiple stellar populations or an effect of stellar rotation. Recently it has been proposed that these extended main sequences can also be produced by ill-characterized stellar variability. Here we present Gemini-S/Gemini Multi-Object Spectrometer (GMOS) time series observations of the IA cluster NGC 1846. Using differential image analysis, we identified 73 new variable stars, with 55 of those being of the Delta Scuti type, that is, pulsating variables close the MSTO for the cluster age. Considering completeness and background contamination effects, we estimate the number of δ Sct belonging to the cluster between 40 and 60 members, although this number is based on the detection of a single δ Sct within the cluster half-light radius. This amount of variable stars at the MSTO level will not produce significant broadening of the MSTO, albeit higher-resolution imaging will be needed to rule out variable stars as a major contributor to the extended MSTO phenomenon. Though modest, this amount of δ Sct makes NGC 1846 the star cluster with the highest number of these variables ever discovered. Lastly, our results present a cautionary tale about the adequacy of shallow variability surveys in the LMC (like OGLE) to derive properties of its δ Sct population. Based on observations obtained at the Gemini Observatory, which is operated by the Association of Universities for Research in Astronomy, Inc., under a cooperative agreement with the NSF on behalf of the Gemini partnership: the National Science Foundation (United States), the National Research Council (Canada), CONICYT (Chile), Ministerio de Ciencia, Tecnología e Innovación Productiva (Argentina), and Ministério da Ciência, Tecnologia e Inovação (Brazil).
Mazuet, Christelle; Legeay, Christine; Sautereau, Jean; Ma, Laurence; Bouchier, Christiane; Bouvet, Philippe; Popoff, Michel R
2016-06-13
In France, human botulism is mainly food-borne intoxication, whereas infant botulism is rare. A total of 99 group I and II Clostridium botulinum strains including 59 type A (12 historical isolates [1947-1961], 43 from France [1986-2013], 3 from other countries, and 1 collection strain), 31 type B (3 historical, 23 recent isolates, 4 from other countries, and 1 collection strain), and 9 type E (5 historical, 3 isolates, and 1 collection strain) were investigated by botulinum locus gene sequencing and multilocus sequence typing analysis. Historical C. botulinum A strains mainly belonged to subtype A1 and sequence type (ST) 1, whereas recent strains exhibited a wide genetic diversity: subtype A1 in orfX or ha locus, A1(B), A1(F), A2, A2b2, A5(B2') A5(B3'), as well as the recently identified A7 and A8 subtypes, and were distributed into 25 STs. Clostridium botulinum A1(B) was the most frequent subtype from food-borne botulism and food. Group I C. botulinum type B in France were mainly subtype B2 (14 out of 20 historical and recent strains) and were divided into 19 STs. Food-borne botulism resulting from ham consumption during the recent period was due to group II C. botulinum B4. Type E botulism is rare in France, 5 historical and 1 recent strains were subtype E3. A subtype E12 was recently identified from an unusual ham contamination. Clostridium botulinum strains from human botulism in France showed a wide genetic diversity and seems to result not from a single evolutionary lineage but from multiple and independent genetic rearrangements. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Fine-scale population structure and the era of next-generation sequencing.
Henn, Brenna M; Gravel, Simon; Moreno-Estrada, Andres; Acevedo-Acevedo, Suehelay; Bustamante, Carlos D
2010-10-15
Fine-scale population structure characterizes most continents and is especially pronounced in non-cosmopolitan populations. Roughly half of the world's population remains non-cosmopolitan and even populations within cities often assort along ethnic and linguistic categories. Barriers to random mating can be ecologically extreme, such as the Sahara Desert, or cultural, such as the Indian caste system. In either case, subpopulations accumulate genetic differences if the barrier is maintained over multiple generations. Genome-wide polymorphism data, initially with only a few hundred autosomal microsatellites, have clearly established differences in allele frequency not only among continental regions, but also within continents and within countries. We review recent evidence from the analysis of genome-wide polymorphism data for genetic boundaries delineating human population structure and the main demographic and genomic processes shaping variation, and discuss the implications of population structure for the distribution and discovery of disease-causing genetic variants, in the light of the imminent availability of sequencing data for a multitude of diverse human genomes.
ExoLocator--an online view into genetic makeup of vertebrate proteins.
Khoo, Aik Aun; Ogrizek-Tomas, Mario; Bulovic, Ana; Korpar, Matija; Gürler, Ece; Slijepcevic, Ivan; Šikic, Mile; Mihalek, Ivana
2014-01-01
ExoLocator (http://exolocator.eopsf.org) collects in a single place information needed for comparative analysis of protein-coding exons from vertebrate species. The main source of data--the genomic sequences, and the existing exon and homology annotation--is the ENSEMBL database of completed vertebrate genomes. To these, ExoLocator adds the search for ostensibly missing exons in orthologous protein pairs across species, using an extensive computational pipeline to narrow down the search region for the candidate exons and find a suitable template in the other species, as well as state-of-the-art implementations of pairwise alignment algorithms. The resulting complements of exons are organized in a way currently unique to ExoLocator: multiple sequence alignments, both on the nucleotide and on the peptide levels, clearly indicating the exon boundaries. The alignments can be inspected in the web-embedded viewer, downloaded or used on the spot to produce an estimate of conservation within orthologous sets, or functional divergence across paralogues.
DNA Persistence in a Sink Drain Environment
Winder, Eric M.; Bonheyo, George T.
2015-07-31
Biofilms are organized structures composed mainly of cells and extracellular polymeric substances produced by the constituent microorganisms. Ubiquitous in nature, biofilms have an innate ability to capture and retain passing material and may therefore act as natural collectors of contaminants or signatures of upstream activities. To determine the persistence and detectability of DNA passing through a sink drain environment, Bacillus anthracis strain Ames35 was cultured (6.35 x 10 7 CFU/mL), sterilized, and disposed of by addition to a sink drain apparatus with an established biofilm. The sink drain apparatus was sampled before and for several days after the addition ofmore » the sterilized B. anthracis culture to detect the presence of B. anthracis DNA. Multiple PCR primer pairs were used to screen for chromosomal and plasmid DNA with primers targeting shorter sequences showing greater amplification efficiency and success. PCR amplification and detection of target sequences indicate persistence of chromosomal DNA and plasmid DNA in the biofilm for 5 or more and 14 or more days, respectively.« less
Gene Expression Profiling in Fish Toxicology: A Review.
Kumar, Girish; Denslow, Nancy D
In this review, we present an overview of transcriptomic responses to chemical exposures in a variety of fish species. We have discussed the use of several molecular approaches such as northern blotting, differential display reverse transcription-polymerase chain reaction (DDRT-PCR), suppression subtractive hybridization (SSH), real time quantitative PCR (RT-qPCR), microarrays, and next-generation sequencing (NGS) for measuring gene expression. These techniques have been mainly used to measure the toxic effects of single compounds or simple mixtures in laboratory conditions. In addition, only few studies have been conducted to examine the biological significance of differentially expressed gene sets following chemical exposure. Therefore, future studies should focus more under field conditions using a multidisciplinary approach (genomics, proteomics and metabolomics) to understand the synergetic effects of multiple environmental stressors and to determine the functional significance of differentially expressed genes. Nevertheless, recent developments in NGS technologies and decreasing costs of sequencing holds the promise to uncover the complexity of anthropogenic impacts and biological effects in wild fish populations.
DNA Persistence in a Sink Drain Environment
Winder, Eric M.; Bonheyo, George T.
2015-01-01
Biofilms are organized structures composed mainly of cells and extracellular polymeric substances produced by the constituent microorganisms. Ubiquitous in nature, biofilms have an innate ability to capture and retain passing material and may therefore act as natural collectors of contaminants or signatures of upstream activities. To determine the persistence and detectability of DNA passing through a sink drain environment, Bacillus anthracis strain Ames35 was cultured (6.35 x 107 CFU/mL), sterilized, and disposed of by addition to a sink drain apparatus with an established biofilm. The sink drain apparatus was sampled before and for several days after the addition of the sterilized B. anthracis culture to detect the presence of B. anthracis DNA. Multiple PCR primer pairs were used to screen for chromosomal and plasmid DNA with primers targeting shorter sequences showing greater amplification efficiency and success. PCR amplification and detection of target sequences indicate persistence of chromosomal DNA and plasmid DNA in the biofilm for 5 or more and 14 or more days, respectively. PMID:26230525
DNA Persistence in a Sink Drain Environment.
Winder, Eric M; Bonheyo, George T
2015-01-01
Biofilms are organized structures composed mainly of cells and extracellular polymeric substances produced by the constituent microorganisms. Ubiquitous in nature, biofilms have an innate ability to capture and retain passing material and may therefore act as natural collectors of contaminants or signatures of upstream activities. To determine the persistence and detectability of DNA passing through a sink drain environment, Bacillus anthracis strain Ames35 was cultured (6.35 x 107 CFU/mL), sterilized, and disposed of by addition to a sink drain apparatus with an established biofilm. The sink drain apparatus was sampled before and for several days after the addition of the sterilized B. anthracis culture to detect the presence of B. anthracis DNA. Multiple PCR primer pairs were used to screen for chromosomal and plasmid DNA with primers targeting shorter sequences showing greater amplification efficiency and success. PCR amplification and detection of target sequences indicate persistence of chromosomal DNA and plasmid DNA in the biofilm for 5 or more and 14 or more days, respectively.
Hidden Markov models of biological primary sequence information.
Baldi, P; Chauvin, Y; Hunkapiller, T; McClure, M A
1994-01-01
Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences. PMID:8302831
Gemi: PCR Primers Prediction from Multiple Alignments
Sobhy, Haitham; Colson, Philippe
2012-01-01
Designing primers and probes for polymerase chain reaction (PCR) is a preliminary and critical step that requires the identification of highly conserved regions in a given set of sequences. This task can be challenging if the targeted sequences display a high level of diversity, as frequently encountered in microbiologic studies. We developed Gemi, an automated, fast, and easy-to-use bioinformatics tool with a user-friendly interface to design primers and probes based on multiple aligned sequences. This tool can be used for the purpose of real-time and conventional PCR and can deal efficiently with large sets of sequences of a large size. PMID:23316117
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.
Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy
2015-05-01
We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.
EUGÈNE'HOM: a generic similarity-based gene finder using multiple homologous sequences
Foissac, Sylvain; Bardou, Philippe; Moisan, Annick; Cros, Marie-Josée; Schiex, Thomas
2003-01-01
EUGÈNE'HOM is a gene prediction software for eukaryotic organisms based on comparative analysis. EUGÈNE'HOM is able to take into account multiple homologous sequences from more or less closely related organisms. It integrates the results of TBLASTX analysis, splice site and start codon prediction and a robust coding/non-coding probabilistic model which allows EUGÈNE'HOM to handle sequences from a variety of organisms. The current target of EUGÈNE'HOM is plant sequences. The EUGÈNE'HOM web site is available at http://genopole.toulouse.inra.fr/bioinfo/eugene/EuGeneHom/cgi-bin/EuGeneHom.pl. PMID:12824408
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation.
Bolleman, Jerven T; Mungall, Christopher J; Strozzi, Francesco; Baran, Joachim; Dumontier, Michel; Bonnal, Raoul J P; Buels, Robert; Hoehndorf, Robert; Fujisawa, Takatomo; Katayama, Toshiaki; Cock, Peter J A
2016-06-13
Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned "omics" areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe - and potentially merge - sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco; ...
2016-06-13
Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less
Fast alignment-free sequence comparison using spaced-word frequencies.
Leimeister, Chris-Andre; Boden, Marcus; Horwege, Sebastian; Lindner, Sebastian; Morgenstern, Burkhard
2014-07-15
Alignment-free methods for sequence comparison are increasingly used for genome analysis and phylogeny reconstruction; they circumvent various difficulties of traditional alignment-based approaches. In particular, alignment-free methods are much faster than pairwise or multiple alignments. They are, however, less accurate than methods based on sequence alignment. Most alignment-free approaches work by comparing the word composition of sequences. A well-known problem with these methods is that neighbouring word matches are far from independent. To reduce the statistical dependency between adjacent word matches, we propose to use 'spaced words', defined by patterns of 'match' and 'don't care' positions, for alignment-free sequence comparison. We describe a fast implementation of this approach using recursive hashing and bit operations, and we show that further improvements can be achieved by using multiple patterns instead of single patterns. To evaluate our approach, we use spaced-word frequencies as a basis for fast phylogeny reconstruction. Using real-world and simulated sequence data, we demonstrate that our multiple-pattern approach produces better phylogenies than approaches relying on contiguous words. Our program is freely available at http://spaced.gobics.de/. © The Author 2014. Published by Oxford University Press.
An intuitive graphical webserver for multiple-choice protein sequence search.
Banky, Daniel; Szalkai, Balazs; Grolmusz, Vince
2014-04-10
Every day tens of thousands of sequence searches and sequence alignment queries are submitted to webservers. The capitalized word "BLAST" becomes a verb, describing the act of performing sequence search and alignment. However, if one needs to search for sequences that contain, for example, two hydrophobic and three polar residues at five given positions, the query formation on the most frequently used webservers will be difficult. Some servers support the formation of queries with regular expressions, but most of the users are unfamiliar with their syntax. Here we present an intuitive, easily applicable webserver, the Protein Sequence Analysis server, that allows the formation of multiple choice queries by simply drawing the residues to their positions; if more than one residue are drawn to the same position, then they will be nicely stacked on the user interface, indicating the multiple choice at the given position. This computer-game-like interface is natural and intuitive, and the coloring of the residues makes possible to form queries requiring not just certain amino acids in the given positions, but also small nonpolar, negatively charged, hydrophobic, positively charged, or polar ones. The webserver is available at http://psa.pitgroup.org. Copyright © 2014 Elsevier B.V. All rights reserved.
Ajawatanawong, Pravech; Atkinson, Gemma C; Watson-Haigh, Nathan S; Mackenzie, Bryony; Baldauf, Sandra L
2012-07-01
Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple alignments, particularly insertions/deletions (indels) in protein sequences. We have developed Sequence Feature and Indel Region Extractor (SeqFIRE) to enable the automated identification and extraction of indels from protein sequence alignments. The program can also extract conserved blocks and identify fast evolving sites using a combination of conservation and entropy. All major variables can be adjusted by the user, allowing them to identify the sets of variables most suited to a particular analysis or dataset. Thus, all major tasks in preparing an alignment for further analysis are combined in a single flexible and user-friendly program. The output includes a numbered list of indels, alignments in NEXUS format with indels annotated or removed and indel-only matrices. SeqFIRE is a user-friendly web application, freely available online at www.seqfire.org/.
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco
Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less
Analysis on the use of Multi-Sequence MRI Series for Segmentation of Abdominal Organs
NASA Astrophysics Data System (ADS)
Selver, M. A.; Selvi, E.; Kavur, E.; Dicle, O.
2015-01-01
Segmentation of abdominal organs from MRI data sets is a challenging task due to various limitations and artefacts. During the routine clinical practice, radiologists use multiple MR sequences in order to analyze different anatomical properties. These sequences have different characteristics in terms of acquisition parameters (such as contrast mechanisms and pulse sequence designs) and image properties (such as pixel spacing, slice thicknesses and dynamic range). For a complete understanding of the data, computational techniques should combine the information coming from these various MRI sequences. These sequences are not acquired in parallel but in a sequential manner (one after another). Therefore, patient movements and respiratory motions change the position and shape of the abdominal organs. In this study, the amount of these effects is measured using three different symmetric surface distance metrics performed to three dimensional data acquired from various MRI sequences. The results are compared to intra and inter observer differences and discussions on using multiple MRI sequences for segmentation and the necessities for registration are presented.
Bardy, Fabrice; Dillon, Harvey; Van Dun, Bram
2014-04-01
Rapid presentation of stimuli in an evoked response paradigm can lead to overlap of multiple responses and consequently difficulties interpreting waveform morphology. This paper presents a deconvolution method allowing overlapping multiple responses to be disentangled. The deconvolution technique uses a least-squared error approach. A methodology is proposed to optimize the stimulus sequence associated with the deconvolution technique under low-jitter conditions. It controls the condition number of the matrices involved in recovering the responses. Simulations were performed using the proposed deconvolution technique. Multiple overlapping responses can be recovered perfectly in noiseless conditions. In the presence of noise, the amount of error introduced by the technique can be controlled a priori by the condition number of the matrix associated with the used stimulus sequence. The simulation results indicate the need for a minimum amount of jitter, as well as a sufficient number of overlap combinations to obtain optimum results. An aperiodic model is recommended to improve reconstruction. We propose a deconvolution technique allowing multiple overlapping responses to be extracted and a method of choosing the stimulus sequence optimal for response recovery. This technique may allow audiologists, psychologists, and electrophysiologists to optimize their experimental designs involving rapidly presented stimuli, and to recover evoked overlapping responses. Copyright © 2013 International Federation of Clinical Neurophysiology. All rights reserved.
NASA Technical Reports Server (NTRS)
Wang, C.-W.; Stark, W.
2005-01-01
This article considers a quaternary direct-sequence code-division multiple-access (DS-CDMA) communication system with asymmetric quadrature phase-shift-keying (AQPSK) modulation for unequal error protection (UEP) capability. Both time synchronous and asynchronous cases are investigated. An expression for the probability distribution of the multiple-access interference is derived. The exact bit-error performance and the approximate performance using a Gaussian approximation and random signature sequences are evaluated by extending the techniques used for uniform quadrature phase-shift-keying (QPSK) and binary phase-shift-keying (BPSK) DS-CDMA systems. Finally, a general system model with unequal user power and the near-far problem is considered and analyzed. The results show that, for a system with UEP capability, the less protected data bits are more sensitive to the near-far effect that occurs in a multiple-access environment than are the more protected bits.
DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.
Eernisse, D J
1992-04-01
DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.
Can you sequence ecology? Metagenomics of adaptive diversification.
Marx, Christopher J
2013-01-01
Few areas of science have benefited more from the expansion in sequencing capability than the study of microbial communities. Can sequence data, besides providing hypotheses of the functions the members possess, detect the evolutionary and ecological processes that are occurring? For example, can we determine if a species is adapting to one niche, or if it is diversifying into multiple specialists that inhabit distinct niches? Fortunately, adaptation of populations in the laboratory can serve as a model to test our ability to make such inferences about evolution and ecology from sequencing. Even adaptation to a single niche can give rise to complex temporal dynamics due to the transient presence of multiple competing lineages. If there are multiple niches, this complexity is augmented by segmentation of the population into multiple specialists that can each continue to evolve within their own niche. For a known example of parallel diversification that occurred in the laboratory, sequencing data gave surprisingly few obvious, unambiguous signs of the ecological complexity present. Whereas experimental systems are open to direct experimentation to test hypotheses of selection or ecological interaction, the difficulty in "seeing ecology" from sequencing for even such a simple system suggests translation to communities like the human microbiome will be quite challenging. This will require both improved empirical methods to enhance the depth and time resolution for the relevant polymorphisms and novel statistical approaches to rigorously examine time-series data for signs of various evolutionary and ecological phenomena within and between species.
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.
Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias
2011-01-01
The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
Garcia-Hermoso, Dea; Criscuolo, Alexis; Lee, Soo Chan; Legrand, Matthieu; Chaouat, Marc; Denis, Blandine; Lafaurie, Matthieu; Rouveau, Martine; Soler, Charles; Schaal, Jean-Vivien; Mimoun, Maurice; Mebazaa, Alexandre; Heitman, Joseph; Dromer, Françoise; Brisse, Sylvain; Bretagne, Stéphane; Alanio, Alexandre
2018-04-24
Mucorales are ubiquitous environmental molds responsible for mucormycosis in diabetic, immunocompromised, and severely burned patients. Small outbreaks of invasive wound mucormycosis (IWM) have already been reported in burn units without extensive microbiological investigations. We faced an outbreak of IWM in our center and investigated the clinical isolates with whole-genome sequencing (WGS) analysis. We analyzed M. circinelloides isolates from patients in our burn unit (BU1, Hôpital Saint-Louis, Paris, France) together with nonoutbreak isolates from Burn Unit 2 (BU2, Paris area) and from France over a 2-year period (2013 to 2015). A total of 21 isolates, including 14 isolates from six BU1 patients, were analyzed by whole-genome sequencing (WGS). Phylogenetic classification based on de novo assembly and assembly free approaches showed that the clinical isolates clustered in four highly divergent clades. Clade 1 contained at least one of the strains from the six epidemiologically linked BU1 patients. The clinical isolates were specific to each patient. Two patients were infected with more than two strains from different clades, suggesting that an environmental reservoir of clonally unrelated isolates was the source of contamination. Only two patients from BU1 shared one strain, which could correspond to direct transmission or contamination with the same environmental source. In conclusion, WGS of several isolates per patients coupled with precise epidemiological data revealed a complex situation combining potential cross-transmission between patients and multiple contaminations with a heterogeneous pool of strains from a cryptic environmental reservoir. IMPORTANCE Invasive wound mucormycosis (IWM) is a severe infection due to environmental molds belonging to the order Mucorales. Severely burned patients are particularly at risk for IWM. Here, we used whole-genome sequencing (WGS) analysis to resolve an outbreak of IWM due to Mucor circinelloides that occurred in our hospital (BU1). We sequenced 21 clinical isolates, including 14 from BU1 and 7 unrelated isolates, and compared them to the reference genome (1006PhL). This analysis revealed that the outbreak was mainly due to multiple strains that seemed patient specific, suggesting that the patients were more likely infected from a pool of diverse strains from the environment rather than from direct transmission among them. This study revealed the complexity of a Mucorales outbreak in the settings of IWM in burn patients, which has been highlighted based on WGS combined with careful sampling. Copyright © 2018 Garcia-Hermoso et al.
Characterisation of a rare, reassortant human G10P[14] rotavirus strain detected in Honduras
Quaye, Osbourne; Roy, Sunando; Rungsrisuriyachai, Kunchala; Esona, Mathew D; Xu, Ziqian; Tam, Ka Ian; Banegas, Dina J Castro; Rey-Benito, Gloria; Bowen, Michael D
2018-01-01
BACKGROUND Although first detected in animals, the rare rotavirus strain G10P[14] has been sporadically detected in humans in Slovenia, Thailand, United Kingdom and Australia among other countries. Earlier studies suggest that the strains found in humans resulted from interspecies transmission and reassortment between human and bovine rotavirus strains. OBJECTIVES In this study, a G10P[14] rotavirus genotype detected in a human stool sample in Honduras during the 2010-2011 rotavirus season, from an unvaccinated 30-month old boy who reported at the hospital with severe diarrhea and vomiting, was characterised to determine the possible evolutionary origin of the rare strain. METHODS For the sample detected as G10P[14], 10% suspension was prepared and used for RNA extraction and sequence independent amplification. The amplicons were sequenced by next-generation sequencing using the Illumina MiSeq 150 paired end method. The sequence reads were analysed using CLC Genomics Workbench 6.0 and phylogenetic trees were constructed using PhyML version 3.0. FINDINGS The next generation sequencing and phylogenetic analyses of the 11-segmented genome of the G10P[14] strain allowed classification as G10-P[14]-I2-R2-C2-M2-A3-N2-T6-E2-H3. Six of the genes (VP1, VP2, VP3, VP6, NSP2 and NSP4) were DS-1-like. NSP1 and NSP5 were AU-1-like and NSP3 was T6, which suggests that multiple reassortment events occurred in the evolution of the strain. The phylogenetic analyses and genetic distance calculations showed that the VP7, VP4, VP6, VP1, VP3, NSP1, NSP3 and NSP4 genes clustered predominantly with bovine strains. NSP2 and VP2 genes were most closely related to simian and human strains, respectively, and NSP5 was most closely related to a rhesus strain. MAIN CONCLUSIONS The genetic characterisation of the G10P[14] strain from Honduras suggests that its genome resulted from multiple reassortment events which were possibly mediated through interspecies transmissions. PMID:29211103
Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin
2017-01-21
RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .
Bouwman, Aniek C; Veerkamp, Roel F
2014-10-03
The aim of this study was to determine the consequences of splitting sequencing effort over multiple breeds for imputation accuracy from a high-density SNP chip towards whole-genome sequence. Such information would assist for instance numerical smaller cattle breeds, but also pig and chicken breeders, who have to choose wisely how to spend their sequencing efforts over all the breeds or lines they evaluate. Sequence data from cattle breeds was used, because there are currently relatively many individuals from several breeds sequenced within the 1,000 Bull Genomes project. The advantage of whole-genome sequence data is that it carries the causal mutations, but the question is whether it is possible to impute the causal variants accurately. This study therefore focussed on imputation accuracy of variants with low minor allele frequency and breed specific variants. Imputation accuracy was assessed for chromosome 1 and 29 as the correlation between observed and imputed genotypes. For chromosome 1, the average imputation accuracy was 0.70 with a reference population of 20 Holstein, and increased to 0.83 when the reference population was increased by including 3 other dairy breeds with 20 animals each. When the same amount of animals from the Holstein breed were added the accuracy improved to 0.88, while adding the 3 other breeds to the reference population of 80 Holstein improved the average imputation accuracy marginally to 0.89. For chromosome 29, the average imputation accuracy was lower. Some variants benefitted from the inclusion of other breeds in the reference population, initially determined by the MAF of the variant in each breed, but even Holstein specific variants did gain imputation accuracy from the multi-breed reference population. This study shows that splitting sequencing effort over multiple breeds and combining the reference populations is a good strategy for imputation from high-density SNP panels towards whole-genome sequence when reference populations are small and sequencing effort is limiting. When sequencing effort is limiting and interest lays in multiple breeds or lines this provides imputation of each breed.
Gardner, Shea N; Wagner, Mark C
2005-01-01
Background Microbial forensics is important in tracking the source of a pathogen, whether the disease is a naturally occurring outbreak or part of a criminal investigation. Results A method and SPR Opt (SNP and PCR-RFLP Optimization) software to perform a comprehensive, whole-genome analysis to forensically discriminate multiple sequences is presented. Tools for the optimization of forensic typing using Single Nucleotide Polymorphism (SNP) and PCR-Restriction Fragment Length Polymorphism (PCR-RFLP) analyses across multiple isolate sequences of a species are described. The PCR-RFLP analysis includes prediction and selection of optimal primers and restriction enzymes to enable maximum isolate discrimination based on sequence information. SPR Opt calculates all SNP or PCR-RFLP variations present in the sequences, groups them into haplotypes according to their co-segregation across those sequences, and performs combinatoric analyses to determine which sets of haplotypes provide maximal discrimination among all the input sequences. Those set combinations requiring that membership in the fewest haplotypes be queried (i.e. the fewest assays be performed) are found. These analyses highlight variable regions based on existing sequence data. These markers may be heterogeneous among unsequenced isolates as well, and thus may be useful for characterizing the relationships among unsequenced as well as sequenced isolates. The predictions are multi-locus. Analyses of mumps and SARS viruses are summarized. Phylogenetic trees created based on SNPs, PCR-RFLPs, and full genomes are compared for SARS virus, illustrating that purported phylogenies based only on SNP or PCR-RFLP variations do not match those based on multiple sequence alignment of the full genomes. Conclusion This is the first software to optimize the selection of forensic markers to maximize information gained from the fewest assays, accepting whole or partial genome sequence data as input. As more sequence data becomes available for multiple strains and isolates of a species, automated, computational approaches such as those described here will be essential to make sense of large amounts of information, and to guide and optimize efforts in the laboratory. The software and source code for SPR Opt is publicly available and free for non-profit use at . PMID:15904493
Somatic mosaicism of a CDKL5 mutation identified by next-generation sequencing.
Kato, Takeshi; Morisada, Naoya; Nagase, Hiroaki; Nishiyama, Masahiro; Toyoshima, Daisaku; Nakagawa, Taku; Maruyama, Azusa; Fu, Xue Jun; Nozu, Kandai; Wada, Hiroko; Takada, Satoshi; Iijima, Kazumoto
2015-10-01
CDKL5-related encephalopathy is an X-linked dominantly inherited disorder that is characterized by early infantile epileptic encephalopathy or atypical Rett syndrome. We describe a 5-year-old Japanese boy with intractable epilepsy, severe developmental delay, and Rett syndrome-like features. Onset was at 2 months, when his electroencephalogram showed sporadic single poly spikes and diffuse irregular poly spikes. We conducted a genetic analysis using an Illumina® TruSight™ One sequencing panel on a next-generation sequencer. We identified two epilepsy-associated single nucleotide variants in our case: CDKL5 p.Ala40Val and KCNQ2 p.Glu515Asp. CDKL5 p.Ala40Val has been previously reported to be responsible for early infantile epileptic encephalopathy. In our case, the CDKL5 heterozygous mutation showed somatic mosaicism because the boy's karyotype was 46,XY. The KCNQ2 variant p.Glu515Asp is known to cause benign familial neonatal seizures-1, and this variant showed paternal inheritance. Although we believe that the somatic mosaic CDKL5 mutation is mainly responsible for the neurological phenotype in the patient, the KCNQ2 variant might have some neurological effect. Genetic analysis by next-generation sequencing is capable of identifying multiple variants in a patient. Copyright © 2015 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.
Álvarez-Pérez, Sergio; de Vega, Clara; Herrera, Carlos M.
2013-01-01
The genetic and evolutionary relationships among floral nectar-dwelling Pseudomonas ‘sensu stricto’ isolates associated to South African and Mediterranean plants were investigated by multilocus sequence analysis (MLSA) of four core housekeeping genes (rrs, gyrB, rpoB and rpoD). A total of 35 different sequence types were found for the 38 nectar bacterial isolates characterised. Phylogenetic analyses resulted in the identification of three main clades [nectar groups (NGs) 1, 2 and 3] of nectar pseudomonads, which were closely related to five intrageneric groups: Pseudomonas oryzihabitans (NG 1); P. fluorescens, P. lutea and P. syringae (NG 2); and P. rhizosphaerae (NG 3). Linkage disequilibrium analysis pointed to a mostly clonal population structure, even when the analysis was restricted to isolates from the same floristic region or belonging to the same NG. Nevertheless, signatures of recombination were observed for NG 3, which exclusively included isolates retrieved from the floral nectar of insect-pollinated Mediterranean plants. In contrast, the other two NGs comprised both South African and Mediterranean isolates. Analyses relating diversification to floristic region and pollinator type revealed that there has been more unique evolution of the nectar pseudomonads within the Mediterranean region than would be expected by chance. This is the first work analysing the sequence of multiple loci to reveal geno- and ecotypes of nectar bacteria. PMID:24116076
Sharma, Amit K; Gohel, Sangeeta; Singh, Satya P
2012-01-01
Actinobase is a relational database of molecular diversity, phylogeny and biocatalytic potential of haloalkaliphilic actinomycetes. The main objective of this data base is to provide easy access to range of information, data storage, comparison and analysis apart from reduced data redundancy, data entry, storage, retrieval costs and improve data security. Information related to habitat, cell morphology, Gram reaction, biochemical characterization and molecular features would allow researchers in understanding identification and stress adaptation of the existing and new candidates belonging to salt tolerant alkaliphilic actinomycetes. The PHP front end helps to add nucleotides and protein sequence of reported entries which directly help researchers to obtain the required details. Analysis of the genus wise status of the salt tolerant alkaliphilic actinomycetes indicated 6 different genera among the 40 classified entries of the salt tolerant alkaliphilic actinomycetes. The results represented wide spread occurrence of salt tolerant alkaliphilic actinomycetes belonging to diverse taxonomic positions. Entries and information related to actinomycetes in the database are publicly accessible at http://www.actinobase.in. On clustalW/X multiple sequence alignment of the alkaline protease gene sequences, different clusters emerged among the groups. The narrow search and limit options of the constructed database provided comparable information. The user friendly access to PHP front end facilitates would facilitate addition of sequences of reported entries. The database is available for free at http://www.actinobase.in.
The B chromosomes in Brachycome.
Leach, C R; Houben, A; Timmis, J N
2004-01-01
This review presents a historical account of studies of B chromosomes in the genus Brachycome Cass. (synonym: Brachyscome) from the earliest cytological investigations carried out in the late 1960s though to the most recent molecular analyses. Molecular analyses provide insights into the origin and evolution of the B chromosomes (Bs) of Brachycome dichromosomatica, a species which has Bs of two different sizes. The larger Bs are somatically stable whereas the smaller, or micro, Bs are somatically unstable. Both B types contain clusters of ribosomal RNA genes that have been shown unequivocally to be inactive in the case of the larger Bs. The large Bs carry a family of tandem repeat sequences (Bd49) that are located mainly at the centromere. Multiple copies of sequences related to this repeat are present on the A chromosomes (As) of related species, whereas only a few copies exist in the A chromosomes of B. dichromosomatica. The micro Bs share DNA sequences with the As and the larger Bs, and they also have B-specific repeats (Bdm29 and Bdm54). In some cases repeat sequences on the micro Bs have been shown to occur as clusters on the A chromosomes in a proportion of individuals within a population. It is clear that none of these B types originated by simple excision of segments from the A chromosomes. Copyright 2004 S. Karger AG, Basel
ERIC Educational Resources Information Center
Hodson, D.
1984-01-01
Investigated the effect on student performance of changes in question structure and sequence on a GCE 0-level multiple-choice chemistry test. One finding noted is that there was virtually no change in test reliability on reducing the number of options (from five to per test item). (JN)
USDA-ARS?s Scientific Manuscript database
Reconstructing the phylogeny of Pyrus has been difficult due to the wide distribution of the genus and lack of informative data. In this study, we collected 110 accessions representing 25 Pyrus species and constructed both phylogenetic trees and phylogenetic networks based on multiple DNA sequence d...
Planets, Planetary Nebulae, and Intermediate Luminosity Optical Transients (ILOTs)
NASA Astrophysics Data System (ADS)
Soker, Noam
2018-05-01
I review some aspects related to the influence of planets on the evolution of stars before and beyond the main sequence. Some processes include the tidal destruction of a planet on to a very young main sequence star, on to a low mass main sequence star, and on to a brown dwarf. This process releases gravitational energy that might be observed as a faint intermediate luminosity optical transient (ILOT) event. I then summarize the view that some elliptical planetary nebulae are shaped by planets. When the planet interacts with a low mass upper asymptotic giant branch (AGB) star it both enhances the mass loss rate and shapes the wind to form an elliptical planetary nebula, mainly by spinning up the envelope and by exciting waves in the envelope. If no interaction with a companion, stellar or sub-stellar, takes place beyond the main sequence, the star is termed a Jsolated star, and its mass loss rates on the giant branches are likely to be much lower than what is traditionally assumed.
Iterative pass optimization of sequence data
NASA Technical Reports Server (NTRS)
Wheeler, Ward C.
2003-01-01
The problem of determining the minimum-cost hypothetical ancestral sequences for a given cladogram is known to be NP-complete. This "tree alignment" problem has motivated the considerable effort placed in multiple sequence alignment procedures. Wheeler in 1996 proposed a heuristic method, direct optimization, to calculate cladogram costs without the intervention of multiple sequence alignment. This method, though more efficient in time and more effective in cladogram length than many alignment-based procedures, greedily optimizes nodes based on descendent information only. In their proposal of an exact multiple alignment solution, Sankoff et al. in 1976 described a heuristic procedure--the iterative improvement method--to create alignments at internal nodes by solving a series of median problems. The combination of a three-sequence direct optimization with iterative improvement and a branch-length-based cladogram cost procedure, provides an algorithm that frequently results in superior (i.e., lower) cladogram costs. This iterative pass optimization is both computation and memory intensive, but economies can be made to reduce this burden. An example in arthropod systematics is discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.
Evidence for the Concerted Evolution between Short Linear Protein Motifs and Their Flanking Regions
Chica, Claudia; Diella, Francesca; Gibson, Toby J.
2009-01-01
Background Linear motifs are short modules of protein sequences that play a crucial role in mediating and regulating many protein–protein interactions. The function of linear motifs strongly depends on the context, e.g. functional instances mainly occur inside flexible regions that are accessible for interaction. Sometimes linear motifs appear as isolated islands of conservation in multiple sequence alignments. However, they also occur in larger blocks of sequence conservation, suggesting an active role for the neighbouring amino acids. Results The evolution of regions flanking 116 functional linear motif instances was studied. The conservation of the amino acid sequence and order/disorder tendency of those regions was related to presence/absence of the instance. For the majority of the analysed instances, the pairs of sequences conserving the linear motif were also observed to maintain a similar local structural tendency and/or to have higher local sequence conservation when compared to pairs of sequences where one is missing the linear motif. Furthermore, those instances have a higher chance to co–evolve with the neighbouring residues in comparison to the distant ones. Those findings are supported by examples where the regulation of the linear motif–mediated interaction has been shown to depend on the modifications (e.g. phosphorylation) at neighbouring positions or is thought to benefit from the binding versatility of disordered regions. Conclusion The results suggest that flanking regions are relevant for linear motif–mediated interactions, both at the structural and sequence level. More interestingly, they indicate that the prediction of linear motif instances can be enriched with contextual information by performing a sequence analysis similar to the one presented here. This can facilitate the understanding of the role of these predicted instances in determining the protein function inside the broader context of the cellular network where they arise. PMID:19584925
Zhou, Carol L Ecale
2015-01-01
In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository.
GASP: Gapped Ancestral Sequence Prediction for proteins
Edwards, Richard J; Shields, Denis C
2004-01-01
Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199
NASA Astrophysics Data System (ADS)
Boumehrez, Farouk; Brai, Radhia; Doghmane, Noureddine; Mansouri, Khaled
2018-01-01
Recently, video streaming has attracted much attention and interest due to its capability to process and transmit large data. We propose a quality of experience (QoE) model relying on high efficiency video coding (HEVC) encoder adaptation scheme, in turn based on the multiple description coding (MDC) for video streaming. The main contributions of the paper are (1) a performance evaluation of the new and emerging video coding standard HEVC/H.265, which is based on the variation of quantization parameter (QP) values depending on different video contents to deduce their influence on the sequence to be transmitted, (2) QoE support multimedia applications in wireless networks are investigated, so we inspect the packet loss impact on the QoE of transmitted video sequences, (3) HEVC encoder parameter adaptation scheme based on MDC is modeled with the encoder parameter and objective QoE model. A comparative study revealed that the proposed MDC approach is effective for improving the transmission with a peak signal-to-noise ratio (PSNR) gain of about 2 to 3 dB. Results show that a good choice of QP value can compensate for transmission channel effects and improve received video quality, although HEVC/H.265 is also sensitive to packet loss. The obtained results show the efficiency of our proposed method in terms of PSNR and mean-opinion-score.
Qin, Y; Duquette, P; Zhang, Y; Talbot, P; Poole, R; Antel, J
1998-01-01
The cerebrospinal fluid (CSF) of multiple sclerosis (MS) patients is characterized by increased concentrations of immunoglobulin (Ig), which on electrophoretic analysis shows restricted heterogeneity (oligoclonal bands). CSF Ig is composed of both serum and intrathecally produced components. To examine the properties of intrathecal antibody-producing B cells, we analyzed Ig heavy-chain variable (V(H)) region genes of B cells recovered from the CSF of 12 MS patients and 15 patients with other neurological diseases (OND). Using a PCR technique, we could detect rearrangements of Ig V(H) genes in all samples. Sequence analysis of complementarity-determining region 3 (CDR3) of rearranged VDJ genes revealed expansion of a dominant clone or clones in 10 of the 12 MS patients. B cell clonal expansion was identified in 3 of 15 OND. The nucleotide sequences of V(H) genes from clonally expanded CSF B cells in MS patients demonstrated the preferential usage of the V(H) IV family. There were numerous somatic mutations, mainly in the CDRs, with a high replacement-to-silent ratio; the mutations were distributed in a way suggesting that these B cells had been positively selected through their antigen receptor. Our results demonstrate that in MS CSF, there is a high frequency of clonally expanded B cells that have properties of postgerminal center memory or antibody-forming lymphocytes. PMID:9727074
Huy, Nguyen Tien; Hang, Le Thi Thuy; Boamah, Daniel; Lan, Nguyen Thi Phuong; Van Thanh, Phan; Watanabe, Kiwao; Huong, Vu Thi Thu; Kikuchi, Mihoko; Ariyoshi, Koya; Morita, Kouichi; Hirayama, Kenji
2012-12-01
Several loop-mediated isothermal amplification (LAMP) assays have been developed to detect common causative pathogens of bacterial meningitis (BM). However, no LAMP assay is reported to detect Streptococcus agalactiae and Streptococcus suis, which are also among common pathogens of BM. Moreover, it is laborious and expensive by performing multiple reactions for each sample to detect bacterial pathogen. Thus, we aimed to design and develop a single-tube LAMP assay capable of detecting multiple bacterial species, based on the nucleotide sequences of the 16S rRNA genes of the bacteria. The nucleotide sequences of the 16S rRNA genes of main pathogens involved in BM were aligned to identify conserved regions, which were further used to design broad range specific LAMP assay primers. We successfully designed a set of broad range specific LAMP assay primers for simultaneous detection of four species including Staphylococcus aureus, Streptococcus pneumoniae, S. suis and S. agalactiae. The broad range LAMP assay was highly specific without cross-reactivity with other bacteria including Haemophilus influenzae, Neisseria meningitidis and Escherichia coli. The sensitivity of our LAMP assay was 100-1000 times higher compared with the conventional PCR assay. The bacterial species could be identified after digestion of the LAMP products with restriction endonuclease DdeI and HaeIII. © 2012 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.; ...
2017-07-18
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Richard A.; Brown, Steven D.
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences. PMID:28769883
Sequence Diversity Diagram for comparative analysis of multiple sequence alignments.
Sakai, Ryo; Aerts, Jan
2014-01-01
The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks.
2014-01-01
Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org. PMID:25237393
Aftershocks of the 2014 South Napa, California, Earthquake: Complex faulting on secondary faults
Hardebeck, Jeanne L.; Shelly, David R.
2016-01-01
We investigate the aftershock sequence of the 2014 MW6.0 South Napa, California, earthquake. Low-magnitude aftershocks missing from the network catalog are detected by applying a matched-filter approach to continuous seismic data, with the catalog earthquakes serving as the waveform templates. We measure precise differential arrival times between events, which we use for double-difference event relocation in a 3D seismic velocity model. Most aftershocks are deeper than the mainshock slip, and most occur west of the mapped surface rupture. While the mainshock coseismic and postseismic slip appears to have occurred on the near-vertical, strike-slip West Napa fault, many of the aftershocks occur in a complex zone of secondary faulting. Earthquake locations in the main aftershock zone, near the mainshock hypocenter, delineate multiple dipping secondary faults. Composite focal mechanisms indicate strike-slip and oblique-reverse faulting on the secondary features. The secondary faults were moved towards failure by Coulomb stress changes from the mainshock slip. Clusters of aftershocks north and south of the main aftershock zone exhibit vertical strike-slip faulting more consistent with the West Napa Fault. The northern aftershocks correspond to the area of largest mainshock coseismic slip, while the main aftershock zone is adjacent to the fault area that has primarily slipped postseismically. Unlike most creeping faults, the zone of postseismic slip does not appear to contain embedded stick-slip patches that would have produced on-fault aftershocks. The lack of stick-slip patches along this portion of the fault may contribute to the low productivity of the South Napa aftershock sequence.
Simple chained guide trees give high-quality protein multiple sequence alignments
Boyce, Kieran; Sievers, Fabian; Higgins, Desmond G.
2014-01-01
Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These also happen to be the fastest and simplest guide trees to construct, computationally. Such guide trees have a striking effect on the accuracy of alignments produced by some of the most widely used alignment packages. There is a marked increase in accuracy and a marked decrease in computational time, once the number of sequences goes much above a few hundred. This is true, even if the order of sequences in the guide tree is random. PMID:25002495
Jossinet, Fabrice; Westhof, Eric
2005-08-01
Efficient RNA sequence manipulations (such as multiple alignments) need to be constrained by rules of RNA structure folding. The structural knowledge has increased dramatically in the last years with the accumulation of several large RNA structures similar to those of the bacterial ribosome subunits. However, no tool in the RNA community provides an easy way to link and integrate progress made at the sequence level using the available three-dimensional information. Sequence to Structure (S2S) proposes a framework in which an user can easily display, manipulate and interconnect heterogeneous RNA data, such as multiple sequence alignments, secondary and tertiary structures. S2S has been implemented using the Java language and has been developed and tested under UNIX systems, such as Linux and MacOSX. S2S is available at http://bioinformatics.org/S2S/.
Badr, Eman; ElHefnawi, Mahmoud; Heath, Lenwood S
2016-01-01
Alternative splicing is a vital process for regulating gene expression and promoting proteomic diversity. It plays a key role in tissue-specific expressed genes. This specificity is mainly regulated by splicing factors that bind to specific sequences called splicing regulatory elements (SREs). Here, we report a genome-wide analysis to study alternative splicing on multiple tissues, including brain, heart, liver, and muscle. We propose a pipeline to identify differential exons across tissues and hence tissue-specific SREs. In our pipeline, we utilize the DEXSeq package along with our previously reported algorithms. Utilizing the publicly available RNA-Seq data set from the Human BodyMap project, we identified 28,100 differentially used exons across the four tissues. We identified tissue-specific exonic splicing enhancers that overlap with various previously published experimental and computational databases. A complicated exonic enhancer regulatory network was revealed, where multiple exonic enhancers were found across multiple tissues while some were found only in specific tissues. Putative combinatorial exonic enhancers and silencers were discovered as well, which may be responsible for exon inclusion or exclusion across tissues. Some of the exonic enhancers are found to be co-occurring with multiple exonic silencers and vice versa, which demonstrates a complicated relationship between tissue-specific exonic enhancers and silencers.
NASA Astrophysics Data System (ADS)
Sahu, Sunil Kumar; Singh, Reena; Kathiresan, Kandasamy
2016-12-01
Mangroves are taxonomically diverse group of salt-tolerant, mainly arboreal, flowering plants that grow in tropical and sub-tropical regions and have adapted themselves to thrive in such obdurate surroundings. While evolution is often understood exclusively in terms of adaptation, innovation often begins when a feature adapted for one function is co-opted for a different purpose and the co-opted features are called exaptations. Thus, one of the fundamental issues is what features of mangroves have evolved through exaptation. We attempt to address these questions through molecular phylogenetic approach using chloroplast and nuclear markers. First, we determined if these mangroves specific traits have evolved multiple times in the phylogeny. Once the multiple origins were established, we then looked at related non-mangrove species for characters that could have been co-opted by mangrove species. We also assessed the efficacy of these molecular sequences in distinguishing mangroves at the species level. This study revealed the multiple origin of mangroves and shed light on the ancestral characters that might have led certain lineages of plants to adapt to estuarine conditions and also traces the evolutionary history of mangroves and hitherto unexplained theory that mangroves traits (aerial roots and viviparous propagules) evolved as a result of exaptation rather than adaptation to saline habitats.
Future of human mitochondrial DNA editing technologies.
Verechshagina, N; Nikitchina, N; Yamada, Y; Harashima, Н; Tanaka, M; Orishchenko, K; Mazunin, I
2018-05-15
ATP and other metabolites, which are necessary for the development, maintenance, and functioning of bodily cells are all synthesized in the mitochondria. Multiple copies of the genome, present within the mitochondria, together with its maternal inheritance, determine the clinical manifestation and spreading of mutations in mitochondrial DNA (mtDNA). The main obstacle in the way of thorough understanding of mitochondrial biology and the development of gene therapy methods for mitochondrial diseases is the absence of systems that allow to directly change mtDNA sequence. Here, we discuss existing methods of manipulating the level of mtDNA heteroplasmy, as well as the latest systems, that could be used in the future as tools for human mitochondrial genome editing.
3-D model-based tracking for UAV indoor localization.
Teulière, Céline; Marchand, Eric; Eck, Laurent
2015-05-01
This paper proposes a novel model-based tracking approach for 3-D localization. One main difficulty of standard model-based approach lies in the presence of low-level ambiguities between different edges. In this paper, given a 3-D model of the edges of the environment, we derive a multiple hypotheses tracker which retrieves the potential poses of the camera from the observations in the image. We also show how these candidate poses can be integrated into a particle filtering framework to guide the particle set toward the peaks of the distribution. Motivated by the UAV indoor localization problem where GPS signal is not available, we validate the algorithm on real image sequences from UAV flights.
Lai, Yen-Ting; Cheng, Chao-Sheng; Liu, Yu-Nan; Liu, Yaw-Jen; Lyu, Ping-Chiang
2008-09-01
Plant nonspecific lipid transfer proteins (nsLTPs) are small, basic proteins constituted mainly of alpha-helices and stabilized by four conserved disulfide bridges. They are characterized by the presence of a tunnel-like hydrophobic cavity, capable of transferring various lipid molecules between lipid bilayers in vitro. In this study, molecular dynamics (MD) simulations were performed at room temperature to investigate the effects of lipid binding on the dynamic properties of rice nsLTP1. Rice nsLTP1, either in the free form or complexed with one or two lipids was subjected to MD simulations. The C-terminal loop was very flexible both before and after lipid binding, as revealed by calculating the root-mean-square fluctuation. After lipid binding, the flexibility of some residues that were not in direct contact with lipid molecules increased significantly, indicating an increase of entropy in the region distal from the binding site. Essential dynamics analysis revealed clear differences in motion between unliganded and liganded rice nsLTP1s. In the free form of rice nsLTP1, loop1 exhibited the largest directional motion. This specific essential motion mode diminished after binding one or two lipid molecules. To verify the origin of the essential motion observed in the free form of rice nsLTP1, we performed multiple sequence alignments to probe the intrinsic motion encoded in the primary sequence. We found that the amino acid sequence of loop1 is highly conserved among plant nsLTP1s, thus revealing its functional importance during evolution. Furthermore, the sequence of loop1 is composed mainly of amino acids with short side chains. In this study, we show that MD simulations, together with essential dynamics analysis, can be used to determine structural and dynamic differences of rice nsLTP1 upon lipid binding. 2008 Wiley-Liss, Inc.
ISOL@: an Italian SOLAnaceae genomics resource.
Chiusano, Maria Luisa; D'Agostino, Nunzio; Traini, Alessandra; Licciardello, Concetta; Raimondo, Enrico; Aversano, Mario; Frusciante, Luigi; Monti, Luigi
2008-03-26
Present-day '-omics' technologies produce overwhelming amounts of data which include genome sequences, information on gene expression (transcripts and proteins) and on cell metabolic status. These data represent multiple aspects of a biological system and need to be investigated as a whole to shed light on the mechanisms which underpin the system functionality. The gathering and convergence of data generated by high-throughput technologies, the effective integration of different data-sources and the analysis of the information content based on comparative approaches are key methods for meaningful biological interpretations. In the frame of the International Solanaceae Genome Project, we propose here ISOLA, an Italian SOLAnaceae genomics resource. ISOLA (available at http://biosrv.cab.unina.it/isola) represents a trial platform and it is conceived as a multi-level computational environment.ISOLA currently consists of two main levels: the genome and the expression level. The cornerstone of the genome level is represented by the Solanum lycopersicum genome draft sequences generated by the International Tomato Genome Sequencing Consortium. Instead, the basic element of the expression level is the transcriptome information from different Solanaceae species, mainly in the form of species-specific comprehensive collections of Expressed Sequence Tags (ESTs). The cross-talk between the genome and the expression levels is based on data source sharing and on tools that enhance data quality, that extract information content from the levels' under parts and produce value-added biological knowledge. ISOLA is the result of a bioinformatics effort that addresses the challenges of the post-genomics era. It is designed to exploit '-omics' data based on effective integration to acquire biological knowledge and to approach a systems biology view. Beyond providing experimental biologists with a preliminary annotation of the tomato genome, this effort aims to produce a trial computational environment where different aspects and details are maintained as they are relevant for the analysis of the organization, the functionality and the evolution of the Solanaceae family.
Sequence Segmentation with changeptGUI.
Tasker, Edward; Keith, Jonathan M
2017-01-01
Many biological sequences have a segmental structure that can provide valuable clues to their content, structure, and function. The program changept is a tool for investigating the segmental structure of a sequence, and can also be applied to multiple sequences in parallel to identify a common segmental structure, thus providing a method for integrating multiple data types to identify functional elements in genomes. In the previous edition of this book, a command line interface for changept is described. Here we present a graphical user interface for this package, called changeptGUI. This interface also includes tools for pre- and post-processing of data and results to facilitate investigation of the number and characteristics of segment classes.
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences
Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong
2015-01-01
Abstract We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate—slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory. PMID:25549288
Battig, Mark R; Soontornworajit, Boonchoy; Wang, Yong
2012-08-01
Polymeric delivery systems have been extensively studied to achieve localized and controlled release of protein drugs. However, it is still challenging to control the release of multiple protein drugs in distinct stages according to the progress of disease or treatment. This study successfully demonstrates that multiple protein drugs can be released from aptamer-functionalized hydrogels with adjustable release rates at predetermined time points using complementary sequences (CSs) as biomolecular triggers. Because both aptamer-protein interactions and aptamer-CS hybridization are sequence-specific, aptamer-functionalized hydrogels constitute a promising polymeric delivery system for the programmable release of multiple protein drugs to treat complex human diseases.
Chen, DaYang; Zhen, HeFu; Qiu, Yong; Liu, Ping; Zeng, Peng; Xia, Jun; Shi, QianYu; Xie, Lin; Zhu, Zhu; Gao, Ya; Huang, GuoDong; Wang, Jian; Yang, HuanMing; Chen, Fang
2018-03-21
Research based on a strategy of single-cell low-coverage whole genome sequencing (SLWGS) has enabled better reproducibility and accuracy for detection of copy number variations (CNVs). The whole genome amplification (WGA) method and sequencing platform are critical factors for successful SLWGS (<0.1 × coverage). In this study, we compared single cell and multiple cells sequencing data produced by the HiSeq2000 and Ion Proton platforms using two WGA kits and then comprehensively evaluated the GC-bias, reproducibility, uniformity and CNV detection among different experimental combinations. Our analysis demonstrated that the PicoPLEX WGA Kit resulted in higher reproducibility, lower sequencing error frequency but more GC-bias than the GenomePlex Single Cell WGA Kit (WGA4 kit) independent of the cell number on the HiSeq2000 platform. While on the Ion Proton platform, the WGA4 kit (both single cell and multiple cells) had higher uniformity and less GC-bias but lower reproducibility than those of the PicoPLEX WGA Kit. Moreover, on these two sequencing platforms, depending on cell number, the performance of the two WGA kits was different for both sensitivity and specificity on CNV detection. The results can help researchers who plan to use SLWGS on single or multiple cells to select appropriate experimental conditions for their applications.
Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H
2015-08-19
Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.
A Multiple-Track Nursing Sequence: Supplement to Research Report No. 1.
ERIC Educational Resources Information Center
Gilpatrick, Eleanor
Following a survey of 2,361 practical nurses in New York City municipal hospitals in 1968, a specific multiple-track nursing sequence was developed to meet manpower shortages and upgrade licensed practical nurses (LPN's) to registered nurses (RN's) and nurse's aides (NA's) to LPN's. The two models designed were for use in New York City but it is…
NASA Astrophysics Data System (ADS)
Qiu, Junchao; Zhang, Lin; Li, Diyang; Liu, Xingcheng
2016-06-01
Chaotic sequences can be applied to realize multiple user access and improve the system security for a visible light communication (VLC) system. However, since the map patterns of chaotic sequences are usually well known, eavesdroppers can possibly derive the key parameters of chaotic sequences and subsequently retrieve the information. We design an advanced encryption standard (AES) interleaving aided multiple user access scheme to enhance the security of a chaotic code division multiple access-based visible light communication (C-CDMA-VLC) system. We propose to spread the information with chaotic sequences, and then the spread information is interleaved by an AES algorithm and transmitted over VLC channels. Since the computation complexity of performing inverse operations to deinterleave the information is high, the eavesdroppers in a high speed VLC system cannot retrieve the information in real time; thus, the system security will be enhanced. Moreover, we build a mathematical model for the AES-aided VLC system and derive the theoretical information leakage to analyze the system security. The simulations are performed over VLC channels, and the results demonstrate the effectiveness and high security of our presented AES interleaving aided chaotic CDMA-VLC system.
Chaw, R. Crystal; Collin, Matthew; Wimmer, Marjorie; Helmrick, Kara-Leigh; Hayashi, Cheryl Y.
2017-01-01
Spiders swath their eggs with silk to protect developing embryos and hatchlings. Egg case silks, like other fibrous spider silks, are primarily composed of proteins called spidroins (spidroin = spider-fibroin). Silks, and thus spidroins, are important throughout the lives of spiders, yet the evolution of spidroin genes has been relatively understudied. Spidroin genes are notoriously difficult to sequence because they are typically very long (≥ 10 kb of coding sequence) and highly repetitive. Here, we investigate the evolution of spider silk genes through long-read sequencing of Bacterial Artificial Chromosome (BAC) clones. We demonstrate that the silver garden spider Argiope argentata has multiple egg case spidroin loci with a loss of function at one locus. We also use degenerate PCR primers to search the genomic DNA of congeneric species and find evidence for multiple egg case spidroin loci in other Argiope spiders. Comparative analyses show that these multiple loci are more similar at the nucleotide level within a species than between species. This pattern is consistent with concerted evolution homogenizing gene copies within a genome. More complicated explanations include convergent evolution or recent independent gene duplications within each species. PMID:29127108
Dessimoz, Christophe; Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro
2011-09-01
Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references.
Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro
2011-01-01
Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references. PMID:21712341
GeneSilico protein structure prediction meta-server.
Kurowski, Michal A; Bujnicki, Janusz M
2003-07-01
Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.
GeneSilico protein structure prediction meta-server
Kurowski, Michal A.; Bujnicki, Janusz M.
2003-01-01
Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta. PMID:12824313
Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V
2003-01-01
In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.
Mahoney, J. Matthew; Titiz, Ali S.; Hernan, Amanda E.; Scott, Rod C.
2016-01-01
Hippocampal neural systems consolidate multiple complex behaviors into memory. However, the temporal structure of neural firing supporting complex memory consolidation is unknown. Replay of hippocampal place cells during sleep supports the view that a simple repetitive behavior modifies sleep firing dynamics, but does not explain how multiple episodes could be integrated into associative networks for recollection during future cognition. Here we decode sequential firing structure within spike avalanches of all pyramidal cells recorded in sleeping rats after running in a circular track. We find that short sequences that combine into multiple long sequences capture the majority of the sequential structure during sleep, including replay of hippocampal place cells. The ensemble, however, is not optimized for maximally producing the behavior-enriched episode. Thus behavioral programming of sequential correlations occurs at the level of short-range interactions, not whole behavioral sequences and these short sequences are assembled into a large and complex milieu that could support complex memory consolidation. PMID:26866597
Zhang, L J; Dong, W X; Guo, S M; Wang, Y X; Wang, A D; Lu, X J
2015-11-19
This study aims to explore the roles of somatic embryogenesis receptor-like kinase (SERK) in Malus hupehensis (Pingyi Tiancha). The full-length sequences of SERK1 in triploid Pingyi Tiancha (3n) and a tetraploid hybrid strain 33# (4n) were cloned, sequenced, and designated as MhSERK1 and MhdSERK1, respectively. Multiple alignments of amino acid sequences were conducted to identify similarity between MhSERK1 and MhdSERK1 and SERK sequences in other species, and a neighbor-joining phylogenetic tree was constructed to elucidate their phylogenetic relations. Expression levels of MhSERK1 and MhdSERK1 in different tissues and developmental stages were investigated using quantitative real-time PCR. The coding sequence lengths of MhSERK1 and MhdSERK1 were 1899 bp (encoding 632 amino acids) and 1881 bp (encoding 626 amino acids), respectively. Sequence analysis demonstrated that MhSERK1 and MhdSERK1 display high similarity to SERKs in other species, with a conserved intron/exon structure that is unique to members of the SERK family. Additionally, the phylogenetic tree showed that MhSERK1 and MhdSERK1 clustered with orange CitSERK (93%). Furthermore, MhSERK1 and MhdSERK1 were mainly expressed in the reproductive organs, in particular the ovary. Their expression levels were highest in young flowers and they differed among different tissues and organs. Our results suggest that MhSERK1 and MhdSERK1 are related to plant reproduction, and that MhSERK1 is related to apomixis in triploid Pingyi Tiancha.
Adhikari, Badri; Hou, Jie; Cheng, Jianlin
2018-03-01
In this study, we report the evaluation of the residue-residue contacts predicted by our three different methods in the CASP12 experiment, focusing on studying the impact of multiple sequence alignment, residue coevolution, and machine learning on contact prediction. The first method (MULTICOM-NOVEL) uses only traditional features (sequence profile, secondary structure, and solvent accessibility) with deep learning to predict contacts and serves as a baseline. The second method (MULTICOM-CONSTRUCT) uses our new alignment algorithm to generate deep multiple sequence alignment to derive coevolution-based features, which are integrated by a neural network method to predict contacts. The third method (MULTICOM-CLUSTER) is a consensus combination of the predictions of the first two methods. We evaluated our methods on 94 CASP12 domains. On a subset of 38 free-modeling domains, our methods achieved an average precision of up to 41.7% for top L/5 long-range contact predictions. The comparison of the three methods shows that the quality and effective depth of multiple sequence alignments, coevolution-based features, and machine learning integration of coevolution-based features and traditional features drive the quality of predicted protein contacts. On the full CASP12 dataset, the coevolution-based features alone can improve the average precision from 28.4% to 41.6%, and the machine learning integration of all the features further raises the precision to 56.3%, when top L/5 predicted long-range contacts are evaluated. And the correlation between the precision of contact prediction and the logarithm of the number of effective sequences in alignments is 0.66. © 2017 Wiley Periodicals, Inc.
MetaMeta: integrating metagenome analysis tools to improve taxonomic profiling.
Piro, Vitor C; Matschkowski, Marcel; Renard, Bernhard Y
2017-08-14
Many metagenome analysis tools are presently available to classify sequences and profile environmental samples. In particular, taxonomic profiling and binning methods are commonly used for such tasks. Tools available among these two categories make use of several techniques, e.g., read mapping, k-mer alignment, and composition analysis. Variations on the construction of the corresponding reference sequence databases are also common. In addition, different tools provide good results in different datasets and configurations. All this variation creates a complicated scenario to researchers to decide which methods to use. Installation, configuration and execution can also be difficult especially when dealing with multiple datasets and tools. We propose MetaMeta: a pipeline to execute and integrate results from metagenome analysis tools. MetaMeta provides an easy workflow to run multiple tools with multiple samples, producing a single enhanced output profile for each sample. MetaMeta includes a database generation, pre-processing, execution, and integration steps, allowing easy execution and parallelization. The integration relies on the co-occurrence of organisms from different methods as the main feature to improve community profiling while accounting for differences in their databases. In a controlled case with simulated and real data, we show that the integrated profiles of MetaMeta overcome the best single profile. Using the same input data, it provides more sensitive and reliable results with the presence of each organism being supported by several methods. MetaMeta uses Snakemake and has six pre-configured tools, all available at BioConda channel for easy installation (conda install -c bioconda metameta). The MetaMeta pipeline is open-source and can be downloaded at: https://gitlab.com/rki_bioinformatics .
A Statistical Study of Brown Dwarf Companions from the SDSS-III MARVELS Survey
NASA Astrophysics Data System (ADS)
Grieves, Nolan; Ge, Jian; Thomas, Neil; Ma, Bo; De Lee, Nathan M.; Lee, Brian L.; Fleming, Scott W.; Sithajan, Sirinrat; Varosi, Frank; Liu, Jian; Zhao, Bo; Li, Rui; Agol, Eric; MARVELS Team
2016-01-01
We present 23 new Brown Dwarf (BD) candidates from the Multi-object APO Radial-Velocity Exoplanet Large-Area Survey (MARVELS) of the Sloan Digital Sky Survey III (SDSS-III). The BD candidates were selected from the processed MARVELS data using the latest University of Florida 2D pipeline, which shows significant improvement and reduction of systematic errors over the 1D pipeline results included in the SDSS Data Release 12. This sample is the largest BD yield from a single radial velocity survey. Of the 23 candidates, 18 are around main sequence stars and 5 are around giant stars. Given a giant contamination rate of ~24% for the MARVELS survey, we find a BD occurrence rate around main sequence stars of ~0.7%, which agrees with previous studies and confirms the BD desert, while the BD occurrence rate around the MARVELS giant stars is ~0.6%. Preliminary results show that our new candidates around solar type stars support a two population hypothesis, where BDs are divided at a mass of ~42.5 MJup. BDs less massive than 42.5 MJup have eccentricity distributions consistent with planet-planet scattering models, where BDs more massive than 42.5 MJup have both period and eccentricity distributions similar to that of stellar binaries. Special Brown Dwarf systems such as multiple BD systems and highly eccentric BDs will also be presented.
A Photometric Search for Planets in the Open Cluster NGC 7086
NASA Astrophysics Data System (ADS)
Rosvick, Joanne M.; Robb, Russell
2006-12-01
In an attempt to discover short-period, Jupiter-mass planets orbiting solar-type stars in open clusters, we searched for planetary transits in the populous and relatively unstudied open cluster NGC 7086. A color-magnitude diagram constructed from new B and V photometry is presented, along with revised estimates of the cluster's color excess, distance modulus, and age. Several turnoff stars were observed spectroscopically in order to determine a color excess of E(B-V)=0.83+/-0.02. Empirically fitting the main sequences of two young open clusters and the semiempirical zero-age main sequence of Vandenberg and Poll yielded a distance modulus of (V-MV)=13.4+/-0.3 mag. This corresponds to a true distance modulus of (m-M)0=10.8 mag or a distance of 1.5 kpc to NGC 7086. These values were used with isochrones from the Padova group to obtain a cluster age of 100 Myr. Eleven nights of R-band photometry were used to search for planetary transits. Differential magnitudes were constructed for each star in the cluster. Light curves for each star were produced on a night-to-night basis and inspected for variability. No planetary transits were apparent; however, some interesting variable stars were discovered: a pulsating variable that appears to be a member of the γ Dor class and four possible eclipsing binary stars, one of which actually may be a multiple system.
Machado, Milla de Andrade; Cardoso, Adauto Lima; Milhomem-Paixão, Susana Suely Rodrigues; Pieczarka, Julio Cesar; Nagamachi, Cleusa Yoshiko
2017-10-01
Gymnotus coatesi is a small and rare species of banded knife fish that was originally described by LaMonte in 1935, found along the main stretch of the Amazon River. There is no described cytogenetic data on this species. We analyzed the karyotype of five specimens of G. coatesi collected from Cururutuia Stream in Bragança, Pará, Brazil. The obtained diploid number is 50 and the karyotypic formula is 24 m/sm +26 st/a. The constitutive heterochromatin is DAPI positive and distributed mainly in the centromeric and pericentromeric regions of the chromosomes. Ag-nucleolus organizer regions staining showed nine active sites. The 5S rDNA probe hybridized chromosome pair 17 in the interstitial part of the long arm. Fluorescence in situ hybridization (FISH) with telomeric probes revealed signals only at terminal regions of the chromosomes. The 18S rDNA probe hybridized to 21 sites, and these signals colocalized with the telomeric sequences. This relatively high number of 18S rDNA sites may reflect gene duplication mediated by transposable elements. These results indicate that although the diploid number of G. coatesi is within the range previously observed for other members of the genus, various karyotypic characteristics distinguish G. coatesi from the other species of the genus and members of the Gymnotiform order.
Common Warm Dust Temperatures Around Main Sequence Stars
NASA Technical Reports Server (NTRS)
Morales, Farisa; Rieke, George; Werner, Michael; Stapelfeldt, Karl; Bryden, Geoffrey; Su, Kate
2011-01-01
We compare the properties of warm dust emission from a sample of main-sequence A-type stars (B8-A7) to those of dust around solar-type stars (F5-KO) with similar Spitzer Space Telescope Infrared Spectrograph/MIPS data and similar ages. Both samples include stars with sources with infrared spectral energy distributions that show evidence of multiple components. Over the range of stellar types considered, we obtain nearly the same characteristic dust temperatures (∼ 190 K and ∼60 K for the inner and outer dust components, respectively)-slightly above the ice evaporation temperature for the inner belts. The warm inner dust temperature is readily explained if populations of small grains are being released by sublimation of ice from icy planetesimals. Evaporation of low-eccentricity icy bodies at ∼ 150 K can deposit particles into an inner/warm belt, where the small grains are heated to dust Temperatures of -190 K. Alternatively, enhanced collisional processing of an asteroid belt-like system of parent planetesimals just interior to the snow line may account for the observed uniformity in dust temperature. The similarity in temperature of the warmer dust across our B8-KO stellar sample strongly suggests that dust-producing planetesimals are not found at similar radial locations around all stars, but that dust production is favored at a characteristic temperature horizon.
Zhang, Shuai; Lun, Zhao-Rong; Wu, Zhong-Dao; Fan, Chia-Kwung; Brown, Christopher L.; Cheng, Po-Ching; Peng, Shih-Yi; Yang, Ting-Bao
2017-01-01
Angiostrongylus cantonensis is of increasing public health importance as the main zoonotic pathogen causing eosinophilic meningitis or meningoencephalitis, which has been documented all over the world. However, there are very limited studies about its phylogeography and spread pattern. In the present study, the phylogeography of A. cantonensis in southern China (including Taiwan) and partial areas of Southeast Asia were studied based on the sequences of complete mitochondrial cytochrome b (Cytb) gene. A total of 520 individuals of A. cantonensis obtained from 13 localities were sequenced for the analyses and grouped into 42 defined haplotypes. The phylogenetic tree (NJ tree and BI tree) revealed a characteristic distribution pattern of the four main lineages, with detectable geographic structure. Genetic differentiation among populations was significant, but demographic expansion could not be detected by either neutrality tests or mismatch distribution analysis, which implied a low gene flow among the local populations in different regions where the samples were collected. Two unique lineages of the A. cantonensis population in Taiwan were detected, which suggests its multiple origin in the island. Populations in Hekou (China) and Laos showed the highest genetic diversities, which were supported by both genetic diversity indices and AMOVA. These results together infer that the area around Thailand or Hekou in Yunnan province, China are the most likely origins of Angiostrongylus cantonensis. PMID:28827809
Peng, Jian; He, Zhang-Ping; Zhang, Shuai; Lun, Zhao-Rong; Wu, Zhong-Dao; Fan, Chia-Kwung; Brown, Christopher L; Cheng, Po-Ching; Peng, Shih-Yi; Yang, Ting-Bao
2017-08-01
Angiostrongylus cantonensis is of increasing public health importance as the main zoonotic pathogen causing eosinophilic meningitis or meningoencephalitis, which has been documented all over the world. However, there are very limited studies about its phylogeography and spread pattern. In the present study, the phylogeography of A. cantonensis in southern China (including Taiwan) and partial areas of Southeast Asia were studied based on the sequences of complete mitochondrial cytochrome b (Cytb) gene. A total of 520 individuals of A. cantonensis obtained from 13 localities were sequenced for the analyses and grouped into 42 defined haplotypes. The phylogenetic tree (NJ tree and BI tree) revealed a characteristic distribution pattern of the four main lineages, with detectable geographic structure. Genetic differentiation among populations was significant, but demographic expansion could not be detected by either neutrality tests or mismatch distribution analysis, which implied a low gene flow among the local populations in different regions where the samples were collected. Two unique lineages of the A. cantonensis population in Taiwan were detected, which suggests its multiple origin in the island. Populations in Hekou (China) and Laos showed the highest genetic diversities, which were supported by both genetic diversity indices and AMOVA. These results together infer that the area around Thailand or Hekou in Yunnan province, China are the most likely origins of Angiostrongylus cantonensis.
Mori, J.
1991-01-01
Event record sections, which are constructed by plotting seismograms from many closely spaced earthquakes recorded on a few stations, show multiple free-surface reflections (PP, PPP, PPPP) of the P wave in the Imperial Valley. The relative timing of these arrivals is used to estimate the strength of the P-wave velocity gradient within the upper 5 km of the sediment layer. Consistent with previous studies, a velocity model with a value of 1.8 km/sec at the surface increasing linearly to 5.8 km/sec at a depth of 5.5 km fits the data well. The relative amplitudes of the P and PP arrivals are used to estimate the source depth for the aftershock distributions of the Elmore Ranch and Superstition Hills main shocks. Although the depth determination has large uncertainties, both the Elmore Ranch and Superstition Hills aftershock sequencs appear to have similar depth distribution in the range of 4 to 10 km. -Author
A resettable and reprogrammable DNA-based security system to identify multiple users with hierarchy.
Li, Hailong; Hong, Wei; Dong, Shaojun; Liu, Yaqing; Wang, Erkang
2014-03-25
Molecular-level security devices have raised ever-increasing interest in recent years to protect data and information from illegal invasion. Prior molecular keypad locks have an output signal dependent upon not only the appropriate combination but also the exact sequence of inputs, but it cannot be reset or reprogrammed. Here, a DNA-based security system with reset and never-reported reprogram function is successfully developed in proof-of-principle, with which one can change the password in case that the system is cracked. The previous password becomes invalid in the reprogrammed security system. Interestingly, more than one password is designed to permit multiple users to access. By harnessing the intrinsic merit of the different passwords, the system can distinguish different user who is endowed with prior authority. The intelligent device is addressed on solid support and facilitates electronic processes, avoiding chemical accumulation in the system by simple removal of the electrode from the input solution and indicating a main avenue for its further development.
Real-time vehicle matching for multi-camera tunnel surveillance
NASA Astrophysics Data System (ADS)
Jelača, Vedran; Niño Castañeda, Jorge Oswaldo; Frías-Velázquez, Andrés; Pižurica, Aleksandra; Philips, Wilfried
2011-03-01
Tracking multiple vehicles with multiple cameras is a challenging problem of great importance in tunnel surveillance. One of the main challenges is accurate vehicle matching across the cameras with non-overlapping fields of view. Since systems dedicated to this task can contain hundreds of cameras which observe dozens of vehicles each, for a real-time performance computational efficiency is essential. In this paper, we propose a low complexity, yet highly accurate method for vehicle matching using vehicle signatures composed of Radon transform like projection profiles of the vehicle image. The proposed signatures can be calculated by a simple scan-line algorithm, by the camera software itself and transmitted to the central server or to the other cameras in a smart camera environment. The amount of data is drastically reduced compared to the whole image, which relaxes the data link capacity requirements. Experiments on real vehicle images, extracted from video sequences recorded in a tunnel by two distant security cameras, validate our approach.
Surface immobilized azomethine for multiple component exchange.
Lerond, Michael; Bélanger, Daniel; Skene, W G
2017-09-27
Diazonium chemistry concomitant with in situ electrochemical reduction was used to graft an aryl aldehyde to indium-tin oxide (ITO) coated glass substrates. This served as an anchor for preparing electroactive azomethines that were covalently bonded to the transparent electrode. The immobilized azomethines could undergo multiple step-wise component exchanges with different arylamines. The write-erase-write sequences were electrochemically confirmed. The azomethines could also be reversibly hydrolyzed. This was exploited for multiple azomethine-hydrolysis cycles resulting in discrete electroactive immobilized azomethines. The erase-rewrite sequences were also electrochemically confirmed.
Gomez-Smith, C Kimloi; LaPara, Timothy M; Hozalski, Raymond M
2015-07-21
The quantity and composition of bacterial biofilms growing on 10 water mains from a full-scale chloraminated water distribution system were analyzed using real-time PCR targeting the 16S rRNA gene and next-generation, high-throughput Illumina sequencing. Water mains with corrosion tubercles supported the greatest amount of bacterial biomass (n = 25; geometric mean = 2.5 × 10(7) copies cm(-2)), which was significantly higher (P = 0.04) than cement-lined cast-iron mains (n = 6; geometric mean = 2.0 × 10(6) copies cm(-2)). Despite spatial variation of community composition and bacterial abundance in water main biofilms, the communities on the interior main surfaces were surprisingly similar, containing a core group of operational taxonomic units (OTUs) assigned to only 17 different genera. Bacteria from the genus Mycobacterium dominated all communities at the main wall-bulk water interface (25-78% of the community), regardless of main age, estimated water age, main material, and the presence of corrosion products. Further sequencing of the mycobacterial heat shock protein gene (hsp65) provided species-level taxonomic resolution of mycobacteria. The two dominant Mycobacteria present, M. frederiksbergense (arithmetic mean = 85.7% of hsp65 sequences) and M. aurum (arithmetic mean = 6.5% of hsp65 sequences), are generally considered to be nonpathogenic. Two opportunistic pathogens, however, were detected at low numbers: M. hemophilum (arithmetic mean = 1.5% of hsp65 sequences) and M. abscessus (arithmetic mean = 0.006% of hsp65 sequences). Sulfate-reducing bacteria from the genus Desulfovibrio, which have been implicated in microbially influenced corrosion, dominated all communities located underneath corrosion tubercules (arithmetic mean = 67.5% of the community). This research provides novel insights into the quantity and composition of biofilms in full-scale drinking water distribution systems, which is critical for assessing the risks to public health and to the water supply infrastructure.
SEAN: SNP prediction and display program utilizing EST sequence clusters.
Huntley, Derek; Baldo, Angela; Johri, Saurabh; Sergot, Marek
2006-02-15
SEAN is an application that predicts single nucleotide polymorphisms (SNPs) using multiple sequence alignments produced from expressed sequence tag (EST) clusters. The algorithm uses rules of sequence identity and SNP abundance to determine the quality of the prediction. A Java viewer is provided to display the EST alignments and predicted SNPs.
Genomic Sequencing: Assessing The Health Care System, Policy, And Big-Data Implications
Phillips, Kathryn A.; Trosman, Julia; Kelley, Robin K.; Pletcher, Mark J.; Douglas, Michael P.; Weldon, Christine B.
2014-01-01
New genomic sequencing technologies enable the high-speed analysis of multiple genes simultaneously, including all of those in a person's genome. Sequencing is a prominent example of a “big data” technology because of the massive amount of information it produces and its complexity, diversity, and timeliness. Our objective in this article is to provide a policy primer on sequencing and illustrate how it can affect health care system and policy issues. Toward this end, we developed an easily applied classification of sequencing based on inputs, methods, and outputs. We used it to examine the implications of sequencing for three health care system and policy issues: making care more patient-centered, developing coverage and reimbursement policies, and assessing economic value. We conclude that sequencing has great promise but that policy challenges include how to optimize patient engagement as well as privacy, develop coverage policies that distinguish research from clinical uses and account for bioinformatics costs, and determine the economic value of sequencing through complex economic models that take into account multiple findings and downstream costs. PMID:25006153
Genomic sequencing: assessing the health care system, policy, and big-data implications.
Phillips, Kathryn A; Trosman, Julia R; Kelley, Robin K; Pletcher, Mark J; Douglas, Michael P; Weldon, Christine B
2014-07-01
New genomic sequencing technologies enable the high-speed analysis of multiple genes simultaneously, including all of those in a person's genome. Sequencing is a prominent example of a "big data" technology because of the massive amount of information it produces and its complexity, diversity, and timeliness. Our objective in this article is to provide a policy primer on sequencing and illustrate how it can affect health care system and policy issues. Toward this end, we developed an easily applied classification of sequencing based on inputs, methods, and outputs. We used it to examine the implications of sequencing for three health care system and policy issues: making care more patient-centered, developing coverage and reimbursement policies, and assessing economic value. We conclude that sequencing has great promise but that policy challenges include how to optimize patient engagement as well as privacy, develop coverage policies that distinguish research from clinical uses and account for bioinformatics costs, and determine the economic value of sequencing through complex economic models that take into account multiple findings and downstream costs. Project HOPE—The People-to-People Health Foundation, Inc.
Coupling detrended fluctuation analysis for multiple warehouse-out behavioral sequences
NASA Astrophysics Data System (ADS)
Yao, Can-Zhong; Lin, Ji-Nan; Zheng, Xu-Zhou
2017-01-01
Interaction patterns among different warehouses could make the warehouse-out behavioral sequences less predictable. We firstly take a coupling detrended fluctuation analysis on the warehouse-out quantity, and find that the multivariate sequences exhibit significant coupling multifractal characteristics regardless of the types of steel products. Secondly, we track the sources of multifractal warehouse-out sequences by shuffling and surrogating original ones, and we find that fat-tail distribution contributes more to multifractal features than the long-term memory, regardless of types of steel products. From perspective of warehouse contribution, some warehouses steadily contribute more to multifractal than other warehouses. Finally, based on multiscale multifractal analysis, we propose Hurst surface structure to investigate coupling multifractal, and show that multiple behavioral sequences exhibit significant coupling multifractal features that emerge and usually be restricted within relatively greater time scale interval.
WEB-server for search of a periodicity in amino acid and nucleotide sequences
NASA Astrophysics Data System (ADS)
E Frenkel, F.; Skryabin, K. G.; Korotkov, E. V.
2017-12-01
A new web server (http://victoria.biengi.ac.ru/splinter/login.php) was designed and developed to search for periodicity in nucleotide and amino acid sequences. The web server operation is based upon a new mathematical method of searching for multiple alignments, which is founded on the position weight matrices optimization, as well as on implementation of the two-dimensional dynamic programming. This approach allows the construction of multiple alignments of the indistinctly similar amino acid and nucleotide sequences that accumulated more than 1.5 substitutions per a single amino acid or a nucleotide without performing the sequences paired comparisons. The article examines the principles of the web server operation and two examples of studying amino acid and nucleotide sequences, as well as information that could be obtained using the web server.
Oh, Jeongsu; Choi, Chi-Hwan; Park, Min-Kyu; Kim, Byung Kwon; Hwang, Kyuin; Lee, Sang-Heon; Hong, Soon Gyu; Nasir, Arshan; Cho, Wan-Sup; Kim, Kyung Mo
2016-01-01
High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology-a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr.
Park, Min-Kyu; Kim, Byung Kwon; Hwang, Kyuin; Lee, Sang-Heon; Hong, Soon Gyu; Nasir, Arshan; Cho, Wan-Sup; Kim, Kyung Mo
2016-01-01
High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology–a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr. PMID:26954507
AK Sco: a tidally induced atmospheric dynamo in a pre-main sequence binary?
NASA Astrophysics Data System (ADS)
Gómez de Castro, A. I.
2009-02-01
AK Sco is a unique source: a 10-30 Myrs old pre-main sequence spectroscopic binary composed by two nearly equal F5 stars that at periastron are separated by barely eleven stellar radii so, the stellar magnetospheres fill the Roche lobe at periastron. The orbit is not yet circularized (e = 0.47) and very strong tides are expected. This makes of AK Sco, the ideal laboratory to study the effect of gravitational tides in the stellar magnetic field building up during pre-main sequence evolution. Evidence of this effect is reported in this contribution.
The Winds of Main Sequence B Stars in NGC 6231, Evidence for Shocks in Weak Winds.
NASA Astrophysics Data System (ADS)
Massa, Derck
1996-07-01
Because the main sequence B stars in NGC 6231 have abnormallystrong C iv wind lines, they are the only main sequence Bstars with distinct edge velocities. Although the underlyingcause for the strong lines remains unknown, these stars doprovide an opportunity to test two important ideas concerningB star winds: 1) that the driving ions in the winds of starswith low mass loss rates decouple from the general flow, and;2) that shocks deep in the winds of main sequence B stars areresponsible for their observed X-rays. In both of thesemodels, the wind accelerates toward a terminal velocity,v_infty, far greater than the observed value, shocking ordecoupling well before it can attain the high v_infty. As aresult, the observable wind accelerates very rapidly, leadingto wind flushing times less than 30 minutes. If theseconjectures are correct, then the winds of main sequence Bstars should be highly variable on time scales of minutes.Model fitting of available IUE data are consistant with thegeneral notion of a rapidly accelerating wind, shocking wellbefore its actual v_infty. However, these are 5 hourexposures, so the fits are to ill-defined mean wind flows.The new GHRS observations will provide adequate spectral andtemporal resolution to observe the expected variability and,thereby, verify the existance of two important astrophysicalprocesses.
Chen, Sunlu; Zheng, Huizhen; Kishima, Yuji
2017-06-01
The interplay of different virus species in a host cell after infection can affect the adaptation of each virus. Endogenous viral elements, such as endogenous pararetroviruses (PRVs), have arisen from vertical inheritance of viral sequences integrated into host germline genomes. As viral genomic fossils, these sequences can thus serve as valuable paleogenomic data to study the long-term evolutionary dynamics of virus-virus interactions, but they have rarely been applied for this purpose. All extant PRVs have been considered autonomous species in their parasitic life cycle in host cells. Here, we provide evidence for multiple non-autonomous PRV species with structural defects in viral activity that have frequently infected ancient grass hosts and adapted through interplay between viruses. Our paleogenomic analyses using endogenous PRVs in grass genomes revealed that these non-autonomous PRV species have participated in interplay with autonomous PRVs in a possible commensal partnership, or, alternatively, with one another in a possible mutualistic partnership. These partnerships, which have been established by the sharing of noncoding regulatory sequences (NRSs) in intergenic regions between two partner viruses, have been further maintained and altered by the sequence homogenization of NRSs between partners. Strikingly, we found that frequent region-specific recombination, rather than mutation selection, is the main causative mechanism of NRS homogenization. Our results, obtained from ancient DNA records of viruses, suggest that adaptation of PRVs has occurred by concerted evolution of NRSs between different virus species in the same host. Our findings further imply that evaluation of within-host NRS interactions within and between populations of viral pathogens may be important.
Genetic alterations in seborrheic keratoses
Heidenreich, Barbara; Denisova, Evygenia; Rachakonda, Sivaramakrishna; Sanmartin, Onofre; Dereani, Timo; Hosen, Ismail; Nagore, Eduardo; Kumar, Rajiv
2017-01-01
Seborrheic keratoses are common benign epidermal lesions that are associated with increased age and sun-exposure. Those lesions despite harboring multiple somatic alterations in contrast to malignant tumors appear to be genetically stable. In order to investigate and characterize the presence of recurrent mutations, we performed exome sequencing on DNA from one seborrheic keratosis lesion and corresponding blood cells from the same patients with follow up investigation of alterations identified by exome sequencing in 24 additional lesions from as many patients. In addition we investigated alterations in all lesions at specific genes loci that included FGFR3, PIK3CA, HRAS, BRAF, CDKN2A and TERT and DHPH3 promoters. The exome sequencing data indicated three mutations per Mb of the targeted sequence. The mutational pattern depicted typical UV signature with majority of alterations being C>T and CC>TT base changes at dipyrimidinic sites. The FGFR3 mutations were the most frequent, detected in 12 of 25 (48%) lesions, followed by the PIK3CA (32%), TERT promoter (24%) and DPH3 promoter mutations (24%). TERT promoter mutations associated with increased age and were present mainly in the lesions excised from head and neck. Three lesions also carried alterations in CDKN2A. FGFR3, TERT and DPH3 expression did not correlate with mutations in the respective genes and promoters; however, increased FGFR3 transcript levels were associated with increased FOXN1 levels, a suggested positive feedback loop that stalls malignant progression. Thus, in this study we report overall mutation rate through exome sequencing and show the most frequent mutations seborrheic keratosis. PMID:28410231
Oshiki, Mamoru; Segawa, Takahiro; Ishii, Satoshi
2018-02-02
Various microorganisms play key roles in the Nitrogen (N) cycle. Quantitative PCR (qPCR) and PCR-amplicon sequencing of the N cycle functional genes allow us to analyze the abundance and diversity of microbes responsible in the N transforming reactions in various environmental samples. However, analysis of multiple target genes can be cumbersome and expensive. PCR-independent analysis, such as metagenomics and metatranscriptomics, is useful but expensive especially when we analyze multiple samples and try to detect N cycle functional genes present at relatively low abundance. Here, we present the application of microfluidic qPCR chip technology to simultaneously quantify and prepare amplicon sequence libraries for multiple N cycle functional genes as well as taxon-specific 16S rRNA gene markers for many samples. This approach, named as N cycle evaluation (NiCE) chip, was evaluated by using DNA from pure and artificially mixed bacterial cultures and by comparing the results with those obtained by conventional qPCR and amplicon sequencing methods. Quantitative results obtained by the NiCE chip were comparable to those obtained by conventional qPCR. In addition, the NiCE chip was successfully applied to examine abundance and diversity of N cycle functional genes in wastewater samples. Although non-specific amplification was detected on the NiCE chip, this could be overcome by optimizing the primer sequences in the future. As the NiCE chip can provide high-throughput format to quantify and prepare sequence libraries for multiple N cycle functional genes, this tool should advance our ability to explore N cycling in various samples. Importance. We report a novel approach, namely Nitrogen Cycle Evaluation (NiCE) chip by using microfluidic qPCR chip technology. By sequencing the amplicons recovered from the NiCE chip, we can assess diversities of the N cycle functional genes. The NiCE chip technology is applicable to analyze the temporal dynamics of the N cycle gene transcriptions in wastewater treatment bioreactors. The NiCE chip can provide high-throughput format to quantify and prepare sequence libraries for multiple N cycle functional genes. While there is a room for future improvement, this tool should significantly advance our ability to explore the N cycle in various environmental samples. Copyright © 2018 American Society for Microbiology.
Biclustering as a method for RNA local multiple sequence alignment.
Wang, Shu; Gutell, Robin R; Miranker, Daniel P
2007-12-15
Biclustering is a clustering method that simultaneously clusters both the domain and range of a relation. A challenge in multiple sequence alignment (MSA) is that the alignment of sequences is often intended to reveal groups of conserved functional subsequences. Simultaneously, the grouping of the sequences can impact the alignment; precisely the kind of dual situation biclustering is intended to address. We define a representation of the MSA problem enabling the application of biclustering algorithms. We develop a computer program for local MSA, BlockMSA, that combines biclustering with divide-and-conquer. BlockMSA simultaneously finds groups of similar sequences and locally aligns subsequences within them. Further alignment is accomplished by dividing both the set of sequences and their contents. The net result is both a multiple sequence alignment and a hierarchical clustering of the sequences. BlockMSA was tested on the subsets of the BRAliBase 2.1 benchmark suite that display high variability and on an extension to that suite to larger problem sizes. Also, alignments were evaluated of two large datasets of current biological interest, T box sequences and Group IC1 Introns. The results were compared with alignments computed by ClustalW, MAFFT, MUCLE and PROBCONS alignment programs using Sum of Pairs (SPS) and Consensus Count. Results for the benchmark suite are sensitive to problem size. On problems of 15 or greater sequences, BlockMSA is consistently the best. On none of the problems in the test suite are there appreciable differences in scores among BlockMSA, MAFFT and PROBCONS. On the T box sequences, BlockMSA does the most faithful job of reproducing known annotations. MAFFT and PROBCONS do not. On the Intron sequences, BlockMSA, MAFFT and MUSCLE are comparable at identifying conserved regions. BlockMSA is implemented in Java. Source code and supplementary datasets are available at http://aug.csres.utexas.edu/msa/
Loeza-Quintana, Tzitziki; Adamowicz, Sarah J
2018-02-01
During the past 50 years, the molecular clock has become one of the main tools for providing a time scale for the history of life. In the era of robust molecular evolutionary analysis, clock calibration is still one of the most basic steps needing attention. When fossil records are limited, well-dated geological events are the main resource for calibration. However, biogeographic calibrations have often been used in a simplistic manner, for example assuming simultaneous vicariant divergence of multiple sister lineages. Here, we propose a novel iterative calibration approach to define the most appropriate calibration date by seeking congruence between the dates assigned to multiple allopatric divergences and the geological history. Exploring patterns of molecular divergence in 16 trans-Bering sister clades of echinoderms, we demonstrate that the iterative calibration is predominantly advantageous when using complex geological or climatological events-such as the opening/reclosure of the Bering Strait-providing a powerful tool for clock dating that can be applied to other biogeographic calibration systems and further taxa. Using Bayesian analysis, we observed that evolutionary rate variability in the COI-5P gene is generally distributed in a clock-like fashion for Northern echinoderms. The results reveal a large range of genetic divergences, consistent with multiple pulses of trans-Bering migrations. A resulting rate of 2.8% pairwise Kimura-2-parameter sequence divergence per million years is suggested for the COI-5P gene in Northern echinoderms. Given that molecular rates may vary across latitudes and taxa, this study provides a new context for dating the evolutionary history of Arctic marine life.
A Code Division Multiple Access Communication System for the Low Frequency Band.
1983-04-01
frequency channels spread-spectrum communication / complex sequences, orthogonal codes impulsive noise 20. ABSTRACT (Continue an reverse side It...their transmissions with signature sequences. Our LF/CDMA scheme is different in that each user’s signature sequence set consists of M orthogonal ...signature sequences. Our LF/CDMA scheme is different in that each user’s signature sequence set consists of M orthogonal sequences and thus log 2 M
Zopf, Agnes; Raim, Roman; Danzer, Martin; Niklas, Norbert; Spilka, Rita; Pröll, Johannes; Gabriel, Christian; Nechansky, Andreas; Roucka, Markus
2015-03-01
The detection of KRAS mutations in codons 12 and 13 is critical for anti-EGFR therapy strategies; however, only those methodologies with high sensitivity, specificity, and accuracy as well as the best cost and turnaround balance are suitable for routine daily testing. Here we compared the performance of compact sequencing using the novel hybcell technology with 454 next-generation sequencing (454-NGS), Sanger sequencing, and pyrosequencing, using an evaluation panel of 35 specimens. A total of 32 mutations and 10 wild-type cases were reported using 454-NGS as the reference method. Specificity ranged from 100% for Sanger sequencing to 80% for pyrosequencing. Sanger sequencing and hybcell-based compact sequencing achieved a sensitivity of 96%, whereas pyrosequencing had a sensitivity of 88%. Accuracy was 97% for Sanger sequencing, 85% for pyrosequencing, and 94% for hybcell-based compact sequencing. Quantitative results were obtained for 454-NGS and hybcell-based compact sequencing data, resulting in a significant correlation (r = 0.914). Whereas pyrosequencing and Sanger sequencing were not able to detect multiple mutated cell clones within one tumor specimen, 454-NGS and the hybcell-based compact sequencing detected multiple mutations in two specimens. Our comparison shows that the hybcell-based compact sequencing is a valuable alternative to state-of-the-art methodologies used for detection of clinically relevant point mutations.
Prediction of pork quality parameters by applying fractals and data mining on MRI.
Caballero, Daniel; Pérez-Palacios, Trinidad; Caro, Andrés; Amigo, José Manuel; Dahl, Anders B; ErsbØll, Bjarne K; Antequera, Teresa
2017-09-01
This work firstly investigates the use of MRI, fractal algorithms and data mining techniques to determine pork quality parameters non-destructively. The main objective was to evaluate the capability of fractal algorithms (Classical Fractal algorithm, CFA; Fractal Texture Algorithm, FTA and One Point Fractal Texture Algorithm, OPFTA) to analyse MRI in order to predict quality parameters of loin. In addition, the effect of the sequence acquisition of MRI (Gradient echo, GE; Spin echo, SE and Turbo 3D, T3D) and the predictive technique of data mining (Isotonic regression, IR and Multiple linear regression, MLR) were analysed. Both fractal algorithm, FTA and OPFTA are appropriate to analyse MRI of loins. The sequence acquisition, the fractal algorithm and the data mining technique seems to influence on the prediction results. For most physico-chemical parameters, prediction equations with moderate to excellent correlation coefficients were achieved by using the following combinations of acquisition sequences of MRI, fractal algorithms and data mining techniques: SE-FTA-MLR, SE-OPFTA-IR, GE-OPFTA-MLR, SE-OPFTA-MLR, with the last one offering the best prediction results. Thus, SE-OPFTA-MLR could be proposed as an alternative technique to determine physico-chemical traits of fresh and dry-cured loins in a non-destructive way with high accuracy. Copyright © 2017. Published by Elsevier Ltd.
Substrates of Peltigera Lichens as a Potential Source of Cyanobionts.
Zúñiga, Catalina; Leiva, Diego; Carú, Margarita; Orlando, Julieta
2017-10-01
Photobiont availability is one of the main factors determining the success of the lichenization process. Although multiple sources of photobionts have been proposed, there is no substantial evidence confirming that the substrates on which lichens grow are one of them. In this work, we obtained cyanobacterial 16S ribosomal RNA gene sequences from the substrates underlying 186 terricolous Peltigera cyanolichens from localities in Southern Chile and maritime Antarctica and compared them with the sequences of the cyanobionts of these lichens, in order to determine if cyanobacteria potentially available for lichenization were present in the substrates. A phylogenetic analysis of the sequences showed that Nostoc phylotypes dominated the cyanobacterial communities of the substrates in all sites. Among them, an overlap was observed between the phylotypes of the lichen cyanobionts and those of the cyanobacteria present in their substrates, suggesting that they could be a possible source of lichen photobionts. Also, in most cases, higher Nostoc diversity was observed in the lichens than in the substrates from each site. A better understanding of cyanobacterial diversity in lichen substrates and their relatives in the lichens would bring insights into mycobiont selection and the distribution patterns of lichens, providing a background for hypothesis testing and theory development for future studies of the lichenization process.
Fast parallel tandem mass spectral library searching using GPU hardware acceleration.
Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K; Martin, Daniel B
2011-06-03
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate-limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper, we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching), is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA, which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment.
Viral Diagnostics in Plants Using Next Generation Sequencing: Computational Analysis in Practice
Jones, Susan; Baizan-Edge, Amanda; MacFarlane, Stuart; Torrance, Lesley
2017-01-01
Viruses cause significant yield and quality losses in a wide variety of cultivated crops. Hence, the detection and identification of viruses is a crucial facet of successful crop production and of great significance in terms of world food security. Whilst the adoption of molecular techniques such as RT-PCR has increased the speed and accuracy of viral diagnostics, such techniques only allow the detection of known viruses, i.e., each test is specific to one or a small number of related viruses. Therefore, unknown viruses can be missed and testing can be slow and expensive if molecular tests are unavailable. Methods for simultaneous detection of multiple viruses have been developed, and (NGS) is now a principal focus of this area, as it enables unbiased and hypothesis-free testing of plant samples. The development of NGS protocols capable of detecting multiple known and emergent viruses present in infected material is proving to be a major advance for crops, nuclear stocks or imported plants and germplasm, in which disease symptoms are absent, unspecific or only triggered by multiple viruses. Researchers want to answer the question “how many different viruses are present in this crop plant?” without knowing what they are looking for: RNA-sequencing (RNA-seq) of plant material allows this question to be addressed. As well as needing efficient nucleic acid extraction and enrichment protocols, virus detection using RNA-seq requires fast and robust bioinformatics methods to enable host sequence removal and virus classification. In this review recent studies that use RNA-seq for virus detection in a variety of crop plants are discussed with specific emphasis on the computational methods implemented. The main features of a number of specific bioinformatics workflows developed for virus detection from NGS data are also outlined and possible reasons why these have not yet been widely adopted are discussed. The review concludes by discussing the future directions of this field, including the use of bioinformatics tools for virus detection deployed in analytical environments using cloud computing. PMID:29123534
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davidge, T. J.
2012-12-20
The stellar contents of the open clusters King 12, NGC 7788, and NGC 7790 are investigated using MegaCam images. Comparisons with isochrones yield an age <20 Myr for King 12, 20-40 Myr for NGC 7788, and 60-80 Myr for NGC 7790 based on the properties of stars near the main-sequence turnoff (MSTO) in each cluster. The reddening of NGC 7788 is much larger than previously estimated. The luminosity functions (LFs) of King 12 and NGC 7788 show breaks that are attributed to the onset of pre-main-sequence (PMS) objects, and comparisons with models of PMS evolution yield ages that are consistentmore » with those measured from stars near the MSTO. In contrast, the r' LF of main-sequence stars in NGC 7790 is matched to r' = 20 by a model that is based on the solar neighborhood mass function. The structural properties of all three clusters are investigated by examining the two-point angular correlation function of blue main-sequence stars. King 12 and NGC 7788 are each surrounded by a stellar halo that extends out to a radius of 5 arcmin ({approx}3.4 pc). It is suggested that these halos form in response to large-scale mass ejection early in the evolution of the clusters, as predicted by models. In contrast, blue main-sequence stars in NGC 7790 are traced out to a radius of {approx}7.5 arcmin ({approx}5.5 pc), with no evidence of a halo. It is suggested that all three clusters may have originated in the same star-forming complex, but not in the same giant molecular cloud.« less
Pre-main-sequence isochrones - II. Revising star and planet formation time-scales
NASA Astrophysics Data System (ADS)
Bell, Cameron P. M.; Naylor, Tim; Mayne, N. J.; Jeffries, R. D.; Littlefair, S. P.
2013-09-01
We have derived ages for 13 young (<30 Myr) star-forming regions and find that they are up to a factor of 2 older than the ages typically adopted in the literature. This result has wide-ranging implications, including that circumstellar discs survive longer (≃ 10-12 Myr) and that the average Class I lifetime is greater (≃1 Myr) than currently believed. For each star-forming region, we derived two ages from colour-magnitude diagrams. First, we fitted models of the evolution between the zero-age main sequence and terminal-age main sequence to derive a homogeneous set of main-sequence ages, distances and reddenings with statistically meaningful uncertainties. Our second age for each star-forming region was derived by fitting pre-main-sequence stars to new semi-empirical model isochrones. For the first time (for a set of clusters younger than 50 Myr), we find broad agreement between these two ages, and since these are derived from two distinct mass regimes that rely on different aspects of stellar physics, it gives us confidence in the new age scale. This agreement is largely due to our adoption of empirical colour-Teff relations and bolometric corrections for pre-main-sequence stars cooler than 4000 K. The revised ages for the star-forming regions in our sample are: ˜2 Myr for NGC 6611 (Eagle Nebula; M 16), IC 5146 (Cocoon Nebula), NGC 6530 (Lagoon Nebula; M 8) and NGC 2244 (Rosette Nebula); ˜6 Myr for σ Ori, Cep OB3b and IC 348; ≃10 Myr for λ Ori (Collinder 69); ≃11 Myr for NGC 2169; ≃12 Myr for NGC 2362; ≃13 Myr for NGC 7160; ≃14 Myr for χ Per (NGC 884); and ≃20 Myr for NGC 1960 (M 36).
Song, Jiangning; Yuan, Zheng; Tan, Hao; Huber, Thomas; Burrage, Kevin
2007-12-01
Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide
Recapitulating phylogenies using k-mers: from trees to networks.
Bernard, Guillaume; Ragan, Mark A; Chan, Cheong Xin
2016-01-01
Ernst Haeckel based his landmark Tree of Life on the supposed ontogenic recapitulation of phylogeny, i.e. that successive embryonic stages during the development of an organism re-trace the morphological forms of its ancestors over the course of evolution. Much of this idea has since been discredited. Today, phylogenies are often based on families of molecular sequences. The standard approach starts with a multiple sequence alignment, in which the sequences are arranged relative to each other in a way that maximises a measure of similarity position-by-position along their entire length. A tree (or sometimes a network) is then inferred. Rigorous multiple sequence alignment is computationally demanding, and evolutionary processes that shape the genomes of many microbes (bacteria, archaea and some morphologically simple eukaryotes) can add further complications. In particular, recombination, genome rearrangement and lateral genetic transfer undermine the assumptions that underlie multiple sequence alignment, and imply that a tree-like structure may be too simplistic. Here, using genome sequences of 143 bacterial and archaeal genomes, we construct a network of phylogenetic relatedness based on the number of shared k -mers (subsequences at fixed length k ). Our findings suggest that the network captures not only key aspects of microbial genome evolution as inferred from a tree, but also features that are not treelike. The method is highly scalable, allowing for investigation of genome evolution across a large number of genomes. Instead of using specific regions or sequences from genome sequences, or indeed Haeckel's idea of ontogeny, we argue that genome phylogenies can be inferred using k -mers from whole-genome sequences. Representing these networks dynamically allows biological questions of interest to be formulated and addressed quickly and in a visually intuitive manner.
Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio
2013-09-01
Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P < 0.01). This algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P < 0.05), whereas it shows results not significantly different to 3D-COFFEE (P > 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.
NASA Astrophysics Data System (ADS)
Martocchia, S.; Niederhofer, F.; Dalessandro, E.; Bastian, N.; Kacharov, N.; Usher, C.; Cabrera-Ziri, I.; Lardo, C.; Cassisi, S.; Geisler, D.; Hilker, M.; Hollyhead, K.; Kozhurina-Platais, V.; Larsen, S.; Mackey, D.; Mucciarelli, A.; Platais, I.; Salaris, M.
2018-04-01
We have recently shown that the ˜2 Gyr old Large Magellanic Cloud star cluster NGC 1978 hosts multiple populations in terms of star-to-star abundance variations in [N/Fe]. These can be seen as a splitting or spread in the sub-giant and red giant branches (SGB and RGB) when certain photometric filter combinations are used. Due to its relative youth, NGC 1978 can be used to place stringent limits on whether multiple bursts of star-formation have taken place within the cluster, as predicted by some models for the origin of multiple populations. We carry out two distinct analyses to test whether multiple star-formation epochs have occurred within NGC 1978. First, we use UV CMDs to select stars from the first and second population along the SGB, and then compare their positions in optical CMDs, where the morphology is dominantly controlled by age as opposed to multiple population effects. We find that the two populations are indistinguishable, with age differences of 1 ± 20 Myr between them. This is in tension with predictions from the AGB scenario for the origin of multiple populations. Second, we estimate the broadness of the main sequence turnoff (MSTO) of NGC 1978 and we report that it is consistent with the observational errors. We find an upper limit of ˜65 Myr on the age spread in the MSTO of NGC 1978. This finding is in conflict with the age spread scenario as origin of the extendend MSTO in intermediate age clusters, while it fully supports predictions from the stellar rotation model.
NASA Astrophysics Data System (ADS)
Martocchia, S.; Niederhofer, F.; Dalessandro, E.; Bastian, N.; Kacharov, N.; Usher, C.; Cabrera-Ziri, I.; Lardo, C.; Cassisi, S.; Geisler, D.; Hilker, M.; Hollyhead, K.; Kozhurina-Platais, V.; Larsen, S.; Mackey, D.; Mucciarelli, A.; Platais, I.; Salaris, M.
2018-07-01
We have recently shown that the ˜2 Gyr old Large Magellanic Cloud star cluster NGC 1978 hosts multiple populations in terms of star-to-star abundance variations in [N/Fe]. These can be seen as a splitting or spread in the subgiant and red giant branches (SGB and RGB) when certain photometric filter combinations are used. Because of its relative youth, NGC 1978 can be used to place stringent limits on whether multiple bursts of star formation have taken place within the cluster, as predicted by some models for the origin of multiple populations. We carry out two distinct analyses to test whether multiple star formation epochs have occurred within NGC 1978. First, we use ultraviolet colour-magnitude diagrams (CMDs) to select stars from the first and second population along the SGB, and then compare their positions in optical CMDs, where the morphology is dominantly controlled by age as opposed to multiple population effects. We find that the two populations are indistinguishable, with age differences of 1 ± 20 Myr between them. This is in tension with predictions from the asymptotic giant branch scenario for the origin of multiple populations. Second, we estimate the broadness of the main-sequence turn-off (MSTO) of NGC 1978 and we report that it is consistent with the observational errors. We find an upper limit of ˜65 Myr on the age spread in the MSTO of NGC 1978. This finding is in conflict with the age spread scenario as origin of the extended MSTO in intermediate-age clusters, while it fully supports predictions from the stellar rotation model.
Automatic detection of pelvic lymph nodes using multiple MR sequences
NASA Astrophysics Data System (ADS)
Yan, Michelle; Lu, Yue; Lu, Renzhi; Requardt, Martin; Moeller, Thomas; Takahashi, Satoru; Barentsz, Jelle
2007-03-01
A system for automatic detection of pelvic lymph nodes is developed by incorporating complementary information extracted from multiple MR sequences. A single MR sequence lacks sufficient diagnostic information for lymph node localization and staging. Correct diagnosis often requires input from multiple complementary sequences which makes manual detection of lymph nodes very labor intensive. Small lymph nodes are often missed even by highly-trained radiologists. The proposed system is aimed at assisting radiologists in finding lymph nodes faster and more accurately. To the best of our knowledge, this is the first such system reported in the literature. A 3-dimensional (3D) MR angiography (MRA) image is employed for extracting blood vessels that serve as a guide in searching for pelvic lymph nodes. Segmentation, shape and location analysis of potential lymph nodes are then performed using a high resolution 3D T1-weighted VIBE (T1-vibe) MR sequence acquired by Siemens 3T scanner. An optional contrast-agent enhanced MR image, such as post ferumoxtran-10 T2*-weighted MEDIC sequence, can also be incorporated to further improve detection accuracy of malignant nodes. The system outputs a list of potential lymph node locations that are overlaid onto the corresponding MR sequences and presents them to users with associated confidence levels as well as their sizes and lengths in each axis. Preliminary studies demonstrates the feasibility of automatic lymph node detection and scenarios in which this system may be used to assist radiologists in diagnosis and reporting.
NASA Astrophysics Data System (ADS)
Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue
2017-08-01
Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.
W134: A new pre-main-sequence double-lined spectroscopic binary
NASA Technical Reports Server (NTRS)
Padgett, Deborah L.; Stapelfeldt, Karl R.
1994-01-01
We report the discovery that the pre-main-sequence star Walker 134 in the young cluster NGC 2264 is a double-lined spectroscopic binary. Both components are G stars with strong Li I 6708 A absorption lines. Twenty radial velocity measurements have been used to determined the orbital elements of this system. The orbit has a period of 6.3532 +/- 0.0012 days and is circular within the limits of our velocity resolution; e less than 0.01. The total system mass is stellar mass sin(exp 3) i = 3.16 solar mass with a mass ratio of 1.04. Estimates for the orbit inclination angle and stellar radii place the system near the threshold for eclipse observability; howerver, no decrease in brightness was seen during two attempts at photometric monitoring. The circular orbit of W 134 fills an important gap in the period distribution of pre-main-sequence binaries and thereby constrains the effectiveness of tidal orbital circularization during the pre-main sequence.
NASA Astrophysics Data System (ADS)
Kopytova, Taisiya G.; Brandner, Wolfgang; Tognelli, Emanuele; Prada Moroni, Pier Giorgio; Da Rio, Nicola; Röser, Siegfried; Schilbach, Elena
2016-01-01
Context. Age and mass determinations for isolated stellar objects remain model-dependent. While stellar interior and atmospheric theoretical models are rapidly evolving, we need a powerful tool to test them. Open clusters are good candidates for this role. Aims: We aim to create a fiducial sequence of stellar objects for testing stellar and atmospheric models. Methods: We complement previous studies on the Hyades multiplicity by Lucky Imaging observations with the AstraLux Norte camera. This allows us to exclude possible binary and multiple systems with companions outside a 2-7 AU separation and to create a single-star sequence for the Hyades. The sequence encompasses 250 main-sequence stars ranging from A5V to M6V. Using the Tool for Astrophysical Data Analysis (TA-DA), we create various theoretical isochrones applying different combinations of interior and atmospheric models. We compare the isochrones with the observed Hyades single-star sequence on J vs. J-Ks, J vs. J-H, and Ks vs. H-Ks color-magnitude diagrams. As a reference we also compute absolute fluxes and magnitudes for all stars from X-ray to mid-infrared based on photometric measurements available in the literature(ROSAT X-ray, GALEX UV, APASS gri, 2MASS JHKs, and WISE W1 to W4). Results: We find that combinations of both PISA and DARTMOUTH stellar interior models with BT-Settl 2010 atmospheric models describe the observed sequence well. We use PISA in combination with BT-Settl 2010 models to derive theoretical predictions for physical parameters (Teff, mass, log g) of 250 single stars in the Hyades. The full sequence covers the mass range of 0.13-2.30 M⊙, and effective temperatures between 3060 K and 8200 K. Conclusions: Within the measurement uncertainties, the current generation of models agree well with the single-star sequence. The primary limitations are the uncertainties in the measurement of the distances to individual Hyades members, and uncertainties in the photometry. Gaia parallaxes, photometry, and spectroscopy will greatly reduce the uncertainties in particular at the lowest mass range, and will enable us to test model predictions with greater confidence. Additionally, a small (~0.05 mag) systematic offset can be noted in J vs. J-K and K vs. H-K diagrams - the observed sequence is shifted to redder colors than the theoretical predictions. Based on observations collected at the Centro Astronómico Hispano Alemán (CAHA) at Calar Alto, operated jointly by the Max-Planck Institut für Astronomie and the Instituto de Astrofísica de Andalucía (CSIC).Full Table 2 is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/585/A7
McGarr, Arthur F.; Barbour, Andrew
2017-01-01
Each of the three earthquake sequences in Oklahoma in 2016—Fairview, Pawnee, and Cushing—appears to have been induced by high-volume wastewater disposal within 10 km. The Fairview M5.1 main shock was part of a 2 year sequence of more than 150 events of M3, or greater; the main shock accounted for about half of the total moment. The foreshocks and aftershocks of the M5.8 Pawnee earthquake were too small and too few to contribute significantly to the cumulative moment; instead, nearly all of the moment induced by wastewater injection was focused on the main shock. The M5.0 Cushing event is part of a sequence that includes 48 earthquakes of M3, or greater, that are mostly foreshocks. The cumulative moment for each of the three sequences during 2016, as well as that for the 2011 Prague, Oklahoma, and nine other sequences representing a broad range of injected volume, are all limited by the total volumes of wastewater injected locally.
Worley, K C; Wiese, B A; Smith, R F
1995-09-01
BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).
Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.
Li, Guang; Wang, Yadong; Su, Xiaohong
2012-10-01
When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Kono, H; Saven, J G
2001-02-23
Combinatorial experiments provide new ways to probe the determinants of protein folding and to identify novel folding amino acid sequences. These types of experiments, however, are complicated both by enormous conformational complexity and by large numbers of possible sequences. Therefore, a quantitative computational theory would be helpful in designing and interpreting these types of experiment. Here, we present and apply a statistically based, computational approach for identifying the properties of sequences compatible with a given main-chain structure. Protein side-chain conformations are included in an atom-based fashion. Calculations are performed for a variety of similar backbone structures to identify sequence properties that are robust with respect to minor changes in main-chain structure. Rather than specific sequences, the method yields the likelihood of each of the amino acids at preselected positions in a given protein structure. The theory may be used to quantify the characteristics of sequence space for a chosen structure without explicitly tabulating sequences. To account for hydrophobic effects, we introduce an environmental energy that it is consistent with other simple hydrophobicity scales and show that it is effective for side-chain modeling. We apply the method to calculate the identity probabilities of selected positions of the immunoglobulin light chain-binding domain of protein L, for which many variant folding sequences are available. The calculations compare favorably with the experimentally observed identity probabilities.
Identification of Prostate Cancer-Specific microDNAs
2014-12-01
displacement amplification (MDA). 2 adopted multiple displacement amplification (MDA) with random primers for enriched circular DNA by rolling circle ... amplification (RCA) (Fig. 1) and then amplified DNA fragments were subject to deep sequencing. Sequence NO of Reads seq 1 184 seq 2 133 seq 3 2407 seq...prostate cancer cells through multiple displacement amplification . Clone #7 is the top candidate which has been cloned in an expression vector and it
MACSIMS : multiple alignment of complete sequences information management system
Thompson, Julie D; Muller, Arnaud; Waterhouse, Andrew; Procter, Jim; Barton, Geoffrey J; Plewniak, Frédéric; Poch, Olivier
2006-01-01
Background In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. Results MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. Conclusion MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at . PMID:16792820
Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G
2012-09-01
Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples.
Malekpour, Seyed Amir; Pezeshk, Hamid; Sadeghi, Mehdi
2018-03-05
Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.
CoSMoS: Conserved Sequence Motif Search in the proteome
Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I
2006-01-01
Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915
Pre-main Sequence Evolution and the Hydrogen-Burning Minimum Mass
NASA Astrophysics Data System (ADS)
Nakano, Takenori
There is a lower limit to the mass of the main-sequence stars (the hydrogen-burning minimum mass) below which the stars cannot replenish the energy lost from their surfaces with the energy released by the hydrogen burning in their cores. This is caused by the electron degeneracy in the stars which suppresses the increase of the central temperature with contraction. To find out the lower limit we need the accurate knowledge of the pre-main sequence evolution of very low-mass stars in which the effect of electron degeneracy is important. We review how Hayashi and Nakano (1963) carried out the first determination of this limit.
Reconciling mass functions with the star-forming main sequence via mergers
NASA Astrophysics Data System (ADS)
Steinhardt, Charles L.; Yurk, Dominic; Capak, Peter
2017-06-01
We combine star formation along the 'main sequence', quiescence and clustering and merging to produce an empirical model for the evolution of individual galaxies. Main-sequence star formation alone would significantly steepen the stellar mass function towards low redshift, in sharp conflict with observation. However, a combination of star formation and merging produces a consistent result for correct choice of the merger rate function. As a result, we are motivated to propose a model in which hierarchical merging is disconnected from environmentally independent star formation. This model can be tested via correlation functions and would produce new constraints on clustering and merging.
Zemali, El-Amine; Boukra, Abdelmadjid
2015-08-01
The multiple sequence alignment (MSA) is one of the most challenging problems in bioinformatics, it involves discovering similarity between a set of protein or DNA sequences. This paper introduces a new method for the MSA problem called biogeography-based optimization with multiple populations (BBOMP). It is based on a recent metaheuristic inspired from the mathematics of biogeography named biogeography-based optimization (BBO). To improve the exploration ability of BBO, we have introduced a new concept allowing better exploration of the search space. It consists of manipulating multiple populations having each one its own parameters. These parameters are used to build up progressive alignments allowing more diversity. At each iteration, the best found solution is injected in each population. Moreover, to improve solution quality, six operators are defined. These operators are selected with a dynamic probability which changes according to the operators efficiency. In order to test proposed approach performance, we have considered a set of datasets from Balibase 2.0 and compared it with many recent algorithms such as GAPAM, MSA-GA, QEAMSA and RBT-GA. The results show that the proposed approach achieves better average score than the previously cited methods.
NASA Technical Reports Server (NTRS)
Wheeler, Ward C.
2003-01-01
A method to align sequence data based on parsimonious synapomorphy schemes generated by direct optimization (DO; earlier termed optimization alignment) is proposed. DO directly diagnoses sequence data on cladograms without an intervening multiple-alignment step, thereby creating topology-specific, dynamic homology statements. Hence, no multiple-alignment is required to generate cladograms. Unlike general and globally optimal multiple-alignment procedures, the method described here, implied alignment (IA), takes these dynamic homologies and traces them back through a single cladogram, linking the unaligned sequence positions in the terminal taxa via DO transformation series. These "lines of correspondence" link ancestor-descendent states and, when displayed as linearly arrayed columns without hypothetical ancestors, are largely indistinguishable from standard multiple alignment. Since this method is based on synapomorphy, the treatment of certain classes of insertion-deletion (indel) events may be different from that of other alignment procedures. As with all alignment methods, results are dependent on parameter assumptions such as indel cost and transversion:transition ratios. Such an IA could be used as a basis for phylogenetic search, but this would be questionable since the homologies derived from the implied alignment depend on its natal cladogram and any variance, between DO and IA + Search, due to heuristic approach. The utility of this procedure in heuristic cladogram searches using DO and the improvement of heuristic cladogram cost calculations are discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.
NASA Technical Reports Server (NTRS)
Strom, Stephen E.; Edwards, Suzan; Strom, Karen M.
1991-01-01
The following topics were discussed: (1) current observation evidence for the presence of circumstellar disks associated with solar type pre-main sequence (PMS) stars; (2) the properties of such disks; and (3) the disk environment.
The evolution of the lithium abundances of solar-type stars. II - The Ursa Major Group
NASA Technical Reports Server (NTRS)
Soderblom, David R.; Pilachowski, Catherine A.; Fedele, Stephen B.; Jones, Burton F.
1993-01-01
We draw upon a recent study of the membership of the Ursa Major Group (UMaG) to examine lithium among 0.3 Gyr old solar-type stars. For most G and K dwarfs, Li confirms the conclusions about membership in UMaG reached on the basis of kinematics and chromospheric activity. G and K dwarfs in UMaG have less Li than comparable stars in the Pleiades. This indicates that G and K dwarfs undergo Li depletion while they are on the main sequence, in addition to any pre-main-sequence depletion they may have experienced. Moreover, the Li abundances of the Pleiades K dwarfs cannot be attributed to main-sequence depletion alone, demonstrating that pre-main-sequence depletion of Li also takes place. The sun's Li abundance implies that the main-sequence mechanism becomes less effective with age. The hottest stars in UMaG have Li abundances like those of hot stars in the Pleiades and Hyades and in T Tauris, and the two genuine UMaG members with temperatures near Boesgaard's Li chasm have Li abundances consistent with that chasm developing fully by 0.3 Gyr for stars with UMaG's metallicity. We see differences in the abundance of Li between UMaG members of the same spectral types, indicating that a real spread in the lithium abundance exists within this group.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramirez, Ramses M.; Kaltenegger, Lisa
We calculate the pre-main-sequence habitable zone (HZ) for stars of spectral classes F-M. The spatial distribution of liquid water and its change during the pre-main-sequence phase of protoplanetary systems is important for understanding how planets become habitable. Such worlds are interesting targets for future missions because the coolest stars could provide habitable conditions for up to 2.5 billion years post-accretion. Moreover, for a given star type, planetary systems are more easily resolved because of higher pre-main-sequence stellar luminosities, resulting in larger planet-star separation for cool stars than is the case for the traditional main-sequence (MS) HZ. We use one-dimensional radiative-convectivemore » climate and stellar evolutionary models to calculate pre-main-sequence HZ distances for F1-M8 stellar types. We also show that accreting planets that are later located in the traditional MS HZ orbiting stars cooler than a K5 (including the full range of M stars) receive stellar fluxes that exceed the runaway greenhouse threshold, and thus may lose substantial amounts of water initially delivered to them. We predict that M-star planets need to initially accrete more water than Earth did, or, alternatively, have additional water delivered later during the long pre-MS phase to remain habitable. Our findings are also consistent with recent claims that Venus lost its water during accretion.« less
Simulation of spatial and temporal properties of aftershocks by means of the fiber bundle model
NASA Astrophysics Data System (ADS)
Monterrubio-Velasco, Marisol; Zúñiga, F. R.; Márquez-Ramírez, Victor Hugo; Figueroa-Soto, Angel
2017-11-01
The rupture processes of any heterogeneous material constitute a complex physical problem. Earthquake aftershocks show temporal and spatial behaviors which are consequence of the heterogeneous stress distribution and multiple rupturing following the main shock. This process is difficult to model deterministically due to the number of parameters and physical conditions, which are largely unknown. In order to shed light on the minimum requirements for the generation of aftershock clusters, in this study, we perform a simulation of the main features of such a complex process by means of a fiber bundle (FB) type model. The FB model has been widely used to analyze the fracture process in heterogeneous materials. It is a simple but powerful tool that allows modeling the main characteristics of a medium such as the brittle shallow crust of the earth. In this work, we incorporate spatial properties, such as the Coulomb stress change pattern, which help simulate observed characteristics of aftershock sequences. In particular, we introduce a parameter ( P) that controls the probability of spatial distribution of initial loads. Also, we use a "conservation" parameter ( π), which accounts for the load dissipation of the system, and demonstrate its influence on the simulated spatio-temporal patterns. Based on numerical results, we find that P has to be in the range 0.06 < P < 0.30, whilst π needs to be limited by a very narrow range ( 0.60 < π < 0.66) in order to reproduce aftershocks pattern characteristics which resemble those of observed sequences. This means that the system requires a small difference in the spatial distribution of initial stress, and a very particular fraction of load transfer in order to generate realistic aftershocks.
Armero, Alix; Baudouin, Luc; Bocs, Stéphanie; This, Dominique
2017-01-01
The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/).
Thomas, W. Kelley; Vida, J. T.; Frisse, Linda M.; Mundo, Manuel; Baldwin, James G.
1997-01-01
To effectively integrate DNA sequence analysis and classical nematode taxonomy, we must be able to obtain DNA sequences from formalin-fixed specimens. Microdissected sections of nematodes were removed from specimens fixed in formalin, using standard protocols and without destroying morphological features. The fixed sections provided sufficient template for multiple polymerase chain reaction-based DNA sequence analyses. PMID:19274156
Multiple tag labeling method for DNA sequencing
Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.
1995-01-01
A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.
Panwar, Priyankar; Verma, A K; Dubey, Ashutosh
2018-05-01
Barnyard ( Echinochloa frumentacea ) and finger ( Eleusine coracana ) millet growing at northwestern Himalaya were explored for the α-amylase inhibitor (α-AI). The mature seeds of barnyard millet variety PRJ1 had maximum α-AI activity which increases in different developmental stage. α-AI was purified up to 22.25-fold from barnyard millet variety PRJ1. Semi-quantitative PCR of different developmental stages of barnyard millet seeds showed increased levels of the transcript from 7 to 28 days. Sequence analysis revealed that it contained 315 bp nucleotide which encodes 104 amino acid sequence with molecular weight 10.72 kDa. The predicted 3D structure of α-AI was 86.73% similar to a bifunctional inhibitor of ragi. In silico analysis of 71 α-AI protein sequences were carried out for biochemical features, homology search, multiple sequence alignment, phylogenetic tree construction, motif, and superfamily distribution of protein sequences. Analysis of multiple sequence alignment revealed the existence of conserved regions NPLP[S/G]CRWYVV[S/Q][Q/R]TCG[V/I] throughout sequences. Superfam analysis revealed that α-AI protein sequences were distributed among seven different superfamilies.
Gene order in rosid phylogeny, inferred from pairwise syntenies among extant genomes
2012-01-01
Background Ancestral gene order reconstruction for flowering plants has lagged behind developments in yeasts, insects and higher animals, because of the recency of widespread plant genome sequencing, sequencers' embargoes on public data use, paralogies due to whole genome duplication (WGD) and fractionation of undeleted duplicates, extensive paralogy from other sources, and the computational cost of existing methods. Results We address these problems, using the gene order of four core eudicot genomes (cacao, castor bean, papaya and grapevine) that have escaped any recent WGD events, and two others (poplar and cucumber) that descend from independent WGDs, in inferring the ancestral gene order of the rosid clade and those of its main subgroups, the fabids and malvids. We improve and adapt techniques including the OMG method for extracting large, paralogy-free, multiple orthologies from conflated pairwise synteny data among the six genomes and the PATHGROUPS approach for ancestral gene order reconstruction in a given phylogeny, where some genomes may be descendants of WGD events. We use the gene order evidence to evaluate the hypothesis that the order Malpighiales belongs to the malvids rather than as traditionally assigned to the fabids. Conclusions Gene orders of ancestral eudicot species, involving 10,000 or more genes can be reconstructed in an efficient, parsimonious and consistent way, despite paralogies due to WGD and other processes. Pairwise genomic syntenies provide appropriate input to a parameter-free procedure of multiple ortholog identification followed by gene-order reconstruction in solving instances of the "small phylogeny" problem. PMID:22759433
de Knegt, Leonardo V; Pires, Sara M; Löfström, Charlotta; Sørensen, Gitte; Pedersen, Karl; Torpdahl, Mia; Nielsen, Eva M; Hald, Tine
2016-03-01
Salmonella is an important cause of bacterial foodborne infections in Denmark. To identify the main animal-food sources of human salmonellosis, risk managers have relied on a routine application of a microbial subtyping-based source attribution model since 1995. In 2013, multiple locus variable number tandem repeat analysis (MLVA) substituted phage typing as the subtyping method for surveillance of S. Enteritidis and S. Typhimurium isolated from animals, food, and humans in Denmark. The purpose of this study was to develop a modeling approach applying a combination of serovars, MLVA types, and antibiotic resistance profiles for the Salmonella source attribution, and assess the utility of the results for the food safety decisionmakers. Full and simplified MLVA schemes from surveillance data were tested, and model fit and consistency of results were assessed using statistical measures. We conclude that loci schemes STTR5/STTR10/STTR3 for S. Typhimurium and SE9/SE5/SE2/SE1/SE3 for S. Enteritidis can be used in microbial subtyping-based source attribution models. Based on the results, we discuss that an adjustment of the discriminatory level of the subtyping method applied often will be required to fit the purpose of the study and the available data. The issues discussed are also considered highly relevant when applying, e.g., extended multi-locus sequence typing or next-generation sequencing techniques. © 2015 Society for Risk Analysis.
Miklós, István
2009-01-01
Homologous genes originate from a common ancestor through vertical inheritance, duplication, or horizontal gene transfer. Entire homolog families spawned by a single ancestral gene can be identified across multiple genomes based on protein sequence similarity. The sequences, however, do not always reveal conclusively the history of large families. To study the evolution of complete gene repertoires, we propose here a mathematical framework that does not rely on resolved gene family histories. We show that so-called phylogenetic profiles, formed by family sizes across multiple genomes, are sufficient to infer principal evolutionary trends. The main novelty in our approach is an efficient algorithm to compute the likelihood of a phylogenetic profile in a model of birth-and-death processes acting on a phylogeny. We examine known gene families in 28 archaeal genomes using a probabilistic model that involves lineage- and family-specific components of gene acquisition, duplication, and loss. The model enables us to consider all possible histories when inferring statistics about archaeal evolution. According to our reconstruction, most lineages are characterized by a net loss of gene families. Major increases in gene repertoire have occurred only a few times. Our reconstruction underlines the importance of persistent streamlining processes in shaping genome composition in Archaea. It also suggests that early archaeal genomes were as complex as typical modern ones, and even show signs, in the case of the methanogenic ancestor, of an extremely large gene repertoire. PMID:19570746
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tai, Lin-Ru; Chou, Chang-Wei; Lee, I-Fang
In this study, we used a multiple copy (EGFP){sub 3} reporter system to establish a numeric nuclear index system to assess the degree of nuclear import. The system was first validated by a FRAP assay, and then was applied to evaluate the essential and multifaceted nature of basic amino acid clusters during the nuclear import of ribosomal protein L7. The results indicate that the sequence context of the basic cluster determines the degree of nuclear import, and that the number of basic residues in the cluster is irrelevant; rather the position of the pertinent basic residues is crucial. Moreover, itmore » also found that the type of carrier protein used by basic cluster has a great impact on the degree of nuclear import. In case of L7, importin β2 or importin β3 are preferentially used by clusters with a high import efficiency, notwithstanding that other importins are also used by clusters with a weaker level of nuclear import. Such a preferential usage of multiple basic clusters and importins to gain nuclear entry would seem to be a common practice among ribosomal proteins in order to ensure their full participation in high rate ribosome synthesis. - Highlights: ► We introduce a numeric index system that represents the degree of nuclear import. ► The rate of nuclear import is dictated by the sequence context of the basic cluster. ► Importin β2 and β3 were mainly responsible for the N4 mediated nuclear import.« less
Xiong, Kan; Asher, Sanford A
2010-01-01
We used CD and UV resonance Raman spectroscopy to study the impact of alcohols on the conformational equilibria and relative Gibbs free energy landscapes along the Ramanchandran Ψ-coordinate of a mainly poly-ala peptide, AP of sequence AAAAA(AAARA)3A. 2,2,2-trifluroethanol (TFE) most stabilizes the α-helical-like conformations, followed by ethanol, methanol and pure water. The π-bulge conformation is stabilized more than the α-helix, while the 310-helix is destabilized due to the alcohol increased hydrophobicity. Turns are also stabilized by alcohols. We also found that while TFE induces more α-helices, it favors multiple, shorter helix segments. PMID:20225890
Gaia's view of the λ Boo star puzzle
NASA Astrophysics Data System (ADS)
Murphy, Simon J.; Paunzen, Ernst
2017-04-01
The evolutionary status of the chemically peculiar class of λ Boo stars has been intensely debated. It is now agreed that the λ Boo phenomenon affects A stars of all ages, from star formation to the terminal age main sequence, but the cause of the chemical peculiarity is still a puzzle. We revisit the debate of their ages and temperatures in order to shed light on the phenomenon, using the new parallaxes in Gaia Data Release 1 with existing Hipparcos parallaxes and multicolour photometry. We find that no single formation mechanism is able to explain all the observations, and suggest that there are multiple channels producing λ Boo spectra. The relative importance of these channels varies with age, temperature and environment.
A pipelined FPGA implementation of an encryption algorithm based on genetic algorithm
NASA Astrophysics Data System (ADS)
Thirer, Nonel
2013-05-01
With the evolution of digital data storage and exchange, it is essential to protect the confidential information from every unauthorized access. High performance encryption algorithms were developed and implemented by software and hardware. Also many methods to attack the cipher text were developed. In the last years, the genetic algorithm has gained much interest in cryptanalysis of cipher texts and also in encryption ciphers. This paper analyses the possibility to use the genetic algorithm as a multiple key sequence generator for an AES (Advanced Encryption Standard) cryptographic system, and also to use a three stages pipeline (with four main blocks: Input data, AES Core, Key generator, Output data) to provide a fast encryption and storage/transmission of a large amount of data.
Reaction schemes visualized in network form: the syntheses of strychnine as an example.
Proudfoot, John R
2013-05-24
Representation of synthesis sequences in a network form provides an effective method for the comparison of multiple reaction schemes and an opportunity to emphasize features such as reaction scale that are often relegated to experimental sections. An example of data formatting that allows construction of network maps in Cytoscape is presented, along with maps that illustrate the comparison of multiple reaction sequences, comparison of scaffold changes within sequences, and consolidation to highlight common key intermediates used across sequences. The 17 different synthetic routes reported for strychnine are used as an example basis set. The reaction maps presented required a significant data extraction and curation, and a standardized tabular format for reporting reaction information, if applied in a consistent way, could allow the automated combination of reaction information across different sources.
Skeleton-based human action recognition using multiple sequence alignment
NASA Astrophysics Data System (ADS)
Ding, Wenwen; Liu, Kai; Cheng, Fei; Zhang, Jin; Li, YunSong
2015-05-01
Human action recognition and analysis is an active research topic in computer vision for many years. This paper presents a method to represent human actions based on trajectories consisting of 3D joint positions. This method first decompose action into a sequence of meaningful atomic actions (actionlets), and then label actionlets with English alphabets according to the Davies-Bouldin index value. Therefore, an action can be represented using a sequence of actionlet symbols, which will preserve the temporal order of occurrence of each of the actionlets. Finally, we employ sequence comparison to classify multiple actions through using string matching algorithms (Needleman-Wunsch). The effectiveness of the proposed method is evaluated on datasets captured by commodity depth cameras. Experiments of the proposed method on three challenging 3D action datasets show promising results.
Andersson, P; Klein, M; Lilliebridge, R A; Giffard, P M
2013-09-01
Ultra-deep Illumina sequencing was performed on whole genome amplified DNA derived from a Chlamydia trachomatis-positive vaginal swab. Alignment of reads with reference genomes allowed robust SNP identification from the C. trachomatis chromosome and plasmid. This revealed that the C. trachomatis in the specimen was very closely related to the sequenced urogenital, serovar F, clade T1 isolate F-SW4. In addition, high genome-wide coverage was obtained for Prevotella melaninogenica, Gardnerella vaginalis, Clostridiales genomosp. BVAB3 and Mycoplasma hominis. This illustrates the potential of metagenome data to provide high resolution bacterial typing data from multiple taxa in a diagnostic specimen. ©2013 The Authors Clinical Microbiology and Infection ©2013 European Society of Clinical Microbiology and Infectious Diseases.
NASA Astrophysics Data System (ADS)
Omar, M. A.; Parvataneni, R.; Zhou, Y.
2010-09-01
Proposed manuscript describes the implementation of a two step processing procedure, composed of the self-referencing and the Principle Component Thermography (PCT). The combined approach enables the processing of thermograms from transient (flash), steady (halogen) and selective (induction) thermal perturbations. Firstly, the research discusses the three basic processing schemes typically applied for thermography; namely mathematical transformation based processing, curve-fitting processing, and direct contrast based calculations. Proposed algorithm utilizes the self-referencing scheme to create a sub-sequence that contains the maximum contrast information and also compute the anomalies' depth values. While, the Principle Component Thermography operates on the sub-sequence frames by re-arranging its data content (pixel values) spatially and temporally then it highlights the data variance. The PCT is mainly used as a mathematical mean to enhance the defects' contrast thus enabling its shape and size retrieval. The results show that the proposed combined scheme is effective in processing multiple size defects in sandwich steel structure in real-time (<30 Hz) and with full spatial coverage, without the need for a priori defect-free area.
Olsen, Anne Berit; Gulla, Snorre; Steinum, Terje; Colquhoun, Duncan J; Nilsen, Hanne K; Duchaud, Eric
2017-06-01
Skin ulcer development in sea-reared salmonids, commonly associated with Tenacibaculum spp., is a significant fish welfare- and economical problem in Norwegian aquaculture. A collection of 89 Tenacibaculum isolates was subjected to multilocus sequence analysis (MLSA). The isolates were retrieved from outbreaks of clinical disease in farms spread along the Norwegian coast line from seven different fish species over a period of 19 years. MLSA analysis reveals considerable genetic diversity, but allows identification of four main clades. One clade encompasses isolates belonging to the species T. dicentrarchi, whereas three clades encompass bacteria that likely represent novel, as yet undescribed species. The study identified T. maritimum in lumpsucker, T. ovolyticum in halibut, and has extended the host and geographic range for T. soleae, isolated from wrasse. The overall lack of clonality and host specificity, with some indication of geographical range restriction argue for local epidemics involving multiple strains. The diversity of Tenacibaculum isolates from fish displaying ulcerative disease may complicate vaccine development. Copyright © 2017 Elsevier B.V. All rights reserved.
Efficient genome editing of differentiated renal epithelial cells.
Hofherr, Alexis; Busch, Tilman; Huber, Nora; Nold, Andreas; Bohn, Albert; Viau, Amandine; Bienaimé, Frank; Kuehn, E Wolfgang; Arnold, Sebastian J; Köttgen, Michael
2017-02-01
Recent advances in genome editing technologies have enabled the rapid and precise manipulation of genomes, including the targeted introduction, alteration, and removal of genomic sequences. However, respective methods have been described mainly in non-differentiated or haploid cell types. Genome editing of well-differentiated renal epithelial cells has been hampered by a range of technological issues, including optimal design, efficient expression of multiple genome editing constructs, attainable mutation rates, and best screening strategies. Here, we present an easily implementable workflow for the rapid generation of targeted heterozygous and homozygous genomic sequence alterations in renal cells using transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeat (CRISPR) system. We demonstrate the versatility of established protocols by generating novel cellular models for studying autosomal dominant polycystic kidney disease (ADPKD). Furthermore, we show that cell culture-validated genetic modifications can be readily applied to mouse embryonic stem cells (mESCs) for the generation of corresponding mouse models. The described procedure for efficient genome editing can be applied to any cell type to study physiological and pathophysiological functions in the context of precisely engineered genotypes.
Scaling similarities of multiple fracturing of solid materials
NASA Astrophysics Data System (ADS)
Kapiris, P. G.; Balasis, G. T.; Kopanas, J. A.; Antonopoulos, G. N.; Peratzakis, A. S.; Eftaxias, K. A.
2004-02-01
It has recently reported that electromagnetic flashes of low-energy gamma-rays emitted during multi-fracturing on a neutron star, and electromagnetic pulses emitted in the laboratory by a disordered material subjected to an increasing external load, share distinctive statistical properties with earthquakes, such as power-law energy distributions (Cheng et al., 1996; Kossobokov et al., 2000; Rabinovitch et al., 2001; Sornette and Helmstetter, 2002). The neutron starquakes may release strain energies up to 1046 erg, while, the fractures in laboratory samples release strain energies approximately a fraction of an erg. An earthquake fault region can build up strain energy up to approximately 1026 erg for the strongest earthquakes. Clear sequences of kilohertz-megahertz electromagnetic avalanches have been detected from a few days up to a few hours prior to recent destructive earthquakes in Greece. A question that arises effortlessly is if the pre-seismic electromagnetic fluctuations also share the same statistical properties. Our study justifies a positive answer. Our analysis also reveals "symptoms" of a transition to the main rupture common with earthquake sequences and acoustic emission pulses observed during laboratory experiments (Maes et al., 1998).
Probabilistic models of eukaryotic evolution: time for integration
Lartillot, Nicolas
2015-01-01
In spite of substantial work and recent progress, a global and fully resolved picture of the macroevolutionary history of eukaryotes is still under construction. This concerns not only the phylogenetic relations among major groups, but also the general characteristics of the underlying macroevolutionary processes, including the patterns of gene family evolution associated with endosymbioses, as well as their impact on the sequence evolutionary process. All these questions raise formidable methodological challenges, calling for a more powerful statistical paradigm. In this direction, model-based probabilistic approaches have played an increasingly important role. In particular, improved models of sequence evolution accounting for heterogeneities across sites and across lineages have led to significant, although insufficient, improvement in phylogenetic accuracy. More recently, one main trend has been to move away from simple parametric models and stepwise approaches, towards integrative models explicitly considering the intricate interplay between multiple levels of macroevolutionary processes. Such integrative models are in their infancy, and their application to the phylogeny of eukaryotes still requires substantial improvement of the underlying models, as well as additional computational developments. PMID:26323768
Stellar Multiplicity Meets Stellar Evolution and Metallicity: The APOGEE View
NASA Astrophysics Data System (ADS)
Badenes, Carles; Mazzola, Christine; Thompson, Todd A.; Covey, Kevin; Freeman, Peter E.; Walker, Matthew G.; Moe, Maxwell; Troup, Nicholas; Nidever, David; Allende Prieto, Carlos; Andrews, Brett; Barbá, Rodolfo H.; Beers, Timothy C.; Bovy, Jo; Carlberg, Joleen K.; De Lee, Nathan; Johnson, Jennifer; Lewis, Hannah; Majewski, Steven R.; Pinsonneault, Marc; Sobeck, Jennifer; Stassun, Keivan G.; Stringfellow, Guy S.; Zasowski, Gail
2018-02-01
We use the multi-epoch radial velocities acquired by the Apache Point Observatory Galactic Evolution Experiment (APOGEE) survey to perform a large-scale statistical study of stellar multiplicity for field stars in the Milky Way, spanning the evolutionary phases between the main sequence (MS) and the red clump. We show that the distribution of maximum radial velocity shifts (ΔRVmax) for APOGEE targets is a strong function of log g, with MS stars showing ΔRVmax as high as ∼300 {km} {{{s}}}-1, and steadily dropping down to ∼30 {km} {{{s}}}-1 for log g ∼ 0, as stars climb up the red giant branch (RGB). Red clump stars show a distribution of ΔRVmax values comparable to that of stars at the tip of the RGB, implying they have similar multiplicity characteristics. The observed attrition of high ΔRVmax systems in the RGB is consistent with a lognormal period distribution in the MS and a multiplicity fraction of 0.35, which is truncated at an increasing period as stars become physically larger and undergo mass transfer after Roche Lobe overflow during H-shell burning. The ΔRVmax distributions also show that the multiplicity characteristics of field stars are metallicity-dependent, with metal-poor ([Fe/H] ≲ ‑0.5) stars having a multiplicity fraction a factor of 2–3 higher than metal-rich ([Fe/H] ≳ 0.0) stars. This has profound implications for the formation rates of interacting binaries observed by astronomical transient surveys and gravitational wave detectors, as well as the habitability of circumbinary planets.
An RNAi in silico approach to find an optimal shRNA cocktail against HIV-1
2010-01-01
Background HIV-1 can be inhibited by RNA interference in vitro through the expression of short hairpin RNAs (shRNAs) that target conserved genome sequences. In silico shRNA design for HIV has lacked a detailed study of virus variability constituting a possible breaking point in a clinical setting. We designed shRNAs against HIV-1 considering the variability observed in naïve and drug-resistant isolates available at public databases. Methods A Bioperl-based algorithm was developed to automatically scan multiple sequence alignments of HIV, while evaluating the possibility of identifying dominant and subdominant viral variants that could be used as efficient silencing molecules. Student t-test and Bonferroni Dunn correction test were used to assess statistical significance of our findings. Results Our in silico approach identified the most common viral variants within highly conserved genome regions, with a calculated free energy of ≥ -6.6 kcal/mol. This is crucial for strand loading to RISC complex and for a predicted silencing efficiency score, which could be used in combination for achieving over 90% silencing. Resistant and naïve isolate variability revealed that the most frequent shRNA per region targets a maximum of 85% of viral sequences. Adding more divergent sequences maintained this percentage. Specific sequence features that have been found to be related with higher silencing efficiency were hardly accomplished in conserved regions, even when lower entropy values correlated with better scores. We identified a conserved region among most HIV-1 genomes, which meets as many sequence features for efficient silencing. Conclusions HIV-1 variability is an obstacle to achieving absolute silencing using shRNAs designed against a consensus sequence, mainly because there are many functional viral variants. Our shRNA cocktail could be truly effective at silencing dominant and subdominant naïve viral variants. Additionally, resistant isolates might be targeted under specific antiretroviral selective pressure, but in both cases these should be tested exhaustively prior to clinical use. PMID:21172023
Citrus and Prunuscopia-like retrotransposons.
Asíns, M J; Monforte, A J; Mestre, P F; Carbonell, E A
1999-08-01
Many of the world's most important citrus cultivars ("Washington Navel", satsumas, clementines) have arisen through somatic mutation. This phenomenon occurs fairly often in the various species and varieties of the genus.The presence of copia-like retrotransposons has been investigated in fruit trees, especially citrus, by using a PCR assay designed to detect copia-like reverse transcriptase (RT) sequences. Amplification products from a genotype of each the following species Citrus sinensis, Citrus grandis, Citrus clementina, Prunus armeniaca and Prunus amygdalus, were cloned and some of them sequenced. Southern-blot hybridization using RT clones as probes showed that multiple copies are integrated throughout the citrus genome, while only 1-3 copies are detected in the P. armeniaca genome, which is in accordance with the Citrus and Prunus genome sizes. Sequence analysis of RT clones allowed a search for homologous sequences within three gene banks. The most similar ones correspond to RT domains of copia-like retrotransposons from unrelated plant species. Cluster analysis of these sequences has shown a great heterogeneity among RT domains cloned from the same genotype. This finding supports the hypothesis that horizontal transmission of retrotransposons has occurred in the past. The species presenting a RT sequence most similar to citrus RT clones is Gnetum montanum, a gymnosperm whose distribution area coincides with two of the main centers of origin of Citrus spp. A new C-methylated restriction DNA fragment containing a RT sequence is present in navel sweet oranges, but not in Valencia oranges from which the former originated suggesting, that retrotransposon activity might be, at least in part, involved in the genetic variability among sweet orange cultivars. Given that retrotransposons are quite abundant throughout the citrus genome, their activity should be investigated thoroughly before commercializing any transgenic citrus plant where the transgene(s) is part of a viral genome in order to avoid its possible recombination with an active retroelement. Focusing on other strategies to control virus diseases is recommended in citrus.
Ranwez, Vincent
2016-01-01
Multiple sequence alignment (MSA) is a crucial step in many molecular analyses and many MSA tools have been developed. Most of them use a greedy approach to construct a first alignment that is then refined by optimizing the sum of pair score (SP-score). The SP-score estimation is thus a bottleneck for most MSA tools since it is repeatedly required and is time consuming. Given an alignment of n sequences and L sites, I introduce here optimized solutions reaching O(nL) time complexity for affine gap cost, instead of O(n2L), which are easy to implement.
Reneker, Jeff; Shyu, Chi-Ren; Zeng, Peiyu; Polacco, Joseph C.; Gassmann, Walter
2004-01-01
We have developed a web server for the life sciences community to use to search for short repeats of DNA sequence of length between 3 and 10 000 bases within multiple species. This search employs a unique and fast hash function approach. Our system also applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. Furthermore, we have incorporated a part of the Gene Ontology database into our information retrieval algorithms to broaden the coverage of the search. Our web server and tutorial can be found at http://acmes.rnet.missouri.edu. PMID:15215469
ERIC Educational Resources Information Center
Fogarty, Ian; Geelan, David
2013-01-01
Students in 4 Canadian high school physics classes completed instructional sequences in two key physics topics related to motion--Straight Line Motion and Newton's First Law. Different sequences of laboratory investigation, teacher explanation (lecture) and the use of computer-based scientific visualizations (animations and simulations) were…
High-Throughput resequencing of maize landraces at genomic regions associated with flowering time
USDA-ARS?s Scientific Manuscript database
Despite the reduction in the price of sequencing, it remains expensive to sequence and assemble whole, complex genomes of multiple samples for population studies, particularly for large genomes like those of many crop species. Enrichment of target genome regions coupled with next generation sequenci...
Multiplicity among Solar-type Stars
NASA Astrophysics Data System (ADS)
Fuhrmann, K.; Chini, R.; Kaderhandt, L.; Chen, Z.
2017-02-01
We present a multiplicity census for a volume-complete all-sky survey of 422 stars with distances less than 25 pc and primary main-sequence effective temperatures T eff ≥ 5300 K. Very similar to previous results that have been presented for various subsets of this survey, we confirm the positive correlation of the stellar multiplicities with primary mass. We find for the F- and G-type Population I stars that 58% are non-single and 21% are in triple or higher level systems. For the old intermediate-disk and Population II stars—virtually all of G type and less massive—even two out of three sources prove to be non-single. These numbers being lower limits because of the continuous flow of new discoveries, the unbiased survey clearly demonstrates that the standard case for solar-type field stars is a hydrogen-burning source with at least one ordinary or degenerate stellar companion, and a surprisingly large number of stars are organized in multiple systems. A principal consequence is that orbital evolution, including the formation of blue straggler stars, is a potentially important issue on all spatial scales and timescales for a significant percentage of the stellar systems, in particular among Population II stars. We discuss a number of recent observations of known or suspected companions in the local survey, including a new detection of a double-lined Ba-Bb subsystem to the visual binary HR 8635.
Pastor, D; Amaya, W; García-Olcina, R; Sales, S
2007-07-01
We present a simple theoretical model of and the experimental verification for vanishing of the autocorrelation peak due to wavelength detuning on the coding-decoding process of coherent direct sequence optical code multiple access systems based on a superstructured fiber Bragg grating. Moreover, the detuning vanishing effect has been explored to take advantage of this effect and to provide an additional degree of multiplexing and/or optical code tuning.
Initial genome sequencing and analysis of multiple myeloma
Chapman, Michael A.; Lawrence, Michael S.; Keats, Jonathan J.; Cibulskis, Kristian; Sougnez, Carrie; Schinzel, Anna C.; Harview, Christina L.; Brunet, Jean-Philippe; Ahmann, Gregory J.; Adli, Mazhar; Anderson, Kenneth C.; Ardlie, Kristin G.; Auclair, Daniel; Baker, Angela; Bergsagel, P. Leif; Bernstein, Bradley E.; Drier, Yotam; Fonseca, Rafael; Gabriel, Stacey B.; Hofmeister, Craig C.; Jagannath, Sundar; Jakubowiak, Andrzej J.; Krishnan, Amrita; Levy, Joan; Liefeld, Ted; Lonial, Sagar; Mahan, Scott; Mfuko, Bunmi; Monti, Stefano; Perkins, Louise M.; Onofrio, Robb; Pugh, Trevor J.; Vincent Rajkumar, S.; Ramos, Alex H.; Siegel, David S.; Sivachenko, Andrey; Trudel, Suzanne; Vij, Ravi; Voet, Douglas; Winckler, Wendy; Zimmerman, Todd; Carpten, John; Trent, Jeff; Hahn, William C.; Garraway, Levi A.; Meyerson, Matthew; Lander, Eric S.; Getz, Gad; Golub, Todd R.
2013-01-01
Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumor genomes and their comparison to matched normal DNAs. Several new and unexpected oncogenic mechanisms were suggested by the pattern of somatic mutation across the dataset. These include the mutation of genes involved in protein translation (seen in nearly half of the patients), genes involved in histone methylation, and genes involved in blood coagulation. In addition, a broader than anticipated role of NF-κB signaling was suggested by mutations in 11 members of the NF-κB pathway. Of potential immediate clinical relevance, activating mutations of the kinase BRAF were observed in 4% of patients, suggesting the evaluation of BRAF inhibitors in multiple myeloma clinical trials. These results indicate that cancer genome sequencing of large collections of samples will yield new insights into cancer not anticipated by existing knowledge. PMID:21430775
Inferring human population size and separation history from multiple genome sequences.
Schiffels, Stephan; Durbin, Richard
2014-08-01
The availability of complete human genome sequences from populations across the world has given rise to new population genetic inference methods that explicitly model ancestral relationships under recombination and mutation. So far, application of these methods to evolutionary history more recent than 20,000-30,000 years ago and to population separations has been limited. Here we present a new method that overcomes these shortcomings. The multiple sequentially Markovian coalescent (MSMC) analyzes the observed pattern of mutations in multiple individuals, focusing on the first coalescence between any two individuals. Results from applying MSMC to genome sequences from nine populations across the world suggest that the genetic separation of non-African ancestors from African Yoruban ancestors started long before 50,000 years ago and give information about human population history as recent as 2,000 years ago, including the bottleneck in the peopling of the Americas and separations within Africa, East Asia and Europe.
The 2016 Mihoub (north-central Algeria) earthquake sequence: Seismological and tectonic aspects
NASA Astrophysics Data System (ADS)
Khelif, M. F.; Yelles-Chaouche, A.; Benaissa, Z.; Semmane, F.; Beldjoudi, H.; Haned, A.; Issaadi, A.; Chami, A.; Chimouni, R.; Harbi, A.; Maouche, S.; Dabbouz, G.; Aidi, C.; Kherroubi, A.
2018-06-01
On 28 May 2016 at 23:54 (UTC), an Mw5.4 earthquake occurred in Mihoub village, Algeria, 60 km southeast of Algiers. This earthquake was the largest event in a sequence recorded from 10 April to 15 July 2016. In addition to the permanent national network, a temporary network was installed in the epicentral region after this shock. Recorded event locations allow us to give a general overview of the sequence and reveal the existence of two main fault segments. The first segment, on which the first event in the sequence was located, is near-vertical and trends E-W. The second fault plane, on which the largest event of the sequence was located, dips to the southeast and strikes NE-SW. A total of 46 well-constrained focal mechanisms were calculated. The events located on the E-W-striking fault segment show mainly right-lateral strike-slip (strike N70°E, dip 77° to the SSE, rake 150°). The events located on the NE-SW-striking segment show mainly reverse faulting (strike N60°E, dip 70° to the SE, rake 130°). We calculated the static stress change caused by the first event (Md4.9) of the sequence; the result shows that the fault plane of the largest event in the sequence (Mw5.4) and most of the aftershocks occurred within an area of increased Coulomb stress. Moreover, using the focal mechanisms calculated in this work, we estimated the orientations of the main axes of the local stress tensor ellipsoid. The results confirm previous findings that the general stress field in this area shows orientations aligned NNW-SSE to NW-SE. The 2016 Mihoub earthquake sequence study thus improves our understanding of seismic hazard in north-central Algeria.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seaver, L.H.; Grimes, J.; Erickson, R.P.
1994-05-15
46,XX female pseudohermaphrodites have been previously described with nearly complete masculinization of the external genitalia and no apparent source of testosterone. Multiple malformations of internal genital, urinary, and gastrointestinal tracts are associated. We have evaluated four such infants with female pseudohermaphroditism and multiple caudal anomalies. Three cases had apparently normal chromosome (46,XX); one had a 46,XX,del(10)(q25.3{yields}qter) chromosome constitution. The chromosome breakpoint is in the region of PAX2, a developmentally important paired box gene which is expressed in urogenital tissue. Using the polymerase chain reaction, we screened for the presence of multiple Y specific sequences, including SRY (sex determining region, Ymore » chromosome), that could explain masculinization of the external genitalia. All were negative for Y centromeric sequences, ZFY (Zinc finger Y), and SRY. Furthermore, there was no evidence for adrenal or other sources of testosterone. We suggest that the masculinization in these cases is the result of abnormal expression of genes which would normally be regulated by testosterone. 32 refs., 1 fig., 2 tabs.« less
Alnajar, Seema; Gupta, Radhey S
2017-10-01
The family Enterobacteriaceae harbors many important pathogens, however it has proven difficult to reliably distinguish different members of this family or discern their interrelationships. To understand the interrelationships among the Enterobacteriaceae species, we have constructed two comprehensive phylogenetic trees for 78 genome-sequenced Enterobacteriaceae species based on 2487 core genome proteins, and another set of 118 conserved proteins. The genome sequences of Enterobacteriaceae species were also analyzed for genetic relatedness based on average amino acid identity and 16S rRNA sequence similarity. In parallel, comparative genomic studies on protein sequences from the Enterobacteriaceae have identified 88 molecular markers in the form of conserved signature indels (CSIs) that are uniquely shared by specific members of the family. All of these multiple lines of investigations provide consistent evidence that most of the species/genera within the family can be assigned to 6 different subfamily level clades which are designated as the "Escherichia clade", "Klebsiella clade", "Enterobacter clade", "Kosakonia clade", "Cronobacter clade" and "Cedecea clade". The members of the six described clades, in addition to their distinct branching in phylogenetic trees, can now be reliably demarcated in molecular terms on the basis of multiple identified CSIs that are exclusively shared by the group members. Several additional CSIs identified in this work that are either specific for individual genera (viz. Kosakonia, Kluyvera and Escherichia-Shigella), or are present at various taxonomic depths, offer information regarding the interrelationships among the different clades. The described molecular markers provide novel means for diagnostic as well as genetic and biochemical studies on the Enterobacteriaceae species and for resolving the polyphyly of its several genera viz. Escherichia, Enterobacter and Kluyvera. On the bases of our results, we are proposing the reclassification of Escherichia vulneris and Enterobacter massiliensis into two novel genera viz. Pseudescherichia gen. nov. and Metakosakonia gen. nov., respectively. Additionally, our results also support the transfer of "Enterobacter lignolyticus" and "Kluyvera intestini" to the genera Pluralibacter and Metakosakonia, respectively. Copyright © 2017 Elsevier B.V. All rights reserved.
Multiple tag labeling method for DNA sequencing
Mathies, R.A.; Huang, X.C.; Quesada, M.A.
1995-07-25
A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.
Finding the Onset of Convection in Main Sequence Stars
NASA Technical Reports Server (NTRS)
Simon, Theodore
2003-01-01
The primary goal of the work performed under this grant was to locate, if possible, the onset of subphotospheric convection zones in normal main sequence stars by using the presence of emission in high temperature lines in far ultraviolet spectra from the FUSE spacecraft as a proxy for convection. The change in stellar structure represented by this boundary between radiative and convective stars has always been difficult to find by other empirical means. A search was conducted through observations of a sample of A-type stars, which were somewhat hotter and more massive than the Sun, and which were carefully chosen to bridge the theoretically expected radiative/convective boundary line along the main sequence.
Lithium abundances among solar-type pre-main-sequence stars
NASA Technical Reports Server (NTRS)
Strom, Karen M.; Wilkin, Francis P.; Strom, Stephen E.; Seaman, Robert L.
1989-01-01
Measurements of Li I 6707 A line strengths were carried out for two samples of pre-main-sequence (PMS) stars (L 1641 and Taurus-Auriga), and the Li abundances estimated for PMS stars are compared with those deduced from observations of Li line strengths for main-sequence stars in the Alpha Persei cluster. It was found that the maximum Li abundances among the PMS stars with solar mass values greater than 1.0 exceed the maximum abundances for Alpha Per stars by at least 0.3 dex. Some PMS stars, including few apparently young stars, showed large (greater than 1.0 dex) Li depletion, and some apparently old PMS stars showed little or no depletion.
Habitable zones around main sequence stars
NASA Technical Reports Server (NTRS)
Kasting, James F.; Whitmire, Daniel P.; Reynolds, Ray T.
1993-01-01
A mechanism for stabilizing climate on the earth and other earthlike planets is described, and the physical processes that define the inner and outer boundaries of the habitable zone (HZ) around the sun and main sequence stars are discussed. Physical constraints on the HZ obtained from Venus and Mars are taken into account. A 1D climate model is used to estimate the width of the HZ and the continuously habitable zone around the sun, and the analysis is extended to other main sequence stars. Whether other stars have planets and where such planets might be located with respect to the HZ is addressed. The implications of the findings for NASA's SETI project are considered.
Multiple-rotor-cycle 2D PASS experiments with applications to (207)Pb NMR spectroscopy.
Vogt, F G; Gibson, J M; Aurentz, D J; Mueller, K T; Benesi, A J
2000-03-01
Thetwo-dimensional phase-adjusted spinning sidebands (2D PASS) experiment is a useful technique for simplifying magic-angle spinning (MAS) NMR spectra that contain overlapping or complicated spinning sideband manifolds. The pulse sequence separates spinning sidebands by their order in a two-dimensional experiment. The result is an isotropic/anisotropic correlation experiment, in which a sheared projection of the 2D spectrum effectively yields an isotropic spectrum with no sidebands. The original 2D PASS experiment works best at lower MAS speeds (1-5 kHz). At higher spinning speeds (8-12 kHz) the experiment requires higher RF power levels so that the pulses do not overlap. In the case of nuclei such as (207)Pb, a large chemical shift anisotropy often yields too many spinning sidebands to be handled by a reasonable 2D PASS experiment unless higher spinning speeds are used. Performing the experiment at these speeds requires fewer 2D rows and a correspondingly shorter experimental time. Therefore, we have implemented PASS pulse sequences that occupy multiple MAS rotor cycles, thereby avoiding pulse overlap. These multiple-rotor-cycle 2D PASS sequences are intended for use in high-speed MAS situations such as those required by (207)Pb. A version of the multiple-rotor-cycle 2D PASS sequence that uses composite pulses to suppress spectral artifacts is also presented. These sequences are demonstrated on (207)Pb test samples, including lead zirconate, a perovskite-phase compound that is representative of a large class of interesting materials. Copyright 2000 Academic Press.
Mallatt, Jon; Craig, Catherine Waggoner; Yoder, Matthew J
2010-04-01
This study (1) uses nearly complete rRNA-gene sequences from across Metazoa (197 taxa) to reconstruct animal phylogeny; (2) presents a highly annotated, manual alignment of these sequences with special reference to rRNA features including paired sites (http://purl.oclc.org/NET/rRNA/Metazoan_alignment) and (3) tests, after eliminating as few disruptive, rogue sequences as possible, if a likelihood framework can recover the main metazoan clades. We found that systematic elimination of approximately 6% of the sequences, including the divergent or unstably placed sequences of cephalopods, arrowworm, symphylan and pauropod myriapods, and of myzostomid and nemertodermatid worms, led to a tree that supported Ecdysozoa, Lophotrochozoa, Protostomia, and Bilateria. Deuterostomia, however, was never recovered, because the rRNA of urochordates goes (nonsignificantly) near the base of the Bilateria. Counterintuitively, when we modeled the evolution of the paired sites, phylogenetic resolution was not increased over traditional tree-building models that assume all sites in rRNA evolve independently. The rRNA genes of non-bilaterians contain a higher % AT than do those of most bilaterians. The rRNA genes of Acoela and Myzostomida were found to be secondarily shortened, AT-enriched, and highly modified, throwing some doubt on the location of these worms at the base of Bilateria in the rRNA tree--especially myzostomids, which other evidence suggests are annelids instead. Other findings are marsupial-with-placental mammals, arrowworms in Ecdysozoa (well supported here but contradicted by morphology), and Placozoa as sister to Cnidaria. Finally, despite the difficulties, the rRNA-gene trees are in strong concordance with trees derived from multiple protein-coding genes in supporting the new animal phylogeny. (c) 2009 Elsevier Inc. All rights reserved.
Phylogenetic and environmental diversity of DsrAB-type dissimilatory (bi)sulfite reductases
Müller, Albert Leopold; Kjeldsen, Kasper Urup; Rattei, Thomas; Pester, Michael; Loy, Alexander
2015-01-01
The energy metabolism of essential microbial guilds in the biogeochemical sulfur cycle is based on a DsrAB-type dissimilatory (bi)sulfite reductase that either catalyzes the reduction of sulfite to sulfide during anaerobic respiration of sulfate, sulfite and organosulfonates, or acts in reverse during sulfur oxidation. Common use of dsrAB as a functional marker showed that dsrAB richness in many environments is dominated by novel sequence variants and collectively represents an extensive, largely uncharted sequence assemblage. Here, we established a comprehensive, manually curated dsrAB/DsrAB database and used it to categorize the known dsrAB diversity, reanalyze the evolutionary history of dsrAB and evaluate the coverage of published dsrAB-targeted primers. Based on a DsrAB consensus phylogeny, we introduce an operational classification system for environmental dsrAB sequences that integrates established taxonomic groups with operational taxonomic units (OTUs) at multiple phylogenetic levels, ranging from DsrAB enzyme families that reflect reductive or oxidative DsrAB types of bacterial or archaeal origin, superclusters, uncultured family-level lineages to species-level OTUs. Environmental dsrAB sequences constituted at least 13 stable family-level lineages without any cultivated representatives, suggesting that major taxa of sulfite/sulfate-reducing microorganisms have not yet been identified. Three of these uncultured lineages occur mainly in marine environments, while specific habitat preferences are not evident for members of the other 10 uncultured lineages. In summary, our publically available dsrAB/DsrAB database, the phylogenetic framework, the multilevel classification system and a set of recommended primers provide a necessary foundation for large-scale dsrAB ecology studies with next-generation sequencing methods. PMID:25343514
2011-01-01
Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were identified to be either in identity by descent (IBD) or in copy number variation (CNV) with results from SNP array genotyping. Coding insertions and deletions (indels) were found to be enriched for size in multiples of 3 and were located near the N- and C-termini of proteins. For larger indels, a combination of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found to be instrumental for calculating resequencing SNP accuracy, while SNP detection within CNVs tended to be less reliable. CNV discovery was affected dramatically by platform resolution and coverage biases. The combined data for this study showed that at a moderate level of sequencing coverage, an ensemble of platforms and tools can be applied together to maximize the accurate detection of sequence and structural variants. PMID:22082336
Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A
2009-01-01
Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species. PMID:19747386
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-08
.... Sponsor: Maine Chapter, Multiple Sclerosis Society. Date: August 21, 2010. Time: 11 am to 2 pm. Location... Sailboat Race. Sponsor: Maine Chapter, Multiple Sclerosis Society. Date: August 21, 2010. Time: 10 am to 4... Tugboat Muster. Event Type: Power Boat Race. Sponsor: Maine Chapter, National Multiple Sclerosis Society...
Lithium in lower-main-sequence stars of the Alpha Persei cluster
NASA Technical Reports Server (NTRS)
Balachandran, Suchitra; Lambert, David L.; Stauffer, John R.
1988-01-01
Lithium abundances are presented for main-sequence stars of spectral types F, G, and K in the young open cluster Alpha Per. For 46 cluster members, a correlation between Li abundance and projected rotational velocity v sin i is found: all of the Li-poor stars are slow rotators. Two explanations are proposed to account for the correlation: (1) that the Li depletion is introduced following a rapid spin-down phase experienced by young low-mass stars, and that this episode of Li depletion may be the dominant one determining the spread of Li abundances among young low-mass main-sequence stars, and (2) that star formation has occurred over a finite period such that the older stars have undergone a spin-down and depletion of Li by a means that may or may not depend on rotation. The Li abundance in the warm and rapidly rotating stars appears to be undepleted, as is predicted by recent models of pre-main-sequence stars. The depletion observed in the cool stars exceeds the level predicted by these models.
A sample of potential disk hosting first ascent red giants
NASA Astrophysics Data System (ADS)
Steele, Amy; Debes, John
2018-01-01
Observations of (sub)giants with planets and disks provide the first set of proof that disks can survive the first stages of post-main-sequence evolution, even though the disks are expected to dissipate by this time. The infrared (IR) excesses present around a number of post-main-sequence (PMS) stars could be due to a traditional debris disk with planets (e.g. kappa CrB), some remnant of enhanced mass loss (e.g. the shell-like structure of R Sculptoris), and/or background contamination. We present a sample of potential disk hosting first ascent red giants. These stars all have infrared excesses at 22 microns, and possibly host circumstellar debris. We summarize the characteristics of the sample to better inform the incidence rates of thermally emitting material around giant stars. A thorough follow-up study of these candidates would serve as the first step in probing the composition of the dust in these systems that have left the main sequence, providing clues to the degree of disk processing that occurs beyond the main-sequence.
78 FR 17613 - Special Local Regulations and Safety Zones; Recurring Events in Northern New England
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-22
... Multiple Sclerosis Event Type: Regatta and Sailboat Regatta. Race Sponsor: Maine Chapter, Multiple...]13'51'' W 8.7 Multiple Sclerosis Event Type: Power Boat Race Harborfest Lobster Boat/ Sponsor: Maine Chapter, National Tugboat Races. Multiple Sclerosis Society [[Page 17619
Presupernova Evolution of Differentially Rotating Massive Stars Including Magnetic Fields
NASA Astrophysics Data System (ADS)
Heger, A.; Woosley, S. E.; Spruit, H. C.
2005-06-01
As a massive star evolves through multiple stages of nuclear burning on its way to becoming a supernova, a complex, differentially rotating structure is set up. Angular momentum is transported by a variety of classic instabilities and also by magnetic torques from fields generated by the differential rotation. We present the first stellar evolution calculations to follow the evolution of rotating massive stars including, at least approximately, all these effects, magnetic and nonmagnetic, from the zero-age main sequence until the onset of iron-core collapse. The evolution and action of the magnetic fields is as described by Spruit in 2002, and a range of uncertain parameters is explored. In general, we find that magnetic torques decrease the final rotation rate of the collapsing iron core by about a factor of 30-50 when compared with the nonmagnetic counterparts. Angular momentum in that part of the presupernova star destined to become a neutron star is an increasing function of main-sequence mass. That is, pulsars derived from more massive stars rotate faster and rotation plays a more important role in the star's explosion. The final angular momentum of the core has been determined-to within a factor of 2-by the time the star ignites carbon burning. For the lighter stars studied, around 15 Msolar, we predict pulsar periods at birth near 15 ms, though a factor of 2 range is easily tolerated by the uncertainties. Several mechanisms for additional braking in a young neutron star, especially by fallback, are explored.
On the Statistical Properties of the Lower Main Sequence
NASA Astrophysics Data System (ADS)
Angelou, George C.; Bellinger, Earl P.; Hekker, Saskia; Basu, Sarbani
2017-04-01
Astronomy is in an era where all-sky surveys are mapping the Galaxy. The plethora of photometric, spectroscopic, asteroseismic, and astrometric data allows us to characterize the comprising stars in detail. Here we quantify to what extent precise stellar observations reveal information about the properties of a star, including properties that are unobserved, or even unobservable. We analyze the diagnostic potential of classical and asteroseismic observations for inferring stellar parameters such as age, mass, and radius from evolutionary tracks of solar-like oscillators on the lower main sequence. We perform rank correlation tests in order to determine the capacity of each observable quantity to probe structural components of stars and infer their evolutionary histories. We also analyze the principal components of classic and asteroseismic observables to highlight the degree of redundancy present in the measured quantities and demonstrate the extent to which information of the model parameters can be extracted. We perform multiple regression using combinations of observable quantities in a grid of evolutionary simulations and appraise the predictive utility of each combination in determining the properties of stars. We identify the combinations that are useful and provide limits to where each type of observable quantity can reveal information about a star. We investigate the accuracy with which targets in the upcoming TESS and PLATO missions can be characterized. We demonstrate that the combination of observations from GAIA and PLATO will allow us to tightly constrain stellar masses, ages, and radii with machine learning for the purposes of Galactic and planetary studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Le Crom, Stphane; Schackwitz, Wendy; Pennacchiod, Len
2009-09-22
Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels, such as ethanol, and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions andmore » 18 larger deletions leading to the loss of more than 100 kb of genomic DNA. From these events we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild type strain QM6a. Our analysis provides the first genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus, and suggests new areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production.« less
Horn, T; Chang, C A; Urdea, M S
1997-12-01
The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays.
Horn, T; Chang, C A; Urdea, M S
1997-01-01
The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays. PMID:9365265
Preparation of 13C/15N-labeled oligomers using the polymerase chain reaction
Chen, Xian; Gupta, Goutam; Bradbury, E. Morton
2001-01-01
Preparation of .sup.13 C/.sup.15 N-labeled DNA oligomers using the polymerase chain reaction (PCR). A PCR based method for uniform (.sup.13 C/.sup.15 N)-labeling of DNA duplexes is described. Multiple copies of a blunt-ended duplex are cloned into a plasmid, each copy containing the sequence of interest and restriction Hinc II sequences at both the 5' and 3' ends. PCR using bi-directional primers and uniformly .sup.13 C/.sup.15 N-labeled dNTP precursors generates labeled DNA duplexes containing multiple copies of the sequence of interest. Twenty-four cycles of PCR, followed by restriction and purification, gave the uniformly .sup.13 C/.sup.15 N-labeled duplex sequence with a 30% yield. Such labeled duplexes find significant applications in multinuclear magnetic resonance spectroscopy.
R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server
Cannone, Jamie J.; Sweeney, Blake A.; Petrov, Anton I.; Gutell, Robin R.; Zirbel, Craig L.; Leontis, Neocles
2015-01-01
The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. PMID:26048960
AlexSys: a knowledge-based expert system for multiple sequence alignment construction and analysis
Aniba, Mohamed Radhouene; Poch, Olivier; Marchler-Bauer, Aron; Thompson, Julie Dawn
2010-01-01
Multiple sequence alignment (MSA) is a cornerstone of modern molecular biology and represents a unique means of investigating the patterns of conservation and diversity in complex biological systems. Many different algorithms have been developed to construct MSAs, but previous studies have shown that no single aligner consistently outperforms the rest. This has led to the development of a number of ‘meta-methods’ that systematically run several aligners and merge the output into one single solution. Although these methods generally produce more accurate alignments, they are inefficient because all the aligners need to be run first and the choice of the best solution is made a posteriori. Here, we describe the development of a new expert system, AlexSys, for the multiple alignment of protein sequences. AlexSys incorporates an intelligent inference engine to automatically select an appropriate aligner a priori, depending only on the nature of the input sequences. The inference engine was trained on a large set of reference multiple alignments, using a novel machine learning approach. Applying AlexSys to a test set of 178 alignments, we show that the expert system represents a good compromise between alignment quality and running time, making it suitable for high throughput projects. AlexSys is freely available from http://alnitak.u-strasbg.fr/∼aniba/alexsys. PMID:20530533
EXors and the stellar birthline
NASA Astrophysics Data System (ADS)
Moody, Mackenzie S. L.; Stahler, Steven W.
2017-04-01
We assess the evolutionary status of EXors. These low-mass, pre-main-sequence stars repeatedly undergo sharp luminosity increases, each a year or so in duration. We place into the HR diagram all EXors that have documented quiescent luminosities and effective temperatures, and thus determine their masses and ages. Two alternate sets of pre-main-sequence tracks are used, and yield similar results. Roughly half of EXors are embedded objects, I.e., they appear observationally as Class I or flat-spectrum infrared sources. We find that these are relatively young and are located close to the stellar birthline in the HR diagram. Optically visible EXors, on the other hand, are situated well below the birthline. They have ages of several Myr, typical of classical T Tauri stars. Judging from the limited data at hand, we find no evidence that binarity companions trigger EXor eruptions; this issue merits further investigation. We draw several general conclusions. First, repetitive luminosity outbursts do not occur in all pre-main-sequence stars, and are not in themselves a sign of extreme youth. They persist, along with other signs of activity, in a relatively small subset of these objects. Second, the very existence of embedded EXors demonstrates that at least some Class I infrared sources are not true protostars, but very young pre-main-sequence objects still enshrouded in dusty gas. Finally, we believe that the embedded pre-main-sequence phase is of observational and theoretical significance, and should be included in a more complete account of early stellar evolution.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koyama, Yusei; Kodama, Tadayuki; Tadaki, Ken-ichi
2014-07-01
We report the discovery of a strong over-density of galaxies in the field of a radio galaxy at z = 1.52 (4C 65.22) based on our broadband and narrow-band (Hα) photometry with the Subaru Telescope. We find that Hα emitters are located in the outskirts of the density peak (cluster core) dominated by passive red-sequence galaxies. This resembles the situation in lower-redshift clusters, suggesting that the newly discovered structure is a well-evolved rich galaxy cluster at z = 1.5. Our data suggest that the color-density and stellar mass-density relations are already in place at z ∼ 1.5, mostly driven bymore » the passive red massive galaxies residing within r{sub c} ≲ 200 kpc from the cluster core. These environmental trends almost disappear when we consider only star-forming (SF) galaxies. We do not find SFR-density or SSFR-density relations amongst SF galaxies, and the location of the SF main sequence does not significantly change with environment. Nevertheless, we find a tentative hint that star-bursting galaxies (up-scattered objects from the main sequence) are preferentially located in a small group at ∼1 Mpc away from the main body of the cluster. We also argue that the scatter of the SF main sequence could be dependent on the distance to the nearest neighboring galaxy.« less
The SAMI Galaxy Survey: spatially resolving the main sequence of star formation
NASA Astrophysics Data System (ADS)
Medling, Anne M.; Cortese, Luca; Croom, Scott M.; Green, Andrew W.; Groves, Brent; Hampton, Elise; Ho, I.-Ting; Davies, Luke J. M.; Kewley, Lisa J.; Moffett, Amanda J.; Schaefer, Adam L.; Taylor, Edward; Zafar, Tayyaba; Bekki, Kenji; Bland-Hawthorn, Joss; Bloom, Jessica V.; Brough, Sarah; Bryant, Julia J.; Catinella, Barbara; Cecil, Gerald; Colless, Matthew; Couch, Warrick J.; Drinkwater, Michael J.; Driver, Simon P.; Federrath, Christoph; Foster, Caroline; Goldstein, Gregory; Goodwin, Michael; Hopkins, Andrew; Lawrence, J. S.; Leslie, Sarah K.; Lewis, Geraint F.; Lorente, Nuria P. F.; Owers, Matt S.; McDermid, Richard; Richards, Samuel N.; Sharp, Robert; Scott, Nicholas; Sweet, Sarah M.; Taranu, Dan S.; Tescari, Edoardo; Tonini, Chiara; van de Sande, Jesse; Walcher, C. Jakob; Wright, Angus
2018-04-01
We present the ˜800 star formation rate maps for the Sydney-AAO Multi-object Integral field spectrograph (SAMI) Galaxy Survey based on H α emission maps, corrected for dust attenuation via the Balmer decrement, that are included in the SAMI Public Data Release 1. We mask out spaxels contaminated by non-stellar emission using the [O III]/H β, [N II]/H α, [S II]/H α, and [O I]/H α line ratios. Using these maps, we examine the global and resolved star-forming main sequences of SAMI galaxies as a function of morphology, environmental density, and stellar mass. Galaxies further below the star-forming main sequence are more likely to have flatter star formation profiles. Early-type galaxies split into two populations with similar stellar masses and central stellar mass surface densities. The main-sequence population has centrally concentrated star formation similar to late-type galaxies, while galaxies >3σ below the main sequence show significantly reduced star formation most strikingly in the nuclear regions. The split populations support a two-step quenching mechanism, wherein halo mass first cuts off the gas supply and remaining gas continues to form stars until the local stellar mass surface density can stabilize the reduced remaining fuel against further star formation. Across all morphologies, galaxies in denser environments show a decreased specific star formation rate from the outside in, supporting an environmental cause for quenching, such as ram-pressure stripping or galaxy interactions.
Circumstellar Material on and off the Main Sequence
NASA Astrophysics Data System (ADS)
Steele, Amy; Debes, John H.; Deming, Drake
2017-06-01
There is evidence of circumstellar material around main sequence, giant, and white dwarf stars that originates from the small-body population of planetary systems. These bodies tell us something about the chemistry and evolution of protoplanetary disks and the planetary systems they form. What happens to this material as its host star evolves off the main sequence, and how does that inform our understanding of the typical chemistry of rocky bodies in planetary systems? In this talk, I will discuss the composition(s) of circumstellar material on and off the main sequence to begin to answer the question, “Is Earth normal?” In particular, I look at three types of debris disks to understand the typical chemistry of planetary systems—young debris disks, debris disks around giant stars, and dust around white dwarfs. I will review the current understanding on how to infer dust composition for each class of disk, and present new work on constraining dust composition from infrared excesses around main sequence and giant stars. Finally, dusty and polluted white dwarfs hold a unique key to our understanding of the composition of rocky bodies around other stars. In particular, I will discuss WD1145+017, which has a transiting, disintegrating planetesimal. I will review what we know about this system through high speed photometry and spectroscopy and present new work on understanding the complex interplay of physics that creates white dwarf pollution from the disintegration of rocky bodies.
NASA Astrophysics Data System (ADS)
Oelkers, Ryan J.; Macri, Lucas M.; Marshall, Jennifer L.; DePoy, Darren L.; Lambas, Diego G.; Colazo, Carlos; Stringer, Katelyn
2016-09-01
The past two decades have seen a significant advancement in the detection, classification, and understanding of exoplanets and binaries. This is due, in large part, to the increase in use of small-aperture telescopes (<20 cm) to survey large areas of the sky to milli-mag precision with rapid cadence. The vast majority of the planetary and binary systems studied to date consists of main-sequence or evolved objects, leading to a dearth of knowledge of properties at early times (<50 Myr). Only a dozen binaries and one candidate transiting Hot Jupiter are known among pre-main-sequence objects, yet these are the systems that can provide the best constraints on stellar formation and planetary migration models. The deficiency in the number of well characterized systems is driven by the inherent and aperiodic variability found in pre-main-sequence objects, which can mask and mimic eclipse signals. Hence, a dramatic increase in the number of young systems with high-quality observations is highly desirable to guide further theoretical developments. We have recently completed a photometric survey of three nearby (<150 pc) and young (<50 Myr) moving groups with a small-aperture telescope. While our survey reached the requisite photometric precision, the temporal coverage was insufficient to detect Hot Jupiters. Nevertheless, we discovered 346 pre-main-sequence binary candidates, including 74 high-priority objects for further study. This paper includes data taken at The McDonald Observatory of The University of Texas at Austin.
Design and construction of 2A peptide-linked multicistronic vectors.
Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A
2012-02-01
The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. This article describes the design and construction of 2A peptide-linked multicistronic vectors, which can be used to express multiple proteins from a single open reading frame (ORF). The small 2A peptide sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. Expression of more than two genes using conventional approaches has several limitations, most notably imbalanced protein expression and large size. The use of 2A peptide sequences alleviates these concerns. They are small (18-22 amino acids) and have divergent amino-terminal sequences, which minimizes the chance for homologous recombination and allows for multiple, different 2A peptide sequences to be used within a single vector. Importantly, separation of genes placed between 2A peptide sequences is nearly 100%, which allows for stoichiometric and concordant expression of the genes, regardless of the order of placement within the vector.
Low-pass sequencing for microbial comparative genomics
Goo, Young Ah; Roach, Jared; Glusman, Gustavo; Baliga, Nitin S; Deutsch, Kerry; Pan, Min; Kennedy, Sean; DasSarma, Shiladitya; Victor Ng, Wailap; Hood, Leroy
2004-01-01
Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich genome of H. sp. NRC-1. Identification of multiple TBP and TFB homologs in these four halophiles are consistent with the hypothesis that different types of complex transcriptional regulation may occur through multiple TBP-TFB combinations in response to rapidly changing environmental conditions. Low-pass shotgun sequence analyses of genomes permit extensive and diverse analyses, and should be generally useful for comparative microbial genomics. PMID:14718067
Ma, Lijun; Lee, Letitia; Barani, Igor; Hwang, Andrew; Fogh, Shannon; Nakamura, Jean; McDermott, Michael; Sneed, Penny; Larson, David A; Sahgal, Arjun
2011-11-21
Rapid delivery of multiple shots or isocenters is one of the hallmarks of Gamma Knife radiosurgery. In this study, we investigated whether the temporal order of shots delivered with Gamma Knife Perfexion would significantly influence the biological equivalent dose for complex multi-isocenter treatments. Twenty single-target cases were selected for analysis. For each case, 3D dose matrices of individual shots were extracted and single-fraction equivalent uniform dose (sEUD) values were determined for all possible shot delivery sequences, corresponding to different patterns of temporal dose delivery within the target. We found significant variations in the sEUD values among these sequences exceeding 15% for certain cases. However, the sequences for the actual treatment delivery were found to agree (<3%) and to correlate (R² = 0.98) excellently with the sequences yielding the maximum sEUD values for all studied cases. This result is applicable for both fast and slow growing tumors with α/β values of 2 to 20 according to the linear-quadratic model. In conclusion, despite large potential variations in different shot sequences for multi-isocenter Gamma Knife treatments, current clinical delivery sequences exhibited consistent biological target dosing that approached that maximally achievable for all studied cases.
Sockeye: A 3D Environment for Comparative Genomics
Montgomery, Stephen B.; Astakhova, Tamara; Bilenky, Mikhail; Birney, Ewan; Fu, Tony; Hassel, Maik; Melsopp, Craig; Rak, Marcin; Robertson, A. Gordon; Sleumer, Monica; Siddiqui, Asim S.; Jones, Steven J.M.
2004-01-01
Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases, the ability to perform large-scale comparative analyses has become increasingly relevant. In addition, the growing complexity of genomic feature annotation means that new approaches to genomic visualization need to be explored. We have developed a Java-based application called Sockeye that uses three-dimensional (3D) graphics technology to facilitate the visualization of annotation and conservation across multiple sequences. This software uses the Ensembl database project to import sequence and annotation information from several eukaryotic species. A user can additionally import their own custom sequence and annotation data. Individual annotation objects are displayed in Sockeye by using custom 3D models. Ensembl-derived and imported sequences can be analyzed by using a suite of multiple and pair-wise alignment algorithms. The results of these comparative analyses are also displayed in the 3D environment of Sockeye. By using the Java3D API to visualize genomic data in a 3D environment, we are able to compactly display cross-sequence comparisons. This provides the user with a novel platform for visualizing and comparing genomic feature organization. PMID:15123592
CodonLogo: a sequence logo-based viewer for codon patterns.
Sharma, Virag; Murphy, David P; Provan, Gregory; Baranov, Pavel V
2012-07-15
Conserved patterns across a multiple sequence alignment can be visualized by generating sequence logos. Sequence logos show each column in the alignment as stacks of symbol(s) where the height of a stack is proportional to its informational content, whereas the height of each symbol within the stack is proportional to its frequency in the column. Sequence logos use symbols of either nucleotide or amino acid alphabets. However, certain regulatory signals in messenger RNA (mRNA) act as combinations of codons. Yet no tool is available for visualization of conserved codon patterns. We present the first application which allows visualization of conserved regions in a multiple sequence alignment in the context of codons. CodonLogo is based on WebLogo3 and uses the same heuristics but treats codons as inseparable units of a 64-letter alphabet. CodonLogo can discriminate patterns of codon conservation from patterns of nucleotide conservation that appear indistinguishable in standard sequence logos. The CodonLogo source code and its implementation (in a local version of the Galaxy Browser) are available at http://recode.ucc.ie/CodonLogo and through the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/.
Deng, Yi-Mo; Spirason, Natalie; Iannello, Pina; Jelley, Lauren; Lau, Hilda; Barr, Ian G
2015-07-01
Full genome sequencing of influenza A viruses (IAV), including those that arise from annual influenza epidemics, is undertaken to determine if reassorting has occurred or if other pathogenic traits are present. Traditionally IAV sequencing has been biased toward the major surface glycoproteins haemagglutinin and neuraminidase, while the internal genes are often ignored. Despite the development of next generation sequencing (NGS), many laboratories are still reliant on conventional Sanger sequencing to sequence IAV. To develop a minimal and robust set of primers for Sanger sequencing of the full genome of IAV currently circulating in humans. A set of 13 primer pairs was designed that enabled amplification of the six internal genes of multiple human IAV subtypes including the recent avian influenza A(H7N9) virus from China. Specific primers were designed to amplify the HA and NA genes of each IAV subtype of interest. Each of the primers also incorporated a binding site at its 5'-end for either a forward or reverse M13 primer, such that only two M13 primers were required for all subsequent sequencing reactions. This minimal set of primers was suitable for sequencing the six internal genes of all currently circulating human seasonal influenza A subtypes as well as the avian A(H7N9) viruses that have infected humans in China. This streamlined Sanger sequencing protocol could be used to generate full genome sequence data more rapidly and easily than existing influenza genome sequencing protocols. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Effects of Main-Sequence Mass Loss on Stellar and Galactic Chemical Evolution.
NASA Astrophysics Data System (ADS)
Guzik, Joyce Ann
1988-06-01
L. A. Willson, G. H. Bowen and C. Struck -Marcell have proposed that 1 to 3 solar mass stars may experience evolutionarily significant mass loss during the early part of their main-sequence phase. The suggested mass-loss mechanism is pulsation, facilitated by rapid rotation. Initial mass-loss rates may be as large as several times 10^{-9}M o/yr, diminishing over several times 10^8 years. We attempted to test this hypothesis by comparing some theoretical implications with observations. Three areas are addressed: Solar models, cluster HR diagrams, and galactic chemical evolution. Mass-losing solar models were evolved that match the Sun's luminosity and radius at its present age. The most extreme viable models have initial mass 2.0 M o, and mass-loss rates decreasing exponentially over 2-3 times 10^8 years. Compared to a constant -mass model, these models require a reduced initial ^4He abundance, have deeper envelope convection zones and higher ^8B neutrino fluxes. Early processing of present surface layers at higher interior temperatures increases the surface ^3He abundance, destroys Li, Be and B, and decreases the surface C/N ratio following first dredge-up. Evolution calculations incorporating main-sequence mass loss were completed for a grid of models with initial masses 1.25 to 2.0 Mo and mass loss timescales 0.2 to 2.0 Gyr. Cluster HR diagrams synthesized with these models confirm the potential for the hypothesis to explain observed spreads or bifurcations in the upper main sequence, blue stragglers, anomalous giants, and poor fits of main-sequence turnoffs by standard isochrones. Simple closed galactic chemical evolution models were used to test the effects of main-sequence mass loss on the F and G dwarf distribution. Stars between 3.0 M o and a metallicity -dependent lower mass are assumed to lose mass. The models produce a 30 to 60% increase in the stars to stars-plus -remnants ratio, with fewer early-F dwarfs and many more late-F dwarfs remaining on the main sequence to the present. The ratio of stars to stellar remnants and the white dwarf age distribution may prove valuable in distinguishing between explanations for the observed bimodal present-day stellar mass function.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-17
...-0001; Sequence 21] General Services Administration Acquisition Regulation: Modifications (Multiple... Modifications (Multiple Award Schedule). DATES: Submit comments on or before: February 15, 2013. FOR FURTHER INFORMATION CONTACT: Ms. Dana Munson, General Services Acquisition Policy Division, GSA, (202) 357-9652 or...
NASA Astrophysics Data System (ADS)
Olree, E.; Robinson, D. M.; McQuarrie, N.; Ghoshal, S.; Olsen, J.
2016-12-01
Using balanced cross sections, one can visualize a valid and admissible interpretation of the surface and subsurface data. Khanal (2014) and Cross (2014) produced two valid and admissible cross sections along the Marsyandi River in central Nepal. However, thermochronologic data adds another dimension that must be adhered to when producing valid and admissible balanced cross sections. Since the previous cross sections were produced, additional zircon-helium (ZHe) cooling ages along the Marsyandi River show ages of 1 Ma near the Main Central thrust in the hinterland to 4 Ma near the Main Boundary thrust closer to the foreland. This distribution of cooling ages requires recent uplift in the hinterland, which is not present in the cross sections. Although a restored version of the Khanal (2014) cross section is sequentially deformed using 2D Move, the kinematic sequence implied in the cross section is inconsistent with the ZHe age distribution. The hinterland dipping duplex proposed by Khanal would require cooling ages that are oldest near the Main Central thrust and young southwards toward the active ramp located 80 km north of the Main Frontal thrust. Instead, the 4 Ma age near the Main Boundary thrust and the increasingly younger ages to the north could be produced by either a foreland-dipping Lesser Himalayan duplex, which would keep active uplift in the north, or by translation of the hinterland dipping duplex southward over the ramp, moving the active thrust ramp northward. To address this problem, a new balanced cross section was produced using both new mapping through the region and the ZHe age distribution as additional constraints. The section was then restored and sequentially deformed in 2D Move. This study illustrates that multiple cross sections can be viable and admissible; however, they can still be incorrect. Thermochronology places additional constraints on the permissible geometries, and thus increases our ability to predict subsurface geometries. The next step of this project is to link the uplift and erosion implied by the kinematic sequence of the new cross section to the measured cooling history by importing the cross section kinematics into advection diffusion modeling software that predicts cooling ages.
IVisTMSA: Interactive Visual Tools for Multiple Sequence Alignments.
Pervez, Muhammad Tariq; Babar, Masroor Ellahi; Nadeem, Asif; Aslam, Naeem; Naveed, Nasir; Ahmad, Sarfraz; Muhammad, Shah; Qadri, Salman; Shahid, Muhammad; Hussain, Tanveer; Javed, Maryam
2015-01-01
IVisTMSA is a software package of seven graphical tools for multiple sequence alignments. MSApad is an editing and analysis tool. It can load 409% more data than Jalview, STRAP, CINEMA, and Base-by-Base. MSA comparator allows the user to visualize consistent and inconsistent regions of reference and test alignments of more than 21-MB size in less than 12 seconds. MSA comparator is 5,200% efficient and more than 40% efficient as compared to BALiBASE c program and FastSP, respectively. MSA reconstruction tool provides graphical user interfaces for four popular aligners and allows the user to load several sequence files at a time. FASTA generator converts seven formats of alignments of unlimited size into FASTA format in a few seconds. MSA ID calculator calculates identity matrix of more than 11,000 sequences with a sequence length of 2,696 base pairs in less than 100 seconds. Tree and Distance Matrix calculation tools generate phylogenetic tree and distance matrix, respectively, using neighbor joining% identity and BLOSUM 62 matrix.
Nanoliter reactors improve multiple displacement amplification of genomes from single cells.
Marcy, Yann; Ishoey, Thomas; Lasken, Roger S; Stockwell, Timothy B; Walenz, Brian P; Halpern, Aaron L; Beeson, Karen Y; Goldberg, Susanne M D; Quake, Stephen R
2007-09-01
Since only a small fraction of environmental bacteria are amenable to laboratory culture, there is great interest in genomic sequencing directly from single cells. Sufficient DNA for sequencing can be obtained from one cell by the Multiple Displacement Amplification (MDA) method, thereby eliminating the need to develop culture methods. Here we used a microfluidic device to isolate individual Escherichia coli and amplify genomic DNA by MDA in 60-nl reactions. Our results confirm a report that reduced MDA reaction volume lowers nonspecific synthesis that can result from contaminant DNA templates and unfavourable interaction between primers. The quality of the genome amplification was assessed by qPCR and compared favourably to single-cell amplifications performed in standard 50-microl volumes. Amplification bias was greatly reduced in nanoliter volumes, thereby providing a more even representation of all sequences. Single-cell amplicons from both microliter and nanoliter volumes provided high-quality sequence data by high-throughput pyrosequencing, thereby demonstrating a straightforward route to sequencing genomes from single cells.
Covariance Matrix Estimation for Massive MIMO
NASA Astrophysics Data System (ADS)
Upadhya, Karthik; Vorobyov, Sergiy A.
2018-04-01
We propose a novel pilot structure for covariance matrix estimation in massive multiple-input multiple-output (MIMO) systems in which each user transmits two pilot sequences, with the second pilot sequence multiplied by a random phase-shift. The covariance matrix of a particular user is obtained by computing the sample cross-correlation of the channel estimates obtained from the two pilot sequences. This approach relaxes the requirement that all the users transmit their uplink pilots over the same set of symbols. We derive expressions for the achievable rate and the mean-squared error of the covariance matrix estimate when the proposed method is used with staggered pilots. The performance of the proposed method is compared with existing methods through simulations.
Alachiotis, Nikolaos; Vogiatzi, Emmanouella; Pavlidis, Pavlos; Stamatakis, Alexandros
2013-01-01
Automated DNA sequencers generate chromatograms that contain raw sequencing data. They also generate data that translates the chromatograms into molecular sequences of A, C, G, T, or N (undetermined) characters. Since chromatogram translation programs frequently introduce errors, a manual inspection of the generated sequence data is required. As sequence numbers and lengths increase, visual inspection and manual correction of chromatograms and corresponding sequences on a per-peak and per-nucleotide basis becomes an error-prone, time-consuming, and tedious process. Here, we introduce ChromatoGate (CG), an open-source software that accelerates and partially automates the inspection of chromatograms and the detection of sequencing errors for bidirectional sequencing runs. To provide users full control over the error correction process, a fully automated error correction algorithm has not been implemented. Initially, the program scans a given multiple sequence alignment (MSA) for potential sequencing errors, assuming that each polymorphic site in the alignment may be attributed to a sequencing error with a certain probability. The guided MSA assembly procedure in ChromatoGate detects chromatogram peaks of all characters in an alignment that lead to polymorphic sites, given a user-defined threshold. The threshold value represents the sensitivity of the sequencing error detection mechanism. After this pre-filtering, the user only needs to inspect a small number of peaks in every chromatogram to correct sequencing errors. Finally, we show that correcting sequencing errors is important, because population genetic and phylogenetic inferences can be misled by MSAs with uncorrected mis-calls. Our experiments indicate that estimates of population mutation rates can be affected two- to three-fold by uncorrected errors. PMID:24688709
Alachiotis, Nikolaos; Vogiatzi, Emmanouella; Pavlidis, Pavlos; Stamatakis, Alexandros
2013-01-01
Automated DNA sequencers generate chromatograms that contain raw sequencing data. They also generate data that translates the chromatograms into molecular sequences of A, C, G, T, or N (undetermined) characters. Since chromatogram translation programs frequently introduce errors, a manual inspection of the generated sequence data is required. As sequence numbers and lengths increase, visual inspection and manual correction of chromatograms and corresponding sequences on a per-peak and per-nucleotide basis becomes an error-prone, time-consuming, and tedious process. Here, we introduce ChromatoGate (CG), an open-source software that accelerates and partially automates the inspection of chromatograms and the detection of sequencing errors for bidirectional sequencing runs. To provide users full control over the error correction process, a fully automated error correction algorithm has not been implemented. Initially, the program scans a given multiple sequence alignment (MSA) for potential sequencing errors, assuming that each polymorphic site in the alignment may be attributed to a sequencing error with a certain probability. The guided MSA assembly procedure in ChromatoGate detects chromatogram peaks of all characters in an alignment that lead to polymorphic sites, given a user-defined threshold. The threshold value represents the sensitivity of the sequencing error detection mechanism. After this pre-filtering, the user only needs to inspect a small number of peaks in every chromatogram to correct sequencing errors. Finally, we show that correcting sequencing errors is important, because population genetic and phylogenetic inferences can be misled by MSAs with uncorrected mis-calls. Our experiments indicate that estimates of population mutation rates can be affected two- to three-fold by uncorrected errors.
A statistical method for the detection of variants from next-generation resequencing of DNA pools.
Bansal, Vikas
2010-06-15
Next-generation sequencing technologies have enabled the sequencing of several human genomes in their entirety. However, the routine resequencing of complete genomes remains infeasible. The massive capacity of next-generation sequencers can be harnessed for sequencing specific genomic regions in hundreds to thousands of individuals. Sequencing-based association studies are currently limited by the low level of multiplexing offered by sequencing platforms. Pooled sequencing represents a cost-effective approach for studying rare variants in large populations. To utilize the power of DNA pooling, it is important to accurately identify sequence variants from pooled sequencing data. Detection of rare variants from pooled sequencing represents a different challenge than detection of variants from individual sequencing. We describe a novel statistical approach, CRISP [Comprehensive Read analysis for Identification of Single Nucleotide Polymorphisms (SNPs) from Pooled sequencing] that is able to identify both rare and common variants by using two approaches: (i) comparing the distribution of allele counts across multiple pools using contingency tables and (ii) evaluating the probability of observing multiple non-reference base calls due to sequencing errors alone. Information about the distribution of reads between the forward and reverse strands and the size of the pools is also incorporated within this framework to filter out false variants. Validation of CRISP on two separate pooled sequencing datasets generated using the Illumina Genome Analyzer demonstrates that it can detect 80-85% of SNPs identified using individual sequencing while achieving a low false discovery rate (3-5%). Comparison with previous methods for pooled SNP detection demonstrates the significantly lower false positive and false negative rates for CRISP. Implementation of this method is available at http://polymorphism.scripps.edu/~vbansal/software/CRISP/.
Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.
2015-01-01
This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030
NASA Technical Reports Server (NTRS)
Springer, E.; Sachs, M. S.; Woese, C. R.; Boone, D. R.
1995-01-01
Representatives of the family Methanosarcinaceae were analyzed phylogenetically by comparing partial sequences of their methyl-coenzyme M reductase (mcrI) genes. A 490-bp fragment from the A subunit of the gene was selected, amplified by the PCR, cloned, and sequenced for each of 25 strains belonging to the Methanosarcinaceae. The sequences obtained were aligned with the corresponding portions of five previously published sequences, and all of the sequences were compared to determine phylogenetic distances by Fitch distance matrix methods. We prepared analogous trees based on 16S rRNA sequences; these trees corresponded closely to the mcrI trees, although the mcrI sequences of pairs of organisms had 3.01 +/- 0.541 times more changes than the respective pairs of 16S rRNA sequences, suggesting that the mcrI fragment evolved about three times more rapidly than the 16S rRNA gene. The qualitative similarity of the mcrI and 16S rRNA trees suggests that transfer of genetic information between dissimilar organisms has not significantly affected these sequences, although we found inconsistencies between some mcrI distances that we measured and and previously published DNA reassociation data. It is unlikely that multiple mcrI isogenes were present in the organisms that we examined, because we found no major discrepancies in multiple determinations of mcrI sequences from the same organism. Our primers for the PCR also match analogous sites in the previously published mcrII sequences, but all of the sequences that we obtained from members of the Methanosarcinaceae were more closely related to mcrI sequences than to mcrII sequences, suggesting that members of the Methanosarcinaceae do not have distinct mcrII genes.
Non-redundant patent sequence databases with value-added annotations at two levels
Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo
2010-01-01
The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/. PMID:19884134
Non-redundant patent sequence databases with value-added annotations at two levels.
Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo
2010-01-01
The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/.
Prediction of protein secondary structure content for the twilight zone sequences.
Homaeian, Leila; Kurgan, Lukasz A; Ruan, Jishou; Cios, Krzysztof J; Chen, Ke
2007-11-15
Secondary protein structure carries information about local structural arrangements, which include three major conformations: alpha-helices, beta-strands, and coils. Significant majority of successful methods for prediction of the secondary structure is based on multiple sequence alignment. However, multiple alignment fails to provide accurate results when a sequence comes from the twilight zone, that is, it is characterized by low (<30%) homology. To this end, we propose a novel method for prediction of secondary structure content through comprehensive sequence representation, called PSSC-core. The method uses a multiple linear regression model and introduces a comprehensive feature-based sequence representation to predict amount of helices and strands for sequences from the twilight zone. The PSSC-core method was tested and compared with two other state-of-the-art prediction methods on a set of 2187 twilight zone sequences. The results indicate that our method provides better predictions for both helix and strand content. The PSSC-core is shown to provide statistically significantly better results when compared with the competing methods, reducing the prediction error by 5-7% for helix and 7-9% for strand content predictions. The proposed feature-based sequence representation uses a comprehensive set of physicochemical properties that are custom-designed for each of the helix and strand content predictions. It includes composition and composition moment vectors, frequency of tetra-peptides associated with helical and strand conformations, various property-based groups like exchange groups, chemical groups of the side chains and hydrophobic group, auto-correlations based on hydrophobicity, side-chain masses, hydropathy, and conformational patterns for beta-sheets. The PSSC-core method provides an alternative for predicting the secondary structure content that can be used to validate and constrain results of other structure prediction methods. At the same time, it also provides useful insight into design of successful protein sequence representations that can be used in developing new methods related to prediction of different aspects of the secondary protein structure. (c) 2007 Wiley-Liss, Inc.
Berthier, Y; Thierry, D; Lemattre, M; Guesdon, J L
1994-01-01
A new insertion sequence was isolated from Xanthomonas campestris pv. dieffenbachiae. Sequence analysis showed that this element is 1,158 bp long and has 15-bp inverted repeat ends containing two mismatches. Comparison of this sequence with sequences in data bases revealed significant homology with Escherichia coli IS5. IS1051, which detected multiple restriction fragment length polymorphisms, was used as a probe to characterize strains from the pathovar dieffenbachiae. Images PMID:7906933
2013-11-21
Fanconi Anemia; Autosomal or Sex Linked Recessive Genetic Disease; Bone Marrow Hematopoiesis Failure, Multiple Congenital Abnormalities, and Susceptibility to Neoplastic Diseases.; Hematopoiesis Maintainance.
Fractional Programming for Communication Systems—Part I: Power Control and Beamforming
NASA Astrophysics Data System (ADS)
Shen, Kaiming; Yu, Wei
2018-05-01
This two-part paper explores the use of FP in the design and optimization of communication systems. Part I of this paper focuses on FP theory and on solving continuous problems. The main theoretical contribution is a novel quadratic transform technique for tackling the multiple-ratio concave-convex FP problem--in contrast to conventional FP techniques that mostly can only deal with the single-ratio or the max-min-ratio case. Multiple-ratio FP problems are important for the optimization of communication networks, because system-level design often involves multiple signal-to-interference-plus-noise ratio terms. This paper considers the applications of FP to solving continuous problems in communication system design, particularly for power control, beamforming, and energy efficiency maximization. These application cases illustrate that the proposed quadratic transform can greatly facilitate the optimization involving ratios by recasting the original nonconvex problem as a sequence of convex problems. This FP-based problem reformulation gives rise to an efficient iterative optimization algorithm with provable convergence to a stationary point. The paper further demonstrates close connections between the proposed FP approach and other well-known algorithms in the literature, such as the fixed-point iteration and the weighted minimum mean-square-error beamforming. The optimization of discrete problems is discussed in Part II of this paper.
Abundant aftershock sequence of the 2015 Mw7.5 Hindu Kush intermediate-depth earthquake
NASA Astrophysics Data System (ADS)
Li, Chenyu; Peng, Zhigang; Yao, Dongdong; Guo, Hao; Zhan, Zhongwen; Zhang, Haijiang
2018-05-01
The 2015 Mw7.5 Hindu Kush earthquake occurred at a depth of 213 km beneath the Hindu Kush region of Afghanistan. While many early aftershocks were missing from the global earthquake catalogues, this sequence was recorded continuously by eight broad-band stations within 500 km. Here we use a waveform matching technique to systematically detect earthquakes around the main shock. More than 3000 events are detected within 35 d after the main shock, as compared with 42 listed in the Advanced National Seismic System catalogue (or 196 in the International Seismological Centre catalogue). The aftershock sequence generally follows the Omori's law with a decay constant p = 0.92. We also apply the recently developed double-pair double-difference technique to relocate all detected aftershocks. Most of them are located to the west of the hypocentre of the main shock, consistent with the westward propagation of the main-shock rupture. The aftershocks outline a nearly vertical southward dipping plane, which matches well with one of the nodal planes of the main shock. We conclude that the aftershock sequence of this intermediate-depth earthquake shares many similarities with those for shallow earthquakes and infer that there are some common mechanisms responsible for shallow and intermediate-depth earthquakes.
NASA Astrophysics Data System (ADS)
Ando, R.; Aoki, Y.; Uchide, T.; Imanishi, K.; Matsumoto, S.; Nishimura, T.
2016-12-01
A couple of interesting earthquake rupture phenomena were observed associated with the sequence of the 2016 Kumamoto, Japan, earthquake sequence. The sequence includes the April 15, 2016, Mw 7.0, mainshock, which was preceded by multiple M6-class foreshock. The mainshock mainly broke the Futagawa fault segment striking NE-SW direction extending over 50km, and it further triggered a M6-class earthquake beyond the distance more than 50km to the northeast (Uchide et al., 2016, submitted), where an active volcano is situated. Compiling the data of seismic analysis and InSAR, we presumed this dynamic triggering event occurred on an active fault known as Yufuin fault (Ando et al., 2016, JPGU general assembly). It is also reported that the coseismic slip was significantly large at a shallow portion of Futagawa Fault near Aso volcano. Since the seismogenic depth becomes significantly shallower in these two areas, we presume the geothermal anomaly play a role as well as the elasto-dynamic processes associated with the coseismic rupture. In this study, we conducted a set of fully dynamic simulations of the earthquake rupture process by assuming the inferred 3D fault geometry and the regional stress field obtained referring the stress tensor inversion. As a result, we showed that the dynamic rupture process was mainly controlled by the irregularity of the fault geometry subjected to the gently varying regional stress field. The foreshocks ruptures have been arrested at the juncture of the branch faults. We also show that the dynamic triggering of M-6 class earthquakes occurred along the Yufuin fault segment (located 50 km NE) because of the strong stress transient up to a few hundreds of kPa due to the rupture directivity effect of the M-7 event. It is also shown that the geothermal condition may lead to the susceptible condition of the dynamic triggering by considering the plastic shear zone on the down dip extension of the Yufuin segment, situated in the vicinity of an active volcano.
NASA Astrophysics Data System (ADS)
Breitfeld, H. T.; Hennig, J.; BouDagher-Fadel, M.; Hall, R.
2017-12-01
The offshore Sarawak Basin NW of North Sarawak is a major hydrocarbon province in SE Asia. A very thick sedimentary sequence of Oligocene to ?Early Miocene age, named Cycle 1, is an important hydrocarbon source and reservoir. Despite numerous wells the stratigraphy and tectonic history is not very well understood. The Nyalau Formation of onshore North Sarawak is the supposed equivalent of the offshore Cycle 1 sequence. The Nyalau Formation is a thick sedimentary sequence of mainly tidal to deltaic deposits. The formation is dominated by well-bedded sandstone-mudstone alternations and thicker sandstones with abundant bioturbation. The sandstones are predominantly arenaceous. Various lithic fragments and feldspar indicate multiple sources and fresh input from igneous and metamorphic rocks. Interbedded thin limestone beds and marls yielded Early Miocene foraminifera for the upper part of the succession. Zircons separated from the sandstones yielded mainly Cretaceous and Triassic ages. The Triassic is the dominant age population. The Nyalau Formation conformably overlies the Buan Shale and the Tatau Formation, and in places unconformably overlies the Belaga Formation. The Belaga Formation is part of the Rajang Group that represents remnants of a large submarine fan deposited in the Late Cretaceous to Eocene in Central Sarawak. In contrast to the Nyalau Formation, the majority of zircons from the Rajang Group have Cretaceous ages. This marks an important change in provenance at the major unconformity separating the Belaga and Nyalau Formations. This unconformity was previously interpreted as the result of an orogeny in the Late Eocene. However, there is no evidence for a subduction or collision event at this time in Sarawak. We interpret it to mark plate reorganisation in the Middle Eocene and name it the Rajang Unconformity. Borneo is the principal source of Cretaceous zircons which were derived from the Schwaner Mountains and West Sarawak. The dominant Triassic zircon age population in the Nyalau Formation indicates either major input from the Malay Peninsula (Malay-Thai Tin belt) or Indochina (SE Vietnam). It also suggests that Borneo supplied little or no sediment to Sarawak in the Oligocene to Early Miocene.
New pharmacotherapy options for multiple myeloma.
Mina, Roberto; Cerrato, Chiara; Bernardini, Annalisa; Aghemo, Elena; Palumbo, Antonio
2016-01-01
Novel agents and the availability of autologous stem-cell transplantation have revolutionized the treatment of patients with multiple myeloma. First-generation novel agents namely thalidomide, lenalidomide, and bortezomib have significantly improved response and survival of patients. Second-generation novel agents such as pomalidomide, carfilzomib, and monoclonal antibodies are being tested both in the newly diagnosed and relapse settings, and results are promising. In this review article, the main results derived from Phase III trials with thalidomide, lenalidomide, and bortezomib for the treatment of myeloma patients, both at diagnosis and at relapse, are summarized. Data about second-generation novel agents such as pomalidomide and carfilzomib are also reported. Newer effective drugs currently under investigation and the promising results with monoclonal antibodies are described. The availability of new effective drugs has considerably increased the treatment options for myeloma patients. A sequential approach including induction, transplantation (when possible), consolidation, and maintenance is an optimal strategy to achieve disease control and prolong survival. Despite these improvements, the best combination, the optimal sequence, and the proper target of newer drugs need to be defined.