Articles
Page 2 of 10
-
Citation: Algorithms for Molecular Biology 2023 18:3
-
Pangenomic genotyping with the marker array
We present a new method and software tool called rowbowt that applies a pangenome index to the problem of inferring genotypes from short-read sequencing data. The method uses a novel indexing structure called the...
Citation: Algorithms for Molecular Biology 2023 18:2 -
All galls are divided into three or more parts: recursive enumeration of labeled histories for galled trees
In mathematical phylogenetics, a labeled rooted binary tree topology can possess any of a number of labeled histories, each of which represents a possible temporal ordering of its coalescences. Labeled histori...
Citation: Algorithms for Molecular Biology 2023 18:1 -
Correction: Heuristic shortest hyperpaths in cell signaling hypergraphs
Citation: Algorithms for Molecular Biology 2022 17:17 -
On a greedy approach for genome scaffolding
Scaffolding is a bioinformatics problem aimed at completing the contig assembly process by determining the relative position and orientation of these contigs. It can be seen as a paths and cycles cover problem...
Citation: Algorithms for Molecular Biology 2022 17:16 -
Treewidth-based algorithms for the small parsimony problem on networks
Phylogenetic reconstruction is one of the paramount challenges of contemporary bioinformatics. A subtask of existing tree reconstruction algorithms is modeled by the Small Parsimony problem: given a tree T and an...
Citation: Algorithms for Molecular Biology 2022 17:15 -
Binning long reads in metagenomics datasets using composition and coverage information
Advancements in metagenomics sequencing allow the study of microbial communities directly from their environments. Metagenomics binning is a key step in the species characterisation of microbial communities. N...
Citation: Algorithms for Molecular Biology 2022 17:14 -
Two metrics on rooted unordered trees with labels
The early development of a zygote can be mathematically described by a developmental tree. To compare developmental trees of different species, we need to define distances on trees. If children cells after a d...
Citation: Algorithms for Molecular Biology 2022 17:13 -
Heuristic shortest hyperpaths in cell signaling hypergraphs
Cell signaling pathways, which are a series of reactions that start at receptors and end at transcription factors, are basic to systems biology. Properly modeling the reactions in such pathways requires directed ...
Citation: Algorithms for Molecular Biology 2022 17:12 -
Embedding gene trees into phylogenetic networks by conflict resolution algorithms
Phylogenetic networks are mathematical models of evolutionary processes involving reticulate events such as hybridization, recombination, or horizontal gene transfer. One of the crucial notions in phylogenetic...
Citation: Algorithms for Molecular Biology 2022 17:11 -
Bi-alignments with affine gaps costs
Commonly, sequence and structure elements are assumed to evolve congruently, such that homologous sequence positions correspond to homologous structural features. Assuming congruent evolution, alignments based on...
Citation: Algorithms for Molecular Biology 2022 17:10 -
Efficient privacy-preserving variable-length substring match for genome sequence
The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely searches a variable-length substring match between a query a...
Citation: Algorithms for Molecular Biology 2022 17:9 -
Tree diet: reducing the treewidth to unlock FPT algorithms in RNA bioinformatics
Hard graph problems are ubiquitous in Bioinformatics, inspiring the design of specialized Fixed-Parameter Tractable algorithms, many of which rely on a combination of tree-decomposition and dynamic programming...
Citation: Algorithms for Molecular Biology 2022 17:8 -
Adding hydrogen atoms to molecular models via fragment superimposition
Most experimentally determined structures of biomolecules lack annotated hydrogen positions due to their low electron density. However, thorough structure analysis and simulations require knowledge about the p...
Citation: Algorithms for Molecular Biology 2022 17:7 -
Perplexity: evaluating transcript abundance estimation in the absence of ground truth
There has been rapid development of probabilistic models and inference methods for transcript abundance estimation from RNA-seq data. These models aim to accurately estimate transcript-level abundances, to acc...
Citation: Algorithms for Molecular Biology 2022 17:6 -
Space-efficient representation of genomic k-mer count tables
k-mer counting is a common task in bioinformatic pipelines, with many dedicated tools available. Many of these tools produce in output k-mer count tables containing both k-mers and counts, easily reaching tens of...
Citation: Algorithms for Molecular Biology 2022 17:5 -
Fast characterization of segmental duplication structure in multiple genome assemblies
The increasing availability of high-quality genome assemblies raised interest in the characterization of genomic architecture. Major architectural elements, such as common repeats and segmental duplications (S...
Citation: Algorithms for Molecular Biology 2022 17:4 -
Parsimonious Clone Tree Integration in cancer
Every tumor is composed of heterogeneous clones, each corresponding to a distinct subpopulation of cells that accumulated different types of somatic mutations, ranging from single-nucleotide variants (SNVs) to...
Citation: Algorithms for Molecular Biology 2022 17:3 -
Efficiently sparse listing of classes of optimal cophylogeny reconciliations
Cophylogeny reconciliation is a powerful method for analyzing host-parasite (or host-symbiont) co-evolution. It models co-evolution as an optimization problem where the set of all optimal solutions may represe...
Citation: Algorithms for Molecular Biology 2022 17:2 -
A new 1.375-approximation algorithm for sorting by transpositions
Citation: Algorithms for Molecular Biology 2022 17:1 -
An optimized FM-index library for nucleotide and amino acid search
Pattern matching is a key step in a variety of biological sequence analysis pipelines. The FM-index is a compressed data structure for pattern matching, with search run time that is independent of the length o...
Citation: Algorithms for Molecular Biology 2021 16:25 -
An improved approximation algorithm for the reversal and transposition distance considering gene order and intergenic sizes
In the comparative genomics field, one of the goals is to estimate a sequence of genetic changes capable of transforming a genome into another. Genome rearrangement events are mutations that can alter the gene...
Citation: Algorithms for Molecular Biology 2021 16:24 -
A simpler linear-time algorithm for the common refinement of rooted phylogenetic trees on a common leaf set
The supertree problem, i.e., the task of finding a common refinement of a set of rooted trees is an important topic in mathematical phylogenetics. The special case of a common leaf set L is known to be solvable i...
Citation: Algorithms for Molecular Biology 2021 16:23 -
Testing the agreement of trees with internal labels
A semi-labeled tree is a tree where all leaves as well as, possibly, some internal nodes are labeled with taxa. Semi-labeled trees encompass ordinary phylogenetic trees and taxonomies. Suppose we are given a c...
Citation: Algorithms for Molecular Biology 2021 16:22 -
Approximation algorithm for rearrangement distances considering repeated genes and intergenic regions
The rearrangement distance is a method to compare genomes of different species. Such distance is the number of rearrangement events necessary to transform one genome into another. Two commonly studied events a...
Citation: Algorithms for Molecular Biology 2021 16:21 -
DeepGRP: engineering a software tool for predicting genomic repetitive elements using Recurrent Neural Networks with attention
Repetitive elements contribute a large part of eukaryotic genomes. For example, about 40 to 50% of human, mouse and rat genomes are repetitive. So identifying and classifying repeats is an important step in ge...
Citation: Algorithms for Molecular Biology 2021 16:20 -
Heuristic algorithms for best match graph editing
Best match graphs (BMGs) are a class of colored digraphs that naturally appear in mathematical phylogenetics as a representation of the pairwise most closely related genes among multiple species. An arc connec...
Citation: Algorithms for Molecular Biology 2021 16:19 -
A novel method for inference of acyclic chemical compounds with bounded branch-height based on artificial neural networks and integer programming
Analysis of chemical graphs is becoming a major research topic in computational molecular biology due to its potential applications to drug design. One of the major approaches in such a study is inverse quant...
Citation: Algorithms for Molecular Biology 2021 16:18 -
INGOT-DR: an interpretable classifier for predicting drug resistance in M. tuberculosis
Prediction of drug resistance and identification of its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent of tuberculosis, is a challenging problem. Solving this problem requires a ...
Citation: Algorithms for Molecular Biology 2021 16:17 -
Approximate search for known gene clusters in new genomes using PQ-trees
Gene clusters are groups of genes that are co-locally conserved across various genomes, not necessarily in the same order. Their discovery and analysis is valuable in tasks such as gene annotation and predicti...
Citation: Algorithms for Molecular Biology 2021 16:16 -
Shape decomposition algorithms for laser capture microdissection
In the context of biomarker discovery and molecular characterization of diseases, laser capture microdissection is a highly effective approach to extract disease-specific regions from complex, heterogeneous ti...
Citation: Algorithms for Molecular Biology 2021 16:15 -
Distinguishing linear and branched evolution given single-cell DNA sequencing data of tumors
Cancer arises from an evolutionary process where somatic mutations give rise to clonal expansions. Reconstructing this evolutionary process is useful for treatment decision-making as well as understanding evol...
Citation: Algorithms for Molecular Biology 2021 16:14 -
Bayesian optimization with evolutionary and structure-based regularization for directed protein evolution
Directed evolution (DE) is a technique for protein engineering that involves iterative rounds of mutagenesis and screening to search for sequences that optimize a given property, such as binding affinity to a ...
Citation: Algorithms for Molecular Biology 2021 16:13 -
Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
One of the Grand Challenges in Science is the construction of the Tree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life i...
Citation: Algorithms for Molecular Biology 2021 16:12 -
Using the longest run subsequence problem within homology-based scaffolding
Genome assembly is one of the most important problems in computational genomics. Here, we suggest addressing an issue that arises in homology-based scaffolding, that is, when linking and ordering contigs to ob...
Citation: Algorithms for Molecular Biology 2021 16:11 -
Disk compression of k-mer sets
K-mer based methods have become prevalent in many areas of bioinformatics. In applications such as database search, they often work with large multi-terabyte-sized datasets. Storing such large datasets is a de...
Citation: Algorithms for Molecular Biology 2021 16:10 -
The Bourque distances for mutation trees of cancers
Mutation trees are rooted trees in which nodes are of arbitrary degree and labeled with a mutation set. These trees, also referred to as clonal trees, are used in computational oncology to represent the mutati...
Citation: Algorithms for Molecular Biology 2021 16:9 -
LazyB: fast and cheap genome assembly
Advances in genome sequencing over the last years have lead to a fundamental paradigm shift in the field. With steadily decreasing sequencing costs, genome projects are no longer limited by the cost of raw seq...
Citation: Algorithms for Molecular Biology 2021 16:8 -
The energy-spectrum of bicompatible sequences
Genotype-phenotype maps provide a meaningful filtration of sequence space and RNA secondary structures are particular such phenotypes. Compatible sequences, which satisfy the base-pairing constraints of a give...
Citation: Algorithms for Molecular Biology 2021 16:7 -
Fast and efficient Rmap assembly using the Bi-labelled de Bruijn graph
Genome wide optical maps are high resolution restriction maps that give a unique numeric representation to a genome. They are produced by assembling hundreds of thousands of single molecule optical maps, which...
Citation: Algorithms for Molecular Biology 2021 16:6 -
Exact transcript quantification over splice graphs
The probability of sequencing a set of RNA-seq reads can be directly modeled using the abundances of splice junctions in splice graphs instead of the abundances of a list of transcripts. We call this model gra...
Citation: Algorithms for Molecular Biology 2021 16:5 -
Natural family-free genomic distance
A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. Th...
Citation: Algorithms for Molecular Biology 2021 16:4 -
Improving metagenomic binning results with overlapped bins using assembly graphs
Metagenomic sequencing allows us to study the structure, diversity and ecology in microbial communities without the necessity of obtaining pure cultures. In many metagenomics studies, the reads obtained from m...
Citation: Algorithms for Molecular Biology 2021 16:3 -
Fast lightweight accurate xenograft sorting
With an increasing number of patient-derived xenograft (PDX) models being created and subsequently sequenced to study tumor heterogeneity and to guide therapy decisions, there is a similarly increasing need fo...
Citation: Algorithms for Molecular Biology 2021 16:2 -
Quantifying steric hindrance and topological obstruction to protein structure superposition
In computational structural biology, structure comparison is fundamental for our understanding of proteins. Structure comparison is, e.g., algorithmically the starting point for computational studies of struct...
Citation: Algorithms for Molecular Biology 2021 16:1 -
Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains
Simultaneous alignment and folding (SA&F) of RNAs is the indispensable gold standard for inferring the structure of non-coding RNAs and their general analysis. The original algorithm, proposed by Sankoff, solv...
Citation: Algorithms for Molecular Biology 2020 15:19 -
gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections
The construction of a suffix array for a collection of strings is a fundamental task in Bioinformatics and in many other applications that process strings. Related data structures, as the Longest Common Prefix...
Citation: Algorithms for Molecular Biology 2020 15:18 -
A linear-time algorithm that avoids inverses and computes Jackknife (leave-one-out) products like convolutions or other operators in commutative semigroups
Data about herpesvirus microRNA motifs on human circular RNAs suggested the following statistical question. Consider independent random counts, not necessarily identically distributed. Conditioned on the sum, ...
Citation: Algorithms for Molecular Biology 2020 15:17 -
Reconstruction of time-consistent species trees
The history of gene families—which are equivalent to event-labeled gene trees—can to some extent be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralo...
Citation: Algorithms for Molecular Biology 2020 15:16 -
On an enhancement of RNA probing data using information theory
Identifying the secondary structure of an RNA is crucial for understanding its diverse regulatory functions. This paper focuses on how to enhance target identification in a Boltzmann ensemble of structures via...
Citation: Algorithms for Molecular Biology 2020 15:15
- ISSN: 1748-7188 (electronic)