Most of my software can be found at my github page. Below you can find a list of my main projects.

eBWTclust: positional clustering of the Burrows-Wheeler Transform

eBWTclust finds clusters corresponding to the same genome position in the eBWT of a set of DNA reads of two individuals and, by analyzing these clusters, it finds SNPs between the two individuals (reference-free, alignment-free).

The paper describing the theory behind the tool (eBWT positional clustering) can be found here:

The run-length Compressed Burrows-Wheeler transform index (r-index)

The r-index is the first full-text index of size O(r), r being the number of BWT runs of the input text (of size n), supporting near-optimal locate of pattern occurrences. The r-index employs a novel suffix array sampling of size 2r; in classical FM-indexes, this sampling would result in a locate time of Omega(n/r) per occurrence. The r-index, on the other hand, reduces this time to O(log(n/r)).

DYNAMIC: Dynamic succinct and compressed data structures

DYNAMIC is a C++ library offering space- and time-efficient implementations of some basic succinct/compressed dynamic data structures such as succinct dynamic partial sums, bitvectors, strings.

ERNE: The Extended Randomized Numerical alignEr

This tool, developed in collaboration with the university of Udine and the Institute of Applied Genomics, includes:

  • A DNA short-reads aligner.
  • A bisulfite-treated short-reads aligner for methylation analysis.
  • A read filtering tool.
  • A methylation calling tool.

ERNE is based on a novel succinct hash data structure combining the concepts of Burrows-Wheeler transform and Hamming-Aware hash functions.