Skip to content

Releases: RasmussenLab/vamb

v5.0.4

29 Apr 13:02
c53e541

Choose a tag to compare

v5.0.4

This version fixes two critical bugs which made TaxVamb and Taxometer run slower and will much less accuracy.

Bugfix: Fix bugs when parsing taxonomy TSV files where the trailing newline would be considered part of a clade name, and where a missing annotation was considered the annotation ""
Bugfix: Previous v5 releases erroneously used the VAEVAE training options for Taxometer when running TaxVamb. Make it respect Taxometer's own training settings

v5.0.3

08 Apr 11:39
7ce46fa

Choose a tag to compare

Bugfix: Allow taxonomy file to contain sequences that was present in the FASTA file, but have been filtered away due to falling below the minimum length.

v5.0.2

28 Mar 11:20
db35c11

Choose a tag to compare

v5.0.2

This release includes various bugfixes.

  • Improve various error messages
  • Make Vamb more robust when handling a low number of sequences
  • Switch the backend backage from a Cython to a Rust package

Test input data

28 Mar 14:38
db35c11

Choose a tag to compare

This input data can be used to test that Vamb runs, and to run the benchmarking process. Note that the data is quite artificial with several limitations and so is not a good dataset to benchmark the accuracy of Vamb (or any other binner)

v5.0.1

05 Mar 13:09

Choose a tag to compare

Version 5 is a major release that includes several breaking changes to the API,
as well as new types of models, improved binning accuracy, and more user
friendliness.

Added

  • Added the TaxVamb binner - a semi-supervised model that can augment binning
    using taxonomic assignment from e.g. mmseqs2 of some of the input contigs.
    TaxVamb is state-of-the-art, and significantly outperforms all other Vamb
    models when the taxonomic assignment is reasonably good.
    TaxVamb is available from command-line using vamb bin taxvamb
  • Added the Taxometer annotation refiner. This program enhances taxonomic
    assignment of metagenomic contigs using composition and abundance.
    TaxVamb will automatically run Taxometer to increase accuracy.
    Taxometer is available from command-line using vamb taxometer
  • [EXPERIMENTAL] Added reclustering functionality, which reclusters an existing
    binning using single-copy genes, using a technique inspired by the SemiBin2
    binner. This improves bacterial bins.
    We may remove this feature in future versions of Vamb.

Breaking changes

  • The command-line interface of Vamb has been changed, such that the different
    functionality should be used through subcommands. For example, the binners in
    Vamb are accesible through vamb bin.
    Also, a few command-line flags have been removed.
  • All output files ending in .tsv is now actually in TSV format. Previously,
    Vamb did not include a header in the file, as the TSV format requires.
    In version 5, the header is included.
  • The file mask.npz is no longer output, because the encoder no longer masks
    any sequences.
  • The name of the output clusters files have been changed. When binsplitting is
    used, Vamb now outputs both the split, and the unsplit clusters.
    The name of the output files are now:
    • vae_clusters_split.tsv
    • vae_clusters_unsplit.tsv
      And similarly for e.g. vaevae_clusters_split.tsv.
      When binsplitting is not used, only the unsplit clusters are output.
  • The benchmark module of Vamb has been removed, as it is superseded by our
    new benchmarking tool https://github.com/jakobnissen/BinBencher.jl

Other changes

  • Several details of the clustering algorithm has been rehauled.
    It now returns more accurate clusters and may be faster in some circumstances.
    However, GPU clustering may be significantly slower. (#198)
  • Vamb now uses both relative and absolute abundances in the encoder, compared
    to only the relative ones before. This improves binning, especially when using
    a low number of samples (#210)
  • Vamb now binsplits with -o C by default.
    • To disable binsplitting, pass -o without an argument
  • Vamb now supports passing abundances in TSV format. This TSV can created very
    efficiently using the strobealign aligner with the --aemb flag.
  • If passing abundances in BAM format, it is now recommended to pass in a
    directory with all the BAM files using the --bamdir flag, instead of using
    the old --bamfiles flag.
  • Vamb no longer errors when the batch size is too large.
  • Several errors and warnings have been improved:
    • The user is warned if any sequences are filtered away for falling below
      the contig size cutoff (flag -m).
    • Improved the error message when the FASTA and BAM headers to not match.
    • Vamb now errors early if the binsplit separator (flag -o) is not found
      in the parsed contig identifiers.
      If the binsplit separator is not set explicitly and defaults to -o C,
      Vamb will instead warn the user and disable binsplitting.
  • Vamb now writes its log to both stderr and to the logfile. Every line in the
    log is now timestamped, and formatted better.
  • Vamb now outputs metadata about the unsplit clusters in the output TSV file
    vae_clusters_metadata.tsv.
  • Vamb now correctly uses a random seed on each invokation (#213)
  • Fixed various bugs and undoubtedly introduced some fresh ones.

v4.1.3

02 Jun 11:10

Choose a tag to compare

v4.1.3

  • Fix a bug that resulting in poor clustering results (#179)

v4.1.2

28 May 13:35

Choose a tag to compare

v4.1.2

  • Fix a bug in src/create_fasta.py
  • Bugfix: Make seeding the RNG work from command line
  • Bump compatible Cython version

v4.1.1

28 Apr 10:41

Choose a tag to compare

v4.1.1

  • Create tmp directory in parsebam if needed for pycoverm (issue # 167)

v4.1.0

21 Apr 09:43

Choose a tag to compare

v4.1.0

  • Allow setting the RNG seed from command line
  • Fix typo in output AAE_Z cluster names. They are now called e.g. "aae_z_1"
    instead of "aae_z1"
  • Clean up the directory structure of Avamb workflow.
  • Fix the CheckM2 dependencies to allow CheckM2 to be installed
  • Allow the Avamb workflow to be run on Slurm clusters
  • Fix issue #161: Mismatched refhash when spaces in FASTA headers

v4.0.1

30 Mar 13:13

Choose a tag to compare

v4.0.1

  • Fix Random.choice for Tensor on Python 3.11. See issue #148