Skip to content

Reimplement NN ensemble using Pytorch instead of TensorFlow #895

@osma

Description

@osma

The current NN ensemble backend has been implemented using TensorFlow. It is the only part of the Annif codebase that depends on TensorFlow. I think it would make sense to try to get rid of the TensorFlow dependency. That means reimplementing the NN ensemble backend using Pytorch.

Reasons:

  1. Pytorch and TensorFlow provide very similar functionality. Both are quite large libraries (hundreds of megabytes for a CPU-only variant; several GB for CUDA or ROCm variants). I'm not sure if they can be used at the same time from the same Python process. Using just one of them would make things easier.

  2. The proposed PECOS / X(R)-Transformer backend, in PR Add Xtransformer to backend #798, is based on Pytorch. We are also looking for a new implementation of fastText (see Need for maintained fastText project #795) and one of the candidates is built on Pytorch. Also, DNB is working on a new Embedding Based Matching backend (see Development of an Embedding-based matching backend #855) which uses Pytorch. So it looks like Annif will soon need to depend on Pytorch anyway.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions