-
Notifications
You must be signed in to change notification settings - Fork 179
Open
Description
Hi!
After seeing some issues related to OOM errors due to high prompt length, I was wondering if an option to generate sequences with Evo using higher prompts (>1kb and so on) would be, rather than GPU sharding, to decrease the float precision?
- I believe that currently it is set to
float16(as inmodel.backbone = model.backbone.to(torch.bfloat16)from thegeneration_to_folding.pyscript), but wouldfloat8be an option (is it anyhow compatible with Evo?)? - If so, do you expect a big decrease in generation performance, or do you already have data representing precision vs performance?
Thanks so much!
Metadata
Metadata
Assignees
Labels
No labels