Hi, thank you for your great work and the open-source release!
How do I evaluate the model's performance on each dataset?
Could you please provide some scripts or commands for the evaluation?
Or could you share the code repo you use for evaluation?
Thanks!