RuntimeError: size mismatch.

Hi, thanks for sharing the code.


I am trying to replicate the results you showed in the paper by running eval.

But I got size mismatched. 

[rank0]:     raw_reward = per_token_logps - ref_per_token_logps if ref_setup == "w/ ref" else per_token_logps
[rank0]: RuntimeError: The size of tensor a (624) must match the size of tensor b (645) at non-singleton dimension 1