Replies: 1 comment 4 replies
-
|
另:使用Eurus-2-7B-SFT进行prime训练复现,没有复现出预期的结果,指标差的比较大,目前训练到step399 |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
如题,
但我目前脑子里没法辨析这些相似的地方和R1的具体差异所在。
是否有大佬能分析讨论一下?
Beta Was this translation helpful? Give feedback.
All reactions