Skip to content

Conversation

@kilinchange
Copy link
Collaborator

@kilinchange kilinchange commented Dec 27, 2025

  1. DDP 不需要 chunk 相关的方法,不跟 PP 耦合;
  2. 保留了 GPT2Chunk、GPT2FirstStage、GPT2LastStage 三种 module,用它们构造 GPT2 module,GPT2 module 的 Forward 只会在不开 PP 时执行,否则会在 schedule 里执行前三种 module 的 Forward;
  3. module 接口保持与原来一致,不需要 BuildChunks 和 ForwarChunk 方法。

@kilinchange
Copy link
Collaborator Author

image

for (auto &[name, module] : modules_) {
if (name.starts_with("__pp")) {
continue;
}
Copy link
Collaborator Author

@kilinchange kilinchange Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里本质上是需要对 modules 或者 parameter 进行去重操作, @Chamberlain0w0 看看帮忙加一下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants