-
Notifications
You must be signed in to change notification settings - Fork 136
Open
Description
def get_model_tokenizer(args):
model = LlamaForCausalLM.from_pretrained(args.model_name_or_path)
tokenizer = LlamaTokenizer.from_pretrained(args.model_name_or_path)
tokenizer.add_special_tokens({'pad_token': ""})
model.resize_token_embeddings(len(tokenizer))
model = bmt.BMTrainModelWrapper(model)
return model, tokenizer
假设在单机8卡服务器上,加载UltraChat65B的模型进行微调,会不会存在OOM的问题?每个卡都会执行model = LlamaForCausalLM.from_pretrained(args.model_name_or_path)加载一份模型,哪怕存CPU内存,65B大概需要130G的内存,8卡差不多需要1T的内存,而服务器总内存也差不多1T。
Metadata
Metadata
Assignees
Labels
No labels