In the code train_model.py:
model = AutoModelForSeq2SeqLM.from_pretrained(
model_args.model_name_or_path,
from_tf=bool(".ckpt" in model_args.model_name_or_path),
config=config,
cache_dir=model_args.cache_dir,
revision=model_args.model_revision,
use_auth_token=True if model_args.use_auth_token else None,
)
if anyone could not use this to initial a flan-t5 model from AutoModelForSeq2SeqLM, then you need to add the following in params:
unk_token="",
bos_token="",
eos_token=""
Thanks!
In the code train_model.py:
model = AutoModelForSeq2SeqLM.from_pretrained(
model_args.model_name_or_path,
from_tf=bool(".ckpt" in model_args.model_name_or_path),
config=config,
cache_dir=model_args.cache_dir,
revision=model_args.model_revision,
use_auth_token=True if model_args.use_auth_token else None,
)
if anyone could not use this to initial a flan-t5 model from AutoModelForSeq2SeqLM, then you need to add the following in params:
unk_token="",
bos_token="
","eos_token="
Thanks!