This repository was archived by the owner on Feb 17, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 131
Pre-train mT5 on Custom Datastet #49
Copy link
Copy link
Open
Description
Is it possible to pre-train mT5 on a custom dataset. If yes,
- Should the data be part of TFDS or other file formats supported?
- Should the data be pre-processed in the format \t before calling the below code? Where task will be the custom task
python -m t5.models.mesh_transformer_main \
--tpu="${TPU}" \
--gcp_project="${PROJECT}" \
--tpu_zone="${ZONE}" \
--model_dir="${MODEL_DIR}" \
--gin_file="models/t5.1.1.large.gin" \
--gin_param="MIXTURE_NAME = '${TASK}'" \
--gin_param="utils.run.sequence_length = {'inputs': 1024, 'targets': 256}" \
--gin_param="utils.run.batch_size = ('tokens_per_batch', 1048576)" \
--gin_param="utils.run.learning_rate_schedule=@learning_rate_schedules.rsqrt_no_ramp_down" \
--gin_param="run.train_steps = 1000000" \
--gin_param="utils.tpu_mesh_shape.model_parallelism = 1" \
--gin_param="utils.tpu_mesh_shape.tpu_topology = 'v3-256'" \
--eval_mode="perplexity_eval" \
--eval_gin_param="mesh_eval_dataset_fn.num_eval_examples = 10000" \
--t5_tfds_data_dir="${BUCKET}/t5-tfds" \
--module_import="multilingual_t5.tasks"
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels