Skip to content

Max Input and Output length #22

@shashank140195

Description

@shashank140195

In the finetine_for_summarization.py, why max_source_length, train_max_target_length, and eval_max_target_length is set to default 510? Is this the max the BioMedLM can take as Input and only generate max 510 tokens? As soon as I increase the value above this default value, I get the error.

max_source_length: Optional[int] = field(
default=510, metadata={"help": "the max source length of summarization data. "}
)
train_max_target_length: Optional[int] = field(
default=510, metadata={"help": "the max target length for training data. "}
)
eval_max_target_length: Optional[int] = field(
default=510, metadata={"help": "the max target length for dev data. "}

Error:
Traceback (most recent call last): File "finetune_for_summarization.py", line 168, in <module> finetune() File "finetune_for_summarization.py", line 162, in finetune trainer.train() File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1534, in train return inner_training_loop( File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1807, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/trainer.py", line 2649, in training_step loss = self.compute_loss(model, inputs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/trainer.py", line 2674, in compute_loss outputs = model(**inputs) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1769, in forward loss = self.module(*inputs, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl result = forward_call(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 1075, in forward transformer_outputs = self.transformer( File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl result = forward_call(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 843, in forward position_embeds = self.wpe(position_ids) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl result = forward_call(*args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 162, in forward return F.embedding( File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 2210, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: CUDA error: device-side assert triggered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions