Skip to content

[Bug]: PyTorch Quickstart - avg_trainloss scales with local-epochs > 1 #6333

@jun-simons

Description

@jun-simons

Describe the bug

In quickstart-pytorch, the reported avg_trainloss increases linearly with local-epochs. Upon searching this issue was pointed out in the discussion forum.

The fix is simply to include the number of epochs when normalizing.

Steps/Code to Reproduce

From Quickstart, run with different numbers of epochs:
flwr run . --run-config "num-server-rounds=1 local-epochs=1" : train_loss 2.21
flwr run . --run-config "num-server-rounds=1 local-epochs=2" : train_loss = 4.34
flwr run . --run-config "num-server-rounds=1 local-epochs=4" : train_loss = 8.57

Expected Results

Average loss should not increase linearly with the number of epochs as it should be appropriately normalized

Actual Results

1 epoch: train_loss = 2.21
2 epochs: train_loss = 4.34 (~x2)
4 epochs: train_loss = 8.57 (~x4)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions