Release synced BatchNorm, DataModules and final API · Lightning-AI/pytorch-lightning

Overview

The newest PyTorch Lightning release includes final API clean-up with better data decoupling and shorter logging syntax.

Were happy to release PyTorch Lightning 0.9 today, which contains many great new features, more bugfixes than any release we ever had, but most importantly it introduced our mostly final API changes! Lightning is being adopted by top researchers and AI labs around the world, and we are working hard to make sure we provide a smooth experience and support for all the latest best practices.

Detail changes

Added

Added SyncBN for DDP (#2801, #2838)
Added basic CSVLogger (#2721)
Added SSIM metrics (#2671)
Added BLEU metrics (#2535)
Added support to export a model to ONNX format (#2596)
Added support for Trainer(num_sanity_val_steps=-1) to check all validation data before training (#2246)
Added struct. output:
- tests for val loop flow (#2605)
- EvalResult support for train and val. loop (#2615, #2651)
- weighted average in results obj (#2930)
- fix result obj DP auto reduce (#3013)
Added class LightningDataModule (#2668)
Added support for PyTorch 1.6 (#2745)
Added call DataModule hooks implicitly in trainer (#2755)
Added support for Mean in DDP Sync (#2568)
Added remaining sklearn metrics: AveragePrecision, BalancedAccuracy, CohenKappaScore, DCG, Hamming, Hinge, Jaccard, MeanAbsoluteError, MeanSquaredError, MeanSquaredLogError, MedianAbsoluteError, R2Score, MeanPoissonDeviance, MeanGammaDeviance, MeanTweedieDeviance, ExplainedVariance (#2562)
Added support for limit_{mode}_batches (int) to work with infinite dataloader (IterableDataset) (#2840)
Added support returning python scalars in DP (#1935)
Added support to Tensorboard logger for OmegaConf hparams (#2846)
Added tracking of basic states in Trainer (#2541)
Tracks all outputs including TBPTT and multiple optimizers (#2890)
Added GPU Usage Logger (#2932)
Added strict=False for load_from_checkpoint (#2819)
Added saving test predictions on multiple GPUs (#2926)
Auto log the computational graph for loggers that support this (#3003)
Added warning when changing monitor and using results obj (#3014)
Added a hook transfer_batch_to_device to the LightningDataModule (#3038)

Changed

Truncated long version numbers in progress bar (#2594)
Enabling val/test loop disabling (#2692)
Refactored into accelerator module:
- GPU training (#2704)
- TPU training (#2708)
- DDP(2) backend (#2796)
- Retrieve last logged val from result by key (#3049)
Using .comet.config file for CometLogger (#1913)
Updated hooks arguments - breaking for setup and teardown (#2850)
Using gfile to support remote directories (#2164)
Moved optimizer creation after device placement for DDP backends (#2904](https://github.com/PyTorchLightning/pytorch-lighting/pull/2904))
Support **DictConfig for hparam serialization (#2519)
Removed callback metrics from test results obj (#2994)
Re-enabled naming metrics in ckpt name (#3060)
Changed progress bar epoch counting to start from 0 (#3061)

Deprecated

Deprecated Trainer attribute ckpt_path, which will now be set by weights_save_path (#2681)

Removed

Removed deprecated: (#2760)
- core decorator data_loader
- Module hook on_sanity_check_start and loading load_from_metrics
- package pytorch_lightning.logging
- Trainer arguments: show_progress_bar, num_tpu_cores, use_amp, print_nan_grads
- LR Finder argument num_accumulation_steps

Fixed

Fixed accumulate_grad_batches for last batch (#2853)
Fixed setup call while testing (#2624)
Fixed local rank zero casting (#2640)
Fixed single scalar return from training (#2587)
Fixed Horovod backend to scale LR schedlers with the optimizer (#2626)
Fixed dtype and device properties not getting updated in submodules (#2657)
Fixed fast_dev_run to run for all dataloaders (#2581)
Fixed save_dir in loggers getting ignored by default value of weights_save_path when user did not specify weights_save_path (#2681)
Fixed weights_save_path getting ignored when logger=False is passed to Trainer (#2681)
Fixed TPU multi-core and Float16 (#2632)
Fixed test metrics not being logged with LoggerCollection (#2723)
Fixed data transfer to device when using torchtext.data.Field and include_lengths is True (#2689)
Fixed shuffle argument for the distributed sampler (#2789)
Fixed logging interval (#2694)
Fixed loss value in the progress bar is wrong when accumulate_grad_batches > 1 (#2738)
Fixed correct CWD for DDP sub-processes when using Hydra (#2719)
Fixed selecting GPUs using CUDA_VISIBLE_DEVICES (#2739, #2796)
Fixed false num_classes warning in metrics (#2781)
Fixed shell injection vulnerability in subprocess call (#2786)
Fixed LR finder and hparams compatibility (#2821)
Fixed ModelCheckpoint not saving the latest information when save_last=True (#2881)
Fixed ImageNet example: learning rate scheduler, number of workers and batch size when using DDP (#2889)
Fixed apex gradient clipping (#2829)
Fixed save apex scaler states (#2828)
Fixed a model loading issue with inheritance and variable positional arguments (#2911)
Fixed passing non_blocking=True when transferring a batch object that does not support it (#2910)
Fixed checkpointing to remote file paths (#2925)
Fixed adding val_step argument to metrics (#2986)
Fixed an issue that caused Trainer.test() to stall in DDP mode (#2997)
Fixed gathering of results with tensors of varying shape (#3020)
Fixed batch size auto-scaling feature to set the new value on the correct model attribute (#3043)
Fixed automatic batch scaling not working with half-precision (#3045)
Fixed setting device to root GPU (#3042)

Contributors

@ananthsub, @ananyahjha93, @awaelchli, @bkhakshoor, @Borda, @ethanwharris, @f4hy, @groadabike, @ibeltagy, @justusschock, @lezwon, @nateraw, @neighthan, @nsarang, @PhilJd, @pwwang, @rohitgr7, @romesco, @ruotianluo, @shijianjian, @SkafteNicki, @tgaddair, @thschaaf, @williamFalcon, @xmotli02, @ydcjeff, @yukw777, @zerogerc

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

synced BatchNorm, DataModules and final API

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Overview

Detail changes

Added

Changed

Deprecated

Removed

Fixed

Contributors

Uh oh!