Skip to content

Conversation

clairecyq
Copy link

Summary: Use the same device for constant tensor as input tensor to avoid concatenating tensors on separate devices

Reviewed By: czxttkl

Differential Revision: D39020305

czxttkl and others added 30 commits July 1, 2021 00:32
Summary:
Pull Request resolved: facebookresearch#498

Add some assertions to make sure end users can use algorithms correctly.

Reviewed By: bankawas

Differential Revision: D29481662

fbshipit-source-id: 0332d990df7d3eca61e1f7bd205136d32f04a7b2
Summary:
Pull Request resolved: facebookresearch#499

Remove Seq2SlateDifferentiableRewardTrainer because it's not tested and wouldn't be used.

Reviewed By: kittipatv

Differential Revision: D29522083

fbshipit-source-id: 9cd7e0d6d1d10c17cc174a54d77a4b37b0f279b7
Summary:
Pull Request resolved: facebookresearch#500

Migrate the regular seq2slate to PyTorch Lightning, which includes one model manager `Seq2SlateTransformer` and three trainers `Seq2SlateTrainer`, `Seq2SlateSimulationTrainer` and `Seq2SlateTeacherForcingTrainer`. Manual optimization (https://pytorch-lightning.readthedocs.io/en/latest/common/optimizers.html#manual-optimization) is used to handle the sophisticated usage of optimizers during training.

Model manager `Seq2SlatePairwiseAttn` and trainer `Seq2SlatePairwiseAttnTrainer` are not migrated in this diff. But to make them compatible with the changes, the setting of `minibatch_size` is also moved from `trainer_params` to `reader_options`.

Reviewed By: czxttkl

Differential Revision: D29436608

fbshipit-source-id: 612a1de4923eb7d138fcb6cb4715be6e4d05b424
Summary: AutoDataModule yields dictionary of tensors. Therefore, we need to manually type the input

Reviewed By: czxttkl

Differential Revision: D29479986

fbshipit-source-id: ab135bb869d8f0eb1fba1813aebf5af6d5ca3401
Differential Revision: D29573192

fbshipit-source-id: 65dc670d1777dd1d6b86c9228a198cd16f504c6e
…h#489)

Summary: Pull Request resolved: facebookresearch#489

Reviewed By: czxttkl

Differential Revision: D29144000

fbshipit-source-id: b72401ee3bb69f4973c32914a440e571d56241f6
…ebookresearch#502)

Summary:
Pull Request resolved: facebookresearch#502

Use transformers to learn the return decomposition model.
1) customized attention layers that feed positional encoding to Key & Query but not V.
2) residual connections that learn meaningful embeddings.

Reviewed By: czxttkl

Differential Revision: D29346526

fbshipit-source-id: c6e642548d4d2b0bcc7f089c08d9144c6f96f8e0
Reviewed By: zertosh

Differential Revision: D29656934

fbshipit-source-id: c40bbc8e4512b145050ee47db2c8dc781f3c36e9
…search#501)

Summary:
Pull Request resolved: facebookresearch#501

Migrate model manager `Seq2SlatePairwiseAttn` and trainer `Seq2SlatePairwiseAttnTrainer` to PyTorch Lightning.

This diff marks the completeness of the migration to PyTorch Lightning for the entire reagent codebase. `train_and_evaluate_generic` is removed. Only `train_eval_lightning` from now on!

Reviewed By: kittipatv, czxttkl

Differential Revision: D29545053

fbshipit-source-id: 71d115c07354b297d3b56d9bfcd13854cd60cb34
Summary:
Pull Request resolved: facebookresearch#503

(1) Entropy regularization is added in the CRR to test whether it can help improve the stability of the training or not.

(2) Modification in rl_offline_analysis: extract `dqn` manifold path from CRR outputs.

Reviewed By: czxttkl

Differential Revision: D29469826

fbshipit-source-id: 705ee9069edff9a2b2ff5362d3c4ff464b5a27bd
Summary: There are several modules in the ReAgent library where the logger level is set in the library code thus overriding the level set by the library client resulting in very verbose stdout. This diff removes places in the library where the logger level is set so that the client's setting is always maintained.

Reviewed By: bankawas

Differential Revision: D29673661

fbshipit-source-id: 8f6db342571d4524768f75d6d6bf4416bad8ad1c
Summary: Delete old style trainer classes

Reviewed By: czxttkl

Differential Revision: D29700788

fbshipit-source-id: 2f4448d9a7cb8d31d11b25bf35184e1f8c1ce9f6
Differential Revision: D29738340

fbshipit-source-id: 97c83cea89c46c469cdc967cce2ac7ce281c55fc
Summary: Pull Request resolved: facebookresearch#508

Reviewed By: czxttkl

Differential Revision: D29805519

fbshipit-source-id: dbcde11f8292eb167a0b7a66384e0d1d723b38e4
Summary:
Pull Request resolved: facebookresearch#506

Use `ExtraData.from_dict`

Reviewed By: czxttkl

Differential Revision: D29768249

fbshipit-source-id: de0056420ab71a79c4f9821cf451328949256037
Summary: Implement data module for Seq2Slate

Reviewed By: czxttkl

Differential Revision: D29717416

fbshipit-source-id: 424e3c025d73f691c8b0880f853f8d4dca0db584
Summary:
Pull Request resolved: facebookresearch#507

Previously the SlateQ trainer only supports SARSA on-policy training. This diff implements a off-policy training approach based on Q-learning.

Changes are:

1. Introduced a new `slate_opt_parameters` to specify which slate optimization method to use: top_k, greedy, or exact, based on the SlateQ paper. Currently only the top_k approach is implemented;
2. When choosing the next action, instead of directly using `training_batch.next_action`, we first calculate the Q-value for each next candidate, and rank them by doc value * Q-value. And choose the indices for the top-k items as the next action.

Reviewed By: kittipatv

Differential Revision: D29660887

fbshipit-source-id: 9b15de4cba41ad5e34f1ca4553f90c53399052c4
Summary:
Pull Request resolved: facebookresearch#511

n/a

Reviewed By: igfox

Differential Revision: D29820857

fbshipit-source-id: 7389785f20e1a503c5eea3221c5ad68ca1f79b31
Summary:
Pull Request resolved: facebookresearch#510

Currently QR-DQN is not tested offline. This diff adds an integration test to open_ai_gym_offline and cogwheel. It also corrects an issue with QR-DQN's CPE (the optimizers were in the wrong order) and modifies our model registration to work outside of fblearner flow environments.

Reviewed By: kittipatv

Differential Revision: D29800557

fbshipit-source-id: ae324c0323a9e644524a228ab296c412923c5336
Summary:
Pull Request resolved: facebookresearch#512

- Removing the opt from `manual_backward` call
- Pin Lightning version to same version as in fbcode

Reviewed By: igfox

Differential Revision: D29828482

fbshipit-source-id: 26a52d71362a9a6fd1ea995d854f4a0268d5cce6
Summary: Currently if logged action prob is 0 NaNs can propagate to the actor loss (even with entropy set to 0) and mess up training (f287261291). This diff removes entropy calculation if entropy_coeff <= 0 and raises an error if entropy calculation is on while a logged action has probability 0.

Reviewed By: czxttkl

Differential Revision: D29861744

fbshipit-source-id: 2fae30e7108145139851d0767d7bbe18f6dd388a
Summary: Black/whitelist are in the process of being removed from all FB code (https://fb.workplace.com/groups/e/permalink/3320810064641820/). This diff replaces all instances of black/whitelist with block/allowlist in the reagent codebase.

Reviewed By: kittipatv

Differential Revision: D29881070

fbshipit-source-id: 3d2e63eff5f4371f994ba4ae37586e3ef33c2fb7
…fbe63) to github/third-party/PyTorchLightning/pytorch-lightning

Summary:
# Manual Changes
- Migrate callsites of `_extract_batch_size` to `extract_batch_size` (as per https://fburl.com/code/4q7n8fs9).
- Remove unnecessary unit tests in `test_hive_writing_callback.py`

### New commit log messages
  000fbe63 Expose `extract_batch_size` method and add corresponding tests. (#8357)

Reviewed By: yifuwang

Differential Revision: D29834484

fbshipit-source-id: 219a3d40401d9b2c35d3a74b75f2394c4f57d61b
Differential Revision: D29929083

fbshipit-source-id: 66bae2de6f4c7ac658de98475b00f81215ef6b0e
Summary:
Pull Request resolved: facebookresearch#513

The previous approach use the fixed slate_size, which includes padded items, and it shouldn't give use the actual average over valid Q-value estimations.

This diff fix this issue by calculating the actual slate_size summing the item mask (1 if an item is valid) over each slate.

Reviewed By: czxttkl

Differential Revision: D29848923

fbshipit-source-id: 2a3fea30cdaa46b85b72fe5b5d054d7b78755a5b
Summary:
Pull Request resolved: facebookresearch#514

As titled. RBF kernel is used in Eqn. 10 in https://jgillenw.com/cikm2018.pdf.

Reviewed By: Strideradu

Differential Revision: D29894690

fbshipit-source-id: 46681ca4e0b5091434834d7f86d9d87c7228da64
Summary:
Our current PathManager is based on fvcore, it's in the process of deprecation and is being replaced by an open source solution iopath.

This diff is the result of running the provided codemod script on reagent, rl, and rl_exp, followed by a round of autodeps and a fbgs search for 'fvcore'.

https://fb.workplace.com/groups/939200583171018/permalink/1022439911513751/

Reviewed By: czxttkl

Differential Revision: D29974786

fbshipit-source-id: 397fd69ef94d43a7ca07c963c2a46bbbdcf78599
Summary:
Pull Request resolved: facebookresearch#515

title

Reviewed By: czxttkl

Differential Revision: D29922558

fbshipit-source-id: b5ad7863d5c5b15363a5e9daf237800b79c260f2
Summary:
Pull Request resolved: facebookresearch#516

See title. Also:
- modifies the CRR constructor to use the standard target_model format
- fixes a bug with delayed_policy_update
- adds some todos

Reviewed By: czxttkl

Differential Revision: D29970711

fbshipit-source-id: 8d3add660b865d96b365cbda9baf0aa7ea13e879
Summary: Pull Request resolved: facebookresearch#519

Reviewed By: igfox

Differential Revision: D30047652

fbshipit-source-id: b2a4b2542455e43798a8c0b4606be88bcb00f823
czxttkl and others added 24 commits July 27, 2022 09:40
Summary:
To avoid test failures caused by importing torchrec, i finally decide the following import rules:
For gpu machines, import torchrec (gpu, stable version)
For cpu machines, import torchrec-nightly-cpu

Reviewed By: speedystream

Differential Revision: D38185498

fbshipit-source-id: 7988695f827cfd04d53f6d63630ac843eb6c23ee
Summary:
Pull Request resolved: facebookresearch#657

1. add docstrings
2. test if models are torch.jit.trac-able

Reviewed By: dkorenkevych

Differential Revision: D38067071

fbshipit-source-id: 7863e0e1f3f618ee7fe46c6fa076fec1dd6fd48a
Summary:
Pull Request resolved: facebookresearch#653

Add test_dqn_base.py file with unit tests
for the methods in DQNTrainerBaseLightning class.

Reviewed By: czxttkl

Differential Revision: D37673366

fbshipit-source-id: 43482dde9be06a0df1e8dd3bb16e92d508bc8a13
…ebookresearch#654)

Summary:
Pull Request resolved: facebookresearch#654

Add docstrings to DQNTrainer and DQNTrainerBaseLightning classes and their methods.

Reviewed By: czxttkl

Differential Revision: D37875900

fbshipit-source-id: 52e9947f1c84f099bedb79a696de94a05c631f5c
Summary:
Pull Request resolved: facebookresearch#663

This is the first part of the diff. I have moved the model in to reagent. Moreover, I have made some refactoring.

Reviewed By: czxttkl

Differential Revision: D38011818

fbshipit-source-id: cb76646ed0e3149887180cbe642b1035afaace9b
Differential Revision: D38447983

fbshipit-source-id: 03d4384d075a57bfbd9a76c23730307fc5255c90
Summary:
Pull Request resolved: facebookresearch#665

we need to unfold embeddings from different sparse features

Differential Revision: D38556778

fbshipit-source-id: 8eb646105991c0307d981bb3198c48e850cededa
)

Summary:
Pull Request resolved: facebookresearch#666

ReAgent trainer inputs usually inherit from `TensorDataClass`. In GPU training, trainer inputs are moved to cuda automatically by pytorch lightining strategies. We need to make sure `TensorDataClass.to(torch.device("cuda"))` can move all fields to cuda.

Before this change, KeyedJaggedTensor in a TensorDataClass cannot be moved to cuda properly.

Reviewed By: alexnikulkov

Differential Revision: D38685842

fbshipit-source-id: 96c35a8171e3249a9cbcdd561f38d64078b3e9a6
Summary:
Pull Request resolved: facebookresearch#664

Implement Reagent DeepRepresentLinUCB.

It is a multiple layer regression model that output UCB score.
The first N layers are trainable by torch optimizer().
The last layer is the traditional LinUCB, and it is not updated by optimizer, but still will be updated by matrix computations.

{F760388037}

Reviewed By: alexnikulkov

Differential Revision: D38268001

fbshipit-source-id: d739e6af0cfecd891681d1e736d3191547441c92
Summary:
Pull Request resolved: facebookresearch#668

Previously if we do `super__init__(automatic_optimization=automatic_optimization)` (which we better do), the CI complains for some unknown reason.

This quick fix use `**kwargs` to include `automatic_optimization` .

Reviewed By: alexnikulkov

Differential Revision: D38771283

fbshipit-source-id: 89062959ee0de2d2bca9caa961c146d16708a633
…gument in pytorch (facebookresearch#670)

Summary:
Pull Request resolved: facebookresearch#670

See title

Reviewed By: czxttkl

Differential Revision: D38839005

fbshipit-source-id: f720555dc9caf840c9354eca4ef8d86101ae7fe3
Summary:
Pull Request resolved: facebookresearch#669

Add `DeepRepresentLinUCBTrainerParameters`.

Reviewed By: alexnikulkov

Differential Revision: D38813086

fbshipit-source-id: e1cbe01ed100b9a8328fe6cd63e9672e1df269f1
Summary:
Pull Request resolved: facebookresearch#671

CircleCI tests started failing. Example: https://app.circleci.com/pipelines/github/facebookresearch/ReAgent/2490/workflows/efbceb68-9a01-4889-8d82-3d4167af4643/jobs/24780
Root cause is a significant jump in protobuf version from 3.19.4 to 4.21.5. This was caused by an an updated dependency on protobuf in grcpio-tools.
I'm adding an upper limit on grcpio-tools version to prevent this error

Reviewed By: BerenLuthien, czxttkl

Differential Revision: D38870120

fbshipit-source-id: 10d2e619aa10c2b03272b8305681792feefd06f4
Summary:
Pull Request resolved: facebookresearch#672

The test was failing in CircleCI: https://app.circleci.com/pipelines/github/facebookresearch/ReAgent/2491/workflows/e153e4b9-795d-4c30-a3ff-6e4411b7dbf2/jobs/24796
I removed unnecessary parts which were breaking the test

Reviewed By: czxttkl

Differential Revision: D38874439

fbshipit-source-id: 74fe760014b5152d8a59f676beb4c8fa96e15798
Summary:
Code design doc:
- First part of https://docs.google.com/document/d/1eiyJpLSiSDufPc4h2nIRrHh4yllBtZ9DEsFaHMtl6GM/edit#

Main difference from existing LinUCB model in the codebase:
- Input is not a batch of data, but a List[batch] of data. Each element of the input is the corresponding batch for an arm.
- Similarly:
-- covariance matrix A and inv_A is (num_arms, d, d) dimension
-- model coef: (num_arms, d)

Reviewed By: alexnikulkov

Differential Revision: D38374089

fbshipit-source-id: c60b990d8cb7d75ab334ceb24beebf4f5b78fc8a
Summary:
Pull Request resolved: facebookresearch#661

Make `coefs` a property, instead of an attribute. This property automatically checks if the currently saved coefficient value is valid (computed using the current values of `A` and `b`) and uses it if so. If the current saved coefficient value is invalid, we compute the new value as `coefs = A**(-1) * b` and save it

Reviewed By: czxttkl

Differential Revision: D38283370

fbshipit-source-id: aa78828e6a8cf8a7971ccee734a2cbe71856ccd1
Summary:
Pull Request resolved: facebookresearch#662

`predict_ucb` wasn't doing anything that can't be done by `ucb_alpha`, so I removed the `predict_ucb` and simplified the logic

Reviewed By: czxttkl

Differential Revision: D38305542

fbshipit-source-id: 72a4c4e240492c7abf61ece93d1a3725779b50ca
Summary:
Pull Request resolved: facebookresearch#674

FX tracing is a pre-requisite of GMS. The current dense-only RL model is not FX traceable. In this diff the model is tweaked to enable FX tracing without any potential behavior change.

Reviewed By: czxttkl

Differential Revision: D38835020

fbshipit-source-id: 96fd835b44bf42bff42459f0d37654a802bd537d
Summary:
The unit tests use torch.randn to generate examples, so it has some randomness which might fail due to floating issues. Proposed fixes are
- torch set seed
- increase tolerance to 10^-3.

Reviewed By: alexnikulkov

Differential Revision: D38950945

fbshipit-source-id: e300bd601720bd5671b4729e3406dbe7200f08f3
Summary:
Pull Request resolved: facebookresearch#677

Enable distributed training by reducing the values of `A` and `b` across all trainer instances.
For non-distributed training nothing changes because `sync_ddp_if_available` returns the input if distributed training isn't used

Differential Revision: D38294567

fbshipit-source-id: af1973c764f85fbc1c27b0e1395ed726c5193649
…ch#675)

Summary:
Pull Request resolved: facebookresearch#675

Add dtype conversion to reagent CB batch preprocessor
Without this some elements were `double` instead of `float`

Reviewed By: czxttkl

Differential Revision: D38807573

fbshipit-source-id: d5120eb0555cba85e5e3a1d9d13ed9c8d43d8363
…cebookresearch#678)

Summary:
Pull Request resolved: facebookresearch#678

Update reagent fully_connected_network.py to comply with fx trace

Reviewed By: czxttkl

Differential Revision: D38748880

fbshipit-source-id: 3f8f5a85504591e0d7c1bbfbb1bff418502d3bb7
Summary:
Pull Request resolved: facebookresearch#676

Interaction features weren't used by any of the first 3 users of LinUCB, so I'm removing them for simplicity

Reviewed By: czxttkl

Differential Revision: D38807570

fbshipit-source-id: 2360fae3e931847e46121cb9d51cad7c25b13f21
Summary: Use the same device for constant tensor as input tensor to avoid concatenating tensors on separate devices

Reviewed By: czxttkl

Differential Revision: D39020305

fbshipit-source-id: 35ad9d41521b92914717c28f6c0eebc106a09abc
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D39020305

@codecov-commenter
Copy link

codecov-commenter commented Aug 25, 2022

Codecov Report

Merging #679 (c7f673d) into main (e8fe386) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #679   +/-   ##
=======================================
  Coverage   87.48%   87.48%           
=======================================
  Files         366      366           
  Lines       23398    23398           
  Branches       44       44           
=======================================
  Hits        20470    20470           
  Misses       2902     2902           
  Partials       26       26           
Impacted Files Coverage Δ
reagent/core/torch_utils.py 92.39% <ø> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@facebook-github-bot
Copy link

Hi @clairecyq!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.