Skip to content

Add TBE data configuration reporter to TBE forward (v3) (#4455) #4672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

gchalump
Copy link
Contributor

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/1516

Re-land attempt of D75462895

Add TBE data configuration reporter to TBE forward call.

The reporter reports TBE data configuration at the SplitTableBatchedEmbeddingBagsCodegen forward call. The output is a TBEDataConfig object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

Just Knobs for enablement

Environment Variables


The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  • FBGEMM_REPORT_INPUT_PARAMS_INTERVAL:

    • Description: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
    • Example Value: 1 (report every iteration)
  • FBGEMM_REPORT_INPUT_PARAMS_ITER_START:

    • *Description: Specifies the start of the iteration range to capture reports. Default 0.
    • *Example Value: 0 (start reporting from the first iteration)
-   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
  -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
  -   ***Example Value**: `-1` (report until the last iteration)
  • FBGEMM_REPORT_INPUT_PARAMS_BUCKET:

    • Description: Specifies the name of the Manifold bucket where the report data will be saved.
    • Example Value: tlparse_reports
  • FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX:

    • Description: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    • Example Value: tree/tests/

Use Cases

  • FileStore
    • General
      • Auto-create output directories if not exist.
    • fb-internal:
      • Only export to manifold.
      • Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
    • OSS
      • Will use local FileStore to store the output

Example Usage


Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2

Explanation

The above setting will report iter 3 and iter 5

  • FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2: The reporter will generate a report every 2 iterations.
  • FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0: The reporter will start generating reports from the first iteration.
  • FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default): The reporter will continue to generate reports until the last iteration interval.
  • FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports: The reports will be saved in the tlparse_reports bucket.
  • FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/: The reports will be stored with the path prefix tree/tests/. For Manifold make sure all folders within the path exist.

Note on Benchmark example

Note that with the --iters 2 option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.



Other includes changes in this Diff:

  • Updates build dependency of tbe_data_config* files
  • Remove shutil and numpy.random lib as it cause uncompatiblity error.
  • Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603

Copy link

netlify bot commented Aug 11, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 02d1bbc
🔍 Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68a57fd534c95f0008d09e2d
😎 Deploy Preview https://deploy-preview-4672--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@meta-cla meta-cla bot added the cla signed label Aug 11, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 12, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 12, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 12, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 12, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 14, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 14, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 14, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 14, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 18, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 18, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

@gchalump gchalump force-pushed the export-D79758603 branch 2 times, most recently from 0cdeb28 to 357780d Compare August 19, 2025 14:45
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 19, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 19, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 19, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@gchalump gchalump force-pushed the export-D79758603 branch 2 times, most recently from 2e1a09e to 6fd52d6 Compare August 19, 2025 18:30
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 19, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 19, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 19, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 19, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
Summary:
X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

The diff is splitted between benchmark related with necessary refactoring from the integration of the reporter to tbe.
This diff contains changes on the benchmark related with necessary refactoring.
 ---
## Changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold
  - Adding just knob for enablement

Differential Revision: D76992907

Reviewed By: spcyppt
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 20, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 20, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703


X-link: facebookresearch/FBGEMM#1516


Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```


## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output


## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.


---
---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

gchalump added a commit to gchalump/FBGEMM that referenced this pull request Aug 20, 2025
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
Summary:
X-link: facebookresearch/FBGEMM#1703

Pull Request resolved: pytorch#4672

X-link: facebookresearch/FBGEMM#1516

Pull Request resolved: pytorch#4455

Re-land attempt of D75462895

# Add TBE data configuration reporter to TBE forward call.
The reporter reports TBE data configuration at the `SplitTableBatchedEmbeddingBagsCodegen` ***forward*** call. The output is a `TBEDataConfig` object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

## Just Knobs for enablement
 - fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
    - Default is set to `False`, enable this flag to enable reporter.
    - To enable it locally use:
       ```
       jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
       ```

## Environment Variables
---------------------

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

  -  **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL**:
      -  **Description**: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
      -   **Example Value**: `1` (report every iteration)

   -  **FBGEMM_REPORT_INPUT_PARAMS_ITER_START**:
      -   ***Description**: Specifies the start of the iteration range to capture reports. Default 0.
      -   ***Example Value**: `0` (start reporting from the first iteration)

    -   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
      -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
      -   ***Example Value**: `-1` (report until the last iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET**:
    *   **Description**: Specifies the name of the Manifold bucket where the report data will be saved.
    *   **Example Value**: `tlparse_reports`

-  **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX**:
    -  **Description**: Defines the path prefix where the report files will be stored. Path will be created if not exist.
    -   **Example Value**: `tree/tests/`

## Use Cases
- FileStore
   - General
       - Auto-create output directories if not exist.
   - fb-internal:
     - Only export to manifold.
     - Assert error, if the flag is set but failed to initialize manifold connection. (missing backend or manifold bucket is not exist)
  - OSS
    - Will use local FileStore to store the output

## Example Usage
-------------

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

```
FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2
```

**Explanation**

The above setting will report `iter 3` and `iter 5`

*   **FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2**: The reporter will generate a report every 2 iterations.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0**: The reporter will start generating reports from the first iteration.
*   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default)**: The reporter will continue to generate reports until the last iteration interval.
*   **FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports**: The reports will be saved in the `tlparse_reports` bucket.
*   **FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/**: The reports will be stored with the path prefix `tree/tests/`. For Manifold make sure all folders within the path exist.

**Note on Benchmark example**

Note that with the `--iters 2` option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

 ---
 ---
## Other includes changes in this Diff:
  - Updates build dependency of tbe_data_config* files
  - Remove `shutil` and `numpy.random`  lib as it cause uncompatiblity error.
  - Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D79758603
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79758603

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants