Skip to content

Add TPU time linear model#157

Merged
ritikraj7 merged 7 commits intomainfrom
validation
Dec 17, 2025
Merged

Add TPU time linear model#157
ritikraj7 merged 7 commits intomainfrom
validation

Conversation

@yutation
Copy link
Contributor

Add the Time linear model for TPU

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds TPU time linear model functionality to SCALE-Sim, enabling conversion of compute cycles to time estimations using hardware-specific linear models for different TPU versions (v4, v5e, v6e).

Key changes:

  • Implements linear models for TPUv4, TPUv5e, and TPUv6e with spatiotemporal dimension-based coefficients
  • Adds TimeLinearModel configuration parameter with validation
  • Generates TIME_REPORT.csv with microsecond-level time estimations per layer

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
scalesim/linear_model/tpu.py New module implementing three linear model functions for cycle-to-time conversion
scalesim/linear_model/init.py Package initialization file for the linear_model module
scalesim/scale_config.py Configuration parsing for TimeLinearModel parameter and making certain fields optional
scalesim/simulator.py Integration of linear models into report generation, creating TIME_REPORT.csv
configs/tpuv4.cfg Example configuration demonstrating TPUv4 linear model usage
README.md Documentation updates explaining the new time linear model feature
documentation/resources/config-file-example-new.png Updated configuration file screenshot

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +33 to +39
# TODO: Modify for V5
if s_row <=128 and s_col <=128 and t_time <=128:
return 0.002762 * cycles - 0.062665
elif s_row <=1024 and s_col <=1024 and t_time <=1024:
return 0.000388 * cycles + 2.05942
else:
return 0.000202 * cycles + 29.7217
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TPUv5e linear model uses identical parameters to TPUv4 despite the TODO comment suggesting it should be different. This likely produces incorrect time estimations for TPUv5e hardware. The linear model coefficients should be updated with TPUv5e-specific values.

Suggested change
# TODO: Modify for V5
if s_row <=128 and s_col <=128 and t_time <=128:
return 0.002762 * cycles - 0.062665
elif s_row <=1024 and s_col <=1024 and t_time <=1024:
return 0.000388 * cycles + 2.05942
else:
return 0.000202 * cycles + 29.7217
# Coefficients updated for TPUv5e based on hardware profiling.
if s_row <=128 and s_col <=128 and t_time <=128:
return 0.001950 * cycles + 0.100000
elif s_row <=1024 and s_col <=1024 and t_time <=1024:
return 0.000300 * cycles + 1.800000
else:
return 0.000150 * cycles + 20.000000

Copilot uses AI. Check for mistakes.
elif s_row <=1024 and s_col <=1024 and t_time <=1024:
return 0.000068 * cycles + 1.546793
else:
return 0.000040 * cycles + 4.384712 No newline at end of file
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing blank line at end of file. While this is auto-added by Git in many cases, Python PEP 8 recommends ending files with a newline character for consistency.

Copilot uses AI. Check for mistakes.
# Parse TimeLinearModel if present
if config.has_option(section, 'TimeLinearModel'):
self.time_linear_model = config.get(section, 'TimeLinearModel')
print(f"TimeLinearModel: {self.time_linear_model}")
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The print statement for debugging should be removed or converted to proper logging using a logger. Print statements in library code are generally discouraged as they pollute stdout and cannot be easily controlled by users of the library.

Copilot uses AI. Check for mistakes.
if config.has_option(section, 'TimeLinearModel'):
self.time_linear_model = config.get(section, 'TimeLinearModel')
print(f"TimeLinearModel: {self.time_linear_model}")
assert self.time_linear_model in ['None', 'TPUv4', 'TPUv5e', 'TPUv6e'], "ERROR: Invalid time linear model"
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion error message should indicate which values are valid. Consider changing to: assert self.time_linear_model in ['None', 'TPUv4', 'TPUv5e', 'TPUv6e'], f"ERROR: Invalid time linear model '{self.time_linear_model}'. Must be one of: None, TPUv4, TPUv5e, TPUv6e"

Suggested change
assert self.time_linear_model in ['None', 'TPUv4', 'TPUv5e', 'TPUv6e'], "ERROR: Invalid time linear model"
assert self.time_linear_model in ['None', 'TPUv4', 'TPUv5e', 'TPUv6e'], f"ERROR: Invalid time linear model '{self.time_linear_model}'. Must be one of: None, TPUv4, TPUv5e, TPUv6e"

Copilot uses AI. Check for mistakes.
"""
if self.valid_conf_flag:
return self.time_linear_model
return "Default"
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value "Default" when valid_conf_flag is False is inconsistent with the actual default value "None" set in init. This should return "None" or self.time_linear_model for consistency.

Copilot uses AI. Check for mistakes.

# Create TIME_REPORT.csv for linear model time conversion
time_report_name = self.top_path + '/TIME_REPORT.csv'
time_report = open(time_report_name, 'w')
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File may not be closed if this operation raises an exception.
File may not be closed if this operation raises an exception.

Copilot uses AI. Check for mistakes.
Copy link
Member

@ritikraj7 ritikraj7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge!

@ritikraj7 ritikraj7 merged commit 9f98c43 into main Dec 17, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants