Skip to content

Conversation

cmdr64
Copy link

@cmdr64 cmdr64 commented Oct 10, 2023

Added an argument low_cpu_mem_usage=True that drastically reduces load time of the model. Before I made this change, loading Llama took more than a minute. Now it loads in seconds.

Tested on NVIDIA 4090 Founder's Edition.

…mem_usage=True, drastically speeding up load time (cuts more than a minute off of load)
@cmdr64 cmdr64 force-pushed the load-llama-faster branch from 17a1e35 to 5ab62fe Compare October 11, 2023 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant