-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Deprecate seed in favor of random #2888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
This comment was marked as outdated.
This comment was marked as outdated.
for more information, see https://pre-commit.ci
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
this is orders of magnitude faster if you want to sample only a single item from a list.
|
Performance benchmarks:
|
|
As can be seen in the performance benchmarks, shifting from stdlib random to numpy.random.default_rng comes with a performance hit. I dug into this a bit more. The two main random calls that are made in these benchmark models are More broadly, Thoughts are welcome. In your opinion, is the improved quality of the random number generators and the fact that you have local state with numpy worth the performance hit? |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
Given how common it is to draw a single random number (i.e., def test_rand(rng, initial_size=100):
initial_values = rng.random(size=initial_size)
while True:
for entry in initial_values:
yield entry
initial_value = rng.random(size=initial_size)
def draw():
return next(a)
a = test_rand(rng, 200) And then I timed these as shown below So, a naive shift from |
|
Performance benchmarks:
|
I don't know if it's a good idea, but: Can you pre-generate an array (like a 100 or a 1000), use those numbers, and everytime you used up the array generate a new one? Basically generate random numbers in batch? |
yes, that is exactly what I show here: #2888 (comment) |
|
@quaquel Your stuff is always great and this is no exception. I don't think I have anything substantive to add. From my perspective it is better to improve the rng even if it has a performance hit. Then the next question becomes, are there ways to mitigate the performance hit while ensuring optimal randomness. Two options on the table:
I think 1 is the best bet. Subclassing in python to a generator using c-extensions may create unexpected challenges that may have unintended impacts on user simulations. At this point I would recommend 1, but always open to discussion! |
|
Can we add a Model lever keyword argument, that defaults to using NumPy RNG but allows you to use random if you need te performance? |
|
I'll try to take another look at this somewhere this week. As a first step, I'll add the wrapper idea to the model class so you can just do @EwoutH, nothing prevents you from using stdlib.random in your own models in the future. If we decide to make |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
Performance benchmarks:
|
This PR addressed #2884.
It does several things
FutureWarningwhen seed or random is being used instead of the new preferred rng kwarg. I use. FutureWarning because this is intended for end users (See https://docs.python.org/3/library/warnings.html).reminder
we need to fix the set_seed widget on the solara side as well!