Skip to content

Conversation

ShindeShivam
Copy link
Contributor

@ShindeShivam ShindeShivam commented Sep 6, 2025

Chapter :18

Cell : Fixed Q-Value Targets , Double DQN , Dueling Double DQN

Play_one_step() returns (next_state, reward, done, truncated, info), but the training loops unpacked it as (obs, reward, done, info, truncated).

Fixed by using the correct order:

obs, reward, done, truncated, info = play_one_step(env, obs, epsilon)

@ageron ageron merged commit 8391321 into ageron:main Oct 13, 2025
@ageron
Copy link
Owner

ageron commented Oct 13, 2025

Great catch @ShindeShivam, thanks a lot for the PR. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants