Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade gymnasium to 1.0.0 #502

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from
Draft

Upgrade gymnasium to 1.0.0 #502

wants to merge 10 commits into from

Conversation

sdpkjc
Copy link
Collaborator

@sdpkjc sdpkjc commented Mar 2, 2025

Description

Upgrade gymnasium to 1.0.0

  • gymnasium classic control

    • c51.py
    • c51_jax.py
    • dqn.py
    • dqn_jax.py
    • ppo.py
    • pqn.py
  • gymnasium mujoco

    • ddpg_continuous_action.py
    • ddpg_continuous_action_jax.py
    • td3_continuous_action.py
    • td3_continuous_action_jax.py
    • sac_continuous_action.py
    • ppo_continuous_action.py
    • rpo_continuous_action.py
  • gymnasium atari (EpisodicLifeEnv conflicts with gymnasium v1.0.0's RecordEpisodeStatistics and will be fixed later.)

    • c51_atari.py
    • c51_atari_jax.py
    • dqn_atari.py
    • dqn_atari_jax.py
    • qdagger_dqn_atari_impalacnn.py
    • qdagger_dqn_atari_jax_impalacnn.py
    • sac_atari.py
    • ppo_atari.py
    • ppo_atari_lstm.py
    • ppo_atari_multigpu.py
  • envpool

    • ppo_rnd_envpool.py
    • pqn_atari_envpool_lstm.py
    • pqn_atari_envpool.py
    • ppo_atari_envpool.py
    • ppo_atari_envpool_xla_jax.py
    • ppo_atari_envpool_xla_jax_scan.py
  • other

    • ppg_procgen.py
    • ppo_pettingzoo_ma_atari.py
    • ppo_procgen.py
    • ppo_trxl.py
    • ppo_continuous_action_isaacgym.py

Types of changes

  • Bug fix
  • New feature
  • New algorithm
  • Documentation

Checklist:

  • I've read the CONTRIBUTION guide (required).
  • I have ensured pre-commit run --all-files passes (required).
  • I have updated the tests accordingly (if applicable).
  • I have updated the documentation and previewed the changes via mkdocs serve.
    • I have explained note-worthy implementation details.
    • I have explained the logged metrics.
    • I have added links to the original paper and related papers.

If you need to run benchmark experiments for a performance-impacting changes:

  • I have contacted @vwxyzjn to obtain access to the openrlbenchmark W&B team.
  • I have used the benchmark utility to submit the tracked experiments to the openrlbenchmark/cleanrl W&B project, optionally with --capture_video.
  • I have performed RLops with python -m openrlbenchmark.rlops.
    • For new feature or bug fix:
      • I have used the RLops utility to understand the performance impact of the changes and confirmed there is no regression.
    • For new algorithm:
      • I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
    • I have added the learning curves generated by the python -m openrlbenchmark.rlops utility to the documentation.
    • I have added links to the tracked experiments in W&B, generated by python -m openrlbenchmark.rlops ....your_args... --report, to the documentation.

Copy link

vercel bot commented Mar 2, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
cleanrl ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 4, 2025 9:24am

Copy link
Collaborator

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this @sdpkjc

Ale-py should be updated to v0.10.1

And the auto reset mode of the vector environment should be updated

pyproject.toml Outdated
stable-baselines3 = "2.0.0"
gymnasium = ">=0.28.1"
stable-baselines3 = ">=2.4.0"
gymnasium = ">=1.0.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be specified as v1.1.0, if sb3 is the limitation then I think see if I can update it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sb3 depends on <=1.0.0.

@pseudo-rnd-thoughts
Copy link
Collaborator

pseudo-rnd-thoughts commented Mar 3, 2025

If I remember correctly, SB3 is used for the replay buffer and the atari wrappers. IMO, those features can probably be shifted in cleanrl_utils directly however this should happen in a separate later PR

@sdpkjc sdpkjc changed the title Upgrade gymnasium to 1.1.0 Upgrade gymnasium to 1.0.0 Mar 4, 2025


# Only for gymnasium v1.0.0
class SameModelSyncVectorEnv(gym.vector.SyncVectorEnv):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be called SameStepModeSyncVectorEnv or we just shift to gymnasium v1.1.0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -12,13 +12,13 @@ license="MIT"
readme = "README.md"

[tool.poetry.dependencies]
python = ">=3.8,<3.11"
python = ">=3.9,<3.11"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is limiting us increasing this?

tensorboard = "^2.10.0"
wandb = "^0.13.11"
gym = "0.23.1"
torch = ">=1.12.1"
stable-baselines3 = "2.0.0"
gymnasium = ">=0.28.1"
stable-baselines3 = "^2.4.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we will have a new release of sb3 with support for gymnasium v1.1.0 as no changes seem to be required on their end (DLR-RM/stable-baselines3#2095)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants