Skip to content

Refactor RL system to integrate with existing grSim simulator and reuse shared constants#2498

Draft
Copilot wants to merge 7 commits intoros2from
copilot/develop-reinforcement-learning-system
Draft

Refactor RL system to integrate with existing grSim simulator and reuse shared constants#2498
Copilot wants to merge 7 commits intoros2from
copilot/develop-reinforcement-learning-system

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 23, 2026

Description

The RL system was fully standalone with its own physics engine and duplicated geometry/physics constants from the C++ codebase. This refactors it to use the existing grSim simulator as its physics backend and source constants from a single shared module that mirrors rj_constants/constants.hpp.

Core changes:

  • constants.py (new) — Single Python source of truth for physical constants, mirroring constants.hpp and network.hpp (robot/ball dimensions, field geometry, sim ports, physics params)
  • sim_client.py (new) — grSim UDP protobuf client using the same RobotControl/SimulatorCommand/SSL_WrapperPacket protocol as sim_radio.cpp
  • config.py — Removed FieldConfig and PhysicsConfig (duplicated geometry); only RL-specific config remains (RewardConfig, TrainingConfig, EnvConfig)
  • env.py — Replaced built-in physics with grSim communication (--use-sim); retains lightweight fallback using shared constants for CI/testing
  • state.py, action.py, reward.py — Now reference constants module instead of duplicated config values
  • proto_gen/ — Generated Python protobuf bindings from existing src/rj_protos/protos/ssl_simulation_*.proto
# Before: duplicated constants
class FieldConfig:
    length: float = 9.0      # duplicates constants.hpp
    robot_radius: float = 0.09  # duplicates constants.hpp

# After: single source of truth
from rj_rl import constants
constants.FIELD_LENGTH      # mirrors rj_constants/constants.hpp
constants.ROBOT_RADIUS      # mirrors rj_constants/constants.hpp
constants.SIM_COMMAND_PORT  # mirrors rj_common/network.hpp

Associated Issue

RL system architecture review feedback — eliminate standalone physics and geometry duplication.

Steps to test

  1. cd src/rj_rl && pip install -e . && python -m pytest test/ -v — 52 tests pass
  2. python -m scripts.train --timesteps 500 --seed 42 — fallback mode training works
  3. Start grSim, then python -m scripts.train --timesteps 5000 --use-sim — simulator-backed training

Expected result: All tests pass; training runs in both fallback and grSim modes; no duplicated physics constants remain in config.py.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits February 23, 2026 01:29
Create src/rj_rl/ package with:
- Gym-compatible 2D environment with SSL-like physics
- State encoding for observation space
- Discrete action space (move, shoot, pass, defend, position, idle)
- Configurable reward functions (goals, possession, progress)
- Numpy-based MLP neural network (actor-critic)
- PPO agent with GAE advantage estimation
- Training loop with logging and checkpointing
- Training script entry point
- 50 unit tests (all passing)
- README documentation

Co-authored-by: sanatd33 <53443682+sanatd33@users.noreply.github.com>
Co-authored-by: sanatd33 <53443682+sanatd33@users.noreply.github.com>
Copilot AI changed the title [WIP] Add reinforcement learning system for strategy improvement Add reinforcement learning framework for RoboCup SSL strategy Feb 23, 2026
Copilot AI requested a review from sanatd33 February 23, 2026 01:37
Copilot AI and others added 2 commits February 23, 2026 01:54
- Remove FieldConfig and PhysicsConfig (duplicated geometry/physics)
- Create constants.py mirroring rj_constants/constants.hpp
- Create sim_client.py for grSim UDP protobuf communication
- Refactor env.py to use grSim (with fallback for testing)
- Update state.py, action.py, reward.py to use shared constants
- Generate Python protobuf bindings from existing proto files
- Update all tests (52 passing)
- Update README with grSim integration documentation

Co-authored-by: sanatd33 <53443682+sanatd33@users.noreply.github.com>
…aces

Co-authored-by: sanatd33 <53443682+sanatd33@users.noreply.github.com>
Copilot AI changed the title Add reinforcement learning framework for RoboCup SSL strategy Refactor RL system to integrate with existing grSim simulator and reuse shared constants Feb 23, 2026
Had an AI implement the trained model in modifying our strategy.
automated style fixes

Co-authored-by: CameronLyon <CameronLyon@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants