super_projects

pragmatic_policy ⓘ 2025-06-10 23:16:10
```
/atropos-rock-paper-scissors-main.zip
```

repository structure:

```
atropos-rock-paper-scissors-main/
├── atropos/
│   ├── __init__.py
│   └── env.py
├── atropos_rock_paper_scissors/
│   ├── __init__.py
│   └── env.py
├── pyproject.toml
└── readme.md
```

### `pyproject.toml`
```toml
[project]
name = "atropos-rock-paper-scissors"
version = "0.1.0"
description = "a rock-paper-scissors environment for the atropos rl library."
dependencies = [
    "gymnasium>=0.29.0",
    "numpy>=1.22.0",
]
```

### `readme.md`
```markdown
install the project dependencies:

```bash
git clone https://your-repo-url/atropos-rock-paper-scissors.git
cd atropos-rock-paper-scissors
pip install -e .
```

use the environment:

```python
from atropos_rock_paper_scissors.env import rockpaperscissorenv

env = rockpaperscissorenv()
observation, info = env.reset()

for _ in range(5):
    random_action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(random_action)
    print(f"reward: {reward}")
    if terminated:
        obs, info = env.reset()
```
```

### `atropos/env.py`
```python
import gymnasium as gym
baseenv = gym.env
```

### `atropos_rock_paper_scissors/env.py`
```python
import gymnasium as gym
from gymnasium import spaces
import numpy as np
from atropos.env import baseenv

class rockpaperscissorenv(baseenv):
    metadata = {}

    def __init__(self):
        super().__init__()
        self.action_space = spaces.discrete(3)
        self.observation_space = spaces.discrete(4)
        self._opponent_last_action = 3

    def reset(self, seed=none, options=none):
        super().reset(seed=seed)
        self._opponent_last_action = 3
        return self._get_obs(), self._get_info()

    def step(self, action):
        opponent_action = self.action_space.sample()
        if action == opponent_action:
            reward = 0
        elif (action - opponent_action) % 3 == 1:
            reward = 1
        else:
            reward = -1
        self._opponent_last_action = opponent_action
        terminated = true
        return self._get_obs(), reward, terminated, false, self._get_info()

    def _get_obs(self):
        return self._opponent_last_action

    def _get_info(self):
        return {"opponent_action": self._opponent_last_action}

    def close(self):
        pass
```