Stable-Baselines3 (Claude Skill)

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API.


Type	Claude Skill
Supplier	K-Dense Inc. (community OSS)
Availability	GA — part of the actively maintained K-Dense `scientific-agent-skills` collection
Pricing	Free / OSS (MIT)
Capabilities	Read/Write — Claude runs the skill’s Python locally (Bash), not as an MCP tool

How to install

Claude Code / Claude.ai — Skills CLI (recommended):
```
npx skills add K-Dense-AI/scientific-agent-skills
```
Installs the K-Dense collection; enable the stable-baselines3 skill when prompted. Works across Claude Code, Cursor, and Codex via the Agent Skills spec (requires Node ≥ 18).
Claude Code / Claude Desktop — manual clone:
```
git clone https://github.com/K-Dense-AI/scientific-agent-skills
cp -r scientific-agent-skills/skills/stable-baselines3 ~/.claude/skills/
```
Project-scoped alternative: copy into .claude/skills/ instead of ~/.claude/skills/. The skill declares its own Python dependencies in its SKILL.md; install them (the K-Dense skills generally use uv / pip) when prompted on first use.

What it does

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.

Primary use cases: standard RL experiments, quick prototyping, and well-documented algorithm implementations.

Notes

Distributed as a SKILL.md (plus code examples) in the K-Dense collection — Claude executes it locally via Bash/Python rather than as an MCP server. Upstream license: MIT. The skill name to enable after install is stable-baselines3.

Sources

Installed this tool?

Share feedback — install path, OS, errors, workarounds. The form opens with this tool pre-selected and a link back to this page.