Stable-Baselines3 (Claude Skill)
Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API.
| Type | Claude Skill |
| Supplier | K-Dense Inc. (community OSS) |
| Availability | GA — part of the actively maintained K-Dense scientific-agent-skills collection |
| Pricing | Free / OSS (MIT) |
| Capabilities | Read/Write — Claude runs the skill’s Python locally (Bash), not as an MCP tool |
How to install
- Claude Code / Claude.ai — Skills CLI (recommended):
npx skills add K-Dense-AI/scientific-agent-skillsInstalls the K-Dense collection; enable the
stable-baselines3skill when prompted. Works across Claude Code, Cursor, and Codex via the Agent Skills spec (requires Node ≥ 18). - Claude Code / Claude Desktop — manual clone:
git clone https://github.com/K-Dense-AI/scientific-agent-skills cp -r scientific-agent-skills/skills/stable-baselines3 ~/.claude/skills/Project-scoped alternative: copy into
.claude/skills/instead of~/.claude/skills/. The skill declares its own Python dependencies in itsSKILL.md; install them (the K-Dense skills generally useuv/pip) when prompted on first use.
What it does
Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.
Primary use cases: standard RL experiments, quick prototyping, and well-documented algorithm implementations.
Notes
Distributed as a SKILL.md (plus code examples) in the K-Dense collection — Claude executes it locally via Bash/Python rather than as an MCP server. Upstream license: MIT. The skill name to enable after install is stable-baselines3.
Sources
Installed this tool?
Share feedback — install path, OS, errors, workarounds. The form opens with this tool pre-selected and a link back to this page.