Datamol (Claude Skill)

Claude skill providing Python recipes for Datamol, an RDKit-built molecular-manipulation library optimised for drug-discovery pipelines (standardization, tautomer/stereoisomer enumeration, featurization, and pandas-friendly batch operations).

   
Type Claude Skill
Supplier K-Dense Inc. (community OSS)
Availability GA — actively maintained 2025–2026
Pricing Free / OSS skill (MIT collection); Datamol itself is Apache-2.0
Capabilities Read/Write — Claude executes Datamol via the Bash/Python tool

How to install

  • Also packaged in the SciAgent-Skills collection (jaechang-hits (community OSS, CC BY 4.0)): clone jaechang-hits/SciAgent-Skills and run /plugin install sciagent-skills in Claude Code (or copy skills/structural-biology-drug-discovery/datamol-cheminformatics into ~/.claude/skills/).
  • Claude Code / Claude.ai — Skills CLI (recommended):
    npx skills add K-Dense-AI/scientific-agent-skills
    

    Installs the K-Dense collection; enable the datamol skill when prompted (also works in Cursor/Codex via the Agent Skills spec; requires Node ≥ 18).

  • Claude Code / Claude Desktop — manual clone:
    git clone https://github.com/K-Dense-AI/scientific-agent-skills
    cp -r scientific-agent-skills/skills/datamol ~/.claude/skills/
    pip install datamol
    

Project-scoped alternative: copy into .claude/skills/ instead of ~/.claude/skills/.

What it does

SKILL.md with recipes for:

  • Molecular I/O (SMILES / SDF / MOL) and dataframe round-tripping
  • Standardization and sanitization (charge, tautomer, stereochemistry)
  • Molecular transformations (tautomer / stereoisomer enumeration)
  • Featurization — descriptors, fingerprints, and graph representations
  • Parallel processing for large compound libraries
  • Reference datasets bundled in references/reactions_data.md (CDK2 kinase inhibitors, FreeSolv hydration free energies, RDKit solubility train/test splits)

Primary use cases: Compound-library cleanup and standardization for ML pipelines, analog generation in lead optimisation, large-scale molecular preprocessing, similarity searching.

Notes

Skill is documentation plus Python recipes — Claude executes Datamol locally via Bash/Python. Sits alongside the catalog’s rdkit-skill page in the same K-Dense marketplace; Datamol layers higher-level workflows (caching, dataframe integration, batch ops) on top of raw RDKit calls.

Sources


Installed this tool?

Share feedback — install path, OS, errors, workarounds. The form opens with this tool pre-selected and a link back to this page.